“Obtainium” is slang which describes found materials used to create works of art or other made objects. It is just what the artist needs, and is ideally free. The same concept exists in the digital twin, or virtual twin of our world, and in this post we describe a simple example which, ideally, helps us understand why we make things (digital or otherwise).
“Digital Obtainium” is already widely available to coders. If I’m looking for how to use a particular Python method then a web search of the method name plus “examples” turns up what I need. Similarly with JavaScript, Java, or any other language you want to use. The coding community deserves a large amount of thanks for setting up this sharing ecosystem.
However, the further you climb up the slopes of “Mount Science”, the scarcer the digital obtainium becomes (which is probably why many of us choose to live at those elevations). There is also the issue of assembling the found bits into the artwork, and here language-neutral developer tools are a must.
In what follows we describe our assembly of a digital artefact which allows a formulation or process scientist to understand and optimize both the mixture and process parts of their product, in the most effective way possible.
KCV Models
The first chunk of obtainium is the work by Kowalski, Cornell and Vining [1], who found a new set of mixture-process models which are much simpler than those used previously. This is worthy of an entire blog on its own, but the essence is that many of the model cross-terms are left out, and the value is that you can fit the model from far fewer experiments (often less than half the number). Saving experimentation is at the core of the value of the digital twin.
Now most of the commercial DOE software providers have already included these designs and models in their software, but a key property of obtainium, that the material be free, is thankfully met by the R “mixexp” package [2].
The package also has model-fitting routines for the KCV forms, and allows you to make mixture-specific response-surface plots such as the one below:
The authors of [2] also provide some example data sets (another chunk of obtainium), which we’ve used in testing our code. The “Burn” data set from [2] comes from making a rocket propellant using combustible materials and a twin-screw extruder. It is inherently a process where both mixture and process factors are important, and interact.
The Artwork
What do we want our digital artwork to look like, and to do? We don’t simply want to re-run existing code. We want to create an object which is “alive” in some way. Currently, the most compelling way to breathe life into a digital object is to imbue it with Artificial Intelligence (AI). This fits nicely with what we are trying to create, in that we can use machine learning to create a “model”, which is artificially intelligent. It can predict the performance (e.g. burn rate) of a formulation which only exists in-silico.
The tool of choice for this artist in doing the above is Pipeline Pilot. This is a data pipelining framework which uses components arranged in pipelines to accomplish specific tasks. It is language-agnostic, as we have discussed in an earlier blog. It fits well with data science and machine learning tasks; indeed, a model is simply another component which is added to the system library available for re-use.
This is captured at a high level in the following screenshot, where the simplicity of the component model for data science becomes clear. Inside the selected “Learner” component there is a lot of clever R-script, most of it coming from [2]. However, components “encapsulate” these details, and can be used without having to worry about them.
Glue and Solder
So is that it? Are we done? No, on a couple of accounts. First, the digital object should be usable by a formulator, not just a data scientist. Luckily, Pipeline Pilot also functions as a “RAD” (rapid application development) tool, so a web app is easily created.
More instructive, however, is the final part of the assembly. The KCV model is a deliberate simplification of a full crossed mixture-process model. Accordingly, there is a chance that it suffers from “Lack of Fit” (the model form just isn’t flexible enough to pass close to all of the data points) and we need to convey the LOF-statistic to the end-user so that they can judge the validity of the model.
More digital obtainium, surely? But No! There is no LOF function in R. This key bit of the artefact must be built by hand. In the end this can be done with a few dozen lines of R-script.
Having to do this coding work in fact makes the whole process of assembling your artefact more satisfying rather than less. That is simply because we make things (art, code, whatever) in order to encapsulate a small part of ourselves within those things. Purely assembling obtainium is nice, but much nicer is actually making part of the artwork yourself.
References
[1] Kowalski SM, Cornell JA, Vining GG (2000). “A New Model and Class of Designs for Mixture Experiments with Process Variables.” Communications in Statistics – Theory and Methods, 29, 2255–2280. doi:10.1080/03610920008832606. [2] “Mixture Experiments in R Using mixexp”, John Lawson and Cameron Willden, J. Stat. Soft. August 2016, Volume 72, Code Snippet 2.https://www.jstatsoft.org/index.php/jss/article/view/v072c02