The present disclosure generally relates to training a surrogate machine learning model to predict mixing quality.
Mixing is a critical step in developing drug substances. Low quality of mixing may lead to out-of-specification downstream drug substance and/or wasted capital during the operation. High levels of mixing are required to ensure downstream product quality. This is achieved by mixing inlet streams (e.g. buffer flow, retentate flow, etc.) in a tank while continuously stirring the tank at constant speed using an agitator. The resulting volume exits the tank and goes to the subsequent steps. Several configurations are usually utilized to enhance the degree of mixing such as use of internal baffles, placing the agitator at an angle to the tank, use of tangential or radial blades and so on. The quality of mixing, usually represented by “standard deviation” of trace concentration in the tank and in the tank exit, is a complex relationship of tank geometry, working volume, inlet and outlet configuration and flow rates, agitation speed and drug substance properties that is typically realized only after long and expensive in-situ studies. In-silico models based on computational fluid dynamics (CFD) have been used in order to reduce the cost and time associated with these studies; however, while resulting in reduced experimentations, CFD modeling is processor-intensive, time consuming, and sometimes expensive (e.g., if third party vendors are required).
The present disclosure provides a surrogate model to predict the quality of mixing at steady state condition with the ability to vary all the input parameters, without expensive and time-consuming computational fluid dynamics (CFD) simulations. Surrogate modeling is a novel predictive tool that provides an understanding of mixing qualities in agitated tanks. The surrogate model can be used to predict the best working volume and impeller speed for a certain operation and significantly reduce the characterization time. The surrogate model provided herein allows for consistent and reliable prediction of mixing characteristics of a mixing process, helping to ensure supply to every patient every time. The surrogate model also allows engineering teams to make rapid and reliable science-based decisions when selecting and recommending a mixing system for a given product, following a right first time (predict and prevent) development philosophy. Moreover, compared to time consuming and computationally intensive CFD modeling, the surrogate model provided herein can output instant results responsive to changes in process variables (flowrates, mixer RPM, fluid properties, etc.).
Advantageously, predictions using the surrogate model provided herein provide immediate insight into the mixing quality, with the flexibility of assessing any combination of variables, without expensive and time-consuming CFD or in-situ studies. Moreover, by making predictions using the surrogate model, significant time and money can be saved due to reduced third party involvement.
In an aspect, a method is provided, comprising: generating, by one or more processors, a plurality of training CFD models for a plurality of training steady state mixing configurations in which inlet streams are mixed in tanks, wherein each training CFD model is generated based on a plurality of steady state mixing factors associated with each training steady state mixing configuration; calculating, by the one or more processors, a mixing quality for each training steady state mixing configuration using each respective training CFD model; generating, by the one or more processors, a training dataset that includes the steady state mixing factors associated with each training steady state mixing configuration, and the calculated mixing quality for each training steady state mixing configuration; and training, by the one or more processors, a machine learning model, using the training dataset, to predict mixing qualities for steady state mixing configurations based on based on steady state mixing factors associated with the steady state mixing configurations.
In another aspect, a computer system is provided, comprising: one or more processors; and a non-transitory program memory communicatively coupled to the one or more processors and storing executable instructions that, when executed by the one or more processors, cause the processors to: generate a plurality of training CFD models for a plurality of training steady state mixing configurations in which inlet streams are mixed in tanks, wherein each training CFD model is generated based on a plurality of steady state mixing factors associated with each training steady state mixing configuration; calculate a mixing quality for each training steady state mixing configuration using each respective training CFD model; generate a training dataset that includes the steady state mixing factors associated with each training steady state mixing configuration, and the calculated mixing quality for each training steady state mixing configuration; and train a machine learning model, using the training dataset, to predict mixing qualities for steady state mixing configurations based on based on steady state mixing factors associated with the steady state mixing configurations.
In still another aspect, a non-transitory computer readable storage medium storing computer-readable instructions that, when executed by one or more processors, cause the one or more processors to: generate a plurality of training CFD models for a plurality of training steady state mixing configurations in which inlet streams are mixed in tanks, wherein each training CFD model is generated based on a plurality of steady state mixing factors associated with each training steady state mixing configuration; calculate a mixing quality for each training steady state mixing configuration using each respective training CFD model; generate a training dataset that includes the steady state mixing factors associated with each training steady state mixing configuration, and the calculated mixing quality for each training steady state mixing configuration; and train a machine learning model, using the training dataset, to predict mixing qualities for steady state mixing configurations based on based on steady state mixing factors associated with the steady state mixing configurations.
Referring now to the drawings,
The training set may then be used to calibrate (train) the model. At steady state (number of inlets+number of outlets−1), independent flowrates may be used as numerical features. In addition, impeller speed, tank working volume, and fluid Reynolds number may be used as numerical features. The tank and stirrer geometries may be used as categorical features. On the other hand, the values of standard deviation of mean age may be used as numerical labels. A polynomial combination up to order 9 may be added to the numerical features, and the input data may be normalized using a normal distribution with mean of zero and standard deviation of one. A multi-layer perceptron (MLP) neural network may then be constructed with several possible layer sizes, learning rates and 12 regularizer coefficients. Additionally, a grid search model may be built using a cross-validation method to find the best model parameters using the training set. This will ensure the most reliable calibrated model by cross-validating all the parameters on the training set. The best validated model result may then be applied to the test set to report the model quality, including mean absolute error.
The system 700 may include a computational fluid dynamics (CFD) computing device 702, and a surrogate model computing device 704, as well as one or more other computing devices 706 in some examples. The computing devices 702, 704, and 706 may communication with one another via a network 708, which may be a wired or wireless network.
Generally speaking, the CFD computing device 702 may include one or more processors 710 and a memory 712 (e.g., volatile memory, non-volatile memory) accessible by the one or more processors 710 (e.g., via a memory controller). The one or more processors 710 may interact with the memory 712 to obtain, for example, computer-readable instructions stored in the memory 712. The computer-readable instructions stored in the memory 712 may cause the one or more processors 710 to execute one or more applications, including a CFD modeling application 714. Executing the CFD modeling application may include receiving steady state mixing factors associated with a plurality of various steady state mixing configurations, generating CFD models for each of the steady state mixing configurations based on the steady state mixing factors, and calculating a measure of mixing quality for each steady state mixing configuration based on CFD model for each steady state mixing configuration. Executing the CFD modeling application may further include storing the determined measurements of mixing quality and CFD models for each steady state mixing configuration in a training CFD model database 716 and/or a test CFD model database 718. For instance, in some examples, 90% of the data generated by the CFD modeling application 714 may be stored in the training CFD model database 716 while 10% of the data generated by the CFD modeling application is stored in the test CFD model database 718. In other examples, different percentages of the data generated by the CFD modeling application 714 may be stored in each of the databases 714 and 716, or all of the data generated by the CFD modeling application 714 may be stored in the training CFD model database 716. Furthermore, in some examples, the computer-readable instructions stored on the memory 712 may include instructions for carrying out any of the steps of the method 900, described in greater detail below with respect to
Generally speaking, the surrogate model computing device 704 may include one or more processors 720 and a memory 722 (e.g., volatile memory, non-volatile memory) accessible by the one or more processors 720 (e.g., via a memory controller). The one or more processors 720 may interact with the memory 722 to obtain, for example, computer-readable instructions stored in the memory 722. The computer-readable instructions stored in the memory 722 may cause the one or more processors 720 to execute one or more applications, including a surrogate model training application 724, a surrogate machine learning model 726, and a steady state mixing quality predictor application 728. Executing the surrogate model training application 724 may include accessing the training CFD model database 716 and using the steady state mixing factors and calculated measure of mixing quality for each steady state mixing configuration as training data to train a surrogate machine learning model 726 to predict mixing quality based on steady state mixing factors for a given steady state mixing configuration, as discussed in greater detail with respect to
In examples in which other computing devices 706 are included in the system 700, these other computing devices 706 may each include one or more processors 730 and a memory 732 (e.g., volatile memory, non-volatile memory) accessible by the one or more processors 730 (e.g., via a memory controller). The one or more processors 730 may interact with the memory 732 to obtain, for example, computer-readable instructions stored in the memory 732. The computer-readable instructions stored in the memory 732 may cause the one or more processors 730 to receive steady state mixing factors for a given steady state mixing configuration, send the steady state mixing factors for the given steady state mixing factor configuration to the surrogate model computing device 704, and receive a predicted measure of mixing quality from the surrogate model computing device 704 (i.e., based on steady state mixing quality predictor application 728 of the surrogate model computing device 704 applying the trained surrogate machine learning model 726 to the steady state mixing factors from the other computing device 706). Furthermore, in some examples, the computer-readable instructions stored on the memory 732 may include instructions for carrying out any of the steps of the method 900, described in greater detail below with respect to
Now referring to
The surrogate model training application 724 can receive various input signals, including steady state mixing factors 802 for a new steady state mixing configuration (i.e., for which a mixing quality is to be predicted), as well as training data 804 generated using CFD models for a plurality of training steady state mixing configurations. The training data 804 may include training steady state mixing factors 806 for each training steady state mixing configuration, as well as training measurements of mixing quality 808 for each training steady state mixing configuration as calculated using CFD models. The steady state mixing factors 802 and 806 may include one or more of: tank geometry, stirrer geometry, working volume, inlet configuration, outlet configuration, inlet flow rates for each inlet, outlet flow rates for each outlet, agitation speed, impeller speed, fluid Reynolds number for each substance, and/or other chemical and pharmaceutical properties for each substance involved in the mixing, or any other suitable steady state mixing factors associated with each steady state mixing configuration in which inlet streams are mixed in a tank. The training measurements of mixing quality 808 calculated using the CFD models may include measures of the standard deviation of trace concentration in the tank for each steady state mixing configuration, or any other suitable measure of mixing quality for each steady state mixing configuration.
Generally speaking, the feature extraction functions 810 can operate on at least some of these input signals to generate feature vectors, or logical groupings of parameters associated with various steady state mixing factors for each steady state mixing configuration. For example, the feature extraction functions 810 may generate a feature vector that indicates that for a higher agitation speed, the result corresponds to a higher quality of mixing. As another example, the feature extraction functions 810 may generate a feature vector that indicates that when working volume is increased for substances having certain chemical or pharmaceutical properties, the result corresponds to a lower quality of mixing. These results can be used as a set of labels for the feature vectors.
Accordingly, the feature extraction functions 810 can generate feature vectors 812 using the training steady state mixing factors 806 for each training steady state mixing configuration and the training measurements of mixing quality 808 for each training steady state mixing configuration, as calculated using CFD models. In general, the surrogate model training application 724 can train the surrogate machine learning model 726 using supervised learning, unsupervised learning, reinforcement learning, or any other suitable technique. Moreover, the surrogate model training application 724 can train the surrogate machine learning model 726 as a standard regression model.
Over time, as the surrogate model training application 724 trains the surrogate machine learning model 726, the trained surrogate machine learning model 726 may learn to predict a measure of mixing quality 814 for a given steady state mixing configuration based on steady state mixing factors 802 associated with the steady state mixing configuration. For instance steady state mixing quality predictor application 728 may receive steady state mixing factors 802 for a new steady state mixing configuration as inputs (e.g., via a user interface of the surrogate model computing device 704), and may apply the trained surrogate machine learning model 726 to the steady state mixing factors 802 for the new steady state mixing configuration. The trained surrogate machine learning model 726 may then generate a predicted steady state mixing quality 814 for the new steady state mixing configuration using the steady state mixing factors 802, and may send an indication of the predicted measure of mixing quality 814 to the steady state mixing quality predictor application 728, which may display the predicted measure of mixing quality 814 to a user, or may send the predicted measure of mixing quality 814 to another device (such as the other computing device 706), or may store the predicted measure of mixing quality 814, etc.
In some examples, a CFD model may be generated for a testing steady state mixing configuration using the steady state mixing factors of the testing steady state mixing configuration, and the CFD model may calculate a mixing quality for the testing steady state mixing quality. The trained surrogate machine learning model 726 may then be applied to the steady state mixing factors of the testing steady state mixing configuration, and may predict a measure of mixing quality using the steady state mixing factors of the testing steady state mixing configuration. This predicted measure of mixing quality may then be compared to the measure of mixing quality calculated for the same testing steady state mixing configuration by the CFD model. Any difference between the predicted and calculated measure of mixing quality may be used in subsequent training of the surrogate machine learning model 726, i.e., for fine-tuning to improve the performance of the surrogate machine learning model 726.
The method may begin when a plurality of training CFD models are generated (block 902) for a plurality of training steady state mixing configurations in which inlet streams are mixed in tanks. Each training CFD model may be generated based on a plurality of steady state mixing factors associated with each training steady state mixing configuration. For instance, the steady state mixing factors may include one or more of: tank geometry, stirrer geometry, working volume, inlet configuration, outlet configuration, inlet flow rates for each inlet, outlet flow rates for each outlet, agitation speed, impeller speed, fluid Reynolds number for each substance, and/or other chemical and pharmaceutical properties for each substance involved in the mixing, or any other suitable steady state mixing factors associated with a given steady state mixing configuration in which inlet streams are mixed in a tank.
A mixing quality may be calculated (block 904) for each steady state mixing configuration using each respective training CFD model. For instance, the mixing quality may be a measure of the standard deviation of trace concentration in the tank.
A training dataset that includes the steady state mixing factors associated with each training steady state mixing configuration, and the calculated mixing quality for each training steady state mixing configuration may be generated (block 906).
Using the training dataset, a machine learning model may be trained (block 908) to predict mixing qualities for steady state mixing configurations based on based on steady state mixing factors associated with the steady state mixing configurations. In some examples, the machine learning model may be a deep learning model.
In some examples, the method 900 may additionally include applying (block 910) the trained machine learning model to new steady state mixing factors associated with a new steady state mixing configuration; and predicting (block 912) a mixing quality for the new steady state mixing configuration based on applying the trained machine learning model to the steady state mixing factors associated with the new steady state mixing configuration.
Additionally, in some examples, the method 900 may further include generating at least one testing CFD model for a testing steady state mixing configuration in which inlet streams are mixed in a tank. Like the plurality of training CFD models generated at block 902, the testing CFD model may be generated based on a plurality of steady state mixing factors associated with the testing steady state mixing configuration. A mixing quality may then be calculated for the testing steady state mixing configuration using the testing CFD model. The machine learning model trained at block 908 may then be applied to the steady state mixing factors associated with the testing steady state mixing configuration, and a quality of mixing may be predicted for the testing steady state mixing configuration using the trained machine learning model. The trained machine learning model may then be evaluated based on comparing the mixing quality calculated for the testing steady state mixing configuration using the testing CFD model to the mixing quality predicted for the testing steady state mixing configuration using the trained machine learning model. For instance, the trained machine learning model may be evaluated based on how closely the mixing quality calculated by the testing CFD model for the testing steady state mixing configuration matches the mixing quality predicted by the trained machine learning model for the testing steady state mixing configuration. In some examples the trained machine learning model may be modified based on the evaluation, e.g., if the predicted mixing quality differs from the calculated mixing quality by greater than a threshold amount.
Embodiments of the techniques described in the present disclosure may include any number of the following aspects, either alone or combination:
This application claims priority to Provisional Application No. 63/117,789, entitled “COMPUTER SURROGATE MODEL TO PREDICT THE SINGLE-PHASE MIXING QUALITY IN STEADY STATE MIXING TANKS”, filed Nov. 24, 2020, the disclosure of which is incorporated herein by reference in its entirety.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US21/52501 | 9/29/2021 | WO |
Number | Date | Country | |
---|---|---|---|
63117789 | Nov 2020 | US |