Several types of specialty chemicals, such as demulsifiers, corrosion inhibitors, scale inhibitors, and defoamers, are used during oil and/or gas production. Due to complicated formulation and application scenarios, selection and development of the specialty chemicals is typically an empirical process.
According to one aspect, of the disclosure, a computing device for specialty chemical development testing includes a tester interface and a pre-test recommendation module. The tester interface is to receive a test description indicative of a test parameter for a test of a chemical formulation, wherein the test parameter comprises an oil field process parameter. The pre-test recommendation module is to search a database of historical test results based on similarity to the test parameter of the test description to generate a plurality of search results, and generate a plurality of candidate chemical formulations in response to a search of the database. Each of the plurality of candidate chemical formulations is associated with a search result of the plurality of search results.
In an embodiment, the chemical formulation comprises an oil field specialty chemical. In an embodiment, the oil field specialty chemical comprises a demulsifier, a dispersant, a corrosion inhibitor, or a defoamer. In an embodiment, the oil field process parameter comprises a geometrical location, a treating temperature, a treating pressure, a reservoir type, a crude oil pump method parameter, or a crude oil characterization. In an embodiment, to search the database comprises to perform a multidimensional distance search of the historical test results based on the test parameter.
In an embodiment, the computing device further includes a formulation cluster module to cluster the plurality of candidate chemical formulations with an unsupervised machine learning algorithm to generate a plurality of formulation clusters; and select a representative chemical formulation for each of the plurality of formulation clusters. In an embodiment, the unsupervised machine learning algorithm comprises a k-means clustering algorithm.
In an embodiment, the tester interface is further to receive a plurality of test results in response to selection of the representative chemical formulation, wherein each of the plurality of test results is indicative of a performance indicator for a corresponding representative chemical formulation. In an embodiment, the performance indicator comprises turbidity, top oil total water content, or water recovery speed.
In an embodiment, the computing device further includes a formulation optimizer module to train a predictor with the plurality of test results using a supervised machine learning algorithm. In an embodiment, the predictor comprises a regressor. In an embodiment, the predictor comprises a random forest classifier.
In an embodiment, the formulation optimizer module is further to generate a plurality of virtual formulation candidates, wherein each of the plurality of virtual formulation candidates is indicative of a proportion of a chemical; and predict a plurality of predicted results with the predictor in response to training of the predictor, wherein each of the plurality of predicted results is indicative of the performance indicator for a corresponding virtual formulation candidate of the plurality of virtual formulation candidates.
In an embodiment, the tester interface is further to receive a plurality of second test results in response to prediction of the plurality of predicted results, wherein each of the plurality of second test results is indicative of a performance indicator for a corresponding virtual formulation candidate of the plurality of virtual formulation candidates; and the formulation optimizer module is further to train the predictor with the plurality of second test results using the supervised machine learning algorithm.
According to another aspect, a method for specialty chemical development testing includes receiving, by a computing device, a test description indicative of a test parameter for a test of a chemical formulation, wherein the test parameter comprises an oil field process parameter; searching, by the computing device, a database of historical test results based on similarity to the test parameter of the test description to generate a plurality of search results; and generating, by the computing device, a plurality of candidate chemical formulations in response to searching the database, wherein each of the plurality of candidate chemical formulations is associated with a search result of the plurality of search results.
In an embodiment, the chemical formulation comprises an oil field specialty chemical. In an embodiment, the oil field specialty chemical comprises a demulsifier, a dispersant, a corrosion inhibitor, or a defoamer. In an embodiment, the oil field process parameter comprises a geometrical location, a treating temperature, a treating pressure, a reservoir type, a crude oil pump method parameter, or a crude oil characterization. In an embodiment, searching the database comprises performing a multidimensional distance search of the historical test results based on the test parameters.
In an embodiment, the method further includes clustering, by the computing device, the plurality of candidate chemical formulations with an unsupervised machine learning algorithm to generate a plurality of formulation clusters; and selecting, by the computing device, a representative chemical formulation for each of the plurality of formulation clusters. In an embodiment, the unsupervised machine learning algorithm comprises a k-means clustering algorithm.
In an embodiment, the method further includes receiving, by the computing device, a plurality of test results in response to selecting the representative chemical formulation, wherein each of the plurality of test results is indicative of a performance indicator for a corresponding representative chemical formulation. In an embodiment, the performance indicator comprises turbidity, top oil total water content, or water recovery speed.
In an embodiment, the method further includes training, by the computing device, a predictor with the plurality of test results using a supervised machine learning algorithm. In an embodiment, the predictor comprises a regressor. In an embodiment, the predictor comprises a random forest classifier.
In an embodiment, the method further includes generating, by the computing device, a plurality of virtual formulation candidates, wherein each of the plurality of virtual formulation candidates is indicative of a proportion of a chemical; and predicting, by the computing device, a plurality of predicted results with the predictor in response to training the predictor, wherein each of the plurality of predicted results is indicative of the performance indicator for a corresponding virtual formulation candidate of the plurality of virtual formulation candidates.
In an embodiment, the method further includes receiving, by the computing device, a plurality of second test results in response to predicting the plurality of predicted results, wherein each of the plurality of second test results is indicative of a performance indicator for a corresponding virtual formulation candidate of the plurality of virtual formulation candidates; and training, by the computing device, the predictor with the plurality of second test results using the supervised machine learning algorithm.
The concepts described herein are illustrated by way of example and not by way of limitation in the accompanying figures. For simplicity and clarity of illustration, elements illustrated in the figures are not necessarily drawn to scale. Where considered appropriate, reference labels have been repeated among the figures to indicate corresponding or analogous elements.
While the concepts of the present disclosure are susceptible to various modifications and alternative forms, specific embodiments thereof have been shown by way of example in the drawings and will be described herein in detail. It should be understood, however, that there is no intent to limit the concepts of the present disclosure to the particular forms disclosed, but on the contrary, the intention is to cover all modifications, equivalents, and alternatives consistent with the present disclosure and the appended claims.
References in the specification to “one embodiment,” “an embodiment,” “an illustrative embodiment,” etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may or may not necessarily include that particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the art to effect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described. Additionally, it should be appreciated that items included in a list in the form of “at least one A, B, and C” can mean (A); (B); (C): (A and B); (B and C); or (A, B, and C). Similarly, items listed in the form of “at least one of A, B, or C” can mean (A); (B); (C): (A and B); (B and C); or (A, B, and C).
The disclosed embodiments may be implemented, in some cases, in hardware, firmware, software, or any combination thereof. The disclosed embodiments may also be implemented as instructions carried by or stored on a transitory or non-transitory machine-readable (e.g., computer-readable) storage medium, which may be read and executed by one or more processors or processing units (e.g., GPUs, or tensor processing units (TPUs)). A machine-readable storage medium may be embodied as any storage device, mechanism, or other physical structure for storing or transmitting information in a form readable by a machine (e.g., a volatile or non-volatile memory, a media disc, or other media device).
In the drawings, some structural or method features may be shown in specific arrangements and/or orderings. However, it should be appreciated that such specific arrangements and/or orderings may not be required. Rather, in some embodiments, such features may be arranged in a different manner and/or order than shown in the illustrative figures. Additionally, the inclusion of a structural or method feature in a particular figure is not meant to imply that such feature is required in all embodiments and, in some embodiments, may not be included or may be combined with other features.
Referring now to
The computing device 102 may be embodied as any type of device capable of performing the functions described herein. For example, a computing device 102 may be embodied as, without limitation, a server, a rack-mounted server, a blade server, a workstation, a network appliance, a web appliance, a desktop computer, a laptop computer, a tablet computer, a smartphone, a consumer electronic device, a distributed computing system, a multiprocessor system, and/or any other computing device capable of performing the functions described herein. Additionally, in some embodiments, the computing device 102 may be embodied as a “virtual server” formed from multiple computing devices distributed across the network 104 and operating in a public or private cloud. Accordingly, although each computing device 102 is illustrated in
The processor 120 may be embodied as any type of processor or compute engine capable of performing the functions described herein. For example, the processor may be embodied as a single or multi-core processor(s), digital signal processor, microcontroller, or other processor or processing/controlling circuit. Similarly, the memory 124 may be embodied as any type of volatile or non-volatile memory or data storage capable of performing the functions described herein. In operation, the memory 124 may store various data and software used during operation of the computing device 102 such as operating systems, applications, programs, libraries, and drivers. The memory 124 is communicatively coupled to the processor 120 via the I/O subsystem 122, which may be embodied as circuitry and/or components to facilitate input/output operations with the processor 120, the memory 124, and other components of the computing device 102. For example, the I/O subsystem 122 may be embodied as, or otherwise include, memory controller hubs, input/output control hubs, firmware devices, communication links (i.e., point-to-point links, bus links, wires, cables, light guides, printed circuit board traces, etc.) and/or other components and subsystems to facilitate the input/output operations. In some embodiments, the I/O subsystem 122 may form a portion of a system-on-a-chip (SoC) and be incorporated, along with the processor 120, the memory 124, and other components of the computing device 102, on a single integrated circuit chip.
The data storage device 126 may be embodied as any type of device or devices configured for short-term or long-term storage of data such as, for example, memory devices and circuits, memory cards, hard disk drives, solid-state drives, or other data storage devices. The communication subsystem 128 of the computing device 102 may be embodied as any communication circuit, device, or collection thereof, capable of enabling communications between the computing device 102 and other remote devices. The communication subsystem 128 may be configured to use any one or more communication technology (e.g., wireless or wired communications) and associated protocols (e.g., Ethernet, InfiniBand® Bluetooth®, WiMAX, 3G LTE, 5G, etc.) to effect such communication.
As discussed in more detail below, the computing devices 102 may be configured to transmit and receive data with each other and/or other devices of the system 100 over the network 104. The network 104 may be embodied as any number of various wired and/or wireless networks. For example, the network 104 may be embodied as, or otherwise include, a wired or wireless local area network (LAN), a wired or wireless wide area network (WAN), a cellular network, and/or a publicly-accessible, global network such as the Internet. As such, the network 104 may include any number of additional devices, such as additional computers, routers, stations, and switches, to facilitate communications among the devices of the system 100.
Referring now to
The tester interface 202 is configured to receive a test description indicative of one or more test parameters for a test of a chemical formulation. The chemical formulation may be an oil field specialty chemical, such as a demulsifier, a dispersant, a corrosion inhibitor, a scale inhibitor and/or a defoamer. The one or more test parameters may include an oil field process parameter such as a geometrical location, a treating temperature, a treating pressure, a reservoir type, a crude oil pump method parameter, or a crude oil characterization. The tester interface 202 may be further configured to receive multiple test results that are each indicative of a performance indicator for a corresponding chemical formulation. The performance indicator may include turbidity, top oil total water content, or water recovery speed. As described further below, in some embodiments, the tester interface 202 may be configured to generate or otherwise output reports including a pre-test recommendation, a list of representative candidate formulations, and/or predicted results for virtual formulations. In some embodiments, the tester interface 202 may import one or more parameters for machine learning, such as predictor/algorithm selection, a pruning parameter for random forest to avoid overfitting, or other machine learning parameters.
The pre-test recommendation module 204 is configured to search a database of historical test results based on similarity to the one or more test parameters of the test description to generate search results. Searching the database may include performing a multidimensional distance search of the historical test results based on the one or more test parameters. The pre-test recommendation module 204 is further configured to generate multiple candidate chemical formulations in response to a search of the database. Each candidate chemical formulation is associated with a search result.
The formulation cluster module 206 is configured to cluster candidate chemical formulations with an unsupervised machine learning algorithm to generate formulation clusters, and to select a representative chemical formulation for each of the formulation clusters. The unsupervised machine learning algorithm may be embodied as a k-means clustering algorithm.
The formulation optimizer module 208 is configured to train a predictor with the test results using a supervised machine learning algorithm. The predictor may be embodied as a regressor or a classifier such as a random forest classifier. The formulation optimizer module 208 is further configured to generate multiple virtual formulation candidates. Each virtual formulation candidate is indicative of a proportion of one or more chemicals. The formulation optimizer module 208 is further configured to predict multiple predicted results using the predictor in response to training the predictor. Each predicted result is indicative of the performance indicator for a corresponding virtual formulation candidate. The formulation optimizer module 208 may be further configured to continue training the predictor with additional test results using the supervised machine learning algorithm.
Referring now to
In block 304, the computing device 102 searches a database of historical test results for similarly to the test parameters received from the user. The historical test results may be stored in a relational database, an object database, a data lake, a database such as SQL, NoSQL, MongoDB, or other data store accessible by the computing device 102. Each search result may be associated with a historical test of a specialty chemical and thus may include information related to the test parameters of the historical test, the historical formulation that was tested, historical test result performance data, including values for key performance indicators, or other information related to the historical test. The computing device 102 may use any appropriate technique to search the historical test results for similarity. In some embodiments, in block 306, the computing device 102 performs a multidimensional distance search to identify similar historical test results. For example, the computing device 102 may process each of the supplied test parameters as a value in a particular dimension, and then calculate a Euclidean distance from the supplied test parameters to the historical test results. In some embodiments, the test parameters may be weighted when performing the search. In some embodiments, the tester may also provide custom weights for the test parameters.
In block 308, the computing device 102 generates a pre-test recommendation based on the search results. The pre-test recommendation may be embodied as a web page or other report that may be provided to the tester or other user. The pre-test recommendation includes information derived from the historical test results located by the search. For example, the pre-rest representative may include the most-related testing methods for a similar process, the best-performing product for similar process and similar crude oil characterization, any commercial products and formulations that have never been tested in a similar process, or other relevant information. Thus, the pre-test recommendation may include a list of chemical formulations that are candidates for further testing. The tester may use the pre-test recommendation to select chemical formulations for testing, adjust test methods or other parameters, or otherwise prepare for specialty chemical testing.
In block 310, the computing device 102 receives a shortlist of candidate formulations for further testing. The shortlist may be received, for example, from the tester or other user via a web interface of the computing device 102. Continuing that example, the tester may prepare the shortlist based on the pre-test recommendation that is generated as described above. Additionally or alternatively, in some embodiments, the computing device 102 may receive the shortlist of candidate formulations automatically or otherwise without additional user input. For example, a certain number of top search results determined as described above may be included in the shortlist of candidate formulations.
In block 312, the computing device 102 clusters the candidate formulations into multiple clusters using an unsupervised machine learning algorithm. Each cluster includes a grouping of similar candidate formulations selected from the shortlist of candidate formulations. That is, the chemical formulations included in a cluster may be more similar to each other than to formulations included in other clusters. The chemical formulations may be clustered based on one or more features of the formulation, such as a chemical type, a molecular weight, a chemical code, a numeric feature, or other feature. The computing device 102 may use any appropriate technique to cluster the candidate formulations. In some embodiments, in block 314 the computing device 102 may select a particular number of clusters based on available testing equipment. For example, if 12 samples may be tested in a particular batch, the computing device 102 may cluster the candidate formulations into 11 clusters, leaving one testing position open for an incumbent chemical or other control. In some embodiments, in block 316 the computing device 102 may cluster the candidate formulations using a k-means clustering algorithm. Of course, in other embodiments, the computing device 102 may use any other appropriate unsupervised clustering algorithm, such as density-based spatial clustering of applications with noise (DBSCAN), hierarchical clustering, distribution models, density models, support vector clustering, or other clustering algorithm. In some embodiments, selection of the clustering algorithm may be guided or otherwise determined by the type of data associated with the candidate formulations. For example, certain algorithms may be better suited for input data with a Gaussian distribution.
In block 318, the computing device 102 generates a representative formulation report. To generate the report, the computing device 102 selects a representative formulation from each cluster determined as described above. The representative formulation may be, for example, a formulation that is closest to the center or centroid of each cluster, a formulation that is closest to the average of each cluster, or other formulation selected from the chemical formulations included in the cluster. The representative formulation report may be embodied as a web page or other report that may be provided to the tester or other user. The tester may use the representative formulation report to perform additional testing. For example, the tester may perform tests using each of the representative formulations and collect corresponding test results. Continuing the example described above, the representative formulation report may list 11 representative formulations, one representative formulation for each cluster. The tester may prepare a test including those 11 representative formulations plus a control formulation.
After generating the representative formulation report, the method 300 advances to block 320, shown in
In block 322, the computing device 102 trains a predictor with the test results using a supervised machine learning algorithm. The predictor may be embodied as a regressor, a classifier, or any other supervised machine learning prediction model. The predictor may be trained to predict one or more predicted results (e.g., one or more predicted KPI values) for each input chemical formulation, for example all untested but commercially available formulations, or other input features. Test results received as described above may be used as labels or other training data. In some embodiments, in block 324, the computing device 102 may train a random forest classifier. In some embodiments, in block 326 the computing device 102 may build a decision tree to perform predictions. Additionally or alternatively, in some embodiments the computing device 102 may train any appropriate supervised learning model, such as an artificial neural network, support vector machine, linear regression, logistic regression, or other predictor or combination of predictors (e.g., a combination of random forest and gradient descent classifiers such as XGBoost).
In block 328, the computing device 102 generates multiple virtual formulation candidates. Each virtual formulation candidate identifies one or more constituent chemicals or other components of the formulation, and a corresponding proportion of that constituent chemical. In some embodiments, in block 330 the computing device 102 may generate combinations of commercially available specialty chemical intermediates. Thus, the virtual formulations may include blends of intermediates or other chemicals that both are and are not commercially available. As an illustrative example, the computing device 102 may generate all potential virtual formulations given a certain number of intermediates or other components, an available percentage range, and a percentage accuracy. Continuing that example, in an illustrative embodiment virtual formulations may be generated for blends of two chemicals, labeled intermediate A and intermediate B. The percentage range may be from zero to 100%, and the percentage accuracy may be 20%. In that example, the computing device 102 may generate six virtual formulations as shown below in Table 1. Of course, as the number of potential chemical intermediates increases, the number of virtual formulations may also increase. In some embodiments, the computing device 102 may generate hundreds or thousands of virtual formulation candidates.
In block 332, the computing device 102 predicts performance of the virtual formulation candidates using the trained predictor. The computing device 102 may, for example, predict the values of one or more KPIs such as turbidity, top oil total water content, water recovery speed, or other indicators of performance. In some embodiments, the computing device 102 may classify the virtual formulation candidates or otherwise predict performance of the virtual formulation candidates using the predictor.
In block 334, the computing device 102 generates a report with the predicted results. The predicted result report may be embodied as a web page or other report that may be provided to the tester or other user. The user may use the predicted results to identify particular virtual formulation candidates for further testing. For example, the tester may identify certain virtual formulations having the best predicted performance for additional testing.
In block 336, the computing device 102 determines whether to refine the predictor. The computing device 102 may refine the predictor, for example, in response to additional testing that may be performed by the tester. In some embodiments, the computing device 102 may be connected to MLOps tools and workflows such as automated continuous integration, continuous delivery, and continuous training systems to perform further model calibration, data governance, and ML lifecycle operations. If the computing device 102 determines to refine the predictor, the method 300 loops back to block 320 to receive additional test results and continue training the predictor. If the computing device 102 determines not to refine the predictor, the method 300 loops back to block 302 shown in
Although illustrated in
Referring now to
Referring now to
As shown in
Referring now to
In the illustrative formulation clustering 702, each chemical formulation is identified by a name (e.g., formulation name, formulation code, trade name, or other identifier) as well as a composition. The composition of each formulation is illustratively shown as different percentages of each of nine chemical intermediates, labeled as molecules G, H, M, K, I, A, C, L, and F. In other embodiments, each chemical formulation may include additional information, such as chemical intermediate type and/or code name. For example, each intermediate may be embodied as a particular resin, polymer, solvent, or other chemical intermediate that may be combined to form a formulation for testing as a demulsifier.
As shown in
Referring now to
As shown, the predicted results 802 include a predicted KPI score 806 for each virtual formulation. The KPI score 806 corresponds to a performance result generated by the predictor for that virtual formulation. For example, the KPI 806 may be embodied as a predicted score for turbidity, top oil total water content, water recovery speed, or other indicators of performance for a demulsifier or other specialty chemical. The KPI 806 may be reported in appropriate units or may be scaled. Illustratively, the KPIs 806 shown in
Number | Date | Country | Kind |
---|---|---|---|
3148574 | Feb 2022 | CA | national |