The present disclosure relates generally to detecting anomalies in total gas measurements associated with the exploration of one or more hydrocarbon wells and, more particularly, to systems and/or methods that employ machine learning models to detect the anomalies and generate alerts.
Hydrocarbon wells are used to extract oil and/or gas from subsurface hydrocarbon reservoirs. The process of drilling a hydrocarbon well involves drilling a borehole into the Earth's subsurface and evaluating the rock formations encountered to determine whether they contain hydrocarbon deposits. Typically, estimates are made attempting to predict the characteristics of a potential subsurface hydrocarbon reservoir to guide the implementation of exploratory wells. These estimates are then manually compared to observed measurements as the exploratory well is drilled. For example, hydrocarbon gas values for a target location may be predicted based on geological characteristics of the region and/or the operation of proximate hydrocarbon wells (e.g., hydrocarbon wells in established fluid communication with the subsurface reservoir). While an exploratory hydrocarbon well is drilled at the target location, deviation from the predicted hydrocarbon gas values can be an indication of a safety concern and/or a change in the economic viability of the proposed hydrocarbon well.
Various details of the present disclosure are hereinafter summarized to provide a basic understanding. This summary is not an extensive overview of the disclosure and is neither intended to identify certain elements of the disclosure, nor to delineate the scope thereof. Rather, the primary purpose of this summary is to present some concepts of the disclosure in a simplified form prior to the more detailed description that is presented hereinafter.
According to an embodiment consistent with the present disclosure, a method is provided. The method can comprise collecting well feature data characterizing operation of an exploratory hydrocarbon well. The well feature data includes a gas measurement value. The method can also comprise applying a machine learning model to predict gas reading values associated with the exploratory hydrocarbon well. The method can further comprise comparing the gas measurement value with the gas reading value predicted by the machine learning model to detect a gas reading anomaly.
In another embodiment, a system is provided. The system can comprise memory to store computer executable instructions. The system can also comprise one or more processors, operatively coupled to the memory, that execute the computer executable instructions to implement a data collector configured to collect well feature data characterizing operation of an exploratory hydrocarbon well, where the well feature data includes a gas measurement value. Further, the system can comprise a machine learning engine having a training stage and an inference stage. The inference stage can be configured to, based on a machine learning model, predict gas reading values associated with the exploratory hydrocarbon well. Also, the machine learning model can be configured to compare the gas measurement value with the gas reading value predicted by the machine learning model to detect a gas reading anomaly.
In a further embodiment, a computer program product for monitoring gas readings for anomalies is provided. The computer program product can comprise a computer readable storage medium having computer executable instructions embodied therewith. The computer executable instructions executable by one or more processors to cause the one or more processors to collect well feature data characterizing operation of an exploratory hydrocarbon well, wherein the well feature data includes a gas measurement value. Also, the computer executable instructions can cause the one or more processors to apply a machine learning model to predict gas reading values associated with the exploratory hydrocarbon well. Further, the computer executable instructions can cause the one or more processors to compare the gas measurement value with the gas reading value predicted by the machine learning model to detect a gas reading anomaly.
Any combinations of the various embodiments and implementations disclosed herein can be used in a further embodiment, consistent with the disclosure. These and other aspects and features can be appreciated from the following description of certain embodiments presented herein in accordance with the disclosure and the accompanying drawings and claims.
Embodiments of the present disclosure will now be described in detail with reference to the accompanying figures. Like elements in the various figures may be denoted by like reference numerals for consistency. Further, in the following detailed description of embodiments of the present disclosure, numerous specific details are set forth in order to provide a more thorough understanding of the claimed subject matter. However, it will be apparent to one of ordinary skill in the art that the embodiments disclosed herein may be practiced without these specific details. In other instances, well-known features have not been described in detail to avoid unnecessarily complicating the description. Additionally, it will be apparent to one of ordinary skill in the art that the scale of the elements presented in the accompanying Figures may vary without departing from the scope of the present disclosure.
Conventional gas monitoring techniques employed with exploratory hydrocarbon wells can be time-consuming and labor intensive, requiring subject matter experts to derive performance trends and identify deviations in performance data. Further, periodic evaluations may not provide a complete picture of the exploratory well's real-time, or near real-time performance. Thus, conventional monitoring and/or detection methodologies can be ill-suited for real-time anomaly detection.
Embodiments in accordance with the present disclosure generally relate to systems and/or computer-implemented methods that can employ machine learning engines to detect anomalies in total gas measurements associated with one or more exploratory wells. For example, one or more embodiments described herein can be employed to compare observed gas measurements to one or more predicted gas values, where predictions can be made by the one or more machine learning models based on training datasets that include historic and/or synthetic data. Further, various embodiments described herein can correlate deviations from the predicted gas value trend as gas measurement anomalies and generate alerts characterizing the anomalies and/or well performance. In one or more embodiments, the alerts can serve to notify one or more well technicians of the identified anomalies and trigger a performance investigation. Where the performance investigation determines that the alert is a false anomaly detection, the measurement data associated with the detection, and the false anomaly detection status, can be used to re-train the machine learning model to improve the accuracy of subsequent evaluations.
In one or more embodiments, the one or more machine learning models can execute one or more ensemble learning algorithms (e.g., an extremely randomized trees algorithm) to determine gas measurement predictions based on various well feature input data. Additionally, various embodiments described herein can pre-process the well feature input data via one or more data conditioning operations, such as: data imputation, data standardization, and/or data sequencing. For exploratory wells in hydrocarbon fields that lack historical performance data, the pre-processed well feature input data can be utilized to generate synthetic training data to populate one or more training datasets for training the one or more machine learning models. Further, well feature input data can be analyzed by the trained machine learning model executing the one or more ensemble learning algorithms, which can approach anomaly detection as a regression problem. For example, a total gas measurement anomaly can be identified based on a machine learning engine determining that a root mean squared error is higher than a normal value by a viable percentage that can be adjusted by one or more users. Detected total gas measurement anomalies can be the subject of one or more generated alerts shared amongst designated users of the system.
Moreover, various embodiments described herein can constitute one or more technical improvements over conventional exploratory well gas anomaly detection by utilizing machine learning models to perform real-time, or near real-time, measurement evaluations (e.g., unsupervised anomaly detection). For instance, various embodiments described herein can utilize machine learning models (e.g., executing ensemble learning algorithms) to predict patterns and/or trends in the total gas measurements of one or more exploratory wells. Additionally, one or more embodiments described herein can have a practical application by utilizing observed well feature input data to generate an expansive training dataset of synthetic training data to facilitate the training of machine learning models in the performance of exploratory wells in hydrocarbon fields associated with minimal historic data. Additionally, one or more embodiments described herein can control the generation and/or distribution of one or more alerts based on gas measurement anomalies detected by one or more machine learning engines.
As used herein, the term “machine learning” can refer to an application of artificial intelligence technologies to automatically and/or autonomously learn and/or improve from an experience (e.g., training data) without explicit programming of the lesson learned and/or improved upon. Machine learning as used herein can include, but is not limited to, deep learning techniques. Various system components described herein can utilize machine learning (e.g., via supervised, unsupervised, and/or reinforcement learning techniques) to perform tasks such as classification, regression, and/or clustering. Execution of machine learning tasks can be facilitated by one or more machine learning models trained on one or more training datasets in accordance with one or more model configuration settings.
As used herein, the term “machine learning model” can refer to a computer model used to facilitate one or more machine learning tasks (e.g., regression and/or classification tasks). For example, a machine learning model can represent relationships (e.g., causal or correlation relationships) between parameters and/or outcomes within the context of a specified domain. For instance, machine learning models can represent the relationships via probabilistic determinations that can be adjusted, updated, and/or redefined based on historic data and/or previous executions of a machine learning task. In various embodiments described herein, machine learning models can simulate a number of interconnected processing units that can resemble abstract versions of neurons. For example, the processing units can be arranged in a plurality of layers (e.g., one or more input layers, hidden layers, and/or output layers) connected by varying connection strengths (e.g., which can be commonly referred to within the art as “weights”).
Machine learning models can learn through training with one or more training datasets; where data with known outcomes is inputted into the machine learning model, outputs regarding the data are compared to the known outcomes, and/or the weights of the machine learning model are autonomously adjusted based on the comparison to replicate the known outcomes. As the one or more machine learning models train (e.g., utilize more training data), the machine learning models can become increasingly accurate; thus, trained machine learning models can accurately analyze data with unknown outcomes, based on lessons learned from training data and/or previous executions, to facilitate one or more machine learning tasks.
Example types of machine learning models can include, but are not limited to: artificial neural network (“ANN”) models, perceptron (“P”) models, feed forward (“FF”) models, radial basis network (“RBF”) models, deep feed forward (“DFF”) models, recurrent neural network (“RNN”) models, long/short memory (“LSTM”) models, gated recurrent unit (“GRU”) models, auto encoder (“AE”) models, variational AE (“VAE”) models, denoising AE (“DAE”) models, sparse AE (“SAE”) models, markov chain (“MC”) models, Hopfield network (“HN”) models, Boltzmann machine (“BM”) models, deep belief network (“DBN”) models, convolutional neural network (“CNN”) models, deep convolutional network (“DCN”) models, deconvolutional network (“DN”) models, deep convolutional inverse graphics network (“DCIGN”) models, generative adversarial network (“GAN”) models, liquid state machine (“LSM”) models, extreme learning machine (“ELM”) models, echo state network (“ESN”) models, deep residual network (“DRN”) models, kohonen network (“KN”) models, support vector machine (“SVM”) models, and/or neural turing machine (“NTM”) models.
As shown in
The one or more processing units 110 can comprise any commercially available processor. For example, the one or more processing units 110 can be a general purpose processor, an application-specific system processor (“ASSIP”), an application-specific instruction set processor (“ASIPs”), or a multiprocessor. For instance, the one or more processing units 110 can comprise a microcontroller, microprocessor, a central processing unit, and/or an embedded processor. In one or more embodiments, the one or more processing units 110 can include electronic circuitry, such as: programmable logic circuitry, field-programmable gate arrays (“FPGA”), programmable logic arrays (“PLA”), an integrated circuit (“IC”), and/or the like.
The one or more computer readable storage media 112 can include, but are not limited to: an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, a combination thereof, and/or the like. For example, the one or more computer readable storage media 112 can comprise: a portable computer diskette, a hard disk, a random access memory (“RAM”) unit, a read-only memory (“ROM”) unit, an erasable programmable read-only memory (“EPROM”) unit, a CD-ROM, a DVD, Blu-ray disc, a memory stick, a combination thereof, and/or the like. The computer readable storage media 112 can employ transitory or non-transitory signals. In one or more embodiments, the computer readable storage media 112 can be tangible and/or non-transitory. In various embodiments, the one or more computer readable storage media 112 can store the one or more computer executable instructions 112 and/or one or more other software applications, such as: a basic input/output system (“BIOS”), an operating system, program modules, executable packages of software, and/or the like.
The one or more computer executable instructions 114 can be program instructions for carrying out one or more operations described herein. For example, the one or more computer executable instructions 114 can be, but are not limited to: assembler instructions, instruction-set architecture (“ISA”) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state-setting data, configuration data, source code, object code, a combination thereof, and/or the like. For instance, the one or more computer executable instructions 114 can be written in one or more procedural programming languages. Although
The one or more networks 108 can comprise one or more wired and/or wireless networks, including, but not limited to: a cellular network, a wide area network (“WAN”), a local area network (“LAN”), a combination thereof, and/or the like. One or more wireless technologies that can be comprised within the one or more networks 108 can include, but are not limited to: wireless fidelity (“Wi-Fi”), a WiMAX network, a wireless LAN (“WLAN”) network, BLUETOOTH® technology, a combination thereof, and/or the like. For instance, the one or more networks 108 can include the Internet and/or the IoT. In various embodiments, the one or more networks 108 can comprise one or more transmission lines (e.g., copper, optical, or wireless transmission lines), routers, gateway computers, and/or servers. Further, the one or more anomaly detectors 102, input devices 104, and/or well sites 106 can comprise one or more network adapters and/or interfaces (not shown) to facilitate communications via the one or more networks 108.
In various embodiments, the one or more input devices 104 can be employed to supply data to the system 100 and/or interact with the one or more anomaly detectors 102. In various embodiments, the one or more input devices 104 can include and/or display one or more input interfaces (e.g., a user interface) to facilitate entry of data into the system 100. Also, in one or more embodiments the one or more input devices 104 can be employed to display one or more outputs from the one or more anomaly detectors 102.
The one or more input devices 104 can include one or more computer devices, including, but not limited to: desktop computers, servers, laptop computers, smart phones, smart wearable devices (e.g., smart watches and/or glasses), computer tablets, keyboards, touch pads, mice, augmented reality systems, virtual reality systems, microphones, remote controls, stylus pens, biometric input devices, a combination thereof, and/or the like. Additionally, the one or more input devices 104 can include one or more displays that can present one or more outputs generated by, for example, the one or more anomaly detectors 102. Example displays can include, but are not limited to: cathode tube display (“CRT”), light emitting diode display (“LED”), electroluminescent display (“ELD”), plasma display panel (“PDP”), liquid crystal display (“LCD”), organic light-emitting diode display (“OLED”), a combination thereof, and/or the like.
The one or more well sites 106 can include one or more exploratory wells performing one or more drilling operations to establish fluid communication with a subsurface hydrocarbon reservoir. For example, the well sites 106 can be employed to explore the economic viability of a proposed hydrocarbon well in accordance with one or more estimated performance metrics, such as predicted hydrocarbon gas values. In various embodiments, the well sites 106 can include a variety of sensors (e.g., flow sensors, pressure sensors, temperature sensors, and/or the like) to monitor operation of the associated exploratory well. For example, the well site 106 can include one or more sensors located at a well head of the exploratory well and/or configured to observe and/or measure a wellbore of the exploratory well.
In various embodiments, the data collector 116 can receive and/or retrieve well feature data from the one or more input devices 104 and/or well sites 106 characterizing: operation of an exploratory well associated with the well site 106, and/or geological features of a hydrocarbon field in which the well site 106 is located. Example operational data of the exploratory well that can be included in the well feature data includes, but is not limited to: rate of penetration (“ROP”) values, weight on bit (“WOB”) values, drill torque, bit size, mud weight, mud type, lithology percentages, depths (e.g., total vertical depth and/or depth above sea level), gas measurement values, a combination thereof, and/or the like. Example geological data of the given hydrocarbon field that can be included in the well feature data includes, but is not limited to: latitude values, longitude values, formation top prognosis values, a combination thereof and/or the like. Additionally, one or more of the well feature data can be associated with defined depth values (e.g., total vertical depth or vertical depth above sea level). In accordance with various embodiments described herein, the machine learning engine 120 can utilize the well feature data to train one or more machine learning models 122, where trained machine learning models 122 can be utilized to predict gas values associated with the exploratory well.
In various embodiments, the data collector 116 can receive and/or retrieve the well feature data in real-time, or near real-time, from the one or more input devices 104 and/or well sites 106 characterizing observed performance of an exploratory well being monitored by the one or more anomaly detector 102. For example, the data collector 116 can receive well feature data automatically loaded via a socket connection with the one or more input devices 104 and/or well sites 106. For instance, the data collector 116 can analyze a defined network port for incoming well feature data from one or more defined remote sources (e.g., an input device 104 and/or well site 106). Auto-loading well feature data via a socket connection with the one or more input devices 104 and/or well sites 106 can enable real-time data processing, reducing the need for manual data input and the likelihood data entry errors. In accordance with various embodiments described herein, the machine learning engine 120 can compare the gas measurement with a gas value prediction generated by the trained machine learning model 122 to detect one or more gas reading anomalies.
In one or more embodiments, the data collector 116 can generate a well feature dataset 123 with the well feature data received by the one or more input devices 104 and/or well sites 106. For example, the well feature dataset 123 can include one or more tables comprising rows associated with depth increments and columns associated with the well feature data values, with each column labeled according to a respective category of well feature data (e.g., an ROP column, a WOB column, etc.). In various embodiments, the data collector 116 can generate a respective well feature dataset 123 for each well site 106 monitored by the one or more anomaly detectors 102.
The data conditioner 118 can pre-process the well feature data collected by the data collector 116 by, for example, one or more data imputation, standardization, and/or sequencing operations. For example, the data conditioner 118 can initially clean the well feature data by removing values based on one or more pre-defined (e.g., via the one or more input devices 104) criteria. For instance, the data conditioner 118 can remove ROP values that are below zero. Additionally, the data conditioner 118 can treat lithology data as multiple variables with percentage based values. Moreover, the data conditioner 118 can provide categorical values associated with the defined bit size and/or mud weight. In or more embodiments, the data conditioner 118 can further execute a standard scaler across the columns of the well feature dataset 123, whereby the well feature data can be scaled to have a mean of zero and a standard deviation of one. For instance, the data conditioner 118 can subtract the mean from each data point and then divide by the standard deviation.
In various embodiments, the data conditioner 118 can further generate one or more data sequence datasets 124 from the pre-processed well feature dataset 123. For example, the data conditioner 118 can employ one or more data sequencing techniques to generate a single data sequence from multiple rows of the well feature dataset 123. The data conditioner 118 can utilize a sliding window approach, where a window size (e.g., 10 rows from the well feature dataset 123, with each row corresponding to 10 feet of vertical depth) and slide the window along a time series to create overlapping data sequences. Thereby, the data conditioner 118 can generate data sequences from multiple rows of well feature data using a sliding window approach to condition the data for trend analysis by the machine learning engine 120.
In one or more embodiments, the machine learning engine 120 can utilize the one or more well feature datasets 123 and/or data sequence datasets 124 to generate one or more training datasets 125, which can be used to train one or more the machine learning models 122. For example, a model trainer 130 can generate the one or more training datasets 125 with synthetic data to implement a training stage for the machine learning models 122. Once the one or more machine learning models 122 are trained, the machine learning engine 120 can implement an inference stage of the one or more trained machine learning models 122 to detect deviations between the real-time gas measurement values and the predicted gas values.
In various embodiments, the model trainer 130 can execute a conditional tabular generative adversarial network (“CTGAN”) to analyze the one or more well feature datasets 123 and generate one or more training datasets 125 with synthetic data. For example, the CTGAN can learn underlying data distributions of a given tabular dataset (e.g., a well feature dataset 123, where each column can represent well feature data and each row can represent depth as an observation value) and generate synthetic data that has similar statistical properties to the original data. For instance, the CTGAN algorithm can generate tabular data by modeling the conditional probability distribution of each well feature data values given the values of other well feature data values. Thereby, the model trainer 130 can generate synthetic training data that preserves the correlations and dependency inherit to the well feature data. Further, the model trainer 130 can generate the one or more training datasets 125 comprising the synthetic training data.
During a training stage, the machine learning engine 120 can train the one or more machine learning models 122 on the one or more training datasets 125. For example, the machine learning engine 120 can execute one or more supervised learning and/or reinforcement learning techniques to train the one or more machine learning models 122 based on the one or more training datasets 125. Additionally, during the training stage the machine learning engine 120 can train a plurality of respective machine learning models 122 having, for example, different software architectures (e.g., to facilitate a subsequent implementation of an ensemble of machine learning models 122 during the inference stage). In one or more embodiments, the data conditioner 118 can perform one or more data sequencing techniques to sequence the one or more training datasets 125. In various embodiments, the one or more machine learning models 122 can execute one or more ensemble learning algorithms, such as an extreme randomized tree algorithm.
For example, during a first iteration of training, the one or more machine learning models 122 can execute an extreme randomized tree algorithm to learn one or more correlations between well features (e.g., ROP, WOB, drill torque, bit size, mud weight mud type, lithology percentages) and gas values at various depths based on synthetic training data of the one or more training datasets 125. During a second iteration of training, a first subset of feature data sequences of the well feature data, absent the gas measurement values, can be inputted into the machine learning model 122 to predict gas values. Further, the machine learning engine 120 can execute a loss function algorithm (e.g., based on mean squared error) to analyze the accuracy of the predicted gas values given the known gas measurement values associated with the feature data sequences. Based on loss value computed by the loss function algorithm the machine learning model 122 can tune one or more hyperparameters of the machine learning model 120. During a third iteration of training, a second subset of feature data sequences of the well feature data, absent the gas measurement values, can be inputted into the tuned machine learning model 122 to predict gas values. Subsequently, the machine learning engine 120 can execute the loss function algorithm to evaluate the predictions of the tuned machine learning model 122. Where a loss value computed by the loss function algorithms is greater than or equal to a defined threshold (e.g., thereby indicating an undesirable amount of inaccuracy), the machine learning engine 120 can repeat the features of the second and/or third iteration of training to further tune the machine learning model 122 until the computed loss value is below the defined threshold (e.g., thereby indicating a desirable amount of accuracy).
In one or more embodiments, the machine learning engine 120 can store trained machine learning models 122 in one or more trained model libraries 126. For example, a trained machine learning model 122 can be associated with the hydrocarbon field associated with the training dataset 125 utilized during the training stage. For instance, well feature data can be associated with a given hydrocarbon field in which an exploratory well characterized by the well feature data is located. Further, a training dataset 125 can be associated with the same hydrocarbon field associated with the well feature data from which it is derived. Additionally, a trained machine learning model 122 can be associated with the same hydrocarbon field associated with the training dataset 125 utilized during the given machine learning model's 122 training stage. The trained model library 126 can sort trained machine learning models 122 based on the associated hydrocarbon field such that gas reading anomaly evaluation of exploratory wells in the same hydrocarbon field can be conducted using a common machine learning model 122. Thereby, lessons learned by a trained machine learning model 122 regarding the performance of a one well within a given hydrocarbon field can be applied to evaluating the performance of another well within the given hydrocarbon field. Additionally, characteristics unique to one hydrocarbon field can be reserved from machine learning models 122 employed to evaluate well performance in another hydrocarbon field with different characteristics.
During an inference stage, the machine learning engine 120 can utilize the data sequences of the one or more data sequence datasets 124 as feature inputs to the trained machine learning model 122 to predict one or more gas value estimates (e.g., estimate a gas value trend). In one or more embodiments, the feature inputs can skip data sequences associated with an initial pre-defined depth (e.g., first 100 feet of vertical depth). For example, the trained machine learning model 122 can execute an extremely randomized trees algorithm to analyze the sequenced well feature data in the context of a regression problem to generate one or more gas value estimates that characterize a predicted gas value trend.
In various embodiments, the machine learning engine 120 can determine a difference between subsequently collected gas measurement values (e.g., collected in real-time) and the estimated gas values predicted by the one or more trained machine learning models 122. For example, the machine learning engine 120 can execute a root mean square error (“RMSE”) algorithm to measure an amount of difference between the observed gas measurement values (e.g., collected by the data collector 116 and/or processed by the data conditioner 118) and the predicted gas value estimates (e.g., predicted by the one or more trained machine learning models 122). In one or more embodiments, the machine learning engine 120 can also utilize an RMSE algorithm to compare the performance of multiple trained machine learning models 122 to tune one or more hyperparameters.
In various embodiments, the alert generator 121 can generate one or more alerts based on the compute RMSE value indicating a substantial deviation between the observed gas measurement value and the predicted gas vale estimate. For example, as the RMSE value (e.g., computed as a result of the RMSE algorithm) increases, the amount of deviation between the observed gas measurement value and the predicted gas vale estimate can also increase. The alert generator 121 can be configured to generate the one or more alerts based on the RMSE value being greater than or equal to a pre-defined RMSE threshold (e.g., which can be defined by one or more users of the system 100 via the one or more input devices 104). For instance, a deviation between the observed gas measurement value and the predicted gas vale estimate characterized by an RMSE value greater than or equal to the pre-defined RMSE threshold can be indicative of a gas reading anomaly. The alert generator 121 can generate one or more alerts to notify one or more users of the system 100 of the detected gas reading anomaly. For example, the one or more alerts can be shared with one or more users of the system 100 via the one or more input devices 104. In various embodiments, the alert generator 121 can generate the alerts as one or more emails and/or text messages to be sent to pre-defined addresses (e.g., email addresses or telephone numbers).
In view of the foregoing structural and functional features described above, example methods will be better appreciated with reference to
At 202, the computer-implemented method 200 can comprise collecting (e.g., via the data collector 116), by a system 100 operatively coupled to one or more processor units 110, real-time, near real-time, and/or static well feature data. In accordance with one or more embodiments described herein, well feature data can be automatically loaded (e.g., via data collector 116) to an anomaly detector 102 from one or more input devices 104 and/or well sites 106 via one or more socket connections. For example, the well feature data collected at 202 can be included in one or more well feature datasets 123 characterizing the operations of one or more exploratory wells within a hydrocarbon field and/or one or more geological features of the hydrocarbon field.
At 204, the computer-implemented method 200 can comprise performing (e.g., via the data conditioner 118), by the system 100, one or more data conditioning operations. In accordance with the one or more embodiments described herein, example conditioning operations can include one or more data imputation operations, data standardizations (e.g., via a standard scaler), and/or data sequencing operations. For example, the data condition operations at 204 can be utilized to pre-process the one or more well feature datasets 123 and/or generate one or more feature data sequences (e.g., including in one or more data sequence datasets 124).
At 206, the computer-implemented method 200 can comprise determining (e.g., via the machine learning engine 120), by the system 100, whether a trained machine learning model 122 exists for the hydrocarbon field of interest. In accordance with one or more embodiments described herein, the determination at 206 can be made (e.g., via the machine learning engine 120) by referencing one or more trained model libraries 126, which can include previously trained machine learning models 122 sorted based on associated hydrocarbon fields. Where none of the trained machine learning models 122 of the trained model library 126 are associated with the hydrocarbon field of interest, the computer-implemented method 200 can proceed to 208 to implement a training stage for a machine learning model 122. Where one or more of the trained machine learning models 122 of the trained model library 126 are associated with the hydrocarbon field of interest, the computer-implemented method 200 can proceed to 210 to implement an inference stage of a machine learning engine 120.
At 208, the computer-implemented method 200 can comprise generating (e.g., via the model trainer 130), by the system 100, one or more training datasets 125. In accordance with one or more embodiments described herein, the one or more training datasets 125 can be generated at 208 using one or more CTGAN algorithms to generate synthetic training data based on the well feature data collected at 202.
At 212, the computer-implemented method 200 can comprise training (e.g., via the machine learning engine 120), by the system 100, one or more machine learning models 122 based on the one or more training datasets 125. In accordance with one or more embodiments described herein, the or more machine learning models 122 can analyze the one or more training datasets 125 by executing one or more ensemble learning algorithms, such as an extremely randomized tree algorithm, to predict gas value readings expected to be exhibited by the given exploratory well in the hydrocarbon field of interest. For instance, during a training stage of the machine learning engine 120, the machine learning models 122 can execute one or more extremely randomized tree algorithms to learn one or more correlations between well features (e.g., ROP, WOB, drill torque, bit size, mud weight mud type, lithology percentages) and gas values at various depths based on synthetic training data of the one or more training datasets 125. Further, during the training stage, the one or more machine learning models 122 can be tuned (e.g., via the machine learning engine 120) based on one or more subsets of the well feature data collected at 202 to minimize a loss function in accordance with one or more embodiments described herein.
At 210, the computer-implemented method 200 can comprise selecting (e.g., via the machine learning engine 120), by the system 100, a relevant trained machine learning model 122. In accordance with one or more embodiments described herein, a trained machine learning model 122 can be selected at 210 from a trained model library 126 based on the trained machine learning model 122 being trained on well feature data associated with other hydrocarbon wells (e.g., exploratory wells or production wells) of the hydrocarbon field of interest or historic well feature data of the exploratory well being monitored by the given implementation of the computer-implemented computer method 200.
At 214, the computer-implemented method 200 can comprise analyzing (e.g., via the machine learning engine 120), by the system 100, the well feature data collected at 202 using the one or more trained machine learning models 122 associated with the hydrocarbon field of interest. In accordance with one or more embodiments described herein, the machine learning model 120 can execute one or more RMSE algorithms to measure a deviation between the gas measurement values of the well feature data with the gas value estimates predicted by the one or more trained machine learning models 122.
At 216, the computer-implemented method 200 can comprise determining (e.g., via the machine learning engine 120), by the system 100, whether an anomaly in the gas measurement values is detected. In accordance with one or more embodiments described herein, a gas reading anomaly (e.g., an anomaly in the gas measurement values) can be determined based on the RMSE value computed by the RMSE algorithm at 214 being greater than or equal to a pre-defined threshold (e.g., where the pre-defined threshold characterizes an undesirable amount of deviation between the gas measurement values and the gas value estimates). Where a gas reading anomaly is not detected, the computer-implemented method 200 can proceed back to 202 and well feature data can be continually collected (e.g., in real-time or near real-time) for further gas reading anomaly monitoring. Where a gas reading anomaly is detected, the computer-implemented method 200 can proceed to 218.
At 218, the computer-implemented method 200 can comprise generating (e.g., via the alert generator 121), by the system 100, one or more alerts characterizing the detected gas reading anomaly. In accordance with one or more embodiments described herein, the generated alert can be shared (e.g., via the one or more networks) with one or more users of the computer-implemented method 200 (e.g., via the one or more input devices 104).
At 220, the computer-implemented method 200 can comprise determining (e.g., via the alert generator 121), by the system 100, whether the one or more generated alerts are false. For example, the alert generator 121 can generate one or more queries accompanying the one or more alerts. One or more users can investigate the alerts and respond (e.g., via the one or more input devices 106) to the queries indicating the validity of the gas reading anomaly detection. Where the alert is indicated as a true alert (e.g., where the gas reading anomaly determination is determined to be correct), the computer-implemented method 200 can proceed back to 202 and well feature data can be continually collected (e.g., in real-time or near real-time) for further gas reading anomaly monitoring. Where the alert is indicated as a false alert (e.g., where the gas reading anomaly determination is determined to be in error), the computer-implemented method 200 can proceed to 222.
At 222, the computer-implemented method 200 can comprise re-training (e.g., via the model trainer 130 and/or machine learning engine 120), by the system 100, the one or more machine learning models 122. For example, the re-training at 222 can comprise the features of 208 and/or 212 using the well feature data associated with the false gas reading anomaly determination and the non-anomaly status.
In view of the foregoing structural and functional description, those skilled in the art will appreciate that portions of the embodiments may be embodied as a method, data processing system, or computer program product. Accordingly, these portions of the present embodiments may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware, such as shown and described with respect to the computer system of
Certain embodiments have also been described herein with reference to block illustrations of methods, systems, and computer program products. It will be understood that blocks of the illustrations, and combinations of blocks in the illustrations, can be implemented by computer-executable instructions. These computer-executable instructions may be provided to one or more processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus (or a combination of devices and circuits) to produce a machine, such that the instructions, which execute via the processor, implement the functions specified in the block or blocks.
These computer-executable instructions may also be stored in computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture including instructions which implement the function specified in the flowchart block or blocks. The computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart block or blocks.
In this regard,
Computer system 300 includes processing unit 302, system memory 304, and system bus 306 that couples various system components, including the system memory 304, to processing unit 302. Dual microprocessors and other multi-processor architectures also can be used as processing unit 302. System bus 306 may be any of several types of bus structure including a memory bus or memory controller, a peripheral bus, and a local bus using any of a variety of bus architectures. System memory 304 includes read only memory (ROM) 310 and random access memory (RAM) 312. A basic input/output system (BIOS) 314 can reside in ROM 310 containing the basic routines that help to transfer information among elements within computer system 300.
Computer system 300 can include a hard disk drive 316, magnetic disk drive 318, e.g., to read from or write to removable disk 320, and an optical disk drive 322, e.g., for reading CD-ROM disk 324 or to read from or write to other optical media. Hard disk drive 316, magnetic disk drive 318, and optical disk drive 322 are connected to system bus 306 by a hard disk drive interface 326, a magnetic disk drive interface 328, and an optical drive interface 330, respectively. The drives and associated computer-readable media provide nonvolatile storage of data, data structures, and computer-executable instructions for computer system 300. Although the description of computer-readable media above refers to a hard disk, a removable magnetic disk and a CD, other types of media that are readable by a computer, such as magnetic cassettes, flash memory cards, digital video disks and the like, in a variety of forms, may also be used in the operating environment; further, any such media may contain computer-executable instructions for implementing one or more parts of embodiments shown and described herein.
A number of program modules may be stored in drives and RAM 310, including operating system 332, one or more application programs 334, other program modules 336, and program data 338. In some examples, the application programs 334 can include data collector 116, data conditioner 118, machine learning engine 120, and/or alert generator 121, and the program data 338 can include the one or more well feature datasets 123, data sequence datasets 124, training datasets 125, and/or trained model libraries 126. The application programs 334 and program data 338 can include functions and methods programmed to detect gas reading anomalies associated with the operation of one or more exploratory hydrocarbon wells, such as shown and described herein.
A user may enter commands and information into computer system 300 through one or more input devices 340, such as a pointing device (e.g., a mouse, touch screen), keyboard, microphone, joystick, game pad, scanner, and the like. For instance, the user can employ input device 340 to edit or modify one or more loss function thresholds and/or RMSE thresholds. These and other input devices 340 are often connected to processing unit 302 through a corresponding port interface 342 that is coupled to the system bus, but may be connected by other interfaces, such as a parallel port, serial port, or universal serial bus (USB). One or more output devices 344 (e.g., display, a monitor, printer, projector, or other type of displaying device) is also connected to system bus 306 via interface 346, such as a video adapter.
Computer system 300 may operate in a networked environment using logical connections to one or more remote computers, such as remote computer 348. Remote computer 348 may be a workstation, computer system, router, peer device, or other common network node, and typically includes many or all the elements described relative to computer system 300. The logical connections, schematically indicated at 350, can include a local area network (LAN) and a wide area network (WAN). When used in a LAN networking environment, computer system 300 can be connected to the local network through a network interface or adapter 352. When used in a WAN networking environment, computer system 300 can include a modem, or can be connected to a communications server on the LAN. The modem, which may be internal or external, can be connected to system bus 306 via an appropriate port interface. In a networked environment, application programs 334 or program data 338 depicted relative to computer system 300, or portions thereof, may be stored in a remote memory storage device 354.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, for example, the singular forms “a,” “an,” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “contains”, “containing”, “includes”, “including,” “comprises”, and/or “comprising,” and variations thereof, when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
Terms of orientation are used herein merely for purposes of convention and referencing and are not to be construed as limiting. However, it is recognized these terms could be used with reference to an operator or user. Accordingly, no limitations are implied or to be inferred. In addition, the use of ordinal numbers (e.g., first, second, third, etc.) is for distinction and not counting. For example, the use of “third” does not imply there must be a corresponding “first” or “second.” Also, as used herein, the terms “coupled” or “coupled to” or “connected” or “connected to” or “attached” or “attached to” may indicate establishing either a direct or indirect connection, and is not limited to either unless expressly referenced as such.
While the disclosure has described several exemplary embodiments, it will be understood by those skilled in the art that various changes can be made, and equivalents can be substituted for elements thereof, without departing from the spirit and scope of the invention. In addition, many modifications will be appreciated by those skilled in the art to adapt a particular instrument, situation, or material to embodiments of the disclosure without departing from the essential scope thereof. Therefore, it is intended that the invention not be limited to the particular embodiments disclosed, or to the best mode contemplated for carrying out this invention, but that the invention will include all embodiments falling within the scope of the appended claims. Moreover, reference in the appended claims to an apparatus or system or a component of an apparatus or system being adapted to, arranged to, capable of, configured to, enabled to, operable to, or operative to perform a particular function encompasses that apparatus, system, or component, whether or not it or that particular function is activated, turned on, or unlocked, as long as that apparatus, system, or component is so adapted, arranged, capable, configured, enabled, operable, or operative.
The present disclosure is also directed to the following exemplary embodiments, which can be practiced in any combination thereof.
Embodiment 1: A method, collecting well feature data characterizing operation of an exploratory hydrocarbon well, wherein the well feature data includes a gas measurement value; applying a machine learning model to predict gas reading values associated with the exploratory hydrocarbon well; and comparing the gas measurement value with the gas reading value predicted by the machine learning model to detect a gas reading anomaly.
Embodiment 2: The method of embodiment 1, wherein the applying the machine learning model comprises executing an ensemble learning algorithm.
Embodiment 3: The method of any of embodiments 1 or 2, wherein the comparing the gas measurement value with the gas reading value comprises executing a root mean square error algorithm.
Embodiment 4: The method of any of embodiments 1-3, further comprising generating an alert based on the root mean square error algorithm computing a value that is greater than equal to a defined threshold of deviation.
Embodiment 5: The method of any of embodiments 1-4, wherein the collecting the well feature data is performed via an automatic loading operation via a socket connection between an anomaly detector employing the machine learning model and a well site monitoring the operation of the exploratory hydrocarbon well.
Embodiment 6: The method of any of embodiments 1-5, further comprising: executing a standard scaler algorithm to standardize the well feature data; and executing a data sequencing algorithm that utilizes a sliding window of a defined size to sequence the well feature data.
Embodiment 7: The method of any of embodiments 1-6, further comprising: executing a conditional tabular generative adversarial network to generate synthetic training data; and training the machine learning model based on the synthetic training data.
Embodiment 8: A system, comprising: A system, comprising: memory to store computer executable instructions; and one or more processors, operatively coupled to the memory, that execute the computer executable instructions to implement: a data collector configured to collect well feature data characterizing operation of an exploratory hydrocarbon well, wherein the well feature data includes a gas measurement value; and a machine learning engine having a training stage and an inference stage, wherein the inference stage is configured to, based on a machine learning model, predict gas reading values associated with the exploratory hydrocarbon well, and wherein the machine learning model is configured to compare the gas measurement value with the a gas reading value predicted by the machine learning model to detect a gas reading anomaly.
Embodiment 9: The system of embodiment 8, wherein the machine learning model is configured to execute an ensemble learning algorithm.
Embodiment 10: The system of any of embodiments 8 or 9, wherein the machine learning engine is configured to execute a root mean square error algorithm to compare the gas measurement value with the a gas reading value.
Embodiment 11: The system of any of embodiments 8-10, further comprising an alert generator configured to generate an alert based on the root mean square error algorithm computing a value that is greater than equal to a defined threshold of deviation.
Embodiment 12: The system of any of embodiments 8-11, further comprising a data conditioner configured to execute a standard scaler algorithm to standardize the well feature data, wherein the data conditioner is further configured to execute a data sequencing algorithm that utilizes a sliding window of a defined size to sequence the well feature data.
Embodiment 13: The system of any of embodiments 8-12, further comprising: a model training configured to execute a conditional tabular generative adversarial network to generate synthetic training data, wherein the training stage of the machine learning engine is configured to train the machine learning model based on the synthetic training data.
Embodiment 14: A computer program product for monitoring gas readings for anomalies, the computer program product comprising a computer readable storage medium having computer executable instructions embodied therewith, the computer executable instructions executable by one or more processors to cause the one or more processors to: collect well feature data characterizing operation of an exploratory hydrocarbon well, wherein the well feature data includes a gas measurement value; apply a machine learning model to predict gas reading values associated with the exploratory hydrocarbon well; and compare the gas measurement value with the a gas reading value predicted by the machine learning model to detect a gas reading anomaly.
Embodiment 15: The computer program product of embodiment 14, wherein the computer executable instructions further cause the one or more processors to apply the machine learning model using an extremely randomized tree algorithm.
Embodiment 16: The computer program product of any of embodiments 14 or 15, wherein the computer executable instructions further cause the one or more processors to execute a root mean square error algorithm to compare the gas measurement value with the a gas reading value.
Embodiment 17: The computer program product of any embodiments 14-16, wherein the computer executable instructions further cause the one or more processors to generate an alert based on the root mean square error algorithm computing a value that is greater than equal to a defined threshold of deviation.
Embodiment 18: The computer program product of any of embodiments 14-17, wherein the well feature data is collected via an automatic loading operation via a socket connection between an anomaly detector employing the machine learning model and a well site monitoring the operation of the exploratory hydrocarbon well.
Embodiment 19: The computer program product of any of embodiments 14-18, wherein the computer executable instructions further cause the one or more processors to: execute a standard scaler algorithm to standardize the well feature data; and execute a data sequencing algorithm that utilizes a sliding window of a defined size to sequence the well feature data.
Embodiment 20: The computer program product of any of embodiments 14-19, wherein the computer executable instructions further cause the one or more processors to: execute a conditional tabular generative adversarial network to generate synthetic training data; and train the machine learning model based on the synthetic training data.