This application claims the benefit of Brazilian Patent Application No. 1020230003630, filed Jan. 9, 2023, the entire contents of which are explicitly incorporated by reference herein.
The present invention falls within the field of digital electrical data processing, specifically using machine learning, more specifically by means of methods implemented with artificial intelligence.
With the increase in well instrumentation, especially those equipped with intelligent completion, which have multiple pressure and temperature recording points, a large amount of data has become available as input for the management and characterization of reservoirs.
This mass of data can reveal previously inaccessible information about the well-reservoir system, but the investment of time and effort into its analysis can quickly become prohibitive. It is estimated, for example, that more than 50% of specialists' working time is dedicated solely to data preparation—a significant amount of time that could be dedicated to the more intellectually challenging task of analyzing this data to diagnose or mitigate problems.
And even if data preparation time were reduced to zero, the growing number of highly instrumented wells would require an increasing number of professionals dedicated to their analysis.
In this context, aiming at maximizing the automation of tasks and redirect the focus of the reservoir professional towards more effective analyses, the invention reported herein appears.
The application of Artificial Intelligence (AI) techniques to the large mass of data generated by permanent pressure and temperature sensors installed in wells with intelligent completion in multiple zones seeks to identify deviations from a behavior considered normal and alert the responsible professionals preventively. The objective is to drastically reduce the need for constant human monitoring, alerting and directing professionals' time only to those wells that exhibit an apparently anomalous behavior.
In this way, the specialist's saved working time can be used in more detailed and efficient analyzes of the detected anomalies, speeding up diagnoses and preventing relevant events from going unnoticed. The premature detection of scale, for example, can reduce the costs associated with descaling or even completely avoid carrying out some treatment operations.
Some documents of the state of the art disclose technologies in which the context is similar in at least one of the points encompassed by the present invention; however, as will be detailed below, there are fundamental differences in approach or some unresolved deficiencies remain.
RU2624863 discloses a method for determining the internal structure of massive fractured petroleum deposits, comprising a preliminary determination of the reference temperature profile and subsequent sequential conduct of well field survey in stationary filtration modes, conducting field downhole survey with measurement of temperature, pressure and flow rate of the well with obtaining real temperature record, comparison of real temperature record with the reference one, identification by temperature comparing abnormal well profiles and determination of allowable range of possible parameter values for each fracture, intersecting a well, from the condition of calculated minimum temperature deviation values and real log parameters with predetermined confidence level.
RU2624863, based on a stochastic analysis of the results, determines the most likely parameter values (tilt angle, length, opening, width) and determines the range of possible parameter values with a given level of confidence for at least one crack (flow).
Said method of RU2624863 differs essentially from the present invention as it focuses on identifying abnormal well temperature, pressure and flow rate profiles, as well as analyzing fractures, intercepting a well, based on the condition of minimum deviation values of calculated temperature and real log parameters with pre-determined confidence level.
Although the input variables are of similar nature (pressure, flow rate and temperature), the treatment used is totally different. In the present invention, the objective is to construct signatures of anomalies at reservoir depths, especially scale and valve cycling, and the use of such signatures in calculating the probability of occurrence of an anomaly, given conditions considered normal for a determined well. Furthermore, the method is also not specific to fractured reservoirs and can be applied to any type of reservoir. Regarding the treatment given to the variables, although both methods are stochastic in nature, they fundamentally differ from each other.
The method proposed in the present invention is said to be stochastic because it calculates the probability of a given signature being anomalous, considering previously provided historical signatures, while the method proposed in RU2624863 treats the input variables in a stochastic manner. In other words, in the present invention, the method focuses on the analysis of anomaly indicators that can identify the anomaly generated by scale and valve cycling, for example, which have physical signatures and effects that are not easily detectable by traditional stochastic methods. Furthermore, the method of the present invention does not only consider well data but rather data profiles from reservoirs and not from shallower depths.
IN201841045703 discloses an automated system for monitoring the condition of pipes, which can be applied in the technical field related to oil exploration, and can notify an operator if there is any anomaly in its operation, such as leakage, obstruction or imperfections. The system detects variations in hydrostatic and atmospheric pressure and identifies the anomalies based on this pressure variation. The anomalies are identified with the help of Internet of Things (IOT) and Artificial Intelligence.
Said system of IN201841045703 comprises at least five main modules: electronic sensors, pressure anomaly detection module, central system notification module, central server and reservoir distribution module.
It should first be noted that the method of IN201841045703 attacks problems not specifically related to the same problem solved by the present invention. The present invention uses pressure, flow rate and temperature data combined with features that define the general nature of the reservoir. Wells from different reservoirs may have different physical characteristics and this affects the detector. With regard to artificial intelligence, the technique described in the reference differs from the model used for feature engineering of the present invention, since signatures are first sought in the input data, which are inserted so that a processor implemented with the method of the present invention process the dimension reduction and outlier detection algorithms.
In short, in the present invention, the data is not used directly as input for the algorithm implemented with artificial intelligence, but rather curve signatures extracted from the same. In addition, this method also differs with regard to artificial intelligence tools as it is a regression model dependent on annotated data, unlike the strategy of the present invention which automatically searches for usual and anomalous signatures.
CN109784539 discloses the method and apparatus using the machine learning algorithm, which trains the long-term objective prediction model of the oil/gas well group, and determines the method and apparatus that adjusts the operation of the well group by using the prediction model. The method.
The training method of the model of CN109784539 includes: the set of historical data obtained about various groups of wells; Feature extraction is performed for the historical dataset to obtain the fit of the operation training sample feature set and well group status data, based on the well group and based on the corresponding target label data long-term well group; And the training sample feature set and corresponding label data are based, and it is trained using the predefined machine learning algorithm, obtain the long-term objective prediction model of the oil/gas well group. The above-mentioned obtained model can predict the long-term objective of each well group, so that the practical operation of the well group is adjusted according to the long-term predict objective.
However, the technique described in CN109784539 presents the need for labeled data, that is, a large volume of data previously annotated and used by a machine learning algorithm in its training. On the other hand, the present invention, in addition to being computationally less intensive, does not require previously labeled data since it is an unsupervised learning model and reduces the human effort required to adjust Artificial Intelligence models.
In the paper “Innovative Artificial Intelligence Approach in Vaca Muerta Shale Oil Wells for Real Time Optimization”, Quishpe et al. (2019) refer to an approach using artificial intelligence in oil wells for real-time optimization. The aforementioned paper contextualizes that the natural flow well production is regulated using surface constraints to regulate the production rate in such a way that the well's overall performance is a function of several variables.
Examples of these variables are pipe size, choke size, wellhead pressure, flowline size and drilling density. This implies that changing any of these variables will modify well performance. In the paper in question, wellhead pressure curves are analyzed using data science, with the aim of predicting in real time anomalies that may occur for timely correction. The data corresponds to 130 drainage wells in the Loma Campana Field. The study began with a process of filtering the pressure curve, with two specific objectives: firstly, to eliminate atypical values from the time series and, secondly, to smooth the curve so that future predictions can be made.
Then, the Prophet's methodology was applied with the aim of predicting the values of the curve. This relies on historical values of the time series to predict future values; the characteristic trend of the curve was used to apply this methodology. Then, to identify the anomaly, a model was designed based on the slope of the curve. The pressure slope curve is a descending exponential function, so the first and second derivatives indicate the trend (ascending-descending) and the curvature (concave or convex) of the same. Once these values are available, they are classified according to the anomaly: paraffin, scale or obstruction.
However, the model used in said paper differs essentially from the artificial intelligence model used in the present invention and mainly the nature of the anomalies, since the technique in said paper does not consider the extraction of: signatures from the analyzed curves and classifies them directly. In addition, this technique uses forecasting algorithms to predict a future curve behavior. In the present invention, online learning models are not used and a classification model is not used but rather an outlier detection stochastic model. Furthermore, the algorithms are essentially different from the ARIMA and LOESS model used in this work.
In the paper “Fault detection and classification in oil wells and production/service lines using random forest”, Marins et al. (2020) address to the automatic detection and classification of failure events during the practical operation of oil and gas wells and lines. The events considered herein are part of the publicly available 3W database, developed by Petrobras. Seven classes of failures are considered, with distinct dynamics and patterns, as well as several instances of normal operation. A random forest classifier is employed with different statistical measures to identify each type of failure. Three experiments are designed to evaluate the system performance in different classification scenarios. An accuracy rate of 94% indicates a successful performance of the proposed system in detecting real events. In addition, the system detection time averaged 12% of the transient period preceding the steady state of the fault.
The database reported in the aforementioned paper consists of approximately 2,000 operational events representing different well states, ranging from normal operation (Class 0) to eight distinct failures (Classes 1 to 8): Abrupt Increase of Basic Sediments and Water; Spurious Closing of the Downhole Safety Valve; Severe Slugging; Flow Instability; Rapid Loss of Productivity; Rapid Production Choke Restraint (PCK); PCK scheduling; Hydrate on the Production Line. Each event is a series of temporal data consisting of n=8 tags acquired by 8 different sensors, chosen according to their availability and relevance to the failures in question.
The method presented in the aforementioned paper is essentially different from that presented in the present invention both due to the nature of the method for identifying faults and the type of algorithm used for classification. The work of Marin et al. (2020) uses the random forest method, which requires annotated data to build decision trees.
On the other hand, in the present invention, during the preliminary analysis of the provided data, it was found that there is a large disparity between data with/without anomalies, as well as in the types of anomalies present, which leads to a sparse classification space (see curse of high dimensionality). Naturally, the largest fraction of the data is made up of periods without anomalies, with few representatives for each anomaly that may occur. The indiscriminate use of this data in training a machine learning model can lead to serious training bias, leading the model to return any results as “no anomaly”, keeping hit rates high, as this data forms the largest base fraction). Given this, we chose to employ an unsupervised model, more specifically a stochastic selection of Outliers. This way, there is no need to use previously annotated data and the algorithm is capable of identifying anomalies based on signatures automatically detected in a historical data pre-processing step. These signatures naturally change over time and as the field matures, making this method especially effective, since it is not necessary to deploy a team of engineers to re-annotate sections of anomalies as their behavior evolves.
In other words, in short, all or most of the documents mentioned above disclose technologies that aim at remotely monitoring one or more oil wells by means of the processing of sensor data, which allows anomalies to be identified. However, all proposals presented in items D1-D5 process data directly from artificial intelligence (AI) models for classification or forecasting.
On the other hand, in the present invention, the data processing pipeline of our model follows a flow where, based on curves of pressure, flow rate, temperature and other reservoir features, signatures of these curves are calculated by subdividing them to a period of T days each. Such features are calculated considering wavelet transforms, Fourier feature mapping, entropy, among other methods described previously. After processing these features by dimensionality reduction with principal component analysis and, at the end, with these datapoints processed, they are sent to the module that executes the outlier detection model to process anomalies in each time sample T. In this case, the main point is that at no time is it necessary for us to have written down data or tags for identification. In other words, the model is based on a statistical analysis of the distribution of the pre-processed features themselves. This exponentially reduces the time required for training and fitting models.
In this way, it appears that a technician in the subject, in possession of any documents mentioned above or combinations thereof, would not have subsidies to create the dimensionality reduction step via analysis of principal components, which can be considered as a main step in resolving the technical problem in question, since it effectively assists in the processing of a large volume of data that has a direct effect on the decision-making time after the detection of anomalies, that is, it is guided by the resolution of the technical problem.
The main objective of the present invention is to enable continuous monitoring and detect anomalous behavior in wells equipped with intelligent completion automatically by means of a method implemented with artificial intelligence. The present invention applies AI techniques to monitor wells in an oil field and has the ability to understand what the usual behavior of each well would be, based on temperature, pressure and flow rate sensors, and then identify by means of a stochastic technique of selection of outliers which a deviation from usual behavior would be. From the outlier detection, it is possible to quantify an anomaly probability and associate the same with a possible event, such as: sensor failure and loss of data, closure of one of the producing intervals, scale deposition, among others.
To assist in identifying the main features of the present invention, the figures to which references are made are presented, as follows:
The present invention refers to a method of detecting anomalies in the oil well and reservoir system using artificial intelligence that comprises the steps of:
In step (A), the method begins by collecting pressure, temperature and daily flow rate data preferably on the PI System and SIP (Production Information System) servers for the registered wells, but can also be collected in others historian software such as AVEVA Historian. This data is stored in the main memory in tabular format.
In step (B), the pre-processing takes place by standardizing the time units obtained to the format YYYY-MM-DD hh:mm:ss.mmmm+TZ, referring to the format determined by the ISO-8601 standard, where TZ refers to time zone, and further occurs through the interpolation of missing data (for example, due to the occasional failure of one of the sensors), and interpolation of daily flow rate data, for the 5s step. The data resulting from step (B) will be enriched using features such as: logarithmic derivatives of pressure and temperature, as well as the pressure (and temperature) differences between the annular sensors and the string sensor, in addition to their logarithmic derivatives.
In step (C), referring to dimensionality reduction via analysis of principal components, for wells with intelligent completions, there is a total of nine raw series, that is, obtained directly from sensors and databases: four pressure series, four temperature series, and one daily flow series. From them, the differences between the annulus and the string are calculated (+3 pressure series and another 3 temperature series), as well as their derivatives (+14 series, one for each series previously obtained, except the flow rate) and, finally, the 4 series proposed by Tian and Horne (2019) are calculated. This calculation results in a total of 33 time series for each well. The use of automatic algorithms for calculating time series characteristics increases this number to hundreds of series per well, making it essential to apply dimensionality reduction techniques, more specifically, principal component analysis, as it retains the general variance of the set of data in a reduced dimensional space, which speeds up processing and reduces noise associated with less relevant features.
Various techniques that seek to identify signatures in time series can be used to reduce dimensionality, preferably being chosen from: shapelet transforms, bendford correlation, continuous wavelet transform of the Ricker wavelet, coefficients for the fast Fourier transform, Fourier entropy, Friedrich coefficients, Kurtosis, number of peaks, spectral centroid (mean), variance, slope and kurtosis of the absolute spectrum of the Fourier transform and autocorrelation.
The main problem in this case is that, when the dimensionality increases, the spatial volume of the data increases so quickly that the available relevant data can become sparse. Without dimensionality reduction, to obtain a reliable result, the amount of data required often grows exponentially as dimensionality increases. In this sense, much of the data calculated to try to identify signatures may be of little relevance and have a negative influence on the result, in addition to consuming significant processing time. To mitigate this effect, the principal component analysis (PCA) technique aims at calculating correlations between all available data, allowing us to distinguish which of these data has the greatest impact on the system's variability. This way, it is possible to filter the data that has the least effect on the AI result and only use the most relevant information in the AI training.
In step (D), after consolidating the data appropriately in the previous step, they are used for training the AI, which involves a set of instructions for detecting Outliers, mainly by means of the stochastic method.
The set of instructions (algorithm) is capable of identifying extreme variations and reporting them as anomalies in an unsupervised manner. The advantage of this approach is that the set of unsupervised outlier selection instructions has as input only a matrix of characteristics of data coming from the PCA, generating an outlier probability for each datapoint. Therefore, it is not necessary for an expert to have exhaustively classified and annotated a prior training set for the AI.
Given a user-specified period (for example, 6 months of well histories), the system implemented by the method of the present invention subdivides the data into series with a defined block size, for example, in 4 days. In this case, this series of 4-day blocks is used for each well, and, in this way, the AI is able to calibrate the behavior considered normal for each of them and associates a score with each block that represents the probability of containing an anomaly.
The most recent period is the one used in step (E), using the knowledge acquired by the AI with the historical data of each well to determine the probability of an anomaly is occurring at this moment, or in the near past (last four days of the period) in each well.
In short, it should be noted that the pressure, temperature and flow rate data, as well as the differences between data from annular sensors and string, derived from these data, Horne characteristics and other measurements related to the time series are processed by a set of instructions for extracting signatures. After that, the generated set has its dimensionality reduced to be given as input to the outlier identification algorithm.
In step (F), the results are sent to the PI servers, so that the operators are notified, can inspect the values and then determine the appropriate course.
To carry out said method, the invention preferably further uses a system comprising:
The data access and collection module depends on one or more temperature and pressure sensors for each monitored well, as well as the processor, main memory and hard disk, and a network connection with the bases that store said data (PI and SIP, for example).
The model training module comprises a software architecture implementing a set of instructions for processing a large volume of information in the form of time series, through the parallel processing of data curves and identification of time series signatures. The software architecture consists of sets of instructions for extracting and identifying signatures from data and AI algorithms for selecting anomalies. The model training module comprises a machine with a processor, main memory and hard disk.
The writing module is responsible for recording the resulting anomaly probability as a time series in a historian. Taking the PI System as an example, the PI tag writing module comprises a remote storage server (PI Server), as well as the computer that runs the invention, with processor, main memory, hard disk, and access to the network where the PI server is located. This module inserts tags into PI that identify a time series as containing a possible anomalous behavior in some period of its history, and is used as an alarm for responsible operators.
The first step towards the appropriate application of the invention is the survey of the databases, as well as obtaining and pre-processing the same. Given that, as previously stated, this step is the most manual and consumes the most HH (man-hours); its automation represents a considerable saving of resources.
Obtaining and pre-processing this data take place in the data access and collection module (1), which obtains previously stored pressure, temperature and flow rate data and consolidates the same in tabular format, so that they can feed models of Artificial Intelligence, implemented in the model training module (2).
However, pressure and temperature data come from sensors with sampling in a window of seconds, while flow rate data are calculated in a daily window, raising the need for standardization of the sampling intervals. As the set of instructions of the method (algorithm) must be applied to anomalies that can occur both in intervals of minutes/hours and in intervals of days, it was found that the expected flow rate from the wells should not vary abruptly in an interval of minutes to hours, and, therefore, it was decided to interpolate the flow rate data to obtain sampling similar to the pressure data. This data, consolidated from that moment on, can be consumed by other modules of the invention.
When monitoring wells, a series of anomalies of different types can occur; however, there are few records representing each type of anomaly. This finding makes the unsupervised approach, where the model would detect an anomaly, regardless of the type, the most suitable for the desired monitoring.
The present invention preferably uses the stochastic method for identifying outliers, modified with the insertion of physical features, to detect the occurrence of anomalies at reservoir depths in oil wells. Outliers are data that are radically different from all the others, values that deviate from normality and that can cause imbalance in the results obtained.
Considering the features of the problem, an empirical study was carried out in which, for certain time intervals, analyzing the time series of well data, different features were considered for the detection process. For each feature, pressure and temperature data from PDG (permanent downhole gauge) sensors, for example String and Annular PDGs, and flow rate measurements were considered. These data were used in an initial step to process the input features of the present method.
Other data were calculated such as the values of pressure delta (δp), temperature delta (δt), as well as the pressure and temperature derivatives. Further, as additional features, the data proposed by Liu and Horne (2013) were calculated. The set of features proposed by this work is described below:
It should be noted that the pressure and temperature differences described are related to the recording differences between the sensors located in the well annuli and the string sensor.
In the aforementioned determination of the set of features, the following conditions are preferably respected: 1) The flow rate (q), time, temperature (temp), derivative of pressure (well testing bourdet) and derivative of temperature are considered, in addition to the deltas; 2) The features are calculated for each instant i of the well data; 3) To define the set of analysis, the well dataset is subdivided into a user-defined interval (4 days for example); 4) Data is sampled every 10 min; 5) The calculation is done for each of the PDGs (annular and string).
After calculating all the information and establishing the time series used for each well, each period analyzed to estimate outliers corresponds to a total period of 4 days. That is, the well data histories and additional calculated data were divided into 4-day intervals, with each of these intervals being a datapoint to be considered. With all well data divided into time intervals, specific characteristics of each curve are analyzed for each interval. For each of the data generated, shapelets are used in an attempt to extract signatures from the curves as well as the continuous wavelet transform, entropy, statistics on the autocorrelation of the time series, variance, linear trend, Fourier entropy, among others such as the mean of the differences.
As an output from this step, we have a considerable number of features for each well in the period and their sensor data. To reduce the complexity of the process, a dimensionality reduction technique is applied, via Principal Component Analysis (PCA), where the information is reduced to a set of components that represent 90% of the data variance.
At the end of the data preparation, the set of instructions present in the “Stochastic Outlier Selection” (SOS) algorithm is used to actually detect which of the analyzed periods correspond to a real anomaly. SOS is an unsupervised outlier selection algorithm that takes as input a matrix of characteristic or a dissimilarity matrix and generates for each datapoint a probability of that datapoint being an outlier.
With the dissimilarity matrix given as input, the algorithm calculates an affinity matrix, a relation probability matrix between the datapoints, and finally, the outlier probability vector. Each point in the generated vector corresponds to the probability of each interval being a discrepant value and representing, in fact, an anomaly.
Intuitively, a datapoint is considered an outlier when the other datapoints have insufficient “affinity” with the same. The affinity that a determined datapoint (or interval of days) has with another datapoint decreases with a Gaussian proportion in relation to their dissimilarity. Each datapoint has a variance associated with the same. The variance depends on the density of the neighborhood. Higher density implies lower variance. In fact, the variance is defined so that each datapoint effectively has the same number of neighbors. This number is controlled through the only SOS parameter called perplexity. The perplexity can be interpreted as the k of the k-nearest neighbors' algorithm. The difference is that in SOS, being a neighbor point is not a binary property, but a probabilistic one.
With this approach, the set of instructions (algorithm) generates a probability matrix taking into account this concept of affinity. The probability matrix is just the affinity matrix, so the rows sum to 1. To obtain the outlier probability of a datapoint, the joint probability that the other datapoints do not connect to it is calculated according to the following equation:
Each row corresponds to a datapoint and its respective relations with other points. This simple equation matches the intuition behind the previously mentioned SOS: a datapoint is considered an outlier when the other points have insufficient affinity with the same.
Through tests on real data, it was found that probabilities greater than 55% could be considered real anomalies occurring at the bottom of the well (depths consistent with that of the reservoir) and/or reservoir in the wells. In this way, in the context of the aforementioned method, it was observed that:
Said system implemented by the method of the present invention works as a kind of alarm, in which, for each outlier detection, an event is written to a database and an event window is highlighted for user analysis.
The method is even much more sensitive than the models described previously since, as it is an unsupervised outlier selection model based on statistical distribution of the features, it is capable of identifying anomalies even if there are few samples for the training data or even if similar events were not represented in the training.
In short, the present method innovates in proposing the probabilities of anomalies of concern in the reservoir management routine, with the great advantage of monitoring the wells by artificial intelligence and enabling the specialist engineer in the area to direct his attention only to the wells that present a probability of significant anomaly. The invention has the advantage of presenting itself as a methodology that allows all wells in a field to be constantly monitored by AI, reducing the possibility of a problem not being identified due to a lack of time for a human to monitor all the wells.
Those skilled in the art will value the knowledge presented herein and will be able to reproduce the model in the presented embodiments and in other variants, encompassed by the scope of the attached claims.
Number | Date | Country | Kind |
---|---|---|---|
10 2023 000363 0 | Jan 2023 | BR | national |