The present invention relates to a computer-implemented method of predicting corrosion rate in a section of a pipe. The present invention further relates to a computer system configured to execute this method. The present invention further relates to training a computer-implemented machine learning model for said computer-implemented method.
Pipelines are widely used for transmission of any substance comprising hydrocarbon fluids (oil and gas). Corrosion is one of the leading causes of pipeline failure, both in onshore and offshore transmission pipelines. Corrosion is caused by oxidation and electrochemical breakdown of the structure of a pipeline section, used in the pipeline to convey the substance. Typically in these pipelines, internal corrosion, caused by the substance being transmitted, presents the dominant corrosion failure mode. Its mitigation requires extensive and reliable predictive modeling.
Various models for pipeline corrosion prediction have been developed. Particularly, physics-based mechanistic corrosion models have demonstrated reliable results. The physics-based mechanistic corrosion models are increasingly enabled by computational fluid dynamics (CFD) simulation, to incorporate the many mechanisms such as chemical kinetics and hydrodynamics. CFD-based modeling allows for flexible numerical characterization of a broad range of corrosion scenarios including different gas species, pipeline geometries and flow conditions, which are
practically infeasible in laboratory and field experiments. However, application in practice tends to be limited by its high computational cost.
To accelerate the corrosion prediction process, it has been proposed to replace the CFD model with a machine-learning enabled surrogate model. Such surrogate model, sometimes also referred to with the terms proxy model or meta model, is a statistically defined model (or function) that replicates the CFD model output over a multidimensional parameter space of selected input parameters. In SPE paper “Machine Learning Based Predictive Models for CO2 Corrosion in Pipelines With Various Bending Angles” (SPE-201275-MS; published in Proceeding of the Annual Technical Conference & Exhibition, October 2020), Huihui Yang et. al. present machine learning surrogate models, based on Light Gradient Boosting Machine (LightGBM) and Multiple Layer Perceptron Neural Network (MLPNN), for the prediction of CO2 corrosion in aqueous pipelines with different pipe bending angles. A total of seven variables, including flow velocity, pH value, and CO2 concentration of the substance flowing through the pipe, and pipe inner diameter, pipe bend angle, bend radius, and temperature, are taken as input variables with the corrosion rate as target output variable. A CFD model was used to compute the electrochemical processes occurring at the metal inner surface of the pipe to predict the corrosion rate. As these features have a non-linear relationship with the target output, LightGBM and MLPNN were chosen to statistically map the input variable space to the corrosion rate.
In one aspect, there is provided a computer-implemented method of estimating corrosion rate in a section of a pipe transmitting a substance comprising an aqueous phase and corrosive particles, said method comprising:
In another aspect, there is provided a computer system for estimating corrosion rate in a section of a pipe transmitting a substance comprising an aqueous phase and corrosive particles, said computer system comprising:
In still another aspect, there is provided a computer-implemented method of training a computer model for estimating corrosion rate in a section of a pipe transmitting a substance comprising an aqueous phase and corrosive particles, said method comprising:
Optionally, non-transitory computer-readable memory of the computer system may contain further computer-readable instructions capable of causing the computer system to execute one or more other processing steps as set forth herein, including those specified in the appended claims.
The drawing figures depict one or more implementations in accord with the present teachings, by way of example only, not by way of limitation. In the figures, like reference numerals refer to the same or similar elements.
one adaptive sampling iteration;
The person skilled in the art will readily understand that, while the detailed description of the invention will be illustrated making reference to one or more embodiments, each having specific combinations of features and measures, many of those features and measures can be equally or similarly applied independently in other embodiments or combinations.
In the present specification, the term “inflow velocity” (Vin) is defined as the average flow velocity across the inlet area. Mathematically, this corresponds to the volumetric flow rate of the substance through the pipe divided by the cross-sectional area available for flow in the pipe section. In case of a circular cross section of the pipe section, the cross-sectional area is equal to 0.25×π×d2, wherein d denotes an inner diameter of the circular cross section.
The term “near-wall velocity” is defined as the local free stream velocity directly adjacent to the boundary layer which is in contact with the inside wall of the pipe section. In case of CFD, it is approximated by the first cell average velocity of a cell bound by the inside wall, provided y+ exceeds 30. The “maximum near-wall velocity” (Vmax) is the highest value of location-resolved near-wall velocities within the pipe section.
The term “physics-based simulation” is used herein to describe any type of simulation that uses a physics model, as distinct from a data-driven model. A physics model is built on laws and equations of physics, and it generally uses differential equations that are based on conservation laws or other physical principles. In case of fluid flow simulation, physics models may use Navier-Stokes equations and/or Euler equations. Physics-based simulations can be implemented in various forms, including computational fluid dynamics software (CFD) and finite element calculation software. Lattice Bolzmann methods are a recognized class of CFD methods.
A new computer-implemented approach has been developed to estimate corrosion rate in a section of a pipe transmitting a corrosive substance. A trained surrogate model is provided to output an estimated value of maximum near-wall velocity of the substance in the pipe section. The estimated value of maximum near-wall velocity is then fed into a computerized electrochemical model, together with electrochemical parameters associated with the corrosive substance, which electrochemical model then determines an estimated corrosion rate imposed on the pipe section by the corrosive substance.
The surrogate model is trained using results of a full physics-based simulation. Once it has been trained, the surrogate model can generate the estimated value of maximum near-wall velocity much faster than the full physics-based simulation can.
The present invention is based on the surprising discovery that there is a much stronger and more reliable correlation between corrosion rate and the near-wall velocity of a corrosive fluid in a pipe section than for example with shear stress at the wall. The present invention is also based on the finding that a surrogate model can be trained to deliver the near-wall velocity in a given pipe section configuration. The parameter space of input parameters can be much smaller than would be the case for a machine-learning enabled surrogate model that predicts the corrosion rate as output such as published in for example the above-mentioned 2020 SPE paper by Huihui Yang et. al. As a result, the model can be simpler (fewer layers, for example) and training can be done much faster with smaller training data sets.
An additional advantage of the presently proposed approach, which de-couples the fluid flow simulation from the electrochemical model, is that it can fairly easily be applied to other types corrosive substances and/or pipe materials. Alternatively, it relatively less time consuming to train a new surrogate model for the flow aspects for other pipeline section parameters (other shapes, other types of fluids, etc.) than it would be to retrain an entire machine learning enabled model that outputs corrosion rate directly. The surrogate model can be combined with any type of corrosion mechanism.
In the example described herein forth, a Bayesian active learning method has been developed to efficiently and automatically collect CFD samples and construct the predictive model. The term “predictive model” is used herein to describe the combined model of the surrogate model and the electrochemical model. The model in this example is based on Gaussian process regression (GPR). It not only predicts the corrosion rate for a given pipeline design but also provides uncertainty quantification associated with the prediction. An adaptive sampling strategy is applied to automate the sampling process in order to reduce the predictive uncertainty. The exploration technique currently proposed for adaptive sampling is geared to find the next sampling position that will reduce the quantified uncertainty the most. In addition, both physics-based and data-driven dimension reduction methods are employed to simplify the corrosion modeling sample space by orders of magnitude to enable practical applications of CFD-based corrosion predictive modeling.
The following sections introduces the CFD-based corrosion model. The active learning framework is described in detail, and the proposed method is demonstrated in a case involving the prediction of O2-dominated pipeline corrosion. However, it is to be understood that these are by way of example only. The method can readily be applied to other corrosive particles and pipes, or using other physics-based models to replace CFD.
The specific pipe geometry under consideration consists of a pipe section with two straight sections connected by an elbow section, as illustrated in
The overall learning task is to construct the predictive model for pipeline corrosion induced by gases (e.g. CO2 and O2) dissolved in aqueous medium. Specifically, a function is sought after that maps from the input variables (including the pipeline design parameters and operating conditions) to an output quantity of interest (i.e. the pipeline corrosion rate), which falls under the broad category of multivariate regression problems. The foremost distinguishing feature of the problem is the tension between the high cost of sampling from the underlying CFD model for the electrochemical process of corrosion and the need to scale up and speed up predictive modeling with budgeted resources in an industrial setting. Hence, a data-scarce scenario should be assumed instead of data being virtually free and unlimited. Additionally, the safety-critical nature of the application under consideration demands uncertainty quantification and/or error estimation of the model prediction, to facilitate engineering and business decision making.
The predictive model combines electrochemistry and near-wall mass transport for aqueous flow inside a steel pipeline of an elbow-shaped geometry. The model has been developed and validated for various corrosion-inducing gases such as CO2 and O2. The corrosion is assumed to be mainly driven by processes involving chemical reactions related to the solution and electrochemistry. For example, the following reactions dominate corrosion for water flow inside a steel pipeline with dissolved CO2 and O2, including solution chemistry:
CO2+H2O⇔H2CO3
H2CO3⇔HCO3−+H+
HCO3−⇔CO32−+H+
anodic reaction:
Fe→Fe2++2e−
and cathodic reactions:
2H2CO3+2e−→H2+2HCO3−
2HCO3−+2e−→H2+2CO32−
O2+2H2O+4e−→4OH−
These reactions are combined with surface mass transport limitation equations to determine the rate of corrosion. In the CFD model, the Reynolds-averaged Navier-Stokes equations are employed with a k-ϵ turbulence model for fully turbulent flow inside the pipe section. This study focuses on single-phase flow, while two-phase oil-water flow is also supported albeit at a higher computational cost. The model may suitably be implemented and executed using ANSYS Fluent 18.1 (Ansys, Inc. Canonsburg, PA 15317, U.S.A).
Samples of CFD generated data, i.e. the simulated maximum near-wall velocity (one output) together with the specific values of the input parameters (the set of geometric parameters and inflow velocity) which led to the simulated maximum near-wall velocity, are then passed a machine learning (ML) model for training at 40. This will be the surrogate model.
Still referring to
Any artificial neural network may be employed as the ML model for the surrogate model. A feed forward artificial neural network will typically suffice. The neural network may be fully connected, and comprise two, or more, hidden layers. Preferably, however, a Gaussian Process Regression (GPR) model is employed, one of its advantages being that it has the ability to provide uncertainty quantification on the predictions. Uncertainty estimates may be advantageously employed to drive an adoptive data sampling strategy, as will be further explained below. GPR is a suitable regression method, but other types of non-parametric regression may be employed, instead, preferably those that provide uncertainty quantification with the predictions.
At 80, the estimated maximum near-wall velocity value(s) are fed to the electrochemical model, together with input parameters 90 relevant for the electrochemical model. These additional electrochemical input parameters may be included in the input query vector, but they are not used by the surrogate model at 60. The electrochemical input parameters 90 generally may include parameters associated with the nature of the substance in the pipe section. This typically includes temperature of the substance type of corrosive particle in the substance, and partial pressure of each type of corrosive particle present in the substance, and the pH of the substance. At 100, the electrochemical model outputs the estimated corrosion rate corresponding to the specific query vector.
An active learning approach is adopted to sample training data from the costly physics based CFD model in a manner that optimizes the value of collected data given limited resources and minimizes human efforts for modeling and sampling. An example workflow of the learning process is illustrated in
The active learning method used in this example employs a predictive model based on Gaussian process regression, combined with a sampling strategy driven by greedy exploration and user interaction.
Gaussian process regression (GPR) models are nonparametric kernel-based probabilistic models. GPR is associated with flexible kernel formulation, allowing for representing complex nonlinear multivariate functions. In addition, an underlying Bayesian formalism quantifies the uncertainty associated with the model prediction, and a measure of merit for the predictive model is provided, which can drive further sampling as part of the active learning framework. The GPR formulation employed is briefly described herein. A more detailed exposition is available in open references including Rasmussen, C. E. band Williams, C. K. I., “Gaussian Processes for Machine Learning”, MIT Press., Cambridge, Massachusetts, USA (ISBN 026218253X) (2006).
The Gaussian process (GP) is a stochastic process that characterizes the distribution of functions in a function space determined by the kernel (a symmetric positive definite function k(⋅, ⋅):d×
d→
+). Regression based on GP (i.e. GPR) amounts to homing in on a specific distribution of functions that accounts for observed samples (i.e. a posterior distribution conditioned upon data). For a true model mapping x→y from input (of X∈
d dimension d) to output y∈
and a corresponding dataset
={(xi, yi)}i=1n for i=1 to n samples, GPR seeks a function approximation f: x→f(x). The symbol
indicates the set of real numbers,
d is the real number space of dimension d, and
+ is the set of positive real numbers. By the definition of GP, the labeled data and a pointwise prediction jointly f(x*) (for any x*∈
d) follow a multivariate Gaussian distribution (assuming a zero mean without loss of generality),
where y∈n, X∈
n×d and
It can be derived that the model prediction f(x*) is conditionally distributed given the data (X and y) as follows:
in which denotes the normal Gaussian operator and k is a kernel (covariance matrix). The mean μ* is the regression output predicted by GPR for input x*, and is endowed with uncertainty quantified in terms of the covariance matrix Σ* (i.e. the variance σ2(x*) for scalar-valued prediction f(x*)).
The GP kernel k(⋅,⋅;θ), parametrized by θ∈p (of dimension p), is defined as a sum of Matèrn 5/2 (with anisotropic length scales) and white noise kernels. The Matèrn 5/2 kernel component models twice-differentiable functions and reflects the assumption of smoothness in the underlying true model being approximated, while anisotropic length scales allow for multiscale features attributed to different input dimensions. The white noise kernel component accounts for random noises in the sampling process. For example, the numerical noise arising from post-processing the CFD results (which are dependent on the underlying numerical discretization and computational grid) to obtain the output data is considered random. The GP model parameter θ of the specific kernel employed here encodes the length scales and the variance of the Matèrn 5/2 kernel and the noise level of the white noise kernel. For each GP model, the parameter θ is tuned using maximum likelihood estimation (MLE), intended to best explain the given dataset
.
The GPR prediction in Eq. (2) uses the kernel with optimized parameter θ*.
In the active learning framework described herein, new samples are collected adaptively and are used to continually update the model. This process forms a feedback loop as illustrated in
y′ is collected (by executing the physics-based CFD model) for the input x′ that maximizes the utility function:
where σ2(x) is the variance for GPR prediction f(x), {xj}j=1N+ user, and serves as a tunable parameter. The deviation from user queries in the (normalized) input space is measured by the Euclidean distance (i.e. an L2 norm). For λ=0, the sampling strategy reduces to an exploration without-exploitation approach in Bayesian optimization.
The utility function u (x) requires only negligible cost to evaluate with the GPR model compared to the full physics-based (e.g CFD) model, and hence allows for a greedy search for the utility maximizing input x′ for which the subsequent sample is taken.
Dynamical similarity in the fluid dynamics problem can be leveraged to accomplish a further reduction in the number of dimensions needed in the input space to train the surrogate model. Three of the geometrical parameters (d, R, and Vin) and the output parameter (Vmax) are dimensional in the sense that they have units in the dimensions length and time. Another parameter which is used in the physics-based model, v, which stands for kinematic viscosity of the substance, also has a dimension (units of m2/s). Based on Buckingham-II theorem for dimensional analysis, the original physics based problem can be formulated equivalently using only three dimensionless variables by using ratios of two dimensional variables. The following functional dependence is used:
In this dependence, d/R, Red and Vmax/Vin are postulated dimensionless variables respectively corresponding to non-dimensionalizing R, v and Vmax. Red corresponds to the Reynolds number based on the pipe inner diameter, defined to be dVmax/v. The input parameter space thus reduces to three-dimensional using the dimensionless formulation.
Data driven methods have been employed to explore additional opportunities for further dimension reduction, and it has been found for the case described Red below that had a relatively small influence Vmax/Vin on. Hence, it was possible to simplify the input parameters to β and d/R only. Similar data-driven sensitivity analysis may be employed for other cases and geometries as well.
The methodology presented herein has been tested. The training of the surrogate CFD model was initiated with only four initial samples 110, which were randomly selected within a search space formulated as ={(β,d/R)|β∈[0°, 180°], d/R∈[0, 2]}. After running the initial four samples, the GPR and adaptive sampling procedures were executed iteratively until the terminal condition at 140 was met. To determine
the convergence of the learning process, a relative uncertainty margin defined as
was used. Here, f(x) refers to the output of interest (i.e. the corrosion rate, which is strictly positive) and σ(x) refers to the associated standard deviation. With this equation, 1.96 σ(x) measures the half-range of the 95% confidence interval.
The active learning method was executed for thirty learning-sampling iterations. The convergence of the uncertainty margin as defined in Eq. (7) is illustrated in
Results from the three selected iterations labeled with stars in of d Ron the vertical axis and β on the horizontal axis. The numbers inserted in the grade shaded zones indicate the lower limit of ranges of estimated corrosion rates. E.g. the zone labeled 14 corresponds to the area in the heat map where the corrosion rate is estimated to be in the interval [14,16) wherein a “[-type” bracket indicates lower value including and a “)-type” bracket indicates upper value excluding. The zone with the highest estimated corrosion rate corresponds to [28-30). Heat map (b) on the right indicates the maximum relative uncertainty in the estimations of heat map (a) in %. The same convention is used for the intervals as in heat map (a). In
The dark dots indicate the existing samples on which the GPR model was trained. The four initial samples 181 are seen in heat map (b) of
In
All steps and machine learning models may suitably be integrated under one common user interface and automatically executable in the computer system so that manual execution of subsequent models is not necessary. All sequential deep learning models are applied to the data by the computer system without human intervention.
The methodologies and systems described above can be used practically in pipeline design to minimize the corrosion risk. It reduces the CFD simulation needs during the design process and thus is capably of exploring larger design spaces in a given amount of time. It can also be used in predictive maintenance to monitor/predict if some spots in the pipeline have significant corrosion risk. Herewith it can be avoided that operations are stopped prematurely for inspection while at the same time reducing the risk of acute corrosion-induced failures. Accordingly, pipeline sections can be built and/or maintained in accordance with results provided by operating the methodologies and systems described herein.
The pipe sections discussed above may be included in any kind of piping that conveys corrosive substances, including pipelines, such as oil and gas pipelines, subsea cross-over lines, piping in a chemical plant or refinery, etc.,
The person skilled in the art will understand that the present invention can be carried out in many various ways without departing from the scope of the appended claims.
| Filing Document | Filing Date | Country | Kind |
|---|---|---|---|
| PCT/EP2023/055684 | 3/7/2023 | WO |
| Number | Date | Country | |
|---|---|---|---|
| 63318514 | Mar 2022 | US |