METHOD FOR OBTAINING GEOLOGICAL HETEROGENEITY TRENDS OF A GEOLOGICAL FORMATION

Information

  • Patent Application
  • 20240361490
  • Publication Number
    20240361490
  • Date Filed
    March 23, 2022
    2 years ago
  • Date Published
    October 31, 2024
    16 days ago
Abstract
A method for obtaining geological heterogeneity trends of a geological formation, including the steps: drilling wells (w1, w2, w3, w4, w5) that penetrate the formation, acquiring well logs (log 1, log 2) for each well (w1, w2, w3, w4, w5) as function of depth inter Vals (Dk) of the respective well, determining a third degree tensor (Tk, m, n), where a z-dimension denotes the depths, a x-dimension denotes the well logs, and a y-dimension denotes the wells, extracting matrices (L1k,m, L2k,m, . . . , LMk,n) from the tensor (Tk, m, n), clustering the matrices (L1k,n, L2k,n, . . . , LMk,n) based on the characteristics of the corresponding well logs (log 1, log 2) to a clustering result matrix, aggregating the clustering result matrix to a cluster ensemble (π1, π2, . . . , πM), and spatial partitioning the cluster ensemble (π1, π2, . . . , πM) to a map that shows the geological heterogeneity trends associated with cluster types of the wells (w1, w2, w3, w4, w5).
Description
BACKGROUND

A geological formation is a rock with geological properties that characterize the formation. To obtain the geological properties of the formation, a well is drilled that penetrates the formation. A detailed record of the well is obtained by well logging.


Geologic data from well logging (well log data) are used to explore wells and oil production in the petroleum industry. Well log data are obtained by running various logging devices in wells to detect the geological properties of the formation and/or properties of a fluid within the formation. The properties of a formation may be naturally occurring radioactivity, e.g. gamma ray, or other natural and induced formation signals such as spontaneous potential, bulk density, neutron porosity, acoustic, and resistivity. The well log data is used for interpretations for quantitative formation evaluation and commonly annotations on logs such as stratigraphic tops, which are widely used as the standard graphic base for subsurface cross sections of properties.


Heterogeneity is the variation of rock properties as function of the location in a reservoir or formation. The qualitative and quantitative analysis of the properties in the well logs may effectively reflect the gross geologic variations and thus the heterogeneous characters of the formation. Due to the horizontal and vertical geologic variations in a formation, there is a need to group the wells with those variations based on the nature of the properties in the well logs. Different groups of wells may reflect different underground heterogeneous zones or formations.


Accordingly, there exists a need for a method for obtaining geological heterogeneity trends of a geological formation.


SUMMARY

This summary is provided to introduce a selection of concepts that are further described below in the detailed description. This summary is not intended to identify key or essential features of the claimed subject matter, nor is it intended to be used as an aid in limiting the scope of the claimed subject matter.


In one aspect, embodiments disclosed herein relate to a method for obtaining geological heterogeneity trends of a geological formation, comprising the steps: drilling wells that penetrate the formation, acquiring well logs for each well as function of depth intervals of the respective well, determining a third degree tensor, where a z-dimension denotes the depths, a x-dimension denotes the well logs, and a y-dimension denotes the wells, extracting matrices from the tensor, clustering the matrices based on the characteristics of the corresponding well logs to a clustering result matrix, aggregating the clustering result matrix to a cluster ensemble, and spatial partitioning the cluster ensemble to a map that shows the geological heterogeneity trends associated with cluster types of the wells.


Other aspects and advantages of the claimed subject matter will be apparent from the following description and the appended claims.





BRIEF DESCRIPTION OF DRAWINGS


FIG. 1 shows a reservoir penetrated by five wells in accordance with one or more embodiments.



FIG. 2 illustrates a flowchart for obtaining geological heterogeneity trends of a geological formation in accordance with one or more embodiments.



FIG. 3 shows a cube comprising the log curves (log 1 or L1, log 2 or L2, etc.) for wells as function of the depth (D1, D2, etc.) of the respective well in accordance with one or more embodiments.



FIG. 4 shows a third degree tensor (Tk,m,n) in accordance with one or more embodiments.



FIGS. 5A-5B show flowcharts for obtaining geological heterogeneity trends of a geological formation in accordance with one or more embodiments (K is the total number of depths, M is the number of logs, and N is the number of wells).



FIG. 6 shows a flowchart of the procedure for clustering the matrices in accordance with one or more embodiments.



FIG. 7 shows the procedure for folding a vector into a fold-matrix in accordance with one or more embodiments.



FIG. 8 shows an example of a matrix generated by folding a vector in accordance with one or more embodiments.



FIG. 9 shows a flowchart for generating a cluster ensemble in accordance with one or more embodiments.



FIG. 10 shows a procedure for obtaining a cluster ensemble in accordance with one or more embodiments.



FIG. 11 shows the result of the cluster ensemble in accordance with one or more embodiments.



FIG. 12 shows a diagram of indicator transformation in accordance with one or more embodiments.



FIG. 13 shows a diagram of a cluster indicator interpolation in accordance with one or more embodiments.



FIG. 14 shows a diagram of maximum probability threshold map in accordance with one or more embodiments.



FIG. 15 illustrates the ML engine for performing the method for obtaining geological heterogeneity trends of a geological formation





DETAILED DESCRIPTION

Embodiments disclosed herein relate to a clustering ensemble method for quick well clustering based on well log data. Clustering wells based on their similar log curve features can reveal hidden heterogeneous characteristics. The results can facilitate further reservoir studies, such as petrophysical and geological modeling and reservoir simulation. Multiple logs or log curves are usually measured along the well path. Extracting patterns from these log curves for a certain formation is a challenge using the traditional clustering methods.


In embodiments disclosed herein, wells are clustered based on multiple well logs data along certain well path intervals defined by stratigraphic tops. The method for doing this involves three major steps: base clustering, clustering ensemble and spatial partitioning at the reservoir scale. Base clustering is to cluster all the wells based on their individual well log curve. Then, the clustering ensemble clusters the wells by considering multiple well log curves together. The spatial partitioning is to perform a 3D spatial compartmentalization of the underlying domains according to well cluster types. Finally, clustering of all the wells in a target area may be obtained.


In one or more embodiments, the method for obtaining geological heterogeneity trends of a geological formation is performed by computer software programs for fast and robust well clustering. Such computer software programs may be executed on any suitable device, such as shown in FIG. 15.



FIG. 1 shows a reservoir 102 penetrated by five wells w1, w2, w3, w4, w5. Two different types of well logs log 1 and log 2 are obtained from each well w1, w2, w3, w4, w5. Each well may have several different measured well log curves for a certain target formation as shown in FIG. 2. The qualitative and quantitative log feature analysis of the logs effectively reflect the gross geologic variations and thus the heterogeneous characters of the formation. Due to the horizontal and vertical geologic variations, there is a need to group those variations based on the nature of geologic information carried by the logs, acquired with wireline or other means such as logging while drilling (LWD). Different groups of wells may reflect different underground heterogeneous zones or formations.



FIG. 2 illustrates a flowchart 200 of the method steps for obtaining geological heterogeneity trends of a geological formation. The geological heterogeneity trends have a wide range of applications, including reservoir characterization, geological modeling, petrophysical modeling, reservoir performance dynamic simulation, as well as reservoir engineering and management.


In step 202, N wells w1, w2, . . . , wN are drilled that penetrate the formation. FIG. 2 shows a reservoir 202 penetrated by five wells w1, w2, w3, w4, w5.


In step 204, well log curves log 1, log 2, . . . , log M are acquired for each well w1, w2, . . . , wN as function of a depth D1, D2, . . . , DK of the respective well w1, w2, . . . , wN. The well log curves are measured along the depth of each well, even if a well is embodied vertical, deviated, or horizontal. The value of a property of the formation is recorded as a function of the depth of the logging device in the well. The well log data are then plotted in well log curves that show the value of the properties versus the depth. FIG. 2 shows different types of well log curves L1, L2, . . . , LM obtained from each well w1, w2, w3, w4, w5. In one or more embodiments, the well log curves L1, L2, . . . , LM are acquired by wireline or logging while drilling (LWD).


In step 206, a third-degree Tensor Tk,m,n is determined, where k=1, 2, . . . , K is an index of the depths with D1, D2, . . . , DK, m=1, 2, . . . , M is an index for the well logs L1, L2, . . . , LM, and n=1, 2, . . . , N is an index of the wells w1, w2, . . . , wN. The well logs L1, L2, . . . , LM are values of the well log curve log 1, log 2, . . . , log M at the depth D1, D2, . . . , DK of the respective well wn. FIG. 4 shows a third degree tensor Tk,m,n.


In step 208, matrices L1k,n, L2k,n, . . . , LMk,n are extracted from the tensor Tk,m,n. Each matrix Lmk,n is extracted from the tensor Tk,m,n. For example, Lin is equal to Tk,1,n, L2k,n is equal to Tk,2,n, . . . , and LMk,n is equal to Tk,M,n.


In step 210, the matrices L1k,n, L2k,n, . . . , LMk,n are clustered based on the characteristics of the corresponding well logs L1, L2, . . . , LM (base clustering) to a clustering result matrix. The base clustering of the matrices L1k,n, L2k,n, . . . , LMk,n, is a matrix feature extraction and clustering task, which is a challenge to the clustering. In one or more embodiments, the base clustering comprises k-means algorithm. The base clustering of the matrices L1k,n, L2k,n, . . . , LMk,n is based on the similarity of the well logs and reveals hidden heterogeneous characteristics. The results facilitate further reservoir studies, such as petrophysical and geological modeling and reservoir simulation.


Extracting patterns from the well logs while drilling the well is a challenge for the base clustering. The wells are clustered based on multiple well logs along certain depth intervals defined by historic analysis of layers of sedimentary rock called strata (stratigraphic analysis) of markers of geologic layers (tops).


The base clustering of the wells is conducted based on the analysis of the type of the well log along the depth. The challenge is that the number N of wells is much smaller than the depth intervals Dk. For example, a value of a well log is measured at depths of every 0.125 meter (0.5 feet) of the well. In one or more embodiments, the depth intervals Dk is around 1000s, while the well number is around 100s. An unsupervised multivariate data reduction, such as principal component analysis (PCA), is used to reduce the dimension of the depth intervals Dk. Furthermore, the matrices are huge mathematical matrices, even for vertical wells. Furthermore, the number N of the wells is much smaller than the numbers of the well logs M.


Since the base clustering is done according to the different types of well logs Lm, the N×K matrix LmK,n is to be to be base clustered for each type of well log Lm. Each of the N wells comprises K depths.


In one or more embodiments, the base clustering is a machine learning (ML) technique. ML clusters the matrices Lmk,n according to the well logs Lm as unsupervised learning for statistical data analysis. Data points in the same group have similar properties and/or features, while data points in different groups should have highly dissimilar properties and/or features.


Table 1 shows an example of a N×K matrix Lmk,n of a well log Lm, where m is the type of the well log, e.g., gamma radiation (GR), N=15 is the number of wells to be clustered, and K=89 is the depth of the wells.









TABLE 1







An example of a N × K matrix Lk, nm, with N = 15 and K = 89.














n
D1
D2
D3
D4
D5

D89

















1
8.936
9.143153
9.177482
9.146391
9.073243
. . .
7.959


2
7.644
7.646959
7.597866
7.56815
7.621289
. . .
8.193


3
8.352
8.43842
8.464696
8.459282
8.452808
. . .
7.782


4
8.229
8.176308
8.181622
8.258484
8.256728
. . .
8.804


5
8.211
8.078801
7.999436
7.951944
8.006787
. . .
8.225


6
7.757
7.804761
7.821874
7.798978
7.754996
. . .
9.299


7
8.808
8.686874
8.669926
8.799988
8.819263
. . .
8.553


8
7.363
7.497409
7.564005
7.593902
7.574305
. . .
8.53


9
9.086
9.179404
8.960364
8.943797
8.984507
. . .
8.796


10
8.26
8.421
8.544
8.645212
8.529
. . .
8.644


11
8.328
8.260026
8.150307
8.200241
8.145491
. . .
8.808


12
9.366
9.192261
9.319
9.315
9.252
. . .
8.822


13
8.939
8.793447
8.724068
8.595155
8.829165
. . .
8.262


14
9.16
9.023854
8.899685
8.765882
8.768025
. . .
8.421


15
9.032
9.467102
9.258101
8.861262
8.941784
. . .
9.012









The N×K matrix Lmk,n, according to table 1 needs to be clustered as a base for final clustering (for base clustering at a later stage). In one or more embodiments, the base clustering comprises K-means algorithm. Table 2 shows a matrix resulting from base clustering of the N×K matrix Lmk,n, according to table 1 (with M=1) for M=4 different well logs L1, L2, L3, L4 using K-means algorithm.









TABLE 2







Base clustering result matrix resulting from clustering


the matrices Lk, nm using K-means algorithm.












L1
L2
L3
L4


n
Gamma Ray
Spontaneous Potential
ML1
ML2














1
3
4
1
3


2
1
3
3
3


3
3
4
4
4


4
3
4
1
1


5
1
1
3
3


6
4
4
3
3


7
3
1
1
1


8
4
3
1
1


9
2
2
2
2


10
2
2
2
2


11
2
2
2
2


12
2
2
2
2


13
2
2
2
2


14
2
2
2
2


15
2
2
2
2









In step 212, the clustering result matrix is aggregated to a cluster ensemble π1, π2, . . . , πM. The cluster ensemble combines multiple clustering result matrices of the well logs to yield a single overall clustering. The cluster ensemble π1, π2, . . . , πM clusters the well logs by considering multiple well logs together.


In step 214, the cluster ensemble π1, π2, . . . , πM is spatial partitioned resulting in a map that shows the geological heterogeneity trends associated with cluster types of the wells. The domain of the well log data, acquired from the cluster ensemble, is extended to a 2D spatial domain. The 2D spatial domain reveals an insight in how similar the wells w1, w2, . . . , wN are clustered. The spatial partitioning comprises a 3D spatial compartmentalization of the underlying domains according to the types of the well logs. Finally, target areas are obtained for integrated reservoir studies by clustering the wells.



FIG. 3 shows a cube comprising the log curves log 1, log 2, . . . , log M for N wells w1, w2, . . . , wN as function of the depth K of the respective well. There are N wells, each well has M types of well logs, and each well comprises K measurement points along the depth of the well.



FIG. 4 shows a third degree tensor Tk,m,n with the dimension K×M×N. The tensor Tk,m,n has three indices k, m, n, where k is an index for the depth with D1, D2, . . . , DK, m is an index for the well log L1, L2, . . . , LM, and n is an index for the wells w1, w2, . . . , wN. In one or more embodiments, the third degree tensor Tk,m,n is a massive tensor.



FIG. 5A shows a flowchart 500 of the method steps for obtaining geological heterogeneity trends of a geological formation.


In step 502, the tensor Tk,m,n is obtained (see description of FIG. 4 above).


In step 504, matrices L1k,n, L2k,n, . . . , LMk,n are extracted from the Tensor Tk,m,n. The matrices L1k,n, L2k,n, . . . , LMk,n represent the heterogeneous physical properties of the formation.


In step 506, the matrices L1k,n, L2k,n, . . . , LMk,n are base clustered to a base clustering result. The base clustering results are shown in FIG. 5B and described below.



FIG. 5B shows a continuation the method steps of FIG. 5A.


In step 508, the base clustering results are obtained from the base clustering.


In step 510, the cluster ensemble π1, π2, . . . , πM is combined to a single overall cluster.


In step 512, a 2D spatial map from the overall cluster is obtained. The 2D spatial map shows the geological heterogeneity trends associated with cluster types of all the wells w1, w2, . . . , wN. The 2D spatial map reveals the heterogeneity not only within the same well, but also in the spatial domain between the wells. Thereby, it provides a pseudo-2D heterogeneity reservoir model for enhanced geological and engineering studies. Each measured type of well log reveals its heterogeneity.



FIG. 6 shows a flowchart 600 of the steps of the procedure for clustering the matrices L1k,n, L2k,n, . . . , LMk,n based on the characteristics of the corresponding well logs L1, L2, . . . , LM to a clustering result matrix.


In step 602, the well logs L1, L2, . . . , LM are transformed to vectors Dm,n.


In step 604, the vectors Dm,n are folded to folding matrices. The description of FIGS. 7 and 8 describe the procedure for folding a vector Dm,n to a folding-matrix.


In step 606, the matrices are 2D EM (2D expectation maximization) clustered.


One of the base clustering techniques is 2D EM clustering. The 2D EM clustering characterizes the features of the well logs. The 2D EM clustering is especially suited for small well numbers N and the high numbers of depth K>>N.


In one or more embodiments, the 2D EM clustering comprises spectral clustering because the number N of the wells is much smaller than the number of types (including the total number of logs, excluding the total number of log types) of well logs M. Spectral clustering uses the spectrum (eigenvalues) of a similarity matrix to perform dimensionality reduction before clustering in fewer dimensions. The 2D EM clustering is performed for all the wells N based on the characteristics of each type of well log L1, L2, . . . , LM.


The 2D EM clustering is unsupervised and determines factors of the probability distribution by a maximum likelihood estimation. Specifically, random values are selected by the maximum likelihood estimation to estimate the best fit for the petrophysical well logs and wells. The maximum likelihood estimation is then obtained using the 2D EM clustering.


Assuming a vector Y={y1, y2, . . . , yk} represents unlabeled wells with a number k of the unlabeled wells. Let the class label of the n-th cluster be denoted as Yn for (n=1, . . . , C) being αn.


The probability of the mixture component is:










P

(

n




"\[LeftBracketingBar]"


Y
k



)

=



π
n



P

(


Y
k





"\[LeftBracketingBar]"



α
n

,

β
n




)









n
=

1



c





π
n



P

(


Y
k





"\[LeftBracketingBar]"



α
n

,

β
n




)







(
1
)









    • where αn, βn are the estimates of mean and standard deviation respectively of m component. The denominator in Eq. (1) normalizes the probability of the mixture component P(n|Yk) based on:












0


P

(

n




"\[LeftBracketingBar]"


Y
k



)


1




(
2
)




















n
=
1

c



P

(

n




"\[LeftBracketingBar]"


Y
k



)



1






(
3
)








In the maximization step, the probabilities are used to perform re-estimation of the parameter. The likelihood clusters are evaluated using Eq. (4) to (6).










α
n

=








k
=
1

N



P

(

n




"\[LeftBracketingBar]"


Y
k



)



Y
k









k
=
1

N



P

(

n




"\[LeftBracketingBar]"


Y
k



)







(
4
)













β
n
2

=








k
=
1

N



P

(

n




"\[LeftBracketingBar]"


Y
k



)







Y
k

-

μ
k




2









k
=
1

N



P

(

n




"\[LeftBracketingBar]"


Y
k



)







(
5
)













π
n

=


1
N








k
=
1

N



P

(

n




"\[LeftBracketingBar]"


Y
k



)



Y
k






(
6
)







After a vector is folded into a matrix, the above introduced maximum likelihood estimation of EM algorithm is implemented for matrix clustering. The core of the method for obtaining geological heterogeneity trends of a geological formation penetrated by the wells Y={y1, y2, . . . , yk} is to cluster a set of 2D matrices through using this 2D EM clustering.


During clustering, the distance between each matrix pairs y; and y; is calculated using a Hausdorff distance. The Hausdorff distance measures how large a metric space is from each other.


More formally, the Hausdorff distance from set A to set B is a maximum function, defined as:










h

(

A
,
B

)

=


max

a

A



{


min

b

B



{

d

(

a
,
b

)

}


}






(
7
)









    • where a and b are values from matrix A and B respectively, and d(a,b) is any distance calculated between a and b. For simplicity, Euclidian distance between a and b is used in the calculation.





The 2D EM clustering is performed for all types of well logs m that are different from each other. Therefore, the matrices used for 2D EM clustering are N×K matrices for each type of well log clustering. In other words, each well w1, w2, . . . , wN has K measuring points along the depth of the well wn.


The results from the 2D EM clustering characterize the vertical shape information of the well logs L1, L2, . . . , LM.


Entering different data of well logs in the 2D EM clustering results in a matrix of different wells with different types of well logs.


In step 608, the 2D EM clustering outputs the clustering results. The space of the clustering results is a latent space which has no spatial meaning and no relationship with the well locations. The clustering results are just for easy understanding. Actually, there is no need to do such a 2D projection from a hyper-dimensional latent space.


Using different well log data as the input for clustering of the matrices L1k,n, L2k,n, . . . , LMk,n, the result is a table of different wells with different types of well logs.



FIG. 7 shows the procedure for folding a vector Dm,n into a fold-matrix. Each column of the tensor Tk,m,n is a vector Dm,n={D1, D2, . . . , DK}. The vector Dm,n describes k observed real values along the well. In a first step of the procedure for folding a vector, the vector Dm,n is rescaled, such that all values Dk of the vector Dm,n fall within the interval [−1, 1]. The values Dk of the vector Dm,n are rescaled by:











D
~

k

=


(


D
k

-

max

(

D

m
,
n


)

+

(


D
k

-

min

(

D

m
,
n


)


)





max

(

D

m
,
n


)

-

min

(

D

m
,
n


)







(
8
)







In the second step of the procedure for folding a vector, the vector {tilde over (D)}k is transformed from Cartesian coordinate system to polar coordinate system. The rescaled vector {tilde over (D)}x follows polar coordinates by encoding the values Dk as the angular cosine and depth step as the radius with the equation below:









{






ϕ
=

arccos

(


x
~

i

)


,


-
1




x
~

i


1

,



x
~

i



X
˜









r
=


t
i

N


,


t
i








,





(
9
)









    • where ti are the depth steps, and custom-character is a constant factor to regularize the span of the polar coordinate system.





The transformation from Cartesian coordinate to polar coordinate through Eq. (9) has two important properties. The first property is that Eq. (9) is bi-ejective, because cos (ϕ) is monotonic when ϕ∈[0, 1]. The character encoding map produces only one result in the polar coordinate system with a unique inverse function. The second property is that as opposed to Cartesian coordinates, polar coordinates preserve absolute temporal relations of the vector. Thus, the corresponding area from depth step ti to depth step tj is not only dependent on the depth interval |ti−tj|, but also determined by the absolute value of ti and tj.


After transforming the rescaled vector into the polar coordinate system, the angular cosine is easily exploited by considering the trigonometric sum between each point to identify the temporal correlation of the well log values within different measured depth intervals.


In case the radius r is known, the fold-matrix is defined as follows:









G
=

[




cos



(


ϕ

(

x
1

)

+

ϕ

(

x
1

)


)








cos



(


ϕ


(

x
1

)


+

ϕ


(

x
n

)



)







cos



(


ϕ


(

x
2

)


+

ϕ


(

x
1

)



)








cos



(


ϕ


(

x
2

)


+

ϕ


(

x
n

)



)


















cos



(


ϕ


(

x
n

)


+

ϕ


(

x
1

)



)








cos



(


ϕ


(

x
n

)


+

ϕ


(

x
n

)



)





]





(
10
)







The fold-matrix in Eq. (10) has several advantages. First advantage is that the matrix G provides a way to preserve the temporal dependency. The third-degree matrix G contains temporal correlations because G (i,j∥i−j|=k) represents the relative correlation by superposition of directions with respect to depth interval k. The main diagonal Gi,i of the matrix G is the special case when k=0, which contains the original value/angular information.


For example, assuming the vector Dm,n describes k=10 observed real values along the well. In a first step the vector is normalized to an interval [−1,1] (see Eq. (8)). In a second step the angular coordinates are calculated (see Eq. (9)). In a third step, the matrix G is calculated by Eq. (10). The vector is transformed into a data 10×10 matrix. In a fourth step a 2D image is determined. FIG. 8 shows an example of a matrix generated from a vector.



FIG. 9 illustrates the procedure of the cluster ensemble. The cluster ensemble comprises generating a cluster ensemble, creating link-based similarity matrices, consensus functions and evaluating clustering results, respectively. The Link-based technique is to ensemble the based clustering result into a final cluster result. The similarity matrices are between different base clustering results. For example, if the base clustering from L1 is the same as L2, then, the similarity distance between them is 0.


Generating a cluster ensemble is described in the description of FIG. 10. Creating similarity matrices is defined as follows: Two matrices A and B are called similar if there exists an invertible n-by-n matrix P such that B=P−1AP. Creating the consensus functions is also described in the description of FIG. 10.



FIG. 10 shows a procedure for obtaining a cluster ensemble. The 2D EM cluster ensemble π1, π2, . . . , πM are obtained by combining the matrices resulted from the 2D EM clustering via a special cluster algorithm that generates the well log based cluster ensemble.


In step 1002, the matrices L1k,n, L2k,n, . . . , LMk,n are extracted from the tensor Tk,m,n.


In step 1004, the matrices L1k,n, L2k,n, . . . , LMk,n are aggregated into a cluster ensemble π1, π2, . . . , πM.


An algorithm of the cluster ensemble combines different datasets with various clustering algorithms to achieve better accuracy than the individual clustering algorithms. The matrices Lmk,n={L1k,n, L2k,n, . . . , LMk,n} are a set of M data points Lmk,n and Π={π1, π2, . . . , πM} is the cluster ensemble with M ensemble members πm. Each ensemble member πm of the cluster ensemble Π={π1, π2, . . . , πM} returns a set of cluster ensembles πi={C1i, C2i, . . . , Ckii}, such that Uj=1kiCji=Lk,nm, where ki is the number of clusters in the i-th clustering. Each Lk,nm∈Tk,m,n, C(Lk,nm) denotes the cluster label to which the data point Dk belongs. In the i-th clustering, C(Lk,nm)=j if x∈Cji, the problem is to find a new partition π* of a data set Lk,nm that summarizes the information from the cluster ensemble. In other words, the special cluster ensemble combines the clustering with the same dataset with various clustering algorithms.


In step 1006, the cluster ensemble π1, π2, . . . , πM is entered into a consensus function that performs spatial partitioning of the cluster ensemble π1, π2, . . . , πM resulting in a map that shows the geological heterogeneity trends associated with cluster types of the wells.


The ensemble members π1, π2, . . . , πm are inputted in the consensus function. The ensemble members π1, π2, . . . , πm are aggregated to form a final data partition. There are two main stages: (i) generating the similarity matrix through cluster ensemble, and (ii) producing the final partition by a consensus function.


Consensus clustering is a method of ensemble clustering the clustering results from multiple clustering algorithms. Also called aggregation of partitions, the consensus clustering refers to the situation in which a number of different clusterings are obtained for a particular dataset and it is desired to find a single (consensus) clustering which is a better fit in some sense than the existing clusterings. Consensus clustering is the problem of reconciling clustering information about the same data set coming from different sources or from different runs of the same algorithm. When cast as an optimization problem, consensus clustering is known as median partition. Consensus clustering for unsupervised learning is analogous to ensemble learning in supervised learning.


There are many clustering techniques, either supervised or unsupervised learning, which are widely used in the case of unlabeled data, i.e., data without defined categories or groups.


After the Ensemble operation, the Base clustering result matrix in table 2 is spatial partitioned into one final clustering, as shown in table 3.









TABLE 3







Cluster ensemble from multiple well logs.













L1
L2
L3
L4
Cluster


n
Gamma Ray
Spontaneous Potential
ML1
ML2
Ensemble















1
3
4
1
3
3


2
1
3
3
3
3


3
3
4
4
4
3


4
3
4
1
1
1


5
1
1
3
3
3


6
4
4
3
3
3


7
3
1
1
1
1


8
4
3
1
1
1


9
2
2
2
2
2


10
2
2
2
2
2


11
2
2
2
2
2


12
2
2
2
2
2


13
2
2
2
2
2


14
2
2
2
2
2


15
2
2
2
2
2










FIG. 11 shows the result of the cluster ensemble using the data in table 3. On the left side of FIG. 11, four clustered matrices L1k,n, L2k,n, L3k,n, L4k,n are depicted. The four clustered matrices L1k,n, L2k,n, L3k,n, L4k,n are aggregated to a final clustering result π*.


The base clustering of well logs only accounts for the well log data and is, therefore, one dimensional. However, the spatial partitioning of the well logs is two-dimensional. In one or more embodiments, the spatial partitioning is based on a simple inverse distance algorithm or an advanced spatial interpolation method, such as indicator kriging using an indicator function, is implemented.


Multivariate interpolation is interpolation on functions when the variates are spatial coordinates (spatial interpolation). Multivariate interpolation creates a digital elevation model from a set of points on the Earth's surface.


The indicator function of a subset A of a set X is a function defined from X to the two-element set {0,1}, denoted as 1A:X→{0,1}. The indicator function indicates whether an element in X belongs to A or not. The indicator function of a subset A of a set X is a function 1A:X→{0,1}, defined as








1
A

:
X

:=

{




1




if


x


A





0




if


x


A




.






The clustering results in ensemble result column of table 3 using the 15 wells, are used as an example to illustrate the 2D spatial clustering algorithm. The first step is to prepare data files for 15 wells, comprising well logs, spatial x and y coordinates, and lowest depth D1 and highest depth DK of the respective well. Table 4 lists well number, spatial x and y coordinates of the well, lowest depth D1, and highest depth DK of the respective well.









TABLE 4







Spatial coordinates and top and bottom depth data of N = 15 wells.











well nr. n
x
y
D1
DK














1
663.70
278.40
1480
1487


2
940.68
326.68
1488
1496


3
571.23
906.21
1417
1523


4
819.02
986.84
1454
1461


5
983.75
916.28
1490.25
1586


6
449.47
54.76
1492
1497.25


7
686.65
526.78
1443
1459.75


8
151.67
314.85
1594
1670.5


9
641.00
79.61
1567.25
1580.25


10
747.99
210.07
1497.5
1514


11
119.85
899.28
1423.75
1535


12
327.74
489.98
1436.75
1553.25


13
372.05
143.87
1462
1579.75


14
852.69
110.41
1498
1516


15
190.62
32.63
1480.25
1601










FIG. 12 shows a diagram of indicator transformation calculated by Eq. (11). The results are an indicator variable from each type of base clustering. For example, if there are four types of base clustering. The result will be four variables from each type four types of base clustering. On the left side of FIG. 12, the final clustering result π* of FIG. 11 is depicted. An indicator transformation is applied to the final clustering result π*.


The second step for spatial partitioning is to perform an indicator transformation to the clustering results for the wells in table 4. Assuming the locations are uα, α=1, . . . , N, the cluster results for those locations are denoted as z(uα)=k, α=1, . . . , N, z=1, . . . , K. The indicator transformation is:











p
k

(

u
a

)

=

{




1
,







if



z

(

u
a

)


=
k

;

k
=


,


,
K






0
,



otherwise








(
11
)







From information theory, it is derivable that the indicator transformation results as the probability of each cluster at the well sites. If the indicator value is one, it means the probability to find the specific cluster at this site is 100 percent. Otherwise, the probability is zero.


The third step is to do an indicator interpolation. FIG. 13 shows a diagram of a cluster indicator interpolation calculated by Eq. (11). The cluster indicator interpolation transforms the indicators and comprises Inverse distance weighting (IDW). The IDW is a spatial interpolation to estimate an unknown value at a location using some known values with weighted distances between the locations of the known value and the location of the unknown value. An unknown value at a location to be determined is denoted as pk(u*). pk(uα) is the probability of certain type for an offset well locations. The weight is inverse to the distance of an estimated location to each offset locations.


Higher distances result in lower weight. So, for each cluster, the inverse distance weighted interpolation is calculated as:












p
k

(

u
*

)

=







α
=
1

N



1

d
α





p
k

(

u
α

)



,

α
=
1

,


,

N
;

k
=
1


,


,
K




(
12
)









    • where dα is the distance from current estimation spatial location to all the offset wells.






FIG. 14 shows a diagram of a maximum probability threshold map. As the indicator transformation is interpreted as probability, the final map of this indicator variable interpolation will be a probability of each cluster type. But the importance lies on a type, so the maximum probability threshold is used to get a final cluster type based on Eq. (13).


After performing spatial indicator interpolation for all the clusters, a maximum probability threshold algorithm is implemented as given in the following equation:












z

(

u
*

)

=
k

;

max


{


p
k

(

u
*

)

}


;

k
=
1


,


,
K




(
13
)







It assigns a specific cluster to each cell based on the probability of each cluster.



FIG. 15 illustrates the ML engine 1500 for performing the method for obtaining geological heterogeneity trends of a geological formation. In one or more embodiments, the ML engine 1500 is a high performance computing (HPC) device, server, desktop computer, laptop/notebook computer, wireless data port, smart phone, personal data assistant (PDA), tablet computing device, one or more computer processors within these devices, or any other suitable processing device, comprising both physical or virtual instances (or both) of the computing device. Additionally, the ML engine 1500 comprises a computer that comprises an input device, such as a keypad, keyboard, touch screen, or other device that can accept user information, and an output device that conveys information associated with the operation of the ML engine 1500, comprising digital data, visual, or audio information (or a combination of information), or a GUI.


The ML engine 1500 also comprises an interface 1504. The interface 1504 comprises software supporting one or more communication protocols. The interface 1504 further comprises hardware that receives the well log curves.


In one or more embodiments, the interface 1504 is wirelessly connected to ML engine 1500. In other embodiments, the interface 1504 comprises a wired connection to the ML engine 1500.


Furthermore, the ML engine 1500 comprises one or more ML algorithms 1508 for performing the method steps for determining properties of a formation. The ML algorithm 1508 is a software component of the ML engine 1500. Although illustrated as an internal part of the ML engine 1500, in alternative embodiments, the ML algorithm 1508 is an external component of the ML engine 1500.


The ML engine 1500 comprises a processor 1506. The processor 1506 executes instructions according to the ML algorithm 1508 and manipulates the well logs to perform the method for obtaining geological heterogeneity trends of a geological formation, according to the ML algorithm 1508.


The ML engine 1500 further comprises a database 1520. The existing well log curves are stored in the database 1520. While the database 1520 is illustrated as an integral component of the ML engine 1500, in alternative embodiments, the database 1520 is external to the ML engine 1500. The database 1520 may be any repository capable of storing data, including but not limited to data structures such as tables, lists, arrays, etc.


The interface 1504, the processor 1506, the ML algorithm 1508, and the database 1520 communicate via a system bus 1514. In one or more embodiments, any or all of the interface 1504, the processor 1506, the ML algorithm 1508, and the database 1520, communicate with each other over the system bus 1514 using an application programming interface (API) 1510 or a service layer 1512 or a combination of the API 1510 and service layer 1512.


In one or more embodiments, the ML algorithm 1508 creates a ML model with an artificial neural network (ANN). The ANN comprises neurons, wherein each neuron is connected to every other neuron in the ANN. A neuron receives data then processes it and sends the data to all the other neurons. The neurons are aggregated and organized into layers. The neurons of a layer are connected to all the neurons of the neighboring layers. A first layer is the input layer that receives the existing data logs. The last layer is the output layer that outputs the estimated pore pressure log. The mineral composition of the formation is predicted from the digital images of the drill cuttings using ANN.


Although only a few example embodiments have been described in detail above, those skilled in the art will readily appreciate that many modifications are possible in the example embodiments without materially departing from this invention. Accordingly, all such modifications are intended to be included within the scope of this disclosure as defined in the following claims. In the claims, means-plus-function clauses are intended to cover the structures described herein as performing the recited function and not only structural equivalents, but also equivalent structures. Thus, although a nail and a screw may not be structural equivalents in that a nail employs a cylindrical surface to secure wooden parts together, whereas a screw employs a helical surface, in the environment of fastening wooden parts, a nail and a screw may be equivalent structures. It is the express intention of the applicant not to invoke 35 U.S.C. § 112 (f) for any limitations of any of the claims herein, except for those in which the claim expressly uses the words ‘means for’ together with an associated function.

Claims
  • 1. A method for obtaining geological heterogeneity trends of a geological formation, comprising the steps: drilling wells that penetrate the formation,acquiring well logs for each well as function of depth intervals of the respective well,determining a third degree tensor, where a z-dimension denotes the depths, a x-dimension denotes the well logs, and a y-dimension denotes the wells,extracting matrices from the tensor,clustering the matrices based on the characteristics of the corresponding well logs to a clustering result matrix,aggregating the clustering result matrix to a cluster ensemble, andspatial partitioning the cluster ensemble to a map that shows the geological heterogeneity trends associated with cluster types of the wells.
  • 2. The method according to claim 1, wherein the clustering of the matrices comprises: extracting vectors from the tensor,folding the vectors to matrices,clustering the matrices to a cluster ensemble by two dimension expectation maximization (2D EM) clustering to a final clustering result by a consensus function.
  • 3. The method according to claim 2, wherein 2D EM clustering comprises determining a maximum likelihood estimate.
  • 4. The method according to claim 2, wherein the 2D EM clustering comprises spectral clustering.
  • 5. The method according to claim 1, wherein spatial partitioning of the cluster ensemble comprises generating a similarity matrix through cluster ensemble, and producing a final partition (consensus function).
  • 6. The method according to claim 1, wherein a well log Lm is acquired every 0.125 meter (0.5 feet).
  • 7. The method according to claim 1, wherein well logs are acquired by wireline or logging while drilling (LWD).
  • 8. The method according to claim 2, wherein folding the vector comprises the steps: rescaling the vector to a rescaled vector, such that all values fall within an interval [−1, 1], by
  • 9. The method according to claim 1, wherein a multivariate data reduction is used to reduce the dimension of the depth intervals.
  • 10. The method according to claim 9, wherein multivariate data reduction comprises principal component analysis (PCA).
  • 11. The method according to claim 2, wherein the 2D EM clustering comprises a filtering part, and a clustering part.
  • 12. The method according to claim 1, wherein clustering the matrices comprises k-means algorithm.
  • 13. The method according to claim 1, wherein clustering the matrices is performed by machine learning (ML).
  • 14. The method according to claim 1, wherein the spatial partitioning of the cluster ensemble comprises an inverse distance algorithm or an advanced spatial interpolation method
  • 15. The method according to claim 14, wherein, such inverse distance algorithm comprises indicator kriging using an indicator function.
PCT Information
Filing Document Filing Date Country Kind
PCT/CN2022/082500 3/23/2022 WO