This invention relates to the field of geophysical prospecting and, more particularly, to processing geophysical data. Specifically, the invention is a method for inferring properties of the subsurface based on information contained in geophysical data acquired in field experiments.
During seismic, electromagnetic, or a similar survey of a subterranean region, geophysical data are acquired typically by positioning a source at a chosen shot location, and measuring seismic, electromagnetic, or another type of back-scattered energy generated by the source using receivers placed at selected locations. The measured reflections are referred to as a single “shot record”. Many shot records are measured during a survey by moving the source and receivers to different locations and repeating the aforementioned process. The survey can then be used to perform inversion, e.g., Full Waveform/Wavefield Inversion in the case of seismic data, which uses information contained in the shot records to determine physical properties of the subterranean region (e.g., speed of sound in the medium, density distribution, resistivity, etc.). Inversion is an iterative process, each iteration comprising the steps of forward modeling to create simulated (model) data and objective function computation to measure the similarity between simulated and field data. Physical properties of the subsurface are adjusted at each iteration to ensure progressively better agreement between simulated and field data. The invention will be described primarily in the context of Full Waveform Inversion of seismic data, but can be applied to inversion of other types of geophysical data.
Multi-parameter inversion involves simultaneous updating of at least two medium properties. A typical strategy is to formulate an objective (cost) function E(m) measuring the misfit between modeled and field data, where m is a vector of medium properties whose components can be compressional and shear-wave velocities, Vp and Vs, density ρ, Thompsen anisotropy parameters ε and δ (Tsvankin, 2001, p. 18), etc. The gradient of the objective function with respect to individual components of m is indicative of the direction in which medium parameters can be updated so that the objective function is minimized and progressively better fit of modeled and field data is obtained. The basis of this approach is the well-known Taylor series:
where Am is the desired update; ∇mE and ∇mmE are the gradient and the Hessian of the objective function respectively. The gradient ∇mE is a vector containing first-order derivatives of the objective function E with respect to each individual component mi of the model vector m:
The Hessian ∇mmE is a matrix containing second-order derivatives of the objective function E with respect to individual components mi, mj:
Clearly, if we neglect quadratic terms (the ones with the Hessian) of this expansion and set Δm=−α∇mE, with α>0, then the objective function will decrease:
E(m+Δm)=E(m)+(∇mE)Δm=E(m)−α(∇mE)2<E(m).
Optimal α can be determined with the help of line search, which typically involves evaluating the objective (cost) function for strategically chosen values of α so as to find the best one.
The drawback of this approach is that the gradient does not usually provide the best possible descent direction. Different components of the gradient could be of vastly different magnitudes (especially, when they correspond to different types of medium properties, e.g., Vp and ε) and may exhibit leakage from one component to another due to interdependence of different medium parameters on one another.
A better descent direction can be obtained if the quadratic terms are taken into account. Various approaches of this type are called Newton's method, Newton-CG, and Gauss-Newton and are based on inverting the Hessian:
Δm =−(∇mmE)−1∇mE.
Due to its size (typically 109×109 in 3D), the Hessian has to be inverted iteratively, each iteration involving application of the Hessian to a vector. Depending on the problem, the Hessian-vector products (an equivalent term for application of the Hessian to a vector), can be computed analytically, numerically using finite differences, or using the adjoint state method (Heinkenschloss, 2008). Since only a few (usually 10-20) iterations of this iterative process can be afforded in practice, the resulting approximations to the inverse Hessian are usually not very accurate and may not be able to eliminate the leakage (cross-talk) between various medium parameters or provide the correct scaling between different components of the gradient. Moreover, the inversion algorithm may lead to accumulation of artifacts Am, resulting in a suboptimal solution.
A cheaper way to ensure proper relative scaling of the gradient components is to apply the subspace method (Kennett et al., 1988.) The key idea behind this method is to represent the model perturbation as a sum of basis vectors:
Δm=αs1+βs2+ . . .
For example, for two different types of medium parameters (e.g., Vp and ε) a customary choice (Sambridge et al., 1991) is:
where one typically sets Δm1˜(−∇m
Thus, each component of the gradient can be scaled independently so that the resulting search direction is improved. The scaling factors α and β are chosen so that the quadratic approximation to the objective function is minimized:
It is easy to show that the minimum of the objective function will be obtained if we set
The cost of determining the values of α and β (which provide the desired scaling of the gradient components) is equal to two applications of the Hessian to a vector (Δm1 and Δm2), making this method far cheaper than Newton/Newton-CG/Gauss-Newton.
However, the limitation is that the leakage (cross-talk) cannot be handled effectively, since all the subspace method does is scale each component of the gradient up or down (by α and β).
In one embodiment, the invention is, referring to the reference numbers in the
In a preferred variation of the foregoing embodiment, the degree of mixing between gradient (search direction) components may be adjusted by scaling the off-diagonal components of the mixing matrix, i.e., a matrix whose elements are the coefficients of the basis vector expansion of the search direction.
The dimensionality of the extended subspace of the present invention, i.e., the number of basis vectors M, can in principle be any number greater than N, the number of unknown parameters that are being inverted for. Selecting M=N2 allows for leakage between each parameter and all of the others during the inversion process. However, it may be that not all parameters leak into all other parameters. It may be possible to decide based on empirical or theoretical evidence which parameters may potentially have cross-talk among them, and then choose M accordingly. For example, if one is inverting for compressional velocity Vp, shear wave velocity Vs, and anisotropy parameter ε, one might reasonably expect leakage/cross-talk between Vp and Vs, Vp and ε, but not between Vs and ε. So one could have 3 basis vectors for the Vp search direction (gradients w.r.t. Vp, Vs, ε), but only two basis vectors for the Vs and ε search directions, for a total of 7 basis vectors (instead of 9). As an alternative example, one might follow Kennett's approach described above, in which case there would be N(N+1) basis vectors, i.e. 12 for the case of N=3.
In another embodiment of the invention, referring to the flow chart of
The advantages of the present invention are better understood by referring to the following detailed description and the attached drawings, in which:
The invention will be described in connection with example embodiments.
However, to the extent that the following detailed description is specific to a particular embodiment or a particular use of the invention, this is intended to be illustrative only, and is not to be construed as limiting the scope of the invention. On the contrary, it is intended to cover all alternatives, modifications and equivalents that may be included within the scope of the invention, as defined by the appended claims.
The present invention extends the traditional subspace method in a way that explicitly accounts for possible leakage between gradient components. This can be achieved by picking additional basis vectors. Once again, the concept may be illustrated for the case of two different parameters:
Similarly to the original subspace method, one can get optimal scaling coefficients from:
where the superscript T denotes matrix transpose.
The key novelty is that explicit mixing is performed between gradient components corresponding to different medium parameters, e.g., Vp and ε or Vp and ρ. The scaling/mixing coefficients αi and βi are determined automatically from Equation 1 at the cost (measured in the number of Hessian applications to a vector) that is equal to the square of the cost of the traditional subspace method. The coefficients α1 and β2 are the ones that would have been computed in the traditional subspace method, while α2 and β1 correspond to the extended set of basis vectors being introduced in this invention. An important limitation of the method is that curvature information obtained from the Hessian may not be accurate far away from the global minimum, yielding scaling coefficients that would not lead to an improved search direction Δm. Thus, the method as presented so far would be unlikely to work consistently in practice.
Kennett et al. (1988) proposed an alternative approach to selecting an extended set of basis vectors in the subspace method:
However, the cost of this method is much higher (grows as the third power of the cost of the conventional subspace method) due to the need to compute four additional Hessian-vector products. In this case, the matrix in Eqn. (1) would look different, because of the choice of the extended subspace basis vectors. Instead of elements that look like ΔmiT(∇mm
The theory underlying the subspace method assumes that the Hessian correctly captures the behavior of the objective function. As mentioned above, when we are dealing with models that are far from the “true” ones, the objective function may not be locally quadratic. In this case Equation (1) may produce inaccurate estimates of αi and βi. Moreover, it is customary to replace the Hessian with its “reduced” version—so-called Gauss-Newton Hessian—which itself becomes inaccurate away from the global minimum. Thus, to make the method work in practice, several modifications are helpful.
The first modification is an application of the well-known “trust region” concept. If the values of αi and βi turn out to be too large (e.g., requiring a more than 10% update of medium parameters at any given iteration), they need to be scaled down (clipped.) Rewriting the vector of αi and βi as a mixing matrix,
we can conveniently scale down either row of the matrix, depending on which parameter update exceeds a predefined threshold.
The second modification represents a second key novel step and has to do with adjusting the degree of mixing between gradient (search direction) components. The mixing can be adjusted by scaling the off-diagonal components of the mixing matrix by (γα,γβ):
Then a line search is performed, i.e., evaluate a series of objective functions
and select the values of (γα,γβ) corresponding to the best (i.e. minimum or maximum, depending upon how the objective function is formulated) objective function. (Note that the γi are introduced for convenience; we could just as well have found optimal values of the off-diagonal elements of the mixing matrix). There are many known ways to perform the line search, but for purposes of the present invention, in order to minimize the computation cost, it is preferable to fit a quadratic form in (γα,γβ) to the objective function above and then find optimal values of (γα,γβ):
E(m+Δ{tilde over (m)};γαγβ)=α0+α1γα+α2γβ+α3γα2+α4γβ2+α5γαγβ.
The objective function is evaluated at six different points (γαγβ), e.g., (1,1), (0.75,1), (1,0.75), (0.5,1), (1,0.5), (0.5,0.5) and the resulting system of linear equations solved for αi. When the quadratic form is not positive definite, and end point (either 0 or 1) can be chosen for each γ. Note that this line search is different from the traditional one and serves a different purpose. Conventionally, the line search is performed to determine the best possible step size (scaling of the model update), while it is used here to determine the best possible set of mixing coefficients that minimize leakage/cross-talk between different inversion parameters. Once the mixing coefficients are determined and updated search directions are obtained, a conventional line search can be applied to further scale the updated search directions.
The third key novel step addresses the situation in which the level of cross-talk is spatially varying, so that scaling factors (γα,γβ) need to be spatially varying as well. The line search can be performed separately for each shot, producing a spatially varying set of scaling factors. Note that the cost of performing the line search for each shot individually is the same as the cost of traditional spatially invariant line search. The only difference is that instead of summing all individual objective functions computed for each shot record and then selecting the values of (γα,γβ) that correspond to the best cumulative objective function, the selection is performed shot-by-shot, skipping the summation. Each shot is assigned a spatial location and the selected optimal value of (γα,γβ) is also assumed to occur at that location. Finally, interpolation may be performed to obtain a spatially varying distribution of optimal scaling factors (γα,γβ), followed by optional smoothing to avoid introducing artifacts into the inversion.
Incorporating Well Constraints
The idea of using gradients as basis vectors for forming an improved update (search direction) in inversion can be extended to the case in which well logs or other reliable information regarding the subsurface is available, representing another key novel step. Similarly to the methodology described in the previous sections, an improved update (search direction) can be obtained by setting
Δ{tilde over (m)}i=wi1Δm1+wi2Δm2+wi3e (3),
where i=1,2; e is a vector with all components set to “1”. The unknown coefficients wi1, wi2, wi3 can be determined by requiring that the improved model update fit the “true” well-log-based update
Δmitrue=miwelllog−micurrent
in some norm:
∥Δmitrue−Δmi∥L
In general, optimal coefficients wi1, wi2, wi3 can be found numerically. If n=2, i.e., the L2 norm is used, the solution to this minimization problem is given by
The Δmi can be set proportional (or equal) to the gradients of E, their preconditioned/modified versions, or the improved search directions coming from the extended subspace method described in the previous sections.
There are two key differences with the extended subspace method described previously. First, there is effectively no additional computational cost to be incurred in computing an improved search direction based on the well log information because Hessian-vector products need not be computed and just a small 3×3 matrix has to be inverted. Secondly, the set of basis vectors was extended even further by including vector e. This vector allows us to determine the background (“DC”) component of the update. It is well known that FWI cannot correctly compute the background update when seismic data are missing low frequencies, as is the case for most datasets acquired to date. For some parameters, such as Thompsen's anisotropy parameter δ, this is impossible under any circumstances based on surface seismic data alone. Thus, the vector e was not included previously because it would have been difficult to obtain it reliably. (The availability of a direct measurement of subsurface medium parameters at well locations changes the situation.) Of course, e can be more general than a vector consisting of “1”. For example, it could be a depth-varying function.
If more than one well is available, optimal coefficients wi1, wi2, wi3 should preferably be found at each well location and spatially interpolated between wells and extrapolated away from the wells.
In a typical application the extended subspace method based on the surface seismic data might be used first to produce an improved model update, i.e. search direction, followed by a further modification based on the well log information. Basic steps in this embodiment of the invention are shown in the self-explanatory
Additionally, application of the extended subspace method could be skipped and well log information used directly to obtain an improved search direction. Basic steps in this embodiment of the invention are shown in the self-explanatory
The present inventive method was tested using synthetic data generated by assuming the “true” models for the parameters Vp and ε shown in
Next, a two-parameter inversion was performed for Vp and ε using the initial model shown in
The foregoing description is directed to particular embodiments of the present invention for the purpose of illustrating it. It will be apparent, however, to one skilled in the art, that many modifications and variations to the embodiments described herein are possible. All such modifications and variations are intended to be within the scope of the present invention, as defined by the appended claims.
This application claims the benefit of U.S. Provisional Patent Application 61/830,537, filed Jun. 3, 2013, entitled “Extended Subspace Method for Cross-Talk Mitigation in Multi-Parameter Inversion,” the entirety of which is incorporated by reference herein.
Number | Date | Country | |
---|---|---|---|
61830537 | Jun 2013 | US |