Krylov-space-based quasi-newton preconditioner for full-wavefield inversion

Information

  • Patent Grant
  • 10838093
  • Patent Number
    10,838,093
  • Date Filed
    Thursday, June 23, 2016
    8 years ago
  • Date Issued
    Tuesday, November 17, 2020
    4 years ago
Abstract
A method, including: storing, in a computer memory, seismic data acquired from a seismic survey of a subsurface region; and generating, with a computer, a final subsurface physical property model of the subsurface region by processing the seismic data with an iterative full wavefield inversion method, wherein the iterative full wavefield inversion method generates the final subsurface physical property model by iteratively applying a linear solver with a preconditioner that is generated from information from one or more previous iterations of the linear solver.
Description
FIELD OF THE INVENTION

Exemplary embodiments described herein pertain generally to the field of geophysical prospecting, and more particularly to geophysical data processing. An exemplary embodiment can increase the speed of convergence of full wavefield inversion (FWI).


BACKGROUND

This section is intended to introduce various aspects of the art, which may be associated with exemplary embodiments of the present technological advancement. This discussion is believed to assist in providing a framework to facilitate a better understanding of particular aspects of the present technological advancement. Accordingly, it should be understood that this section should be read in this light, and not necessarily as admissions of prior art.


Seismic inversion is a process of extracting subsurface information from the data measured at the surface of the earth acquired during a seismic survey. In a typical seismic survey, seismic waves are generated by a source positioned at desired locations. As the source generated wave propagates through the subsurface and travels back to the receiver locations where it is recorded.


Full waveform inversion (FWI) is a seismic processing method which can potentially exploit the full seismic record including events that are treated as “noise” by standard seismic processing algorithms. FWI iteratively minimizes an objective function based on a comparison of a simulated and measured seismic records. Even with today's available high-performance computing resources, one of the biggest challenges to FWI is still the computational cost. Nevertheless, the benefit of inferring a detailed representation of the subsurface using this method is expected to outweigh this cost, and development of new algorithms and workflows that lead to faster turnaround time is a key step towards making this technology feasible for field-scale applications, allowing users to solve larger scale problems faster. The computationally most intensive component of FWI is the simulations of the forward and adjoint wavefields. The number of total forward and adjoint simulations is proportional to the number of iterations, which is typically on the order of hundreds to thousands. Any method reducing number of FWI iterations will reduce to number of forward and adjoint simulation calls and the computational run time.


The crux of any FWI algorithm can be described as follows: using a given starting subsurface physical property model, synthetic seismic data are generated, i.e. modeled or simulated, by solving the wave equation using a numerical scheme (e.g., finite-difference, finite-element, spectral element, and etc.) which typically divides the model domain into a set of nonoverlapping cells (also referred as elements or blocks). The term velocity model or geophysical property model as used herein refers to an array of numbers, where each number, which may also be called a model parameter, is a value of velocity or another geophysical property in a cell. The synthetic seismic data are compared with the field seismic data and using a norm, an error or objective function is calculated. Using this objective function and an optimization algorithm, a modified subsurface model is generated which is used to simulate a new set of synthetic seismic data. This new set of synthetic seismic data is compared with the field data to generate a new objective function. This process is repeated until the optimization algorithm satisfactorily minimizes the objective function and the final subsurface model is generated. A global or local optimization method is used to minimize the objective function and to update the subsurface model. Further details regarding FWI can be found in U.S. Patent Publication 2011/0194379 to Lee et al., the entire contents of which are hereby incorporated by reference.


Common FWI methods iteratively minimize the objective function which is subject to the wavefield propagation—the physics of the problem. A (nonlinear) iteration i of FWI involves the following two steps: (1) compute a search direction for the current model mi, d(mi); and (2) search for an update to the current model which is a perturbation along the search direction and that reduces the objective function. The FWI processing starts from a given starting model m0 provided by the user. FWI algorithms iteratively improve this starting model using an optimization technique,

mi+1=miidi,  (1)

where αi is a scalar parameter, di is the search direction and i is the nonlinear iteration number. The search direction is chosen along a globalization strategy [1,2]. For the second-order optimization methods, the search direction is obtained by solving

Hidi=−gi,  (2)

where Hi can be Newton's Hessian or Gauss-Newton's Hessian. For the large scale optimization problems, the Hessian is both prohibitively large to store and compute explicitly. Instead, an approximate inverse Hessian Hi−1 is used to calculate the search direction. There are several choices for this approximation, such as (i) quasi-Newton methods and (ii) truncated Newton's or Gauss-Newton methods (note that “(Gauss)-Newton” is used herein to refer to both Newton and Gauss-Newton methods).


SUMMARY

In an exemplary embodiment, a method can include: storing, in a computer memory, seismic data acquired from a seismic survey of a subsurface region; and generating, with a computer, a final subsurface physical property model of the subsurface region by processing the seismic data with an iterative full wavefield inversion method, wherein the iterative full wavefield inversion method generates the final subsurface physical property model by iteratively applying a linear solver with a preconditioner that is generated from information from one or more previous iterations of the linear solver.


In an exemplary embodiment, the linear solver is a Krylov-space method.


In an exemplary embodiment, the linear solver is a conjugate gradient method.


In an exemplary embodiment, the method can further include generating the preconditioner with a limited-memory Broyden-Fletcher-Goldfarb-Shanno (BFGS) method.


In an exemplary embodiment, the method can further include generating the preconditioner with a quasi-Newton method.


In an exemplary embodiment, the method can further include: storing, in a computer memory, a change in an optimization parameter of the full wavefield inversion method and a change in a gradient of a cost function used in the full wavefield inversion method for each of a plurality of iterations of the linear solver; generating the preconditioner based on the change in the optimization parameter and the change in the gradient of the cost function from each of the plurality of iterations of the linear solver; and applying the preconditioner to a subsequent iteration of the linear solver, relative to the plurality of iterations.


In an exemplary embodiment, the preconditioner is a fixed preconditioner, and the preconditioner does not change when solving a linear system and it is only based on changes in the optimization parameter and changes in the gradient of the cost function from previous iterations of the linear solver.


In an exemplary embodiment, the preconditioner is a variable preconditioner, and the preconditioner can change when solving a linear system, and is based on changes in the optimization parameter and changes in the gradient of the cost function from previous iterations of the linear solver and a current linear iteration of the linear solver.


In an exemplary embodiment, the linear solver is non-flexible.


In an exemplary embodiment, the linear solver is flexible.


In an exemplary embodiment, the change in the optimization parameter for a standard quasi-Newton algorithm is replaced with a change in a search direction of a linear system.


In an exemplary embodiment, the change in the gradient for the standard quasi-Newton algorithm is replaced with a change in a residual of the linear solver.


In an exemplary embodiment, the optimization parameter is the search direction and the gradient is a residual of a linear system.


In an exemplary embodiment, the linear solver is a Krylov-space method.


In an exemplary embodiment, the method can further include managing hydrocarbons based on the final subsurface physical property model of the subsurface region.


In an exemplary embodiment, the method can further include: creating, with a processor, an image of the subsurface region from the final subsurface physical property model.


In an exemplary embodiment, the method can further include: using the final subsurface physical property model in interpreting a subsurface region for hydrocarbon exploration or production.


In an exemplary embodiment, the method can further include drilling for hydrocarbons at a location determined using the final subsurface physical property model of the subsurface region.


In an exemplary embodiment, the linear solver is a generalized minimal residual method.





BRIEF DESCRIPTION OF THE DRAWINGS

While the present disclosure is susceptible to various modifications and alternative forms, specific example embodiments thereof have been shown in the drawings and are herein described in detail. It should be understood, however, that the description herein of specific example embodiments is not intended to limit the disclosure to the particular forms disclosed herein, but on the contrary, this disclosure is to cover all modifications and equivalents as defined by the appended claims. It should also be understood that the drawings are not necessarily to scale, emphasis instead being placed upon clearly illustrating principles of exemplary embodiments of the present invention. Moreover, certain dimensions may be exaggerated to help visually convey such principles.



FIG. 1 illustrates the linear and nonlinear iterative methods that are included in the present technological advancement.



FIG. 2 illustrates an exemplary method embodying the present technological advancement.



FIG. 3 describes an exemplary application of the present technological advancement.





DESCRIPTION OF THE INVENTION

While the present disclosure is susceptible to various modifications and alternative forms, specific example embodiments thereof have been shown in the drawings and are herein described in detail. It should be understood, however, that the description herein of specific example embodiments is not intended to limit the disclosure to the particular forms disclosed herein, but on the contrary, this disclosure is to cover all modifications and equivalents as defined by the appended claims. It should also be understood that the drawings are not necessarily to scale, emphasis instead being placed upon clearly illustrating principles of exemplary embodiments of the present invention. Moreover, certain dimensions may be exaggerated to help visually convey such principles.


The present technological advancement can increase the speed for convergence of FWI by several factors when second order methods are used as the optimization technique. The present technological advancement can uniquely combine two known optimization techniques: the quasi-Newton methods (such as L-BFGS) (first method) and the truncated Newton-or Gauss-Newton)-method (second method). In the present technological advancement, the second method can be used as an optimization algorithm and the first method can be used to speed up the convergence of the second method as a preconditioner.


Quasi-Newton Methods


Quasi-Newton methods replace Hi−1 with its approximation in equation (2) when solving for d. These methods approximate the inverse Hessian operator (H) using gradient and model parameter changes throughout nonlinear iteration (this can be contrasted with (Gauss) Newton's method, which does not approximate the Hessian, but (approximately) solves Equation (2)).


Not all quasi-Newton methods are directly applicable to FWI problems due to its large-scale nature. However, quasi-Newton methods are modified and extended in several ways to make them suitable for large-scale optimization problems. The members of the quasi-Newton methods suitable for large scale optimization problems are so called limited-memory quasi-Newton methods. The limited-memory BFGS (Broyden-Fletcher-Goldfarb-Shanno) (L-BFGS) algorithm is the most common member suitable for FWI, as it is robust and computationally inexpensive and easy to implement [2]. All the preconditioner approaches introduced in the rest of the discussion will be based on the L-BFGS algorithm. However, the present technological advancement is not limited to the use of the L-BFGS algorithm; and other quasi-Newton algorithms such as limited memory SR1 method could also be used [2].


The inverse Hessian approximation Hi+1−1 will be dense, so the cost of storing and manipulating it is computationally prohibitive for an FWI problem. To circumvent this problem, in limited memory BFGS approximation, {si,yi} pairs are used to approximate the action of Hi+1−1 to a given vector. The vector pairs {si,yi} are defined as

si=mi+1−mi, and  (3)
yi=gi+1−gi.  (4)


In other words, vector si is the change in optimization parameter and vector yi is the change in the gradient gi, at nonlinear iteration i.


The resulting recursive algorithm computes the application of the approximate inverse Hessian on a vector q by the L-BFGS approach using m pairs of {si,yi} is given in Algorithm (1) below. Note that given vector q and m vector pairs {si,yi}, the algorithm returns the vector p, which is multiplication of the approximate inverse Hessian with vector q.












Algorithm 1: Two-loop L-BFGS algorithm [2]















FOR (k = m, m − 1, ... ,1) (m is the total number of pairs and k is the


iterator for the loop)





  
αk=skTqykTsk






 q = q − αkyk


END FOR


p = (Hi0)−1q


FOR (k = 1,2, ... , m)





  
β=ykTpykTsk






 p = p + (αk − β)sk


END FOR


RETURN p









To complete the given L-BFGS algorithm (1), an initial estimate (Hi0)−1 needs to be provided to the algorithm. A method for choosing (Hi0)−1 that has proven to be effective in practice is to set












(

H
i
0

)


-
1


=




s

i
-
1

T



y

i
-
1





y

i
-
1

T



y

i
-
1





I


,




(
5
)








where the multiplier in front of identity matrix (I) is the scaling factor that attempts to estimate the size of the true Hessian matrix along the most recent search direction. The choice of scaling factor is important to ensure that the initial Hessian approximation is accurately scaled. Alternatively, the initial estimate (Hi0)−1 can be set to an available geophysics based preconditioner (such as U.S. Patent Publication No. 2015/0073755, the entirety of which is incorporated by reference). This mechanism enables different types of preconditioners to be combined with the present technological advancement.


Truncated (Gauss)-Newton Method


Another approach for solving the (Gauss)-Newton system (2) is to use an iterative method. In contrast to the Quasi-Newton method, this approach uses the Hessian operator (or its approximation as in Gauss-Newton method) directly. However, due to difficulty of storing and explicitly computing the Hessian operator, one can only employ an iterative method to solve (2) since these methods do not explicitly require the Hessian operator in (2) but rather they require application of the Hessian operator to a given vector. A preferred approach is to use one of the so-called Krylov space methods. A Krylov space method for solving a linear system Ax=b is an iterative method starting from some initial approximation x0 and the corresponding residual r0=b−Ax0, and iterating until the process possibly finds the exact solution or a stopping criteria is satisfied. These methods only require application of the Hessian operator to a given vector. For an FWI, application of Hessian on a vector can require computing at least one forward and one adjoint simulations of the wavefields. The linear iterations usually terminate using inner convergence criteria to improve speed of the nonlinear convergence of FWI[1, 2, 6]. The following discussion will utilize the conjugate gradient method as a non-limiting example of a Krylov space method.


After the search direction is computed by approximately solving (2) with an iterative linear Krylov-space solver (such as the conjugate gradient method), the FWI model is updated using a line search strategy. This procedure is repeated till convergence (see FIG. 1). The linear iterations for solving the system (2) with the conjugate gradient method are referred to as inner iterations 101, and the nonlinear iterations for updating the optimization parameter mi are referred to as outer iterations 103, and the relationship between the two are depicted in FIG. 1.


The accompanying Appendix provides additional information regarding implementation of the conjugate gradient method.


Preconditioning


When using the truncated (Gauss)-Newton method, the performance of linear solvers used to solve (2) can be improved by using a preconditioner. In this case, instead of solving equation (2), the following system is solved with the Krylov space methods

Bi−1Hidi=−Bi−1gi,  (6)

where Bi−1 is the preconditioner to equation system (2) which is typically an approximation to the numerical inverse of the Hessian for optimization problems. One of the roles of the preconditioner is to reduce the condition number of the Hessian, so that equation system (6) can be solved more efficiently with less linear iterations. Note that, in (6) the preconditioner is applied from the left side of the operator Hi. There are other alternatives applications of the preconditioner, and the present technological advancement is not limited to the example provided here. (See Appendix, [2,6]).


Exemplary Embodiments

In following non-limiting exemplary embodiments of the present technological advancement, three methods are combined in a unique way: (i) Quasi-Newton method, (ii) truncated (Gauss)-Newton method and (iii) preconditioning. These exemplary embodiments use the truncated (Gauss)-Newton method as the optimization algorithm. In addition, the quasi-Newton approximation of the inverse Hessian is used as a preconditioner to the (Gauss)-Newton system. In other words, Algorithm 1 is used as a preconditioner when solving the system (2) using a Krylov-space method. To create the preconditioner of Algorithm 1, either information from the outer nonlinear iterations 103 or inner linear iterations 101 can be used to construct the preconditioner [1]. The main difference in these approaches (and one example of where the present technological advancement differentiates itself from the state-of-the art [1]) is the way the vector pairs {si,yi} are created and used in the application of Algorithm 1 by the present technological advancement.


The state-of-the-art quasi-Newton preconditioning approach essentially approximates the inverse Hessian using the information captured from the outer iterations. The present technological advancement, on the contrary, introduces a quasi-Newton preconditioner which approximates the inverse Hessian using the history of inner iterations 101. The present technological advancement may include the use of any additional preconditioner to Algorithm 1 as a starting initial estimate of equation (5) for the iterative process. The present technological advancement significantly improves the convergence speed of FWI relative to the state-of-the-art preconditioning methods with an additional negligible computational cost.


First, it is observed that the solution of equation (2) via an iterative Krylov space method is equivalent to minimization of the following unconstrained quadratic optimization problem of the form











min

d
i








f


(

d
i

)



=




1
2



d
i
T



H
i



d
i


+


d
i
T



g
i



..





(
7
)







It is noted that solving (2) with a Krylov subspace method, such as the conjugate gradient method, is equivalent to minimizing an objective function in the form of (7). This idea is combined with the quasi-Newton method to create a preconditioner for the truncated Newton (or truncated Gauss-Newton) method.


In solving (2) with a Krylov-subspace method, the optimization parameter is the search direction di and the gradient is equivalent to residual of the linear system (2). At a given inner linear iteration l, the gradient of the objective function (7) (also the residual of linear system (2)) is

rl=Hidil+gi.  (8).


In a given linear iteration of the conjugate gradient method, the solution of (2) is updated with

dil+1=dillγl.  (9).


And the residual (gradient) is updated with

rl+1=rllHiγl,  (10)

where μl is the step length and γl is the search direction for the linear system (see [2, Appendix] for details).


The present technological advancement constructs the quasi-Newton preconditioner with the information captured from this inner optimization, {si,yi} pairs in equation (7). Using equation (3) and (4) along with (8) and (9) we get

yl=rl+1−rllHiγl, and  (11)
sl=dil+1−dillγl.  (12)

In contrast, the quasi-Newton method uses outer nonlinear iterations to create {sl,yl} pairs.


Noting that scaling factor μl in (11) and (12) can be omitted because it cancels out in the application of the preconditioner, leading to

sll and yl=Hiγl.  (13)


To generate the preconditioner, the vectors in (13) are substituted into Algorithm 1, wherein the output is the multiplication of the approximate inverse Hessian with vector q, wherein this vector q can be the residual vector r in the preconditioned conjugate gradient method (see Appendix).


These choices of vectors (13) and their use, for example, are ways in which the present technological advancement distinguishes over the conventional techniques. In other words, on using inner iterations 101 of Krylov-space method (conjugate gradient method) as a linear solver, it is recognized that: “the change in the residual of inner iterations” plays the same role as “change in the gradient of outer iterations”, and “the change in the solution of the linear system” plays the same role as “the change in the solution of the outer optimization system”.



FIG. 1 is a depiction of the outer and inner iterations that are part of the truncated (Gauss)-Newton method, and how they relate to generating an updated physical properties subsurface model. The linear or inner iterations 101i are repeated for equations (7)-(12) until a predetermined stopping criteria is reached, at which point di is determined. As part of these iterations, L {sl,yl} pairs are generated in linear iterations 101i, stored, and used to create the preconditioner for use in subsequent iterations. di, the output of the first linear iteration 101i in FIG. 1, is used to update the model m at the outer iteration 103i, which is repeated for the subsequent outer iterations. The outer 103 and inner 101 iterations are repeated until the updated model converges or some other predetermined stopping criteria is satisfied.



FIG. 2 is a flow chart for an exemplary method of implementing the present technological advancement. The exemplary method of FIG. 2 builds on the truncated (Gauss)-Newton method of FIG. 1 by combining it with quasi-Newton methods and preconditioning.


In step 201, in the first non-linear iteration 103i, equation (2) is solved using the conjugate gradient method, without preconditioning, by performing several linear iterations 101i of the conjugate gradient method. Since (sk, yk) pairs are not available for the first iteration, the preconditioner of the present technological advancement is not used in the first nonlinear iteration. However, it is also possible to use another preconditioner when performing step 201.


In step 203, in the first nonlinear iteration 103i, when solving equation (2), vectors γl, and its product with the Hessian operator Hiγl are stored in computer memory. Effectively, the right most terms in equations (9) and (10) can be stored for each iteration 1. As noted above, the scaling factor will cancel out and effectively it does not matter whether the scaling factor is stored along with vectors γl and its product with the Hessian operator Hiγl.


In step 205, the model is updated in outer iteration 103i. The output of the inner iterations 101i is di, which is used to update the model in the outer iteration 103i.


In step 207, in the second nonlinear iteration 103i+1, the stored pairs of {γl, Hiγl} from the previous nonlinear iteration 103i are used to construct quasi-Newton “inverse Hessian mat-vec (matrix-free) operator” (Algorithm 1). This operator is used for preconditioning the conjugate gradient algorithm as given in equation (6). Only the pairs stored in the previous outer iteration(s) are used in the preconditioner application when a non-flexible conjugate gradient method is used. However, flexible version of the conjugate gradient method can also be used with the present technological advancement. The flexible version allows for a variable preconditioner within inner iterations 101. In this case, a slight modification in the application of the preconditioner is possible. Accordingly, the flexible version of the conjugate gradient method can use all pairs of {γl, Hiγl} vectors (i.e., those from the previous nonlinear iteration 103i and those generated during the current nonlinear iteration 103i+1). Further details of this flexible preconditioner approach are included in the Appendix.


The L-BFGS preconditioner requires a starting inverse Hessian, which is above approximated by scaled identity (5). The correct scaling can be crucial for the performance of the preconditioner. When Algorithm 1 is used as part of the present technological advancement to generate the preconditioner, the state-of-the-art preconditioner (5) constructed from the information at the outer iterations can be used as a starting inverse Hessian. Thus, the present technological advancement can combine information obtained from both the outer and inner iterations.


Provided that Hi in (2) is positive definite for a Gauss-Newton system, it can be shown that the resultant algorithm produces a positive definite operator, and guarantees a decent direction and robust algorithm. If one uses a truncated Newton's method, additional care must be taken to preserve positive definiteness [2].


In step 209, in the second linear iterations 101i+1, the process continues to store the pairs of {γl, Hiγl} vectors generated from the current iteration (i+1) during the solving of equation (6) using the conjugate gradient method.


In step 211, the model is updated in outer iteration 103i+1. The output of the inner iterations 101i+1 is di+1, which is used to update the model in the outer iteration 103i+1.


In step 213, it is determined whether convergence criteria or other predetermined stopping criteria has been reached for the updated physical property model. If not, the process returns to step 207 for another iteration. Iterations can continue until the convergence or stopping criteria is satisfied.


When the convergence or stopping criteria is satisfied, the process proceeds to step 215, in which a final physical property subsurface model is generated.


In step 217, the final physical property subsurface model can be used to manage hydrocarbon exploration. As used herein, hydrocarbon management includes hydrocarbon extraction, hydrocarbon production, hydrocarbon exploration, identifying potential hydrocarbon resources, identifying well locations, determining well injection and/or extraction rates, identifying reservoir connectivity, acquiring, disposing of and/or abandoning hydrocarbon resources, reviewing prior hydrocarbon management decisions, and any other hydrocarbon-related acts or activities.


The preconditioner of the present technological advancement can be used along with alternative iterative methods. The above examples use the conjugate gradient method as the linear solver. The conjugate gradient method can be replaced with another iterative method, such as the generalized minimal residual method (GMRES). To be applicable to FWI, such alternative methods need to be iterative like the conjugate gradient method.


Construction of the preconditioner of the present technological advancement is not limited to Algorithm 1. Other alternative iterative methods such as limited memory SR1 can be used. The update vectors of Algorithm 1 are the change in the residual at each iteration and the change of the iteration of the linear solver.



FIG. 3 illustrates a comparison between the present technological advancement and conventional technology. The comparison is based on the data synthetically generated using a geologically appropriate model, and uses a state-of-the-art truncated-Gauss-Newton method as the baseline algorithm. FIG. 3 displays the convergence speed-up with the present technological advancement when it is used for a sequential-source [5] FWI problem constructing-the model.


In all practical applications, the present technological advancement must be used in conjunction with a computer, programmed in accordance with the disclosures herein. Preferably, in order to efficiently perform FWI, the computer is a high performance computer (HPC), known as to those skilled in the art. Such high performance computers typically involve clusters of nodes, each node having multiple CPU's and computer memory that allow parallel computation. The models may be visualized and edited using any interactive visualization programs and associated hardware, such as monitors and projectors. The architecture of system may vary and may be composed of any number of suitable hardware structures capable of executing logical operations and displaying the output according to the present technological advancement. Those of ordinary skill in the art are aware of suitable supercomputers available from Cray or IBM.


The present techniques may be susceptible to various modifications and alternative forms, and the examples discussed above have been shown only by way of example. However, the present techniques are not intended to be limited to the particular examples disclosed herein. Indeed, the present techniques include all alternatives, modifications, and equivalents falling within the spirit and scope of the appended claims.


REFERENCES

The following references are each incorporated by the reference in their entirety: [1] L. Metivier, R. Brossier, J. Virieux, and S. Operto, “Full Waveform Inversion and Truncated Newton Method”, SIAM J. Sci. Comput., 35(2), B401-B437; [2] J. Nocedal and J. Wright, “Numerical Optimization”, 2nd Edition, Springer; [3] V. Akcelik, G. Biros, O. Ghattas, J. Hill, D. Keyes, and B. van Bloemen Waanders, “Parallel algorithms for PDE constrained optimization, in Parallel Processing for Scientific Computing”, SIAM, 2006; [4] J. L. Morales and J. Nocedal, “Automatic Preconditioning by Limited Memory Quasi-Newton Updating”, SIAM J. Optim, 10(4), 1079-1096; [5] J. R. Krebs, J. E. Anderson, D. Hinkley, R. Neelamani, S. Lee, A. Baumstein and M. D. Lacasse, “Fast Full-wavefield Seismic Inversion Using Encoded Sources”, Geophysics, 74; and [6] D. A. Knoll and D. E. Keyes, “Jacobian-free Newton-Krylov Methods: A Survey of Approaches and Applications”, SIAM J. Sci. Comp. 24:183-200, 2002.


APPENDIX

The conjugate gradient method is an algorithm for finding the numerical solution of symmteric and positive-definite systems of linear equations. The conjugate gradient method is often implemented as an iterative algorithm, applicable to sparse systems that are too large to be handled by a direct implementation or other direct methods such as the Cholesky decomposition. Large sparse systems often arise when numerically solving partial differential equations or optimization problems. The conjugate gradient method can also be used to solve unconstrained optimization problems such as least squares minimization problem.


Description of the Method


Suppose solving the following system of linear equations

Ax=b,  (A1)

for the vector x, where b is the known vector, and A is the known n-by-n matrix which is symmetric (i.e., AT=A), positive definite (xTAx>0 for all non-zero vectors x in custom character), and real. The unique solution of this system is given by x*.


The Conjugate Gradient Method as a Direct Method


Two non-zero vectors u and v are conjugate with respect to A if

uTAv=0.  (A2)

Since A is symmetric and positive definite, the left-hand side of equation (A2) defines an inner product

custom characteru, vcustom character:=custom characterAu, vcustom character=custom characteru, ATvcustom character=custom characteru, Avcustom character=uTAv,  (A3)

where custom character,custom character is the inner product operator of two vectors. The two vectors are conjugate if and only if they are orthogonal with respect to this inner product operator. Being conjugate is a symmetric relation: if u is conjugate to v, then v is conjugate to u.


Suppose that

P={pk: ∀i≠k, k∈[1, n], and custom characterpi, pkcustom character=0}  (A4)

is a set of n mutually conjugate directions. Then P is a basis of custom character, so the solution x, of Ax=b can be expanded within P space as

x*=Σi=1nαipi,  (A5)

which leads to

b=Ax,=Σi=1nαiApi,  (A6)

For any pk ∈ P,

pkTb=pkTAx*=Σi=1nαipkTApikpkTApk,  (A7)

because ∀i≠ k,pi and pk are mutually conjugate.










α
k

=




p
k
T


b



p
k
T



Ap
k



=






p
k

,
b








p
k

,

p
k




A


.






(
A8
)








This gives the following method for solving the equation Ax=b find a sequence of n conjugate directions, and then compute the coefficients αk.


The Conjugate Gradient Method as an Iterative Method


An iterative conjugate gradient method allows to approximately solve systems of linear equations where n is so large that the direct method is computationally intractable. Suppose an initial guess for x* by x0 and assume without loss of generality that x0=0. Starting with x0, while searching for the solution, in each iteration a metric to determine if the current iterate is closer to x* (that is unknown). This metric comes from the fact that the solution x* is the unique minimizer of the following quadratic function











f


(
x
)


=



1
2



x
T


Ax

-


x
T


b



,

x



n


,




(
A9
)








and as this function ƒ becomes smaller, solution x gets closer to x*. The search (descent) direction for function ƒ in (A9) equals to the negative gradient b−Ax. Starting from a guessed solution x0 (x0=0 in case of no guessed solution) at the kth step, this descent direction is

rk=b−Axk  (A10)

The conjugation constraint described previously is an orthonormal-type constraint and hence the algorithm bears resemblance to Gram-Schmidt orthonormalization. This gives the following expression for conjugate of rk










p
k

=


r
k

-




i
<
k







p
i
T



Ar
k




p
i
T



Ap
i





p
i








(
A11
)








Following this direction, the next optimal location is











x

k
+
1


=


x
k

+


α
k



p
k










where
,





(
A12
)








α
k

=




p
k
T


b



p
k
T



Ap
k



=



p
k
T



r

k
-
1





p
k
T



Ap
k





,




(

A





13

)








where the last equality holds because pk and xk−1 are conjugate.


Conjugate Gradient Algorithm


The above algorithm gives the straightforward explanation of the conjugate gradient method. Seemingly, the algorithm as stated requires storage of all previous searching directions and residue vectors, as well as many matrix-vector multiplications, and thus can be computationally expensive. However, a closer analysis of the algorithm shows that rk+1 is conjugate to pi for all i<k, and therefore only rk, pk, and xk are needed to construct rk+1, pk+1, and xk+1. Furthermore, only one matrix-vector multiplication is needed in each iteration.


A modified algorithm is detailed below for solving Ax=b where A is a real, symmetric, positive-definite matrix, with an input vector x0 (a guessed solution otherwise 0).












Algorithm A1: A conjugate gradient algorithm.















r0 = b − Ax0


p0 = r0


k = 0


REPEAT





  
αk=rkTrkpkTApk






 xk+1 = xk + αkpk


 rk+1 = rk − αkApk


 IF rk+1 is sufficiently small, EXIT REPEAT





  
βk=rk+1Trk+1rkTrk






 pk+1 = rk+1 + βkpk


 k = k + 1


END REPEAT


RETURN xk+1









Preconditioned Conjugate Gradient Method


Preconditioning speeds up convergence of the conjugate gradient method. A preconditioned conjugate gradient algorithm is given in Algorithm A2, which requires an application of preconditioner operator B−1 on a given vector in addition to the steps in Algorithm A1.












Algorithm A2: A preconditioned conjugate gradient algorithm.















r0 = b − Ax0


z0 = B−1r0


p0 = z0


k = 0


REPEAT





  
αk=rkTrkpkTApk






 xk+1 = xk + αkpk


 rk+1 = rk − αkApk


 IF rk+1 is sufficiently small, EXIT REPEAT


 zk+1 = B−1rk+1





  
βk=rk+1zk+1TrkzkT






 pk+1 = rk+1 + βkpk


 k = k +1


END REPEAT


RETURN xk+1










The preconditioner matrix B has to be symmetric positive-definite and fixed, i.e., cannot change from iteration to iteration. If any of these assumptions on the preconditioner is violated, the behavior of Algorithm A2 becomes unpredictable and its convergence can not be guaranteed.


Flexible Preconditioned Conjugate Gradient Method


For some numerically challenging applications, Algorithm A2 can be modified to accept variable preconditioners, changing between iterations, in order to improve the convergence performance of Algorithm A2. For instance, the Polak-Ribière formula











β
k

=



z

k
+
1

T



(


r

k
+
1


-

r
k


)




z
k
T



r
k




,




(
A14
)








instead of the Fletcher-Reeves formula used in Algorithm A2,











β
k

=



z

k
+
1

T



r

k
+
1





z
k
T



r
k




,




(
A15
)








may dramatically improve the convergence of the preconditioned conjugate-gradient method. This version of the preconditioned conjugate gradient method can be called flexible, as it allows for variable preconditioning. The implementation of the flexible version requires storing an extra vector. For a fixed preconditioner, zk+1Ttk=0 both Polak-Ribière and Fletcher-Reeves formulas are equivalent. The mathematical explanation of the better convergence behavior of the method with the Polak-Ribière formula is that the method is locally optimal in this case, in particular, it does not converge slower than the locally optimal steepest descent method.

Claims
  • 1. A method, comprising: storing, in a computer memory, seismic data acquired from a seismic survey of a subsurface region;generating, with a computer, a final subsurface physical property model of the subsurface region by processing the seismic data with an iterative full wavefield inversion method, wherein the iterative full wavefield inversion method includes a non-linear outer iteration process comprising a plurality of outer iterations, each of which updates physical property values m of the physical property model and a nested linear inner iteration process, wherein: (1) each iteration i of the plurality of non-linear outer iterations includes a nested linear inner iteration process for determining a search direction di to be used in updating the physical property values mi of said non-linear outer iteration; and(2) each of the second and subsequent outer iterations of the plurality of non-linear outer iterations further includes applying a preconditioner to the nested linear inner iteration process of such respective second and subsequent non-linear outer iteration, wherein the preconditioner at such non-linear outer iteration i+1 is generated based at least in part upon the nested linear inner iteration process of the immediately previous outer iteration i and wherein the preconditioner is generated using vectors {γl, Hiγl} solved for in the nested inner iteration process, where γl is the search direction for the linear system and where Hiγl is the product of the vector γl with a Hessian operator Hi; andcreating, with a process, an image of the subsurface region from the final subsurface physical property model.
  • 2. The method of claim 1, comprising using a Krylov-space method as a linear solver for the nested inner iteration processes.
  • 3. The method of claim 1, comprising using a conjugate gradient method as a linear solver for the nested inner iteration processes.
  • 4. The method of claim 2, wherein each preconditioner is generated based on the nested inner iteration process of the immediately previous outer iteration using a limited-memory Broyden-Fletcher-Goldfarb-Shanno (BFGS) method.
  • 5. The method of claim 2, wherein each preconditioner is generated based on the nested inner iteration process of the immediately previous outer iteration using a quasi-Newton method.
  • 6. The method of claim 1, wherein the preconditioner of each of the second and subsequent outer iterations is a variable preconditioner such that the preconditioner can change when solving a linear system, and the preconditioner is generated based on the nested iteration process of the then-current outer iteration in addition to the nested inner iteration process of the immediately previous outer iteration.
  • 7. The method of claim 1, further comprising managing hydrocarbons based on the final subsurface physical property model of the subsurface region.
  • 8. The method of claim 1, further comprising: using the final subsurface physical property model in interpreting a subsurface region for hydrocarbon exploration or production.
  • 9. The method of claim 1, further comprising drilling for hydrocarbons at a location determined using the final subsurface physical property model of the subsurface region.
  • 10. The method of claim 1, wherein a generalized minimal residual method is used as a linear solver for the nested inner iteration processes.
CROSS-REFERENCE TO RELATED APPLICATION

This application claims the benefit of U.S. Provisional Patent Application 62/188,063 filed Jul. 2, 2015 entitled KRYLOV-SPACE-BASED QUASI-NEWTON PRECONDITIONER FOR FULL WAVEFIELD INVERSION, the entirety of which is incorporated by reference herein.

US Referenced Citations (224)
Number Name Date Kind
3812457 Weller May 1974 A
3864667 Bahjat Feb 1975 A
4159463 Silverman Jun 1979 A
4168485 Payton et al. Sep 1979 A
4545039 Savit Oct 1985 A
4562650 Nagasawa et al. Jan 1986 A
4575830 Ingram et al. Mar 1986 A
4594662 Devaney Jun 1986 A
4636957 Vannier et al. Jan 1987 A
4675851 Savit et al. Jun 1987 A
4686654 Savit Aug 1987 A
4707812 Martinez Nov 1987 A
4715020 Landrum, Jr. Dec 1987 A
4766574 Whitmore et al. Aug 1988 A
4780856 Becquey Oct 1988 A
4823326 Ward Apr 1989 A
4924390 Parsons et al. May 1990 A
4953657 Edington Sep 1990 A
4969129 Currie Nov 1990 A
4982374 Edington et al. Jan 1991 A
5260911 Mason et al. Nov 1993 A
5469062 Meyer, Jr. Nov 1995 A
5583825 Carrazzone et al. Dec 1996 A
5677893 de Hoop et al. Oct 1997 A
5715213 Allen Feb 1998 A
5717655 Beasley Feb 1998 A
5719821 Sallas et al. Feb 1998 A
5721710 Sallas et al. Feb 1998 A
5790473 Allen Aug 1998 A
5798982 He et al. Aug 1998 A
5822269 Allen Oct 1998 A
5838634 Jones et al. Nov 1998 A
5852588 de Hoop et al. Dec 1998 A
5878372 Tabarovsky et al. Mar 1999 A
5920838 Norris et al. Jul 1999 A
5924049 Beasley et al. Jul 1999 A
5999488 Smith Dec 1999 A
5999489 Lazaratos Dec 1999 A
6014342 Lazaratos Jan 2000 A
6021094 Ober et al. Feb 2000 A
6028818 Jeffryes Feb 2000 A
6058073 Verwest May 2000 A
6125330 Robertson et al. Sep 2000 A
6219621 Hornbostel Apr 2001 B1
6225803 Chen May 2001 B1
6311133 Lailly et al. Oct 2001 B1
6317695 Zhou et al. Nov 2001 B1
6327537 Ikelle Dec 2001 B1
6374201 Grizon et al. Apr 2002 B1
6381543 Guerillot et al. Apr 2002 B1
6388947 Washbourne et al. May 2002 B1
6480790 Calvert et al. Nov 2002 B1
6522973 Tonellot et al. Feb 2003 B1
6545944 de Kok Apr 2003 B2
6549854 Malinverno et al. Apr 2003 B1
6574564 Lailly et al. Jun 2003 B2
6593746 Stolarczyk Jul 2003 B2
6662147 Fournier et al. Dec 2003 B1
6665615 Van Riel et al. Dec 2003 B2
6687619 Moerig et al. Feb 2004 B2
6687659 Shen Feb 2004 B1
6704245 Becquey Mar 2004 B2
6714867 Meunier Mar 2004 B2
6735527 Levin May 2004 B1
6754590 Moldoveanu Jun 2004 B1
6766256 Jeffryes Jul 2004 B2
6826486 Malinverno Nov 2004 B1
6836448 Robertsson et al. Dec 2004 B2
6842701 Moerig et al. Jan 2005 B2
6859734 Bednar Feb 2005 B2
6865487 Charron Mar 2005 B2
6865488 Moerig et al. Mar 2005 B2
6876928 Van Riel et al. Apr 2005 B2
6882938 Vaage et al. Apr 2005 B2
6882958 Schmidt et al. Apr 2005 B2
6901333 Van Riel et al. May 2005 B2
6903999 Curtis et al. Jun 2005 B2
6905916 Bartsch et al. Jun 2005 B2
6906981 Vaage Jun 2005 B2
6927698 Stolarczyk Aug 2005 B2
6944546 Xiao et al. Sep 2005 B2
6947843 Fisher et al. Sep 2005 B2
6970397 Castagna et al. Nov 2005 B2
6977866 Huffman Dec 2005 B2
6999880 Lee Feb 2006 B2
7046581 Calvert May 2006 B2
7050356 Jeffryes May 2006 B2
7069149 Goff et al. Jun 2006 B2
7027927 Routh et al. Jul 2006 B2
7072767 Routh et al. Jul 2006 B2
7092823 Lailly et al. Aug 2006 B2
7110900 Adler et al. Sep 2006 B2
7184367 Yin Feb 2007 B2
7230879 Herkenoff et al. Jun 2007 B2
7271747 Baraniuk et al. Sep 2007 B2
7330799 Lefebvre et al. Feb 2008 B2
7337069 Masson et al. Feb 2008 B2
7373251 Hamman et al. May 2008 B2
7373252 Sherrill et al. May 2008 B2
7376046 Jeffryes May 2008 B2
7376539 Lecomte May 2008 B2
7400978 Langlais et al. Jul 2008 B2
7436734 Krohn Oct 2008 B2
7480206 Hill Jan 2009 B2
7584056 Koren Sep 2009 B2
7599798 Beasley et al. Oct 2009 B2
7602670 Jeffryes Oct 2009 B2
7616523 Tabti et al. Nov 2009 B1
7620534 Pita et al. Nov 2009 B2
7620536 Chow Nov 2009 B2
7646924 Donoho Jan 2010 B2
7672194 Jeffryes Mar 2010 B2
7672824 Dutta et al. Mar 2010 B2
7675815 Saenger et al. Mar 2010 B2
7679990 Herkenhoff et al. Mar 2010 B2
7684281 Vaage et al. Mar 2010 B2
7710821 Robertsson et al. May 2010 B2
7715985 Van Manen et al. May 2010 B2
7715986 Nemeth et al. May 2010 B2
7725266 Sirgue et al. May 2010 B2
7791980 Robertsson et al. Sep 2010 B2
7835072 Izumi Nov 2010 B2
7840625 Candes et al. Nov 2010 B2
7940601 Ghosh May 2011 B2
8121823 Krebs et al. Feb 2012 B2
8190405 Appleyard May 2012 B2
8248886 Neelamani et al. Aug 2012 B2
8428925 Krebs et al. Apr 2013 B2
8437998 Routh et al. May 2013 B2
8547794 Gulati et al. Oct 2013 B2
8688381 Routh et al. Apr 2014 B2
8781748 Laddoch et al. Jul 2014 B2
9601109 Horesh Mar 2017 B2
20020049540 Beve et al. Apr 2002 A1
20020099504 Cross et al. Jul 2002 A1
20020120429 Ortoleva Aug 2002 A1
20020183980 Guillaume Dec 2002 A1
20040199330 Routh et al. Oct 2004 A1
20040225438 Okoniewski et al. Nov 2004 A1
20060235666 Assa et al. Oct 2006 A1
20070036030 Baumel et al. Feb 2007 A1
20070038691 Candes et al. Feb 2007 A1
20070274155 Ikelle Nov 2007 A1
20080175101 Saenger et al. Jul 2008 A1
20080306692 Singer et al. Dec 2008 A1
20090006054 Song Jan 2009 A1
20090067041 Krauklis et al. Mar 2009 A1
20090070042 Birchwood et al. Mar 2009 A1
20090083006 Mackie Mar 2009 A1
20090164186 Haase et al. Jun 2009 A1
20090164756 Dokken et al. Jun 2009 A1
20090187391 Wendt et al. Jul 2009 A1
20090248308 Luling Oct 2009 A1
20090254320 Lovatini et al. Oct 2009 A1
20090259406 Khadhraoui et al. Oct 2009 A1
20100008184 Hegna et al. Jan 2010 A1
20100018718 Krebs et al. Jan 2010 A1
20100039894 Abma et al. Feb 2010 A1
20100054082 McGarry et al. Mar 2010 A1
20100088035 Etgen et al. Apr 2010 A1
20100103772 Eick et al. Apr 2010 A1
20100118651 Liu et al. May 2010 A1
20100142316 Keers et al. Jun 2010 A1
20100161233 Saenger et al. Jun 2010 A1
20100161234 Saenger et al. Jun 2010 A1
20100185422 Hoversten Jul 2010 A1
20100208554 Chiu et al. Aug 2010 A1
20100212902 Baumstein et al. Aug 2010 A1
20100246324 Dragoset, Jr. et al. Sep 2010 A1
20100265797 Robertsson et al. Oct 2010 A1
20100270026 Lazaratos et al. Oct 2010 A1
20100286919 Lee et al. Nov 2010 A1
20100299070 Abma Nov 2010 A1
20110000678 Krebs et al. Jan 2011 A1
20110040926 Donderici et al. Feb 2011 A1
20110051553 Scott et al. Mar 2011 A1
20110075516 Xia et al. Mar 2011 A1
20110090760 Rickett et al. Apr 2011 A1
20110131020 Meng Jun 2011 A1
20110134722 Virgilio et al. Jun 2011 A1
20110182141 Zhamikov et al. Jul 2011 A1
20110182144 Gray Jul 2011 A1
20110191032 Moore Aug 2011 A1
20110194379 Lee et al. Aug 2011 A1
20110222370 Downton et al. Sep 2011 A1
20110227577 Zhang et al. Sep 2011 A1
20110235464 Brittan et al. Sep 2011 A1
20110238390 Krebs et al. Sep 2011 A1
20110246140 Abubakar et al. Oct 2011 A1
20110267921 Mortel et al. Nov 2011 A1
20110267923 Shin Nov 2011 A1
20110276320 Krebs et al. Nov 2011 A1
20110288831 Tan et al. Nov 2011 A1
20110299361 Shin Dec 2011 A1
20110320180 Ai-Saleh Dec 2011 A1
20120010862 Costen Jan 2012 A1
20120014215 Saenger et al. Jan 2012 A1
20120014216 Saenger et al. Jan 2012 A1
20120051176 Liu Mar 2012 A1
20120073824 Routh Mar 2012 A1
20120073825 Routh Mar 2012 A1
20120082344 Donoho Apr 2012 A1
20120143506 Routh et al. Jun 2012 A1
20120215506 Rickett et al. Aug 2012 A1
20120218859 Soubaras Aug 2012 A1
20120275264 Kostov et al. Nov 2012 A1
20120275267 Neelamani et al. Nov 2012 A1
20120290214 Huo et al. Nov 2012 A1
20120314538 Washbourne et al. Dec 2012 A1
20120316790 Washbourne et al. Dec 2012 A1
20120316791 Shah Dec 2012 A1
20120316844 Shah et al. Dec 2012 A1
20130060539 Baumstein Mar 2013 A1
20130081752 Kurimura et al. Apr 2013 A1
20130238246 Krebs et al. Sep 2013 A1
20130279290 Poole Oct 2013 A1
20130282292 Wang et al. Oct 2013 A1
20130311149 Tang Nov 2013 A1
20130311151 Plessix Nov 2013 A1
20140350861 Wang et al. Nov 2014 A1
20140358504 Baumstein et al. Dec 2014 A1
20140372043 Hu et al. Dec 2014 A1
20150073755 Tang et al. Mar 2015 A1
20160238729 Warner Aug 2016 A1
Foreign Referenced Citations (21)
Number Date Country
2 796 631 Nov 2011 CA
1 094 338 Apr 2001 EP
1 746 443 Jan 2007 EP
2 390 712 Jan 2004 GB
2 391 665 Feb 2004 GB
WO 2006037815 Apr 2006 WO
WO 2007046711 Apr 2007 WO
WO 2008042081 Apr 2008 WO
WO 2008123920 Oct 2008 WO
WO 2009067041 May 2009 WO
WO 2009117174 Sep 2009 WO
WO 2010085822 Jul 2010 WO
WO 2011040926 Apr 2011 WO
WO 2011091216 Jul 2011 WO
WO 2011093945 Aug 2011 WO
WO 2012024025 Feb 2012 WO
WO 2012041834 Apr 2012 WO
WO 2012083234 Jun 2012 WO
WO 2012134621 Oct 2012 WO
WO 2012170201 Dec 2012 WO
WO 2013081752 Jun 2013 WO
Non-Patent Literature Citations (13)
Entry
Nocedal, J. “Updating Quasi-Newton Matrices with Limited Storage” Mathematics of Computation, vol. 35, No. 151, pp. 773-782 (1980) (Year: 1980).
Nocedal, J. “Numerical Optimization” 2nd Ed. (2006) (Year: 2006).
U.S. Appl. No. 14/329,431, filed Jul. 11, 2014, Krohn et al.
U.S. Appl. No. 14/330,767, filed Jul. 14, 2014, Tang et al.
Akcelik, V., et al. (2006) “Parallel algorithms for PDE constrained optimization, in Parallel Processing for Scientific Computing”, SIAM, 2006; Ch 16, pp. 291-322.
Knoll, D.A. and Keyes, D.E., (2004) “Jacobian-free Newton-Krylov Methods: A Survey of Approaches and Applications”, Journal of Computational Physics 193, 357-397.
Krebs, J.R., et al. (2009) “Fast Full-wavefield Seismic Inversion Using Encoded Sources”, Geophysics, 74, No. 6, pp. WCC177-WCC188.
Metivier, L. et al. (2013) “Full Waveform Inversion and Truncated Newton Method”, SIAM J. Sci. Comput., 35(2), B401-6437.
Morales, J.L. and Nocedal, J. (2000) “Automatic Preconditioning by Limited Memory Quasi-Newton Updating”, SIAM J. Optim, 10(4), 1079-1096.
Nocedal, J. and Wright, J. (2006) “Numerical Optimization”, 2nd Edition, Springer; pp. I-Xii, 101-192.
Akcelik, V., et al. (2002) “Parallel Multiscale Gauss-Newton-Krylov Methods for Inverse Wave Propogation”, Supercompuiting, ACM/IEEE 2002 Conference Nov. 16-22, 2002.
Metivier, L. et al. (2014) “Full Waveform Inversion and Truncated Newton Method: quantitative imaging of complex subsurface structures”, Geophysical Prospecting, vol. 62, pp. 1353-1375.
Virieux, J. et al. (2009) “An overview of full-waveform inversion in exploration geophysics”, Geophysics, vol. 74, No. 6.pp. WCC127-WCC152.
Related Publications (1)
Number Date Country
20170003409 A1 Jan 2017 US
Provisional Applications (1)
Number Date Country
62188063 Jul 2015 US