The present invention relates to a method and device for processing images such as Synthetic Aperture Radar (SAR) images from Earth observation satellites, and more particularly, Interferometric SAR (InSAR) images. The present invention applies to extraction from such images of useful information related notably to ground deformations caused by natural phenomena, including tectonic processes and volcanic activity, or caused by human activities, including aquifer depletion, water injection (including wastewater injection and geothermal injection), and extraction activities (including oil, gas, ore). More particularly, the invention applies to detection of slow slip events along faults, detection of precursory slips along active faults.
The relative motion between tectonic plates is accommodated by slip on active faults that can, in principle, be measured by InSAR. The classic paradigm is that most faults are locked most of the time. Accumulating elastic stress may be released during large earthquakes, while slip on some faults or portions of faults may be dominantly aseismic and thereby pose little seismic hazard. Studies over the last two decades revealing occurrences of unexpected slip events have shown this picture is simplistic in important ways. Indeed, different kinds of slip phenomena have recently been identified, including low frequency earthquakes, slow slip events and associated tectonic tremor as well as continuous aseismic slip.
Earthquakes pose a significant threat to densely populated regions and, so far, it is impossible to provide any short- or long-term forecast for the occurrence of individual earthquakes. It has been shown theoretically that earthquakes should be preceded by a nucleation phase, during which a tectonic fault starts to slip slowly and accelerates before rupturing rapidly. Detecting such nucleation phase would allow for short-term forecasting of large earthquakes. However, some faults slip slowly, episodically, without generating large earthquakes and, without a dense enough set of observations, it is difficult to efficiently differentiate a harmless slow slip event from a nucleation phase precursory to a devastating event. Moreover, aseismic and seismic slip are closely related: fault creep is thought to precede and often follows earthquakes, underscoring the need for a major advance in the mechanical and physical description of fault slip in general.
Synthetic aperture radar interferometry (InSAR) developed in the early 1990's is now routinely used to measure ground deformations due to hydrologic, volcanic, and tectonic processes. Orbiting satellites now provide daily images of the surface of the Earth and, using the concept of interferometry with a radar system, maps of the evolution of the apparent distance between the ground and the satellite system can be derived. These maps can be assembled to provide continuous monitoring of such distance. This apparent distance is directly related to ground motion over time, hence it is possible today to detect any slow slip event, precursory to a large earthquake or not, anywhere on Earth. The Sentinel 1 InSAR satellite constellation has the potential to enable a systematic mapping of all deforming regions worldwide, revealing mechanisms behind all slip modes of tectonic faults, provided that a robust and automatic method for separating atmospheric signals from fault deformation can be developed. Previous to Sentinel 1 constellation, the typical extent of a SAR image was 100×100 km for the classic C- and L-band satellites (ERS, Envisat, ALOS . . . ), with a pixel size on the order of tens of meters (depending on the wavelength of the Radar and the antenna setup). With the Sentinel constellation, the size of the images is on average 3 times larger and acquisitions are, at minimum, 5 times more frequent. The launch of the Sentinel 1 satellite constellation is transformative in that it provides systematic radar mapping of all actively deforming regions in the world with a 6-day return period. Upcoming satellite launches (NISAR, etc.) will provide even higher resolution and higher frequency InSAR data, furthering the need for automatic detection tools.
InSAR technique has been successfully applied to monitor large displacements due to earthquakes, ice sheet motion, smaller displacements related to aquifer depletion, interseismic deformation, slow moving landslides, and slow slip events. Rapid, large-amplitude deformation signals such as coseismic displacement fields or volcano-tectonic episodes are also routinely analyzed by InSAR. Similarly, the slow but steady accumulation of deformation over relatively long periods of time is also easily detected using InSAR, as the resulting deformation is also sufficiently large. However, these measurements generally have been successful only through painstaking manual exploration of the deformation data sets by experts, thus preventing large scale studies.
The combination of two SAR images into an interferogram (InSAR) allows ground motion to be measured over a wide area. The phase of the interferogram varies as a function of any physical phenomenon that affects the two-way travel time of the radar wave between the radar carrier and the ground. These effects include orbit changes, as well as changes in the atmosphere and ground deformation between acquisitions. Because orbits are well-known, the most important source of noise obscuring measurements of deformation is the spatial and temporal variation of temperature, pressure and water vapor in the atmosphere. Tropospheric and ionospheric delays generate large errors in InSAR processing, often amounting to errors of 10-20 cm which can be considerably larger than the signals of interest. Delays in radar signals are primarily due to water vapor present in the troposphere. These delays have two main identified causes: 1) turbulences in the troposphere, and 2) a component highly connected to local topography. Two primary approaches have been developed in the literature to alleviate this issue and reduce noise. The first approach consists of smoothing or filtering in time, with the assumption that atmospheric noise is stochastic in nature and can be averaged out. These approaches include stacking interferogram time series analysis based upon filtering, and correlation methods. However, atmospheric noise is in fact non-stochastic both in time and space, and smoothing leads to a decrease in the resolution of the signals from the ground. In other words, these methods have been shown to partially mask out the signals of interest.
The second approach relies on additional datasets to estimate delays and mitigate noise in ground deformation estimates, including meteorological data, water vapor maps from spectroradiometer and spectrometer data, geographic position data (from a Global Navigation Satellite system), and estimates from weather models and simulations. These different datasets all have their merits and drawbacks. Atmospheric models may suffer from systematic or localized errors. Water vapor maps have a high spatial resolution but a relatively poor time resolution, and are limited when clouds are present. In contrast, geographic position data provide the highest time resolution estimates of atmospheric delays, but requires stations on Earth's surface, and the data are limited spatially to available stations (interpolation between stations is not obvious and it is difficult to assess its quality). At present there exists no reliable, universally used approach to mitigate the impact of atmosphere-induced errors on InSAR ground motion estimations.
Thus, the detection of low-amplitude, long-wavelength deformation fields such as those due to interseismic strain accumulations or postseismic motions remains challenging because of interferometric decorrelation, inaccurate orbits of the satellites, and atmospheric propagation delays: if the atmosphere slows down the satellite signal, the distance between the ground and the satellite appears to increase, as if the ground had moved. In order to attain this goal, significant advances in InSAR measurement of slow deformation must be made.
Therefore, the systematic detection of these slip events along tectonic faults is not yet possible, although the data are available in the InSAR images. In fact, existing solutions rely on human-based detection through visual inspection of InSAR time series of images of ground motion, where the signal is a mix of atmospheric noises and tectonic signals.
Accordingly, there is a need for automatically processing massive time series of image sets such as InSAR images time series to remove or attenuate atmospheric perturbations, with a view to highlighting the smallest possible deformation signals of all kinds. There is also a need for a systematic and global detection and characterization of low-amplitude, and/or long-wavelength ground deformations at different scales, with a view to confidently differentiating between harmless slow slip events and small events related to precursory phases to earthquakes. it may also be desirable to identify fault deformation without requiring expert interpretation.
A method is described for processing time series of images of a same area, subjected to noise. The method may include: executing a plurality of successive filtering steps comprising a first filtering step receiving an input image time series, a last filtering step providing an output image, and intermediary filtering steps, each of the first filtering step and the intermediary filtering steps transforming a set of time series of filtered images initially including the input image time series, the set being transformed by generating a number of time series of filtered images by combining by linear combinations each pixel of each image of each time series of the set with selected neighboring pixels in the image and in an adjacent image of the image time series of the set, each linear combinations using a respective set of weighting coefficients and resulting in a pixel of one of the generated filtered images; and executing one or more combination operations, each combination operation being performed between two successive of the intermediary filtering steps, to reduce a number of images in each of the time series of filtered images of the set, by combining images of subsets of adjacent images in each of the time series of filtered images of the set, a last one of the combination operations reducing each time series of filtered images of the set to a single filtered image.
According to an embodiment, the method further comprises, after the last combination operation, introducing a model image of the area in the set as an additional filtered image, the introduction of the model image being followed by one or more of the filtering steps.
According to an embodiment, each of the linear combinations includes a bias coefficient which is added to a result of the linear combination.
According to an embodiment: each of the weighting coefficients has a value depending on a sign, positive or negative of a pixel value to which it is multiplied, or the result of each of the linear combinations is transformed by a rectified linear unit function.
According to an embodiment, the method further comprises: generating first image time series from a ground deformation model using randomly selected parameters; generating training image time series, from the first image time series, using models of different noise signals; and using the generated training time series to adjust the values of the weighting coefficients.
According to an embodiment, the values of the weighting coefficients are iteratively adjusted by applying an iterative gradient-based descent minimization method using the training image time series and a model cost function, so as to minimize the result of the model cost function.
According to an embodiment, the generation of the set of times series of filtered images from the input image time series are performed using the following equation:
wherein PX[i, j, t, f] is one pixel of one of the filtered images of the time series f of filtered images, PX[i, j, t] is one pixel of one image of the input image time series, W[l, m, u, f] is one of the first coefficients for the time series f of filtered images, B[f] is a bias coefficient for the time series f, and LR( ) is a rectified linear unit function.
According to an embodiment, each linear combination of first filtering operations of the filtering operations applies the following equation:
wherein PX[i, j, t, f′] is one pixel of one of the filtered images of the time series of filtered images f′, W[s, l, m, u, f, f′] is one of the second weighting coefficients for the time series f′ of filtered images for the filtering operation s, B[s, f′] is a bias coefficient for the time series f′ and the filtering operation s, and LR( ) is a leaky rectified linear unit function.
According to an embodiment, each linear combination of second filtering operations of the filtering operations applies the following equation:
wherein PX[i, j, f′] is one pixel of one of the filtered images f′, W[s, l, m, f, f′] is one of the second weighting coefficients for the filtered images f′ for the filtering operation s, B[s, f′] is a bias coefficient for the filtered images f′ and the filtering operation s, and LR( ) is a leaky rectified linear unit function.
According to an embodiment, each pixel of the output image is computed by the following equation:
wherein PX[i, j] is one pixel of the output image, PX[i, j, f] is one pixel of the filtered image f, W[l, m, f] is one of the third coefficients, B is a bias coefficient, and LR( ) is a leaky rectified linear unit function.
According to an embodiment, the leaky rectified linear unit function LR is such that LR(x)=x if x≥0 and LR(x)=0.5x if x<0.
According to an embodiment, each of the image combination operations applies the following equation:
wherein PX[i, j, t, f] is one pixel of one of the filtered images of the time series of filtered images f, and MAX(PX[i, j, t+u, f]) is a function providing the maximum value among the pixel values PX[i, j, t+u, f] with u=−1, 0 and 1.
Embodiments may also relate to a computer comprising a processor, a memory, the processor being configured to carry out the above-disclosed method.
According to an embodiment, the computer further comprises a graphic processor controlled by the processor, the processor being configured to configure the graphic processor to carry out some of the operations of the above-disclosed method.
Embodiments may also relate to a computer program product loadable into a computer memory and comprising code portions which, when carried out by a computer, configure the computer to carry out the above-disclosed method.
The method and/or device may be better understood with reference to the following drawings and description. Non-limiting and non-exhaustive descriptions are described with the following drawings. In the figures, like referenced signs may refer to like parts throughout the different figures unless otherwise specified.
A method for processing images such as InSAR images is disclosed. This method comprises:
An embodiment of software on a computer is described below as a set of computer readable program components that cooperate to control the performance of operations of data processing when loaded and executed on the computer. It will be apparent to a person skilled in the art that the individual steps of methods of the present invention can be implemented in computer program code and that a variety of programming languages and coding implementations may be used to implement the methods described herein. Moreover, computer programs included in the software are not intended to be limited to the specific control flows described herein, and one or more of the steps of the computer programs may be performed in parallel or sequentially. One or more of the operations described in the context of a computer-program-controlled implementation could alternatively be implemented as a hardware electronics component.
According to an embodiment, each pixel PX[i, j, t, f] (i=1, . . . X, j=1, . . . Y, t=1, . . . T, f=1, . . . F) of each filtered image FI[0, t, f] is computed by the corresponding filtering component FT[0, f], according to the following equation:
where LR is the leaky ReLU function, B[0] is a one-dimensional table of bias coefficients B[0, f], each representing a constant value used by the filtering component FT[0, f], W[0] is a four-dimensional table of weighting coefficients W[0, l, m, u, f] having the following dimensions (1 . . . L, 1 . . . M, 1 . . . U, 1 . . . F), the weighting coefficients W[0, l, m, u, f] having real values, and a, b, c are coefficients which can be adjusted with the parameters L, M and U to select in the table PX the neighboring pixels which are involved in the linear combinations. The pixels outside the images RI[t], i.e. when i+a.l>X, and/or j+b.m>Y, and/or t+c.u>T are set to 0.
According to an example, LR(x)=x if x≥0 and LR(x)=0.5x if x<0. In the example of
According to an embodiment, each pixel PX[i, j, t, f′] of each filtered image FI[s, t, f′] (f′ equal to 1 to F) is computed by the corresponding filtering component FT[s, f], according to the following equation:
where B[s, f′] is a two-dimensional table of bias coefficients B[s, f′], each having a constant value attributed to the filtering component FT[s, f′], W[s] (s=1, 2, 3 or 5) is a five-dimensional table of weighting coefficients W[s, l, m, u, f, f′] having the following dimensions (1 . . . L, 1 . . . M, 1 . . . U, 1 . . . F, 1 . . . F), the weighting coefficients W[s, l, m, u, f, f] having real values, and a, b, c are coefficients which can be adjusted with the parameters L, M and U to select in the table PX the neighboring pixels which are involved in the linear combinations.
According to an example, LR(x)=x if x≥0 and LR(x)=0.5x if x<0. In the example of
According to an embodiment, each pixel PX[i, j, t, f] of each filtered image FI[s, t, f] (f equal to 1 to F) is computed from three filtered images FI[s−1, t], FI[s−1, t+1] and FI[s−1, t+2] by the corresponding combining component CB[s, f], according to the following equation:
where MAX is a function providing the greatest value in each group of three pixels PX[i, j, 3(t−1)+1], PX[i, j, 3(t−1)+2] and PX[i, j, 3(t−1)+3], t being equal to 1, 2, . . . T/3. As a result, each of the time series of filtered images FI(s, f) provided by the processing stages SL4 and SL6 comprises T/3 images, T being a multiple of 3.
After the processing stage SL6, only one image FI(s, f) remains in each of the time series of filtered images FI(s, f, t).
According to an embodiment, each pixel PX[i, j, f′] of each filtered image FI[s, f′] (f′ equal to 1 to F) is computed by the corresponding filtering component FT[s, f], according to the following equation:
where B[s] is a one-dimensional table of bias coefficients B[s, f′], each having a constant value attributed to the filtering component FT[s, f′], W[s] are four-dimensional tables of weighting coefficients W[s, l, m, f, f′], with s=7, 8, 9, 10 or 11, having the following dimensions (1 . . . L, 1 . . . M, 1 . . . F, 1 . . . F), the weighting coefficients W[s, l, m, f, f′] having real values, and a and b are coefficients which can be adjusted with the parameters L and M to select in the table PX the neighboring pixels which are involved in the linear combinations. In the example of
The image EM of the elevation model of the ground area imaged by the image time series RI[t] is introduced in the processing stage FP7 as a filtered image having the index F+1. Therefore, in the computations performed at processing stage FP7, the summation on index fin equation (4) is performed between 1 and F+1, and the four-dimensional table W[7] has the following dimensions (1 . . . L, 1 . . . M, 1 . . . F+1, 1 . . . F). In the processing stages FP8-FP11, the four-dimensional tables W[s] have the following dimensions (1 . . . L, 1 . . . M, 1 . . . F, 1 . . . F).
According to an embodiment, each pixel PX[i, j] of the output image PI is computed by the filtering component FT[11], according to the following equation:
where W[12] is a three-dimensional table of weighting coefficients W[12, l, m, f] having the following dimensions (1 . . . L, 1 . . . M, 1 . . . F], the weighting coefficients W[12, l, m, f] having real values and a and b are coefficients which can be adjusted with the parameters L and M to select in the table PX the neighboring pixels which are involved in the linear combinations. In the example of
The choice of the number F of filtering components FT[s, f] in the stages FPs (s=1-3, 5, 7-11) results from a trade-off between efficiency of the neural network to remove the noise from the input images and computation time. Generally, the number F is set between 10 and 100. In the above examples, F is set to 64. Therefore:
W[0] includes 2×3×3×64(=1162) weighting coefficients, and B[0, f] includes 64 bias coefficients,
W[s] and B[s, f], with s=1, 2, 3 and 5, include respectively 2×3×3×64×64(=73728) weighting coefficients, and 64 bias coefficients,
W[7] includes 3×3×64×65(=37440) weighting coefficients, and B[7, f] includes 64 bias coefficients,
W[s] and B[s, f], with s=8, 9, 10 and 11, include respectively 3×3×64×64(=36864) weighting coefficients, and 64 bias coefficients, and
W[12] includes 3×3×64(=576) weighting coefficients, and B[12] represents one bias coefficient.
According to an embodiment, all the coefficients W and B are determined by means of a learning phase using training data. According to an embodiment, the training data comprise of millions or billions of time series of images that are generated from simulations of ground motion using simple elastic models of faulting. The times series of images are for example generated by randomly generating a slowly sleeping fault with random latitude, longitude, depth, strike angle, dip angle and width. Then the ground deformation time series are corrupted with a variety of noise signals. For example, a spatially correlated Gaussian noise simulates delays introduced in the InSAR images by the atmospheric turbulences of various length scales. Further, a variety of additional spurious signals can be used to further corrupt the ground deformation time series. For example, transient faulting signals can be added so as to simulate the worst case scenario of very sharp weather related signals that look like faulting signals, but without correlation over time. Wrong pixels or pixel patches simulating pixel decorrelation and unwrapping errors can also be introduced in the time series of images.
Each training image time series is associated with the final image to be provided by the neural network CNN. Training the neural network comprises finding a solution to a system of equations where the unknowns are the coefficients W and B, the pixels PX[i, j] of the image time series RI and of the resulting image PI being known from a training case. According to an embodiment a solution is computed by using an iterative gradient-based descent minimization method using a model cost function, initial values for the coefficients W and B being randomly selected between −1 and 1, such as to have a uniform distribution.
According to an embodiment, the model cost function MCF used is the error norm of the deformation reconstruction, e.g. the sum of the absolute values of the reconstruction errors for each pixel:
where CPX[i, j] represents a computed pixel in the image PI provided by the neural network CNN from a training image time series, and MPX[i, j] represents the corresponding pixel in the final image to be provided, associated with the training image time series PI. The coefficients W and B are iteratively adjusted so as to minimize the result of the model cost function MCF on groups of training data. When using such a cost function which is not convex, it is almost certain to never meet a global minimum, but only local minimums depending on initial values. Thus it is ensured to reach a unique set of coefficients W, B depending on the training data.
According to an embodiment, coefficients W and B are updated at each iteration, based on the gradient of the cost function calculated from a batch of data, following the ADAM rule, according to the following equation:
wherein:
w is one of the coefficients W and B, wstp and ε are constant values, respectively set to 0.001 and 10−6,
m=β1 m+g(1−β1), v=β2 v+g2(1−β2), m and v being initially set to 0, β1 and β2 being constant values respectively set to 0.9 and 0.999, and t being the iteration number,
g=gradient(x, y), x being the set of temporal series used in one iteration, y is the corresponding true output deformation, and g can be set to the mean value of the gradient of the cost function as a function of the model parameters,
It is apparent to those of ordinary skill in the art that other gradient descent algorithms can equally be employed to adjust the coefficients W and B of the model.
An advantage of purely convolutional neural networks is that the size of the images to be processed do not depend on the size of the training images: all the above equations (1) to (5) compute a single pixel and are used to compute each pixel of the images, and thus do not depend on the number of pixels of these images.
According to an embodiment, the neural network is trained using synthetic training time series comprising nine images of 40×40 pixels generated from modeled ground deformations and modeled noise.
To test the efficiency of the trained convolutional network CNN, it is used to process InSAR time series of images from two regions that have already been ‘manually’ analyzed by experts: the North Anatolian fault in Turkey (COSMO-SkyMed data), and the Chaman fault at the border between Afghanistan and Pakistan (Sentinel 1 data). The InSAR images from Chaman comprise time series of 9 images of 7024×2488 pixels covering an area of roughly 180 000 km2. Both in Turkey and in Chaman, signals stronger than signals identified by experts as faulting remain in the data even after conventional atmospheric corrections. In both cases, a priori knowledge of fault location and manual inspection of the data has been necessary to produce fault deformation time series. The neural network CNN trained using synthetic training data and with no further fine tuning on real data, automatically isolates and recovers clean deformation signals in Turkey and Chaman where expert analysis also found signal. On the North Anatolian fault time series, the neural network CNN found a 1.5 cm line of sight slip, with virtually no noise signal remaining elsewhere than on the fault, without human intervention, and without having the knowledge of the location of the fault.
It should be observed that all the calculations performed when executing the neural network CNN to extract ground deformation from the time series of images from Chaman can be done on a graphics processor of a conventional personal computer, with a minimal computation time, for example, around 4 minutes on a single Nvidia® RTX 6000 graphic processor to extract ground deformation from a time series of 9 acquisitions of 7024×2488 pixels.
The above description of various embodiments of the present invention is provided for purposes of description to one of ordinary skill in the related art. It is not intended to be exhaustive or to limit the invention to a single disclosed embodiment. Numerous alternatives and variations to the present invention will be apparent to those skilled in the art of the above teaching. Accordingly, while some alternative embodiments have been discussed specifically, other embodiments will be apparent or relatively easily developed by those of ordinary skill in the art.
In this respect, it is apparent to a person skilled in the art to perform the operations performed by the processing stages disclosed in
In addition, the number and respective positions of the neighboring pixels and the number of adjacent images in the time series taken into account to compute the linear combinations can also be varied depending on the type of images to process and the type of noise signals affecting the images to process. Adjustment of the parameters L, M and U and the coefficients a, b, c that selects the adjacent pixels involved in the linear combinations FT[s, f] can be performed as a function of the amplitude of the perturbations affecting the images to be processed. Generally, these parameters and coefficients are selected between 1 and 4.
The addition of the bias coefficients B[s, f] in the filtering operations FPs is also optional although it introduces other degrees of freedom in the design of the neural network CNN. Removal of this bias coefficient could be easily compensated in terms of degree of freedom by adding one or more filtering operation FP which introduces a huge number of weighting coefficients W, and therefore a large number of degrees of freedom.
Application of the rectified linear unit function RL is also optional, and depends on the type of images to be processed. In the above examples, and more particularly when the images to be processed are InSAR images corrupted by atmospheric turbulence, it is desirable to give more importance to the positive values than to the negative values of the pixels. In other applications, other rectified linear unit functions could be more adapted. In addition, instead of using such a function, different weighting coefficients W could be defined as a function of the pixel value by which the weighting coefficient is multiplied.
In addition, the training phase for computing the coefficients (weighting coefficients W and bias coefficients B) of the neural network CNN does not need to be performed before each true image processing since these coefficients only depend on the type of images and signals to process. Accordingly, a computer implementing a neural network designed to process one type of images does not need to implement the training method and the method for generating the training data.
Further, the above-described method does not exclusively apply to InSAR images of ground acquired by Earth observation satellites. This method can easily adapted to process noisy image time series in which noise models are known, to extract motion signals between the images of the time series.
The above description is intended to embrace all alternatives, modifications and variations of the present invention that have been discussed herein, and other embodiments that fall within the spirit and scope of the above description. Limitations in the claims should be interpreted broadly based on the language used in the claims, and such limitations should not be limited to specific examples described herein.
Number | Date | Country | Kind |
---|---|---|---|
20157709.5 | Feb 2020 | EP | regional |
This application is a 371 National Stage of International Application No. PCT/EP2021/053628, filed Feb. 15, 2021, which claims priority to European Patent Application No. 20157709.5, filed Feb. 17, 2020, the disclosures of which are herein incorporated by reference in their entirety.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2021/053628 | 2/15/2021 | WO |