SINGLE-FRAME FRINGE PATTERN ANALYSIS METHOD BASED ON MULTI-SCALE GENERATIVE ADVERSARIAL NETWORK

Description

FIELD OF THE INVENTION

The invention belongs to the technosphere of optical measurement, in particular to a single-frame fringe pattern analysis method based on multi-scale generative adversarial network.

BACKGROUND OF THE INVENTION

With the development of computer technology, information technology and optoelectronic technology, optical 3D measurement technology has been developed rapidly. Optical 3D measurement technology is based on the modern optics and further integrates technologies such as optoelectronics, signal processing, image processing, computer graphics, pattern recognition, and so on. It uses optical images as a means to detect and transmit information. Its purpose is to extract useful signals from captured images and complete the reconstruction of 3D models. According to the illumination modes, optical 3D measurement technology is generally divided into two categories: passive 3D measurement and active 3D measurement. Passive 3D measurement technology estimates the distance from two-dimensional images acquired by one or more cameras to generate the 3D profile of the measured object. Generally, such methods have low measuring accuracy and are not suitable for industrial applications. Active 3D measurement technology uses the structured light illumination technology to project images that are encoded according to certain rules to the measured object. The encoding pattern(s) is deformed owing to the modulation of the object surface. Then, the deformed pattern is captured by a camera. The 3D shape of the object can be calculated by using the position between the camera and the projector, and the degree of the pattern deformation. Structured light illumination 3D measurement technology has attracted more and more attention because of its non-contact, high sensitivity, high accuracy, and high automation.

Fringe pattern analysis is an indispensable step in the implementation of structured light illumination 3D measurement. Its main purpose is to analyze and to obtain the phase information hidden in the fringe image that is related to the 3D contour of the target. According to the number of projected images, fringe pattern analysis methods are often divided into multi-frame method and single-frame method. The N-step phase-shifting method is a widely used method that applies the strategy of the multi-frame fringe image analysis. This method adds several constant phase by phase shifting into the phase component of the projected pattern, so as to obtain a series of phase-shifting images and solve the phase from these images (Chao Zuo, Lei Huang, Minliang Zhang, Qian Chen, and Anand Asundi. “Temporal phase unwrapping algorithms for fringe projection profilometry: A comparative review.” Optics and lasers in engineering 85 (2016): 84-103). The advantages of this method are high accuracy and high fidelity to the phase details of the object. However, the disadvantage is that a series of fringe images have to be collected for analysis, so the efficiency is not high, and it is difficult to meet the requirement of 3D measurement of moving objects.

Compared with the multi-frame method, the single-frame method shows superior advantages in terms of the efficiency. This kind of methods carry out the phase encoding process by using a single fringe image, so only one image is used to decode the phase information. Fourier transform fringe image analysis is the most representative single-frame method (Mitsuo Takeda, Hideki Ina, and Seiji Kobayashi, “Fourier-transform method of fringe-pattern analysis for computer-based topography and interferometry,” J. Opt. Soc. Am. 72, 156-160 (1982)). This method is based on spatial filtering and its principle is to use the phase information of light to encode the spatial height of the object. The phase is extracted by selecting the appropriate filter window in the frequency domain and filtering suitable frequency component. The 3D reconstruction is realized according to the mapping relationship between the phase and the height. Since the phase can be obtained by only one deformed fringe pattern, this method has the advantages of high efficiency and fast measurement speed, which is suitable for 3D measurement of dynamic and fast moving objects. However, the disadvantage of this method is that the accuracy is low and the fidelity of contour details is poor for complex surfaces. Based on the traditional Fourier transform fringe image analysis, the windowed Fourier transform fringe image analysis can retain the phase information of more details by introducing the windowed Fourier transform (Kemao Qian. “Two-dimensional windowed Fourier transform for fringe pattern analysis: principles, applications and implementations.” Optics and Lasers in Engineering 45, no. 2 (2007): 304-317). However, its disadvantage is that its implementation is complex, where the adjustment of involved parameters is not easy, and the time cost of phase calculation is high.

SUMMARY OF THE INVENTION

The object of the invention is to provide a single-frame fringe pattern analysis method based on multi-scale generative adversarial network.

A technical solution for achieving the object of the invention is as follows: a single-frame fringe pattern analysis method based on multi-scale generative adversarial network, and the steps are as follows:

Step 1: A multi-scale generative adversarial neural network model is constructed and it involves a multi-scale image generator and an image discriminator.

Step 2: A comprehensive loss function L is constructed for the multi-scale generative adversarial neural network.

Step 3: The training data are collected to train the multi-scale generative adversarial network.

Step 4: During the prediction, a fringe pattern is fed into the trained multi-scale network where the generator outputs the sine term, cosine term, and the modulation image of the input pattern. Then, the arctangent function is applied to compute the phase.

Preferably, the mentioned multi-scale image generator includes four data processing paths of the same structure termed as (1) to (4). Each data processing path consists of a convolutional layer, 4 residual blocks, and a convolutional layer that is linearly activated.

Preferably, the input of the mentioned data processing path (4) is I₄(x,y) whose image size is

$\frac{1}{8} H \times \frac{1}{8} W .$

The input of the mentioned data processing path (3) is I₃(x, y) and the output of the data processing path (4) which includes the sine term, the cosine term and the modulation image and their size is upsampled to

$\frac{1}{4} H \times \frac{1}{4} W .$

The input of the mentioned data processing path (2) is I₂(x, y) and the output of the data processing path (3) which includes the sine term, the cosine term and the modulation image and their size is upsampled to

$\frac{1}{2} H \times \frac{1}{2} W .$

The input of the mentioned data processing path (1) is I₁(x, y) and the output of the data processing path (2) which includes the sine term, the cosine term and the modulation image and their size is upsampled to H×W.

Preferably, the mentioned image discriminator is built by 6 residual blocks and a fully connected layer. They are connected sequentially. The last fully connected layer is activated by the sigmoid function.

Preferably, the mentioned comprehensive loss function L can be expressed as

L=αL
_image
+βL
_GAN

where α and β are the weights, L_imagethe loss function regarding the image content, and L_GANthe adversarial loss function.

Compared with traditional methods, the present invention has the following advantages: (1) Compared with the multi-frame fringe image analysis method, the present invention only needs one fringe image as input, and can obtain phase information quickly and efficiently; (2) Compared with the representative single-frame method such as the Fourier transform fringe image analysis, the accuracy of the present invention is higher; (3) When the network is trained, parameters are fixed and do not need to set manually during the calculation, making the implementation easy.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is the flow chart of the present invention.

FIG. 2 is the diagram of the multi-scale image generator.

FIG. 3 is the diagram of the image discriminator.

FIG. 4 is the diagram of the residual block.

FIG. 5 shows some results of the invention. (a) The fringe pattern that is the input to the trained network. (b) The output sine term. (c) The output cosine term. (d) The output modulation image. (e) The phase calculated by substituting (b) and (c) into the arctangent function.

FIG. 6 shows the absolute phase error of (a) Fourier transform fringe image analysis and (b) the present invention.

DESCRIPTION OF THE PREFERRED EMBODIMENT

A single-frame fringe pattern analysis method based on multi-scale generative adversarial network can analyze a single fringe image and obtain high-precision phase information. The principle is as follows:

According to the fringe image analysis, the fringe image I(x, y) can be expressed as

I(x,y)=A(x, y)+B(x, y)cos[ϕ(x, y)]

where (x, y) is the pixel coordinate, A(x, y) the background image, B(x, y) the modulation image, and ϕ(x, y) the phase to be calculated. The phase can be calculated by

$ϕ (x, y) = ac \tan \frac{M (x, y)}{D (x, y)} = ac \tan \frac{\frac{1}{2} B (x, y) \sin ϕ (x, y)}{\frac{1}{2} B (x, y) \cos ϕ (x, y)}$

Generally, ϕ(x, y) is called wrapped phase as it is discontinuous. It ranges from [−π,π].

According to the above formula for phase calculation, the fringe image is fed into the multi-scale generative adversarial network. Firstly, the network is trained to estimate the sine term that is

$M (x, y) = \frac{1}{2} B (x, y) \sin ϕ (x, y),$

the cosine term

$D (x, y) = \frac{1}{2} B (x, y) \cos ϕ (x, y),$

and the modulation image B(x, y). Then, the sine and cosine terms are substituted into the arctangent formula to calculate the phase ϕ, y). Although the modulation image B(x, y) does not directly participate in the phase calculation, it is calculated as an output of the neural network, which is helpful to constraining the training process of the neural network, and is thus beneficial to improving the accuracy of the estimated sine and cosine terms.

As shown in FIG. 1, the steps of the present invention are as follows:

Step 1: A multi-scale generative adversarial neural network model is constructed and it involves a multi-scale image generator and an image discriminator.

Further, the multi-scale image generator is used to generate the sine term, cosine term and modulation image with the original size of H×W.

Further, the input of the multi-scale image generator is the input fringe image and the fringe images downsampled by c.

As shown in FIG. 2, in a further embodiment, the multi-scale image generator includes four data processing paths (1)-(4), which process the original input fringe image and the fringe images being downsampled by respectively. For convenience of description, the input fringe image is denoted as I₁y), and its size is H×W in pixel. The fringe image I₁(x, y) is then downsampled to different degrees, generating the image

$I_{2} (x, y) of \frac{1}{2} H \times \frac{1}{2} W,$

the image

$I_{3} (x, y) of \frac{1}{4} H \times \frac{1}{4} W,$

and the image

$I_{4} (x, y) of \frac{1}{8} H \times \frac{1}{8} W .$

In some embodiments, the image pyramid method can be used to generate the above images.

As shown in FIG. 2, in a further embodiment, each data processing path has the same structure and consists of a convolutional layer, 4 residual blocks, and a convolutional layer that is linearly activated.

The input of the data processing path (4) is I₄(x, y) whose image size is

$\frac{1}{8} H \times \frac{1}{8} W . I_{4} (x, y)$

is processed by a convolutional layer, four residual blocks and a convolutional layer that is linearly activated, giving the sine term, cosine term and the modulation image of size

$\frac{1}{8} H \times \frac{1}{8} W .$

The input of the data processing path (3) is I₃(x, y) and the output of the data processing path (4) which includes the sine term, the cosine term and the modulation image and their size is upsampled to

$\frac{1}{4} H \times \frac{1}{4} W .$

The input of the data processing path (2) is I₂(x, y) and the output of the data processing path (3) which includes the sine term, the cosine term and the modulation image and their size is upsampled to

$\frac{1}{2} H \times \frac{1}{2} W .$

The input of the data processing path (1) is I₁(x, y) and the output of the data processing path (2) which includes the sine term, the cosine term and the modulation image and their size is upsampled to H×W. The output of the data processing path (1) is the estimated sine term, the cosine term and the modulation image that have the original size of H×W.

As shown in FIG. 3, in a further embodiment, the mentioned image discriminator is built by 6 residual blocks and a fully connected layer. They are connected sequentially. The last fully connected layer is activated by the sigmoid function.

The input of the image discriminator includes two kinds of data whose image resolution is H×W. One type of data is the ground-truth sine term, cosine term and modulation image. They are calculated by high-accuracy standard methods, such as 7-step phase shift algorithm. The label of these ground-truth data is set to 1. The other type of data is the output of the multi-scale image generator, which is the estimated sine term, cosine term and modulation image. The label of these estimated data is set to 0.

Step 2: The comprehensive loss function L is constructed for the multi-scale generative adversarial network.

In a further embodiment, the mentioned comprehensive loss function L can be expressed as

L=αL
_image
+βL
_GAN

where α and β are the weights, L_imagethe loss function regarding the image content, and L_GANthe adversarial loss function.

The image content loss L_imageis written as

L
_image
=γL
_f
+ηL
_m

where γ and η are the weights, L_frepresents the loss of the sine and cosine terms, L_mthe loss of the modulation image. L _fcan be expressed as

$L_{f} = \sum_{s = 1}^{4} L_{s}^{f} L_{s}^{f} = \frac{1}{H_{s} \times W_{s}} [{ G_{\sin}^{s} - P_{\sin}^{s} }^{2} +  G_{\cos}^{s} - P_{\cos}^{s} ^{2}]$

where S represents the different scales of the multi-scale image generator. L_fcomprehensively calculates the sum of the errors of the output sine term and cosine term at four different scales. H_sis the height of the image of the scale S, and W_sis the width of the image of the scale S. G is the ground truth data and P the predicted results generated by the multi-scale image generator. The subscript sin indicates the sine term and the subscript cos indicates the cosine term.

The loss of the modulation image L_mcan be written as

$L_{m} = \sum_{s = 1}^{4} L_{s}^{m} L_{s}^{m} = \frac{1}{H_{s} \times W_{s}} { G_{m o d}^{s} - P_{m o d}^{s} }^{2}$

where L_mcomprehensively calculated the sum of the errors of the output modulation image at four different scales. The subscript mod indicates the modulation image.

The adversarial loss function L_GANis

$\begin{matrix} L_{G A N} = \underset{T \sim p (T)}{E} [\log d (T)] + \underset{I \sim p (I)}{E} [\log (1 - d (g (I)))] \end{matrix}$

where E represents the expectation. I is the input fringe image. T is the ground-truth data, which are the ground-truth sine term, cosine term, modulation degree diagram corresponding to the input fringe image. p represents the probability distribution. g represents the multi-scale image generator. g(I) are the estimated sine term, cosine term, modulation degree diagram by the multi-scale image generator. d is the image discriminator.

Step 3: The training data are collected to train the multi-scale generative adversarial network.

Measuring the fringe images of v different scenes, and taking 7 phase-shifting fringe images for each scene. The acquired fringe images are expressed as I_t(x, y) (t=1,2 . . . K, K=7v is the total number of acquired fringe images).

Using the 7-step phase-shifting algorithm (Bruning, J. Herriot, Donald R. Herriott, Joseph E. Gallagher, Daniel P. Rosenfeld, Andrew D. White, and Donald J. Brangaccio. “Digital wavefront measuring interferometer for testing optical surfaces and lenses.” Applied optics 13, no. 11 (1974): 2693-2703), the ground-truth sine term, cosine term, and modulation image corresponding to the fringe image is calculated.

Then, feeding the fringe image I_t(x, y) into the multi-scale image generator and estimating the sine term, cosine term and modulation image with size of H×W.

The image discriminator alternately extracts a group of sine term, cosine term and modulation image from the ground-truth dataset and the dataset generated by the multi-scale image generator as input. The input data are then successively passes through six residual blocks and a fully connected layer. Finally, the data pass through the sigmoid activation function and outputs a probability value between 0 and 1. The significance of the image discriminator is to learn how to distinguish the group of ground-truth sine term, cosine term and modulation image (labeled as 1) from the group output by the image generator (labeled as 0) through training. During training, the multi-scale image generator can generate sine terms, cosine terms and modulation image with higher fidelity until the adversarial loss is close to a preset threshold. At this moment, these estimated data can “fool” the image discriminator.

In some embodiments, 80% of the collected training data is used for training, and the remaining 20% of the data is used for verification. Specific training implementation can be carried out by referring works such as Isola, Phillip, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. “Image-to-image translation with conditional adversarial networks.” In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1125-1134. 2017 and Goodfellow, Ian, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. “Generative adversarial networks.” Communications of the ACM 63, no. 11 (2020): 139-144.

Step 4: When the multi-scale generative adversarial network is trained, a fringe image is fed into the image generator. It outputs the sine term M (x, y), cosine term D(x, y), and modulation image B(x,y). The sine term M(x, y) and cosine term D(x, y) are substituted into the arctangent function to calculate the phase ϕ(x, y):

$ϕ (x, y) = \arctan \frac{M (x, y)}{D (x, y)}$

Since the input of the neural network is only a single fringe pattern, the invention provides an efficient and high-precision phase calculation method for moving objects.

EXAMPLE

In order to verify the effectiveness of the present invention, a fringe projection 3D measurement system is constructed with a camera (model aca640-750, Basler), a projector (model LightCrafter 4500, Ti), and a computer. Firstly, the multi-scale generative adversarial network is constructed by using the steps 1 and 2. Secondly, the training data collected in step 3 are used to train the multi-scale generative adversarial network. In this embodiment, v=150 different scenarios are selected, and 1050 training fringe images I_t(x, y) are taken by using the 7-step phase-shifting method (t=1,2, . . . ,1050). The 7-step phase-shifting method is used to generate ground-truth sine term M_t(x, y), cosine term D_t(x, y), modulation image B_t(x, y) for each I_t(x, y).

When the neural network is trained, an unseen scene is measured. The fringe image of this scene is shown in FIG. 5(a). The fringe image is fed into the image generator of the multi-scale generative adversarial network and obtain the sine term (FIG. 5(b)), cosine term (FIG. 5(c)), and modulation image (FIG. 5(d)). The sine term and cosine term are substituted into the arctangent function to calculate the phase that is shown in FIG. 5(e).

In order to quantify the phase measurement accuracy of the present invention, the fringe image (FIG. 5(a)) is also analyzed by the Fourier transform fringe analysis method. Taking the phase calculated by the 7-step phase-shifting method as the reference, FIG. 6(a) shows the absolute phase error distribution of Fourier transform method, and FIG. 6(b) shows the absolute phase error distribution of the present invention. The brightness of the gray value corresponds to the magnitude of the phase error. According to the results of the error distribution, it can be seen that the present invention can calculate the phase of complex surfaces (such as the hair of the left model) with higher accuracy. Finally, the average absolute phase error of the whole scene is compared. The error of the Fourier transform fringe analysis is 0.24 rad, while the error of present invention is 0.091 rad. This embodiment shows that as a single-frame fringe analysis method, the phase accuracy of the present invention is higher than that of the traditional Fourier transform fringe analysis.

Claims

1. A single-frame fringe pattern analysis method based on multi-scale generative adversarial network is characterized by the following steps: step 1: A multi-scale generative adversarial neural network model is constructed and it involves a multi-scale image generator and an image discriminator;step 2: A comprehensive loss function L is constructed for the multi-scale generative adversarial neural network;step 3: The training data are collected to train the multi-scale generative adversarial network;step 4: During the prediction, a fringe pattern is fed into the trained multi-scale network where the generator outputs the sine term, cosine term, and the modulation image of the input pattern; then, the arctangent function is applied to compute the phase.
2. The method of claim 1, the single-frame fringe pattern analysis method based on multi-scale generative adversarial network is characterized by the multi-scale image generator that comprises four data processing paths (1)-(4) with the same structure; each data processing path consists of a convolutional layer, 4 residual blocks, and a convolutional layer that is linearly activated.
3. The method of claim 1, the single-frame fringe pattern analysis method based on multi-scale generative adversarial network is characterized by the mentioned data processing path (4)'s input which is I4(x, y) with the size of
4. The method of claim 1, the single-frame fringe pattern analysis method based on multi-scale generative adversarial network is characterized by the image discriminator that is built by 6 residual blocks and a fully connected layer; they are connected sequentially; the fully connected layer is activated by the sigmoid function.
5. The method of claim 1, the single-frame fringe pattern analysis method based on multi-scale generative adversarial network is characterized by the comprehensive loss function L that is written as L=αLimage+βLGAN where α and β are the weights, Limage the loss function regarding the image content, and LGAN the adversarial loss function.
6. The method of claim 5, the single-frame fringe pattern analysis method based on multi-scale generative adversarial network is characterized by the image content loss Limage that is written as Limage=γLf+ηLm where γ and η are the weights, Lf represents the loss of the sine and cosine terms, Lm the loss of the modulation image.
7. The method of claim 6, the single-frame fringe pattern analysis method based on multi-scale generative adversarial network is characterized by the loss of the sine and cosine terms Lf that is written as
8. The method of claim 5, the single-frame fringe pattern analysis method based on multi-scale generative adversarial network is characterized by the adversarial loss function LGAN which is
9. The method of claim 1, the single-frame fringe pattern analysis method based on multi-scale generative adversarial network is characterized by the training process: Measuring the fringe images of v different scenes, and taking 7 phase-shifting fringe images for each scene;for It(x, y), calculating the ground-truth sine term Mt(x, y), cosine term Dt(x, y), and modulation image Bt(x, y) with the 7-step phase-shifting method;the fringe image It(x, y) is fed into the multi-scale image generator; it outputs the sine term M(x, y), cosine term D(x, y), and modulation image B(x, y);the image discriminator alternately extracts a group of sine term, cosine term and modulation image from the ground-truth dataset and the dataset generated by the multi-scale image generator as input; it outputs a probability value between 0 and 1;the training process quits until the adversarial loss is close to a preset threshold.
10. The method of claim 1, the single-frame fringe pattern analysis method based on multi-scale generative adversarial network is characterized by the phase calculation method; the estimated sine term M(x, y) and cosine term D(x, y) is substituted into the arctangent function and the phase ϕ(x, y) is obtained by
11. The method of claim 2, the single-frame fringe pattern analysis method based on multi-scale generative adversarial network is characterized by the mentioned data processing path (4)'s input which is I4(x, y) with the size of
12. The method of claim 2, the single-frame fringe pattern analysis method based on multi-scale generative adversarial network is characterized by the comprehensive loss function L that is written as L=αLimage+βLGAN where α and β are the weights, Limage the loss function regarding the image content, and LGAN the adversarial loss function.
13. The method of claim 3, the single-frame fringe pattern analysis method based on multi-scale generative adversarial network is characterized by the comprehensive loss function L that is written as L=αLimage+βLGAN where α and β are the weights, Limage the loss function regarding the image content, and LGAN the adversarial loss function.
14. The method of claim 4, the single-frame fringe pattern analysis method based on multi-scale generative adversarial network is characterized by the comprehensive loss function L that is written as L=αLimageβLGAN where α and β are the weights, Limage the loss function regarding the image content, and LGAN the adversarial loss function.

Priority Claims (1)

Number	Date	Country	Kind
202010199717.9	Mar 2020	CN	national

PCT Information

Filing Document	Filing Date	Country	Kind
PCT/CN2020/111544	8/27/2020	WO

SINGLE-FRAME FRINGE PATTERN ANALYSIS METHOD BASED ON MULTI-SCALE GENERATIVE ADVERSARIAL NETWORK

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

Priority Claims (1)

PCT Information