The present invention relates to three-dimensional imaging. More specifically, it is concerned with a system and a method for high-speed dual-view band-limited illumination profilometry.
Three-dimensional (3D) surface imaging has been extensively applied in a number of fields in industry, entertainment, and biomedicine. Among developed methods, structured-light profilometry has gained increasing popularity in measuring dynamic 3D objects due to high measurement accuracy and high imaging speeds. Phase-shifting fringe projection profilometry (PSFPP) for instance uses a set of sinusoidal fringe patterns as the basis for coordinate encoding, and, in contrast to other methods such as binary pattern projection for example, the pixel-level information carried by the phase of the fringe patterns is insensitive to variations in reflectivity across the surface of an the object's surface, which results in high accuracy in 3D measurements. The sinusoidal fringes are typically generated using digital micromirror devices (DMDs). Each micromirror on the digital micromirror devices can be independently tilted to either +12° or −12° from the normal to its surface to generate binary patterns at up to tens of kilohertz. Although they are binary amplitude spatial light modulators, it was shown that digital micromirror devices can be used to generate grayscale fringe patterns at high speeds. The average reflectance rate of each micromirror can be controlled by conventional dithering method to form a grayscale image. However, the projection rate of fringe patterns is limited at hundreds of hertz. To improve the projection speed, binary defocusing methods have been developed to produce a quasi-sinusoidal pattern by slightly defocusing a single binary digital micromirror device pattern. Nonetheless, the image is generated at a plane unconjugated to the digital micromirror device, which compromises the depth-sensing range and is less convenient to operate with fringe patterns of different frequencies. Recently, band-limited illumination was developed to control the system bandwidth by placing a pinhole low-pass filter at the Fourier plane of a 4f imaging system. Both the binary defocusing method and the band-limited illumination scheme allow generating one grayscale sinusoidal fringe pattern from a single binary digital micromirror device pattern. Thus, the fringe projection speed matches the refreshing rate of the digital micromirror device.
High-speed image acquisition is indispensable to digital micromirror device-based phase-shifting fringe projection profilometry. In standard phase-shifting fringe projection profilometry methods, extra calibration patterns must be used to avoid phase ambiguity, which reduces the overall 3D imaging speed. A solution to this problem is to use multiple cameras to simultaneously capture the full sequence of fringe patterns. The enriched observation of the 3D object eliminates the necessity of calibration patterns in data acquisition and phase unwrapping. This advancement, along with the incessantly increasing imaging speeds of cameras, has endowed multi-view phase-shifting fringe projection profilometry systems with image acquisition rates that keep up with the refreshing rates of digital micromirror devices.
Current multi-view phase-shifting fringe projection profilometry systems are still limited, mainly in two aspects. First, each camera must capture the full sequence of fringe patterns. This requirement imposes redundancy in data acquisition, which ultimately clamps the imaging speeds of systems. Given the finite readout rates of camera sensors, a sacrifice of the field of view (FOV) is inevitable for higher imaging speeds. Advanced signal processing approaches, such as image interpolation and compressed sensing, applied to mitigate this trade-off typically involve high computational complexity and reduced image quality. Second, generally the cameras are placed on different sides of the projector, and this arrangement may induce a large intensity difference from the directional scattering light and the shadow effect from the occlusion by local surface features, both of which reduce the reconstruction accuracy and exclude the application from non-Lambertian surfaces.
There is a need in the art for a method and a system for high-speed dual-view band-limited illumination profilometry.
More specifically, in accordance with the present invention, there is provided a system for 3D imaging of an object, the system comprising a projection unit and at least two projection unit cameras; the projection unit comprising a light source and a projector; the cameras being positioned on a same side of the projector; wherein the projection unit projects sinusoidal fringe patterns onto the object and the cameras alternatively capture, point by point; fringe patterns deformed by the object, depth information being encoded into the phase of the deformed fringe patterns, and the object being recovered by phase demodulation and reconstruction.
There is further provided a method for 3D imaging of an object, comprising projecting sinusoidal fringe patterns onto the object using a projecting unit and capturing fringe patterns deformed by the object, alternatively by at least a first camera and a second camera, and recovering a 3D image of the object pixel by pixel from mutually incomplete images provided by the first camera and the second camera, by locating a point in images of the second camera that matches a selected pixel of the first camera; determining estimated 3D coordinates and wrapped phase based on calibration of the cameras, determining an horizontal coordinate on the plane of a projector of the projecting unit based on calibration of the projector, and using a wrapped phase value to recover a 3D point of 3D coordinates (x, y, z).
Other objects, advantages and features of the present invention will become more apparent upon reading of the following non-restrictive description of specific embodiments thereof, given by way of example only with reference to the accompanying drawings.
In the appended drawings:
The present invention is illustrated in further details by the following non-limiting examples.
A system for band-limited illumination profilometry (BLIP) with temporally interlaced acquisition (TIA) according to an embodiment of an aspect of the present invention generally comprises a projection unit to project pre-defined fringe patterns onto the surface of the measured object, the fringe patterns being distorted and reflected by the object surface, point by point, and cameras capturing the distorted fringes images, point by point.
In
After expansion and collimation by a beam expander 12, the laser beam from a 10 Ins a 200-mW continuous-wave laser source 10, of wavelength λ=671 nm (MRL-III-671, CNI Lasers), is directed by mirrors 14 and 16 to a 0.45″ digital micromirror device 18 (AJD-4500, Ajile Light Industries) at an incident angle of about 24° to the normal of the surface of the digital micromirror device 18, for sinusoidal fringes generation; using four phase-shifting binary patterns, generated by an error diffusion algorithm from their corresponding grayscale sinusoidal patterns, loaded onto the digital micromirror device 18. The pattern conversion unit 21, comprising a 4f imaging system 25 with a low pass filter 24 such as a pinhole, converts the binary patterns to grayscale fringes at the intermediate image plane 28.
The minimal pinhole diameter D for all spatial frequency content of the sinusoidal fringe pattern to pass through the system is determined by the system bandwidth as follows:
where pf=324 μm is the period of the fringes composed by the digital micromirror device pixels, f1 being the focal length of lens 22. With lenses 22 and 26 of the 4f imaging system 25 having focal lengths f1=120 mm and f2=175 mm respectively, the minimal pinhole diameter is D=248.52 μm. In an experiment, a 300 μm-diameter pinhole was selected.
A projector lens 30 (AF-P DX NIKKOR, Nikon) projects the output fringe patterns on a 3D object 32.
Deformed structured images are captured alternately by two high-speed CMOS cameras 34, 36 (CP70-1HS-M-1900, Optronis) placed side by side, i.e on a same side of the projector. Depending on their roles in image reconstruction, the cameras are referred to as the main camera 34 and the auxiliary camera 36 respectively, as will be described hereinbelow. Synchronized by the trigger signal of the digital micromirror device, each camera captures half of the sequence (
The light source 10 is a high coherent light source of power at least 50 mW, selected depending on the sensitivity of cameras 34 and 36. with a laser wavelength is comprised in the range between about 380 and about 750 nm in case of visible light cameras, and in the range between about 800 and about 1100 nm in case of near infrared (NIR) cameras.
The high-speed cameras 34, 36 may be cameras with global shutter, of imaging speed of at least about 2 k frames/second, with image resolution at least about 1000×800 pixels.
The spatial light modulator 18 has a refreshing rate of at least about 4 kHz, on board memory of at least about 1 Mb, and is selected to work at the corresponding wavelength of the light source. It may be a liquid crystal display or a binary fringe mask with a motorized translation stage for example.
The pattern conversion unit 21 may comprise a 4f imaging system 25 with lenses of different focal lengths and the low-pass filter 24 may be a slit. The focal lengths of the two lenses are selected with a ratio (focal length of the first lens/focal length of the second length) comprised in the range between about 0.75 and about 1.5. The diameter of the low pass filter is selected in the range between about 150 μm and about 300 μm.
The projecting optics 30 is selected with a focal length in the range between about 18 and about 55 mm, a F number in the range between about 3.5 and about 5.6, and a magnification ratio in a range between about 5 and about 10 times.
The imaging speed and field of view may be further improved by using more than two cameras, in such a way to separate the workload to an array of cameras, for example to trace and recognize hand gesture in 3D space to provide information for human-computer interaction.
The system thus projects sinusoidal fringe patterns onto the object and captures the corresponding deformed patterns modulated by the object surfaces. The depth information is encoded into the phase of the distorted fringe images. For phase demodulation and reconstruction of the 3D object, the retrieved phase distribution corresponding to the object height is mathematically wrapped to principle values of arctangent function ranging between −π and π, and consequently, the phase discontinuities occur at the limits every time when the unknown true phase changes by 2π, which is referred to as the phase ambiguity problem, resulting from the periodical nature of the sinusoidal signal. A unique pixel correspondence between the cameras and the projector is obtained by phase unwrapping.
According to an aspect of the present disclosure, a method to recover the 3D image of the object pixel by pixel from the mutually incomplete images provided by the cameras generally comprises locating a point (u′a, v′a) in the images of the auxiliary camera 36 that matches a selected pixel (um, vm) of the main camera 34; determining estimated 3D coordinates and wrapped phase from knowledge of the cameras calibration, determining the horizontal coordinate on the plane of the projector from knowledge of the projector calibration, and using the wrapped phase value to recover the 3D point of 3D coordinates (x, y, z) with the coordinate-based method.
System Calibration
To recover the object's 3D information, the method relies on a coordinate-based understanding of the spatial relationship between the projector 30 and the cameras 34, 36 in image formation. The projection of the 3D coordinates (x, y, z) of the 3D point onto the camera coordinates (u, v) is described in a pinhole model using extrinsic parameters R and T describing the rotation and translation of coordinates, respectively, and intrinsic parameters characterizing the properties of the cameras in image formation, with fu and fv the effective focal lengths along each axis of the sensor of the cameras; upp and vpp the coordinates of the principal point of the cameras; and a accounting for pixel skewness, as follows:
Column vectors [u, v, 1]T and [x, y, z, 1]T represent the camera coordinates (u, v) and the 3D coordinates (x, y, z) in homogeneous coordinates, which allow for the numerical extraction of the camera coordinates (u, v) from Relation (2) through a scalar factor s.
The cameras and the projector are calibrated to determine the values of the extrinsic and intrinsic parameters using a checkerboard. Since the direct image acquisition is not possible for a projector, projector-centered images of the calibration object obtained by the phase-based mapping method are sent to a toolbox with calibration in the same manner as for the cameras.
Coordinate-Based 3D Point Determination
3D information is recovered from the calibrated imaging system using a coordinate-based method. To a point on the 3D object with the 3D coordinates (x, y, z) correspond two independent coordinates, (u, v) for the cameras and (u″, v″) for the projector.
In a calibrated phase-shifting fringe projection profilometry system, any three of these coordinates {u, v, u″, v″} can be determined and a linear system of the form E=M [x, y, z]T is derived. The elements of E and M are obtained by using the calibration parameters of each device, the scalar factors and the three determined coordinates among u, v, u″ and v″. Thus, 3D information of an object point can be extracted via matrix inversion.
Returning to the system discussed in relation to
Data Acquisition
For data acquisition, four fringe patterns with phases equally shifted by π/2 illuminate the 3D object. The intensity value Ik (u, v) for the pixel (u, v) in the kth image acquired by the calibrated main camera 34 is obtained as follows:
where k∈[0,3]. Ib(u, v) is the background intensity, Iva(u, v) is the variation of intensity and φ(u, v) is the depth-dependent phase.
Relation (3) allows analyzing two types of intensity matching conditions for the order of pattern projection shown in
I
0(um,vm)+I2(u′a,v′a)=I1(um,vm)+I3(u′a,v′a). (4)
Rearrangement of Relation (4) leads to the equivalent relation, selected as the intensity matching condition:
I
0(um,vm)−I1(um,vm)=I3(u′a,v′a)−I2(u′a,v′a). (5)
Each side of Relation (5) contains images captured by the same camera and represent a residual fringe component of sinusoidal characteristics, which allows to increase the efficiency of line-constrained searches by regularizing local maxima and minima in the patterns and by including additional phase information. Moreover, by considering the right-hand side as a continuously varying function along the epipolar line determined on the calibrated auxiliary camera 36, Relation (5) and bi-linear interpolation allows for the selection of discrete candidates with sub-pixel accuracy.
In a quality map determination step (see Step I in
In a candidate discovery step (see Step II in
The quality map constraint requires that the candidates (u′ai, v′ai) for the matching point in the auxiliary images fall within the quality map of the auxiliary camera.
The transformation constraint requires that candidates occur within a segment of the epipolar line determined by a fixed two-dimensional projective transformation or homography that approximates the location of the matching point (u′e, v′e) within the images of the auxiliary camera as follows:
s′[u′e,v′e,1]T=H[um,vm,1]T, (6)
where s′ is a scalar factor representing extraction of the pair of coordinates of the estimated corresponding point (u′e, v′e) from its homogeneous coordinates [x, y, z, 1]T. H is obtained by applying Relation (6) to four points chosen as the corners of a flat rectangular plane when imaged by both cameras at the approximate center of the measurement volume. [um, vm, 1]T are the homogeneous coordinates of the selected pixel (um, vm) of the main camera. Once the coordinates of the estimated corresponding point (u′e, v′e) are determined, the search along the epipolar line is confined to the segment occurring over the horizontal interval [u′e−r0, u′e+r0], where r0 is an experiment-dependent constant. In general, r0 is selected as small as possible while still covering the targeted depth range. For the presently described experiments, the value of r0 was set to 40 pixels.
The phase sign constraint requires that the selected point (um, vm) of the main camera and candidates (u′ai, v′ai) have the same sign of their wrapped phases ωm and ω′ai respectively. Estimates of the wrapped phases are obtained using Fourier transform profilometry. In particular, the intensity If(um, vm) of the selected pixel (um, vm) of the main camera pixel in the filtered image is obtained by band-pass filtering the left-hand side of Relation (5) I0−I1, as follows:
The wrapped phase estimation ωm of the selected point (um, vm) is obtained as follows:
where [⋅] and [⋅] denote the imaginary and real part of a complex variable respectively. The same band-pass filtering applied to the right-hand side of Relation (5) I3−I2 yields the estimate of its wrapped phase ω′ai of the candidate (u′ai, v′ai), as follows:
The phase sign constraint requires that the wrapped phase estimation ωm of the selected point (um, vm) and the wrapped phase estimation ω′ai of the candidate (u′ai, v′ai) have the same sign in the interval (−π, π].
Other Fourier transform profilometry methods for wrapped phase value extraction.
The output of the candidate discovery step is a pool of candidates for further evaluation and the method proceeds to matching point selection. If no candidate is found, the candidate discovery step is re-initiated for the next pixel in the main camera, until a candidate is obtained, and the method proceeds to the matching point selection.
In the matching point selection step (see Step III in
Meanwhile, for each candidate (u′ai, v′ai), the coordinate triple (um, vm, u′ai) and knowledge of camera calibration allows determining an estimated 3D point Pi by using the stereo vision method. In addition, with the knowledge of the projector calibration, a point with coordinates (u″pi, v″pi) on the plane of the projector is determined for each candidate. Then, an unwrapped phase value φ″pi is obtained by:
where u″d is a horizontal datum coordinate on the plane of the projector associated with the zero phase, and p is the fringe period in units of projector pixels. Since these independently obtained phase values must agree if the candidate correctly matches (um, vm), a penalty score Ai, as a normalized difference of these two phase values, is obtained as follows:
where the rewrapping function R(⋅) computes the subtracted difference between wrapped and unwrapped phase values.
To improve the robustness of the method, two additional criteria are implemented using data available from the candidate discovery step. Bi is a normalized distance score favoring candidates located closer to the estimated matching point (u′e, v′e), which is obtained by:
Moreover, Ci is a normalized difference of wrapped phase values obtained by using the wrapped phases ωm and ω′ai, as follows:
A total penalty score Si for each candidate is then determined as a weighted linear combination of three individual scores as follows:
S
i=η1Ai+η2Bi+η3Ci, (15)
where the normalized weights [η1, η2, η3]=[0.73, 0.09, 0.18] are empirically selected to lead to the results that are most consistent with physical reality. Finally, the candidate with the minimum total penalty score Si is selected as the matching point (u′a, v′a), and its phase values are obtained by using relations. (10) and (11) are denoted as φ′a and φ″p, respectively.
In a final step of 3D point recovery (see Step IV in
u″
p
=u″
d
+P(φ′a/2π+q), (16)
from which the final 3D coordinates (x, y, z) are obtained using calibration information associated with the coordinate triple (um, vm, u″p).
Results
To examine the feasibility of the method, various static 3D objects were imaged. First, two sets of 3D distributed letter toys that composed the words of “LACI” and “INRS” were imaged.
Imaging of Dynamic 3D Objects
To verify high-speed 3D surface profilometry, the method was used to image two dynamic scenes: a moving hand and three bouncing balls. The fringe patterns were projected at 4 kHz. The exposure times of both cameras were te=250 μs. Under these experimental conditions, a 3D imaging speed of 1 thousand frames per second (kips), a field of view (FOV) of 150 mm×130 mm, corresponding to 1180×860 pixels in captured images, and a depth resolution of 0.24 mm were achieved.
In the second experiment, three white balls, each of which was marked by a different letter on its surface, bounced in an inclined transparent container.
Application to the Study of Sound-Induced Vibration on Glass
To highlight the broad utility of the method, sound-induced vibration on glass was imaged. In an experiment (
Time histories of averaged depth displacements under different sound frequencies were further analyzed.
Application to the Study of Glass Breakage
To further apply the method to recording non-repeatable 3D dynamics, the process of glass breaking by a hammer was imaged. As displayed in
There is thus presented a method with a kfps-level 3D imaging speed over a field of view of up to 150 mm×130 mm. The method implements temporally interlaced acquisition in multi-view 3D phase-shifting fringe projection profilometry systems, which allows each camera capturing half of the sequence of phase-shifting fringes. Leveraging the characteristics indicated in the intensity matching condition [Relation (5)], the method applies constraints in geometry and phase to find the matching pair of points in the main and auxiliary cameras and guides phase unwrapping to extract the depth information. The method was shown to allow the 3D visualization of glass vibration induced by sound and the glass breakage by a hammer.
There is thus presented a system and a method for high-speed dual-view band-limited illumination profilometry using temporally interlaced acquisition As people in the art will now be in a position to appreciate, temporally interlaced acquisition eliminates the redundant capture of fringe patterns in data acquisition. The roles of the main camera and the auxiliary camera are interchangeable and the present method may be adapted to a range of multi-view phase-shifting fringe projection profilometry systems. Moreover, temporally interlaced acquisition reduces the workload for both cameras by half. For the given bandwidth of the camera's interface, this more efficient use of cameras can either increase the 3D imaging speed for a fixed field of view or enlarge the field of view with a maintained 3D imaging speed. Both advantages shed light on implementing the present method with an array of cameras to simultaneously accomplishing high accuracy and high speed 3D imaging over a larger files of view. Also, the two cameras deployed in the present method are placed on a same side relative to the projector, which circumvents the intensity difference induced by the directional scattering light from the 3D object and reduces shadow effect by occlusion occurring when placing the cameras on different sides of the projector. As a result, robust pixel matching in the image reconstruction algorithm allows to recover 3D information on non-Lambertian surfaces.
The imaging speed and field of view may be optimized by separating the workload to four cameras, by using a faster digital micromirror device, and by using a more powerful laser. The image reconstruction toward real-time operation may be increased by further adapting the 3D point recovery method to four cameras and by using parallel computing to accelerate the calculation.
The present method may be integrated in structured illumination microscopy and frequency-resolved multi-dimensional imaging. The present method may also be implemented in the study of the dynamic characterization of glass in its interaction with the external forces in non-repeatable safety test analysis. As another example, the present method may be used to trace and recognize the hand gesture in 3D space to provide information for human-computer interaction. Furthermore, in robotics, the present method may provide a dual-view 3D vision for object tracking and reaction guidance. Finally, the present method can be used as an imaging accelerometer for vibration monitoring in rotating machinery and for behavior quantification in biological science.
Temporally interlaced acquisition thus integrated in a dual-view phase-shifting fringe projection profilometry system allows each camera capturing half of the sequence of phase-shifting fringes. Leveraging the characteristics indicated in the intensity matching condition, the method applies constraints in geometry and phase to find the matching pair of points in the main and auxiliary cameras and guides phase unwrapping to extract the depth information.
The present method and system eliminate the redundant capture of fringe patterns in data acquisition, which lifts the long-standing limitation in imaging speed for multi-view phase-shifting fringe projection profilometry, and allows reducing the workload of cameras, which enables the enhancement of either the 3D imaging speed or the imaging field of view. Dynamic 3D imaging of over 1 thousand frames per second on a field of view of up to 150×130 mm2, corresponding to 1180×860 pixels in captured images, was demonstrated. Moreover, by putting the two cameras side by side on a same did of the projector, the present method and system circumvent the influence of directional scattering light and occlusion effect for more robust reconstruction, thereby expanding the application range of multi-view phase-shifting fringe projection profilometry to non-Lambertian surfaces.
The present method and system may be adapted into other multi-view 3D profilometers, thus opening new opportunities to blur-free 3D optical inspection and characterization with high speeds, large fields of view, and high accuracy. The present method and system provide a versatile tool for dynamic 3D metrology with potential applications in advanced manufacturing, such as characterization of glass in non-repeatable safety test and high-speed vibration monitoring in rotating machinery. The present compact and symmetric system may be embedded in the vision system of robots to track objects, to recognize the gesture for human-computer interaction, and to guide reactions.
The scope of the claims should not be limited by the embodiments set forth in the examples, but should be given the broadest interpretation consistent with the description as a whole.
This application claims benefit of U.S. provisional application Ser. No. 63/060,630, filed on Aug. 3, 2020. All documents above are incorporated herein in their entirety by reference.
Number | Date | Country | |
---|---|---|---|
63060630 | Aug 2020 | US |