The present invention generally relates to a structured-light three-dimensional (3D) scanning system, and more particularly to a structured-light 3D scanning system with a depth fusion device.
A structured-light scanning system projects a known pattern (e.g., grids or horizontal bars) onto an object in a scene. The deform resulted from the light reflection may be analyzed to calculate the depth and surface information of the objects in the scene.
The conventional structured-light scanning system, while analyzing the deform resulted from the light reflection, usually encounters a dilemma of adopting a larger kernel of pixels with greater depth accuracy (but less resolution) or adopting a smaller kernel of pixels with larger resolution (but less depth accuracy).
A need has thus arisen to propose a novel scheme to overcome drawbacks of the conventional structured-light scanning system and to solve the dilemma as mentioned above.
In view of the foregoing, it is an object of the embodiment of the present invention to provide a structured-light 3D scanning system capable of generating a fused depth map that is more accurate than original depth maps with enhanced image quality.
According to one embodiment, a structured-light three-dimensional (3D) scanning system includes a projector, an image capture device, a depth decoder and a depth fusion device. The projector emits a projected light with a predetermined pattern onto an object. The image capture device generates a captured image according to a reflected light reflected from the object, the predetermined pattern of the projected light being distorted due to 3D shape of the object, thereby resulting in a distorted pattern. The depth decoder converts the distorted pattern into a depth map representing the 3D shape of the object. The depth fusion device generates a fused depth map according to at least two different depth maps associated with the object.
Specifically, the structured-light 3D scanning system 100 (“system” hereinafter) may include a projector 11 configured to emit a (visible or invisible) projected light with a predetermined pattern (or ground truth) onto an object 10. The system 100 may include an image capture device 12, such as a camera, operatively coupled to receive a reflected light reflected from the object 10 and configured to capture an image, thereby generating a captured image (or raw image). It is noted that, while being reflected from the object 10, the predetermined pattern of the projected light may be geometrically distorted due to 3D shape of the object 10, thereby resulting in a distorted pattern.
In the embodiment, the system 100 may include a depth decoder 13 operatively coupled to receive the distorted pattern (on the captured image) and configured to convert (i.e., decode) the distorted pattern into an (original) depth map representing a 3D shape of the object 10.
The system 100 of the embodiment may optionally include a post-processor 14 configured to process the captured image, after decoding by the depth decoder 13). For example, the post-processor 14 may be configured to remove noise from the captured image by applying (conventional) noise reduction algorithms, details of which are omitted for brevity. The post-processor 14 may be configured to fill holes inside hollow regions of the captured image by applying (conventional) hole filling algorithms, details of which are also omitted for brevity.
According to one aspect of the embodiment, the system 100 may include a depth fusion device 15 operatively coupled to receive at least two (original) depth maps associated with the object 10 and configured to generate a fused depth image, which is more accurate than the (original) depth maps (without sacrificing resolution), thereby enhancing image quality of the system 100. It is appreciated that more than two depth maps may be received and operated by the depth fusion device 15 in other embodiments.
In step 21, the depth decoder 13 decodes the captured image with a first kernel (also called window or mask) of pixels, thereby generating a first depth map. Next, in step 22, the depth decoder 13 decodes the captured image with a second kernel (of pixels) that is different from the first kernel in size, thereby generating a second depth map.
Specifically, the kernel, which is one type of spatial filter, may include a matrix. While decoding the captured image, a convolution may be performed between the kernel and the data (of the captured image). The convolution is a process of adding each element of the captured image to its local neighbors, weighted by the kernel as conventionally done and details of which are omitted for brevity.
In the embodiment, the depth fusion device 15 may include a depth difference device 151 configured to determine a depth difference between the first depth map and the second depth map at a pixel (under determination) (step 23).
The depth fusion device 15 of the embodiment may include a comparator 152 configured to compare the depth difference with a predetermined threshold, thereby resulting a comparison result (step 24). In one exemplary embodiment, the threshold is fixed. In another exemplary embodiment, the threshold is dynamically determined. For example, the threshold may be determined as a percentage (e.g., 1%) of a depth value of the pixel (under determination) on a corresponding depth map.
The depth fusion device 15 of the embodiment may include a selector 153A (e.g., multiplexer) configured to select between the first depth map and the second depth map at the pixel (under determination) according to the comparison result (of the comparator 152), thereby generating a fused depth value for the fused depth map at the pixel under determination (step 25A). In the embodiment, the selector 153A selects a depth value of a (first/second) depth map with larger kernel as the fused depth value for the fused depth map when the depth difference is greater than the predetermined threshold.
In addition to the kernel size, the selector 153A may perform selection based on other criteria. For example, the selector 153A may perform selection further based on confidence levels that are derived from the post-processor 14.
The depth fusion device 15 of
In the embodiment, the depth fusion device 15 may include a weighting device 153B (that replaces the selector 153A of
Similar to the embodiment as depicted in
Although specific embodiments have been illustrated and described, it will be appreciated by those skilled in the art that various modifications may be made without departing from the scope of the present invention, which is intended to be limited solely by the appended claims.
Number | Name | Date | Kind |
---|---|---|---|
6268923 | Michniewicz | Jul 2001 | B1 |
7440590 | Hassebrook | Oct 2008 | B1 |
10805549 | Tsai | Oct 2020 | B1 |
20070165246 | Kimmel | Jul 2007 | A1 |
20130153651 | Fedorovskaya | Jun 2013 | A1 |
20130324875 | Mestha | Dec 2013 | A1 |
20160255332 | Nash | Sep 2016 | A1 |
20170277028 | Short | Sep 2017 | A1 |
20190007590 | Lee | Jan 2019 | A1 |
20190247157 | Wu | Aug 2019 | A1 |
20200286248 | Shi | Sep 2020 | A1 |
20210058599 | Tech et al. | Feb 2021 | A1 |
20210209778 | Chan | Jul 2021 | A1 |
20210248769 | Luo et al. | Aug 2021 | A1 |
20210368206 | Sugano | Nov 2021 | A1 |
Number | Date | Country |
---|---|---|
109186493 | Jan 2019 | CN |
Entry |
---|
Office Action Dated Aug. 30, 2024 in corresponding Taiwan Patent Application No. 112127607. |
Number | Date | Country | |
---|---|---|---|
20240119613 A1 | Apr 2024 | US |