The present invention relates to an information processing device, an information processing method, and a program.
A three-dimensional measurement technique using a Time of Flight (ToF) method is known. In this method, reference light such as an infrared pulse is projected toward a subject, and the depth of the subject is detected based on information of the time until the reflected light is received.
The above method has an issue that, in the presence of ambient light having the same wavelength as that of the reference light in the measurement environment, the ambient light produces noise to cause depth detection accuracy deterioration.
In view of this, the present disclosure proposes an information processing device, an information processing method, and a program capable of enhancing the depth detection accuracy.
According to the present disclosure, an information processing device is provided that comprises: a signal acquisition unit that extracts, from light reception data, active information indicating a flight time of reference light and passive information indicating two-dimensional image information of a subject obtained from ambient light; a passive depth estimation unit that generates passive depth information based on the passive information; and a fusion unit that fuses the passive depth information with active depth information generated based on the active information to generate depth information of the subject. According to the present disclosure, an information processing method in which an information process of the information processing device is executed by a computer, and a program for causing a compute to execute the information process of the information processing device are provided.
Embodiments of the present disclosure will be described below in detail with reference to the drawings. In each of the following embodiments, the same parts are denoted by the same reference symbols, and a repetitive description thereof will be omitted.
Note that the description will be given in the following order.
[1. Distance Measurement Technique Using ToF Camera]
The ToF camera 1 includes a light projection unit 10 and a light reception unit 20. The light projection unit 10 projects reference light PL toward a subject SU. The reference light PL is pulsed light of infrared, for example. The light reception unit 20 receives reference light PL (reflected light RL) reflected by the subject SU. The ToF camera 1 detects a depth d of the subject SU based on a time t from of a moment of projection of the reference light PL to a moment of reception of the reference light PL by the light reception unit 20 after reflection of the light on the subject SU. The depth d can be expressed as d=t×c/2 using a speed of light c.
When the ambient light source ELS such as the sun or the electric light exists in the image capturing environment, the ambient light EL emitted from the ambient light source ELS is incident on the light reception unit 20 in addition to the reference light PL reflected on the subject SU. When the ambient light EL having the same wavelength as the reference light PL exists in the image capturing environment, the ambient light PL produces noise to cause depth detection accuracy deterioration in some cases. Although Patent Literatures 2 to 4 eliminate the ambient light PL that produces noise, the present disclosure actively utilizes the ambient light PL to enhance the depth detection accuracy. This point will be described below.
The light reception unit 20 includes a plurality of pixels arrayed in the X direction and the Y direction. The Z direction orthogonal to the X direction and the Y direction is a depth direction. Each pixel is provided with an infrared sensor that detects infrared radiation. The infrared sensor receives infrared radiation at a preset sampling period. The infrared sensor detects the number of photons received within one sampling term (sampling period) as luminance.
The light reception unit 20 repeatedly measures the luminance within a preset measurement term (one frame). The ToF camera 1 stores the light reception data LRD for one measurement term measured by the light reception unit 20 as LiDAR input data ID. The LiDAR input data ID is time-series luminance data of each pixel measured in one measurement term. The ToF camera 1 converts the time from a measurement start time point into a distance (depth). The ToF camera 1 generates a depth map DM of the subject SU based on the LiDAR input data ID.
The lower part of
The position of the reference light component RC in the time axis direction reflects the distance to the subject SU. An average signal value of the ambient light component EC indicates brightness of the pixel. The brightness of each pixel provides information regarding a two-dimensional image of the subject SU (two-dimensional image information). In the present disclosure, active information indicating a flight time of the reference light PL and passive information indicating two-dimensional image information of the subject SU, obtained from the ambient light EL, are extracted from the LiDAR input data ID. For example, the active information includes information indicating the position and magnitude of the peak attributed to the reference light PL of each pixel. The passive information includes information indicating an average signal value of each pixel attributed to the ambient light EL.
The left part of
For example,
Furthermore, the depth can be estimated by stereo imaging using a plurality of two-dimensional images IM having different viewpoints. By fusing the depth information estimated by the stereo imaging with the depth information estimated based on the active information AI, it is possible to obtain accurate depth information. The depth detection accuracy is enhanced by combining the active distance measurement technique using the active information AI and the passive distance measurement technique using the passive information PI.
[2. LiDAR Input Data]
The LiDAR input data ID includes the ToF data MD at a plurality of time points measured in the sampling period SP. The ToF data MD includes luminance data of each pixel measured at the same time point. Number of samples N of the LiDAR input data ID (the number of measurements that can be performed in one measurement term FR) is determined by the size of the memory of the ToF camera 1 and the capacity of the ToF data MD. The distance measurement range (depth measurement interval) indicating the resolution of the depth varies according to the sampling period SP. The shorter the sampling period SP, the shorter the distance measurement range, leading to higher depth resolution.
[3. Configuration of ToF Camera]
The ToF camera 1 includes a light projection unit 10, a light reception unit 20, a processing unit 30, a motion detection unit 60, and a storage unit 70.
Examples of the light projection unit 10 include a laser or a projector that projects the reference light PL. The light reception unit 20 is, for example, an image sensor including a plurality of pixels for infrared detection. The processing unit 30 is an information processing device that processes various types of information. The processing unit 30 generates the depth map DM based on the LiDAR input data ID acquired from the light reception unit 20. The motion detection unit 60 detects motion information of the light reception unit 20. The motion detection unit 60 includes an inertial measurement unit (IMU) and a GPS sensor, for example. The storage unit 70 stores a waveform separation model 71, setting information 72, and a program 79 necessary for information processing performed by the processing unit 30.
The processing unit 30 includes a signal acquisition unit 31, a filter unit 32, a position/posture estimation unit 33, a passive depth estimation unit 34, a fusion unit 35, and a planar angle estimation unit 36.
The signal acquisition unit 31 acquires the LiDAR input data ID from the light reception unit 20. The signal acquisition unit 31 extracts the luminance signal BS of each pixel from the LiDAR input data ID. The signal acquisition unit 31 extracts the active information AI and the passive information PI from the luminance signal BS of each pixel.
The signal acquisition unit 31 generates active depth information ADI indicating an estimation value of the depth of each pixel from the active information AI. The signal acquisition unit 31 supplies the active depth information ADI to the filter unit 32 and the planar angle estimation unit 36. The signal acquisition unit 31 supplies the passive information PI to the filter unit 32, the position/posture estimation unit 33, the passive depth estimation unit 34, and the planar angle estimation unit 36.
The filter unit 32 corrects the active depth information ADI (first active depth information ADI1) acquired from the signal acquisition unit 31 based on the passive information PI. For example, the filter unit 32 estimates the shape of the subject SU based on the two-dimensional image information of the subject SU extracted from the passive information PI. The filter unit 32 corrects the first active depth information ADI1 so that the depth of the subject SU matches the shape of the subject SU estimated based on the passive information PI. The filter unit 32 supplies the corrected active depth information ADI (second active depth information ADI2) to the position/posture estimation unit 33 and the fusion unit 35.
[4. Active Depth Information Correction Processing Using Passive Information]
For example, the filter unit 32 generates the two-dimensional image IM of the subject SU based on the passive information PI. The filter unit 32 extracts a planar portion PN of the subject SU from the two-dimensional image IM using a known object recognition technology. The planar angle estimation unit 36 estimates an angle θ of the planar portion PN of the subject SU based on the first active depth information ADI1.
As illustrated on the left part of
Returning to
The passive depth estimation unit 34 generates passive depth information PDI based on the passive information PI. For example, the passive depth estimation unit 34 extracts, from the passive information PI, the plurality of two-dimensional images IM of the subject SU acquired from different viewpoints at different time points. The passive depth estimation unit 34 detects a plurality of viewpoints corresponding to the plurality of two-dimensional images IM based on the position/posture information PPI of the light reception unit 20 estimated by the position/posture estimation unit 33. The passive depth estimation unit 34 uses stereo imaging to generate the passive depth information PDI from the plurality of two-dimensional images IM.
The fusion unit 35 generates the depth information DI of the subject SU by fusing the passive depth information PDI with the second active depth information ADI2 generated based on the active information AI. The fusion ratio between the second active depth information ADI2 and the passive depth information PDI is set based on an energy ratio between the active signal and the passive signal, for example.
[5. Waveform Separation Processing]
As illustrated in
The waveform conversion unit 41 converts a peak attributed to the reference light PL into a pulsed waveform in the luminance signal BS. The waveform conversion unit 41 supplies a corrected luminance signal MBS obtained by performing the waveform conversion processing on the luminance signal BS to the peak acquisition unit 42.
The waveform conversion processing is performed based on the waveform separation model 71. The waveform separation model 71 is, for example, a trained neural network obtained by performing machine learning on the relationship between the reference light PL and the luminance signal BS. The luminance signal BS detected for the subject SU the distance to which is known in advance is used as input data to the machine learning. As illustrated in
The peak acquisition unit 42 acquires a signal of a portion corresponding to the pulsed waveform as the active signal AS attributed to the reference light PL from the luminance signal BS (corrected luminance signal MBS) having a waveform converted by the waveform conversion unit 41. The peak acquisition unit 42 calculates the depth based on the position on the time axis of the active signal AS. The peak acquisition unit 42 generates the first active depth information ADI1 based on the depth calculated for each pixel.
The section averaging unit 43 calculates an average signal value of the corrected luminance signal MBS of the portion attributed to the ambient light EL. The section averaging unit 43 acquires the calculated average signal value as a signal value of a passive signal PS attributed to the ambient light EL.
A method of calculating the signal value of the passive signal PS varies according to a ratio between the energy values of the active signal AS and the passive signal PS. Based on a preset determination criterion, the section averaging unit 43 sets various sections on the time axis in which the signal values of the corrected luminance signal MBS are averaged. For example, the determination criterion is set as a condition that the energy value of the passive signal PS is larger than the energy value of the active signal AS, with a significant difference between these energy values.
For example, as illustrated in
[6. Fusion Ratio Setting Processing]
Returning to
For example, a pixel in which the fusion ratio FUR of the second active depth information ADI2 is larger than the fusion ratio FUR of the passive depth information PDI is set as an active dominant pixel in which the active signal AS is dominant. In addition, a pixel in which the fusion ratio FUR of the passive depth information PDI is larger than the fusion ratio FUR of the second active depth information ADI2 is set as a passive dominant pixel in which the passive signal PS is dominant.
The fusion unit 35 extracts an active dominant pixel group in which the active signal AS is dominant from a plurality of pixels included in the light reception unit 20. The fusion unit 35 calculates, for each pixel, a difference between the passive depth estimated from the passive information PI and the active depth estimated from the active information AI. The fusion unit 35 performs bundle adjustment of the passive depth of all the pixels of the light reception unit 20 such that the difference between the passive depth and the active depth satisfies a predetermined criterion in the active dominant pixel group.
For example, the bundle adjustment is performed so as to satisfy a criterion represented by the following Formula (1). Expression (1) represents a condition of minimizing the value obtained by adding the difference between the passive depth and the active depth for all the pixels of the active dominant pixel group.
min(Σi∈D
The fusion unit 35 sets the depth of the passive dominant pixel group in which the passive signal PS is dominant, as a passive depth after the bundle adjustment. The fusion unit 35 sets the depth of the pixel group other than the passive dominant pixel group, as an active depth.
[7. Setting of Measurement Term and Sampling Period]
The measurement term re-setting unit 45 calculates a minimum measurement term FR in which the ratio of the energy values of the passive signal PS satisfies a preset high ratio condition. The sampling period re-setting unit 46 sets the sampling period SP of the light reception unit 20 based on the maximum depth value of the subject SU, the number N of measurable samples, and the minimum measurement term FR satisfying the high ratio condition.
In
P
Psv(TPsv−TAct)≥100PActTAct (2)
The high ratio condition is satisfied by increasing the measurement term FR (with long TPsv). However, due to the upper limit of the number of samples N acquirable as the LiDAR input data ID, increasing the measurement term FR prolongs the sampling period. The prolonged sampling period increases the distance measurement range, leading to a lowered depth resolution. To handle this, the measurement term re-setting unit 45 sets the length TPsv of the measurement term FR as the minimum length that satisfies Formula (2).
The sampling period re-setting unit 46 adaptively determines the sampling period SP for each frame in accordance with the distance to the subject SU so as to maximize the depth resolution. The sampling period re-setting unit 46 extracts the maximum value of depth (maximum depth value) from both the second active depth information ADI2 and the passive depth information PDI, and calculates the sampling period SP using the maximum depth value with higher level of reliability.
For example, in an active distance measurement technique using the reference light PL, the light reception intensity regarding the reference light PL (reflected light) varies according to the distance to the subject SU. The shorter the distance, the higher the light reception intensity and the higher the depth detection accuracy. In contrast, the longer the distance, the lower the light reception intensity and the lower the depth detection accuracy. In an active distance measurement technique, the measurable distance is shorter than in a passive measurement technique using stereo imaging. Accordingly, the sampling period re-setting unit 46 determines the priority of the second active depth information ADI2 and the passive depth information PDI based on the distance to the subject SU. The sampling period re-setting unit 46 sets the sampling period SP using the maximum depth value extracted from the information determined to have higher priority.
For example, the sampling period re-setting unit 46 extracts the maximum depth (maximum depth value) among the depths of the subject SU from both the second active depth information ADI2 and the passive depth information PDI. The sampling period re-setting unit 46 determines the priority of the passive depth information PDI based on a maximum depth value (first maximum depth value) extracted from the second active depth information ADI2 and a maximum depth value (second maximum depth value) extracted from the passive depth information PDI.
When having determined to prioritize the passive depth information PDI, the sampling period re-setting unit 46 sets the sampling period SP using the second maximum depth value. When having determined not to prioritize the passive depth information PDI, the sampling period re-setting unit 46 sets the sampling period SP using a maximum depth value (third maximum depth value) extracted from the depth information DI.
For example, the following is an assumed case where the magnitude of the first maximum depth value is max(depthActive), the magnitude of the second maximum depth value is max(depthpassive), the length of the sampling period SP is ΔT, the number of measurable samples is N, the speed of light is c, and the number of buffer samples is buffer. As indicated in the following Formula (3), in a case where the second maximum depth value is sufficiently larger than the first maximum depth value and it is determined that the depth cannot be accurately measured by the active measurement technique, the sampling period re-setting unit 46 sets the sampling period SP using measurement information (second maximum depth value) of the passive measurement technique capable of accurately measuring the distant positions.
As indicated in the following Formula (4), in a case where the second maximum depth value and the first maximum depth value are equal, and it is determined that the depth can be measured with high accuracy by the active measurement technique, the sampling period re-setting unit 46 sets the sampling period SP using measurement information (third maximum depth value=max(depthFusion)) with high certainty obtained by fusion.
However, as indicated in the following formula (5), when the sampling period SP becomes short, the measurement term FR, which is calculated by the sampling period SP and the number of samples N, becomes shorter than the minimum measurement term FR calculated by the formula (2) in some cases. In this case, sampling period re-setting unit 46 sets the sampling period SP based on the minimum measurement term SP calculated by Formula (2).
min(TPsv)>ΔT×N⇒ΔT=min(TPsv)/N (5)
The measurement term re-setting unit 45 and the sampling period re-setting unit 46 control the measurement term FR and the sampling period SP of the light reception unit 20 based on the calculation result. The light reception unit 20 performs subsequent time (subsequent frame) measurement based on the set measurement term FR and sampling period SP.
The information regarding the above-described various conditions and criteria is included in the setting information 72. The waveform separation model 71, the setting information 72, and the program 79 used for the above-described processing are stored in the storage unit 70. The program 79 is a program that causes a computer to execute information processing according to the present embodiment. The processing unit 30 performs various processes in accordance with the program 79 stored in the storage unit 70. The storage unit 70 may be used as a work area for temporarily storing a processing result of the processing unit 30. The storage unit 70 includes, for example, any non-transitory storage medium such as a semiconductor storage medium and a magnetic storage medium. The storage unit 70 includes an optical disk, a magneto-optical disk, or flash memory, for example. The program 79 is stored in a non-transitory computer-readable storage medium, for example.
The processing unit 30 is a computer including a processor and memory, for example. The memory of the processing unit 30 includes random access memory (RAM) and read only memory (ROM). By executing the program 79, the processing unit 30 functions as the signal acquisition unit 31, the filter unit 32, the position/posture estimation unit 33, the passive depth estimation unit 34, the fusion unit 35, the planar angle estimation unit 36, the waveform conversion unit 41, the peak acquisition unit 42, the section averaging unit 43, the fusion ratio setting unit 44, the measurement term re-setting unit 45, and the sampling period re-setting unit 46.
[8. Information Processing Method]
In step S1, the light projection unit 10 projects the reference light PL toward the subject SU. In step S2, the light reception unit 20 receives the reflected light RL. In step S3, the processing unit 30 acquires the LiDAR input data ID as the light reception data LRD from the light reception unit 20.
In step S4, the processing unit 30 performs processing for enhancing the quality of the active depth information ADI. For example, the processing unit 30 extracts, from the LiDAR input data ID, the active information AI indicating the flight time of the reference light PL and the passive information PI indicating the two-dimensional image information of the subject SU obtained from the ambient light EL. The processing unit 30 corrects the active depth information ADI so that the depth of the subject SU matches the shape of the subject SU estimated based on the passive information PI.
In parallel with step S4, the processing unit 30 estimates the depth of the subject SU from the passive information PI using a passive distance measurement technique. For example, in step S5, the processing unit 30 estimates the position and posture of the light reception unit 20 based on the two-dimensional image information of the subject SU extracted from the passive information PI, the active depth information ADI, and the motion information acquired from the motion detection unit 60.
In step S6, the processing unit 30 extracts, from the passive information PI, the plurality of two-dimensional images IM of the subject SU acquired from different viewpoints at different time points. The processing unit 30 detects the viewpoint of the light reception unit 20 from which the two-dimensional image IM has been acquired for each two-dimensional image IM based on the information of the position and posture of the light reception unit 20. The processing unit 30 use stereo imaging to generate passive depth information PDI from a plurality of two-dimensional images IM having different viewpoints.
In step S7, the processing unit 30 fuses the passive depth information PDI with the active depth information ADI. This leads to composition of the active depth information ADI and the passive depth information PDI.
In step S8, the processing unit 30 determines whether to end the processing. For example, the processing unit 30 determines to end the processing when having received an operation on an image capture end button or the like from the user operating the ToF camera 1. In a case where it is determined in step S8 to end the processing (step S8: Yes), the depth estimation processing is ended. In a case where it is determined in step S8 not to end the processing (step S8: No), the processing returns to step S1, and the above-described steps are repeated until determined to end the processing.
[9. Effects]
The processing unit 30 includes the signal acquisition unit 31, the passive depth estimation unit 34, and the fusion unit 35. The signal acquisition unit 31 extracts, from the light reception data LDR, active information AI indicating the flight time of the reference light PL and passive information PI indicating the two-dimensional image information of the subject SU obtained from the ambient light EL. The passive depth estimation unit 34 generates the passive depth information PDI based on the passive information PI. The fusion unit 35 generates the depth information DI of the subject SU by fusing the passive depth information PDI with the active depth information ADI generated based on the active information AI. The information processing method of the present embodiment including execution of processing of the processing unit 30 described above by a computer. The program 79 of the present embodiment causes the computer to implement the processing of the processing unit 30 described above.
With this configuration, the depth detection accuracy is enhanced using the ambient light EL. Therefore, high accuracy detection of the depth is performed even in an environment with strong ambient light such as outdoors.
The passive depth estimation unit 34 extracts, from the passive information PI, the plurality of two-dimensional images IM of the subject SU acquired from different viewpoints at different time points. The passive depth estimation unit 34 uses stereo imaging to generate the passive depth information PDI from the plurality of two-dimensional images IM.
With this configuration, the passive depth information PDI is generated without preparing the light reception unit 20 different for each viewpoint.
The passive depth estimation unit 34 detects a plurality of viewpoints corresponding to the plurality of two-dimensional images IM based on the position/posture information PPI estimated using the active depth information ADI.
With this configuration, the passive depth information PDI with high depth estimation accuracy can be obtained.
The processing unit 30 includes the filter unit 32. The filter unit 32 corrects the active depth information ADI so that the depth of the subject SU matches the shape of the subject SU estimated based on the passive information PI.
With this configuration, active depth information with high depth estimation accuracy can be obtained.
The processing unit 30 includes the planar angle estimation unit 36. The planar angle estimation unit 36 estimates the angle θ of the planar portion PN of the subject SU based on the active depth information ADI (first active depth information ADI1) before correction. In a case where the planar portion PN is orthogonal to the depth direction, the filter unit 32 corrects the active depth information ADI such that the planar portion PN has an identical depth. In a case where the planar portion PN is inclined with respect to the depth direction, the filter unit 32 corrects the active depth information ADI such that the depth of the planar portion PN varies according to the angle θ.
With this configuration, the depth of the planar portion PN is accurately estimated based on the angle θ of the planar portion PN.
The signal acquisition unit 31 includes the waveform conversion unit 41, the peak acquisition unit 42, and the section averaging unit 43. The waveform conversion unit 41 converts a peak attributed to the reference light PL into a pulsed waveform. The peak acquisition unit 42 acquires a signal of a portion corresponding to the pulsed waveform as the active signal AS attributed to the reference light PL. The section averaging unit 43 calculates an average signal value of a portion attributed to the ambient light EL. The section averaging unit 43 acquires the calculated average signal value as a signal value of the passive signal PS attributed to the ambient light EL.
This configuration makes it possible to separate the active signal AS and the passive signal PS from each other with high accuracy.
The signal acquisition unit 31 includes the fusion ratio setting unit 44. The fusion ratio setting unit 44 sets the fusion ratio FUR of the active depth information ADI and the passive depth information PDI based on the ratio between the energy value of the active signal AS and the energy value of the passive signal PS.
With this configuration, the appropriate fusion ratio FUR according to the reliability of the active depth information ADI is set.
The fusion unit 35 extracts an active dominant pixel group in which the active signal AS is dominant. The fusion unit 35 performs bundle adjustment of the passive depth of all the pixels such that a difference between the passive depth estimated from the passive information PI and the active depth estimated from the active information AI satisfies a predetermined criterion in the active dominant pixel group. The fusion unit 35 sets the depth of the passive dominant pixel group in which the passive signal PS is dominant as the passive depth after the bundle adjustment, and sets the depth of the pixel group other than the passive dominant pixel group as the active depth.
This configuration makes it possible to enhance the passive depth estimation accuracy.
The signal acquisition unit 31 includes the measurement term re-setting unit 45. The measurement term re-setting unit 45 calculates the minimum measurement term FR in which the ratio of the energy values of the passive signal PS satisfies a preset high ratio condition.
This configuration makes it possible to stably acquire the light reception data LDR satisfying the high ratio condition.
The signal acquisition unit 31 includes the sampling period re-setting unit 46. The sampling period re-setting unit 46 sets the sampling period SP based on the maximum depth value of the subject SU, the number N of measurable samples, and the minimum measurement term FR satisfying the high ratio condition.
This configuration makes it possible to perform measurement with high depth resolution.
The sampling period re-setting unit 46 determines the priority of the active depth information ADI and the passive depth information PDI based on the distance to the subject SU. The sampling period re-setting unit 46 sets the sampling period SP using the maximum depth value extracted from the information determined to have higher priority.
With this configuration, the maximum depth value, which is the basis of the calculation of the sampling period SP, is appropriately determined based on the reliability of the measurement information according to the distance to the subject SU.
The sampling period re-setting unit 46 determines the priority of the passive depth information PDI based on the first maximum depth value extracted from the active depth information ADI and the second maximum depth value extracted from the passive depth information PDI. When having determined to prioritize the passive depth information PDI, the sampling period re-setting unit 46 sets the sampling period SP using the second maximum depth value. When having determined not to prioritize the passive depth information PDI, the sampling period re-setting unit 46 sets the sampling period SP using the third maximum depth value extracted from the depth information DI.
With this configuration, an appropriate sampling period is calculated based on the maximum depth value with high accuracy.
The effects described in the present specification are merely examples, and thus, there may be other effects, not limited to the exemplified effects.
Note that the present technique can also have the following configurations.
(1)
An information processing device comprising:
The information processing device according to (1),
The information processing device according to (2),
The information processing device according to any one of (1) to (3), further comprising
The information processing device according to (4), further comprising
The information processing device according to any one of (1) to (5),
The information processing device according to (6),
The information processing device according to (6) or (7),
The information processing device according to any one of (6) to (8),
The information processing device according to (9),
The information processing device according to (10),
The information processing device according to (11),
An information processing method to be executed by a computer, the method comprising:
A program for causing a computer to execute:
Number | Date | Country | Kind |
---|---|---|---|
2021-033442 | Mar 2021 | JP | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2022/002517 | 1/25/2022 | WO |