The present disclosure generally relates to methods and apparatus for use in optical devices, and more particularly, for improving processing capabilities for computer visual applications.
A stereoscopic camera arrangement is an apparatus made of two camera units, assembled in a stereoscopic module. Stereoscopy (also referred to as “stereoscopics” or “3D imaging”) is a technique for creating or enhancing the illusion of depth in an image by means of stereopsis. In other words, it is the impression of depth that is perceived when a scene is viewed with both eyes by someone having normal binocular vision, which is responsible for creating two slightly different images of the scene in the two eyes due to the eyes'/camera's different locations.
Combining 3D information derived from stereoscopic images, and particularly for video streams, requires search and comparison of a large number of pixels to be held for each pair of images, each of which derived from a different image capturing device.
The present invention seeks to provide a computational module that enable improving the robustness of depth calculation based on information received from one or more image capturing devices.
The disclosure may be summarized by referring to the appended claims.
It is an object of the present disclosure to provide a method and apparatus that implement an innovative method for depth calculation process based on information comprised in an image captured by one or more image capturing devices.
It is another object of the present disclosure to provide a method and apparatus that enable improving the robustness of depth map results getting from information received from one or more image capturing devices.
It is another object of the present disclosure to provide a method and apparatus that enable distinguishing between areas included in the captured image that comprise details that are implementable by a matching algorithm and areas that do not have such details.
Other objects of the present invention will become apparent from the following description.
According to a first embodiment of the disclosure, there is provided a computational platform for use in a depth calculation process based on information comprised in an image captured by one or more image capturing devices (sensors), wherein the computational platform enables distinguishing between areas included in the captured image that comprise details that are implementable by a matching algorithm and areas that do not have such details, wherein the computational platform comprises:
The term “computational platform” as used herein throughout the specification and claims, is used to denote a number of distinct but interrelated units for carrying out a computational process. Such a computational platform can be a computer, or a computational module such as an Application-Specific Integrated Circuit (“ASIC”), or a Field Programmable Gate Array (“FPGA”), or any other applicable processing device.
The terms “image capturing device”, “image capturing sensor” and “image sensor” as used herein interchangeably throughout the specification and claims, is used to denote a sensor that detects and conveys information used to make an image. Typically, it does so by converting the variable attenuation of light waves (as they pass through or reflect off objects) into signals. The waves can be light or another electromagnetic radiation. An image sensor may be used in robotic devices, AR/VR glasses, a drone, a digital camera, smart phones, medical imaging equipment, night vision equipment and the like.
According to another embodiment, the metric is based on a physical model of a signal representing the captured image.
In accordance with another embodiment, the at least one processor is further configured to remove outliers' values from each of the at least one selected matching window.
By yet another embodiment, the at least one processor is further configured to apply a normalization function to compensate for the metric's dependency on a number of pixels comprised in the selected matching window.
According to still another embodiment, the at least one processor is further configured to select the metric from among a group that consists of the members:
According to another aspect of the disclosure, there is provided a method for use in a depth calculation process based on information comprised in an image captured by one or more image capturing devices (sensors), that enables distinguishing between areas included in the captured image that comprise details that are implementable by a matching algorithm and areas that do not have such details, wherein the method comprises the steps of:
According to another embodiment of the present aspect of the disclosure the metric is based on a physical model of a signal representing the captured image.
By yet another embodiment of the present aspect of the disclosure the method further comprises a step of removing outliers' values from each of the at least one selected matching window.
In accordance with another embodiment of the present aspect of the disclosure, the method further comprises a step of applying a normalization function to compensate for the metric's dependency on a number of pixels comprised in the selected matching window.
According to still another embodiment of the present aspect of the disclosure, the method further comprises a step of selecting the metric from among a group that consists of the members:
According to another aspect of the present disclosure there is provided a method for use in a depth calculation process based on information comprised in an image captured by one or more image capturing devices, that enables distinguishing between areas included in the captured image that comprise details that are implementable by a matching algorithm and areas that do not have such details, wherein said method comprises the steps of:
According to another aspect of the present disclosure, there is provided an image capturing sensor comprising a computational platform as described hereinabove.
For a more complete understanding of the present invention, reference is now made to the following detailed description taken in conjunction with the accompanying drawings wherein:
In this disclosure, the term “comprising” is intended to have an open-ended meaning so that when a first element is stated as comprising a second element, the first element may also include one or more other elements that are not necessarily identified or described herein, or recited in the claims.
In the following description, for the purposes of explanation, numerous specific details are set forth in order to provide a better understanding of the present invention by way of examples. It should be apparent, however, that the present invention may be practiced without these specific details.
The present invention seeks to provide improved solution for obtaining depth results retrieved by using one or more stereoscopic capturing devices which enables distinguishing between areas in the captured image that do not comprise any informative details (e.g., details that may be implemented by a matching algorithm) as opposed to areas that do have such informative details.
Let us first consider an example illustrated in
In this example active approach (i.e., the approach by which a pattern is projected onto the scene) is implemented.
In this
Numeral “2” presents dark areas associated with low signal. If a stereo matching algorithm were to be applied to these areas, the results that will be obtained that way, might have a large error due to a difficulty to distinguish between real texture and noise signal.
Numeral “3” presents areas wherein normal projected patterns are shown. In these areas, a stereo matching algorithm may be successfully used as all the required information for the results to be accurate are included in these areas.
The solution provided by the present invention relies on using a metric that allows to quantitively evaluate the existence of real information in one or more selected windows comprised in a stereo image.
In order to illustrate the present solution, let us consider the following steps.
First,
When comparing these histograms with the respective images of claim 2, one may conclude that the farther two peaks are from each other and the smaller the intersection between signals representing light and dark patterns, the easier it would be for the matching algorithm to create a robust result.
In order to reproduce a mathematical metric that would allow evaluating a robustness of the matching results for a specific region, one must be aware of the nature of the noise.
A signal outputted from an image capturing sensor has many noise components. Part of these noise components are related to the light physics while another part, to the image capturing sensor architecture. The most important noise components are shot noise and readout (RO) noise.
A shot noise is a type of noise which can be modeled by a Poisson process. It occurs in photon counting in optical devices, where shot noise is associated with the particle nature of light. For the signal level that we are interested in, it can be described with a good accuracy by a normal distribution. Noise behavior of the signal with average value of X electrons may be formulated as a normal distribution characterized by a standard deviation being equal to the square root of the mean value X.
RO noise has a form of a normal distribution and usually it does not depend on the signal level. This noise is important only for low signals analysis.
As a next step, let us consider an image comprising an area having no texture that is equally illuminated at each point included in this area. As was discussed above, the measured signals of such an image can be represented by a normal distribution with a standard deviation that depends on the average signal level.
Knowing the noise nature, few metrics, that will be further discussed, may be proposed.
First, let us consider a metric that is calculated using the formula (maxValue−minValue)/std where:
In order to remove outliers, a simple procedure was used for the removal of N highest and lowest values from the calculation. The value of N depends on the pixels' number, and is calculated as:
This reflects a known fact that 99.7% of all normal distribution points are included within the range of 6 sigma.
To compensate for insufficient statistics in cases of small number of pixels, one can introduce a compensation polynomial function, one that depends on the pixels' number.
In the following description, the simulation results are presented. From this description and associated drawings, the metric's dependency on different simulation parameters may be understood.
While aggregating the results obtained from carrying out the various simulations, the following characteristics can be identified:
In addition, other metrics can optionally be used, based on the noise nature are:
The parameters included in the above metrics, are:
The above metrics may optionally be used in specific use cases.
Now, all the simulation results discussed above were achieved by using a physical model of the signal that were stored at the image capturing sensor's analog part (the photodiode). A digitalization procedure usually has a linear characteristic so that when applying the selected metric equation, it does not depend on the scaling factor that exists while converting the data from the electrons form to the digital numbers (“DN”) form, hence the equation remains essentially the same as before with some minor adaptations:
wherein:
There are various ways of utilizing the metric map discussed above as part of the depth determination process. For example:
According to the method demonstrated in this example, an image captured by two image capturing devices is provided (step 100). For each pixel, a processor selects a window for matching a corresponding part included in each image captured by the two image capturing devices (step 110).
Based on the selected window, metric (i,j) is calculated (step 120).
Next, an information map is generated (step 130). The map is generated in the following way, if the value of the metric(i,j)>a predefined threshold then InfoMap(i,j)=0, else InfoMap(i,j)=1.
In step 140, depth(i,j) is calculated based on the corresponding stereo image pairs (or alternatively based on the mono image, if applicable), and for all i, j values, where InfoMap(i,j)=0, setting the value of depth(i,j) to unknown.
In summary, the present solution provides a variety of metrics, each of which is based on the physical model of the signals, as captured by the image capturing sensor. Each of the metrics construed in accordance with the present invention is configured to distinguish between areas that comprise information sufficient to allow robust matching procedure for the depth calculation from the stereo images, as opposed to areas that do not include such information.
The method provided herein relies on a step of removing outliers' values from each selected matching window, thereby achieving more robust results. Optionally, a normalization function is applied to compensate for the metric dependency on the pixels' number in the selected matching window. Finally, the generated metric map may be used in the process of filtering depth results.
In the description and claims of the present application, each of the verbs, “comprise” “include” and “have”, and conjugates thereof, are used to indicate that the object or objects of the verb are not necessarily a complete listing of members, components, elements or parts of the subject or subjects of the verb.
The present invention has been described using detailed descriptions of embodiments thereof that are provided by way of example and are not intended to limit the scope of the invention in any way. The described embodiments comprise different features, not all of which are required in all embodiments of the invention. Some embodiments of the present invention utilize only some of the features or possible combinations of the features. Variations of embodiments of the present invention that are described and embodiments of the present invention comprising different combinations of features noted in the described embodiments will occur to persons of the art. The scope of the invention is limited only by the following claims.