The present disclosure relates generally to image reconstruction technologies.
Recently, the use of three-dimensional (3-D) images is becoming increasingly popular due to the increased demand for applications utilizing 3-D features of an image. In line with that trend, determination of 3-D data from images is of central importance, e.g., in the fields of image reconstruction, machine vision, and the like. Machine vision has a wide range of potential applications including but not limited to three-dimensional map building data visualization and robot pick-and-place operations.
Typical techniques for implementing 3-D vision include geometric stereo and photometric stereo. In geometric stereo, images of an object are captured by employing, e.g., two cameras disposed at different positions, and measuring disparity between the corresponding points of the two images, thereby building a depth map of the object. Meanwhile, photometric stereo involves using a camera to take pictures of an object by varying the position of a light source. The photometric stereo involves processing the pictures to obtain features of the object such as slope, albedo at each pixel of the picture of the object to reconstruct an image, thereby implementing a 3-D vision of the object. Photometric stereo can have varying results depending on the surface characteristics of the object.
Upon comparing the two methods, it is generally known that the geometric stereo method outperforms the photometric stereo method for an object having a complex and non-continuous surface (i.e., having a high texture component), while the photometric stereo method tends to be superior to the geometrical stereo method for an object having a relatively simple surface whose reflective characteristics are lambertian (i.e., a surface that may comply with a diffusion reflection model). Such limitation, i.e., the performance of the above two methods is dependent on certain qualities (e.g., surface characteristics) of an object, is the fundamental problem when the photometric or geometric stereo method is used.
The foregoing and other features of the present disclosure will become more fully apparent from the following description and appended claims, taken in conjunction with the accompanying drawings. Understanding that these drawings depict only several embodiments in accordance with the disclosure and are, therefore, not to be considered limiting of its scope, the disclosure will be described with additional specificity and detail through use of the accompanying drawings in which:
Embodiments of image retrieval systems, image matching techniques and descriptor generating techniques are disclosed herein. In accordance with one embodiment, an image reconstructing system includes one or more cameras configured to capture images of an object, one or more light sources to emit light to the object, and an image processor configured to process the images to generate a first representation and a second representation of the object, and to generate a reconstructed image of the object based on the first and second representations.
The Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
It will be readily understood that the components of the present disclosure, as generally described and illustrated in the Figures herein, could be arranged and designed in a wide variety of different configurations. Thus, the following more detailed description of the embodiments of apparatus and methods in accordance with the present disclosure, as represented in the Figures, is not intended to limit the scope of the disclosure, as claimed, but is merely representative of certain examples of embodiments in accordance with the disclosure. The presently described embodiments will be best understood by reference to the drawings, wherein like parts are designated by like numerals throughout.
Referring to
In an example use of the camera 120, a user may use the camera 120 to take pictures of an object by operating the controller 110 to adjust the light source 140. The camera 120 delivers the pictures to the image processor 160 by using various interfaces between the camera 120 and the image processor 160, including but not limited to a wired line, a cable, a wireless connection or any other interfaces, under the command or instructions of the controller 110. The image processor 160 may process the pictures to generate a reconstructed image of the object and deliver the reconstructed image to the display 180. The display 180 displays the reconstructed image, for example, for the user's reference.
In selected embodiments where the image processor 160 is installed on a separate device detachable from the camera 120, the image may be transmitted from the camera 120 to the image processor 160 using a wired, wireless communication protocol, or a combination thereof. For example, the wired or wireless communication protocol may be implemented by employing a digital interface protocol, such as a serial port, parallel port, PS/2 port, universal serial bus (USB) link, firewire/IEEE 1394 link, or wireless interface connection, such as an infra-red interface, BloothTooth, ZigBee, high definition multimedia interface (HDMI), high-bandwidth digital contents projection (HDCP), wireless fidelity (Wi-Fi), or the like. In one embodiment, the image reconstructing system 100 may be configured with a suitable operating system to install and run executable code, programs, etc., from one or more computer readable media, such as a floppy disk, CD-ROM, DVD, a detachable memory, a USB memory, a memory stick, a memory card or the like.
Referring to
For example, the first processing unit 220 may perform a geometric stereo method to generate the first representation (i.e., a depth map) of the object.
where L is a distance between center points of the two images, f is a focal length of a lens of the camera, and dl and dr are distances to the center points from the corresponding points A and A′, respectively, as indicated by
In selected embodiments, in order to create a reconstructed image of an object by applying the geometrical stereo, the first processing unit 220 may generate more than one depth map under different settings of light sources from those used for obtaining a previous depth map. For example, the first processing unit 220 may use two different sets of images captured under different settings of the light source 140 from those used for the first depth map, to generate two depth maps of geometric images. In other words, after obtaining one depth map as described above, the image reconstructing system 100 may change the settings, e.g., brightness, position, etc., of the light source 140. The first processing unit 220 may create another depth map under the changed settings of the light source 140. Based on the two different depth maps, the first processing unit 220 may determine the fidelity of the geometric stereo which may indicate the level of reliability of the geometric stereo at each pixel of the pictures. For example, the first processing unit 220 may calculate the fidelity of the geometric stereo as given by the equation below:
where A(x,y) and B(x,y) are brightness under a first and a second setting of the light source 140, respectively, at a certain pixel (x,y) and {circumflex over (z)}(x,y) and ŵ(x,y) are depth map values under a first and a second setting of the light source 140, respectively, at a certain pixel (x,y).
In one embodiment, the second processing unit 240 may perform a photometric stereo operation to generate a slope map (so-called, a p-q map) of a photometric image, which is referred to as a second representation of the object. The image reconstructing system 100 may use the camera 120 to take one or more pictures of the object with the variance of light source in its position and perspective. By using one or more pictures, the second processing unit 240 may perform the photometric stereo method to obtain the slope of the 3-D object at each pixel of the picture. In this way, the second processing unit 240 can output feature information indicating slopes of the 3-D object, and albedo, i.e., brightness, together with reliability levels of the coefficients at each pixel of the pictures.
where |X| indicates an absolute value of “X.” The second processing unit 240 may detect the brightness at each pixel of the picture and the brightness can be represented as a combination of a Lambertian component and a total reflection component as follows:
where LD and LS indicate the Lambertian component and the total reflection component of the brightness L, respectively, and ρD and ρS are Lambertian and total reflection albedo, respectively.
For example, upon assuming the Lambertian model, the brightness at each pixel (x,y) of the picture taken by the camera 140 under the i-th setting (e.g., position) of the light source 140 may be given by the equation below:
L
i
=E
iρ({right arrow over (s)}i{right arrow over (n)}) i=1,2,3
where Li is the brightness of the surface of the object by the i-th position of the light source 140, ρ is the albedo value and Ei is a light power of the i-th setting of the light source 140, and {right arrow over (s)}i=[six, siy, siz]T is the direction vector of the i-th setting of the light source 140, and indicates an inner sum operation between vectors, and {right arrow over (n)}[nx, ny, nz]T is the normal vector. Although three light source positions are given in the embodiment (i.e., i=1, 2, 3), the operations can be performed with any number of positions of the light sources. In the example, the same light power can be practically used for each setting of the light source 140 to be E1=E2=E3. Then, the brightness of the surface of the object by the light source 140 can be represented as follows:
and nx, ny nz are the x,y,z components of the normal vector {right arrow over (n)}, respectively.
The second processing unit 240 calculates the normal vector {right arrow over (n)} from the above equation as follows:
where {right arrow over (S)}−1 represents the inverse vector of {right arrow over (S)}. For the slopes p and q at each pixel of the photometric image, the normal vector has the relationship with the slopes as follows:
From the above relationship, the second processing unit 240 may calculate the slopes p(x,y) and q(x,y) at pixel of (x,y) by using the equation below:
where “dz/dx” indicates differentiation of z by x.
In this way, the second processing unit 240 may generate the p-q map (i.e., the second representation) having the values p and q that indicate the slope at each pixel of the photometric image of the object. The second processing unit 240 may determine the brightness of the photometric image at each pixel thereof. For example, the second processing unit 240 may calculate the brightness of the photometric image by using the previous equation below:
where LD and LS indicate the Lambertian component and the total reflection component of the brightness L, respectively, and ρD and ρS are Lambertian and total reflection albedo, respectively.
In selected embodiments, the second processing unit 240 may create another or more p-q maps under the changed settings (i.e., different combination of positions) of the light source 140. Based on the brightness detected by the camera 120 and the brightness of the reconstructed image, the second processing unit 240 may obtain the fidelity of the photometric stereo which may indicate the level of reliability of the photometric stereo. For example, the second processing unit 240 may calculate the fidelity of the photometric stereo as given by the equation below:
where C(x,y,j) and RC(x,y,j) are brightness of the captured image and the reconstructed image under the j-th setting of the light source 140, respectively, at a certain pixel (x,y).
Referring back to
d=∫∫a·(|zx−{circumflex over (z)}x|2+|zy−{circumflex over (z)}y|2)+b·(|z−{circumflex over (z)}|2)dxdy
where a and b indicate the fidelity values of the photometric stereo and the geometric stereo at pixel (x,y), respectively, zx indicates differentiation of z by x, {circumflex over (z)}x and {circumflex over (z)}y, indicate the p-q map values at pixel (x,y), and {circumflex over (z)} represents depth map value at pixel (x,y). In order to find the optimum surface function, the image reconstructing unit 260 represents the surface function z(x,y) by using the given basis functions φ(x,y,ω) as follows:
where ω=(ωx,ωy) is a two-dimensional index, Ω is a finite set of indexes, and C(ω) is an expansion coefficient. The image reconstructing unit 260 may calculate the coefficients Ĉ(ω) of depth map {circumflex over (z)}(x,y) for each basis function to represent the depth map as follows:
For the p-q map values {circumflex over (z)}x(x,y) and {circumflex over (z)}y(x,y), the image reconstructing unit 260 may calculate the coefficients Ĉ1(ω) and Ĉ2(ω) for each basis function to represent the p-q map values as follows:
where φx(x,y,w) and φy(x,y,ω) are partial derivatives of φ(x,y,ω) by x, and y, respectively. The image reconstructing unit 260 may find the surface function z(x, y) that minimizes the distance d by calculating the expansion coefficients of the surface function as follows:
where Px(ω), Py(ω), P(ω) can be given as follows: Px(ω)=∫∫|φx(x,y,ω)|2dxdy, Py(ω)=∫∫|φy(x,y,ω)|2dxdy and P(ω)=∫∫|φ(x,y,ω)|2dxdy, respectively. In this way, the image reconstructing unit 260 inserts the above-calculated coefficients of the surface function into the equation
to generate the surface function, thereby reconstructing the image of the object.
Referring to
At operation 540, the first processing unit 220 generates a first representation of the object from the pictures of the object, each picture taken by the one or more cameras having a different camera position. For instance, the first processing unit 220 may apply a geometric method to the first and the second pictures that are taken by one or more cameras located at two different positions under the first setting of the one or more light sources. The first processing unit 220 operates the first and the second pictures according to the geometric method, thereby generating the first representation (e.g., a first depth map) of the object. In one embodiment, the first processing unit 220 may apply a geometric method to the third and the fourth pictures that are taken by one or more cameras in the two different positions under the second setting of the one or more light sources, thereby additionally generating a third representation (e.g., a second depth map) of the object. The first processing unit 220 delivers the first and the third representation of the object together with the pictures taken by the camera 120 to the image reconstructing unit 260.
At operation 560, the second processing unit 240 generates a second representation of the object from the pictures of the object, each picture taken by the one or more cameras having a different setting of the light source 140. For instance, the second processing unit 240 may apply a photometric method to the first and the third pictures that are taken by one or more cameras under two different settings (e.g., different positions) of the light sources 140. The second processing unit 240 operates the first and the third pictures according to the photometric method, thereby generating the second representation (e.g., a first p-q map) of the object. In one embodiment, the second processing unit 240 may apply a photometric method to another set of two pictures that are taken by one or more cameras at the same position as taking the first and the third pictures respectively under two different settings (e.g., different positions) of the light sources 140, thereby generating a fourth representation (e.g., a second p-q map) of the object. The first processing unit 220 delivers the second and the fourth representation of the object to the image reconstructing unit 260.
In operation 580, the image reconstructing unit 260 receives the representations of the object from the first and the second processing unit 220 and 240 to generate a reconstructed image. The image reconstructing unit 260 may perform a numerical operation such as least square minimization algorithm to the first and the second representations of the object, thereby finding an optimum surface function z(x,y) of the object that has the least distance from the first and the second representation of the object. In this case, the image reconstructing unit 260 can determine the fidelity values (a and b) of the photometric stereo and the geometric stereo at pixel (x,y) in the above equation d=∫∫a·(|zx−{circumflex over (z)}x|2+|zy−{circumflex over (z)}y|2)+b·(|z−{circumflex over (z)}|2)dxdy to be, e.g., both 1. The image reconstructing unit 260 determines the surface function z(x,y) that minimizes the distance d by performing the least square algorithm as described above. Alternatively, the image reconstructing unit 260 may calculate a first fidelity level from the first and the third representation of the object, and a second fidelity level from the second and the fourth representation of the object. For example, the image reconstructing unit 260 may calculate the fidelity of the geometric image by using the above equation,
where A(x,y) and B(x,y) are the brightness of the first and the second picture, respectively, at a certain pixel (x,y) and {circumflex over (z)}(x,y) and ŵ(x,y) are depth map values of the first and the third representation of the object, respectively, at a certain pixel (x,y). The image reconstructing unit 260 may calculate the fidelity of the photometric image by using the above equation,
where C(x,y,j) is the brightness of the first and the second picture and RC(x,y,j) is the brightness of the second and the fourth representation of the object, respectively, at a certain pixel (x,y) (j=1 and 2 in this example). The image reconstructing unit 260 determines the surface function z(x,y) that minimizes the distance d by performing the least square algorithm by using the fidelities of the geometric and the photometric as weights (a and b). In this way, the image reconstructing unit 260 may find optimum surface function z(x,y) that minimizes the distance d by performing the least square algorithm as described above, thereby generating the reconstructed image of the object.
In light of the present disclosure, those skilled in the art will appreciate that the apparatus, and methods described herein may be implemented in hardware, software, firmware, middleware, or combinations thereof and utilized in systems, subsystems, components, or sub-components thereof. For example, a method implemented in software may include computer code to perform the operations of the method. This computer code may be stored in a machine-readable medium, such as a processor-readable medium or a computer program product, or transmitted as a computer data signal embodied in a carrier wave, or a signal modulated by a carrier, over a transmission medium or communication link. The machine-readable medium or processor-readable medium may include any medium capable of storing or transferring information in a form readable and executable by a machine (e.g., by a processor, a computer, etc.).
The present disclosure may be embodied in other specific forms without departing from its basic features or characteristics. Thus, the described embodiments are to be considered in all respects only as illustrative, and not restrictive. The scope of the disclosure is, therefore, indicated by the appended claims, rather than by the foregoing description. All changes within the meaning and range of equivalency of the claims are to be embraced within their scope.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/KR2008/002033 | 4/10/2008 | WO | 00 | 9/3/2010 |