This disclosure relates to systems and methods for refining three-dimensional images, including systems and methods for refining three-dimensional images using circuit model operation or optimization.
Multimedia technologies, including those for video- and image-related applications, are widely used in various fields, such as entertainment, education, medical diagnosis, and business presentations. For example, the entertainment industry is presenting more and more contents with three-dimensional (“3D”) images and videos. Various image-based rendering methods are used to render the 3D presentations. For example, a 3D image can be rendered based on a set of two-dimensional (2D) images and their associated depth images that indicate how far each pixel of the respective 2D images is from a viewpoint.
Three-dimensional rendering relies on the depth image with sufficient quality. As an illustrative and non-restrictive example, poor quality in depth image(s) may cause conflicts between monocular and binocular cues, which may cause viewers' visual discomfort. For instance, viewer may feel eyestrain, headache, or other painful 3D-related sickness. Depth Image based Rendering (DIBR) methods that are currently available do not correct such conflicts of depth cues that are caused by poor or incorrect depth images.
Therefore, it may be desirable to have systems, methods, or a combination thereof that refines the depth images before the three-dimensional image is rendered based thereon.
The disclosed embodiments provide a computer-implemented method for refining a three-dimensional image. The method identifies a depth image of the three-dimensional image, and establishes a simulation circuit model. The simulation circuit model includes data nodes, diffusion nodes and connection devices. The connection devices connect the data nodes and the diffusion nodes. The simulation circuit model assigns emulation voltage signals to the data nodes corresponding to at least a portion of the data points in the depth image. The assigned emulation voltage signals are substantially correlated to depth data of the at least a portion of the data points. The method further applies an optimization operation to generate diffused voltage signals for the diffusion nodes due to at least a redistribution of at least some of the emulation voltage signals to the diffusion nodes through the connection devices. The method also updates the depth data of the depth image based on the diffused voltage signals.
The disclosed embodiments further provide another computer-implemented method for refining a three-dimensional image. The method identifies a depth image of the three-dimensional image. The method further determines an energy including a first energy portion corresponding to a depth constraint, a second energy portion corresponding to a distortion constraint, and a third energy portion corresponding to an edge bending constraint. The depth constraint, the distortion constraint, and the edge bending constraint are each a function of depth data of the depth image. The method then applies an optimization operation to refine the depth data of the depth image by at least one of reducing minimizing the energy.
The disclosed embodiments also provide a system for refine a three-dimensional image. The system includes a storage device storing a depth image of the three-dimensional image. The depth image includes a depth data. The system further includes a processor coupled with the storage device. The processor is configured to establish a simulation circuit model. The simulation circuit model includes data nodes, diffusion nodes and connection devices. The connection devices connect the data nodes and the diffusion nodes. The simulation circuit model assigns emulation voltage signals to the data nodes corresponding to at least a portion of the data points in the depth image. The assigned emulation voltage signals are substantially correlated to depth data of the at least a portion of the data points. The processor is further configured to apply an optimization operation to generate diffused voltage signals for the diffusion nodes due to at least a redistribution of at least some of the emulation voltage signals to the diffusion nodes through the connection devices. The processor is also configured to update the depth data of the depth image based on the diffused voltage signals.
The disclosed embodiments further provide a system for refining a three-dimensional image. The system includes a storage device storing a depth image of the three-dimensional image. The depth image includes a depth data. The system further includes a processor coupled with the storage device. The processor is configured to determine an energy including a first energy portion corresponding to a depth constraint, a second energy portion corresponding to a distortion constraint, and a third energy portion corresponding to an edge bending constraint. The depth constraint, the distortion constraint, and the edge bending constraint are each a function of depth data of the depth image. The processor is further configured to apply an optimization operation to refine the depth data of the depth image by at least one of reducing or minimizing the energy.
The disclosed embodiments further provide a non-transitory computer-readable medium with an executable program stored thereon, wherein the program instructs a processor to perform a method for refining a three-dimensional image. The method identifies a depth image of the three-dimensional image, and establishes a simulation circuit model. The simulation circuit model includes data nodes, diffusion nodes and connection devices. The connection devices connect the data nodes and the diffusion nodes. The simulation circuit model assigns emulation voltage signals to the data nodes corresponding to at least a portion of the data points in the depth image. The assigned emulation voltage signals are substantially correlated to depth data of the at least a portion of the data points. The method further applies an optimization operation to generate diffused voltage signals for the diffusion nodes due to at least a redistribution of at least some of the emulation voltage signals to the diffusion nodes through the connection devices. The method also updates the depth data of the depth image based on the diffused voltage signals.
The disclosed embodiments further provide another non-transitory computer-readable medium with an executable program stored thereon, wherein the program instructs a processor to perform a method for refining a three-dimensional image. The method identifies a depth image of the three-dimensional image. The method further determines an energy including a first energy portion corresponding to a depth constraint, a second energy portion corresponding to a distortion constraint, and a third energy portion corresponding to an edge bending constraint. The depth constraint, the distortion constraint, and the edge bending constraint are each a function of depth data of the depth image. The method then applies an optimization operation to refine the depth data of the depth image by at least one of reducing minimizing the energy.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are intended to provide further explanation of the invention as claimed.
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate disclosed embodiments described below.
In the drawings,
and
Reference will now be made in detail to the exemplary embodiments of the disclosure, examples of which are illustrated in the accompanying drawings. Wherever possible, the same reference numbers will be used throughout the drawings to refer to the same or like parts.
Consistent with some embodiments, 3D image processing system 100 may include a processor 110, a memory or memory module 120, a user input device 130, a display device 140, and a communication interface 150. Processor 110 can be a general processor (such as one of the ARM® processors), a central processing unit (“CPU”), an application-specific integrated circuit (“ASIC”), a graphic processing unit (“GPU”), or any combination thereof. Depending on the type of hardware being used, processor 110 can include or be coupled with one or more printed circuit boards that have one or more microprocessor chips. Processor 110 can execute sequences of computer program instructions to perform various methods, including the exemplary ones described below.
Memory 120 can include, among other things, a random access memory (“RAM”), a read-only memory (“ROM”), a flash memory, or any combination thereof. The computer program instructions can be accessed and read from a ROM, a flash memory, or any other suitable location and loaded into the RAM for execution by processor 110. For example, memory 120 may store one or more software applications. Software applications stored in memory 120 may comprise operating system 121 for one or more processors or one or more common computer systems as well as for one or more software-controlled devices. Further, memory 120 may store the software application or only a part of the software application that is executable by processor 110. In some embodiments, memory 120 may store image processing software 122 that may be executed by processor 110. For example, image processing software 122 may be executed to refine the depth images.
Image processing software 122 or portions of it may be stored on a removable computer readable medium, such as a hard drive, computer disk, CD-ROM, DVD ROM, CD±RW or DVD±RW, flash memory, USB flash drive, memory stick, or any other suitable medium, and may run on any suitable component of 3D image processing system 100. For example, portions of applications to perform image processing may reside on a removable computer readable medium and be read and acted upon by processor 110 using routines that have been copied to memory 120.
In some embodiments, memory 120 may also store master data, user data, application data, program code, or any combination thereof. For example, memory 120 may store a database 123 having various image data such as the depth data of the depth images and the pixel values of the 2D images.
In some embodiments, input device 130 and display device 140 may be coupled to processor 110, such as through an appropriate interface or interface circuitry. In some embodiments, input device 130 may be a hardware keyboard, a keypad, a touch screen, or any combination thereof, through which an authorized user may input information to 3D image processing system 100. Display device 140 may include one or more display screens that display the various images or any related information to the user. For example, display device 140 may display the rendered 3D images, and/or the intermediate 2D images and depth images.
Communication interface 150, in some embodiments, may enable image processing system 100 to exchange data with one or more external devices. Consistent with some embodiments, communication interface 150 may include a network interface, a universal serial bus (USB), a HDMI port, etc. and may be (not shown) configured to receive 3D images, such as 2D image data and depth data, from an image acquisition device, such as a 3D camera. Consistent with some embodiments, communication interface 150 may also be configured to send 3D image data to a remote display device.
One or more components of 3D image processing system 100 may be used to implement a process related to 3D image processing. For example,
In some embodiments, the depth data may have a larger value when the object is nearer to the viewpoint. For example, the m×n original depth data Di(1,1)-Di(m,n) may each include 8-bits. In other words, each of the original depth data Di(1,1)-Di(m,n) has a numeric value ranging between 0-255. The greater the values the original depth data Di(1,1)-Di(m,n) are, the lesser depths the corresponding pixel value I(1,1)-I(m,n) have. Conversely, the smaller values the depth data Di(1,1)-Di(m,n) are, the greater depths the corresponding pixel value I(1,1)-I(m,n) have. When the depth data is represented by a grayscale image, the nearer object may be represented by a lighter gray.
In step 202, a simulation circuit model may be established. For example,
In some embodiments, the number of data nodes 310 and diffusion nodes 321 may be equal to, less than, or more than the number of original depth data values. In one embodiment, m×n data nodes 310 and m×n diffusion nodes 320 may be established in simulation circuit model 300 based on the m×n original depth data values Di(1,1−)-Di(m,n) received in step 201.
Accordingly, the simulation circuit model 300, as illustrated in
As shown in
Referring back to
In step 204, the emulation voltage signals may be supplied to the data nodes. For example, emulation voltage signals SV(1,1), SV(1,2), . . . , SV(m,n) may be supplied to data nodes NS(1,1), NS(1,2), . . . , NS(m,n), respectively. Because diffusion nodes ND(1,1), ND(1,2), . . . , ND(m,n) each have a zero voltage potential, a diffusion current may occur between each data node NS(i,j) and its corresponding diffusion node ND(i,j). For example, as shown in
Referring back to
where, α and β denote predetermined parameters; Ct denotes the color information of the corresponding pixel data DV(i,j) of the original depth data Di(i,j); Cn denotes the color information of the corresponding pixel data of each of the original depth data on the diffusion nodes (e.g., ND(i−1,j), ND(i,j−1), ND(i,j+1) and ND(i+1,j)) coupled by the diffused connection devices RD1-RD4. Consistent with some embodiments, the color information Ct and Cn of the pixel data can be obtained from the sum of the absolute values of the sub-pixel data of each color of the corresponding pixel data.
Accordingly, as part of step 205, diffusion currents may be determined using Ohm's law, as the potential difference across the respective connection devices divided by the respective resistance values of the connection devices. For example, diffusion current I1-I5 of
where, V410 is the supplied emulation voltage signal on data node 410, and V421-V425 are unknown diffused voltage signals.
According to Kirchhoff's current law, at any node in an electrical circuit, the sum of currents flowing into that node is equal to the sum of currents flowing out of that node. Take diffusion node 421 as an example, diffusion currents I1, I3, and I5 flow into the node, and I2 and I4 flow out of it. Therefore, under Kirchhoff's current law, the diffusion currents should satisfy the diffusion current equation as follow:
I
1
+I
3
+I
5
−I
2
−I
4=0 (7)
Consistent with some embodiments, diffusion current equations similar to Equation (7) may be determined for all the diffusion nodes in the simulation circuit model.
Referring back to
x=(AAH)−1b (8)
In the exemplary solution as shown in Equation (8), the diffused voltage signals may be determined as a least square matrix solution. However, it is contemplated that other solutions may also be used.
For example, the matrix equation Ax=b may also be solved using iterative methods, or optimization methods as to minimize the difference between Ax and b. Iterative methods solve the optimization in an iterative manner, where in each iteration, the unknowns are adjusted using a gradient of the cost function (e.g., |Ax−b|), and then the cost function is updated based on the adjusted unknowns. In each iteration, the value of the cost function may be reduced compared to the previous iteration. The iterations continue until the cost function satisfies a predetermined criterion. For example, iterations may stop when |Ax−b| is below a predetermined threshold. Various iterative methods, such as Newton methods, Conjugate Gradient methods, Gradient descent methods, Subgradient methods, Gauss-Seidel methods, multi-grid methods, etc., may be used.
As a result of the optimization operation, diffused voltage signals on the respective diffusion nodes 320 may be obtained. The diffused voltage signals reflect the redistribution of the emulation voltage signals supplied to the data nodes 310 in step 204. For example, in a simulation circuit model that has m×n diffusion nodes ND(1,1)-ND(m,n), m×n diffused voltage signals SVD(1,1), SVD(1,2), . . . , SVD(m,n) may be respectively obtained.
Referring back to
Consistent with some embodiments, process 200 may sharpen the edges of scene objects due to the diffusion effect of the simulation circuit model. For example, using the simulation circuit model, voltages within the status are trapped inside the color boundary, and voltages outside the status diffuse, which sharps the object edge depth. Accordingly, foreground fatten effect may be reduced, and scattered local errors may also be smoothed.
Although
In step 502, a depth constraint may be determined for the warping map. Ideally, the difference between the warping map and the depth map is the corresponding axis value. Using x-axis as an example, the depth constraint may be determined as:
w
x
−x−d
i,j=0 (9)
where wx is the x-axis component of a warping form of a refined depth, x is the corresponding value of x axis, and di,j is the original depth data of the (i,j) image pixel. Depth constraints for other axes may be determined similarly.
In step 503, a distortion constraint may be determined for the warping map. The distortion constraint is to limit the edge distortions. Again, using x-axis as an example, the distortion constraint may be determined as:
where wx is the x-axis component of a warping form of a refined depth, and x is the corresponding value of x axis. Distortion constraints for other axes may be determined similarly.
In step 504, an edge bending constraint may be determined for the warping map. The edge bending constraint is to limit the edge bendings, and thus involves at least two axes. Ideally the gradient of the warping map over another dimension is zero. Using x-axis as an example, the distortion constraint may be determined as:
where wx is the x-axis component of a warping form of a refined depth, and y is the corresponding value of y axis. Distortion constraints for other axes may be determined similarly.
In step 505, a combined energy may be determined based on at least one of the depth constraint, distortion constraint, and edge bending constraint. In some embodiments, the combined energy may be determined as a total energy including three portions:
E
total=λdED+λdEd+λbEb (12)
E
D=∫x(wx−x−di)dx (13)
where Etotal is the combined energy of constraints, ED is the energy from depth constraint, Ed means energy from image distortion constraint, Eb means energy from edge bending constraint, and λD, λid, λb are the weights of ED, Ed and Eb, respectively.
In some embodiments, each of the energy ED, Ed and Eb may be determined as sum of squares (i.e., second order norm) of the respective constraints, as in Equations (13)-(15). However, it is contemplated that any other order norms may also be used.
In some embodiments, the weights λD, λd , λb may be fixed and pre-programmed into processor 110. In some embodiments, processor 110 may adaptively determine the weights λD, λd, λb based on the relative significances of depth distortion, edge distortion, and edge bending in each particular synthetic image. For example, λd may be set relatively larger, if significant edge distortion occurs in the synthetic image. In some cases, one or more of weights λD, λd , λb may be set as zero such that the respective constraints are removed from the combined energy. In some other embodiments, the weights may be determined and input by the user via input interface 130.
In step 506, the warping map may be solved by reducing or minimizing the combined energy determined in step 505. In some embodiments, the following optimization operation may be performed for the warping map that results in a minimum energy Etotal:
arg minD(Etotal=λDED+λdEd+λbEb) (16)
Like the optimization operation of Equation (8), the optimization operation of Equation (16) may also be performed using various algorithms that are known to the skilled in the art. As a result of the optimization, a warping map of refined depth may be obtained. Process 500 may conclude after step 506.
In step 606, a combined energy may be determined based on at least one of the depth constraint, distortion constraint, and edge bending constraint. In some embodiments, the combined energy may be determined as a total energy including three portions:
E
total=λDED+λdEd+λbEb+λcEc (17)
where energy
where d is the depth and n denotes the n-th pixel, and ED, Ed and Eb are as defined in Equations (13)-(15), respectively, and k , λD, λd, λb are the weights of Ec, ED, Ed and Eb, respectively. Although Equation (17) shows all four energy portions, it is contemplated that the combined energy may include less energy portions by adjusting the weights λc , λD , λd , λb .
In step 607, the warping map may be solved by reducing or minimizing the combined energy determined in step 606. In some embodiments, the following optimization operation may be performed for the warping map that results in a minimum energy Etotal:
arg minD(Etotal=λDED+λdEd+λbEb+λcEc) (19)
Methods for performing optimization operation of Equation (19) are similar to those described for solving Equations (8) and (16). As a result of the optimization, a warping map of refined depth may be obtained and process 600 may conclude after step 607.
Process 600 essentially combines process 200 and process 500 in a manner that an optimization cost function is a combination of the cost functions used in the respective processes. As a result, an optimization based on the simulation circuit model and an optimization based on saliency constraints are performed in one combined optimization step. Consistent with some embodiments, process 600 may smooth the depth errors by diffusion within similar color, as well as limiting distortion and bending constraints.
Although process 700 is described that step 702 is performed before step 703, it is contemplated that order of these two steps may be switched. By performing the two optimization operations sequentially and independently, the benefits of diffusion current constraint and the saliency constraints are taken advantage of without restrictions by each other.
Although the above descriptions are made in connection with refining a depth map of a three-dimensional image, it is contemplated that the disclosed systems and methods can be adapted for various other applications. For example, the processes described in connection with FIGS. 2 and 5-7 may be readily applied to refine a sparse correspondence of the depth map. In some cases, the depth map may include only a subset of pixels that carry significant information and the rest of the pixels may carry only nominal depth information. Consistent with some embodiments, only the pixels that carry significant information (i.e., a sparse correspondence of the depth map) may be included in calculating the one or more energy terms that are described above. As a result, computational complexity for solving the unknowns may be reduced, without losing much important information in the depth map.
It will be apparent to those skilled in the art that various modifications and variations can be made in the disclosed embodiments without departing from the scope or spirit of those disclosed embodiments. Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosed embodiments being indicated by the following claims.
Number | Date | Country | Kind |
---|---|---|---|
99145926 | Dec 2010 | TW | national |
This application is a continuation-in-part of commonly-assigned, co-pending application U.S. patent application Ser. No. 13/152,093, filed Jun. 2, 2011, which claims the benefit of priority from U.S. Provisional Application No. 61/374,735, filed Aug. 18, 2010, and Taiwan Patent Application Serial No. 99145926, filed Dec. 24, 2010. These applications are hereby incorporated by reference into this application in their entirety.
Number | Date | Country | |
---|---|---|---|
61374735 | Aug 2010 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 13152093 | Jun 2011 | US |
Child | 13339136 | US |