The present invention relates generally to an imaging system, and more particularly to an imaging system for vanishing point detection.
Modern consumer and industrial electronics, especially devices with a graphical imaging capability, such as cameras, televisions, projectors, cellular phones, and combination devices, are providing increasing levels of functionality to support modern life including three-dimensional display services. Research and development in the existing technologies can take a myriad of different directions.
As users become more empowered with the growth of three-dimensional display devices, new and old paradigms begin to take advantage of this new device space. There are many technological solutions to take advantage of this new display device opportunity. One existing approach is to display three-dimensional images on consumer, industrial, and mobile electronics such as video projectors, televisions, monitors, gaming systems, or a personal digital assistant (PDA).
Due to projective imaging in digital cameras, projections of parallel lines in 3D world converge at one point in 2D image plane which are called vanishing points. Vanishing point detection is a very important problem in computer vision. Given a single 2D image, finding the vanishing points in the image can be used as a step in geometric structure analysis of the scene, building 3D models of the scene, depth estimation from 2D images and videos etc.
Three-dimensional imaging systems have been incorporated in cameras, projectors, televisions, notebooks, and other portable products. Today, these systems aid users by capturing and displaying available relevant information, such as diagrams, maps, or videos. The display of three-dimensional images provides invaluable relevant information.
However, displaying information in three-dimensional form has become a paramount concern for the consumer. Displaying a three-dimensional image that does not correlates with the real world decreases the benefit of using the tool.
Thus, a need still remains for better imaging systems to capture and display three-dimensional images. In view of the ever-increasing commercial competitive pressures, along with growing consumer expectations and the diminishing opportunities for meaningful product differentiation in the marketplace, it is increasingly critical that answers be found to these problems. Additionally, the need to reduce costs, improve efficiencies and performance, and meet competitive pressures adds an even greater urgency to the critical necessity for finding answers to these problems.
Solutions to these problems have been long sought but prior developments have not taught or suggested any solutions and, thus, solutions to these problems have long eluded those skilled in the art.
The present invention provides method of operation of an imaging system, including: providing a source image having image metadata; calculating a segment image from the source image; calculating a compass angle for producing a maximum value of an a posteriori probability of the compass angle, the a posteriori probability based on the segment image and the image metadata; calculating an x-axis vanishing point, a y-axis vanishing point, and a z-axis vanishing point based on the compass angle and the image metadata; and calculating a display image for displaying on a display unit, the display image based on the source image, the x-axis vanishing point, the y-axis vanishing point, and the z-axis vanishing point.
The present invention provides an imaging system, including: an image sensor for capturing a source image having image metadata; a segment image calculated from the source image; a compass angle calculated for producing a maximum value of an a posteriori probability of the compass angle, the a posteriori probability of the compass angle based on the segment image and the image metadata; an x-axis vanishing point, a y-axis vanishing point, and a z-axis vanishing point calculated based on the compass angle and the image metadata; and a display unit for displaying a display image, the display image based on the source image, the x-axis vanishing point, the y-axis vanishing point, and the z-axis vanishing point.
Certain embodiments of the invention have other steps or elements in addition to or in place of those mentioned above. The steps or element will become apparent to those skilled in the art from a reading of the following detailed description when taken with reference to the accompanying drawings.
The following embodiments are described in sufficient detail to enable those skilled in the art to make and use the invention. It is to be understood that other embodiments would be evident based on the present disclosure, and that system, process, or mechanical changes may be made without departing from the scope of the present invention.
In the following description, numerous specific details are given to provide a thorough understanding of the invention. However, it will be apparent that the invention may be practiced without these specific details. In order to avoid obscuring the present invention, some well-known circuits, system configurations, and process steps are not disclosed in detail.
The drawings showing embodiments of the system are semi-diagrammatic and not to scale and, particularly, some of the dimensions are for the clarity of presentation and are shown exaggerated in the drawing FIGS. Similarly, although the views in the drawings for ease of description generally show similar orientations, this depiction in the FIGs. is arbitrary for the most part. Generally, the invention can be operated in any orientation.
The same numbers are used in all the drawing FIGs. to relate to the same elements. The embodiments have been numbered first embodiment, second embodiment, etc. as a matter of descriptive convenience and are not intended to have any other significance or provide limitations for the present invention.
The term “image” is defined as a pictorial representation of an object. An image can include a two-dimensional image, three-dimensional image, video frame, a calculate file representation, an image from a camera, a video frame, or a combination thereof. For example, the image can be a machine readable digital file, a physical photograph, a digital photograph, a motion picture frame, a video frame, an x-ray image, a scanned image, or a combination thereof. The image can be formed by pixels arranged in a rectangular array. The image can include an x-axis along the direction of the rows and a y-axis along the direction of the columns.
The horizontal direction is the direction parallel to the x-axis of an image. The vertical direction is the direction parallel to the y-axis of an image. The diagonal direction is the direction non-parallel to the x-axis and non-parallel to the y-axis.
The term “module” referred to herein can include software, hardware, or a combination thereof. For example, the software can be machine code, firmware, embedded code, and application software. Also for example, the hardware can be circuitry, processor, calculator, integrated circuit, integrated circuit cores, or a combination thereof.
Referring now to
The imaging device 102 can form the source image 104 in a variety of ways. For example, the source image 104 can be formed by capturing a visual representation of a physical scene with an optical sensor. In another example, the source image 104 can be formed by a computer, an infrared imaging device, an ultraviolet imaging device, a scanning device, or a combination thereof.
The source image 104 can include image metadata 106. The image metadata 106 is information about the source image 104. For example, the image metadata 106 can include information about the physical properties of the imaging device when the source image 104 was created. In another example, the image metadata 106 can be the picture information recorded with the digital image in a digital camera.
The image metadata 106 can include information such as photographic properties, imaging device orientation, imaging device location, optical parameters, settings, light levels, lens information, or a combination thereof. For example, the image metadata 106 can include a focal length 120. The focal length 120 is the length between the lens and the focus point of the imaging device 102.
The imaging device 102 can form a segment image 108 from the source image 104. The segment image 108 is an image having directional information extracted from the source image 104. For example, the segment image 108 can include a line image, an edge image, a gradient image, a vector image, or a combination thereof. In another example, the segment image 108 can be a line image formed by performing line detection on the source image 104. In yet another example, the segment image 108 can be an edge image formed by performing edge detection on the source image 104.
The segment image 108 and the source image 104 having image metadata 106 can be transferred from the imaging device 102 to a display device 112 over a communication link 118. The display device 112 is a unit capable of displaying a display image 116 on a display unit 114. For example, the display device 112 can be a handheld device with a liquid crystal display unit for viewing images.
The communication link 118 is a mechanism for transferring information. For example, the communication link 118 can be an internal computer bus, an inter-device bus, a network link, or a combination thereof. Although the imaging device 102 and the display device 112 are depicted as separate devices, it is understood that the imaging device 102 the display device 112 may be implemented as a single integrated device.
Referring now to
Referring now to
The uv coordinate system 210 can depict the plane of the image sensor 110 of the imaging device 102. The uv coordinate system 210 indicates the two-dimensional plane of the image sensor 110. An u-axis 212 can depict the horizontal dimension of the image sensor 110. A v-axis 214 can depict the vertical dimension of the image sensor 110. The uv coordinate system 210 depicts a three-dimensional block and the plane of the image sensor 110.
The abc coordinate system 216 can depict the coordinate system relative to the imaging device, such as a camera, calculator, video recorder, or a combination thereof. The abc coordinate system 216 is illustrated by a three-dimensional cube. An a-axis 218 can depict the horizontal dimension. A b-axis 220 can depict the vertical dimension. A c-axis 222 can depict the depth dimension. The abc coordinate system 216 depicts another three-dimensional block.
The u-axis is aligned to a-axis, and v-axis is aligned to the b-axis. However, the abc coordinate system 216 is usually not aligned with Manhattan world coordinate system, the xyz coordinate system 202, due to camera orientation changes.
It has been found that three vanishing points corresponding to lines in three orthogonal directions in the Manhattan world model. The Manhattan world assumes that scenes consist of piece-wise planar surfaces with dominant directions.
Vanishing points locations and camera orientation are closely related. The first coordinate system is the xyz coordinate system 202 of the Manhattan world model. The second one is the abc coordinate system 216 of the imaging device 102, such as a camera. The third is the uv coordinate system 210 of the plane of the image sensor 110. The first two are 3D coordinate systems, and the last one is a 2D coordinate system.
Referring now to
The abc coordinate system 216 of
Referring now to
One of the line segments 410, also designed “i”, has an angle theta(i) (θi,) relative to a horizon line 408. The horizon line 408 is a horizontal line indicating the horizon of the source image 104 of
The horizon line 408 and the line connecting center of one of the line segments 410 and the x-axis vanishing point 402 form the line angle theta (i,1) θi,1. The horizon line 408 and the line connecting center of one of the line segments 410 and y-axis vanishing point 404 form the line angle theta (i,2) (θi,2). And the horizon line 408 and the line connecting center of one of the line segments 410 and z-axis vanishing point 406 form the line angle theta (i,3) (θi,3).
It has been found that the three vanishing points, the x-axis vanishing point 402, the y-axis vanishing point 404, and the z-axis vanishing point 406, can be calculated given the focal length 120 of
The focal length 120, the elevation angle 306 of
The x-axis vanishing point 402 can be described as:
The y-axis vanishing point 404 can be described as:
The z-axis vanishing point 406 can be described as:
Referring now to
The vanishing points pseudo code 502 can receive the source image 104 of
For example, the segment image 108 can be a line image having line segments 410 of
The vanishing points pseudo code 502 can then calculate a coarse probability 504 for the compass angle 304 of
The coarse angle maximum probability 506 indicates a coarse compass angle estimate 508. The coarse compass angle estimate 508 is the angle which produces the maximum probability 506.
The coarse probability 504 can be calculated at 5 degree intervals over the range of −45 degrees to +45 degrees. The coarse probability 504 can also be calculated at 5 degree intervals over the range of about −45 degrees to about +45 degrees. The coarse probability 504 can be calculated using the FindProb function, which is a function to calculate the condition probability of the compass angle 304 alpha for one of the line segments 410, Li.
The vanishing points pseudo code 502 can refine the estimate for the compass angle 304 by calculating a refined probability 510 for the compass angle 304 for each of the line segments 410 to identify a refined compass angle estimate 512. The refined compass angle estimate 512 is the angle which produces the maximum of the refined probability 510 for each of the values tested for the compass angle 304.
The refined probability 510 can be calculated at 1 degree intervals around the previously determined value for the coarse compass angle estimate 508 The refined probability 510 can be calculated at a variety of ranges around the previously determined value for the coarse probability 504. For example, the refined probability 510 can be calculated over from −3 to +3 degrees, −5 to +5 degrees, about −5 to about +5, or a combination thereof.
The compass angle 304 can be set to the value of the refined compass angle estimate 512. The x-axis vanishing point 402, the y-axis vanishing point 404, and the z-axis vanishing point 406, can all be calculated based on the compass angle 304, the focal length 120, the elevation angle 306, and the twist angle 308.
The process for calculating the x-axis vanishing point 402, the y-axis vanishing point 404, and the z-axis vanishing point 406 can be described in more detail below. It has been found that a maximum value of an a posteriori probability 516 estimate of the compass angle 304 (α) of the imaging device 102 of
where P(α) is the prior probability of the compass angle 304 (α), and {Li} represents all detected lines in the segment image 108.
The segment image 108 can be generated from the source image 104 using a line detection process. For example, the line detection process can include Canny edge detection, Hough transforms, the von Gioi method, or a combination thereof.
The compass angle 304 (α) can be found by maximizing the a posteriori probability 516 as follows in Equation (3):
Note that P({Li}) is not a function of α, so it can be removed from the maximization, and α* can represent the estimated value of the compass angle 304.
It has been found that the maximum can be determined by first doing a coarse search by computing the a posteriori probability 516 on a 5 degree interval of α in the range of (−45 degrees, 45 degrees). Then the result can be refined by computing the a posteriori probability 516 on a 1 degree interval around the coarse compass angle estimate 508.
The prior probability P(α) is the probability of the compass angle 304 based on prior knowledge. The prior probability P(α) is considered uniform over the range (−45 degrees, 45 degrees). Because the prior probability is uniform, it can be considered to be a scalar constant and has no effect on the determination of the maximum probability for the tested values of the compass angle 304.
The next step is to calculate the conditional probability of P(Li|α). Consider each line i to be in one of the following four groups denoted by gi where gi=1 means that line i points to vanishing point (ux,vx), gi=2 means that line i points to vanishing point (uy,vy), gi=3 means that line i points to vanishing point (uz,vz), and gi=4 means that line i does not point to any vanishing points.
Then, the conditional probability of P(Li|α) can be calculated using Equation (4).
The angles of the line segments 410 can be modeled with respect to the horizon line 408 (θi) as a Gaussian random variable when gi={1,2,3}. Gi represents the groups of the line segments 410.
The angle θi can be treated as uniformly distributed when gi=4. So the conditional probability of P(Li,gi|α) can be calculated using Equation (5):
where σi2 is inversely proportional to line length, θi,g
Given the elevation angle 306 (β) and the twist angle 308 (γ), as retrieved from the image metadata 106 of
Referring now to
The line detection module 604 can receive the source image 104 and generate the segment image 108 of
The find line parameters module 606 can determine line angles 614 and line lengths 618 for each of the line segments 410 in the segment image 108. The find line parameters module 606 can iterate through the set of the line segments 410 in the segment image 108 and determine the individual values for the line angles 614 and the line lengths 618. The line angles 614 can be passed to the calculate differences module 610. The line angles 614 and the line lengths 618 can be passed to the calculate probability module 612.
The calculate vanishing points module 608 can calculate the x-axis vanishing point 402, the y-axis vanishing point 404, and the z-axis vanishing point 406 based on the focal length 120, the compass angle 304, the elevation angle 306, and the twist angle 308. The calculate vanishing points module 608 can pass the x-axis vanishing point 402, the y-axis vanishing point 404, and the z-axis vanishing point 406 for the source image 104 to the calculate differences module 610.
The calculate differences module 610 can receive the x-axis vanishing point 402, the y-axis vanishing point 404, the z-axis vanishing point 406, and the line angles 614 to determine line angle differences 616 between each of the line angles 614 with respect to the x-axis vanishing point 402, the y-axis vanishing point 404, and the z-axis vanishing point 406. The calculate differences module 610 can pass the line angle differences 616 to the calculate probability module 612.
The calculate probability module 612 can receive the line angle differences 616 from the calculate differences module 610 and the line angles 614 and the line lengths 618 of the segment image 108 from the find line parameters module 606. The calculate probability module 612 can calculate the a posteriori probability 516 based on the line angle differences 616, the line angles 614, and the line lengths 618 for the lines in the segment image 108.
The block diagram represents the vanishing points pseudo code 502 of
It has been discovered that calculating the compass angle 304 based on the Manhattan world from lines model and the image metadata 106 can increase accuracy of the compass angle 304 estimate. For example, the accuracy can be increased by around four percent. Using the Manhattan world from lines model and the image metadata 106 of
Referring now to
The vanishing points edge pseudo code 702 can perform edge detection to generate the segment image 108, such as an edge image, from the source image 104. The edge image can be is a representation of the source image 104 based on edge segments 710 of the pixels in the source image 104. The segment image 108 of
The vanishing points edge pseudo code 702 can evaluate a cost function 704 (ξ or Xi) for the segment image 108. Calculating the cost function 704 corresponds to the edge detection for generating the edge image from the source image 104. The segment image 108 can include the edge segments 710 calculated based on the gradient of the source image 104. The cost function 704 is equivalent to determining the a posteriori probability 516 of
The vanishing points edge pseudo code 702 can then calculate a coarse cost minimum 706 for the cost function 704 for each of the edge segments 710 of
The vanishing points edge pseudo code 702 can refine the estimate for the compass angle 304 by calculating a fine cost minimum 708 for the compass angle 304 for each of the edge segments 710. The fine cost minimum 708 can be calculated at 1 degree intervals around the minimum coarse angle 712 for the coarse cost minimum 706. The fine cost minimum 708 can be calculated at a variety of ranges around the previously determined value for the coarse cost minimum 706. For example, the fine cost minimum 708 can be calculated over from −3 to +3 degrees, −5 to +5 degrees, about −5 to about +5, or a combination thereof. The fine cost minimum 708 represents the minimum cost at a minimum fine angle 714.
The compass angle 304 can be set to the value of the minimum values of the fine cost minimum 708. The x-axis vanishing point 402 of
It has been discovered that calculating the compass angle 304 based on the regularized Manhattan world model and the image metadata 106 can increase accuracy of the estimate of the compass angle 304. For example, the accuracy can be increased by around five percent or more. Using the regularized Manhattan world model and the image metadata 106 of
In the vanishing points edge pseudo code 702, one of the edge segments 710 (ei) is defined to be one of the edge segments 710 at the pixel i and the angle θi is the edge angle at pixel i. The edge direction is orthogonal to gradient direction, so θi=a tan(−du/dv), where du and dv are image gradient along u and v directions. The gradient gi is similarly defined by simply replacing line i with edge i.
The maximum a posteriori estimate of α and {gi} is calculated using Equation (6):
It has been found that the estimate of the compass angle 304 is calculated with improved accuracy because the process jointly optimizes the set of edges, {gi}. Optimizing each of the set of edges pointing to each of the x-axis vanishing point 402, the y-axis vanishing point 404, and the z-axis vanishing point 406 improves accuracy of the compass angle 304.
It has been also been found that the accuracy of the compass angle 304 is calculated with improved accuracy because the spatial dependency of the edges are taken into account, whereas different lines are typically spatially independent. Grouping the edges to take advantage of the spatial dependency of similar edges increases the likelihood that the grouped edges describe an artifact in the source image 104.
The a posteriori probability 516 of
Further, the conditional probability term P(ei|α,gi) of Equation (7) can be calculated using Equation (8):
where τ is a pre-defined constant.
The joint prior probability of {gi} can be calculated as
where W is a normalizing constant, λ is a pre-specified constant scalar, δ( ) is a Kronecker delta function, and C is the set of cliques ((i, j)εC indicates that pixel i and j are neighbors). The Kronecker delta function δ(gi, gj) is a piecewise function for gi and gj where the value is 1 when gi=gj and 0 otherwise. The set of cliques (i,j) represent the set of edge segments 710 that are neighbors in a 3×3 window.
The prior probability P(α) is considered to be uniform within its range. Because the prior probability P(α) is constant, it can be factored out when determining the minimum value of the cost function 704.
Thus, maximizing the probability in Equation (7) is equivalent to minimizing the following cost function. The cost function 704 can be expressed as in Equation (10):
where θi,g
Therefore, the compass angle 304 can be determined using Equation (11):
Minimizing the cost function in Equation (10) can be done in a variety of ways. For example, the cost function 704 can be minimized using graph-cut or any other optimization approaches.
It has been found that calculating the cost function can increase accuracy of calculation of the compass angle 304 by using the regularization term. The second term of Equation (10) can be considered as a regularization term, which is why the process is named regularized Manhattan world. The regularization term penalizes neighboring edges pointing to different vanishing points and results in the calculation of a more accurate value for the compass angle 304.
It has been discovered that determining the x-axis vanishing point 402, the y-axis vanishing point 404, and the z-axis vanishing point 406, based on the calculated value of the compass angle 304 and the retrieved values of the elevation angle 306 and the twist angle 308 can increase processing speed and reduce computational complexity.
Referring now to
The a posteriori probability 516 can be calculated for the source image 104 based on the compass angle 304, the elevation angle 306, the twist angle 308, and the focal length 120. The block diagram can include a calculate image gradient module 804, a find gradient parameters module 806, a find edge pixels module 808, a define pixel neighborhood module 810, a calculate vanishing points edge module 812, a calculate differences edge module 814, and a minimizing module 816.
The calculate image gradient module 804 can receive the source image 104 and calculate the segment image 108 of
The find gradient parameters module 806 can determine gradient angles 818 and gradient magnitudes 820 for the segment image 108. The gradient angles 818 can be passed to the calculate differences edge module 814. The gradient angles 818 and the gradient magnitudes 820 can be passed to the find edge pixels module 808.
The find edge pixels module 808 can iterate through the set of the gradient angles 818 and the gradient magnitudes 820 from the segment image 108 and determine which edge pixels 824 represent valid edges using the gradient threshold 822. The gradient threshold 822 is a minimum difference between the values of two pixels to determine if an edge exists or not. The edge pixels 824 can be pass to the define pixel neighborhood module 810.
The define pixel neighborhood module 810 can determine values for an array of pixels for defining the set of cliques where pixels i and j are neighbors. The clique is the set of neighboring pixels. Pixels are neighbors if they are included within an array, such as a 3×3 pixel array. The pixel neighborhood information can be passed to the minimizing module 816.
The calculate vanishing points edge module 812 can calculate the x-axis vanishing point 402, the y-axis vanishing point 404, and the z-axis vanishing point 406 based on the focal length 120, the compass angle 304, the elevation angle 306, and the twist angle 308. The calculate vanishing points module 608 of
The calculate differences edge module 814 can receive the x-axis vanishing point 402, the y-axis vanishing point 404, the z-axis vanishing point 406, and the line angles 614 to determine gradient angle differences between each of the gradient angles 818 with respect to the x-axis vanishing point 402, the y-axis vanishing point 404, and the z-axis vanishing point 406. The calculate differences edge module 814 can pass the gradient angle differences to the minimizing module 816.
The minimizing module 816 can receive the gradient angle differences from the calculate differences edge module 814 and the gradient angles 818, the gradient magnitudes 820, and the pixel neighborhood information from the define pixel neighborhood module 810. The minimizing module 816 can calculate the a posteriori probability 516, the cost function 704 of
The block diagram represents a portion of the vanishing points edge pseudo code 702 of
It has been discovered that calculating the x-axis vanishing point 402, the y-axis vanishing point 404, and the z-axis vanishing point 406 using the compass angle 304 based on edges instead of lines, the estimation of the compass angle 304 becomes more robust and less dependent on line detection algorithms, since line detection is usually more difficult than simple edge detection.
It has been discovered that calculating the compass angle 304 using the image metadata 106 of
It has been discovered that calculating the compass angle 304 based on the Manhattan world from lines model and the image metadata 106 can increase accuracy of the compass angle 304 estimate. Compared to the original Manhattan world method plus metadata, the accuracy can be increased by around four percent. Using the Manhattan world from lines model and the image metadata 106 provides additional information to increase the accuracy of the estimate of the compass angle 304 and provide more accurate calculations of the x-axis vanishing point 402, the y-axis vanishing point 404, and the z-axis vanishing point 406.
It has been discovered that calculating the compass angle 304 based on the regularized Manhattan world model and the image metadata 106 can increase accuracy of the estimate of the compass angle 304. Compared to Manhattan world from lines model, the accuracy can be increased by around five percent or more. Using the regularized Manhattan world model and the image metadata 106 provides additional information to increase the accuracy of the estimate of the compass angle 304 and provide more accurate calculations of the x-axis vanishing point 402, the y-axis vanishing point 404, and the z-axis vanishing point 406.
Referring now to
Referring now to
The segment image 108 can include lines that point toward the x-axis vanishing point 402 of
Referring now to
The segment image 108 can include lines that point toward the x-axis vanishing point 402 of
Referring now to
It has been discovered that the present invention thus has numerous aspects. The present invention valuably supports and services the historical trend of reducing costs, simplifying systems, and increasing performance. These and other valuable aspects of the present invention consequently further the state of the technology to at least the next level.
Thus, it has been discovered that the imaging system of the present invention furnishes important and heretofore unknown and unavailable solutions, capabilities, and functional aspects for efficiently coding and decoding video content for high definition applications. The resulting processes and configurations are straightforward, cost-effective, uncomplicated, highly versatile and effective, can be surprisingly and unobviously implemented by adapting known technologies, and are thus readily suited for efficiently and economically manufacturing video coding devices fully compatible with conventional manufacturing processes and technologies. The resulting processes and configurations are straightforward, cost-effective, uncomplicated, highly versatile, accurate, sensitive, and effective, and can be implemented by adapting known components for ready, efficient, and economical manufacturing, application, and utilization.
While the invention has been described in conjunction with a specific best mode, it is to be understood that many alternatives, modifications, and variations will be apparent to those skilled in the art in light of the aforegoing description. Accordingly, it is intended to embrace all such alternatives, modifications, and variations that fall within the scope of the included claims. All matters hithertofore set forth herein or shown in the accompanying drawings are to be interpreted in an illustrative and non-limiting sense.