The invention relates generally to the field of stereoscopic capture, processing, and display systems. More specifically, the invention relates to a stereoscopic system that provides a way to compensate for spatial misalignment in source images and in the display system using image-processing algorithms.
The normal human visual system provides two separate views of the world through our two eyes. Each eye has a horizontal field of view of about 60 degrees on the nasal side and 90 degrees on the temporal side. A person with two eyes, not only has an overall broader field of view, but also has two slightly different images formed at his/her two retinas, thus forming different viewing perspectives. In normal human binocular vision, the disparity between the two views of each object is used as a cue by the human brain to derive the relative depth between objects. This derivation is accomplished by comparing the relative horizontal displacement of corresponding objects in the two images.
Stereoscopic displays are designed to provide the visual system with the horizontal disparity cue by displaying a different image to each eye. Known stereoscopic displays typically display a different image to each of the observers' two eyes by separating them in time, wavelength or space. These systems include using liquid crystal shutters to separate the two images in time, lenticular screens, barrier screens or autostereoscopic projection to separate the two images in space, and the use of color filters or polarizers to separate the two images based on optical properties.
It is to be understood that while the two eyes are generally displaced in the horizontal direction, they are generally not displaced in the vertical direction. Therefore, while horizontal disparities are expected, vertical disparities are not expected and can significantly degrade the usefulness of a stereoscopic display system. For example, vertical displacement or misalignment existing between corresponding objects in the two images will reduce the viewer's ability to fuse the two images into a single perceive image, and the viewer is likely to experience visual fatigue and other undesirable side effects. When the amount of misalignment is small, the presence of vertical disparity results in eyestrain, degraded depth, and partial loss of depth perception. When the amount of vertical misalignment is large, vertical disparity may result in binocular rivalry and the total loss of depth perception.
Vertical misalignment can be introduced into stereoscopic images at various stages, including during image capture and image display. During image capture, a stereo image pair is typically recorded with either image of the image pair being captured through a different optical system, which may themselves not always be aligned vertically; or two images are recorded by using one camera and laterally shifting the camera between captures, during which the vertical position of the camera can change. When the capture system is off on vertical misalignment, all pixels of the stereo pair may be off by a certain amount vertically. Keystone distortion can also be created if the cameras are not positioned parallel to one another as is often required to capture objects that are close to the capture system. This keystone distortion often reduces the vertical size of objects that are positioned at opposite sides of the scene, and this keystone distortion results in a vertical misalignment of a different amount for different pixels in the stereo pair. The vertical misalignment due to keystone distortion can, therefore, be much larger at the corners of the images compared to the center of the images. The two captures can also have rotational or magnification differences, causing vertical misalignment in the stereo images. The vertical misalignment from rotational and magnification difference are generally larger at the corners of the images, and smaller at places close to the center of the images. Usually the vertical misalignment of the stereo images is a result of a combination of the factors mentioned above. A scanning process can also cause this type of vertical misalignment if the images are captured or stored on an analog medium, such as film, and a scanner is used to convert the analog images to digital.
Vertical disparity can also be produced by a vertical misalignment or rotation or magnification of the display optics. Many stereoscopic display systems have two independent imaging channels, each consisting of numerous optical and display components. It would be very difficult to manufacture two identical components to use for the two channels. In addition, it is also very difficult to assemble the system so that the two imaging channels are identical to each other in vertical position and offset precisely in horizontal position. As a result, various spatial mismatches can be introduced between the two imaging channels. Those spatial mismatches in display systems are manifested as spatial displacement in the stereo images. In the stereo images horizontal displacement can generally be interpreted as differences in depth while vertical displacement can lead to user discomfort. Stereoscopic systems that may present images with some degree of vertical displacement (e.g., helmet-mounted displays) typically have a very tight tolerance for relative display. The presence of this tight tolerance often complicates the manufacture and increases the cost of producing such devices.
Image-processing algorithms have been used to correct for the spatial misalignment created in stereoscopic capture systems. U.S. Pat. No. 6,191,809 and EP 1 235 439 A2 discusses a means for electronically correcting for misalignment of stereo images generated by stereoscopic capture devices, in particular, by stereo endoscopes. A target in the capture space is used for calibration. From the captured left and right images of the target magnification and rotational errors of the capture device are estimated in sequence, and used to correct the captured images. The horizontal and vertical offsets are estimated based on a second set of captured images of the target that have been corrected for magnification and rotational errors.
U.S. Patent Application Publication No. 2003/0156751 A1 describes a method for determining a pair of rectification transformations to rectify the two captured images to substantially eliminate vertical disparity from the rectified image pair. The goal of rectification is to transform the stereo image pair from a non-parallel camera setup to a virtual parallel camera set-up. This method takes as inputs both the captured images, and the statistics of parameters of the stereoscopic image capture device. The parameters may include intrinsic parameters such as the focal length and principal point of a single camera, and extrinsic parameters such as the rotation and translation between the two cameras. A warping method is used to apply the rectification transformation to the stereo image pair. Each of the references mentioned above requires information about the capture devices, or to link the image-processing system to the capture process. In the case of unknown image source, the methods described above will not function properly.
It has also been recognized that there is a need to align certain components of a stereoscopic display system. U.S. Patent Application Publication No. 2004/0263970 A1 discloses a method of aligning an array of lenticular lenses to a display using software means. The software consists of a program that will provide test patterns to aid in positioning the lenticular array over the array of pixels on the display. In the alignment phase, the user would use some input means to indicate the rotational positions of test patterns shown on the display relative to the lenticular screen. The information determined by the alignment phase of the installation is subsequently stored in the computer, allowing rendering algorithms to compensate for the rotation of the lenticular screen with respect to the underlying pixel pattern on the display. While the actual algorithm of doing software processing to compensate for the rotational alignment of the lenticular screen is not described in the document, it would be expected that the misalignment of the lenticular screen would result primarily in horizontal shifts in the location of the pixels that will be seen by the left versus the right eye, and this algorithm would be expected to compensate for this artifact. Therefore, this reference does not provide a method for compensating for vertical misalignment within the stereoscopic display system.
There is a need, therefore, for creating a stereoscopic display system that can minimize overall spatial misalignment between the two stereo images without knowledge of the capture system. There is further a need for a method to compensate for the vertical and horizontal spatial misalignment in the display system. This method should further be robust, require a minimal processing time such that it may be performed in real time, and require minimal user interaction.
The present invention is directed to overcoming one or more of the problems set forth above. According to one aspect of the present invention, an image-processing algorithm is developed to correct the vertical misalignment introduced in the image capturing/producing process without prior knowledge of the causes. This image-processing algorithm compares the two images and registers one image to the other. The image registration process creates two displacement maps for both the horizontal and vertical directions. The algorithm applies the vertical displacement to one or both of the images to make the two images well aligned in the vertical direction. The method of the present invention also generates a display displacement map using a pair of test targets, a twin video camera set, a video mixer, and a video monitor. This displacement map can be further used by an image warping algorithm to pre-processing the stereo images, and hence to compensate for any spatial misalignment introduced in the display system. Overall, the present invention provides an integrated solution to minimize the spatial misalignment caused by either the source or the display device in a stereoscopic display system.
The above and other objects, features, and advantages of the present invention will become more apparent when taken in conjunction with the following description and drawings wherein identical reference numerals have been used, where possible, to designate identical features that are common to the figures, and wherein:
a is a flow chart showing the method of image vertical misalignment correction of the present invention;
b shows a system using the method introduced in
a is a flow chart showing the method of display misalignment correction of the present invention; and
b shows a system using the method introduced in
To facilitate understanding, identical reference numerals have been used, where possible, to designate identical elements that are common to the figures.
The present description is directed in particular to elements forming, part of, or cooperating more directly with, apparatus in accordance with the invention. It is to be understood that elements not specifically shown or described may take various forms well known to those skilled in the art.
The present invention is directed towards a method for rectifying misalignment in a stereoscopic display system comprising: providing an input image to an image processor; creating an image source displacement map; obtaining a display displacement map; and applying the image source displacement map and the display displacement map to the input image to create a rectified stereoscopic image pair. The image source displacement map and the display displacement map may be combined to form a system displacement map and this map may be applied to the input image in a single step. Alternatively, the image source displacement map and the display displacement map may alternately be applied to the input image in separate steps. Further provided is a system employing the method of the present invention. Further methods are provided for forming and applying the image source displacement map based upon an analysis of the input image and for forming and applying the display displacement map.
The present invention is useful when applied within a stereoscopic imaging system in which one or more components of the system introduce some degree of spatial misalignment that can create discomfort for a human observer. The vertical misalignment of source images is compensated for by computing image transformation functions for a pair of stereo images, indicating the degree to which one image must be transformed to align to a second image; applying the vertical compensation to generate vertical displacement maps; computing working displacement maps for at least one of the stereo images; and correcting for the vertical displacement by deforming the stereo images using the computed working displacement maps. Such a processing chain may additionally consider display attributes by forming displacement maps that contain both vertical and horizontal displacements to compensate for vertical or horizontal displacements formed by misalignment of the display. The spatial misalignment of the display system is compensated by creating a display system displacement map, and applying a warping algorithm to pre-process one or more of the images so that the viewer will perceive stereo image pairs with minimal system introduced spatial misalignment.
Such an image processing chain may improve the comfort and the quality of the stereoscopic image viewing experience. This invention is based on the research results by the authors in which images containing vertical disparities were shown to induce discomfort. This improvement in viewing experience will often result in increased user comfort or enhanced viewing experience in terms of increasing user enjoyment, engagement and/or presence. This improvement may also be linked to the improvement in the performance of the user during the completion of a task such as the estimation of distances or depths within the images represented by the stereoscopic image pairs.
A system useful in practicing the present invention is shown in
The image source 110 may be any device or combination of devices that are capable of providing stereoscopic image information. For example, this image source may include a pair of still or video cameras capable of capturing the stereoscopic image information. Alternately, the image source 110 may be a server that is capable of storing one or more stereoscopic images. The image source 110 may also consist of a memory device capable of providing definitions of a computer generated graphics environment and textures that can be used by the image processor to render a stereoscopic view of a three dimensional graphical environment.
The image processor 120 may be any processor capable of performing the calculations that are necessary to determine the misalignment between a pair of stereoscopic images that have been retrieved from the image source 110. For example, this processor may be any application specific integrated circuit (ASIC), programmable integrated circuit or general-purpose processor. The image processor 120 performs the needed calculations based on information from the image source 110.
The rendering processor 130 may be any processor capable of performing the calculations that are necessary to apply a warping algorithm to a pair of input images to compensate for the spatial misalignment in the display system. The calculation is based on information from image processor 120 and from storage device 160. The rendering processor 130 and the image processor 120 may be two separate devices, or may be the same device.
The stereoscopic display device 140 may be any display capable of providing a stereoscopic pair of images to a user. For example, the stereoscopic display device 140 may be a direct view device that presents an image at the surface of the display (i.e., has a point of accommodation and convergence at the plane of the display surface); such as a barrier screen liquid crystal display device, a CRT with liquid crystal shutters and shutter glasses, a polarized projection system with linearly or circular polarized glasses, a display employing lenticules, a projected autostereoscopic display, or any other device capable of presenting a pair of stereographic images to each of the left and right eyes at the surface of the display. The stereoscopic display device 140 may also be a virtual image display that displays the image at a virtual location, having adjustable points of accommodation and convergence; such as an autostereoscopic projection display device, a binocular helmet-mounted display device or retinal laser projection display.
The means for obtaining a display displacement map 150 may include a display device to display a stereoscopic image pair having a known spatial arrangement of points, a pair of stereoscopic cameras to capture the left and right images, and a processor to compare the two images to derive the display displacement map. The capture can be obtained with any still digital cameras or with video cameras as long as the spatial alignment of the two cameras is known. Alternately, the means for obtaining a display displacement map may include a display device to display a stereoscopic image pair having a known spatial arrangement, a user input device for allowing the user to move at least one of the images in the stereoscopic image pair for obtaining correspondence between two points and a method for determining the displacement of the images when the user indicates that correspondence is achieved. It should be noted that targets useful for automated alignment may not be adequate when the means for obtaining the display displacement map is obtained based upon user alignment. Because the eyes of the user cannot be aligned in a fixed location, and because the human brain will attempt to align targets which have similar spatial structure on the stereoscopic display, the targets presented on the left and right screens must be designed to have little spatial correlation. One method to achieve this is to display primarily horizontal lines to one eye and vertical lines to the other eye. By using targets in which a horizontal or vertical line is displayed to one eye and asking the user to align this line to a gap in a line shown to the other eye, little spatial correlation exist between the two eye images, allowing the targets to be adjusted to fall the same place on the two human retinas when the user's eyes are near their natural resting point.
The display displacement map will be stored in storage device 160, and will be used as input to the rendering processor 130. This map will be used to process the input images from image processor 120 to compensate for the horizontal as well as vertical misalignment of the display device.
Referring now to
In terms of image registration terminology the two images involved in stereoscopic visualization are referred as a source image 220 and a reference image 222. Denote the source image and the reference image by I(xt, yt, t) and I(xt+1, yt+1, t+1) respectively. The notations x and y are the horizontal and vertical coordinates of the image coordinate system, and t is the image index (image 1, image 2, etc.). The origin, (x=0, y=0), of the image coordinate system is defined at the center of the image plane. It should be pointed that the image coordinates, x and y, are not necessarily integers.
For the convenience of implementation, the image (or image pixel) is also indexed as I(i, j) where i, and j are strictly integers and parameter t is ignored for simplicity. This representation aligns with indexing a matrix in the discrete domain. If the image (matrix) has a height of h and a width of w, the corresponding image plane coordinates x and y at location (i, j) can be computed as x=i−(w−1)/2.0, and y=(h−1)/2.0−j. The column index i runs from 0 to w−1. The row index j runs from 0 to h−1.
In general, the registration process involves finding an optimal transformation function Φt+1(xt, yt) (see step 202) such that
[xt+1,yt+1,1]T=Φ(xt,yt)[xt,yt,1]T (10-1)
The transformation function of Equation (10-1) is a 3×3 matrix with elements shown in Equation (10-2).
In fact, the transformation matrix consists of two parts, a rotation sub-matrix
and a translation vector
Note that the transformation function Φ is either a global function or a local function. A global function Φ transforms every pixel in an image in the same manner. A local function Φ transforms each pixel in an image differently based on the location of the pixel. For the task of image registration, the transformation function Φ could be a global function or a local function or a combination of the two.
In practice, the transformation function Φ generates two displacement maps, X(i, j), and Y(i, j), which contain the information that could bring pixels in the source image to new positions that align with the corresponding pixel positions in the reference image. In other words, the source image is to be spatially corrected.
It is clear that in the case of stereoscopic visualization for human viewers, only the vertical direction displacement map Y(i, j) (step 204) is needed to bring the pixels in the source image to new positions that align, in the vertical direction, with the corresponding pixels in the reference image. This vertical alignment will correct the discomfort caused by the varying vertical misalignment due to, for example, perspective distortion. For the displacement map Y(i, j), the column index i runs from 0 to w−1 and the row index j runs from 0 to h−1.
In practice, to generalize the correction of vertical misalignment using the displacement Y(i, j), a working displacement map Yα(i, j) is introduced. The working displacement map Yα(i, j) is computed with a pre-determined factor α of a particular value (step 206) as
Yα(i,j)=αY(i,j).
where 0≦α≦1. The generated working displacement map Yα(i, j) is then used to deform the source image (step 208) to obtain a vertical misalignment corrected source image 224. The introduction of a working displacement Yα(i, j) facilitates the correction of vertical misalignment for both images (left and right) when the need arises. The process of correction of vertical misalignment for both images (left and right) is explained below.
It is clear that the roles of source and reference images are exchangeable for the two images (left and right images) involved in the context of correction of vertical misalignment in stereoscopic visualization.
In general, to correct the discomfort caused by the varying vertical misalignment due to, for example, perspective distortion, both the left and right images could be spatially corrected with working displacement maps Yα(i, j) computed with a pre-determined factor α of particular values.
As shown in
An exemplary result of vertical misalignment correction is shown in
Note that the registration algorithm used in computing the image transformation function Φ could be a rigid registration algorithm, a non-rigid registration algorithm or a combination of the two. People skilled in the art understand that there are numerous registration algorithms that are typically used to register images that are captured at different time intervals or to assess the horizontal disparity of different objects in order to determine depth or distance from stereoscopic image pairs. However, these same algorithms can carry out the task of finding the transformation function Φ that generates the needed displacement maps for the correction of the vertical misalignment in stereoscopic visualization by performing this registration in the vertical dimension for left and right eye images. Exemplary registration algorithms can be found in “Medical Visualization with ITK”, by Ibanez, L., et al. at http://www.itk.org. Also people skilled in the art understand that spatially correcting an image with a displacement map could be realized by using any suitable image interpolation algorithms (see “Robot Vision” by Horn, B., The MIT Press, pp. 322 and 323.)
Having discussed a method for creating an image source displacement map, a method for determining a display displacement map can be addressed. Referring to
An exemplar measurement system is shown in
An exemplar, measurement results of the display displacement map is shown in
Referring now to
Persons skilled in the art will recognize that numerous warping algorithms exist to generate a displacement map based on a series of source and destination anchor points. An exemplar method is to connect the anchor points within each image into a grid of line segments and to employ the method for warping based on line segments that is described in Beier, T. and Neely, S., “Feature-Based Image Metamorphosis,” Computer Graphics, Annual Conference Series, ACM SIGGRAPH, 1992, pp. 35-42. Alternate methods have been developed that are based directly on the positions of the anchor points. An exemplar technique is described in Lee, S., Wolberg, G., and Shin, S. Y., “Scattered Data Interpolation with Multilevel B-Splines,” IEEE Transactions on Visualization and Computer Graphics, Vol. 3, No. 3, 1997, pp. 228-244.
As in the case of vertical misalignment correction, to generalize the correction of display misalignment using the displacement map Z(i, j), a working displacement map Zα(i, j) is introduced. The working displacement map Zα(i, j) is computed with a pre-determined factor α of a particular value (step 830) as
Zα(i,j)=αZ(i,j).
where 0≦α≦1. The generated working displacement map Zα(i, j) is then used to deform the source image (step 840) to obtain a warped source image 850. As an alternate embodiment the working displacement maps 206 and 830 could be combined and the deformation operations 208 and 840 could be reduced to a single operation in order to improve the efficiency of the method. The introducing of working displacement Zα(i, j) facilitates the correction of display misalignment for both images (left and right) when the need arises. The process of correction of display misalignment for both images (left and right) is explained below.
As shown in
By applying both the image source displacement map, discussed earlier, and the display displacement map, vertical misalignment in source images and both vertical and horizontal misalignment due to imperfections in the display system can be virtually eliminated. Although, it is preferable that these may each be applied separately, it is desirable that they both be enabled and applied within a system. It is also possible to apply the display displacement map as described herein together with image source displacement maps that are created based on other means, such as those included in U.S. Pat. No. 6,191,809 and EP 1 235 439 A2, both of which are herein included by reference.
The invention has been described with reference to a preferred embodiment. However, it will be appreciated that variations and modifications can be effected by a person of ordinary skill in the art without departing from the scope of the invention.