The present disclosure generally relates to image processing and, more particularly, to image processing apparatuses, image processing methods, and storage mediums.
There is known a display apparatus that displays a three-dimensional (3D) image (video image) by using a parallax image composed of a left-eye image and a right-eye image. In such a display apparatus, the following problem is known. Specifically, when a viewer views a 3D image with his/her head tilted to the right or the left, parallax is produced in a direction different from the direction of a line segment connecting the eyes of the viewer. Such parallax may cause eye fatigue for the viewer or may make the fusion of the parallax image difficult.
The technique disclosed in Japanese Patent Laid-Open No. 2012-244453 is one of the techniques for reducing the parallax described above. According to Japanese Patent Laid-Open No. 2012-244453, the display positions of the right and left video images (display images) are each shifted in accordance with the tilt of the viewer's head, and the display images are displayed on a planar 3D display. According to Japanese Patent Laid-Open No. 2012-244453, the image display positions are shifted in accordance with the tilt of the viewer's head acquired with a posture sensor provided in 3D glasses.
However, when an image (video image) is displayed in a head mounted display, there is a case in which processing for largely deforming the image needs to be carried out when the image is displayed (for example, an image captured by a fisheye camera, a 360-degree spherical image, or the like is displayed). If an image that has been subjected to the processing including large deformation is simply shifted, a large portion of the peripheral region of the display image is trimmed, and thus the viewing angle obtained when the image is viewed with the head mounted display is reduced. In addition, if the display image is simply shifted when the distortion correction of the eyepiece lens in the head mounted display is employed along with the deformation processing, the distortion of the eyepiece lens cannot be corrected properly.
The present disclosure is related to providing image processing for generating a display image with appropriate parallax without reducing the viewing angle.
An image processing apparatus according to one or more aspects of the present disclosure is an image processing apparatus configured to generate a display image to be displayed in a display apparatus from a parallax image, and the image processing apparatus includes a first acquiring unit configured to acquire posture information of the display apparatus, a second acquiring unit configured to acquire a parallax amount in the parallax image, an adjusting unit configured to adjust postures of virtual cameras disposed virtually on viewpoints corresponding to the parallax image based on the posture information of the display apparatus and the parallax amount in the parallax image, and a generating unit configured to generate the display image from the parallax image based on the posture information of the display apparatus and adjusted postures of the virtual cameras.
Further features of the present disclosure will become apparent from the following description of exemplary embodiments with reference to the attached drawings.
Hereinafter, one or more aspects of the present disclosure will be described with reference to the drawings. It is to be noted that the following exemplary embodiments are not intended to limit the present disclosure and that not all the combinations of the features described in one or more aspects of the present disclosure are essential in the present disclosure. Identical configurations are given identical reference characters in the description.
Described in one or more aspects of the present disclosure is an image processing apparatus that displays an image with appropriate parallax without reducing the viewing angle even if the viewer tilts the head when a parallax image composed of a left-eye image and a right-eye image is displayed in a display apparatus for displaying a stereoscopic image, such as a head mounted display.
First, a configuration example of an image processing system that includes an image processing apparatus according to one or more aspects of the present disclosure will be described with reference to
As illustrated in
The ROM 103 and the HDD 105 store program(s) for the operation of the image processing apparatus 113.
The HDD I/F 104 connects the HDD 105 to the image processing apparatus 113. The HDD I/F 104 may be an interface of serial ATA (SATA) or the like, for example. The HDD 105 is an example of a secondary storage device to be connected to the image processing apparatus 113. In addition to an HDD (or in place of an HDD), a different secondary storage device, such as an optical disc drive, may be connected to the image processing apparatus 113. ATA stands for Advanced Technology Attachment.
The CPU 101, which may include one or more processors and one or more memories, may execute the program(s) stored in the ROM 103 and (or) the HDD 105 while using the RAM 102 as a work memory to control each configuration unit of the image processing apparatus 113 via the system bus 112. Thus, the various processes described later are executed.
The CPU 101 can read out data from the HDD 105 and can write data into the HDD 105 via the HDD I/F 104. The CPU 101 can expand the data stored in the HDD 105 onto the RAM 102. The CPU 101 can store the data expanded on the RAM 102 into the HDD 105. The CPU 101 can execute the programs expanded on the RAM 102.
The input I/F 106 may connect the input device 107, such as a keyboard, a mouse, a digital camera, or a scanner, to the image processing apparatus 113. The input I/F 106 may be, for example, a serial bus interface compliant with the standard such as USB or IEEE 1394. The CPU 101 can retrieve data from the input device 107 via the input I/F 106.
The output I/F 108 may connect the output device 109 to the image processing apparatus 113. The output device 109 may include an image display surface. The output device 109 may be a head mounted display in one or more aspects of the present disclosure. The head mounted display may include an eyepiece lens. The output I/F 108 may be, for example, a video out interface compliant with the standard such as DVI or HDMI (registered trademark). DVI stands for Digital Visual Interface. HDMI stands for High Definition Multimedia Interface. The CPU 101 can transmit data to the output device 109 via the output I/F 108 to cause the output device 109 to display the data.
The posture detecting I/F 110 may connect the posture detecting device 111, such as an acceleration sensor or an angular velocity sensor, to the image processing apparatus 113. The posture detecting I/F 110 may be, for example, a serial bus interface compliant with the standard such as USB or IEEE 1394. In one or more aspects of the present disclosure, the posture detecting device 111 may be mounted to the image display surface of the output device 109 or to the vicinity thereof. The CPU 101 can retrieve posture information of the image display surface of the output device 109 from the posture detecting device 111 via the posture detecting I/F 110. The posture information of the image display surface of the output device 109 can also be input to the CPU 101 with a mouse, a keyboard, a camera, or the like.
Hereinafter, the flow of the processing carried out by the image processing apparatus 113 according to one or more aspects of the present disclosure will be described with reference to
The units described throughout the present disclosure are exemplary and/or preferable modules for implementing processes described in the present disclosure. The modules can be hardware units (such as circuitry, a field programmable gate array, a digital signal processor, an application specific integrated circuit or the like) and/or software modules (such as a computer readable program or the like). The modules for implementing the various steps are not described exhaustively above. However, where there is a step of performing a certain process, there may be a corresponding functional module or unit (implemented by hardware and/or software) for implementing the same process. Technical solutions by all combinations of steps described and units corresponding to these steps are included in the present disclosure.
The CPU 101 illustrated in
First, in S1, the posture information acquiring unit 201 (
In S2, the image data acquiring unit 202 acquires image data to be displayed next from the HDD 105 or the input device 107. The image data acquiring unit 202 outputs the acquired image data to the parallax amount acquiring unit 203. Of the acquired image data, the image data to be displayed for the left eye is designated by IL, and the image data to be displayed for the right eye is designated by IR (
Examples of 360-degree spherical images are illustrated in
In S3 and thereafter, display images (display parallax image) as illustrated in
In S3, the parallax amount acquiring unit 203 acquires an parallax amount between the image data IL and the image data IR in a region of interest on the basis of the posture information and the image data (or on the basis of the image data). The parallax amount acquiring unit 203 outputs the acquired parallax amount to the virtual camera posture adjusting unit 204.
In a head mounted display that displays a video image while following the movement of the head, the viewer looks around while moving the head, and thus the view is likely to focus on the center region in the field of view. Therefore, in one or more aspects of the present disclosure, the center regions in the field of view of the image data IL and the image data IR are set as the regions of interest, and the parallax amount thereof is acquired. The parallax amount may be acquired from a parallax map created in advance from the image data IL and the image data IR or may be acquired through a calculation.
When the parallax amount is acquired through a calculation, the coordinates indicated by the azimuthal angle θd and the elevation angle φd representing the posture of the head mounted display are set as the center of the field of view in the image data. A region of N×M pixels around the center of the field of view in the left-eye image data IL is set as AL. Then, the pixels around the center of the field of view in the right-eye image data IR are scanned in the horizontal direction to carry out matching processing, and a region AR with the highest similarity is obtained. This matching processing is template matching, for example. It suffices that the evaluation index of the similarity be able to evaluate the similarity between regions, such as SSD (Sum of Squared Difference) or SAD (Sum of Absolute Difference). The deviation amount between the center of the obtained region AR and the center of the region AL in the horizontal direction is designated as a parallax amount d in the target of interest (region of interest). Although the target of interest (region of interest) has been described as the center of the field of view in the image data, the parallax may be obtained with a region with the smallest distance in the image data, a region with the highest prominence, or a region specified manually serving as the region of interest.
In S4, the virtual camera posture adjusting unit 204 adjusts the postures of the right and left virtual cameras used to generate the right and left display images on the basis of the parallax amount acquired from the parallax amount acquiring unit 203 and the posture information. The virtual camera posture adjusting unit 204 outputs the information on the adjusted postures of the virtual cameras (posture information) to the display image generating unit 206. Herein, of the posture information, the amount of rotation (roll angle) α about an axis extending positively in the front direction of the head mounted display is used to adjust the postures of the virtual cameras.
In a conventional technique, display images such as those illustrated in
Accordingly, in one or more aspects of the present disclosure, the postures of the right and left virtual cameras are adjusted such that appropriate parallax is produced in the horizontal direction relative to the eyes of the viewer and the parallax in the vertical direction is eliminated in the region of interest, and thus an image with an appropriate parallax amount is presented without reducing the viewing angle.
For example, when the parallax is present in the vertical direction (y-axis direction) relative to the display, the virtual cameras are rotated by ΔΦ [radian] about the respective axes extending in the right and left direction of the virtual cameras to adjust the postures. When the parallax in the horizontal direction (x-axis direction) of the display is adjusted, the virtual cameras are rotated by Δθ [radian] about the respective axes extending in the up and down direction of the virtual cameras to adjust the parallax. Thus, an image in which the distortion is corrected properly and appropriate parallax is present can be displayed without narrowing the viewing angle (without reducing the viewing angle). Hereinafter, a method for calculating the amount of rotation of the virtual cameras will be described.
When the head is tilted as illustrated in
The amount of rotation (the amount of adjustment) ΔθL, and ΔΦL of the left virtual camera and the amount of rotation ΔθR and ΔΦR of the right virtual camera can be expressed through the following expressions.
The amount of rotation of the left virtual camera:
ΔθL=Sign(cos α)×D/2×(1−|cos α|)
ΔΦL=−Dsin α/2
The amount of rotation of the right virtual camera:
ΔθR=−Sign(cos α)×D/2×(1−|cos α|)
ΔΦR=Dsin α/2
Here, the parallax amount D is obtained by converting the parallax amount d [pixel] on the 360-degree spherical image to the radian system of units, and D=d×2π/w holds. In the above, w is the width [pixel] of the 360-degree spherical image. In addition, Sign( ) is a signum function and returns 1 when the variable is greater than 0, 0 when the variable is 0, and −1 when the variable is smaller than 0. The amount of adjustment of the virtual cameras obtained here is converted to a rotation matrix, which yields ML and MR. The posture RL of the left virtual camera and the posture RR of the right virtual camera are each a rotation matrix obtained by applying ML and MR to the posture R of the display. An example of the orientation of the virtual cameras of which the postures have been adjusted is illustrated in
As illustrated in
Although the right and left virtual cameras are rotated by the same amount in the foregoing description, one or more aspects of the present disclosure is not limited to such an adjustment. It suffices that the amount of adjustment (the amount of shift) be the same in total between the right and left virtual cameras, and thus the amount of shift need not be equal between the right and left virtual cameras. For example, only the right virtual camera may be adjusted, or only the left virtual camera may be adjusted. In addition, the amount of adjustment of the postures may be adjusted by constant-multiplying the amount of adjustment by the number from 0 to 1.
When the parallax amount is small, the parallax mismatch arising when the roll occurs is also small, and thus the postures of the virtual cameras need not be adjusted. In that case, a threshold value dth of the parallax amount is determined in advance. Then, the postures of the virtual cameras are adjusted when the parallax amount is greater than the threshold value dth, and the postures of the virtual cameras are not adjusted when the parallax amount is smaller than the threshold value dth.
In S5, the display parameter acquiring unit 205 acquires, from the input device 107, an image display parameter, which is a parameter for displaying the parallax image in the output device 109. The image display parameter includes, for example, the focal length of the eyepiece lens of the head mounted display, the distortion parameter, the center position of the eyepiece lens on the display, the display resolution, and the display size. The display parameter acquiring unit 205 outputs the acquired image display parameter to the display image generating unit 206. Instead of acquiring the image display parameter from the input device 107, the display parameter acquiring unit 205 may acquire the image display parameter stored in the RAM 102 or the HDD 105 from the RAM 102 or the HDD 105.
In S6, the display image generating unit 206 generates the display images to be displayed in the output device 109 from the image display parameter, the image data, and the posture information of the right and left virtual cameras. As illustrated in
In S7, the image data output unit 207 outputs the display images JL and JR generated in S6 to the output device 109 in the form of image data and causes the display images JL and JR to be displayed on the display.
The processing proceeds to S8 from S7. In S8, it is determined whether the processing is to be terminated. If the determination in S8 is Yes, the image processing apparatus 113 terminates the processing. If the determination in S8 is No, the processing returns to S1.
The display images to be displayed in the head mounted display are generated by sampling the pixel values from the parallax image in accordance with the direction in which the head of the viewer (the wearer of the head mounted display) is tilted. In order to sample the pixel values from the parallax image, a light ray of which orientation is to be displayed at each pixel position on the display is calculated on the basis of the posture information of the right and left virtual cameras. The orientation of the light ray can be expressed by the azimuthal angle θ indicating the angle about the axis extending in the up and down direction and the elevation angle φ indicating the angle about the axis extending in the right and left direction. In the following description, the position of a pixel in the coordinate system with its origin lying on the center of the eyepiece lens is designated as (x,y), the orientation (θ,φ) of a light ray is calculated, and the pixel value in the display images is obtained from the calculated orientation.
The details of the display image generating processing will be described with reference to the flowchart illustrated in
In S11, the display image generating unit 206 (
A Lookup table may be created in advance so that the distortion rate can be referenced for each pixel position, and the calculation of the distance r from the center of the eyepiece lens may be omitted. In addition, when the distance r from the center of the eyepiece lens to a given pixel is large and the pixel is outside the area that can be viewed with the eyepiece lens, the pixel value in the display image may be set to black, and the processing on that pixel may be terminated.
In S12, the display image generating unit 206 calculates the pixel position from which it appears as if the light ray from the pixel position (x,y) is emitted due to the refraction caused by the distortion of the eyepiece lens. The calculated pixel position (xc,yc) is referred to as a “distorted position” in the following description. The distorted position (xc,yc) can be calculated by multiplying x and y in the pixel position (x,y) by the distortion rate c, as indicated in the expression (1).
x
c
=x×c
y
c
=y×c (1)
The distorted position (xc,yc) is illustrated in
In S13, the display image generating unit 206 converts the distorted position (xc,yc) to a three-dimensional position (X,Y,Z) on the coordinate system with its origin lying at the position of the eye and thus calculates the three-dimensional position (X,Y,Z). An example of the relationship between a point on the image coordinates and a three-dimensional position is illustrated in
[XYZ]
t
=R×[xyf]
t (2)
The posture of the virtual camera differs depending on whether the virtual camera is for the left-eye display image or for the right-eye display image. Thus, R in the expression (2) is replaced with RL when the pixel value in the left-eye display image is obtained, and the R in the expression (2) is replaced with RR when the pixel value in the right-eye display image is obtained.
In S14, the display image generating unit 206 calculates the azimuthal angle θ and the elevation angle φ from the three-dimensional position (X,Y,Z) of the pixel.
φ=asin(Y/L)
θ=asin(Z/L/cos φ) (3)
In S15, the display image generating unit 206 acquires (samples) the pixel value of the light ray in the direction (θ,φ) from the corresponding image and stores the acquired pixel value into the corresponding pixel position in the display image. In other words, in S15, the display image generating unit 206 samples the pixel value of the light ray in the direction (θ,φ) from the corresponding image and stores the sampled pixel value into the drawing array. The pixel value is sampled from the left-eye image when the left-eye display image is generated, and the pixel value is sampled from the right-eye image when the right-eye display image is generated. The position at which the sampled pixel value is stored is the position moved from the origin by (x,y).
When the pixel value is sampled from the image, the nearest pixel value may be acquired, or the sampling may be carried out through interpolation, such as bilinear interpolation or bicubic interpolation, by using surrounding pixel values. In addition, when the roll angle α is |α|>90°, the positional relationship of the right and left eyes is inverted from that held when the roll angle α is |α|≤90°. Thus, the right and left display images may be switched. In other words, the pixel value for the left-eye display image may be sampled from the right-eye image, and the pixel value for the right-eye display image may be sampled from the left-eye image.
When the pixel value is sampled from an image other than a 360-degree spherical image (for example, an image captured by a fisheye camera or the like), a lookup table of the orientations (the azimuthal angles and the elevation angles) of the light rays stored in advance and the pixel positions corresponding thereto is prepared. The display image generating unit 206 acquires the pixel position (a,b) corresponding to the orientation (θ,φ) of a light ray from the stated lookup table, samples the pixel in the pixel position (a,b) of the image data, and stores the sample pixel onto the coordinates (x,y) of the display image, as in the case of the 360-degree spherical image.
After S15, in S16, it is determined whether all the pixels in the right and left display images have been processed. If not all the pixels have been processed, the processing returns to S11. If all the pixels have been processed, the processing returns to the flowchart illustrated in
The display position of the display image is shifted after the display image is generated in Japanese Patent Laid-Open No. 2012-244453, which thus produces regions in the edges of the display (screen) in which the display image cannot be displayed by the amount by which the display image has been shifted, and the viewing angle is reduced. In addition, shifting the display image produces a mismatch between the center of the eyepiece lens and the center position in the distortion correction, and thus the distortion cannot be corrected properly. In contrast, according to one or more aspects of the present disclosure, the parallax mismatch is suppressed without reducing the viewing angle by adjusting the postures of the virtual cameras before the display images are generated. In addition, no mismatch is produced between the center of the eyepiece lens and the center position in the distortion correction, and thus the distortion can be corrected properly.
Accordingly, with the image processing apparatus 113 according to one or more aspects of the present disclosure, the parallax mismatch produced when the viewer tilts the head while the parallax image is displayed in the head mounted display is reduced, and an image with appropriate parallax in the region of interest can be displayed.
In the image processing apparatus 113 according to one or more aspects of the present disclosure, the postures of the virtual cameras are adjusted. Thus, the image processing apparatus 113 can be applied to the generation of display images from a 360-degree spherical image or a fisheye image, and the distortion can be corrected properly.
At least some of the functional blocks illustrated in
In the foregoing description, the CPU 101 executes the programs stored in the ROM 103 or the like to thus function as the functional blocks illustrated in
Although the image processing apparatus 113 does not include the HDD 105, the input device 107, the output device 109, and the posture detecting device 111 in the foregoing description, the configuration of the image processing apparatus according to one or more aspects of the present disclosure is not limited to the above. For example, the image processing apparatus 113 may include at least one of the HDD 105, the input device 107, the output device 109, and the posture detecting device 111 as a constituent element of the image processing apparatus 113.
Although the HDD I/F 104 is included in the image processing apparatus 113 in the foregoing description, the image processing apparatus 113 need not include the HDD I/F 104 if the HDD 105 need not be used.
Although the terms “roll,” “pitch,” and “yaw” are used to express the posture in the foregoing description, another term, such as quaternion, may be used.
Although a 360-degree spherical image or an image captured by a fisheye camera has been illustrated as an example in the foregoing description, the present invention can be applied as long as a parallax image having an angle of view of no smaller than a predetermined value is viewed. Although a head mounted display has been illustrated as an example in the foregoing description, the present invention can also be applied to a display apparatus other than a head mounted display.
Although the posture detecting device 111 is mounted to the image display surface of the output device 109 or to the vicinity thereof in the foregoing description, the posture detecting device 111 may be mounted to a position other than the position described above as long as the posture detecting device 111 can detect the tilt of the display.
The present disclosure can also be implemented through processing in which a program that implements one or more functions of one or more aspects of the present disclosure described above is supplied to a system or an apparatus via a network or a storage medium and one or more processors in a computer of the system or the apparatus reads out and executes the program. In addition, the present disclosure can also be implemented by a circuit (for example, ASIC) that implements one or more functions.
Embodiment(s) of the present disclosure can also be realized by a computerized configuration(s) of a system or apparatus that reads out and executes computer executable instructions (e.g., one or more programs) recorded on a storage medium (which may also be referred to more fully as a ‘non-transitory computer-readable storage medium’) to perform the functions of one or more of the above-described embodiment(s) and/or that includes one or more circuits (e.g., application specific integrated circuit (ASIC)) for performing the functions of one or more of the above-described embodiment(s), and by a method performed by the computerized configuration(s) of the system or apparatus by, for example, reading out and executing the computer executable instructions from the storage medium to perform the functions of one or more of the above-described embodiment(s) and/or controlling the one or more circuits to perform the functions of one or more of the above-described embodiment(s). The computerized configuration(s) may comprise one or more processors, one or more memories (e.g., central processing unit (CPU), micro processing unit (MPU)), and may include a network of separate computers or separate processors to read out and execute the computer executable instructions. The computer executable instructions may be provided to the computer, for example, from a network or the storage medium. The storage medium may include, for example, one or more of a hard disk, a random-access memory (RAM), a read only memory (ROM), a storage of distributed computing systems, an optical disk (such as a compact disc (CD), digital versatile disc (DVD), or Blu-ray Disc (BD)™), a flash memory device, a memory card, and the like.
While the present disclosure has been described with reference to exemplary embodiments, it is to be understood that the disclosure is not limited to the disclosed exemplary embodiments. The scope of the following claims is to be accorded the broadest interpretation so as to encompass all such modifications and equivalent structures and functions.
This application claims the benefit of priority from Japanese Patent Application No. 2016-256606 filed Dec. 28, 2016, which is hereby incorporated by reference herein in its entirety.
Number | Date | Country | Kind |
---|---|---|---|
2016-256606 | Dec 2016 | JP | national |