This application is a national stage application of International Application No. PCT/JP2012/082116 filed Dec. 5, 2012, whose benefit is claimed and which claims the benefit of Japanese Patent Application No. 2012-005661, filed Jan. 13, 2012, the entire disclosures of which are incorporated herein by reference.
The present invention relates to an image generation method, and an image generation apparatus.
An image capture apparatus has been proposed in which the exit pupil of the imaging lens is divided into multiple pupil areas, and multiple parallax images corresponding to the divided pupil areas can be captured at the same time.
U.S. Pat. No. 4,410,804 discloses an image capture apparatus that uses a two-dimensional image sensor in which one microlens and multiple divided photo-electric converters are formed with respect to one pixel. The divided photo-electric converters are configured so as to receive light from different pupil sub-areas of the exit pupil of the imaging lens via one microlens, and thus pupil division is performed. Multiple parallax images that correspond to the divided pupil sub-areas can be generated from the signals obtained due to light reception in the divided photo-electric converters. Japanese Patent Laid-Open No. 2001-083407 discloses the generation of a captured image by adding together all of the signals obtained due to light reception in the divided photo-electric converters.
The multiple parallax images that are captured are equivalent to light field (LF) data, which is information on a spatial distribution of light intensity and an angle distribution. Stanford Tech Report CTSR 2005-02, 1 (2005) discloses refocusing technology in which the focal position of a captured image is modified after capturing by compositing an image at a virtual image forming plane that is different from the image sensing plane using LF data that has been acquired.
However, although multiple parallax images can be acquired at the same time in the above-described conventional examples due to dividing the exit pupil of the imaging lens into multiple areas, there is the problem of a reduction in the spatial resolution of the captured image that is generated from the parallax images.
The present invention was achieved in view of the above-described problems, and generates a captured image that has a high spatial resolution from multiple parallax images.
An image generation method according to a first aspect of the present invention is an image generation method for generating an output image from an input image acquired by an image sensor that has an array of a plurality of pixels, each of which has arranged therein a plurality of sub-pixels that each receive a light beam that passes through a different pupil sub-area of an imaging optical system, the method including: a step of generating a plurality of parallax images that respectively correspond to the different pupil sub-areas based on the input image; a step of generating a plurality of pixel shifted images by performing different non-integral shifting for each of the plurality of parallax images according to a virtual image forming plane of the imaging optical system that is different from an image sensing plane at which the image sensor is arranged; and a step of generating an output image that has a higher resolution than each of the resolutions of the plurality of parallax images from the plurality of pixel shifted images through composition processing.
Also, according to a second aspect of the present invention, a program causes a computer to execute the steps of the above-described image generation method.
Also, according to a third aspect of the present invention, a computer-readable storage medium stores a program for causing a computer to execute the steps of the above-described image generation method.
Also, according to a fourth aspect of the present invention, an image generation apparatus comprises an image sensor configured to acquire an input image, wherein the image sensor has an array of a plurality of pixels, each of which has arranged therein a plurality of sub-pixels that each receive a light beam that passes through a different pupil sub-area of an imaging optical system; a first generation means configured to generate a plurality of parallax images that respectively correspond to the different pupil sub-areas based on the input image; a second generation means configured to generate a plurality of pixel shifted images by performing different shifting for each of the plurality of parallax images according to a virtual image forming plane of the imaging optical system that is different from an image sensing plane at which the image sensor is arranged; and a composition means configured to generate an output image that has a higher resolution than the resolution of the parallax images from the plurality of pixel shifted images through composition processing.
Further features of the present invention will become apparent from the following description of exemplary embodiments with reference to the attached drawings.
Embodiments of the present invention will be described below in detail with reference to the accompanying drawings.
Reference numeral 105 denotes a third lens group that adjusts the focal point by moving forward and backward in the optical axis direction. Reference numeral 106 denotes an optical low-pass filter, which is an optical element for reducing false coloring and moiré that appear in captured images. Reference numeral 107 denotes an image sensor configured by a two-dimensional CMOS photosensor and peripheral circuitry, and this image sensor is arranged at the image forming plane of the imaging optical system.
Reference numeral 111 denotes a zoom actuator that performs a magnification operation by driving elements from the first lens group 101 to the third lens group 105 in the optical axis direction by rotating a cam barrel (not shown). Reference numeral 112 denotes an aperture/shutter actuator that adjusts the amount of captured light by controlling the opening diameter of the aperture/shutter 102, as well as controls the light exposure time in still image capturing. Reference numeral 114 denotes a focus actuator that adjusts the focal point by driving the third lens group 105 forward and backward in the optical axis direction.
Reference numeral 115 denotes an electronic flash for subject illumination in image capturing, and is preferably a flash illumination apparatus that uses a xenon tube, but may be an illumination apparatus that includes a continuous-emission LED. Reference numeral 116 denotes an AF auxiliary light apparatus that projects a mask image having a predetermined pattern of openings into the subject field via a projection lens so as to improve focus detection capability with respect to darks subjects and low-contrast subjects.
Reference numeral 121 denotes a CPU in the camera that performs various types of control with respect to the camera body, has an arithmetic portion, a ROM, a RAM, an A/D converter, a D/A converter, a communication interface circuit, and the like, and drives various circuits in the camera based on a predetermined program stored in the ROM. This CPU also executes a series of operations such as AF, image capturing, image generation, and recording. The CPU 121 is an image generation means, a parallax image generation means, a pixel shifted image generation means, and a super-resolution processing means of the present invention.
Reference numeral 122 denotes an electronic flash control circuit that performs control for lighting the electronic flash 115 in synchronization with an image capturing operation. Reference numeral 123 denotes an auxiliary light driver circuit that performs control for lighting the AF auxiliary light apparatus 116 in synchronization with a focus detection operation. Reference numeral 124 denotes an image sensor driver circuit that controls image capturing operations of the image sensor 107, as well as subjects an acquired image signal to A/D conversion and transmits the converted image signal to the CPU 121. Reference numeral 125 denotes an image processing circuit that performs processing such as γ conversion, color interpolation, and JPEG compression on an image that was acquired by the image sensor 107.
Reference numeral 126 denotes a focus driver circuit that adjusts the focal point by performing control for driving the focus actuator 114 based on a focus detection result so as to move the third lens group 105 forward and backward in the optical axis direction. Reference numeral 128 denotes an aperture/shutter driver circuit that controls the opening diameter of the aperture/shutter 102 by performing control for driving the aperture/shutter actuator 112. Reference numeral 129 denotes a zoom driver circuit that drives the zoom actuator 111 in accordance with a zoom operation that was performed by a photographer.
Reference numeral 131 denotes a display apparatus such as an LCD that displays information regarding the camera shooting mode, a preview image before image capturing, an image for checking after image capturing, an image indicating the focus state in focus detection, and the like. Reference numeral 132 denotes an operation switch group that is configured by a power switch, a release (shooting trigger) switch, a zoom operation switch, a shooting mode selection switch, and the like. Reference numeral 133 denotes a removable flash memory that records captured images.
In the first embodiment, in a 2×2 pixel group 200 shown in
As shown in
The photo-electric converters 301 to 316 may be pin-structure photodiodes in which an intrinsic layer is sandwiched between a p layer and an n layer, or, as necessary, may be pn-junction photodiodes in which the intrinsic layer is omitted.
In each pixel, a color filter 306 is formed between the microlens 305 and the photo-electric converters 301 to 316. Also, for each sub-pixel, the spectral transmittance of the color filter may be changed, or the color filter may be omitted, as necessary.
Light that enters the pixel 200G shown in
In each photo-electric converter, pairs of an electron and a hole are generated according to the amount of received light and separated by a depletion layer, and then negatively charged electrons are accumulated in the n layer (not shown), whereas the holes are discharged outside the image sensor via the p layer, which is connected to a constant voltage source (not shown).
The following describes a pupil division means of the first embodiment.
The image sensor is arranged in the vicinity of the image forming plane of the imaging lens (imaging optical system), and light beams from a subject pass through an exit pupil 400 of the imaging optical system and enter respective pixels. The plane at which the image sensor is arranged is the image sensing plane. Due to the microlens, pupil sub-areas 501 to 516 are in an approximately conjugate relationship with the light receiving faces of the photo-electric converters 301 to 316 (sub-pixels 201 to 216) that are divided into Nθ×Nθ areas (4×4 areas), and these pupil sub-areas represent pupil sub-areas from which light can be received by the corresponding photo-electric converters (sub-pixels). The exit pupil 400 of the imaging optical system is divided in Np (Np=Nθ×Nθ) different pupil sub-areas, where Np is the pupil division count. Letting F be the aperture value of the imaging optical system, the effective aperture value of pupil sub-areas is approximately NθF. Also, a pupil area 500 is the pupil area from which the entire pixel 200G can receive light when all of the photo-electric converters 301 to 316 (sub-pixels 201 to 216) that are divided into Nθ×Nθ areas (4×4 areas) are combined.
The following describes parallax image generation in the present embodiment.
A parallax image that corresponds to a specified pupil sub-area among the pupil sub-areas 501 to 516 of the imaging optical system can be obtained by, for each pixel, selecting a signal from a specified sub-pixel among the sub-pixels 201 to 216 (photo-electric converters 301 to 316). For example, a parallax image that corresponds to the pupil sub-area 509 of the imaging optical system can be obtained by selecting the signal from the sub-pixel 209 (photo-electric converter 309) for each pixel. The same follows for the other sub-pixels as well. Based on the input image acquired by the image sensor of the present embodiment, multiple (pupil division count Np) parallax images that respectively correspond to the different pupil sub-areas and have a resolution equal to the effective pixel count can be generated.
Also, a captured image with a resolution equal to the effective pixel count can be generated by adding together all of the signals from the sub-pixels 201 to 216 for each pixel.
The following describes the refocusable range.
After image capturing, it is possible to generate (perform refocus processing), based on the LF data (multiple parallax images), an image at a virtual image forming plane that is different from the image sensing plane at which the image sensor is arranged and the sub-pixels Li,a were acquired. A refocused image can be generated at a virtual image forming plane by translating all of the sub-pixel signals Li,a along the respective angles θa from the image sensing plane to the virtual image forming plane, distributing the signals to virtual pixels in the virtual image forming plane, and then performing weighted addition. The coefficient used in the weighted addition is determined such that all of the values are positive and have a sum of 1.
There is a limit to a distance (maximum refocus amount) dmax from the image sensing plane to the virtual image forming plane at which refocusing is possible while maintaining a resolution equal to the effective pixel count NLF, and this maximum refocus amount dmax is approximately determined by Expression (1).
As shown in
In the present embodiment, the exit pupil of the imaging optical system having the aperture value F decreases in area upon being divided into Nθ×Nθ pupil areas, and the effective aperture value of the pupil sub-areas increases to NθF. As the focal depth increases, parallax images having a wider focal range can be obtained, and an image at a virtual image forming plane can be composited from these parallax images. The third member in Expression (1) shows that refocusing can be performed in the range in which the effective aperture value NθF of the pupil sub-areas increases and the focal depth increases. Although refocus processing from the image sensing plane in the rearward focus direction has been described, the same follows for refocus processing in the forward focus direction.
The following describes an image processing method for generating an output image from an input image of the present embodiment with reference to the flowchart of
In step S100, an input image is acquired by the image sensor that has an array of multiple pixels, each of which has arranged therein multiple sub-pixels (the sub-pixels 201 to 216) that each receive a light beam that passes through a different pupil sub-area of the imaging optical system. It is also possible to use an input image that was captured by the image sensor having the above configuration in advance and stored in a recording medium.
In step S200, a parallax image that corresponds to a specified pupil sub-area among the pupil sub-areas 501 to 516 of the imaging optical system is generated by, for each pixel, selecting a signal from a specified sub-pixel among the sub-pixels 201 to 216 from the input image. Based on the input image, multiple parallax images that respectively correspond to the different pupil sub-areas and have a resolution equal to the effective pixel count are generated.
In step S300, multiple pixel shifted images are generated by, for each of the parallax images generated in step S200, performing different non-integral shifting according to a virtual image forming plane of the imaging optical system that is different from the image sensing plane at which the image sensor is arranged.
Since there is no pixel shift in the parallax images at the image sensing plane, pixel shift super-resolution processing cannot be performed while in this state. In view of this, in the present embodiment, multiple pixel shifted images are generated by performing translation along the angle θa for each of the parallax images to a virtual image forming plane that is different from the image sensing plane. At this time, in order to be able to perform pixel shift super-resolution processing using multiple pixel shifted images, a distance d between the image sensing plane and the virtual image forming plane is set such that the amount of shift in the horizontal direction is a non-integer. Also, in order to prevent a reduction in resolution, it is desirable that the distance d between the image sensing plane and the virtual image forming plane is greater than 0 and less than or equal to the maximum refocus amount dmax=NpFΔx.
In the present embodiment, the distance d between the image sensing plane and the virtual image forming plane is set to d=dmax/Nθ=FΔX. As shown in
In step S400, super-resolution processing is performed such that an output image whose resolution is higher than the resolution of each of the parallax images is generated from the multiple pixel shifted images that were generated in step S300.
Expression (2) is a relational expression between the super-resolution pixel signal lμ and the sub-pixel signal Lμ arrayed one-dimensionally. Given μ=Nθi+a and ν=Nθj+b (i,j=0 to NLF−1; a,b=0 to Nθ−1), the relational expression of Expression (3) holds between the super-resolution pixel signal lμ,ν and sub-pixel signal Lμ,ν arrayed two-dimensionally. A determinant Mμ,ν,μ′,ν′ is a sparse matrix. In the present embodiment, the relational expression of Expression (3) corresponds to the generation of multiple pixel shifted images by performing non-integral shifting on each of the parallax images in step S300.
Accordingly, using the inverse matrix M−1μ,ν,μ′,ν′ of the determinant Mμ,ν,μ′,ν′, the relational expression of Expression (4) holds between the super-resolution pixel signal lμ,ν and the sub-pixel signal Lμ,ν.
In step S400, an output image (super-resolution pixel signal lμ,ν) is generated through super-resolution processing for obtaining the inverse matrix M−1μ,ν,μ′,ν′ of the determinant Mμ,ν,μ′,ν′ and performing compositing using the relational expression of Expression (4). The inverse matrix M−1μ,ν,μ′,ν′ may be obtained in advance as necessary.
In the present embodiment, the sampling period in the x direction after super-resolution processing is ΔX/Nθ=Δx, which is the same as the sub-pixel period. Accordingly, an output image whose resolution is equal to the effective sub-pixel count (the factor Np=Nθ×Nθ of the resolution equal to the effective pixel count) can be generated through the super-resolution processing.
A configuration is possible in which, as necessary, the super-resolution pixel signal lμ,ν, the inverse matrix M−1μ,ν,μ′,ν′, and the sub-pixel signal Lμ,ν in Expressions (3) and (4) are respectively subjected to Fourier transformation, super-resolution processing is performed in the frequency space, and then inverse Fourier transformation is performed.
As necessary, dark correction, shading correction, demosaicing processing, and the like may be performed on one or a combination of the input image, the parallax images, the pixel shifted images, and the output image.
The output image generated through the above-described image generation method is displayed by the display apparatus 131.
The present embodiment is one example of an image capture apparatus that has an image generation means for performing the above-described image generation method. Also, the present embodiment is one example of a display apparatus that has an image generation means for performing the above-described image generation method.
According to the above configuration, a captured image that has a high spatial resolution can be generated from multiple parallax images.
The following describes an image processing method for generating an output image from an input image according to a second embodiment of the present invention with reference to the flowchart of
The processing up to the generation of multiple parallax images that respectively correspond to the different pupil sub-areas and have a resolution equal to the effective pixel count based on the input image in step S200 is similar to that in the first embodiment.
In the present embodiment, first, super-resolution processing in the x direction is performed, and then super-resolution processing in the y direction is performed. Similarly to the first embodiment, the distance d between the image sensing plane and the virtual image forming plane is set to d=dmax/Nθ.
First, in step S310, translation along the angle θa is performed for each “x direction only” parallax image, and multiple x-direction pixel shifted images are generated by performing x-direction non-integral shifting (shifting by the non-integral factor 1/Nθ of the pixel period ΔX). The relational expression of Expression (5) corresponds to the generation of multiple x-direction pixel shifted images by performing x-direction non-integral shifting on each of the parallax images in step S310.
In step S410, multiple x-direction super-resolution images are generated by solving the simultaneous equation of Expression (5) for the super-resolution pixel signal lμ,ν. Expression (5) can be explicitly described as the recurrence formulas in Expressions (6a) to (6d). The recurrence formulas in Expressions (6a) to (6d) can be sequentially solved for the super-resolution pixel signal lμ,ν, and there is no need to obtain the inverse matrix M−1μ,μ′ of the determinant Mμ,μ′, thus making it possible to simplify the arithmetic processing. In this way, x-direction super-resolution processing is performed through steps S310 and S410.
Next, in step S320, translation along the angle θb is performed for each “y direction only” x-direction pixel shifted image, and multiple y-direction pixel shifted images are generated by performing y-direction non-integral shifting (shifting by the non-integral factor 1/Nθ of the pixel period ΔX). Recurrence formulas similar to those of Expressions (6a) to (6d) hold between the y-direction pixel shifted images and the super-resolution pixel signal lμ,ν as well.
In step S420, the recurrence formulas expressing the relationship between the y-direction pixel shifted images and the super-resolution pixel signal lμ,ν are sequentially solved for the super-resolution pixel signal lμ,ν, and thus an output image (super-resolution pixel signal lμ,ν) is generated.
Similarly to the first embodiment, in the present embodiment as well, the sampling period in the x direction after super-resolution processing is ΔX/Nθ=Δx, which is the same as the sub-pixel period. Accordingly, an output image whose resolution is equal to the effective sub-pixel count (the factor Np=Nθ×Nθ of the resolution equal to the effective pixel count) can be generated through the super-resolution processing. Other aspects are similar to those in the first embodiment.
According to the above configuration, a captured image that has a high spatial resolution can be generated from multiple parallax images.
Aspects of the present invention can also be realized by a computer of a system or apparatus (or devices such as a CPU or MPU) that reads out and executes a program recorded on a memory device to perform the functions of the above-described embodiments, and by a method, the steps of which are performed by a computer of a system or apparatus by, for example, reading out and executing a program recorded on a memory device to perform the functions of the above-described embodiments. For this purpose, the program is provided to the computer for example via a network or from a recording medium of various types serving as the memory device (e.g., computer-readable medium).
While the present invention has been described with reference to exemplary embodiments, it is to be understood that the invention is not limited to the disclosed exemplary embodiments. The scope of the following claims is to be accorded the broadest interpretation so as to encompass all such modifications and equivalent structures and functions.
This application claims the benefit of Japanese Patent Application No. 2012-005661, filed Jan. 13, 2012, which is hereby incorporated by reference herein in its entirety.
Number | Date | Country | Kind |
---|---|---|---|
2012-005661 | Jan 2012 | JP | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2012/082116 | 12/5/2012 | WO | 00 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2013/105383 | 7/18/2013 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
4410804 | Stauffer | Oct 1983 | A |
8749620 | Knight | Jun 2014 | B1 |
Number | Date | Country |
---|---|---|
2001-083407 | Mar 2001 | JP |
2007-004471 | Jan 2007 | JP |
2008-294741 | Dec 2008 | JP |
2013-042443 | Feb 2013 | JP |
Entry |
---|
Ng et al. “Light Field Photography with a hand-held Plenoptic Camera” Stanford Tech Report CTSR May 2005. |
Number | Date | Country | |
---|---|---|---|
20140368690 A1 | Dec 2014 | US |