The present invention relates to an image-processing apparatus and a light-field imaging apparatus.
In the related art, there is a known light-field imaging apparatus: that is provided with an imaging device in which a plurality of pixels are two-dimensionally disposed and a microlens array having microlenses that are disposed, closer to an imaging subject than the imaging device is, in correspondence with each of the plurality of pixels of the imaging device; and that images a three-dimensional distribution of the imaging subject (for example, see Japanese Unexamined Patent Application, Publication No. 2010-102230).
Generally, unlike an image acquired by a normal imaging apparatus, an image acquired by a light-field imaging apparatus (hereinafter referred to as a light-field image) itself is an image in which images of numerous three-dimensionally distributed points overlap with each other; therefore, it is not possible to intuitively ascertain basic information such as plane position and distance of the imaging subject on a flat surface unless image processing is applied.
Therefore, the imaging subject is reconstructed by generating a three-dimensional image from the acquired light-field images and a pupil-image function of an imaging optical system that includes the microlenses. In processing for three-dimensionally reconstructing the imaging subject from the light-field images, a method in which optimization is achieved by performing repeated computations, by means of a computation method such as the Richardson-Lucy method, by using an appropriately set initial value is employed.
An aspect of the present invention is an image-processing apparatus including: a storing portion that stores a pupil-image function of an imaging optical system; and a reconstructing-processing portion that reconstructs, on the basis of the pupil-image function stored in the storing portion and input light-field images, a three-dimensional image of an imaging subject by means of repeated computations that give an initial value, wherein the reconstructing-processing portion uses the three-dimensional image reconstructed on the basis of, among the light-field images of a plurality of frames acquired in a time series, the light-field image of a preceding one of the frames in a time-axis direction as the initial value.
Another aspect of the present invention is a light-field imaging apparatus including: an imaging optical system that focuses light coming from an imaging subject and forms an image of the imaging subject; a microlens array that has a plurality of microlenses that are two-dimensionally arrayed at a position at which a primary image is formed by the imaging optical system or a conjugate position with respect to the primary image and that focus light coming from the imaging optical system; an imaging device that has a plurality of pixels that receive the light focused by the microlenses and that generates light-field images by performing photoelectric conversion of the light received by the pixels; and any one of the above-described image-processing apparatuses that process the light-field images generated by the imaging.
An image-processing apparatus 2 and a light-field imaging apparatus 1 according to a first embodiment of the present invention will be described below with reference to the drawings.
As shown in
As shown in
The imaging device 9 is also configured by two-dimensionally arraying the individual pixels 9a in a direction that is orthogonal to the optical axis L of the imaging optical system 3. The plurality of pixels 9a are arrayed in each of regions corresponding to the plurality of microlenses 5a of the microlens array 5 (for example, in an 8×8 arrangement in the above-described example). The plurality of pixels 9a perform photoelectric conversion of the detected light, and output light-intensity signals (pixel values) that serve as light-field-image information of the imaging subject S.
The imaging device 9 sequentially outputs the light-field-image information about a plurality of frames acquired at different times in a time-axis direction. For example, the imaging device performs video recording or time-lapse recording.
The image-processing apparatus 2 is configured by a processor, and includes, as shown in
A pupil-image function [H] is a function that satisfies Expression (1) below:
[b]=[H][g] (1)
[b] denotes a light-field image, and
[g] denotes the intensity of light coming from each portion of the three-dimensional imaging subject S.
In other words, Expression (1) indicates the relationship in which the light coming from the imaging subject S is converted to a light-field image via the imaging optical system 3 and received by the individual pixels 9a of the imaging device 9, and the pupil-image function [H] functions as a transformation matrix. It is possible to determine, in advance, the pupil-image functions of the imaging optical system 3, the microlens array 5, and the relay lens 4, and the pupil-image functions are stored in the storing portion 11.
The imaging optical system 3 includes, for example, as shown in
The reconstructing-processing portion 12 determines [g] that minimizes an error function e, expressed as Expression (2), when the light-field image [b] of Expression (1) is input.
Here,
∥x∥2 {Eq. 2}
is the L2 norm of x.
As a method for determining [g] that minimizes Expression (2), for example, repeated computations, such as computations according to the Richardson-Lucy method indicated in Eq. 3, are executed.
g
(k+1)=diag(Ht1)−1 diag(Ht diag(Hg(k))−1b)g(k) {Eq. 3}
Here,
g(k) denotes the three-dimensional image of the imaging subject S that is calculated in k-th repeated computation,
b denotes the light-field image output from the imaging device 9,
diag denotes a diagonal matrix,
t denotes a transpose matrix,
−1 denotes an inverse, and
k denotes the number of repetitions.
More specifically, as shown in
As shown in
Then, whether or not the number of repetitions k is kmax is determined (step S44); in the case in which k is kmax, the processing is ended and the procedure advances to step S5; and, in the case in which k is not kmax, whether or not the error function e is less than a predetermined threshold th is determined (step S45). Here, kmax is a maximum value of the number of repetitions. The threshold th is a constant that varies according to the sizes of x, y, and z, and is experimentally determined.
In the case in which e<th, the repeated computations are ended, and, in the case in which e≥th, a computation result g (x, y, z, t) is input to the initial image g0 (x, y, z, t) (step S46), the number of repetitions k is incremented (step S47), and the steps from step S42 are repeated.
Next, whether or not the frame number t is the final number or not is determined (step S5); in the case in which the frame number t is the final number, the procedure is ended; and, in the case in which the frame number t is not the final number, the frame number t is incremented (step S6), and the procedure returns to step S2.
In the second frame and thereafter, because t is determined not to be zero in step S2, the three-dimensional image g (x, y, z, t−1) calculated for the immediately preceding frame is set to the initial image g0 (x, y, z, t) (step S7), and the steps from step S4 are repeated.
With the image-processing apparatus 2 and the light-field imaging apparatus 1 according to this embodiment, thus configured, because, regarding the second frame and thereafter, the repeated computations are performed by using the three-dimensional image g (x, y, z, t−1) generated by using the light-field image of the immediately preceding frame as the initial image g0 (x, y, z, t), the number of repetitions k becomes one in nearly all cases when there is little change with respect to the light-field image of the immediately preceding frame, and thus, there is an advantage in that it is possible to considerably reduce the amount of time required to perform the three-dimensional reconstructing processing.
In the case in which there is a change with respect to the light-field image of the immediately preceding frame, because the error function e becomes equal to or greater than the threshold th, the repeated computations are performed within the range of the maximum value kmax of the number of repetitions k, and thus, it is possible to generated an appropriate three-dimensional image g (x, y, z, t).
In
With the image-processing apparatus 2 and the light-field imaging apparatus 1 according to this embodiment, with regard to the cases in which the frame numbers t=1 and 77, it is possible to obtain, even if the number of repetitions k is set to one, three-dimensional images that are as clear as the three-dimensional images generated by taking time, shown in the Reference Example in
At the frame number t=78, although the three-dimensional image is unclear in the case in which the number of repetitions k is set to one, this is because some kind of change occurred with respect to the light-field image of the immediately preceding frame. In this case, for example, as shown in
Next, an image-processing apparatus 22 and a light-field imaging apparatus according to a second embodiment of the present invention will be described below with reference to the drawings.
In describing this embodiment, portions having the same configurations as those of the image-processing apparatus 2 and the light-field imaging apparatus 1 according to the first embodiment, described above, will be given the same reference signs, and descriptions thereof will be omitted.
As shown in
The evaluation-value calculating portion 23 calculates an event evaluation value A(t) by means of Eq. 4 with respect to a t-th light-field image from the first light-field image in a sequence consisting of the light-field images of the plurality of frames acquired by the imaging device 9 at the predetermined time interval.
The event evaluation value A(t) is a representative value, for each light-field image, with respect to numerical values that indicate divergences from the average (predetermined reference value) of the entire sequence of the pixel values of the individual pixels included in the light-field images.
Here,
ttotal is the total number of frames.
As a result of arraying, in time series, the calculated event evaluation values A(t) in association with the frames, the graph shown in
In
Specifically, as shown in
With the image-processing apparatus 22 and the light-field imaging apparatus according to this embodiment, thus configured, whether or not an event is present in an light-field image is determined, and in the case in which it is determined that an event is not present, because the three-dimensional image g (x, y, z, t−1) calculated for the immediately preceding frame is set to the initial image g0 (x, y, z, t) and the three-dimensional image g (x, y, z, t) is generated, the number of repetitions k becomes one in nearly all cases, and thus, there is an advantage in that it is possible to considerably reduce the amount of time required to perform the three-dimensional reconstructing processing.
In the case in which it is determined that an event is present, by using, as the initial image g0 (x, y, z, t), an image that is constructed in a simple manner from the light-field image, it is possible to reduce the value of the error function e with a number of repetitions k that is less than that for the three-dimensional image g (x, y, z, t−1) for the immediately preceding frame, and, in this case also, there is an advantage in that it is possible to reduce the amount of time required for performing the three-dimensional reconstructing processing.
In this embodiment, although the initial image g0 (x, y, z, t) is changed depending on the presence/absence of an event, in addition to this, the processing performed by the reconstructing-processing portion 12 may be changed, as shown in
As shown in
In this embodiment, although the event evaluation value A(t) is calculated by using pixel values of the average image of the entire sequence, as indicated in Eq. 5, the event evaluation value A(t) may be a sum of absolute values, taken for the entire light-field images, with respect to differences between pixel values of corresponding pixels in the light-field images that are adjacent (immediately preceding) in the time-axis direction. In this case, the presence/absence of an event is determined depending on whether or not the absolute values of the differences exceed a predetermined threshold.
Obtaining the differences is suitable for ascertaining movements of and changes in the shape of the imaging subject S. Because it is not necessary to acquire all of the light-field images, there is an advantage in that it is possible to detect the event evaluation value A(t) substantially in real time while imaging.
In this embodiment, although the value in which, with respect to the individual pixels of the light-field images of the individual frames, the absolute values of the difference values indicating the divergences from the reference value are added up for the entire sequence is used as the event evaluation value A(t), alternatively, another arbitrary representative value, for example, an arbitrary statistical value such as an average, a maximum value, a minimum value, or a median, may be employed as the event information.
Although this embodiment has been described in terms of an example in which a three-dimensional image g (x, y, z, t−1) is generated by using the light-field image of an immediately preceding frame, alternatively, the three-dimensional image g (x, y, z, t−1) may be generated by using the light-field image of a preceding frame in the time-axis direction.
Although this embodiment has been described in terms of an example in which the pixels 9a of the imaging device 9 and the pixels on the light-field image to be used in event detection coincide with each other, alternatively, the pixels 9a of the imaging device 9 and the pixels on the light-field image need not coincide with each other.
In an actual optical system, there are cases in which a setting error occurs, such as the pitch of the microlens 5a not being an integer multiple of the pixel pitch and the microlenses 5a being disposed in a slightly rotated manner, and thus, there are cases in which calibrating processing is performed, wherein the pixels are rearranged by means of interpolating processing at the beginning of the image processing. In this case, strictly speaking, the pixels 9a of the imaging device 9 and the pixels on the light-field image to be used in event detection do not coincide with each other.
As a result, the following aspect is read from the above described embodiment of the present invention.
An aspect of the present invention is an image-processing apparatus including: a storing portion that stores a pupil-image function of an imaging optical system; and a reconstructing-processing portion that reconstructs, on the basis of the pupil-image function stored in the storing portion and input light-field images, a three-dimensional image of an imaging subject by means of repeated computations that give an initial value, wherein the reconstructing-processing portion uses the three-dimensional image reconstructed on the basis of, among the light-field images of a plurality of frames acquired in a time series, the light-field image of a preceding one of the frames in a time-axis direction as the initial value.
With this aspect, as a result of the reconstructing-processing portion performing the repeated computations that give the initial value on the basis of the pupil-image function of the imaging optical system stored in the storing portion and the input light-field images, the three-dimensional image of the imaging subject is reconstructed. In the case in which an event indicating some kind of change between the light-field images of adjacent frames is not so significant, the three-dimensional image, which is reconstructed by using the light-field images of the plurality of frames acquired in a time series, does not have a large difference with respect to the three-dimensional image reconstructed by using the individual light-field images preceding in the time-axis direction. Therefore, by using, as the initial value, the three-dimensional image reconstructed on the basis of the light-field image of the preceding frame in the time-axis direction, it is possible to cause the repeated computations to be completed earlier, and thus, it is possible to perform the three-dimensional reconstructing processing in a short period of time.
In the above-described aspect, the reconstructing-processing portion may use, as the initial value, the three-dimensional image reconstructed on the basis of the light-field image of an immediately preceding one of the frames.
The above-described aspect may further include an event-determining portion that determines the presence/absence of an event in the light-field images wherein, the reconstructing-processing portion may use, as the initial value, the three-dimensional image reconstructed on the basis of the light-field image of the immediately preceding one of the frames in the case in which the event-determining portion determines that the event is not present in the light-field image.
By doing so, in the case in which the event-determining portion determines that the event is not present, by using, as the initial value, the three-dimensional image reconstructed by using the preceding light-field image in the time-axis direction, which has no large difference with respect to the three-dimensional image obtained as a result of the reconstructing processing, it is possible to cause the repeated computations to be completed earlier, and thus, it is possible to perform the three-dimensional reconstructing processing in a short period of time.
In the above-described aspect, the event-determining portion may determine that the event is present in the case in which a difference between the light-field image to be used in reconstruction and the light-image field image of the immediately preceding one of the frames exceeds a predetermined threshold.
By doing so, it is possible to determine that the event is present in a simple manner in the case in which the difference between the light-field image to be used in reconstruction and the light-field image of the preceding frame in the time-axis direction exceeds the predetermined threshold.
In the above-described aspect, the reconstructing-processing portion may use, as an initial image, an image created from the light-field image to be used in reconstruction without being subjected to repeated computation in the case in which the event-determining portion determines that the event is present.
By doing so, regarding the light-field image in which it is determined that the event is present, it is possible to use the image created from said light-field image without being subjected to the repeated computations as the initial image, and, in the case in which it is determined that the event is not present, it is possible to switch to the processing in which the three-dimensional image reconstructed by using the preceding light-field image in the time-axis direction is used as the initial value.
In the above-described aspect, the reconstructing-processing portion may perform the repeated computations according to a first number of repetitions set in advance, in the case in which the event-determining portion determines that the event is present, and may perform the repeated computations according to a second number of repetitions that is less than the first number of repetitions, in the case in which the event-determining portion determines that that the event is not present.
By doing so, it is possible to keep the number of repetitions low in the case in which it is determined that the event is not present, and it is possible to perform the three-dimensional reconstructing processing in a short period of time.
In the above-described aspect, the repeated computations that give the initial value may be performed in accordance with the Richardson-Lucy method.
Another aspect of the present invention is a light-field imaging apparatus including: an imaging optical system that focuses light coming from an imaging subject and forms an image of the imaging subject; a microlens array that has a plurality of microlenses that are two-dimensionally arrayed at a position at which a primary image is formed by the imaging optical system or a conjugate position with respect to the primary image and that focus light coming from the imaging optical system; an imaging device that has a plurality of pixels that receive the light focused by the microlenses and that generates light-field images by performing photoelectric conversion of the light received by the pixels; and any one of the above-described image-processing apparatuses that process the light-field images generated by the imaging.
This is a continuation of International Application PCT/JP2017/025590, with an international filing date of Jul. 13, 2017, which is hereby incorporated by reference herein in its entirety.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/JP2017/025590 | Jul 2017 | US |
Child | 16736890 | US |