This application claims the benefit, under 35 U.S.C. § 365 of International Application PCT/EP2017/068312, filed Jul. 20, 2017, which was published in accordance with PCT Article 21(2) on Feb. 8, 2018, in English, and which claims the benefit of European Patent Application No. 16306020.5 filed Aug. 05, 2016.
The disclosure relates to the processing of raw data obtained by a plenoptic camera (also named a light field camera).
This section is intended to introduce the reader to various aspects of art, which may be related to various aspects of the present invention that are described and/or claimed below. This discussion is believed to be helpful in providing the reader with background information to facilitate a better understanding of the various aspects of the present invention. Accordingly, it should be understood that these statements are to be read in this light, and not as admissions of prior art.
The development of plenoptic camera that enable to perform refocusing a posteriori is a hectic research subject. In order to achieve such refocusing, it is needed to perform some shifting and adding operations on several sub-aperture images (that correspond to images of a same scene obtained from different acquisition angles at a same time, a sub-aperture image being also named a viewpoint image), as explained for example in the article entitled “Light Field Photography with a Hand-held Plenoptic Camera” by Ren Ng et al, in the Stanford Tech Report CTSR 2005-02. In order to obtain a sub-aperture image from raw data obtained/acquired by a plenoptic camera, usually the processing which is done consists of obtaining the same pixel under each of the micro-lenses comprised in the plenoptic camera (a micro-lens generating a micro image, also named a lenslet image), and gathering these obtained pixels in order to define a sub-aperture image. However, such processing for obtaining a set of sub-aperture images from raw data is based on the hypothesis that each sensor pixel positioned behind the microlenses array only record one viewpoint pixel image, as mentioned in the Chapter 3.3 of the Phd dissertation thesis entitled “Digital Light Field Photography” by Ren Ng, published in July 2006, due to the fact that the coordinates of the center of a micro-image formed by a microlens have only integer values (i.e. there is a perfect match between a micro image (or lenslet image) and the image sensors/pixels sensors). From a mathematical point of view (and in view of the
Vn,m[k,l]=Rl,k[m,n]
Where Vn,m denotes a sub-aperture image, and Rl,k denotes a micro-image (also noted μ-image), with n∈0, N−1, m∈0, M−1, l∈0, L−1, and k∈0, K−1.
However, it should be noted that the hypothesis previously formulated is not always verified. Indeed, the micro-image Rl,k may be misaligned with the sensor array. Therefore, the sub-apertures images extraction process (such process is also named a demultiplexing process or also a decoding process as detailed in the article “Accurate Disparity Estimation for Plenoptic Images” by N. Sabater et al., published in ECCV Workshop 2014) from the raw data is not as accurate as it should be. Hence, it is necessary to improve the extraction process in order to determine correctly the set of sub-aperture images. In order to solve this issue, a technique described in the document US 2014/0146184 proposes to perform a calibration for correcting the misalignment.
The proposed technique is an alternative to the one of document US 2014/0146184.
References in the specification to “one embodiment”, “an embodiment”, “an example embodiment”, indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the art to affect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described.
The present disclosure is directed to a method for obtaining at least one sub-aperture image being associated with one view, from raw light field data corresponding to recorded data by an array of pixels sensors positioned behind an array of micro-lenses in a light field camera, each of said pixel sensor recording a linear mixing of up to four different views. The method is remarkable in that it comprises applying a signal separation process on said raw data by using an inverse of a mixing matrix A, said mixing matrix comprising coefficients that convey weighting information of said up to four different views recorded by a pixel sensor.
In a preferred embodiment, the method is remarkable in that said coefficients are defined according to positions, in said array of pixels, of micro-lenses images centers.
In a preferred embodiment, the method is remarkable in that said applying comprises multiplying said recorded data, represented by a column vector, by said inverse of said mixing matrix A.
In a preferred embodiment, the method is remarkable in that said signal separation is a blind signal separation.
Indeed, in one embodiment of the disclosure, it is possible to apply a blind separation technique. Indeed, such kind of technique can be efficient for recovering a sub-aperture image. Blind separation technique has been successfully applied in the processing of image data as explained in the article entitled: “Blind separation of superimposed shifted images using parameterized joint diagonalization” by Be'ery E. and Yeredor A., and published in IEEE Trans Image Process. 2008 March; 17(3):340-53, where blind separation of source images from linear mixtures is done.
In a preferred embodiment, the method is remarkable in that said coefficients are obtained by performing a calibration process on said light field camera.
According to an exemplary implementation, the different steps of the method are implemented by a computer software program or programs, this software program comprising software instructions designed to be executed by a data processor of a relay module according to the disclosure and being designed to control the execution of the different steps of this method.
Consequently, an aspect of the disclosure also concerns a program liable to be executed by a computer or by a data processor, this program comprising instructions to command the execution of the steps of a method as mentioned here above.
This program can use any programming language whatsoever and be in the form of a source code, object code or code that is intermediate between source code and object code, such as in a partially compiled form or in any other desirable form.
The disclosure also concerns an information medium readable by a data processor and comprising instructions of a program as mentioned here above.
The information medium can be any entity or device capable of storing the program. For example, the medium can comprise a storage means such as a ROM (which stands for “Read Only Memory”), for example a CD-ROM (which stands for “Compact Disc-Read Only Memory”) or a microelectronic circuit ROM or again a magnetic recording means, for example a floppy disk or a hard disk drive.
Furthermore, the information medium may be a transmissible carrier such as an electrical or optical signal that can be conveyed through an electrical or optical cable, by radio or by other means. The program can be especially downloaded into an Internet-type network.
Alternately, the information medium can be an integrated circuit into which the program is incorporated, the circuit being adapted to executing or being used in the execution of the method in question.
According to one embodiment, an embodiment of the disclosure is implemented by means of software and/or hardware components. From this viewpoint, the term “module” can correspond in this document both to a software component and to a hardware component or to a set of hardware and software components.
A software component corresponds to one or more computer programs, one or more sub-programs of a program, or more generally to any element of a program or a software program capable of implementing a function or a set of functions according to what is described here below for the module concerned. One such software component is executed by a data processor of a physical entity (terminal, server, etc.) and is capable of accessing the hardware resources of this physical entity (memories, recording media, communications buses, input/output electronic boards, user interfaces, etc.).
Similarly, a hardware component corresponds to any element of a hardware unit capable of implementing a function or a set of functions according to what is described here below for the module concerned. It may be a programmable hardware component or a component with an integrated circuit for the execution of software, for example an integrated circuit, a smart card, a memory card, an electronic board for executing firmware etc. In a variant, the hardware component comprises a processor that is an integrated circuit such as a central processing unit, and/or a microprocessor, and/or an Application-specific integrated circuit (ASIC), and/or an Application-specific instruction-set processor (ASIP), and/or a graphics processing unit (GPU), and/or a physics processing unit (PPU), and/or a digital signal processor (DSP), and/or an image processor, and/or a coprocessor, and/or a floating-point unit, and/or a network processor, and/or an audio processor, and/or a multi-core processor. Moreover, the hardware component can also comprise a baseband processor (comprising for example memory units, and a firmware) and/or radio electronic circuits (that can comprise antennas) which receive or transmit radio signals. In one embodiment, the hardware component is compliant with one or more standards such as ISO/IEC 18092/ECMA-340, ISO/IEC 21481/ECMA-352, GSMA, StoLPaN, ETSI/SCP (Smart Card Platform), GlobalPlatform (i.e. a secure element). In a variant, the hardware component is a Radio-frequency identification (RFID) tag. In one embodiment, a hardware component comprises circuits that enable Bluetooth communications, and/or Wi-fi communications, and/or Zigbee communications, and/or USB communications and/or Firewire communications and/or NFC (for Near Field) communications.
It should also be noted that a step of obtaining an element/value in the present document can be viewed either as a step of reading such element/value in a memory unit of an electronic device or a step of receiving such element/value from another electronic device via communication means.
In another embodiment of the disclosure, it is proposed an electronic device for obtaining at least one sub-aperture image being associated with one view, from raw light field data corresponding to recorded data by an array of pixels sensors positioned behind an array of micro-lenses in a light field camera, each of said pixel sensor recording a linear mixing of up to four different views. The electronic device comprises a memory and at least one processor coupled to the memory, and the at least one processor is remarkable in that it is configured to apply a signal separation process on said raw data by using an inverse of a mixing matrix A, said mixing matrix comprising coefficients that convey weighting information of said up to four different views recorded by a pixel sensor.
In a variant, the electronic device is remarkable in that said coefficients are defined according to positions, in said array of pixels, of micro-lenses images centers.
In a variant, the electronic device is remarkable in that said at least one processor is further configured to multiply said recorded data, represented by a column vector, by said inverse of said mixing matrix A.
In a variant, the electronic device is remarkable in that said signal separation is a blind signal separation.
In a variant, the electronic device is remarkable in that said at least one processor is further configured to perform a calibration of said light field camera in order to obtain said coefficients.
The above and other aspects of the invention will become more apparent by the following detailed description of exemplary embodiments thereof with reference to the attached drawings in which:
The
More precisely, a plenoptic camera comprises a main lens referenced 101, and a sensor array (i.e., an array of pixel sensors (for example a sensor based on CMOS technology)), referenced 104. Between the main lens 101 and the sensor array 104, a micro-lens array referenced 102, that comprises a set of micro-lenses referenced 103, is positioned. It should be noted that optionally some spacers might be located between the micro-lens array around each lens and the sensor to prevent light from one lens to overlap with the light of other lenses at the sensor side. In one embodiment, all the micro-lenses have the same focal. In another embodiment, the micro-lens can be classified into at least three groups of micro-lenses, each group being associated with a given focal, different for each group. Moreover, in a variant, the focal of a micro-lens is different from the ones positioned at its neighborhood; such configuration enables the enhancing of the plenoptic camera's depth of field. It should be noted that the main lens 101 can be a more complex optical system as the one depicted in
More details related to plenoptic camera can be found out in the Section 4 entitled “Image formation of a Light field camera” in the article entitled “The Light Field Camera: Extended Depth of Field, Aliasing, and Super resolution” by Tom E. Bishop and Paolo Favaro, published in the IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 34, N°5, in May 2012.
The
For reminders, a sub-aperture image corresponds to an image of the object space from a given view (i.e. it can be viewed as a sampling of the pupil). In theory, when the micro-lens array and the pixel sensor array are perfectly aligned, the pixels from the raw data and the pixels from the sub-aperture images are linked by the following equation:
Vn,m[k,1]=Rl,k[m,n]
where Vn,m denotes a sub-aperture image positioned at row referenced n and column referenced m in the matric of views referenced 300, and Rl,k denotes a micro-image (also noted μ-image). Hence Vn,m[k, l] corresponds to the pixel located at position (k,l) in the sub-aperture image Vn,m.
It should be noted that rearranging μ-images into sub-aperture images requires to know precisely the location of the μ-images. In the following we denote (ck,lx,ck,ly) the coordinates of the μ-center ck,l i.e. the center of the μ-image (k,l).
In the literature, most approaches propose de-mosaicking of the raw sensor image at first step, before having any insight of the scene geometry. This induces irrelevant interpolations between samples within μ-images. To circumvent this, disparity-guided de-mosaicking has been proposed in the article entitled “Accurate Disparity Estimation for Plenoptic Images” by Neus Sabater et al., published in the conference proceedings of the Workshop on Light Fields for Computer Vision, ECCV 2014, but that solution relies on nearest integer coordinates, which lessens the accuracy of the reconstructed sub-aperture images.
It is proposed a new approach to the generation of matrices of views that handles sub-pixel position for μ centers and plenoptic samples, while keeping interpolations consistent with physics.
The
Indeed, in view of the
coordinates (i,j) correspond to horizontal and vertical integer coordinates in the raw sensor picture;
coordinates (k,l) corresponds to horizontal and vertical indices of a μ-image;
coordinates (m,n) correspond to horizontal and vertical indices of a sub-aperture image;
coordinates (x,y) correspond to horizontal and vertical real (a priori non-integers) coordinates in a μ image.
Besides, the following integers are defined:
K and L respectively denote the width and height of the μ-lens array. In the case of hexagonal patterns, one dimension is doubled so that every lens presents integer indices.
W and H respectively denote the width and height of the sensor.
M and N respectively denote the width and height of a μ image. In the case of a square pattern,
where └.┘ denotes the floor function. In the case of a row-major hexagonal pattern, note that
in we case of a column-major hexagonal pattern, note that
Usually, camera calibration provides the positions of μ centers {(ck,lx, ck,ly)∈2}1≤l≤L1≤k≤K. These positions are a priori not integers.
Pixels positions (i,j) can be turned into:
The indices (k,l) of the μ image they belong;
their relative (a priori non-integer) position (x,y)∈2 with regards to corresponding μ-centers (ck,lx, ck,ly).
Pixels also have a color channel (Red or Green or Blue, or Lightness, or Infra-red or whatever) and an intensity.
In the ideal case, each pixel of the sensor is associated with a unique view as detailed for example in the
Therefore, by formalizing and generalizing this observation, it appears that a relationship (also called equation 1 in the following) linking the raw data (i.e. the data recorded by the pixels) and sub-aperture images can be established:
R(i,j)=αβ·V└y┘,└x┘(k,l)+α
R denoting the raw sensor picture;
(i,j)∈2 the pixel position in the raw picture;
(k,l)∈2 the corresponding the μ-image indices in the raw picture;
(x,y)∈2 the relative (a priori non-integer) pixel coordinates in the μ-image with regards to μ-center position (ck,lx, ck,ly)∈2.
└.┘ and ┌.┐ respectively denoting floor and ceiling functions, and where
α=x−└x┘ and
β=y−└y┘ and
Now let us consider both the raw image and the matrix of views as (KLMN-row vectors.
Let also m and n respectively denote the integer parts of x and y: m=└x┘ and n=└y┘
We can write down equation 1 as a matrix product:
With R[i,j] being the (j·KM+i)th line of vector R
Vn,m[k,l] being the ((n·L+l)·KM+(m·K+k))th line of vector V
And A being a square KLMN×KLMN matrix.
Some remarks concerning the matrix A can be done:
In the monochrome case, sub-aperture images can be recovered straightforwardly:
A−1·R=V
In the RGB case, R can be considered a KLMN×3 vector, whose coefficients are only partially known:
In this case, color planes of the sub-aperture images can only be partially recovered, and de-mosaicking is performed.
In one embodiment of the disclosure, the coefficients of the mixing matrix A can be obtained from a calibration process.
The
In a step, referenced 601, an electronic device obtains either a mixing matrix A or an inverse of said mixing matrix, said mixing matrix comprising coefficients that convey weighting information (related to different views) that details proportion of views recorded by a pixel sensor.
In a step referenced 602, the electronic device executes a signal separation process on said raw data by using an inverse of a mixing matrix A. Hence, in the case that in step 601, only the mixing matrix is obtained, an inversion step has to be done by said electronic device.
Then, in a step referenced 603, the electronic device generates a matrix of views or a set of sub-aperture images to be stored on a memory unit and/or to be transmitted to another electronic device.
The
Such device referenced 700 comprises a computing unit (for example a CPU, for “Central Processing Unit”), referenced 701, and one or more memory units (for example a RAM (for “Random Access Memory”) block in which intermediate results can be stored temporarily during the execution of instructions a computer program, or a ROM block in which, among other things, computer programs are stored, or an EEPROM (“Electrically-Erasable Programmable Read-Only Memory”) block, or a flash block) referenced 702. Computer programs are made of instructions that can be executed by the computing unit. Such device 700 can also comprise a dedicated unit, referenced 703, constituting an input-output interface to allow the device 700 to communicate with other devices. In particular, this dedicated unit 703 can be connected with an antenna (in order to perform communication without contacts), or with serial ports (to carry communications “contact”). It should be noted that the arrows in the
In an alternative embodiment, some or all of the steps of the method previously described, can be implemented in hardware in a programmable FPGA (“Field Programmable Gate Array”) component or ASIC (“Application-Specific Integrated Circuit”) component.
In an alternative embodiment, some or all of the steps of the method previously described, can be executed on an electronic device comprising memory units and processing units as the one disclosed in the
Number | Date | Country | Kind |
---|---|---|---|
16306020 | Aug 2016 | EP | regional |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2017/068312 | 7/20/2017 | WO | 00 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2018/024490 | 2/8/2018 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
8890865 | Park et al. | Nov 2014 | B2 |
20140146184 | Meng et al. | May 2014 | A1 |
20170330339 | Seifi | Nov 2017 | A1 |
20180047185 | Boisson | Feb 2018 | A1 |
20180144492 | Vandame | May 2018 | A1 |
Number | Date | Country |
---|---|---|
1319415 | Jun 1993 | CA |
3094076 | Nov 2016 | EP |
2488905 | Dec 2013 | GB |
WO 2013167758 | Nov 2013 | WO |
Entry |
---|
Sabater et al., “Accurate Disparity Estimation for Plenoptic Images”, European Conference on Computer Vision Workshops 2014, Zurich, Switzerland, Sep. 6, 2014, pp. 548-560. |
Georgiev et al., “Superresolution with the Focused Plenoptic Camera”, IS&T/SPIE Electronic Imaging, San Francisco, California, USA, Jan. 23, 2011, 13 pages. |
Ng, R., “Digital Light Field Photography”, Stanford University, Department of Computer Science, Doctor of Philosophy Thesis, Jul. 2006, pp. 1-203. |
Bishop et al., “The Light Field Camera: Extended Depth of Field; Aliasing; and Superresolution”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, No. 5, May 2012, pp. 972-986. |
Cho et al., “Modeiing the Calibration Pipeline of the Lytro Camera for High Quality Light-Field Image Reconstruction”, 2013 IEEE International Conference on Computer Vision, Sydney, Australia, Dec. 1, 2013, pp. 3280-3287. |
Bok et al., “Geometric Calibration of Micro-Lens-Based Light-Field Cameras Using Line Features”, European Conference on Computer Vision Workshops 2014, Zurich, Switzerland, Sep. 6, 2014, pp. 47-61. |
Ng et al., “Light Field Photography with a Hand-held Plenoptic Camera”, Stanford University Computer Science Technical Report,CSTR 2005-02, Apr. 2005, pp. 1-11. |
Be'Ery et al., “Blind Separation of Superimposed Shifted Images Using Parameterized Joint Diagonalization”, IEEE Transactions on Image Processing, vol. 17, No. 3, Mar. 2008, pp. 340-353. |
Number | Date | Country | |
---|---|---|---|
20190191142 A1 | Jun 2019 | US |