This disclosure relates to sensors and methods for acquiring image data.
Most cameras provide images in multiple different colours, such as red, green and blue (RGB). Each colour relates to a particular frequency band in the visible spectrum from about 400 nm to about 700 nm and an image sensor detects the intensity of light at these frequency bands. More particularly, the image sensor comprises an array of imaging elements and each imaging element is designated for one of the colours red, green and blue by placing a corresponding filter in front of that imaging element.
Imaging element 206 comprises a photo diode 208, a column selection transistor 210, an amplifier transistor 212 and a row activation transistor 214. The current through photo diode 208 depends on the amount of light that reaches the photo diode 208. Amplifier transistor 212 amplifies this current and an image processor (not shown) is connected to the amplifier output via row and column lines to measure this amplified current and to A/D convert the amplified voltage into a digital intensity signal representing the intensity of light reaching the photodiode. The digital intensity signal is then referred to as a colour value for one pixel. A pixel is defined as a group of imaging elements, such as imaging element 206, such that each colour is represented at least once. The individual imaging elements are also referred to as sub-pixels. In many RGB sensors, there are two green, one red and one blue sub-pixels per pixel to make up one 2×2 square of imaging elements as indicated by the thick rectangle 108 in
As can be seen in
A problem with the arrangement shown in
Any discussion of documents, acts, materials, devices, articles or the like which has been included in the present specification is not to be taken as an admission that any or all of these matters form part of the prior art base or were common general knowledge in the field relevant to the present disclosure as it existed before the priority date of each claim of this application.
Throughout this specification the word “comprise”, or variations such as “comprises” or “comprising”, will be understood to imply the inclusion of a stated element, integer or step, or group of elements, integers or steps, but not the exclusion of any other element, integer or step, or group of elements, integers or steps.
A sensor for acquiring image data comprises:
It is an advantage that the alignment between lenses and filters is simplified because each lens and filter combination is associated with multiple imaging elements.
The sensor may further comprise a focusing element in front of the array of multiple lenses.
The focusing element may be configured such that when the sensor captures an image of a scene each of the multiple lenses projects the scene onto the multiple imaging elements associated with that lens.
The sensor may further comprise a processor to determine multispectral image data based on the intensity signals, the multispectral image data comprising for each of multiple pixels of an output image wavelength indexed image data.
The sensor may further comprise a processor to determine depth data indicative of a distance of an object from the sensor based on the intensity signals.
The more than one imaging elements associated with each of the multiple lenses may create an image associated with that lens and the processor may be configured to determine the depth data based on spatial disparities between images associated with different lenses.
All the intensity signals representing a part of the hyperspectral image data may be created by exactly one filter and exactly one of the multiple lenses.
The exactly one filter associated with each of the multiple lenses of the array may be a single integrated filter for all of the multiple lenses and the filter has a response that is variable across the filter.
The filter may be a colour filter and the part of the image data may be a spectral band of hyperspectral image data.
The filter may be a polariser and the part of the image data may be a part that is polarised in a direction of the polariser.
A method for acquiring image data comprises:
A method for determining image data comprises:
Software that, when installed on a computer, causes the computer to perform the above method.
A computer system for determining image data comprises:
Optional features described of any aspect of method, computer readable medium or computer system, where appropriate, similarly apply to the other aspects also described here.
An example will now be described with reference to:
There is disclosed herein a camera concept, which not only avoids this need for alignment but also delivers a depth estimate together with a hyperspectral image. Hyperspectral in this disclosure means more than three bands, that is, more than the three bands of red, green and blue that can be found in most cameras.
It is worth noting that the system presented here differs from other approaches in a number of ways. For example, the filters are not to be on the sensor but rather on the microlens array. This alleviates the alignment problem of the other cameras. Moreover, the configuration presented here can be constructed using any small form-factor sensor and does not require complex foundry or on-chip filter arrays.
Moreover, the system presented here is a low-cost, compact alternative to current hyperpsectral imagers, which are often expensive and cumbersome to operate. This setting can also deliver the scene depth and has no major operating restrictions, as it has a small form factor and its not expected to be limited to structured light or indoor settings. These are major advantages over existing systems.
Sensor 300 further comprises an array 304 of multiple lenses and a filter layer 306. An array is a two-dimensional structure that may follow a rectangular or quadratic grid pattern or other layouts, such as a hexagonal layout. Each of the multiple lenses is associated with more than one of the multiple imaging elements 302. In this example, each lens of array 304 is associated with nine (3×3) imaging elements as indicated by thick rectangle 308.
Further, each of the multiple lenses of the array 304 is associated with exactly one colour filter of filter layer 306 such that the intensity signals generated by the more than one of the multiple imaging elements associated with that lens represent a spectral band of the hyperspectral image data. Being associated in this context means that light that passes through the lens also passes through the filter associated with that lens and is then incident on the imaging element also associated with that lens.
In the example of
In another example, the colour filters of layer 306 are realised as an integrate filter layer for all of the multiple lenses of layer 304. The integrated filter may have a colour response that is variable across the colour filter, such as a gradient in the colour wavelengths from near infrared at one end of the filter to ultra-violet at the opposite end. Such a variable filter also has a unique response at any point. In one example, the integrated filter is a spatially variable response coating, such as provided by Research Electro-Optics, Inc. Boulder, Colo. (REO).
Since the filters 306 are to be on the microlens array 304, each of these replicated views is wavelength resolved. These replicas are not identical, but rather shifted with respect to each other. The shift in the views is such that two pixels corresponding to the same image feature are expected to show paraxial shifts.
For example, the light beams from single point 402 all reach imaging elements 406, 408, 410, 412 and 414, which results in five different views of the same point 402 for five different wavelengths. It is noted that the optical paths illustrated by arrows in FIG. 4 are simplified for illustrative purposes. More accurate paths are provided further below.
The focal length equations of the system are those corresponding to a convex and a plano-convex lenses in
where Equation 1 corresponds to the lens L1 508 and Equation 2 accounts for either of the two lenslets L2 504 or L3 506, respectively. The variables in the equations above correspond to the annotations in
The wavefront simulation for a single lenslet, is shown in
Note that the wavefront in
In another example, the proposed system also acquires depth information. Note that, as the lenslets shift off-centre, a point in the scene also shifts over the respective tiled views on the image sensor.
In one example, the sensor comprises two different cameras. The first of these may be a Flea 1 firewire camera. The second one may be a Flea 2. Both are manufactured by Point Gray.
Note that, in
In one example, the computer system 1100 is integrated into a handheld device such as a consumer or surveillance camera and the scene 1105 may be any scene on the earth, such as a tourist attraction or a person, or a remote surveillance scenario.
The computer 1104 receives intensity signals from the sensor 1102 via a data port 1106 and processor 1110 stores the signals in data memory 1108(b). The processor 1110 uses software stored in program memory 1108(a) to perform the method receiving intensity signals and determining hyperspectral image data based on the received intensity signals. The program memory 1108(b) is a non-transitory computer readable medium, such as a hard drive, a solid state disk or CD-ROM.
Software stored on program memory 1108(a) may cause processor 1110 to generate a user interface that can be presented to the user on a monitor 1112. The user interface is able to accept input from the user (i.e. touch screen). The monitor 1112 provides the user input to the input/out port 1106 in the form of interrupt and data signals. The sensor data and the multispectral or hyperspectral image data may be stored in memory 1108(b) by the processor 1110. In this example the memory 1108(b) is local to the computer 1104, but alternatively could be remote to the computer 1104.
The processor 1110 may receive data, such as sensor signals, from data memory 1108(b) as well as from the communications port 1106. In one example, the processor 1110 receives sensor signals from the sensor 1102 via communications port 1106, such as by using a Wi-Fi network according to IEEE 802.11. The Wi-Fi network may be a decentralised ad-hoc network, such that no dedicated management infrastructure, such as a router, is required or a centralised network with a router or access point managing the network.
In one example, the processor 1110 receives and processes the sensor data in real time. This means that the processor 1110 determines multispectral or hyperspectral image data every time the image data is received from sensor 1102 and completes this calculation before the sensor 1102 sends the next sensor data update. This may be useful in a video application with a framerate of 60 fps.
Although communications port 1106 is shown as single entity, it is to be understood that any kind of data port may be used to receive data, such as a network connection, a memory interface, a pin of the chip package of processor 1110, or logical ports, such as IP sockets or parameters of functions stored on program memory 1108(a) and executed by processor 1110. These parameters may be stored on data memory 1108(b) and may be handled by-value or by-reference, that is, as a pointer, in the source code.
The processor 1110 may receive data through all these interfaces, which includes memory access of volatile memory, such as cache or RAM, or non-volatile memory, such as an optical disk drive, hard disk drive, storage server or cloud storage. The computer system 1104 may further be implemented within a cloud computing environment, such as a managed group of interconnected servers hosting a dynamic number of virtual machines.
It is to be understood that any receiving step may be preceded by the processor 1110 determining or computing the data that is later received. For example, the processor 1110 determines the sensor data, such as by filtering the raw data from sensor 1102, and stores the filtered sensor data in data memory 1108(b), such as RAM or a processor register. The processor 1110 then requests the data from the data memory 1108(b), such as by providing a read signal together with a memory address. The data memory 1108(b) provides the data as a voltage signal on a physical bit line and the processor 1110 receives the sensor data via a memory interface.
Processor 1110 receives image sensor signals, which relates to the raw data from the sensor 1102. After receiving the image sensor signals, processor 1110 may perform different image processing and computer vision techniques to recover the scene depth and reconstruct the hyperspectral image cube. That is, processor 1110 performs these techniques to determine for each pixel location multiple wavelength indexed image values. In addition to these image values that make up the hyperspectral image cube, processor 1110 may also determine for each pixel a distance value indicative of the distance of the object from the camera 1106. In other words, the distance values of all pixels may be seen as a greyscale depth map where white indicates very near objects and black indicates very far objects.
The image processing techniques may comprise deblurring methods where the mask is adapted from tile to tile to image enhancement through the use of the centre tiles (these do not suffer from serious blur) for methods such as gradient transfer as described in P. Perez, M. Gangnet, and A. Blake. Poisson image editing. ACM Trans. Graph., 22(3):313-318, 2003, which is incorporated herein by reference. Processor 1110 may also perform super-resolution techniques or determine depth estimates to improve photometric parameter recovery methods such as that in C. P. Huynh and A. Robles-Kelly. Simultaneous photometric invariance and shape recovery. In International Conference on Computer Vision, 2009, which is incorporated herein by reference. Thirdly, as said earlier, the proposed configuration is a cheap alternative to other hyperspectral cameras. The Flea cameras may have a resolution between 0.7 and 5 MP. These can be substituted with cameras with a much greater resolution such as the Basler Ace USB 3.0 camera with a 15 MP resolution.
In one example, processor 1110 may apply machine learning, data driven approaches to determine or learn parameters of known transformation. That is, processor 1110 solves equations on a large amount of data, that is, the data from the image sensor representing the parallax shift of the known object. In other words, the disparities between the image in tiles of different wavelengths relate to respective equations and processor 1110 solves these equation or optimises error functions to determine the best fit of the camera parameters to the observed data.
In particular, it may be difficult to manufacture a particular focal length for lenslets. So instead, processor 1110 may learn the focal length and thereby account for manufacturing variation for each individual image sensor. The result may include inverted depth, pitch, thickness, index of refraction and based on f1 processor determines f2. With these camera parameters processor can apply algorithms of inverted parallax to take advantage of the depth dependent disparity between image tiles. That is, processor 1110 uses the difference in two difference images and recovers the depth out of the disparity based on triangulation.
Processor 1110 may further perform a stereo vision method as described in L. Boyer, A. C. Kak, Structural stereopsis for 3-D vision, IEEE Trans. Pattern Anal. Machine Intell. 10 (1988), 144-16, which is incorporated herein by reference. This is applicable since each of the imaging elements acquires a displaced image whose parameters are determined by the lens equation and the position of the lenses with respect to the camera plane. Thus, each of the acquired scenes is one of the displaced views that processor 1110 then uses in a manner akin to stereo vision to recover depth.
The spectra for each pixel may be stored in a compact form as described in PCT/AU2009/000793.
Processor 1110 may recover the illumination spectrum from the hyperspectral image data as described in PCT/AU2010/001000, which is incorporated herein by reference, or may determine colour values as described in PCT/AU2012/001352, which is incorporated herein by reference, or cluster the image data as described in PCT/AU2014/000491, which is incorporated herein by reference. Processor 1110 may also process the hyperspectral image data as described in PCT/AU2015/050052, which is incorporated herein by reference.
Processor 1110 may decompose the image data into material spectra is described in U.S. Pat. No. 8,670,620, which is incorporated herein by reference.
It will be appreciated by persons skilled in the art that numerous variations and/or modifications may be made to the specific embodiments without departing from the scope as defined in the claims.
It should be understood that the techniques of the present disclosure might be implemented using a variety of technologies. For example, the methods described herein may be implemented by a series of computer executable instructions residing on a suitable computer readable medium. Suitable computer readable media may include volatile (e.g. RAM) and/or non-volatile (e.g. ROM, disk) memory, carrier waves and transmission media. Exemplary carrier waves may take the form of electrical, electromagnetic or optical signals conveying digital data steams along a local network or a publically accessible network such as the internet.
It should also be understood that, unless specifically stated otherwise as apparent from the following discussion, it is appreciated that throughout the description, discussions utilizing terms such as “estimating” or “processing” or “computing” or “calculating”, “optimizing” or “determining” or “displaying” or “maximising” or the like, refer to the action and processes of a computer system, or similar electronic computing device, that processes and transforms data represented as physical (electronic) quantities within the computer system's registers and memories into other data similarly represented as physical quantities within the computer system memories or registers or other such information storage, transmission or display devices.
The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive.
Number | Date | Country | Kind |
---|---|---|---|
2016900098 | Jan 2016 | AU | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/AU2017/050020 | 1/12/2017 | WO | 00 |