The present invention relates to the field of compressive imaging, and more particularly, to mechanisms for accelerating the rate at which compressive imaging devices can acquire and reconstruct sequences of images.
According to Nyquist theory, a signal x(t) whose signal energy is supported on the frequency interval [−B,B] may be reconstructed from samples {x(nT)} of the signal x(t), provided the rate fS=1/TS at which the samples are captured is sufficiently high, i.e., provided that fS is greater than 2B. Similarly, for a signal whose signal energy is supported on the frequency interval [A,B], the signal may be reconstructed from samples captured with sample rate greater than B−A. A fundamental problem with any attempt to capture a signal x(t) according to Nyquist theory is the large number of samples that are generated, especially when B (or B−A) is large. The large number of samples is taxing on memory resources and on the capacity of transmission channels.
Nyquist theory is not limited to functions of time. Indeed, Nyquist theory applies more generally to any function of one or more real variables. For example, Nyquist theory applies to functions of two spatial variables such as images, to functions of time and two spatial variables such as video, and to the functions used in multispectral imaging, hyperspectral imaging, medical imaging and a wide variety of other applications. In the case of an image I(x,y) that depends on spatial variables x and y, the image may be reconstructed from samples of the image, provided the samples are captured with sufficiently high spatial density. For example, given samples {I(nΔx,mΔy)} captured along a rectangular grid, the horizontal and vertical densities 1/Δx and 1/Δy should be respectively greater than 2Bx and 2By, where Bx and By are the highest x and y spatial frequencies occurring in the image I(x,y). The same problem of overwhelming data volume is experienced when attempting to capture an image according to Nyquist theory. The modern theory of compressive sensing is directed to such problems.
Compressive sensing relies on the observation that many signals (e.g., images or video sequences) of practical interest are not only band-limited but also sparse or approximately sparse when represented using an appropriate choice of transformation, for example, a transformation such as a Fourier transform, a wavelet transform or a discrete cosine transform (DCT). A signal vector v is said to be K-sparse with respect to a given transformation T when the transformation of the signal vector, Tv, has no more than K non-zero coefficients. A signal vector v is said to be sparse with respect to a given transformation T when it is K-sparse with respect to that transformation for some integer K much smaller than the number L of components in the transformation vector Tv.
A signal vector v is said to be approximately K-sparse with respect to a given transformation T when the coefficients of the transformation vector, Tv, are dominated by the K largest coefficients (i.e., largest in the sense of magnitude or absolute value). In other words, if the K largest coefficients account for a high percentage of the energy in the entire set of coefficients, then the signal vector v is approximately K-sparse with respect to transformation T. A signal vector v is said to be approximately sparse with respect to a given transformation T when it is approximately K-sparse with respect to the transformation T for some integer K much less than the number L of components in the transformation vector Tv.
Given a sensing device that captures images with N samples per image and in conformity to the Nyquist condition on spatial rates, it is often the case that there exists some transformation and some integer K very much smaller than N such that the transform of each captured image will be approximately K sparse. The set of K dominant coefficients may vary from one image to the next. Furthermore, the value of K and the selection of the transformation may vary from one context (e.g., imaging application) to the next. Examples of typical transforms that might work in different contexts include the Fourier transform, the wavelet transform, the DCT, the Gabor transform, etc.
Compressive sensing specifies a way of operating on the N samples of an image so as to generate a much smaller set of samples from which the N samples may be reconstructed, given knowledge of the transform under which the image is sparse (or approximately sparse). In particular, compressive sensing invites one to think of the N samples as a vector v in an N-dimensional space and to imagine projecting the vector v onto each vector in a series of M vectors {R(i): i=1, 2, . . . , M} in the N-dimensional space, where M is larger than K but still much smaller than N. Each projection gives a corresponding real number S(i), e.g., according to the expression
S(i)=<v,R(i)>,
where the notation <v,R(i)> represents the inner product (or dot product) of the vector v and the vector R(i). Thus, the series of M projections gives a vector U including M real numbers: Ui=S(i). Compressive sensing theory further prescribes methods for reconstructing (or estimating) the vector v of N samples from the vector U of M real numbers. For example, according to one method, one should determine the vector x that has the smallest length (in the sense of the L1 norm) subject to the condition that ΦTx=U, where Φ is a matrix whose rows are the transposes of the vectors R(i), where T is the transformation under which the image is K sparse or approximately K sparse.
Compressive sensing is important because, among other reasons, it allows reconstruction of an image based on M measurements instead of the much larger number of measurements N recommended by Nyquist theory. Thus, for example, a compressive sensing camera would be able to capture a significantly larger number of images for a given size of image store, and/or, transmit a significantly larger number of images per unit time through a communication channel of given capacity.
As mentioned above, compressive sensing operates by projecting the image vector v onto a series of M vectors. As discussed in U.S. Pat. No. 8,199,244, issued Jun. 12, 2012 (invented by Baraniuk et al.) and illustrated in
The compressive sensing is implemented by driving the orientations of the micromirrors through a series of spatial patterns. Each spatial pattern specifies an orientation state for each of the micromirrors. The output signal of the photodiode is digitized by an A/D converter 70. In this fashion, the imaging device is able to capture a series of measurements {S(i)} that represent inner products (dot products) between the incident light field and the series of spatial patterns without first acquiring the incident light field as a pixelized digital image. The incident light field corresponds to the vector v of the discussion above, and the spatial patterns correspond to the vectors R(i) of the discussion above.
The incident light field may be modeled by a function I(x,y,t) of two spatial variables and time. Assuming for the sake of discussion that the DMD comprises a rectangular array, the DMD implements a spatial modulation of the incident light field so that the light field leaving the DMD in the direction of the lens 50 might be modeled by
{I(nΔx,mΔy,t)*M(n,m,t)}
where m and n are integer indices, where I(nΔx,mΔy,t) represents the portion of the light field that is incident upon that (n,m)th mirror of the DMD at time t. The function M(n,m,t) represents the orientation of the (n,m)th mirror of the DMD at time t. At sampling times, the function M(n,m,t) equals one or zero, depending on the state of the digital control signal that controls the (n,m)th mirror. The condition M(n,m,t)=1 corresponds to the orientation state that reflects onto the path that leads to the lens 50. The condition M(n,m,t)=0 corresponds to the orientation state that reflects away from the lens 50.
The lens 50 concentrates the spatially-modulated light field
{I(nΔx,mΔy,t)*M(n,m,t)}
onto a light sensitive surface of the photodiode. Thus, the lens and the photodiode together implement a spatial summation of the light portions in the spatially-modulated light field:
Signal S(t) may be interpreted as the intensity at time t of the concentrated spot of light impinging upon the light sensing surface of the photodiode. The A/D converter captures measurements of S(t). In this fashion, the compressive sensing camera optically computes an inner product of the incident light field with each spatial pattern imposed on the mirrors. The multiplication portion of the inner product is implemented by the mirrors of the DMD. The summation portion of the inner product is implemented by the concentrating action of the lens and also the integrating action of the photodiode.
In a compressive imaging system, an incident light stream from a scene under observation is modulated with a time sequence of spatial patterns, and the modulated light stream is sensed with a light detector. The electrical signal generated by the light detector is sampled by an analog-to-digital converter to acquire a sequence of samples. The sequence of samples may comprise a compressed representation of a sequence of images carried by the incident light stream. The samples may be partitioned into non-overlapping subsets of M samples each. As shown in
One of the challenges in compressive imaging is acquiring the sample subsets fast enough to compete with conventional cameras that employ array-based light sensors (such as focal plane arrays, CCD arrays, etc.), and therefore, are able to non-compressively acquire images at video rates. In a compressive imaging system, the rate of acquisition of the sample subsets may be limited by the rate at which the light modulator can be reconfigured. For example, one of the digital micromirror devices (DMDs) supplied by Texas Instruments has a maximum pattern modulation rate of about 32,000 patterns per second. Under the assumption of a 10% compressive sensing ratio, the compressive imaging system would have to collect 100,000 samples in order to effectively capture a one megapixel image. The 100,000 samples would be algorithmically processed to reconstruct the one mega-pixel image. However, approximately three seconds (100K/32 KHz) are required to collect the 100,000 samples, assuming the 32 KHz pattern modulation rate. Thus, there exists a need for mechanisms capable of increasing the rate at which reconstructed images can be generated.
It should be understood that the specific numbers (32,000 KHz, 100000 samples, one megapixel, 10% compression ratio) given above are only for the sake of illustration and are not meant to be limiting to the scope of the inventions herein claimed.
A method for reconstructing a sequence of images from compressively-acquired sequence of measurements may involve the following operations.
The method may include modulating an incident light stream with a sequence of spatial patterns to obtain a modulated light stream, where the modulation includes applying the spatial patterns to the incident light stream successively in time. A sequence of measurements is acquired from a light sensing device. The sequence of measurements represents intensity of the modulated light stream over time. Each of the measurements is acquired in response to the application of a respective one of the spatial patterns to the incident light stream.
The method may also include generating a sequence of subsets of the intensity measurements. Each consecutive pair of the subsets overlap by a nonzero amount. Each of the subsets corresponds to a respective group of the spatial patterns.
The method may also include reconstructing a sequence of images, where each of the images is reconstructed from a respective input data set including a respective one of the subsets of intensity measurements and a respective one of the groups of the spatial patterns. The sequence of images may be displayed using a display device.
In some embodiments, the input data set for the reconstruction of a current image of the image sequence may also include a previously-reconstructed image of the image sequence. The previously-reconstructed image serves as an initial estimate for the current image, and, may allow the reconstruction algorithm to converge faster to its final estimate for the current image than if no initial estimate were provided.
In some embodiments, the input data set for the reconstruction of a current image of the image sequence may also include a partially-reconstructed version of a previous image of the image sequence.
Various additional embodiments are described in U.S. Provisional Application No. 61/502,153, filed on Jun. 28, 2011, entitled “Various Compressive Sensing Mechanisms”.
A better understanding of the present invention can be obtained when the following detailed description of the preferred embodiments is considered in conjunction with the following drawings.
While the invention is susceptible to various modifications and alternative forms, specific embodiments thereof are shown by way of example in the drawings and are herein described in detail. It should be understood, however, that the drawings and detailed description thereto are not intended to limit the invention to the particular form disclosed, but on the contrary, the intention is to cover all modifications, equivalents and alternatives falling within the spirit and scope of the present invention as defined by the appended claims.
Terminology
A memory medium is a non-transitory medium configured for the storage and retrieval of information. Examples of memory media include: various kinds of semiconductor-based memory such as RAM and ROM; various kinds of magnetic media such as magnetic disk, tape, strip and film; various kinds of optical media such as CD-ROM and DVD-ROM; various media based on the storage of electrical charge and/or any of a wide variety of other physical quantities; media fabricated using various lithographic techniques; etc. The term “memory medium” includes within its scope of meaning the possibility that a given memory medium might be a union of two or more memory media that reside at different locations, e.g., on different chips in a system or on different computers in a network.
A computer-readable memory medium may be configured so that it stores program instructions and/or data, where the program instructions, if executed by a computer system, cause the computer system to perform a method, e.g., any of a method embodiments described herein, or, any combination of the method embodiments described herein, or, any subset of any of the method embodiments described herein, or, any combination of such subsets.
A computer system is any device (or combination of devices) having at least one processor that is configured to execute program instructions stored on a memory medium. Examples of computer systems include personal computers (PCs), workstations, laptop computers, tablet computers, mainframe computers, server computers, client computers, network or Internet appliances, hand-held devices, mobile devices, personal digital assistants (PDAs), tablet computers, computer-based television systems, grid computing systems, wearable computers, computers implanted in living organisms, computers embedded in head-mounted displays, computers embedded in sensors forming a distributed network, etc.
A programmable hardware element (PHE) is a hardware device that includes multiple programmable function blocks connected via a system of programmable interconnects. Examples of PHEs include FPGAs (Field Programmable Gate Arrays), PLDs (Programmable Logic Devices), FPOAs (Field Programmable Object Arrays), and CPLDs (Complex PLDs). The programmable function blocks may range from fine grained (combinatorial logic or look up tables) to coarse grained (arithmetic logic units or processor cores).
As used herein, the term “light” is meant to encompass within its scope of meaning any electromagnetic radiation whose spectrum lies within the wavelength range [λL, λU], where the wavelength range includes the visible spectrum, the ultra-violet (UV) spectrum, infrared (IR) spectrum and the terahertz (THz) spectrum. Thus, for example, visible radiation, or UV radiation, or IR radiation, or THz radiation, or any combination thereof is “light” as used herein.
In some embodiments, a computer system may be configured to include a processor (or a set of processors) and a memory medium, where the memory medium stores program instructions, where the processor is configured to read and execute the program instructions stored in the memory medium, where the program instructions are executable by the processor to implement a method, e.g., any of the various method embodiments described herein, or, any combination of the method embodiments described herein, or, any subset of any of the method embodiments described herein, or, any combination of such subsets.
System 100 for Operating on Light
A system 100 for operating on light may be configured as shown in
The light modulation unit 110 is configured to modulate a received stream of light L with a series of spatial patterns in order to produce a modulated light stream (MLS). The spatial patterns of the series may be applied sequentially to the light stream so that successive time slices of the light stream are modulated, respectively, with successive ones of the spatial patterns. (The action of sequentially modulating the light stream L with the spatial patterns imposes the structure of time slices on the light stream.) The light modulation unit 110 includes a plurality of light modulating elements configured to modulate corresponding portions of the light stream. Each of the spatial patterns specifies an amount (or extent or value) of modulation for each of the light modulating elements. Mathematically, one might think of the light modulation unit's action of applying a given spatial pattern as performing an element-wise multiplication of a light field vector (xij) representing a time slice of the light stream L by a vector of scalar modulation values (mij) to obtain a time slice of the modulated light stream: (mij)*(xij)=(mij*xij). The vector (mij) is specified by the spatial pattern. Each light modulating element effectively scales (multiplies) the intensity of its corresponding light stream portion by the corresponding scalar factor.
The light modulation unit 110 may be realized in various ways. In some embodiments, the LMU 110 may be realized by a plurality of mirrors (e.g., micromirrors) whose orientations are independently controllable. In another set of embodiments, the LMU 110 may be realized by an array of elements whose transmittances are independently controllable, e.g., as with an array of LCD shutters. An electrical control signal supplied to each element controls the extent to which light is able to transmit through the element. In yet another set of embodiments, the LMU 110 may be realized by an array of independently-controllable mechanical shutters (e.g., micromechanical shutters) that cover an array of apertures, with the shutters opening and closing in response to electrical control signals, thereby controlling the flow of light through the corresponding apertures. In yet another set of embodiments, the LMU 110 may be realized by a perforated mechanical plate, with the entire plate moving in response to electrical control signals, thereby controlling the flow of light through the corresponding perforations. In yet another set of embodiments, the LMU 110 may be realized by an array of transceiver elements, where each element receives and then immediately retransmits light in a controllable fashion. In yet another set of embodiments, the LMU 110 may be realized by a grating light valve (GLV) device. In yet another embodiment, the LMU 110 may be realized by a liquid-crystal-on-silicon (LCOS) device.
In some embodiments, the light modulating elements are arranged in an array, e.g., a two-dimensional array or a one-dimensional array. Any of various array geometries are contemplated. For example, in some embodiments, the array is a square array or rectangular array. In another embodiment, the array is hexagonal. In some embodiments, the light modulating elements are arranged in a spatially random fashion.
Let N denote the number of light modulating elements in the light modulation unit 110. In various embodiments, the number N may take a wide variety of values. For example, in different sets of embodiments, N may be, respectively, in the range [64, 256], in the range [256, 1024], in the range [1024,4096], in the range [212,214], in the range [214,216], in the range [216,218], in the range [218,220], in the range [220,222], in the range [222,224] in the range [224,226], in the range from 226 to infinity. The particular value used in any given embodiment may depend on one or more factors specific to the embodiment.
The light sensing device 130 may be configured to receive the modulated light stream MLS and to generate an analog electrical signal IMLS(t) representing intensity of the modulated light stream as a function of time.
The light sensing device 130 may include one or more light sensing elements. The term “light sensing element” may be interpreted as meaning “a transducer between a light signal and an electrical signal”. For example, a photodiode is a light sensing element. In various other embodiments, light sensing elements might include devices such as metal-semiconductor-metal (MSM) photodetectors, phototransistors, phototubes and photomultiplier tubes.
In some embodiments, the light sensing device 130 includes one or more amplifiers (e.g., transimpedance amplifiers) to amplify the analog electrical signals generated by the one or more light sensing elements.
The ADC 140 acquires a sequence of samples {IMLS(k)} of the analog electrical signal IMLS(t). Each of the samples may be interpreted as an inner product between a corresponding time slice of the light stream L and a corresponding one of the spatial patterns. The set of samples {IMLS(k)} comprises an encoded representation, e.g., a compressed representation, of an image (or a video sequence) and may be used to reconstruct the image (or video sequence) based on any reconstruction algorithm known in the field of compressive sensing. (For video sequence reconstruction, the samples may be partitioned into contiguous subsets, and then the subsets may be processed to reconstruct corresponding images.)
In some embodiments, the samples {IMLS(k)} may be used for some purpose other than, or in addition to, image (or video) reconstruction. For example, system 100 (or some other system) may operate on the samples to perform an inference task, such as detecting the presence of a signal or object, identifying a signal or an object, classifying a signal or an object, estimating one or more parameters relating to a signal or an object, tracking a signal or an object, etc. In some embodiments, an object under observation by system 100 may be identified or classified by virtue of its sample set {IMLS(k)} (or parameters derived from that sample set) being similar to one of a collection of stored sample sets (or parameter sets).
In some embodiments, the light sensing device 130 includes exactly one light sensing element. (For example, the single light sensing element may be a photodiode.) The light sensing element may couple to an amplifier (e.g., a TIA) (e.g., a multi-stage amplifier).
In some embodiments, the light sensing device 130 may include a plurality of light sensing elements (e.g., photodiodes). Each light sensing element may convert light impinging on its light sensing surface into a corresponding analog electrical signal representing intensity of the impinging light as a function of time. In some embodiments, each light sensing element may couple to a corresponding amplifier so that the analog electrical signal produced by the light sensing element can be amplified prior to digitization. System 100 may be configured so that each light sensing element receives, e.g., a corresponding spatial portion (or spectral portion) of the modulated light stream.
In one embodiment, the analog electrical signals produced, respectively, by the light sensing elements may be summed to obtain a sum signal. The sum signal may then be digitized by the ADC 140 to obtain the sequence of samples {IMLS(k)}. In another embodiment, the analog electrical signals may be individually digitized, each with its own ADC, to obtain corresponding sample sequences. The sample sequences may then be added to obtain the sequence {IMLS(k)}. In another embodiment, the analog electrical signals produced by the light sensing elements may be sampled by a smaller number of ADCs than light sensing elements through the use of time multiplexing. For example, in one embodiment, system 100 may be configured to sample two or more of the analog electrical signals by switching the input of an ADC among the outputs of the two or more corresponding light sensing elements at a sufficiently high rate.
In some embodiments, the light sensing device 130 may include an array of light sensing elements. Arrays of any of a wide variety of sizes, configurations and material technologies are contemplated. In one embodiment, the light sensing device 130 includes a focal plane array coupled to a readout integrated circuit. In one embodiment, the light sensing device 130 may include an array of cells, where each cell includes a corresponding light sensing element and is configured to integrate and hold photo-induced charge created by the light sensing element, and to convert the integrated charge into a corresponding cell voltage. The light sensing device may also include (or couple to) circuitry configured to sample the cell voltages using one or more ADCs.
In some embodiments, the light sensing device 130 may include a plurality (or array) of light sensing elements, where each light sensing element is configured to receive a corresponding spatial portion of the modulated light stream, and each spatial portion of the modulated light stream comes from a corresponding sub-region of the array of light modulating elements. (For example, the light sensing device 130 may include a quadrant photodiode, where each quadrant of the photodiode is configured to receive modulated light from a corresponding quadrant of the array of light modulating elements. As another example, the light sensing device 130 may include a bi-cell photodiode. As yet another example, the light sensing device 130 may include a focal plane array.) Each light sensing element generates a corresponding signal representing intensity of the corresponding spatial portion as a function of time. Each signal may be digitized (e.g., by a corresponding ADC, or perhaps by a shared ADC) to obtain a corresponding sequence of samples. Thus, a plurality of sample sequences are obtained, one sample sequence per light sensing element. Each sample sequence may be processed to reconstruct a corresponding sub-image (or sub-video sequence). The sub-images may be joined together to form a whole image (or whole video sequence). The sample sequences may be captured in response to the modulation of the incident light stream with a sequence of M spatial patterns, e.g., as variously described above. By employing any of various reconstruction algorithms known in the field of compressive sensing, the number of pixels (voxels) in each reconstructed image (sub-video sequence) may be greater than (e.g., much greater than) M. To reconstruct each sub-image (sub-video), the reconstruction algorithm uses the corresponding sample sequence and the restriction of the spatial patterns to the corresponding sub-region of the array of light modulating elements.
In some embodiments, the light sensing device 130 includes a small number of light sensing elements (e.g., in respective embodiments, one, two, less than 8, less than 16, less the 32, less than 64, less than 128, less than 256). Because the light sensing device of these embodiments includes a small number of light sensing elements (e.g., far less than the typical modern CCD-based or CMOS-based camera), an entity interested in producing any of these embodiments may afford to spend more per light sensing element to realize features that are beyond the capabilities of modern array-based image sensors of large pixel count, e.g., features such as higher sensitivity, extended range of sensitivity, new range(s) of sensitivity, extended dynamic range, higher bandwidth/lower response time. Furthermore, because the light sensing device includes a small number of light sensing elements, an entity interested in producing any of these embodiments may use newer light sensing technologies (e.g., based on new materials or combinations of materials) that are not yet mature enough to be manufactured into focal plane arrays (FPA) with large pixel count. For example, new detector materials such as super-lattices, quantum dots, carbon nanotubes and graphene can significantly enhance the performance of IR detectors by reducing detector noise, increasing sensitivity, and/or decreasing detector cooling requirements.
In one embodiment, the light sensing device 130 is a thermo-electrically cooled InGaAs detector. (InGaAs stands for “Indium Gallium Arsenide”.) In other embodiments, the InGaAs detector may be cooled by other mechanisms (e.g., liquid nitrogen or a Sterling engine). In yet other embodiments, the InGaAs detector may operate without cooling. In yet other embodiments, different detector materials may be used, e.g., materials such as MCT (mercury-cadmium-telluride), InSb (Indium Antimonide) and VOx (Vanadium Oxide).
In different embodiments, the light sensing device 130 may be sensitive to light at different wavelengths or wavelength ranges. In some embodiments, the light sensing device 130 may be sensitive to light over a broad range of wavelengths, e.g., over the entire visible spectrum or over the entire range [λL,λu] as defined above.
In some embodiments, the light sensing device 130 may include one or more dual-sandwich photodetectors. A dual sandwich photodetector includes two photodiodes stacked (or layered) one on top of the other.
In one embodiment, the light sensing device 130 may include one or more avalanche photodiodes.
In one embodiment, the light sensing device 130 may include one or more photomultiplier tubes (PMTs).
In some embodiments, a filter may be placed in front of the light sensing device 130 to restrict the modulated light stream to a specific range of wavelengths or specific polarization. Thus, the signal IMLS(t) generated by the light sensing device 130 may be representative of the intensity of the restricted light stream. For example, by using a filter that passes only IR light, the light sensing device may be effectively converted into an IR detector. The sample principle may be applied to effectively convert the light sensing device into a detector for red or blue or green or UV or any desired wavelength band, or, a detector for light of a certain polarization.
In some embodiments, system 100 includes a color wheel whose rotation is synchronized with the application of the spatial patterns to the light modulation unit. As it rotates, the color wheel cyclically applies a number of optical bandpass filters to the modulated light stream MLS. Each bandpass filter restricts the modulated light stream to a corresponding sub-band of wavelengths. Thus, the samples captured by the ADC 140 will include samples of intensity in each of the sub-bands. The samples may be de-multiplexed to form separate sub-band sequences. Each sub-band sequence may be processed to generate a corresponding sub-band image. (As an example, the color wheel may include a red-pass filter, a green-pass filter and a blue-pass filter to support color imaging.)
In some embodiments, the system 100 may include a memory (or a set of memories of one or more kinds).
In some embodiments, system 100 may include a processing unit 150, e.g., as shown in
The system 100 (e.g., the processing unit 150) may store the samples {IMLS(k)} in a memory, e.g., a memory resident in the system 100 or in some other system.
In one embodiment, processing unit 150 is configured to operate on the samples {IMLS(k)} to generate the image or video sequence. In this embodiment, the processing unit 150 may include a microprocessor configured to execute software (i.e., program instructions), especially software for performing an image/video reconstruction algorithm. In one embodiment, system 100 is configured to transmit the compensated samples to some other system through a communication channel. (In embodiments where the spatial patterns are randomly-generated, system 100 may also transmit the random seed(s) used to generate the spatial patterns.) That other system may operate on the samples to reconstruct the image/video. System 100 may have one or more interfaces configured for sending (and perhaps also receiving) data through one or more communication channels, e.g., channels such as wireless channels, wired channels, fiber optic channels, acoustic channels, laser-based channels, etc.
In some embodiments, processing unit 150 is configured to use any of a variety of algorithms and/or any of a variety of transformations to perform image/video reconstruction. System 100 may allow a user to choose a desired algorithm and/or a desired transformation for performing the image/video reconstruction.
In some embodiments, the system 100 is configured to acquire a set ZM of samples from the ADC 140 so that the sample set ZM corresponds to M of the spatial patterns applied to the light modulation unit 110, where M is a positive integer. The number M is selected so that the sample set ZM is useable to reconstruct an n-pixel image or n-voxel video sequence that represents the incident light stream, where n is a positive integer less than or equal to the number N of light modulating elements in the light modulation unit 110. System 100 may be configured so that the number M is smaller than n. Thus, system 100 may operate as a compressive sensing device. (The number of “voxels” in a video sequence is the number of images in the video sequence times the number of pixels per image, or equivalently, the sum of the pixel counts of the images in the video sequence.)
In various embodiments, the compression ratio M/n may take any of a wide variety of values. For example, in different sets of embodiments, M/n may be, respectively, in the range [0.9,0.8], in the range [0.8,0.7], in the range [0.7,0.6], in the range [0.6,0.5], in the range [0.5,0.4], in the range [0.4,0.3], in the range [0.3,0.2], in the range [0.2,0.1], in the range [0.1,0.05], in the range [0.05,0.01], in the range [0.001,0.01].
Superpixels for Modulation at Lower Spatial Resolution
As noted above, the image reconstructed from the sample subset ZM may be an n-pixel image with n≦N. The spatial patterns may be designed to support a value of n less than N, e.g., by forcing the array of light modulating elements to operate at a lower effective resolution than the physical resolution N. For example, the spatial patterns may be designed to force each 2×2 cell of light modulating elements to act in unison. At any given time, the modulation state of the four elements in a 2×2 cell will agree. Thus, the effective resolution of the array of light modulating elements is reduced to N/4. This principle generalizes to any cell size, to cells of any shape, and to collections of cells with non-uniform cell size and/or cell shape. For example, a collection of cells of size kH×kV, where kH and kV are positive integers, would give an effective resolution equal to N/(kHkV). In one alternative embodiment, cells near the center of the array may have smaller sizes than cells near the periphery of the array.
The “cells” of the above discussion are referred to herein as “superpixels”. When the reconstruction algorithm generates an image (video frame) from the acquired sample data, each superpixel corresponds to one pixel in the reconstructed image (video frame).
Restricting the Spatial Patterns to a Subset of the Modulation Array
Another way the spatial patterns may be arranged to support the reconstruction of an n-pixel image with n less than N is to allow the spatial patterns to vary only within a subset (or region) of the array of light modulating elements. In this mode of operation, the spatial patterns are null (take the value zero) outside the subset. (Control unit 120 may be configured to implement this restriction of the spatial patterns.) Light modulating elements corresponding to positions outside of the subset do not send any light (or send only the minimum amount of light attainable) to the light sensing device. Thus, the reconstructed image is restricted to the subset. In some embodiments, each spatial pattern (e.g., of a measurement pattern sequence) may be multiplied element-wise by a binary mask that takes the one value only in the allowed subset, and the resulting product pattern may be supplied to the light modulation unit. In some embodiments, the subset is a contiguous region of the array of light modulating elements, e.g., a rectangle or a circular disk or a hexagon. In some embodiments, the size and/or position of the region may vary (e.g., dynamically). The position of the region may vary in order to track a moving object. The size of the region may vary in order to dynamically control the rate of image acquisition or video frame rate. In some embodiments, the size of the region may be determined by user input. For example, system 100 may provide an input interface (GUI and/or mechanical control device) through which the user may vary the size of the region over a continuous range of values (or alternatively, a discrete set of values), thereby implementing a digital zoom function. Furthermore, in some embodiments, the position of the region within the field of view may be controlled by user input.
Oversampling Relative to Pattern Modulation Rate
In some embodiments, the A/D converter 140 may oversample the electrical signal generated by the light sensing device 130, i.e., acquire samples of the electrical signal at a rate higher than (e.g., a multiple of) the pattern modulation rate. The pattern modulation rate is the rate at which the spatial patterns are applied to the incident light stream L by the light modulation unit 110. Thus, the A/D converter may generate a plurality of samples per spatial pattern. The plurality of samples may be averaged to obtain a single averaged sample per spatial pattern. The averaging tends to reduce noise, and thus, to increase quality of image reconstruction. The averaging may be performed by processing unit 150 or some other processing agent. The oversampling ratio may be controlled by setting the pattern modulation rate and/or setting the A/D sampling rate.
In one embodiment, system 100 may include a light transmitter configured to generate a light beam (e.g., a laser beam), to modulate the light beam with a data signal and to transmit the modulated light beam into space or onto an optical fiber. System 100 may also include a light receiver configured to receive a modulated light beam from space or from an optical fiber, and to recover a data stream from the received modulated light beam.
In one embodiment, system 100 may be configured as a low-cost sensor system having minimal processing resources, e.g., processing resources insufficient to perform image (or video) reconstruction in user-acceptable time. In this embodiment, the system 100 may store and/or transmit the samples {IMLS(k)} so that another agent, more plentifully endowed with processing resources, may perform the image/video reconstruction based on the samples.
In some embodiments, system 100 may include an optical subsystem 105 that is configured to modify or condition the light stream L before it arrives at the light modulation unit 110, e.g., as shown in
In some embodiments, system 100 may include an optical subsystem 117 to direct the modulated light stream MLS onto a light sensing surface (or surfaces) of the light sensing device 130.
In some embodiments, the optical subsystem 117 may include one or more lenses, and/or, one or more mirrors.
In some embodiments, the optical subsystem 117 is configured to focus the modulated light stream onto the light sensing surface (or surfaces). The term “focus” implies an attempt to achieve the condition that rays (photons) diverging from a point on an object plane converge to a point (or an acceptably small spot) on an image plane. The term “focus” also typically implies continuity between the object plane point and the image plane point (or image plane spot); points close together on the object plane map respectively to points (or spots) close together on the image plane. In at least some of the system embodiments that include an array of light sensing elements, it may be desirable for the modulated light stream MLS to be focused onto the light sensing array so that there is continuity between points on the light modulation unit LMU and points (or spots) on the light sensing array.
In some embodiments, the optical subsystem 117 may be configured to direct the modulated light stream MLS onto the light sensing surface (or surfaces) of the light sensing device 130 in a non-focusing fashion. For example, in a system embodiment that includes only one photodiode, it may not be so important to achieve the “in focus” condition at the light sensing surface of the photodiode since positional information of photons arriving at that light sensing surface will be immediately lost.
In one embodiment, the optical subsystem 117 may be configured to receive the modulated light stream and to concentrate the modulated light stream into an area (e.g., a small area) on a light sensing surface of the light sensing device 130. Thus, the diameter of the modulated light stream may be reduced (possibly, radically reduced) in its transit from the optical subsystem 117 to the light sensing surface (or surfaces) of the light sensing device 130. For example, in some embodiments, the diameter may be reduced by a factor of more than 1.5 to 1. In other embodiments, the diameter may be reduced by a factor of more than 2 to 1. In yet other embodiments, the diameter may be reduced by a factor of more than 10 to 1. In yet other embodiments, the diameter may be reduced by factor of more than 100 to 1. In yet other embodiments, the diameter may be reduced by factor of more than 400 to 1. In one embodiment, the diameter is reduced so that the modulated light stream is concentrated onto the light sensing surface of a single light sensing element (e.g., a single photodiode).
In some embodiments, this feature of concentrating the modulated light stream onto the light sensing surface (or surfaces) of the light sensing device allows the light sensing device to sense at any given time the sum (or surface integral) of the intensities of the modulated light portions within the modulated light stream. (Each time slice of the modulated light stream comprises a spatial ensemble of modulated light portions due to the modulation unit's action of applying the corresponding spatial pattern to the light stream.)
In some embodiments, the modulated light stream MLS may be directed onto the light sensing surface of the light sensing device 130 without concentration, i.e., without decrease in diameter of the modulated light stream, e.g., by use of photodiode having a large light sensing surface, large enough to contain the cross section of the modulated light stream without the modulated light stream being concentrated.
In some embodiments, the optical subsystem 117 may include one or more lenses.
In some embodiments, the optical subsystem 117 may include one or more mirrors. In one embodiment, the optical subsystem 117 includes a parabolic mirror (or spherical mirror) to concentrate the modulated light stream onto a neighborhood (e.g., a small neighborhood) of the parabolic focal point. In this embodiment, the light sensing surface of the light sensing device may be positioned at the focal point.
In some embodiments, system 100 may include an optical mechanism (e.g., an optical mechanism including one or more prisms and/or one or more diffraction gratings) for splitting or separating the modulated light stream MLS into two or more separate streams (perhaps numerous streams), where each of the streams is confined to a different wavelength range. The separate streams may each be sensed by a separate light sensing device. (In some embodiments, the number of wavelength ranges may be, e.g., greater than 8, or greater than 16, or greater than 64, or greater than 256, or greater than 1024.) Furthermore, each separate stream may be directed (e.g., focused or concentrated) onto the corresponding light sensing device as described above in connection with optical subsystem 117. The samples captured from each light sensing device may be used to reconstruct a corresponding image (or video sequence) for the corresponding wavelength range. In one embodiment, the modulated light stream is separated into red, green and blue streams to support color (R,G,B) measurements. In another embodiment, the modulated light stream may be separated into IR, red, green, blue and UV streams to support five-channel multi-spectral imaging: (IR, R, G, B, UV). In some embodiments, the modulated light stream may be separated into a number of sub-bands (e.g., adjacent sub-bands) within the IR band to support multi-spectral or hyper-spectral IR imaging. In some embodiments, the number of IR sub-bands may be, e.g., greater than 8, or greater than 16, or greater than 64, or greater than 256, or greater than 1024. In some embodiments, the modulated light stream may experience two or more stages of spectral separation. For example, in a first stage the modulated light stream may be separated into an IR stream confined to the IR band and one or more additional streams confined to other bands. In a second stage, the IR stream may be separated into a number of sub-bands (e.g., numerous sub-bands) (e.g., adjacent sub-bands) within the IR band to support multispectral or hyper-spectral IR imaging.
In some embodiments, system 100 may include an optical mechanism (e.g., a mechanism including one or more beam splitters) for splitting or separating the modulated light stream MLS into two or more separate streams, e.g., where each of the streams have the same (or approximately the same) spectral characteristics or wavelength range. The separate streams may then pass through respective bandpass filters to obtain corresponding modified streams, where each modified stream is restricted to a corresponding band of wavelengths. Each of the modified streams may be sensed by a separate light sensing device. (In some embodiments, the number of wavelength bands may be, e.g., greater than 8, or greater than 16, or greater than 64, or greater than 256, or greater than 1024.) Furthermore, each of the modified streams may be directed (e.g., focused or concentrated) onto the corresponding light sensing device as described above in connection with optical subsystem 117. The samples captured from each light sensing device may be used to reconstruct a corresponding image (or video sequence) for the corresponding wavelength band. In one embodiment, the modulated light stream is separated into three streams which are then filtered, respectively, with a red-pass filter, a green-pass filter and a blue-pass filter. The resulting red, green and blue streams are then respectively detected by three light sensing devices to support color (R,G,B) acquisition. In another similar embodiment, five streams are generated, filtered with five respective filters, and then measured with five respective light sensing devices to support (IR, R, G, B, UV) multi-spectral acquisition. In yet another embodiment, the modulated light stream of a given band may be separated into a number of (e.g., numerous) sub-bands to support multi-spectral or hyper-spectral imaging.
In some embodiments, system 100 may include an optical mechanism for splitting or separating the modulated light stream MLS into two or more separate streams. The separate streams may be directed to (e.g., concentrated onto) respective light sensing devices. The light sensing devices may be configured to be sensitive in different wavelength ranges, e.g., by virtue of their different material properties. Samples captured from each light sensing device may be used to reconstruct a corresponding image (or video sequence) for the corresponding wavelength range.
In some embodiments, system 100 may include a control unit 120 configured to supply the spatial patterns to the light modulation unit 110, as shown in
In some embodiments, the control unit 120 may supply the spatial patterns to the light modulation unit in a periodic fashion.
The control unit 120 may be a digital circuit or a combination of digital circuits. For example, the control unit may include a microprocessor (or system of interconnected of microprocessors), a programmable hardware element such as a field-programmable gate array (FPGA), an application specific integrated circuit (ASIC), or any combination such elements.
In some embodiments, the control unit 120 may include a random number generator (RNG) or a set of random number generators to generate the spatial patterns or some subset of the spatial patterns.
In some embodiments, system 100 is battery powered. In some embodiments, the system 100 includes a set of one or more solar cells and associated circuitry to derive power from sunlight.
In some embodiments, system 100 includes its own light source for illuminating the environment or a target portion of the environment.
In some embodiments, system 100 may include a display (or an interface configured for coupling to a display) for displaying reconstructed images/videos.
In some embodiments, system 100 may include one or more input devices (and/or, one or more interfaces for input devices), e.g., any combination or subset of the following devices: a set of buttons and/or knobs, a keyboard, a keypad, a mouse, a touch-sensitive pad such as a trackpad, a touch-sensitive display screen, one or more microphones, one or more temperature sensors, one or more chemical sensors, one or more pressure sensors, one or more accelerometers, one or more orientation sensors (e.g., a three-axis gyroscopic sensor), one or more proximity sensors, one or more antennas, etc.
Regarding the spatial patterns that are used to modulate the light stream L, it should be understood that there are a wide variety of possibilities. In some embodiments, the control unit 120 may be programmable so that any desired set of spatial patterns may be used.
In some embodiments, the spatial patterns are binary valued. Such an embodiment may be used, e.g., when the light modulating elements are two-state devices. In some embodiments, the spatial patterns are n-state valued, where each element of each pattern takes one of n states, where n is an integer greater than two. (Such an embodiment may be used, e.g., when the light modulating elements are each able to achieve n or more modulation states). In some embodiments, the spatial patterns are real valued, e.g., when each of the light modulating elements admits a continuous range of modulation. (It is noted that even a two-state modulating element may be made to effectively apply a continuous range of modulation by duty cycling the two states during modulation intervals.)
Coherence
The spatial patterns may belong to a set of measurement vectors that is incoherent with a set of vectors in which the image/video is approximately sparse (“the sparsity vector set”). (See “Sparse Signal Detection from Incoherent Projections”, Proc. Int. Conf. Acoustics, Speech Signal Processing—ICASSP, May 2006, Duarte et al.) Given two sets of vectors A={ai} and B={bi} in the same N-dimensional space, A and B are said to be incoherent if their coherence measure μ(A,B) is sufficiently small. Assuming that the vectors {ai} and {bi} each have unit L2 norm, then coherence measure may be defined as:
The number of compressive sensing measurements (i.e., samples of the sequence {IMLS(k)} needed to reconstruct an N-pixel image (or N-voxel video sequence) that accurately represents the scene being captured is a strictly increasing function of the coherence between the measurement vector set and the sparsity vector set. Thus, better compression can be achieved with smaller values of the coherence.
In some embodiments, the measurement vector set may be based on a code. Any of various codes from information theory may be used, e.g., codes such as exponentiated Kerdock codes, exponentiated Delsarte-Goethals codes, run-length limited codes, LDPC codes, Reed Solomon codes and Reed Muller codes.
In some embodiments, the measurement vector set corresponds to a randomized or permuted basis, where the basis may be, for example, the DCT basis (DCT is an acronym for Discrete Cosine Transform) or Hadamard basis.
In some embodiments, the spatial patterns may be random or pseudo-random patterns, e.g., generated according to a random number generation (RNG) algorithm using one or more seeds. In some embodiments, the elements of each pattern are generated by a series of Bernoulli trials, where each trial has a probability p of giving the value one and probability 1−p of giving the value zero. (For example, in one embodiment p=½.) In some embodiments, the elements of each pattern are generated by a series of draws from a Gaussian random variable.)
The system 100 may be configured to operate in a compressive fashion, where the number of the samples {IMLS(k)} captured by the system 100 is less than (e.g., much less than) the number of pixels in the image (or video) to be reconstructed from the samples. In many applications, this compressive realization is very desirable because it saves on power consumption, memory utilization and transmission bandwidth consumption. However, non-compressive realizations are contemplated as well.
In some embodiments, the system 100 is configured as a camera or imager that captures information representing an image (or a series of images) from the external environment, e.g., an image (or a series of images) of some external object or scene. The camera system may take different forms in different application domains, e.g., domains such as visible light photography, infrared photography, ultraviolet photography, high-speed photography, low-light photography, underwater photography, multi-spectral imaging, hyper-spectral imaging, etc. In some embodiments, system 100 is configured to operate in conjunction with (or as part of) another system, e.g., in conjunction with (or as part of) a microscope, a telescope, a robot, a security system, a surveillance system, a fire sensor, a node in a distributed sensor network, etc.
In some embodiments, system 100 is configured as a spectrometer.
In some embodiments, system 100 is configured as a multi-spectral or hyper-spectral imager.
In some embodiments, system 100 may configured as a single integrated package, e.g., as a camera.
In some embodiments, system 100 may also be configured to operate as a projector. Thus, system 100 may include a light source, e.g., a light source located at or near a focal point of optical subsystem 117. In projection mode, the light modulation unit 110 may be supplied with an image (or a sequence of images), e.g., by control unit 120. The light modulation unit may receive a light beam generated by the light source, and modulate the light beam with the image (or sequence of images) to obtain a modulated light beam. The modulated light beam exits the system 100 and is displayed on a display surface (e.g., an external screen).
In one embodiment, the light modulation unit 110 may receive the light beam from the light source and modulate the light beam with a time sequence of spatial patterns (from a measurement pattern set). The resulting modulated light beam exits the system 100 and is used to illuminate the external scene. Light reflected from the external scene in response to the modulated light beam is measured by a light sensing device (e.g., a photodiode). The samples captured by the light sensing device comprise compressive measurements of external scene. Those compressive measurements may be used to reconstruct an image or video sequence as variously described above.
In some embodiments, system 100 includes an interface for communicating with a host computer. The host computer may send control information and/or program code to the system 100 via the interface. Furthermore, the host computer may receive status information and/or compressive sensing measurements from system 100 via the interface.
In one realization 200 of system 100, the light modulation unit 110 may be realized by a plurality of mirrors, e.g., as shown in
In some embodiments, the mirrors 110M are arranged in an array, e.g., a two-dimensional array or a one-dimensional array. Any of various array geometries are contemplated. For example, in different embodiments, the array may be a square array, a rectangular array, a hexagonal array, etc. In some embodiments, the mirrors are arranged in a spatially-random fashion.
The mirrors 110M may be part of a digital micromirror device (DMD). For example, in some embodiments, one of the DMDs manufactured by Texas Instruments may be used.
The control unit 120 may be configured to drive the orientation states of the mirrors through the series of spatial patterns, where each of the patterns of the series specifies an orientation state for each of the mirrors.
The light sensing device 130 may be configured to receive the light portions reflected at any given time onto the sensing path 115 by the subset of mirrors in the first orientation state and to generate an analog electrical signal IMLS(t) representing a cumulative intensity of the received light portions as function of time. As the mirrors are driven through the series of spatial patterns, the subset of mirrors in the first orientation state will vary from one spatial pattern to the next. Thus, the cumulative intensity of light portions reflected onto the sensing path 115 and arriving at the light sensing device will vary as a function time. Note that the term “cumulative” is meant to suggest a summation (spatial integration) over the light portions arriving at the light sensing device at any given time. This summation may be implemented, at least in part, optically (e.g., by means of a lens and/or mirror that concentrates or focuses the light portions onto a concentrated area as described above).
System realization 200 may include any subset of the features, embodiments and elements discussed above with respect to system 100. For example, system realization 200 may include the optical subsystem 105 to operate on the incoming light L before it arrives at the mirrors 110M, e.g., as shown in
In some embodiments, system realization 200 may include the optical subsystem 117 along the sensing path as shown in
In some embodiments, the optical subsystem 117 may include one or more mirrors, e.g., a mirror 117M as shown in
In some embodiments, there may be one or more optical elements intervening between the optical subsystem 105 and the mirrors 110M. For example, as shown in
Rolling Window Reconstruction
In one set of embodiments, a method 600 for reconstructing a sequence of images from a stream of compressive-imaging measurements may involve the operations shown in
Operation 610 involves modulating an incident light stream with a sequence of spatial patterns {aj} to obtain a modulated light stream, e.g., as variously described above in connection with system 100 and system realization 200. (Index j is the sequence index.) The action of modulating the incident light stream includes applying the spatial patterns successively in time to the incident light stream.
Operation 615 involves acquiring a sequence of measurements {I(j)} representing intensity of the modulated light stream over time. (See the above description of the light sensing device 130 in connection with system 100 and system realization 200.) Each of the measurements I(j) is acquired in response to the application of a respective one of the spatial patterns to the incident light stream, and represents the intensity of the modulated light stream during the application of the respective spatial pattern. In some embodiments, the acquisition of the sequence of measurements may include oversampling and averaging as described above in the section entitled “Oversampling Relative to Pattern Modulation Rate”.
Operation 620 involves generating a sequence of subsets of the measurements, where each consecutive pair of the subsets overlaps by a nonzero amount. Each of the subsets corresponds to a respective group of the spatial patterns. Because each measurement I(j) of the measurement sequence corresponds to a respective pattern aj of the sequence of spatial patterns, the spatial pattern groups will overlap in time by the same amount that the measurement subsets do.
More generally,
In some embodiments, action of generating the subsets may involve receiving each measurement from the light sensing device, and storing the measurement into a buffer, e.g., a FIFO buffer. The subsets may be generated periodically in time, with each subset being composed from the most recent M measurements in the buffer. (FIFO is an acronym for “first in, first out”.)
Referring once again to
The reconstruction of each image may be started as soon as the corresponding subset of measurements becomes available. For example, as shown in
While the subsets shown in
The reconstruction algorithm used to reconstruct each image of the image sequence may be any reconstruction algorithm known in the field of compressive sensing.
Operations 610 through 625 may be performed as a pipeline process, i.e., each of the operations may be performed continuously, and each of the operations continuously consumes the output of the previous operation.
Operation 630 involves displaying the images of the image sequence using a display device. The display device may include a display screen conforming to any desired type of display technology. In some embodiments, the display device may include a video projector for projecting the image sequence on a display surface.
In some embodiments, the input data set used for the reconstruction of the current image of the image sequence also includes a previously-reconstructed image of the image sequence. The image reconstruction algorithm may be configured to converge upon an estimate of the current image in an iterative fashion. Thus, a previously-reconstructed image of the image sequence may provide a warm start for the reconstruction of the current image, i.e., may decrease the time to convergence for the reconstruction of the current image.
If the reconstruction algorithm takes more than (1−r)MTS units of times to reconstruct a given image, where TS is the measurement period (i.e., the reciprocal of the measurement rate fS), then the reconstruction of the next image may start before the reconstruction of the current image is completed. For example, in
N
R=ceiling{TMAX/(M(1−r)TS)}
image reconstructions may be executing simultaneously at any given time.
In some embodiments, the action of reconstructing the images of the image sequence includes invoking execution of a current instance of a reconstruction algorithm in order to reconstruct a current image of the image sequence before one or more previous instances of the reconstruction algorithm (corresponding respectively to one or more previous images of the image sequence) have completed execution. The current instance and the one or more previous instances of the reconstruction algorithm may execute at least partially in parallel.
In some embodiments, the input data set used for the reconstruction of the current image of the image sequence also includes a partially-reconstructed version of a previous image of the image sequence. As described above, the reconstruction of a previous image may be still in progress when the reconstruction of the current image is started. However, the partially-completed previous image may nevertheless be useful information for the current image reconstruction, i.e., useful in decreasing the time to convergence for the current image reconstruction.
In some embodiments, the amount of overlap (between each consecutive pair of subsets in the sequence of subsets) may be set to achieve a target image rate for the image sequence. If the reconstruction of each image has maximum latency TMAX and each image reconstruction is started within a maximum latency of TRS after the corresponding subset of the sequence of measurements becomes available, then the image rate fimage of the reconstructed image sequence will be governed by the rate at which the measurement subsets are generated, i.e., by the product (1−r)M and the measurement rate fS:
f
image
=f
s/((1−r)M),
where r is the overlap ratio, and M is the number of measurements in each measurement subset. Thus, the image rate fimage is controllable by selection of the overlap ratio r and/or the subset size M. In the extreme r=1−(1/M), the image rate fimage equals the measurement rate fM. In the opposite extreme r=0, the image rate fimage=fS/M. Any image rate between those two extremes may be achieved by appropriate choice of overlap ratio r.
In some embodiments, the image rate fimage may be set to a value sufficiently large so that the displayed image sequence achieves flicker fusion, i.e., appears to be continuous in time to the human observer.
In some embodiments, the image rate fimage may be set to a value sufficiently large so that the displayed image sequence can be regarded as a video sequence.
In some embodiments, the target image rate is dynamically controllable by user input. A user input device may be monitored to determine if the user has made any changes to a target image rate value. If so, the amount of overlap (or the overlap ratio r) may be changed dynamically in response to each such change in the target image rate value.
In some embodiments, the method 600 may also include transmitting the sequence of measurements to a remote computer, where the action 620 of generating the subsets and the action 625 of reconstructing the image sequence are performed on the remote computer.
Each of the subsets of the sequence of measurements preferably has fewer measurements than the number of pixels in the respective image of the image sequence. Thus, each subset comprises a compressed representation of the respective image.
As described above, the reconstruction of each image of the image sequence from the corresponding subset of measurements and the corresponding group of spatial patterns may be performed with latency less than or equal to a maximum value TMAX. In some embodiments, the reconstruction algorithm may output its current estimate of the image if the TMAX deadline is reached before convergence has been achieved. In some embodiments, the latency bound TMAX is determined by user input.
As described above, the incident light stream is modulated with a sequence of spatial patterns. The spatial patterns are preferably drawn from a measurement pattern set that is incoherent with respect to a dictionary of patterns in which each image of the image sequence is sparse (or compressible).
In a playback mode (or replay mode), operations 620, 625 and 630 may be started after operations 651 and 615 have completed (or alternatively, after operations 650 and 615 have started). Thus, the sequence of measurements acquired by operation 615 may be stored in a memory for later access. In some embodiments, the method 600 may include: accessing the sequence of measurements from the memory (e.g., in response to a user request for playback of the image sequence); generating 620 the overlapping subsets from the accessed measurement sequence, performing the reconstruction 625 on the overlapping subsets; and displaying 630 the reconstructed image sequence. The accessing, generating, reconstructing and displaying operations may form a pipeline process, where each operation continuously consumes the output of the preceding operation. The frame rate of the displayed image sequence may be controlled by adjusting the overlap ratio, e.g., in response to user input.
In some embodiments, the sequence of measurements may be accessed from memory at a rate higher than the rate fM at which the measurements were acquired from the light sensing device. This super-speed access of the measurement sequence may be used to reconstruct and display the images of the image sequence at a rate higher than would be possible if reconstruction and display were performed in real time (i.e., while the modulation 610 was being performed.)
In one set of embodiments, a system 1000 for reconstructing a sequence of images from compressively-acquired measurements of an incident light stream may be configured as shown in
The light modulation unit 1010 may be configured to modulate an incident light stream L with a sequence of spatial patterns to obtain a modulated light stream MLS, e.g., as variously described above. The light modulation unit is configured to apply the spatial patterns to the incident light stream L successively in time. In some embodiments, the light modulation unit 1010 may be realized by the light modulation unit 110 or mirrors 110M as variously described above.
The light sensing device 1020 may be configured to acquire a sequence of measurements representing intensity of the modulated light stream MLS over time. Each of the measurements is acquired in response to the application of a respective one of the spatial patterns to the incident light stream L. In some embodiments, the light sensing device 1020 may be realized the light sensing device 130 as variously described above.
The processing unit 1030 may be configured to generate a sequence of subsets of the intensity measurements, e.g., as described above in connection with method 600. Each consecutive pair of the subsets overlap by a nonzero amount. Each of the subsets corresponds to a respective group of the spatial patterns. The processing unit 1030 may also reconstruct a sequence of images, where each of the images is reconstructed from a respective input data set including a respective one of the subsets of intensity measurements and a respective one of the groups of the spatial patterns.
In some embodiments, the processing unit 1030 may be configured to execute up to NR image reconstructions in parallel, where NR is greater than one. For example, the processing unit may include NR parallel processing channels (or pipelines), each configured to execute an image reconstruction. The NR parallel processing channels may be realized by one or more processors that are configured to execute program instructions, by dedicated digital circuitry such as one or more application-specific integrated circuits (ASICs), by one or more programmable hardware elements such as FPGAs, or by any combination of the foregoing. In one embodiment, the processing unit may include NR processor cores each capable of executing an independent stream of program code.
The display device 1040 may be configured to display the sequence of images. The display device 1040 may be realized by any desired type of display technology. In some embodiments, the display device 1040 may be a computer monitor, or the display of a mobile device or handheld device, or a video projector, or the display of a laptop or notebook computer, or a head-mounted display.
In some embodiments, system 1000 may be realized as a single integrated package, e.g., as a camera device.
In some embodiments, system 1000 may include the optical subsystem 105 described above in connection with system 100 and system realization 200. The optical subsystem 105 may be configured to allow adjustable focus, e.g., in response to user input. In one embodiment, the processing unit 1030 may be configured to increase the amount of overlap (or the overlap ratio), and thereby, increase the image rate fimage when the system 1000 is being focused. A user may prefer to have a higher image rate while adjusting the focus setting, especially in contexts where the baseline image rate (i.e., the image rate without any overlap) is on the order of seconds per image.
In some embodiments, the processing unit 1030 is located remotely from the light modulation unit 1010 and the light sensing device 1020, in which case the system 1000 may also include a transmitter 1022 and a receiver 1024, e.g., as shown in
In different embodiments, the transmitter 1022 may be configured to transmit respectively different kinds of signals over respectively different kinds of communication channel. For example, in some embodiments, the transmitter transmits electromagnetic signals (e.g., radio signals or optical signals) through a wireless or wired channel. In one embodiment, the transmitter transmits electromagnetic signals through an electrical cable. In another embodiment, the transmitter transmits electromagnetic waves through free space (e.g., the atmosphere). In yet another embodiment, the transmitter transmits through free space or through an optical fiber using modulated light signals or modulated laser signals. In yet another embodiment, the transmitter transmits acoustic signals through an acoustic medium, e.g., a body of water. The transmitter may be any type of transmitter known in the art of telecommunications. Likewise, the receiver may be any type of receiver known in the art of telecommunications.
In some embodiments, the amount of overlap (or the overlap ratio) may be set to achieve a target image rate for the image sequence. In some embodiment, the system 1000 may includes a user interface 1050 for receiving user input that controls the target image rate, e.g., as illustrated in
In some embodiments, system 1000 includes a snapshot mode in which a single image is compressively acquired in response to the user's action of pressing a shutter button. (The button may be a mechanical button or a graphical user interface element/icon.) The user may wish to acquire a number of images in quick succession, and thus, may press the shutter button repeatedly. The methods described herein for increasing the rate of image sequence acquisition and reconstruction are useful even when the sequence of images are acquired in the snapshot mode. Each press of the shutter button may initiate the acquisition of a corresponding subset of measurements for the reconstruction of a corresponding image. The overlap ratio between successive images may be determined by the time between successive presses of the shutter button. In some embodiments, the processing unit 1030 (or processing unit 150 of system 100) may capture timestamps indicating times when the user presses the shutter button (or otherwise initiates the acquisition of images). Each timestamp may be used to control the start time of a corresponding subset of the sequence of measurements. The number of measurements in each subset may be determined an image quality setting, e.g., one controlled by user input.
In some embodiments, the light sensing device 1020 may include a plurality (e.g., an array) of light sensing elements. Each light sensing element may be configured to receive a corresponding spatial portion (or spectral portion) of the modulated light stream and to generate a corresponding electrical signal representing intensity of the spatial portion (or spectral portion) as a function of time. The light sensing device 1020 (i.e., A/D conversion circuitry of the light sensing device 1020) may then sample each of the electrical signals to obtain a plurality of measurement sequences, i.e., one measurement sequence per electrical signal. In these embodiments, each of the measurement sequences may be treated as variously described above. In particular, each of the measurement sequences may be processed to reconstruct a corresponding image sequence representing a corresponding spatial portion (or spectral portion) of the field of view. The reconstruction of each image sequence may employ the overlapping-subset methodology as variously described above. The plurality of image sequences may be concatenated together to obtain a global image sequence representing the entirety of the field of view. For more information on compressive imaging systems including such light sensing devices as just described, please refer to U.S. patent application Ser. No. 13/197,304 (U.S. Pub. No. 20120038786), filed on Aug. 3, 2011, entitled “Decreasing Image Acquisition Time for Compressive Imaging Devices”, which is hereby incorporated by reference in its entirety.
Rolling Reconstruction from a Mathematical Viewpoint
Suppose that one is interested in obtaining compressive measurements of a signal xt. Here the superscript t highlights the fact that the signal may be different at different moments of time. For simplicity, assume that the signal of interest is in the form of a column vector with N elements. The column vector xt represents the light intensities of portions of the incident light stream incident upon the array of light modulating elements at time t. See the above discussion of modulation given in connection with system 100 and system realization 200. At time t, the signal of interest xt is measured by taking its inner product with a test function at (also referred to herein as a “spatial pattern”) as variously described above. The test function at may be represented as a row vector of length N. The measurement yt is the scalar value resulting from this inner product plus a noise term. (Below we assume that the signals under discussion are real. However, the principles discussed here naturally extend to complex-valued signals and measurements.) Mathematically, the measurement can be represented as
y
t
=a
t
x
t
+e
t,
where et is the noise.
Assume that the acquisition system (e.g., system 100 or system 1000) can take a measurement every TM seconds. (TM is the reciprocal of the rate fM at which the light modulation unit can apply the test functions, i.e., the spatial patterns.) The measurements may be continuously collected into a running list:
{yjT
Given this list of measurements, suppose that we want to use M of them to reconstruct the original signal/scene, or at least, a close approximation. Suppose that we supply an algorithm ALG with M measurements, and that it takes TALG seconds for the algorithm to reconstruct an image.
Note: If we assume that the reconstruction algorithm achieves the latency TALG for each frame, then the reconstruction algorithm doesn't affect the frame rate. (The “frame rate” referred to here is a synonym for the image rate fimage referred to above in connection with method 600 and system 1000.) This is illustrated, e.g., in
For k=1, 2, . . . , the conventional compressive sensing (CS) methodology gathers both the kth set of M measurements
y
M
k
={y
((k−1)M+1)T
,y
((k−1)M+2)T
, . . . ,y
((k−1)M+M)T
}εR
M
and the corresponding set of M test functions
and supplies them to the reconstruction algorithm ALG to generate the kth image
{circumflex over (x)}
k
ALG(yMk,AMk).
Moreover, since we have to wait for M new measurements to generate each new image, the frame rate fframe=1/(MTM). This frame rate is the rate at which groups of measurements are delivered to the reconstruction algorithm.
According to the present invention, we can speed up the frame rate by removing the necessity of waiting for M completely new measurements. Suppose that we allow the sets of M measurements to overlap, or have redundancy r where 0≦r≦1. This means we keep the most recent rM measurements and then only have to wait for (1−r)M new measurements before reconstructing a new image. Now the frame rate has been increased to
f
frame=1/(M(1−r)TM).
For large r, this can result in a significant increase in speed. We refer to this process as a “rolling reconstruction”. It is depicted, e.g., in
In rolling reconstruction, the kth set of M measurements and the corresponding set of test functions can be represented, respectively, as
where, as before, the kth reconstructed image {circumflex over (x)}k=ALG(yMk,AMk).
Because of the redundancy in each yMk and AMk, we can accelerate the algorithm ALG (i.e., reduce TALG) by supplying to it the last reconstructed image. Hence, the new reconstructed image {circumflex over (x)}k will also be a function of the previous reconstructed image {circumflex over (x)}k−1, i.e.,
{circumflex over (x)}
k
=ALG(yMk,AMk,{circumflex over (x)}k−1).
The rationale here is that since the measurements and corresponding test functions between two successive image reconstructions have much in common, then so too must the reconstructed images. Therefore, rather than have the algorithm start from scratch each time, it can be accelerated by providing the previous reconstructed image as a good guess for a “warm start.” Other aspects of the algorithm can also be optimized in a similar way (e.g., by saving certain tuning and convergence parameters for the algorithm, etc.).
This acceleration technique may be instrumental in reducing TALG to less than a desired maximum latency. For example, it may be desirable for TALC to be less than or equal to 1/fframe.
We can generalize the acceleration technique described above to say that
{circumflex over (x)}
k
=ALG(yMk,AMk,{circumflex over (x)}κ),
where κ (Greek kappa) is a real-valued index, where κ<k. This includes the case of κ=k−1 described above, as well as other values of κ. The case of κ=k−2 would utilize the reconstructed image from two frames before, etc. In particular, κ=k−½ describes the case where we take a “half-finished” reconstructed image that the algorithm is still working on (from the previous frame), and feed that in as a best guess for the present frame based on the new set of measurements. Other, more complicated, scenarios are possible too.
Acceleration of Matching Pursuit
In this section, we describe how the previous reconstructed image (or partially reconstructed image) {circumflex over (x)}κ may be incorporated into the structure of a matching pursuit algorithm to implement the above-described acceleration given by:
{circumflex over (x)}
k
=ALG(yMk,AMk,{circumflex over (x)}κ).
Inputs: Current measurement vector yMkεRM; sparsity basis Ψ; measurement matrix AMk; and the previous reconstructed image {circumflex over (x)}κ.
1. Initialize the residual vector r0=yMk−AMk{circumflex over (x)}κ and the coefficient vector {circumflex over (α)}=Ψ−1{circumflex over (x)}κεRN. Initialize an iteration counter t=1.
2. Select the vector θi in the holographic basis Θ=AMkΨ that maximizes the projection of the residual vector rt−1 onto θi:
3. Update the residual vector and the estimate of the coefficient vector a using the maximizing vector θn
where {circumflex over (α)}n
4. Increment t. If t<T and ∥rt∥2>ε∥y∥2, then go to Step 2; otherwise go to Step 5. Parameters is a predetermined small positive constant.
Note that the iteration limit T may be different in different applications or for different types of images. Furthermore, T may depend on the amount of overlap r. In some embodiments, T may be set to 2K, where K is the sparsity of the images being reconstructed. While T is set to 2K, the average number of iterations through the loop comprising steps 2-4 is expected to be significantly smaller than K, due to the acceleration provided by use of the previous reconstructed image.
5. Estimate the current image {circumflex over (x)}k as {circumflex over (x)}k=Ψ{circumflex over (α)}.
Compressive Imaging System 1300
In one set of embodiments, a compressive imaging system 1300 may be configured as shown in
The optical system 1310 focuses an incident light stream onto the spatial light modulator 1315, e.g., as variously described above. See the discussion above regarding optical subsystem 105. The incident light stream carries an image (or a spectral ensemble of images) that is to be captured by the CI system in compressed form.
The spatial light modulator 1315 modulates the incident light stream with a sequence of spatial patterns to obtain a modulated light stream, e.g., as variously described above.
Each of the detectors 1320 generates a corresponding electrical signal that represents the intensity of a corresponding portion of the modulated light stream, e.g., a spatial portion or a spectral portion of the modulated light stream.
Each of the amplifiers 1325 (e.g., transimpedance amplifiers) amplifies the corresponding detector signal to produce a corresponding amplified signal.
Each of the ADCs 1330 acquires samples of the corresponding amplified signal.
The processing element 1340 may operate on the sample sets obtained by the respective ADCs to reconstruct respective images. The images may represent spatial portions or spectral slices of the incident light stream. Alternatively, or additionally, the processing element may send the sample sets to a remote system for image reconstruction.
The processing element 1340 may include one or more microprocessors configured to execute program instructions stored in a memory medium.
The processing element 1340 may be configured to control one or more other elements of the CI system. For example, in one embodiment, the processing element may be configured to control the spatial light modulator 1315, the transimpedance amplifiers 1325 and the ADCs 1330.
The processing element 1340 may be configured to perform any subset of the above-described methods on any or all of the detector channels.
Compressive Imaging System 1400
In one set of embodiments, a compressive imaging system 1400 may be configured as shown in
The light modulation unit 110 receives an incident light stream and modulates the incident light stream with a sequence of spatial patterns to obtain a modulated light stream MLS, e.g., as variously described above.
The optical subsystem 1410 delivers portions (e.g., spatial portions or spectral portions) of the modulated light stream to corresponding ones of the light sensing devices LSD1 through LDSL.
For information on various mechanisms for delivering spatial subsets of the modulated light stream to respective light sensing devices, please see U.S. patent application Ser. No. 13/197,304, filed on Aug. 3, 2011, titled “Decreasing Image Acquisition Time for Compressive Imaging Devices”, invented by Woods et al., which is hereby incorporated by reference in its entirety.
In some embodiments, the optical subsystem 1410 includes one or more lenses and/or one or more mirrors arranged so as to deliver spatial portions of the modulated light stream onto respective ones of the light sensing devices. For example, in one embodiment, the optical subsystem 1410 includes a lens whose object plane is the plane of the array of light modulating elements and whose image plane is a plane in which the light sensing devices are arranged. (The light sensing devices may be arranged in an array.)
In some embodiments, optical subsystem 1410 is configured to separate the modulated light stream into spectral components and deliver the spectral components onto respective ones of the light sensing devices. For example, optical subsystem 1410 may include a grating, a spectrometer, or a tunable filter such as a Fabry-Perot Interferometer to achieve the spectral separation.
Each light sensing device LSDj generates a corresponding electrical signal vj(t) that represents intensity of the corresponding portion MLSj of the modulated light stream.
Each signal acquisition channel Cj acquires a corresponding sequence of samples {Vj(k)} of the corresponding electrical signal vj(t). Each signal acquisition channel may include a corresponding amplifier (e.g., a TIA) and a corresponding A/D converter.
The sample sequence {Vj(k)} obtained by each signal acquisition channel may be used to reconstruct a corresponding sub-image which represents a spatial portion or a spectral slice of the incident light stream. The number of samples m in each sample sequence {Vj(k)} may be less than (typically much less than) the number of pixels in the corresponding sub-image. Thus, each signal acquisition channel Cj may operate as a compressive sensing camera for a spatial portion or spectral portion of the incident light.
Each of the signal acquisition channels may include any subset of the embodiments, features, and elements described above.
The principles of the present invention are not limited to light. Various embodiments are contemplated where the signals being processed are, e.g., electromagnetic waves or particle beams or seismic waves or acoustic waves or surface waves on a boundary between two fluids or gravitational waves. In each case, a space-time signal is directed to an array of signal-modulating elements whose transmittances or reflectances are individually varied so as to modulate the space-time signal with a time sequence of spatial patterns. The modulated space-time signal may be sensed by a transducer to generate an electrical signal that represents intensity of the modulated space-time signal as a function of time. The electrical signal is sampled to obtain measurements. The measurements may be processed as variously described above to reconstruct the image or sequence of images carried by the original space-time signal.
Any of the various embodiments described herein may be combined to form composite embodiments. Furthermore, any of the various features, embodiments and elements described in U.S. Provisional Application No. 61/502,153 may be combined with any of the various embodiments described herein.
Although the embodiments above have been described in considerable detail, numerous variations and modifications will become apparent to those skilled in the art once the above disclosure is fully appreciated. It is intended that the following claims be interpreted to embrace all such variations and modifications.
This application claims the benefit of priority to U.S. Provisional Application No. 61/502,153, filed on Jun. 28, 2011, entitled “Various Compressive Sensing Mechanisms”, invented by Tidman, Weston, Bridge, McMackin, Chatterjee, Woods, Baraniuk and Kelly, which is hereby incorporated by reference in its entirety as though fully and completely set forth herein.
Number | Date | Country | |
---|---|---|---|
61502153 | Jun 2011 | US |