The subject matter disclosed herein relates in general to signal processing and in particular to temporal resolution of signals in particular.
Resolution in a digital signal is generally related to its frequency content. High-resolution (HR) signals are band-limited to a more extensive frequency range than low-resolution (LR) signals. The resolution is generally limited by two factors, physical device limitations and the sampling rate. For example, digital image resolution is typically limited by the imaging device's optics (i.e., diffraction limit) and the sensor's pixel density (i.e., sampling rate).
A technique used to increase resolution is temporal super-resolution (TSR). This technique is based on increasing the temporal sampling frequency beyond the Nyquist frequency, which is limited by the sampling rate. Different approaches may be used in applying TSR, also commonly referred to as “up-sampling”, in particular for image signals. Some rely on hardware to increase temporal frequency detection, others on software, others on deep learning models, and others on some combination of the former.
In various embodiments there is provided a method for imaging a target, the method comprising transmitting a plurality of N pulses of electromagnetic (EM) waves to illuminate the target, receiving a pulse of EM waves that is reflected by the target from each of the transmitted pulses at an imager sensitive to the EM waves, integrating energy in the plurality of received pulses during a same exposure period of the imager to provide a measure of the integrated energy, and processing the measure of integrated energy to provide N images of the target.
In various embodiments, there is provided an imaging system operable to image a target, the imaging system comprising a source of EM waves controllable to transmit a plurality of EM waves to illuminate the target, a sensor sensitive to the EM waves controllable to have an exposure period during which the sensor is enabled to receive and integrate energy in EM waves reflected by the target from the transmitted EM waves, and a controller configured to control the source of EM waves and sensor to process the measure of integrated energy to provide N images of the target.
In some embodiments, the sensor integrates energy for each of the different M characterizing features independently of integrating energy for the other distinguishing features to provide a measure of integrated energy for each of the M characterizing features.
In some embodiments, the controller processes the measure of integrated energy for each of the M features to provide N images of the target for each of the M features for a total of N×M images of the target.
In some embodiments, the controller processes the integrated energy to provide the N images comprises minimizing a cost function.
In some embodiments, the transmitted pulses of EM energy comprise EM waves characterized by M different distinguishing features.
In some embodiments, the M different distinguishing features comprise different wavelength bands of EM energy.
In some embodiments, the M different distinguishing features comprise different directions of polarization.
In some embodiments, integrating energy comprises integrating energy for each of the different M characterizing features independently of integrating energy for the other distinguishing features to provide a measure of integrated energy for each of the M characterizing features.
In some embodiments, processing the integrated energy comprises processing the measure of integrated energy for each of the M features to provide N images of the target for each of the M features for a total of N×M images of the target.
In some embodiments, processing the integrated energy to provide the N images comprises minimizing a cost function.
In some embodiments, the cost function comprises a temporal cost function.
In some embodiments, the cost function comprises a spatiotemporal cost function.
In some embodiments, the cost function comprises a Lagrangian cost function.
In some embodiments, the EM waves comprise visible light waves.
In some embodiments, the EM waves comprise infrared (IR) waves.
In some embodiments, the EM waves comprise ultraviolet (UV) waves.
Non-limiting examples of embodiments disclosed herein are described below with reference to figures attached hereto that are listed following this paragraph. The drawings and descriptions are meant to illuminate and clarify embodiments disclosed herein and should not be considered limiting in any way. Like elements in different drawings may be indicated by like numerals. Elements in the drawings are not necessarily drawn to scale. In the drawings:
Applicant has realized that TSR supported by hardware (e.g., optics or sensor) has the potential to increase the temporal sampling frequency to a much higher rate and reliability compared to the other approaches. A drawback, Applicant further realized, is the complexity of known systems and the price associated with such systems.
As a result, Applicant has developed a novel approach for TSR that allows use of an imaging system of low complexity, and provides for a high temporal sampling frequency with a high reliability of spectral reconstruction. The method uses the optical reflection properties of a “target” (an object or an entity, or an area therein) such as its surface polarity reflection and/or its spectral reflectivity (target's color). The imaging system (which may also be referred to hereinafter simply as “system”) combines an imager, a high-frequency illumination source (illuminator) which transmits electromagnetic pulses, and a controller which processes optical coding signals (reflected electromagnetic waves from the transmitted electromagnetic pulses) received by the imager at a fixed sampling rate. Optionally, the imaging system includes a neural network.
An aspect of an embodiment of the disclosure relates to a TSR method for up-sampling a sampling rate of an imaging system, optionally to enhance the system's sensitivity to high frequency features of a target, the image of which is captured by the imager. The method includes operating the imager to acquire an image of the target for each of a sequence of exposure periods having a duration T and exposure period repetition frequency fe substantially equal to 1/T, while simultaneously illuminating the target with a temporally periodic illumination pattern of EM waves (transmitted EM pulses). T may be in a range from 1 ms to 1 second, although it may optionally be greater than 1 second, for example, 1.3 seconds, 1.5 seconds, 1.8 seconds, 2 seconds, or even greater. The illumination pattern may have a temporal period equal to about T/N, where N is an integer greater than 1, and includes EM waves characterized by M different distinguishing features that the system processes in M different respective imaging channels. N may be in the range from 2 to 10, although it may optionally be greater than 10, for example, 12, 15, 20, 30, 45, 60, or even greater. M may be in the range from 1 to 10, although it may optionally it may be greater than 10, for example, 12, 15, 20, 30, 45, 60, or even greater.
EM waves characterized by a characterizing feature m, 1≤m≤M, may be referred to as EM waves in channel m or imaging channel m. The different distinguishing features by way of example may be different wavelength bands or directions of polarizations. Different wavelength bands may be different wavelength bands of visible light, different bands of infrared (IR) light or ultraviolet (UV) light. For each exposure period while the illumination pattern illuminates the target at an illumination period frequency f/equal to about N/T, the imager acquires one image of the target for each imaging channel, for a total of M images of the target. Each of the M images acquired for the target during a single exposure period is generated by integrating energy in EM waves reflected by the target and collected by the system in the corresponding Mth imaging channel from all the N periods of the illumination pattern that illuminate the target during the exposure period.
In some embodiments, data in the M images is processed to generate an image of the target for each of the N illumination periods that occur during the exposure period. The result is a total of N×M images of the target. The generated images provide a sequence of N images of the target at an effective image acquisition rate and a corresponding sampling frequency of EM waves reflected by the target equal to the illumination period frequency fl=N/T, which is greater than the exposure period frequency by a factor of N. At the sampling frequency N/T the sequence of images encodes temporal features of the target up to an upper bound frequency about equal to a Nyquist frequency fl/2=N/2T, which is greater by a factor of N (up-sampling factor) than an upper bound Nyquist frequency fe/2=1/2T associated with the exposure period repetition frequency fe.
In some embodiments, for N=M, data from the M images is sufficient to determine the N images exactly, and in accordance with an embodiment, data from the M images may be processed to determine N “exact” images of the target. For N>M, the N images are underdetermined by data in the M images, and data from the M images is processed to satisfy a constraint based on a cost function to determine approximations for the N images.
Imager 102 may include a RGB camera or other imaging device suitable to receive EM waves 116 (for example light) reflected from target 112 and to acquire images of the target while temporally illuminated by pulsed EM waves 114 (for example pulsed light) from illuminator 104. For convenience hereinafter, light received by the imager (i.e. light 116) may be also referred to as “received light” or “reflected light”, and light transmitted by the illuminator (i.e., light 114) may be referred to as “transmitted light”, “pulsed light”, or “transmitted pulsed light”. Imager 102 may acquire images of target 112 during a sequence of exposure periods having a duration T and exposure period repetition frequency fe substantially equal to 1/T. Imager 102 may additionally acquire an amount of M images associated with a M number of different distinguishing features in pulsed light 114 originating from illuminator 104 and reflected back in light 116, and which may be associated with a polarization and/or color of the light, optionally RGB light. Illuminator 104 may transmit pulsed light 114 having a temporal period equal to T/N. For exemplary purposes, pulsed light 114 may be RGB light with M=3, N=6, and T=33 ms. For polarized light, it may be M=2, N=5 and T=10 ms. It is noted, as previously stated, the pulsed light 114 may be IR or UV light. Exemplary parameters for IR or UV may be M=1, N=3 and T=20 ms.
Controller 106 includes a processor 108 and a memory 110. Optionally, controller 106 includes a neural network 111. Processor 108 controls illuminator 104 to transmit and illuminate the target with pulsed light 114 for each of M different features characterizing the light according to the temporal period T/N. Processor 108 additionally controls imager 102 to receive light 116 reflected by target 112 from transmitted light 114 during a sequence of exposure periods having duration T and exposure period repetition frequency fe equal to about 1/T, and to register the received imaging information in M imaging channels. Processor 108 further processes the received imaging information applying a TSR algorithm as described further on below with relation to
Memory 110 may store all executable instructions required for the operation of processor 108. These may include instructions associated with the execution of the TSR algorithm. Memory 108 may additionally store the imaging information associated with the M channels generated by imager 102 from reflected light 116 for each of the M distinguishing features in pulsed light 114, as well as combined images following application of TSR. It is noted that memory 110, although shown as a single unit in controller 106, may include more than one storage unit in the controller and/or one or more storage units external to the controller and/or one or more storage units in processor 108.
Neural network (NN) 111, optionally included in controller 106, may optionally be an unsupervised NN. An exemplary NN 111 architecture may be based on Unet, and may include a first stage which may serve as an encoder and a second stage which may serve as a decoder. In the encoding stage, NN 111 may use down-sampling, optionally non-linear down-sampling such as, for example, down sampling max pool, to extract the maximum value associated with each one of the M characterizing features in the M imaging channels for all the N periods. In the decoding stage, up-sampling may be applied to transfer the mapping resulting from the first stage to a larger pixel space. Optionally, non-linear filtering using a ReLU activation filter may be applied.
At block 202, illuminator 104 transmits pulse trains LPm of transmitted light 114 comprising pulses cnm having pulse widths τ=T/N at illumination frequency fl. For exemplary purposes, the pulsed light 114 has M=3 pulse trains LPm, 1≤m≤3 of EM energy comprising optionally N=5 pulses cnm, and 1≤n≤N=5 of EM energy respectively for each exposure period 220 of imager 102. Pulse trains LPm are therefore configured having an illumination period frequency fl equal to about Nfe Pulse trains LPm are optionally visible light pulse trains comprising pulses cn1, cn2, cn3 of R, G and B light, respectively. Pulse trains LP1, LP2, LP3 are schematically shown along timelines 212, 214, and 216, respectively.
At block 204, imager 102 receives N pulses of reflected light 116 from target 112 associated with each of the pulse trains LPm. The reflected light 116 is received and registered by imager 102 during the exposure period 220. The exposure periods 220 are shown along timeline 218. During each exposure period 220, imager 102 collects and images reflected light 116 from target 112 from the N light pulses cn1, cn2, cn3, 1≤n≤N=5 in pulse trains LP1, LP2, LP3 of pulsed light 114 respectively, on pixels of a photosensor (not shown) comprised in the imager.
At block 206, each pixel integrates energy from the reflected light pulses imaged on the pixel in each pulse train during exposure period 220 on different respective imaging channels Cm, 1≤m≤3 of the pixel to register the light. Typically, an imaging channel of a pixel for registering R, G, or B light includes a light sensitive region overlaid by an R, G, or B filter respectively and electronics for integrating and converting energy in incident light that passes though the filter into an electronic signal. Let C1, C2, and C3 represent the electronic signals that a pixel generates responsive to pulses of reflected light 116 from transmitted light pulses cn1, cn2, cn3, by a region of target 102 that is imaged on the pixel during an exposure period 210. Signals C1, C2, and C3 may be thought of and are optionally referred to as images of the region imaged on the pixel. In
Images C1, C2, and C3 may be acquired at a sampling frequency equal to fe and the images encode data from the area on target 112 characterized by temporal frequencies in a bandwidth limited by a cutoff frequency equal to about a Nyquist frequency fl/2. The images may therefore be blind to high frequency features, for example ephemeral features (not shown) that are exhibited for very short periods of time.
At block 208, to increase the temporal cutoff frequency of the images acquired by imager 102, controller 106 processes images C1, C2, and C3 to generate images of target 112 for each pulse cn1, cn2, cn3, that illuminates the target during each exposure period 220, and provides N images of the target for each exposure period. At N images per exposure period, imager 102 operates at an effective temporal cutoff frequency equal to about Nf/2.
The method of processing images C1, C2, and C3 by controller 106 to generate up-sampled images of target 112 for each pulse cn1, cn2, cn3 is described with reference to a flow chart 300. Also described therein is the integration method employed by the sensors to generate C1, C2, and C3. It is noted that the method is described generically, for Cm imaging channels (i.e., images).
At block 302, to determine Cm, an assumption may be made that imager 102 generates images of target 102 responsive to reflected light 116 for each of M channels respectively defined by sensitivity to light in a different wavelength band represented by λm (1≤m≤M). Linear optics may also be assumed so that the reflected light does not undergo any changes as it optionally passes through channels. Let Cm(T,t) represent an image that a pixel in imager 102 generates for a particular exposure period having duration T that begins at a given time t. Let Qm(λ) represent sensitivity of a pixel in imager 102 as a function of wavelength X to intensity of incident light in wavelength band λm and let cm(λ,t) represent intensity of light in an illumination pattern that illuminator 104 transmits at time t as a function of wavelength in wavelength band λm. If R(λ,t) represents reflectivity of regions in target 112 at time t as a function of wavelength λ, then the pixel generates an image Cm(T,t) responsive to incident light reflected by a region of target 112 imaged on the pixel that may be expressed by,
Assuming that cm(λ,t) is separable and may be written cm(λ,t)=cm(t)cm(λ), then Qm(λ) may be redefined to include the wavelength dependence, cm(λ), of cm(λ,t), and equation (1) may be written as,
where Q′m=Qm(λ)cm(λ).
At block 304, controller 106 may determine Cm for discrete conditions represented by pulsed light 114 changing in time between two modes, off and on, from equation (2). A further assumption may be made that
where ym,k are constants, and that pulsed light 114 that illuminator 104 transmits includes a pulse train having N substantially discrete pulses of light during an exposure period T, so that equation (2) may be rewritten as
where in=∫−∞+∞Q′m(λ)R(λ, tn)dλ. From equation (4), it may be appreciated that Cm has been determined with N being the up-sampling factor and in represents the average value of the image at a sub-time step n.
At block 306, controller 106 optionally applies a cost function. It is noted that, in equation (4) extracting the values of in is equivalent to up-sampling in factor N at the time domain. This may pose a problem as for M channels, the equation may only be solved for an un-sampling factor of N=M. As in practice, the number of channels is relatively low and a high rate of TSR is desired, a cost function may be optionally introduced.
Controller 106 may define an image of a region of target 112 imaged on a given pixel for an n-th light pulse cnm in the m-th channel of imager 102 as IMnm=cnmin (1≤n≤N, 1≤m≤M). Controller 106 may then operate to determine in (1≤n≤N) and thereby N images IMnm of the facial region for a given exposure period at a time t and channel m by optionally selecting scene smoothness in time as a cost function, optionally Lagrangian, which may be given by
where m are Lagrange multipliers. In matrix notation the solution to equation (5) may be written as
where vectors {right arrow over (I)} and {right arrow over (C)} and matrices S and M are defined as follows:
where {right arrow over (I)} is the intensity vector of size N for each exposure time, {right arrow over (C)} is of size M and is the captured value in each of the channels for a single exposure time, {right arrow over (c)} have binary values of 0 or 1 when the pulse of channel m is on or off, and M represents the Lagrange multiplier for each of the channels.
In the above description, controller 106 determines values for in and therefrom IMnm responsive to the Lagrangian cost function defined by equation (5). However, the cost function to be applied may not be limited to equation (5), which may be considered a temporal cost function that provides N images for a given pixel for each channel M as a function of a temporal sequence of images Cm(T,t) provided only by the given pixel. For example, an alternative cost function may provide N images for a given pixel as a function of images provided by pixels in a pixel neighborhood “P” of a given pixel.
Let an image provided by a given pixel at pixel coordinates x, y for an exposure period T that begins at a given time t be denoted by Cx,ym (T, t). A Lagrangian cost function that controller 106 may process to determine images IMnm for a given pixel Cx,ym (T, t) may be a spatiotemporal cost function responsive not only to a temporal sequence of images provided by the given pixel but also to images provided by pixels in a pixel neighborhood of the given pixel. Optionally, the pixel neighborhood may be a 4-neighborhood. An optional spatiotemporal Lagrangian 4-neighborhood cost function, by way of example, may be given by the expression:
where wx,yt, wx,ys, and wx,ys are weights.
Applicant conducted a number of tests to evaluate the efficacy of the disclosed method for TSR which allows use of an imaging system of low complexity, and allows a high temporal sampling frequency with a high reliability of spectral reconstruction. A description of the tests and the results obtained is given below.
The test setup included use of a commercial CMOS camera with adjustable speed as the imager, a smartphone as the illuminator set at a refresh rate of 60 Hz, and a rotating home fan with the blades covered in white paper sheet as the target. The camera was set at different frame speeds, 10 Hz, 20 Hz, and 80 Hz. The rotating speed of the fan at approximately 21.5 Hz. For every N, the same coded pattern was used. The temporal illumination was RGB light with the following characteristics:
To avoid noise artifacts, white noise filtering was applied for all the measured signals during the testing.
Illumination correction was introduced as a comparison was made between the actual signal (which was captured in the high frame-per-second recording) with the same signal captured with a low frame-per-second (and up-sampled). A compensation gain for the high frame-per-second signal was made to overcome the illumination difference due to the different exposure time. An additional correction was made due to the object color (the gamma-factors), representing the reflections for R, G and B. To detect the gamma factors to balance the intensities for all colors, a reference measurement of a white target (the center of the fan) was used to calibrate the intensity values relative to it.
The experimental results are shown in the graphs in
In
The imaging results are shown in
To evaluate the SNR for different a factors, a clean, white paper located 40 cm in front of the camera and the illuminator was used. Different environment illumination using a white-light projector was also used. The illumination values were measured using a Lux-meter. The results are shown in
It may be appreciated that the SNR is improved with the use of the illuminator as it increases the light in the scene
An additional experiment was to measure the signal reconstruction performance (angular error) versus the a factor. The Cosine similarity was determined for x1 and x3 for different values of a. The results are shown in
One fundamental task in computer vision is motion estimation or optical flow estimation. Given the image's spatial and temporal derivatives, one can calculate the velocity of a pixel in the x-y plane. Estimation of the temporal derivative relies heavily on the camera frame-per-second rate. Since high temporal frequencies cannot be detected in a low frame-per-second camera, using the disclosed TSR method together with increasing the camera frame-per-second can improve the temporal. The rotating fan's blade velocity (at the x-y plane) was measured at each pixel and compared to the ground truth, which was detected using a high frame-per-second camera. The result is shown in
It may be appreciated that there is a substantial improvement in the error by using the disclosed up-sampling method.
Some stages (steps) of the aforementioned method(s) may also be implemented in a computer program for running on a computer system, at least including code portions for performing steps of the relevant method when run on a programmable apparatus, such as a computer system or enabling a programmable apparatus to perform functions of a device or system according to the disclosure. Such methods may also be implemented in a computer program for running on the computer system, at least including code portions that make a computer execute the steps of a method according to the disclosure.
A computer program is a list of instructions such as a particular application program and/or an operating system. The computer program may for instance include one or more of: a subroutine, a function, a procedure, a method, an implementation, an executable application, an applet, a servlet, a source code, code, a shared library/dynamic load library and/or other sequence of instructions designed for execution on a computer system.
The computer program may be stored internally on a non-transitory computer readable medium. All or some of the computer program may be provided on computer readable media permanently, removably or remotely coupled to an information processing system. The computer readable media may include, for example and without limitation, any number of the following: magnetic storage media including disk and tape storage media; optical storage media such as compact disk media (e.g., CD-ROM, CD-R, etc.) and digital video disk storage media; nonvolatile memory storage media including semiconductor-based memory units such as FLASH memory, EEPROM, EPROM, ROM; ferromagnetic digital memories; MRAM; volatile storage media including registers, buffers or caches, main memory, RAM, etc.
A computer process typically includes an executing (running) program or portion of a program, current program values and state information, and the resources used by the operating system to manage the execution of the process. An operating system (OS) is the software that manages the sharing of the resources of a computer and provides programmers with an interface used to access those resources. An operating system processes system data and user input, and responds by allocating and managing tasks and internal system resources as a service to users and programs of the system.
The computer system may for instance include at least one processing unit, associated memory and a number of input/output (I/O) devices. When executing the computer program, the computer system processes information according to the computer program and produces resultant output information via I/O devices.
Unless otherwise stated, the use of the expression “and/or” between the last two members of a list of options for selection indicates that a selection of one or more of the listed options is appropriate and may be made.
It should be understood that where the claims or specification refer to “a” or “an” element, such reference is not to be construed as there being only one of that element.
All references mentioned in this specification are herein incorporated in their entirety by reference into the specification, to the same extent as if each individual reference was specifically and individually indicated to be incorporated herein by reference. In addition, citation, or identification of any reference in this application shall not be construed as an admission that such reference is available as prior art to the present disclosure.
While this disclosure has been described in terms of certain embodiments and generally associated methods, alterations and permutations of the embodiments and methods will be apparent to those skilled in the art. The disclosure is to be understood as not limited by the specific embodiments described herein, but only by the scope of the appended claims.
This application claims priority from U.S. Provisional Patent Application No. 63/219,378 filed Jul. 8, 2021, which is expressly incorporated herein by reference in its entirety.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/IB2022/056275 | 7/7/2022 | WO |
Number | Date | Country | |
---|---|---|---|
63219378 | Jul 2021 | US |