This disclosure relates generally to electronically-implemented methods and systems for computer image processing, more particularly to image fusion.
Image fusion is a process of combining information from different sources of images into an image. A purpose of image fusion is not only to reduce an amount of data of the images but also to construct the fused image more appropriate and understandable for human and machine perception. In computer vision, multi-sensor image fusion is a process of combining relevant information from two or more images into a single image or fused image. The fused image may contain more information than any of the input images.
One active research area in image fusion is fusing color images with Near-Infrared (NIR) images. In general, efforts in this research area are to increase details of color image from the extra information of NIR while preserving color and brightness of the color image. For example, color and NIR image fusion is used to de-haze a scene so to see through a fog and/or haze captured in an original color image.
An innovative concept called conjugate image is provided herein to fix deviations in a fused imaged based on an NIR image. In some embodiments, the conjugate image is applied to a fused image in a form of a weight or weight function to preserve the colors and details of the vegetation in the fused image. In those embodiments, vegetation colors, brightness, and/or any other details in the fused image are weighted towards the color image, while non-vegetation parts in the fused image are weighted towards the NIR image.
In some embodiments, a device can be configured to perform particular operations or actions by virtue of having software, firmware, hardware, or a combination of them installed on the device that in operation causes or cause the device to perform the actions. One or more computer programs can be configured to perform particular operations or actions by virtue of including instructions that, when executed by the device, cause the device to perform the actions. One general aspect in those embodiments includes an electronically-implemented image processing method. The electronically-implemented image processing method may include obtaining a first image and a second image, where both the first image and the second image include a scene. The first image comprises a first pixel and the second image comprises a second pixel. The first pixel corresponds to the second pixel such that both pixels correspond to a same part in the scene. The electronically-implemented image processing method may include a conjugate image based on the second image. A third pixel in the conjugate image corresponds to the second pixel in the second image, and a luminance of the third pixel is less than a luminance of the second pixel.
In those embodiments, the electronically-implemented image processing method may include obtaining a weight based on the conjugate image, and using the first and second images to produce the fused image based on the weight. In those embodiments, as a result of the weight, the fusing is biased towards the third pixel in the conjugate image in color, luminance and/or any other aspects rather than the second pixel in the second image. Other embodiments may include corresponding computer systems, apparatus, and computer programs stored on one or more computer storage devices, each configured to perform the electronically-implemented image processing method.
Various embodiments may include one or more of the following features. In some embodiments, the first image is a color image and the second image is an NIR image. In some embodiments, the electronically-implemented imaging processing method may include converting the first image to a L*a*b* color space, where L* represents the luminance of the first image, and a* and b* represent colors of the first image. In some embodiments, at the third pixel in the conjugate image, a luminance of third pixel is obtained by dividing a square of a luminance of the first pixel in the first image using a luminance of the second pixel.
In some embodiments, for the part in the scene, a difference (conjugateDiff) between the color image and conjugate image in luminance is obtained. In those embodiments, the fusing includes: for the part in the scene in the fused image, using conjugateDiff as the luminance. In those embodiments, the conjugateDiff is obtained by subtracting the luminance of the third pixel in the conjugate image from the luminance of the first pixel in the first image.
In some embodiments, for obtaining the weight based on the conjugate image, an infrared emission difference (irDiff) between the first image and the second image is obtained; a difference (conjugateDiff) between the color image and conjugate image in luminance is obtained; and a difference (irConjugateDiff) between conjugateDiff and irDiff is obtained. In those embodiments, the fusing includes applying the irConjugateDiff as a weight.
In some embodiments, an inverted irConjugateDiff is obtained. In those embodiments, the fusing includes using irConjugateDiff and inverted irConjugateDiff as weights according to the following formula:
L1 represents a luminance of the first pixel in the first image, and L2 represents a luminance of the second pixel in the second image.
These illustrative embodiments are mentioned not to limit or define the disclosure, but to provide examples to aid understanding thereof. Additional embodiments are discussed in the Detailed Description, and further description is provided there.
Features, embodiments, and advantages of the present disclosure are better understood when the following Detailed Description is read with reference to the accompanying drawings.
As used herein a color image may be referred to as an image captured by an image senor and created by a color filter. An RGB (red, green, blue) image is a type of color image. However, color images are not necessary only limited to RGB images. Other types of color images are also contemplated and within the scope of the present disclosure.
As used herein, a fused image may be referred to an output image from two or more images. It is understood that a fused image in accordance with the disclosure is not necessarily limited to a final output image - for example to be perceived by a human. Any output images including intermediate images for producing a final output image are within the scope of the fused image in accordance with the present disclosure so long as they are created by fusing two or more images.
As used here, in a near-infrared (NIR) image may be referred to as an image captured by an NIR sensor. NIR is a subset of the infrared band of the electromagnetic spectrum. These wavelengths are just outside the range of what humans can see and can sometimes offer clearer details than what is achievable with visible light imaging. A specific range of the NIR wavelengths is not intended to be limited by the present disclosure. Several benefits of NIR imaging are described below and thus imaging fusion method using various principles to enhance the fused image described herein are within the scope of the present disclosure.
As mentioned above, NIR is very close to human vision but removes the color wavelengths, which results in most objects in the NIR image looking very similar to an image that has been converted to black and white. One exception is trees and plants, which are highly reflective in the NIR wavelength and thus appear much brighter than they do in color. That difference in reflectivity of certain objects, in combination with reduced atmospheric haze and distortion in the NIR wavelength, means that detail and visibility are often improved at long ranges for NIR enhanced images.
One benefit of NIR imaging is that the longer wavelengths of the NIR spectrum are able to penetrate haze, light fog, smoke and other atmospheric conditions better than visible light. For long-distance imaging, this often results in a sharper, less distorted image with better contrast than what can be seen with visible light.
Another benefit of NIR imaging is that, unlike thermal energy which displays objects quite differently from visual perception, NIR is a reflected energy that behaves similar to visible light, which means that it can see things like printed information on signs, vehicles and vessels that thermal imaging usually cannot. Faces, clothing and many other objects will also look more natural and recognizable than they do in thermal.
In various embodiments, devices with imaging capabilities may be configured to capture color images and NIR images of a same scene or more or less the same scene simultaneously (or near simultaneously) and separately. For example, such devices may include a smartphone equipped with a color image sensor and a NIR senor. In that example, the color image sensor and the NIR sensor may be controlled (for example by a camera app on the smartphone) to capture a color image and a NIR image of the scene simultaneously. Typically, the so captured color image and NIR image differ in color, brightness, details and/or any other aspects for various reasons explained herein. In that example, the camera app on the smartphone may be configured to employ one or more image fusion methods in accordance with the present disclosure to enhance the color image by fusing the color image and NIR image. For example, details provided by the NIR image may be added to the color image through the image fusion.
However, one drawback with many existing image fusion methods that fuse color image and NIR image is that the resulting color in the fused image typically deviates from that in the color image. Under those existing image fusion methods, the fused image thus may not look natural or its color may appear to be wrong when perceived by human.
To address this problem, inventor(s) of the present disclosure has come up a number of innovative ways to fuse color images and NIR images to reduce the color/brightness deviation in the fused image. In US Patent Application number #63/113,151, entitled “Color Image & Near-Infrared Image Fusion with Base-Detail Decomposition and Flexible Color and Details Adjustment”, the inventor(s) comes up a way of computing IR emission strength in the NIR image to adjust the color appearance of the fused image. In that application, the IR emission strength is derived from how much the NIR deviate from the brightness of the color image’s L channel’s. US Patent Application number #63/113,151 is incorporated herein in its entirety.
In accordance with the present disclosure, the inventor(s) has come up an innovative concept called Conjugate Image to fix the deviations in a fused imaged based on NIR. In accordance with the present disclosure, a Conjugate Image may be referred to an image having an opposite brightness characteristics as compared to a corresponding NIR image. In some embodiments, the Conjugate Image is coupled with an IR Emission Strength image and is applied to a fused image in a form of a weighting function to preserve the colors and details of the vegetation in the fused image. In those embodiments, vegetation colors in the fused image are weighted towards their colors in the color image, while non-vegetation parts in the fused image may be weighted towards their counterparts in the NIR image.
With an inventive concept in accordance with the present disclosure having been generally described, attention is now directed to
It is understood that the arrangement of the sensor 202 and sensor 204 on a same device (e.g., the device 200) is not necessarily the only arrangement for color and NIR image sensors in accordance with the present disclosure. In some other embodiments, the sensor 202 (e.g., color) may be arranged on one device and the sensor 204 (e.g., NIR) may be arranged on another device. For example, in one embodiment, the sensor 202 may be mounted on an unmanned vehicle (UAV) and the sensor 204 may be mounted on a vision enhancing device separate and distinct from the UAV. In that embodiment, the UAV and vision enhancing device may be controlled to capture the same (or more or less the same) field of view at a same or different time.
As mentioned above, in this example, the sensor 202 and sensor 204 are positioned on the device 200 to cover the same (or more or less the same) field of view so that no view cropping is needed for the images captures by them when fused. In some implementations, sensor 202 and sensor 204 may be set to capture images of same (or more or less the same) resolutions. However, this is not intended to be limiting. It is understood that matching resolutions for images captured by the sensor 202 and sensor 204 are not required by the present disclosure. As also mentioned, the processor 206 may be configured to generate an instruction to cause the sensor 202 and sensor 204 to capture images of the field of view at the same (or more or less the same) time. After the images are captures, the images can be processed and fused into the fused image using a Conjugate Image fusion method in accordance with the present disclosure, which will be described in greater details in the following sections.
The following non-limiting example is provided to introduce some embodiments. In this example, the processor 206 may be configured to execute an image processing application, which can receive a first image and a second image for fusing and generating a fused image from these two images. For instance, the first image and the second image can both be digital photographs. The first image may be a RGB color image of a real-world scene captured by a regular color camera and the second image may be an NIR image of the real-world scene captured by an NIR camera. The two images may have overlapping fields of view of the real-world scene. In one embodiment, the two images may be captured simultaneously or nearly simultaneously by image sensors mounted in proximity on a device such as the device 200 shown in
With an example device in accordance with the present disclosure having been generally described, attention is now directed to
In some embodiments, method 300 may be implemented by a device including one or more of the processor, such as the ones shown in
At 302, a color image and a NIR image of a scene can be obtained. As described herein, in some embodiments, the color image and NIR image of the scene may be captured by sensors on a device (such as a smartphone) at the same time or nearly at the same time. However, this is not necessarily the only case. In some embodiments, the color image and the NIR image may be obtained from a database, where those images of the scene are stored. Those images, in those embodiments, may or may not be captured at the same time. For example, it is contemplated that the color image of the scene of the image may be captured at a first time, and the NIR image of the scene may be captured at a second time separate and distinct from the first time. For instance, color images of a deep sea scene may be captured at the first time, and NIR images of the scene (or more or less of the scene) may be captured at a second time for enhancing the details captured in the color images.
At 304, the color image is converted into a color space that separates the colors in the color image from luminance in the color image. One example of such a color space is the CIELAB (International Commission on Illumination Lab) color space also referred to as L*a*b* color space. It expresses color as three values: L* for perceptual luminance, and a* and b* for four unique colors of human vision: red, green, blue, and yellow. However, it should be understood other color spaces are also contemplated.
At 306, a conjugate image is obtained based on the NIR image. An example of the conjugate image is illustrated in
In various embodiments, the conjugate image is obtained by dividing the luminance L of the color image obtained at 304 by the NIR at pixel level. In those embodiments, the conjugate image thus has an inverse relationship with the NIR- because at a given pixel, the higher luminance of that pixel in the NIR image is (for example, a vegetation pixel), the lower luminance of that pixel in the conjugate image as compared to some other pixels in the NIR. As an example, the bright vegetation in the NIR image, as can be seen, becomes darker as compared to other objects, such as the paved road, in the conjugate image shown in
In one embodiment, the conjugate image is defined as
at pixel level. That is, in that embodiment, for a given pixel in the conjugate image such as pixel 404 shown, the square of Luminance of that pixel in the color image is divided by that pixel in the NIR image. In this embodiment, L square is selected for determining the conjugate image mainly for a consideration attenuation of the L in the color image for pixels. In this embodiment, the color image is in float values normalized between [0, 1]. At a given pixel such as pixel 404, the L square can attenuate luminance of that pixel (e.g., pixel 502) in the color image when the L value is small for that pixel. On the other hand, when the L value for that pixel is large (for example, close to 1), L square would not attenuate luminance of that pixel by very much. A result of this choice of using L square is that when such attenuation of the color image in the luminance channel is added to the NIR image at pixel level, the pixels will not saturate or become too bright to deviate the color - because the bright pixels in the color image are not attenuated by very much as explained above. It should be understood L square is merely a design choice for this embodiment according to the aforementioned principles. The choice of L square in this embodiment should not be construed as limiting the present disclosure. Other forms (formulas) of conjugate image obtained based on L and NIR in accordance with the present disclosure is also within the scope of present disclosure. For example, in some other examples, L cube, square root of L, L+ some weight, or any other formula involving L may be used instead of L square.
In various embodiments, pixels in the conjugate image are normalized to values between 0 and 1. Other ranges of the conjugate image are contemplated. In that embodiments, since NIR is a denominator, at pixels where the NIR is 0, the above division may be invalid due to infinity division by 0. In that embodiment, for those pixels, the above division is skipped and their values are set to 0 in the conjugate image. Below is an example of pseudo code for this embodiment: For each pixel in the conjugate image{ Determine if the pixel in the NIR is 0{ If Yes: set the pixel to 0 If No: obtain L of that pixel in the color image; set the pixel to
and normalize the pixel to a value between 0 and 1}} It should be understood, this formula is just one embodiment of obtaining the conjugate image in accordance with the present disclosure. Other embodiments are contemplated. For example, in some other embodiments, a different formula may be used to create at least one opposing color/brightness characteristics in the conjugate image as compared to the NIR image.
At 308, a weight can be obtained based on the conjugate image obtained at 306. As mentioned above, the weight obtained at 308 is for reducing certain aspects of the NIR image during the image fusion using the NIR image.
Attention is now directed to
In accordance with the present disclosure, conjugateDiff can be used to preserve details for a high IR emission part in the color image so they are less biased towards the NIR image, while add details from the NIR image for non-vegetation parts. Such a part may include a vegetation part (e.g., trees, grasses, plants), bright color objects (e.g., a red roof, bright color clothing) and/or any other high emission part. For example, use a given pixel in the fused image as an illustration, such as pixel 406 shown in
At 604, conjugateDiff is obtained for a given pixel as explained above. Details are not repeated.
At 606, a difference (referred to as irConjugateDiff herein) between conjugateDiff and irDiff can be obtained for a given pixel, such as pixel 502 shown in the color image shown in
One motivation behind obtaining irConjugateDiff is that such a value can help reduce bright/color deviation brought by high IR emission parts in the NIR image. Typically, for a high emission pixel, the following is true: NIR>L>conjugate image. That is, luminance for this type of pixel is higher in the NIR image than that in the color image, which is higher than that in the conjugate image. Thus, for this type of pixel, irDiff is typically larger than or equal to conjugateDiff, which can result in irConjugateDiff being less than or equal to 0. Thus, for this type of pixel, irConjugateDiff is desired during the image fusion to cause NIR to contribute nothing to this pixel. On the other hand, for a low IR emission pixel, irDiff is low, which is typically close to 0. For this type of pixel, thus, irConjugateDiff would approximate conjugateDiff, which as explained above can be used, for example, to preserve the pixel brightness in the color image during image fusion.
In some embodiments, operation(s) at 308 for obtaining a weight may involve obtaining an inverted irConjugateDiff such as the step 608 as shown in
Having described different embodiments for step 308, attention is now directed back to
In various embodiments, irConjugateDiff may also be applied as a weight during the image fusion. In those embodiments, the irCongjugateDiff may produce a better result for the fused image (in terms of less color/brightness deviation in high IR emission parts) than congjugateDiff due to further processing being taken for irConjugateDiff to differentiate high IR emission pixels from low IR emission pixels as explained above.
such that the vegetation parts in the fused image is weighted towards that of the color image, and dehazing is achieved on the non-vegetation parts by weighting those zones towards the NIR image.As explained above, for high IR emission pixels (e.g., vegetation), NIR effects are undesired, which can be zeroed out using irConjugateDiff (i.e., irCongjugateDiff can be set to 0 for those pixels) as a weight. For those pixels, since irCongjugateDiff is set to 0, the inverted irConjugateDiff (i.e., 1- irConjugateDiff) is 1, which can be used to preserve the brightness of those pixels in the color image when producing the fused image. On the other hand, for low IR emission pixels, irConjugateDiff approximates conjugateDiff as explained above, which together with the inverted irConjugateDiff can be used to weight those pixels towards NIR in luminance channel so the brightness of NIR for those pixels are used when produced the fusion image. As can be seen, the vegetation in the fused images shown in
It should be understood, while the examples shown in
With an example Conjugate Image method having been described, attention is now directed to
The conjugate image module 704 may be configured to obtain a conjugate image based on the second image. For achieving this, in various embodiments, the conjugate image module 704 may be configured to execute operations described in association with step 306 shown in
The weight determination module 706 can be configured obtain a weight based on the conjugate image. For achieving this, in various embodiments, the conjugate image module 706 may be configured to execute operations described in association with step 308 shown in
In some embodiments, weight determination module 706 may be configured to obtain a conjugateDiff based on the first image and the conjugate image. For achieving this, in various embodiments, the weight determination module 706 may be configured to execute operations described in association with step 604 shown in
In some embodiments, the weight determination module 706 may be configured to obtain an irDiff, irConjugateDiff and/or an inverted irConjugateDiff. For achieving this, in various embodiments, the weight determination module 706 may be configured to execute operations described in association with steps shown in
The image fusion module 708 may be configured to fuse the first and second images to produce a fused image based on irDiff, conjugateDiff, irConjugateDiff, inverted irConjugateDiff, and/or any other weighs. For achieving this, in various embodiments, the conjugate image module 708 may be configured to execute operations described in association with step 310 shown in
Any suitable computing system can be used for performing the operations described herein. For example,
The memory 814 can include any suitable non-transitory computer-readable medium. The computer-readable medium can include any electronic, optical, magnetic, or other storage device capable of providing a processor with computer-readable instructions or other program code. Non-limiting examples of a computer-readable medium include a magnetic disk, memory chip, ROM, RAM, an ASIC, a configured processor, optical storage, magnetic tape or other magnetic storage, or any other medium from which a computer processor can read instructions. The instructions may include processor-specific instructions generated by a compiler and/or an interpreter from code written in any suitable computer-programming language, including, for example, C, C++, C#, Visual Basic, Java, Python, Perl, JavaScript, and ActionScript.
The computing device 800 can also include a bus 816. The bus 816 can communicatively couple one or more components of the computing device 800. The computing device 800 can also include a number of external or internal devices such as input or output devices. For example, the computing device 800 is shown with an input/output (“I/O”) interface 818 that can receive input from one or more input devices 820 or provide output to one or more output devices 822. The one or more input devices 820 and one or more output devices 822 can be communicatively coupled to the I/O interface 818. The communicative coupling can be implemented via any suitable manner (e.g., a connection via a printed circuit board, connection via a cable, communication via wireless transmissions, etc.). Non-limiting examples of input devices 820 include a touch screen (e.g., one or more cameras for imaging a touch area or pressure sensors for detecting pressure changes caused by a touch), a mouse, a keyboard, or any other device that can be used to generate input events in response to physical actions by a user of a computing device. Non-limiting examples of output devices 822 include an LCD screen, an external monitor, a speaker, or any other device that can be used to display or otherwise present outputs generated by a computing device.
The computing device 800 can execute program code that configures the processor 812 to perform one or more of the operations described above with respect to
The computing device 800 can also include at least one network interface device 824. The network interface device 824 can include any device or group of devices suitable for establishing a wired or wireless data connection to one or more data networks 828. Non-limiting examples of the network interface device 824 include an Ethernet network adapter, a modem, and/or the like. The computing device 800 can transmit messages as electronic or optical signals via the network interface device 824.
The computing device 800 can also include image capturing device(s) 830, such as a camera or other imaging device that is capable of capturing a photographic image. The image capturing device(s) 830 can be configured to capture still images and/or video. The image capturing device(s) 830 may utilize a charge coupled device (“CCD”) or a complementary metal oxide semiconductor (“CMOS”) image sensor to capture images. Settings for the image capturing device(s) 830 may be implemented as hardware or software buttons. In some examples, the computing device 800 can include a regular color camera configured for capturing RGB color images and an NIR camera configured for capturing NIR images. The regular color camera and the NIR camera can be configured so that the fields of the view of the two cameras are substantially the same. In addition, the two cameras may have a matching resolution and have a synchronous image capturing from both sensors.
Numerous specific details are set forth herein to provide a thorough understanding of the claimed subject matter. However, those skilled in the art will understand that the claimed subject matter may be practiced without these specific details. In other instances, methods, apparatuses, or systems that would be known by one of ordinary skill have not been described in detail so as not to obscure claimed subject matter.
Unless specifically stated otherwise, it is appreciated that throughout this specification discussions utilizing terms such as “processing,” “computing,” “calculating,” “determining,” and “identifying” or the like refer to actions or processes of a computing device, such as one or more computers or a similar electronic computing device or devices, that manipulate or transform data represented as physical electronic or magnetic quantities within memories, registers, or other information storage devices, transmission devices, or display devices of the computing platform.
The system or systems discussed herein are not limited to any particular hardware architecture or configuration. A computing device can include any suitable arrangement of components that provide a result conditioned on one or more inputs. Suitable computing devices include multi-purpose microprocessor-based computer systems accessing stored software that programs or configures the computing system from a general purpose computing apparatus to a specialized computing apparatus implementing one or more embodiments of the present subject matter. Any suitable programming, scripting, or other type of language or combinations of languages may be used to implement the teachings contained herein in software to be used in programming or configuring a computing device.
Embodiments of the methods disclosed herein may be performed in the operation of such computing devices. The order of the blocks presented in the examples above can be varied—for example, blocks can be re-ordered, combined, and/or broken into sub-blocks. Certain blocks or processes can be performed in parallel.
The use of “adapted to” or “configured to” herein is meant as open and inclusive language that does not foreclose devices adapted to or configured to perform additional tasks or steps. Additionally, the use of “based on” is meant to be open and inclusive, in that a process, step, calculation, or other action “based on” one or more recited conditions or values may, in practice, be based on additional conditions or values beyond those recited. Headings, lists, and numbering included herein are for ease of explanation only and are not meant to be limiting.
While the present subject matter has been described in detail with respect to specific embodiments thereof, it will be appreciated that those skilled in the art, upon attaining an understanding of the foregoing, may readily produce alterations to, variations of, and equivalents to such embodiments. Accordingly, it should be understood that the present disclosure has been presented for purposes of example rather than limitation, and does not preclude the inclusion of such modifications, variations, and/or additions to the present subject matter as would be readily apparent to one of ordinary skill in the art.
This application a continuation of International Application No. PCT/US2021/032486, filed on May 14, 2021, which claims priority to U.S. Provisional Application No. 63/113,151, entitled “Color Image & Near-Infrared Image Fusion with Base-Detail Decomposition and Flexible Color and Details Adjustment” filed on Nov. 12, 2020. The disclosures of the aforementioned applications are hereby incorporated by reference in their entireties.
Number | Date | Country | |
---|---|---|---|
63113151 | Nov 2020 | US |
Number | Date | Country | |
---|---|---|---|
Parent | PCT/US2021/032486 | May 2021 | WO |
Child | 18139061 | US |