1. Field of the Invention
Embodiments of the present invention generally relate to a method and apparatus for combining images. More particularly, the present invention relates to image fusion techniques.
2. Description of the Related Art
Image fusion is the process of combining two or more source images of a given scene in order to construct a new image with enhanced information content for presentation to a human observer. For example, the source images may be infrared (IR) and visible camera images of the scene obtained from approximately the same vantage point.
There are two broad classes of image fusion algorithms. The first class is color fusion and the second class is feature selective fusion.
Both classes of image fusion have strengths as well as limitations. Color fusion makes use of human color vision to convey more information to an observer than can be provided in the comparable monochrome display. Color fusion also allows intuitive perception of materials, e.g., vegetation, roads, vehicles, and the like. However, color fusion often results in reduced contrast of some features in the scene, making those features more difficult to see. Feature selective fusion preserves selected scene features at full contrast. Feature selective fusion also provides a more general framework for combining images than does color fusion. However, feature selective fusion may discard information that is “good”.
Therefore, there is a need in the art for an image fusion approach that maintains full contrast and allows for intuitive perception while reducing the amount of relevant information that is discarded.
The present invention generally relates to a method and apparatus for combining a plurality of images. In one embodiment, at least one signal component is determined from a plurality of source images using feature selective fusion. At least one color component is determined from the plurality of source images using color fusion. An output image is formed from the at least one signal component and the at least one color component.
In another embodiment, at least one image component is determined from a plurality of source images using feature selective fusion. An output image is formed from the at least one image component using color fusion.
So that the manner in which the above recited features of the present invention can be understood in detail, a more particular description of the invention, briefly summarized above, may be had by reference to embodiments, some of which are illustrated in the appended drawings. It is to be noted, however, that the appended drawings illustrate only typical embodiments of this invention and are therefore not to be considered limiting of its scope, for the invention may admit to other equally effective embodiments.
The present invention discloses a method and apparatus for image fusion that combines the basic color and feature selective methods outlined above to achieve the beneficial qualities of both while avoiding the shortcomings of each.
In the color fusion, multiple images are combined to form an output image. One example of color fusion is color fusion as a direct mapping. This type of color fusion is shown in
In feature selection, images are combined in a pyramid or wavelet image transform domain and the combination is achieved through selection of one image source or another at each sample position in the transform. Selection may be binary or through weighted average. This method is also called feature fusion, pattern selective, contrast selective, or “choose best” fusion. Feature fusion provides the selection, at any image location, of the source that has the best image quality, e.g., best contrast, best resolution, best focus, best coverage. An example of feature fusion (e.g., “choose best” selection) is illustrated in
At each location (e.g., sample position) i, j and scale k:
where LA, LB comprise transformed images from sources A and B, and SA, SB comprise a salience of each transformed image. Salience may be determined as follows:
At each location (e.g., sample position) i, j and scale k:
Salience measures for fusion based on contrast may be represented as
SI(ijk)=|LI(ijk).
Salience measures for merging based on support may be represented as
SI(ijk)=GM(ijk),
where M is a mask indicating a support area for image I.
A combined salience measure may be represented as
SI(ijk)=GM(ijk)|LI(ijk)|.
The output transformed image LC is then inverse transformed by inverse transformer 445 to provide combined image IC.
The method and apparatus of the present invention discloses color plus feature fusion (CFF), where multiple source images may be combined to form an image for viewing. In one embodiment, the multiple source images are both monochrome and color and are combined to form a color image for viewing. The output image may be defined in terms of three standard spectral bands used in display devices, typically red, green and blue component images. Alternatively the output image may be described in terms of a three-channel coordinate system in which one channel represents intensity (or brightness or luminance) and the other two represent color. For example the color channels may be hue and saturation or opponent colors such as red-green and blue-yellow, or color difference signals, e.g., Red-Luminance, Blue-Luminance. In one embodiment CFF may operate in one color space format, e.g., Hue, Saturation, Intensity (HSI), and provide an output in another color space format, e.g, Red, Green, Blue (RGB).
In one embodiment, mapping element 1020 may be implemented as follows:
At each point (ijk):
where SA comprises a salience of IIR and SB comprises a salience of IEO, LA comprises the transformed image of IIR and LB comprises the transformed image of IEO, and R, G, and B respectively comprise red, green, and blue channels.
In one embodiment, mapping element 1020 may be implemented as follows:
where SIR comprises a salience of the infrared source image, SEO indicates a salience of electro-optical source image, and R, G, and B respectively comprise red, green, and blue channels.
Thus, image processing device or system 1400 comprises a processor (CPU) 1410, a memory 1420, e.g., random access memory (RAM) and/or read only memory (ROM), a color plus feature fusion (CFF) module 1440, and various input/output devices 1430, (e.g., storage devices, including but not limited to, a tape drive, a floppy drive, a hard disk drive or a compact disk drive, a receiver, a transmitter, a speaker, a display, an image capturing sensor, e.g., those used in a digital still camera or digital video camera, a clock, an output port, a user input device (such as a keyboard, a keypad, a mouse, and the like, or a microphone for capturing speech commands).
It should be understood that the CFF module 1440 can be implemented as one or more physical devices that are coupled to the CPU 1410 through a communication channel. Alternatively, the CFF module 1440 can be represented by one or more software applications (or even a combination of software and hardware, e.g., using application specific integrated circuits (ASIC)), where the software is loaded from a storage medium, (e.g., a magnetic or optical drive or diskette or field programmable gate array (FPGA)) and operated by the CPU in the memory 1420 of the computer. As such, the CFF module 1440 (including associated data structures) of the present invention can be stored on a computer readable medium, e.g., RAM memory, magnetic or optical drive or diskette and the like.
In one embodiment, an enhancement is performed in combination with color plus feature fusion. Enhancement may involve point methods in the image domain. Point methods may include contrast stretching, e.g., using histogram specification. Enhancement may involve region methods in the pyramid domain, e.g., using Gaussian and Laplacian transforms. Region methods may include sharpening, e.g., using spectrum specification. Enhancement may also involve temporal methods during the alignment process. Temporal methods may be utilized for stabilization and noise reduction.
In one embodiment, color plus feature fusion (CFF) may be utilized in a video surveillance system. Fusion and enhancement may be provided using position and scale invariant basis functions. Analysis may be provided using multi-scale feature sets and fast hierarchical search. Compression is provided using a compact representation retaining salient structure.
CFF maintains the contrast of feature fusion and provides intuitive perception of materials. CFF also provides a general framework for image combination and for video processing systems. Where processing latency is important, CFF embodiments may achieve sub-frame latency.
The present invention has described CFF using just two source cameras. It should be understood that the method and apparatus may be applied with any number of source cameras, just as standard color and feature fusion methods may be applied to any number of source cameras. Also the source images may originate from any image source, and need not be limited to cameras.
Example apparatus embodiments of the present invention are described such that only one presentation format is shown. It should be apparent to one skilled in the art that a signal component or a color component may be a band in a color space (e.g., R, G, and B bands in the RGB domain; Hue, Saturation, and Intensity in the HSI domain; Luminance, Color U, and Color V in the YUV space, and so on). Each source image may contain only one band as in IR, or multiple bands as in EO.
While the foregoing is directed to embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow.
This application claims benefit of U.S. provisional patent application Ser. No. 60/540,100, filed Jan. 27, 2004, which is herein incorporated by reference.
Number | Date | Country | |
---|---|---|---|
60540100 | Jan 2004 | US |