The present invention provides methods and apparatus for visual sub-band decomposition of an image signal (S) by imitating and using Human Visual System Characteristics (HVSC). A base image (S) is used as an input and generates sub band decomposed images (Sbi, i=1, 2 . . . M) that may include variable level sub-band decomposition. The resulting sequence of sets of sub images may be processing independently and may be transmitted, analyzing and displaying separately. The sub-band decomposed images can be stored and retrieved to be applied to any number of different processing cycles and incorporate with other decomposition methods. The decomposed images may then be fused back together to reconstruct an output image (S′).
Another aspect of the present invention is that the resulting sets of sub images can be used/incorporated as input to a variety of image processing techniques such as, enhancement, edge detection, filtering, compression, recognition, denoising, restoration, digital forensic, security, and information hiding. It is understood that exemplary embodiments of the invention are useful in a wide variety of applications including photography, video, medical imaging, and digital film processing, whereby the illustrated method may be used to improve image contrast. Embodiments of the invention can process the images in real time and can operate in a wire and/or wireless transmission environment or embedded in a hardware, firmware, or software which is embedded in another system or application.
In one aspect of the invention, a computer-implemented method comprises performing visual sub-band decomposition of an image using human visual system characteristics to generate a plurality of sub-band decomposed images, independently processing the plurality of sub-band decomposed images with at least one application on a computer, and fusing the independently processed sub-band decomposed images to reconstruct an output image.
The method can include one of more of the following features: the plurality of sub-band decomposed images include variable level sub-band decomposition, the application includes image enhancement and/or edge detection, performing decomposition of the image not including human visual system characteristics for inclusion in the fusing step, fusing is performed to optimize a particular application, transmitting, analyzing and displaying the sub-band decomposed images separately, and performing on the image one of more of edge detection, compression, filtering, enhancement, edge enhancement, edge detection, filtering, compression, recognition, denoising, restoration, digital forensic, security, and information hiding, recognition, fault detection, static hazard free circuit design, a method for simplifying a plurality of gates in circuits, image interpolation and image resizing.
In another aspect of the invention, an article comprises a machine-readable medium that stores executable instructions to enable a computer perform the steps of: performing visual sub-band decomposition of an image using human visual system characteristics to generate a plurality of sub-band decomposed images, independently processing the plurality of sub-band decomposed images with at least one application on a computer, and fusing the independently processed sub-band decomposed images to reconstruct an output image.
In a further aspect of the invention, a system comprises: a first decomposition module to decompose an image into a first sub-band based on human visual system characteristics, a second decomposition module to decompose the image into a second sub-band based on human visual system characteristics, a third decomposition module to decompose the image into a third sub-band based on human visual system characteristics, a processing module to process the decomposed images from the first, second, and third sub-band modules, and a fusion module to fuse the processed decomposed images from the processing module.
The system can further include one or more of the following features: the plurality of sub-band decomposed images include variable level sub-band decomposition, the application includes image enhancement and/or edge detection, decomposition of the image not including human visual system characteristics for inclusion in the fusing step, and fusing is performed to optimize a particular application.
The foregoing features of this invention, as well as the invention itself, may be more fully understood from the following description of the drawings in which:
The visual sub-band modules 102 provide respective decomposed images 104a-M, 106a-M, 108a-M to an image processing module 110. The image processing module 110 can provide various processing depending upon the selected application. Exemplary processing includes edge detection, filtering, data hiding, etc. The image processing module 110 outputs processed images 112a-M, 114a-M, 116a-M for the sub-bands that can be stored by a first visual sub-band output module 118a, a second visual sub-band output module 118b, and a further visual sub-band output module 118M. The processed images for the sub-bands are then fused by a fusion module 124 to generate an enhanced image 20.
In general, the process of decomposing images into multiple sub-bands allows improved local enhancement for each sub-band, as opposed to utilizing a single method, which may not work well for certain areas of the image. For example, dark areas and bright areas of a photo may benefit from different processing. Depending on the choice of the sub-bands and fusion method, one can achieve a fast efficient hardware implementation or software system for imaging applications. This allows imaging systems to run faster and provides low-cost implementation choices. This enables the use of imaging hardware in applications that are currently cost prohibitive due to the implementation complexity. As an example, an exemplary embodiment of the invention can be directed to low-cost cameras for applications where the cameras will not or cannot be retrieved. The cameras should be sufficiently inexpensive so that they can be discarded, yet efficient enough to provide accurate crucial information quickly. Conventional computational methods for detection are not cost-effective and are impractical.
Exemplary embodiments of the invention can process images in real time on-the-fly or in a batch mode, and can operate in a wire and/or wireless transmission environment or embedded in a hardware, firmware, or software which is embedded in another system or application. The system can be a stand-alone system or a server/client network system, etc.
There are a number of known models of the human visual system have been introduced for various applications. One known method is to attempt to model the transfer functions of the parts of the human visual system, such as the optical nerve, cortex, etc. This method then attempts to implement filters which recreate these processes to model human vision. Another known method uses a single channel to model the entire system, processing the image with a global algorithm. Another known method uses knowledge of the human visual system to detect edges using a collection of known one-dimensional curves. Along with the gradient information along those curves, there is a complete representation of the image.
In exemplary embodiments of the invention, an image is segmented into a collection of two-dimensional images with similar internal properties. To perform the two-dimensional segmentation gradient information is used in conjunction with background illumination.
As shown in
This model enables an automated method for decomposing an image into four, three, and two regions/subbands. These different regions are characterized by the minimum difference between two pixel intensities for the human visual system to register a difference. The parameters x1, x2, and x3 can be calculated using deferens methods in a manner known in the art. Particularly, the four visual sub-band images can be calculated as described below in detail:
where B(x, y) is the background intensity at each pixel, X(x, y) is the input image, Q is all of the pixels which are directly up, down, left, and right from the pixel, Q′ is all of the pixels diagonally one pixel away, and l and q are some constant:
where ⊕ as PLIP addition, Θ as PLIP difference and as PLIP multiplication by a scalar, as set below in PLIP Arithmetic and disclosed in “Human Visual System-Based Image Enhancement and Logarithmic Contrast Measure Panetta, K. A., Wharton, E. J., and Agaian, S. S., Tufts Univ., Medford; Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on, Publication Date: February 2008, Volume: 38, Issue: 1, page(s): 174-188.
The thresholding parameters Bxi, i=1, 2, 3 are calculated by for the background illumination thresholds and Ki, i=1, 2, 3 for the gradient thresholds.
where α1, α2, α3 are parameters based upon the three different regions of response characteristics displayed by the human eye. As α1 is the lower saturation level, it is effective to set this to 0. For α2, α3, it is necessary to determine these experimentally, or rule/measure.
In some cases it is effective to combine the saturation region and the fourth image. Experimental results show that this does not produce noticeable changes in the output image and has the benefit of reducing computation time and simplifying the process. This is demonstrated in
These three images are then enhanced separately and recombined to form the enhanced image.
In multi-histogram equalization image enhancement algorithm, the decomposition process 102 of
b shows RMSHE using four sub-images. It can be seen that this improves brightness preservation and produces a more visually natural image than DSHE, however the segmented regions still do not always correspond to the actual physical regions because the thresholding is performed directly on the pixel intensities.
In accordance with exemplary embodiments of the invention, the parameter selection can be done by using the Logarithmic AMEE measure or other measure as described below in Contrast Measure.
Selection of the α constants, while slightly different for different images, is efficiently done using the Logarithmic AMEE measure, as demonstrated in
The following example presents an exemplary enhancement process used in conjunction with Human Visual System Based Image Enhancement. An Edge Preserving Contrast Enhancement (EPCE) process is disclosed. It is well known that human eyes have a much larger dynamic range than current imaging and display devices. Traditional methods for image enhancement such as gamma adjustment, logarithmic compression, histogram equalization, and level or curve type methods are generally limited because they process images globally. To solve this, more advanced methods have been introduced based on better understanding of the human visual system which is more capable of handling scenes with high dynamic ranges. Many of these methods make use of spatially dependent processing methods where each pixel is determined by both local and global image information.
The inventive Edge Preserving Contrast Enhancement (EPCE) process is a contrast enhancement algorithm which is designed to preserve edges while improving contrast on a local scale by combining the output of an edge detection algorithm with the original spatial image information. This achieves a more robust enhancement algorithm that is tunable to perform edge detection or enhancement. This enhancement process can work with any suitable edge detection algorithm. It uses pre-processing steps to standardize image brightness and several post-processing steps to enhance the edges.
The inventive measure shows the results of computer simulations with the presented image enhancement algorithms. Results are compared to the known Retinex algorithm and histogram equalization, since these are comparable fully automated image enhancement algorithms. For the basis of comparison, the Logarithmic AME is used to assess the enhancement quality.
In one aspect of the invention, edge detection can be used for processing the sub-band images. Edge detection can include, for example, conventional, conventional with PLIP Arithmetic and different shapes (see, e.g., Shahan Nercessian, Karen Panetta, and Sos Agaian, “Generalized Rectangular Edge Detection Kernels and Edge Fusion For Biomedical Applications,” in Biomedical Imaging: From Nano to Macro, 2009. ISBI 2009. 6th IEEE International Symposium on, 2009, Shahan Nercessian, Karen Panetta, and Sos Agaian, “A Generalized Set of Kernels for Edge and Line Detection,” in SPIE Electronic Imaging 2009, San Jose, Calif., USA, 2009). It is understood that in scene analysis systems the use of edge detection to determine the boundary between foreground and background objects is important. From military and robotics to security and data hiding/watermarking, the accurate detection of edges allows improved object recognition and image attribute identification. To do so, the balance between good detection and reducing the amount of errors (false positives) is essential, although often a difficult task as unwanted noise is frequently emphasized by derivative-based techniques.
Referring again to
The decomposed images may then be fused back together to reconstruct an output image. In exemplary embodiments, a particular fusion procedure can be selected based upon certain criteia. For example, given two visual sub images I1 and I2 can be fused by using the following procedures
where * denotes PLIP convolution
A=A1⊕A2; d1=I1ΘA1; d2=I2ΘA2;
D=max(abs(d1),abs(d2))
Edge_Map=(αA)⊕(βD)
where α, and β are some parameters, and where ⊕ as PLIP addition, Θ as PLIP difference and as PLIP multiplication by a scalar. Or, where ⊕, Θ and can be commonly used operations: ⊕ addition, Θ difference, and multiplication by a scalar.
Using visual sub-band composition with edge detection quantitatively outperforms conventional edge detection as shown in
The PLIP model was introduced by Panetta, Wharton, and Agaian to provide a non-linear framework for image processing that addresses these five requirements. It is designed to both maintain the pixel values inside the range (0, M] as well as to more accurately process images from a human visual system point of view. To accomplish this, images are processed as absorption filters using the gray tone function. This gray tone function is as follows:
g(i,j)=M−f(i,j) (1)
where f(i, j) is the original image function, g(i, j) is the output gray tone function, and M is the maximum value of the range. It can be seen that this gray tone function is much like a photo negative. The PLIP model operator primitives can be summarized as follows:
where we use ⊕ as PLIP addition, Θ as PLIP subtraction, as PLIP multiplication by a scalar, and * as PLIP multiplication of two images. Also, a and b are any grey tone pixel values, c is a constant, M is the maximum value of the range, and β is a constant. γ(M), k(M), and λ(M) are all arbitrary functions. We use the linear case, such that they are functions of the type γ(M)=AM+B, where A and B are integers, however any arbitrary function will work. In general, a and b correspond to the same pixel in two different images that are being added, subtracted, or multiplied. The best values of A and B can be determined to be any combination such that γ(M), k(M), and λ(M)=1026 and the best value of β was determined to be β=2.
Many enhancement techniques are based on enhancing the contrast of an image. There have been many differing definitions of an adequate measure of performance based on contrast. Gordon and Rangayan used local contrast defined by the mean gray values in two rectangular windows centered on a current pixel. Begchladi and Negrate defined an improved version of the aforementioned measure by basing their method on local edge information of the image. In the past, attempts at statistical measures of gray level distribution of local contrast enhancement such as those based on mean, variance, or entropy have not been particularly useful or meaningful. A number of images, which show an obvious contrast improvement, showed no consistency, as a class, when using these statistical measurements. Morrow introduced a measure based on the contrast histogram, which has a much greater consistency than statistical measures (see Sos Agaian, Blair Silver, and Karen Panetta, 2007, March), “Transform Coefficient Histogram Based Image Enhancement Algorithms Using Contrast Entropy,” IEEE Trans. Image Processing, 16(3) pp. 751-758 (2007)). Measures of enhancement based on the human visual system have been previously proposed. Algorithms based on the human visual system are fundamental in computer vision. Two definitions of contrast measure have traditionally been used for simple patterns: Michelson for periodic patterns like sinusoidal gratings and Weber for large uniform luminance backgrounds with small test targets. However, these measures are not effective when applied to more difficult scenarios, such as images with non-uniform lighting or shadows. The first practical use of a Weber's law based contrast measure, the AWC or contrast measure was developed by Agaian (see Sos S. Agaian, “Visual Morphology,” Proceedings of SPIE, Nonlinear Image Processing X, San Jose, Calif., vol. 3646, pp. 139-150, March 1999).
This contrast measure was later developed into the EME, or measure of enhancement, and the EMEE, or measure of enhancement by entropy. Finally, the Michelson Contrast Law was included to further improve the measures. These were called the AME and AMEE. These are summarized in Table I, and are calculated by dividing an image into k1×k2 blocks, calculated the measure for each block, and averaging the results as shown in the formula definitions.
In accordance with exemplary embodiments of the invention, PLIP operator primitives are used to improve image processing for a more accurate processing of the contrast information, as PLIP subtraction has been shown to be consistent with Weber's Contrast Law. With these modifications, the inventive Logarithmic AME and Logarithmic AMEE measures are disclosed, which are better able to assess images and thus more accurately select parameters.
where ⊕ as PLIP addition, Θ as PLIP difference and as PLIP multiplication by a scalar, as set below in PLIP Arithmetic and disclosed in “Human Visual System-Based Image Enhancement and Logarithmic Contrast Measure Panetta, K. A., Wharton, E. J., and Agaian, S. S., Tufts Univ., Medford; Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on, Publication Date: February 2008, Volume: 38, Issue: 1, page(s): 174-188. To use the measures for selection of parameters, the algorithm is first run for all practical values of the parameters. The measure is calculated for each output image, and this data is organized into a graph as measure versus parameters. Finally, the best parameters are located at the local extrema.
It is understood that exemplary embodiments of the invention described herein can be implemented in a variety of hardware, software, and hardware/software combinations. In one embodiment, the processing for the system of
The processing described herein is not limited to use with the hardware and software of
The system may be implemented, at least in part, via a computer program product, (e.g., in a machine-readable storage device), for execution by, or to control the operation of, data processing apparatus (e.g., a programmable processor, a computer, or multiple computers)). Each such program may be implemented in a high level procedural or object-oriented programming language to communicate with a computer system. However, the programs may be implemented in assembly or machine language. The language may be a compiled or an interpreted language and it may be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment. A computer program may be deployed to be executed on one computer or on multiple computers at one site or distributed across multiple sites and interconnected by a communication network. A computer program may be stored on a storage medium or device (e.g., CD-ROM, hard disk, or magnetic diskette) that is readable by a general or special purpose programmable computer for configuring and operating the computer when the storage medium or device is read by the computer to perform processing. Processing may also be implemented as a machine-readable storage medium, configured with a computer program, where upon execution, instructions in the computer program cause the computer to operate.
The processing associated with implementing the system may be performed by one or more programmable processors executing one or more computer programs to perform the functions of the system. All or part of the system may be implemented as, special purpose logic circuitry (e.g., an FPGA (field programmable gate array) and/or an ASIC (application-specific integrated circuit)). Elements of different embodiments described herein may be combined to form other embodiments not specifically set forth above. Other embodiments not specifically described herein are also within the scope of the following claims.
Having described exemplary embodiments of the invention, it will now become apparent to one of ordinary skill in the art that other embodiments incorporating their concepts may also be used. The embodiments contained herein should not be limited to disclosed embodiments but rather should be limited only by the spirit and scope of the appended claims. All publications and references cited herein are expressly incorporated herein by reference in their entirety.
The present application claims the benefit of U.S. Provisional Patent Application No. 61/039,889, filed on Mar. 27, 2008, which is incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
61039889 | Mar 2008 | US |