1. Field of the Invention
This invention is related to color image sensors and, more particularly, to color integrated image sensors including image processing circuitry for information extraction integrated therewith.
2. Description of the Prior Art
CMOS integrated circuits technology readily allows the incorporation of photodetector arrays and image processing circuits on the same silicon die. This has led to the recent proliferation in cheap and compact digital cameras, system-on-a-chip video processors, and many other cutting-edge commercial and research imaging products. The concept of using CMOS technology for combining sensing and processing was not spear-headed by the imaging community. It actually emerged in mid 1980's from the neuromorphic engineering community, developed by Carver Mead and collaborators, as discussed in “Sensitive Electronic Photoreceptors” by C. Mead, Proc. 1985 Chapel Hill Conf. VLSI, pp. 463-471, Computer Science Press, Maryland, 1984. Mead's motivation was to mimic the information processing capabilities of biological organisms; biology tends to optimize information extraction by introducing processing at the sensing epithelium. This approach to sensory information processing, which was later captured with terms such as “sensory processing” and “computational sensors,” produced a myriad of vision chips, whose functionalities include edge detection, motion detection, stereopsis and many others (examples can be found in “Neuromorphic Electronic Systems” by C. Mead, Proc. IEEE, Vol 78, pp. 1629-1636, 1990).
The preponderance of the work on neuromorphic vision has focused on spatiotemporal processing on the intensity of light (gray scale images) because the intensity can be readily transformed into a voltage or current using basic integrated circuits components: photodiodes, photogates, and phototransistors. These devices are easily implemented in CMOS technologies using no additional lithography layers. On the other hand, color image processing has been limited primarily to the commercial camera arena because three additional masks are required to implement red (R), green (G) and blue (B) filters. The additional masks make fabrication of color sensitive photodetection arrays expensive and, therefore, not readily available to researchers. Nonetheless, a large part of human visual perception is based on color information processing. Consequently, neuromorphic vision systems should not ignore this obviously important cue for scene analysis and understanding.
There has been a limited amount of previous work on neuromorphic color processing. The vast majority of color processing literature addresses standard digital image processing techniques. That is, they consist of a camera that is connected to a frame-grabber that contains an analog-to-digital converter (ADC). The ADC interfaces with a digital computer, where software algorithms are executed. Of the few biologically inspired hardware papers, there are clearly two approaches. The first approach uses separate imaging chips and processing chips discussed in “A real-Time Neural System for Color Constancy” by A. Moore et al., IEEE Transactions on Neural Networks, Vol. 2, No. 2, pp. 237-247, 1991, while the second approach integrates a handful of photodetectors and analog processing circuitry as discussed in “Towards Color Image Segmentation in analog VLSI: Algorithms and Hardware” by F. Perez et al., Int. J. Computer Vision, Vol. 12, No. 1, pp.17-42, 1994. In the former example, standard cameras are connected directly to analog VLSI chips that demultiplex the video stream and store the pixel values as voltages on arrays of capacitors. Arrays as large as 50×50 pixels have been realized to implement various algorithms for color constancy as noted in Moore et al., cited above. As can be expected, the system is large and clumsy, but real-time performance is possible. The second set of chips investigate a particular biologically inspired problem, such as RGB (red, green, blue color)-to-HSI (Hue, Saturation and Intensity) conversion using biologically plausible color opponents and HSI-based image segmentation, using a very small number of photodetectors and integrated analog VLSI circuits as noted in Perez et al., cited above. Clearly, the goal of the latter is to demonstrate a concept and not to develop a practical system for useful image sizes.
One persistent problem in attempts at neuromorphic color processing is encountered in seeking to model a degree of subjectivity in human color perception, particularly in regard to color temperature of illumination of objects or a scene. For example, a human observer will tend to perceive the same color of an object regardless of whether it is illuminated by direct sunlight, indirect light from the sky, incandescent light, etc. (although some differences in color perception may occur under fluorescent illumination or other illumination having strong, narrow peaks in the light spectrum such as mercury vapor illumination). It has been hypothesized that human language encodes particular colors into broad linguistic categories corresponding to about eleven basic colors which are fairly stable in regard to color temperature variation of illumination. In contrast, machine vision system are extremely sensitive to changes in the spectral content of light and small changes in illumination may cause large shifts in apparent color detected by an imager which, in turn, leads to errors in any form of image processing which relies on color matching.
It is therefore an object of the present invention to to address the gap in the silicon vision literature by providing an integrated, large-array of color photodetectors and on-chip processing.
It is another object of the invention to provide an integrated chip for the recognition of objects based on their color signatures.
It is a further object of the invention to provide a color image sensor integrated together with image processing circuitry sufficient to support information extraction on a single chip.
It is yet another object of the invention to provide a generalized, rule-based processing embodiment for extraction of color signatures from raw image data which can also be integrated with an image sensor array on an integrated circuit chip.
It is yet another object of the invention to provide image processing which tends to emulate human insensitivity to color temperature variation which can be integrated on a single chip with a color imaging array.
In order to accomplish these and other objects of the invention, a color-based visual sensor is provided, comprising a sensor array for sensing component colors of light in a scene and providing image data in accordance with an arbitrary color space, an arrangement for defining opponencies in the color space, a processor for transforming said image data in accordance with the opponencies to provide transformed image data, comparators for dividing the transformed image data into families and sub-families in accordance with respective opponencies and combinations of opponencies, respectively, and for dividing the families and subfamilies in accordance with variation of the transformed image data corresponding to respective opponencies to produce divided transformed image data, registers for region of interest selection, an analyzer for analyzing the divided transformed image data over the region of interest to develop a color signature, and a calculator for matching the color signature with a template.
The foregoing and other objects, aspects and advantages will be better understood from the following detailed description of a preferred embodiment of the invention with reference to the drawings, in which:
Referring now to the drawings, and more particularly to
This chip represents the first self-contained color processing imager with focal-plane segmentation, histogramming and template matching capabilities. The principal functional blocks of this chip combining an imager and image information extraction circuitry will now be described.
The Imager: In the imager array 20, three currents, corresponding to R, G and B, values are sampled-and-held for each pixel (a color filter wheel is used in this prototype). To facilitate processing, a current mode imaging approach is adopted. This approach provides more than 120 dB of dynamic range as noted in “Sensitive Electronic Photoreceptor”, cited above, allows RGB scaling for white correction using a multiplying DAC, and RGB normalization using a translinear circuit as described in “Translinear Circuits, 25 Years on Part I, The Foundations” by B. Gilbert, Electronic Engineering (London), Vol. 65, No. 800, August, 1993. The normalization guarantees that a large dynamic range of RGB currents are resized for the HSI or other color space transformer, if used, to operate robustly. However, if used, normalization and/or color space transformation limits the speed of operation to approximately 30 fps because the transistors must operate in sub-threshold regions. For read-out, the pixels can be grouped into blocks of 1×1 (single pixel) to 128×64 (entire array). The blocks can be advanced across the array in single or multiple pixel intervals. The organization of the pixels and the scanning methods are programmable by loading bit patterns in two scanning registers for each coordinate direction, one (12, 18) for scanning pixels within blocks and the other (14, 16) for scanning the blocks across the array.
RGB-to-HSI and Color Segmentation: The preferred RGB-to-HSI transformer 20 uses an opponent color formulation, reminiscent of biological color processing as discussed in the Gilbert article, cited above. The intensity (I) is obtained before normalization by summing the RGB components (see
HSI Histogramming and Template Matching: The HSI histogramming step is preferably performed using 36, 12-bit counters to measure the number of pixels that fall within each prescribed HSI interval. The number of histogram intervals is not critical to the practice of the invention but approximately 30 to 36 intervals is preferred to yield adequate color resolution while limiting errors due to illumination spectrum induced color shifts and suitably limiting space requirements on the chip. It should also be understood that histogramming is not necessary to the successful practice of the invention and that other computational expedients such as a vector could be used. Nevertheless histogramming has been found to be convenient in regard to computation, signature coding and required chip space. After the scanning of the imager is completed, the counters hold the color signature of the scene. During the learning phase, the signature is transferred to one of the 32 on-chip array of SRAM template cells. During the matching phase, the newly acquired signatures are compared to the stored templates, using 8 serial presentations of 4 parallel templates, with the SAD cells. The resultant error for each template is presented to logic for selecting the best match, either on or off the chip, where they can be sorted using a simple micro-controller such as a PIC, to find the best match template.
The prototype of the arrangement described above demonstrates that a real-time color segmentation and recognition system can be implemented using a small silicon area and small power budget. By using a fabrication technology with RGB filters, the entire system can be realized with a tiny footprint for compact imaging/processing applications. The above arrangement using RGB-to-HSI conversion is preferred since it is believed that signals corresponding to colors in the HSI color space most closely models human color perception sensitivity. However, it has been found that the invention can be successfully practiced using signals corresponding to any color space (e.g. CMY, YIG, RGB, etc.) and that color space conversion is unnecessary to the practice of the invention. In other words, the invention, in a generalized form, need not rely on specific computations but may be implemented in a rule-based form which is as robust as the preferred embodiment described above, if not more so, as well as enhancing stability under conditions of variable color temperature of illumination. The more general form of the invention can be integrated even more easily on a single chip since color space conversion is not necessary: leaving additional chip space for a more extensive imager array (although high spatial resolution is not necessary for obtaining color signatures of identifiable objects) and/or storage of a greater number of templates to support recognition, tracking and the like of a larger number of objects. On the other hand, if desired, any desired color space conversion can be provided, allowing greater flexibility in data capture in accordance with any desired color space as well as allowing exploitation of particular properties of particular color spaces such as more linear processing for noise removal, simplicity of computations of opponent color values, modeling of biological systems and the like. As with the preferred hue-based embodiment described above, special-purpose gate arrays and circuits can be provided for relatively simple calculations and any complex calculations can be approximated sufficiently for histogramming, vector computation and the like with a suitable degree of color resolution, as discussed above, by using a look-up table in a manner similar to that used for RGB-to-HSI conversion, as also discussed above.
Specifically, a rule-based embodiment of the invention is preferably implemented with processing which falls into two groups: a.) preprocessing and normalization and b.) rule application. Basically, the preprocessing and normalization captures image data, standardizes the form of the data and determines color opponencies. The application of rule sets develops color families and then refines the color categorization into sub-families and then analyzes the amount of variation in data within a sub-family to develop a color signature for learning or recognition of objects/scene segments in newly captured data.
As with the preferred, hue-based embodiment described above, the raw data sampling can be performed in numerous ways such as using a color filter wheel in the optical system projecting a scene on the image sensor or the scene can be illuminated with alternating wavelengths of light. Color filters or differences in color sensitivity can also be implemented on the chip but such alternatives generally cause substantial increases in cost and chip space (since plural pixel areas are then required for respective colors of an elemental area of a scene). Illumination or sensing in at least three wavelength bands covering the spectral region of interest is preferred. The light incident on the imager array is measured/quantized and an N-dimensional vector (where N equals the number of spectral bands measured) is produced. These measurements are then normalized. A chart with black, white and three intervening gray levels such as the commercially available Macbeth Color Checker Chart, commonly used in color photographic printing, has been found to provide convenience in normalization. It is preferred to normalize image values such that the value of each element is between zero and one in each color band measured. Once normalization has been accomplished, color opponencies are determined.
The concept of color opponencies is important to the practice of the generalized form of the invention as it is in the preferred embodiment. Color opponencies, as the term suggests, represent the basic distinctions which are made between colors or color component values in order to develop at least three color channels to which rule sets can later be applied. In theory, color opponencies may be freely chosen but careful selection of color opponencies can greatly improve color error tolerance (e.g. a misidentification of a color between purple and blue is much more tolerable in terms of extracting a color signature of a portion of a scene than a gross error such as between yellow and blue—therefore color opponencies should preferably be defined between colors which are most nearly opposite and the opponencies chosen should be as nearly orthogonal as possible as shown in
Three or more sets of color opponencies are preferred for practice of the invention but three sets of color opponencies are also entirely sufficient for successful practice of the invention. It is also preferred that one of the opponencies be a black-white opponency. Exemplary opponencies for some commonly used color spaces are:
for RGB, red-green (=R−G), blue-yellow (=2*B−R−G), and black-white (=(R+B+G)/3);
for CMY, an M−C channel is equivalent to the R−G channel in an RGB system (and the equivalency appears as the R−G and M−C opponcencies being parallel vectors in
for YIQ, the YIQ is simply RGB multiplied by a suitable 3×3 matrix; etc. In general, good choices of color opponencies will allow distinctions between colors to be made in accordance with each opponency by analog differential amplifiers or by relatively simple (and small chip area) digital addition, subtraction and multiplication and division by two and three circuits, suitable forms of which are known in the art. It should be noted from the above examples, that good choices for color opponencies are often somewhat similar due to the desirability of substantial mutual orthogonality of opponencies (e.g. as depicted in
Once the captured raw data has been transformed in terms of defined opponencies three preferred rule sets can be applied in turn. The first rule set applied classifies the transformed data into families such as red, green, blue, yellow and gray (which may be visualized as five volumes arranged in a cruciform configuration as shown in solid lines of
The second rule set develops sub-families to characterize the relative color vector lengths along the respective opponencies. For example, a color which could be described as orange could, depending on the particular shade thereof, be classified into either a red or yellow family which correspond to different opponencies. Application of this second rule set determines if the color is a red-orange or a yellow orange. The sub-families can be visualized as bifurcated volumes (depicted with dashed lines) in the interstices or corners between arms of the cruciform arrangement of volumes in
Application of the third rule set analyzes the amount of variation in the respective color channels corresponding to the opponencies discussed above. Low variation corresponds to de-saturated colors which are identified as “light” while wide variation corresponds to saturated colors. Low luminosity colors are referred to as “dark”. While the definitions of light and dark are not entirely symmetric, results have, nevertheless, been found to be quite acceptable. Average variation is also discriminated which, together with light and dark classifications, provides a multiplication of the families and sub-families by a factor of three (e.g. dark, average and light—more subdivisions of families and sub-families may be provided if desired), generally corresponding to the preferred thirty-six bins of the histogram described above in connection with the preferred, hue-based embodiment of the invention and can thus clearly be accommodated by the chip in accordance with the invention as described above in connection with the hue-based preferred embodiment.
In view of the above discussion of the preferred hue-based embodiment of the invention and the rule-based, generalized implementation of the invention, those skilled in the art will recognize that the pre-processing followed by application of the first and second rule sets is comparable to and serves much the same function as the hue computer of the preferred embodiment but does not require (but can include) a color space transformation while being generalized to any color space and application of the first rule set followed by application of the third rule set is comparable to the saturation computation in the preferred, hue-based embodiment. Accordingly, it is seen that a generalized, rule-based embodiment of the invention is broadly applicable to any arbitrary color space, is robust and stable in practice under varying illumination conditions, does not require a color space conversion and processing circuitry is at least as well accommodated, together with an imager array, on a single integrated circuit chip.
While the invention has been described in terms of several embodiments and approaches, those skilled in the art will recognize that the invention can be practiced with modification within the spirit and scope of the appended claims.
This application is a Continuation-in-Part of U.S. patent application Ser. No. 10/214,123, filed Aug. 8, 2002, priority of which as to common subject matter with the present application is hereby claimed and the entire disclosure thereof is hereby fully incorporated by reference.
Number | Date | Country | |
---|---|---|---|
Parent | 10214123 | Aug 2002 | US |
Child | 11118485 | May 2005 | US |