The present invention relates to digital video signal processing, and more particularly to architectures and methods for digital camera front-ends.
Imaging and video capabilities have become the trend in consumer electronics. Digital cameras, digital camcorders, and video cellular phones are common, and many other new gadgets are evolving in the market. Advances in large resolution CCD/CMOS sensors coupled with the availability of low-power digital signal processors (DSPs) has led to the development of digital cameras with both high resolution image and short audio/visual clip capabilities. The high resolution (e.g., sensor with a 2560×1920 pixel array) provides quality offered by traditional film cameras.
a is a typical functional block diagram for digital camera control and image processing (the “image pipeline”). The automatic focus, automatic exposure, and automatic white balancing are referred to as the 3A functions; and the image processing includes functions such as color filter array (CFA) interpolation, gamma correction, white balancing, color space conversion, and JPEG/MPEG compression/decompression (JPEG for single images and MPEG for video clips). Note that the typical color CCD consists of a rectangular array of photosites (pixels) with each photosite covered by a filter (the CFA): typically, red, green, or blue. In the commonly-used Bayer pattern CFA one-half of the photosites are green, one-quarter are red, and one-quarter are blue.
Typical digital cameras provide a capture mode with full resolution image or audio/visual clip processing plus compression and storage, a preview mode with lower resolution processing for immediate display, and a playback mode for displaying stored images or audio/visual clips.
A digital signal processing device that provides the imaging and video computation and data flow faces multiple challenges:
The present invention provides a digital camera video processing front-end architecture of multi-interconnected autonomous processing modules for efficient operation.
a-1d illustrate functional blocks of a preferred embodiment front-end, a buffer interface, a video processing subsystem, and a digital camera processor.
a-2b are functional block diagrams for a generic digital camera image pipeline and a generic network connection.
a-4c illustrates data flow and reformatter in a preferred embodiment CCD/CMOS controller.
shows a horizontal median filter.
a-17b show resampling.
a-18b show resampling.
Preferred embodiment video processing front-end (VPFE) architectures include multiple processing modules (e.g., CCD controller, preview engine, 3A functions, histogram, resizer) interfaced together in such a way that complicated data flow can be realized and managed.
The
a also shows the processing modules tied to a configuration/MMR (memory-mapped registers) bus central resource; the configuration bus can connect the processing modules to a program controller (e.g., ARM RISC processor in
b shows an integrated circuit processor for a digital camera which includes a preferred embodiment VPFE (upper left in
c (and section 2 below) shows more detail of the connections of the VPFE processing modules with the external memory read/write buffers together with bus priorities plus port bit widths. Note that a processing module reads from the external memory through the read buffer on a bus with VBUSM protocol; whereas, a processing module writes to the external memory through the write buffer on a bus with VBUSP protocol. Essentially, the VBUSM protocol provides non-blocking split-transaction on reads, whereas the VBUSP protocol provides single-transaction posted writes. That is, reads should be split into request and read-data transactions, so a pending read does not block subsequent read requests. Writes should be implemented as posted writes, so a pending write is buffered, while subsequent writes can still be accepted.
d shows the VPFE together with a video processing back-end (VPBE) which shares the read buffers and bus for reads from the external memory. Note that the CCDC can send data directly to the video encoder (VENC) for output with minimal processing.
The control mechanism for each module is autonomous to allow chain-regulated as well as concurrent dataflow. For example, we can have data transfers such as:
The ability to chain processing steps and to allow multiple concurrent autonomous threads of computation adds significant flexibility and power efficiency to digital processing devices that incorporate the VPFE architecture.
The inter-module video port interface (VPI) is a bus that carries video data as well as video clock, data enable, horizontal synchronization (HSYNC) and vertical synchronization signals. With synchronization information incorporated into the interface, modules can be connected in different configurations easily in alternative chip designs.
The video port interface is also used inside the CCD Controller and the Preview engine modules to connect processing stages. This allows a modular design methodology that enables reconfiguration of the processing stages in CCDC or Preview, and allows reuse of these processing stages in other modules.
CCD Controller's video signal output is transmitted over two instances of the video port interface (VPI in
Preferred embodiment systems (digital still cameras, digital camcorders, video cell phones, netcams, et cetera) include preferred embodiment VPFE with any of several types of additional hardware: digital signal processors (DSPs), general purpose programmable processors, application specific circuits, or systems on a chip (SoC) such as multicore processor arrays or combinations of a DSP and a RISC processor together with various specialized programmable accelerators; see
The shared buffer logic/memory is a unique block that is tailored for seamlessly integrating the VPSS into an image/video processing system. It acts as the primary source or sink to all the VPFE and VPBE modules that are either requesting or transferring data from/to the SDRAM/DDRAM. In order to efficiently utilize the external SDRAM/DDRAM bandwidth, the shared buffer logic/memory interfaces with the direct memory access (DMA) system via a high bandwidth bus (64-bit wide). The shared buffer logic/memory also interfaces with all the VPFE and VPBE modules via a 128-bit wide bus. The shared buffer logic/memory (divided into the read and write buffers plus arbitration logic) is capable of performing the following functions.
The shared buffer logic is capable of arbitrating between all the VPFE and VPBE modules and the DMA SCR0 based on fixed priorities. It is designed to maximize the SDRAM/DDRAM bandwidth even though each of the individual VPFE/VPBE modules makes data transfers/requests in smaller sizes. Based on the bandwidth analysis, the arbitration scheme for the buffer memory between all the VPFE modules, VPBE, and the DMA SCR0 (DDR EMIF) interface needs to be customized for each system. It is important to note that the VPSS requests to the DMA SCR0 interface should be treated as the highest priority on the system to guarantee correct functionality. It is possible to lower the priority of the VPSS requests to the DDR EMIF by a register setting.
The shared buffer logic/memory comprises the following to achieve its functionality:
There are several registers available for debugging the transfer of data between the VPSS modules and the SDRAM/DDRAM. The debug registers are divided into two categories:
(a) 8 global request registers to capture information about any of the 56 individual module request registers (each register provides information about one data-unit) at a given time. The number 8 corresponds to the maximum number of EMIF command queue entries plus one.
Each of the global request registers provides the following information:
(b) 56 individual module request registers (either read or write information;
each register corresponds to one data-unit)
The VPSS has a single central resource (is a BCG SCR 1-to-n generator) that generates all the individual MMR/config bus signals to the various VPFE/VPBE modules. The MMR/config bus port for each module is used to program the individual registers. The central resource itself has an input MMR/config bus port on the VPSS boundary.
Module Starting addresses could be:
There are various embedded memories in the processing modules and the read/write buffers for external memory, as follows:
The main processing done by the CCDC module on the raw data (from the CCD/CMOS sensor) is optical black clamping followed by a fault pixel correction; see upper portion of
The data reformatter converts nonstandard imager data format to the standard raster-scan format for processing. The imager data format, particularly in video mode (lower resolution but high frame rate, usually 30 frames/sec), varies among imager vendors and is still evolving. A programmable data reformatter architecture (see cross-referenced application Ser. No. 10/888,701, hereby incorporated by reference) comprehends many data formats today, and should support many more future data formats.
b shows the processing flow for YCbCr data. Control registers provide format information so that the video signal can be properly recognized and processed.
The data reformatter memory is efficiently utilized by functionally reorganizing the memory into:
The fault pixel table must contain entries in ascending order (pixel read-out order) in terms of the line and pixel count. In case of interlaced sensors, the programmer can program multiple tables (one for each field of the frame), and switch the starting address in the SDRAM when the corresponding field is clocked in to the CCDC. Note that the number of fault pixels should also be modified appropriately. The fault pixel correction can be applied to movie mode sensors also (note that each fault pixel position is determined by the pixel's offset from the VD/VSYNC and HD/HSYNC).
The CCDC requests the fault pixel entries from the read buffer interface in the VPSS. The read buffer is capable of buffering up to a total of N (for example, 128 for discussion here) fault entries internally. The 128 entries (can be a variable parameter for a different chip/design) are arranged as two 64 entry blocks in a ping-pong scheme. On every new frame, the read buffer logic issues a request to the system DMA controller to transfer 64 entries into the internal buffer. A second request is also sent immediately after that. Further requests are satisfied only upon the complete utilization of 64 entries. In order to allow time to fetch the fault pixels from the SDRAM/DDRAM, the number of fault pixels to be corrected in a certain time will be limited by the system DMA bandwidth and latency. At a minimum, the time to transfer 64 entries from the external location (typically SDRAM/DDRAM) should be less than the time to exhaust (fault-pixel correct) the 64 entries residing in the other block. If this requirement is not met at any instant of time, then the fault-pixel correction circuitry in the CCDC will flag an error bit and halt processing for that frame. There is no error recovery implemented where this circuitry can correct as many fault-pixels as possible after not being able to correct a fault-pixel due to bandwidth/latency issues.
Following the fault pixel correction, the raw data can be stored into the SDRAM/DDRAM for software image processing (e.g., the DSP and coprocessor subsystem in
The CCDC module is capable of transforming movie mode readout patterns (such as Sony, Fuji, Sharp, Matsushita) into Bayer readout patterns. The advantage of such a conversion is that the remaining VPFE modules need not be designed to handle formats other than Bayer and Foveon patterns. This vastly simplifies the design effort in those modules. Following the fault pixel correction, the CCDC module utilizes the data reformatter memory and logic for this transformation. Data from the reformatter memory is stored as the Bayer pattern and this is in turn the input to the various VPFE modules.
The basic idea behind the data reformatter is to convert a single line of movie mode sensor into multiple Bayer lines.
The preview engine receives raw image/video data from either the video port interface via the CCDC block (which is interfaced to the external CCD/CMOS sensor) or from the read buffer interface via the SDRAM/DDRAM. The input data is 10-bits wide if the source is the video port interface. When the input source is the read buffer interface, the data can either be 8-bit or 10-bits. The 8-bit data can either be linear or non-linear. In addition, the preview engine can optionally fetch a dark frame from the SDRAM/DDRAM with each pixel being 8-bits wide.
The starting input SDRAM/DDRAM address should be on a 32-byte boundary. Even though, the address is programmed as 32-bits, the 5 LSB are treated as zeroes. The 16-bit line offset register also must be programmed on a 32-byte boundary. Similar to the starting address, the 5 LSB are treated as zeroes for the 16-bit offset register. Furthermore, the dark frame subtract input and the preview engine output addresses and line offsets must be on a 32-byte boundary.
When the input source is the SDRAM/DDRAM, the preview engine always operates in the one-shot mode; the enable bit is turned off and it is up to the firmware to re-enable it to process the next frame from the SDRAM/DDRAM.
The preview engine can only output 1280 pixels in each horizontal line due to the line memory width restrictions in the noise filter and CFA interpolation blocks. In order to support sensors that output greater than 1280 pixels per line, an averager is incorporated to downsample by factors of 1 (no averaging), 2, 4, or 8 in the horizontal direction. The horizontal distance between two consecutive pixels to be averaged is selectable between 1, 2, 3, or 4. Furthermore, the horizontal distance between two consecutive pixels for even and odd lines can be programmed separately. The valid output of the input formatter/averager is either 8- or 10-bits wide. Alternatively, a wide image could be partitioned into panels of at most 1280 pixels, each panel processed without averaging, and the processed panels stitched together.
The preview engine is capable of writing a dark frame to the SDRAM-/DDRAM instead of performing the conventional processing steps. This dark frame can later be used for subtracting from the raw image data. Each input pixel is written out as an 8-bit value; if the input pixel value is greater than 255, it is saturated to 255. The idea here is that if a dark pixel is greater than 255, it is more likely to be a fault pixel and can be corrected by the fault pixel correction module in the CCDC.
In order to save capacity and bandwidth when the input source to the preview engine is the SDRAM/DDRAM, data could be stored in an A-law compressed (non-linear) space by the CCDC. The inverse A-law block decompresses the 8-bit non-linear data to 10-bit linear data if enabled. If the A-law block is not enabled, but the input is still 8-bits, the data is left shifted by 2 to make it 10-bit data. If the input is 10-bits wide in the first place, no operation is performed on the data.
The preview engine is capable of optionally fetching a dark frame containing 8-bit values from the SDRAM/DDRAM and subtracting it pixel-by-pixel to the incoming input frame. The output of the dark frame subtract is 10-bits wide (U10Q0). The firmware is responsible for allocating enough SDRAM/DDRAM bandwidth to the preview engine if this feature is enabled. At its peak (operating at 75 MP/s), the dark frame subtract read bandwidth is 75 MB/s.
The preview engine contains a horizontal median filter that is useful for reducing temperature induced noise effects. The horizontal median filter, shown in
If the horizontal median filter is enabled, the preview engine will reduce the output of this stage by 4 pixels (2 starting pixels—left edge and 2 ending pixels—right edge) in each line. For example, if the input size is 656×490 pixels, the output will be 652×490 pixels. There will be no chopping of data if this block is disabled.
Following the horizontal median filter, a programmable filter that operates on a 3×3 grid of same color pixels reduces the noise in the image/video data. This filter always operates (identifies neighborhood same-color pixels that are close in value) on nine pixels of the same color.
If the noise filter is enabled, the preview engine will reduce the output of this stage by 4 pixels in each line (2 starting pixels—left edge and 2 ending pixels—right edge) and 4 lines in each frame (2 starting lines—top edge and 2 ending lines—bottom edge). For example, if the input size is 656×490 pixels, the output will be 652×486 pixels. There will be no chopping of data if this block is disabled.
The white balance module has two gain adjusters, a digital gain adjuster and a white balance adjuster. In the digital gain adjuster, the raw data is multiplied by a fixed value gain regardless of the color of the pixel to be processed. In the white balance gain adjuster, the raw data is multiplied by a selected gain corresponding to the color of the processed pixel. The white balance gain can be selected from four 8-bit values depending on the position of the current pixel modulo 4 or 3 (selectable in control register setting). Firmware can assign any combination of up to 4 pixels in the horizontal and vertical direction (up to 16 total locations). For example, the white balance gain selected for pixel #0 and line #0 can be different than pixel #2 and line #0.
The CFA interpolation block is responsible for populating the missing color pixels at a given location resulting in a 3-color RGB pixel. The CFA interpolation module will be bypassed in the case of the Foveon sensor since the image is fully populated with all the three primary colors. In the case of Bayer pattern, the CFA interpolation should work for either primary color sensors, complementary color sensors, or four color sensors.
The CFA interpolation is implemented using programmable filter coefficients, with each coefficient being 8-bits wide. Each of the three output colors (R, G, and B) has their own coefficients. There are 9 coefficients per output color (to accommodate a 3×3 fully populated grid). In addition, there are 4 phases for each color representing the position in the 2×2 grid. Furthermore, different sets of filter coefficients are provided depending on the tendency (either horizontal, vertical, or neutral) as shown in
The horizontal and vertical gradients are computed as:
Gradient=ABS(X1X)/2+ABS(X+1X)/2+ABS(X1X+1)+ABS(X+2X)+ABS(X2X)
Based on the phase, color, and tendency, the 9 selected filter coefficients are used to compute the output pixel by performing 2D 3×3 FIR filtering. Since the preview engine will be able to be clocked at least twice the incoming raw input data rate, only 14 multipliers are required to implement the CFA interpolation. 9 of the 14 multipliers are used in computing either the red or blue color. The remaining 5 multipliers are used in computing the partial green. In the next cycle, 9 of the 14 are used to compute either blue or red and the other 5 multipliers are used to compute the remaining green color.
The CFA filter coefficients are stored in an internal memory inside the preview engine. Firmware is responsible for programming the table entries.
The CFA interpolation step can be optionally disabled. In this case, the input stream is duplicated into 3 streams to represent the red, green, and blue colors. If the CFA interpolation is enabled, the preview engine will reduce the output of this stage by 4 pixels in each line (2 starting pixels—left edge and 2 ending pixels—right edge) and 4 lines in each frame (2 starting lines—top edge and 2 ending lines—bottom edge). For example, if the input size is 656×490 pixels, the output will be 652×486 pixels for each of the three output colors. There will be no chopping of data if this block is disabled.
The CFA interpolation architecture provides directional information and allows the firmware to configure filter coefficients for each direction tendency. By providing orthogonally programmable coefficients, the CFA interpolation stage can deal with different sensor characteristics, different lighting/scene characteristics, and can implement special effects like sharpening and softening in conjunction for free. For example, complementary color sensor can be supported with the same architecture but with filter coefficients selected to comprehend color space transformation.
The output of the CFA interpolation is three pixels (red, blue, and green values) and this is fed as input to the black adjustment module. The black adjuster module performs the following calculation for an adjustment of each color level.
data_out=data_in+b1_offset
The RGB2RGB blending module has a general 3×3 square matrix and redefines the RGB data from the CFA interpolation module, which can be used as a function of a color correction. The input is signed 11-bits and the output is unsigned 10-bits. In this module, the following calculation is made.
Each of the gains is 12-bit data with a range of −8 to +8 (with 8-bit fraction).
The gamma correction is performed on each of the R, G, and B pixels separately by using a RAM based lookup. Each table has 1024 entries and is programmed by the firmware, with each entry being 8-bit wide. The input data value is used to index into the table and the table content is the output. The host processor can only write the gamma RAM (via registers) when the preview engine is disabled.
The RGB2YCbCr conversion module has a 3×3 square matrix and converts the RGB color space of the image data into the YCbCr color space. In addition to the conversion matrix operation, offset, contrast, brightness and chroma suppression are performed in this module.
hpy(i)=y(i)−(y(i−1)+y(i+1)/2;
and fed to a lookup table with interpolation (optionally, the luminance value y itself can be fed instead of the high-passed version of Y)
The interpolated output is then added to original Y to complete the luminance enhancement.
enh
—
y(i)=clip(y(i)+interpolated(i));
If the non-linear luminance enhancer is enabled, the preview engine will reduce the output of this stage by 2 pixels (1 starting pixel—left edge and 1 ending pixel—right edge) in each line. For example, if the input size is 656×490 pixels, the output will be 654×490 pixels. There will be no chopping of data if the non-linear luminance enhancer is disabled.
The resizer module performs either upsampling (digital zoom) or downsampling on image/video data. The input source can be either the preview engine or SDRAM/DDRAM and the output is sent to the SDRAM/DDRAM.
The resizer module performs horizontal resizing then vertical resizing. In between there is an optional edge enhancement feature. Processing flow and data precision at each stage are shown in
The line buffer is functionally either 3 lines of 1280 pixels×16-bit or 6 lines of 640 pixels×16-bit, depending on the vertical resizing being 4-tap or 7-tap mode. In hardware implementation, the line buffer is intended to be a single block of memory organized as 640×96-bit.
The resizer module has the ability to upsample or downsample image data with independent resizing factors in the horizontal and vertical directions (HRSZ and VRSZ). The same resampling algorithm is applied in both the horizontal and vertical directions. For the rest this section, the horizontal direction is used in describing the resampling algorithm. The HRSZ and VRSZ parameters can range from 64 to 1024 to give a resampling range of 0.25× to 4× (256/RSZ). There are 32 programmable coefficients available for the horizontal direction and another 32 programmable coefficients for the vertical direction. The 32 programmable coefficients are arranged as either 4-taps & 8-phases for the resizing range of ½x−4× or 7-taps & 4-phases for a resizing range of ¼x−˜½× (upper step not included). Table 2 shows the arrangement of the 32 filter coefficients. Each tap is arranged in a S10Q8 format (signed value of 10-bits with 8 of them being the fraction).
a-17b show the resizer method in the 4-tap/8-phase mode.
Standard implementation of resampling requires number of phase being the numerator of the resampling factor, in this case, 256. The resizer module is archtiected with approximation scheme to reduce the number of phases to 4 or 8, to reduce coefficient storage by a factor of up to 64. This approach reduces hardware cost while providing fine grain resampling factor control (compared with providing just 4/D resampling), and there should be minimal quality impact on the resized images.
Chroma inputs, Cb and Cr, are 8-bit unsigned that represents a 128-biased 8-bit signed number. Before resizing computation, chroma should have the 128 bias subtracted to convert back to 8-bit signed format (strictly speaking the signed chroma is called U and V instead of Cb and Cr). In resizing, chroma should be processed as 8-bit signed number. After vertical resizing, the 128 bias should be added back to convert back to 8-bit unsigned format.
Edge enhancement can be optionally applied to the horizontally resized luminance component before the output of the horizontal stage is sent to the line memories and the vertical stage. Either a 3-tap or a 5-tap horizontal high-pass filter can be selected to use in the luminance enhancement as shown below. If the edge enhancement is selected, the two left most and two right most pixels in each line will not be output to the line memories and the vertical stage. The edge enhancement algorithm is as follows.
Basically, the high pass gain is computed by mapping the absolute value of high passed luma with the curve of
CORE is in U8Q0, or unsigned 8-bit integer format. SLOP is in U4Q4, or unsigned 4-bit fraction format. GAIN is in U4Q4, or unsigned 4-bit fraction format. Hpgain is computed with sign/integer bits plus 4-bit of fraction, but can be saturated to 0.15 (representing 0.15/16) before clipping by GAIN.
The selectable high-pass filter kernel allows different degree of sharpening. The 3-tap filter offers general-purpose sharpening, while the 5-tap filter has a frequency characteristic to amplify a wider spectrum of the input image. The 5-tap filter works well with large downsampling factor (from 2 to 4), where a larger portion of the spectrum is attenuated due to the resampling filter.
The resizer should support multiple passes of processing for larger resizing operations. By “larger” there are several meanings:
As shown in the high level block diagram in
Prior to directing the image/video data to the AF and AE/AWB data paths, the h3A module has the task of preprocessing the input data. The preprocessing steps that are necessary are a horizontal median filtering step and a 10-bit to 8-bit A-law compression step.
The horizontal median filter, shown in
The A-law conversion routine compresses the 10-bit value to an 8-bit value. In the case of the A-law table being enabled, the output is still 10-bits with the upper two bits filled with a 0.
The Auto Focus Engine works by extracting each red, green, and blue pixel from the video stream and subtracts a fixed offset of 128 or 512 (depending of whether the A-law is enabled or disabled) from the pixel value. The offset value is then passed through an IIR filter and the absolute value of the filter output is the focus value or FV. The Focus values can either be accumulated or the maximum FV for each line can be accumulated. The maximum FV of each line in a Paxel is acquired if FV mode is set to ‘Peak mode’. Values of the red, green, and blue pixels and either the accumulated FV or the maximum FV are accumulated in the Paxel, and are sent out the data interface.
The Red, Green, and Blue Pixel extraction is controlled by a register setting that specifies which of the six possible modes is to be used as shown in
The focus value calculator takes the unsigned red/green/blue extracted data and subtracts 128 or 512 (depending on whether the A-law is enabled or disabled) to place the data in the range −128 or 512 to 127 or 511. After the removing the offset, the data is sent through two IIR filters each with a unique set of 11 Coefficients; see
The FV Accumulator takes the FV values from the filter and accumulates the FV values for each Paxel. The size and number of Paxels is configurable by registers. In the Peak Mode, maximum value is accumulated. In the Sum mode, all FV are accumulated in a Paxel; see
The AE/AWB Engine starts by sub-sampling the frames into windows and further sub-sampling each window into 2×2 blocks. Then for each of the sub-sampled 2×2 blocks the each pixel is accumulated. Also, each pixel is compared to a limit set in a register. If any of the pixels in the 2×2 block are greater than or equal to the limit then the block is not counted in the unsaturated block counter. All pixels greater then the limit are replaced by the limit and the value of the pixel is accumulated.
The Sub-Sampler module takes setting from the register for the starting position of the windows is set by WINSH for the horizontal start and WINSV for the vertical start. The width of the window is set by WINW and the height by WINK The number of windows in a horizontal direction is set by WINHC while WINVC set the number of windows in the vertical dimension.
Each window is further sampled down to a set of 2×2 blocks. The horizontal distance between the start of blocks is set by AEWINCH. The vertical distance between the start of blocks is set by AEWINCV.
The saturation check module takes the data from the sub-sampler and compares it to the value in the Limit Register. It replaces the value of a pixel that is greater then the value in the limit register is replaced with the value in the limit register. If all 4 pixels in the 2×2 block are below the limit then the value of the unsaturated block counter is incremented. There is 1 unsaturated block counter per window.
The data output from the saturation check module and the sub-sampler module are each accumulated for each pixel. There are a total of 8 accumulators per window.
In addition to the 128 vertical paxels/windows, the AE/AWB module provides support for an additional vertical row of paxels/windows for black data. The black row of paxels/windows can either be before or after the 128 regular vertical paxels/windows. The vertical start setting for the black row of paxels is specified by a separate register setting. Furthermore, the height of the black row of paxels is specified separately from the regular vertical rows of paxels/windows.
The VBUSP DMA Interface module is responsible for taking the data from the AF Engine and the AE/AWB Engine and building packets to be sent out to the SDRAM/DDRAM. The data interface has separate start and end pointers for the both the AF and AE/AWB engine. It will continuously loop through this data as it builds the packets.
The histogram module accepts raw image/video data from the CCDC, performs a color-separate gain on each pixel (white/channel balance), and bins them according to the amplitude, color, and region which are all specified via its register settings. It can support either 3 or 4 colors, and up to 4 regions simultaneously.
Histogram function supports the following
The histogram RAM is 1024×20-bit in size. Therefore the user can attempt to select conditions that require more memory (example 4 active regions and 128 bins per color). The manual shall call these out as illegal conditions but the hardware shall not fail if the user uses these illegal settings. The hardware shall limit the number of bins in the following way:
The histogram RAM is 20 bits wide. If incrementing a histogram bin would cause the value to become greater than what the RAM word could hold the value shall be saturated to the maximum value RAM word will hold, which is 2̂20−1.
The input data width is 10 bits wide (9 . . . 0) and the data to be histogrammed is 8 bits wide. Therefore if the input value is larger than the highest bin location the result shall be clipped to the highest bin location. This allows data from above the bin range to be included in the upper most bin.
Starting address of the regions in various number of bins configuration is shown in the next table:
Offset of colors within each region in the RAM is shown in the next table:
The preferred embodiments can be modified in various ways while retaining one or more of the features of video processing front-end modules connected for data transfers under autonomous operations.
For example, the vertical auto focus and the horizontal auto focus could be put into a common processing module (either part of h3A or a separate module); the various parameters such as bus widths, filter coefficients, et cetera could be varied; processing modules for additional image pipeline functions could be added, such as the white balance, lens shading compensation, lens distortion compensation, adaptive fault pixel correction (the hardware does not require a calibrated/capture fault list), and video stabilization.
This application is a divisional of non-provisional application Ser. No. 11/219,925 filed Sep. 6, 2005, which claims priority from provisional application Nos. 60/606,944 and 60/607,380, both filed Sep. 3, 2004, which are all herein incorporated by reference.
| Number | Date | Country | |
|---|---|---|---|
| 60606944 | Sep 2004 | US | |
| 60607380 | Sep 2004 | US |
| Number | Date | Country | |
|---|---|---|---|
| Parent | 11219925 | Sep 2005 | US |
| Child | 12689071 | US |