Not Applicable
Not Applicable
A portion of the material in this patent document is subject to copyright protection under the copyright laws of the United States and of other countries. The owner of the copyright rights has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the United States Patent and Trademark Office publicly available file or records, but otherwise reserves all copyright rights whatsoever. The copyright owner does not hereby waive any of its rights to have this patent document maintained in secrecy, including without limitation its rights pursuant to 37 C.F.R. §1.14.
1. Technological Field
This technology pertains generally to image coding, and more particularly to an apparatus and method for utilizing intra and inter-plane prediction during RGB image coding.
2. Technological Background
A RGB color model comprises red (R), green (G) and blue (B) in this additive color model in which R, G and B are added in various amounts toward reproducing a desired color in its spectrum. The RGB model is extensively utilized in various electronic image systems, in particular televisions and computers, although it is also utilized in some instances for digital photography. In the RGB model, zero intensity for each color component yields black, while full intensity for each results in a white. Although the quality of the white light, in relation to true white, depends on the nature of the primary RGB light sources. Each of the RGB colors is quantized to a desired color depth which is expressed as a number of bits, such as from 1 to 24 bits or more, although most typically 8 to 24 bits, depending on the application and its attendant need for color accuracy.
Conventional RGB encoding utilizes a combination of pulse code modulation (PCM) and differential pulse code modulation (DPCM) when encoding an RGB image block. In some encoding systems, predictions are performed based on intra-plane correlations, toward increasing coding efficiency. However, it would be beneficial to enhance RGB color encoding to further increase coding efficiency.
Accordingly, the present technology provides additional RGB coding benefits and overcomes shortcomings of previous approaches.
RGB image encoding is enhanced by the disclosed technology by performing inter-plane correlations to enhance the performance of intra-plane correlations, and thus to increase coding efficiency. RGB color images have red (Rn), blue (Bn), and green (Gn) pixels associated with each pixel group number n. Quantization is performed on the pixel data for multiple quantization level (qn) values. Two prediction levels are performed prior to entropy coding. In a first level prediction, intra-plane prediction is performed on each of the colors R, G and B to generate residuals, ΔRn, ΔGn, and ΔBn. Intra-prediction involves neighboring pixels in the same color plane, that is of the same color. Results from the first level prediction are utilized to perform a second level prediction on one or more of the colors, in which inter-plane correlation is taken into account. Inter-prediction involves neighboring pixels in different color planes, that is of different colors. In a preferred embodiment, this second prediction level is performed on the red and blue colors, such as ΔΔRn, =ΔRn−ΔGn, and ΔΔBn=ΔBn−ΔGn. Entropy coding is performed on the quantized residuals for each color, preferably the residual for red, blue from level 2 prediction and the residual for green from level 1 prediction, to generate an entropy coded RGB bitstream.
In at least one embodiment, multiple encoding modes are generated and selected for encoding the RGB image block. In these modes, colors from the RGB image are coded with pulse code modulation (PCM) or differential pulse code modulation (DPCM) at a given quantization level (qn). In one embodiment a first mode codes all colors in PCM, a second mode codes colors in a mixture of PCM and DPCM, while a third mode code colors in DPCM. In the exemplified case, mode 2 codes green (G) in PCM with red (R) and blue (B) coded in DPCM. The qn selections and mode decision are preferably performed to optimum coding, such as based on a metric known such as bit-coverage, although other metrics can be utilized.
The coding is preferably performed by encoder programming executing on a computer processor and associated memory, although at least portions are amenable to hardware acceleration.
Further aspects of the technology will be brought out in the following portions of the specification, wherein the detailed description is for the purpose of fully disclosing preferred embodiments of the disclosed technology without placing limitations thereon.
The disclosed technology will be more fully understood by reference to the following drawings which are for illustrative purposes only:
Motivation for the present technology includes a realization that inter-plane correlation exists within RGB images as well as intra-plane correlation. That is to say, that not only correlations of adjacent pixels of the same color (intra-plane) have significance, but also correlation exists with other colors (inter-plane) within the RGB image. Additional inter-plane correlations are performed, according to the present technology, to enhance RGB image encoding toward compressing the data more efficiently.
Intra-plane correlation, which is a correlation between neighboring pixels of the same color, exploits spatial correlation. In the correlation, it will be appreciated that neighboring pixel samples with the same color tend to have more, similar values compared to distant samples. For example, it is more likely that |R1−R2|<|R1−R4|.
Inter-plane correlation, which is a correlation between neighboring pixels across different colors, exploit inter-color correlation. This inter-plane correlation relies on the observation that spatial value changes and/or fluctuations tends to be similar for neighboring R, G, and B pixels. For example, if R1<R2, it is likely that such change is also similar for other colors, such as B1<B2, and G1<G2.
The following describes the double level prediction process being performed for a specific pixel group (group n), with processing steps as follows.
A first level of prediction is performed for each pixel group of every color, by computing a spatial prediction residual (Δ) as follows:
ΔRn=Rn−SP(Rn)
ΔGn=Gn−SP(Gn)
ΔBn=Bn−SP(Bn)
where SP(Xn) means “spatial predictor” of color X in pixel group n. Example of SP: SP(X6)=X5, SP(X6)=X2, SP(X6)=X5+X2−X1, and so forth. It will be appreciated that Xn needs to be predicted from a neighboring pixel Xm, that may be found in any desired direction (e.g., above, left, right, and below) when considering a 2D block, or left or right in a 1D block. However, prediction is preferably performed in relation to neighboring pixels that have already been coded. So for example, in a 1D block (e.g., 8×1), there may be only one preferred choice of a left-side neighbor (e.g., Xn−1), which has already been coded and can be utilized.
A second level of prediction is performed for each pixel group of every color, by determining a spatial prediction residual which incorporates intra-plane correlations. In the following ΔRn and ΔBn are again predicted from neighboring ΔGn:
ΔΔRn=ΔRn−ΔGn
ΔΔBn=ΔBn−ΔGn
It should be noted that G was chosen as a predictor color for inter-color prediction, although R or B can be alternatively chosen without departing from the teachings of the present technology.
Finally, the residuals ΔΔRn, ΔΔBn, ΔGn are entropy coded to generate a coded output bitstream.
For each qn value, from 0 through depth-1, a quantization or bit-truncation is performed in which R, G, B for the qn value results in generating Rq, Gq, and Bq. In this example, quantization produces a first output 38 with Gq, and a second output 40 with Rq and Bq. There is a single qn for all colors which means if qn=3 is chosen, we truncate 3 LSBs of each color; whereby G is not quantized any differently than R and B. It should be appreciated that the colors can be grouped in alternative ways without departing from the technology (e.g., grouping GqRq, or Gq, Bq, instead of Rq, Bq as exemplified). The first sample of each color (R, G, B) is PCM coded 56, producing output 66 for mode 3. It should be appreciated that when PCM mode is chosen, coding is performed in an MSB to LSB order as long as the bit-budget allows. For example, for 50% compression of an 8-bit RGB case, PCM mode codes 4 MSBs of each R, G, B.
PCM coding is performed 54 on one of the colors, exemplified as Gq, as received at sum junction 58 for use in mode 2. Residuals are generated as previously described for DPCM coding. In particular, a predictor is shown generated 42 for G (SP(Gn)), with residual ΔGq produced from minus sum junction 46 (difference) of Gq 38 and SP(Gn), which is entropy coded 50, for connection to sum junction 60 for use with mode 1. The predictor of Gq=SP(Gq).
Similarly, predictors for R and B are determined 44 which include inter-plane prediction by receiving ΔGq from minus sum junction 46 (difference). The predictor of Rq=SP(Rq)+Gq−SP(Gq), with the predictor of Bq=SP(Bq)+Gq−SP(Gq). Residuals ΔΔRq and ΔΔBq are output from minus sum (difference) junction 48 to be entropy coded 52 for receipt at sum junctions 58 and 60 for use in mode 1 and mode 2 encoding.
Thus, it is seen that mode 1 encoding comprises DPCM encoding of Rq, Gq, and Bq. In particular, ΔGq from block 50 is summed with ΔΔRq and ΔΔBq from block 52 at sum junction 60 to produce the mode 1 output 64. The output in mode 1 is entropy coding (DPCM) of ΔGq, ΔΔRq, ΔΔBq and the refinement bits available because each of these is DPCM coded.
It is also seen that mode 2 encoding comprises PCM coding of one color, herein exemplified as Gq, with the other colors DPCM coded. In particular, Gq from block 54 is summed 58 with ΔΔRq and ΔΔBq from block 52 to produce mode 2 output 62. The output in mode 2 is PCM coding of Gq, entropy coding of ΔΔRq, ΔΔBq, and the refinement bits available because Rq and Bq are DPCM coded.
It was already discussed that mode 3 provides for PCM coding of all colors, such as utilized when coding the first samples of each color. In mode 3, the output is merely the PCM coding of R, G, and B, which is independent of qn value, and for which no refinements bits are available.
During the encoding process decisions are made as to the optimum qn value to utilize for quantizing the blocks, and for selecting the best mode. By way of example, these decisions are made on the basis of bit-coverage, although other metrics can be utilized without departing from the teachings of the present technology.
In
It should be appreciated that the disclosed apparatus and method can be utilized for both “non-random access” and “random access” conditions. In “random access” conditions, when encoding a given block, the encoder and decoder do not need access to other blocks. This means that the decoder can still decode any given block without having to know information of the other blocks. In contrast to this, under “non-random access” conditions, the ability is needed to access other blocks, such as neighboring blocks, as the predictor is computed based on pixels of the other blocks. This means that the decoder can only decode a given block in random access conditions if the decoder can access other blocks.
It can be seen for mode 1, at the top of the figure, that the first bit (at the right) indicates the bitstream is not all PCM coded. The second bit indicates if one of the colors (e.g., G in this case) is PCM coded. For mode 1, in the top bitstream diagram, G is not PCM coded, whereby the second bit is set equal to 0. It should be appreciated that the bit states described above may be selected as desired for indicating conditions, so for example the second bits for mode 1 and mode 2 can be reversed using a “0” for PCM mode 3, while using a “1” to indicate that G is PCM coded, or vice-versa.
Referring to
In the example of mode 2, seen as the middle bitstream of the figure, G is PCM coded and this second bit is exemplified as equal to 1. There is no need for the all-zero signaling bit for G, but one is seen for the combination of R and B. Data is then seen for PCM coding of G up to qn, then followed by data for PCM encoding of R and B first samples, VLCs for R and B, and the refinement bits. It should be appreciated that the bitstream syntax figure shows the same length of signaling bits for mode 1 and mode 2, although they are usually different. It is not necessary, however, that mode 1 has more refinement bits than other modes, as it depends on qn and many other factors. The St (stuffed bits) length can also be different.
It will be appreciated that the decoder side is performed in the reverse order of the encoder, using inverse functions. This process is straightforward, with inverse entropy coding performed to reconstruct residuals, followed by adding appropriate predictors to the residuals. Then a bit shift by qn is performed followed by reconstructing refinement bits. Finally, middle point reconstruction of the uncoded bit-planes is performed.
The present technology can be incorporated within an encoder and decoder, such as for integration with various devices configured for RGB color image capture (e.g., digital cameras, camcorders, and scanners), or in response to receiving color image information from an image capture device.
Embodiments of the present technology may be described with reference to flowchart illustrations of methods and systems according to embodiments of the technology, and/or algorithms, formulae, or other computational depictions, which may also be implemented as computer program products. In this regard, each block or step of a flowchart, and combinations of blocks (and/or steps) in a flowchart, algorithm, formula, or computational depiction can be implemented by various means, such as hardware, firmware, and/or software including one or more computer program instructions embodied in computer-readable program code logic. As will be appreciated, any such computer program instructions may be loaded onto a computer, including without limitation a general purpose computer or special purpose computer, or other programmable processing apparatus to produce a machine, such that the computer program instructions which execute on the computer or other programmable processing apparatus create means for implementing the functions specified in the block(s) of the flowchart(s).
Accordingly, blocks of the flowcharts, algorithms, formulae, or computational depictions support combinations of means for performing the specified functions, combinations of steps for performing the specified functions, and computer program instructions, such as embodied in computer-readable program code logic means, for performing the specified functions. It will also be understood that each block of the flowchart illustrations, algorithms, formulae, or computational depictions and combinations thereof described herein, can be implemented by special purpose hardware-based computer systems which perform the specified functions or steps, or combinations of special purpose hardware and computer-readable program code logic means.
Furthermore, these computer program instructions, such as embodied in computer-readable program code logic, may also be stored in a computer-readable memory that can direct a computer or other programmable processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the block(s) of the flowchart(s). The computer program instructions may also be loaded onto a computer or other programmable processing apparatus to cause a series of operational steps to be performed on the computer or other programmable processing apparatus to produce a computer-implemented process such that the instructions which execute on the computer or other programmable processing apparatus provide steps for implementing the functions specified in the block(s) of the flowchart(s), algorithm(s), formula(e), or computational depiction(s).
From the description herein, it will be appreciated that that the present disclosure encompasses multiple embodiments which include, but are not limited to, the following:
1. An apparatus for RGB image encoding, comprising: (a) a computer processor configured for receiving color pixel data of an input RGB image containing colors red (Rn), blue (Bn), and green (GO pixels associated with each pixel group number n; and (b) programming in a non-transitory computer readable medium and executable on the computer processor for performing steps comprising: (b)(i) performing quantization for multiple quantization level (qn) values; (b)(ii) performing intra-plane prediction on each of the colors R, B and G to generate residuals, ΔRn, ΔGn, and ΔBn, from this first level of prediction in which prediction is performed for a given color based on neighboring pixels of a same color; (b) (iii) performing inter-plane prediction on one or more of the colors to generate residuals from a second level of prediction in which prediction of a given color is based on neighboring pixels of different colors, toward augmenting intra-plane prediction; and (b) (iv) entropy coding the quantized residual to generate an entropy coded RGB bitstream.
2. The apparatus of any preceding embodiment, wherein said programming executable on a non-transitory computer readable medium further comprises performing intra-plane prediction on each of the colors R, G and B to generate residuals, ΔRn, ΔGn, and ΔBn in response to ΔRn=Rn−SP(Rn), ΔGn=Gn−SP(Gn), and ΔBn=Bn−SP(Bn); wherein SP describes performing an intra-color spatial predictor (SP).
3. The apparatus of any preceding embodiment, wherein said one or more colors on which inter-plane prediction is performed in said second level of prediction comprises the red (R) and blue (B) colors.
4. The apparatus of any preceding embodiment, wherein said residuals from said second level of prediction comprise ΔΔRn and ΔΔBn determined in response to ΔΔRn,=ΔRn−ΔGn, and ΔΔBn=ΔBn−ΔGn.
5. The apparatus of any preceding embodiment, wherein said programming executable on a non-transitory computer readable medium further comprises generating multiple encoding modes in which different colors from the input RGB image are differently coded with pulse code modulation (PCM), or differential pulse code modulation (DPCM) at a given quantization level (qn).
6. The apparatus of any preceding embodiment, wherein said programming executable on a non-transitory computer readable medium further comprises generating multiple encoding modes having one mode in which all colors are PCM coded, another mode in which all colors are DPCM coded and one or more intermediate modes in which a combination of PCM and DPCM coding is performed.
7. The apparatus of any preceding embodiment, wherein said programming executable on a non-transitory computer readable medium further comprises generating multiple encoding modes having one mode in which all colors are PCM coded, another mode in which all colors are DPCM coded and one mode in which green (G) is PCM coded while read (R) and blue (B) are DPCM coded.
8. The apparatus of any preceding embodiment, further comprising a decoding process of inverse entropy coding performed to reconstruct residuals, followed by adding appropriate predictors to the residuals, after which a bit shift by qn is performed followed by reconstructing refinement bits, and then middle point reconstruction of the uncoded bit-planes.
9. The apparatus of any preceding embodiment, wherein said apparatus can be utilized for coding in both random access and non-random access conditions.
10. An apparatus for RGB image encoding, comprising: (a) a computer processor configured for receiving color pixel data of an input RGB image containing colors red (Rn), blue (Bn), and green (Gn) pixels associated with each pixel group number n; and (b) programming in a non-transitory computer readable medium and executable on the computer processor for performing steps comprising: (b) (i) performing quantization for multiple quantization level (qn) values; (b) (ii) performing intra-plane prediction on each of the colors R, B and G to generate residuals, ΔRn, ΔGn, and ΔBn, from this first level of prediction in which prediction is performed for a given color based on neighboring pixels of a same color; (b) (iii) performing inter-plane prediction on one or more of the colors to generate residuals from a second level of prediction in which prediction of a given color is based on neighboring pixels of different colors, toward augmenting intra-plane prediction; wherein residuals are generated by performing intra-plane prediction on each of the colors R, G and B to generate residuals, ΔRn, ΔGn, and ΔBn in response to ΔRn=Rn−SP(Rn), ΔGn=Gn−SP(Gn), and ΔBn=Bn−SP(Bn), with SP describing an intra-color spatial predictor (SP); (b) (iv) entropy coding the quantized residual to generate an entropy coded RGB bitstream.
11. The apparatus of any preceding embodiment, wherein said one or more colors on which inter-plane prediction is performed in said second level of prediction comprises the red (R) and blue (B) colors.
12. The apparatus of any of the preceding embodiments, wherein said residuals from said second level of prediction comprise ΔΔRn and ΔΔBn determined in response to ΔΔRn,=ΔRn−ΔGn, and ΔΔBn=ΔBn−ΔGn.
13. The apparatus of any preceding embodiment, wherein said programming executable on a non-transitory computer readable medium further comprises generating multiple encoding modes in which different of the colors from the input RGB image are differently coded with pulse code modulation (PCM), or differential pulse code modulation (DPCM) at a given quantization level (qn).
14. The apparatus of any preceding embodiment, wherein said programming executable on a non-transitory computer readable medium further comprises generating multiple encoding modes having one mode in which all colors are PCM coded, another mode in which all colors are DPCM coded and one or more intermediate modes in which a combination of PCM and DPCM coding is performed.
15. The apparatus of any preceding embodiment, wherein said programming executable on a non-transitory computer readable medium further comprises generating multiple encoding modes having one mode in which all colors are PCM coded, another mode in which all colors are DPCM coded and one mode in which green (G) is PCM coded while read (R) and blue (B) are DPCM coded.
16. The apparatus of any preceding embodiment, further comprising inverse entropy coding performed to reconstruct residuals, followed by adding appropriate predictors to the residuals, after which a bit shift by qn is performed followed by reconstructing refinement bits, and then middle point reconstruction of the uncoded bit-planes.
17. The apparatus of any preceding embodiment, wherein said apparatus can be utilized for coding in both random access and non-random access conditions.
18. A method for RGB image encoding, comprising: (a) receiving color pixel data of an input RGB image containing colors red (Rn), blue (Bn), and green (Gn) pixels associated with each pixel group number n within a processing device configured for image coding; and (b) performing quantization for multiple quantization level (qn) values; (c) performing intra-plane prediction on each of the colors R, B and G to generate residuals, ΔRn, ΔGn, and ΔBn, from this first level of prediction in which prediction is performed for a given color based on neighboring pixels of a same color; (d) performing inter-plane prediction on one or more of the colors to generate residuals from a second level of prediction in which prediction of a given color is based on neighboring pixels of different colors, toward augmenting intra-plane prediction; and (e) entropy coding the quantized residual to generate an entropy coded RGB bitstream.
19. The method of any of the preceding embodiments, further comprising performing intra-plane prediction on each of the colors R, G and B to generate residuals, ΔRn, ΔGn, and ΔBn in response to ΔRn=Rn−SP(Rn), ΔGn=Gn−SP(Gn), and ΔBn=Bn−SP(Bn); wherein SP describes performing an intra-color spatial predictor (SP).
20. The method of any preceding embodiment, wherein said residuals from said second level of prediction comprise ΔΔRn and ΔΔBn determined in response to ΔΔRn,=ΔRn−ΔGn, and ΔΔBn=ΔBn−ΔGn.
Although the description herein contains many details, these should not be construed as limiting the scope of the disclosure but as merely providing illustrations of some of the presently preferred embodiments. Therefore, it will be appreciated that the scope of the disclosure fully encompasses other embodiments which may become obvious to those skilled in the art.
In the claims, reference to an element in the singular is not intended to mean “one and only one” unless explicitly so stated, but rather “one or more.” All structural, chemical, and functional equivalents to the elements of the disclosed embodiments that are known to those of ordinary skill in the art are expressly incorporated herein by reference and are intended to be encompassed by the present claims. Furthermore, no element, component, or method step in the present disclosure is intended to be dedicated to the public regardless of whether the element, component, or method step is explicitly recited in the claims. No claim element herein is to be construed as a “means plus function” element unless the element is expressly recited using the phrase “means for”. No claim element herein is to be construed as a “step plus function” element unless the element is expressly recited using the phrase “step for”.
This application claims priority to, and the benefit of, U.S. provisional patent application Ser. No. 61/925,972 filed on Jan. 10, 2014, incorporated herein by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
61925972 | Jan 2014 | US |