This disclosure is related to color video and/or image coding.
Color images and/or videos are usually visually appealing and have been found at times to convey more information than gray scale images or video. Therefore, efficient color image and/or video compression schemes are desirable. One issue in coding video, for example, including color video, is motion estimation. Motion estimation is typically computationally intensive and may also affect the amount of compression achieved.
The subject matter is particularly pointed out and distinctly claimed in the concluding portion of the specification. The claimed subject matter, however, both as to organization and method of operation, together with objects, features, and advantages thereof, may best be understood by reference to the following detailed description when read with the accompanying drawings in which:
In the following detailed description, numerous specific details are set forth in order to provide a thorough understanding of the claimed subject matter. However, it will be understood by those skilled in the art that the claimed subject matter may be practiced without these specific details. In other instances, well-known methods, procedures, components and circuits have not been described in detail in order so as not to obscure the claimed subject matter.
In one embodiment, an efficient color coding scheme for color video using three dimensional (3-D) discrete wavelet transforms (DWT) employing correlation among different color components is described. Of course, the claimed subject matter is not limited in scope to employing a DWT. Other wavelet transforms may be employed within the scope of the claimed subject matter. However, experimental results for such an embodiment are also provided. Furthermore, to produce experimental results, several aspects of the coding were specified; however, this is merely an example embodiment and the claimed subject matter is not limited in scope to these specified coding aspects or to this particular embodiment.
One aspect of the following embodiment is the notion that a sequence of still images or video may be treated as 3-D data or as a 3-D image. One advantage for this particular embodiment is that motion estimation is not employed to perform coding. It shall be noted that for this embodiment this produced levels of compression that typically are not available with conventional approaches. Furthermore, this embodiment is typically less computationally intensive than such conventional approaches.
Since color models (RGB, YUV, YCbCr, etc.) are usually inter-convertible and are frequently provided in red-green-blue (RGB) format, this embodiment employs RGB format. However, the claimed subject matter is not limited in scope to this particular format. Likewise, here, the green component may be taken as a reference component for the red and blue components. This may be at least in part because the human eye is more sensitive to luminance and much of the luminance information may be contained in the green component. However, again, this is merely an example embodiment.
One embodiment of a method of color coding a sequence of frames may include the following. A three-dimensional wavelet transform may be applied to the color components of the sequence of frames or color video. At least one of the color components may be coded. Additionally, an adjustment to at least some remaining color components may be coded relative to the at least one coded color component. Likewise, an embodiment of a method of decoding and reconstructing a sequence of coded color frames includes the following. At least a color component that has been coded is decoded. Adjustments to the at least one decoded color component are decoded to produce remaining color components. An inverse 3-D wavelet transform is applied to the decoded color components to reconstruct the sequence of color frames.
Although the claimed subject matter is not limited in scope in this respect, one approach to applying a DWT in three dimensions is described in aforementioned “Method and Apparatus for Three-Dimensional Wavelet Transform,” U.S. patent application Ser. No. 09/867,784.
This particular embodiment, here applied to color video, includes the following.
In this particular embodiment, a multi-resolution wavelet representation, such as for a DWT, may be employed to provide a simple hierarchical framework for interpreting an image. At different resolutions, the details of an image generally characterize different physical structures of a scene. A coarse to fine approach may be employed to assist in the coding of the transformed image, as well as to result in effective compression. When the approach is applied to a video sequence, it may be modified from a 2-D to a 3-D transform.
The procedure followed to perform the 3-D wavelet transform may be explained using the diagram shown in
The input video sequence, here designated V, may be treated as a 3-D block with the different frames arranged substantially according to time position. This sequence, in this particular embodiment may be fed to two paths, designated P1 and P2. Along one path, here P1, filtering along the time axis may be applied, in this embodiment with filter function g(n). The filtered data, again, in this particular embodiment may be sub-sampled, here by 2. Thus, in this embodiment, alternative frames of the block may be retained. The frames from this reduced block may be again fed into two paths, here P3 and P4, as illustrated in
Along one of the paths or sub-paths, such as here P3, filtering may be applied along the rows, again with filter function g(n). The filtered data, again, in this particular embodiment may be sub-sampled, here by 2. Here, alternative columns of the matrix or frame may be retained. This reduced matrix may be fed into two paths, P5 and P6 as illustrated in
Along direction P5, here, filtering may be applied along the columns with filter function g(n). The filtered data may be sub-sampled by 2. Alternative rows of the matrix may be retained. This may produce a detail signal, D1.
Along the other direction, here P6, filtering may be applied along the columns with filter function h(n), in this particular embodiment. The filtered data may be sub-sampled by 2, again, for this particular embodiment. Alternative rows of the matrix may be retained. This may produce a detail signal, D2.
In the other sub-path, here P4, filtering may be applied along the rows with filter function h(n). The filtered data may be sub-sampled by 2. Alternative columns of the matrix may be retained. This reduced matrix may be again split into two paths, P7 and P8 in
In one direction P7, filtering may be applied along the columns, here with filter function g(n). The filtered data may be sub-sampled by 2. Here, alternative rows of the matrix may be retained. This may produce a detail signal, D3.
In the other direction P8, filtering may be applied along the columns, here with filter function h(n). The filtered data may be sub-sampled by 2. Alternative rows of the matrix may be retained. This may produce a detail signal, D4.
In the other path, here P2, filtering may be applied along the time axis, here with filter function h(n) in this embodiment. The filtered data may be sub-sampled by 2, in this embodiment. Alternative frames of the block may be retained. The frames from this reduced block may be again fed into two paths, P9 and P10 in
In one sub-path P9, filtering may be applied along the rows, with filter function g(n) in this embodiment. The filtered data may be sub-sampled by 2. Thus, alternative columns of the matrix or frame may be retained. This reduced matrix may be again fed into two paths, P11 and P12 in
In one direction, here P11, filtering may be applied along the columns, here with filter function g(n). The filtered data may be sub-sampled by 2. Thus, alternative rows of the matrix may be retained. This may produce a detail signal, D5.
In the other direction, here P12, filtering may be applied along the columns, here with filter function h(n). The filtered data may be sub-sampled by 2. Thus, alternative rows of the matrix may be retained. This may produce a detail signal, D6.
In the other sub-path P10, filtering may be applied along the rows, here using h(n).The filtered data may be sub-sampled by 2. Alternative columns of the matrix may be retained. This reduced matrix may again be split into two paths, P13 and P14 in this embodiment.
In one direction, here P13, filtering may be applied along the columns with filter function g(n). The filtered data may be sub-sampled, here by 2. Alternative rows may be retained. This may produce a detail signal, D7.
In the other direction P14, filtering may be applied along the columns with filter function h(n) in this embodiment. The filtered data may be sub-sampled by 2. Therefore, alternative rows of the matrix may be retained. This may produce a detail signal, V′.
Thus, seven detail subblocks may be extracted that provide the variations of the edge information, eg, horizontal, vertical and diagonal, with time. The other, or eighth, subblock or component, in this embodiment, may be the applied video sequence at a lower resolution, due to low pass filtering, such as by h(n) in this embodiment. Applying compression to produce these blocks, such as described in more detail hereinafter, for example, therefore, may produce 3-D coding.
Similarly, an inverse 3D discrete wavelet transform or reconstruction approach may be explained using a diagram, such as shown in
This approach is described and illustrated with reference to
Detail signal D3 may be up-sampled. For example, a row of zeros may be inserted between rows. This sub-block may then be filtered along the columns with the filter function g(n). Detail signal D4 may be up-sampled. For example, a row of zeros may be inserted between rows. This sub-block may then be filtered along the columns with the filter function h(n). The resultant output signals from applying the foregoing processes to D3 and D4 may be added. The resultant sub-block may be up-sampled. For example, a column of zeros may be inserted between columns. This matrix may then then filtered along the rows with the filter function h(n). The resultant output signals here may be added with interim signals I1. The resultant sub-block may be up-sampled. For example, a frame of zeros may be inserted between frames. This matrix may be then filtered along the frames with the filter function g(n) to produce interim signals I2.
Detail signal D5 may be up-sampled. For example, a row of zeros may be inserted between adjacent rows. This sub-block may then be filtered along the columns with the filter function g(n). Detail signal D6 may be up-sampled. For example, a row of zeros may be inserted between adjacent rows. This sub-block may then be filtered along the columns with the filter function h(n). The resulting output signals from applying the foregoing processes to D5 and D6 may be added, as illustrated in
Detail signal D7 may be up-sampled. For example, a row of zeros may be inserted between rows. This sub-block may then be filtered along the columns with the filter function g(n). Detail signal V′ may be up-sampled. For example, a row of zeros may be inserted between rows. This sub-block may then be filtered along the columns with the filter function h(n). The resultant output signals may be added. The resultant sub-block may be up-sampled. For example, a column of zeros may be inserted between columns. This matrix may then be filtered along the rows with the filter function h(n). The resultant output signals may be added with interim signals I3. The resultant sub-block may be up-sampled. For example, a frame of zeros may be inserted between frames. This matrix may then be filtered along the frames with the filter function h(n). The resultant output signals may be added with interim signals I2. The resultant sub-block may be multiplied by 8 to get the sub-matrix to the next level of resolution. diagram as shown in
Quantization factors that may be applied to different blocks at the highest level of 3-D wavelet coefficients are given in table 1 (refer to
A DWT transformed G color plane in a 3D sequence may be encoded as described, for example, in “Method and Apparatus for Coding of Wavelet Transformed Coefficients” U.S. patent application Ser. No. 09/867,781. One particular video embodiment is provided below, although the claimed subject matter is not limited in scope to this particular embodiment.
A 3D wavelet transform decomposes an image into seven subbands, one low frequency subband (e.g., LLL) and seven high frequency subbands (e.g., LLH, LHL, LHH, HLL, HLH, HHL, HHH). The LLL subband has the characteristics of the original image and may be decomposed in multiple levels. In an example application, the decomposition may be applied up to 4 levels and, in this example, levels of the transform are numbered as in
Coefficients of R and B planes may be coded using a technique described in “Method of Compressing a Color Image” U.S. patent application Ser. No. 09/411,697; although, of course, the claimed subject matter is not limited in scope in this respect. Applying this coding scheme to video, 3-D wavelet transformed coefficient planes for different color planes are shown in
coeff—r=r/g & coeff—b=b/g
where r, g, b are calculated on the coefficient values of the shaded regions shown in
For the R component, the following may be performed. From the R coefficient values, a value based at least in part on the corresponding G values, in this particular embodiment, may be subtracted for the coefficients in the shaded region in
R_error (i,j)=R(i,j)−coeff—r*G(i,j)
The values of the error in the shaded region as shown in
For the B component a similar approach may be applied in this embodiment. From the B coefficient values, again, appropriate values, in this embodiment based at least in part on the corresponding G values, may be subtracted, for the coefficients in the shaded region of
B_error (i,j)=B(i,j)−coeff—b*G(i,j)
The values in the shaded region, again, may be entropy coded and transmitted.
For decoding, an inverse scheme may be applied in this embodiment. From the G coefficients and coeff_r, coeff_b of different subbands, estimated R′, B′ coefficient value matrices may be constructed as follows. For the R component, values G′ coefficients *coeff_r may be stored in corresponding places of the R matrix.
R′(i,j)=G′(i,j)*coeff—r*
*Values of coegg_r and coeff_b are different for different subbands.
For the B component, values G′ coefficients *coeff_b may be stored in corresponding places of the B matrix. To put it mathematically,
B′(i,j)=G′(i,j)*coeff—b
For the R component, the partial error matrix may be entropy decoded and then corresponding values may be added to previously estimated R′ coefficient values. Mathematically,
new R(i,j)=R′(i,j)+R_error(i,j)
For the B component, again, the partial error matrix may be entropy decoded and then corresponding values may be added to the previously estimated B′ coefficient values.
new B(i,j)=B′(i,j)+B_error(i,j)
Once the R, G, B wavelet transformed matrices are thus calculated, an inverse 3-D wavelet transformation may be applied to get the reconstructed frames. A schematic diagram of this embodiment of a color video coding scheme is shown in
Here the results are presented on a RGB formatted 24 bits per pixel cif image (352*288) named “Bright” sequence. Four cases have been taken.
The PSNR values for R, G, B components for different frames is plotted for case (4) in
Although the claimed subject matter is not limited in scope to the particular embodiments described and shown, nonetheless, these embodiments provide a number of potential advantages. An applied 3D wavelet transformation technique has been shown to reduce redundancies in the image sequence by taking advantage of spatial as well as temporal redundancies. No computationally complex motion estimation/compensation technique is employed in this particular embodiment. Likewise, since no motion estimation/compensation based DCT technique is applied, the reconstructed video generally has fewer visually annoying or blocking artifacts. For the most part, the previously described coding scheme is computationally faster and efficiently codes the 3D wavelet transformed coefficients by employing fewer bits. Hence, it improves compression performance. Furthermore, by applying bit-plane processings with minor modifications to the previously described technique, parallel execution may be employed. Likewise, a bit-plane coding and decoding approach makes such an embodiment of a video coder suitable for a progressive coding environment.
It will, of course, be understood that, although particular embodiments have just been described, the claimed subject matter is not limited in scope to a particular embodiment or implementation. For example, one embodiment may be in hardware, such as implemented to operate on an integrated circuit chip, for example, whereas another embodiment may be in software. Likewise, an embodiment may be in firmware, or any combination of hardware, software, or firmware, for example. Referring to
While certain features of the claimed subject matter have been illustrated and described herein, many modifications, substitutions, changes and equivalents will now occur to those skilled in the art. It is, therefore, to be understood that the appended claims are intended to cover all such modifications and changes as fall within the true spirit of the claimed subject matter.
This patent application is a U.S. Continuation-In-Part Patent Application of “Method and Apparatus for Three-Dimensional Wavelet Transform” by Acharya et al., filed on May 29, 2001, U.S. patent application Ser. No. 09/867,784, now U.S. Pat. No. 6,956,903; “Method and Apparatus for Coding of Wavelet Transformed Coefficients” by Acharya et al., filed on May 29, 2001, U.S. patent application Ser. No. 09/867,781, now U.S. Pat. No. 6,834,123; and “Method of Compressing a Color Image” by Acharya et al., filed on Oct. 1, 1999, U.S. patent application Ser. No. 09/411,697, now U.S. Pat. No. 6,798,901; all of the foregoing assigned to the assignee of the current invention and herein incorporated by reference.
Number | Name | Date | Kind |
---|---|---|---|
5875122 | Acharya | Feb 1999 | A |
5956467 | Rabbani et al. | Sep 1999 | A |
5995210 | Acharya | Nov 1999 | A |
6009201 | Acharya | Dec 1999 | A |
6009206 | Acharya | Dec 1999 | A |
6047303 | Acharya | Apr 2000 | A |
6091851 | Acharya | Jul 2000 | A |
6094508 | Acharya et al. | Jul 2000 | A |
6108453 | Acharya | Aug 2000 | A |
6124811 | Acharya et al. | Sep 2000 | A |
6130960 | Acharya | Oct 2000 | A |
6151069 | Dunton et al. | Nov 2000 | A |
6151415 | Acharya et al. | Nov 2000 | A |
6154493 | Acharya et al. | Nov 2000 | A |
6166664 | Acharya | Dec 2000 | A |
6178269 | Acharya | Jan 2001 | B1 |
6195026 | Acharya | Feb 2001 | B1 |
6215908 | Pazmino et al. | Apr 2001 | B1 |
6215916 | Acharya | Apr 2001 | B1 |
6229578 | Acharya et al. | May 2001 | B1 |
6233358 | Acharya | May 2001 | B1 |
6236433 | Acharya et al. | May 2001 | B1 |
6236765 | Acharya | May 2001 | B1 |
6269181 | Acharya | Jul 2001 | B1 |
6275206 | Tsai et al. | Aug 2001 | B1 |
6285796 | Acharya et al. | Sep 2001 | B1 |
6292114 | Tsai et al. | Sep 2001 | B1 |
6611620 | Kobayashi et al. | Aug 2003 | B1 |
6798901 | Acharya et al. | Sep 2004 | B1 |
6834123 | Acharya et al. | Dec 2004 | B1 |
6850570 | Pesquet-Popescu et al. | Feb 2005 | B1 |
Number | Date | Country | |
---|---|---|---|
20040017952 A1 | Jan 2004 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 09867784 | May 2001 | US |
Child | 10206908 | US | |
Parent | 09867781 | May 2001 | US |
Child | 09867784 | US | |
Parent | 09411697 | Oct 1999 | US |
Child | 09867781 | US |