1. Field of the Invention
The present invention relates to texture compression techniques.
2. Background of the Invention
Compression and decompression intended to minimize the memory size needed to store 2D textures is a promising field of application for these techniques in the 3D graphic domain. This possible field of use is becoming more and more significant as the dimensions and number of these textures tend to increase in real applications. The level of detail tends to increase as required by some applications, such as 3D games, and, without the help of such techniques, memory size and bandwidth for access would tend to require increasing performance levels hardly sustainable in mobile, ultra low power, handheld systems. More to the point, these techniques are becoming increasingly important in wireless phone architectures with 3D games processing capabilities.
For example, assuming a texture dimension of 512×512 pixels 16 bit/color each and a depth of 3, the amount of memory needed is 1.5 M bytes. Assuming 20-30 frames per second, the memory bandwidth is 30 to 45 Mbytes/s.
Additional background information on this topic can be gathered from “Real-Time Rendering” by Tomas Akenine-Möller and Eric Haines, A.K. Peters Ltd, 2nd edition, ISBN 1568811829.
A well-known solution in this scenario was developed by the company S3; the related algorithm is designated S3TC (where TC stands for Texture Compression).
This has become a widely used de-facto standard and is included in the Microsoft DirectX libraries with adhoc API support.
Compression is performed off-line at system initialization and next the textures are stored in the main memory. Decompression processes act to decompress textures accessing the memory run-time when needed by the graphic engine. This means that only decompression is implemented in hardware form while compression is not.
Important parameters for the decompression engine are: steps needed to decompress textures and possible parallel operation; low latency between data-access-from-memory and data-out-from the decompression engine.
In order to better understand operation of the S3TC algorithm one may refer to an image in RGB format, where each color component R (Red) or G (Green) or B (Blue) is a sub-image composed by N pixels in the horizontal dimension and M pixels in vertical dimension. If each color component is coded with P bits, the number of bits per image is N*M*3*P.
For example, assuming N=M=256 and P=8, then the resulting size is 1,572,864 bits. If each sub-image R or G or B is decomposed in non-overlapping blocks of Q pixels in the horizontal dimension and S pixel in the vertical dimension, the number of blocks per sub-image is (N*M)/(Q*S) while per image is [3(NM/(Q*S)] and the number of bits per block is [3*(Q*S)]*P. If, for example Q=S=4 and P=8, then the resulting size of each block is 384 bits. If the number of bits per channel is R=5, G=6, B=5 then the resulting size of each block per image is (4*4)*(5+6+5)=256 bits. The S3TC algorithm is able to compress such an amount of data by 6 times when R=8, G=8, B=8 and 4 times when R=5, G=6, B=5. 64 bits compose the resulting compressed block always sent to decompression stage. This number is the results of the coding steps described below assuming Q=S=4.
To sum up, operation of the S3TC algorithm may be regarded as comprised of the following steps:
i) Decompose the R G B image in non-overlapped (Q=4)*(S=4) blocks of R G B colors
ii) Consider the following block composed by 16 pixels each one composed by R, G and B color components:
Pij=Rij U Gij U Bij (this denotes the pixel at the ij position the R G B image, and U is the union operator)
iii) Decompose the block above in three sub-blocks called sub-block R, sub-block G and sub-block B as shown herein below, each block including only one color component:
as shown in
iv) Sort in ascending order each sub-block color
v) Detect the black color, which is a pixel made of R=0 and G=0 and B=0
vi) If the black color is not detected, then set a color palette made by
vii) Otherwise, if black color is detected then set a color palette made by
viii) If black color is not detected define the look-up color palette as
If black color is detected define the color palette as
ix) Associate the following 2 bits code (in boldface, under the palette) to each column of the above palette
x) For each Pij=Rij U Gij U Bij (where i ranges from 1 to Q=4 and j ranges from 1 to S=4) compute the Euclidean distance Dist between it and each look-up color as defined above in vi.a,b,c,d or vii.a,b,c,d depending if black color has been detected or not. Essentially this is the Euclidean distance between two points in a three-dimensional coordinate space. Also, the difference is within a homologous color component (between R or G or B).
Dist1=√(|Rij−MinR|2+|Gij−MinG|2+|Bij−MinB|2)
Dist2=√(|Rij−Int1R|2+|Gij−Int1G|2+|Bij−Int1B|2)
Dist3=√(|Rij−Int2R|2+|Gij−Int2G|2+|Bij−Int2B|2)
Dist4=√(|Rij−MaxR|2+|Gij−MaxG|2+|Bij−MaxB|2)
xi) For each Pij=Rij U Gij U Bij find the minimum distance among Dist1, Dist2, Dist3 and Dist4. For example let it be Dist1.
xii) Send to a decoder process the code associated to the color enclosed in the look-up table that has the minimum distance. If it is Dist1 then the code is 00.
xiii) The decoder receives for each Q*S block as shown in
xiv) If Min is received before Max by the decoder then black has been detected by the encoder otherwise not
xv) As shown in
xvi) As shown in
As stated before, the compression ratio is 6:1 or 4:1. This is because if colors are in R=8 G=8 B=8 format then 384 bits are coded with 64 (384/64=6) and if colors are in R=5 G=6 B=5 format then 256 bits are coded with 64 (256/64=4).
As shown in
However satisfactory the prior art solution considered in the foregoing may be, the need is felt for alternative texture compression/decompression techniques of improved quality.
The aim of the present invention is thus to provide such an alternative, improved technique, leading to better performance in terms of quality achieved and complexity needed for its implementation.
According to the present invention, such an object is achieved by means of a method having the features set forth in the claims that follow. The invention also encompasses the decoding process as well as corresponding apparatus in the form of either a dedicated processor or a suitably programmed general-purpose computer (such as a DSP). In that respect the invention also relates to a computer program product directly loadable into the memory of a digital computer such as a processor and including software code portions performing the method of the invention when the product is run on a computer.
The preferred embodiment of the invention provides a significant improvement over prior art solutions such as S3TC from different viewpoints, since it uses the following compression tools:
The invention will now be described, by way of example only, with reference to the annexed figures of drawing, wherein:
a to 7h are diagrams showing the directions used to scan and predict pixels in the arrangement shown herein; and
A first embodiment of the invention will now be described by using the same approach previously adopted for describing, in the case of Q=S=4, the S3TC arrangement.
This method will first be described by referring to an exemplary embodiment where Q=S=3.
i) Decompose the R G B image in non-overlapped Q×S blocks of R G B colors
ii) Consider the following 3×3 block composed by nine pixels each one composed by R, G and B components:
Pij=RijU Gij U Bij (where Pijagain denotes the pixel placed in the ij position in the R G B image, and U is the union operator)
iii) Decompose the above block in three sub-blocks called sub-block R, sub-block G and sub-block B, respectively, as shown below, wherein each block includes only a color component:
iv) Define a 1st predictor for each sub-block
v) Compute for each sub-block the following prediction differences:
vi) Sort in ascending order the prediction differences in each sub-block as shown in
vii) Set up a look-up prediction difference palette wherein
(In fact the relationships reported in the foregoing correspond to the presently preferred choice within the general relationships:
x) For each Pij=RijU Gij U Bij (where i ranges from 1 to Q=3 and j ranges from 1 to S=3) compute the prediction error using P22 as predictor.
xi) For each Eij compute the Euclidean distance between it and each look-up color as defined above in step ix. This is again the Euclidean distance between two points in a three-dimensional coordinate space and the difference is between homologous prediction error components.
Dist1=√(|ERij−Min_errorR|2+|EGij−Min_errorG|2+|EBij−Min_errorB|2)
Dist2=√(|ERij−Int1R|2+|EGij−Int1G|2+|EBij−Int1B|2)
Dist3=√(|ERij−Int2R|2+|EGij−Int2G|2+|EBij−Int2B|2)
Dist4=√(|ERij−Max_errorR|2+|EGij−Max_errorG|2+|EBij−Max_errorB|2)
xii) For each Eij=ERij U EGij U EBij find the minimum distance among Dist1, Dist2, Dist3 and Dist4. For example this may be Dist1.
xiii) Compose a bitstream as follows:
xiv) Each 3*3 block in encoded to 16+16+16+(8*2)=64 bits instead of 144 bits with a compression factor of 2.25 if the RGB source format is 565. The value 3.375 if the RGB source format is 888.
In the decoding process, the decoder will receive the incoming bitstream and proceed through the following steps:
The arrangement disclosed in the foregoing has been implemented for the following standard images and using two formats: RGB 565 and RGB 888, where 5, 6 or 8 are the number of bits per color channel.
1. 256×256 (horizontal×vertical size dimension)
2. 512×512 (horizontal×vertical size dimension)
3. 640×480 (horizontal×vertical size dimension)
These pictures are a representative set on which typically texture compression is applied. All pictures are in true-color format or 888, while the 565 are obtained from 888 truncating the 323 lowest bits of the 888 pictures. Alternative truncating methods can be used to transform 888 pictures into 565 pictures such as “rounding to nearest integer”, “Floyd-Steinberg dithering” etc. These alternatives do not entail changes to the arrangement disclosed herein.
To measure the performance of each algorithm, visual assessments and objective measures were performed, by taking two parameters as the reference measures, namely mean square error (MSE) and peak signal/noise ratio (PSNR) for each RGB channel.
Input images IS in the 888 format (called Source888) are converted at 200 into the 565 format (called Source 565), then compressed at 201 and further decompressed at 202 to the 565 format. These images are back converted at 203 into the 888 format to generate a first set of output images OS′ (also called Decoded888).
The Source-565 images from block 200 are back converted into the 888 format at 204 to generate a second set of output images OS″ to be used as a reference (called Source565to888).
A first set of PSNR values (called PSNR 888) are computed between the Source 888 IS and the Decoded888 OS′ images. A second set of PSNR (called PSNR 565) values are computed between the Source565to888 OS″ and the Decoded888 OS′ images.
The 565 images are back reported to the 888 format by simple zero bit stuffing of the 323 least important positions.
How the Source888 IS images are converted to the 565 format and back to the 888 format corresponds to conventional techniques that do not need to be described in detail herein.
Mean squared (MSE) and peak (PSNR) error are defined as follows:
MSE=(Σ|Pij−Paij|2)/(W*h) where: Pij=source color
Due to its predictive nature, the arrangement previously described is almost invariably able to achieve better performance, while yielding a lower compression ratio than S3TC.
The proposed arrangement can be easily extended to Q=4×S=4 blocks by simply adding one more column at the right side and one more row at the bottom side of the Q=3×S=3 “chunk”.
The main difference with respect to the embodiment described in the foregoing is related to the possibility of adopting a plurality of different patterns for extending the 3×3 block to a 4×4 block as best shown in
This extension will be described in the following.
i) Decompose the R G B image in non overlapped Q×S blocks of R G B colors
ii) Consider the following Q=4×S=4 block composed of 16 pixels each one composed by R, G and B components:
Pij=Rij U Gij U Bij (where Pij is again the pixel at the ij position in the R G B image, and U is the union operator)
iii) Decompose the above block in three sub-blocks called sub-block R, sub-block G and sub-block B. respectively, as shown below, wherein each block includes only a color component
iv) Define a 1st predictor for each sub-block
v) Define
It will be appreciated that other prediction patterns are feasible, as shown in
vi) Compute for each sub-block the following prediction differences:
vii) Sort in ascending order the prediction differences for each sub-block as shown in
viii) Two groups are defined by the sorted prediction differences. The first is composed by the three lowest elements and the second by the three highest as shown in
ix) Set a look-up prediction differences palette composed as follows:
i.Int1R=(2*min_median_errorR+max_median_error)/3, Int1G=(2*min_median_errorG+max_median_errorG)/3, Int1B=(2*min_median_errorB+max_median_errorB)/3
i.Int2R=(min_median_errorR+2*max_median_errorR)/3, Int2G=(min_median_errorG+2*max_median_errorG)/3, Int2B=(min_median_errorB+2*max_median_errorB)/3
In
(In fact the relationships reported in the foregoing correspond to the presently preferred choice within the general relationships:
x) Define the look-up prediction error palette as
xi) Associate the following 2 bits code with each column of the above palette
xii) For each Pij=Rij U Gij U Bij (where i ranges from 1 to Q=4 and j ranges from 1 to S=4) compute the prediction error using predictors as defined in steps v and vi.
p. Define the prediction error Eij=ERij U EGij U EBij=(PredictorRkl−Rij) U (PredictorGkl−Gij) U (PredictorBkl−Bij)
xiii) For each Eij compute the Euclidean distance between it and each look-up color as defined above in step ix. This is again the Euclidean distance between two points in a three-dimensional coordinate space and the difference is between homologous prediction error components.
Dist1=√(|ERij−Min_median_errorR|2+|EGij−
Min_median_errorG|2+|EBj−Min_median_errorB|2)
Dist2=√(|ERij−Int1R|2+|EGij−Int1G|2+|EBij−Int1B|2)
Dist3=√(|ERij−Int2R|2+|EGij−Int2G|2+|EBij−Int2B|2)
Dist4=√(|ERij−Max_median_errorR|2+|EGij−
Max_median_errorG|2+|EBij−Max_median_errorB|2)
xiv) For each Eij=ERij U EGij U EBij find the minimum distance among Dist1, Dist2, Dist3 and Dist4. For example, this may be Dist1, and the two-bit code associated thereto is 00.
xv) Each Q*S block is fully coded in 8 different sessions, where in each session uses one of the 8 configurations for the predictions shown in
xvi) Compose a bitstream as follows:
xvii) For each block send the bitstream as defined in step xvi) to a decoder process.
xviii) Each Q×S=4*4 block in encoded to 16+16+16+(15*2)+2=80 bits=16*5=10 bytes instead of 256 bits allows a compression factor of 3.2 if the RGB source format is 565. It is 4.8 if the RGB source format is 888.
The arrangement just described has been implemented the same set of pictures defined previously. The results show that the instant predictive arrangement is able to achieve at least the same performance levels of S3TC and yields a compression factor slightly lower than S3TC on 565 sequences.
The proposed arrangement however achieves unquestionably better quality in the both the 3×3 and 4×4 versions, in spite of a lower compression ratio (i.e. in 4×4 reaches 80% of performance of S3TC). Even when worse quality has been measured, visual assessments showed imperceptible artifacts.
Of course, without prejudice to the underlying principle of the invention, the details and embodiments may vary, also significantly, with respect to what has been described and shown by way of example only, without departing from the scope of the invention as defined by the annexed claims.
Number | Date | Country | Kind |
---|---|---|---|
03002728 | Feb 2003 | EP | regional |
Number | Name | Date | Kind |
---|---|---|---|
4319267 | Mitsuya et al. | Mar 1982 | A |
5663764 | Kondo et al. | Sep 1997 | A |
5956431 | Iourcha et al. | Sep 1999 | A |
6393060 | Jeong | May 2002 | B1 |
6757435 | Kondo | Jun 2004 | B2 |
20030161404 | Wu | Aug 2003 | A1 |
Number | Date | Country | |
---|---|---|---|
20040156543 A1 | Aug 2004 | US |