The present invention relates to colorspace interchange and more particularly, to a new color format that can be used for a reference color frame and for internal color space for image processing.
Communication of color information between different devices and industries has become recognized as an important issue. Each industry generally has its own color management history, with its own terminology standards and methods for communicating color information. As more users are connecting different peripheral devices made by different companies and, in addition, communicate with one another over the Internet, it is becoming more urgent to have a standardized color data management scheme that provides consistent color data management. Many different practices and standards are currently being used.
Different phosphor sets are being used to provide the colors of “red”, “green”, and “blue”. For example, where a monitor may illustrate a pink color, and the user selects the pink color, the printer printing the selection may print out an ugly purple/lavender. This, different values on chromaticity diagrams represent a same color, providing confusion. Tone reproduction of various systems also differs. Also, viewing conditions may vary, causing colors to appear different to different observers. Thus, due to differing visual conditions, the color of the illuminant (the white point), the absolute level of the scene irradiance (generally the illuminance), the surrounding colors, etc., all affect the color perception unless the initial and final conditions are identical. Unless a white point and illuminance level are the same, the color interchange data may not be identical.
Previous color data conversion methods have required the use of cube root computation or raising values to the third power. To store data, every pixel had to be converted using a set of power function routines. This process is time-consuming, consumes processing power, and may introduce errors. Other techniques, such as is described in U.S. Pat. No. 5,224,178, by Madden et al., provide for compressing digital code values to provide a set of reduced range digital codes of a same resolution, but having a smaller range of basic image content values than the dynamic range of the digitized image data base.
As shown in
Although lasers have virtually monochromatic output and the primaries of the laser would reside on the spectrum locus of the CIE (International Commission on Illumination) diagram of
Cathode display tubes (CRTs), color flat panels (both active and passive matrix types) and high definition televisions (HDTVs) provide chromaticity diagrams that are similar to the CRT model shown in FIG. 2. However, the sRGB chromaticity diagram lacks a range of gamut that includes all colors, and conversion of sRGB color data values is non-linear, thus often resulting in undesired results.
Advanced graphic systems require the features of anti-aliases (removing ragged edges) and blending (translucency) effects. Those effects are handled by an extra component, called the alpha channel, in addition to the RGB components. In order to perform the anti-alias and blending operations correctly with the alpha channel, the linear color components need to be defined in terms of their intensities. Current systems are however limited to the intensity values between 0 and 1, which do not provide optimal results in some circumstances.
One aspect of the invention relates to an extended colorspace which has a higher accuracy and a wider gamut than sRGB color space. The extended color space includes an alpha channel which defines the translucency of the color image. The alpha channel is different from known alpha channels in that the inventive alpha channel can represent “super transparent” and “super opaque” values by allowing the alpha parameter (α) to be greater than 1 and less than 0.
A data structure for storing image information for each component of an image is also disclosed. The data structure has three fields, a sign field, an integer field and a decimal field. The sign field defines whether an integer is negative or positive. The integer field defines the integer, wherein the integer defines the super or under saturated values for color and alpha components. The decimal field defines the fine detailed information for the value of the color and alpha components.
FIGS. 10(a)-10(e) illustrate various stages of color data according to embodiments of the invention.
As shown in
By allowing the component of each primary color to be negative and to extend beyond 1.0 (when normalized to 1.0 in sRGB), the present invention's gamut is larger than the visible color space. The data scheme of the present invention, “XsRGB”, is also known as “sRGB64. “XsRGB” will be used hereafter to represent the color data scheme of either XsRGB or sRGB64.
Advanced graphic systems require anti-aliasing features (removing ragged edges) and blending (translucency) effects. To achieve these anti-aliasing features and blending effects, an extra component called an “alpha channel” was introduced. To utilize the alpha channel, the linear color components must be expressed in terms of their intensities. However, sRGB and other color management systems typically store color data values in non-linear 8-bit values per channel. The non-linearity is expressed as a “gamma value”. For example, Microsoft's and Apple's color management systems are 2.2 and 1.8, respectively. When only 8 bits were available for color data value representation, it was necessary to convert the color data non-linearly. Otherwise, it created a large gap in the lower intensity values and causing the resulting images to show contours. However, when the size of each component is extended to higher bit (12 bit or higher), the non-linearity requirement is eliminated. Thus, in an embodiment with 12 or more bits for each component, component values do not have to be non-linearized, avoiding confusion of different gamma values in different color standards. When the super luminous or negative values are allowed, color profiles do not require clipping to a narrower gamut. Since, in this embodiment, color values are standardized, standard images may be stored in the XsRGB format without attaching a standardized profile such as an ICC (International Color Consortium) profile to clarify the colors intended. Where desired, an alpha channel may be implemented to store information on transparency. Also, where selected, the color values may be premultiplied by alpha channel values to provide efficient blending.
It is better to define XsRGB more generally by a 4×4 matrix. Also, there is a conversion rule for XsRGB with a different white point.
XsRGB is linear in the visual intensity of each component. Hence, XsRGB can relate linearly to 1931 CIE XYZ values. Let R0, G0, and B0 denote the normalized red, green, and blue components, respectively. Let X, Y, and Z denote 1931 CIE XYZ values, but Y is normalized to 1 instead of 100. The relationship between the normalized XsRGB and XYZ is given by a 4×4 matrix.
Only 12 coefficients are needed to define XsRGB. In addition to the rotational part (mRZ, etc.), the transitional part (tR, etc.) is used. With this notation, the white point may be addressed as well as the black point. Using the inverse of the above matrix, the reverse relation from XsRGB to CIE XYZ space is given by:
A 16 bit definition of RGB components is given by:
In equation (2c), no gamma corrections are required since a sufficient number of bits are available to describe the color data (here, 16 bits).
It is desirable for XsRGB to have a simple transform to sRGB in D65. D50 and D65 are the standard illuminans (the spectrum distributions of the light source) defined by CIE. D50 and D65 are the spectrum distributions similar to the Black Body radiation of 5000 and 6500 Kelvin, respectively. Indeed, it is desirable for XsRGB to be identical to sRGB when its value is inside the range of sRGB. From the sRGB specification, the coefficients of Eq. (1b) and Eq. (2b) are determined as:
The white point of D65 is (xD65, yD65)=(0.3127, 0.3291); the corresponding CIE XYZ values are
Note that the Y-value at the white point is 1. When the device has the different white point (Xw, Yw, Zw), the CIE XYZ coordinates for the appearance match must be transformed by the scaling matrix.
and its inverse is
The transformation matrix from XYZ to XsRGB at this white point is given by
Mw=MD65Sw (6a)
and its transverse matrix is given by
Mw−1=Sw−1MD65−1 (6b)
For an example, the white point of D50 is (xD50, yD50)=(0.3457,0.3585). The corresponding CIE XYZ value is (XD50,YD50,ZD50)=(0.9643,1,0.8251). Hence the scaling matrices are
The resultant transformation matrices for D50 are:
The appearance match is obtained if the XsRGB values are calculated from the conversion matrix of the device white point. The absolute match may be obtained if the conversion matrix of D65 is used irrespective of the device white point.
Let (Rw, Gw, Bw) denote the normalized RGB value obtained with the matrix Mw defined in Eq. (6a) for the specific white point. The (Rw, Gw, Bw) value is used to do the appearance match and is called the appearance RGB value. When the absolute match is needed, the RGB values (R0, G0, B0) are used by using the matrix MD65 defined in Eq. (3a), which is called the absolute RGB value. The absolute RGB value is obtained from the appearance RGB value by the following equation:
The reverse relation is obtained as:
Since the XsRGB space is directly linked to CIE XYZ space, it is possible to produce the XsRGB measuring device. The new XsRGB device may be produced by adding the matrix conversion routines to the existing colorimeters. XsRGB values may be measured directly from the device. The device may produce the appearance RGB values and the absolute RGB values.
The default XsRGB space is the case of D65 that is linked to sRGB. Since there is no translational part, Eq. (1a) with M=MD65 can be written with a 3×3 matrix as:
Allowing each component to go from −4 to 4 by X, Y, Z values, wherein X, Y, and Z denote 1931 CIE XYZ values wherein Y has been normalized to 1 instead of 100, covers a range larger than the range covered by XYZ. The equation (10) provides one embodiment of a floating point format for XsRGB. When the 16 bit version of XsRGB is utilized, a signed 16 bit integer is used and 8192 (=213) is interpreted as 1 in the normalized value. Hence, the lowest 13 bits are used for the decimal portion.
Conversion from 16 bit color data for the XsRGB format to an 8 bit sRGB format is as follows: Let C16 and C8 denote one of the components in 16 bit XsRGB format and 8 bit sRGB format, respectively. The relationships are:
C0≡C16/8192 (Corresponding to the normalized linear XsRGB)
C8=0 for C16<0
C8=12.92×C0×255 for 0≦C0<0.00304 (0≦C16≦24)
C8=(1.055×C0(1.0/2.4)−0.055)×255 for 0.00304≦C0<1 (25≦C16<8192)
C8=255 for C0≧1 (C16≧8192) (11)
The above conversions correspond to clipping below 0 and above 8192 of the 16 bit XsRGB when converting to 8 bit sRGB. The clipping routine may be further modified as desired.
The reverse relationships are:
C16=2.4865×C8 for 0≦C8≦10
C16=8.192×[(C8+14.025)/269.025]2.4 for 11≦C8≦255. (12)
The extension of sRGB in accordance with the present invention provides a number of advantages. For example, blending operations with an alpha channel may be directly applied to XsRGB since XsRGB is linear. The XsRGB profiles may easily be obtained from the CIE XYZ profiles. When XsRGB is used for color reference, there is no need to rotate color components to display each component in an 8 bit sRGB device. Only gamma correction described in (Eq. 12) above need be used to convert to 8 bit sRGB. Even without an exact calibration, XsRGB yields satisfactory conversion for output for color monitors. The scanned images may generally be stored in XsRGB format without losing bit depths since most scanners produce data in not more than 12 bits in each color component.
Color operations defined in the RGB/ARGB colorspace may be extended to the expanded RGB/ARGB colorspace. Three examples of color operations in the expanded RGB/ARGB colorspace include:
With reference to
A number of program modules may be stored on the hard disk, magnetic disk 529, optical disk 531, ROM 524, or RAM 525, including an operating system 535, one or more application programs 536, other program modules 537 and program data 538. A user may enter commands and information into the personal computer 520 through input devices such as a keyboard 540 and pointing device 542. Other input devices (not shown) may include a microphone, joystick, game pad, satellite dish, scanner or the like. These and other input devices are often connected to the processing unit 521 through a serial port interface 546 that is coupled to the system bus, but may be connected by other interfaces, such as a parallel port, game port or a universal serial bus (USB). A monitor 547 or other type of display device is also connected to the system bus 523 via an interface, such as a video adapter 548. In addition to the monitor, personal computers typically include other peripheral output devices (not shown) such as speakers and printers.
The personal computer 520 may operate in a networked environment using logical connections to one or more remote computers, such as a remote computer 549. The remote computer 549 may be another personal computer, a server, a router, a network PC, a peer device or other common network node, and typically includes many or all of the elements described above relative to the personal computer 520, although only a memory storage device 550 has been illustrated in FIG. 5. The logical connections depicted in
When used in a LAN networking environment, the personal computer 520 is connected to the local network 551 through a network interface or adapter 553. When used in a WAN networking environment, the personal computer 520 typically includes a modem 554 or other means for establishing communications over the wide are network 552, such as the Internet. The modem 554, which may be internal or external, is connected to the system bus 523 via the serial port interface 546. In a networked environment program modules depicted relative to the personal computer 520, or portions thereof, may be stored in the remote memory storage device. It will be appreciated that the network connections shown are exemplary and other means of establishing communications between the computers may be used.
According to one embodiment of the invention, an alpha channel is used to convey transparency information about a color and is referred to as “XsARGB”. For XsARGB, an additional 16 bit component is used to store transparency information. First, a normalized alpha channel A0 will be introduced. The values 0 and 1 of A0 are regarded as transparent and opaque, respectively. The four components (A0, R0, G0, B0) constitutes one color value. If each component is multiplied by 8192 we obtain the XsARGB component (A16, R16, G16, B16). This component is called the non-premultiplied XsARGB and (A0, R0, G0, B0) is called the normalized non-premultiplied XsARGB. The discussion of premultiplied colors will now be discussed.
When an image B is laid on top of an image A, the resultant image P has the pixel value
p=βb+(1−β)a, (14)
where a and b are a color component of the images A and B at a certain pixel and β is the alpha channel of the image B at that pixel. An image C can be laid on top of the image P to create another image. We want to create the overlay formula so that the final result does not depend on the order (associativity).
C⊕(B⊕A)=(C⊕B)⊕A. (15)
Let D denote the image created by C and B and one of its color components and alpha value d and δ, respectively. The equation (9) for each color component can be written as
γc+(1−γ)[βb+(1−β)a]=δd+(1−δ)a, (16)
where c and γ are one of color components and alpha value of the image C at the pixel of interest. Comparing the coefficients of a, the alpha value of the composited image D must be
δ=β+γ−γβ. (17)
Comparing the rest of the equation (16), the value of the color component, d, multiplied by its alpha value, δ, are given by
δd=γc+(1−γ)βb. (18)
Notice that in equation (18) the color components are always multiplied by their alpha values. Hence, it is efficient to work with color components that are already multiplied by their alpha values. Those colors are called premultiplied colors.
When blending operations are processed, it is more efficient to use RGB values which are multiplied by the alpha value. The four components (A0, R′0, G′0, B′0) where R′0=A0R0, G′0=A0G0, and B′0=A0B0, are called normalized premultiplied XsRGB. By multiplying each component by 8192, the XsARGB component (A16, R′16, G′16, B′16) is obtained. This component is called the premultiplied XsARGB.
According to one embodiment of the invention, the alpha value A0 is allowed to go above 1 and below zero. The meaning of the alpha value can be considered in the following way. When a source image, S, is overlaid to the destination image, D the resultant image, D′ is obtained, as
d′=αs+(1−α)d (19)
where s, d, and d′ are one of the normalized color components of the image S, D, and D′ at the corresponding pixels, respectively, and a is the alpha value of the source image S at the considering pixel. When α=0, the resultant image remains the same as the destination image. This case is called transparent. When α=1, the resultant image is the same as the source image. This case is called opaque. When α is between 0 and 1, the resultant image is the mixed image between the source and destination images. Usually α is a translucency parameter ranging between transparent (=0) to opaque (=1). However, if equation (12) is studied, it will be noticed that equation (12) is nothing but an interpolation equation. Hence, when α<0 or α>1, equation (12) is very well defined and it is extrapolating the source and destination images.
When a XsRGB (16 bit each component) color is put into a memory, bits 0-15 are the blue component, bits 16-31 are the green component, and bits 32-47 are the red component as illustrated in FIG. 7. Each component is a signed 16 bit integer. The value 8192 is interpreted as 1.0. When it is saved into a file, Intel's Little Endian convention put the first two bytes for the blue component, the next two bytes for the green component, and the subsequent two bytes for the red component. This color format is called RGB48.
Either premultiplied or non-premultiplied, the data structure of XsARGB (16 bit each component) remains the same. When a XsARGB (16 bit each component) color is put into a memory, bits 0-15 are the blue component, bits 16-31 are the green component, bits 32-47 are the red component, and bits 48-63 are the alpha component as illustrated in FIG. 8. Each component is a signed 16 bit integer. The value 8192 is interpreted as 1.0. When it is saved into a file, Intel's Little Endian convention put the first two bytes for the blue component, the next two bytes for the green component, and the subsequent two bytes for the red component, and the last two bytes for the alpha component. This color format is called ARGB64.
When the data is stored linearly, even the 16 bit scale may not be sufficient to store the detailed shades. When more details are required for the certain range of each component, a special range can be assigned in each component. When each component is normalized, the format can allow each component to vary from −4 to +4. However, most values lie within 0 to 1. Special ranges can be assigned in −4 to +3, −3 to −2, 2 to 3, and 3 to 4 to have different scales from the default XsRGB and XsARGB. A user can specify those ranges and define it in the image header. Hence XsRGB and XsARGB can contain multi-color resolution data.
The variation of color format for lower bit depths can be defined. The possible formats are as follows:
As illustrated in
In addition to the higher accuracies, RGB48 and ARGB64 have the extra ranges below 0 and beyond 1 (in terms of the normalized values). This gives significant advantages in images filtering. For example, suppose a series of filters, ƒ1, ƒ2, . . . , ƒn is applied. In each filter, the component value may go below 0 or go above 1. If the color value is truncated in each filter, the correct result may not be obtained in the end. For an example,
The correct result of ƒ2∘ƒ1(x)≡ƒ2(ƒ1(x)) should be x itself. However, ƒ1 is truncated between 0 and 1, the result of ƒ2∘ƒ1(x) becomes 0.5 for all x<0.5.
The existing image format forces the resultant image to be clipped in each filter. Hence, the final result is likely to be incorrect. This will cause artifacts to appear in the image. With the excess range in the color space of the invention, the intermediate results are not clipped. Only the final result is clipped to the range of the output device. To further illustrate this point, FIG. 10(a) illustrates the color data for a red component which has a peak value of 200. If the red component is shifted by some operation by 120, some of the color data will be greater than 255. The color data which has a value greater than 255 is clipped to equal 255 as illustrated in FIG. 10(b). If the red component is then shifted back to its original position, the peak values of the red component are now only equal to 135 as illustrated in FIG. 10(c). However, according to one embodiment of the invention, where the color space is not limited to 0 to 255, the color data greater than 255 is not lost or clipped when the red component is shifted by 120 as illustrated in FIG. 10(d). Since the peak values are not lost using the invention, when the red component is shifted back, all of the original information still remains as illustrated in FIG. 10(e).
Another clipping example is cubic interpolation. It simulates the sync function and the interpolated value can be negative. The RGB48 and ARGB64 formats can store those negative values without clipping.
In general, the intermediate space is made as large as possible up to complex numbers. Quantum mechanics uses complex numbers to calculate the wave function. However, the final observable result is expressed as the absolute value of the wave function as the probability. Unless the complex numbers are kept in the intermediate state, the correct results cannot be obtained. The RGB48 and ARGB64 formats are aimed at this direction so that the intermediate states can have very wide range. The general Fourier transform produces the complex numbers. However, cosine or sine transforms are usually used to produce the real numbers. The RGB48 or ARGB64 format can be used to store the convoluted coefficients of sine or cosine transform which require the signed numbers.
In general, n-dimensional variables ξ1, ξ2, . . . , ξn can be stored in the consecutive slots in the RGB48 and ARGB64. Then, the general transform for each variable can be applied.
F(ω1, . . . , ωn)=∫dξ1 . . . ∫dξng(ω1, . . . ωn;ξ1, . . . , ξn)ƒ(ξ1, . . . , ξn) (21a)
If there are inverse transforms, the inverse transformation can be written by
F(ξ1, . . . , ξn)=∫dω1∫dωnG(ξn;ω1, . . . , ωn)F(ω1, . . . , ωn) (21b)
In both cases, the integral can be replaced with the summation, Σ, if the considering variable is discrete. The examples of those transforms are Fourier transform, (discrete) Cosine and Sine transforms, Laplace transforms, etc. In addition, box filters and other filters can also be used.
When the variable is a complex number, it can be saved in two components, one for the real part and the other for the imaginary part. The above equations can be used for the complex number and saved in the RGB48 and ARGB64 formats as well as in their variation forms.
JPEG compression uses DCT (Discrete Cosine Transform), quantization, and Huffman encoding. In the lossy mode, either DCT or quatization is used. Since all of the above algorithms are well defined in negative numbers, JPEG compression algorithms can be used to compress RGB48 and ARGB64 images. Since those image formats are not defined in JPEG itself, the encoder and decoder need to be modified to be able to handle them.
Although the invention has been described in connection with the preferred embodiments, it will be understood by those skilled in the art that many modifications can be made thereto within the scope of the claims which follow. Accordingly, it is not intended that the scope of the invention be limited by the above description, but that it be determined by reference to the claims that follow.
This application is a divisional application of and claims priority from U.S. application Ser. No. 09/631,285, filed Aug. 3, 2000, now U.S. Pat. No. 6,748,107 which claims priority to provisional application Ser. No. 60/147,325, filed Aug. 5, 1999, both of which are herein incorporated by reference.
Number | Name | Date | Kind |
---|---|---|---|
4533952 | Norman, III | Aug 1985 | A |
4649470 | Bernstein et al. | Mar 1987 | A |
4855934 | Robinson | Aug 1989 | A |
5224178 | Madden et al. | Jun 1993 | A |
5631859 | Markstein et al. | May 1997 | A |
5696892 | Redmann et al. | Dec 1997 | A |
5761105 | Goddard et al. | Jun 1998 | A |
5886701 | Chauvin et al. | Mar 1999 | A |
5977977 | Kajiya et al. | Nov 1999 | A |
6005582 | Gabriel et al. | Dec 1999 | A |
6122721 | Goddard et al. | Sep 2000 | A |
6411294 | Furuhashi et al. | Jun 2002 | B1 |
6487308 | Ulichney et al. | Nov 2002 | B1 |
6642962 | Lin et al. | Nov 2003 | B1 |
6795868 | Dingman et al. | Sep 2004 | B1 |
20040126010 | Yamazoe | Jul 2004 | A1 |
Number | Date | Country | |
---|---|---|---|
20040174377 A1 | Sep 2004 | US |
Number | Date | Country | |
---|---|---|---|
60147325 | Aug 1999 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 09631285 | Aug 2000 | US |
Child | 10804162 | US |