The present invention generally relates to systems and methods for using a three dimensional Look Up Table (LUT) to map one color space to another color space. In one aspect, the invention relates to systems and methods for mapping from one color space to another by dividing the color space into three dimensional cubes.
Digital photography involves capturing color images and then converting those color images into numbers. There are many different ways to turn color images into numbers, and these may be termed “color models”. For example, “RGB” is one color model that relies on the three primary colors, red, green, and blue to be mixed together in differing amounts to yield all of the remaining colors. Another color model is known as CMYK, which uses cyan (C), magenta (M), yellow (Y), and black (K), the primary colors of pigment to create all of the necessary colors.
Each of these different color models can be used to define a specific color space. For example, to create a three-dimensional representation of a color space, the amount of magenta color can be assigned to the representation's X axis, the amount of cyan to its Y axis, and the amount of yellow to its Z axis. This forms a three dimensional color space that has one three dimensional position for each possible color in the color space.
However, it is sometimes necessary to convert from one color space to another. For example, computer monitors typically display colors using an RGB color space, although the image being displayed may have been encoded using a different color space. Many current graphics processors include functions for transforming colors. However, in many cases, color transformations involve complex nonlinear functions, thus making it impractical to transform colors for large images in real time on a per pixel basis. Color look-up tables (LUTs) are used to transform input color signal representations into output color signal representations which can be applied to drive a color display. Such transformations are necessary because color displays commonly have non-linear input to output signal transformation characteristics. Ideally, for a given input value, a LUT generates a corresponding output value that precisely cancels the effects of a display's non-linearity so that colors appearing on the display accurately correspond to the colors defined by the input color signal representations. The LUT may be embedded in a hardware imaging system, or may be implemented via image processing software.
A typical LUT contains representations of different input color signals which are preselected to span the range of input drive signals that may be encountered during normal operation of the display. For each input color signal representation, the LUT also stores either a corresponding output color signal representation or information which can be used to derive a corresponding output color signal representation. As explained below, an input color signal representation is processed by extracting its closest corresponding output color signal representation from the LUT, or by using the information stored in the LUT to derive an output color signal representation which most closely corresponds to the input color signal representation. The extracted or derived output color signal representation is applied to drive the display.
Three dimensional look up tables, or “3D LUTs”, have been used to map one color space on a three dimensional cube to another. For example, a 3D LUT may be used to map a sRGB image to the red, green and blue (RGB) signals required for an OLED panel or other display device that does not have the color gamut of sRGB.
One embodiment is a method of mapping an input color space to an output color space. This embodiment includes receiving an input point corresponding to a first pixel to be converted from an input color space to an output color space; providing a plurality of intermediate tables comprising data coordinates corresponding to corners within a plurality of three dimensional cubes in a lattice and color transformation data associated with each corner, wherein each corner data coordinate is represented in only one table; determining which of the plurality of tables in the lattice contains data for the corners of a cube of interest having the input point; and accessing the color transformation data for the cube of interest using the determined tables.
Another embodiment is an integrated circuit for transforming input color space representations into output color space representations. This embodiment includes a plurality of intermediate tables comprising data coordinates corresponding to corners within a plurality of three dimensional cubes in a lattice and color transformation data associated with each corner, wherein each corner data coordinate is represented in only one table; instructions configured to receive an input point corresponding to a first pixel to be converted from an input color space to an output color space; instructions configured to determine which of the plurality of tables in the lattice contains data for the corners of a cube of interest having the input point; and instructions configured to determine color transformation data for the cube of interest using the determined tables.
Still another embodiment is a system for mapping an input color space to an output color space comprising: means for receiving an input point corresponding to a first pixel to be converted from an input color space to an output color space; means for providing a plurality of intermediate tables comprising data coordinates corresponding to corners within a plurality of three dimensional cubes in a lattice and color transformation data associated with each corner, wherein each corner data coordinate is represented in only one table; means for determining which of the plurality of tables in the lattice contains data for the corners of a cube of interest having the input point; and means for accessing the color transformation data for the cube of interest using the determined tables.
Embodiments of the invention relate to systems and methods for mapping from one color space to another color space using reference to a three dimensional lookup table (3DLUT). In some embodiments, the systems and methods described herein are part of an integrated circuit, such as a graphic processing unit. One non-limiting example of such as graphics processing unit is the Adreno® integrated graphics solution that is part of the Snapdragon® line of chipsets offered from Qualcomm (San Diego, Calif.). In these embodiments, the graphics processing unit may include a memory having stored instructions for carrying out the steps described below.
As described below, the 3DLUT is used to store conversion values from a source color space to a destination color space. As described in more detail below, embodiments relate to systems and methods for representing a source color space by dividing the 3DLUT having values for converting from one color space to another color space into (N−1)×(N−1)×(N−1) basic cubes, where N is a number of grid points in each of the three directions (for red, green and blue in an RGB image). The objective is to use the lookup table to convert into a destination, or address color space. Embodiments of the invention relate to the addressing method that is used to represent the data within the 3DLUTs.
A 3DLUT is based on a three-dimensional cube, with the ability to alter a given single red, green or blue output value based on a single red, green or blue input value change. For a 3DLUT, consider an example with three axes: red (“R”), green (“G”) and blue (“B”). The point where all three color planes intersect in a 3DLUT is considered to be the input point, for which an output point also exists. In an 8 bit storage system, there would be 28, or 256 values per color axis, which may range in value from 0 to 255. For an input value in the form of (R,G,B) for a single pixel, each axis may range from 0 to 255, so there may be a total of 2563, or 16,777,216 different input combinations to cover all possible color combinations for a pixel. The objective of the 3DLUT is to map each of the input values (in this case, approximately 16 million) to an output value. Accordingly, for a single pixel that has a possible 16,777,216 different inputs, a storage system may require approximately 16 megabytes of storage space.
Embodiments of the invention are directed towards reducing the input space required by reducing the total number of input combinations (the approximately 16 million in the example discussed above). As discussed below, a mechanism has been found for optimally storing and retrieving look up data for one output component. It is thus applicable to conversion of any 3 dimensional input space to any dimensional output space (e.g. 1 for gray scale output, 3 for RGB output, 4 for CMYK output) by duplicating the method for each output component.
Instead of storing every input value along each axis (the 256 values from 0 to 255), only a few points are stored along each axis depending on the 3DLUT size. For each of the stored points along each axis, an output value is known. However, for each input value that falls in a region along an axis that does not have an input value stored, an interpolation process is used to calculate the output value. As described in more detail below, the interpolation process may be performed within a sub-cube (Cube of Interest) of the input space. The number of cube corners required for interpolation depends on the interpolation scheme, but at worst case, it is all 8 corners (for example, in a tri-linear interpolation) that would need to be known.
The 3DLUT may be written in the form of N×N×N, where N is an integer designating the size of the 3DLUT, and the number of known points along each axis. The points along each of the three axes may be connected in a manner to create cubes, with the total number of cubes along each axis totaling (N−1). Therefore, for all three axes, the total number of cubes may total (N−1)3. Since a cube contains 8 corner points, the total number of cube corner points would be 8*(N−1)3.
Consider an example of a 3×3×3 3DLUT for an 8 bit storage system (256 points along each axis). For a 3×3×3 3DLUT, each axis of R, G and B may store the points 0, 128 and 255 (storing three or “N” points along each axis instead of storing all points from 0 to 255). The total number of input points would be 3*3*3=27 (rather than 2563). For each of those 27 input points, an output value may be immediately determined. Also, the number of cubes along each axis would be (N−1), or (3−1) or 2 cubes. The total number of cubes for all three axes would be (3×1)3=8 cubes. The total number of cube corner points (since each cube contains 8 corners) would be 8*(3−1)3=64.
For every point that falls outside those known 27 points (or inside a particular “cube”), an interpolation process is used to determine the proper output value. Since only 27 points per output component are stored for a 3×3×3 3DLUT, rather than all 16,777,216 points, storage requirements are reduced. For a larger sized 3DLUT (e.g., a 17×17×17 point 3DLUT), more points would be stored, and hence less memory would be saved.
Large 3DLUT′s require a lot of memory, and thus the storage requirements on a hardware die for a graphics processor for using many 3DLUT′s quickly becomes prohibitive. To interpolate RGB color values within a 3D cube in a single clock cycle, up to eight corners of a particular 3D lattice are required simultaneously. Thus, to map one pixel per clock, up to 8 memory locations need to be addressed simultaneously. In current systems, this may be done by using as many as 8 memories on the die so that each memory is accessed during the same clock read cycle. However, these memories often include a lot of common content, resulting in large areas for implementation. An industry standard 17×17×17 interpolated look up table requires 173 values, times at worst case 8 memories.
Embodiments of the present invention relate to a system that can use a series of three dimensional lookup tables, wherein each table contains data specific to corners of a lattice of cubes, and each cube covers a particular subset of the entire color space. One example is a 2×2×2 lattice of cubes as shown in
As discussed above, for an N×N×N 3DLUT, there would be a total of (N−1)3 cubes, and with 8 corners for each cube. Thus, the total number of cube corner points in this configuration would be 8*(N−1)3. However, for an adjacent cube, for example, sitting immediately next to another cube, four of the corner points would be in common between the adjacent cubes. In a conventional 3DLUT system, these common points would be stored separately even though they held the same values, wasting storage space by storing redundant data. Embodiments of the present invention therefore also relate to exploiting the shared common points for adjacent cubes, without re-storing them and thus saving storage space in memory.
Embodiments of the invention relate to an optimal way of assigning lattice corner data to a series of 8 tables, and a mechanism for mapping an input value to a lattice cube (Cube of Interest) and then determining which table and index within that table the corner data is located, such that all 8 corners are guaranteed to be in different tables, and all corner data is stored only once so there is no redundancy in the stored corner data. Described herein is a very efficient implementation that is possible if the lattice components span 2n (two to the power of n) input values. For example, in the case used for illustration, an 8 bit 3D input space is represented as a 9×9×9 lattice. Thus each lattice segment spans 256/(9−1)=32 input values. Therefore, n=5 as each lattice segment covers 25=32 input values.
Referring to
The lattice component, or sub-cube, containing the input RGB value is hereafter referred to as the Cube of Interest (CoI).
In order to properly convert color values, a means to find a corner's value within a table is needed. The indexing scheme for embodiments of the invention is now described.
The input space can be considered to be subdivided into three levels. The smallest are the sub-cubes which span, as described earlier, 2n input values along each axis. The middle level is the 2×2×2 sub lattices, each consisting of the eight sub-cubes described previously. The top most, or largest level, is the assemblage of 2×2×2 sub-lattices themselves. Each level of subdivision is directly inferred by bits in the input values. Bits 0 through n identify where within the CoI the actual input lies. For simplicity, these ranges are referred to as Cr, Cg and Cb (“Cx” generically) and can take on the values of 0 through (2n−1). In the illustrated example, n=5, therefor Cr, Cg and Cb can range from 0 to 31.
The second level, the position of the CoI within the 2×2×2 lattice, corresponds to the (n+1)st bit. These ranges are referred to as Lr, Lg and Lb (“Lx” generically). They can only take on the values of 0 or 1, and therefore Lr, Lg and Lb can identify one of the 8 sub cubes within the 2×2×2 sub lattice. As described earlier, in the illustrated example, this is the 6th bit.
The top level corresponds to the n+2 and more significant bits. These ranges are referred to as SLr, SLg and SLb (“SLx” generically). The can range from 0 to 2(m−n−1)−1 where m is the bit size of the entire input space. In the illustrated example, m is 8 bits, n is 5 bits, so these ranges can vary from 0 to 3. Thus the entire illustrated input space can include 64 (43) 2×2×2 sub-lattices.
Indexing is achieved by a simple manipulation of SLx and using the results in a computation which is dependent on m-n (but is fixed for a particular implementation).
For each table, an entry value for red, green and blue inputs is determined. Let these 8 values be ar[i], ag[i], and ab[i] where i ranges from 0-7 to identify a particular table.
The values for ar, ag and ab are determined as follows:
These manipulations are specific to the original choice of corner to table mapping. If a different choice is made, the manipulations to the red, green and blue entry values would be swapped.
This manipulation translates the sub cube coordinates for a corner to the sub cube who's table contains the actual data.
An index (or address) into each table is calculated. If the index into table i is represented as index[i], then the mechanism is as follows:
index[0]=(AA×ab[0])+(DD×ag[0])+ar[0]
index[1]=(BB×ab[1])+(EE×ag[1])+ar[1]
index[2]=(CC×ab[2])+(EE×ag[2])+ar[2]
index[3]=(BB×ab[3])+(DD×ag[3])+ar[3]
index[4]=(AA×ab[4])+(DD×ag[4])+ar[4]
index[5]=(BB×ab[5])+(EE×ag[5])+ar[5]
index[6]=(CC×ab[6])+(EE×ag[6])+ar[6]
index[7]=(BB×ab[7])+(DD×ag[7])+ar[7]
Where AA through EE are coefficients that depend on m−n−1, but are fixed for any particular instance of the 3D Lut. Let k=m−n−1. k=4 for the illustrated example, and is the number of 2×2×2 sub lattices that span each dimension of the input space.
The values of the coefficients are determined as follows:
AA=(k+1)(k+1)=25 in the illustrated example
BB=(k+1)k=20 in the illustrated example
CC=k×k=16 in the illustrated example
DD=k+1=5 in the illustrated example
EE=k=4 in the illustrated example
These manipulations are simple and synthesize to small logic areas. Again, if the original mapping of corners to tables is different, then the roles of the red, green and blue inputs are swapped.
Corner logic modules 615 are outputted to a mux hardware block 610. The hardware mux block simply steers the correct table value to the correct corner.
The least significant n bits of the Red, Green and Blue inputs represent the position of the input value within the lattice sub-cube. This data plus the output of the 8 corner values for the sub-cube are passed to the interpolation unit for final calculation of the final output value. The final output value is then calculated and returned.
The technology is operational with numerous other general purpose or special purpose computing system environments or configurations. Examples of well-known computing systems, environments, and/or configurations that may be suitable for use with the invention include, but are not limited to, personal computers, server computers, hand-held or laptop devices, multiprocessor systems, processor-based systems, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
As used herein, instructions refer to computer-implemented steps for processing information in the system. Instructions can be implemented in software, firmware or hardware and include any type of programmed step undertaken by components of the system.
A processor may be any conventional general purpose single- or multi-chip processor such as a Pentium® processor, a Pentium® Pro processor, a 8051 processor, a MIPS® processor, a Power PC® processor, or an Alpha® processor. In addition, the processor may be any conventional special purpose processor such as a digital signal processor or a graphics processor. The processor typically has conventional address lines, conventional data lines, and one or more conventional control lines.
The system is comprised of various modules as discussed in detail. As can be appreciated by one of ordinary skill in the art, each of the modules comprises various sub-routines, procedures, definitional statements and macros. Each of the modules are typically separately compiled and linked into a single executable program. Therefore, the description of each of the modules is used for convenience to describe the functionality of the preferred system. Thus, the processes that are undergone by each of the modules may be arbitrarily redistributed to one of the other modules, combined together in a single module, or made available in, for example, a shareable dynamic link library.
The system may be used in connection with various operating systems such as Linux®, UNIX® or Microsoft Windows®.
The system may be written in any conventional programming language such as C, C++, BASIC, Pascal, or Java, and ran under a conventional operating system. C, C++, BASIC, Pascal, Java, and FORTRAN are industry standard programming languages for which many commercial compilers can be used to create executable code. The system may also be written using interpreted languages such as Perl, Python or Ruby.
Those of skill will further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present disclosure.
The various illustrative logical blocks, modules, and circuits described in connection with the embodiments disclosed herein may be implemented or performed with a general purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration.
In one or more example embodiments, the functions and methods described may be implemented in hardware, software, or firmware executed on a processor, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media include both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage medium may be any available media that can be accessed by a computer. By way of example, and not limitation, such computer-readable media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer. Also, any connection is properly termed a computer-readable medium. For example, if the software is transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. Disk and disc, as used herein, includes compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk and Blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.
The foregoing description details certain embodiments of the systems, devices, and methods disclosed herein. It will be appreciated, however, that no matter how detailed the foregoing appears in text, the systems, devices, and methods can be practiced in many ways. As is also stated above, it should be noted that the use of particular terminology when describing certain features or aspects of the invention should not be taken to imply that the terminology is being re-defined herein to be restricted to including any specific characteristics of the features or aspects of the technology with which that terminology is associated.
It will be appreciated by those skilled in the art that various modifications and changes may be made without departing from the scope of the described technology. Such modifications and changes are intended to fall within the scope of the embodiments. It will also be appreciated by those of skill in the art that parts included in one embodiment are interchangeable with other embodiments; one or more parts from a depicted embodiment can be included with other depicted embodiments in any combination. For example, any of the various components described herein and/or depicted in the Figures may be combined, interchanged or excluded from other embodiments.
With respect to the use of substantially any plural and/or singular terms herein, those having skill in the art can translate from the plural to the singular and/or from the singular to the plural as is appropriate to the context and/or application. The various singular/plural permutations may be expressly set forth herein for sake of clarity.
It will be understood by those within the art that, in general, terms used herein are generally intended as “open” terms (e.g., the term “including” should be interpreted as “including but not limited to,” the term “having” should be interpreted as “having at least,” the term “includes” should be interpreted as “includes but is not limited to,” etc.). It will be further understood by those within the art that if a specific number of an introduced claim recitation is intended, such an intent will be explicitly recited in the claim, and in the absence of such recitation no such intent is present. For example, as an aid to understanding, the following appended claims may contain usage of the introductory phrases “at least one” and “one or more” to introduce claim recitations. However, the use of such phrases should not be construed to imply that the introduction of a claim recitation by the indefinite articles “a” or “an” limits any particular claim containing such introduced claim recitation to embodiments containing only one such recitation, even when the same claim includes the introductory phrases “one or more” or “at least one” and indefinite articles such as “a” or “an” (e.g., “a” and/or “an” should typically be interpreted to mean “at least one” or “one or more”); the same holds true for the use of definite articles used to introduce claim recitations. In addition, even if a specific number of an introduced claim recitation is explicitly recited, those skilled in the art will recognize that such recitation should typically be interpreted to mean at least the recited number (e.g., the bare recitation of “two recitations,” without other modifiers, typically means at least two recitations, or two or more recitations). Furthermore, in those instances where a convention analogous to “at least one of A, B, and C, etc.” is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., “a system having at least one of A, B, and C” would include but not be limited to systems that have A alone, B alone, C alone, A and B together, A and C together, B and C together, and/or A, B, and C together, etc.). In those instances where a convention analogous to “at least one of A, B, or C, etc.” is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., “a system having at least one of A, B, or C” would include but not be limited to systems that have A alone, B alone, C alone, A and B together, A and C together, B and C together, and/or A, B, and C together, etc.). It will be further understood by those within the art that virtually any disjunctive word and/or phrase presenting two or more alternative terms, whether in the description, claims, or drawings, should be understood to contemplate the possibilities of including one of the terms, either of the terms, or both terms. For example, the phrase “A or B” will be understood to include the possibilities of “A” or “B” or “A and B.”
While various aspects and embodiments have been disclosed herein, other aspects and embodiments will be apparent to those skilled in the art. The various aspects and embodiments disclosed herein are for purposes of illustration and are not intended to be limiting.