This disclosure relates generally to the field of data processing and, more particularly, to various techniques to reduce the bit-depth required for satisfactory linearization of data, e.g., data that is to be displayed on a monitor or other type of display surface, saved to a file, printed, played as audio, further processed, transmitted, or analyzed, etc.
Gamma adjustment, or, as it is often simply referred to, “gamma,” is the name given to the nonlinear operation commonly used to encode linear luma values. Gamma, γ, may be defined by the following simple power-law expression: Lout=Linγ, where the input and output values, Lin and Lout, respectively, are non-negative real values, typically in a predetermined range, e.g., zero to one. In other embodiments, referred to herein as “extended range” embodiments, the values of Lin and Lout may include positive and negative real numbers in a range that is greater than zero to one, e.g., −0.75, −1.25, etc. In the case of extended range embodiments, the power-law expression may be modified as follows: Lout=(Lin/|Lin|)|Lin|γ. A gamma value greater than one is sometimes called an encoding gamma, and the process of encoding with this compressive power-law nonlinearity is called gamma compression; conversely, a gamma value less than one is sometimes called a decoding gamma, and the application of the expansive power-law nonlinearity is called gamma expansion. Gamma encoding maps linear data into a more perceptually uniform domain.
Another way to think about the gamma characteristic of a system is as a power-law relationship that approximates the relationship between the encoded luma in the system and the actual desired image luminance on whatever the eventual user display device is. Other uses of gamma may include: encoding between the physical world and media; decoding media data to linear space; and converting display linear data to the display's response space. In existing systems, a computer processor or other suitable programmable control device may perform gamut adjustment computations for a particular display device it is in communication with based on the native luminance response (often called the “EOTF,” or electrical optical transfer function) of the display device, the color gamut of the device, and the device's white point (which information may be stored in an ICC profile), as well as the ICC color profile the source content's author attached to the content to specify the content's “rendering intent.” The ICC profile is a set of data that characterizes a color input or output device, or a medium, according to standards promulgated by the International Color Consortium (ICC). ICC profiles may describe the color attributes of a particular device or viewing requirement by defining a mapping between the device source or target color space and a profile connection space (PCS), usually the CIE XYZ color space.
In some embodiments, image values, e.g., red, green, and blue pixel values or luma values, enter a “framebuffer” having come from an application or applications that have already processed the image values to be encoded with a specific implicit gamma. A framebuffer may be defined as a video output device that drives a video display from a memory buffer containing a complete frame of, in this case, image data. The implicit gamma of the values entering the framebuffer can be visualized by looking at the “Framebuffer Gamma Function,” as will be explained further below. Ideally, this Framebuffer Gamma Function is the exact inverse of the display device's “Native Display Response” function, which characterizes the luminance response of the display to input, to yield unity system response. However, because the inverse of the Native Display Response isn't always exactly the inverse of the framebuffer, a “Look Up Table” (LUT), sometimes stored and implemented on a video card, may be used to accommodate for the imperfections in the relationship between the encoding gamma and decoding gamma values, as well as the display's particular luminance response characteristics.
As mentioned above, media is usually encoded by a non-linear transfer function that approximates human perception, and is then quantized in the gamma corrected space to efficiently allocate codes (a linear representation may dramatically over allocate codes to areas near white that human observers may see as mainly the same, and provide too few in areas near black that are more easily observed as being different). This is true for most any perceptual system including, but not limited to: vision, hearing, sense of touch, smell, taste, etc.
Many common operations on media must be performed in a linear representation to be correct and artifact-free. These include, but are not limited to: scaling, rotating, compositing, and gamut converting. The transfer functions, as defined by human perception, and as commonly used by content providers in practice, are often pure-power functions (as described above), referred to herein as “gamma” functions.
These pure-power gamma functions asymptotically reach a slope of zero near an input value of zero (i.e., pure black for image data). Consequently, for a given quantized precision input (usually described in terms of bit-depth), dramatically more bits are required for the linear output space to meet the requirement that every unique input value produce a unique output value, such that the original signal may be reproduced by applying the inverse gamma function. Banding artifacts are produced if these criteria are not met. For a common image case, input data may be 8-bit quantized with 2.2 gamma. To avoid banding, as many as 17 bits are needed in the linear-space (this number may be verified empirically).
The inventors have realized new and non-obvious ways to reduce the bit-depth required to linearize data for the performance of image processing operations without the introduction of banding artifacts—or with significantly reduced banding artifacts. The inventors have also realized new and non-obvious ways to further reduce the bit depth needed before processing, e.g., via the use of stochastic dithering.
Methods, computer readable media, and systems of reducing the number of bits required in the linear domain are described herein. In some embodiments, a piecewise linear function (e.g., a linear segment followed by an offset curve) is substituted for the pure-power function, such that a slope limit of the linear segment is applied to constrain the number of (additional) linear bits required (over the input precision) to a desired number. In some embodiments, the offset curve following the linear segment may be modeled using a pure-power curve. In still other embodiments, the offset curve may be modeled as a second-order, third-order (or other-order) polynomial function that is solved for using numerical methods and one or more predetermined constraints.
Exemplary constraints that may be applied to the process of modeling the piecewise linear transfer function include: 1) the offset curve intersects the point (1,1) (assuming input and output values are scaled to the range 0.1); 2) the slope of the linear segment may be limited based on the number of additional bits, i.e., the number of bits needed beyond the input precision/quantization, that the implementation is willing to use in linear space; 3) the linear segment and offset curve are continuous with one another at the input value where they intersect; 4) the slopes of the linear segment and offset curve are continuous with one another at the input value where they intersect; 5) the area under the piecewise linear transfer function curve is the same as what the area under the ideal “pure power” curve would be; and 6) the mean square error (or other quality metric, e.g., a perceptual quality metric) is minimized between the piecewise linear transfer function curve and the ideal “pure power” curve.
Applying these techniques, the additional number of bits required in the linear space may be reduced from nine to four or fewer—depending on the acceptable level of difference between the curves (differences between the modeled piecewise linear functions and the pure-power curve will appear as subtle tone shifts at reasonable levels of bit conservation).
To further reduce bit requirements for linear-space computations (such as scaling), a stochastic dither may be applied preceding a quantization. For instance, with 8-bit, 2.2 gamma input data linearized using the aforementioned piecewise linear function techniques, only 12 bits may be required in linear space (to avoid banding artifacts), versus the 17 bits that would be required if a pure power transfer function were used. If stochastic noise is also added (e.g., centered at the quantization's least significant bit), the signal may further be quantized to, e.g., 10 bits for further linear-space computation without creating objectionable artifacts. Since scalars and other linear-space computations are expensive, any bit reductions save greatly in terms of transistors, space, power, and computational expense.
Thus, in one embodiment disclosed herein, a non-transitory program storage device, readable by a programmable control device, may comprise instructions stored thereon to cause the programmable control device to: receive non-linear encoded input data, wherein the received non-linear encoded input data has a first quantized bit-depth; determine a first transfer function; and transform the received non-linear encoded input data into linear output data having a second quantized bit-depth according to the first transfer function, wherein the first transfer function comprises a piecewise linear function, the piecewise linear function defined by a first linear segment followed, after a first input value, by an offset curve, wherein the first linear segment is continuous with the offset curve at the first input value, and wherein the slopes of the first linear segment and the offset curve at the first input value are the same.
In still other embodiments, the techniques described herein may be implemented in apparatuses and/or systems, such as electronic devices having memory and programmable control devices.
This disclosure pertains to systems, methods, and computer readable media for reducing the number of bits required in the linear domain for performing operations on encoded input data without producing noticeable artifacts. The techniques disclosed herein are applicable to any number of electronic devices, such as: digital cameras, digital video cameras, mobile phones, personal data assistants (PDAs), portable music players, monitors, televisions, and, of course, desktop, laptop, and tablet computer displays.
In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the inventive concept. As part of this description, some of this disclosure's drawings represent structures and devices in block diagram form in order to avoid obscuring the invention. In the interest of clarity, not all features of an actual implementation are described in this specification. Moreover, the language used in this disclosure has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter, resort to the claims being necessary to determine such inventive subject matter. Reference in this disclosure to “one embodiment” or to “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one implementation of the invention, and multiple references to “one embodiment” or “an embodiment” should not be understood as necessarily all referring to the same embodiment.
It will be appreciated that, in the development of any actual implementation (as in any development project), numerous decisions must be made to achieve the developers' specific goals (e.g., compliance with system- and business-related constraints), and that these goals may vary from one implementation to another. It will also be appreciated that such development efforts might be complex and time-consuming, but would nevertheless be a routine undertaking for those of ordinary skill in the design of an implementation of image processing systems having the benefit of this disclosure.
Referring now to
Information relating to the source content 100 and source profile 102 may be sent to viewer 116's device containing the system 112 for performing gamma adjustment utilizing a LUT 110. Viewer 116's device may comprise, for example, a mobile phone, PDA, portable music player, monitor, television, or a laptop, desktop, or tablet computer. Upon receiving the source content 100 and source profile 102, system 112 may perform a color adaptation process 106 on the received data, e.g., utilizing the COLORSYNC® framework. (COLORSYNC® is a registered trademark of Apple Inc.) COLORSYNC® provides several different methods of doing gamut mapping, i.e., color matching across various color spaces. For instance, perceptual matching tries to preserve as closely as possible the relative relationships between colors, even if all the colors must be systematically distorted in order to get them to display on the destination device.
Once the color profiles of the source and destination have been appropriately adapted, image values may enter the framebuffer 108. In some embodiments, the image values entering framebuffer 108 will already have been processed and have a specific implicit gamma, i.e., the Framebuffer Gamma function, as will be described later in relation to
System 112 may then utilize a LUT 110 to perform a so-called “gamma adjustment process.” LUT 110 may comprise a two-column table of positive, real values spanning a particular range, e.g., from zero to one. The first column values may correspond to an input image value, whereas the second column value in the corresponding row of the LUT 110 may correspond to an output image value that the input image value will be “transformed” into before being ultimately being displayed on display 114. LUT 110 may be used to account for the imperfections in the display 114's luminance response curve, also known as a transfer function, or “EOTF.” In other embodiments, a LUT may have separate channels for each primary color in a color space, e.g., a LUT may have Red, Green, and Blue channels in the sRGB color space.
As mentioned above, in some embodiments, the goal of this gamma adjustment system 112 is to have an overall 1.0 gamma boost applied to the content that is being displayed on the display device 114. An overall 1.0 gamma boost corresponds to a linear relationship between the input encoded luma values and the output luminance on the display device 114. Ideally, an overall 1.0 gamma boost will correspond to the source author's intended look of the displayed content.
Referring now to
The x-axis of Native Display Response Function 202 represents input image values spanning a particular range, e.g., from zero to one. The y-axis of Native Display Response Function 202 represents output image values spanning a particular range, e.g., from zero to one. In theory, systems in which the decoding gamma is the inverse of the encoding gamma should produce the desired overall 1.0 gamma boost. However, this system does not take into account the effect on the viewer due to ambient light in the environment around the display device. Thus, the desired overall 1.0 gamma boost may only be achieved in certain ambient lighting environment conditions.
Referring now to
Referring now to
As mentioned above, because of the nature and shape of gamma curves, for a given quantized precision input, e.g., 8-bit input, dramatically more bits, e.g., 17 bits total, are required for the linear output space to meet the requirement that every unique input value produce a unique output value, such that the original signal may be reproduced by the inverse gamma function. Thus, it would be beneficial to be able to reduce the number of bits needed to encode the gamma function. Since scalars and other linear-space computations are expensive, any bit reductions save greatly in terms of transistors, space, power, and computational expense.
Referring now to
The number of bits required in linear space is bounded between:
((1/(2̂inputbits−1))̂gamma)*2̂linearbits=0.5
and
((1/(2̂inputbits−1))̂gamma)*2̂linearbits=1.0.
These equations have been determined by analyzing what occurs at the first quantized step of the curve. Because the signal is steepest at that point, it may be used to characterize the number of bits required.
Thus, according to one embodiment, a method to reduce the number of bits required in the linear domain comprises substituting a piecewise linear transfer function (e.g., a line segment followed by an offset curve, as is shown in
According to some embodiments, one or more of the following constraints may be optimized over in order to generate a piecewise linear transfer function defined as:
y(x)=c*x, where x<d
y(x)=(a*x*b)̂curve_gamma, where x>=d.
Exemplary Constraints for Determining the Piecewise Linear Transfer Function
Constraint 1.) The piecewise linear transfer function intersects the coordinates (1,1). In other words: 1=(a*b)̂curve_gamma.
Constraint 2.) The slope of the linear segment is limited according to the following equation: log2(1/c)=additional_bits.
Constraint 3.) The linear segment and the offset curve are continuous with one another where they intersect. In other words: c*d=(a*d+b)̂curve_gamma.
Constraint 4.) The linear segment and the offset curve have equal slopes where they intersect. In other words: diff(c*d)=diff((a*d+b) ̂curve_gamma).
Constraint 5.) The area under the piecewise linear transfer function is optimized to be equal or substantially equal to the area under an ideal curve (i.e., pure power function). In other words: integrate((x̂gamma),x,0,1)=integrate (c*x, x, 0, d)+integrate ((a*x+b) ̂curve_gamma, x, d, 1).
Constraint 6.) The mean square error is minimized between the piecewise linear transfer function and the ideal curve (i.e., pure power function).
Constraint 7.) Other possibly perceptual-based error metrics to correct for differences in human perception of visual and/or auditory signals.
As mentioned above, by applying these techniques, the additional number of bits required in the linear space may be reduced from nine to four or fewer—depending on the acceptable level of difference between the curves (differences between the modeled piecewise linear functions and the pure-power curve will appear as subtle tone shifts at reasonable levels of bit conservation).
Polynomial Approximations for the Offset Curve
Further, piecewise linear/polynomial approximations may be solved for (instead of piecewise linear/pure-power functions described above) for the piecewise linear transfer function. Polynomial approximations of cubic—or even second order—may be solved for to meet one or more of the constraints enumerated above, while still providing good approximations of the ideal pure-power function representation and requiring far less computational resources to compute or table size to store.
According to one embodiment, a cubic polynomial may be solved for of the form:
y(x)=c*x, where x<d
y(x)=k3x3+k2x2+k1x+k0, where x>=d.
If the above mentioned constraints that: 1) y(0)=0; 2) y(1)=1; 3) y(d)=d*c; 4) y′(d)=c; and 5) the area under y(x) is equal to the area under an ideal pure power function having a gamma value, g, are also applied in the context of the cubic polynomial approximation of the offset curve, then additional constraints may be implied that:
k
3
+k
2
+k
1
+k
0=1; (from constraint 2) 1)
k
3
d
3
+k
2
d
2
+k
1
d+k
0
=d*c; (from constraint 3) 2)
3k3d2+2k2d+k1=c; (from constraint 4) and 3)
(12(1−d)k0+6(1−d2)k1+4(1−d3)k2+3(1−d4)k3)/12+(cd2/2)=1/(g+1) (from constraint 5). 4)
Solving the above system of equations for k3, k2, k1, and k0 yields:
The preferred value of d may then be determined numerically, e.g., by minimizing the maximum deviation between the piecewise linear transfer function having the cubic polynomial offset curve the ideal pure-power function with gamma value, g. Once d is known, the value of c may be solved for trivially by plugging the solved-for value of d into the solved cubic polynomial offset curve.
As mentioned above, second order polynomial approximations (as well as higher-order polynomial approximations) for the offset curve may also be numerically determined using the constraints and methods described above with respect to a third-order polynomial approximation. The choice of what order polynomial to use for a given implementation may depend, e.g., on the system's processing constraints, tolerance for error, and/or tolerance for artifacts in the resulting (i.e., after inversion back to non-linear space) data.
Stochastic Dithering
To further reduce bit requirements for linear-space computations (such as scaling), a stochastic dither may be applied preceding a quantization. For instance, with 8-bit, 2.2 gamma input data linearized using the aforementioned piecewise linear function techniques, only 12 bits may be required in linear space (to avoid banding artifacts), versus the 17 bits that would be required if a pure power transfer function were used.
If stochastic noise is also added (e.g., centered at the quantization's least significant bit), the signal may further be quantized to, e.g., 10 bits for further linear-space computation without creating objectionable artifacts. Adding appropriate noise to the signal (e.g., centered at the quantization of triangular distribution, etc.) allows for quantization to fewer bits, while preserving original signal without introducing banding artifacts. Since scalars and other linear-space computations are expensive, any bit reductions save greatly in terms of transistors, space, power, and computational expense. In the examples described above, a linear-space scalar operation may be reduced from requiring 17 bits of precision to just 10 bits—a dramatic savings!
Referring now to
Referring now to
Referring now to
Processor 905 may be any suitable programmable control device capable of executing instructions necessary to carry out or control the operation of the many functions performed by device 900 (e.g., such as the linearization and/or processing of images in accordance with operations in any one or more of the Figures). Processor 905 may, for instance, drive display 910 and receive user input from user interface 915 which can take a variety of forms, such as a button, keypad, dial, a click wheel, keyboard, display screen and/or a touch screen. Processor 905 may be a system-on-chip such as those found in mobile devices and include a dedicated graphics processing unit (GPU). Processor 905 may be based on reduced instruction-set computer (RISC) or complex instruction-set computer (CISC) architectures or any other suitable architecture and may include one or more processing cores. Graphics hardware 920 may be special purpose computational hardware for processing graphics and/or assisting processor 905 process graphics information. In one embodiment, graphics hardware 920 may include a programmable graphics processing unit (GPU).
Sensor and camera circuitry 950 may capture still and video images that may be processed to generate images, at least in part, by video codec(s) 955 and/or processor 905 and/or graphics hardware 920, and/or a dedicated image processing unit incorporated within circuitry 950. Images so captured may be stored in memory 960 and/or storage 965. Memory 960 may include one or more different types of media used by processor 905, graphics hardware 920, and image capture circuitry 950 to perform device functions. For example, memory 960 may include memory cache, read-only memory (ROM), and/or random access memory (RAM). Storage 965 may store media (e.g., audio, image and video files), computer program instructions or software, preference information, device profile information, and any other suitable data. Storage 965 may include one more non-transitory storage mediums including, for example, magnetic disks (fixed, floppy, and removable) and tape, optical media such as CD-ROMs and digital video disks (DVDs), and semiconductor memory devices such as Electrically Programmable Read-Only Memory (EPROM), and Electrically Erasable Programmable Read-Only Memory (EEPROM). Memory 960 and storage 965 may be used to retain computer program instructions or code organized into one or more modules and written in any desired computer programming language. When executed by, for example, processor 905 such computer program code may implement one or more of the methods described herein.
It is to be understood that the above description is intended to be illustrative, and not restrictive. The material has been presented to enable any person skilled in the art to make and use the invention as claimed and is provided in the context of particular embodiments, variations of which will be readily apparent to those skilled in the art (e.g., some of the disclosed embodiments may be used in combination with each other). In addition, it will be understood that some of the operations identified herein may be performed in different orders. The scope of the invention therefore should be determined with reference to the appended claims, along with the full scope of equivalents to which such claims are entitled. In the appended claims, the terms “including” and “in which” are used as the plain-English equivalents of the respective terms “comprising” and “wherein.”
Number | Date | Country | |
---|---|---|---|
61909215 | Nov 2013 | US |