Many types of computer systems include display devices to display images, video streams, and data. Accordingly, these systems typically include functionality for generating and/or manipulating images and video information. In digital imaging, the smallest item of information in an image is called a “picture element” and more generally referred to as a “pixel.” To represent a specific color on a typical electronic display, each pixel can have three values, one each for the amounts of red, green, and blue present in the desired color. Some formats for electronic displays may also include a fourth value, called alpha, which represents the transparency of the pixel. This format is commonly referred to as ARGB or RGBA. Another format for representing pixel color is YCbCr, where Y corresponds to the luminance, or brightness, of a pixel and Cb and Cr correspond to two color-difference chrominance components, representing the blue-difference (Cb) and red-difference (Cr).
Luminance is a photometric measure of the luminous intensity per unit area of light travelling in a given direction. Luminance describes the amount of light that is emitted or reflected from a particular area. Luminance indicates how much luminous power will be detected by an eye looking at a surface from a particular angle of view. One unit used to measure luminance is a candela per square meter. A candela per square meter is also referred to as a “nit”.
Based on studies of human vision, there is some minimum change in luminance in order for humans to detect a difference in luminance. For high dynamic range (HDR) type content, video frames are typically encoded using a perceptual quantizer electro-optical transfer function (PQ-EOTF) to cause adjacent code words to be close to the minimum step in perceivable brightness. Typical HDR displays use 10-bit color depth, meaning each color component can range from values of 0-1023. With 10-bit encoded PQ EOTF, each of the 1024 code words represent some luminance from 0-10000 nits, but based on human perception it is possible to have more luminance levels that can be differentiated from these 1024 levels. With 8 bit color depth per component, there are only 256 code words, thus each jump in luminance is even more obvious if only 8 bits are used to describe the entire 0-10000 nits range. When using the PQ-EOTF to encode video frames, an output pixel value of zero represents a minimum luminance of 0 nits and the maximum output pixel value (e.g., 1023 for a 10-bit output value) represents a maximum luminance of 10,000 nits. However, typical displays in use today are not able to reach up to that brightness level. Accordingly, displays are not able to represent some of the luminance values that are encoded in the video frame.
The advantages of the methods and mechanisms described herein may be better understood by referring to the following description in conjunction with the accompanying drawings, in which:
In the following description, numerous specific details are set forth to provide a thorough understanding of the methods and mechanisms presented herein. However, one having ordinary skill in the art should recognize that the various implementations may be practiced without these specific details. In some instances, well-known structures, components, signals, computer program instructions, and techniques have not been shown in detail to avoid obscuring the approaches described herein. It will be appreciated that for simplicity and clarity of illustration, elements shown in the figures have not necessarily been drawn to scale. For example, the dimensions of some of the elements may be exaggerated relative to other elements.
Various systems, apparatuses, and methods for implementing an effective electro-optical transfer function for limited luminance range displays are disclosed herein. A processor (e.g., graphics processing unit (GPU)) detects a request to encode pixel data to be displayed. The processor also receives an indication of the effective luminance range of a target display. In response to receiving the indication, the processor encodes the pixel data in a format which maps to the effective luminance range of the target display. In other words, the format has a lowest output pixel value which maps to the minimum luminance value able to be displayed by the target display, and the format has a highest output pixel value which maps to the maximum luminance value able to be displayed by the target display.
In one implementation, a processor receives pixel data in a first format which has one or more output pixel values which map to luminance values outside of the effective luminance range of the target display. Accordingly, these output pixel values are not able to convey any useful information. The processor converts the pixel data from the first format to a second format which matches the effective luminance range of the target display. In other words, the processor rescales the pixel representation curve, such that all values that are transmitted to the target display are values that the target display can actually output. A decoder then decodes the pixel data of the second format and then the decoded pixel data is driven to the target display.
Referring now to
In one implementation, processor 105A is a general purpose processor, such as a central processing unit (CPU). In one implementation, processor 105N is a data parallel processor with a highly parallel architecture. Data parallel processors include graphics processing units (GPUs), digital signal processors (DSPs), field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), and so forth. In some implementations, processors 105A-N include multiple data parallel processors. In one implementation, processor 105N is a GPU which provides a plurality of pixels to display controller 150 to be driven to display 155.
Memory controller(s) 130 are representative of any number and type of memory controllers accessible by processors 105A-N and I/O devices (not shown) coupled to I/O interfaces 120. Memory controller(s) 130 are coupled to any number and type of memory devices(s) 140. Memory device(s) 140 are representative of any number and type of memory devices. For example, the type of memory in memory device(s) 140 includes Dynamic Random Access Memory (DRAM), Static Random Access Memory (SRAM), NAND Flash memory, NOR flash memory, Ferroelectric Random Access Memory (FeRAM), or others.
I/O interfaces 120 are representative of any number and type of I/O interfaces (e.g., peripheral component interconnect (PCI) bus, PCI-Extended (PCI-X), PCIE (PCI Express) bus, gigabit Ethernet (GBE) bus, universal serial bus (USB)). Various types of peripheral devices (not shown) are coupled to I/O interfaces 120. Such peripheral devices include (but are not limited to) displays, keyboards, mice, printers, scanners, joysticks or other types of game controllers, media recording devices, external storage devices, network interface cards, and so forth. Network interface 135 is used to receive and send network messages across a network.
In various implementations, computing system 100 is a computer, laptop, mobile device, game console, server, streaming device, wearable device, or any of various other types of computing systems or devices. It is noted that the number of components of computing system 100 varies from implementation to implementation. For example, in other implementations, there are more or fewer of each component than the number shown in
Turning now to
Network 210 is representative of any type of network or combination of networks, including wireless connection, direct local area network (LAN), metropolitan area network (MAN), wide area network (WAN), an Intranet, the Internet, a cable network, a packet-switched network, a fiber-optic network, a router, storage area network, or other type of network. Examples of LANs include Ethernet networks, Fiber Distributed Data Interface (FDDI) networks, and token ring networks. In various implementations, network 210 further includes remote direct memory access (RDMA) hardware and/or software, transmission control protocol/internet protocol (TCP/IP) hardware and/or software, router, repeaters, switches, grids, and/or other components.
Server 205 includes any combination of software and/or hardware for rendering video/image frames and encoding the frames into a bitstream. In one embodiment, server 205 includes one or more software applications executing on one or more processors of one or more servers. Server 205 also includes network communication capabilities, one or more input/output devices, and/or other components. The processor(s) of server 205 include any number and type (e.g., graphics processing units (GPUs), CPUs, DSPs, FPGAs, ASICs) of processors. The processor(s) are coupled to one or more memory devices storing program instructions executable by the processor(s). Similarly, client 215 includes any combination of software and/or hardware for decoding a bitstream and driving frames to display 250. In one embodiment, client 215 includes one or more software applications executing on one or more processors of one or more computing devices. Client 215 can be a computing device, game console, mobile device, streaming media player, or other type of device.
Referring now to
In various implementations, computing system 300 executes any of various types of software applications. In one implementation, as part of executing a given software application, a host CPU (not shown) of computing system 300 launches kernels to be performed on GPU 305. Command processor 335 receives kernels from the host CPU and issues kernels to dispatch unit 350 for dispatch to compute units 355A-N. Threads within kernels executing on compute units 355A-N read and write data to global data share 370, L1 cache 365, and L2 cache 360 within GPU 305. Although not shown in
Turning now to
Many displays are not able to generate a maximum luminance of 10000 nits. For example, some displays are only able to generate a maximum luminance of 600 nits. Using the curve shown in graph 400, a luminance of 600 nits corresponds to a 10-bit pixel value of 713. This means that for a display that has a maximum luminance of 600 nits, all output pixel values greater than 713 are wasted because these values will result in a luminance output of 600 nits. In another example, other types of displays are only able to generate a maximum luminance of 1000 nits. A pixel value of 768 corresponds to a luminance of 1000 nits, and so for a display that has a maximum luminance output of 1000 nits, all output pixel values greater than 768 are wasted.
Referring now to
Turning now to
The PQ curve 605 illustrates a typical PQ EOTF encoding which results in wasted code words which map to luminance values that cannot be displayed by limited luminance range displays. PQ curve 605 illustrates the same curve as the dashed-line curve shown in graph 500 (of
For a target display with a maximum luminance of 1000 nits, partial PQ curve 610 is utilized to map 10-bit output pixel values to luminance values. For partial PQ curve 610, the maximum 10-bit output pixel value of 1024 maps to a luminance of 600 nits. This mapping results in the entire range of output pixel values generating luminance values that the target display is actually able to display. In one implementation, partial PQ curve 615 is generated by scaling PQ curve 605 by a factor of 50/3 (10000 divided by 600 nits). In other implementations, other similar types of partial PQ curves are generated to map output pixel values to luminance values for displays with other maximum luminance values besides 600 or 1000 nits.
Referring now to
A processor detects a request to generate pixel data for display (block 705). Depending on the implementation, the pixel data is part of an image to be display or the pixel data is part of a video frame of a video sequence to be displayed. Also, the processor determines an effective luminance range of a target display (block 710). In one implementation, the processor receives an indication of the effective luminance range of the target display. In other implementations, the processor determines the effective luminance range of the target display using other suitable techniques. In one implementation, the effective luminance range of the target display is specified as a pair of values indicative of a minimum luminance and a maximum luminance able to be generated by the target display.
Next, the processor encodes pixel data using an electro-optical transfer function (EOTF) to match the effective luminance range of the target display (block 715). In one implementation, encoding pixel data to match the effective luminance range of the target display involves mapping a minimum output pixel value (e.g., 0) to a minimum luminance value of the target display and mapping a maximum output pixel value (e.g., 0x3FF in a 10-bit format) to a maximum luminance value of the target display. Then, the output pixel values in between the minimum and maximum are scaled in between using any suitable perceptual quantizer transfer function or other type of transfer function. The perceptual quantizer transfer function distributes output pixel values in between the minimum and maximum output pixel values to optimize for human eye perception. In one implementation, the processor encodes pixel data in between the minimum and maximum values using a scaled PQ EOTF. After block 715, method 700 ends.
Turning now to
Then, the processor converts the received pixel data from the first format to a second format which matches the effective luminance range of the target display (block 820). In one implementation, the second format uses the same or less than the number of bits per pixel component value as the first format. By matching the effective luminance range of the target display, the second format is a more bandwidth efficient encoding of the pixel data. In one implementation, the second format is based on a scaled PQ EOTF. In other implementations, the second format is any of various other types of formats. Next, the pixel data encoded in the second format is driven to the target display (block 825). After block 825, method 800 ends. Alternatively, the pixel data in the second format is stored or sent to another unit after block 820 rather than being driven to the target display.
Referring now to
If the first format matches the effective luminance range of a target display (conditional block 920, “yes” leg), then the processor keeps the pixel data in the first format (block 925). After block 925, method 900 ends. Otherwise, if the first format does not match the effective luminance range of the target display (conditional block 920, “no” leg), then the processor converts the received pixel data from the first format to a second format which matches the effective luminance range of the target display (block 930). After block 930, method 900 ends.
Turning now to
Referring now to
In one implementation, encoder 1110 is implemented on a computer with a GPU, with the computer connected directly to display device 1120 through an interface such as DisplayPort or high-definition multimedia interface (HDMI). In this implementation, the bandwidth limitations for the video stream sent from encoder 1110 to display device 1120 would be the maximum bit rate of the DisplayPort or HDMI cable. In a bandwidth limited scenario where the video stream is encoded using low bit depth, the encoding techniques described throughout this disclosure can be advantageous.
In various implementations, program instructions of a software application are used to implement the methods and/or mechanisms described herein. For example, program instructions executable by a general or special purpose processor are contemplated. In various implementations, such program instructions are represented by a high level programming language. In other implementations, the program instructions are compiled from a high level programming language to a binary, intermediate, or other form. Alternatively, program instructions are written that describe the behavior or design of hardware. Such program instructions are represented by a high-level programming language, such as C. Alternatively, a hardware design language (HDL) such as Verilog is used. In various implementations, the program instructions are stored on any of a variety of non-transitory computer readable storage mediums. The storage medium is accessible by a computing system during use to provide the program instructions to the computing system for program execution. Generally speaking, such a computing system includes at least one or more memories and one or more processors configured to execute program instructions.
It should be emphasized that the above-described implementations are only non-limiting examples of implementations. Numerous variations and modifications will become apparent to those skilled in the art once the above disclosure is fully appreciated. It is intended that the following claims be interpreted to embrace all such variations and modifications.