1. Field of the Invention
The invention relates generally to display systems and, more specifically, to a method and apparatus for performing burst refresh of a self-refreshing display device.
2. Description of the Related Art
Computer systems typically include some sort of display device, such as a liquid crystal display (LCD) device, coupled to a graphics controller. During normal operation, the graphics controller generates video signals that are transmitted to the display device by scanning-out pixel data from a frame buffer based on timing information generated within the graphics controller. Some recently designed display devices have a self-refresh capability, where the display device includes a local controller configured to generate video signals from a static, cached frame of digital video independently from the graphics controller. When in such a self-refresh mode, the video signals are driven by the local controller, thereby allowing portions of the graphics controller to be turned off to reduce the overall power consumption of the computer system. Once in self-refresh mode, when the image to be displayed needs to be updated, control may be transitioned back to the graphics controller to allow new video signals to be generated based on a new set of pixel data.
When in a self-refresh mode, the graphics controller may be placed in a power-saving state such as a deep sleep state. In addition, the main communications channel between a central processing unit (CPU) and the graphics controller may be turned off to conserve energy. When the image needs to be updated, the computer system “wakes-up” the graphics controller and any associated communications channels. The graphics controller may then process the new image data and transmit the processed image data to the display device for display.
One drawback to updating the image is that “waking-up” the graphics controller and associated communications channels may take a significant amount of time. For example, waking up a PCIe bus may take 70-100 ms or more. Such delays introduce latencies between the time the CPU attempts to update an image and the time the processed image is displayed via the display device. When frequently entering and exiting a self-refresh mode, such delays may become distracting to a user of the computer system. Furthermore, the computer system may consume unnecessary amounts of energy in order to initialize the graphics controller and associated communications channels for relatively minor tasks.
As the foregoing illustrates, what is needed in the art is an improved technique for updating the cached frame of video data in a self-refreshing display device.
One embodiment of the present invention sets forth a method for performing burst refresh of a self-refreshing display device. The method includes the steps of causing a display device to enter a self-refresh mode, where the pixels of the display device are driven via video signals that are generated based on pixel data stored in a local memory associated with the display device, configuring an interface that connects a graphics processing unit (GPU) to the display device and has a data transmission rate greater than the refresh rate of the display device, and transmitting pixel data associated with a first frame of video to the display device via the interface, where the display device stores the pixel data associated with the first frame in the local memory. The method also includes the steps of entering a power-saving state after the pixel data associated with the first frame has been transmitted to the display device, exiting the power-saving state prior to a time when the display device begins to scan out from the local memory pixel data associated with a second frame of video, and transmitting the pixel data associated with the second frame of pixel data to the display device via the interface.
Another embodiment of the present invention sets forth a system for performing burst refresh of a self-refreshing display device. The system comprises a graphics processing unit coupled to the display device via an interface. The graphics processing unit is configured to cause the display device to enter a self-refresh mode, where the pixels of the display device are driven via video signals that are generated based on pixel data stored in a local memory associated with the display device, configure the interface that has a data transmission rate greater than the refresh rate of the display device, and transmit pixel data associated with a first frame of video to the display device via the interface, where the display device stores the pixel data associated with the first frame in the local memory. The graphics processing unit is also configured to enter a power-saving state after the pixel data associated with the first frame has been transmitted to the display device, exit the power-saving state prior to a time when the display device begins to scan out from the local memory pixel data associated with a second frame of video, and transmit the pixel data associated with the second frame of pixel data to the display device via the interface.
Yet another embodiment of the present invention sets forth a computer readable medium including instructions that, when executed by a processor, cause the processor to perform the steps of causing a display device to enter a self-refresh mode, where the pixels of the display device are driven via video signals that are generated based on pixel data stored in a local memory associated with the display device, configuring an interface that connects a graphics processing unit (GPU) to the display device and has a data transmission rate that is greater than the refresh rate of the display device, and transmitting pixel data associated with a first frame of video to the display device via the interface, where the display device stores the pixel data associated with the first frame in the local memory. The steps also include entering a power-saving state after the pixel data associated with the first frame has been transmitted to the display device, exiting the power-saving state prior to a time when the display device begins to scan out from the local memory pixel data associated with a second frame of video, and transmitting the pixel data associated with the second frame of pixel data to the display device via the interface.
One advantage of the disclosed technique is that placing the GPU and video interface in a power-saving state reduces the overall power consumption of the system, which extends the battery life of today's mobile devices. The burst refresh technique may result in power savings of 60% to 70% or more when compared to conventional operating modes. Furthermore, operating in burst refresh mode is completely transparent to a viewer watching the displayed video.
So that the manner in which the above recited features of the invention can be understood in detail, a more particular description of the invention, briefly summarized above, may be had by reference to embodiments, some of which are illustrated in the appended drawings. It is to be noted, however, that the appended drawings illustrate only typical embodiments of this invention and are therefore not to be considered limiting of its scope, for the invention may admit to other equally effective embodiments.
In the following description, numerous specific details are set forth to provide a more thorough understanding of the invention. However, it will be apparent to one of skill in the art that the invention may be practiced without one or more of these specific details. In other instances, well-known features have not been described in order to avoid obscuring the invention.
In one embodiment, the parallel processing subsystem 112 incorporates circuitry optimized for graphics and video processing, including, for example, video output circuitry, and constitutes a graphics processing unit (GPU). In another embodiment, the parallel processing subsystem 112 incorporates circuitry optimized for general purpose processing, while preserving the underlying computational architecture, described in greater detail herein. In yet another embodiment, the parallel processing subsystem 112 may be integrated with one or more other system elements, such as the memory bridge 105, CPU 102, and I/O bridge 107 to form a system on chip (SoC).
It will be appreciated that the system shown herein is illustrative and that variations and modifications are possible. The connection topology, including the number and arrangement of bridges, the number of CPUs 102, and the number of parallel processing subsystems 112, may be modified as desired. For instance, in some embodiments, system memory 104 is connected to CPU 102 directly rather than through a bridge, and other devices communicate with system memory 104 via memory bridge 105 and CPU 102. In other alternative topologies, parallel processing subsystem 112 is connected to I/O bridge 107 or directly to CPU 102, rather than to memory bridge 105. In still other embodiments, I/O bridge 107 and memory bridge 105 might be integrated into a single chip. Large embodiments may include two or more CPUs 102 and two or more parallel processing systems 112. The particular components shown herein are optional; for instance, any number of add-in cards or peripheral devices might be supported. In some embodiments, switch 116 is eliminated, and network adapter 118 and add-in cards 120, 121 connect directly to I/O bridge 107.
GPU 240 may be configured to receive graphics primitives from CPU 102 via communications path 113, such as a PCIe bus. GPU 240 processes the graphics primitives to produce a frame of pixel data for display on display device 110 and stores the frame of pixel data in frame buffers 244. In normal operation, GPU 240 is configured to scan out pixel data from frame buffers 244 to generate video signals for display on display device 110. In one embodiment, GPU 240 is configured to generate a digital video signal and transmit the digital video signal to display device 110 via a digital video interface such as an LVDS, DVI, HDMI, or DisplayPort (DP) interface. In another embodiment, GPU 240 may be configured to generate an analog video signal and transmit the analog video signal to display device 110 via an analog video interface such as a VGA or DVI-A interface. In embodiments where communications path 280 implements an analog video interface, display device 110 may convert the received analog video signal into a digital video signal by sampling the analog video signal with one or more analog to digital converters.
As also shown in
SRC 220 is configured to generate video signals for display on LCD device 216 based on pixel data stored in local frame buffers 224. In normal operation, display device 110 drives LCD device 216 based on the video signals received from parallel processing subsystem 112 over communications path 280. In contrast, when display device 110 is operating in a panel self-refresh mode, display device 110 drives LCD device 216 based on the video signals received from SRC 220.
GPU 240 may be configured to manage the transition of display device 110 into and out of a panel self-refresh mode. Ideally, the overall power consumption of computer system 100 may be reduced by operating display device 110 in a panel self-refresh mode. In one embodiment, to cause display device 110 to enter a panel self-refresh mode, GPU 240 may transmit a message to display device 110 using an in-band signaling method, such as by embedding a message in the digital video signals transmitted over communications path 280. In alternative embodiments, GPU 240 may transmit the message using a side-band signaling method, such as by transmitting the message using an auxiliary communications channel. Various signaling methods for signaling display device 110 to enter or exit a panel self-refresh mode are described below in conjunction with
Returning now to
In order to cause display device 110 to exit the panel self-refresh mode, GPU 240 may transmit a similar message to display device 110 using a similar method as that described above in connection with causing display device 110 to enter the panel self-refresh mode. After receiving the message to exit the panel self-refresh mode, display device 110 may be configured to ensure that the pixel locations associated with the video signals generated by GPU 240 are aligned with the pixel locations associated with the video signals generated by SRC 220 currently being used to drive LCD device 216 in the panel self-refresh mode. Once the pixel locations are aligned, display device may transition control for driving LCD device 216 from the video signals generated by SRC 220 to the video signals generated by GPU 240.
The amount of storage required to implement a self-refreshing capability may be dependent on the size of the uncompressed frame of video used to continuously refresh the image on the display device 110. In one embodiment, display device 110 includes a single local frame buffer 224(0) that is sized to accommodate an uncompressed frame of pixel data for display on LCD device 216. The size of frame buffer 224(0) may be based on the minimum number of bytes required to store an uncompressed frame of pixel data for display on LCD device 216, calculated as the result of multiplying the width by the height by the color depth of the native resolution of LCD device 216. For example, frame buffer 224(0) could be sized for an LCD device 216 configured with a WUXGA resolution (1920×1200 pixels) and a color depth of 24 bits per pixel (bpp). In this case, the amount of storage in local frame buffer 224(0) available for self-refresh pixel data caching should be at least 6750 kB of addressable memory (1920*1200*24 bpp; where 1 kilobyte is equal to 1024 or 210 bytes).
In another embodiment, local frame buffer 224(0) may be of a size that is less than the number of bytes required to store an uncompressed frame of pixel data for display on LCD device 216. In such a case, the uncompressed frame of pixel data may be compressed by SRC 220, such as by run length encoding the uncompressed pixel data, and stored in frame buffer 224(0) as compressed pixel data. In such embodiments, SRC 220 may be configured to decode the compressed pixel data before generating the video signals used to drive LCD device 216. In yet other embodiments, GPU 240 may compress the frame of pixel data prior to encoding the compressed pixel data in the digital video signals transmitted to display device 110. For example, GPU 240 may be configured to encode the pixel data using an MPEG-2 format. In such embodiments, SRC 220 may store the compressed pixel data in local frame buffer 224(0) in the compressed format and decode the compressed pixel data before generating the video signals used to drive LCD device 216.
Display device 110 may be capable of displaying 3D video data, such as stereoscopic video data. Stereoscopic video data includes a left view and a right view of uncompressed pixel data for each frame of 3D video. Each view corresponds to a different camera position of the same scene captured approximately simultaneously. Some display devices are capable of displaying three or more views simultaneously, such as in some types of auto-stereoscopic displays.
In one embodiment, display device 110 may include a self-refreshing capability in connection with stereoscopic video data. Each frame of stereoscopic video data includes two uncompressed frames of pixel data for display on LCD device 216. Each of the uncompressed frames of pixel data may be comprised of pixel data at the full resolution and color depth of LCD device 216. In such embodiments, local frame buffer 224(0) may be sized to hold one frame of stereoscopic video data. For example, to store uncompressed stereoscopic video data at WUXGA resolution and 24 bpp color depth, the size of local frame buffer 224(0) should be at least 13500 kB of addressable memory (2*1920*1200*24 bpp). Alternatively, local frame buffers 224 may include two frame buffers 224(0) and 224(1), each sized to store a single view of uncompressed pixel data for display on LCD device 216.
In yet other embodiments, SRC 220 may be configured to compress the stereoscopic video data and store the compressed stereoscopic video data in local frame buffers 224. For example, SRC 220 may compress the stereoscopic video data using Multiview Video Coding (MVC) as specified in the H.264/MPEG-4 AVC video compression standard. Alternatively, GPU 240 may compress the stereoscopic video data prior to encoding the compressed video data in the digital video signals for transmission to display device 110.
In one embodiment, display device 110 may include a dithering capability. Dithering allows display device 110 to display more perceived colors than the hardware of LCD device 216 is capable of displaying. Temporal dithering alternates the color of a pixel rapidly between two approximate colors in the available color palette of LCD device 216 such that the pixel is perceived as a different color not included in the available color palette of LCD device 216. For example, by alternating a pixel rapidly between white and black, a viewer may perceive the color gray. In a normal operating state, GPU 240 may be configured to alternate pixel data in successive frames of video such that the perceived colors in the image displayed by display device 110 are outside of the available color palette of LCD device 216. In a self-refresh mode, display device 110 may be configured to cache two successive frames of pixel data in local frame buffers 224. Then, SRC 220 may be configured to scan out the two frames of pixel data from local frame buffers 224 in an alternating fashion to generate the video signals for display on LCD device 216.
The panel enable signal VDD may be connected from GPU to the display device 110 to turn on power in display device 110. The backlight enable and backlight pwm signals control the intensity of the backlight in display device 110 during normal operation. However, when the display device 110 is operating in a panel self-refresh mode, control for these signals must be handled by TCON 210 and may be changed by SRC 220 via control signals received over the auxiliary communication channel (Aux). One of skill in the art will recognize that the intensity of the backlight may be controlled by pulse width modulating a signal via the backlight pwm signal (Backlight_PWM). In some embodiments, communications path 280 may also include a frame lock signal (FRAME_LOCK) that indicates a vertical sync in the video signals generated by SRC 220. The FRAME_LOCK signal may be used to resynchronize the video signals generated by GPU 240 with the video signals generated by SRC 220.
The hot-plug detect signal, HPD, may be a signal connected from the display device 110 to GPU 240 for detecting a hot-plug event or for communicating an interrupt request from display device 110 to GPU 240. To indicate a hot-plug event, display device 110 drives HPD high to indicate that a display device 110 has been connected to communications path 280. After display device 110 is connected to communications path 280, display device 110 may signal an interrupt request by quickly pulsing the HPD signal low for between 0.5 and 1 millisecond.
The auxiliary channel, Aux, is a low bandwidth, bidirectional half-duplex data communication channel used for transmitting command and control signals from GPU 240 to display device 110 as well as from display device 110 to GPU 240. In one embodiment, messages indicating that display device 110 should enter or exit a panel self-refresh mode may be communicated over the auxiliary channel. On the auxiliary channel, GPU 240 is a master device and display device 110 is a slave device. In such a configuration, data or messages may be sent from display device 110 to GPU 240 using the following technique. First, display device 110 indicates to GPU 240 that display device 110 would like to send traffic over the auxiliary channel by initiating an interrupt request over the hot-plug detect signal, HPD. When GPU 240 detects an interrupt request, GPU 240 sends a transaction request message to display device 110. Once display device 110 receives the transaction request message, display device 110 then responds with an acknowledgement message. Once GPU 240 receives the acknowledgement message, GPU 240 may read one or more register values in display device 110 to retrieve the data or messages over the auxiliary channel.
It will be appreciated by those of skill in the art that communications path 280 may implement a different video interface for transmitting video signals between GPU 240 and display device 110. For example, communications path 280 may implement a high definition multimedia interface (HDMI) or a low voltage differential signal (LVDS) video interface such as open-LDI. The scope of the present invention is not limited to an Embedded DisplayPort video interface.
The format of digital video signals 250 enables secondary data packets to be inserted directly into the digital video signals 250 transmitted to display device 110. In one embodiment, the secondary data packets may include messages sent from GPU 240 to display device 110 that request display device 110 to enter or exit a panel self-refresh mode. Such secondary data packets enable one or more aspects of the invention to be realized over the existing physical layer of the eDP interface. It will be appreciated that this form of in-line signaling may be implemented in other packet based video interfaces and is not limited to embodiments implementing an eDP interface.
Secondary data packets may be inserted into digital video signals 250 during the vertical or horizontal blanking periods of the video frame represented by digital video signals 250. As shown in
Control symbols and secondary data packets may be inserted into digital video signals 250 during the horizontal blanking period. For example, a VB-ID symbol is inserted in the first link symbol clock cycle 255(01) after the BS symbol. The VB-ID symbol provides display device 110 with information such as whether the main video stream is in the vertical blanking period or the vertical display period, whether the main video stream is interlaced or progressive scan, and whether the main video stream is in the even field or odd field for interlaced video. Immediately following the VB-ID symbol, a video time stamp (Mvid7:0) and an audio time stamp (Maud7:0) are inserted at link symbol clock cycles 255(02) and 255(03), respectively. Dummy symbols may be inserted during the remainder of the link symbol clock cycles 255(04) during the horizontal blanking period. Dummy symbols may be a special reserved symbol indicating that the data in that lane during that link symbol clock cycle is dummy data. Link symbol clock cycles 255(04) may have a duration of a number of link symbol clock cycles such that the frame rate of digital video signals 250 over communications path 280 is equal to the refresh rate of display device 110.
A secondary data packet may be inserted into digital video signals 250 by replacing a plurality of dummy symbols during link symbol clock cycles 255(04) with the secondary data packet. A secondary data packet is framed by the special secondary start (SS) and secondary end (SE) framing symbols. Secondary data packets may include an audio data packet, link configuration information, or a message requesting display device 110 to enter or exit a panel self-refresh mode.
The BE framing symbol is inserted in digital video signals 250 to indicate the start of active pixel data for a horizontal line of the current video frame. As shown, pixel data P0 . . . PN has a RGB format with a per channel bit depth (bpc) of 8-bits. Pixel data P0 associated with the first pixel of the horizontal line of video is packed into the first lane 251 at link symbol clock cycles 255(06) through 255(08) immediately following the BE symbol. A first portion of pixel data P0 associated with the red color channel is inserted into the first lane 251 at link symbol clock cycle 255(06), a second portion of pixel data P0 associated with the green color channel is inserted into the first lane 251 at link symbol clock cycle 255(07), and a third portion of pixel data P0 associated with the blue color channel is inserted into the first lane 251 at link symbol clock cycle 255(08). Pixel data P1 associated with the second pixel of the horizontal line of video is packed into the second lane 252 at link symbol clock cycles 255(06) through 255(08), pixel data P2 associated with the third pixel of the horizontal line of video is packed into the third lane 253 at link symbol clock cycles 255(06) through 255(08), and pixel data P3 associated with the fourth pixel of the horizontal line of video is packed into the fourth lane 254 at link symbol clock cycles 255(06) through 255(08). Subsequent pixel data of the horizontal line of video are inserted into the lanes 251-254 in a similar fashion to pixel data P0 through P3. In the last link symbol clock cycle to include valid pixel data, any unfilled lanes may be padded with zeros. As shown, the third lane 253 and the fourth lane 254 are padded with zeros at link symbol clock cycle 255(13).
The sequence of data described above repeats for each horizontal line of pixel data in the frame of video, starting with the top most horizontal line of pixel data. A frame of video may include a number of horizontal lines at the top of the frame that do not include active pixel data for display on display device 110. These horizontal lines comprise the vertical blanking period and may be indicated in digital video signals 250 by setting a bit in the VB-ID control symbol.
In one embodiment, the secondary data packet 260 may include a header and data indicating that the display device 110 should enter or exit a panel self-refresh mode. For example, the secondary data packet 260 may include a reserved header code that indicates that the packet is a panel self-refresh packet. The secondary data packet may also include data that indicates whether display device 110 should enter or exit a panel self-refresh mode.
As described above, GPU 240 may send messages to display device 110 via an in-band signaling method, using the existing communications channel for transmitting digital video signals 250 to display device 110. In alternative embodiments, GPU 240 may send messages to display device 110 via a side-band method, such as by using the auxiliary communications channel in communications path 280. In yet other embodiments, a dedicated communications path, such as an additional cable, may be included to provide signaling to display device 110 to enter or exit the panel self-refresh mode.
Computer system 100 may also include multiple display devices 110 such as an internal display panel 110(0) and one or more external display panels 110(1) . . . 110(N). Each of the one or more display devices 110 may be connected to GPU 240 via communication paths 280(0) . . . 280(N). In one embodiment, each of the HPD signals included in communication paths 280 are also connected to EC 310. When one or more display devices 110 are operating in a panel self-refresh mode, EC 310 may be responsible for monitoring HPD and waking-up GPU 240 if EC 310 detects a hot-plug event or an interrupt request from one of the display devices 110.
In one embodiment, a FRAME_LOCK signal is included between internal display device 110(0) and GPU 240. FRAME_LOCK passes a synchronization signal from the display device 110(0) to GPU 240. For example, GPU 240 may synchronize video signals generated from pixel data in frame buffers 244 with the FRAME_LOCK signal. FRAME_LOCK may indicate the start of the active frame such as by passing the vertical sync signal used by TCON 210 to drive LCD device 216 to GPU 240.
EC 310 transmits the GPU_PWR and FB_PWR signals to voltage regulators that provide a supply voltage to the GPU 240 and frame buffers 244, respectively. EC 310 also transmits the WARMBOOT, SELF_REF and RESET signals to GPU 240 and receives a GPUEVENT signal from GPU 240. Finally, EC 310 may communicate with GPU 240 via an I2C or SMBus data bus. The functionality of these signals is described below.
The GPU_PWR signal controls the voltage regulator that provides GPU 240 with a supply voltage. When display device 110 enters a panel self-refresh mode, an operating system executing on CPU 102 may instruct EC 310 to kill power to GPU 240 by making a call to driver 340. Driver 340 will then cause EC 310 to drive the GPU_PWR signal low to kill power to GPU 240 to reduce the overall power consumption of computer system 100. Similarly, the FB_PWR signal controls the voltage regulator that provides frame buffers 244 with a supply voltage. When display device 110 enters the panel self-refresh mode, computer system 100 may also kill power to frame buffers 244 in order to further reduce overall power consumption of computer system 100. The FB_PWR signal is controlled in a similar manner to the GPU_PWR signal. The RESET signal may be asserted during “wake-up” of the GPU 240 to hold GPU 240 in a reset state while the voltage regulators that provide power to GPU 240 and frame buffers 244 are allowed to stabilize.
The WARMBOOT signal is asserted by EC 310 to indicate that GPU 240 should restore an operating state from SPI flash device 320 instead of performing a full, cold-boot sequence. In one embodiment, when display device 110 enters a panel self-refresh mode, GPU 240 may be configured to save a current state in SPI flash device 320 before GPU 240 is powered down. GPU 240 may then restore an operating state by loading the saved state information from SPI flash device 320 upon waking-up. Loading the saved state information reduces the time required to wake-up GPU 240 relative to performing a full, cold-boot sequence. Reducing the time required to wake-up GPU 240 is advantageous during high frequency entry and exit into a low-power sleep state.
The SELF_REF signal is asserted by EC 310 when display device 110 is operating in a panel self-refresh mode. The SELF_REF signal indicates to GPU 240 that display device 110 is currently operating in a panel self-refresh mode and that communications path 280 should be isolated to prevent transients from disrupting the data stored in local frame buffers 224. In one embodiment, GPU 240 may connect communications path 280 to ground through weak, pull-down resistors when the SELF_REF signal is asserted.
The GPUEVENT signal allows the GPU 240 to indicate to CPU 102 that an event has occurred, even when the PCIe bus is off. GPU 240 may assert the GPUEVENT to alert system EC 310 to configure the I2C/SMBUS to enable communication between the GPU 240 and the system EC 310. The I2C/SMBUS is a bidirectional communication bus configured as an I2C, SMBus, or other bidirectional communication bus to enable GPU 240 and system EC 310 to communicate. In one embodiment, the PCIe bus may be shut down when display device 110 is operating in a panel self-refresh mode. The operating system may notify GPU 240 of events, such as cursor updates or a screen refresh, through system EC 310 even when the PCIe bus is shut down.
In the wake-up frame buffer state 420, display device 110 wakes-up the local frame buffers 224. If display device 110 cannot initialize the local frame buffers 224, then display device 110 may send an interrupt request to GPU 240 indicating that the display device 110 has failed to enter the panel self-refresh mode and display device 110 returns to normal state 410. In one embodiment, display device 110 may be required to initialize the local frame buffers 224 before the next frame of video is received over communications path 280 (i.e., before the next rising edge of the VSync signal generated by GPU 240). Once display device 110 has completed initializing local frame buffers 224, display device 110 transitions to a cache frame state 430.
In the cache frame state 430, display device 110 waits for the next falling edge of the VSync signal generated by GPU 240 to begin caching one or more frames of video in local frame buffers 224. In one embodiment, GPU 240 may indicate how many consecutive frames of video to store in local frame buffers 224 by writing a value to a control register in display device 110. After display device 110 has stored the one or more frames of video in local frame buffers 224, display device 110 transitions to a self-refresh state 440.
In the self-refresh state 440, the display device 110 enters a panel self-refresh mode where TCON 210 drives the LCD device 216 with video signals generated by SRC 220 based on pixel data stored in local frame buffers 224. Display device 110 stops driving the LCD device 216 based on the video signals generated by GPU 240. Consequently, GPU 240 and communications path 280 may be placed in a low-power sleep state to reduce the overall power consumption of computer system 100. While in the self-refresh state 440, display device 110 may monitor communications path 280 to detect a request from GPU 240 to exit the panel self-refresh mode. If display device 110 receives a panel self-refresh exit request, then display device 110 transitions to a re-sync state 450.
In the re-sync state 450, display device 110 attempts to re-synchronize the video signals generated by GPU 240 with the video signals generated by SRC 220. When display device 110 has completed re-synchronizing the video signals, then display device 110 transitions back to a normal state 410. In one embodiment, display device 110 will cause the local frame buffers 224 to transition into a local frame buffer sleep state 460, where power supplied to the local frame buffers 224 is turned off.
In one embodiment, display device 110 may be configured to quickly exit wake-up frame buffer state 420 and cache frame state 430 if display device 110 receives a panel self-refresh exit request. In both of these states, display device 110 is still synchronized with the video signals generated by GPU 240. Thus, display device 110 may transition quickly back to normal state 410 without entering re-sync state 450. Once display device 110 is in self-refresh state 440, display device 110 is required to enter re-sync state 450 before returning to normal state 410.
Panel self-refresh is one method to reduce the overall power consumption of a display subsystem by transitioning control for generating video signals from a high-powered GPU to a low-powered controller embedded in the display device. By exploiting periods of graphical inactivity where GPU processing is not required, the GPU may be turned off and the overall power consumption of the display subsystem is reduced. Panel self-refresh is most effective where there are many consecutive frames of inactivity because there is a delay between when the graphical inactivity begins and when the GPU 240 signals the display device to enter the panel self-refresh mode. However, when the image being displayed is being refreshed at higher frequency, the periods of time where self-refresh can be active may be too short to obtain any power savings.
Conventionally, with both analog video signals and digital video signals, the display device is updated substantially simultaneously with the transmission of the video signal over the video interface. For example, a CRT display may be refreshed based on analog video signals that control the path of the electron beam in the display. Similarly, an LCD display may be refreshed based on digital video signals that include color information used to drive the individual pixels of the display. As a controller in the display receives the incoming video signals, the controller causes a corresponding change in the image being displayed on the display device. Because a conventional display device does not include local storage for the video signal, the display device must adjust the portion of the image corresponding to that portion of the video signals when the display device receives the information over the video interface.
However, a display device that implements self-refreshing capabilities may include local storage, such as local frame buffers 224 included within display device 110, for at least a portion of a frame of pixels encoded in the video signals. By exploiting the local frame buffers 224, display device 110 may refresh the image being displayed asynchronously with the video signals received over the video interface from GPU 240. Consequently, GPU 240 may send one or more frames of video data to display device 110 at a higher data transmission rate than the current refresh rate of the display device 110. Once one or more frames have been sent over the video interface, GPU 240 may be placed in a low-power sleep state until the next subsequent frame is needed by the display device 110.
Returning to
As described above, the link symbol clock rate for the main link of an eDP interface may be either 162 MHz (Reduced Bit Rate; hereinafter “RBR”), 270 MHz (High Bit Rate; hereinafter “HBR”) or 540 MHz (High Bit Rate 2; hereinafter “HBR2”) for 1, 2 or 4 lanes (i.e., differential pairs) of the video interface. In order to change the refresh rate of display device 110, GPU 240 may configure communications path 280 to utilize a particular number of lanes at a particular link symbol clock rate to approximate the necessary bandwidth for video signals 250. In addition, because the eDP interface provides only a few discrete configurations for different bandwidth requirements, GPU 240 must also pad the video signals 250 with enough dummy symbols during the horizontal blanking period (i.e., link symbol clock cycles 255(04)) to approximate the desired refresh rate. As an alternative method for padding the video signals 250, GPU 240 may also insert stuffing symbols into video signals 250 during the active pixel data for a horizontal line (not shown in
For example, GPU 240 may be configured to cause display device 110 to run at a refresh rate of 60 Hz (i.e., 60 frames per second). In other words, GPU 240 transmits one frame of video data to display device 110 approximately every 16.6 ms. In a display device configured for WUXGA resolution and 24 bpp color depth, a full frame of active pixel data requires approximately 6750 kB of data (1920*1200*24 bpp). At a refresh rate of 60 Hz, the minimum required data rate over the video interface for this resolution and color depth is approximately 405 MB/s. The eDP specification states that the maximum link symbol clock rate (1 link clock cycle equals 1 byte per lane) is 540 MHz, which translates to a maximum, 4-lane interface data transmission rate of 2.16 GB/s, or approximately five times as much bandwidth as is needed to transmit the required active pixel data at a refresh rate of 60 Hz. Therefore, GPU 240 may reconfigure the eDP interface to utilize one or two lanes instead of four lanes or may reduce the link symbol clock rate from 540 MHz to 162 MHz or 270 MHz, or some combination thereof, in order to reduce the amount of dummy symbols that GPU 240 stuffs in the horizontal blanking period of video signals 250.
Returning now to
As also shown in
Video signals 514 corresponds to a refresh rate of 48 Hz. As shown, GPU 240 begins transmitting active pixel data for frame N at time t0 and finishes transmitting active pixel data for frame N at time t2. GPU 240 begins transmitting active pixel data for frame N+1 at time t3 and finishes transmitting active pixel data for frame N+1 at time t5. Then, GPU 240 begins transmitting active pixel data for frame N+2 at time t6, and so forth. Because the refresh rate of video signals 514 is 48 Hz instead of 24 Hz, FRAME_LOCK 520 is received by GPU 240 twice as often for video signals 514 as for video signals 512. As shown, GPU 240 receives FRAME_LOCK 520(0) prior to transmitting active pixel data for frame N, FRAME_LOCK 520(1) prior to transmitting active pixel data for frame N+1, FRAME_LOCK 520(2) prior to transmitting active pixel data for frame N+2, FRAME_LOCK 520(3) prior to transmitting active pixel data for frame N+3, and FRAME_LOCK 520(4) prior to transmitting active pixel data for frame N+4.
As shown, video signals 512 and 514 transmit the same amount of active pixel data (i.e., for WUXGA resolution, 24 bpp color depth, at 60 Hz refresh rate; 405 MB/s) per frame. However, GPU 240 adjusts the refresh rate of the video signals by either adjusting the link symbol clock rate of the communications path 280 (i.e., changing the link symbol clock rate from 162 MHz (648 MB/s) to 270 MHz (1.08 GB/s) for 4 lanes), adjusting the number of active lanes of the communications path 280 (i.e., transmitting on 4 lanes instead of 2 lanes), adjusting the amount of dummy symbols added to the horizontal blanking period of the video signals, adjusting the amount of stuffing symbols added during the active pixel region of video signals 250, or some combination thereof. However, these methods fail to utilize the full bandwidth of the communications path 280 and/or waste power by transmitting data that is discarded by the display device 110.
As shown, video signals 516 corresponds to a refresh rate of 24 Hz, similar to the refresh rate of video signals 512. However, in contrast with video signals 512, GPU 240 transmits the active pixel data for frame N in a much shorter time (time t0 until time t1) by utilizing the local frame buffer 244 in display device 110. First, GPU 240 causes display device 110 to enter panel self-refresh mode by sending a panel self-refresh entry request to display device 110. GPU 240 then transmits the first frame of active pixel data to display device 110 to cache in the local frame buffer 224. Entering self-refresh state 440, SRC 220 generates the video signals used to drive LCD device 216 based on the active pixel data stored in local frame buffer 224 at a refresh rate set by GPU 240 in a register of display device 110.
Then, for each frame of video, GPU 240 may burst the active pixel data to display device 110 over the communications path 280 at a faster data transmission rate than the refresh rate of video signals 516. SRC 220 may be configured to buffer the pixel data received at the fast data transmission rate of the video interface and scan-out the buffered pixel data at the slower refresh rate of the display device 110. This operation minimizes the number of dummy symbols inserted into the horizontal blanking period of video signals 516. In one embodiment, link symbol clock cycles 255(04) of video signals 516 may include only secondary packets for embedded information, such as subtitle information and audio data, and no dummy symbols. In another embodiment, GPU 240 may only transmit active pixel data (i.e., data P0-PN in link symbol clock cycles 255(06)-255(13)), discarding all other framing symbols. Thus, for a WUXGA resolution, 24 bpp color depth, and 24 Hz video signal, GPU 240 may transmit the active pixel data for a single frame in approximately 6.91×106 link symbol clock cycles over a single lane of the eDP, or 1.73×106 link symbol clock cycles over four lanes. If communications path 280 is configured to utilize all four lanes in the main eDP link at an HBR2 (540 MHz) link symbol clock rate, then a single frame of WUXGA resolution pixel data may be transmitted from GPU 240 to display device 110 in approximately 3.2 ms. However, at a refresh rate of 24 Hz, GPU 240 is only required to send a frame to display device 110 approximately every 41.6 ms. The difference in time between the minimum time and the maximum time needed to transmit the frame to display device 110 may enable certain power-saving techniques to be utilized when active pixel data is not being transmitted to display device 110.
In one embodiment, GPU 240 may be placed in a power-saving state between the time when GPU 240 finishes transmitting active pixel data for frame N (time t1) and when GPU 240 is required to begin transmitting active pixel data for frame N+1 (time t6). For example, GPU 240 could cause portions of GPU 240 to be clock-gated or power-gated. Alternatively, EC 310 could turn off power to the voltage regulators that supply power to GPU 240. In addition, GPU 240 may cause communications path 280 to be placed in a power-saving state, such as by reducing the link symbol clock rate to RBR (162 MHz), reducing the active lanes from two or four lanes down to one lane, or turning off and isolating the communications path 280 entirely. Advantageously, the state of GPU 240 and communications path 280 may be stored in frame buffers 244 or system memory 104 temporarily and reloaded when GPU 240 and communications path 280 are brought out of the power-saving state and restored to normal operation.
The operation described above related to video signals 516 is described herein as “burst refresh mode.” In burst refresh mode, GPU 240 “wakes-up” momentarily to burst a portion of the video signals to display device 110 at a transmission rate that exceeds the refresh rate of the display device 110 and then returns to a low-power sleep state to reduce power consumption until the next portion of the video signals must be transmitted to display device 110. In one embodiment, the size of the portion of the video signals transmitted during each burst cycle is determined based on the available size of local frame buffers 224 in display device 110. In some embodiments, display device 110 makes the size of local frame buffers 224 available to GPU 240 via a register in display device 110 that may be read by GPU 240 over the auxiliary communication channel.
In one embodiment, local frame buffers 224 is sized to store a full, uncompressed frame of pixel data, and GPU 240 transmits active pixel data corresponding to one frame of the video signals 516 during each burst cycle. In alternative embodiments, local frame buffers 224 is sized to hold less than a full, uncompressed frame of pixel data. In such embodiments, GPU 240 may be configured to compress the pixel data for a single frame before bursting the pixel data over the video interface to display device 110, the compressed frame capable of being stored in the smaller local frame buffers 224. Alternatively, GPU 240 may be configured to burst only a portion of the uncompressed frame of pixel data to display device 110, the size of the portion corresponding to the size of the local frame buffers 224. For example, if display device 110 includes local frame buffers 224 having a 2 MB capacity, then GPU 240 may transmit portions of the frame of pixel data in 2 MB bursts. In yet other embodiments, local frame buffers 224 may be larger than a single frame of uncompressed pixel data. In such embodiments, GPU 240 may be configured to burst one or more frames of pixel data during each burst cycle, or a stereoscopic frame of pixel data having a left view and a right view.
In one embodiment, display device 110 is configured with a refresh rate that is a multiple of the content rate corresponding to the video signals 516. For example, display device 110 may be configured to operate using a refresh rate of 48 or 72 Hz even though video signals 516 correspond to a content rate of 24 frames per second. Such operation may be important for video where content rate is low (e.g., feature film cinema may be recorded at 24 fps), but where low refresh rates may result in a noticeable flicker of the LCD device 216. Thus, it may be desirable to refresh the LCD device 216 multiple times per frame in order to reduce the appearance of flicker. In such embodiments, SRC 220 may be configured to drive LCD device 216 using the pixel data in local frame buffers 224 at a high refresh rate that is a multiple of N of the refresh rate associated with video signals 516. In other words, video signals 516 updates the pixel data in local frame buffers 224 once every N frames such that, even though LCD device 216 is being refreshed at a fast refresh rate, the image being displayed is updating at a low refresh rate.
During burst refresh mode, display device 110 requires new pixel data to be transmitted at regular intervals between GPU 240 and display device 110. In one embodiment, GPU 240 may determine the time period for each burst refresh cycle based on the desired refresh rate of display device 110 and the data transmission rate of communications path 280. In one embodiment, the control for causing GPU 240 to exit from the low-power sleep state may be implemented via a timing mechanism. As shown in
In other embodiments, timer 610 may be included in GPU 240 and supplied with a separate power source that is not turned off during the low-power sleep state. Alternatively, timer 610 may be included as a separate chip in parallel processing subsystem 112. In yet other embodiments, timer 610 may be included in SRC 220 within display device 110. Although timer 610 may cause EC 310 to “wake-up” GPU 240 via the GPU_PWR signal, in other embodiments, timer 610 may be configured to send a signal to GPU 240 directly, which causes GPU 240 to “wake-up” such as by turning off power-gating or clock-gating of certain portions of GPU 240. The signal generated by timer 610 may also cause GPU 240 to restore communications path 280 to normal operation.
GPU 240 may also control the timing of the recurring signal generated by timer 610 relative to the point at which GPU 240 must begin transmitting new pixel data to display device 110. Depending on the particular low-power sleep state of GPU 240, GPU 240 may require a small amount of time to return to normal operation after receiving the signal from timer 610. For example, GPU 240 may have to load some state variables from non-volatile memory such as SPI flash device 320 and execute a warm-boot routine. Consequently, timer 610 may be configured to expire a short period before GPU 240 is required to begin transmitting the next frame in order to allow GPU 240 to return to normal operation.
In other embodiments, GPU 240 may use the FRAME_LOCK signal generated by display device 110 to cause GPU 240 to “wake-up.” In such embodiments, the time between when the FRAME_LOCK signal is asserted by display device 110 (during the vertical sync period) until the time display device 110 begins refreshing the active pixels of LCD device 216 may be sufficient to “wake-up” GPU 240 and begin transmitting the next frame to display device 110. In still other embodiments, timer 610 may cause the FRAME_LOCK signal to be pulsed twice—a first time to signal GPU 240 to “wake-up” and a second time to indicate the vertical sync in the refresh timing of LCD device 216.
In one embodiment, GPU 240 is configured to render the next frame substantially simultaneously with transmitting the data for the current frame to display device 110. In other words, after GPU 240 “wakes-up,” GPU 240 begins transmitting data for the current frame which is stored in frame buffers 224. Substantially simultaneously, GPU 240 may receive commands and data from graphics driver 103 that are processed by one or more processing units within GPU 240 to generate pixel data for the next frame in frame buffers 244. Once GPU 240 has finished transmitting the pixel data for the current frame to display device 110 and generating new pixel data for the next frame, GPU 240 may be placed in a low-power sleep state until timer 610 causes GPU 240 to be “woken-up” to transmit the new frame of pixel data to display device 110. In other embodiments, GPU 240 may be configured to be woken-up to render the new pixel data immediately prior to transmitting the new pixel data to display device 110.
In one embodiment, GPU 240 may be configured to only transmit the active pixel data for a portion of a frame. After the first frame of pixel data has been cached in local frame buffers 224, GPU 240 may only need to transmit pixel data for subsequent frames that are different from the pixel data stored in local frame buffers 224. For example, frame 700 illustrates a current frame cached in local frame buffers 224. To generate pixel data for the next frame, GPU 240 may receive instructions and data for rendering the next frame of pixel data from CPU 102 that cause a plurality of pixels in the next frame of video to be different from the corresponding pixel data stored in local frame buffers 224. As shown in
Instead of transmitting the entire frame of pixel data to display device 110, GPU 240 may be configured to transmit only the plurality of different pixels that correspond to all pixels in frame 700 bound by the rectangle formed by pixel 710 and pixel 711. In one embodiment, GPU 240 may transmit a data packet to display device 110 that includes address data and pixel data associated with the plurality of different pixels. For example, the data packet may include coordinates for the upper-left pixel and the lower-right pixel that identifies a range of addresses corresponding to the plurality of different pixels in frame 700. The data packet may also include the pixel data for each of the plurality of different pixels. SRC 220 may be configured to receive the data packet and update the pixel data in local frame buffers 224 corresponding to the plurality of different pixels.
In another embodiment, GPU 240 may be configured to transmit each horizontal line of the new frame of pixel data that includes at least one different pixel compared to the previous frame of pixel data stored in local frame buffers 224. It will be appreciated that other techniques for transmitting only different pixel data may be implemented and are within the scope of the present invention. For example, one or more “dirty” rectangles of different pixels may be transmitted for each frame, which may be a more efficient technique when only a small number of pixels are modified across large parts of frame 700.
In burst-refresh state 810, display device 110 is configured to continually refresh the pixels of LCD device 216 based on a cached frame of video in local frame buffers 224. In one embodiment, display device 110 is configured to receive one frame of pixel data for each refresh cycle of LCD device 216. In other embodiments, display device 110 is configured to receive one frame of pixel data for a plurality of refresh cycles of LCD device 216. In yet other embodiments, display device 110 is configured to receive each frame of pixel data sporadically based on whether the next frame of pixel data is different from the previous frame of pixel data. When GPU 240 is ready to send the next frame of pixel data to display device 110, GPU 240 may signal to display device 110 that it is transmitting the next frame and display device 110 transitions to cache frame state 820. In other embodiments, GPU 240 just begins transmitting active pixel data at any point and display device 110 transitions to cache frame state 820 when display device 110 detects the next pixel data on the video interface.
In cache frame state 820, display device is configured to receive pixel data associated with a frame of video at a data transmission rate that exceeds the refresh rate of the LCD device 216. For example, GPU 240 configures communications path 280 to use a link symbol clock rate of 540 MHz in connection with 4 active lanes of the eDP interface, corresponding to a maximum refresh rate of approximately 312 Hz at WUXGA resolution. However, display device 110 is configured to refresh LCD device 216 at 60 Hz. At the next VSync signal, display device transitions back to burst-refresh state 810 and beings refreshing LCD device 216 based on the new frame cached in local frame buffers 224.
GPU 240 may also be configured to transmit a burst-refresh exit request to display device 110 to transition out of burst-refresh mode. In one embodiment, display device 110 may continue to operate in normal panel self-refresh mode. In other embodiments, display device 110 may transition back to normal operation by exiting panel self-refresh mode and re-syncing the video signals generated by SRC 220 with the video signals generated by GPU 240 to return to normal state 410 (not shown).
The method begins at step 910, where GPU 240 causes display device 110 to enter a self-refresh mode. In one embodiment, GPU 240 inserts a message in the video signals transmitted to display device 110 that causes display device 110 to cache the current frame of pixel data in local frame buffers 224 and begin driving LCD device 216 based on the cached frame of pixel data. At step 912, GPU 240 reconfigures communications path 280 to operate with a faster data transmission rate than the refresh rate of display device 110. In one embodiment, GPU 240 reconfigures communications path 280 to operate at the fastest possible data transmission rate available. For some example embodiments where communications path 280 implements an eDP interface, GPU 240 configures communications path 280 to utilize four lanes of the main eDP link with a 540 MHz link symbol clock rate. At step 914, GPU 240 bursts the current frame of pixel data to display device 110 at the fast data transmission rate. SRC 220 stores the current frame of pixel data in local frame buffers 224. The timing of the transmission of the pixel data for the current frame must be coordinated with the refresh rate of LCD device 216 to ensure that pixel data for the current frame does not overwrite valid pixel data for the previous frame that is still being scanned out by SRC 220 to drive LCD device 216.
At step 916, GPU 240 renders the next frame of pixel data based on instructions and data received from graphics driver 103 and stores the next frame of pixel data in frame buffers 244. In one embodiment, GPU 240 renders the next frame of pixel data substantially simultaneously with bursting the current frame of pixel data to display device 110. At step 918, GPU 240 enters a low-power sleep state. In one embodiment, GPU 240 causes EC 310 to turn off the voltage regulator that supplies power to GPU 240. In another embodiment, portions of GPU 240 are clock-gated or power-gated to reduce power consumption.
At step 920, GPU 240 determines whether a signal to “wake-up” has been received. If a signal to “wake-up” has not been received, then GPU 240 waits until the “wake-up” signal is received. However, if GPU 240 has received a “wake-up” signal, then method 800 proceeds to step 922 where GPU 240 exits the low-power sleep state. In one embodiment, a timing mechanism, such as timer 610, causes EC 310 to “wake-up” GPU 240 by controlling the voltage regulators that supply power to GPU 240. In other embodiments, timer 610 may transmit a signal to GPU 240 that causes GPU 240 to cease clock-gating or power-gating portions of GPU 240. At step 922, GPU 240 performs any necessary operations to return to a normal operating state.
At step 924, GPU 240 determines whether to continue operating in a burst refresh mode. If GPU 240 determines to continue operating in a burst refresh mode, then method 900 returns to step 914 where GPU 240 bursts the next frame of pixel data to display device 110. However, if GPU 240 determines not to continue operating in a burst refresh mode, then method 900 proceeds to step 926 where GPU 240 reconfigures communications path 280 to operate with a data transmission rate that matches the refresh rate of display device 110 and method 900 terminates.
In sum, the disclosed technique enables a GPU to transmit a frame of pixel data to a display device asynchronously with the refresh of the display device. Typically, the bandwidth required for the pixel data is much less than the maximum possible bandwidth of the video interface that couples the GPU with the display device. By maximizing the data transmission rate over the interface, periods of inactivity are created that allow the GPU to be placed in a low-power sleep state.
One advantage of the disclosed technique is that placing the GPU and video interface in a power-saving state reduces the overall power consumption of the system, which extends the battery life of today's mobile devices. The burst refresh technique may result in power savings of 60% to 70% or more when compared to conventional operating modes. Furthermore, operating in burst refresh mode is completely transparent to a viewer watching the displayed video.
While the foregoing is directed to embodiments of the invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof. For example, aspects of the present invention may be implemented in hardware or software or in a combination of hardware and software. One embodiment of the invention may be implemented as a program product for use with a computer system. The program(s) of the program product define functions of the embodiments (including the methods described herein) and can be contained on a variety of computer-readable storage media. Illustrative computer-readable storage media include, but are not limited to: (i) non-writable storage media (e.g., read-only memory devices within a computer such as CD-ROM disks readable by a CD-ROM drive, flash memory, ROM chips or any type of solid-state non-volatile semiconductor memory) on which information is permanently stored; and (ii) writable storage media (e.g., floppy disks within a diskette drive or hard-disk drive or any type of solid-state random-access semiconductor memory) on which alterable information is stored. Such computer-readable storage media, when carrying computer-readable instructions that direct the functions of the present invention, are embodiments of the invention.
In view of the foregoing, the scope of the invention is determined by the claims that follow.