1. Field
The disclosed embodiments relate to techniques for switching between graphics sources to drive a display in a computer system. More specifically, the disclosed embodiments relate to a buffering technique that facilitates switching between graphics sources to drive a display without a visible interruption.
2. Related Art
To operate without interruption, computer displays require a constant video stream from a graphics source. However, a modern computer systems often drives a display from different graphics sources. For example, a computer system may include multiple graphics processing units (GPUs), which provide differing levels of graphics-processing performance and consume different amounts of power. This enables the computer system to switch a display between different GPUs in a manner that balances changing graphics-processing requirements and power consumption. Unfortunately, video streams from the different graphics sources are not necessarily synchronized with each other, and the process of starting up a graphics source can take some time. As a consequence, the process of switching between different graphics sources can cause user-visible display glitches.
Hence, what is needed is a technique that facilitates driving a display using different graphics sources without the above-described problems.
The disclosed embodiments provide a system that facilitates driving a display in a computer system. During operation, the system receives an input video stream from a graphics source, wherein the input video stream comprises a sequence of video frames. Next, the system directs the input video stream through a set of two or more memory buffers including a front buffer and a back buffer to produce an output video stream, which is used to drive the display. While directing the input video stream through the set of memory buffers, the system writes a video frame from the input video stream into the back buffer, and concurrently drives the output video stream from a preceding video frame in the front buffer. When the writing of the video frame completes, the system switches buffers so that the back buffer becomes the front buffer, which drives the output video stream, and the front buffer becomes either a spare buffer or the back buffer, which receives a subsequent frame from the input video stream.
In some embodiments, if the input video stream goes offline, the system temporarily halts the switching of the buffers, and continues to drive the output video stream from the front buffer until the input video stream comes back online.
In some embodiments, the system receives a second input video stream from a second graphics source, and performs a switching operation to direct the second input video stream through the set of memory buffers instead of the input video stream. While the switching operation is in progress, the system temporarily halts the switching of the buffers and continues to drive the output video stream from the front buffer until the switching operation completes.
In some embodiments, after the switching operation completes, if the switching operation introduced a buffering time lag between the input video stream and the output video stream, the system reduces the time lag during successive video frames until the time lag is eliminated.
In some embodiments, while the input video stream is being directed through the set of memory buffers, the system allows a processor to perform direct rendering operations into the back buffer.
In some embodiments, the system switches the input video stream to a live path, which bypasses the set of memory buffers, to produce the output video stream. After the input video stream is switched to the live path, the system can conserve power by removing power from the set of memory buffers.
In some embodiments, receiving the input video stream involves selecting the input video stream from one or more graphics sources.
In some embodiments, the one or more graphics sources include: a graphics processing unit (GPU); a plane within a GPU; or a graphics stream.
Overview
The disclosed embodiments provide a system which is interposed between the graphics sources and the display and has the ability to either pass through a frame or output an internally generated frame that may be based on previously captured frames. One embodiment of the system provides a frame buffer and a multiplexer integrated with digital logic that controls the video stream. This system either directly passes the video stream through to the display or stores a frame from the video stream to the frame buffer. The stored frame can then be retransmitted to the display indefinitely (and independently of the graphics source), thereby enabling the system to power down the graphics source while still refreshing the display using the stored frame.
This system facilitates receiving several video streams, which are not necessarily synchronized, from primary graphics sources (such as GPUs), and then capturing and saving complete or incremental frames to a frame buffer in an internal memory. Note that this capturing process may be turned on or off automatically or semi-automatically under host software control.
The system also facilitates generating an output video stream, which corresponds to one of the input streams or an internally generated stream, and optionally adding a time shift relative to the input streams. The disclosed embodiments also facilitate sustaining the output display using an internally generated stream, thereby permitting one or more of the video sources to be taken offline. Note that by controlling the relative timing of the input, output and internal streams, the system facilitates switching the output between any of the input streams or the internal stream with no user-visible display glitches.
The system also facilitates complete or incremental modification of the internal frame buffer from a host or auxiliary processor, thereby allowing screen updates, graphical user interface (GUI) events and cursor movements to occur even when the primary graphics sources are offline. In this way, the processor can control an internally generated cursor in the internal frame buffer (even when the frame buffers are not switching), which gives the user an indication that the system is still responsive.
The system also provides support for optional transformations, such as quantization, dithering and backlight adaptation, which may be applied to the input and/or output streams. The input and output streams may also have different signaling protocols, such as LVDS or Display Port, and the system may convert between stream formats.
The system provides a number of advantages. For example, graphics sources such as GPUs can cause significant power dissipation, even while displaying a still image, or an image with relatively small changes between successive frames. The system also facilitates turning off a graphics source and sustaining the display using an internally generated image stream based on previously captured frames, thereby reducing system power dissipation significantly.
The system can also support multiple graphics sources, such as a high-performance high-power GPU and a low-performance energy-efficient GPU. These sources are often not synchronized with each other, and the process of switching between the sources can cause user-visible display glitches. By using the internal memory to store frame data and adapting the synchronization signals in conjunction with time shifting the video stream, the system can facilitate switching between such graphics sources without causing display glitches.
Moreover, the delay from turning on power for a GPU to the point where the GPU provides valid frames may range from a few hundred milliseconds to several seconds. Hence, without the above-described system, it is impractical to aggressively turn off GPUs to save power, because during the GPU initialization process the user will notice that the display is unresponsive to user actions such as cursor movement.
The system also provides the ability to apply modifications, such as cursor movement under host software control, to previously captured frames when the system internally generates a video stream. In addition, the host software may directly modify captured frames in the internal frame buffer, thereby permitting partial updates to be made to the last frame received before all graphics sources were taken offline. This may, for example, be used to render GUI events, such as the display of a clock in a GUI. In this case, the display appears to be responsive to the user, even while the GPU is offline or in the process of being brought online.
For usage scenarios such as browsing, word processing and full screen movie playback, much of the computational effort for GUI updates happens on the CPU, and the GPU workload is quite low. For example, while browsing, all network activity, parsing of HTML and images, and font rendering happens on the CPU, the GPU is finally invoked to update the rendered window area to the frame buffer. In this scenario, the system can be used to take all GPUs offline when the GUI workload is low, thereby allowing the software on the CPU to directly render images into the internal frame buffer, which leads to significant power savings and increased battery life. If the GUI workload increases beyond some threshold, the software can bring a GPU online and can switch over to a video stream from that GPU without display glitches. Hence, the described system facilitates switching between several video streams (or no video stream) without a visible interruption. Also, because the CPU provides some level of GUI rendering and cursor updating during the transition period, the system will appear to be responsive to the user.
Note that capturing frames for later redisplay is itself a cause of power dissipation. To alleviate this problem, the system can use techniques that automatically reduce the bandwidth and power required to capture frames by comparing the differences between successive frames.
The above-described system is described in more detail below, but first we describe the associated computer system hardware.
Computer System
A video stream on direct path 109 feeds through pre-processing circuitry 110 and then into MUX 118, which selects a stream from either the direct path 109 or the indirect path 111 to drive display 122. The selected stream feeds through post-processing circuitry 120 before driving display 122. Note that direct path 109 is useful for applications which are sensitive to the buffering delay through indirect path 111. For example, video games, which require users to quickly react to changes in display output, will not function well with a typical 16 ms delay introduced by frame buffering.
A video stream through indirect path 111 similarly feeds through pre-processing circuitry 108 before feeding through stream-generation-and-timing-control circuitry 112 and memory controller 114, and then into a set of buffers in frame buffer memory 116. Note that stream-generation-and-timing-control circuitry 112 performs various operations, such as generating horizontal and vertical timing signals, fetching data from buffer memory, determining when a next frame is due and determining when to swap between frames. Also note that memory controller 114 is a dedicated frame-buffer memory controller, which is separate from a general system memory controller. Moreover, the set of buffers in frame buffer memory 116 includes a front buffer, which drives the display, and a back buffer, which receives a next frame from the video stream. The set of buffers can also include additional buffers to accommodate additional frames (between the frame stored in the front buffer and the frame stored in the back buffer), which can be used to mask a time lag which is greater than one frame. After the stream is buffered in frame buffer memory 116, the stream feeds back through memory controller 114 and stream-generation-and-timing-control circuitry 112 before feeding into MUX 118.
Note that pre-processing circuitry 108 and 110 and post-processing circuitry 120 can perform various graphics-processing operations, such as dynamic backlight adaptation, quantization, dithering, gamma correction, format conversion and compression. Post-processing circuitry 120 can also overlay a cursor on a display stream under control of host CPU 102.
During system operation, host CPU 102 interacts with memory controller 114 and stream-generation-and-timing-control circuitry 112. For example, host CPU 102 can incrementally or completely modify a video frame by performing direct-rendering operations into a buffer in frame buffer memory 116. This allows screen updates, GUI events and cursor movements to occur, even when the primary graphics sources are offline.
Buffering Process
While the input video stream is being directed through the set of memory buffers, the system allows a processor to perform direct-rendering operations into the back buffer (step 210). Finally, the system uses the output video stream to drive the display (step 212).
Note that, because the processor generally wakes up more quickly from a sleep state than the GPUs, the direct-rendering operations can be used to improve the user experience during the wake-up period by allowing the user to move the cursor, or by updating the clock while the GPUs are waking up. Note that the system can alternatively leave the GPU in a sleep state while the processor performs updating operations until the graphics-processing load picks up. If there are multiple GPUs, the system can first activate a low-power GPU, and then a high-power GPU if the graphics-processing load increases.
Also note that it is possible to incrementally update the frame in the buffer memory. This can save on power involved in writing to the buffer memory. To implement incremental updates, each video frame can be divided into tiles, wherein each tile in a memory buffer can be either a “back tile” or a “front tile,” with a bit indicating which tile is front or back. The system can also store a hash of the tile along with this bit. Whenever the system writes new data, the system computes the hash of the tile. If the hash is the same as the previous hash, the system does not update the tile or change the front/back status. Because false positives may occur, the system periodically overwrites each tile with new data.
Switching Video Streams
Note that this is an improvement over existing techniques for switching between unsynchronized graphics sources. These existing techniques ensure synchronization by waiting to switch streams until the precessing of frames from the different graphics sources causes blanking intervals from the unsynchronized graphics sources to align. (For example, see related U.S. patent application Ser. No. 11/499,167, filed 4 Aug. 2006, entitled “Method and Apparatus for Switching Between Graphics Sources,” by inventors David G. Conroy, Michael F. Culbert, William C. Athas and Brian D. Howard.) After this alignment, the switching can take place without causing a user-visible display glitch. Because the precessing can be slow, these existing techniques may have to wait as long as a few seconds before switching.
Finally, after the switching operation completes, if the switching operation introduced a buffering time lag between the input video stream and the output video stream, the system can reduce the time lag during successive video frames until the time lag is eliminated (step 306). For example, the time lag can be reduced gradually between successive frame, so that the time lag is eliminated after about 10 frames.
Video Stream Going Offline
Next, when the input video stream comes back online, the system resumes switching the buffers (step 404). This involves writing a next video frame from the input video stream to the back buffer, and when this frame is written, performing a switching operation so that the back buffer becomes the front buffer. Note that, when a video source, such as a GPU, is turned off to save power and is turned on again at a later time, there will typically be a delay of some hundreds of milliseconds or even seconds before the GPU is live again. During this delay period, switching will remain disabled, and the display will be driven by the front buffer.
Switching Between Indirect Path and Live Path
Next, after the input video stream is switched to the live path, the system can conserve power by removing power from the set of frame buffer memories and parts of the memory controller (step 504).
Note that when the video stream is being fed through the direct path, it is possible to concurrently send the stream (or differences between successive frames in the stream) through the indirect path to maintain a full frame in the frame buffer memory. In this case, if the graphics source goes offline, the display can be driven from the frame buffer memory. This also facilitates rapidly switching to the indirect path from the direct path.
The foregoing descriptions of embodiments have been presented for purposes of illustration and description only. They are not intended to be exhaustive or to limit the present description to the forms disclosed. Accordingly, many modifications and variations will be apparent to practitioners skilled in the art. Additionally, the above disclosure is not intended to limit the present description. The scope of the present description is defined by the appended claims.
Moreover, the preceding description is presented to enable any person skilled in the art to make and use the disclosed embodiments, and is provided in the context of a particular application and its requirements. Various modifications to the disclosed embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be applied to other embodiments and applications without departing from the spirit and scope of the disclosed embodiments. Thus, the disclosed embodiments are not limited to the embodiments shown, but are to be accorded the widest scope consistent with the principles and features disclosed herein.
The methods and processes described in the detailed description section can be embodied as electrical circuitry, or alternatively as code and/or data, which can be stored in a computer-readable storage medium as described above. When a computer system reads and executes the code and/or data stored on the computer-readable storage medium, the computer system performs the methods and processes embodied as data structures and code and stored within the computer-readable storage medium. Furthermore, the methods and processes described below can be incorporated into hardware modules. For example, the hardware modules can include, but are not limited to, application-specific integrated circuit (ASIC) chips, field-programmable gate arrays (FPGAs), and other programmable-logic devices now known or later developed. When the hardware modules are activated, the hardware modules perform the methods and processes included within the hardware modules.
Number | Name | Date | Kind |
---|---|---|---|
5259004 | Nakayama | Nov 1993 | A |
5621431 | Harper et al. | Apr 1997 | A |
5727192 | Baldwin | Mar 1998 | A |
RE37508 | Taylor et al. | Jan 2002 | E |
6385267 | Bowen et al. | May 2002 | B1 |
6424320 | Callway | Jul 2002 | B1 |
6487719 | Itoh et al. | Nov 2002 | B1 |
6535208 | Saltchev et al. | Mar 2003 | B1 |
6624816 | Jones, Jr. | Sep 2003 | B1 |
6778187 | Yi | Aug 2004 | B1 |
6807232 | Nicholson et al. | Oct 2004 | B2 |
6850240 | Jones, Jr. | Feb 2005 | B1 |
7068278 | Williams et al. | Jun 2006 | B1 |
7262776 | Wilt et al. | Aug 2007 | B1 |
7522167 | Diard et al. | Apr 2009 | B1 |
7542010 | Lai | Jun 2009 | B2 |
7576745 | de Waal et al. | Aug 2009 | B1 |
20010022587 | Ono | Sep 2001 | A1 |
20020033812 | Van Vugt | Mar 2002 | A1 |
20020126122 | Yet et al. | Sep 2002 | A1 |
20030227460 | Schinnerer | Dec 2003 | A1 |
20040075622 | Shiuan et al. | Apr 2004 | A1 |
20040174367 | Liao | Sep 2004 | A1 |
20040207618 | Williams | Oct 2004 | A1 |
20040246257 | MacInnis et al. | Dec 2004 | A1 |
20050012749 | Gonzalez et al. | Jan 2005 | A1 |
20050035928 | De Greef | Feb 2005 | A1 |
20050083339 | Wilt et al. | Apr 2005 | A1 |
20050093854 | Kennedy et al. | May 2005 | A1 |
20050237327 | Rubinstein et al. | Oct 2005 | A1 |
20050244131 | Uehara | Nov 2005 | A1 |
20050285863 | Diamond | Dec 2005 | A1 |
20050289361 | Sutardja | Dec 2005 | A1 |
20060007203 | Chen et al. | Jan 2006 | A1 |
20060012540 | Logie | Jan 2006 | A1 |
20060132491 | Riach et al. | Jun 2006 | A1 |
20060197768 | Van Hook et al. | Sep 2006 | A1 |
20060284884 | Cahill, III | Dec 2006 | A1 |
20070139445 | Khan et al. | Jun 2007 | A1 |
20070283175 | Marinkovic et al. | Dec 2007 | A1 |
20080030509 | Conroy et al. | Feb 2008 | A1 |
20080055318 | Glen | Mar 2008 | A1 |
20080186319 | Boner | Aug 2008 | A1 |
20090079746 | Howard et al. | Mar 2009 | A1 |
20100007673 | Swic | Jan 2010 | A1 |
20100053177 | Diard et al. | Mar 2010 | A1 |
20100295999 | Li | Nov 2010 | A1 |
20110157202 | Kwa et al. | Jun 2011 | A1 |
Number | Date | Country |
---|---|---|
0497377 | Aug 1992 | EP |
1061434 | Dec 2000 | EP |
5-113785 | May 1993 | JP |
200612126 | Jan 2006 | JP |
02086745 | Oct 2002 | WO |
2006055608 | May 2006 | WO |
2007140404 | Dec 2007 | WO |
Entry |
---|
Gardner, Floyd M., “Charge-Pump Phase-lock Loop”, IEEE Transactions on Communications, vol. Com-28, No. 11, Nov. 1980, pp. 1849-1858. |
Number | Date | Country | |
---|---|---|---|
20110298814 A1 | Dec 2011 | US |