The present invention relates generally to the field of computer graphics, and more particularly to the efficient generation and updating of computer graphics in a computer frame buffer for display on a display device.
There are many approaches to updating graphics on a display device. One classic method, although rarely used, is the brute force approach where changes to the display graphic are rendered by the processor to memory, and the entire updated graphic is then copied directly to the frame buffer for display. However, this method is extremely inefficient because every pixel of the display device is updated in the frame buffer whether the data for that pixel has changed or not, and the processing resources consumed by this approach are enormous.
A second method for updating graphics on a display device is for the processor to use a revision list to track in memory each pixel that is changed, and then copy only the updated pixels from memory to the frame buffer. This approach has the advantage of copying to the frame buffer data pertaining only to those pixels which have changed; however, this approach is also resource intensive in regard to the memory necessary for maintaining the revision list which, in the worst case scenario, may require a change to every pixel. This, along with other shortcomings, significantly slows video processing.
A third method for updating graphics on a display device involves a complex algorithmic approach that analyzes individual revisions and groups them geometrically into small but efficient “revision regions” comprising both “dirty” (changed) pixels as well as “clean” (unchanged) pixels. The regions are then merged together for an update to the frame buffer. However, for complex revisions, such as a curves and other shapes that can only be broken down into a very large number of small rectangular regions, conducting the merge (among other tasks) is very expensive computationally.
What is needed in the art is a resource-efficient approach to updating graphics on a display device. The present invention addresses these shortcomings.
The method for one embodiment of the present invention is to establish the zone grid at system initialization and, thereafter, track which zones have any pixels revised so that, when the time comes to update the display, only the zones requiring revision (that is, those zones in which any pixel has been revised) are copied from shadow memory to the frame buffer for display on the display device. The memory for tracking these zones can be allocated at initialization and held since it is relatively small. As a result, a significant performance gain may be achieved by avoiding the shortcomings of the existing methods in the art notwithstanding the fact that some “clean” pixels in each zone having even a single changed pixel are also rewritten to the frame buffer.
The foregoing summary, as well as the following detailed description of preferred embodiments, is better understood when read in conjunction with the appended drawings. For the purpose of illustrating the invention, there is shown in the drawings exemplary constructions of the invention; however, the invention is not limited to the specific methods and instrumentalities disclosed. In the drawings:
The subject matter is described with specificity to meet statutory requirements. However, the description itself is not intended to limit the scope of this patent. Rather, the inventors have contemplated that the claimed subject matter might also be embodied in other ways, to include different steps or combinations of steps similar to the ones described in this document, in conjunction with other present or future technologies. Moreover, although the term “step” may be used herein to connote different elements of methods employed, the term should not be interpreted as implying any particular order among or between various steps herein disclosed unless and except when the order of individual steps is explicitly described.
Computer Environment
Numerous embodiments of the present invention may execute on a computer.
As shown in
A number of program modules may be stored on the hard disk, magnetic disk 29, optical disk 31, ROM 24 or RAM 25, including an operating system 35, one or more application programs 36, other program modules 37 and program data 38. A user may enter commands and information into the personal computer 20 through input devices such as a keyboard 40 and pointing device 42. Other input devices (not shown) may include a microphone, joystick, game pad, satellite disk, scanner or the like. These and other input devices are often connected to the processing unit 21 through a serial port interface 46 that is coupled to the system bus, but may be connected by other interfaces, such as a parallel port, game port or universal serial bus (USB). A monitor 47 or other type of display device is also connected to the system bus 23 via an interface, such as a video adapter 48. In addition to the monitor 47, personal computers typically include other peripheral output devices (not shown), such as speakers and printers. The exemplary system of
The personal computer 20 may operate in a networked environment using logical connections to one or more remote computers, such as a remote computer 49. The remote computer 49 may be another personal computer, a server, a router, a network PC, a peer device or other common network node, and typically includes many or all of the elements described above relative to the personal computer 20, although only a memory storage device 50 has been illustrated in
When used in a LAN networking environment, the personal computer 20 is connected to the LAN 51 through a network interface or adapter 53. When used in a WAN networking environment, the personal computer 20 typically includes a modem 54 or other means for establishing communications over the wide area network 52, such as the Internet. The modem 54, which may be internal or external, is connected to the system bus 23 via the serial port interface 46. In a networked environment, program modules depicted relative to the personal computer 20, or portions thereof, may be stored in the remote memory storage device. It will be appreciated that the network connections shown are exemplary and other means of establishing a communications link between the computers may be used.
While it is envisioned that numerous embodiments of the present invention are particularly well-suited for computerized systems, nothing in this document is intended to limit the invention to such embodiments. On the contrary, as used herein the term “computer system” is intended to encompass any and all devices capable of storing and processing information and/or capable of using the stored information to control the behavior or execution of the device itself, regardless of whether such devices are electronic, mechanical, logical, or virtual in nature.
Graphics Processing
In each system, a graphics processing subsystem comprises a central processing unit 21′ that, in turn, comprises a core processor 214 having an on-chip L1 cache (not shown) and is further directly connected to an L2 cache 212. As well-known and appreciated by those of skill in the art, the CPU 21′ accessing data and instructions in cache memory is much more efficient than having to access data and instructions in random access memory (RAM 25, referring to
For each system in these examples, and in contrast to the typical system illustrated in
Also common to
In the subsystem of
In the subsystem of
As previously discussed earlier herein, there are many approaches to updating graphics on a display device. With the brute force approach, and in reference to
A second known method for updating graphics on a display device is for the processor (the CPU 21′ or the GPU 242) to use a revision list to track in memory (VSM 222 or VRAMSM 248) each pixel that is changed, and then copy only the updated pixels from memory to the frame buffer. This approach has the advantage of copying to the frame buffer data pertaining only to those pixels which have changed; however, this approach is also resource intensive in regard to the memory necessary for maintaining the revision list which, in the worst case scenario, may require a change to every pixel. At four bytes per pixel (for 32-bit true color) on a 1024×768 display device (having 1024 pixels per row and 768 pixels per column on the display device), this method requires nearly three megabytes of memory for the revision list. Since this amount of memory typically cannot be allocated and held by the system because of the negative impact such exclusive use of this memory would have on the processing speed of other, unrelated applications, this memory must be allocated (and, thereafter, released) real-time as the revisions are made. However, this amount of memory may not always be available for immediate use. Consequently, the graphic rendering software must have error-handling routines for out-of-memory conditions that might arise when required memory cannot be allocated. Altogether these shortcomings significantly slow video processing using this method.
A third method for updating graphics on a display device involves a complex algorithmic approach that analyzes individual revisions and groups them geometrically into small but efficient “revision regions” comprising both “dirty” pixels (pixels that have been changed) as well as “clean” pixels (that are unchanged). For efficiency, these regions are dynamically created and tracked in memory by various methods (e.g., by tracking starting point, number of horizontal pixels, and number of vertical pixels to rewrite) and are then merged together for an update to the frame buffer. However, for complex revisions, such as a curves and other shapes that can only be broken down into a very large number of small rectangular regions, the computational cost of determining the region size, shape, and location; dynamically allocating memory to track same (and releasing this memory when complete); and conducting the merge of revised regions is altogether very expensive computationally.
To address these shortcomings, in one embodiment of the present invention, the display area of the display device 47′ (and the corresponding frame buffer 246n and/or shadow memories, VSM 222 and/or VRAMSM 248 respectively in
In alternative embodiments of the present invention, the zones may be established at some time other than startup (not predetermined), and the zones may be dynamic based on algorithms employed to determine the most optimal zone size for any particular use (e.g., larger zones for text-based applications, smaller zones for applications that render detailed graphics objects) when the increased overhead necessary may be justified. Likewise, other alternative embodiments may not comprise a square zone grid but, instead, comprise a rectangular grid when the number of vertical zones is greater or less than the number of horizontal zones.
The method for one embodiment of the present invention, as illustrated in
The foregoing method is particularly effective computer systems utilizing text enhancement technologies (TETs) such as Microsoft's ClearType™. ClearType™ is a “sub-pixel anti-aliaser,” a special type of TET software that dramatically improves the readability of text on LCDs (Liquid Crystal Displays), including without limitation laptop screens, Pocket PC screens, and flat panel monitors. ClearType™ enables the words on a display monitor to appear almost as sharp and clear as those printed on a piece of paper. This particular TET works by accessing the individual vertical color stripe elements (sub-pixels) in every pixel of an LCD screen. Prior to ClearType™, the smallest level of detail that a computer could display was a single pixel, but this TET displays features of text as small as a fraction of a pixel in width. This extra resolution increases the sharpness of the tiny details in text display, making it much easier to read over long durations. However, in operation this TET necessarily renders a very large number of graphic revisions to more clearly display the text, and these revisions are most effectively and efficiently rendered using the method described for the present embodiment.
Conclusion
The various system, methods, and techniques described herein may be implemented with hardware or software or, where appropriate, with a combination of both. Thus, the methods and apparatus of the present invention, or certain aspects or portions thereof, may take the form of program code (i.e., instructions) embodied in tangible media, such as floppy diskettes, CD-ROMs, hard drives, or any other machine-readable storage medium, wherein, when the program code is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the invention. In the case of program code execution on programmable computers, the computer will generally include a processor, a storage medium readable by the processor (including volatile and non-volatile memory and/or storage elements), at least one input device, and at least one output device. One or more programs are preferably implemented in a high level procedural or object oriented programming language to communicate with a computer system. However, the program(s) can be implemented in assembly or machine language, if desired. In any case, the language may be a compiled or interpreted language, and combined with hardware implementations.
The methods and apparatus of the present invention may also be embodied in the form of program code that is transmitted over some transmission medium, such as over electrical wiring or cabling, through fiber optics, or via any other form of transmission, wherein, when the program code is received and loaded into and executed by a machine, such as an EPROM, a gate array, a programmable logic device (PLD), a client computer, a video recorder or the like, the machine becomes an apparatus for practicing the invention. When implemented on a general-purpose processor, the program code combines with the processor to provide a unique apparatus that operates to perform the indexing functionality of the present invention.
While the present invention has been described in connection with the preferred embodiments of the various figures, it is to be understood that other similar embodiments may be used or modifications and additions may be made to the described embodiment for performing the same function of the present invention without deviating there from. For example, while exemplary embodiments of the invention are described in the context of digital devices emulating the functionality of personal computers, one skilled in the art will recognize that the present invention is not, limited to such digital devices, as described in the present application may apply to any number of existing or emerging computing devices or environments, such as a gaming console, handheld computer, portable computer, etc. whether wired or wireless, and may be applied to any number of such computing devices connected via a communications network, and interacting across the network. Furthermore, it should be emphasized that a variety of computer platforms, including handheld device operating systems and other application specific hardware/software interface systems, are herein contemplated, especially as the number of wireless networked devices continues to proliferate. Therefore, the present invention should not be limited to any single embodiment, but rather construed in breadth and scope in accordance with the appended claims.
This application is related by subject matter to the inventions disclosed in the following commonly assigned applications: U.S. patent application Ser. No. (not yet assigned) (Atty. Docket No. MSFT-1787), filed on even date herewith, entitled “SYSTEMS AND METHODS FOR EFFICIENTLY DISPLAYING GRAPHICS ON A DISPLAY DEVICE REGARDLESS OF PHYSICAL ORIENTATION”; and U.S. patent application Ser. No. (not yet assigned) (Atty. Docket No. MSFT-1794), filed on even date herewith, entitled “SYSTEMS AND METHODS FOR EFFICIENTLY UPDATING COMPLEX GRAPHICS IN A COMPUTER SYSTEM BY BY-PASSING THE GRAPHICAL PROCESSING UNIT AND RENDERING GRAPHICS IN MAIN MEMORY”.