Notice: More than one reissue application has been filed for the reissue of U.S. Pat. No. 6,680,738. The reissue applications are the present application, which is a broadening reissue of the '738 Patent, and application Ser. No. 12/789,856, which is a divisional broadening reissue of the '738 Patent.
This invention relates to computer-graphics systems, and more particularly to frame buffers split among multiple blocks in memory.
An interesting variety of small consumer devices are appearing. Portable computing and/or communication devices such as the personal digital assistant (PDA), Pocket PC, and smart cellular phones have an astonishing computing power for such small devices. These portable, often hand-held, computing devices often use a very-large-scale-integration (VLSI) chip that includes a microprocessor or central processing unit (CPU), memory, and I/O controllers on a single silicon chip known as a System-On-a-Chip (SOC).
These consumer devices run on battery power to achieve portability. The battery must be made small and light to keep the size and weight of the overall device small. Such small batteries necessitate the use of low-power chips including the SOC.
The SOC can include an on-chip static random-access memory (SRAM). Program running on the SOC's CPU can access data from an on-chip read-only-memory (ROM) and write data to the on-chip SRAM. Using the on-chip SRAM reduces power, since this avoids access cycles to an external dynamic-random-access memory (DRAM) that require more power to drive the larger off-chip capacitances.
Some accesses of the external DRAM may still be needed to load a very large program into the SRAM, or to fetch very large data files. Once these are stored and fetched, the external DRAM can be powered down while the program and frame buffer are located and executed within the on-chip SRAM. Use of the on-chip SRAM also improves performance, as SRAM access times are faster than access times to the external DRAM.
The SOC may include a graphics controller that continuously reads pixel data from a frame buffer and sends these pixels off-chip from the SOC to a display. The display can be a small liquid crystal display (LCD) that requires little power, or other compact display. The frame buffer can be a portion of the on-chip SRAM that is written by the CPU when updating the display. Using the internal SRAM for the frame buffer can further save power, since external accesses of an external frame-buffer memory are avoided.
However, larger, more colorful displays running at higher-resolution modes may require a large frame buffer to store a large number of pixels. Higher-color modes require more storage bits per pixel, and higher resolutions have more pixels to store. The on-chip SRAM may need to be enlarged to provide sufficient capacity for these larger frame buffers. However, larger on-chip SRAMs increase the SOC die size and reduce manufacturing yield. The SOC may even become too expensive for many low-cost consumer devices.
For example, a display of 320×240 pixels having one byte per pixel requires 76,000 bytes, which fits in a 100 Kilo-Byte (KB) SRAM. However, a more colorful display using 16 bits per pixel requires about 150 KB, which is larger than the 100 KB SRAM.
The frame buffer could be split among the on-chip SRAM and the external DRAM. However, all software programs running on the CPU expect the frame buffer to be a single, continuous block of address. Re-writing the many programs that can run on the CPU to allow for a split frame buffer is not practical. Programs are written expecting a conventional frame buffer with a contiguous block of addresses.
What is desired is a SOC that supports a frame buffer that can be split among multiple blocks of memory in the internal SRAM and the external DRAM. A graphics controller that can re-assemble pixels from the multiple blocks is desirable. A SOC that has a high-power display mode that splits the frame buffer between the on-chip SRAM and the external DRAM, and with a low-power display mode that only uses the on-chip SRAM is desired. It is further desired that the frame buffer appear to be a single, contiguous block of memory to programs executing on the CPU.
The present invention relates to an improvement in frame buffers. The following description is presented to enable one of ordinary skill in the art to make and use the invention as provided in the context of a particular application and its requirements. Various modifications to the preferred embodiment will be apparent to those with skill in the art, and the general principles defined herein may be applied to other embodiments. Therefore, the present invention is not intended to be limited to the particular embodiments shown and described, but is to be accorded the widest scope consistent with the principles and novel features herein disclosed.
Peripherals 26 can include a keypad or pointing device that inputs commands or selections from a user. Peripherals 26 can include other devices, such as a speaker, light-emitting diode lights, cable connectors, etc. I/O controller 18 has registers that can be read and written by CPU 12 over internal bus 15 to communicate with and control peripherals 26. Timers, direct-memory access (DMA), or other I/O controller functions may be included in I/O controller 18.
Graphics display controller 24 drives a stream of pixels to external display 29 to refresh the display. These pixels are fetched from a frame buffer in on-chip SRAM 22 for low-power display modes when the entire frame buffer fits inside SRAM 22. However, for higher-resolution, higher-color, higher-power display modes the frame buffer does not completely fit in SRAM 22. Instead, the frame buffer is divided among two or more blocks. At least one block is in SRAM 22, but one or more other blocks are located in external SDRAM 28.
Graphics display controller 24 contains address-generation and boundary-checking logic to fetch some pixels from SRAM 22, while other pixels are fetched from SDRAM 28 using external memory controller 20.
Programs running on CPU 12 update the display by writing pixels to a single-block frame buffer in the logical address space. MMU 14 translates these memory accesses from CPU 12 into the physical addresses of multiple blocks in SRAM 22 and SDRAM 28. Thus programs see a frame buffer with a single memory block, but graphics display controller 24 sees a frame buffer with multiple memory blocks.
Memory writes from CPU 12 are sent to MMU 14 before actually being written to the frame-buffer memory. MMU 14 translates the virtual address from CPU 12 into a physical address for the memory. The physical address is within physical address space 40, which includes both on-chip addresses 90 in the on-chip SRAM, and external addresses 92 in the external SDRAM.
The frame buffer is split among two blocks in this example. Display block A is virtual block 32 in virtual address space 30, but physical block 42 in physical address space 40. Pixels to block 32 are translated by MMU to addresses within block 42 so that the pixels are written to on-chip SRAM.
Display block B is virtual block 34 in virtual address space 30, but physical block 44 in physical address space 40. Pixels to block 34 are translated by the MMU to addresses within block 44 so that the pixels are written to the external SDRAM. Thus although blocks 32, 34 are continuous in virtual address space 30 accessed by programs, pixels are stored in separate physical blocks 42, 44 in different physical memory devices.
Graphics display controller 24 reads pixels from physical block 42 with internal addresses for the on-chip SRAM. These pixels are displayed as display block 46 on display 29. Graphics display controller 24 also reads pixels from physical block 44 with external addresses for the external SDRAM. These pixels are displayed as display block 48 on display 29. Thus Graphics display controller 24 reads separate physical-memory blocks and combines them into a single frame of display.
Logical address 52 is generated by the CPU and contains a least-significant-bits (LSB) portion known as offset 56. The most-significant-bits (MSB) portion of logical address 52 is page base address 54, or simply the page address.
Page base address 54 of logical address 52 is used to lookup an entry in translation table 50. The entry for page base address 54 contains a translated page address, which is output from table 50 as translated page address 55.
Physical address 53 is formed by concatenating translated page address 55 with offset 56 from logical address 52, which is unchanged as offset 57. Offset 56, 57 can be considered the address within a page that is identified by page base address 54 in the logical address space, or translated page address 55 in the physical address space. In this example, offsets 56, 57 are 12 bits, for a page size of 4 KB.
Just before the start of a new screen, VS is activated, and the screen starting address from selector 60 is fed through mux 62 to line start register 64. Otherwise, mux 62 selects the next line starting address, LSA(n+1), when the vertical sync VS is not active. This allows the address for the next horizontal line of pixels to be generated.
The new screen address (SSA) selected by mux 62 can be loaded into line start register 64 when both HS and VS are active. Then, after the end of the horizontal sync (HS), the screen start address (SSA) in the line start register 64 is latched into memory address counter 66 at the end of the horizontal sync.
Memory address counter 66 is incremented by the pixel size for each pixel clock PCLK during the display time, when both horizontal display active HDA and vertical display active VDA are on. Memory address counter 66 is loaded with the line start address from line start register 64 before the beginning of each line towards the end of the HS timing after line start register 64 is loaded with a new value at the beginning of HS. Memory address counter 66 contains a physical address that is sent the physical memory to fetch the next pixel. The physical address is in the on-chip SRAM for the first block, but can be in the external SDRAM for other blocks of the frame buffer.
Adder 70 generates the next line's starting address, LSA (n+1), by adding the current line start address from line start register 64 to the screen-image width SIW from screen-width register 68. Screen-width register 68 contains the width of the screen, which is the number of pixels in the displayed line plus the number of pixels in the off-screen (non-displayed) part of the line, multiplied by the pixel width in bytes (or other addressable units of the physical memory). This next-line starting address is selected by mux 62 to be loaded into line start register 64 at the beginning of the next HS.
The physical address of the current pixel from memory address counter 66 is compared by detector 72 to the block end address BEA in register 76. When the current pixel's address matches the block end address, the end of the block has been reached. The next pixel must be fetched from a different physical block, which may not be contiguous with the current block.
Sometimes the block end may occur at the end of a line of pixels, or between the address at the end of a line and the next line's start address, rather than in the middle of the line. Then the block end is not detected by detector 72. Instead, comparator 74 detects that the next line's starting address (which is generated by adder 70) is greater than the block end address from BEA register 76.
When the end of the block is detected by either detector 72 (block ends in the middle of a line) or comparator 74 (block ends at end of line), block end signal BE is activated. The next-line address from adder 70 is no good as it over-runs or exceeds the end of the block. Instead, the start address of the next block is used. This block start address (BSA) is selected to be output from start address selector 60 prior to the end of the current block. Mux 62 selects the new block's start address from start address selector 60 when BE is activated. The new block start address is latched into line start register 64 by the BE signal during the vertical display active time VDA.
Memory address counter 66 is loaded with the new block start address from line start register 64. The next pixel in the line is fetched from the new physical memory block at this block starting address. Memory address counter 66 then continues to count up with the pixel clock, reading pixels from the new physical block. Adder 70 generates the next line's starting address, LSA(n+1) from the block's starting address in line start register 64. At the end of the line, when HS is activated, this new line's address is latched into line start register 64 through mux 62, and pixel fetching continues with the second display line in the new physical block.
The screen starting address in start address selector 60 is a full-width address. The full address can be loaded into both the upper (U) and lower (L) portions of line start register 64 and memory address counter 66 as described above. This allows the physical blocks to have an arbitrary length. However, the block start address uses a fixed block size, such as fixed-size pages. For example, the page size (length) is often set to 4K bytes, where the lower 12 address bits are the offset within the page. Then the page's (block's) starting address can be specified by just the upper m−12 address bits, where m is the address width in bits. For example, a 32-bit address can use just the upper 20 bits as the page starting address, since it is assumed that the lower address bits are all zero's for the first address in the page.
Using such fixed-length physical-memory blocks, or pages, is advantageous because smaller-width registers and logic paths can be used. Lower address bits can be quickly zeroed out when a new physical block is started. When a new block address is loaded from start address selector 60 through mux 62 to line start register 64, the upper bits are loaded from start address selector 60 by mux 62 into the upper portion of line start register 64. The lower bits are zeroed by mux 62 and loaded as zeros into the lower portion of line start register 64. Likewise, the lower portion of memory address counter 66 can become zero at the start of a new block, and just the upper bits loaded from line start register 64. If the block end occurs between the end of a line and the next start address, the lower portion of memory address counter 66 can be loaded from output of the adder 70 through mux 62 and line start register 64. This is not the start of a new block, so just the upper bits loaded from line start register 64.
Since all physical addresses within a block have the same upper address bits, the upper portions of line start register 64 and memory address counter 66 can continue to be loaded from start address selector 60 for each new line, if the next line start address is in the same block. Alternately, the upper portions can be loaded just once at the start of the block, and not re-loaded when the lower-portions are loaded for each new display line.
An overflow or carry-out signal from the lower portion of memory address counter 66 is an input to detector 72. If the block end address register 76 and upper portion of memory address counter 66 are equal, then the BE signal become active. Adder 70 outputs the next line's start memory address which is input to comparator 74 to compare with block end address register 76 to signal the block end BE when the next line's start address is greater than block end address register 76.
Block Ends in Middle of Display Line—
The displayable area occurs for only a portion of the total frame time. Pixels are written to the display during the horizontal display active HDA time of a line, but not when HDA is off at the end of each line. The horizontal sync signal HS occurs when HDA is off at the ends of the lines.
Displayable lines of pixels are written to the display during the vertical display active VDA time, but not after the last line is written and VDA is turned off. Then the vertical sync VS signal occurs. Both the VS and HS signals are active near the end of the display frame timing.
The first physical block A, display block 46, begins with the screen starting address SSA at the beginning of line LSA0, and ends with BEA on line LSA(n). Block 46 ends in the middle of line LSA(n). Block-end signal BE is activated by the detector when the memory address counter matches the block-end address. Then the starting address of the next physical block is loaded into the line start register.
The first pixel in the second physical block B is read from the block start address BSA and written to the display to continue the current line LSA(n). Then other pixels in line LSA(n) in block 48 are written, and the next line LSA(n+1) in block 48 and subsequent lines are written.
Block Ends after End of Display Line in Off-Screen Area—
Block-end signal BE is activated by the comparator when the next-line starting address generated by the adder exceeds the block-end address. Then the upper portion (U), bits m−12, of the next-line address generated by adder 70 is discarded. Instead, the upper portion (U) of the starting address of the next physical block is loaded into the line start register from start address selector 60. The lower part (L), bits 11-0, generated by adder 70 is loaded to memory address counter 66 through mux 62 and line start address register 64.
The first pixel in the second physical block B is read and written using the address value from memory address counter 66. It is the first displayable pixel in the next line LSA(n+1). Then other pixels in line LSA(n+1) in block 48 are written, and subsequent lines are written.
Full-Power Mode Fetches Pixels from Both On-Chip and External Memories—
The second block B of the frame buffer begins at block start address BSA. Second block 48 continues to the end of the displayable area in this simplified example. This second block 48 is stored in external SDRAM, which requires more power to access than the on-chip SRAM since external lines with larger capacitances have to be driven. Also, the access time may be slower for the external SDRAM than for the internal SRAM.
During display of a frame, power consumption of the graphics functions in the SOC chip is lower during first block 46 than for second block 48 and any subsequent blocks from external SDRAM.
Low-Power Mode Fetches Pixels only from On-Chip Memory—
To save power, the size of the display is reduced to a display window that is a fraction of the normal display. Additional counters and comparators count the number of pixels in the current line, and the line number. When the number of pixels in the current line matches or exceeds the horizontal-active-end HAE value, memory fetching ends for the line. Instead, dummy pixel data is written to the display. The dummy pixels could all be black or gray or some other pre-determined color. These dummy pixels form blank data 82 that is displayed after HAE. Active data 80 contains pixels that are fetched from the on-chip SRAM.
When the current line matches or exceeds the vertical-active end (VAE), memory fetching for the frame ends. Dummy pixels are written to the display, forming blank data 84.
The values for HAE and VAE are set so that the display window of active data 80 falls completely within first block 46, before BEA. Only dummy pixel data occurs in the second block 48, and no pixel-fetches of the external memory are required.
During display of a frame in a stand-by or other low-power mode using only the display window, power consumption of the graphics functions in the SOC chip is lower since only pixels from first block 46 are fetched. No fetches occur for second block 48, so power-hungry fetches from external SDRAM are eliminated.
Active data 80 in the smaller display window can contain status information for the hand-held device, or a smaller amount of information than is available for the full-power mode when the entire display is written. The hand-held device can switch to the standby or low-power display mode when the battery is low, or when little activity is occurring. The device can switch to the higher or full-power mode when the user is actively operating the device and needs more information displayed.
Several other embodiments are contemplated by the inventors. For example the address generator of
The entire page (all offset address locations within a page) does not have to be used by displayable pixels. Some overhead storage locations may be located on the page, and the last page on a frame can have only some of the available space used. A typical system has many more pages than shown in the drawings. For example, a frame buffer that stores 1 M-byte of pixels uses about 256 physical pages when each page is 4K in length.
The physical address of the memory may be specified in units other than bytes. For example, the physical memory may be read in words of 4 or 8 bytes. The memory address counter can be made to increment by one memory word every other pixel clock, to fetch several pixels at a time. Memory address counter 66 could count downward rather than upward to read blocks from top to bottom.
Some systems may not use vertical and horizontal sync signals, or other timing signals described, or may substitute other signals. Flat-panel and LCD displays may not require sync signals, or may use other signals. However, the invention can be modified to use these substitute signals, or dummy sync signals can be generated. The pixel clock may be stopped while the memory address counter is loaded, or pixel buffering or fast address loading may allow the pixel clock to continue uninterrupted.
The hand-held device can switch to a lower-power mode by changing the color resolution of the display. For example, the display can be switched from a 2 byte per pixel mode to a 2-bit per pixel mode to reduce the frame buffer size and reduce memory fetches. The number of pixels per line, and the numbers of lines per frame may also be changed to reduce power. When the frame buffer size falls below the size of the on-chip SRAM, then a refresh fetches are to the lower-power on-chip memory. Even when some external memory is needed, reducing the overall frame-buffer size reduces the number of external fetches needed and thus reduces power.
The abstract of the disclosure is provided to comply with the rules requiring an abstract, which will allow a searcher to quickly ascertain the subject matter of the technical disclosure of any patent issued from this disclosure. It is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims. 37 C.F.R. §1.72(b). Any advantages and benefits described may not apply to all embodiments of the invention. When the word “means” is recited in a claim element, Applicant intends for the claim element to fall under 35 USC §112, paragraph 6. Often a label of one or more words precedes the word “means”. The word or words preceding the word “means” is a label intended to ease referencing of claims elements and is not intended to convey a structural limitation. Such means-plus-function claims are intended to cover not only the structures described herein for performing the function and their structural equivalents, but also equivalent structures. For example, although a nail and a screw have different structures, they are equivalent structures since they both perform the function of fastening. Claims that do not use the word means are not intended to fall under 35 USC §112, paragraph 6. Signals are typically electronic signals, but may be optical signals such as can be carried over a fiber optic line.
The foregoing description of the embodiments of the invention has been presented for the purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise form disclosed. Many modifications and variations are possible in light of the above teaching. It is intended that the scope of the invention be limited not by this detailed description, but rather by the claims appended hereto.
Number | Name | Date | Kind |
---|---|---|---|
4404552 | Hirahata | Sep 1983 | A |
4845662 | Tokumitsu | Jul 1989 | A |
4882683 | Rupp et al. | Nov 1989 | A |
4985848 | Pfeiffer et al. | Jan 1991 | A |
5062057 | Blacken et al. | Oct 1991 | A |
5063526 | Kawaga et al. | Nov 1991 | A |
5091721 | Hamori | Feb 1992 | A |
5119494 | Garman | Jun 1992 | A |
5129060 | Pfeiffer et al. | Jul 1992 | A |
5249267 | Osaki | Sep 1993 | A |
5280579 | Nye | Jan 1994 | A |
5311213 | Kosugi | May 1994 | A |
5353403 | Kohiyama et al. | Oct 1994 | A |
5408251 | Kwon | Apr 1995 | A |
5422997 | Nagashima | Jun 1995 | A |
5473348 | Fujimoto | Dec 1995 | A |
5487146 | Guttag et al. | Jan 1996 | A |
5526025 | Selwan et al. | Jun 1996 | A |
5712664 | Reddy | Jan 1998 | A |
5712999 | Guttag et al. | Jan 1998 | A |
5794016 | Kelleher | Aug 1998 | A |
5818417 | Mattison | Oct 1998 | A |
5860016 | Nookala et al. | Jan 1999 | A |
5943066 | Thomas et al. | Aug 1999 | A |
5945974 | Sharma et al. | Aug 1999 | A |
5966116 | Wakeland | Oct 1999 | A |
6101620 | Ranganathan | Aug 2000 | A |
6125431 | Kobayashi | Sep 2000 | A |
6205531 | Hussain | Mar 2001 | B1 |
6308248 | Welker et al. | Oct 2001 | B1 |
6417857 | Finger et al. | Jul 2002 | B2 |
Number | Date | Country | |
---|---|---|---|
Parent | 09683852 | Feb 2002 | US |
Child | 11337221 | US |