This invention is related to graphics controller systems and, more particularly, to graphics controller systems with low power dissipation.
As shown in
The video memory interface 15 of the graphics controller integrated circuit 10 has ports dedicated to interface with the video memory 11. The number of ports required for this interface 15 is the sum of the address, data and control signals required to access the video memory 11. The memory 11 has a size which is a function of the video frame buffer required to support the display resolution. While dynamic random access memory (DRAM) is most commonly used for the video frame buffer, some high performance systems use VRAMs (DRAMs with serial data ports added). A typical VGA (Video Graphics Adapter standard) display in an IBM-compatible mobile computer, often called a notebook computer, with an LCD (liquid crystal display) panel uses a single 256K×16 DRAM integrated circuit as a video frame buffer. A typical SVGA (Super VGA standard) system uses two such DRAMs organized as 256K×32.
Wider data paths between the video memory and the graphics controller allow greater bandwidth for data transfer. However, the wider data paths also increase the pin count of the graphics controller package and the package count of the DRAMs with the accompanying increased manufacturing complexity and costs. A 16-bit data path requires one DRAM package and approximately 30 signal lines to handle the memory address, data, and control signals, while a 32-bit data path requires two DRAM packages and 50 signal lines. Power dissipation is increased as more signal lines are added since each signal line has a parasitic capacitance associated with the package I/O pin, as well as with the conducting trace on the motherboard of the mobile computer system. Therefore, an increase in graphics performance is accompanied by an increase in power dissipation, pin count and package count.
The present invention solves or substantially mitigates these problems with a high performance graphics controller system having low power dissipation, and low pin and package counts.
According to the invention, there is provided a graphics controller system with increased performance simultaneously with a reduction in power dissipation, point count and package count. Previously external video memory is integrated with the graphics controller system to eliminate the memory interface. The reduction in pin count is used to add the pins associated with a PCMCIA host adapter and thus allow the integration of that function on the same chip, so as to further reduce the package count on the mother board.
The present invention also provides for particular arrangements for logic circuits and output buffer circuits so that large amounts of logic circuitry sufficient to perform graphics controller system functions may be integrated with the large amounts of memory sufficient to act as a high performance video memory. Furthermore, the present invention provides for a wide bus between the integrated video memory and the functional blocks of the graphics controller system. The present invention has circuits in these blocks for manipulating the video data from the wide bus so that data transfer remains compatible to the various operational VGA modes.
In accordance with the present invention, the graphics controller functions are integrated on the same integrated circuit substrate as the video memory, as shown in
The integrated circuit 20 also has other interfaces, such as an infrared interface 26 for wireless transmission of data between the mobile computer and another device, a PCMCIA host adaptor interface 27 for connections to devices, such as modems, hard disks, etc., which are designed to meet the PCMCIA specification, and a video stream interface 28 for receiving video signals from a variety of sources, such as television or VCR signals. The video stream interface 28 is adapted to the VAFC (VESA Advanced Feature Connector) standards being promulgated by the Video Electronics Standards Association (VESA).
Integrating a large block of DRAM (approximately 7 megabits) on the same substrate with a large block of logic (approximately 40K to 50K of logic gates) required for the graphics engine 22 and various interfaces is not simply a matter of placing DRAM and logic circuits on the same integrated circuit substrate. The optimum technologies for a DRAM and for a logic circuit are electrically incompatible. Hence various steps, described below, must be taken to ensure that the performance of the DRAM circuits and the logic circuits are fully maintained and not compromised.
An integrated circuit process used for building logic gates uses one of the external supply voltages (VDD or VSS depending upon the substrate type) as the voltage to bias the substrate. On the other hand, an integrated circuit process used for building a DRAM uses an internally generated substrate bias voltage which is different from the external supply voltages. This is done primarily to lower the junction capacitance of the memory cell bit lines of the DRAM, as well as to improve the memory cell refresh time.
For example, in commonly used CMOS technology today, the substrate material is P-type silicon. A logic process uses the externally supplied VSS, or ground, voltage to connect to the substrate. The VSS line is also used in the circuit areas to provide the ground path for current flow in the pulldown, N-channel transistors in the logic circuits.
Thus the same VSS metal tracks in the logic integrated circuit serves two functions: 1) as a substrate tap every 25-50 microns across the substrate surface, and 2) as a source terminal of the N-channel transistors of the logic circuits. The substrate taps are necessary to protect the circuits from going into a latch-up condition during operation, since each logic gate has both N-type and P-type transistors.
In a CMOS DRAM, however, the typical DRAM array is built with only N-type transistors and capacitors, and a major goal is to minimize the parasitic capacitance of the memory bit line. Since the N-type bit line junction areas contribute to a majority of the bit line capacitance and since the junction capacitance is greatly reduced by a reverse junction bias voltage, the P-type substrate is typically biased at −1.5 volts, termed VBB. This voltage is generated from an on-chip charge pump and thus has a limited capacity and a high output impedance. This results in the substrate voltage being relatively “noisy” due to the precharging and discharging of the large junction areas associated with the memory array, that are capacitively coupled to the substrate.
In the DRAM, the small amount of on-chip logic which handles the memory address decoding and the data read and write functions typically uses the VSS metal tracks only to connect to the source terminal of the N-channel, pulldown transistors and not as a substrate tap. As shown by a representative logic gate, an inverter, in
In fact, DRAMs typically do not have any substrate taps in the middle of the circuitry. The substrate taps are only made at the edges of the die. Since most of the logic blocks in a DRAM consist of a few cells, repeated many times, which together form a small portion of the total chip area, large P-to-N diffusion spacings, typically 25 microns, can be maintained in the logic cells to avoid latch-up. In contrast, in a logic circuit, which has many different cell types connected in a relatively random manner, the cell size is very important as it determines the total chip area. The P-to-N diffusion spacings are minimized, typically 5 microns, which requires the use of substrate taps in every cell to avoid latch-up.
To combine a significant amount of random logic to a significant amount of DRAM in a single integrated circuit requires that this problem be overcome. The present invention combines the random logic, the graphics engine 12 and interfaces, with the DRAM 11, in an integrated circuit 20 manufactured in accordance with a DRAM process. The logic circuits of the integrated circuit are redesigned to decouple the VSS line connected to the source terminals of the N-channel, pulldown transistors from the P-substrate tap. As shown in
Additionally, the graphics engine 22 of the integrated circuit 20 has analog circuits. In an exemplary analog circuit, a low-pass filter is often used and an RC circuit is required. Heretofore, the capacitor of the RC circuit has been typically formed by an NMOS transistor with its gate forming one terminal and the shorted source/drains forming the other terminal of the capacitor. Since the body bias of this transistor is the noisy substrate voltage generated from the on-chip charge pump required for the DRAM 21, some of the noise couples inevitably into the low-pass filter circuit. To avoid this problem, the analog circuits according to the present invention are designed with mostly P-channel transistors within independent N-wells which are connected to the positive and relatively quiet reference voltage, VDD, as shown in
On the periphery of the integrated circuit are buffer circuits for transferring signals to and from the external world. Problems arise with the DRAM technology in the driver stage circuit of each output buffer. Shown in
During the switching of output signals, the output signal voltages invariably tend to overshoot the VDDQ and VSSQ supply voltages due to an impedance mismatch between the driver transistors and the external load. When the output signal voltage, DATA OUT, is being driven high in response to an internal signal, dataout*, going low at the input terminal 57, for instance, the prior art design results in a parasitic diode 55, marked by dotted lines, becoming forward-biased when the overshoot exceeds 0.6 volts. The diode 55 is formed by the junction of the drain of the P-channel, pullup transistor 50 and the N-well holding the transistor, which is also connected to VDDQ. This forward-biasing action results in the injection of positive charges, or holes, into the substrate which works against the on-chip substrate bias voltage generator. The amount of hole injection is a function of the severity of the overshoot, the number of output buffers, and the frequency of switching. If the substrate bias voltage generator is overwhelmed by excessive hole injection, functional failures or soft errors occur in the on-chip DRAM.
To avoid this problem, a new output driver circuit, shown in
For each bank of output buffers on the same integrated circuit, a different NVDD bias generator, referencing the VDDQ supply for that bank, is used to generate the higher voltage. This allows the N-well(s) of an output buffer bank which is driven from a +3.3V supply to be biased at +4.3V, while an output buffer bank driven by a +5.0V supply, has its N-well(s) biased at +6.0V.
As shown in
Integrating the video memory, the DRAM 21, with a graphics controller results in significant power savings compared to present graphics controller systems with external DRAM. Capacitance in present graphics controller systems is comprised of the capacitance of the I/O pins of the DRAM packages and the controller package plus capacitance of the traces on the motherboard which carries and connects the DRAM and controller packages. The present invention has a roughly twenty-fold reduction of the video memory address, data and control bus capacitance. This results in an equivalent power savings since most of these lines are continuously switching at high frequencies.
Another source of power savings is the 128-bit wide memory word which can be transferred between the graphics engine 22 and the video memory 21. The controller, i.e., the graphics engine 22, has 128 bits of data available after one DRAM read cycle. In comparison, the graphics controller system in the prior art requires four or eight read cycles, depending upon a 32-bit or 16-bit wide DRAM organization, respectively. Since a fixed amount of power is consumed every DRAM cycle, the present invention has a savings of three-fourths to seven-eighths of the prior art power dissipation.
Furthermore, the present invention uses memory very efficiently. The capacity of the video memory 21 is not required to fall on high order binary boundaries, such as combinations of DRAM integrated circuits forming a memory of 32K×128 bits, or 64K×128 bits, the next larger size. In the present invention, the addition of memory blocks, with a typical size of 256K(218) bits each, achieves the required capacity for the video memory 11. Memory capacity can be customized for a particular application. For example, a 1024×768×8 display requirement requires a video memory of 6.4 megabits, which can be built with 24 memory blocks. With external DRAMs, a video memory of 8 megabits is required since the standard DRAM package has 4 megabits. The video memory 21 of the integrated circuit 20 can be organized to be of any width, depth or capacity and need not follow the multiplexed addressing architecture associated with standard DRAMs.
With the ability to incorporate large amounts of logic and memory in a single integrated circuit, the present invention provides for a video memory and logic for graphics control operations in one integrated circuit.
The GUI acceleration block 62 has a 128-bit wide register 70, which receives data from the bus 61. The register 70 splits its contents into two parts, 64-bits apiece, to a 64-bit BIT Block Transfer (BITBLT) operation unit 71 for performing the operations. The output of the unit 71 is fed into an assemble register 72 on a 64-bit wide path. After a BITBLT operation, the register 72 builds up a 128-bit word for transfer back to the bus 61. This organization represents the best compromise between performance and space on the integrated circuit; the transfer rate is maximized between the memory 21 and the operation unit 71, yet an optimum size of 64 bits for the unit 71 is maintained. A 128-bit BITBLT operation unit occupies a very large amount of integrated circuit space, while a unit for 32 bits slows BITBLT operations too much.
The host bus interface block 63 lies between the bus of the host, i.e., the CPU of the computer system, and the bus 61. The interface block 63 provides a bidirectional data path between the 128-bit bus 61 and a 32-bit bus of the host. The block 63 has a Bus Read Latch 73 which holds a 128-bit wide word from the bus 61. The output of the latch 73 is connected to the input of a multiplexer 74, which selects 32 bits from the 128-bit latch 73 for the host bus. For host Read operations, the selected 32 bits should contain four consecutive bytes in the host bus address space. Depending upon the VGA-compatible format and other extended storage formats in use, these four bytes may be scattered among the 16 bytes, 16×8 bits equals 128 bits, of data stored in the latch 73.
The output of the latch 73 is illustrated in
The logic to generate the control signals, selda(2:0), seldb(2:0), seldc(2:0), and seldc(2:0), for the multiplexers 74A-74D is listed in VHDL code in TABLES 1 and 2. These control signals are derived from VGA standard control bits in programmable control registers which determine the storage format in use, and additional internal state information in the memory controller state machine. The present invention uses the following standard control bits:
SR4[3]=Chain-4
GR5[3]=Read Mode
GR5[4]=Odd/Even
GR4[0]=Read Map Select[0]
GR4[1]=Read Map Select[1]
GR6[0]=APA/Text*, Graphics Mode
and an extended mode control bit:
PACPIX=Packed Pixel Format
As indicated in the code listed in TABLE 1, these control bits are used to generate control signals, tmppack, pack, rdplanar and wrplanar, which are used ultimately in generating the selda(2:0), seldb(2:0), seldc(2:0), and seldc(2:0) control signals. From these control signals, other control signals are generated for each of the multiplexers 74A-74D. For example, the control logic and signals for the first multiplexer 74A are illustrated in
The code in TABLE 2 similarly illustrates the details of the control signals and operation of the multiplexers 74B-74D respectively.
With reference to
The CRT display block 64 provides a data path between the memory 21 and the CRT display which is compatible with the VGA standard. The block 64 has a Data Rotate unit 80, which receives 128 bits from the bus 61. The unit 80 is connected over four 32-bit paths to the input terminals of a CRT FIFO register 81 which has a capacity of 4 words, each word 128 bits wide. Stated differently, the FIFO register 81 is 128 bits wide and four stages deep, and can be filled in four memory fetches. The output terminals of the FIFO register 81 are connected to a VGA Display Path unit 82 over a 32-bit wide path. All VGA compatible graphics controllers for notebook computers today are based on a 16-bit or 32-bit memory bus to an external video memory buffer. The present 128-bit bus architecture, in comparison, allows improved performance while reducing power consumption. However, to achieve VGA compatibility and improve performance, byte swapping is required in transferring data from the memory bus 61 to the FIFO register 81. This swapping is implemented in the Data Rotate unit 80. From the 128 bits of the FIFO register 81, 32 bits are selected and sent to the VGA Display Path unit 82. The code in TABLE 3 specifies the control signals and implementation in terms of the control register bits which define the VGA storage format or extended mode storage format in use, as well as the memory controller control states.
The control signals are:
fontcy is a signal derived from the internal state machine and indirectly from the previously identified control signal, GR6[0]; a control signal, such as fontcy, is found in present VGA compatible controllers to determine a font or ASCII fetch operation in text mode.
swap 0, swap 1 are the 0 and 1-order bits of the CRT address counters found in VGA compatible controllers; these signals are derived from the Chain-4 and Odd/Even control signals mentioned previously.
rscntb0, rscntb1 are the 0 and 1-order bits of the 5-bit row scan counter found in VGA compatible controllers; the row scan counter is used for tracking the rows of a character in text mode.
Iword is the Chain-4, or SR4[3], control signal identified previously.
crsr_dtct is the cursor detect signal in VGA-compatible controllers; and
TEXT, apa are the true and inverted of the GR6[0] control signal previously identified.
These control signals are used to generate further signals, memc1_dta and memc2dta. Stated generally, these two signals are either the crsr dtct signal in text mode, or bit 96 (or bit 36 respectively) of the data bits from the 128-bit word on the bus 61 in graphics mode. The signals, mema_dta, are basically the four 32-bit words of data formed from the 128-bit word on the bus 61. The words for the bit locations, 127-96 and 63-32, are modified so that the bits 96 and 32 are either crsr_dtct in text mode or respectively data bits 96 and 32 from the bus 61. Finally, swapa and swapb are the control signals to the multiplexers in the Data Rotate unit 80. It should be noted that the symbol, “&” represents a concatenation of signals.
The code listed in TABLE 4 illustrates the operation of the Data Rotate unit 80, which receives the mema_dta signals as input and transmits crt_fin signals as output to the FIFO register 81. The first VGA (32-bit) word, crt_fin(31 DOWNTO 0), may be filled by any one of the four incoming 32-bit words from the bus 61, depending upon the state of control signals swapa. Similarly, the third VGA (32-bit) word, crt_fin(95 DOWNTO 64), may be filled by any one of the four incoming 32-bit words from the bus 61, depending upon the state of control signals swapb. The second and fourth VGA words, crt_fin(63 DOWNTO 32) and crt_fin(127 DOWNTO 96), are respectively filled by the third and first incoming 32-bit words from the bus 61.
The CRT FIFO 81 then selectively feeds 32-bit words into a VGA Display Path unit 82 and a Color Palette RAM 83. The RAM 83 is, in turn, connected to a digital-to-analog converter (DAC) 84. The RAM 83 feeds 18 bits of data, 6 bits for each component color, to the DAC 84. The DAC 84 generates the analog signals for a CRT color display.
The RAM 83 also feeds data into the LCD Display Interface block 65 which is organized for dual scan LCD panel displays. The general operation of the block 65 is that a Shader unit 96 receives the data from the RAM 83. The unit 96 generates the grayscale values for the LCD pixels. In passing, it should be noted that the word, grayscale, implies intensity for a color LCD display. These values are sent to a Formatter unit 92 which, as the name implies, formats the grayscale values for the integrated circuit(s) which drive the electrodes of the LCD display. The Shader unit 96 also sends its grayscale values through several buffer units 95, 94, 93, 90 and 91 (and along the bus 61) before being formatted and transmitted by the Formatter unit 92 for a dual scan operation. Dual scan LCD panels are commonly used today in notebook computers and the buffer units of the block 65 provide for the memory by which, in alternating operation, the display in one LCD panel is updated by the Shader unit 96 while the display in the second panel is maintained from memory.
Therefore, while the description above provides a full and complete disclosure of the preferred embodiments of the present invention, various modifications, alternate constructions and equivalents may be employed without departing from the true scope and spirit of the invention. For example, while the present invention has been described in terms of an integrated circuit with a memory capacity of some 7.3 megabits and some 40-50K logic gates, one could use the present invention to build an integrated circuit of reduced size. An integrated circuit having a memory capacity of 2 megabits, the capacity of basic VGA video memory in graphics cards, with 30K logic gates, the approximate amount of logic in present graphics controller integrated circuits, still realizes the advantages of the present invention. Costs, power dissipation and occupied space are reduced, and performance is enhanced, for instance. The present invention, therefore, should be limited only by the metes and bounds of the appended claims.
This application is a continuation of pending U.S. patent application Ser. No. 11/382,433, filed May 9, 2006, which is a divisional of U.S. patent application Ser. No. 10/908,259, filed May 4, 2005, (now U.S. Pat. No. 7,106,619), which is a divisional of U.S. patent application Ser. No. 10/803,783, filed Mar. 18, 2004, (now U.S. Pat. No. 6,920,077), which is a divisional of U.S. patent application Ser. No. 10/042,952, filed Jan. 7, 2002, (now U.S. Pat. No. 6,771,532), which is a continuation of U.S. patent application Ser. No. 09/467,942, filed Dec. 21, 1999 (now U.S. Pat. No. 6,356,497), which is a continuation of U.S. patent application Ser. No. 08/883,538, filed Jun. 26, 1997 (now U.S. Pat. No. 6,041,010), which is a continuation of U.S. patent application Ser. No. 08/581,086, filed Dec. 29, 1995 (abandoned) which is a divisional application of U.S. patent application Ser. No. 08/262,412, filed Jun. 20, 1994 (abandoned). These applications and patents are incorporated herein by reference, in their entirety, for any purpose.
Number | Date | Country | |
---|---|---|---|
Parent | 10908259 | May 2005 | US |
Child | 11382433 | US | |
Parent | 10803783 | Mar 2004 | US |
Child | 10908259 | US | |
Parent | 10042952 | Jan 2002 | US |
Child | 10803783 | US | |
Parent | 08262412 | Jun 1994 | US |
Child | 08581086 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 11382433 | May 2006 | US |
Child | 13439585 | US | |
Parent | 09467942 | Dec 1999 | US |
Child | 10042952 | US | |
Parent | 08883538 | Jun 1997 | US |
Child | 09467942 | US | |
Parent | 08581086 | Dec 1995 | US |
Child | 08883538 | US |