1. Technical Field
The present invention relates to a method of determining the maximum speed at which a PCI or similar bus should be set. In particular, the present invention relates to using a card presence indicator to determine the maximum speed at which a PCI or similar bus should be set.
2. Description of Related Art
In conventional computing systems, when several Peripheral Component Interface (PCI) devices are possible on the same PCI bus, the bus speed is limited by the total number of devices on the bus.
A conventional computing system is composed of many complex components and all of these components need to communicate with each other in a fast and efficient manner. Thus, the conventional computing system contains buses which provide a channel or path between the components within a computer. One of the buses within the conventional computing system is the PCI bus.
A typical computer has two key buses. The first one, known as the system bus or local bus, connects the microprocessor (central processing unit) and the system memory. Other buses, such as the ISA and PCI buses, connect to the system bus through a bridge, which is a part of the computer's chipset and acts as a traffic cop, integrating the data from the other buses to the system bus.
PCI is a synchronous bus architecture with all data transfers being performed relative to a system clock. The initial PCI specification permitted a maximum clock rate of 33 MHz allowing one bus transfer to be performed every 30 nanoseconds. Later, revisions of the PCI specification extended the bus definition to support operation at 66-133 MHz and higher bus speeds.
PCI implements a 32-bit multiplexed Address and Data bus. It architects a means of supporting a 64-bit data bus through a longer connector slot, but most of today's personal computers support only 32-bit data transfers through the base 32-bit PCI connector. At 33 MHz, a 32-bit slot supports a maximum data transfer rate of 132 MBytes/sec, and a 64-bit slot supports 264 MBytes/sec.
PCI supports a rigorous auto configuration mechanism. Each PCI device includes a set of configuration registers that allow identification of the type of device (SCSI, video, Ethernet, etc.) and the company that produced it. Other registers allow configuration of the device's I/O addresses, memory addresses, interrupt levels, etc.
PCI defines support for both 5 Volt and 3.3 Volt signaling levels. The PCI connector defines pin locations for both the 5 Volt and 3.3 Volt levels. However, most early PCI systems were 5 Volt only, and did not provide active power on the 3.3 Volt connector pins. Over time more use of the 3.3 Volt interface is expected, but add-in boards which must work in older legacy systems are restricted to using only the 5 Volt supply. A “keying” scheme is implemented in the PCI connectors to prevent inserting an add-in board into a system with incompatible supply voltage.
PCI bus architecture is processor independent. PCI signal definitions are generic allowing the bus to be used in systems based on other processor families. PCI includes strict specifications to ensure the signal quality required for operation at 33 and 133 MHz. Components and add-in boards must include unique bus drivers that are specifically designed for use in a PCI bus environment. Typical transistor-transistor logic devices used in previous bus implementations such as Integrated Systems Architecture and Extended Industry-Standard Architecture are not compliant with the requirements of PCI. This restriction along with the high bus speed dictates that most PCI devices are implemented as custom Application-Specific Integrated Circuits (ASICs).
The higher speed of PCI limits the number of expansion slots on a single bus to no more than three or four, as compared to six or seven for earlier bus architectures. To permit expansion buses with more than three or four slots, the PCI Special Interest Group has defined a PCI-to-PCI Bridge mechanism. PCI-to-PCI Bridges are ASICs that electrically isolate two PCI buses while allowing bus transfers to be forwarded from one bus to another. Each bridge device has a “primary” PCI bus and a “secondary” PCI bus. Multiple bridge devices may be cascaded to create a system with many PCI buses.
When multiple cards are connected to a single PCI bus, the speed of the bus is currently limited depending on the load imposed on the bus. The normal technique used is to limit the bus speed based on the maximum potential load on the bus.
The present invention provides a mechanism for determining the maximum speed at which a PCI bus should be set. The mechanism uses a card presence pin provided for in the PCI specification to detect the number of devices residing on the PCI bus. The mechanism then sets the PCI bus speed to the highest speed possible for the actual number of devices on the PCI bus and not the maximum number of devices the PCI bus can handle.
The novel features believed characteristic of the invention are set forth in the appended claims. The invention itself, however, as well as a preferred mode of use, further objectives and advantages thereof, will best be understood by reference to the following detailed description of an illustrative embodiment when read in conjunction with the accompanying drawings, wherein:
With reference now to the figures and in particular with reference to
With reference now to
In the depicted example, local area network (LAN) adapter 210, small computer system interface SCSI host bus adapter 212, and expansion bus interface 214 are connected to PCI local bus 206 by direct component connection. In contrast, audio adapter 216, graphics adapter 218, and audio/video adapter 219 are connected to PCI local bus 206 by add-in boards inserted into expansion slots. Expansion bus interface 214 provides a connection for a keyboard and mouse adapter 220, modem 222, and additional memory 224. SCSI host bus adapter 212 provides a connection for hard disk drive 226, tape drive 228, and CD-ROM drive 230. Typical PCI local bus implementations will support three or four PCI expansion slots or add-in connectors.
An operating system runs on processor 202 and is used to coordinate and provide control of various components within data processing system 200 in
Those of ordinary skill in the art will appreciate that the hardware in
For example, data processing system 200, if optionally configured as a network computer, may not include SCSI host bus adapter 212, hard disk drive 226, tape drive 228, and CD-ROM 230. In that case, the computer, to be properly called a client computer, includes some type of network communication interface, such as LAN adapter 210, modem 222, or the like. As another example, data processing system 200 may be a stand-alone system configured to be bootable without relying on some type of network communication interface, whether or not data processing system 200 comprises some type of network communication interface.
The processes of the present invention are performed by processor 202 using computer implemented instructions, which may be located in a memory such as, for example, main memory 204, memory 224, or in one or more peripheral devices 226-230.
The present invention provides a mechanism for determining the maximum speed at which a PCI bus should be set. The mechanism uses a card presence pin provided for in the PCI specification to detect the number of devices residing on the PCI bus. The mechanism then sets the PCI bus speed to the highest speed possible for the actual number of devices on the PCI bus and not the maximum number of devices the PCI bus can handle.
Turning now to
Turning now to
System pins include a clock (CLK) pin, which provides the timing reference for all transfers on the PCI bus, and a reset (RST#) pin, which is driven active low to cause a hardware reset of a PCI device. Address and data pins include address and data pins, bus command and byte enables and parity pins. Address and data pins (AD[31:0]) transfers a 32-bit physical address during “address phases”, and transfers 32-bits of data information during “data phases”. Bus command and byte enables pins (C/BE[3:0]#) carry the bus command that defines the type of transfer to be performed during the address phase of a transaction of these signals. The Parity pin (PAR) provides even parity over the AD[31:0] and C/BE[3:0# signals. Even parity implies that there is an even number of ‘1’s on the AD[31:0], C/BE[3:0]#, and PAR signals.
Interface control pins include cycle frame, initiator ready, target ready, stop, lock, initialization device select, and device select. Cycle frame (FRAME#) is driven low by the initiator to signal the start of a new bus transaction. Initiator Ready (IRDY#) is driven low by the initiator as an indication it is ready to complete the current data phase of the transaction. Target Ready (TRDY#) is driven low by the target as an indication it is ready to complete the current data phase of the transaction. Stop (STOP#) is driven low by the target to request the initiator to terminate the current transaction. Lock (LOCK#) may be asserted by an initiator to request exclusive access for performing multiple transactions with a target. Initialization Device Select (IDSEL) is used as a chip select during PCI configuration read and write transactions. Device Select (DEVSEL#) is driven active low by a PCI target when it detects its address on the PCI bus.
Arbitration pins include a request (REQ#) which is used by a PCI device to request use of the bus and a grant pin (GNT#) which indicates that a PCI device's request to use the bus has been granted. Error reporting pins include a parity error pin (PERR#) which is used for reporting data parity errors during all PCI transactions except a “Special Cycle” and a system error pin (SERR#) which is for reporting address parity errors, data parity errors during a Special Cycle, or any other fatal system error. Interrupt pins (INTA#, INTB#, INTC#, and INTD#) are driven low by PCI devices to request attention from their device driver software.
Cache support pins, which are optional, are architected to permit cacheable memory to be implemented on a PCI bus. The cache support pins are rarely if ever implemented in today's PCI systems. Cache support pins include a snoop backoff (SBO#) that indicates a hit to a modified line when asserted and a snoop done (SDONE) that indicates the status of the snoop for the current access.
Other optional pins are the 64-bit bus extension pins and the JTAG/boundary scan pins. The 64-bit bus extension pins include address and data pins, (AD[63:32]), which are multiplexed on the same pins and provide 32 additional bits when operating in a 64-bit bus environment, bus command and byte enables pins, (C/BE[7:4]#), which are multiplexed onto the same pins and provide 4 additional bits when operating in a 64-bit bus environment. The 64-bit bus extension pins also include a request 64-bit transfer pin (REQ64#) which is asserted low by the initiator to indicate it desires a 64-bit transfer, acknowledge 64-bit transfer (ACK64#) which is asserted low by a target as an indication that it has decoded its address as the target of the current access, and is capable of performing a 64-bit transfer, and a parity pin (PAR64) that is the even parity bit that protects AD[63:32] and C/BE[7:41#. The JTAG/boundary scan pins allow components installed on a PCI add-in board to be exhaustively tested by serially scanning test patterns through each component. The JTAG/boundary scan pins include test clock (TCK), test data input (TDI), test output (TDO), test mode select (TMS) and test reset (TRST#).
Additional pins that are present on a PCI card are clock running, 66 MHZ enable and card present. The clock running pin (CLKRUN#) provides an optional signal used to facilitate stopping of the CLK signal for power saving purposes. The 66 MHZ enable pin (M66EN) is left “open” or disconnected on add-in boards that support operation with a 66 MHz CLK, and grounded on add-in boards that support operation with only a 33 MHz CLK. The card present pins (PRSNT[1:2]#), which is used in accordance with a preferred embodiment of the present invention, are used for two purposes: 1) to indicate that an add-in board is physically present, and 2) to indicate the power requirements of an add-in board. These are static signals that are either grounded or left open on the add-in board.
Turning now to
In summary, the present invention provides a mechanism for minimizing effective memory latency without unnecessary cost through fine-grained software-directed data prefetching using integrated high-level and low-level code analysis and optimizations. The mechanism identifies and classifies streams based on reuse analysis and dependence analysis. The mechanism makes use of the information from high-level loop transformations, data remapping, and work data-set analysis to identify which data is most likely to incur a cache miss. The mechanism exploits effective hardware prefetching through high-level loop transformations, including locality and reuse analysis, to determine the proper number of streams. The mechanism exploits effective data prefetching on different types of streams, based on compiler static analysis and dynamic profiling information, in order to eliminate redundant prefetching and avoid cache pollution. The mechanism uses high-level transformations with integrated lower level cost analysis in the instruction scheduler to schedule prefetch instructions effectively.
It is important to note that while the present invention has been described in the context of a fully functioning data processing system, those of ordinary skill in the art will appreciate that the processes of the present invention are capable of being distributed in the form of a computer readable medium of instructions and a variety of forms and that the present invention applies equally regardless of the particular type of signal bearing media actually used to carry out the distribution. Examples of computer readable media include recordable-type media, such as a floppy disk, a hard disk drive, a RAM, CD-ROMs, DVD-ROMs, and transmission-type media, such as digital and analog communications links, wired or wireless communications links using transmission forms, such as, for example, radio frequency and light wave transmissions. The computer readable media may take the form of coded formats that are decoded for actual use in a particular data processing system.
The description of the present invention has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the invention in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art. The embodiment was chosen and described in order to best explain the principles of the invention, the practical application, and to enable others of ordinary skill in the art to understand the invention for various embodiments with various modifications as are suited to the particular use contemplated.