1. Field of the Invention
The present invention relates to digital communication systems and, more specifically, to an enhancement for multi-lane digital communications protocols.
2. Description of the Prior Art
Multi-lane digital communications systems allow computer processors to communicate with a variety of other devices in a highly flexible manner. Such systems employ a plurality of different data channels (sometimes referred to as “lanes”) that communicate with all of the devices in a network. A lane is a serial point-to-point connection that connects a “root” device to an “endpoint” device. The lanes can be configured as serial data channels, or they can be grouped together to act as parallel data busses, depending on the requirements of the specific device connected to the system.
One type of multi-lane digital communication system is referred to as “PCI Express.” PCI Express is a digital communications standard that allows expansion cards to be added to a computer system. The first generation of PCI Express allows data transfer over 32 different lanes. Each allows a data transfer rate of 250 MB per second and, thus, the total data transfer rate for all lanes is 8 GB per second (other generations will have different data transfer rates). PCI Express also includes a plurality of serial interconnects. A single hub with many pins connects a central unit (such as the mother board of a computer) to the PCI Express bus.
The PCI Express communications protocol is layered. The layers include: a transaction layer, a data link layer; and a physical layer. The physical layer is divided into a logical sublayer and an electrical sublayer. The logical sublayer is frequently further divided into a physical coding sublayer (PCS) and a media access control (MAC) sublayer. In the electrical sublayer, each lane includes two unidirectional low voltage differential signaling (LVDS) conductor pairs that transmit data at 2.5 gigabits per second. Transmit and receive functions use different LDVS pairs, resulting in four conductors per lane.
PCI Express sends all control messages, including interrupts, over the same links used for data. The serial protocol can never be blocked. Data transmitted on multiple-lane links is interleaved so that each successive byte is transmitted on a different lane in a process referred to as “data striping.”
The data link layer (DLL) sequences transaction layer packets (TLPs) that are generated by the transaction layer. The DLL also provides data protection via a 32-bit cyclic redundancy check code (referred to as “LCRC”) and an acknowledgement protocol. When a TLP passes an LCRC check and a sequence number check, an acknowledgement (ACK) is returned. When a TLP fails the LCRC check, a negative acknowledgement (NAK) is sent. TLPs that result in a NAK, or timeouts that occur while waiting for an ACK, result in the TLPs being replayed from a buffer in the transmit data path of the DLL. ACK and NAK signals are communicated via a low-level packet known as a data link layer packet, or DLLP. DLLPs are also used to communicate flow control information between the transaction layers of two connected devices, as well as some power management functions.
PCI Express allows transactions with a request and a response that are separated by time. This enables the link to carry other traffic while the target device gathers data for the response. PCI Express uses credit-based flow control to communicate receive buffer status from a receiver to a transmitter to prevent buffer overflow and to allow transmitter compliance with ordering rules. With flow control, a device at one end of a link advertises an initial number of credits for each of the receive buffers in its transaction layer. When sending transactions to this device, the device at the other end of the link counts the number of credits consumed by each TLP from its account. The sending device may only transmit a TLP when doing so does not result in its consumed credit count exceeding its credit limit. When the receiving device finishes processing the TLP from its buffer, it signals a return of credits to the sending device, which then increases the credit limit by the restored amount.
In PCI Express, flow control (FC) is used to prevent overflow of receiver buffers and to enable compliance with ordering rules. The flow control mechanism is used by a requester to track the queue or buffer space available in an agent across a link. Flow control is point-to-point (across a link) and not end-to-end.
Flow control is separate from the data integrity mechanisms used to implement reliable information exchange between a transmitter and a receiver. Each virtual channel maintains an independent flow control credit pool. The FC information is conveyed between two sides of the link using DLLPs. The VCID field of the DLLP is used to carry the Virtual Channel Identification that is required for proper flow-control credit accounting.
Flow Control is handled by the transaction layer in cooperation with the data link layer. The Transaction Layer performs flow control accounting functions for Received TLPs and “gates” TLP Transmissions based on available credits for transmission.
Flow control is a function of the Transaction Layer and, therefore, the following types of information transmitted on the interface are not associated with Flow Control Credits: LCRC, Packet Framing Symbols, other Special Symbols, and Data Link Layer to Data Link Layer intercommunication packets. These types of information must be processed by the receiver at the rate they arrive. Any TLPs transferred from the transaction layer to the data link and physical Layers must have first passed the flow control “gate.” Thus, both transmit and receive flow control mechanisms are unaware if the data link layer transmits a TLP repeatedly due to errors on the link.
One example of an existing digital communication system 10 that employs flow control is shown in
Each node (12, 14 & 16) in such a system employs a buffer to facilitate the efficient receipt of data. However, if the rate at which a node receives data is faster than the rate at which the data is taken out of the buffer by the node, data will be lost. To prevent this, the system 10 employs a flow control system in which each node transmits a flow control packet 18 that indicates the capacity of the node's buffer to the nodes with which it is communicating. In this way, a sending node will not send data to a receiving node unless the buffer of the receiving node has the capacity to receive the data.
A typical data stream 20 from such a system is shown in
Therefore, there is a need for a system the employs flow control without adding unnecessary overhead.
The disadvantages of the prior art are overcome by the present invention which, in one aspect, is a method of communicating management information from a completing entity to a requesting entity in a digital communication system, in which the availability of management information to be sent from the completing entity to the requesting entity is detected when generating a data packet that does not have a primary purpose of transmitting management information. The management information is included in a management information data field and the management information data field is appended to the data packet. The data packet and the management information are transmitted from the completing entity to the requesting entity.
In another aspect, the invention is a method of communicating flow control information from a completing entity to a requesting entity in a PCI Express system. A value of a flag that, when in a first state, indicates that flow control information is to be included with a data packet is detected. An extended data packet that includes a non-flow control data field and a flow control field is created when the flag is in the first state. Non-flow control data is written to the non-flow control data field and flow control data is written into the flow control field. The extended data packet is sent to a requesting entity.
In yet another aspect, the invention is a method of managing digital communications, in which a data packet to be sent from an endpoint to a root complex is created. A management information packet is appended to the data packet. The management information packet includes a flow control field for transmitting flow control information indicative of buffer availability at the endpoint. The data packet and the management information packet are transmitted to the root complex.
These and other aspects of the invention will become apparent from the following description of the preferred embodiments taken in conjunction with the following drawings. As would be obvious to one skilled in the art, many variations and modifications of the invention may be effected without departing from the spirit and scope of the novel concepts of the disclosure.
A preferred embodiment of the invention is now described in detail. Referring to the drawings, like numbers indicate like parts throughout the views. As used in the description herein and throughout the claims, the following terms take the meanings explicitly associated herein, unless the context clearly dictates otherwise: the meaning of “a,” “an,” and “the” includes plural reference, the meaning of “in” includes “in” and “on.”
Management information can be transmitted between nodes in a link that is part of a multi-link system with greater efficiency by including the management information with a regular data packet when such management information becomes available, rather than sending the management information in a separate packet. Such management information can include, for example: flow control information, which advertises buffer credit availability; acknowledgements (“ACKS”); and negative acknowledgements (“NACKS”).
In a typical PCI Express embodiment, the system generates an ordinary data packet (a data packet that does not have a primary purpose of transmitting flow control information) and detects when flow control information is available to be sent from the completing entity to a requesting entity. The data packet includes a flag that, when set, indicates that it is configured to include both regular data and flow control information.
The flow control information is added to the data packet as part of either a header or a digest. While a data packet is generated for an endpoint-to-endpoint transmission, the flow control information in the packet can be re-written for each link. For example, an endpoint could generate a data packet that includes flow control information indicative of the buffer capacity of the endpoint and then transmit the packet to a switch, which would use the flow control information from the endpoint in managing the amount of data it sends to the endpoint. The switch would write over the flow control information with flow control information indicative of the buffer capacity of the switch and then send the packet to a root complex, which would use the flow control information from the switch in managing the amount of data it sends to the switch. In cases where a switch has receives flow control information from an endpoint, but does not have flow control information to send to the root complex, a null value can be written to the flow control field in the data packet.
In one embodiment, the system adds one double word into a normal TLP packet. The new double word contains the management information. This improvement will help designs with multiple virtual lanes significantly but will also benefit non-virtual lane designs.
As shown in
Conceptually, as shown in
One example of a data packet 400 that would be used to transmit flow control information in a PCI Express embodiment is shown in
In one experimental embodiment, flow control link transmission improved by 50%. It also resulted in faster credit updates and reduced scheduling complexity. The system benefited by not having to wait for a gap in the data transmission to send the flow control information.
The above described embodiments, while including the preferred embodiment and the best mode of the invention known to the inventor at the time of filing, are given as illustrative examples only. It will be readily appreciated that many deviations may be made from the specific embodiments disclosed in this specification without departing from the spirit and scope of the invention. Accordingly, the scope of the invention is to be determined by the claims below rather than being limited to the specifically described embodiments above.