The accompanying figures, in which like reference numerals refer to identical or functionally similar elements throughout the separate views and which are incorporated in and form a part of the specification, further illustrate the embodiment, together with the background, brief summary, and detailed description, serve to explain the principles of the illustrative embodiment.
The illustrative embodiment provides an approach to transferring packets between devices connected to a PCI Express (PCIe) bus of a computer using a method and a system which ensures transfer of large packet sizes between the PCIe bus pairs that have large Maximum Payload Size (MPS) where performance is important, while still allowing accesses to busses that have small Maximum Payload Size (MPS) where performance may be less important.
If the source device MPS exceeds the destination device MPS, the packet being sent from the source device is divided into a plurality of sub-packets each having a maximum payload size based on the MPS of the destination device, as indicated in step 106. The sub-packets are then transmitted to the destination device so that the packet can be delivered to the destination device which has a smaller MPS than the source device (step 107). If, however, the source device MPS does not exceed the destination device MPS, then the packet is transmitted as a single unit to the destination device as indicated in step 105. Thereafter the packet transfer is complete (step 108). Those skilled in the art would understand that method steps 101-103 could be performed in a different sequence from that shown in
By configuring a pair of devices to transmit/receive packets with the respective MPS of the devices and dividing the packet being switched into sub packets based on the destination device MPS if the source device MPS exceeds the destination device MPS, the devices of the system supporting different MPS are capable of transmitting/receiving data with their different MPS and are not limited to transferring data with payload sizes which are smaller than can be supported by some of the devices.
Thus, the method 100 enables packets of data to be selectively switched between a source device, such as root complex device, and a destination device, such as endpoint device, in a manner that enhances the data transfer performance of the PCI-Express bus system.
Method 100 of the illustrative embodiment can be implemented by different PCI Express based bus systems. A system suitable for implementing the method of transferring data between devices connected to a PCI-Express bus according to one embodiment is shown in
In the illustrative embodiment of
Referring now to the switch device 3 in more detail, as best shown in
Switch device 3 provides the PCIe connectivity between an upstream device, for example the Root Complex device 2, and downstream devices, for example endpoint devices 14, 15, and additionally between downstream devices (peer-to-peer). PCIe Configuration Registers (not shown) allow switch device 3 and endpoints 14, 15 to advertise their MPS capability and to be programmed with a MPS for packet transfers. The Root complex device 2 is operable by a flow control program to read all the configuration space of the switch device 3 and endpoints 14,15, known as the discovery phase, and to program (enumerate) the switch device 3 and endpoints to match the MPS of each switch port 7,8,9 to an associated root complex device or endpoint device 2,14,15.
The MPS for each switch port and endpoint pair or each switch port and root complex pair on a PCIe bus are programmed for the Smallest MPS for the pair instead of the Smallest MPS of a device in the system. The switch device 3 is then responsible for the management of Read Completion and Posted Write type operations (CP type operations) involving transfer of Read completion Packet, Memory Write Request packet and/or Message Request Packets and guarantees to not exceed the MPS of the recipient of the packets. Different payload sizes are therefore available for each of the Switch PCIe busses.
The system is responsible for generating multiple Read Completions, Memory Write Requests or Message Requests packets when the MPS of the recipient of the data is less than the Payload Size of the source of the data. The switch device 3 manages Multiple Read Completions that need to be generated when a read completion packet exceeds the MPS of the destination device. The rules for Multiple Read Completions are the same as the PCI Express specification rules for completions. Switch device 3 may generate Multiple Read Completions when the device sourcing the Read Completion packet has a Payload Size that is greater than the MPS of the device receiving the Read Completion data.
The switch device 3 also generates multiple Memory Write requests when the Memory Write Request packet exceeds the MPS of the destination device. The resulting Multiple Memory Write Requests are divided based on the MPS of the destination device. For example, a Memory Write request with a payload of 512 bytes targeted to a device with a MPS of 128 bytes is divided into four packets of 128 bytes each.
Message Requests that exceed the MPS of the receiver are generated with the same method as multiple Memory Write Requests. The length field of the header indicates the number of Dwords transferred in the packet payload. All other header fields of the Multiple Message Requests are unmodified from the original header. Message requests that are divided into multiple Message requests must not allow any other transfers to pass this transfer once the transfer has started.
Known PCI-Express bus systems are not capable of transferring data according to the method and system of the illustrative embodiments. In such PCI-Express bus systems, all the devices connected to the bus system are limited to transferring data with a pay load size which is supportable by all of the devices. The PCI Express (PCIe) specification does not generally allow source and destination devices of PCIe packet transfers to have different Maximum Packet Payload Sizes because this can lead to malformed packet errors. The PCIe supports maximum payload sizes from 128 to 4096 bytes. PCIe specification requires that the MPS transferred between a source and a destination device be equal to the smallest MPS supported by either device.
For example: if device A indicates a supported MPS of 128 bytes and device B indicates a supported MPS of 4096 bytes then the Root Complex of known PCI-Express bus systems would program Device A and B to a MPS of 128 bytes. However, making the MPS of the transferred packet equal to the smallest MPS of the devices prevents the generation of malformed packet errors due to a device receiving a packet with a payload larger than it is capable of handling. In turn, having all the devices transfer data with a payload size smaller than can be supported by some device(s) in the system degrades the system performance.
By system 1 configuring a selected pair of devices to transfer packets with the respective MPS of the devices and dividing the packet being switched into sub packets based on the destination device MPS if the source device MPS exceeds the destination device MPS, the system 1 can transfer packets based on the smallest MPS of the pair of devices rather than having to limit the pair of devices to transferring data with the smallest MPS of all the devices in the system. The resultant larger packet sizes contain less overhead than multiple smaller packets thus improving system performance.
Methods of operating system 1 of
Thereafter the packet 22 is transferred from the endpoint device 14 to the buffer 11 via associated second switch port 8 (steps 205). Since the end point device MPS (1024 bytes) exceeds the root complex device MPS (512 bytes), the packet 22 stored in the buffer 11 is divided into a pair of sub packets 22a, 22b each having a data payload equal to 512 bytes, that is, equal to the MSP of the root complex device. As explained above, formats for the pair of 512 byte sub packets 22a, 22b vary according to whether the original packet 22 is a read completion packet, write request packet or message request packet. Thereafter, the pair of sub-packets 22a, 22b are consecutively transferred to the root complex device 2 via the first switch port 7 (step 208) completing the packet transfer. Those skilled in the art would appreciate that method steps 201-204 could be performed in a different sequence from that shown in
Now lets us assume a 2048 byte packet 32 is being transferred from end point device 15 to endpoint device 14 and the MPS supported by the endpoint device 15 and the endpoint device 14 are 2048 bytes and 1024 bytes, respectively, as indicated in the schematic diagram of
Thereafter, packet 32 is transferred from the endpoint device 15 to the buffer 12 via associated second switch port 9 (step 205). Since the end point device MPS (2048 bytes) exceeds the endpoint device MPS (1024 bytes), the packet 32 is stored in the buffer 12 and is divided into a pair of sub packets 32a, 32b each having a data payload equal to 1024 bytes, that is, equal to the MPS of the endpoint device. Thereafter, the pair of sub-packets 32a, 32b are consecutively transferred to the endpoint via the first switch port 8 (step 208) completing the packet transfer (step 210).
Those skilled in the art would understand that the method 100 for transferring packets between devices connected to a PCI-Express bus can be implemented in accordance with one or more alternative embodiments. For example, in an alternative embodiment, the method 100 can be implemented in a PCI-bus system in which the buffers and/or switch devices are integrated in the root complex device or in which the packet divider is separate from the switch and root complex. In such alternative embodiments of the method 100, the packet divider has a port for receiving the packet from the source device and another port for transmitting the packet or sub-packets to the destination device.
In accordance with additional alternative embodiments, the method described herein can further comprise configuring the MPS of the packer divider port for receiving the packet to equal the MPS of the source device and configuring the MPS of the packet divider port for transmitting the packet or sub-packets to equal the MPS of the destination device. Additionally, the method can comprise, storing the packet in the packet divider by transmitting the packet from the source device to the packet divider port for receiving the packet and transmitting the packet or sub-packets to the destination device from the packet divider port for transmitting the packet.
It will be appreciated that variations of the above-disclosed and other features, aspects and functions, or alternatives thereof, may be desirably combined into many other different systems or applications.
Also, it will be appreciated that various presently unforeseen or unanticipated alternatives, modifications, variations or improvements therein may be subsequently made by those skilled in the art which are also intended to be encompassed by the following claims.