PCI Express to PCI Express based low latency interconnect scheme for clustering systems

Information

  • Patent Application
  • 20120226835
  • Publication Number
    20120226835
  • Date Filed
    April 08, 2012
    12 years ago
  • Date Published
    September 06, 2012
    11 years ago
Abstract
PCI Express is a Bus or I/O interconnect standard for use inside the computer or embedded system enabling faster data transfers to and from peripheral devices. The standard is still evolving but has achieved a degree of stability such that other applications can be implemented using PCIE as basis. A PCIE based interconnect scheme to enable switching and inter-connection between external systems, such that the scalability can be applied to enable data transport between connected systems to form a cluster of systems is proposed. These connected systems can be any computing or embedded system. The scalability of the interconnect will allow the cluster to grow the bandwidth between the systems as they become necessary without changing to a different connection architecture.
Description
FIELD OF INVENTION

This invention relates to cluster interconnect architecture for high-speed and low latency information and data transfer between the systems in the configuration


BACKGROUND AND PRIOR ART

The need for high speed and low latency cluster interconnect scheme for data and information transport between systems have been recognized as one needing attention in recent times. The growth of interconnected and distributed processing schemes have made it essential that high speed interconnect schemes be defined and established to provide the speed up the processing and data sharing between these systems.


There are interconnect schemes that allow data transfer at high speeds, the most common and fast one existing today is the Ethernet connection allowing transport speeds from 10 MB to as high as 10 GB/sec. TCP/IP protocols used with Ethernet have high over head with inherent latency that make it unsuitable for some distributed applications. Effort is under way in different areas of data transport to reduce the latency of the interconnect as this is a limitation on growth of the distributed computing power.


What is Proposed

PCI Express (PCIE) is an emerging I/O interconnect standard for use inside computers, or embedded systems that allow serial high speed data transfer to and from peripheral devices. The typical PCIE provides 2.5 GB transfer rate per link (this may change as the standard and data rates change). Since the PCIE standard is starting become firm and used within the systems, what is disclosed is the use of PCIE standard based peripheral as an interconnect between individual stand-alone systems, typically through an interconnect module to PCIE based peripheral connected directly using data links, as an interconnect between stand-alone systems, typically through an interconnect module or a network switch (switch). This interconnect scheme by using only PCIE based protocols for data transfer over direct physical connection links between the PCIE based Peripheral devices (see FIG. 1), without any intermediate conversion of transmitted data stream to other data transmission protocols or encapsulation of the transmitted data stream within other data transmission protocols, reduces the latencies of communication in a cluster. The PCIE standard based peripheral at a peripheral endpoint of the system, by directly connecting using PCIE protocol based data links to the PCIE standard based peripheral at the switch, provides for increase in the number of links per connection as band width needs increase and thereby allow scaling of the band width available within any single interconnect or the system of interconnects. This will allow the interconnect architecture to remain constant as the interconnect band width need goes from 2.5 GB with a X1 link (single data link) to much higher values of 10 GB with a X4 link (4 data links), 40 GB with a X16 link (16 data links) or a 80 GB with a X32 link (32 data links) and so on providing for easy scaling of the multi-system cluster.


Some Advantages of the Proposed Connection Scheme:





    • 1. Reduced Latency of Data transfer as conversion from PCIE to other protocols like ethernet is avoided during transfer.

    • 2. The number of links per connection can scale from X1 to larger numbers X32 or even X64 possible based on the bandwidth needed.

    • 3. Minimum change in interconnect architecture is needed with increased bandwidth, enabling easy scaling with need.

    • 4. Standardization of the PCIE based peripheral will make components easily available from multiple vendors, making the implementation of interconnect scheme easier and cheaper.

    • 5. The PCIE based peripheral to PCIE based peripheral links in connections allow ease of software control and provide reliable bandwidth.








DESCRIPTION OF FIGURES

FIG. 1—Typical Interconnected (multi-system) cluster (shown with eight systems connected in a star architecture using direct connected data links between PCIE standard based peripheral to PCIE standard based peripheral)


FIG. 2—A cluster using multiple interconnect modules or switches to interconnect smaller clusters.





EXPLANATION OF NUMBERING AND LETTERING IN THE FIG. 1



  • (1) to (8): Number of Systems interconnected in FIG. 1

  • (9): Network Switch (switch) sub-system.

  • (10): Software configuration and control input for the switch.

  • (1a) to (8a): PCI Express based peripheral module (PCIE Modules) attached to systems.

  • (1b) to (8b): PCI Express based peripheral modules (PCIE Modules) at switch.

  • (1L) to (8L): PCIE based peripheral module to PCIE based peripheral module connections having n-links (n-data links).



EXPLANATION OF NUMBERING AND LETTERING IN THE FIG. 2



  • (12-1) and (12-2): clusters

  • (9-1) and (9-2): interconnect modules or switch sub-systems.

  • (10-1) and (10-2): Software configuration inputs

  • (11-1) and (11-2): Switch to switch interconnect module in the cluster

  • (11L): Switch to switch interconnection



DESCRIPTION OF THE INVENTION

PCI Express is a Bus or I/O interconnect standard for use inside the computer or embedded system enabling faster data transfers to and from peripheral devices. The standard is still evolving but has achieved a degree of stability such that other applications can be implemented using PCIE as basis. A PCIE based interconnect scheme to enable switching and inter-connection between external systems, such that the scalability can be applied to enable data transport between connected systems to form a cluster of systems is proposed. These connected systems can be any computing or embedded system. The scalability of the interconnect will allow the cluster to grow the bandwidth between the systems as they become necessary without changing to a different connection architecture. FIG. 1 is a typical cluster interconnect. The Multi-system cluster shown consist of eight units or systems {(1) to (8)} that are to be interconnected. Each system has a PCI express (PCIE) based peripheral module {(1a) to (8a)} as an IO module, at the interconnect port, with n-links built into or attached to the system. (9) is an interconnect module or a switch sub-system, which has number of PCIE based interconnect modules equal to or more than the number of systems to be interconnected, in this case of FIG. 1 this number being eight {(1b) to (8b)}, that can be interconnected for data transfer through the switch. A software based control input is provided to configure and/or control the operation of the switch. Link connections {(1L) to (8L)} attach the PCIE based peripheral modules on the respective systems to those on the switch with n links. The value of n can vary depending on the connect band width required by the system.


When data has to be transferred between say system 1 and system 5, in the simple case, the control is used to establish an internal link between PCIE based peripheral modules 1b and 5b inside the switch. The hand shake is established between outbound based PCIE based peripheral module (PCIE module) 1a and inbound PCIE module 1b and outbound PCIE module 5a and inbound PCIE module 5b. This provides a through connection between the PCI modules 1a to 5b through the switch allowing data transfer. Data can then be transferred at speed between the modules and hence between systems. In more complex cases data can also be transferred and queued in storage implemented in the switch and then when links are free transferred out to the right systems at speed.


Multiple systems can be interconnected at one time to form a multi-system that allow data and information transfer and sharing through the switch. It is also possible to connect smaller clusters together to take advantage of the growth in system volume by using an available connection scheme that interconnects the switches that form a node of the cluster.


If need for higher bandwidth and low latency data transfers between systems increase, the connections can grow by increasing the number of links connecting the PCIE modules between the systems in the cluster and the switch without completely changing the architecture of the interconnect. This scalability is of great importance in retaining flexibility for growth and scaling of the cluster.


It should be understood that the system may consist of peripheral devices, storage devices and processors and any other communication devices. The interconnect is agnostic to the type of device as long as they have a PCIE module at the port to enable the connection to the switch. This feature will reduce the cost of expanding the system by changing the switch interconnect density alone for growth of the multi-system.


PCIE is currently being standardized and that will enable the use of the existing PCIE modules to be used from different vendors to reduce the over all cost of the system. In addition using a standardized module in the system as well as the switch will allow the cost of software development to be reduced and in the long run use available software to configure and run the systems.


As the expansion of the cluster in terms of number of systems, connected, bandwidth usage and control will all be cost effective, it is expected the over all system cost can be reduced and over all performance improved by standardized PCIE module use with standardized software control.


Typical connect operation may be explained with reference to two of the systems, example system (1) and system (5). System (1) has a PCIE module (1a) at the interconnect port and that is connected by the connection link or data-link or link (1L) to a PCIE module (1b) at the IO port of the switch (9). System (5) is similarly connected to the switch trough the PCIE module (5a) at its interconnect port to the PCIE module (5b) at the switch (9) IO port by link (5L). Each PCIE module operates for transfer of data to and from it by standard PCI Express protocols, provided by the configuration software loaded into the PCIE modules and switch. The switch operates by the software control and configuration loaded in through the software configuration input.



FIG. 2 is that of a multi-switch cluster. As the need to interconnect larger number of systems increase, it will be optimum to interconnect multiple switches of the clusters to form a new larger cluster. Such a connection is shown in FIG. 2. The shown connection is for two smaller clusters (12-1 and 12-2) interconnected using PCIE modules that can be connected together using any low latency switch to switch connection (11-10 and 11-2), connected using interconnect links (11L) to provide sufficient band width for the connection. The switch to switch connection transmits and receives data and information using any suitable protocol and the switches provide the interconnection internally through the software configuration loaded into them.


The Following are Some of the Advantages of the Disclosed Interconnect Scheme





    • 1. Provide a low latency interconnect for the cluster.

    • 2. Use of PCIExpress based protocols for data and information transfer within the cluster.

    • 3. Ease of growth in bandwidth as the system requirements increase by increasing the number of links within the cluster.

    • 4. Standardized PCIE component use in the cluster reduce initial cost.

    • 5. Lower cost of growth due to standardization of hardware and software.

    • 6. Path of expansion from a small cluster to larger clusters as need grows.

    • 7. Future proofed system architecture.





In fact the disclosed interconnect scheme provides advantages for low latency multi-system cluster growth that are not available from any other source.

Claims
  • 1. An interconnected cluster, the cluster comprising; a PCIE Express enabled interconnect module comprise a plurality of ports, each said port having a PCI Express based peripheral module enabled for interconnection;a switching mechanism in said interconnect module enabled to transfer data between a first of said plurality of ports on said interconnect module and any of a rest of said plurality of ports on said interconnect module;a plurality of devices and systems that are PCI Express enabled; each said plurality of devices and systems comprise at least a system interconnect port comprise a PCIE express peripheral module enabled for interconnection;said at least a system interconnect port of each said plurality of devices and systems connect to one of said plurality of ports of said interconnect module using at least a PCI Express link;a data transfer mechanism to transfer data to and from each of said plurality of devices and systems to the one of said plurality of ports of said interconnect module connected to it, wherein said data transfer is done using a PCI-Express protocol;wherein said data transfer mechanisms and said switching mechanisms together allow data received from a first of said plurality of devices and systems to be sent to any of a rest of said plurality of said plurality of devices and systems; andwherein said data transfer mechanism and said switching mechanism together further allows data received from any of said rest of said plurality of devices and systems to be sent to said first of said plurality of devices and systems;there by enabling the cluster with interconnection and communication capability between interconnected devices and systems using only PCI Express protocol over PCI Express links.
  • 2. The interconnected cluster in claim 1, where in, each said plurality of devices and systems is connected to said interconnect module by one or more PCI Express data links connected between said PCI Express peripheral module at said system interconnect port and said PCI Express peripheral module at said port of said interconnect module.
  • 3. The interconnected cluster in claim 1, where in, the connection between the PCI Express peripheral module at the interconnect port of each of the devices and systems and the connected PCI Express peripheral module at the port of the interconnect module can be by using multiple data links as the bandwidth requirements of the interconnect demand.
  • 4. The interconnected cluster in claim 1, where in, the interconnect module is a network switch.
  • 5. The interconnected cluster in claim 1, where in, the switching between the ports inside the interconnect module is controlled by the configuration software loaded into the interconnect module.
  • 6. An interconnected cluster system comprising of two or more smaller clusters to form a larger cluster, each of said smaller clusters having multiple devices and systems and an interconnect module; said interconnected cluster system having suitable low latency interconnection between said interconnect modules of the small clusters to allow the overall system growth to the interconnected cluster system.
  • 7. The interconnected cluster system of claim 6, comprising of two or more smaller clusters having multiple devices and systems and the interconnect module each where in, the interconnection between interconnect modules is scalable.
  • 8. The interconnected cluster system of claim 6, comprising of two or more clusters each having multiple devices and systems and an interconnect module each where in, the cluster growth can take place without changing the architecture of the individual small clusters.
CROSS REFERENCE TO PRIOR APPLICATION

This application is a continuation of Ser. No. 11/242,463 filed on Oct. 4, 2005 by the same inventor.

Continuations (1)
Number Date Country
Parent 11242463 Oct 2005 US
Child 13441883 US