(Not Applicable)
Communication networks tend to be constructed according to various physical and/or logical topologies, which can often depend on the capabilities of the components of the communication network. For example,
Network 100 has a lower layer 110 comprised of servers 112, which are typically rack mounted or otherwise concentrated with regard to physical location. A layer 120 uses layer 2 top-of-the rack (TOR) switches 122 to connect servers 112. A layer 130 is composed of layer 2 and/or layer 3 aggregation switches (AS) 132 to interconnect several TOR switches 122. A layer 140 is the top layer of network 100, and is composed of core routers (CR) 142 that connect aggregation switches 132. Often, core routers 142 also function as a gateway to connect to an Internet 150.
One major drawback of the network architecture of network 100 is that the design is oriented mostly for network traffic from users to the servers, so-called North-South traffic that travels in a generally vertical direction in network 100. Due to the very high oversubscription ratio from layer 120 to layer 140, which is collectively from about 1:80 to about 1:240, the so-called West-East traffic between servers 112 that travels in a generally horizontal direction in network 100 can be subject to performance issues. For example, such high oversubscription ratios can create a bottle neck for traffic between servers 112, since the traffic typically flows through layers 120, 130 and 140, rather than directly between servers 112.
Several network topologies have been proposed to overcome the above-mentioned drawbacks of network 100, where the architecture aim is to flatten the network topology to promote West-East traffic and reduce the oversubscription ratio to a more reasonable range of from about 1:3 to about 1:1.
Besides fat-tree, other network topologies based on Clos architecture have been proposed, such as the spine and leaf topology of network 300 of
However, fundamentally, both fat-tree and folded Clos architecture are topologically similar to traditional layered networks, in that they are all assembled in a tree like topology. The difference is the fat-tree and folded Clos arrangements use a series of switches in the top layer, while the traditional network uses one or more big routers at a top layer. These architectures are often called “scale-out” architecture rather than “scale-up” (bigger router) architecture.
One drawback of fat-tree and folded Clos architectures is the increased number of switches used. In addition, large numbers of cable connections are made between all the switches being used to implement the architectures. The complexity of the cabling connectivity and the sheer number of cables used to implement these architectures make them less attractive from a practicality viewpoint. Moreover, in practice, these architectures tend to scale poorly once the network has been built, due at least in part to the further increased complexity of modifying and adding a relatively large number of cable connections. In addition to the complexity, the costs tend to be driven up by relatively expensive cabling used to implement these architectures.
For example, optical cabling is often used to increase speed and throughput in a data center network. Switch ports are directly connected to other switch ports according to the topology configuration, so careful mapping of ports that may be physically separated by relatively large distances is undertaken. In addition, the physical reach of the optical cables is often expected to be greater than 100 meters. If there is a problem with cable or a switch component malfunction, correction of the problem can be costly as well as complicated to implement, since switches and/or cables may need to be installed, and correctly connected in accordance with the complex topology being implemented.
As data centers become more like high performance computing (HPC) platforms, many of the network topologies used in HPC have been proposed for data center networks. However, the topologies employed in an HPC application do not translate well to data center network environments, since the HPC computer processors tend to be densely packed, and the networking connections tend to be restricted to a smaller space, thus limiting complexity and cost for those applications.
Accordingly, the relationship between the number of switches, number of ports on a switch and cabling requirements to implement a desired network topology can present significant challenges in practice. Moreover, problems with scalability and maintenance further increase cost and complexity for scaling up or scaling out and maintaining a desired network topology.
The present disclosure provides a system and method for connectivity of network devices that permits simplified connections for realizing complex networking topologies. The connectivity for the network devices can be achieved using lower cost components. The disclosed system and method permits cabling to be simplified and permits reduced cost cabling to be used to make connections while providing implementations of complex networking topologies. The disclosed system and method assist in simplifying connectivity implementation, so that complex networking topologies can be realized faster and with greater reliability.
Typically, data center network implementation involves connectivity that uses optical technology, which tends to dictate at least a portion of implementation cost. Some of the types of optical technology used for connectivity can include:
The above 850 nm 12 channel module tends to be the lowest cost solution but may be limited to a 100 meter reach. The Silicon Photonics 40G QSFP+ (quad small form factor pluggable) (from Molex) can reach 4 km and the cost can be one quarter of the CWDM (coarse wave division multiplexing) SFP+ solution. Although the Silicon Photonic 40G QSFP+ is not CWDM, it can advantageously be used in a low cost solution in accordance with the present disclosure. The present disclosure permits multi-fiber MTP (multi-fiber termination push-on) fiber to be incorporated into various topologies according to user design, and can accommodate topologies such as chordal rings, including mesh rings, such as a mesh ring with 11 or more nodes. A number of other desirable topologies are also possible.
According to an aspect of the present disclosure, a connectivity arrangement is provided at a network node that includes fiber optic transmitters and receivers. The connectivity configuration provides for pass-through fiber connections that are passive and that offer an optical signal path that is offset or shifted by one or more connector positions as the optical signal passes through the node. The connector position offset for pass-through fiber optic connections permits direct optical signal connection between network nodes that are not necessarily physically connected to each other.
For example, using a disclosed connectivity configuration, a fiber optic signal can originate on one node and be transmitted to another node via a direct physical connection. The transmitted fiber optic signal is received at a first connector interface at an incoming connector position and passed through the node via a passive fiber pathway to a second connector interface at an outgoing connector position that is shifted or offset from the incoming connector position. The second connector interface is directly physically connected to a third node that receives the optical signal directly from the first node via the intermediate node. Thus, the third node is not directly physically connected to the first node, but receives the optical signal directly from the first node via the shifted passive optical pathway in the intermediate node.
In the above example, there is a distinction between direct physical connections between nodes, and direct optical connections between nodes. The direct physical connection is in the form of a cable that can be directly connected between two nodes, while direct optical connection can be implemented via an optical connection between two nodes where the path of the direct optical connection includes an intermediate node that passively passes an optical signal that is shifted or offset by at least one connector position. Accordingly, one or more nodes can be “skipped” with the use of the connection offset or shift, which connectivity configuration can be commonly applied to all of the nodes for simplified modularity and construction, while permitting simplified connectivity.
According to another aspect, one or more connector positions can each be coupled to a bidirectional fiber construct. The bidirectional construct can transmit and receive on a single fiber, so that a single connector position is used for transmitting and receiving. This configuration saves connector space and permits relatively complex network topologies to be implemented with fewer connector positions and thus reduce the number of connector positions that are used in the cabling provided to each of the nodes. The connectivity arrangement permits a bidirectional signal transmitted and received between the bidirectional constructs on different nodes to pass through one or more nodes with a passive connection based on a pathway that connects one connector position for one connector (plug) to an offset or shifted connector position for another connector (plug). The connectivity arrangement can be implemented at each node so that a common connectivity configuration can be used at each node to simplify connectivity cabling for the entire network.
The disclosed system and method can reduce the number of cables used to connect switches to implement relatively complex network topologies while providing greater chordal reach. The arrangement for connectivity in accordance with the present disclosure also can eliminate multiplexers/demultiplexers and wavelength division multiplexing lasers in a node to further reduce the component requirements and simplify connectivity solutions.
The present disclosure is described in greater detail below, with reference to the accompanying drawings, in which:
Data center switches and routers can utilize fiber optical interconnections through their network interface ports. In accordance with the present disclosure, standard fiber optical connectors in conjunction with internal fiber optical interconnections and configurations that can be used to implement desired network topologies.
While the architectures illustrated in
Some or all of the network nodes in a datacenter network can, for example, be configured with the arrangement of network node 600. In such a configuration, each of the connector positions 1-23 is connected to the same numbered connector position in a connected node. So, for example, connector position 1 in set 610 is connected to connector position 1 in a connector of a node to which network node is directly physically connected. In such an instance, connector position 1 of set 610 receives a signal from a connector position 1 of a network node physically connected to network node 600 via set 610. Likewise, connector position 1 of set 620 transmits a signal to a network node physically connected to network node 600 via set 620. Since all the network nodes in this exemplary embodiment can be configured with the same connectivity arrangement of network node 600, connector positions 1 and 2 of each set 610, 620 are respectively reserved for direct, one way, single fiber connections between physically connected nodes.
Connector positions 3-6 in sets 610 and 620 illustrate a shifted or offset arrangement for communicating between nodes. This arrangement permits an intermediate node to passively forward an optical signal from an originating node to a receiving node using a standard fiber optic cable. An optical signal launched from connector position 4 in set 620 would arrive on connector position 4 of set 610 at an intermediate network node, and the signal would be output at connector position 3 of set 620 of the intermediate node. The optical signal would then arrive at connector position 3 of set 610 of a receiving node, so that the optical signal is effectively sent directly from a first node to a third node, skipping an intermediate node. This scenario is implemented in an opposite direction using connector positions 5 and 6 of sets 610 and 620. Thus, an optical signal launched from connector position 6 in set 610 will pass through an intermediately connected network node from connector position 6 in set 620 to connector position 5 in set 610 to land on connector position 5 in set 620 of a third node.
With the configuration of connector positions 1-6, a five node ring mesh network 700 can be constructed, as is illustrated in
The physical cable connections for network 700 can be physically accomplished using five connector cables with six fibers each, in a physical ring topology, as illustrated in
Referring again to
It is noteworthy that such an extension of networks 700, 800 to seven nodes can be achieved with relative ease, since an additional two cables would be connected to the existing network nodes 810 to form a physical ring. Presuming that each node 810 was arranged to have the configuration of network node 600, such additional connections to two additional nodes would readily produce a logical seven node mesh ring topology configuration. If such an extension were contemplated for directly physically connected nodes in a network, an additional eight cables would be used to interconnect all the nodes, and each node would have six physical cable connections. Accordingly, the connectivity configuration of the present disclosure reduces the number of physical cables used, as well as simplifies network extensions.
In the above discussion, the optical pathways are described as being unidirectional. However, it is possible to use bidirectional techniques to further improve the efficiency of the disclosed connectivity configuration. For example, bidirectional pathways are implemented with circulators 602 in network node 600 using connection positions 13-21. Circulators 602 are bidirectional fiber constructs that have an input port and an output port to permit optical signals to be sent and received on a single optical fiber. A transmit QSFP 604 and a receive QSFP 604 are coupled to each circulator 602. Each of transmit QSFP 604 and receive QSFP 604 illustrated in network node 600 are specified as QSFP-LR4. The LR4 variant in transmit and receive QSFPs 604 includes four CWDM transmitters and receivers and an optical multiplexer/demultiplexer. The LR4 variant for QSFP permits four channels to be used with one fiber pair for transmit and receive. It is possible to use nominal QSFP configurations, e.g., without an optical multiplexer/demultiplexer, which would occupy additional fibers. In addition, or alternatively, multi-core fibers can be used with such a nominal QSFP configuration to permit the number of connector positions to be less than an implementation using single core fibers.
In the arrangement shown in network node 600, connector positions 13-16 provide bidirectional pass-through with three offsets or shifts. This arrangement permits circulator 602 on connector position 13 of set 620 to communicate with circulator 602 on connector position 16 of set 610 on a node that is four nodes away, or through three intermediate nodes. The optical signal provided at connector position 13 in set 620 thus transits three pass-through nodes, being offset or shifted one connector position for each node transited, and arrives at connector position 16 at the forth node. Accordingly, a direct optical connection between connector position 13 of a first node and connector position 16 of a fourth node is established, with the direct optical connection physically passing through three intermediate nodes. In addition, because the connections are made between circulators 602, the communication between connector position 13 on a first node and connector position 16 on a fourth node is bidirectional.
Connector positions 17-21 further expand on the connectivity configuration of network node 600 by offering a direct, bidirectional optical connection between a first node and a fifth node that passes through four intermediate nodes. In total, connector positions 1-21 in sets 610, 620 permit a direct optical connection with five adjacent nodes on either side of a given node with bidirectional communication. This configuration permits ring mesh network 400 illustrated in
It should be understood that a greater than five node reach can be implemented for an extended topology configuration by expanding the number of connection positions in network node 600, for example. In addition, or alternately, a greater than five node reach can be implemented by coupling a packet switch or crosspoint switch to a node. The packet switch or crosspoint switch can receive traffic from the node in the network ring and redirect traffic back into the ring, which restarts a five node reach for that node.
The present disclosure provides an advantage in simplified cabling to realize complex topologies that can be extended and be maintained with relative ease. In addition, the use of circulators and/or reduced number of cables significantly reduces fiber count, leading to significant cost savings, to the point where complex topologies become significantly more practical to realize. Moreover, the nodes are not required to multiplex/demultiplex multiple signals to permit reduced fiber count and cable connections, leading to further reductions in complexity and cost. In addition, numerous desirable topologies can be practically realized without prohibitive costs. For example, chordal ring topologies, mesh topologies, torus topologies, Manhattan grid topologies and other desired topologies, each of two, three or arbitrary dimensions, can be constructed quickly, reliably and inexpensively to permit significant advancements in complex network construction and configuration.
The foregoing description has been directed to particular embodiments of the present disclosure. It will be apparent, however, that other variations and modifications may be made to the described embodiments, with the attainment of some or all of their advantages. The scope of the appended claims is therefore not to be limited to the particular embodiments described herein, and is intended to cover all such variations and modifications as come within the true spirit and scope of the present disclosure.
Number | Name | Date | Kind |
---|---|---|---|
5008881 | Karol | Apr 1991 | A |
5576875 | Chawki et al. | Nov 1996 | A |
5691885 | Ward et al. | Nov 1997 | A |
6192173 | Solheim et al. | Feb 2001 | B1 |
6233074 | Lahat et al. | May 2001 | B1 |
6331905 | Ellinas | Dec 2001 | B1 |
6452703 | Kim et al. | Sep 2002 | B1 |
6493118 | Kartalopoulos | Dec 2002 | B1 |
6519059 | Doerr et al. | Feb 2003 | B1 |
6570685 | Fujita et al. | May 2003 | B1 |
6711324 | Zang et al. | Mar 2004 | B1 |
6760302 | Ellinas et al. | Jul 2004 | B1 |
6771907 | Yamazaki et al. | Aug 2004 | B1 |
6785472 | Adams et al. | Aug 2004 | B1 |
6999681 | Gruber et al. | Feb 2006 | B2 |
7130500 | Wachsman et al. | Oct 2006 | B2 |
7224895 | Garnot | May 2007 | B2 |
7254138 | Sandstrom | Aug 2007 | B2 |
7254336 | Harney et al. | Aug 2007 | B2 |
7333511 | Sandstrom | Feb 2008 | B2 |
7433593 | Gullicksen et al. | Oct 2008 | B1 |
7477844 | Gumaste et al. | Jan 2009 | B2 |
7518400 | Redgrave | Apr 2009 | B1 |
7522837 | Tanobe et al. | Apr 2009 | B2 |
7743127 | Santos et al. | Jun 2010 | B2 |
7986713 | Sandstrom | Jul 2011 | B2 |
8027585 | Yokoyama | Sep 2011 | B2 |
8543957 | Takita | Sep 2013 | B2 |
8886040 | Vissers | Nov 2014 | B2 |
9014562 | Gerstel | Apr 2015 | B2 |
20020131118 | Chiaroni et al. | Sep 2002 | A1 |
20030046127 | Crowe et al. | Mar 2003 | A1 |
20030059154 | Sato | Mar 2003 | A1 |
20030175029 | Harney | Sep 2003 | A1 |
20040105364 | Chow et al. | Jun 2004 | A1 |
20040131064 | Burwell et al. | Jul 2004 | A1 |
20040131356 | Gerstel | Jul 2004 | A1 |
20040208548 | Gruber | Oct 2004 | A1 |
20050044195 | Westfall | Feb 2005 | A1 |
20060123477 | Raghavan et al. | Jun 2006 | A1 |
20060228112 | Palacharla et al. | Oct 2006 | A1 |
20060275035 | Way | Dec 2006 | A1 |
20070154221 | McNicol et al. | Jul 2007 | A1 |
20080062891 | Van der Merwe et al. | Mar 2008 | A1 |
20080144511 | Marcondes et al. | Jun 2008 | A1 |
20080162732 | Ballew | Jul 2008 | A1 |
20080219666 | Gerstel | Sep 2008 | A1 |
20090092064 | Fan et al. | Apr 2009 | A1 |
20090138577 | Casado et al. | May 2009 | A1 |
20090219817 | Carley | Sep 2009 | A1 |
20090268605 | Campbell et al. | Oct 2009 | A1 |
20090296719 | Maier et al. | Dec 2009 | A1 |
20090328133 | Strassner et al. | Dec 2009 | A1 |
20100014518 | Duncan et al. | Jan 2010 | A1 |
20100115101 | Lain et al. | May 2010 | A1 |
20100121972 | Samuels et al. | May 2010 | A1 |
20100254374 | Fortier | Oct 2010 | A1 |
20100284691 | Zottmann | Nov 2010 | A1 |
20110026411 | Hao | Feb 2011 | A1 |
20110090892 | Cooke | Apr 2011 | A1 |
20120321309 | Barry et al. | Dec 2012 | A1 |
20130022047 | Nakashima et al. | Jan 2013 | A1 |
20130044588 | Kogge | Feb 2013 | A1 |
20150078746 | Spock | Mar 2015 | A1 |
20150180606 | Gerstel | Jun 2015 | A1 |
Number | Date | Country |
---|---|---|
0 486 203 | Nov 1991 | EP |
0 620 694 | Apr 1994 | EP |
2 429 122 | Mar 2012 | EP |
WO 2008073636 | Jun 2008 | WO |
WO 2008116309 | Oct 2008 | WO |
WO 2009042919 | Apr 2009 | WO |
WO 2009096793 | Aug 2009 | WO |
WO 2009151847 | Dec 2009 | WO |
WO 2010133114 | Nov 2010 | WO |
WO 2010138937 | Dec 2010 | WO |
Entry |
---|
High Performance Datacenter Networks; Architectures, Algorithms, and Opportunities; Dennis Abts and John Kim; 2011. |
“A Torus-Based 4-Way Fault-Tolerant Backbone Network Architecture for Avionic WDM LANs”; The Department of Electrical and Computer Engineering, University of Florida; Dexiang Wang and Janise Y. McNair; Optical Society of America; Mar. 31, 2011. |
“P2i-Torus: A Hybrid Architecture for Direct Interconnection”; Department of Computer Science and Technology, Tsinghua University; Chao Zhang and Menghan Li; IEEE; Dec. 24-26, 2011. |
“Making High Bandwidth But Low Revenue Per Bit Network Applications Profitable”; Optimum Communications; Jan. 15, 2010. |
“A Policy-aware Switching Layer for Data Centers”; Electrical Engineering and Computer Sciences, University of California at Berkeley; Dilip Antony Joseph, Arsalan Tavakoli and Ion Stoica; Jun. 24, 2008. |
“Optimum Communications Services: Finally a way out of the zero-sum game?”; Technologylnside on the web; Oct. 20, 2008. |
“Flattened Butterfly: A Cost-Efficient Topology for High-Radix Networks”; John Kim, William J. Dally, Computer Systems Laboratory; Dennis Abts, Cray Inc.; Jun. 9-13, 2007. |
“Impact of Adaptive Layer 1 for Packet Switching Network Cost and QoS”; TRLabs Next Generation Internet Workshop; Mark Sandstrom; Optimum Communications; Nov. 9, 2007. |
A Proposal of Hierarchical Chordal Ring Network Topology for WDM Networks; Tomoya Kitani, Nobuo Funabiki and Teruo Higashino; IEEE; 2004. |
“Seven Core Multicore Fiber Transmissions for Passive Optical Network”; Zhu et al.; Opitcs Express; vol. 18 Issue 11; 2010; pp. 11117-11122. |
“WDM and SDM in Future Optical Networks”; H.J.H.N. Kenter et al.; Tele-Informatics and Open Systems Group, Department of Computer Science, University of Twente, The Netherlands; Jun. 22, 2000. |
“Optical Cross Connect Based on WDM and Space-Division Multiplexing”; Y. D. Jin et al.; IEEE Photonics Technology Letters; Nov. 1995. |
“Scalable Photonic Interconnection Network with Multiple-Layer Configuration for Warehouse-Scale Networks”; Sakano et al.; Optical Society of America; Aug. 2011. |
“Hybrid Optical WDM Networks Utilizing Optical Waveband and Electrical Wavelength Cross-Connects”; Le et al.; Optical Society of America; 2011. |
“WDM-Based Local Lightwave Networks Part II; Multihop Systems”; Biswanath Mukherjee; IEEE; Jul. 1992. |
Sudevalayam, Sujesha et al., “ Affinity-aware Modeling of CPU Usage for Provisioning Virtualized Applications,” 2011 IEEE 4th International Conference on Cloud Computing, Jul. 4, 2011, pp. 139-146, XP031934583. |
Number | Date | Country | |
---|---|---|---|
20150078746 A1 | Mar 2015 | US |
Number | Date | Country | |
---|---|---|---|
61845040 | Jul 2013 | US |