This invention relates generally to data communications. More particularly, this invention relates to a distributed optical switching architecture for data center networking.
“Big data” is prevalent. There is exponentially increasing computing and storage needs for big data. Cloud architectures are commonly used to address big data challenges. In a data center, server and storage resources are interconnected with packet switches and routers which provide the basic internal data center networking functionality. Data centers are also interconnected across wide area networks through routing and transport systems known as the cloud.
Data centers can be of three types: private, public or virtually private. The size of data centers varies too. Tier 1 data centers may contain thousands of racks and millions of servers. Tier 2 data centers could host hundreds of thousands of servers with the number of racks ranging from 250 to 2000. Tier 3 and 4 data centers have less than 250 racks.
A conventional data center network typically has a hierarchical architecture. Each rack of servers connects to a top of rack (TOR) Ethernet switch, which is usually considered an access switch. A plurality of such top of rack switches connect to a higher level of Ethernet switch, which is generally referred as an aggregation switch. The aggregation switch provides a packet switching function among its down layer and its uplinks A plurality of such top of rack switches further connects to a higher level of Ethernet switch with their uplinks; this type of hierarchy repeats. The highest level of the Ethernet switch is generally referred to as the core switch. In addition, a gateway provides inter-data center connectivity and connectivity to the Internet and end users.
It is becoming increasingly important to reduce the total power consumption inside data centers. To address these problems, large scale electrical switches were developed to handle hundreds and thousands of 10G ports in a single chassis. Such architecture has the benefit of fewer hierarchical layers, reduced power consumption and simpler cabling structure.
Optical networking technology is well known in the telecom and datacom worlds. Optical links support large capacity transmission over long distances. Optical based channel switching or wavelength switching can provide fast switching speed at much lower power consumption. Thus, optical networking technology is well suited to resolve existing challenges in data centers. Two basic approaches have already been proposed based on different optical switching components.
A system has a first rack with a first set of servers and a first top of rack switch and a second rack with a second set of servers and a second top of rack switch. A first optical switch is connected to the first top of rack switch. A second optical switch is connected to the second top of rack switch and the first optical switch. The first optical switch and the second optical switch each employ wavelength selective switching.
The invention is more fully appreciated in connection with the following detailed description taken in conjunction with the accompanying drawings, in which:
Like reference numerals refer to corresponding parts throughout the several views of the drawings.
A multiple dimension and high radix optical distributed switching network architecture for internal data center interconnections is disclosed. The link capacity in this distributed switching network is also optically reconfigurable to be adaptive to the dynamic pattern of internal data center traffic. The solution is naturally scalable to support thousands of servers (e.g., Tiers 3&4 data centers) to millions of servers (e.g., Tier 1 data centers).
In one embodiment, an optical wavelength switching box is equipped with 4 multi-fiber ribbons. Each multi-fiber ribbon, named north, south, west, east respectively, connects to the 4 neighbor server racks. As shown in
An aspect of the invention is the optical design of the optical wavelength switching node, as depicted in
In another aspect of the invention, the optical switching node includes a passive 1 by 4 optical splitter at the outer bound direction, which broadcasts the DWDM signals to west, east, north, and south directions. The optical switching node also includes 2 passive fiber routing blocks. One passive routing block processes the connections for east and west directions, while the other passive routing blocks processes the connections for north and west directions. Each passive routing block connects to 2 multi-fiber ribbon cables where every fiber of the multi-fiber ribbon carries broadcasted DWDM signals. The design of passive routing blocks is described below.
In a further aspect of the invention, the optical wavelength switching node also contains an optical wavelength switch. The optical wavelength switch dynamically selects (switches) one or a group of DWDM signals from one or a group of neighbor server racks. The optical wavelength switch may also block (disconnect) the unselected DWDM signals from one or a group of neighbor server racks to the TOR switch. Thus, the bandwidth of any rack to rack connection is able to be dynamically re-configured at wavelength granularity. Finally, the optical switching node may also include one or a pair of optical amplifiers (e.g., Erbium Doped Fibre Amplifiers (EDFAs)) to amplify the DWDM optical signals to compensate for the optical insertion loss by the optics.
The optical wavelength switch may be implemented by a wavelength selective switch (WSS). In such case, a wavelength selective switch is configured as an N×1 switch to select wavelengths from different sources. The optical wavelength switch may also be implemented by an array waveguide grating router (AWGR) with a tunable filter array.
The optical wavelength switch element can also be implemented by an optical multicast switch (MCS) plus a tunable filter array, as shown in
The splitting ratio of each splitter is optimized to balance optical insertion loss among every node to node connection. The splitter ratio of each splitter follows the rule as shown in Table 3-1.
The disclosed design defines unified cabling for every optical wavelength switching node and enables a fully meshed connection among the nodes, as shown in
Thus, a physical two-dimensional torus connection is achieved by two-dimension cabling.
The architecture is naturally scalable. A new optical switching node is easy to be added at any location next to the existing N×M server rack array.
The network size of the described architecture is defined by N, which is restricted by optical power budgeting and technology limits to achieve high port wavelength selective switching. However, another layer of optical wavelength switching nodes can be added for additional dimensions. Thus, an N-array, 4-flier optical switching architecture is enabled or other simplified architectures can be achieved at the cost of long cablings.
The AWGR based optical switching node of
The disclosed technology provides a novel reconfigurable optical architecture to enable distributed optical switching for data center networking. The solution is easy to scale to support ware-house size data centers with low initial cost and total cost. The solution is also re-configurable to support dynamic traffic patterns for inter-data center networking with low information latency. The solution also benefits from the merits of optical switching technology to dramatically reduce the power consumption and simplify the cabling in the data center.
In the prior art, the core optical switching is centralized so the switching capacity and scalability is limited and therefore is not suitable for large scale data centers. Also, prior art solutions do not exploit SDM to simplify the cabling and thus it is difficult to scale up data center size. While one prior art approach exploits both SDM and WDM technology, it does not introduce wavelength selective switching (WSS) in the design and still relies on electrical switching capability to realize a distributed switching system. Thus, this approach suffers from static and limited node to node optical link capacity and does not resolve the power consumption issue when the link rate scales up.
The foregoing description, for purposes of explanation, used specific nomenclature to provide a thorough understanding of the invention. However, it will be apparent to one skilled in the art that specific details are not required in order to practice the invention. Thus, the foregoing descriptions of specific embodiments of the invention are presented for purposes of illustration and description. They are not intended to be exhaustive or to limit the invention to the precise forms disclosed; obviously, many modifications and variations are possible in view of the above teachings. The embodiments were chosen and described in order to best explain the principles of the invention and its practical applications, they thereby enable others skilled in the art to best utilize the invention and various embodiments with various modifications as are suited to the particular use contemplated. It is intended that the following claims and their equivalents define the scope of the invention.
This application claims priority to U.S. Provisional Patent Application Ser. No. 61/886,553, filed Oct. 3, 2013, the contents of which are incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
61886553 | Oct 2013 | US |