Most of the current high-capacity switches use a flow that goes from input to switch fabric to output solution, of which the input side performs O/E conversion, forwarding/policy enforcement, and segmentation/queuing/scheduling; switch fabric provides the switching path that connects the input and output ports, and usually includes an arbitrator/scheduler to avoid contention; output side reassembles the packets, performs additional output scheduling, and converts signal from electrical to optical. Such switches face potential issues such as:
1) Electrical switching involves high-speed electrical connection from input/output ports to the switch fabric, such connection has limited connection distance and requires a single chassis or pizza-box type layout, so the switching scale and physical arrangement is limited; and
2) The high-speed electrical connection is usually achieved through a serializer/deserializer (SerDes, or called transceiver), which provides parallel interface to the device's internal processing and serial interface for PCB routing to fabric interface. Such SerDes and the crossbar switch fabric consume a lot of power, in particular when the system has large switching capacity.
Optical burst switching network is one solution for the aforementioned problem, in that the edge router/switch assembles the packets into bursts, setup the path along the network, and then transmit the burst. In one variation, the burst is assembled in the intermediate switching node which performs all-optical switching plus burst assembly function.
Optical label switching is another solution, which uses a short optical label for control and configuring the switching path, while the accompanied data packet or burst is switched all optically. In such switching system, the label is extracted and processed electrically to get the destination and policy information; the switch fabric is configured based on this processing result, and then the data packet/burst is fed into the switch fabric and switched to the destination output port.
Another solution is using external wavelength switch to provide fixed connection from one switch to another, either to increase the switching scale, or to bypass the electrical processing for large volume of traffic.
In one aspect, a switching system uses centralized arbitrator to resolve output port contention; assembles the packets in a burst according to grant information and input scheduling result, uses tunable laser at input port, and modulates the bursts of packets into the specific wavelength to reach a destination; uses optical switch fabric to connect burst of packets from input to output port.
In one embodiment, the switched optical burst is directly sent to the interface of the connected system or network interface card (NIC). That interface is capable of receiving in burst mode.
In another embodiment, the switched optical burst is timing adjusted and/or re-framed, and sent in continuous mode to the connected interface.
In yet another embodiment, the system has electrical and optical hybrid switch fabric, provides burst switching for some ports through the optical fabric, and traditional processing/switching path for other ports.
In a further embodiment, the input port/line card and the switch fabric (and the arbitrator) are physically located in different places, for the flexibility of physical layout.
Advantages of the above system or competitive/commercial value of the solution achieved by the system may include one or more of the following. The system is lower-cost, power efficient, and flexible in connection. Optical switching avoids high-speed electrical connection from input/output ports to the switch fabric, and provides long connection distance and the switching scale and physical arrangement is better than electrical switches. There is no need for high power SerDes and the crossbar switch fabric. Additionally, the system has large switching capacity.
Centralized arbitrator 108 coordinates the burst requests from all the ports. The arbitration can be either asynchronous, or using fixed and synchronous time slot. For asynchronous case, 108 returns with the granted start time, duration, and the granted destination port. Synchronous arbitration is to have the whole system running on synchronized time slot, each time slot is for one burst, and the arbitrator responses with (either implicitly or explicitly) the granted time slot plus the destination port. For an authorized time slot, the source port tunes the laser to the destination's wavelength, and reads from the corresponding buffer.
In terms of arbitration, in one embodiment, the arbitrator uses priority-based solution, with priority considered while it performs scheduling. If the source port does not have the packets of that priority to fill in the complete time slot, it may use sub-scheduling to fill it up using packets of other priority. Alternatively, the arbitrator schedules the request in aggregated way without considering service class, either equally distribute the bandwidth (like in round-robin mode), or using weight-based scheduling that uses the total allocated bandwidth between each source and destination ports as the weight. In such case, the scheduler in the source port further distributes the granted bandwidth to different queues, and assembles the burst based on this distribution. In one embodiment, the source port organizes the queue using traditional VOQ (Virtual Output Queue) method, and a burst is assembled during the time it is being read out.
The interface of switch 100 can be connected to either another switch or the network interface card (NIC). The receiver of such interface works in burst mode and is able to receive the burst of packets originated from different source ports.
In one embodiment, the switch fabric and arbitrator of system 100 can be physically separated from the line cards, in that multiple line cards can be grouped in one system (called line card system) and the fabric plus arbitrator can be in another system (called fabric system) that serves multiple line card systems. As shown in
When line card and fabric systems are separated, the output interfaces to the network may be connected directly to the fabric output. For operation convenience, in one embodiment, the output ports of the switch fabric are connected back to the line cards, and then to the output of the network interface through the line cards. In one embodiment, the input and output of each fabric port use one fiber. As given in
Further flexibility and/or cost savings can be achieved by putting the line card's function into the network interface card (NIC) which is plugged in a computer. In this case, the computer (or the NIC) has a routing table for its packets, so it can resolve the destination; it assembles the bursts and requests for bandwidth from the arbitrator directly; and it has a tunable laser to generate the corresponding wavelength for the given destination.
To provide a standard interface such as 802.3 Ethernet interface, in one embodiment, the output direction of a line card has a mapping and/or retiming module that converts the bursts to the standard signal. As shown in
The system can also be the hybrid of the traditional architecture and those described above. For packets destined to a standard interface, it follows traditional processing path such as input queuing/scheduling, switching, output queuing and scheduling, output framing etc.; for packets to a burst-mode-capable interface, the aforementioned solutions can be applied.
For applications, the aforementioned system can be used as the ToR (Top of Rack) switch, with standard interfaces connecting to the aggregation switch, and the burst mode interfaces connecting to the NICs that are plugged into the servers. Such system can also be part of the aggregation switch, with burst mode interface connecting to either the ToR switch, or extended connection to the servers directly.
The system is lower-cost, power efficient, and flexible in connection. Optical switching avoids high-speed electrical connection from input/output ports to the switch fabric, and provides long connection distance and the switching scale and physical arrangement is better than electrical switches. There is no need for high power SerDes and the crossbar switch fabric. Additionally, the system has large switching capacity.
While the invention has been described in its preferred embodiment, it is to be understood that the words which have been used are words of description rather than limitation and that changes may be made within the purview of the appended claims without departing from the true scope and spirit of the invention in its broader aspects.
The present application claims priority to Provisional Application 61/890,438 filed Oct. 14, 2013, the content of which is incorporated by reference.
Number | Name | Date | Kind |
---|---|---|---|
6721315 | Xiong | Apr 2004 | B1 |
7519055 | Zheng | Apr 2009 | B1 |
7889729 | Yun | Feb 2011 | B2 |
8687629 | Kompella | Apr 2014 | B1 |
8861539 | Dong | Oct 2014 | B2 |
8897133 | Lin | Nov 2014 | B2 |
20040052527 | Kirby | Mar 2004 | A1 |
20040151115 | Poppe | Aug 2004 | A1 |
20050073951 | Robotham | Apr 2005 | A1 |
20050089054 | Ciancaglini | Apr 2005 | A1 |
20050152351 | Wang | Jul 2005 | A1 |
20060120342 | Christensen | Jun 2006 | A1 |
20060126512 | Jain | Jun 2006 | A1 |
20060209825 | Carroll | Sep 2006 | A1 |
20070006056 | Lehner | Jan 2007 | A1 |
20070201870 | Cohen | Aug 2007 | A1 |
20080260383 | Zhong | Oct 2008 | A1 |
20090141719 | Roy | Jun 2009 | A1 |
20090245135 | Huang | Oct 2009 | A1 |
20130084062 | Hu | Apr 2013 | A1 |
20130235870 | Tripathi | Sep 2013 | A1 |
20130266309 | Binkert | Oct 2013 | A1 |
20130266315 | Drury | Oct 2013 | A1 |
20140178066 | Patel | Jun 2014 | A1 |
20140255022 | Zhong | Sep 2014 | A1 |
20140334818 | Mehrvar | Nov 2014 | A1 |
20160119058 | Wang | Apr 2016 | A1 |
Number | Date | Country | |
---|---|---|---|
20150104171 A1 | Apr 2015 | US |
Number | Date | Country | |
---|---|---|---|
61890438 | Oct 2013 | US |