Terminology and Bandwidth Accounting
Each switch card 1E, 1F contains a 160 Gbps cross connect IC. The cross-connect IC on the working switch card 1D is illustrated as object 1B. If, in this example, all line cards 1A are the same, and each supports M/16 inputs and M/16 outputs for some number M, then the cross-connect IC 1B comprises an M×M switch matrix, as shown.
The protection switch card 1E and the working switch card 1D are identical. The traffic across the backplane is 640 Gbps since the line cards send and receive 160 Gbps of data to and from each of the two switch cards.
Suppose that the backplane 2C is designed to accommodate 1,280 Gbps of traffic. Each switch card in the expanded system can then cross-connect at most 320 Gbps of traffic. Several options are available to achieve this bandwidth.
1)
However, since the number of crosspoints increases quadratically as a function of bandwidth, such an IC is costly to manufacture. The industry is typically not willing to pay four times the price to get twice the bandwidth. The manufacturing cost of such a monolithic IC is particularly high if advances in IC fabrication technology lag bandwidth growth.
2) Parallel processing techniques, such as bit- or byte-slicing to scale the bandwidth, could be employed. For example, two nibble-sliced 160 Gbps cross-connect ICs can switch 320 Gbps data in parallel. See, for example, McKeown, N., et al., “The Tiny Tera: A Packet Switch Core”, Hot Interconnects V, Stanford University, August 1996, incorporated herein by reference.
Legacy line cards that do not slice and reassemble data would have to be modified or replaced, an option that does not offer backward compatibility. A cross-connect bandwidth upgrade should involve upgrading the switch cards while preserving the line cards to be cost-effective.
If the switch card performs slicing and reassembly, then additional ICs are necessary to slice and reassemble the data there. Since all data links are high-speed links (typically at 2.5 Gbps line rate in 2002), this option doubles the number of these high-speed links on the switch card and triples the number of high-speed ports since the number of high-speed ports in the slicer and the reassembler is twice as many as that of the cross-connect ICs. As a result, this approach doubles the amount of high-speed link routing on the switch card and triples the power consumed by the high-speed ports.
3) A 320 Gbps cross-connect with multiple smaller cross-connect ICs could be implemented to form a Clos network but the resulting system is blocking for arbitrary multicast traffic and requires scheduling. See Clos, C., “A Study of Non-Blocking Switching Networks,” Bell System Technical Journal, vol. 32, 406-424, 1953, incorporated herein by reference.
The present invention addresses three concerns for digital cross-connect system bandwidth upgrade with integrated circuits (ICs): 1) scalability with minimal additional cross-connect IC cost; 2) backward compatibility with existing line cards; and 3) non-blocking switching for arbitrary multicast.
The present invention requires only K identical cross-connect ICs for each switch card to scale K times the bandwidth of the original digital cross-connect system, and maintains non-blocking switching for arbitrary multicast traffic without requiring line card changes or additional ICs on the switch card to preprocess the data. Depending on the IC technology, K such cross-connect ICs are more economical to manufacture than a monolithic cross-connect IC with K times the bandwidth.
A switching system or method, according to an embodiment of the present invention, includes two or more cross-connect ICs. Each IC directly receives some, but not all, of the system inputs, and outputs to some, but not all, outputs. Each cross-connect IC has a switch matrix that has the same number of inputs as the system, and a lesser number of outputs that matches the number of outputs of the IC. Each cross-connect IC provides fanout of its direct inputs to a link to each other cross-connect IC. Thus, each IC receives inputs either directly, or from a fanout on another IC.
Each cross-connect IC may further include deskewers that deskew or synchronize, after fanout, data streams which are input to the cross-connect's switch matrix, such that all data streams entering the switch matrix are synchronized. The deskewers may be, for example, but are not limited to, first-in-first-out (FIFO) buffers.
Plural input streams may be merged into a merged stream prior to being forwarded to other cross-connect ICs. The merged stream preferably has a higher bandwidth than the individual input streams contained therein, and may be formed by bit-interleaving the input streams.
Unique identifiers may be embedded into unused portions, e.g., overhead bytes, of one or more of the input streams. A cross-connect IC receiving a merged stream can then demultiplex and reconstruct the input streams based on identifiers embedded in the input streams.
Data streams may be, but are not limited to SONET or SDH data streams.
Different inputs/outputs may have different bandwidth capabilities.
Another embodiment of the invention is a cross-connect integrated circuit (IC), which includes input ports for directly receiving less than all of the inputs to the switching system, as well as output ports for outputting to less than all outputs of the switching system. One or more link receivers for receive, over one or more links connected to additional (second) cross-connect ICs, inputs which are directly received by those cross-connect ICs. A switch matrix on the IC has an input for each system input, and a lesser number of outputs matching the outputs of the IC, i.e., the direct outputs. A fanout circuit provides fanout of the directly received inputs, for transmission over the link(s) to the second cross-connect ICs.
The foregoing and other objects, features and advantages of the invention will be apparent from the following more particular description of preferred embodiments of the invention, as illustrated in the accompanying drawings in which like reference characters refer to the same parts throughout the different views. The drawings are not necessarily to scale, emphasis instead being placed upon illustrating the principles of the invention.
A description of preferred embodiments of the invention follows.
System Overview
An embodiment of the present invention requires K cross-connect ICs to scale the switching system bandwidth K times.
Thus, without the internal fanout of the cross-connect, a system would have to have an external fanout device, since any fanout signal must go through deserialization. (Differential high-speed signals cannot be fanned out, because they are all point-to-point connections). The number of high-speed transmitters and receivers of such an “external fanout” system would thus be higher than that required by the present invention. The present invention is therefore more power-, cost- and space-efficient than a system with external fanout.
The protection switch card 4F is identical to the working switch card 4E. The two cross-connect ICs 4C, 4D shown on the working switch card 4E are identical and together switch 320 Gbps of traffic to and from the line cards 4A. Bidirectional inter-IC links 4G, 160 Gbps in each direction, accomplish non-blocking switching for arbitrary multicast. As a result, each cross-connect IC 4C/4D has 320 Gbps of inputs and 320 Gbps of outputs, half of which are connected to half of the line cards while the other half are connected to the other cross-connect IC.
Port Partitioning
switch matrix 7A. These rectangular cross-connect ICs, in combination, perform the same function as a monolithic N×N cross-connect IC.
Each rectangular cross-connect IC has (K−1) sets of
pass-through ports 7C. Each set of pass-through ports is connected to a particular one of the other K−1 rectangular cross-connect ICs. In particular, each set of the
pass-through output ports 7C gets its inputs from a unique set of
input ports 7E that are directly connected to a set of line cards, without the data going through the switch matrix 7A. The other K−1 sets of input ports 7B are received from other ICs via pass-through input ports. Only one set of
output ports 7D is directly connected to the switch matrix 7A.
Those skilled in the art should see, depending on N, that K being 2, for instance, the present invention can reduce the die size from that required for an N×N square cross-connect IC so that two such rectangular cross-connect ICs are less costly to manufacture than a monolithic N×N square cross-connect IC.
This section discusses how the pass-through paths may be constructed to eliminate external devices to compensate for cross-connect latency. A SONET/SDH cross-connect is assumed, but the description applies to any cross-connect with a synchronous switch matrix. Each input port in a SONET/SDH cross-connect contains a deskew FIFO to absorb clock transients and arrival time mismatch due to differences in path lengths. The pass-through output ports get their data from before the deskew FIFO input; otherwise, additional deskew devices, for instance, SONET/SDH pointer adjusters, would be necessary in the switching system.
Deskew FIFO Overview
A SONET/SDH cross-connect deskew FIFO typically has its write clock in the recovered clock domain for its particular input port. The read clocks of all such FIFOs are derived from the same source so that the data streams going into the switch matrix are synchronous and properly aligned. In particular, the switch matrix accepts the first byte of each frame (start-of-frame) from each input at the same time. Typically, a cross-connect IC has a reference frame pulse and a programmable counter so that reading of the start-of-frame begins some time after the active edge of the reference frame pulse, as specified by the programmable counter. Adjusting the programmable counter changes the latency of the cross-connect IC.
Cross-Connect Circular Latency Dependency
Now consider the path through input port 8D, cross-connect 8B, and link 8H. Again, without loss of generality, suppose that the start-of-frame arrival time at 8D is 0 and that the programmable counter in 8B is set such that the start-of-frame appears at the output of the deskew FIFO 8G at time DF2. Let the delay through the link 8H be DS2. Then the start-of-frame arrival time at the input port of 8A connected to the link 8H is DF2+DC+DS2.
Note that because the FIFOs cannot overflow if they are to deskew the data properly, DF1+DC+DS1 must be less than DF2, and DF2+DC+DS2 must be less than DF1. Obviously, these conditions cannot both be true.
Reducing Pin Count for Pass-Through Ports
The primary inputs and outputs preferably run at line rates compatible with the line cards. The pass-through ports, on the other hand, can run at a higher rate to reduce pin count, board traces, and, in the case of multi-shelf applications, connectors and cables.
Multiplexing Pass-Through Data
Without loss of generality, consider the merging of two 2.5 Gbps input streams into one 5 Gbps stream, transmitted through a pass-through output port. Since the two input streams are taken from before the deskew FIFO, the 5 Gbps output stream needs to be retimed. The deskewing/synchronizing FIFOs can be such that the write clock of each FIFO originates in each of the recovered clock domains. The read clock can be in one of the recovered clock domains or can be driven by an external mesochronous reference clock.
The read clock 10K of both FIFOs 10E, 10F must be the same. This clock can be one of the recovered clocks, 10I, 10J, or an external mesochronous reference clock. Preferably, the read and write pointers of each FIFO are kept apart enough so that the FIFO does not overflow or underflow, and that there are enough entries in the FIFO to absorb clock transients in the 2.5 Gbps streams. The two streams from the FIFOs are then bit-interleaved and serialized in 10G before reaching the pass-through output port through the 5 Gbps transmitter 10H.
After fan-out, the signals may be further processed by blocks 34 and deskewed by deskew FIFOs 101 before being presented to the switch matrix resident on the particular cross-connect IC.
Demultiplexing Pass-Through Data
Once the 5 Gbps stream reaches the input port of the next cross-connect IC, that input port must have a means to retrieve the original two 2.5 Gbps streams. For SONET/SDH streams, stamping an overhead byte, such as the forty-seventh A2 byte in an STS-48/STM-16 frame, with a uniquely identifiable 2.5 Gbps stream ID can enable the downstream receiver to distinguish between the two bit-interleaved 2.5 Gbps streams. Since the A2 byte is not scrambled in SONET/SDH, the codeword chosen for the stream ID should not disturb DC balance. The forty-seventh A2 byte is used as an example here because it is typically not used for framing. This stamping may be done by the stream-ID insert units 32 (
Suppose that the two stream IDs are 0 and 1. Let “N/A” denote an ID not yet available to the 2×2 switch controller 11F. Then there are 9 possible states of the two inputs into 11F. Table 1 illustrates the 9 possible states. In the “Action” column, “straight through” means that the 2×2 switch 11G has input 11H connected to output 11J and input 11I connected to output 11K. “Swap” indicates that the 2×2 switch 11G has input 11H connected to output 11K and input 11I connected to output 11J.
For instance, whenever the stream IDs from the two framers are different and the ID from framer 11D is 0 or the ID from framer 11E is 1, then output 11J receives data from 11H and 11K from 11I. Likewise, whenever the IDs are different and the ID from framer 11D is 1 or the ID from framer 1E is 0, then the output 11J is connected to 11I and 11K to 11H. When both IDs are either 0 or 1, then the controller reports an error. When both IDs are “N/A”, the framers are still looking for the A1/A2 boundary and the switch is in pass-through mode. Note that this implementation requires only one of the streams to be stamped and ensures that the output port 1J receives the stream stamped with ID “0” and the output port 11K receives the stream stamped with ID “1”.
Connection Set-up
For instance, to connect a north primary input to a south primary output as in 123 of
Reference numbers 121 of
While this invention has been particularly shown and described with references to preferred embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the invention encompassed by the appended claims.
This application claims the benefit of U.S. Provisional Application No. 60/414,699, filed Sep. 27, 2002. The entire teachings of the above application are incorporated herein by reference.
Number | Name | Date | Kind |
---|---|---|---|
3735049 | Buchner et al. | May 1973 | A |
3736381 | Johnson et al. | May 1973 | A |
3925621 | Collins et al. | Dec 1975 | A |
3927267 | Voyer et al. | Dec 1975 | A |
3956593 | Collins et al. | May 1976 | A |
4005272 | Collins et al. | Jan 1977 | A |
4038497 | Collins et al. | Jul 1977 | A |
4797589 | Collins | Jan 1989 | A |
4817083 | Richards | Mar 1989 | A |
4855999 | Chao | Aug 1989 | A |
4967405 | Upp et al. | Oct 1990 | A |
5040170 | Upp et al. | Aug 1991 | A |
5130975 | Akata | Jul 1992 | A |
5923653 | Denton | Jul 1999 | A |
5945922 | Gao et al. | Aug 1999 | A |
6169737 | Lindberg et al. | Jan 2001 | B1 |
6215773 | Karlsson | Apr 2001 | B1 |
6240063 | Suzuki | May 2001 | B1 |
6584121 | Garg et al. | Jun 2003 | B1 |
6628609 | Chapman et al. | Sep 2003 | B2 |
6870838 | Dally | Mar 2005 | B2 |
6944190 | Tomar et al. | Sep 2005 | B1 |
7058010 | Chidambaran et al. | Jun 2006 | B2 |
7173930 | Wellbaum et al. | Feb 2007 | B2 |
20010053160 | Dally | Dec 2001 | A1 |
20020146003 | Kam et al. | Oct 2002 | A1 |
Number | Date | Country |
---|---|---|
WO 9608902 | Mar 1996 | WO |
Number | Date | Country | |
---|---|---|---|
20040062228 A1 | Apr 2004 | US |
Number | Date | Country | |
---|---|---|---|
60414699 | Sep 2002 | US |