Pipelined multiple issue packet switch

Information

  • Patent Grant
  • 6831923
  • Patent Number
    6,831,923
  • Date Filed
    Monday, April 17, 2000
    24 years ago
  • Date Issued
    Tuesday, December 14, 2004
    20 years ago
Abstract
A pipelined multiple issue architecture for a link layer or protocol layer packet switch, which processes packets independently and asynchronously, but reorders them into their original order, thus preserving the original incoming packet order. Each stage of the pipeline waits for the immediately previous stage to complete, thus causing the packet switch to be self-throttling and thus allowing differing protocols and features to use the same architecture, even if possibly requiring differing processing times. The multiple issue pipeline is scaleable to greater parallel issue of packets, and tunable to differing switch engine architectures, differing interface speeds and widths, and differing clock rates and buffer sizes. The packet switch comprises a fetch stage, which fetches the packet header into one of a plurality of fetch caches, a switching stage comprising a plurality of switch engines, each of which independently and asychronously reads from corresponding fetch caches, makes switching decisions, and write to a reorder memory, a reorder engine which reads from the reorder memory in the packets' original order, and a post-processing stage, comprising a post-process queue and a post-process engine, which performs protocol-specific post-processing on the packets.
Description




BACKGROUND OF THE INVENTION




1. Field of the Invention




This invention relates to a pipelined multiple issue packet switch.




2. Description of Related Art




When computers are coupled together into networks for communication, it is known to couple networks together and to provide a switching device which is coupled to more than one network. The switching device receives packets from one network and retransmits those packets (possibly in another format) on another network. In general, it is desirable for the switching device to operate as quickly as possible.




However, there are several constraints under which the switching device must operate. First, packets may encapsulate differing protocols, and thus may differ significantly in length and in processing time. Second, when switching packets from one network to another, it is generally required that packets are re-transmitted in the same order as they arrive. Because of these two constraints, known switching device architectures are not able to take advantage of significant parallelism in switching packets.




It is also desirable to account ahead of time for future improvements in processing hardware, such as bandwidth and speed of a network interface, clock speed of a switching processor, and memory size of a packet buffer, so that the design of the switching device is flexible and scaleable with such improvements.




The following U.S. Patents may be pertinent:




U.S. Pat. No. 4,446,555 to Devault et al., “Time Division Multiplex Switching Network For Multiservice Digital Networks”;




U.S. Pat. No. 5,212,686 to Joy et al., “Asynchronous Time Division Switching Arrangement and A Method of Operating Same”;




U.S. Pat. No. 5,271,004 to Proctor et al., “Asynchronous Transfer Mode Switching Arrangement Providing Broadcast Transmission”; and




U.S. Pat. No. 5,307,343 to Bostica et al., “Basic Element for the Connection Network of A Fast Packet Switching Node”.




Accordingly, it would be advantageous to provide an improved architecture for a packet switch, which can make packet switching decisions responsive to link layer (ISO level 2) or protocol layer (ISO level 3) header information, which is capable of high speed operation at relatively low cost, and which is flexible and scaleable with future improvements in processing hardware.




SUMMARY OF THE INVENTION




The invention provides a pipelined multiple issue link layer or protocol layer packet switch, which processes packets independently and asynchronously, but reorders them into their original order, thus preserving the original incoming packet order. Each stage of the pipeline waits for the immediately previous stage to complete, thus causing the packet switch to be self-throttling and thus allowing differing protocols and features to use the same architecture, even if possibly requiring differing processing times. The multiple issue pipeline is scaleable to greater parallel issue of packets, and tunable to differing switch engine architectures, differing interface speeds and widths, and differing clock rates and buffer sizes.




In a preferred embodiment, the packet switch comprises a fetch stage, which fetches the packet header into one of a plurality of fetch caches, a switching stage comprising a plurality of switch engines, each of which independently and asychronously reads from corresponding fetch caches, makes switching decisions, and writes to a reorder memory, a reorder engine which reads from the reorder memory in the packets' original order, and a post-processing stage, comprising a post-process queue and a post-process engine, which performs protocol-specific post-processing on the packets.











BRIEF DESCRIPTION OF THE DRAWINGS





FIG. 1

shows the placement of a packet switch in an internetwork.





FIG. 2

shows a block diagram of a packet switch.

FIG. 2

comprises FIG.


2


A and

FIG. 2B

collectively.





FIG. 3

shows a fetch stage for the packet switch.





FIG. 4

shows a block diagram of a system having a plurality of packet switches in parallel.











DESCRIPTION OF THE PREFERRED EMBODIMENT




In the following description, a preferred embodiment of the invention is described with regard to preferred process steps, data structures, and switching techniques. However, those skilled in the art would recognize, after perusal of this application, that embodiments of the invention may be implemented using a set of general purpose computers operating under program control, and that modification of a set of general purpose computers to implement the process steps and data structures described herein would not require undue invention.




The present invention may be used in conjunction with technology disclosed in the following copending application.




application Ser. No. 08/229,289, filed Apr. 18, 1994, in the name of inventors Bruce A. Wilford, Bruce Sherry, David Tsiang, and Anthony Li, titled “Packet Switching Engine”.




This application is hereby incorporated by reference as if fully set forth herein, and is referred to herein as the “Packet Switching Engine disclosure”.




Pipelined, Multiple Issue Packet Switch





FIG. 1

shows the placement of a packet switch in an internetwork.




A packet switch


100


is coupled to a first network interface


101


coupled to a first network


102


and a second network interface


101


coupled to a second network


102


. When a packet


103


is recognized by the first network interface


101


(i.e., the MAC address of the packet


103


is addressed to the packet switch


100


or to an address known to be off the first network


102


), the packet


103


is stored in a packet memory


110


and a pointer to a packet header


104


for the packet


103


is generated.




In a preferred embodiment, the packet header


104


comprises a link layer (level 2) header, and a protocol layer (level 3) header. The link layer header, sometimes called a “MAC” (media access control) header, comprises information for communicating the packet


103


on a network


102


using particular media, such as the first network


102


. The protocol layer header comprises information for level 3 switching of the packet


103


among networks


102


. The link layer header comprises information for level 2 switching (i.e., bridging). For example, the link layer header may comprise an ethernet, FDDI, or token ring header, while the protocol layer header may comprise an IP header. Also, there are hybrid switching techniques which respond to the both the level 2 and the level 3 headers, as well as those which respond to level 4 headers (such as extended access lists). Those skilled in the art will recognize, after perusal of this application, that other types of packet headers or trailers are within the scope and spirit of the invention, and that adapting the invention to switching such packet headers would not involve invention or undue experimentation.




The packet switch


100


reads the packet header


104


and performs two tasks—(1) it rewrites the packet header


104


, if necessary, to conform to protocol rules for switching the packet


103


, and (2) it queues the packet


103


for transmission on an output network interface


101


and thus an output network


102


. For example, if the output network


102


requires a new link layer header, the packet switch


100


rewrites the link layer header. If the protocol layer header comprises a count of the number of times the packet


103


has been switched, the packet switch


100


increments or decrements that count, as appropriate, in the protocol layer header.





FIG. 2

shows a block diagram of a packet switch.

FIG. 2

comprises FIG.


2


A and

FIG. 2B

collectively.




The packet switch


100


comprises a fetch stage


210


, a switching stage


220


, and a post-processing stage


230


.




The pointer to the packet header


104


is coupled to the fetch stage


210


. The fetch stage


210


comprises a fetch engine


211


and a plurality of (preferably two) fetch caches


212


. Each fetch cache


212


comprises a double buffered FIFO queue.





FIG. 2A

shows a preferred embodiment in which there are two fetch caches


212


, while

FIG. 2B

shows an alternative preferred embodiment in which there are four fetch caches


212


.




In response to a signal from the switching stage


220


, the fetch engine


211


prefetches a block of M bytes of the packet header


104


and stores that block in a selected FIFO queue of a selected fetch cache


212


. In a preferred embodiment, the value of M, the size of the block, is independent of the protocol embodied in the protocol link layer, and is preferably about


64


bytes. In alternative embodiments, the value of M may be adjusted, e.g., by software, so that the packet switch


100


operates most efficiently with a selected mix of packets


103


it is expected to switch.




When the block of M bytes does not include the entire packet header


104


, the fetch engine


211


fetches, in response to a signal from the fetch cache


212


, a successive block of L additional bytes of the packet header


104


and stores those blocks in the selected FIFO queue of the selected fetch cache


212


, thus increasing the amount of data presented to the switching stage


220


. In a preferred embodiment, the value of L, the size of the additional blocks, is equal to the byte width of an interface to the packet memory


110


, and in a preferred embodiment is about 8 bytes.




After storing at least a portion of a packet header


104


in a fetch cache


212


, the fetch engine


211


reads the next packet header


104


and proceeds to read that packet header


104


and store it in a next selected fetch cache


212


. The fetch caches


212


are selected for storage in a round-robin manner. Thus when there are N fetch caches


212


, each particular fetch cache


212


receives every Nth packet header


104


for storage; when there are two fetch caches


212


, each particular fetch cache


212


receives every other packet header


104


for storage.




Each fetch cache


212


is double buffered, so that the fetch engine


211


may write a new packet header


104


to a fetch cache


212


while the corresponding switch engine


221


is reading from the fetch cache


212


. This is in addition to the fetch on demand operation described above, in which the fetch engine


211


writing successive blocks of additional bytes of an incomplete packet header


104


in response to a signal from a switch engine


221


. Thus each particular fetch cache


212


pipelines up to two packet headers


104


; when there are N fetch caches


212


, there are up to 2N packet headers


104


pipelined in the fetch stage


210


.




More generally, there may be N fetch caches


212


, each of which comprises B buffers, for a total of BN buffers. The fetch engine


211


writes new packet headers


104


in sequence to the N fetch caches


212


in order, and when the fetch engine


211


returns to a fetch cache


212


after writing in sequence to all other fetch caches


212


, it writes in sequence to the next one of the B buffers within that fetch cache


212


.




As shown below, the switching stage


220


comprises an identical number N of switch engines


221


, each of which reads in sequence from one of the B buffers of its designated fetch cache


212


, returning to read from a buffer after reading in sequence from all other buffers in that fetch cache


212


.




In

FIG. 2A

, a preferred embodiment in which there are two fetch caches


212


, there are four packet headers


104


pipelined in the fetch stage


210


, labeled n+3, n+2, n+1, and n. In FIG.


2


B, an alternative preferred embodiment in which there are four fetch caches


212


, there are eight packet headers


104


pipelined in the fetch stage


210


, labeled n+7, n+6, n+5, n+4, n+3, n+2, n+1, and n.




The fetch stage


210


is further described with regard to FIG.


3


.




The switching stage


220


comprises a plurality of switch engines


221


, one for each fetch cache


212


, and a reorder/rewrite engine


222


.




Each switch engine


221


is coupled to a corresponding fetch cache


212


. Each switch engine


221


independently and asychronously reads from its corresponding fetch cache


212


, makes a switching decision, and writes its results to one of a plurality of (preferably two) reorder/rewrite memories


223


in the reorder/rewrite engine


222


. Thus, when there are N fetch caches


212


, there are also N switch engines


221


, and when there are K reorder/rewrite memories


223


for each switch engine


221


, there are KN reorder/rewrite memories


223


in N sets of K.





FIG. 2A

shows a preferred embodiment in which there are two switch engines


221


and four reorder/rewrite memories


223


, while

FIG. 2B

shows an alternative preferred embodiment in which there are four switch engines


221


and eight reorder/rewrite memories


223


.




In a preferred embodiment, each switch engine


221


comprises a packet switch engine as shown in the Packet Switching Engine disclosure. The switching results and other data (e.g., statistical information) written into the reorder/rewrite memories


223


comprise information regarding how to rewrite the packet header


104


and to which network interface


101


to output the packet


103


. Preferably, this information comprises results registers as described in the Packet Switching Engine disclosure, and includes a pointer to the packet header


104


in the packet memory


110


.




Preferably, a single integrated circuit chip comprises significant circuits of at least one, and preferably more than one, switch engine


221


.




As described in the Packet Switching Engine disclosure, each switch engine


221


reads instructions from a “tree memory” comprising instructions for reading and interpreting successive bytes of the packet header


104


. In a preferred embodiment, the tree memory comprises a set of memory registers coupled to the switch engine


221


. In an alternative embodiment, at least some of the tree memory may be cached on the integrated circuit chip for the switch engine


221


.




The reorder/rewrite engine


222


reads from the reorder/rewrite memories


223


in a preselected order. The N sets of K reorder/rewrite memories


223


are interleaved, so that results from the switch engines


221


are read in a round-robin manner. Thus, output from the reorder/rewrite engine


222


is in the original order in which packets


103


arrived at the packet switch


100


.




Thus, each one of the switch engines


221


writes in sequence to its K designated reorder/rewrite memories


223


, returning to one of its designated reorder/rewrite memories


223


after writing in sequence to its other designated reorder/rewrite memories


223


. In parallel, the reorder/rewrite engine


222


reads in sequence from all the NK reorder/rewrite memories


223


, and returns to one of the NK reorder/rewrite memories


223


after reading in sequence from all other reorder/rewrite memories


223


.




In

FIG. 2A

, a preferred embodiment in which there are two switch engines


221


and four reorder/rewrite memories


223


, there are four packet headers


104


pipelined in the switching stage


220


, labeled n+1, n, n−1, and n−2 (now available). In

FIG. 2B

, an alternative preferred embodiment in which there are four switch engines


221


and eight reorder/rewrite memories


223


, there are eight packet headers


104


pipelined in the switching stage


220


, labeled n+3, n+2, n+1, n, n−1, n−2, n−3, and n−4.




The reorder/rewrite engine


222


, in addition to receiving the packet headers


104


in their original order from the reorder/rewrite memories


223


, may also rewrite MAC headers for the packet headers


104


in the packet memory


110


, if such rewrite is called for by the switching protocol.




The post-processing stage


230


comprises a post-processing queue


231


and a post-processor


232


.




The reorder/rewrite engine


222


writes the packet headers


104


into a FIFO queue of post-processing memories


231


in the order it reads them from the reorder/rewrite memories


223


. Because the queue is a FIFO queue, packet headers


104


leave the post-processing stage


230


in the same order they enter, which is the original order in which packets


103


arrived at the packet switch


100


.




The post-processor


232


performs protocol-specific operations on the packet header


104


. For example, the post-processor


232


increments hop counts and recomputes header checksums for IP packet headers


104


. The post-processor


232


then queues the packet


103


for the designated output network interface


101


, or, if the packet


103


cannot be switched, discards the packet


103


or queues it for processing by a route server, if one exists.




In

FIG. 2A

, a preferred embodiment, and in

FIG. 2B

, an alternative preferred embodiment, there are two post-processing memories


231


in the FIFO queue for the post-processing stage


230


. In

FIG. 2A

there are two packet headers


104


pipelined in the post-processing stage


230


, labeled n−3 and n−2. In

FIG. 2B

there are two packet headers


104


pipelined in the post-processing stage


230


, labeled n−6 and n−5.





FIG. 2A

, a preferred embodiment, and

FIG. 2B

, an alternative preferred embodiment, show that there are several packet headers


104


processed in parallel by the packet switch


100


. In general, where there are S switching engines


211


, there are 3S+2 packet headers


104


processed in parallel by the packet switch


100


. Of these, 2S packet headers


104


are stored in the fetch stage


210


, S packet headers


104


are stored in the reorder/rewrite memories


223


, and 2 packet headers


104


are stored in the post-processing stage


230


.




In a preferred embodiment, the packet memory


110


is clocked at about 50 MHz and has a memory fetch path to the fetch stage


210


which is eight bytes wide, there are two switching engines


221


, each of which operates at an average switching speed of about 250 kilopackets switched per second, and each stage of the packet switch


100


completes operation within about 2 microseconds. Although each switching engine


221


is individually only about half as fast as the pipeline processing speed, the accumulated effect when using a plurality of switching engines


221


is to add their effect, producing an average switching speed for the packet switch


100


of about 500 kilopackets switched per second when the pipeline is balanced.




In an alternative preferred embodiment, each switching engine


221


operates at an average switching speed of about 125 kilopackets switched per second, producing an average switching speed for the packet switch


100


of about 250 kilopackets switched per second when the pipeline is balanced. Because the pipeline is limited by its slowest stage, the overall speed of the packet switch


100


is tunable by adjustment of parameters for its architecture, including speed of the memory, width of the memory fetch path, size of the cache buffers, and other variables. Such tunability allows the pocket switch


100


to achieve satisfactory performance at a reduced cost.




Fetch Engine and Fetch Memories





FIG. 3

shows a fetch stage for the packet switch.




The fetch engine


211


comprises a state machine


300


having signal inputs coupled to the packet memory


110


and to the switching stage


220


, and having signal outputs coupled to the switching stage


220


,




A packet ready signal


301


is coupled to the fetch engine


211


from the packet memory


110


and indicates whether there is a packet header


104


ready to be fetched. In this description of the fetch engine


211


, it is presumed that packets


103


arrive quickly enough that, the packet ready signal


301


indicates that there is a packet header


104


ready to be fetched at substantially all times. If the fetch engine


211


fetches packet headers


104


quicker than those packet headers


104


arrive, at some times the fetch engine


211


(and the downstream elements of the packet switch


100


) will have to wait for more packets


103


to switch.




A switch ready signal


302


is coupled to the fetch engine


211


from each of the switch engines


221


and indicates whether the switch engine


211


is ready to receive a new packet header


104


for switching.




A data available (or cache ready) signal


303


is coupled to each of the switch engines


221


from the fetch engine


211


and indicates whether a packet header


104


is present in the fetch cache


212


for switching.




A cache empty signal


304


is coupled to the fetch engine


211


from each of the fetch caches


212


and indicates whether the corresponding switch engine


211


has read all the data from the packet header


104


supplied by the fetch engine


211


. A data not required signal


307


is coupled to the fetch engine


211


from each of the switch engines


211


and indicates whether the switch engine


211


needs further data loaded into the fetch cache


212


.




It may occur that the switch engine


211


is able to make its switching decision without need for further data from the packet header


104


, even though the switch engine


211


has read all the data from the packet header


104


supplied by the fetch engine


211


. In this event, the switch engine


211


sets the data not required signal


307


to inform the fetch engine


211


that no further data should be supplied, even though the cache empty signal


304


has been triggered.




It may also occur that the switch engine


211


is able to determine that it can make its switching decision within the data already available, even if it has not made that switching decision yet. For example, in the IP protocol, it is generally possible to make the switching decision with reference only to the first 64 bytes of the packet header


104


. If the switch engine


211


is able to determine that a packet header


104


is an IP packet header, it can set the data not required signal


307


.




A read pointer


305


is coupled to each of the fetch caches


212


from the corresponding switch engine


221


and indicates a location in the fetch cache


212


where the switch engine


221


is about to read a word (of a packet header


104


) from the fetch cache


212


.




A write pointer


306


is coupled to each of the fetch caches


212


from the fetch engine


211


and indicates a location in the fetch cache


212


where the fetch engine


211


is about to write a word (of a packet header


104


) to the fetch cache


212


.




A first pair of fetch caches


212


(labeled “0” and “1”) and a second pair of fetch caches


212


(labeled “2” and “3”) each comprise dual port random access memory (RAM), preferably a pair of 16 word long by 32 bit wide dual port RAM circuits disposed to respond to addresses as a single 16 word long by 64 bit wide dual port RAM circuit.




A 64 bit wide data bus


310


is coupled to a data input for each of the fetch caches


212


.




The read pointers


305


for the first pair of the fetch caches


212


(labeled as “0” and “1”) are coupled to a first read address bus


311


for the fetch caches


212


using a first read address multiplexer


312


. The two read pointers


305


are data inputs to the read address multiplexer


312


; a select input to the read address multiplexer


312


is coupled to an output of the fetch engine


211


. Similarly, the read pointers


305


for the second pair of the fetch caches


212


(labeled as “2” and “3”) are coupled to a second read address bus


311


for the fetch caches


212


using a second read address multiplexer


312


, and selected by an output of the fetch engine


211


.




Similarly, the write pointers


306


for the first pair of the fetch caches


212


(labeled as “0” and “1”) are coupled to a first write address bus


313


for the fetch caches


212


using a first write address multiplexer


314


. The two write pointers


306


are data inputs to the write address multiplexer


314


; a select input to the write address multiplexer


314


is coupled to an output of the fetch engine


211


. Similarly, the write pointers


306


for the second pair of the fetch caches


212


(labeled as “2” and “3”) are coupled to a second write address bus


313


for the fetch caches


212


using a second write address multiplexer


314


, and selected by an output of the fetch engine


211


.




An output


315


for the first pair of fetch caches


212


is coupled to a byte multiplexer


316


. The byte multiplexer


316


selects one of eight bytes of output data, and is selected by an output of a byte select multiplexer


317


. The byte select multiplexer


317


is coupled to a byte address (the three least significant bits of the read pointer


305


) for each of the first pair of fetch caches


212


, and is selected by an output of the fetch engine


211


.




An initial value for the byte address (the three least significant bits of the read pointer


305


) may be set by the state machine


300


to allow a first byte of the packet header


104


to be offset from (i.e., not aligned with) an eight-byte block in the packet memory


110


. The state machine


300


resets the byte address to zero for successive sets of eight bytes to be fetched from the packet memory


110


.




Similarly, an output


315


for the second pair of fetch caches


212


is coupled to a byte multiplexer


316


. The byte multiplexer


316


selects one of eight bytes of output data, and is selected by an output of a byte select multiplexer


317


. The byte select multiplexer


317


is coupled to a byte address (the three least significant bits of the read pointer


305


) for each of the second pair of fetch caches


212


, and is selected by an output of the fetch engine


211


. The byte multiplexers


316


are coupled to the switching stage


220


.




As described with regard to

FIG. 2

, the fetch engine


211


responds to the switch ready signal


302


from a switch engine


221


by prefetching the first M bytes of the packet header


104


from the packet memory


110


into the corresponding fetch cache


212


. To perform this task, the fetch engine


211


selects the write pointer


306


for the corresponding fetch cache


212


using the corresponding write address multiplexer


314


, writes M bytes into the corresponding fetch cache


212


, and updates the write pointer


306


.




As described with regard to

FIG. 2

, the fetch cache


212


raises the cache empty signal


304


when the read pointer


305


approaches the write pointer


306


, such as when the read pointer


305


is within eight bytes of the write pointer


306


. The fetch engine


211


responds to the cache empty signal


304


by fetching the next L bytes of the packet header


104


from the packet memory


110


into the corresponding fetch cache


212


, unless disabled by the data not required signal


307


from the switch engine


221


. To perform this task, the fetch engine


211


proceeds in like manner as when it prefetched the first M bytes of the packet header


104


.




In a preferred embodiment, the fetch cache


212


includes a “watermark” register (not shown) which records an address value which indicates, when the read pointer


305


reaches that address value, that more data should be fetched. For example, the watermark register may record a value just eight bytes before the write pointer


306


, so that more data will only be fetched when the switch engine


221


is actually out of data, or the watermark register may record a value more bytes before the write pointer


306


, so that more data will be fetched ahead of actual need. Too-early values may result in data being fetched ahead of time without need, while too-late values may result in the switch engine


221


having to wait. Accordingly, the value recorded in the watermark register can be adjusted to better match the rate at which data is fetched to the rate at which data is used by the switch engine


221


.




While the switch engine


221


reads from the fetch cache


212


, the fetch engine


211


prefetches the first M bytes of another packet header


104


from the packet memory


110


into another fetch cache


212


(which may eventually comprise the other fetch cache


212


of the pair). To perform this task, the fetch engine


211


selects the write pointer


306


for the recipient fetch cache


212


using the corresponding write address multiplexer


314


, writes M bytes into the recipient fetch cache


212


, and updates the corresponding write pointer


306


.




The switch engines


221


are each coupled to the read pointer


305


for their corresponding fetch cache


212


. Each switch engine


221


independently and asychronously reads from its corresponding fetch cache


212


and processes the packet header


104


therein. To perform this task, the switch engine


221


reads one byte at a time from the output of the output multiplexer


320


and updates the corresponding byte address (the three least significant bits of the read pointer


305


). When the read pointer


305


approaches the write pointer


306


, the cache low signal


304


is raised and the fetch engine


211


fetches L additional bytes “on demand”.




Multiple Packet Switches in Parallel





FIG. 4

shows a block diagram of a system having a plurality of packet switches in parallel.




In a parallel system


400


, the packet memory


110


is coupled in parallel to a plurality of (preferably two) packet switches


100


, each constructed substantially as described with regard to FIG.


1


. Each packet switch


100


takes its input from the packet memory


110


. However, the output of each packet switch


100


is directed instead to a reorder stage


410


, and an output of the reorder stage


410


is directed to the packet memory


110


for output to a network interface


101


.




The output of each packet switch


100


is coupled in parallel to the reorder stage


410


. The reorder stage


410


comprises a plurality of reorder memories


411


, preferably two per packet switch


100


for a total of four reorder memories


411


. The reorder stage


410


operates similarly to the reorder/rewrite memories


222


of the packet switch


100


; the packet switches


100


write their results to the reorder memories


411


, whereinafter a reorder processor


412


reads their results from the reorder memories


411


and writes them in the original arrival order of the packets


103


to the packet memory


110


for output to a network interface


101


.




In a preferred embodiment where each packet switch


100


operates quickly enough to achieve an average switching speed of about 500 kilopackets per second and the reorder stage


410


operates quickly enough so that the pipeline is still balanced, the parallel system


400


produces a throughput of about 1,000 kilopackets switched per second.




Alternative embodiments of the parallel system


400


may comprise larger numbers of packet switches


100


and reorder/rewrite memories


411


. For example, in one alternative embodiment, there are four packet switches


100


and eight reorder/rewrite memories


411


, and the reorder stage


410


is greatly speeded up. In this alternative embodiment, the parallel system


400


produces a throughput of about 2,000 kilopackets switched per second.




Alternative Embodiments




Although preferred embodiments are disclosed herein, many variations are possible which remain within the concept, scope, and spirit of the invention, and these variations would become clear to those skilled in the art after perusal of this application.



Claims
  • 1. A system, comprisinga packet memory; N switch engines coupled to said packet memory; NK reorder memories coupled to said N switch engines; and a reorder engine coupled to said plurality of reorder memories and disposed to receive packet headers from said reorder memories in an order in which they were originally received; wherein each one of said switch engines is independently coupled to a corresponding K of said reorder memories.
  • 2. The system of claim 1, wherein K equals 2.
  • 3. The system of claim 1, wherein N equals 2.
  • 4. The system of claim 1, wherein N equals 4.
  • 5. The system of claim 1, wherein each one of said N switch engines writes in sequence to said corresponding K reorder memories.
  • 6. The system of claim 1, wherein each one of said NK reorder memories is processed globally.
  • 7. The system of claim 5, wherein said reorder engine reads in sequence from each of said NK reorder memories.
  • 8. The system of claim 5, wherein said reorder engine reads from each of said NK reorder memories in parallel with said N switch engines writing in sequence to said corresponding K of said reorder memories.
Parent Case Info

This application is a continuation of application Ser. No. 08/511,146, filed Aug. 4, 1995.

US Referenced Citations (190)
Number Name Date Kind
4131767 Weinstein Dec 1978 A
4161719 Parikh et al. Jul 1979 A
4316284 Howson Feb 1982 A
4397020 Howson Aug 1983 A
4419728 Larson Dec 1983 A
4424565 Larson Jan 1984 A
4437087 Petr Mar 1984 A
4438511 Baran Mar 1984 A
4439763 Limb Mar 1984 A
4445213 Baugh et al. Apr 1984 A
4446555 Devault et al. May 1984 A
4456957 Schieltz Jun 1984 A
4464658 Thelen Aug 1984 A
4499576 Fraser Feb 1985 A
4506358 Montgomery Mar 1985 A
4507760 Fraser Mar 1985 A
4532626 Flores et al. Jul 1985 A
4644532 George et al. Feb 1987 A
4646287 Larson et al. Feb 1987 A
4677423 Benvenuto et al. Jun 1987 A
4679189 Olson et al. Jul 1987 A
4679227 Hughes-Hartogs Jul 1987 A
4723267 Jones et al. Feb 1988 A
4731816 Hughes-Hartogs Mar 1988 A
4750136 Arpin et al. Jun 1988 A
4757495 Decker et al. Jul 1988 A
4763191 Gordon et al. Aug 1988 A
4769810 Eckberg, Jr. et al. Sep 1988 A
4769811 Eckberg, Jr. et al. Sep 1988 A
4771425 Baran et al. Sep 1988 A
4819228 Baran et al. Apr 1989 A
4827411 Arrowood et al. May 1989 A
4833706 Hughes-Hartogs May 1989 A
4835737 Herrig et al. May 1989 A
4879551 Georgiou et al. Nov 1989 A
4893306 Chao et al. Jan 1990 A
4903261 Baran et al. Feb 1990 A
4922486 Lidinsky et al. May 1990 A
4933937 Konishi Jun 1990 A
4960310 Cushing Oct 1990 A
4962497 Ferenc et al. Oct 1990 A
4962532 Kasirai et al. Oct 1990 A
4965772 Daniel et al. Oct 1990 A
4970678 Sladowski et al. Nov 1990 A
4979118 Kheradpir Dec 1990 A
4980897 Decker et al. Dec 1990 A
4991169 Davis et al. Feb 1991 A
5003595 Collins et al. Mar 1991 A
5014265 Hahne et al. May 1991 A
5020058 Holden et al. May 1991 A
5033076 Jones et al. Jul 1991 A
5054034 Hughes-Hartog Oct 1991 A
5059925 Weisbloom Oct 1991 A
5072449 Enns et al. Dec 1991 A
5088032 Bosack Feb 1992 A
5095480 Fenner Mar 1992 A
5115431 Williams et al. May 1992 A
5128945 Enns et al. Jul 1992 A
5136580 Videlock et al. Aug 1992 A
5166930 Braff et al. Nov 1992 A
5199049 Wilson Mar 1993 A
5206886 Bingham Apr 1993 A
5208811 Kashio et al. May 1993 A
5212686 Joy et al. May 1993 A
5224099 Corbalis et al. Jun 1993 A
5226120 Brown et al. Jul 1993 A
5228062 Bingham Jul 1993 A
5229994 Balzano et al. Jul 1993 A
5237564 Lespagnol et al. Aug 1993 A
5241682 Bryant et al. Aug 1993 A
5243342 Kattemalalavadi et al. Sep 1993 A
5243596 Port et al. Sep 1993 A
5247516 Bernstein et al. Sep 1993 A
5249178 Kurano et al. Sep 1993 A
5253251 Aramaki Oct 1993 A
5255291 Holden et al. Oct 1993 A
5260933 Rouse Nov 1993 A
5260978 Fleischer et al. Nov 1993 A
5268592 Bellamy et al. Dec 1993 A
5268900 Hluchj et al. Dec 1993 A
5271004 Proctor et al. Dec 1993 A
5274631 Bhardwaj Dec 1993 A
5274635 Rahman et al. Dec 1993 A
5274643 Fisk Dec 1993 A
5280470 Buhrke et al. Jan 1994 A
5280480 Pitt et al. Jan 1994 A
5280500 Mazzola et al. Jan 1994 A
5283783 Nguyen et al. Feb 1994 A
5287103 Kasprzyk et al. Feb 1994 A
5287453 Roberts Feb 1994 A
5291482 McHarg et al. Mar 1994 A
5305311 Lyles Apr 1994 A
5307343 Bostica et al. Apr 1994 A
5309437 Perlman et al. May 1994 A
5311509 Heddes et al. May 1994 A
5313454 Bustini et al. May 1994 A
5313582 Hendel et al. May 1994 A
5317562 Nardin et al. May 1994 A
5319644 Liang Jun 1994 A
5327421 Hiller et al. Jul 1994 A
5331637 Francis et al. Jul 1994 A
5339311 Turner Aug 1994 A
5345445 Hiller et al. Sep 1994 A
5345446 Hiller et al. Sep 1994 A
5359592 Corbalis et al. Oct 1994 A
5361250 Nguyen et al. Nov 1994 A
5361256 Doeringer et al. Nov 1994 A
5361259 Hunt et al. Nov 1994 A
5365524 Hiller et al. Nov 1994 A
5367517 Cidon et al. Nov 1994 A
5371852 Attanasio et al. Dec 1994 A
5386567 Lien et al. Jan 1995 A
5390170 Sawant et al. Feb 1995 A
5390175 Hiller et al. Feb 1995 A
5394394 Crowther et al. Feb 1995 A
5394402 Ross Feb 1995 A
5400325 Chatwani et al. Mar 1995 A
5408469 Opher et al. Apr 1995 A
5412646 Cyr et al. May 1995 A
5414705 Therasse et al. May 1995 A
5416842 Aziz May 1995 A
5422880 Heitkamp et al. Jun 1995 A
5422882 Hiller et al. Jun 1995 A
5423002 Hart Jun 1995 A
5426636 Hiller et al. Jun 1995 A
5428607 Hiller et al. Jun 1995 A
5430715 Corbalis et al. Jul 1995 A
5430729 Rahnema Jul 1995 A
5442457 Najafi Aug 1995 A
5442630 Gagliardi et al. Aug 1995 A
5452297 Hiller et al. Sep 1995 A
5473599 Li et al. Dec 1995 A
5473607 Hausman et al. Dec 1995 A
5477541 White et al. Dec 1995 A
5483523 Nederlof Jan 1996 A
5485455 Dobbins et al. Jan 1996 A
5490140 Abensour et al. Feb 1996 A
5490258 Fenner Feb 1996 A
5491687 Christensen et al. Feb 1996 A
5491804 Heath et al. Feb 1996 A
5497371 Ellis et al. Mar 1996 A
5509006 Wilford et al. Apr 1996 A
5517494 Green May 1996 A
5519704 Farinacci et al. May 1996 A
5519858 Walton et al. May 1996 A
5526489 Nilakantan et al. Jun 1996 A
5530963 Moore et al. Jun 1996 A
5535195 Lee Jul 1996 A
5539734 Burwell et al. Jul 1996 A
5541911 Nilakantan et al. Jul 1996 A
5546370 Ishikawa Aug 1996 A
5548593 Peschi Aug 1996 A
5555244 Gupta et al. Sep 1996 A
5561669 Lenney et al. Oct 1996 A
5583862 Callon Dec 1996 A
5590122 Sandorfi et al. Dec 1996 A
5592470 Rudrapatna et al. Jan 1997 A
5598581 Daines et al. Jan 1997 A
5600798 Cherukuri et al. Feb 1997 A
5604868 Komine et al. Feb 1997 A
5608726 Virgile Mar 1997 A
5608733 Vallee et al. Mar 1997 A
5617417 Sathe et al. Apr 1997 A
5617421 Chin et al. Apr 1997 A
5630125 Zellwegger May 1997 A
5631908 Saxe May 1997 A
5632021 Jennings et al. May 1997 A
5634010 Ciscon et al. May 1997 A
5638359 Peltola et al. Jun 1997 A
5644718 Belove et al. Jul 1997 A
5659684 Giovannoni et al. Aug 1997 A
5666353 Klausmeier et al. Sep 1997 A
5673265 Gupta et al. Sep 1997 A
5678006 Valizadeh et al. Oct 1997 A
5680116 Hashimoto et al. Oct 1997 A
5684797 Aznar Nov 1997 A
5687324 Green et al. Nov 1997 A
5689506 Chiussi et al. Nov 1997 A
5694390 Yamato et al. Dec 1997 A
5724351 Chao et al. Mar 1998 A
5748186 Raman May 1998 A
5748617 McClain, Jr. May 1998 A
5802054 Bellenger Sep 1998 A
5835710 Nagami et al. Nov 1998 A
5854903 Morrison et al. Dec 1998 A
5856981 Voelker Jan 1999 A
5892924 Lyon et al. Apr 1999 A
5898686 Virgile Apr 1999 A
5903559 Acharya et al. May 1999 A
6147996 Laor et al. Nov 2000 A
Foreign Referenced Citations (7)
Number Date Country
0 384 758 Feb 1990 EP
0 431 751 Nov 1990 EP
0567 217 Oct 1993 EP
WO9401828 Jan 1994 WO
WO9520850 Aug 1995 WO
WO9307569 Apr 1996 WO
WO9307692 Apr 1996 WO
Non-Patent Literature Citations (11)
Entry
Esaki, et al., “Datagram Delivery in an ATM-Internet,” IEICE Transactions on Communications vol. E77-B, No. 3, (1994) Mar., Tokyo, Japan.
Chowdhury et al., “Alternative Banddwidth Allocation Algorithms for Packet Video in ATM Networks”, 1992, IEEE Infocom 92, pp. 1061-1068.
Zhang, et al., “Rate-Controlled Static-Priority Queuing”, 1993, IEEE, pp. 227-236.
IBM, “Method and Apparatus for the Statistical Multiplexing of Voice, Data and Image Signals”, Nov. 1992, IBM Technical Data Bulletin n6 11-92, pp. 409-411.
William Stallings, Data and Computer Communications, pp: 329-333.
Allen, M., “Novell IPX Over Various WAN Media (IPXWAN)”, Network Working Group, RFC 1551, Dec. 1993, pp. 1-22.
Becker D., 3c589.c: A 3c589 EtherLink3 ethernet driver for linux, becker@CESDIS.gsfc.nasa.gov, May 3, 1994, pp. 1-13.
Pei, et al., “Putting Routing Tables In Silicon”, IEEE Network Magazine, Jan. 1992, p. 42-50.
Perkins, D., “Requirements for an Internet Standard Point-to-Point Protocol”, Network Working Group, RFC 1547, Dec. 1993, pp. 1-19.
Simpson, W., “The Point-to-Point Protocol (PPP)”, Network Working Group, RFC 1548, Dec. 1993, pp. 1-53.
Tusuchiya, P.F., “A Search Algorithm For Table Entries with NonContinguous Wildcarding”, Abstract, Bellcore.
Continuations (1)
Number Date Country
Parent 08/511146 Aug 1995 US
Child 09/549875 US