The present invention relates to a method and switch for packets and is particularly concerned with broadcasting packets.
The capability to broadcast information, that is to send the same information to multiple nodes, is useful in many applications, for example:
Referring to
RapidIO has defined a standard register interface and behavior for a RapidIO switch to broadcast information, called multicast. The implementation of multicast is vendor specific. When a switch receives a packet that is to be multicast. the packet is replicated one at a time to each output port 16.
The leads to several problems. Multicasting of a packet delays all of the packets behind the packet being multicast in proportion to the size of the packet and the number of times the packet must be replicated. Congested egress ports cause head of line blocking of the multicast packet, further increasing delay. Failure of one port can block further multicast operations, and lead to congestive failure of the switch,
When a switch 10 receives a packet that is to be multicast, the ingress port 12 seizes access to all egress ports 16 and then replicates the packet in parallel to all egress ports.
This can result in the following problems:
An object of the present invention is to provide an improved method and switch for broadcasting packets.
In accordance with an aspect of the present invention there is provided a switch for broadcasting packets comprising a switch for broadcasting packets comprising: a plurality of input ports; a switch fabric coupled to the input ports; a plurality of output ports coupled to the suited fabrics; and a multicast interconnect coupled between the input ports and output ports for routing multicast packets directly therebetween.
In accordance with another aspect of the present invention there is provided a method of broadcasting packets from an input port to a plurality of output ports comprising the steps of: at an input port, replicating broadcast packets; and directly coupling the packets to a plurality of output ports.
The present invention will be further understood from the following detailed description with reference to the drawings in which:
Referring to
In operation, the work queue arbiter 30 decides which ingress port 22 should place its packet to be multicast into the work queue 32 next. The work queue 32 holds packets that are to be multicast. This eliminates head of line blocking attributable to multicast functionality. The broadcast buffer 34 holds copies of the original packet until the egress port 26 can transmit them, The work queue 32 is connected to the broadcast buffers 34 by a dedicated interconnect. The egress port arbiter 36 implements collision resolution between multicast and non-multicast packets.
By way of example a packet to be multicast is received by Input Port 0 (22a).
The multicast work queue arbiter selects Input Port 0 (22a) as the next port from which to receive a packet. The packet is routed to the multicast work queue 32.
The multicast work queue determines which output ports 36 the packet should be sent to. The packet is replicated simultaneously to the broadcast buffers 34 of the output ports 26 selected. The broadcast buffer 34a requests to send the packet copy to the output port 26a.
For example, the egress arbiter 36 for the port 26a signals the broadcast buffer 34a to send the packet. In this way, the packet copy is transmitted on each output port.
As discussed above, delay due to size/replication time of multicast packets is a serious problem with multicast. Dedicated interconnect 28 and broadcast buffers 34 allow packet multicast to occur without interference from/to unicast traffic using the switch fabric 24. Parallel packet replication has the advantage that Multicast operations have no different performance characteristics than unicast packets.
Another problem is delay due to head of line blocking by multicast packets. A separate work queue 32 for broadcast packets means that a packet does not have to wait for a congested egress port 26 to be multicast. The egress arbiter 36 allows multicast/unicast contention to be predetermined—since multicast traffic can impact unicast traffic, or vice versa.
Failure of one output port 26 can block further multicast operations, and lead to congestive failure of the switch. Each broadcast buffer 34 has a timeout that forces forward progress of multicast data on each port. If the timeout expires, the broadcast buffer 34 is flushed, freeing up space to allow forward progress of multicast traffic.
Optionally. a port 26 whose timeout expires can be removed from future multicast operations until software can recover the affected link partner.
Multicast wastes bandwidth in the fabric connecting the ports, since all ports cannot be seized simultaneously. To overcome this problem, multicast packet replication is done separately from the non-multicast traffic, so no fabric bandwidth is wasted. The interconnect to the broadcast buffers 34 can replicate packets as quickly as they can be received from one port. The broadcast buffers 34 can accept and transmit data at the maximum fabric speeds.
The work queue arbiter and egress port arbiters can be implemented using various known arbitration algorithms, including round robin, weighted round robin, arrival time, request based, and priority based.
In accordance with a particular implementation, the RapidIO protocol, the work queue arbiter selects packets according to priority—highest priority packet offered is accepted.
Within a priority—any algorithm may be used that accepts packets from the different ports such as weighted round robin and simple round robin, etc.
The work queue 32 is a memory that holds packets waiting to be replicated. The work queue 32 uses an implementation of the RapidIO standard multicast packet replication selection register interface. An implementation specific interface that is faster and easier to use, includes the following:
The broadcast buffers are connected to the work queue via a dedicated interconnect. The broadcast buffers indicate whether or not they are able to accept data to the work queue 32. The work queue 32 transmits data when all of the broadcast buffers 34 that a packet must be replicated to indicate that they can accept 20 data. A variety of flow control algorithms can be used here, depending on how the broadcast buffers 34 are managed, Packets can ‘flow through’ from the input port 22, through the work queue 32, to the broadcast buffer 34 to minimize latency. A complete packet must be received by the broadcast buffer 34 before it can be transmitted on the output port 26. This is not necessary if the implementation can handle stomping of packets flowing through the work queue/broadcast buffer, and if the receiving port is at least equal in speed to the transmitting port.
The Egress port arbiter 36 allows system-specific configuration of contention between multicast and non-multicast traffic. The egress port arbiter 36 must respect strict priority ordering, that is if the multicast packet has a higher priority than the non-multicast packet, the multicast packet must be sent first. The egress port arbiter 36 can implement any arbitration algorithm e.g. round robin and weighted round robin. The particular implementation discussed is a limited form of weighted round robin (one of the two weights is restricted to 0).
The embodiment of
Referring to
Referring to
Referring to
Referring to
Referring to
The present application claims priority of U.S. Provisional Application Ser. No. 60/740,401 filed 28 Nov. 2005, which is incorporated herein in its entirety by this reference.
Number | Date | Country | |
---|---|---|---|
60740401 | Nov 2005 | US |