The present invention relates generally to digital computer network technology; more particularly, to methods and apparatus for providing Local Area Network (LAN) emulation services over Internet protocol (IP) networks.
A LAN is a high-speed network (typically 10 to 1000 Mbps) that supports many computers connected over a limited distance (e.g., under a few hundred meters). Typically, a LAN spans a single building. U.S. Pat. No. 6,757,286 provides a general description of a LAN segment. A Virtual Local Area Network (VLAN) is mechanism by which a group of devices on one or more LANs that are configured using management software so that they can communicate as if they were attached to the same LAN, when in fact they are located on a number of different LAN segments. Because VLANs are based on logical instead of physical connections, they are extremely flexible.
Virtual Private Network (VPN) services provide secure network connections between different locations. A company, for example, can use a VPN to provide secure connections between geographically dispersed sites that need to access the corporate network. There are three types of VPN that are classified by the network layer used to establish the connection between the customer and provider network: Layer 1, VPNs, which are simple point-to-point connections using Layer 1 circuits such as SONET; Layer 2 VPNs (L2VPNs), where the provider delivers Layer 2 circuits to the customer (one for each site) and provides switching of the customer data; and Layer 3 VPNs (L3VPNs), where the provider edge (PE) device participates in the customer's routing by managing the VPN-specific routing tables, as well as distributing routes to remote sites. In a Layer 3 IP VPN, customer sites are connected via IP routers, e.g., provider edge (PE) and intermediate provider (P) nodes, that can communicate privately over a shared backbone as if they are using their own private network. Multi-protocol label switching (MPLS) Border Gateway Protocol (BGP) networks are one type of L3VPN solution. An example of an IP-based Virtual Private Network is disclosed in U.S. Pat. No. 6,693,878. U.S. Pat. No. 6,665,273 describes a MPLS system within a network device for traffic engineering.
Virtual Private LAN Service (VPLS) has recently emerged as a L2VPN to meet the need to connect geographically dispersed locations with a protocol-transparent, any-to-any, full-mesh service. VPLS is an architecture that delivers Layer 2 service that in all respects emulates an Ethernet LAN across a wide area network (WAN) and inherits the scaling characteristics of a LAN. All customer sites in a VPLS appear to be on the same LAN, regardless of their locations. In other words, with VPLS, customers can communicate as if they were connected via a private Ethernet LAN segment. The basic idea behind VPLS is to set up a full-mesh of label switched paths (LSPs) between each PE router so that Media Access Control (MAC) frames received on the customer side can be switched based on their MAC addresses and then encapsulated into MPLS/IP packets on the P node side and sent across the VPLS domain over the full mesh. Conceptually, VPLS can therefore be thought of as an emulated Ethernet LAN segment connected by a set of virtual bridges or virtual Ethernet switches.
In multicast data transmission, data packets originating from a source node are delivered to a group of receiver nodes through a tree structure. (In contrast, unicast communications take place between a single sender and a single receiver.) Various mechanisms, such as the Protocol Independent Multicast (PIM) protocol, have been developed for establishing multicast distribution trees and routing packets across service provider (SP) networks. One commonly used approach uses a dynamic routing algorithm to build the multicast tree by allowing group member receiver nodes to join one-by-one. When a new receiver node attempts to join, it sends a Join request message along a computed path to join the group. The routing algorithm/protocol then connects the new receiver to the exiting tree (rooted at the source) without affecting the other tree member nodes.
By way of further background, U.S. Pat. No. 6,078,590 teaches a method of routing multicast packets in a network. Content-based filtering of multicast information is disclosed in U.S. Pat. No. 6,055,364.
Recent VPLS working group drafts (draft-ieff-I2vpn-vpls-ldp-07.txt and draft-ieff-I2vpn-vpls-bgp-05) have no special handling specified for multicast data within a VPLS instance. That is, multicast data within a VPLS instance is treated the same as broadcast data and it is replicated over all the pseudo-wires (PWs) belonging to that VPLS instance at the ingress provider edge (PE) device. This ingress replication is very inefficient in terms of ingress PE and MPLS/IP core network resources. Furthermore, it is not viable for high bandwidth applications where replicating the multicast data N times may exceed the throughput of the ingress PE trunk. Therefore, SPs are interested in deploying multicast mechanisms in their VPLS-enabled networks that can reduce or eliminate ingress replication, e.g., either replicating the data over the PWs to the PE devices that are member of the multicast group(s) or only sending one copy of the data over each physical link among PE and P nodes destined to the PE devices that are member of the multicast group(s).
Two submissions in the Internet Engineering Task Force (IETF) L2VPN Working Group attempt to solve this problem. The first one (specified in draft-serbest-I2vpn-vpls-mcast-03.txt) uses Internet Group Management Protocol (IGMP)/PIM snooping to restrain multicast traffic over a full mesh of PWs belonging to a given VPLS. IGMP is a standard for IP multicasting in the Internet, and is defined in Request For Comments 1112 (RFC1112) for IGMP version 1 (IGMPv1), in RFC2236 for IGMPv2, and in RFC3376 for IGMPv3. (IGMPv3 includes a feature called Source Specific Multicast (SSM) that adds support for source filtering.) By snooping IGMP/PIM messages, the PE (i.e., switch or router) node can populate the Layer 2 (L2) forwarding table based on the content of the intercepted packets. Thus, a PE device can determine which PWs should be included in a multicast group for a given VPLS instance and only replicate the multicast data stream over that subset of PWs.
Although IGMP snooping helps to alleviate replication overhead, it does not completely eliminate the replication problem at the ingress PE. Therefore, this mechanism may not be viable for multicast applications with high bandwidth requirements because the aggregate data throughput after replication may exceed the bandwidth of the physical trunk at the ingress PE.
The second IETF proposal (described in draft-raggarwa-I2vpn-vpls-mcast-01.txt) tries to address the shortcomings of the previous draft by using the multicast tree to transport customer multicast data of a given VPLS service instance. However, because the unicast and multicast paths for a given VPLS instance are different, this approach can result in numerous problems. The first problem involves packet re-ordering, wherein two consecutive frames are sent on two different paths, e.g., a first frame is sent on a multicast path because of unknown destination unicast MAC address, with a second frame being sent on a unicast path after the path to the destination has been learned. If the unicast path is shorter than multicast path, the second packet can arrive ahead of the first one.
Another problem with the second IETF proposal is that bridged control packets typically need to take the same path as unicast and multicast data, which means the unicast and multicast path need to be aligned or congruent. If control packets are sent on unicast paths, any failure in the multicast path can go undetected. This situation is illustrated in
What is needed therefore is a method and apparatus for eliminating ingress replication of multicast data within a VPLS instance that overcomes the aforementioned problems of the prior art.
The present invention will be understood more fully from the detailed description that follows and from the accompanying drawings, which, however, should not be taken to limit the invention to the specific embodiments shown, but are for explanation and understanding only.
A mechanism for aligning unicast and multicast paths in a service provider network, and which thereby achieves shortest path (i.e., optimal) bridging, is described. In the following description specific details are set forth, such as device types, protocols, network configurations, etc., in order to provide a thorough understanding of the present invention. However, persons having ordinary skill in the networking arts will appreciate that these specific details may not be needed to practice the present invention.
A computer network is a geographically distributed collection of interconnected subnetworks for transporting data between nodes, such as intermediate nodes and end nodes. A local area network (LAN) is an example of such a subnetwork; a plurality of LANs may be further interconnected by an intermediate network node, such as a router, bridge, or switch, to extend the effective “size” of the computer network and increase the number of communicating nodes. Examples of the end nodes may include servers and personal computers. The nodes typically communicate by exchanging discrete frames or packets of data according to predefined protocols. In this context, a protocol consists of a set of rules defining how the nodes interact with each other.
As shown in
In a typical networking application, packets are received from a framer, such as an Ethernet media access control (MAC) controller, of the I/O subsystem attached to the system bus. A DMA engine in the MAC controller is provided a list of addresses (e.g., in the form of a descriptor ring in a system memory) for buffers it may access in the system memory. As each packet is received at the MAC controller, the DMA engine obtains ownership of (“masters”) the system bus to access a next descriptor ring to obtain a next buffer address in the system memory at which it may, e.g., store (“write”) data contained in the packet. The DMA engine may need to issue many write operations over the system bus to transfer all of the packet data.
According to one embodiment of the present invention, congruent (i.e., aligned) unicast and multicast paths through a MPLS/IP network are achieved in the presence of either an asymmetrical path cost or an Equal Cost Multiple Paths (ECMP) through an algorithm in which the unicast paths are first established in a standard manner across an IP network from source to receiver nodes. Trace path messages are then sent hop-by-hop over the network to capture the nodes traversed by each of the unicast paths. The node list information captured by the trace path messages is then provided to the mechanism for building the multicast distribution tree in the opposite direction, i.e., from receiver node to source node.
Once the unicast paths have been built, a trace path message is sent from source node 61 to each of the receiver nodes 62-64. These trace path messages, which are shown in
The aforementioned mechanism may utilize a new trace path PIM message or some other message type (e.g., a standard trace route message) to record the unicast path information from the ingress PE node (PE node 61 in
With continuing reference to the example of
At any point along the route,.the path from a given P node toward the receiver node may be determined by inspecting the forward least cost path, and choosing a path from an ECMP point based on the source IP address. Practitioners in the art will appreciate that when an ECMP point in the network is encountered a path may be selected based either on source IP address or a 3-bit EMCP selector field (assuming the maximum number of ECMPs on a given node is eight or fewer).
In one embodiment, the egress PE node terminates the propagation of the trace path message and then triggers a Join using the recorded path information. Note that in the presently described embodiment the Join is triggered immediately if the tree is inclusive (either aggregate or non-aggregate). In the event that the multicast tree is selective, then the Join may be triggered when an IGMPv3 or a PIM Join is received over an attachment circuit for the source node of interest. When a node receives a Join message from its downstream node, it uses the unicast trace path information for propagating the Join message to the upstream node.
In alternative embodiments in which a source specific multicast tree is built from the source toward receiver, source-initiated join messages may be utilized instead of receiver-initiated join messages. When sending unicast traffic associated with the multicast tree, then any ECMP selection should be consistent with the one used in choosing the mulitcast path.
In the case of a MPLS network, the resource reservation with traffic engineering (RSVP-TE) protocol may also be used to setup point-to-point (P2P) and point-to-multipoint (P2MP) LSPs to ensure their alignment. RSVP-TE allows the use of source routing where the ingress router determines the complete path through the network.
It should be understood that elements of the present invention may also be provided as a computer program product which may include a “machine-readable medium” having stored thereon instructions which may be used to program a computer (e.g., a processor or other electronic device) to perform a sequence of operations. A machine-readable medium” may include any computer program product, apparatus and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. Alternatively, the operations may be performed by a combination of hardware and software. The machine-readable medium may include, but is not limited to, floppy diskettes, optical disks, CD-ROMs, and magneto-optical disks, ROMs, RAMs, EPROMs, EEPROMs, magnet or optical cards or other type of machine-readable medium suitable for storing electronic instructions.
Although the present invention has been described with reference to specific exemplary embodiments, it should be understood that numerous changes in the disclosed embodiments can be made in accordance with the disclosure herein without departing from the spirit and scope of the invention. The preceding description, therefore, is not meant to limit the scope of the invention. Rather, the scope of the invention is to be determined only by the appended claims and their equivalents.
This application claims the benefit of U.S. Provisional Patent Application No. 60/704,817 filed Aug. 1, 2005, entitled “Multicast Mechanism For VPLS”. The present application is also related to co-pending application entitled, “Congruent Forwarding Paths For Unicast and Multicast Traffic” filed concurrently herewith, which application is assigned to the assignee of the present application.
Number | Name | Date | Kind |
---|---|---|---|
5331637 | Francis et al. | Jul 1994 | A |
5818842 | Burwell et al. | Oct 1998 | A |
5848227 | Sheu | Dec 1998 | A |
6073176 | Baindur et al. | Jun 2000 | A |
6188694 | Fine et al. | Feb 2001 | B1 |
6301244 | Huang et al. | Oct 2001 | B1 |
6304575 | Carroll et al. | Oct 2001 | B1 |
6308282 | Huang | Oct 2001 | B1 |
6373838 | Law et al. | Apr 2002 | B1 |
6424657 | Voit et al. | Jul 2002 | B1 |
6430621 | Srikanth et al. | Aug 2002 | B1 |
6484209 | Momirov | Nov 2002 | B1 |
6502140 | Boivie | Dec 2002 | B1 |
6519231 | Ding et al. | Feb 2003 | B1 |
6611869 | Eschelbeck et al. | Aug 2003 | B1 |
6665273 | Goguen et al. | Dec 2003 | B1 |
6667982 | Christie et al. | Dec 2003 | B2 |
6668282 | Booth, III et al. | Dec 2003 | B1 |
6693878 | Daruwalla et al. | Feb 2004 | B1 |
6732189 | Novaes | May 2004 | B1 |
6763469 | Daniely | Jul 2004 | B1 |
6785232 | Kotser et al. | Aug 2004 | B1 |
6785265 | White et al. | Aug 2004 | B2 |
6789121 | Lamberton et al. | Sep 2004 | B2 |
6798775 | Bordonaro | Sep 2004 | B1 |
6801533 | Barkley | Oct 2004 | B1 |
6813268 | Kalkunte et al. | Nov 2004 | B1 |
6826698 | Minkin et al. | Nov 2004 | B1 |
6829252 | Lewin et al. | Dec 2004 | B1 |
6839348 | Tang et al. | Jan 2005 | B2 |
6850521 | Kadambi et al. | Feb 2005 | B1 |
6850542 | Tzeng | Feb 2005 | B2 |
6852542 | Mandel et al. | Feb 2005 | B2 |
6882643 | Mauger et al. | Apr 2005 | B1 |
6892309 | Richmond et al. | May 2005 | B2 |
6954436 | Yip | Oct 2005 | B1 |
7009983 | Mancour | Mar 2006 | B2 |
7016351 | Farinacci et al. | Mar 2006 | B1 |
7092389 | Chase et al. | Aug 2006 | B2 |
7113512 | Holmgren et al. | Sep 2006 | B1 |
7116665 | Balay et al. | Oct 2006 | B2 |
7173934 | Lapuh et al. | Feb 2007 | B2 |
7277936 | Frietsch | Oct 2007 | B2 |
7281058 | Shepherd et al. | Oct 2007 | B1 |
7310342 | Rouleau | Dec 2007 | B2 |
7333487 | Novaes | Feb 2008 | B2 |
7345991 | Shabtay et al. | Mar 2008 | B1 |
7408936 | Ge et al. | Aug 2008 | B2 |
7466703 | Arunachalam et al. | Dec 2008 | B1 |
7701936 | Hongal et al. | Apr 2010 | B2 |
7855950 | Zwiebel et al. | Dec 2010 | B2 |
7978718 | Farinacci et al. | Jul 2011 | B2 |
7990963 | Aggarwal et al. | Aug 2011 | B1 |
20020032780 | Moore et al. | Mar 2002 | A1 |
20020087721 | Sato et al. | Jul 2002 | A1 |
20020156612 | Schulter et al. | Oct 2002 | A1 |
20020196795 | Higashiyama | Dec 2002 | A1 |
20030012183 | Butler | Jan 2003 | A1 |
20030036375 | Chen et al. | Feb 2003 | A1 |
20030101243 | Donahue et al. | May 2003 | A1 |
20030110268 | Kermarec et al. | Jun 2003 | A1 |
20030112781 | Kermode et al. | Jun 2003 | A1 |
20030142674 | Casey | Jul 2003 | A1 |
20030154259 | Lamberton et al. | Aug 2003 | A1 |
20030177221 | Ould-Brahim et al. | Sep 2003 | A1 |
20040095940 | Yuan et al. | May 2004 | A1 |
20040102182 | Reith et al. | May 2004 | A1 |
20040125809 | Jeng | Jul 2004 | A1 |
20040141501 | Adams et al. | Jul 2004 | A1 |
20040151180 | Hu et al. | Aug 2004 | A1 |
20040158735 | Roese | Aug 2004 | A1 |
20040165525 | Burak | Aug 2004 | A1 |
20040165600 | Lee | Aug 2004 | A1 |
20040172559 | Luo et al. | Sep 2004 | A1 |
20040228291 | Huslak et al. | Nov 2004 | A1 |
20040233891 | Regan | Nov 2004 | A1 |
20040264364 | Sato | Dec 2004 | A1 |
20050007951 | Lapuh et al. | Jan 2005 | A1 |
20050025143 | Chen et al. | Feb 2005 | A1 |
20050030975 | Wright et al. | Feb 2005 | A1 |
20050044265 | Vinel et al. | Feb 2005 | A1 |
20050063397 | Wu et al. | Mar 2005 | A1 |
20050068972 | Burns et al. | Mar 2005 | A1 |
20050089047 | Ould-Brahim et al. | Apr 2005 | A1 |
20050099949 | Mohan et al. | May 2005 | A1 |
20050152370 | Meehan et al. | Jul 2005 | A1 |
20050157664 | Baum | Jul 2005 | A1 |
20050157751 | Rabie et al. | Jul 2005 | A1 |
20050163049 | Yazaki et al. | Jul 2005 | A1 |
20050175022 | Nishimura et al. | Aug 2005 | A1 |
20050190773 | Yang et al. | Sep 2005 | A1 |
20050239445 | Karaogguz et al. | Oct 2005 | A1 |
20050249124 | Elie-Dit-Cosaque et al. | Nov 2005 | A1 |
20050286503 | Oda et al. | Dec 2005 | A1 |
20060007867 | Elie-Dit-Cosaque et al. | Jan 2006 | A1 |
20060092847 | Mohan et al. | May 2006 | A1 |
20060098607 | Zeng | May 2006 | A1 |
20060182037 | Chen et al. | Aug 2006 | A1 |
20060248277 | Pande | Nov 2006 | A1 |
20060285500 | Booth | Dec 2006 | A1 |
20060285501 | Damm | Dec 2006 | A1 |
Number | Date | Country |
---|---|---|
WO 2007031002 | Mar 2007 | WO |
WO 2008089370 | Jul 2008 | WO |
Number | Date | Country | |
---|---|---|---|
20070025277 A1 | Feb 2007 | US |
Number | Date | Country | |
---|---|---|---|
60704817 | Aug 2005 | US |