This disclosure relates generally to the field of digital computer networks; more particularly, to routing of data packets and protocols for scaling of failure detection in network sessions.
A LAN is a high-speed network that supports many computers connected over a limited distance (e.g., under a few hundred meters). A Virtual Local Area Network (VLAN) is mechanism by which a group of devices on one or more LANs is configured using management software so that they can communicate as if they were attached to the same LAN, when in fact they are located on a number of different LAN segments. After a VLAN has been created, individual switch ports (also referred to as “access ports”) are assigned to the VLAN. These access ports provide a connection for end-users or node devices, such as a router or server. A router is simply a device or, in some cases, software in a computer, that determines the next network point to which a packet should be forwarded toward its destination.
Bidirectional Forwarding Detection (BFD) is a network protocol, standardized in an Internet Engineering Task Force (IETF) working group, which is used to detect faults between two forwarding engines (e.g., routers or switches). In a typical application, BFD may require 50-150 ms to detect a link failure. According to BFD, sessions are explicitly configured between L3 endpoint neighbors—neighbors at Physical layer L1, Logical Layer L2 over switches or IP Datagram Layer over Routers. A session may operate either in asynchronous mode or demand mode. In asynchronous mode, both endpoints periodically send “Hello” packets to each other. (A Hello packet is basically a “keep alive” message sent by one device to another to check that the connectivity—over physical link, hardware and software paths—between the two L3 Neighbors is operating. The BFD hello mechanism provides detection of failures in a path between adjacent L3 Neighbors, linked over physical media, switches, and routers, switching and routing over forwarding engines, or on any kind of path between systems, including virtual circuits and tunnels.) If a number of the hello packets are not received in a timely fashion, a BFD session between L3 neighbors is considered down. In other words, failure of reachability to a neighbor, for whatever reasons, is detected when packets are not being received or sent. In demand mode, no Hello packets are exchanged after the BFD session is established; rather, it is assumed that the endpoints have another way to verify connectivity to each other, perhaps on the underlying physical layer. However, either host may still send Hello packets if deemed necessary. Regardless of which mode is in use, either endpoint may also initiate an “Echo” function. When this function is active, a stream of Echo packets is sent, and the other endpoint then sends these back—loopbacks—to the sender via its forwarding plane. This function is used to test the forwarding and receiving paths to and from the remote system. Pairing of neighbors to form a BFD Sessions between local and remote is typically per physical port and per sub-interface, which causes large scaling problems, e.g., as number of BFD Sessions multiply and CPU computational overhead increases, and also as the number of sub-interfaces increases (e.g., >1000).
The present invention will be understood more fully from the detailed description that follows and from the accompanying drawings, which however, should not be taken to limit the invention to the specific embodiments shown, but are for explanation and understanding only.
In the following description specific details are set forth, such as device types, system configurations, communication methods, etc., in order to provide a thorough understanding of the present invention. However, persons having ordinary skill in the relevant arts will appreciate that these specific details may not be needed to practice the embodiments described.
In the context of the present application, a computer network is a geographically distributed collection of interconnected sub-networks for transporting data between nodes, such as intermediate nodes and end nodes (also referred to as endpoints). A local area network (LAN) is an example of such a sub-network; a plurality of LANs may be further interconnected by an intermediate network node, such as a router, bridge, or switch, to extend the effective “size” of the computer network and increase the number of communicating nodes. Examples of the devices or nodes include servers, mixers, control units, and personal computers. The nodes typically communicate by exchanging discrete frames or packets of data according to predefined protocols.
A sub-interface is any one of a number of logical interfaces associated with a router's physical interface. Once a sub-interface has been created, a router treats this logical interface just like any physical interface. A “link” refers to a connection between adjacent or neighboring nodes of a network. As it is used in the present disclosure, a link not limited to a direct connection, but may encompass a path that is routed over multiple hops or other paths, e.g., a Multi-protocol Label Switching (MPLS) over a Label Switched Path (LSP). An endpoint (i.e., a sender or receiver) device represents any equipment, node, or other device capable of sending and/or receiving data packets, including any other device, component, element, or object capable of sending, receiving BFD packets, or otherwise participating in BFD packet exchanges.
In one embodiment, the concept of hierarchy is introduced into BFD detection as between physical interfaces and sub-interfaces of a network node. For example, a physical port or interface of a node may be designated as a “parent” interface, with all sub-interfaces created under the physical port being designated as “child” interfaces. The parent and child interfaces (e.g., VLANs, Frame Relay (FR), Asynchronous Transfer Mode (ATM), Layer 2 Virtual Private Networks (L2VPNs) and Layer 3 Virtual Private Networks (L3VPNs) may be configured as a BFD neighbor group that is either created automatically (e.g. all sub-interfaces under a physical port are included) or via configuration (e.g., by a network administrator). Within a group, BFD sessions are run at a higher rate (i.e., shorter failure detection time) on the parent interface to detect failures faster, with children BFD session failures being run at a slower rate (i.e., longer failure detection time), or not running at all, i.e., the parent acting as a proxy for the child.
In one implementation, the parent-child hierarchical relationship confers a BFD policy on the respective interfaces, with certain specific actions taken from the parent to the children in a particular neighbor group. For example, a failure at the parent (e.g., physical interface) level automatically triggers notification to the child (e.g., sub-interface) level. Additionally, failure detection timers for BFD parents are set more aggressively (e.g., by a factor of 10) as compared to BFD children timers (if parent is not acting as a proxy for the child interface/BFD Session). In still other embodiments, a BFD policy may be configured in which children BFD sessions are automatically brought down whenever the number of children BFD sessions in the DOWN state equals or exceeds a predetermined threshold level. For instance, where the threshold level is set to three, failure of three out of five sub-interfaces results in all five of the sub-interfaces being brought down.
The parent-child relationship configured between the physical interface and the associated sub-interfaces on each router is such that the parent BFD sessions have relatively fast failure detection timers (e.g., 50 ms), while the child sessions have relatively slow or longer failure detection timer settings (e.g., 500 ms). As a logical partition of the parent interface, each child sub-interface inherits the physical characteristics of the parent. Furthermore, parent BFD sessions are run based on shorter (i.e., faster) failure detection time and children BFD sessions with longer (i.e., slower) failure detection time. The parent BFD session may signal each of the children BFD sessions based on a policy configured by the user. For example, the policy may be that once a failure is detected by the parent BFD session, all of the child BFD sessions of the same group (parent+children) are notified and/or shut down.
Practitioners in the art will appreciate that by configuring the child BFD sessions with a much longer failure detection timer setting relative to the parent interfaces, the overall keep-alive traffic is reduced considerably (lower BFD packets per second) as for a given group consisting of a physical interface and all associated sub-interfaces. Additionally, the system is able to accommodate a larger number of total child BFD sessions running due to decreased overall CPU loading. Session scalability is also improved by having correlated alarms in the hierarchical relationship between the parent and child BFD sessions.
Stated differently, the physical interfaces to each neighbor are treated differently than other types of interfaces (e.g. sub-interfaces) as far as BFD detection is concerned. The physical interfaces are probed faster to get faster error detection without the node getting slowed down due to a large number of interfaces to the neighbors.
Once BFD is up and running, each of the respective interfaces is constantly monitored for a failure, e.g., the endpoint fails to receive back the Hello or Echo packets previously sent out as part of the keep-alive mechanism of the BFD protocol. In
In a typical networking application, packets are received from a framer, such as an Ethernet media access control (MAC) controller, of the I/O subsystem attached to the system bus. A DMA engine in the MAC controller is provided a list of addresses (e.g., in the form of a descriptor ring in a system memory) for buffers it may access in the system memory. As each packet is received at the MAC controller, the DMA engine obtains ownership of (“masters”) the system bus to access a next descriptor ring to obtain a next buffer address in the system memory at which it may, e.g., store (“write”) data contained in the packet. The DMA engine may need to issue many write operations over the system bus to transfer all of the packet data.
It should be understood that elements of the present invention may also be provided as a computer program product which may include a machine-readable medium having stored thereon instructions which may be used to program a computer (e.g., a processor or other electronic device) to perform a sequence of operations. Alternatively, the operations may be performed by a combination of hardware and software. The machine-readable medium may include, but is not limited to, floppy diskettes, optical disks, CD-ROMs, and magneto-optical disks, ROMs, RAMs, EPROMs, EEPROMs, magnet or optical cards, or other type of machine-readable medium suitable for storing electronic instructions.
Additionally, although the present invention has been described in conjunction with specific embodiments, numerous modifications and alterations are well within the scope of the present invention. Accordingly, the specification and drawings are to be regarded in an illustrative rather than a restrictive sense.
Number | Name | Date | Kind |
---|---|---|---|
4684872 | Stewart | Aug 1987 | A |
5953049 | Horn et al. | Sep 1999 | A |
6016430 | Shinomiya | Jan 2000 | A |
6044081 | Bell et al. | Mar 2000 | A |
6253270 | Ajanovic et al. | Jun 2001 | B1 |
6275889 | Saito | Aug 2001 | B1 |
6311288 | Heeren et al. | Oct 2001 | B1 |
6545979 | Poulin | Apr 2003 | B1 |
6590867 | Ash et al. | Jul 2003 | B1 |
6628608 | Lau et al. | Sep 2003 | B1 |
6771644 | Brassil et al. | Aug 2004 | B1 |
6876632 | Takeda | Apr 2005 | B1 |
6947417 | Laursen et al. | Sep 2005 | B2 |
7422330 | Magaril | Sep 2008 | B2 |
20020014282 | Andersson et al. | Feb 2002 | A1 |
20020051464 | Sin et al. | May 2002 | A1 |
20020186661 | Santiago et al. | Dec 2002 | A1 |
20030016627 | MeLampy et al. | Jan 2003 | A1 |
20030035384 | Cline et al. | Feb 2003 | A1 |
20030076850 | Jason, Jr. | Apr 2003 | A1 |
20030163272 | Kaburlasos et al. | Aug 2003 | A1 |
20040052259 | Garcia et al. | Mar 2004 | A1 |
20040073690 | Hepworth et al. | Apr 2004 | A1 |
20040213152 | Matuoka et al. | Oct 2004 | A1 |
20050007959 | Tomonada et al. | Jan 2005 | A1 |
20050091190 | Klemets et al. | Apr 2005 | A1 |
20060077891 | Smith et al. | Apr 2006 | A1 |
20070008896 | Green et al. | Jan 2007 | A1 |
20070121523 | Morandin | May 2007 | A1 |
20070192459 | Horimoto et al. | Aug 2007 | A1 |
Number | Date | Country |
---|---|---|
1 553 735 | Jul 2005 | EP |
Number | Date | Country | |
---|---|---|---|
20090010171 A1 | Jan 2009 | US |