The present invention generally relates to data communications networks.
The approaches described in this section could be pursued, but are not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated herein, the approaches described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
In computer networks such as the Internet, packets of data are sent from a source to a destination via a network of elements including links (communication paths such as telephone or optical lines) and nodes (for example, routers directing the packet along one or more of a plurality of links connected to it) according to one of various routing protocols.
One routing protocol used, for example, in the internet is Border Gateway Protocol (BGP). BGP is used to route data between routing domains such as autonomous systems (AS) comprising networks under a common administrator and sharing a common routing policy. BGP routers exchange full routing information during a connection session for example using Transmission Control Protocol (TCP) allowing inter-autonomous system routing. The information exchanged includes various attributes including a next-hop attribute. For example where a BGP router advertises a connection to a network, for example in a form of an IP address prefix, the next-hop attribute comprises the IP address used to reach the BGP router.
Edge or border BGP routers in a first AS (ASBRs) communicate with eBGP peers in a second AS via exterior BGP (eBGP). In addition BGP routers within an AS exchange reachability information using interior BGP (iBGP). As a very large number of routes may be advertised in this manner an additional network component comprising a route reflector is commonly provided which sets up a session with each BGP router and distributes reachability information to each other BGP router.
The border routers in respective AS's can advertise to one another, using eBGP, the prefixes (network destinations) reachable from them, the advertisements carrying information such as AS-path, indicating the AS's through which the route advertisement has passed including the AS in which the advertising border router itself is located, and a BGP Community attribute indicating the manner in which the advertisement is to be propagated. For example if an eBGP advertisement is received with Community attribute No-Advertise, then the border router receiving the advertisement does not advertise the route information to any of its peers, including other routers in its AS. When the routes are advertised internally using iBGP, additional information such as a local preference and a nexthop field are included. The local preference attribute sets a preference value to use of that particular route for example for a given set of prefixes such that where more than one route is available to other border routers in the AS they will select the route with the highest local preference. The next-hop attribute provides the IP address used for the link between the border router in the AS and its eBGP peer.
To reduce the amount of iBGP messages further, route reflectors may only advertise the best path for a given destination to all border routers in an AS. Accordingly all border routers will forward traffic for a given destination to the border router identified in the best path advertisement. Forwarding of packets within the AS may then simply use Interior Gateway Protocol (IGP) as described in more detail below where the IGP forwarding table will ensure that packets destined for the eventual destination will be forwarded within the AS towards the appropriate border router. Alternatively an ingress border router receiving incoming packets may tunnel the packets to the appropriate egress border router, that is, encapsulate the packets to a destination egress border router for example using IP or MPLS tunnels. The packets are then decapsulated at the egress border router and forwarded according to the packet destination header.
BGP is capable of supporting multiple address types for example internet protocol version 4 (IPv4), internet protocol version 6 (IPv6) and so forth, and each type of address is identified using an address family identifier (AFI) and a subsequent address family identifier (SAFI). The destinations reachable via a BGP route, for example the network components whose IP addresses are represented by one IP prefix, are referred to as the network layer reachability information (NLRI) in BGP.
Within each AS the routing protocol typically comprises an interior gateway protocol (IGP) for example a link state protocol such as open shortest path first (OSPF) or intermediate system-intermediate system (IS-IS).
The link state protocol relies on a routing algorithm resident at each node. Each node on the network advertises, throughout the network, links to neighboring nodes and provides a cost associated with each link, which can be based on any appropriate metric such as link bandwidth or delay and is typically expressed as an integer value. A link may have an asymmetric cost, that is, the cost in the direction AB along a link may be different from the cost in a direction BA. Based on the advertised information in the form of a link state packet (LSP) each node constructs a link state database (LSDB), which is a map of the entire network topology, and from that constructs generally a single optimum route to each available node based on an appropriate algorithm such as, for example, a shortest path first (SPF) algorithm. As a result a “spanning tree” (SPT) is constructed, rooted at the node and showing an optimum path including intermediate nodes to each available destination node. The results of the SPF are stored in a routing information base (RIB) and based on these results the forwarding information base (FIB) or forwarding table is updated to control forwarding of packets appropriately. When there is a network change an LSP representing the change is flooded through the network by each node adjacent the change, each node receiving the LSP sending it to each adjacent node.
As a result, when a data packet for a destination node arrives at a node the node identifies the optimum route to that destination and forwards the packet to the next node along that route. The next node repeats this step and so forth.
It is important to minimize packet loss in the case of network component failure, both intra-domain (eg IGP) and inter-domain (eg BGP). For example in the case of intra domain link failure ISP's use various techniques to react quickly to the failure while convergence is taking place including handling of the failures by other layers or implementing fast reroute techniques for example of the type described in co-pending patent application Ser. No. 10/340,371, filed 9 Jan. 2003, entitled “Method and Apparatus for Constructing a Backup Route in a Data Communications Network” of Kevin Miles et al., (“Miles et al.”), the entire contents of which are incorporated by reference for all purposes as if fully set forth herein.
In the case of inter-domain failure, for example failure of peering links between AS's, convergence can take several seconds. In one mode of operation, in these circumstances, a BGP router attached to a failed eBGP peering link advertises a new LSP without the destination served by the failed link together with an iBGP withdraw message indicating that the destinations are not reachable. A solution to the problem of inter-domain failure has been described in co-pending patent application Ser. No. 11/254,469, filed Oct. 20, 2005, entitled “A Method of Constructing a Backup Path in an Autonomous System” of Clarence Filsfils et al (“Filsfils et al”), the entire contents of which are incorporated by reference for all purposes as if fully set forth herein. As described in Filsfils et al, in the case where an AS has links with multiple AS's serving respective sets of destinations or prefixes, a backup path is constructed in the case of failure of a link to a respective one of the multiple AS's by identifying alternate links serving the same set of destinations, providing per-prefix route protection.
Currently, when a prefix is advertised in iBGP, the routers contained in the AS must derive the reliability of (redundancy in) the external connection to each prefix themselves from the number of adverts they receive displaying connectivity to said prefix. This leads to a computational overhead which slows down iBGP convergence.
The present invention is illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which like reference numerals refer to similar elements and in which:
An apparatus and method is described for providing reachability information in a routing domain of an external destination address in a data communications network. In the following description, for the purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the present invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced without these specific details. In other instances, well-known structures and devices are shown in block diagram form in order to avoid unnecessarily obscuring the present invention.
Embodiments are described herein according to the following outline:
1.0 General Overview
The needs identified in the foregoing Background, and other needs and objects that will become apparent for the following description, are achieved in the present invention, which comprises, an apparatus for providing reachability information in a routing domain of a data communications network having as components nodes and links therebetween for a routing domain—external destination address. The apparatus is arranged to advertise destination address reachability internally to nodes in the routing domain and associate a reachability category with said internal advertisement of said destination address reachability.
In other aspects, the invention encompasses a computer apparatus and a computer-readable medium configured to carry out the foregoing steps.
2.0 Structural and Functional Overview
In overview an apparatus and method for providing reachability information in a routing domain such as an AS according to the approach described herein can be understood with reference to
The network shown in
In order to provide reachability information, at step 200, ASBR1 receives external reachability information for example BGP connectivity information via eBGP. This may be received in the network of
As can be seen from
At step 204 ASBR1 advertises the destination address reachability internally to nodes in the AS for example via iBGP. At step 206 the determined reachability category is associated with the internal advertisement for example as part of the iBGP advertisement in a community string. As a result a node receiving the iBGP advertisement in the network for example node R1 is able to derive additional connectivity information in relation to p/27 for example by comparison to a policy defining a threshold.
Accordingly at step 208 node R1 is able if necessary to advertise for further connectivity information. For example where the category was advertised in the form of an indicator indicating that only a single path is available to p/27 from ASBR1, router R1 may issue an internal advertisement seeking ASBRs within AS1 which also provides connectivity to p/27. For example referring to
Accordingly it will be seen that both an ASBR such as ASBR1 and an internal router such as node R1 can provide reachability information relating to an external destination address, to other internal nodes, and to further ASBR's. In addition one or more further next best alternate paths can be advertised with a corresponding reachability category.
The addition of the BGP community string indicators reduces the computational overhead on iBGP convergence and provides information on suitable FRR paths. Network stability/reliability/redundancy/connectivity in the case of a link or node failure is enhanced with the prior knowledge of the existence of alternate paths and the ability to request help in finding an alternative path and the offer of being FRR capable.
3.0 Apparatus and Method for Providing Reachability Information in an Autonomous System of an External Destination Address in a Data Communications Network
Reference is made to
Referring firstly to
At step 308 the receiving ASBR advertises the enhanced connectivity information using IBGP internally within AS1.
The category can indicate, for example, the level of redundancy available in connectivity between the ASBR and the prefix.
An alternative scenario can be understood with reference to
A further scenario is shown in
It will be noted that the specific form of the indicator can take any appropriate type such as setting of one or more appropriate bits, or any other appropriate coding recognisable by the other components in the AS. It will be noted that an indicator associated with a particular prefix does not necessarily accurately represent the actual network arrangement. To accommodate this, an ASBR may in effect set a particular path to a network address prefix as optimum or non-optimum by setting a corresponding indicator regardless of the actual network arrangement and the router will then take appropriate action in a policy dependent manner as discussed below.
Turning to the steps performed at a router R1 in AS1 as shown in
The router R1 may have prior knowledge of the degree of reachability required for each individual network address prefix it has access to for example in the form of a policy and can compare this with the advertised enhanced connectivity information. The steps taken in relation to the reachability category for a given prefix may then be determined according to the policy. For example in all cases the router may require at least one further path as well as additional paths if there is a single point of failure. In the embodiment described herein, however, in the case of a “diverse connectivity” indicator (the scenario at
In the case of the scenario shown in
Referring to the scenario shown in
Reverting to
At step 412 the router receives a response from the ASBR providing alternate connectivity and at step 414 the router holds the information in the RIB (in the case of fast convergence) or the presence of the alternative route can be used to speed up convergence by switching immediately to it before waiting for the full BGP convergence. If the information if FRR capable the node can actually repair to it immediately updates its forwarding tables appropriately (in the case of fast re-route) for example by providing the alternate nexthop for the alternate computed path for use in the event of notification of withdrawal of the primary route.
The “help” indicator can be further understood with reference to
As a result forwarding is improved in the event of a failure whilst reducing IBGP traffic and avoiding techniques such as automatic or policy controlled addition of routes in IBGP using techniques such as an “addpath” attribute.
The approach can be implemented in any appropriate network or environment using any appropriate protocol. The manner in which the method described herein is implemented may be using software, firmware, hardware or any combination thereof and with any appropriate code changes as will be apparent to the skilled reader without the need for detailed description herein.
4.0 Implementation Mechanisms—Hardware Overview
The computer system 140 implements as a router acting as an external advertisement receiving node the above described method of forwarding data. Computer system 140 includes a bus 142 or other communication mechanism for communicating information, and a processor 144 coupled with bus 142 for processing information. Computer system 140 also includes a main memory 146, such as a random access memory (RAM), flash memory, or other dynamic storage device, coupled to bus 142 for storing information and instructions to be executed by processor 144. Main memory 146 may also be used for storing temporary variables or other intermediate information during execution of instructions to be executed by processor 144. Computer system 140 further includes a read only memory (ROM) 148 or other static storage device coupled to bus 142 for storing static information and instructions for processor 144. A storage device 150, such as a magnetic disk, flash memory or optical disk, is provided and coupled to bus 142 for storing information and instructions.
A communication interface 158 may be coupled to bus 142 for communicating information and command selections to processor 144. Interface 158 is a conventional serial interface such as an RS-232 or RS-422 interface. An external terminal 152 or other computer system connects to the computer system 140 and provides commands to it using the interface 158. Firmware or software running in the computer system 140 provides a terminal interface or character-based command interface so that external commands can be given to the computer system.
A switching system 156 is coupled to bus 142 and has an input interface and a respective output interface (commonly designated 159) to external network elements. The external network elements may include a plurality of additional routers 160 or a local network coupled to one or more hosts or routers, or a global network such as the Internet having one or more servers. The switching system 156 switches information traffic arriving on the input interface to output interface 159 according to pre-determined protocols and conventions that are well known. For example, switching system 156, in cooperation with processor 144, can determine a destination of a packet of data arriving on the input interface and send it to the correct destination using the output interface. The destinations may include a host, server, other end stations, or other routing and switching devices in a local network or Internet.
The computer system 140 implements as a router acting as an internal or external advertisement receiving node the above described method of forwarding data. The implementation is provided by computer system 140 in response to processor 144 executing one or more sequences of one or more instructions contained in main memory 146. Such instructions may be read into main memory 146 from another computer-readable medium, such as storage device 150. Execution of the sequences of instructions contained in main memory 146 causes processor 144 to perform the process steps described herein. One or more processors in a multi-processing arrangement may also be employed to execute the sequences of instructions contained in main memory 146. In alternative embodiments, hard-wired circuitry may be used in place of or in combination with software instructions to implement the method. Thus, embodiments are not limited to any specific combination of hardware circuitry and software.
The term “computer-readable medium” as used herein refers to any medium that participates in providing instructions to processor 144 for execution. Such a medium may take many forms, including but not limited to, non-volatile media, volatile media, and transmission media. Non-volatile media includes, for example, optical or magnetic disks, such as storage device 150. Volatile media includes dynamic memory, such as main memory 146. Transmission media includes coaxial cables, copper wire and fiber optics, including the wires that comprise bus 142. Transmission media can also take the form of wireless links such as acoustic or electromagnetic waves, such as those generated during radio wave and infrared data communications.
Common forms of computer-readable media include, for example, a floppy disk, a flexible disk, hard disk, magnetic tape, or any other magnetic medium, a CD-ROM, any other optical medium, punch cards, paper tape, any other physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, any other memory chip or cartridge, a carrier wave as described hereinafter, or any other medium from which a computer can read.
Various forms of computer readable media may be involved in carrying one or more sequences of one or more instructions to processor 144 for execution. For example, the instructions may initially be carried on a magnetic disk of a remote computer. The remote computer can load the instructions into its dynamic memory and send the instructions over a telephone line using a modem. A modem local to computer system 140 can receive the data on the telephone line and use an infrared transmitter to convert the data to an infrared signal. An infrared detector coupled to bus 142 can receive the data carried in the infrared signal and place the data on bus 142. Bus 142 carries the data to main memory 146, from which processor 144 retrieves and executes the instructions. The instructions received by main memory 146 may optionally be stored on storage device 150 either before or after execution by processor 144.
Interface 159 also provides a two-way data communication coupling to a network link that is connected to a local network. For example, the interface 159 may be an integrated services digital network (ISDN) card or a modem to provide a data communication connection to a corresponding type of telephone line. As another example, the interface 159 may be a local area network (LAN) card to provide a data communication connection to a compatible LAN. Wireless links may also be implemented. In any such implementation, the interface 159 sends and receives electrical, electromagnetic or optical signals that carry digital data streams representing various types of information.
The network link typically provides data communication through one or more networks to other data devices. For example, the network link may provide a connection through a local network to a host computer or to data equipment operated by an Internet Service Provider (ISP). The ISP in turn provides data communication services through the world wide packet data communication network now commonly referred to as the “Internet”. The local network and the Internet both use electrical, electromagnetic or optical signals that carry digital data streams. The signals through the various networks and the signals on the network link and through the interface 159, which carry the digital data to and from computer system 140, are exemplary forms of carrier waves transporting the information.
Computer system 140 can send messages and receive data, including program code, through the network(s), network link and interface 159. In the Internet example, a server might transmit a requested code for an application program through the Internet, ISP, local network and communication interface 158. One such downloaded application provides for the method as described herein.
The received code may be executed by processor 144 as it is received, and/or stored in storage device 150, or other non-volatile storage for later execution. In this manner, computer system 140 may obtain application code in the form of a carrier wave.
5.0 Extensions and Alternatives
In the foregoing specification, the invention has been described with reference to specific embodiments thereof. It will, however, be evident that various modifications and changes may be made thereto without departing from the broader spirit and scope of the invention. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.
Any appropriate routing protocol and mechanism and forwarding paradigm can be adopted to implement the invention. The method steps set out can be carried out in any appropriate order and aspects from the examples and embodiments described juxtaposed or interchanged as appropriate. For example the method can be implemented using link state protocols such as intermediate system-intermediate system (IS-IS) or open shortest path first (OSPF), or routing vector protocols and any forwarding paradigm, for example MPLS. The method can be applied in any network of any topology and in relation to any component change in the network for example a link or node failure, or the introduction or removal of a network component by an administrator.
Furthermore, the mechanism above of identifying network prefix address connectivity in the Community string of BGP is potentially usable in other protocols where a proportion of the protocol is reserved for the transport of network and connectivity information.
Where reference is made to BGP, eBGP or iBGP it will be appreciated that the approach can be applied in relation to any appropriate exterior or inter-domain protocol. The routing domain may comprise an AS, SRLG, or LAN, or any other network of interconnected components sharing a common routing protocol.
Number | Name | Date | Kind |
---|---|---|---|
5953312 | Crawley et al. | Sep 1999 | A |
6032194 | Gai et al. | Feb 2000 | A |
6438100 | Halpern et al. | Aug 2002 | B1 |
6934763 | Kubota et al. | Aug 2005 | B2 |
6981055 | Ahuja et al. | Dec 2005 | B1 |
7177295 | Sholander et al. | Feb 2007 | B1 |
7181533 | D'Souza et al. | Feb 2007 | B2 |
7197040 | Bressoud et al. | Mar 2007 | B2 |
7209975 | Zang et al. | Apr 2007 | B1 |
7215644 | Wu et al. | May 2007 | B2 |
7233593 | Chavali | Jun 2007 | B2 |
7236575 | Kim et al. | Jun 2007 | B2 |
7355983 | Scudder et al. | Apr 2008 | B2 |
7359393 | Nalawade et al. | Apr 2008 | B1 |
7406035 | Harvey et al. | Jul 2008 | B2 |
7408941 | Martini et al. | Aug 2008 | B2 |
7420958 | Marques | Sep 2008 | B1 |
7480253 | Allan | Jan 2009 | B1 |
7483387 | Guichard et al. | Jan 2009 | B2 |
7502332 | Chen | Mar 2009 | B1 |
7519009 | Fleischman | Apr 2009 | B2 |
7535826 | Cole et al. | May 2009 | B1 |
7590074 | Dondeti et al. | Sep 2009 | B1 |
7697439 | Martini et al. | Apr 2010 | B2 |
7733876 | Davie et al. | Jun 2010 | B2 |
7787396 | Nalawade et al. | Aug 2010 | B1 |
7801030 | Aggarwal et al. | Sep 2010 | B1 |
20020093954 | Weil et al. | Jul 2002 | A1 |
20030007500 | Rombeaut et al. | Jan 2003 | A1 |
20030142682 | Bressoud et al. | Jul 2003 | A1 |
20030233595 | Charny et al. | Dec 2003 | A1 |
20040213233 | Hong et al. | Oct 2004 | A1 |
20040260825 | Agarwal et al. | Dec 2004 | A1 |
20050007950 | Liu | Jan 2005 | A1 |
20050068968 | Ovadia et al. | Mar 2005 | A1 |
20050265258 | Kodialam et al. | Dec 2005 | A1 |
20060029035 | Chase et al. | Feb 2006 | A1 |
20060140190 | Lee | Jun 2006 | A1 |
20060187819 | Bryant et al. | Aug 2006 | A1 |
20060193247 | Naseh et al. | Aug 2006 | A1 |
20060193252 | Naseh et al. | Aug 2006 | A1 |
20060209716 | Previdi et al. | Sep 2006 | A1 |
20060239201 | Metzger et al. | Oct 2006 | A1 |
20060291446 | Caldwell et al. | Dec 2006 | A1 |
20070005784 | Hares et al. | Jan 2007 | A1 |
20070011351 | Bruno et al. | Jan 2007 | A1 |
20070025270 | Sylvain | Feb 2007 | A1 |
20070041379 | Previdi et al. | Feb 2007 | A1 |
20070064702 | Bates et al. | Mar 2007 | A1 |
20070091793 | Filsfils et al. | Apr 2007 | A1 |
20070091794 | Filsfils et al. | Apr 2007 | A1 |
20070091795 | Bonaventure et al. | Apr 2007 | A1 |
20070091796 | Filsfils et al. | Apr 2007 | A1 |
20070180311 | Harvey et al. | Aug 2007 | A1 |
20070214275 | Mirtorabi et al. | Sep 2007 | A1 |
20070214280 | Patel et al. | Sep 2007 | A1 |
20070260746 | Mirtorabi et al. | Nov 2007 | A1 |
20080008104 | Previdi et al. | Jan 2008 | A1 |
20080062986 | Shand et al. | Mar 2008 | A1 |
20080192627 | Lichtwald | Aug 2008 | A1 |
20080219153 | Shand et al. | Sep 2008 | A1 |
20100287305 | Kompella | Nov 2010 | A1 |
Number | Date | Country | |
---|---|---|---|
20080062986 A1 | Mar 2008 | US |