For network administration or network planning, a map of the network infrastructure is typically created. The map identifies devices, such as routers, gateways, etc., in the network and connections between the devices. The map may identify the status of devices or connections, such as whether they are failed or operational, and may identify network metrics about the devices or connections.
In order to generate the map, the devices and their connections need to be discovered. Most Open Systems Interconnection (OSI) layer 2 devices, such as switches, use some variation of the Spanning Tree Protocol to determine which connections to use. As a result, the devices know which other layer 2 devices they connect to. Typically, a management system can poll the devices to find out about the connections they know about. This approach has the fairly obvious failing that only connections between devices supporting compatible versions of the Spanning Tree Protocol can be discovered.
Often, it is more difficult to discover layer 2 devices and their connections. For example, layer 2 connections may not be discoverable or may not be 100% accurate between neighboring devices that implement different (incompatible) layer 2 discovery protocols. A forwarding database for a bridge may be used to determine connections but the information may not be current or a forwarding database may not be available. Thus, it is often difficult to accurately identify devices and connections for a map.
The embodiments are described in detail in the following description with reference to the following figures:
For simplicity and illustrative purposes, the principles of the embodiments are described by referring mainly to examples thereof. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the embodiments. It is apparent that the embodiments may be practiced without limitation to all the specific details. Also, the embodiments may be used together in various combinations.
A topology discovery process determines connections and unknown domains for a topology. In one embodiment, connections for all discovered devices are determined or all possible routes through the network are exhausted. Cached connection data may be used to determine connections and determine updates to the topology. A topology is a map of a network including devices in the network and connections between the devices. Once determined, the topology may be displayed on a user interface and can be used for network administration and planning.
The topology discovery process may include using cached connection data to identify multiple connections from a network interface on a device. However, these connections may not be direct connections. For example, the connections may comprise multi-hop connections between a network interface and another device. According to an embodiment, the topology discovery process identifies, from the connection data, connections comprised of direct bi-directional connections between network interfaces on two devices. The identified direct bi-directional connections, for example, are connections for a spanning tree. These connections are used to form the topology.
The topology generator 101 determines links in the topology of the network from the connection data. To determine the links, a list of connections is determined from the connection data. The connections represent candidate links and one or more of the connections may be selected as actual links in the topology through processing of the connection data. A connection or link is represented by a connection between a pair of network interfaces on different network devices. On the topology a line is drawn between the interfaces to show the link. A connection or link may be bi-directional. The connection or link is a direct connection without any intervening hops. The connection or link may be a connection in a spanning tree and there is one connection between two interfaces on two network devices. A network interface, also referred to herein as interface, may include hardware that takes information packaged by a network protocol and puts it into a format that can be transmitted over some physical medium like Ethernet, a fiber-optical cable, or wirelessly. A switch may include multiple interfaces for sending and receiving data over the network.
The topology generator 101 also identifies a list of unknown domains for which more information is needed. An unknown domain is a group of devices that the topology discovery process can determine are connected, but cannot determine the exact nature of the connection. For example, the process may not be able to determine if the group of devices are connected through single hop or multi hop connections, or cannot determine the interface for the connections, etc. An unknown domain might be represented on a map as a cloud connected to each of the devices in the list.
The topology discovery process is described in further detail below. The topology generator 101 stores topology data gathered from the network devices 102 in a data storage 103 that may be used to determine the topology.
The topology generator 101 may include machine readable instructions executed by a computer system to perform the topology discovery process. The topology generator 101 may include a central computer system, such as a server, capturing topology data to generate the topology or the topology generator 101 may be a distributed application running on multiple computer systems. The topology generator 101 may include an application running on a server in a distributed computing system, such as a cloud computing system, that generates a topology for a network.
The topology generator 101 polls network devices in a network for ARP cache and MAC cache data, and uses this information to determine the topology of the network. The ARP cache is a table which stores mappings between Data Link Layer addresses (e.g., MAC addresses) and Network Layer addresses (e.g., Internet Protocol (IP) addresses). The process iterates across the routers in the network. The links associated with a single router are calculated for devices within the routers domain. The domain shall include, for example, any network devices that could be put into the ARP cache of the discovered router. From time to time a new device shall be discovered, or a device shall be deleted. The process periodically checks for such changes, and when one is found, the process is run on the router that has the device in its domain. Examples for the discovery process are provided below followed by flowcharts describing the methods for the discovery process.
The following two examples describe the topology discovery process applied to a network. The examples discover connections between network devices shown in
Some examples of virtual interfaces are included. Virtual interfaces are one way of segregating traffic out of a physical interface. The topology discovery process accounts for virtual interfaces because they may show up in interface lists, ARP caches, and MAC caches. D-2.1 is an example of name of a virtual interface which is the first virtual interface on interface 2 of device D. The MAC address of a virtual interface is that of its physical interface. For instance the MAC Address of D-2.1 is M.D-2.
Table 1 below is an example connection data including input data from caches, from which network connections are determined.
B may be an older device, so MAC addresses for each interface are not available. Nevertheless, the topology is determined.
The first step is to determine the destinations which correspond to the cache entries. This may be done by adding a destination corresponding to each MAC address. Once the destinations are found, the cache entries are no longer needed.
Table 2 shows putting the inputs in standard form.
In this example, there are no destinations from a device to itself, so nothing to remove here. Destinations are added so each destination has a reverse destination. In this example, quite a few destinations are added to ensure that each destination has a reverse destination. The inputs, after making these changes are given in the table 3 below. Added destinations are indicated in bold.
B-3, C-1, D-1, E-1
At this point interfaces with duplicate MAC Addresses are removed. The general rule is to remove interfaces with names that sort alphabetically later. Interfaces with unknown MAC addresses are not removed. Typically virtual interfaces are named similarly to physical interfaces, such as a name consisting of the physical interface name, followed by a suffix of some sort. For this reason these come later in the alphabet, and are the interfaces that are removed. Any destinations on removed interfaces are moved to the interface with the same MAC Address which is kept. So, for instance C-1.1 is removed, and destination D-1 and E-1 is moved to C-1.
In addition, any destination which points to an interface to be removed is changed to point to an interface with the same MAC address which is being kept. So in the case of C, destinations D-1.1 and E-1.1 are changed to D-1 and E-1 respectively when interfaces D-1.1 and E-1.1 are removed. These same destinations were previously added and should not be added twice.
After removing interfaces with duplicate MAC Addresses, the table 4 shows the inputs. The MAC column, which is no longer needed, is not shown.
Each device has at most one destination on each other device. No change is needed to meet this requirement. All interfaces have destinations, and thus no changes are needed. The connection data shown in Table 4 represents connections between network interfaces in network devices. Through processing described in further detail below, links in the topology may be determined from the connections.
Given below are the inputs in normal form, and a map of the devices, with no connections and domains. With each iteration the process adds a connection or unknown domain, and/or removes a destination and/or an interface.
In each iteration the process looks for the interface with the fewest destinations. For the first iteration either B-1, B-3, or D-2 could have been selected. As a rule the first interface found (top to bottom) is used; in this case SELECTED=B-1. Then a link is added from B-1 to C-1. The first iteration can be represented as follows:
The first interface found is SELECTED. The destination interface for SELECTED is REVERSE. The SELECTED device is BEGIN, and the REVERSE device is END, Destinations A-1, B-1, D-1, and E-1 are removed from C-1, and the corresponding reverse links. In addition, interfaces B-1 and C-1 are removed. The result is as follows, removed items indicated by strike-through.
The interface with the fewest number of destinations is B-3. Add a link from B-3 to A-1. That gives
Remove destinations B-3, D-1, and E-1 from A-1, as well as the reverse destinations. In addition, remove interfaces A-1 and B-3. The result is:
D-1 has one destination. Create a link from D-1 to B-2.
Remove the destinations on B-2, and remove B-2 and D-1. The result is:
D-2 has one destination. Create a link from D-2 to E-1.
Remove destinations on E-1 and remove E-1 and D-2. This yields:
At this point there are no interfaces left, so the process terminates. The result is a spanning tree, as desired.
The second example has some information missing, and the result is an unknown domain. Consider the same network as above, and the same cache data, except that device B information is missing. If the inputs are put into standard form, the following data for the calculation phase is determined:
D-2 has one destination. Create a link from D-2 to E-1.
Remove destinations on E-1 and remove E-1 and D-2. This yields:
All remaining interfaces have two destinations. Pick the device containing A-1, and the devices containing destinations of A-1 (namely C and D), and connect them with an unknown domain. Remove all destinations on interfaces on A, C, and D. The result is:
There are no destinations left, so the topology discovery process terminates.
Both ARP and MAC caches are typically real caches; their contents can age out and be removed over time if no traffic to the specified MAC address is received. To ensure that the caches contain the necessary data, some traffic may be sent to each of the devices (e.g. ping them) to make sure that their caches contain the needed information. Data collection for collecting the connection data may include polling for the appropriate data, using any appropriate management interface.
The following data structures containing the inputs are defined to allow a clear and precise definition of the topology discovery process. Other data structures may be used. The data structures are presented in a Java like pseudo-code. The inputs consist of a Collection<Device>object. Collection indicates a generic list, and the name in the angle brackets indicates the type of the object in the collection—in this case Devices.
Instances of the Interface class contain information about a single interface, and its connection to others.
Instances of the Device class contain a collection of Interfaces, and in practical applications, other information,
The inputs can be used to populate a list of devices, each device with a list of interfaces, and the addresses populated from the address caches. The destinations are then populated by creating a Map (index) of addresses to interfaces, and for each address in the addresses, finding the corresponding interface.
Connection data may be standardized and normalized for the topology discovery process. Examples of standardizing and normalizing is described above with respect to tables 1 through 4. For standardizing, the following conditions may be met. Firstly, if interface B is in the destinations of interface A, then interface A must be in the destinations of interface B. In other words—two way communication. Secondly, two interfaces on the same device should not have the same address. Sometimes the raw output from a device shows two interfaces with the same MAC address. This is typically because some of the interfaces are “virtual interfaces”. Virtual interface names are typically the physical interface name plus some additional characters. For instance, if “i1” is a physical interface, “i1-1” and “i1-2” are most likely virtual interfaces. One way to select a single interface if more than one have the same address is to pick the first one in alphabetical order, since that is normally the physical interface. Thirdly, for any two devices A and B, there should be at most one interface on A that has an interface on B as a destination (unique interface per destination).
The next section describes how to convert an arbitrary collection of devices into one that meets the above conditions.
The following may be performed to normalize connection data. For each device, remove any destinations that are to interfaces on the same device. Connections between a device and itself are not calculated. Also, for each device, check all the destinations on its interfaces. If interface A has a destination interface B, and A is not a destination interface for B, add A as a destination interface. Also, for each device, check to see that each interface on this device has a different address. If two interfaces A and B are found with the same address, find the interface that has the name last in alphabetical order (assume this is B). Then add all destinations of B to A's destinations, and modify all the destinations that go to B to instead go to A. Then remove interface B.
Also, for each device A and B, find all destinations to B on interfaces on A. If there are more than one such destination, remove all but one of them. Also, remove any interfaces that have no destinations.
The flowcharts in
A set of candidate selections of links in use for network interfaces of network devices in the network is determined. This may include connection data that has been standardized and/or normalized. As shown in the methods 300-500 in
As shown in
At 302, the topology generator 101 finds an interface with a minimum number of destinations. This interface is referred to as the SELECTED interface. In example 1, the interface B-1 is SELECTED. The destination interface for SELECTED is REVERSE. The SELECTED device is BEGIN, and the REVERSE device is END.
At 303, the topology generator 101 determines if SELECTED interface has 0 destination interfaces (i.e., destinations). If yes, SELECTED is removed from the connection data at 304. Removing SELECTED from the connection data means that the interface is considered not to be part of any connections that can be selected as a link in the topology based on the current processing. In other words, connections for the removed interfaces may no longer be part of candidate links that can be selected for inclusion in the topology. Removing may include marking, flagging or indicating through another way that the interface may not be used to identify other links in the topology when performing the iterations. SELECTED, however, may not actually be deleted from the data storage, and may be used in future processing to determine links for the topology.
If no at 303, the topology generator 101 determines if SELECTED has 1 destination interface at 305. If yes, a link is added to the topology between SELECTED and REVERSE, which is the destination interface, at 306. This process is also described with respect to example 1 above. For example, a link is added from B-1 to C-1 in the first iteration of example 1. In the first iteration SELECTED=B-1 and REVERSE=C-1. After 306, processing proceeds to the method 400 shown in
At 401, the topology generator 101 identifies REVERSE. For example, REVERSE=C-1 in the first iteration of example 1. At 402, the topology generator 101 identifies the device for selected (i.e., BEGIN) and the device for REVERSE (i.e., END). For example, in the first iteration of example 1. BEGIN=B, and END=C.
At 403, the topology generator 101 removes SELECTED and REVERSE from their devices in the connection data. At 404, the topology generator 101 determines if there are any devices that have a destination to both BEGIN and END. These devices are each referred to as MIDDLE. If yes, the topology generator 101 removes destinations from MIDDLE to END at 405. For example, in iteration 1 destinations A-1, B-1, D-1, and E-1 are removed from C-1, and the corresponding reverse links. In addition, interfaces B-1 and C-1 are removed. If no at 404, processing proceeds back to the method 300 as shown.
At 305 in the method 300, if SELECTED is determined to have more than one destination, then processing proceeds to the method 500 shown in
The methods 300 through 500 may be performed in a single iteration. Then, the iteration is repeated until there are no interfaces on any devices containing a destination, such as shown at 301 which goes to DONE if no at 301. Each iteration either adds a link or an unknown domain and may remove either one or more destinations, and/or one or more interfaces as described. The number of loops may be proportional to the number of connections and unknown domains found—which is in turn proportional to the number of network devices, since layer 2 devices are connected via a spanning tree. In a switched Ethernet network with complete information, the discovery process produces a spanning tree solution which corresponds to the actual spanning tree in the network.
The topology discovery process can be performed in layer 2 networks. The process may also be performed for layer 3 IP networks, as is shown in
The topology discovery process described above can generate the topology even if complete information is not available. If partial information is provided, the process provides accurate network connections at least in parts of the network where sufficient information is available. Also, it identifies unknown domains for which additional information is needed.
The computer system 700 includes a processor 702 that may implement or execute machine readable instructions performing some or all of the methods, functions and other processes described herein. Commands and data from the processor 702 are communicated over a communication bus 709. The computer system 700 also includes a main memory 707, such as a random access memory (RAM), where the machine readable instructions and data for the processor 702 may reside during runtime, and a secondary data storage 708, which may be non-volatile and stores machine readable instructions and data. The memory and data storage are examples of computer readable mediums.
The computer system 700 may include an I/O device 710, such as a keyboard, a mouse, a display, etc. The computer system 700 may include a network interface 712 for connecting to a network. Other known electronic components may be added or substituted in the computer system 700.
While the embodiments have been described with reference to examples, various modifications to the described embodiments may be made without departing from the scope of the claimed embodiments.
The present application claims priority to U.S. provisional patent application Ser. No. 61/467,850, filed Mar. 25, 2011, and entitled “Network Topology Discovery”.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/US2011/055749 | 10/11/2011 | WO | 00 | 8/12/2013 |
Number | Date | Country | |
---|---|---|---|
61467850 | Mar 2011 | US |