1. Field of the Invention
The present invention relates generally to adaptive routing systems and methods, as well as link state routing systems and methods. More particularly, the present invention relates to building an independent cognitive learning component into a link state routing protocol.
2. Description of Related Art
Distance-vector and link-state routing protocols are two major classes of distributed routing protocols. Both classes are interior gateway protocols which operate within a routing domain or an autonomous system. Networks nodes such as computers are coupled together by intra-domain routers running on various types of routing protocols.
Distance-vector routing protocols base the routing decisions on the best path to a given destination node on the distance to the destination. The distance may be measured in number of hops, or delay time, or packets lost, etc. Each router that operates using a distance-vector stores a routing table that contains the distance of the router to other routers. Each router advertises its routing table to directly connected neighbors and receives similar advertisements from the neighbors. The received distances are used by each router to update its respective routing table. The advertisement cycle continues until each router converges to stable values. Distance-vector routing protocols typically adopt the Bellman-Ford algorithm.
Unlike routers that use a distance-vector routing protocol, routers that use a link-state routing protocol possess information about the full network topology. Examples of link state routing protocols (LSRPs) include Open Shortest Path First (OSPF) and Intermediate System-to-Intermediate System (IS-IS) protocols. Each routing node (link state enabled router) that uses a link state protocol maintains a link state database (LSDB) that stores link state information, which is a tree like image of the entire network topology. Routing nodes run a flooding algorithm that periodically floods its neighboring nodes with information about its directly connected links. A typical flooding algorithm allows a routing node to send information to all the other routing nodes in the same routing domain other than the ones over which the new piece of information was received. Each node receiving the new information then updates its respective link state database with the new information. The flooding algorithm ensures that all routers within a routing domain converge on the same topological information within a finite period of time.
Nodes that use a link state routing protocol independently calculate the best next hop for every possible destination in the network using the link-state information. The collection of the best next hops forms a routing table for the respective router. The Djikstra algorithm is often adopted in link state routing protocols to obtain the best path by finding the shortest route through a network from the source to the destination node. In the Djikstra algorithm, the length of an individual link in the path is described by a cost, which could be assigned by a network management system. The length, or the total cost, of the path is then defined as the sum of this cost over all of the links that make up the path.
Link state routing protocols can provide fast convergence after a link failure and low delay at low load by using minimum hop paths. However, while giving high performance at low load, link state protocols often waste network capacity by forcing all routes to follow the shortest path. With limited capacity, most networks cannot support as many data flows as possible if “optimal paths” are calculated by the finding-shortest-path algorithm.
In non-dynamic networks, a centralized approach such as traffic engineering provides a much higher network capacity than distributed routing protocols. Traffic measurements, topology collection and routing parameters configuration are performed externally from the routers, for example, by an operational network or a network management system. Based on a network-wide view and access of the network metrics, traffic engineering can bring more stability to the routing protocol, consume less bandwidth with less transmission overhead and incorporate more diverse performance constraints. In the situation of multiple shortest paths between a pair of routers, traffic engineering may offer a better load balancing by selecting the outgoing links to minimize the worst case traffic congestion at any node in the network.
However, such centralized optimization is not suitable to dynamic networks, such as Mobile Ad Hoc Networks. In these networks, cognitive routing may provide higher capacity than distributed routing protocols. In cognitive routing, the routers may adapt to the network environment through learning techniques and improve the route selection. For example, a Q-routing protocol based on the Q-learning framework offers a reinforcement-based routing protocol. In Q-routing, each network node has a separate logic controller that performs independent decision-making and policy selecting.
However, Q-routing protocols also have limitations. These limitations include the inability to fine tune routing policies especially at low network load level and the inability to learn new optimal policies under decreasing load conditions. While offering higher capacity, cognitive routing also takes time to learn the best paths. This results in worse performance than conventional distributed routing protocols at low load, for example, in terms of less path availability, longer convergence time and path delay.
In summary, the prior work in distributed routing generally falls into two distinct categories. The first category of routing solutions focus on performance (e.g., providing less link delay through using least cost paths) at the cost of network capacity. The second category of routing solutions focus on optimizing network capacity at the expense of performance (e.g., longer link delay while exploring new topologies). Each category of approaches partially solves the problem. Furthermore, the second category of approaches focuses only on the less popular distance vector routing protocols. Thus, a holistic solution to the underlying problem, namely, the need for robust, high performance and high capacity routing mechanisms, is desired.
Aspects of the invention include a routing method and system that combine a cognitive learning module with a link state protocol.
In one embodiment of the invention, a method is provided for obtaining routing paths in a communication network employing a link state protocol. The communication network includes a plurality of network nodes and a plurality of communication links connecting the plurality of network nodes. The method comprises receiving periodically, at one of the plurality of the network nodes, link state information of one or more of the plurality of communication links; storing received link state information for a predetermined period of time, wherein the stored link state information includes historical link state information; and determining, through a learning algorithm, routing paths to other network nodes of the plurality of network nodes based on the stored link state information.
In one example, calculating routing paths comprises calculating a shortest network path based on a Djikstra algorithm.
In another example, the link state information comprises path cost metrics describing link delay and queue lengths at the network nodes.
In a further example, wherein the learning algorithm is a Q-learning algorithm.
In one alternative, the method comprises comprising periodically sampling, through the learning algorithm, the stored link state information.
In another alternative, calculating routing paths further comprises calculating routing paths at a predetermined time interval.
In a further alternative, the predetermined time interval comprises a time corresponding to receipt of link state information.
In yet another example, the method comprises adapting with different learning ratios.
In yet another alternative, the method comprises discovering neighboring nodes by periodically sending a hello message to neighboring nodes within a predetermined hop count.
In another embodiment of the invention, a communication apparatus is provided in a communication network. The communication network includes a plurality of communication devices and a plurality of communication links connecting the plurality of communication devices. The communication apparatus connects to at least one of the plurality of communication devices over at least one of the communication links. The communication apparatus comprises a communication interface for periodically receiving link state information about one or more of the plurality of communication links from other communications devices, a processor in connection with the communication interface, a first memory coupled to the processor and containing a set of instructions executable by the processor. The set of instructions being executable to execute a link state protocol; store, in the first or a second memory, received link state information for a predetermined period of time, wherein the stored link state information includes current and historical link state information; and determine, through a learning algorithm, routing paths to the connected communication devices based on the stored link state information.
In one example, the communication apparatus comprises instructions to calculate shortest paths based on a Djikstra algorithm.
In another example, the link state information comprises path cost metrics describing respective link delay and queue lengths at the network nodes.
In a further example, the learning algorithm is a Q-learning algorithm.
In one alternative, the communication apparatus comprises instructions to periodically sample, through the learning algorithm, the stored link state information.
In another alternative, the routing paths are calculated on a predetermined basis regardless a link state database on said network node is updated or not.
In a further alternative, the predetermined basis is every time link state information is received.
In yet another example, the communication apparatus comprises instructions to adapt with different learning ratios.
In yet another alternative, the communication apparatus comprises instructions to discover neighbor nodes through periodically sending a hello message to neighbor nodes, wherein the neighbor nodes are within a predetermined hop count.
In a further embodiment of the invention, a network node is provided. The network node comprises a communication interface for periodically receiving link state information from a plurality of other network nodes, a memory storing executable instructions, and a processor operable to execute the instructions to store received link state information for a predetermined period of time and process the stored link state information to determine a routing path based on the stored link state information using an adaptive learning process. The stored link state information includes historical link state information.
In one example, the adaptive learning process comprises a Q-learning algorithm.
Aspects, features and advantages of the system and method will be appreciated when considered with reference to the following description of exemplary embodiments and accompanying figures. The same reference numbers in different drawings may identify the same or similar elements. Furthermore, the following description is not limiting; the scope of the invention is defined by the appended claims and equivalents.
In accordance with aspects of the system and method, a network node in a communication network maintains a routing table that contains paths to all reachable destination nodes in the network. The network runs a link state routing protocol. The network node receives periodic disseminations of link state information from neighboring nodes in the network. The link state information includes neighboring node identity and link cost metrics. The network node calculates the initial routing paths based on the received link state information by using a link state routing algorithm. The network node then adapts the calculated paths based on both newly received link state information and past link state information through a reinforcement learning approach. It then selects a routing path to each destination node based on the adaptation and updates the routing table accordingly.
As shown in
The memory 210 stores information accessible by processor 208, including instructions 258 and data 262 that may be executed or otherwise used by the processor 208. The memory 210 may be of any type capable of storing information accessible by the processor, including a computer-readable medium, or other medium that stores data that may be read with the aid of an electronic device, such as a hard-drive, memory card, ROM, RAM, DVD or other optical disks, as well as other write-capable and read-only memories. Systems and methods may include different combinations of the foregoing, whereby different portions of the instructions and data are stored on different types of media.
The instructions 258 may be any set of instructions to be executed directly (such as machine code) or indirectly (such as scripts) by the processor. For example, the instructions may be stored as computer code on the computer-readable medium. In that regard, the terms “instructions” and “programs” may be used interchangeably herein. The instructions may be stored in object code format for direct processing by the processor, or in any other computer language including scripts or collections of independent source code modules that are interpreted on demand or compiled in advance. The instructions may contain various algorithms, including the path finding algorithms 259 used to find a best path from the router's network address to every possible destination node in the network 90. Adaptive learning (or reinforcement learning) algorithms 260, such as Q-learning algorithms) can also be instantiated in the instructions.
The processor 208 may be any conventional processor, such as processors from Intel Corporation or Advanced Micro Devices. Alternatively, the processor may be a dedicated device such as an ASIC. Although
The data 262 may be retrieved, stored or modified by processor 208 in accordance with the instructions 258. For instance, although the system and method is not limited by any particular data structure, the data may be stored in computer registers, in a relational database as a table having a plurality of different fields and records, XML documents or flat files. The data may also be formatted in any computer-readable format, and may comprise any information sufficient to identify the relevant information, such as numbers, descriptive text, proprietary codes, references to data stored in other areas of the same memory or different memories (including other network locations) or information that is used by a function to calculate the relevant data.
Data 262 may include the types of data structures described above in router 202 in
Each routing server also maintains a routing table 266, which stores routing paths that are calculated by the path finding algorithms 259. Neighbor table 268 may be used to store a list of neighbor routing nodes within a predetermined hop count, e.g., 1-hop or 2-hop, to facilitate periodic neighbor discovery. A topology table 270 may be constructed from the link state information and represents the topology of the network domain in which the router belongs. A learning database 272 may be used to store historical link state data to support the learning algorithms on the router, such as adaptive learning algorithm 260.
Operations in accordance with one or more aspects of the invention will now be described with reference to
As shown in
In block 306, the receiving node determines whether a learning component is enabled or not. If the learning component is disabled, the hybrid routing system proceeds to block 310, where it calculates the paths based only on a link state routing algorithm. If the learning component is enabled, then the system decides, in block 308, if the relevant destination nodes need their routing paths to be initialized, i.e., a determination is made whether it is the first time that the receiving node is calculating a routing path for a particular destination node. This may happen, for example, during the initialization stage of a network domain. It may also happen when either the receiving node or the destination node or both are newly added to the network.
The system resorts to the conventional path selection algorithm in block 310 if it is calculating for the first time the path for the destination node in question. If at block 308 it is determined that the path is not a new path, the receiving node then proceeds to block 312, where it calculates paths leading to the possible destinations based on both the currently received link state information and the historic values of the respective link paths, which are captured through a cognitive learning process. Finally, in block 314, the routing table is updated with the newly calculated paths. Thus, in the hybrid routing system, past link state information, if available, is used in determining the current path for routing data to a destination point. That path is then added to the routing table for the receiving node.
Further, in a conventional link state protocol the paths in the routing table are not recalculated unless the information link state database is changed. In contrast, in a hybrid routing system, the recalculation is scheduled regardless the changes of the link state database, and may be done each time the database is updated or periodically. This allows capturing of link path changes or stability over time as part of the learning process.
As illustrated, each routing node executes a link state protocol and preferably comprise functional modules such as a neighbor discovery module 410, a topology representation module 412, a topology dissemination module 414, a link state database 416, a route calculation module 418, and a routing/forwarding table 424.
In system 400, each link state enabled routing node periodically builds and sends a Hello message 408 to its neighbors. More specifically, neighbor discovery module 410 on router 402a, based on the network topology map in 412, floods the Hello message to its one-hop neighbors who are listening on a well-known port. For example, in
Each routing node in system 400 may periodically obtain or perform a series of tests to obtain the cost associated with the link to each of its neighbors. For example, node 402c may measure the cost for link 406a leading to node 402a, the cost 402b for link 406b connecting node 402b, and the cost 402d for link 406c connecting node 402d. These costs may be measures of end-to-end delay, throughput, etc.
Link state advertisement messages 404 are also periodically built by each routing node and broadcasted to the neighboring nodes in system 400. For example, router 402a receives the link state advertisement message 404a sent from router 402c that describes the costs of link paths 406a-c. Each link state advertisement message 404 may include a link identifier that identifies a connected link 406, a link state field that may be used to identify the state of the link (e.g., active, idle, congested, unavailable, etc.), one or more cost metric field that contains the cost values of the connected link, and a link neighbor field that may be used to identify adjacent connected routing nodes.
The link state packet may also contain information identifying the initiating node, and a sequence number that monotonically increases each time the initiating node constructs a new link state packet. The link state advertisement may also include age parameters to prevent a packet wandering in the network for an indefinite period of time. Upon receiving link state advertisements, each node may further disseminate the packets to its neighbors so that all the routing nodes in a network rapidly obtain the most updated link state advertisements. The specific data structures of a link state advertisement may vary from different routing protocols and the details may be found in the standards for each protocol.
Each link state advertisement message describes the operating conditions of a link path 406 in terms of a cost metric associated with the link. The cost of a link path may be described by static cost metrics of the corresponding link path where the costs are measured at a given time. The cost may also be described by dynamic cost metrics of the corresponding link path where the link performance parameters may be obtained by multiple samplings over a prescribed period of time. The dynamic cost values may be further processed through various statistic processes before being included in the link state advertisement messages. For example, a statistic normalization process may be applied to the sampled cost values over the sample period to obtain a statistical mean or a median value of the sampled data.
Embodiments of the invention may utilize various dimensions of static and/or dynamic link cost metrics to express the link costs. For example, the cost metric may describe the costs in terms of available bandwidth, utilized bandwidth, link congestion, queue size at a router, link reliability, one-trip or round-trip transmission delay, etc.
Upon receiving the updated link state advertisements from other routing nodes, the router 402a updates its link state database 416 for each respective link path. Database 416 stores the most up to date information obtained from the link state advertisements 404 and Hello messages 408. This information forms a complete topology of network 412. The topology map represents all the network nodes in a weighted graph, such as the network graph shown in
Based on the current weighted graph of the whole network, the path finding module 418 may be configured to perform link state routing computations for determining optimal paths to transmit packet data to all possible destination nodes. The link state path selections may be performed based on a predetermined routing algorithm and the cost metrics in the link state database. Specific calculation algorithms may be based on, for example, Djikstra's shortest-path algorithm, where the routing node finds the path with least cost to each destination node. The optimal path selected based on cost may then be used to construct a routing table 424.
As shown in
In one embodiment of the invention, the learning component may be enabled to allow a node in the network to operate in hybrid mode. In hybrid mode, a node or router may determine path selection using reinforcement learning techniques. For example, in system 400, the learning component 422 in router 402a may execute a Q-learning process to select a path based not only on the instant path cost metrics received from the link state advertisements, but also on the historic cost metrics of the respective link paths.
As explained above, the learning resource component 422 may be flexibly combined with any specific link state routing protocol to support diverse applications and smooth migration to new routing protocols. For example, the component may be integrated with OSPF in wired networks and OSLR in wireless MANETs at the same time, or be integrated with any new link state routing protocols in the future. By separating the cognitive process from the standard protocol adopted by the routers, the component-based approach also allows a transparent understanding of the impact of a specific type of learning approach or algorithm. This may be achieved by studying the difference in the routing performances between the router performing under the standard mode (without learning component) and the hybrid mode (with the learning component). Furthermore, the component-based approach provides a framework for adding and combining different learning modules. Thus, when new learning approaches become available, the framework allows easy upgrade to new hybrid routing protocols.
Source nodes s1 and s2 learn of changes in the link costs of paths 102 and 104 through the link state advertisements received from nodes d1, d2 and x3. The Q-learning process on the source nodes captures the knowledge of the delay in these paths and uses the knowledge to train the path selection accordingly. For example, in a Q-learning process, the value of Qs1(d1, x3) supplies an indication of the estimation of the time necessary for data to reach destination node d1 from source node x1 through relay node x3. Generally, the Q value of Qs1 may take account of the following transmission delay factors: journey time from x3 to d1 (t) and journey time from source node s1 to relay node x3 (s). In general, the transmission delays s and t may depend on factors such as the raw bandwidth, noise and interference with other transmitters. In addition, packets may be delayed in the queues at source node s1 (delay “p”) and relay node x3 (delay “q”). These queuing delays depend on the number of flows passing through the node and too many flows (congestion) will increase packet delay. At a given time, a new set of the above delay factors may be assembled from the most updated version of the link state advertisements. A learning ratio α (0<α<1) may then be applied to these factors and the old Qs1 old value to obtain the new Qs1 value:
Q
s1=α(t+q+s+p−Qs1
Over time, the impact of changes in delay along path 102 may gradually take over the previously selected path Qs1
The hybrid routing protocol offers an efficient routing solution by making use of the existing link state protocol. In example 500, the initial Q values, i.e., the starting routing paths, are obtained from the conventional link state routing algorithms, e.g., Djikstra's finding-shortest-path. In contrast, suboptimal initial routes and long optimization times result from Q-routing or other cognitive routing protocols that build the routing table through the cognitive process from the beginning. Furthermore, the hybrid routing system uses the existing messaging in the link state protocol (with possible small modifications if desired to carry additional link cost metrics (e.g., for OLSR to carry delay and not just hop count). By contrast, the Q-routing system defines its own messaging protocol. By retaining the feature of proactively calculating routing paths proactively in the link state protocol, the hybrid protocol offers a faster convergence and a more scalable solution than the reactive-based cognitive routing protocols.
In conventional link state routing protocols, routing paths are recalculated and routing tables are updated only when the link state database are changed, e.g., nodes are added/removed, path costs changed, etc. In hybrid routing system 400, with learning enabled, the path calculations may be rerun each time a new set of link state advertisements is received regardless whether the link state database is updated or not. Alternatively, the path recalculations may be performed on a periodical basis even if there is no change in the link state database.
In scenario 600 the network adopts a distributed TDMA MAC algorithm as the transmission protocol. A constant bit rate flow (180 kbps) is scheduled between each source-destination node pair. The channel conditions are assumed to be ideal and the wireless link speed set to 1 Mbps.
In
Under route option 1, the standard routing protocol OLSR only considers path costs in terms of number of hops. Inner paths 602 and 604 are selected as the optimal routes. Both source nodes s1 and s2 select node x3 as their forwarding node. Over the time, this leads to congestion at node x3 and link delay along the paths 602 and 604. Furthermore, if every data packet occupies one TDMA data slot, TDMA MAC's fair scheduling policy allows data to be sent from node s1 and s2 every 3rd slot. If node x3 is under a full load condition with incoming data from the two nodes s1 and s2, x3 can forward data only every 6th slot. Accordingly, it takes an average of six slots for a packet to reach nodes d1 or d2 respectively from source nodes s1 or s2. Therefore, choosing the shortest path does not always give the best network performance in scenario 600.
Under route option 2, the outer path 606 is selected for the data flow from node s1 to node d1 and the inner path 604 is still used for the data flow from node s2 to node d2. Under this routing scheme, each data packet takes an average of four TDMA data slots traveling from the source to the destination. Under route option 3 where both outer paths 606 and 608 are selected to route the data flows from nodes s1 and s2, the network has the best performance where it takes an average of just three TDMA data slots per packet to travel from the source node to the destination node.
As shown in
In
DTED 0 terrain data of a Texas geographical area (N29.5, W100.5) and TIREM3 propagation parameter set, as shown in
Under OLSR-D, the load between Gateway 1 and Gateway 2 are shared.
It will be further understood that the sample values, types and configurations of data described and shown in the figures are for the purposes of illustration only. In that regard, systems and methods in accordance with aspects of the invention may be based on different link state routing protocols, and be used in different network architectures. The systems and methods may be provided and received at different times (e.g., via different servers or databases) and by different entities (e.g., some values may be pre-suggested or provided from different sources).
As these and other variations and combinations of the features discussed above can be utilized without departing from the invention as defined by the claims, the foregoing description of exemplary embodiments should be taken by way of illustration rather than by way of limitation of the invention as defined by the claims. It will also be understood that the provision of examples of the invention (as well as clauses phrased as “such as,” “e.g.”, “including” and the like) should not be interpreted as limiting the invention to the specific examples; rather, the examples are intended to illustrate only some of many possible aspects.
Unless expressly stated to the contrary, every feature in a given embodiment, alternative or example may be used in any other embodiment, alternative or example herein. For instance, any suitable cognitive learning algorithms may be employed in any configuration herein. Existing or future link state routing protocols may be used in any configuration herein. Any static or dynamic cost metric with parameters of network conditions may be used with any of the configurations herein.