This invention relates to network management activities in communication networks. More particularly, and not by way of limitation, the invention is directed to a system and method for disseminating network management tasks to network nodes in large, complex, and dynamic communication networks, and solving the tasks in a distributed manner.
The management architecture in use today in communication networks is based on an architecture specified by the ITU-M series of standards. This seminal work in the field of network management had at its center the simple client-server architecture. In the standard text, this is referred to as the “agent-manager” relationship, where the Agent resides on the network equipment being managed and the Manager is a central entity that interacts with the agent for the retrieval of management information and coordination of configuration tasks. This is basically the same paradigm that current third generation (3G) Network Management System (NMS) solutions are based on. This architecture relies on a centralized element or server responsible for collecting data from managed devices, aggregating the data, and setting the state information on the device. The functionality realized in this server is typically divided according to the FCAPS functional taxonomy, as defined by ITU-T in the X.700 specification family.
Communication networks continue to grow in size and complexity, which leads to increased dynamics as individual nodes go on and off line, and links fail and are repaired. These factors introduce a number of challenges to the current centralized NMS architecture. To meet these challenges in part, network management tasks are being distributed down into the network nodes and other network entities themselves in an attempt to increase the availability, performance characteristics, scalability, and correctness guarantees of the network management system.
The ability to find information without a central look up table is a difficult task. One technology which enables node and data discovery in a distributed fashion is the Distributed Hash Table (DHT). DHTs (such as Chord, Pastry, Tapestry, CAN, Bamboo, Kademlia, Coral, and Viceroy) are structured peer-to-peer systems in which all nodes participate equally in consuming/providing data and solving distributed tasks. DHTs are built as logical overlays on top of the physical network, and provide a routing mechanism that relies on a very precise naming scheme. The result is a fully distributed system which offers many advantages, such as scalability to millions of peer nodes, efficient lookup algorithms, robustness and automatic reconfiguration in the face of node arrival/departure and ease of management and deployment.
In essence, all DHTs offer the same functionality (i.e., location of peers/data), with some variations in terms of properties, such as the number of routing neighbors, choice of iterative vs. recursive lookups, choice of routing table creation algorithms, and neighbor selection strategies. Moreover, over time, different DHTs have evolved in the same strategic direction, implementing the best choices as they emerged from studies on existing DHTs. To this end, most current DHTs guarantee that any node can be discovered in an average number of overlay hops of O(log N), with local information stored at each node of O(log N), where N is the number of nodes in the network, thus guaranteeing the scalability of the solution.
DHTs, however, have several disadvantages as well. The disadvantages of DHTs reside primarily in the fact that the mapping between the physical network nodes and the overlay is usually independent of any functionality of the nodes being mapped. Therefore, inefficiencies arise when management tasks are distributed.
In the context of distributed network management tasks, at the application level, it is normally necessary that each network node be able to identify a certain number of “neighbors” that it will be in contact with for completing its part of the assigned task(s). This set of neighbors is dependent on the task to be solved. For example, if the task is to verify the consistency of intra-RNC neighbor-cell relations in a WCDMA-based radio network, each Radio Network Controller (RNC) must initiate contact with the other RNC's that its cells have neighboring relations with, and must request the other RNCs to determine whether the cell neighboring relations are defined symmetrically on the neighbor's side.
In general, data existing in the managed network (for example relations between network nodes), usually define a directed graph that can be used at the application level for propagating the processing request from one network element to another until all nodes that should partake in the distributed task are contacted. If this graph is strongly connected (i.e., there is a path between any two nodes in the graph), then requests originating at any network node will eventually be propagated to all other network nodes (presupposing some underlying layer which enables node discovery and addressing).
In current centralized NM systems, the central managing node's view of the network is used when processing management tasks. In the context of networks of increased size, complexity, and dynamics, the use of central knowledge for deciding whether a request for distributed processing of a network management task has reached all nodes does not provide high guarantees in terms of scalability, performance, availability, and consistency.
Regarding scalability, current solutions have problems handling increases in the number of nodes being managed. The process of data collection, aggregation, and correlation becomes very complex as there is a commensurate increase in the volume of data to be managed relative to the number of devices/network elements which are to be managed. Regarding performance and availability, the 1−n (one manager to many agents) relationship in current solutions creates problems in case of failure of the manager. Similarly, the central node can be overloaded collecting data from the nodes and processing the collected data. In more extreme cases, when a management task is related to an entire network, such as determining whether a property holds true across all nodes in the network where there is shared state information (cell parameters), this workload can be difficult to handle in an efficient manner at one central location.
Finally, current solutions have problems maintaining consistency of data collected by the central management node. When working on a snapshot or copy of information retrieved from the network to support cell planning, for example, the central node performs all data processing on local copies of the actual data. Ensuring strict consistency between the data on the managed node and the data on the OSS node is extremely difficult or impossible in massively distributed systems.
The above issues raise serious and complicated challenges as networks evolve and the volume of entities to be managed grows ever larger. What is needed in the art is more viable network management architecture and method that helps alleviate the problems associated with the issues outlined above. Such an architecture should enable efficient distribution of network management tasks to nodes throughout the network, and should readily accommodate changes in the architecture graph. The present invention provides such an architecture and method.
The present invention enables direct communication between nodes in a telecommunications or similar network, making possible the distribution of network management tasks within the managed network itself. The invention overcomes the disadvantages of the prior art by utilizing semantic information from the traffic network to build a Data Distribution and Discovery (D3) layer, efficiently dealing with dynamic situations and maintaining several overlays for the different management tasks. The invention thus utilizes functional information when constructing the mapping (in the information hashed for constructing the overlay identity), and constructs a 1-to-n mapping to accommodate different network management functionalities. Network nodes may collaborate in response to network management requests thus balancing the network management load among the nodes in the network, increasing the scalability of the network management solution, and/or using the actual data on the nodes as opposed to cached, possibly outdated copies on a central node, as is traditionally the case in current network management approaches.
In one aspect, the present invention is directed to a method of distributing a network management task from a source to a plurality of network nodes in a traffic network having an application layer and a functional management overlay layer. The method includes the steps of receiving the network management task in a network node; utilizing application-layer information regarding the functionality of neighboring nodes to select by the receiving network node, at least one neighboring node that needs to receive the network management task; and utilizing a functional management overlay layer to distribute the network management task from the receiving network node to the at least one selected neighboring node. The receiving network node then receives responses from the neighboring nodes, aggregates the responses, and sends an aggregated response to the source.
In another aspect, the present invention is directed to a system for distributing a network management task from a source to a plurality of network nodes in a traffic network. The system includes means within each network node for selecting at least one neighboring node to receive the network management task. The network node utilizes application-layer knowledge of the functionality of each neighboring node to select only neighboring nodes that need to receive the network management task. The system also includes a functional management overlay layer for directly communicating between each network node and the node's neighboring nodes; and means within each network node for utilizing the functional management overlay layer to distribute the network management task from the network node to the at least one selected neighboring node. The network node then receives responses from the neighboring nodes, aggregates the responses, and sends an aggregated response to the source.
In another aspect, the present invention is directed to a network node for distributing a network management task to a plurality of neighboring nodes in a traffic network. The network node includes means for selecting at least one neighboring node to receive the network management task, wherein the network node utilizes application-layer knowledge of the functionality of each neighboring node to select only neighboring nodes that need to receive the network management task; and means for distributing the task to the at least one selected neighboring node utilizing a functional management overlay layer that provides direct communication between each network node and the node's neighboring nodes.
In another aspect, the present invention is directed to a network node for collecting network management information from a plurality of neighboring nodes in a traffic network in response to a network management request received from an originating node. The network node includes means for determining local management information needed to respond to the request and requesting remote information; means for utilizing application-layer knowledge of the functionality of each neighboring node to identify neighboring nodes where the remote management information is located; and means for utilizing a functional management overlay layer to send request messages to the identified neighboring nodes to request the remote management information. The network node also includes means for receiving the requested remote management information in response messages from the identified neighboring nodes; and means for aggregating the remote management information and the local management information and sending the aggregated information to the originating node.
In the following, the essential features of the invention will be described in detail by showing preferred embodiments, with reference to the attached figures in which:
The present invention provides an architecture for distributing and solving network management tasks in a decentralized manner. The architecture of the present invention distributes management tasks based on an overlay. The roles of the overlay are: (1) to provide direct addressing between the different nodes (i.e., not through a central node), and (2) to provide an alternative way to reach nodes beyond relations defined at the application level. In this manner, the invention provides scalability, performance, availability, and consistency when deciding whether a request for distributed processing of a network management task has reached all nodes.
The architecture of the present invention allows for large growth in the number of network elements being managed. The architecture handles the increased complexity and dynamics which result from distributing the management functions between the managing systems and the managed systems by imposing a small overhead on each of the nodes. As a result, decentralizing the management tasks helps to alleviate the load on the managing system, to improve the efficiency of the management process, and to ensure that the data processing is performed on the actual data, as opposed to potentially inconsistent copies of the data.
In order to enable the distribution of network management tasks, the architecture of the present invention allows for communication of management tasks and requests, not only between the managing system and managed system(s), but also between the managed system(s), when it is more appropriate to do so. This new architectural approach demands that managed systems must be able to locate and communicate with each other without necessarily using a centralized system as an intermediary.
For reliability reasons, automated routing around failures and automatic reconfiguration in the face of node arrival/departure is extremely important in the context of networks spanning many thousands or even tens of thousands of managed systems. As noted, to enable distribution of network management tasks, managed systems must be able to locate and address each other without the use of centralized knowledge. This discovery plane in turn should be scalable and reconfigurable, and logically integrated with the existing network structure, so as to be of maximum use to the management applications. In various embodiments of the present invention, the identifiers used in the discovery plane are logically related to unique semantic information currently defined and used in the managed network.
The present invention introduces a new function overlay (abstraction) layer within the traffic network referred to as the Data Distribution and Discovery (D3) layer. The D3 layer supports effective control and management of network elements (managed systems) by providing a framework and architecture that supports dynamic discovery of the relevant information needed to support managing the traffic network in a distributed manner, and provides the infrastructure needed to support distributed management algorithms which can be used for the creation of an autonomic management system. The invention uses semantic information from the traffic network and network management tasks to build the D3 layer, dynamically maintains the D3 layer when the network configuration or the semantics change, and maintains multiple overlays in the D3 layer for different network management tasks.
The D3 layer is a computational abstraction layer that sits on top of the traffic network and below the classic Network Management “Manager” layer. The D3 layer is used to enable distributed discovery and addressing of nodes, necessary to support distributing the network management tasks across the network elements. The primary objective of the D3 layer is to enable nodes to autonomously locate each other and communicate directly, without the need, support, or central knowledge of a central node to forward requests.
The methodology described herein builds on existing concepts such as peer-to-peer systems. The D3 layer is used for discovering distributed network nodes and management information, and distributing network management tasks to the nodes. These tasks require some form of peer-to-peer architecture, which allows nodes to directly communicate with each other and collaborate together, so as to accomplish specific network management tasks. In peer-to-peer systems, each node has partial knowledge of the network, being therefore able to contact a subset of nodes in the system. The present invention can also exploit this knowledge for extending requests to parts of the network that are not necessarily covered by network management relations at the application level.
In brief, the application-level graph may be viewed as being used to propagate the request, the D3 layer as being used to locate and address nodes, and the physical layer as being used for the actual data communication.
At the D3 layer 12, routing tables and/or neighborhood sets are created according to a pre-defined algorithm, which enables distributed discovery of network nodes 14 and data associated with the network nodes. When a message needs to be sent from one network node to another, the routing information in the overlay node (i.e., local information at the D3 layer) is utilized to discover a route to the target node. The overlay routing works by matching prefixes of nodes from the routing table with the final destination node.
In one exemplary embodiment, the overlay is implemented utilizing DHT technology, or a variant thereof. Most DHT implementations will guarantee the discovery of the destination node in an average of O(log N) steps, where N is number of nodes in the D3 layer, with O(log N) information stored in the local routing tables. The performance of the discovery algorithm is related to how much information is stored in the routing tables—the more information stored, the easier it is to find the next node. Therefore, whenever if an average performance of O(log N) is desired, the routing tables must be of O(log N) size.
The design of the network architecture 10 is based on the following principles:
When a distributed network management function needs to initiate communication between network nodes, the following sequence of activities may be performed:
The following is an example illustrating the architectural approach outlined above, as applied to a UMTS or LTE radio network, using a Distributed Hash Table (DHT) as the underlying solution for communication and discovery. The D3 distribution overlay built on top of the physical network uses a DHT to enable the network nodes to discover each other in a distributed fashion. Each node keeps a partial view of the network and supports a deterministic method for forwarding requests from any node in the distribution overlay to any other node. The example presented here uses the Bamboo algorithm, although any similar implementation would also provide the same basic level of support. In the Bamboo based solution, each node keeps:
The routing table, leafset, and neighborhood set are automatically created and/or updated as a node joins the network, and are also automatically reconfigured when nodes leave the network.
Each of the following steps corresponds to the architectural principle outlined in the previous section.
<type><seq_no><target><type of encoding><application-specific payload>
However, many types of message formats and content may be envisaged within the scope of the present invention.
It should also be understood from the above description that the roles of originating and receiving nodes can co-exist in the same node. Thus, the requesting node and the remote network node may be physically co-located in the same node.
The present invention may of course, be carried out in other specific ways than those herein set forth without departing from the essential characteristics of the invention. The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive and all changes coming within the meaning and equivalency range of the appended claims are intended to be embraced therein.
This application claims the benefit of U.S. Provisional Application No. 60/894,085 filed Mar. 9, 2007.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/EP2008/052418 | 2/28/2008 | WO | 00 | 11/8/2010 |
Number | Date | Country | |
---|---|---|---|
60894085 | Mar 2007 | US |