The present disclosure relates generally to the field of networking, and in particular, to the management of a plurality of forwarding network nodes such as routers and switches in a network cloud.
The term Network Cloud (NC) refers to a cloud that is being used for serving network functionalities such as routing, switching, etc. In other words, it refers to a concept of disaggregating network entities hardware and software. The control plan of a network entity is decoupled from its data-path, and is installed on a local server or in a network cloud. An underlying abstraction layer separates the control element and makes it agnostic to the data-path related hardware components. The data-path runs on a distributed hardware resources such as servers, network interfaces and white box devices and may be programmed directly.
Network cloud concept uses cloud methodology to serve Software Defined Network (“SDN”) services such as routing, switching, VPN, QOS, DDOS mitigation and the like in a more efficient, centrally controlled and easily programmable way.
The separation that nowadays exists between software and hardware in the networking field, has resulted in a new model of a network cloud, wherein an optimized usage of hardware resources is implemented to enable deploying of a distributed network operating system.
Nowadays, network operators are facing a financial problem as the network elements' prices are relatively high per device and consequently the price is high on a “per port” basis, whereas the income per subscriber remains mostly constant and, in some cases, has even declined. Obviously, this affects the profitability of network owners and encourages them to look for ways to implement a cost reduction approach. Many network operators and large network owners, such as web-scale owners, have adopted the approach of implementing white-boxes in their networks, where a white-box is a hardware element that is manufactured by silicon ODMs (commodity chipsets sellers). This approach allows network operators to use different white boxes manufactured by different manufacturers, within the same distributed network cloud cluster and thereby to reduce the hardware price to a model of BOM cost plus an agreed-upon margin. Yet, this approach is rather different from the traditional approach, whereby network elements were purchased as a monolithic device of hardware and software combined together. As was mentioned above, the hardware part of the problem (i.e. the hardware part of the network elements) was solved by adopting the white-box approach. Still, the adoption of this approach has created new challenges for the software part of the problem. Since this approach involves multiple software modules and containers, the use of distributed hardware nodes solution which comprises a plurality of hardware white-boxes, requires the software modules and containers to run in synchronization.
When this concept is adopted, different functionalities may be distributed between hardware resources as they can function along the data-path while allowing packet processing. Alternatively, they can function as fabric entities allowing communication between data-path elements, or as network controllers handling routing protocols or while carrying out any other applicable functionality.
Enterprises, as well as communication service providers, who implement a virtualized network cloud, are facing new challenges that relate to the installation, deployment, configuration, orchestration, provisioning and monitoring of the plurality of different entities required in the lifecycle management of the cluster, whether these entities are network cloud routers, switches or any other applicable network element.
Additionally, as the complexity of the distributed network cloud increases, there is an increasing need to enable automatic processes for adding, removing or replacing hardware devices within the network.
Therefore, there is a need for an orchestration model which may be configured for managing a plurality of routing or switching entities in a network cloud. The present invention seeks to fulfill this need.
The disclosure may be summarized by referring to the appended claims.
It is an object of the present disclosure to provide a novel solution for managing a plurality of routing and/or switching entities operating in a network cloud.
It is another object of the present disclosure to provide a system, a method and a software program for managing distributed network nodes operating in a network cloud, based on information related to key performance indicators (KPIs) collected from the plurality of physical network elements.
It is another object of the present disclosure to provide a system, a method and a software program for managing distributed network nodes based on threshold values associated with KPIs and the information collected from a plurality of physical network elements.
Other objects of the present disclosure will become apparent from the following description.
According to a first aspect of the disclosure there is provided a communication system configured to operate in a network cloud, the system comprising a plurality of physical network elements and a server configured to operate as a cloud orchestrator which receives information related to key performance indicators (KPIs) collected from the plurality of physical network elements, and determines whether a pre-defined action that relates to a respective physical network element needs to be executed, based on a) one or more threshold values stored at the cloud orchestrator and associated with the KPIs and b) the information collected from the plurality of physical network elements.
The terms “physical network element” or “physical network node” or “hardware network element” or “network element”, as the case may be, are used interchangeably herein throughout the specification and claims to denote a physical entity such as a packet processor, a CPU, a memory, a network interface, and the like, that can act as a single or multiple entities being a part of a virtual routing entity and supports the routing functionality of the latter.
The term “network cloud”, as used herein throughout the specification and claims, refers to a cloud that is being used for serving network functionalities such as routing, switching, etc.
On the other hand, the term “cloud network” denotes network resources (servers, disks, CPUs, and the like) that are used for providing a cloud functionality (e.g. for hosting services, files, sites, and the like), as web scale companies do.
The terms “cloud orchestrator” or “orchestrator” as used herein throughout the specification and claims refer to a cloud management platform that automates provisioning of cloud services using policy-based tools. It enables the user to configure, provision, integrate service management—and add management, monitoring, back-up and security—in a short period of time. The platform comprises a collection of customized individual activities that are specific to a product or technology, so that the activities to be performed thereby are integrated with that product. The cloud orchestrator is typically able of guiding information, providing redundancy, availability, low latency and total transparency among the different communication protocols while offering security, management and capacity to integrate a large number of network elements, and to enable aggregating new members to a network cloud in a fast and secured manner.
According to another embodiment, the cloud orchestrator is further configured to trigger a configuration change at at least one of the plurality of physical network elements, in response to determining that one or more threshold values have been exceeded.
By yet another embodiment, the cloud orchestrator is further configured to identify a new device (e.g. a white-box device or another hardware element, server and the likes) when that new device is added to the network cloud, and to associate the new device with a cluster of devices comprised in that network cloud.
According to another embodiment, the cloud orchestrator is operative to configure a hardware device to operate as a network cloud entity by installing one or more software packages and/or software containers at that hardware device, and establishing communications between the hardware device and at least one other network cloud entity, and wherein the one or more software packages and/or software containers installed at the hardware device by the cloud orchestrator, depends on a functionality which will be performed by that network cloud entity when operating as part of the network cloud.
According to still another embodiment, the cloud orchestrator is configured to form a new cluster of physical network elements, and determine one or more of the members included in a group that comprises: type of the new cluster, mandatory and non-mandatory modules to be installed at network elements that are members of the new cluster, a setup for at least one of the mandatory modules and a setup for at least one of the non-mandatory modules, as well as required configurations for physical network elements, and the like.
By yet another embodiment, the cloud orchestrator is operative to configure a communication channel for conveying messages exchanged between network elements that are members of a cluster, and/or between network elements that are members of the cluster and the cloud orchestrator. Preferably, these messages comprise at least one type of messages being a member of a group that comprises: keepalive messages, configuration commands forwarded from the cloud orchestrator to modules installed at network elements, messages that are sent every pre-determined time interval to the cloud orchestrator and comprise information that relate to at least one of: current telemetry, statistics, events, KPIs, and the like.
According to another embodiment, the cloud orchestrator is configured to add to a cluster a network node that belongs to the network cloud, without interfering with the functionality of that cluster (e.g. without causing a down time to any of the cluster's operative network elements). A network node may be added in any one of the following events:
A network node may be shut down in any one of the following events:
In accordance with still another embodiment, the cloud orchestrator is configured to ensure that a plurality of physical network elements operate as a single virtual routing entity.
In accordance with another aspect of the present disclosure there is provided a method for use in a network cloud, which comprises a plurality of physical network elements and a server configured to operate as a cloud orchestrator and to receive information related to key performance indicators (KPIs) collected from the plurality of physical network elements, wherein the method comprises the steps of determining whether a pre-defined action that relates to a respective physical network element needs to be executed, and wherein the determination is based on threshold values associated with these KPIs and information collected from the plurality of physical network elements.
According to another embodiment of this aspect of the disclosure, the method further comprising the steps of determining whether one or more threshold values have been exceeded, and if in the affirmative, triggering a configuration change at at least one of the plurality of physical network elements.
By still another embodiment, the method further comprising a step of establishing a communication channel for conveying messages exchanged between network elements that are members of the cluster, and/or between network elements that are members of the cluster and the cloud orchestrator.
According to another embodiment the messages comprise at least one type of messages being a member of a group that consists of: keepalive messages, configuration commands forwarded from the cloud orchestrator to modules installed at network elements, messages that are sent every pre-determined time interval to the cloud orchestrator and comprise information that relate to at least one of: current telemetry, statistics, events and KPIs.
In accordance with another embodiment the method further comprising a step of ensuring that a plurality of physical network elements operate as a single virtual routing entity.
By yet another embodiment, the method further comprising a step of monitoring at least one network element and KPIs associated therewith at a pre-defined steady rate, and upon detecting that a malfunction is about to be associated with that network element, determining a time period during which relevant KPIs will be sampled at a rate that is higher than the steady rate applied prior to making that determination, thereby enabling to detect the cause for the possible malfunction.
According to yet another aspect of the disclosure there is provided a non-transitory computer readable medium storing a computer program for performing a set of instructions to be executed by one or more computer processors, the computer program is adapted to perform a method for use in a network cloud comprising a plurality of physical network elements and a server configured to operate as a cloud orchestrator and to receive information related to key performance indicators (KPIs) collected from the plurality of physical network elements, wherein the method comprises the steps of determining whether a pre-defined action that relates to a respective physical network element needs to be executed, and wherein the determination is based on threshold values associated with said KPIs and information collected from the plurality of physical network elements.
The accompanying drawings, which are incorporated herein and constitute a part of this specification, illustrate several embodiments of the disclosure and, together with the description, serve to explain the principles of the embodiments disclosed herein.
Some of the specific details and values in the following detailed description refer to certain examples of the disclosure. However, this description is provided only by way of example and is not intended to limit the scope of the invention in any way. As will be appreciated by those skilled in the art, the claimed method and device may be implemented by using other methods and/or other devices that are known in the art per se. In addition, the described embodiments comprise different steps, not all of which are required in all embodiments of the invention. The scope of the invention can be summarized by referring to the appended claims.
A cloud orchestrator automates the management, coordination and organization of complicated computer systems, services and middleware. In addition to a reduced requirement for personnel involvement, the orchestration functionality eliminates the potential errors that might be introduced while carrying out provisioning, scaling or other cloud processes.
Once an operating system (OS) is installed at the cloud orchestrator 110 (and at the network controller 120, if the latter is deployed), agents 1401 to 140N may be installed at NEs 1301 to 130N to support communication from cloud orchestrator 110 either directly or through cloud controller 120. Once these agents have been installed, links are established at the L2 layer and respective tunnels may be configured, thereby enabling a two-ways communication between the cloud orchestrator and the network elements.
First, the newly added network element, being for example a router or a switch, is identified and a communication link is established between the cloud orchestrator and that new NE (step 200). The NE is then associated by the managing entity with a certain cluster (step 210) and the cloud orchestrator or the cloud controller, as the case may be, forwards images/dockers to the new NE (step 230). Once a keepalive message is sent by the NE to the cloud orchestrator/controller, checking/confirming that the communication link that has been established between the two is operative, a plurality of KPIs will be collected and stored at the cloud orchestrator, preferably at pre-configurable time intervals (step 240).
The next stage is exemplified in
Another phase of the network cloud operation is exemplified in
This phase starts by retrieving KPIs collected from different NEs (step 400). Then, the cloud orchestrator analyzes the traffic flows that are conveyed via these NEs and identifies traffic trends (such as future possible congestion, etc.) based on these traffic flows that were analyzed (step 410). In view of the identified trends, one or more automatic actions and their associated threshold values are suggested to be carried out in the network cloud (step 420), in order to adequately act on the scenarios predicted based on the trends identified in step 410. Once the changes (the new automated actions) are approved (step 430) the new actions and their respective threshold values are added to the cloud orchestrator storage for automatic execution thereof (step 440), and upon occurrence of situations at which the need for the new actions were added arises, the actions will be executed automatically (step 450).
When a configuration change to a network element (e.g. a node) or a plurality of nodes is required, the cloud orchestrator (acting as an administrator) may define a configuration patch (e.g. certain configuration lines or scripts) and set a list of one or more threshold values that will trigger that configuration change. The cloud orchestrator triggers the required configuration change when threshold values are exceeded, logs the executed changes and allows rollbacks. The system may execute actions in response to threshold values being exceeded, and machine learning (Artificial Intelligence) actions can be carried for configuration, administration and/or orchestration types of activities. Such actions may be for example one of the following actions:
During operation, periodic calculations may be carried out using recently retrieved KPIs in order to identify trends in the network cloud operation. A machine learning algorithm may be used to generate hourly and/or daily and/or weekly trends, which may then be displayed visually to the network operator.
Furthermore, based on the calculated trends, predictions can be made and be then translated into relevant threshold values. The threshold values may be saved in a thresholds database which is comprised in this example within the cloud orchestrator, so that an event manager (part of the functionality carried out by the cloud orchestrator server) may trigger events upon exceeding these relevant threshold values.
In addition, a list of required or recommended actions may be generated based on the analysis of the collected information and the calculated predictions, and the managing entity of the cloud orchestrator (an administrator) may be used to determine whether a certain action should be executed, or whether to avoid preforming a certain action, or whether to automate a certain action, so that when applicable, that action will be carried out automatically.
Moreover, monitoring of failures may be carried out according to the following embodiment:
Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/IL2019/051247 | 11/16/2019 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62778897 | Dec 2018 | US |