A robust, disruption free network with assured quality of service (QoS) is a key performance criterion in any enterprise network. With the increase in edge devices and cloud services, traffic seen by networking devices is increasing significantly. It is critical that network devices, such as routers, can handle this increasing load and dynamically adapt to the new demands. However, hardware resources in such network devices are limited, so these resources need to be managed efficiently. While many operations can be handled in software, utilizing a solely software approach may increase latency, for example, by taking additional time and CPU cycles. Thus, it may be desirable to implement a solution which manages both the hardware and software resources that are available at the network resource in order to adapt dynamically to the future demands.
The present disclosure, in accordance with one or more various embodiments, is described in detail with reference to the following figures. The drawings are provided for purposes of illustration only and merely depict typical or example embodiments. These drawings are provided to facilitate the reader's understanding of various embodiments and shall not be considered limiting of the breadth, scope, or applicability of the present disclosure. It should be noted that for clarity and ease of illustration these drawings are not necessarily made to scale.
The figures are not intended to be exhaustive or to limit various embodiments to the precise form disclosed. It should be understood that various embodiments can be practiced with modification and alteration.
An example system consistent with this disclosure forecasts and predicts a resource utilization for a network device, such as a router, by monitoring past resource consumption patterns. In certain examples, such resource utilization is forecast and predicted particularly for multicast flows within the network. An Artificial Intelligence (AI) forecasting model, such as an Autoregressive Integrated Moving Average (ARIMA) model, may be leveraged to perform real-time analysis, forecasting, and predicting of the future demands on the network. Based on the real-time predictions, the network device can efficiently reorganize its limited resources, optimizing its performance. Furthermore, the system can perform these adaptations to optimally meet the predicted demands without disrupting the existing multicast flows on the network, thereby ensuring that communication on the network remains reliable and uninterrupted. According to some embodiments, elements that are configured for performing the disclosed multicast resource management can be embedded in a network device, for example incorporating a micro-service (executing some multicast resource management aspects) in the switch software. Furthermore, some elements that are configured for performing the disclosed multicast resource management may be implemented within a cloud-based service, for instance a component of a cloud-based management service which handles policy provisioning aspects of multicast resource management.
The central management server 120 can be implemented as a cloud-based network management solution that enables cloud-based network monitoring and control. As seen in
The networking environment of
The router 130 can be configured to handle the routing of multicast traffic to and/or from the multicast client devices 110A-110C. For example, during use of these multimedia applications at the respective multicast client devices 110A-110C, the router 130 can direct the associated multicast traffic from the multicast source 111 to the multicast clients 110A-110C. However, as alluded to above, as the multicast traffic significantly increases on the network (e.g., as a result of more clients or more applications utilizing multicasting) it becomes increasingly more critical that the router 130 can appropriately handle the increased loads and has the capability to dynamically adapt to the demands on the network. For example, key resources of the router 130 that may be consumed by the multicast protocols can include, but are not limited to: 1) Internet Protocol (IP) Multicast table entries in the router's Application Specific Integrated Circuit (ASIC) to program bridge and route entries; and 2) Central Processing Unit (CPU) cycles to handle unknown multicast data, and register packets. Even further, attempting to dynamically adapting the router's resources in order to meet the demands of multicast traffic is particularly difficult, as the demands are continuously changing on a network. Often, the demands relating to multicasting on the network change extremely quickly, essentially in real-time.
In order to address these challenges, the disclosed embodiments implement a network device that is distinctly configured to not only predict future demands on the network for multicast traffic in real-time, but also to dynamically adjust management and utilization of its resources in a manner that is optimal to meet the demands on the network (based on the predictions). According to the embodiments, the router 130 can include several elements that support the distinct capabilities to achieve multicast resource management, including: data monitoring, forecast modeling, and real-time prediction of multicast flows. In the example of
As illustrated, the flow prediction controller 131 resides on the router 130 and is configured to monitor resource consumptions and demands on the network, for example monitoring the multicast traffic and other data that is communication from the multicast client devices 110A-110C. The flow prediction controller 131 can use this monitored data to train the AI forecasting model 134, such that AI algorithms can predict the future resource demands.
Additionally, the router 130 is configured to include a resource optimizer 132. The resource optimizer 132 is employed to optimize and re-organize critical resources of the router 130, in order to accommodate new flows without disturbing existing flows. As alluded to above, the disclosed embodiments realizes advantages over other resource management approaches by achieving this optimization without disrupting the existing multicast flows, and does not trade-off increased optimization by reducing reliability (e.g., increased dropped packets, greater multicast traffic latency, lost or interrupted multicast flows). Thus, the router 130 can dynamically adjust management of its resource and direct multicast traffic amongst the multicast clients 110A-110C in a manner that is optimized for the current and predicted future demands of these devices on the network without disrupting other multicast flows on the network.
It should be understood that the disclosed embodiments are described with respect to a router 130 for purposes of illustration. The description is not intended to be limited to the configuration of
Referring now to
A key function of the flow prediction controller 231 is data monitoring. As illustrated, data 205 can be received by the flow prediction controller 231 while monitoring the network for traffic and indications of demands. Table 260, shown in FIG. 2B, indicates the data sets monitored and the inferences and classifications that can be made by monitoring them over a period of time.
As shown in table 260, data 270 can include multicast routing table changes, and the corresponding classification/inferences 271 can include detecting the stable flows, detecting flows that are added and removed periodically, detecting interfaces that are subscribing and leaving periodically, and detecting random flows. Data 272 can include IGMP and MLD membership changes, and the corresponding classification/inference 273 can include detecting stable membership joins, detecting random membership joins, detecting periodic membership joins and leaves, determining the number of ports joined for a given group and the total ports in the VLAN ratio, and determining the number of active groups in the VLAN and a “joined_ports_to_VLAN_ports_ratio” for each group. Also, data 274 can include maximum multicast route/bridge entries seen and the duration, and the corresponding classification/inference 275 can include detecting the peak resource consumption and its duration. Data 276 can include the maximum IGMP/MLD joins seen, and the corresponding classification/inference 277 can include detecting the peak resource consumption and its duration. Additionally, data 278 can include multicast data and register packets seen in the CPU, and the corresponding classification/inference 279 can include detecting continuous packet redirection to the CPU and spikes in the CPU consumption. In some embodiments, the table 260 can be stored in a memory of flow prediction controller 231, where the table 260 is employed to govern the data that is analyzed and actions of the flow prediction controller 231 during data monitoring.
In order to proactively manage the multicast resources, real-time prediction of the multicast flows is another key function that is performed by the flow prediction controller 231 shown in
The AI forecasting model 234 analyzes the data 205 obtained from data monitoring, for forecasting and predicting the multicast flows and their resource utilization. Further analysis of the predictions is also key to improving the performance of the AI forecasting model 234. Accordingly, in some embodiments, the flow prediction controller 231 can also perform a prediction performance analysis, for instance using a Root Mean Square Error (RMSE) metric, in order to determine a quantitative indication of performance for the AI forecasting model 234.
As previously described, the flow prediction controller 231 collects and prepares data 205. The data samples that are required for modelling are collected at regular time intervals, as specified by the policy engine. Examples of features that are collected from the data 205 are listed in table 260 of
In
The resource optimizer 232 is configured to handle resource allocations and re-organization of key resources of the network device in order to meet higher (or dynamically changing) demands, based on the real-time prediction. As seen, the resource optimizer 232 can receive predictions of multicast flows 220 that have been previously generated by the flow prediction controller (shown in
In the example, the process 300 begins at operation 305 where real-time forecasts and predictions of a resource utilization for multicast flows on a network device are derived. As described previously in detail, these predictions can be based on predictions relating to future demands on the network, and can involve applying an ARIMA model to the data that is collected from the network communicating multicast traffic, in order to generate a prediction as a result.
Thereafter, at operation 310, a conditional check is performed to determine whether the prediction indicates that a multicast flow is expected on the network device, or that the prediction indicates that a multicast join is expected. As referred to herein, a multicast join can refer to a message transmitted in order for a client to join a multicast group. In order to join a multicast group, a host sends a join message, for instance using IGMP, to its first-hop router. Multicast groups are identified by a single class D IP address (e.g., in the range 224.0.0.0 to 239.255.255.255). In this way, messages destined for a multicast group are addressed to the appropriate IP address, similar to other non-multicast group message. In the case where the check determines that a multicast flow is predicted (shown as “FLOW” in
At operation 315, based on the predicted multicast flow, resource utilization for one or more multicast flows, based on the predicted multicast flow, is proactively programmed at the network device. In some cases, the prediction of a multicast flow also determines an expected duration that the predicted multicast flow is foreseen to be active. Proactively programming can involve programming hardware and software resources of the network device for one or more multicast flows, such as IP multicast table entries, CPU cycles, and the like. Furthermore, programming resource utilization for the one or more multicast flows is performed prior to the multicast traffic associated with the predicted multicast flow arriving at the network device. By proactively programming the network device's resources for one or more multicast flows (based on the predicted multicast flow) before the actual multicast traffic arrives at the network device, various advantages can be realized. For example, by proactively programming resources for the one or more multicast flows in operation 315, unknown multicast miss punting to the CPU can be avoided, and CPU cycles may be saved. This resource management action of operation 315 can also enable faster convergence of new multicast flows.
Next, at operation 320, the hardware tables are provisioned for the one or more flows (based on the predicted multicast flow). As alluded to above, the hardware tables are provisioned as a predictive resource management action, being completed before the multicast traffic actually arrives at the network device. In some cases, operation 320 can involve removing any hardware table entries that have been provisioned for a predicted flow after the expected duration, if the flow is not active.
Referring back to operation 310, in the case where the conditional check determines that a join is predicted (shown as “JOIN” in
At operation 325, the resource utilization for one or more multicast joins, based on the predicted multicast join, is proactively programmed at the network device. In other words, network resources are proactively programmed for multicast joins, when a multicast join is predicted based on the network demands. According to the embodiments, the resources for the network device are proactively programmed before clients send a multicast join. This resource management action of operation 325 will allow a faster response to the new clients. Next, at operation 330, simulating multicast joins can be proactively performed. Thereafter, at operation 335, o-lists or joined ports can be proactively populated. Accordingly, the resources of the network device are dynamically allocated in a manner that is optimized for a multicast join, thereby efficiently handling new multicast clients on the network.
The computer system 400 also includes a main memory 406, such as a random-access memory (RAM), cache and/or other dynamic storage devices, coupled to fabric 402 for storing information and instructions to be executed by processor 404. Main memory 406 also may be used for storing temporary variables or other intermediate information during execution of instructions to be executed by processor 404. Such instructions, when stored in storage media accessible to processor 404, render computer system 400 into a special-purpose machine that is customized to perform the operations specified in the instructions.
The computer system 400 further includes storage devices 410 such as a read only memory (ROM) or other static storage device coupled to fabric 402 for storing static information and instructions for processor 404. A storage device 410, such as a magnetic disk, optical disk, or USB thumb drive (Flash drive), etc., is provided and coupled to fabric 402 for storing information and instructions.
The computer system 400 may be coupled via fabric 402 to a display 412, such as a liquid crystal display (LCD) (or touch screen), for displaying information to a computer user. An input device 414, including alphanumeric and other keys, is coupled to fabric 402 for communicating information and command selections to processor 404. Another type of user input device is cursor control 416, such as a mouse, a trackball, or cursor direction keys for communicating direction information and command selections to processor 404 and for controlling cursor movement on display 412. In some embodiments, the same direction information and command selections as cursor control may be implemented via receiving touches on a touch screen without a cursor.
The computing system 400 may include a user interface module to implement a GUI that may be stored in a mass storage device as executable software codes that are executed by the computing device(s). This and other modules may include, by way of example, components, such as software components, object-oriented software components, class components and task components, processes, functions, attributes, procedures, subroutines, segments of program code, drivers, firmware, microcode, circuitry, data, databases, data structures, tables, arrays, and variables.
In general, the word “component,” “engine,” “system,” “database,” data store,” and the like, as used herein, can refer to logic embodied in hardware or firmware, or to a collection of software instructions, possibly having entry and exit points, written in a programming language, such as, for example, Java, C or C++. A software component may be compiled and linked into an executable program, installed in a dynamic link library, or may be written in an interpreted programming language such as, for example, BASIC, Perl, or Python. It will be appreciated that software components may be callable from other components or from themselves, and/or may be invoked in response to detected events or interrupts. Software components configured for execution on computing devices may be provided on a computer readable medium, such as a compact disc, digital video disc, flash drive, magnetic disc, or any other tangible medium, or as a digital download (and may be originally stored in a compressed or installable format that requires installation, decompression or decryption prior to execution). Such software code may be stored, partially or fully, on a memory device of the executing computing device, for execution by the computing device. Software instructions may be embedded in firmware, such as an EPROM. It will be further appreciated that hardware components may be comprised of connected logic units, such as gates and flip-flops, and/or may be comprised of programmable units, such as programmable gate arrays or processors.
The computer system 400 may implement the techniques described herein using customized hard-wired logic, one or more ASICs or FPGAs, firmware and/or program logic which in combination with the computer system causes or programs computer system 400 to be a special-purpose machine. According to one embodiment, the techniques herein are performed by computer system 400 in response to processor(s) 404 executing one or more sequences of one or more instructions contained in main memory 406. Such instructions may be read into main memory 406 from another storage medium, such as storage device 410. Execution of the sequences of instructions contained in main memory 406 causes processor(s) 404 to perform the process steps described herein. In alternative embodiments, hard-wired circuitry may be used in place of or in combination with software instructions.
As used herein, a circuit might be implemented utilizing any form of hardware, software, or a combination thereof. For example, one or more processors, controllers, ASICs, PLAs, PALs, CPLDs, FPGAs, logical components, software routines or other mechanisms might be implemented to make up a circuit. In implementation, the various circuits described herein might be implemented as discrete circuits or the functions and features described can be shared in part or in total among one or more circuits. Even though various features or elements of functionality may be individually described or claimed as separate circuits, these features and functionality can be shared among one or more common circuits, and such description shall not require or imply that separate circuits are required to implement such features or functionality. Where a circuit is implemented in whole or in part using software, such software can be implemented to operate with a computing or processing system capable of carrying out the functionality described with respect thereto, such as computer system 400.
As used herein, the term “or” may be construed in either an inclusive or exclusive sense. Moreover, the description of resources, operations, or structures in the singular shall not be read to exclude the plural. Conditional language, such as, among others, “can,” “could,” “might,” or “may,” unless specifically stated otherwise, or otherwise understood within the context as used, is generally intended to convey that certain embodiments include, while other embodiments do not include, certain features, elements and/or steps.
Terms and phrases used in this document, and variations thereof, unless otherwise expressly stated, should be construed as open ended as opposed to limiting. Adjectives such as “conventional,” “traditional,” “normal,” “standard,” “known,” and terms of similar meaning should not be construed as limiting the item described to a given time period or to an item available as of a given time, but instead should be read to encompass conventional, traditional, normal, or standard technologies that may be available or known now or at any time in the future. The presence of broadening words and phrases such as “one or more,” “at least,” “but not limited to” or other like phrases in some instances shall not be read to mean that the narrower case is intended or required in instances where such broadening phrases may be absent.