The present disclosure relates generally to computer network traffic. More particularly, the present disclosure relates to a system and method for capacity planning and congestion management planning in a computer network.
Service Providers, including Internet Service Providers (ISP) want to know what type of traffic flows and how much traffic overall there is on their networks. Further, to provide subscribers with appropriate levels of Quality of Experience (QoE) and/or Quality of Service (QoS), it is beneficial to understand the applications, content, congestion levels and the like on the network to determine whether the network is appropriately meeting the subscriber's needs.
Users may experience various levels of QoE which may differ based on application, the content delivered, the congestion levels in a network, and the like. Operators of computer networks try to provide high levels of QoE across various applications, but as some applications may be able to provide a variety of different types of traffic flows (for example, different content within the different traffic flows), some traffic flows may be more affected by latency, loss, or other issues. Operators may wish to provide traffic management and provide capacity planning given the various uses of the resources on the network.
As such, there is a need for an improved method and system for capacity planning and congestion management on a network.
The above information is presented as background information only to assist with an understanding of the present disclosure. No determination has been made, and no assertion is made, as to whether any of the above might be applicable as prior art with regard to the present disclosure.
In a first aspect, there is provided a method for capacity planning in a computer network, the method includes: determining traffic flow, application and network parameters; collecting the parameters for a predetermined time period; determining the features or parameters associated with various growth models; and combining the growth models to determine network capacity planning and congestion management planning for the network.
In some cases, the parameters are related to application throughput, subscriber throughput and node throughput.
In some cases, the parameters are related to application QoE, subscriber QoE and node QoE.
In some cases, high frequency parameters are aggregated over a longer time period.
In some cases, the growth models are machine learning models.
In some cases, the capacity planning is in a period of months in the future.
In some cases, the capacity planning is performed for one of node, link, or network device.
In some cases, the capacity planning is performed for Node growth based on node capacity/throughput; Node growth based on QoE; and Node growth based on subscriber throughput.
In another aspect, there is provided a system for capacity planning in a computer network, the system including: a data collection module configured to collect data and parameters associated with the traffic flow, applications and nodes and links on the network; a forecasting module having a prediction engine configured to predict growth based on the QoE determined from the collected parameters; and a traffic action module configured to provide growth results and/or congestion management options.
In some cases, the data collection module is configured to collect parameters related to application throughput, subscriber throughput and node throughput.
In some cases, the data collection module is configured to collect parameters related to application QoE, subscriber QoE and node QoE.
In some cases, the data collection module is configured to aggregate high frequency parameters over a longer time period.
In some cases, the forecasting module is configured to use machine learning models for growth models.
In some cases, the traffic action module is configured to provide capacity planning in a period of months in the future.
In some cases, the traffic action module is configured to provide capacity planning for one of node, link, or network device.
In some cases, the traffic action module is configured to provide capacity planning for Node growth based on node capacity/throughput; Node growth based on QoE; and Node growth based on subscriber throughput.
Other aspects and features of the present disclosure will become apparent to those ordinarily skilled in the art upon review of the following description of specific embodiments in conjunction with the accompanying figures.
Embodiments of the present disclosure will now be described, by way of example only, with reference to the attached Figures.
In the following, various example systems and methods will be described to provide example embodiment(s). It will be understood that no embodiment described below is intended to limit any claimed invention. The claims are not limited to systems, apparatuses or methods having all of the features of any one embodiment or to features common to multiple or all of the embodiments described herein. A claim may include features taken from any embodiment as would be understood by one of skill in the art. The applicants, inventors or owners reserve all rights that they may have in any invention disclosed herein, for example the right to claim such an invention in a continuing or divisional application and do not intend to abandon, disclaim or dedicate to the public any such invention by its disclosure in this document.
Generally, the present disclosure provides a method and system for capacity planning. The system and method are configured to collect high frequency data at routine intervals to store as historical data. The data may then be reviewed in various areas to be correlated and combined to predict growth of the network. The predicted growth may be used for capacity planning and congestion management planning for the network.
The system 100 is configured to be transparent to users. It will be understood that
A system 100 for capacity planning is intended to reside in the core network or receive data from the core network. In particular, the system 100 is intended to be in a location where the system is able to access the data noted herein. It will be understood that in some cases the system may be a physical network device or may be a virtual network device. The system may also be distributed over a number of physical or virtual devices. It will be understood that the system may be used on any IP based networking system, for example, Wi-Fi based, mobile data networks like GPRS, CDMA, 4G, 5G, LTE, satellite based, WLAN based networks, fixed line broadband fiber optic networks as well as on virtual private networks.
Conventionally, network operators have a challenge in knowing and accurately predicting growth in capacity. There have been difficulties determining what is really driving capacity growth. Typically, it has been difficult to know when it is time to upgrade or to enable congestion management to improve QoE, extend upgrade time, or to save Capital Expenditure (CAPEX).
In conventional capacity planning systems, the timing of a node upgrade is typically calculated using linear or simple regression math on a utilization metric. Utilization is typically a function of usage compared to capacity. These systems suffer from lack of accuracy, leading to either upgrading capacity too soon, which wastes CAPEX (upgrade costs tend to decrease over time, so investing early has a higher cost), or upgrading too late which leads to a loss of QoE. As such, there is a desire to have a more accurate system for capacity planning.
The data collection module 110 is configured to collect data associated with the traffic flows, for example, external inputs such as node capacity, cost of node and the like. The data collection module 110 is further configured to collect node level Key Performance Indicators (KPIs), for example, node throughput, number of subscribers, average round trip time and the like, and application and application category KPIs, for example, peak throughput, peak subscribers, application popularity, throughput for application and content categories and the like. The collected data is provided to the forecasting module 120. Provisioning such as node capacity may be performed by the service provider or may be determined by monitoring historical node throughput or the like.
The forecasting module 120 includes a prediction engine, as shown in
The traffic action module 130 is configured to determine whether congestion management traffic actions may be beneficial given the prediction from the forecasting module.
By breaking down the network throughput usage into components, and analyzing each components trend, there is intended to be an improvement for the forecasting accuracy of these models. These improvements may be further enhanced with identifying application usage and QoE in the network and the nodes. Using this data, the system may be able to determine what is causing the network growth and predict when capacity needs to be upgraded or when to enable congestion management. The components may be, for example, a node, link, network device or the like. A node, as a point of communication can receive, send, and forward, and process data within the network. A few examples are port, interface or the like.
The data collection module is configured to collect QoE Key Performance Indicators (KPI's) by inline or offline network element that is in the network data path. The data is intended to be collected at a high frequency (for example, every 1 millisecond (ms), every 2 ms, or the like) to capture peak data and to eliminate any outliers.
Network throughput is a function of transfer of data volume over a time interval. If the time interval is sufficiently short, then a network path may be considered to be 100% used when sending a single packet. If the time interval is sufficiently long, then the network path may never be considered to be 100% used because sooner or later there will be idle time (e.g. when a packet is not travelling on the network). The time interval useful to serve as a utilization function for capacity planning should be chosen such that it captures noticeable QoE impact of high utilization. In this document, the system and method are described as using 5 seconds as a time interval, but 1 second, or even 5 minutes, or any time in between could also be used with reasonable results. It will be understood that different time intervals may be used in parallel in the forecasting model with good results.
A longer interval reduces the likelihood of detecting peak link capacity. For instance, consider a scenario where the volume on a link is computed as 10 million bytes over 5 seconds.
This translates to a link capacity of 16 Mbps (10,000*8/5). However, if 5 million bytes are sent within the first 1 second, the resulting capacity would be 40 Mbps (5,000*8/1 sec).
Embodiments of the system and method detailed herein are intended to determine what makes up the time interval, in this example, 5 seconds of traffic on a network segment. For a forecasting model to be useful, the system and method are intended to find leading indicators or components of growth. In research that has been completed by the applicant, it has been found that the following components when used give good accuracy in forecasting:
The system may further determine the parameters or circumstances with respect to the features associated with the network, network components, applications and application and content categories, at 230. The results are intended to be stored for historical input into the prediction engine, 240. The results may be stored in a database, at 250.
The peak data for each 5-minute interval is stored in the database. In a particular example, the first 5 mins peak calculated at 1 second=10 Gbps and the next 5 mins peak calculated at 1 second=11 Gbps. Further, the application peak is also calculated, and determined that the first 5 mins—Application peak for Netflix 400 Mbps and the next 5 mins—Application peak for Netflix 450 Mbps. The data regarding node peak and application peak throughput is stored in the database. This information is necessary for computing the peaks of both nodes and applications over a 1-day interval.
Once a database has been created, the system may predict growth of the network. The historical data may be read from the database, at 260. The system may determine whether to predict per node and/or link QoE, at 270, per Application and Category QoE, at 280, Per Subscriber tier plan QoE, at 290, per device Qo, at 300 or the like. In some cases, to determine per node and/or link QoE, at 270, the features or parameters used may include peak throughput, 95 percentile throughput, average throughput and total number of subscribers and a node or link growth may be predicted over a chosen time period, for example predict growth in a particular number of months, 310.
Per application and category QoE, at 280, the system may be configured to determine the total subscribers using the application in each high frequency data collections and an average throughput for all applications for the same interval. The system may then predict the application and category growth in a chosen time period, at 320. For subscriber tier plan QoE, at 290, the system may determine the total subscribers using the tier plan within each high frequency data collection period and the average throughput for the same interval. The system may then predict subscriber tier plan growth in the chosen time period, at 330. For device QoE, at 300, the system may determine the total subscribers using the device in each high frequency data collection period and the average thought for the same interval. The device's growth prediction may then be predicted in the chosen time period, at 340. Once the predictions have been made, the system may correlate all the features and predictions to determine an overall growth and growth contribution to the network, at 350.
Embodiments of the system and method are configured to predict upgrade time for the network's capacity and determine when congestion management should be enabled. Once a regression analysis on each component is done and a forecast is produced, the system is also intended to be configured to determine the results of a what-if analysis on each component. For example:
What if the network attracted 3000 more net new subscribers?
Prediction is carried out using various machine learning methods. The selected method analyzes multiple features and considers seasonality to forecast node growth. In some cases, the operator or service provider can choose from three prediction models:
In order to predict 12 months in advance, it is beneficial to have sufficient historical data as shown in
In the preceding description, for purposes of explanation, numerous details are set forth in order to provide a thorough understanding of the embodiments. However, it will be apparent to one skilled in the art that these specific details may not be required. In other instances, well-known structures may be shown in block diagram form in order not to obscure the understanding. For example, specific details are not provided as to whether the embodiments or elements thereof described herein are implemented as a software routine, hardware circuit, firmware, or a combination thereof.
Embodiments of the disclosure or elements thereof can be represented as a computer program product stored in a machine-readable medium (also referred to as a computer-readable medium, a processor-readable medium, or a computer usable medium having a computer-readable program code embodied therein). The machine-readable medium can be any suitable tangible, non-transitory medium, including magnetic, optical, or electrical storage medium including a diskette, compact disk read only memory (CD-ROM), memory device (volatile or non-volatile), or similar storage mechanism. The machine-readable medium can contain various sets of instructions, code sequences, configuration information, or other data, which, when executed, cause a processor to perform steps in a method according to an embodiment of the disclosure. Those of ordinary skill in the art will appreciate that other instructions and operations necessary to implement the described implementations can also be stored on the machine-readable medium. The instructions stored on the machine-readable medium can be executed by a processor or other suitable processing device and can interface with circuitry to perform the described tasks.
The above-described embodiments are intended to be examples only. Alterations, modifications and variations can be effected to the particular embodiments by those of skill in the art without departing from the scope, which is defined solely by the claims appended hereto.
The present disclosure claims priority to U.S. Provisional Patent Application No. 63/476,819 filed Dec. 22, 2022, which is hereby incorporated herein in its entirety.
Number | Date | Country | |
---|---|---|---|
63476819 | Dec 2022 | US |