The disclosure generally relates to electronic communication techniques (e.g., CPC class H04) and arrangements for maintenance of administration of packet switching networks (e.g., CPC subclass H04L 41/00).
The terms wide area network (WAN) and local area network (LAN) identify communications networks of different geographic scope. For a LAN, the geographic area can range from a residence or office to a university campus. For a WAN, the geographic area can be defined with respect to a LAN—greater than the area of a LAN. In the context of telecommunications, a circuit refers to a discrete path that carries a signal through a network between two remote locations. A circuit through a WAN can be a physical circuit or a virtual/logical circuit. A physical WAN circuit refers to a fixed, physical path through a network. A dedicated or leased line arrangement uses a physical WAN circuit. A logical WAN circuit refers to a path between endpoints that appears fixed but is one of multiple paths through the WAN that can be arranged. A logical circuit is typically implemented according to a datalink and/or network layer protocol, although a transport layer protocol (e.g., transmission control protocol (TCP)) can support a logical circuit.
The Software-defined Network (SDN) paradigm decouples a network management control plane from the data plane. A SDN controller that implements the control plane imposes rules on switches and routers (physical or virtual) that handle Internet Protocol (IP) packet forwarding in the data plane. The limitations of managing traffic traversing a WAN invited application of the SDN paradigm in WANs.
Embodiments of the disclosure may be better understood by referencing the accompanying drawings.
The description that follows includes example systems, methods, techniques, and program flows that embody aspects of the disclosure. However, it is understood that this disclosure may be practiced without these specific details. For instance, this disclosure refers to scoring a path based on circuit data in illustrative examples. Data used for scoring a path will depend upon configuration of the measuring network devices. Aspects of this disclosure can also be applied to tunnels provisioned on a circuit. In other instances, well-known instruction instances, protocols, structures and techniques have not been shown in detail in order not to obfuscate the description.
A network path scoring system is disclosed herein that scores “health” of network paths in terms of packet loss. The system scores health of a network path based on packet loss of the network path, bandwidth capacity (“bandwidth”) of a corresponding SD-WAN circuit (“network circuit” or “circuit”), and bandwidth utilization (“load”) of the circuit. The scoring is done for the ingress and egress packet loss and occurs in nearly real-time to aid with detection of network problems, including transient or ephemeral problems which can impact application performance and possibly violate a service level agreement.
The scoring uses a “dynamic packet loss threshold” that is based on benchmarks of “good” packet loss behavior of network paths associated with circuits of different bandwidths and recent behavior of the path being scored. The observations for good packet loss behavior are bucketized by corresponding circuit load. For the path being scored, observations are also bucketized and aggregated into a moving average per load bucket. The moving averages represent recent behavior of the path by load bucket. The scoring system scores a path as a function of the current time interval packet loss of the network path being scored and the dynamic packet loss threshold of the current time interval. The dynamic packet loss threshold of the current time interval is a function of a good packet loss benchmark and the packet loss moving average for the load of the current time interval.
At stage A, the edge device 105 obtains packet loss data of a network path for a current time interval. A “current” time interval refers to a time interval that has most recently elapsed. A NRT scoring system can be implemented as a network appliance with a hardware or software form factor. In
At stage B, the edge device 105 selects a good behavior benchmark for a circuit load bucket of the current time interval from a benchmark table 131. The benchmark table 131 is a structure that associates defined good behavior benchmarks with buckets of circuit bandwidth utilization (“circuit load”). The edge device 105 computes or retrieves the circuit load over the time interval. Circuit load is determined based on the circuit capacity, which is defined/configured, and amount of received data over the time interval for scoring based on ingress circuit data. For egress scoring, the circuit load will be based on amount of transmitted data. The time granularity for determining circuit load aligns with the scoring time interval. Use of circuit load as a percent of capacity allows scoring to be agnostic with respect to circuit capacity, which allows the scoring to be with respect to the good behavior benchmark. Assuming the network path being scored is the network path 113, then the scoring system would determine ingress load for the circuit 114 for ingress scoring and egress load for the circuit 114 for egress scoring. The packet loss data would be based on probes transmitted between the path endpoints 105, 107.
Returning to
As shown above, the dynamic packet loss upper threshold when the moving average is 3.0% is capped at the upper threshold of 2.92% when the load is 45%. Embodiments can compute the dynamic packet loss upper threshold differently with the constraints that the dynamic upper threshold not exceed the upper threshold and not fall below the lower threshold and that the dynamic upper threshold capture the dynamic behavior of the network path being scored. As an example, the dynamic packet loss upper threshold can be computed as a sum of the lower threshold defined for the current load and a square of the moving average. This is expressed as
dynamic_upper_threshold=lower_threshold +(moving_average*moving_average)
At stage D, the edge device 105 computes a NRT network path score based on the packet loss of the current time interval and the dynamic packet loss upper threshold.
The edge device 105 computes the NRT score according to the expression:
NRT Score=(Dynamic_Packet_Loss_Upper_Threshold_Packet Loss)*100/Dynamic_Packet_Loss_Upper_Threshold
Table 2 below indicates the scores that would result from the example dynamic packet loss upper thresholds in Table 1.
The scoring is on a scale of 0-100 with allowance for negative scores depending upon implementation. As shown above in Table 2, the NRT circuit scores get worse with the increasing packet loss at the 45% load.
The edge device 105 can then update a visual representation 151 of a NRT score series with the path score for the current time interval. The circuit score visual representation 151 depicts, at each scored time, a smoothed NRT score as a descending line with the NRT score as a dot. The smoothed score smooths out dips and identifies intervals with sustained low scores.
At block 501, a scoring system detects packet loss for a current time interval for a network path. The scoring system can detect the packet loss for the current time interval by various means depending upon the monitoring infrastructure and application organization. A process or thread of the scoring system can detect that packet loss for a time interval is written to a monitored location or receive the percent packet loss over the time interval as calculated by another entity (e.g., program, process, etc.) collecting packet loss data and calculating statistical information. At time interval elapse, the scoring system can query a repository or application for the percent packet loss of the last minute or at a specified time for an identified path.
At block 503, the scoring system determines a percent utilization of circuit bandwidth (“load”) of a circuit corresponding to the network path for the current time interval. As with the percent packet loss for a time interval, the scoring system can interact or query another system or application to obtain the current load on the circuit. Implementations of the scoring system may include functionality for computing load on the circuit for the currently elapsed time interval.
At block 505, the scoring system selects a packet loss lower threshold defined for the determined load. The scoring system accesses a structure that associates circuit load buckets with defined packet loss lower thresholds. The structure is not unique to the network path being scored and has been determined based on observations of packet loss of numerous network paths with good application performance. The scoring system will identify a circuit load bucket of the structure that encompasses the determined circuit load and select the packet loss lower threshold defined for the circuit load bucket.
At block 507, the scoring system updates a packet loss moving average for the determined load based on the packet loss of the current time interval. As previously discussed, the scoring system maintains a packet loss moving average for each circuit load bucket indicated in the benchmark structure. The scoring system reads the packet loss moving average of the bucket corresponding to the current circuit load and updates the moving average to incorporate current packet loss (i.e., packet loss of the most recently elapsed time interval). The moving average may be a weighted or smoothed moving average, for example an exponential moving average with a defined alpha (e.g., 0-0.3, exclusive of 0).
At block 509, the scoring system computes a sum of the updated packet loss moving average and the packet loss lower threshold. The packet loss lower threshold was selected based on the current circuit load (505).
At block 510, the scoring system determines whether the computed sum exceeds a packet loss upper threshold defined for the load bucket. The scoring system can retrieve the packet loss upper threshold defined for the bucket of the current circuit load from the benchmark structure. The scoring system can instead use the coefficient that relates the upper and lower thresholds to determine the packet loss upper threshold. If the sum exceeds the packet loss upper threshold, then operational flow continues to block 511. If the sum does not exceed the packet loss upper threshold, then operational flow continues to block 513.
At block 511, the scoring system sets the dynamic packet loss upper threshold as the packet loss upper threshold. The scoring system uses the packet loss upper threshold as a cap to reduce the impact of packet loss that can be considered noise or extreme deviations. Operational flow continues to block 515.
At block 513, the scoring system sets the dynamic packet loss upper threshold as the computed sum of the updated packet loss moving average and the packet loss lower threshold. This allows the circuit to be scored based on a range of acceptable packet loss below an upper threshold that accounts for recent behavior of the network path as represented by the moving average. Operational flow continues to block 515.
At block 515, the scoring system computes a NRT packet loss score for the network path based on the current packet loss and the dynamic packet loss upper threshold. The score corresponds to where current packet loss for the network path falls within a range of acceptable packet loss defined from 0 to the dynamic upper threshold. The expression used in
Embodiments can compare each score against a configurable threshold for alarm or notification. For example, a threshold can be defined at 20. If a score falls below the threshold (or is less than or equal to the threshold), then a notification can be generated (e.g., text message sent, graphical display updated with an indication of a low score, etc.) and/or an alarm triggered. Different thresholds can be set for different levels of urgency.
While the above examples refer to scoring a network path with ingress packet loss data, a network path score can be based on one of the egress and ingress scores (e.g., the lowest of the two scores) or based on both the ingress and egress scores (e.g., a sum of the scores). Accordingly, the example operations of
The flowcharts are provided to aid in understanding the illustrations and are not to be used to limit scope of the claims. The flowcharts depict example operations that can vary within the scope of the claims. Additional operations may be performed; fewer operations may be performed; the operations may be performed in parallel; and the operations may be performed in a different order. For example, the operations depicted in
As will be appreciated, aspects of the disclosure may be embodied as a system, method or program code/instructions stored in one or more machine-readable media. Accordingly, aspects may take the form of hardware, software (including firmware, resident software, micro-code, etc.), or a combination of software and hardware aspects that may all generally be referred to herein as a “circuit,” “module” or “system.” The functionality presented as individual modules/units in the example illustrations can be organized differently in accordance with any one of platform (operating system and/or hardware), application ecosystem, interfaces, programmer preferences, programming language, administrator preferences, etc.
Any combination of one or more machine readable medium(s) may be utilized. The machine readable medium may be a machine readable signal medium or a machine readable storage medium. A machine readable storage medium may be, for example, but not limited to, a system, apparatus, or device, that employs any one of or combination of electronic, magnetic, optical, electromagnetic, infrared, or semiconductor technology to store program code. More specific examples (a non-exhaustive list) of the machine readable storage medium would include the following: a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a machine readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. A machine readable storage medium is not a machine readable signal medium.
A machine readable signal medium may include a propagated data signal with machine readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A machine readable signal medium may be any machine readable medium that is not a machine readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a machine readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
The program code/instructions may also be stored in a machine readable medium that can direct a machine to function in a particular manner, such that the instructions stored in the machine readable medium produce an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
Plural instances may be provided for components, operations or structures described herein as a single instance. Finally, boundaries between various components, operations and data stores are somewhat arbitrary, and particular operations are illustrated in the context of specific illustrative configurations. Other allocations of functionality are envisioned and may fall within the scope of the disclosure. In general, structures and functionality presented as separate components in the example configurations may be implemented as a combined structure or component. Similarly, structures and functionality presented as a single component may be implemented as separate components. These and other variations, modifications, additions, and improvements may fall within the scope of the disclosure.
Use of the phrase “at least one of” preceding a list with the conjunction “and” should not be treated as an exclusive list and should not be construed as a list of categories with one item from each category, unless specifically stated otherwise. A clause that recites “at least one of A, B, and C” can be infringed with only one of the listed items, multiple of the listed items, and one or more of the items in the list and another item not listed.
Number | Date | Country | |
---|---|---|---|
63261571 | Sep 2021 | US |