The invention relates to methods for monitoring the performance of networks, and more particularly to the determination of packet loss rates.
The performance of data networks is sensitive to the loss of packets. To be able to optimize the performance of a network, the operator needs to have information about packet losses on various links of the network. For example, one major reason for packet losses is buffer overflow. By identifying and quantifying the loss rate attributable to buffer overflow at a node of the network, the network operator may be able to apply policies that relieve the traffic load at the overloaded link or overloaded node of the network.
The danger of overloads that lead to packet loss has become especially acute in 3G and 4G wireless networks. One reason is that on such networks there is competition for network resources among data applications, voice applications, and video and other applications, each of which has different bandwidth requirements and different delivery modes. This competition is exacerbated by the limited bandwidth available for content delivery over the air interface, and by the high volume of signaling overhead that is typically required to enable wireless communication.
As a consequence, there is an especially great need to monitor packet losses in advanced wireless networks. Particular advantage would be gained by monitoring most or all links between, e.g., the GGSN of a GPRS-supported network such as a W-CDMA mobile telecommunications network and each of the base stations with which the GGSN is associated. To do this by conventional methods, however, would require a monitoring device to be deployed on each of the links that are to be monitored. Because advanced wireless networks, among others, may have hundreds, or even thousands, of such links, such a wide deployment of monitoring devices is not usually feasible.
Therefore, there remains a need for methods of monitoring packet losses that can be deployed from a limited number of locations on the network and still obtain information on the loss rates on many individual links.
We have discovered such a method. Our method involves collecting data on downstream packet losses at a single point in a network, and from the collected data, estimating packet loss rates on at least two subnetworks downstream of the collection point, wherein the subnetworks differ by at least one link.
In particular embodiments, the data collection is performed by a dedicated hardware monitoring device. In some embodiments, such a device is deployed on the first link below the GGSN (GPRS Support Node) of a GPRS core network.
In particular embodiments, the collected data relate to packet losses on the core network, or the portion thereof, extending from the monitoring point to a plurality of base stations, and in some embodiments even extending to the mobile stations supported by the base stations.
Schematically illustrated in
It will be evident from
Our method can be applied to any packetized communication network that may be represented by a tree graph. As noted above, wireless GPRS networks such as the network of
Our method may be deployed on a computer or digital processor running at a network node such as the GGSN or other node of
In some embodiments, the machine on which the message is deployed will collect packet loss information by diverting a copy of packet traffic through a tap installed at an intermediate point on a link. For example,
In general, tap 70 may be situated at any point at or below the GGSN, because at such locations it will generally be able to obtain the network topology from signaling information. The monitoring device needs to know the topology of the tree network downstream of its location in order to be able to infer packet loss rates according to the method to be described.
We will now describe an illustrative embodiment of our method with reference to
Turning back to
It should be noted in this regard that in at least some implementations, it will be desirable to include the mobile stations as the leaves of the graph, but the mobile stations may be so numerous that it is not computationally feasible to consider individual air-interface links. In such cases, it will be advantageous to group the mobile stations into lumped, equivalent nodes in such a way that each base station serves at least two equivalent nodes. Although such an approach does not yield loss rates on individual air links, it will often provide useful information about loss rates as averaged over populations of air links.
In
One important observation we have made is that information collected at one link of the network, relating to losses of correlated pairs of packets, may be probative of loss rates on other, downstream links of the network. To put this observation into practice, we have defined a “pair” as two packets, destined for different end nodes, that were both transmitted from the root node within a time interval δ. The value of δ may be specified by the operator, or it may be determined adaptively. In many networks, there will be a high correlation between the losses of packets transmitted sufficiently close in time, not least because if one packet is dropped due to buffer overflow, a packet following a short time later is likely to meet the same fate. Although useful values for δ will depend on the properties of the specific network of interest, typical values in a W-CDMA network could lie in the range 50-100 ms.
There are currently available monitoring devices that can determine the packet loss rate, i.e., the fraction of packets that are lost over a suitable time-averaging interval, on the end-to-end path from the monitoring point to a network element that serves as a leaf node. One such device is the Alcatel-Lucent 9900 Wireless Network Guardian, which is available from Alcatel-Lucent Inc., having an office at 600 Mountain Avenue, Murray Hill, N.J. When such a device is situated, for example, at or just below the GGSN of a GPRS core network, it can readily measure the end-to-end packet loss rates between the GGSN and the mobile stations (individually or as grouped into equivalent nodes) served by the base stations associated with that GGSN.
Accordingly, one quantity that is measurable in many networks is the end-to-end packet loss rate Fi from a root node no to an end node ni. Given two distinct end nodes ni, nj, another often-measurable quantify is the probability Fi,j;δ that a packet destined for ni and a packet destined for nj will both be lost, given that the two packets belong to a pair as defined above.
Another quantity that is measurable in many networks is the fraction Wij of all packet pairs (counted over a suitable averaging period) transmitted from the root node that are destined for a given pair of end nodes ni, nj. That is, for all end nodes ni, nj, i≠j, let Nij represent the total count, over the averaging period, of packet pairs destined for (ni, nj). Then in general, Wij=Nij/ΣNlm, wherein the summation (i.e., over indices l, m) is taken over all pairs of distinct end nodes. For implementations of the method to be described below, the summation is taken only over all pairs of distinct outer nodes, which are defined below.
We will now describe a procedure, referred to herein as Algorithm 1, for estimating the packet loss rate f0i from a root node n0 to a selected intermediate node ni. Thus, for example, Algorithm 1 might be applied to estimate the loss rate from the root node to node n2 of
There is a criterion that an intermediate node must meet in order for it to be eligible as an inner node. To be eligible, the selected intermediate node must be a root node relative to at least two end nodes via distinct branches that intersect at the selected node. For example, node n2 of
We also introduce here the concept of an outer node, which was mentioned above. Given a selected inner node, a pair of end nodes are outer nodes if: (1) the selected inner node is a root relative to the selected end nodes, and (2) at least two distinct branches intersect at the selected inner node, each of which terminates at a respective one of the selected end nodes. In the example of
Turning now to
The next step 120 of the procedure is to select an inner node ni. In the example of
The next step 130 is to select two end nodes nj, nk which qualify as outer nodes. The next step 140 is to compute an estimate of the packet loss rate f0i from the root node n0 to the selected inner node ni using the information obtained in step 110. Formulas for making the computation are provided below. In the example of
We will now describe a procedure, referred to here as Algorithm 2, for estimating the packet loss rate on a path Pjk from a selected intermediate node nj to a selected intermediate node nk lying below nj in the tree graph. In order for the selected intermediate nodes nj and nk to be eligible for application of Algorithm 2, each of them must qualify as an inner node relative to at least one pair of outer nodes, as defined above in connection with Algorithm 1. Algorithm 2 will be described with reference to
In the example of
Turning now to
The next step 220 is to compute the loss rates f0j and f0k, using Algorithm 1. In the example of
The next step 230 is to compute the loss rate fjk between the selected intermediate nodes from the rates obtained in step 220. In the example of
It will be evident from
The formula used in step 230 is fjk=(f0k−f0j)/(1−f0j). In the example of
By repeatedly applying Algorithms 1 and 2, it is readily achievable to estimate the packet loss rate on every qualifying link. A link is qualifying if (a) it terminates on an end node, or (b) it terminates on an eligible inner node.
For example, we will now describe with reference to
The end-to-end loss rate from n0 to n4 is measurable. Consequently, f04 is taken as F4.
The end-to-end loss rate f03 from n0 to the intermediate node n3 is computed by Algorithm 1 using the measured values of the end-to-end loss rates from the root node n0 to nodes n4 and n7 and the measured value of the pair-loss probability F4,7;δ. Then Algorithm 2 is used to compute the loss rate f34 on link l4.
Algorithm 1 is then used to compute the loss rate f02 from n0 to intermediate node n2 using the measured values of the end-to-end loss rates from the root node n0 to nodes n4 and n6 and the measured value of the pair-loss probability F4,6;δ. Algorithm 2 is then used to compute the loss rate f23 on link l3, using the determined values of f02 and f03.
Algorithm 1 is then used to compute the loss rate f01 from n0 to n1 using the measured values of the end-to-end loss rates from n0 to n4 and n8 and the measured value of the pair-loss probability F4,8;δ. Algorithm 2 is then used to compute the loss rate f12 on link l2 from f01 and f02.
The loss rate from n0 to n1, i.e., the loss rate on link l1, is set to rate f01, which has been computed using Algorithm 1.
Mathematical Details
Let nk be a selected inner node, and let ni and nj be corresponding outer nodes. Let f0k be the packet loss rate from the root node n0 to node nk, and let fki and fkj be the respective packet loss rates from nk to ni, and from nk to nj. Under the assumptions that the two packets in a packet pair will experience the same loss event, i.e., either both will succeed or both will fail to arrive, and that loss events on different links are independent, it can be shown that
F
ij
−F
i
F
j
=f
0k(1−f0k)(1−fkifkj)
and hence that
the sum being taken over all possible pairs of outer nodes.
Now define the weighted average Fpair of the probability that both packets of the pair will be lost by
where the sum is taken over all possible pairs of outer nodes. Since fki cannot be greater than Fi, it follows that
It follows that f0k has lower and upper bounds as expressed by
According to a useful approximation for simplifying the above formula, it is assumed that the occurrence of a packet pair for a given leaf pair is independent of the leaves themselves; that is, a packet goes to a branch i with probability wi, and thus that
Under the above assumption,
Hence, it may be sufficient to track the individual packet fractions wi, rather than the pair fractions Wij, to get a useful approximation of the upper and lower bounds.
Now define the average end-to-end loss rate FL to the leaves, by
the sum being taken over all leaves, i.e., end nodes, that are possible outer nodes. Hence,
using the preceding approximation. If there are many leaves but packet arrivals are sufficiently uniform among the leaves that none of them dominate, the sum
will be small. If the sum is taken as negligibly small, there follows the approximation
which yields approximate lower and upper bounds according to the expression,
As noted above, a computational simplification can be achieved in the event there are many leaf branches by grouping the branches into a plurality of lumped pseudo-branches. This grouping is advantageously performed randomly. This can be done without a substantial loss in estimation efficiency. That is, the only information that is effaced by the grouping is that relating to packet pairs whose respective destinations fall into the same group. Therefore, provided there are, e.g., at least 10 groups, information on no more than 10% of the packet pairs will be lost.
As described above, the various summations involved in the computations for Algorithm 1 are carried out over all possible pairs of outer nodes. In an alternative approach for applying Algorithm 1, we do the following:
For a given inner node, consider each of its branches and all of the end nodes that terminate those branches. Group all end nodes that terminate a given branch into an artificial, lumped end node. By treating the lumped end nodes as single nodes, the portion of the tree graph extending from the given inner node to the lumped end nodes is transformed into a two-level tree with, typically, many branches, each branch terminated by one of the lumped end nodes. When this alternative approach is applied, the summations are taken over all possible pairs of the (distinct) lumped end nodes.
In at least some cases, the alternative approach described above may be advantageous both by reducing the amount of computation and by increasing accuracy. The reason is that there are fewer effective outer nodes over which to perform the summations, and the consolidation of outer nodes mitigates the inaccuracies that might occur in determinations of end-to-end loss rates when too few packets are destined for some end nodes.