Embodiments of the present invention relate generally to monitoring the operation of communication networks. This is achieved utilizing end-point stations as probes that communicate with suitably deployed time-servers on a continual basis for purposes of establishing performance metrics that are used to quantify network loading and identify fault conditions. Historical analysis of network loading is necessary to optimize network equipment deployment and growth strategies.
Traditional approaches to monitoring communication networks assume that the network elements comprising the network can monitor their own performance and report loading/over-loading conditions to a network management system. In many cases specialized test equipment is utilized in the field for troubleshooting purposes. However, such testing methods are useful for establishing network performance parameters only for the limited duration of the equipment deployment and only apply to the conditions that exist during the test.
Most network elements such as routers that are deployed in communications networks maintain an estimate of the occupancy of their communications links. For instance, if an Ethernet interface provides the capability of transmitting 1 Gbit/s and information traffic consumes, on the average, 500 Mbit/s, the link is considered to be loaded at 50%, the remaining transmission bandwidth comprises idle signal or fill-in information that can be replaced by traffic if necessary. If the link loading is 100% then the link cannot carry any additional traffic and can thus result in congestion whereby information traffic can be delayed or even discarded. This delay and/or discard operation represents an impairment of the traffic carrying capability of the network element. Often routers maintain queues for scheduling transmission of traffic packets and can estimate loading by examining the fill level of the queues. Coordinating the information from multiple network elements can provide a partial picture of the network loading conditions.
For troubleshooting wireless network access issues, specialized equipment is deployed, on a temporary basis, in the vicinity of the base-station suspected of sub-par performance or deployed in a mobile device such as a car or truck that is driven around in the vicinity of the base-station. This manual/semi-manual approach suffices to address static problems that persist regardless of time-of-day or demand for network resources. Problems that may manifest themselves in one geographical area that have a root cause involving multiple geographical areas (base-stations) may not be uncovered by this approach. Observations of network conditions made by such deployed test equipment are available only on a temporary basis while the test equipment is in operation and cannot be used for continual monitoring purposes.
A series of nodes are connected over a communication network using bi-directional transmission links. For convenience the network is logically separated into segments. Server nodes, referred to here as time-servers, that derive time from a common reference source such as GPS are deployed at judicious locations within the network.
Client nodes are disbursed around the network edge. For example, in a wireless network the client nodes can be the mobile stations such as phones and tablets. In a wired network such as that of an enterprise, the client nodes can be the desktop computers on a local area segment of the network or mobile computers accessing the local area network using wireless communications.
The client nodes interact with the server nodes using a time-transfer protocol such as NTP or PTP or similar protocol suitable for exchanging time-stamps of events between client and server. The events correspond to the time-of-arrival and time-of-departure of designated packets exchanged by the server and client. The exchange of time-stamps can be the basis for the client nodes setting their internal time-clock. The client nodes may also have alternative time sources including, but not limited to, GPS, to set their time-clock. The time-stamps associated with the time-of-departure and time-of-arrival of a particular packet provide an estimate of the transit delay of the packet from the server (or client) to the client (or server).
The time-stamps exchanged are also reported to a centralized network management server that includes these time-stamps in a database along with particulars of the client and server and additional ancillary information including the identities of the server and client; the geographical location of the client if it is a location-enabled mobile wireless device; geographical location of the intermediate network elements such as, in the case of wireless networks, cellular base-stations or WiFi access points; RF (radio frequency) signal strength parameters; particulars of the route taken by the packet through the network;
Computing suitable metrics from the time-stamps and analyzing the historical trend thereof can be used to identify network issues including, but not limited to, over-loading and under-utilization. Data mining techniques and graphical depiction of performance metrics derived from the data can be used by operators to better understand and analyze network performance. The time-stamps provide a way to analyze the metrics in terms of the temporal evolution of performance as well as ascertain simultaneity of events that may occur in different parts of the network, physical and/or logical.
So that the manner in which the above recited features of the present invention can be understood in detail, a more particular description of the invention, briefly summarized above, may be had by reference to embodiments, some of which are illustrated in the appended drawings. It is to be noted, however, that the appended drawings illustrate only typical embodiments of this invention and are therefore not to be considered limiting of its scope, for the invention may admit to other equally effective embodiments.
For clarity, identical reference numbers have been used, where applicable, to designate identical elements that are common between figures. It is contemplated that features of one embodiment may be incorporated in other embodiments without further recitation.
In one embodiment of the invention depicted in
Server nodes are referenced to global Coordinated Universal Time (UTC) via a satellite time reference, e.g., Global Navigation Satellite System (GNSS) such as Global Positioning System (GPS), GLObal NAvigation Satellite System (GLONASS), Galileo, Compass/Beidou, Wide Area Augmentation System (WAAS) or similar, or via a terrestrial RF broadcast time reference, e.g., WWVB, JJY or similar, or via mobile wireless base-station signals, e.g., CDMA, GSM, WiMAX or similar. Server nodes may include a client node to derive UTC from other servers over the network in hierarchical fashion in cases where the primary satellite or RF reference is unavailable. Client nodes derive absolute time from one or more server nodes which distribute timing packets over the network. Client nodes may also derive time from satellite and RF references. Server and Client Nodes may also derive position from GNSS, ground-based RF navigation systems (e.g., LORAN), RF triangulation techniques including TDOA and Signals of Opportunity, inferred from connected cell tower identification (position lookup from cell tower database) or, in the case of fixed assets, known from a previous survey. Network topology is unconstrained.
There are several methods for distributing time over the network e.g., IEEE1588 Precision Time Protocol (PTP), and Network Time Protocol (NTP). The method used is a network design choice of the operator and also depends upon the application. PTP is often the protocol of choice for network operators to distribute time to their mobile backhaul infrastructure. NTP is a common choice for distribution of time to endpoint devices over IP and Internet. RTP (Real Time Protocol) is typically used to synchronize real-time services over IP such as VoIP and video-conferencing and can, with some modifications, be used for time-transfer applications as well.
Each of these protocols involves the time-stamping of packets upon creation of the packet, representing the time-of-departure and the time-stamping of the packet upon the reception of the packet representing the time-of-arrival. In NTP the typical sequence of events follows the progression depicted in
The accuracy of the time-stamps depends upon many factors in the network such as network delay, jitter and packet loss. In general, implementations attempt to time-stamp packets as accurately as possible and attempt to reduce or eliminate delay variation in terms of the time the transmitted packet was generated (TS) to the time it is transmitted on the network and similarly from the time the received packet physically entered from the network to the time the packet was time-stamped (TR). (TR-TS) for any particular packet is the estimate of the one-way delay. Several algorithms exist for deriving timing over the network and all require a two-way exchange of time-stamped packets from the client to the server (upstream) and time-stamped packets from the server to the client (downstream). For each protocol the time-stamp format may be different. It is well known that for network-based time distribution, the accuracy is limited to the difference in transit delay in the two directions, TASYM, divided by 2 (accuracy approximately=TASYM/2) and also depends upon the jitter (transit delay variation from nominal) and packet loss in the network. Very good accuracy can be obtained by the server or client device when connected to GPS or similar. In these cases, the network protocol may still be used to calculate network delays but uses the GPS reference time instead of using the protocol's time derivation algorithm.
Once (system) time is established by the client running on the mobile device (e.g. MS 130), the mobile client can act as a monitor of delay, jitter and packet loss based on packets exchanged with the time servers (e.g. Server 201). This exchange between mobile clients and time servers is conducted on a continual basis. Each exchange is reported back to the network management center. The network management computer maintains a data-base with entries exemplified by
Clients (and Servers that include client functions) can sample delays from multiple servers simultaneously, or over time in sequence at same or different rates. Information can also be gathered for various packet sizes and various COS or TOS packet markings. The client can also collect delay and jitter data for multiple protocols, multiple logical connections, multiple qualities of service and may or may not be application aware. The client stores the raw upstream and downstream delays and timestamps in its persistent or dynamic database associated with the device. The delay and timestamp information may be further processed by the device itself to generate statistical information such as moving or windowed averages, maximums, minimums, differences, jitter, as well as generate threshold crossing alerts such as when mean delay exceeds a minimum threshold for a given period of time. The client can also track packet loss rates with the various protocols. The statistics may be further processed to form metrics such as Mean Opinion Score (MOS) and R-Factor for digital voice, or ITU Y-1541 Network Performance parameters.
Some network timing nodes consist of both client and servers and operate in a hierarchy. In the parlance of PTP, these nodes are known as Boundary clocks. The client in the boundary clock may derive timing from a grandmaster that has GPS as its absolute reference. The choice of which grandmaster or boundary clock any client function references at any time is outside the scope of this description, however, in general the timing protocol will qualify the clock source and will use the “best” master clock that is available. For instance, in NTP, there is the concept of a Stratum hierarchy with the lower the Stratum number, the better the reference. The reference quality among servers of the same stratum may be determined by NTP using metrics of reachability, delay, offset and dispersion.
In mobile networks, when a handoff occurs in an operator network between towers sharing similar backhaul paths, the delay changes may be on the order of microseconds or tens of microseconds. However in intra-operator handoffs to towers with different backhaul paths, instantaneous delay changes can be in 100 s to 1000 s of microseconds. Inter-operator or inter-technology handoffs between carrier networks and public networks can experience substantial delay changes into the 10 s, perhaps 100 s of milliseconds. The quality of the connection for voice or video conferencing can be severely impacted, if not impaired, by these changes in delay.
In wireless networks the gathering of the delay data collected by the mobile device is accompanied with the association of the delay data with relevant physical and logical information such as device position, cell tower ID, cell sector, hardware and software make, model and revision for the infrastructure including the mobile device itself. All of the above information may monitored by the centralized network monitoring system over the network as shown in
Delay and Jitter statistics and metrics associate with, but not limited to the following:
Association of the delays with any of the above may be done be the client device itself in combination with stored database. For instance, the delay data may be annotated with GPS position from the device itself along with the Cell tower ID and sector information. The make and model of the cell tower may be later associated to the delay information through a query to a database.
In
This method permits monitoring of delay and jitter for individual mobile devices and for time varying ensembles of mobile clients connected to base-stations that change as mobile clients are handed to, or handed from the base-station.
For example, the operator may wish to query the data base for ensemble call quality for all 3G voice connections for every Friday in the past year in the city of Phoenix for those users with Android-based cellular devices manufactured by Motorola. Such a query can be further constrained to the period of 8 AM-12:00 PM in the downtown area. And again further constrained to evaluate delay metrics for the access portion of the network as opposed to full-end to end delays and further categorized as those connections made over a particular base-station make and model, such as Ericsson BTS 2111 or RBS 3202.
Delay metrics can also be collected based on subscriber such as delays for the month of May for subscriber n. This can be further subdivided to all 3G connections for any service, or by a particular service class, such as voice, video, data. For instance the operator may want to examine the delays for UDP packets of sizes ranging from 576 Bytes to 1518 Bytes.
Typically cellular devices are within 1-2 km of the cell tower of the base-station. The cell tower precise position is known and therefore the device is within 3 us-6 us of the cell tower. If the precise position of the device and connected tower, is known through surveyed, GNSS or other RF techniques, then the time-of-flight delay can be estimated to 100 ns or better. This delay can then be distinguished from the network delays. Delays can be further associated to the cell sector. In cellular networks, some base-stations may be single sector, but also often multi-sector. Depending upon the method of delivering data and the position of the devices in the network, local interference and weather, distance from the base-station as the data rate may vary with signal strength.
In addition to the monitoring of statistics, the cellular device tracks the number of timing packets transmitted and received and the operator can discount these packets from the data plans so that the subscriber is not charged for the timing packets used for the operator's monitoring of the network. Similarly for the requests for raw or processed delay data.
As indicated above, associating mobile client delay data with various physical and logical information enables a mobile network monitoring method for mobile service providers that is not available in the prior art.
In one application a particular mobile station may be monitored as it moves around within an extended geographical area. Consider a mobile 130 that collects and reports data regarding its TS and TR time-stamps related to its communication with server 201. The delay estimate is computed as (TR−TS).
In another application suppose the goal is to monitor the performance of the access network segment 114 between RNC 120 and router R 126. For this the data used to develop the metrics involves packet exchanges between all mobiles that are associated with RNC 120 and time servers 201 and 202. With reference to
In
A. Mean delay increases with load.
B. Standard deviation of delay increases with load.
Consequently, the network management system can establish loading estimates using these one-way delay estimates. For example, with reference to
In another application, the measurements made from mobiles connected to a particular base-station to a particular server can be used to characterize the behavior of the base-station.
The literature in metrology provides additional metrics that can be computed over selected data. First, the data base can be searched using a particular set of parameters. For example, the search parameters could be all records associated with base station “X” (e.g. 104) and server “Y” (e.g. 201). Suppose the time-stamp data extracted is for the transit delay from a mobile to the server. That is the value of T1 (401) is subtracted from T2 (402) to give “ρ”. The server time is generally considered to be the most accurate and stable, so this value of “ρ” is associated with time t=T2 (402). This procedure allows us to create a sequence of numbers that can be expressed as {ρ(t); t=T2} corresponding to the entries in the data base. For convenience the data may be restricted to a particular time period such as a day or week or month; the value of T2 can be used to restrict the data to this chosen interval. Now the values of T2 in this set may not be uniformly spaced in time. A common approximation is to decide on a suitable sampling interval τ0 and then construct an equivalent sequence that is representative of a uniformly spaced sampling-time grid of t0 by establishing
That is, the new sequence {x(nτ0)} represents the average of the values of “ρ” whose time-index value (T2) is within one-half sampling time unit from n·τ0. This new sequence corresponds to a uniform sampling-time grid and conventional formulae for timing metrics such as TDEV/TVAR, MTIE/MRTIE, etc. can be applied. For reference, the following formulas apply for a data sequence of N points. Suppressing the “τ0” in the sequence index for notational simplicity, the MTIE formula is
or, equivalently,
The formula for TDEV is
The formula for TDEV is shown without the square-root on the right-hand-side; this is the formula for the square of TDEV, namely TVAR.
The importance of TDEV and MTIE, in addition to the simple mean and standard deviation is that they provide metrics as a function of “observation time” that in turn provides information regarding persistence, periodicity, and duration of congestion that is bursty in nature.
Various substitutions, modifications, additions and/or rearrangements of the features of embodiments of the present disclosure may be made without deviating from the scope of the underlying inventive concept. All the disclosed elements and features of each disclosed embodiment can be combined with, or substituted for, the disclosed elements and features of every other disclosed embodiment except where such elements or features are mutually exclusive. The scope of the underlying inventive concept as defined by the appended claims and their equivalents cover all such substitutions, modifications, additions and/or rearrangements.
The appended claims are not to be interpreted as including means-plus-function limitations, unless such a limitation is explicitly recited in a given claim using the phrase(s) “means for” or “mechanism for” or “step for”. Sub-generic embodiments of the invention are delineated by the appended independent claims and their equivalents. Specific embodiments of the invention are differentiated by the appended dependent claims and their equivalents.
This application claims a benefit of priority under 35 U.S.C. 119(e) from co-pending provisional patent application U.S. Ser. No. 61/718,598, filed Oct. 25, 2012, the entire contents of which are hereby expressly incorporated herein by reference for all purposes.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2013/066950 | 10/25/2013 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
61718598 | Oct 2012 | US |