The present invention relates to a method of sampling datagrams of the type, for example, that samples a datagram at a first time and at a first point in a communications link and the datagram at a second time and at a second point in the communications link. The present invention also relates to an apparatus for sampling a datagram, for example a network probe. The present invention further relates to a monitoring system of the type, for example, that samples a datagram at a first time and at a first point in a communications link and the datagram at a second time and at a second point in the communications link.
In the field of monitoring performance of a communications network, it is known to provide a passive monitoring system to monitor predetermined flows of datagrams communicated over a communications link spanning one or more communications networks. In such monitoring systems, a first probe is located at a first point along the monitored link and a second probe is located at a second point along the monitored link. The probes are coupled respectively to the first and second points along the link by respective taps that “siphon off” a copy of the electrical or optical signals passing the first and second points. The data tapped by the first and second probes are correlated to derive diagnostic data that can be used for managing and troubleshooting the communications network.
One particular known method that employs the first and second probes as described above, captures a set of packets respectively at the first and second probes, i.e. at the first and second points along the communications link. As each packet is captured, a timestamp is generated as well as a hash signature, derived from the packet data, which uniquely identifies the packet, the hash signature and associated timestamp being stored on the probe. After a predetermined number of packets have been captured, the hash signature and timestamp pairs are sent to a separate processor, known as a correlator, for correlation. At the correlator, the hash signatures are used to identify and match the two observations of a given packet, which passed the two separate points in the communications network. The timestamps of the matched instances of the given packet are then used to calculate jitter and average delay of the set of packets constituting a measured network flow. However, this system may generate sufficient additional traffic that could drive a heavily loaded network into instability.
Another method of performing loss measurements does not involve forwarding of the hash signature/timestamp pair to a separate correlator. Instead, by filtering at the two probes for a predetermined specific set of matching packet instances corresponding to predetermined flows, only the set of timestamps need be sent for comparison at the correlator. However, this method requires careful selection of a filter algorithm to ensure spurious packets are not selected. If the filter is too wide then not only are too many packets selected, incorrect packets that are part of alternate flows, not part of the measured flow on the monitored link, may also be incorrectly selected for correlation, resulting in spurious timestamps being sent to the correlator and overall results being compromised. Without any mechanism to correlate timestamps with actual packet instances, the above mentioned spurious results cannot be eliminated and a final statistical grooming to remove the spurious results is therefore required. However, even use of complex statistical techniques does not completely eliminate the possibility of pollution of results data by false matches. Additionally, on links with high levels of jitter, the statistical grooming mentioned above may eliminate measurements that are, in fact, correct, thus causing further false results. Additionally, an absence of the hash signature to match timestamps results in the correlator being unable to identify out of order packets.
According to a first aspect of the present invention, there is provided a method of sampling datagrams, the method comprising the steps of: sampling a plurality of datagrams from a predetermined flow of datagrams associated with a first point in a communications link, the plurality of datagrams being sampled with reference to a first respective plurality of sampling intervals; generating respective first time record data corresponding to a predetermined number of the plurality of datagrams; sampling the plurality of datagrams from the predetermined flow of datagrams associated with a second point in the communications link, the plurality of datagrams being sampled with reference to a second respective plurality of sampling intervals; generating respective second time record data corresponding to of the predetermined number of the plurality of datagrams; and correlating the first and second time record data; wherein the first respective plurality of sampling intervals is consistent with the second respective plurality of sampling intervals.
The method may further comprising the steps of: obtaining a copy of datagrams passing the first point and extracting datagrams relating to the predetermined flow of datagrams therefrom; and obtaining a copy of the datagrams passing the second point and extracting datagrams relating to the predetermined flow of datagrams therefrom.
The sampling may be passive. The first and second respective plurality of sampling intervals may be intervals of datagram sequence numbers.
The first time record data comprises a first time record entry and the second time record data may comprise a second time record entry, the first time record entry respectively corresponding to the second time record entry, the first and second time record entries relating to a same sampled datagram from the predetermined flow of datagrams.
The first time record entry may correspond to a first sampling interval of the first respective plurality of sampling intervals, the same sampled datagram having a sequence number numerically closest to a lower limit of the first sampling interval.
The second time record entry may correspond to a second sampling interval of the second respective plurality of sampling intervals, the same sampled datagram having a sequence number numerically closest to a lower limit of the second sampling interval.
The step of correlating the first and second time record data may comprise the steps of: receiving the first time record data in respect of the first point, the first time record data comprising a first plurality of time record entries associated with the sampled plurality of datagrams; and receiving the second time record data in respect of the second point, the second time record data comprising a second plurality of time record entries associated with the sampled plurality of datagrams.
The first time record data and the second time record data may each comprise flow-identifying data to identify the predetermined flow of datagrams.
The step of sampling the plurality of datagrams from the predetermined flow of datagrams associated with the first point in the communications link may comprise the steps of: providing a plurality of individually testable threshold values, the plurality of threshold values delineating the first respective plurality of sampling intervals; and comparing a first sequence number of a datagram from the predetermined flow of datagrams with each of a first number of the plurality of threshold values so as to identify a first threshold value from the first plurality of threshold values equal to or less than the first sequence number of the datagram and numerically closest to the first sequence number.
The step of generating the respective first time record data may comprise the step of: recording a first time record in respect of the datagram from the predetermined flow of datagrams in response to the first threshold value identified; wherein the first time record may be recorded in the first time record data, a position of the first time record within the first time record data relative to other time record entries corresponding uniquely to the first threshold value identified from the first number of the plurality of threshold values.
The step of sampling the plurality of datagrams from the predetermined flow of datagrams associated with the second point in the communications link may comprise the steps of: providing a plurality of individually testable threshold values, the plurality of threshold values delineating the second respective plurality of sampling intervals; and comparing a second sequence number of a datagram from the predetermined flow of datagrams with each of a second number of the plurality of threshold values so as to identify a second threshold value from the second plurality of threshold values equal to or less than the second sequence number of the received datagram and numerically closest to the second sequence number.
The step of generating the respective second time record data may comprise the step of: recording a second time record in respect of the datagram from the predetermined flow of datagrams in response to the second threshold value identified; wherein the second time record is recorded in the second time record data, a position of the second time record within the second time record data relative to other time record entries corresponding uniquely to the second threshold value identified from the second number of the plurality of threshold values.
Each time record entry of the first time record data may respectively correspond to each of the first respective plurality of sampling intervals. Each time data entry of the second time record data may respectively correspond to each of the second respective plurality of sampling intervals.
The first threshold value may be unavailable for subsequent comparisons in response to the first threshold value being equal to or less than the first sequence number of the datagram from the predetermined flow of datagrams and numerically closest to the first sequence number. The second threshold value may be unavailable for subsequent comparisons in response to the second threshold value being equal to or less than the second sequence number of the datagram from the predetermined flow of datagrams and numerically closest to the second sequence number.
The first and second threshold values may be available for comparison in response to the first and second sequence number being out-of-sequence.
The predetermined flow of datagrams may comprise an out-of-sequence datagram, the method further comprising the step of: identifying the out-of-sequence datagram as being out of sequence; and identifying one of the first number of the plurality of threshold values less than or equal to a sequence number of the out-of-sequence datagram and numerically closest to the sequence number of the out-of-sequence datagram.
The method may further comprise the step of: substituting a time record entry for a datagram sampled in the absence of the out-of-sequence datagram with a time record associated with the out-of-sequence datagram.
The predetermined flow of datagrams may comprise an out-of-sequence datagram, the method further comprising the step of: identifying the out-of-sequence datagram as being out of sequence; and identifying one of the second number of the plurality of threshold values less than or equal to a sequence number of the out-of-sequence datagram and numerically closest to the sequence number of the out-of-sequence datagram.
The method may further comprise the step of: substituting a time record entry for a datagram sampled in the absence of the out-of-sequence datagram with a time record associated with the out-of-sequence datagram.
The method may further comprise the step of: temporarily storing sequence numbers of datagrams corresponding to time records stored in the first time record data and/or second time record data; and using the temporarily stored sequence numbers to identify in the first time record data and/or the second time record data the time record entry for the datagram sampled in the absence of the out-of-sequence datagram.
The step of using the temporarily stored sequence numbers may comprise the steps of: comparing from the temporarily stored sequence numbers a temporarily stored sequence number associated with the time record entry recorded in the absence of the out-of-sequence datagram with the sequence number of the out-of-sequence datagram; and determining whether the sequence number of the out-of-sequence datagram is less than the temporarily stored sequence number.
The method may further comprise the step of: sending the first time record data and/or the second time record data to a correlator without the temporarily stored sequence numbers.
The first plurality of threshold values may comprise a predetermined separation therebetween; and the second plurality of threshold values may comprise substantially the predetermined separation therebetween.
The first plurality of thresholds and/or the second plurality of thresholds may comprise an initial threshold, the method may further comprise the step of: setting the initial threshold with respect to a Transmission Control Protocol (TCP) synchronise (SYN) value.
The method may further comprise the step of: obtaining the TCP SYN value from a TCP SYN datagram.
The first and second pluralities of time record entries may be arrival times associated with the sampled plurality of datagrams at the first and second points, respectively.
According to a second aspect of the present invention, there is provided a computer program element comprising computer program code means to make a computer execute the method as set forth above in relation to the first aspect of the invention.
The computer program element may be embodied on a computer readable medium.
According to a third aspect of the present invention, there is provided a method of calculating datagram jitter comprising the method as set forth above in relation to the first aspect of the invention.
According to a fourth aspect of the present invention, there is provided a method of calculating datagram delay comprising the method of sampling datagrams as set forth above in relation to the first aspect of the invention.
According to a fifth aspect of the present invention, there is provided a datagram sampling apparatus comprising: a sampler for sampling a plurality of datagrams from a predetermined flow of datagrams associated with a point in a communications link, the plurality of datagrams being sampled with reference to a respective plurality of sampling intervals; and a time record generator for generating respective time record data corresponding to a predetermined number of the plurality of datagrams; wherein the respective plurality of sampling intervals is in accordance with a shared predetermined sampling interval regime so that the time record data comprises a plurality of time record entries corresponding respectively to the respective plurality of sampling intervals.
The time record data may be contained in a data structure.
According to a sixth aspect of the present invention, there is provided a time record correlator apparatus for a communications network, the apparatus comprising: a processing resource arranged to receive first time record data and second time record data, and correlate the first and second time record data; wherein the first time record data comprises a first plurality of time record entries and the second time record data comprises a second plurality of time record entries, a position of a data record entry in the first time record data having a corresponding known counterpart position in the second time record data.
According to a seventh aspect of the present invention, there is provided a datagram sampling system comprising: a first sampler for sampling a plurality of datagrams from a predetermined flow of datagrams associated with a first point in a communications link, the plurality of datagrams being sampled with reference to a first respective plurality of sampling intervals; a first time record generator for generating respective first time record data corresponding to a predetermined number of the plurality of datagrams; a second sampler for sampling the plurality of datagrams from the predetermined flow of datagrams associated with a second point in the communications link, the plurality of datagrams being sampled with reference to a second respective plurality of sampling intervals; a second time record generator for generating respective second time record data corresponding to of the predetermined number of the plurality of datagrams; and a correlator the first and second time record data; wherein the first respective plurality of sampling intervals is consistent with the second respective plurality of sampling intervals.
It is thus possible to provide a method, system and apparatus that does not require identifying patterns to be sent with timestamps of sampled packets, thereby reducing the amount of data that needs to be communicated to a correlator.
At least one embodiment of the invention will now be described, by way of example only, with reference to the accompanying drawings, in which:
Throughout the following description identical reference numerals will be used to identify like parts.
Referring to
Turning to
The packet recogniser and timestamp generator unit 210 is also coupled to a current sequence number capture unit 232 via a seventh interconnecting data bus 234, the current sequence number capture unit 232 being coupled to a sequence number compare unit 236 via an eighth interconnecting data bus 235. The sequence number compare unit 236 is also coupled to the actual sequence number sample point generation unit 222 via a ninth interconnecting data bus 238 and to a sample record storage unit 240 via a tenth interconnecting data bus 242.
The sample record storage unit 240 is also coupled to the packet sample rate adjustment unit 218 via an eleventh interconnecting data bus 244, to the packet recogniser and timestamp generator unit 210 via a twelfth interconnecting data bus 246 as well as to an output interface 248 via a thirteenth interconnecting data bus 250. The output interface 248 has an output 252 coupled to an output data bus 254. Although not shown in
Although the above units have been described in terms of hardware, the skilled person will, of course, appreciate that the functionality can be implemented in software or a combination of both hardware and software.
Turning to
Similarly, a second output 316 of a second sequence number register 318 is coupled to a second inverting input 320 of a second comparator 322 via a second comparator data bus 324. A third output 326 of a third sequence number register 328 is coupled to a third inverting input 330 of a third comparator 342 via a third comparator data bus 344. The remaining sequence number registers 302 and comparators 304 of the N comparator modules 300 are similarly configured and so to avoid over-complicating the description of the sequence number compare unit 236, the remaining comparator modules will not be described further herein.
A current Packet Data Unit (PDU) or datagram sequence number register 346 is coupled to each non-inverting input 348 of the comparators 304 via a common comparator data bus 350.
In operation, packets originating from a source Internet Protocol (IP) address (not shown) pass, in this example, past the first point 110 and the second point 114 on the way to a destination IP address (not shown), the route being taken being considered a communications link. Referring to
In relation to the IP Header 406 (
Referring to
Thereafter (
If the received packet is a TCP SYN packet, the TCP SYN packet is passed to the initial sequence number capture unit 214 and the initial sequence number capture unit 214 extracts (Step 712) a sequence number of the SYN packet that serves as an Initial Sequence Number (ISN) value and stores the ISN value. The ISN value is then passed (Step 714) by the initial sequence number capture unit 214 to the actual sequence number sample point generation unit 222, where the actual sequence number sample point generation unit 222 calculates Sequenced Comparator Chain (SCC) sample points. The SCC sample points are sequence number threshold values calculated using the ISN value and adding increasing offset values to generate each SCC sample point. The offset values are adjustable by the packet sample adjustment unit 218 in the event that the sample record storage unit 240 is filling-up too rapidly, determined by comparison to the number of matching packets identified by the packet recogniser and timestamp generator unit 210. The sample point generation unit 222 obtains the offset values from the base sequence number sample point storage unit 228, the base offset values stored by the base sequence number sample point storage unit 228 being, for example, predefined and having a random distribution and being in respect of a flow having an initial sequence number of 0. The offset values are passed from the base sequence number sample point storage unit 228 to the actual sequence number sample point generation unit 222, where each successive SCC sample point is then generated by adding an increasing offset value to the ISN value:
SCC_sample_point_number(n)=ISN_value+offset_value(n)
Where n=1, 2, 3, . . .
For example, a first SCC sample point is the ISN value, a second SCC sample point is the sum of the ISN value and a second offset value, a third SCC sample point is the sum of the ISN value and a third offset value, a fourth SCC sample point is the sum of the ISN value and fourth offset value, and so on. However, it should be appreciated that the precise manner in which the offset values are calculated can vary.
The first, second, third, . . . , nth sample points are then passed to the sequence number compare unit 236 where they are stored in the first, second, third, . . . , nth sequence number registers 302, respectively. The ISN value is also passed (Step 716) to the sequence number compare unit 236 and stored as an initial Largest Sequence Number (LSN) value. Thereafter, the packet recogniser and timestamp generator unit 210 continues awaiting (Step 700) another packet. It will, of course, be appreciated by the skilled person that this process takes place in parallel to the processing described above in relation to the SYN packet.
If the received packet is determined (Step 710) not to be a TCP SYN packet, the packet recogniser and timestamp generator unit 210 passes the received packet to the current sequence number capture unit 232, whereupon the current sequence number capture unit 232 extracts (Step 718) a sequence number of the received packet and passes (Step 720) the extracted sequence number to the sequence number compare unit 236, the sequence number compare unit 236 storing the extracted sequence number in the datagram sequence number register 346 as a Current Sequence Number (CSN) value.
Referring to
In order to identify the threshold value that is numerically closest to the CSN value, the sequence number compare unit 236 simply selects (Step 806) a numerically highest of the number of the N comparator modules 300, i.e. the triggered comparator modules. Thereafter (
If, through analysis of the flag associated with the selected comparator module 300, the selected comparator module 300 is deemed to have been previously selected, the received packet and the associated timestamp are discarded and the packet sampling unit 202 returns to awaiting (Step 700) receipt of copies of packets. However, if the selected comparator module 300 has been determined not to have been previously selected, the flag associated with the selected comparator module is set to indicate that the comparator module 300 has been selected (Step 810). The timestamp generated by the packet recogniser and timestamp generator 210 for the received packet is then stored (Step 812) in a data structure 900 (
In addition to the timestamp for the received packet, the CSN being tested is temporarily stored (Step 812) in a Stored Sequence Number (SSN) field of the time record entry 910 associated with the comparator module selected. The sequence number compare unit 236 then stores a Highest Triggered Comparator (HTC) identifier to record (Step 814) the selected comparator module 300 as being a highest selected and triggered comparator module so far. Thereafter (
The above operation of the packet sampling unit 202 is repeated for subsequently received packets until, as mentioned above, the timeout period expires or the data structure 900 becomes full. Consequently, the data structure 900 is gradually populated with time record entries relating to the predetermined flow of packets. However, in order to fully illustrate operation of the packet sampling unit 202, it is now assumed that one of the packets received by the packet sampling unit 202 is an out-of-sequence packet that is not a SYN packet, but is from the predetermined flow of packets being monitored. Consequently, the processing in relation to the out-of-sequence packet reaches the stage where the sequence number comparator unit 236 identifies (Step 800) the out-of-sequence packet as being out-of-sequence, whereafter operation of the packet sampling unit 202 is different to that described above. Clearly, the sequence number comparator unit 236 determines that the CSN value of the out-of-sequence packet is greater than or equal to the LSN value and so the out-of-sequence packet is deemed (Step 816) out-of-sequence and so the sequence number comparator unit 236 begins a “reverse search” of the already selected comparator modules 300.
Consequently, the sequence number comparator unit 236 retrieves (Step 818) the HTC identifier and determines (Step 820) whether the CSN value of the out-of-sequence packet is greater than or equal to the sequence number stored in the sequence number register 302 of the comparator module 300 associated with the HTC identifer. If the CSN value of the out-of-sequence packet is not greater than or equal to the sequence number stored in the sequence number register 302 associated with the HTC identifier, the sequence number comparator unit 236, using the flags for each comparator module 300, identifies (Step 822) a next numerically highest comparator module 300 that has previously been triggered and selected, and repeats the above comparison of the CSN value with the sequence number stored in the sequence number register 302 of the next numerically highest comparator module 300 that has been triggered and selected. This sub-process of finding next highest triggered and selected comparator modules 300 is repeated until one is found where the CSN value of the out-of-sequence packet is greater than or equal to the sequence number stored in the sequence number register 302 of the triggered and selected comparator module 300 found.
The sequence number comparator unit 236 then accesses the time record entry 910 corresponding to the triggered and previously selected comparator module 300 found and determines (Step 824) whether the CSN value of the out-of-sequence packet is less than the SSN value stored in the SSN field 920 of the time record entry 910 accessed. If the CSN value of the out-of-sequence packet is not less than the SSN value stored in the SSN field 920 of the time record entry 910 accessed, the sequence number comparator unit 236 concludes (Step 826) that packet re-ordering has taken place, but that the out-of-sequence packet is not required to be sampled in order to monitor the predetermined flow of packets. Thereafter (
Alternatively, if the CSN value of the out-of-sequence packet is found to be less than the SSN value stored in the SSN field 920 of the time record entry 910 accessed, the sequence number comparator unit 236 concludes (Step 828) that the time record entry 910 accessed has to be re-written and so the timestamp associated with the out-of-sequence packet is stored (Step 830) in place of the time record entry 920 accessed. Additionally, the CSN value of the out-of-sequence packet is stored in the SSN field 920 of the time record entry 910 amended. Thereafter (
Prior to sending the data structure 900 to the correlator, the sequence number comparator unit 202 removes all SSN fields 920, and hence entries, from the data structure 900.
The above described operation is performed by both the first and second network probes 106, 116 and so for every data structure generated by the first network probe 106, a counterpart data structure is generated by the second network probe. Moreover, each time record entry of the data structure generated by the first network probe 106 has a counterpart time record entry in the counterpart data structure generated by the second network probe 116.
At the correlator (not shown), the data structure generated by the first network probe 106 and the data structure generated by the second network probe are received and, using the data in the Desired Source IP Address field 902, the Desired Destination IP Address field 904, the Desired Source Port field 906 and the Desired Destination Port field 908, are identified as relating to the predetermined flow of packets to be monitored. Further, due to the correspondence between the time record entry positions in each of the data structures generated by the first and second network probes 106, 116, the correlator is able to match timestamp data for sampled packets and use the matched timestamp data to perform jitter and/or delay calculations in relation to the predetermined flow of packet.
It should be understood that reference herein to the processing resource is intended to embrace either a single data processing entity or a plurality of data processing entities either co-located or distributed.
It should also be appreciated that although the above-described sampling functionality is implemented within a probe, it can alternatively be implemented within a network element or distributed between a probe and a network element, or between a number of network elements.
Although the above examples have been described in the context of packet communication, it should be appreciated that the term “message” is intended to be construed as encompassing packets, datagrams, frames, cells, and protocol data units and so these term should be understood to be interchangeable.
Alternative embodiments of the invention can be implemented as a computer program product for use with a computer system, the computer program product being, for example, a series of computer instructions stored on a tangible data recording medium, such as a diskette, CD-ROM, ROM, or fixed disk, or embodied in a computer data signal, the signal being transmitted over a tangible medium or a wireless medium, for example, microwave or infrared. The series of computer instructions can constitute all or part of the functionality described above, and can also be stored in any memory device, volatile or non-volatile, such as semiconductor, magnetic, optical or other memory device.
Number | Date | Country | Kind |
---|---|---|---|
0501174.7 | Jan 2005 | GB | national |