This invention relates to the field of network management, and in particular to an interactive system and method for capturing and analyzing network traffic.
The complexities of network managing continue to increase, along with the corresponding need for efficient and effective network monitoring to detect and troubleshoot problems, or potential problems.
A variety of tools are available for capturing network traffic, including, for example, discrete hardware devices termed ‘network sniffers’ that monitor traffic on selected channels, and software modules that are embedded within routers or other network switching systems. Generally, these tools are configured to record a portion of the contents of each message that is communicated over the channel(s) being monitored. Depending upon the capabilities of the tool, some filtering may be applied based on contents of the message, to selectively record only the information related to particular messages or particular types of messages.
Network monitoring tools had conventionally been used to create a record of network traffic to facilitate fault analysis and/or fault isolation when a problem was detected or suspected. These monitoring tools had also conventionally been used to characterize traffic flow through the network to facilitate network modeling and simulation. As the need for rapid response and maximum ‘up-time’ has increased, these tools are being used to monitor network traffic in a more active manner, to potentially recognize problems as they are developing, before they lead to outages or other failures.
Although the available tools are effective for recording information related to each monitored message, the sheer volume of messages over a monitored channel reduces the tool's effectiveness for on-line, or real-time, analysis. U.S. patent applications 2004/0093413, 2004/0098611, and 2004/0133733 filed 13 May 2004, 20 May 2004, and 8 Jul. 2004 for Bean et al. and incorporated by reference herein, disclose techniques for organizing captured network data to facilitate an interactive display of the volume of data communicated through a router over time. Summary information, in the form of histogram data, is stored for each defined time period, with pointers to the detailed message data corresponding to this histogram data. The user is provided options to pan and zoom through this volume data, including the ability to view multiple time lines at different time scales. Because this data is summarized as histogram data, these panning and zooming actions can be performed quickly.
Although the display of the volume of data flowing through a router over time can facilitate an analysis of traffic flow, it does not, per se, facilitate the analysis of traffic patterns, and additional analysis of the underlying detailed data is required to identify the causes of the traffic. That is, in the prior art systems such as taught by Bean et al., there is no distinction among the messages at the summary level, and therefore any analysis that is based on characteristics of the messages requires a subsequent analysis of the underlying detailed data.
It would be advantageous to organize captured message traffic by categories, to facilitate real-time data-capture control and analysis based on such categorization. It would be advantageous if such categorization distinguished among the sources and/or destinations of the messages. It would also be advantageous if a user were able to customize and control the data capture tools while performing this network traffic analysis.
These advantages, and others, can be realized by a network monitoring system and method for processing captured message data to create a plurality of categories, providing summary data corresponding to each category, and displaying the categorized summary data. The categories preferably include an identification of the source node and destination node of each message, and the summary data includes the amount of traffic communicated between each pair of source-destination nodes. The display of this summary data includes a graphic display that provides a visual indication of each pair and the volume of traffic between the nodes of the pair.
The invention is explained in further detail, and by way of example, with reference to the accompanying drawings wherein:
Throughout the drawings, the same reference numerals indicate similar or corresponding features or functions. The drawings are included for illustrative purposes and are not intended to limit the scope of the invention.
In the following description, for purposes of explanation rather than limitation, specific details are set forth such as the particular architecture, interfaces, techniques, etc., in order to provide a thorough understanding of the concepts of the invention. However, it will be apparent to those skilled in the art that the present invention may be practiced in other embodiments, which depart from these specific details. In like manner, the text of this description is directed to the example embodiments as illustrated in the Figures, and is not intended to limit the claimed invention beyond the limits expressly included in the claims. For purposes of simplicity and clarity, detailed descriptions of well-known devices, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.
The invention is presented herein using the generic term of ‘message’ to identify a communication from a source node of a network to one or more destination nodes. Depending upon the technologies used within the network, and within the collection tools, a message may be a discrete unit, such as a packet or frame, a set of discrete units, a continuous stream of finite length, or any other identifiable segments or sets of segments of related data items sent by the source node.
The interface at
The window 230 at the right of
As each agent captures the message data, the agent extracts information from each message, typically from the header information, and processes the information so as to create categorized summary data. In a preferred embodiment, the source and destination of each message is extracted, so that the message data can be categorized as a function of one or the other, or both. A particularly effective categorization uses tier-pairs, each pair corresponding to the source and destination nodes of a message, without regard to which node is source or destination; i.e. without regard to the direction of traffic flow. That is, for example, messages associated with the tier-pair N1-N4 of
The monitoring system MON receives the categorized summary data from one or more of the network monitors, and displays it in one or more formats. As a summarization of the message data, the summary data is generally much smaller in size than the raw message data. Accordingly, transferring the summary data from the network monitors to the monitoring system advantageously takes significantly less time than transferring the raw message data, thereby enabling a user to more quickly analyze a given set of network traffic.
One of ordinary skill in the art will recognize that many alternative display formats may be used for a given set of categories, and that alternative sets of categories may be used to create different organizations of summary data. For example, the same data that is used to generate the display of
In
In
The tab “Tier-Pair Circle” 421 is illustrated as having been selected in
A selection of the Tier-Pair Table tab 422 will effect the display of the same data in a tabular form, as a list of each tier-pair and the corresponding amount of traffic for the pair, in either text or bar-graph form. Optionally, a matrix of tiers can be displayed, in which some or all of the tiers are listed on both the horizontal and vertical axis, and the intersecting box for any two tiers will identify the corresponding amount of traffic between those tiers.
The window 460 provides a timing diagram of the amount of traffic data over time. The example window 460 illustrates the traffic flow for the entire network and any selected tier pairs. For example, if a tier-pair chord 435, or a group of tier-pair chords is selected in window 430, the window 460 will display the traffic flow for that particular selection in conjunction with the traffic flow for the entire network. The two flows are preferably distinguished via different colors, but could alternatively be distinguished using different line styles (e.g. dotted, dashed, etc.). In an alternative embodiment, if multiple tier-pair chords are selected, each corresponding traffic flow is displayed separately using a variety of colors or line styles. In another alternative embodiment, multiple windows 460 are displayed simultaneously, such that each window displays a separate data flow. Other options may also be provided, including, for example, displaying the traffic flow among the N most active nodes or tier-pairs.
Using conventional graphic interface techniques, the user can control the content of each window 460 by creating a zoom-box about a segment of the displayed timing diagram. In response, the monitoring system expands the selected segment across the span of the window 460, and redisplays the summary data with additional detail. Alternatively, an explicit timespan-control window 470 can be used to select the start and end times of the displayed information. In this window, the entire time-span of summary data is displayed, and a start-time slide pointer 471 and an end-time slide pointer 472 allow the user to zoom into selected times of the summary data. Optional text-input windows 473, 474 are also provided to facilitate this selection. This window 470 is preferably linked to a timing window 460 that is configured to display the total network traffic, and ‘goalpost’ lines 461 or other indicators are used to identify the selected time-span relative to the entire time-span of the summary data.
If the selected time-span is locked 475, the length of the time-span, or the distance between the goalposts 461, is fixed, and changing either the start time or stop time changes the other. In a preferred embodiment, when the time-span is locked 475, backward button 476 and forward button 477 appear, thereby enabling the user to step through the entire time-span at intervals equal to the amount of time between the goalposts 461. For example, if the time-span is locked and the selected duration of time is 20 seconds, any subsequent selection of the backward button 476 or forward button 477 will advance each of the slide pointers 471, 472, and consequently the goalposts 461, in the corresponding direction by 20 seconds.
Another control window 480 provides options for controlling the update of the summary data being displayed based on the selected time-span. If the auto-update option 481 is enabled, the tier-pair information displayed in the window 430 is automatically updated as the selected time-span is changed. Otherwise, the updating can be manually controlled, using the update button 482. The download option 485 allows the user to download from the network monitor only the detailed message data that corresponds to the time interval indicated by the goalposts. This advantageously eliminates the extra and often lengthy amount of time it would take to download all of the message data. The downloaded message data of interest can subsequently be analyzed in further detail with a network traffic analysis tool.
As noted above, the summary data can be selected from both active and inactive captures. In the event that an active capture is selected, the invention can be configured to continually collect new summary data from the capture agent so that analysis occurs in real-time. If, for example, the capture agent is configured to write summary data every 10 seconds, the system may be configured to check for new data every 10 seconds. A manual refresh button may also be provided to control window 480 to enable the user to choose when to display any newly received summary data, or to specify how frequently the display is to be refreshed.
In a preferred embodiment, the user is provided the option of applying one or more other filters to the summary data, including, for example, filters based on protocol, direction, packet size, application, abnormalities, and so on. Generally, select filter parameters are saved in files, and the user is provided the option of selecting one or more filter files to be applied to the summary data that is displayed. These filters, if so desired, can also be applied to any message data that is downloaded with the download option 485. The filters advantageously provide the user with a further mechanism for eliminating uninteresting traffic and reducing the time it takes to download message data needed for further analysis.
The user is also given the option of modifying the capture agents to collect different information based on the analysis of the summary data. For example, based on an initial analysis, the user may configure the capture agents to report the summary data more or less frequently, to achieve more or less resolution, or may configure the capture agents to capture message data from other tier-pairs, and so on.
The capture agents 510 are configured to capture message data and store it in a local data store 520, wherein data store 520 could be a traditional database, a file, computer-readable memory, or any other well-known data storage mechanism. As the message data is captured, the capture agents 510 are preferably configured to process the message data and generate summary data. The summary data may also be stored in the local data store 520, but it can alternatively be transmitted directly to the monitoring system 530. Correspondingly, the monitoring system 530 is configured to access the data stores 520 to retrieve the summary data, or receive the summary data directly. The monitoring system is also preferably configured to provide access to the captured message data at the data store 520 upon demand.
As noted above, the summary data that is provided to the monitoring system is categorized according to one or more properties of the network traffic, and the monitoring system 530 is configured to process and present this summary data based on this categorization. As also noted above, categorization by tier-pair has been found to be particularly well suited for traffic analysis and other purposes.
In this example embodiment, elements 551-553 that are typically found in the header 550 of each message are processed to provide summary data 570 that facilitates display and analysis via the monitoring system 530. The source 551 and destination 552 of each message are provided to a hashing component 560 to provide a hash value 571 that identifies the pair of source-destination nodes, without regard to which node is the source and which node is the destination, each source-destination pair being termed a tier-pair herein. For example, if the hash value 571 is based on a product of the addresses of the source and destination nodes, the same product will result regardless of which node of the pair is the source 551 and which node is the destination 552. The hashing component 560 maintains a table for mapping the hash value 571 back to the tier-pair, which is used when displaying the associated summary data.
An accumulator 565 is preferably provided to accumulate the size of each message associated with each source-destination pair during a specified time period. Using conventional terminology, a “bucket” is associated with each tier-pair, and this bucket is used to accumulate a measure 572 of the amount of data transferred by the tier-pair within each user-definable collection period.
The record of the amount of data (accumulated-size) 572 transferred by each tier-pair for each time period 573 is stored in the local data store 520 associated with each capture agent 510, or alternatively transferred directly to the monitoring system 530 as discussed above. The time 573 may be stored with each hash value 571 and accumulated-size data entry 572, or, in a preferred embodiment, a single time 573 is assigned to all hash values 571 associated with a non-zero accumulated size 572 during this identified time period 573.
As noted above, other message data, such as the port or protocol used to transfer the data, or other parameter, may be included in the summary data 570 that is captured by each agent 510, or each set of agents. These other parameters may be saved as discrete data entries, or included within the computed hash value 571 that uniquely identifies the particular combination of parameters that serve to classify or categorize the captured message data. It should be recognized that hash values 171 are used as an efficiency mechanism and are not required to effectively store the message data. In other words, the source 551, destination 552, size 553 and a corresponding time period could be written to the data store 520 in its original format.
The foregoing merely illustrates the principles of the invention. It will thus be appreciated that those skilled in the art will be able to devise various arrangements which, although not explicitly described or shown herein, embody the principles of the invention and are thus within its spirit and scope. For example, although the processing of the message data 550 to provide summary data 570 that facilitates display of the data is preferably provided by the capture agents 510, to optimize storage space requirements, one of skill in the art will recognize that the raw data 550 for each message may alternatively be initially stored at and/or subsequently processed by an intermediary device to provide the summary data 570. These and other system configuration and optimization features will be evident to one of ordinary skill in the art in view of this disclosure, and are included within the scope of the following claims.
In interpreting these claims, it should be understood that:
This application claims the benefit of U.S. Provisional Patent Application 60/750,667, filed 15 Dec. 2005 and U.S. Provisional Patent Application 60/773,563, filed 15 Feb. 2006.
Number | Date | Country | |
---|---|---|---|
60750667 | Dec 2005 | US | |
60773563 | Feb 2006 | US |