1. Field of the Disclosure
The present disclosure generally relates to data networks and more particularly to detecting peer-to-peer (P2P) applications.
2. Description of the Related Art
Network administrators may wish to detect and monitor P2P applications that consume a large portion of network bandwidth. In some systems, P2P applications are detected and monitored using deep packet inspection.
P2P applications have become popular and may use a large percentage of bandwidth consumption over an Internet service provider's (ISP's) edge and core networks. Disclosed embodiments assist with identifying those users of an ISP who run P2P applications, with identifying the P2P traffic over the users' network links, and with analyzing usage patterns to detect P2P connections. Disclosed embodiments provide data that can help customer support personnel diagnose user connection issues or provide the identities of P2P operators that may be instructed to limit P2P usage to avoid affecting the quality of video, audio, or voice over Internet protocol (VoIP) transmissions.
In one aspect, a disclosed process detects P2P connections from a local network address by accessing and analyzing an NAT table to determine a quantity of communication sessions available. The quantity of available communications is stored and compared to a maximum level of communication sessions. If the quantity of available communication sessions meets or exceeds the maximum level, a P2P connection variable is stored as TRUE, and in some embodiments, if the quantity does not meet or exceed the maximum level, the P2P connection variable is set as FALSE. The maximum level of communication sessions may be a total number (e.g., 1024) of available communication sessions for a residential gateway (RG) or client, or may be determined as a fraction (e.g., 900/1024) of the total number of available communication sessions for the RG or client.
In some embodiments, after accessing the NAT table, the number of unique destination network addresses coupled to a local port is determined. The quantity of unique destination network addresses is stored and compared to a maximum level of unique destination addresses. A P2P connection variable is set as TRUE if the quantity of unique destination addresses meets or exceeds the maximum level of unique destination addresses. The disclosed process may further include storing the P2P connection variable as FALSE if the quantity of unique destination addresses is less than the maximum level of unique destination addresses.
If a local service port is otherwise analyzed according to disclosed methods, it may render a false positive regarding a P2P connection for a client. Therefore, in some embodiments, an identifier of a local port under analysis is compared to a plurality of known non-P2P port identifiers (e.g., Internet protocol television (IPTV) service port identifiers). If a local port that has above the maximum level of unique destination addresses is a well-known non-P2P service port, the disclosed process may store a connection variable for the port as FALSE to indicate no P2P connection or application.
Disclosed processes may include repeating procedures for accessing the NAT table to determine the quantity of unique destination addresses for other local ports. In some embodiments, well-known non-P2P ports or IPTV service ports may be removed from analysis under disclosed processes. However, for other individual local ports, the quantity of unique destination addresses is compared to a quantity of unique destination ports. A P2P connection variable is stored as TRUE if the quantity of unique destination addresses is equal to the quantity of unique destination ports. If the quantity of unique destination addresses is substantially equal to the quantity of unique destination ports, a difference value is determined by subtracting a quantity of unique destination ports from the quantity of unique destination addresses. In the event the number resulting from this subtraction is a negative value, an absolute value of the number is taken so that the difference value is positive. In accordance with the disclosed process, the P2P connection variable is stored as TRUE if the ratio of unique destination addresses to unique destination ports is equal to 1 or, the P2P connection variable is stored as TRUE if the ratio is substantially equal to 1 and the difference value is less than or equal to a maximum difference value (e.g., 10). In some embodiments, the process includes storing the P2P connection variable as FALSE if the ratio is not substantially equal to 1 (or is equal to 1). The disclosed process, including accessing the NAT table, may be performed by components of an RG that stores the NAT table locally (i.e., local to the RG). Accordingly, in some embodiments the process includes transporting NAT table data to a detection server responsive to a server request. Transporting NAT table data to the detection server may be scheduled to occur after a predetermined period or may occur upon a detection event, such as the RG detecting a potential P2P application.
In another aspect, a disclosed detection server is implemented to detect P2P applications by executing instructions that access and analyze an NAT table of a client. The accessed NAT table includes a plurality of entries with stored data for communication sessions between a local customer premises device (CPE) and at least one remote device. Individual entries in the client's NAT table are associated with a destination address, a destination port, a source port, and a source address. The detection server includes instructions for determining whether a threshold number (e.g., 20) of entries have differing destination addresses while having the same source address and source port. If a threshold number of entries have differing destination addresses, the client is flagged as a potential P2P operator.
Embodied detection servers may also execute machine-readable instructions to determine a difference value between a quantity of unique destination addresses and a quantity of unique destination ports in the NAT table. If the difference value is less than or equal to a threshold number (e.g., 10), the client may be flagged as a potential P2P operator. In some embodiments, if the difference value is greater than the threshold number, the client is unflagged (i.e., removed from then current consideration) as a potential P2P operator.
In still another aspect, a disclosed RG is implemented for detecting a P2P application. The RG includes an interface that enables communication over a network between the local client and potentially multiple remote sources. A disclosed RG includes an NAT table stored on a tangible readable medium that includes session entries with data indicating communication sessions between the local client and multiple remote resources. Each of the multiple remote resources is associated with a remote network address and a remote port. The disclosed RG includes or communicates with a processor enabled for executing machine-readable instructions to determine a relationship (e.g., a ratio) between the remote network addresses associated with the multiple remote sources and the remote ports associated with the multiple remote sources. In some embodiments, the relationship is a ratio. The processor may report a P2P application in response to the ratio being equal to 1. The RG may report a P2P application to a detection server or other network-based (i.e., remote) device. If the ratio is substantially equal to 1, but is not equal to 1, a difference value may be calculated between a quantity of unique remote network addresses and a quantity of unique remote ports. If the difference value is less than or equal to a maximum value (e.g., 10), the RG may report a P2P application to the detection server.
In the following description, examples are set forth with sufficient detail to enable one of ordinary skill in the art to practice the disclosed subject matter without undue experimentation. It should be apparent to a person of ordinary skill that the disclosed examples are not exhaustive of all possible embodiments. Regarding reference numerals used to describe elements in the figures, a hyphenated form of a reference numeral typically refers to a specific instance of an element and an un-hyphenated form of the reference numeral typically refers to the element generically or collectively. Thus, for example, element 112-1 refers to an instance of a remote source, which may be referred to collectively as remote sources 112 and any one of which may be referred to generically as a remote source 112.
Disclosed embodiments are intended to provide lightweight methods and systems for RG based identification of P2P applications and P2P users. By analyzing the relationships between local and remote pairings as evidenced within an RG's NAT table (i.e., NAT session tables), disclosed processes attempt to recognize P2P connections. For example, relationships between and among local IP addresses, corresponding local ports, remote IP addresses and corresponding remote ports within the NAT table are analyzed to detect whether P2P applications are likely running or have likely run. By utilizing readily available information (e.g., NAT table data) from RGs, disclosed embodiments may provide a lighter weight solution than deep packet inspection techniques. However, disclosed embodiments may operate in conjunction with deep-packet-inspection (DPI) techniques, methods, processes, and systems.
To monitor and analyze customer traffic patterns including P2P applications, DPI techniques may capture and analyze traffic at the full bit rate of transmission. A packet flow may be identified by a five-element tuple, for example: (src_ip, src_port, dst_ip, dst_port, protocol_number). Using pattern recognition methods, DPI systems attempt to match captured traffic patterns with predefined signatures to identify the application type of a particular packet flow. Example application types include without limitation: P2P, Web/Hypertext Transfer Protocol (Web/HTTP), File Transfer Protocol (FTP), and Domain Name System (DNS).
DPI techniques may be challenging. DPI techniques require looking into packet payloads, which may raise privacy issues. In addition, DPI techniques and systems may be challenged to detect P2P flows that are encrypted so as to obfuscate their protocols. In such cases, a DPI technique may not be able to identify some P2P applications. In addition, DPI techniques require expensive hardware and software systems that make deployment over a wide geographic area difficult. What is more, if a DPI technique or system includes inserting monitoring equipment at a hub office or other centralized location, it may be difficult to capture and/or analyze P2P traffic flowing from one customer to another customer service via the same hub office.
Disclosed systems offer advantages over only using DPI techniques. Disclosed systems analyze NAT tables (i.e., NAT session tables), which are typically readily-available from an RG, to determine if a user is running a P2P application (i.e., to determine if the user is a P2P user or operator). The NAT table information (i.e., data) may be remotely pulled by a detection server from a user RG using a customized script. The NAT table data may then be parsed to derive relevant information for identifying P2P users.
Disclosed embodiments may operate using a combination of techniques for analyzing the NAT data. Firstly, the NAT data may be analyzed to determine the number of available sessions available and thereby estimate whether a user is likely running a P2P application. An NAT table in an RG may be limited, for example, to storing 1024 session entries. If a user is running a P2P application, the number of available sessions typically drops to a very low number (e.g., zero) at times. However, the number of available sessions is dynamic, so it may be difficult to detect when the number of sessions is low or zero. Another challenge in detecting P2P connections with this technique is that if a P2P file sharing task is not very popular (i.e., online swarm is small), it may not cause the NAT table to be exhausted of available sessions. In such cases, further disclosed techniques for analyzing the NAT table data may be employed.
Another technique that may be used by disclosed embodiments includes, for example, comparing a number of destination addresses to a minimum number of addresses, determining a ratio of a quantity of unique destination addresses to a quantity of unique destination ports, and determining a difference between the quantity of unique destination addresses and the quantity of unique destination ports. If there are less than, for example, twenty unique destination addresses connected to a local address on the same local port, it suggests that a P2P application is not running. Also, if the local port is a well known non-P2P service port such as 80 or 8080, or if the local port is an IPTV service port, disclosed systems may determine that a P2P application is not running.
However, if there are more than 20 unique destination addresses connected to a local address on the same local port, it is an indication of a P2P application. If the ratio of the quantity of unique destination addresses to the quantity of unique destination ports is 1, it is an indication that a user is running a P2P application and the user (or user client) may be flagged as a potential P2P operator. If the ratio is substantially equal to 1 and if the difference between the quantity of unique destination addresses and the quantity of unique destination ports is less than or equal to a minimum level (e.g., 10), disclosed systems may report a P2P application and a P2P variable may be set and stored as TRUE. In addition, a user client may be notified by a detection server, for example, to stop the P2P application.
In some embodiments, in response to determining that a P2P application is not running, a P2P variable is set and stored as FALSE. For example, if the ratio is substantially equal to 1 and the difference (the absolute value) between a quantity of unique destination addresses and the quantity of unique destination ports is greater than 10, disclosed systems may flag a user or client as a non-P2P operator.
In some embodiments, an RG monitors connection patterns by analyzing its own NAT table and tracking any potential P2P usage. The RG may periodically (e.g., daily) send to a detection server or other network device the NAT table data or may send a report summarizing any detected P2P usage. In some embodiments, the RG may respond to a detection server request to provide relevant NAT table data. Generally, disclosed systems that determine the ratio of the quantity of unique destination addresses to the quantity of unique destination ports and the difference between the quantity of unique destination addresses and the quantity of unique destination ports are intended to not depend on the timing of any NAT table pull (or push) and are intended to not depend on peer swarm size. In addition, because different P2P applications may be running on different local port numbers or different RG port numbers, such methods are intended to detect if a user is running more than one type of P2P application.
Referring now to the figures,
During a P2P connection, (i.e., while running a P2P application as a P2P operator or user), client 102 may receive portions of file 114 from various combinations of the remote sources 112. For example, client 102 may receive from remote source 112-1 the first sixth of file 114, may receive from remote source 112-2 the second sixth of file 114, may receive from remote source 112-3 the third sixth of file 114, and so on. In this way, client 102, as a P2P operator, may receive different portions of file 114 from a plurality of remote sources 112.
As shown, RG 104 includes a processor 132 and media 128, which is a tangible computer readable medium that may include drive media, solid-state media, volatile media, persistent media, magnetic media and other forms of storage. As shown, media 128 includes NAT table 116 and may also include computer executable instructions that enable RG 104 to detect when client 102 is running a P2P application. RG 104 includes an interface for communicating with network 108 and an interface for communicating with client 102. NAT table 116 includes session entries with data indicating communication sessions between client 102 and one or more remote sources 112. Each remote source 112 in communication with client 102 is associated with a remote network address (i.e., a destination address) and a remote port (i.e., destination port) and data indicative of these addresses and ports is stored in NAT table 116.
As shown, processor 132 executes instructions for determining a relationship (e.g., a ratio) between the remote network addresses associated with remote sources 112 and the remote ports (not depicted) associated with remote sources 112. For example, processor 132 may determine that a ratio between remote network addresses in NAT table 116 (that are associated with remote resources 112) and unique destination ports (that are associated with remote resources 112) is equal to 1. If more than a minimum number (e.g., 20) of NAT sessions are logged in NAT table 116 and the ratio is equal to 1, the processor may flag client 102 as a P2P operator. Accordingly, RG 104 may report to P2P detection server 110 or other network components that client 102 is a P2P operator.
If the ratio is not equal to one but is substantially equal to 1 (e.g., within 6% of 1), then a difference value that is the absolute value of the difference between the quantity of unique remote network addresses and the quantity of unique remote ports is calculated. If the difference value is less than or equal to a maximum level (e.g., 10), a P2P application is reported and a P2P variable may be set as TRUE. However, if the difference value is greater than the maximum level, a P2P variable may be set as FALSE.
As shown in
If the disclosed process is performed by P2P detection server 110, a detected level of available communication sessions that is less than the minimum level of available sessions results in P2P detection server 110 storing a P2P connection variable as TRUE. In some embodiments, if the quantity of available sessions is greater than the minimum level of sessions, P2P detection server 110 stores the P2P connection variable as FALSE. To help prevent a false detection of a P2P connection, P2P detection server 110 may determine whether used communication sessions listed in NAT table 116 correspond to well-known non-P2P service ports (e.g., 80 or 8080). If a used communication session corresponds to a well-known non-P2P service port, P2P detection server 110 may take this into account when determining whether a P2P connection may exist. For example, one communication session may be subtracted from the number of used communication sessions for each communication session in the NAT table that uses a well-known non-P2P service port.
In some disclosed processes, P2P detection server 110 determines from NAT table 116 a quantity of unique destination network addresses communicatively coupled to a local port. The quantity of unique destination network addresses is stored and compared to a maximum level (e.g., 20) of unique destination addresses. If the quantity of unique destination addresses meets or exceeds the maximum level of unique destination addresses, a client may be flagged as a potential P2P operator and, in some cases, a P2P connection variable may be stored as TRUE. In contrast, if the quantity of unique destination addresses is less than the maximum level (e.g., 20) of unique destination addresses, P2P detection server 110 may unflag the client as a potential P2P operator and store or rewrite the P2P connection variable as FALSE. If a quantity of unique destination addresses meets or exceeds the maximum level of unique destination addresses and an identifier of a local port corresponding to the unique destination addresses is identical to an identifier of a known non-P2P service port, then P2P detection server 110 may store the P2P connection variable as FALSE. Alternatively, if a local port is identified as a well-known non-P2P service port (e.g., 80, 8080, or an IPTV service port), any unique destination addresses communicating with the local service port may be removed from the quantity of unique destination addresses used to help detect a P2P connection.
Disclosed processes may repeat for each unique local port operation for accessing NAT table 116 data and determining a quantity of unique destination network addresses. For individual ports, a quantity of unique destination addresses is compared to a quantity of unique destination ports. A client may be flagged as a potential P2P operator if the quantity of unique destination addresses is equal to the quantity of unique destination ports and the quantity of unique destination addresses exceeds a threshold level (e.g. 20). The client may be flagged by storing a P2P connection variable as TRUE.
In some disclosed processes, P2P detection server 110 performs at least two operations to detect P2P connections. First, for individual ports of the local ports, P2P detection server 110 determines a ratio of the quantity of unique destination addresses to a quantity of unique destination ports. Second, if the ratio is substantially equal to one, P2P detection server 110 determines a difference value by subtracting the quantity of unique destination ports from the quantity of unique destination addresses. A client is flagged as a potential P2P operator if the ratio is substantially equal to 1 and the difference value is less than or equal to a maximum difference value (e.g., 10). In some embodiments, if the client is flagged as a potential P2P operator, a P2P connection variable is stored as TRUE. The P2P connection variable may be stored as FALSE if the ratio is not substantially equal to 1 (or is not equal to 1) or the difference value is greater than the maximum difference value (e.g., 10).
As shown in
In still other embodiments, P2P detection server 110, as shown in
As an additional step to determine whether the client is a P2P operator, RG 104 may determine a ratio of unique destination addresses to unique destination ports. If the ratio is equal to 1, the client is flagged as a potential P2P operator. If the ratio is not equal to 1 but is within a tolerance value (e.g., 6%) of 1, P2P detection server 110 may additionally determine whether a difference value, which is the absolute value of the quantity of unique destination ports subtracted from a quantity of unique destination addresses, is less than or equal to a threshold number (e.g., 10). If the difference value is greater than the threshold number (e.g., 10), the client may be unflagged as a potential P2P operator.
In
As shown in NAT table 200 in
If initial analysis shows that a threshold number of unique destination addresses occur in an NAT table for source ports that are not included in a database of known non-P2P ports, further analysis is conducted to detect a P2P application. Specifically, a ratio of the number of unique destination addresses to the number of unique destination ports is determined according to the following formula:
If the ratio is equal to 1 then a client is flagged as a potential P2P operator. This is because typically only peers establish a single TCP connection to a unique peer. If the ratio is substantially equal to 1 and a difference value (calculated as the absolute value of the number of unique destination addresses minus the number of unique destination ports) is less than or equal to a threshold value (e.g., 10), it is also an indication that a P2P application is running. The difference is calculated according to the following equation:
ip2port_diff=|the # of unique dst IPs−the # of unique dst ports|
As shown in NAT table 200, there are 20 unique destination addresses in column 203 and 21 unique destination ports. This yields, according to the above formula, a ratio of 0.95, which is within a threshold tolerance of 6% of 1, and is therefore considered substantially equal to 1. The difference is calculated as the absolute value of 20-21, which yields 1. If the threshold value is 10, the calculated difference of 1 is less than 10, and this indicates that the application using source address 192.168.1.69 and source port 7375 is likely a P2P application. The difference calculation allows for a relatively small difference between the number of destination addresses and destination ports, which may be due to a unique P2P contact (e.g., a unique IP) having two sessions on two different ports: one session on a TCP port, the other on a different UDP port, with both sessions connected to the same local port. If the ratio is far away from 1 or is not substantially equal to 1 and the difference is greater than the threshold value (e.g., 10), then the NAT table is not considered to evidence a P2P application.
Detecting P2P applications as discussed above may be conducted by a network based P2P detection server or by a local RG (e.g., RG 104 in
Accordingly, disclosed embodiments provide relatively lightweight systems for detection of P2P applications and do not necessarily require looking into packet payload. Such systems may help to detect P2P flow even if the flow is encrypted. P2P user identification can help in diagnosing some user connection issues and may be used to instruct users to limit P2P usage in certain situations to avoid affecting quality of video or VoIP transmissions.
To the maximum extent allowed by law, the scope of the present disclosure is to be determined by the broadest permissible interpretation of the following claims and their equivalents, and shall not be restricted or limited to the specific embodiments described in the foregoing detailed description.