The present invention relates to providing a peer to peer path optimizer (PPO) to examine peer to peer networking messages and dynamically and transparently redirect them to a cost efficient path.
Peer to peer (P2P) networking has emerged as a popular form of exchanging data such as movies or music among individuals using the Internet. In a P2P network each computer in the network has the same responsibilities as each of the others, i.e. it is a “peer”. Many variations of P2P networks have been created, at the time of writing the most prevalent being: Napster, Kazaa and Gnutella. The use of P2P for transferring large amounts of multimedia data such as movies or music has significantly increased the amount of information transmitted on the Internet.
P2P has led to increased financial pressure for network service providers. A network service provider is an entity that maintains a group of computers or nodes that form a network. Examples of networks include but are not limited to: a network controlled by an Internet Service Provider (ISP), a corporate network or a university network.
A network service provider typically must pay a fee for the traffic to and from their network.
Given the popularity of P2P networking, it is difficult for any network service provider to block P2P traffic. The network service provider is left with few choices, namely:
Thus, there is a need for an alternative approach, which allows a network service provider to cost effectively constrain P2P traffic through their network, while maintaining or improving existing performance to the user. The present invention addresses this need.
The present invention is directed to a peer to peer optimizer, the optimizer examining peer to peer messages between nodes within networks connected to the optimizer, for the purpose of optimizing behavior on each of said networks.
The present invention is also directed to a peer to peer optimizer, the optimizer examining peer to peer messages between nodes within networks connected to the optimizer, for the purpose of determining a cost efficient path for each peer to peer message.
The present invention is also directed to a process for managing peer to peer messages between and within networks, the process comprising the step of determining a cost efficient path for each of the peer to peer messages.
The present invention is also directed to a computer readable medium containing instructions for managing peer to peer messages between nodes in networks, the medium comprising instructions for optimizing behavior on the networks.
The present invention is further directed to a system for optimizing peer to peer messages between nodes within networks, the system comprising a peer to peer optimizer, the system utilizing the optimizer to examine messages between the nodes for the purpose of optimizing behavior on each of the networks.
For a better understanding of the present invention, and to show more clearly how it can be carried into effect, reference will now be made, by way of example only, to the accompanying drawings in which:
To minimize the cost of P2P traffic, network 12 utilizes PPO 10 to determine a cost efficient path for exchanging P2P data between nodes 14. A node 14 is any computer that is capable of receiving or transmitting P2P data.
Referring now to
Assuming that a P2P request can be serviced within a single network such as 12a, then typically the most cost efficient paths for P2P transfer will be within network 12a. Examples would be connections to nodes 14a and 14b. However, this may not always be the case. For example a request to node 14d may be very expensive if node 14d which contains the data, resides halfway around the world within a corporate intranet. In such a scenario, node 14f, within network 12b, which contains the required data, would be a more cost efficient choice.
In determining a cost efficient path for the delivery or reception of P2P data, PPO 10 combines the cost class of each node on the end of a potential exchange of data. This combination results in a path cost value. For example, a request from node 14e for a file on node 14a may result in a path cost of 155. This example is one of simple addition to the cost class of two nodes to determine a path cost. The inventor does not intend to restrict the present invention to any specific algorithm to obtain a path cost. For example, a weighting factor may be applied to nodes with a high cost class to exclude them from consideration in calculating a path cost.
In
PPO 10 is where the present invention resides. PPO 10 serves to provide three main functions:
Although the example of
Before describing in detail the structure of PPO 10, we will refer first to how it may be utilized in a variety of P2P models.
Referring now to
To explain how
Referring to
In the present invention the topology of network 60 is reconfigured as shown in
Referring now to
Referring now to
With regard to the topologies of the networks shown in
Referring now to
Licensing module 102 is responsible for enforcing the maximum number of concurrent users of PPO 10 for which the customer (i.e. the owner of a PPO 10) has paid a license fee. Configuration module 104 maintains the configuration of PPO 10, such as the sub-networks and IP addresses of the nodes that reside within a network 12. Statistics module 106 maintains the statistics for PPO 10, such as the number of files redirected and the number of concurrent users. Logging module 108 is responsible for logging functions, such as when PPO 10 was started up or shut down and when the number of licenses was exceeded. Load balancer feedback module 110 provides a negative feedback loop to an external load balancer so that multiple PPO's under the control of a customer will receive equal traffic. WCCP module 112 operates with the Cisco Web Cache Communication Protocol (WCCP) to ensure that a router, such as distribution router 24 of
P2P application 116 acts as the control program for PPO 10. Application 116 comprises: route/path cost module 118, query module 120, ping/pong network training module 122, connection manager module 124 and transfer manager module 126. Route/path cost module 118 assigns a path cost to each proposed connection based upon the cost class of each node in the connection.
Query module 120 comprises: string edit distance module 128, search amalgamation module 130, query routing logic module 132, QoS modification module 134 and content index module 136. String edit distance module 128 determines the similarity between the name of a requested file and the filenames known to PPO 10. Search amalgamation module 130 utilizes string edit distance module 128 to map the name of a requested file to the known files available, regardless of cost class. Query routing logic module 132 routes queries for a file to the nodes that are likely to contain the requested file. Module 132 maintains a list of all messages to and from a network 12. By maintaining such a list, module 132 may quickly drop spurious messages, such as requests for data that have not been acknowledged. QoS modification module 134 rewrites the routing information of module 132 to select a cost efficient path determined by route/path cost module 118. Routing information includes QoS parameters such as stated bandwidth and uptime. The purpose of rewriting routing information is to provide the requestor with a path to a file or files that make the most efficient use of network resources. By doing so a message may be redirected. Content index 136 maintains an index of content available for access in nodes 14 within networks 12. Content index 136 also contains the cost class for each node in which the content resides. Typically such content will be a file but may also include forms of data such as streaming media. It is not the intent of the inventor to restrict the use of the term “file” to any form of P2P data that may be examined by or transmitted through PPO 10.
Ping/Pong network training module 122 serves to fill host cache 138 with IP addresses of nodes 14 based upon the Ping messages received by PPO 10 from nodes 14. Ping/Pong network training module 122 sends a plurality of Pong messages in response to a Ping message in an attempt to train a network sending a Ping message. Pong messages are sent by PPO 10 for each node 14 that is in the same cost class as the sender of the Ping that PPO 10 is aware of. This use of multiple Pong messages serves to train the network that sent the Ping. This training provides the sending network with nodes other than those for which PPO 10 wishes to restrict traffic.
When a connection is established between a node 14 and PPO 10, connection manager 124 maintains the connection until the node 14 drops the connection. Index fetch module 142 is responsible for obtaining content names and adding them to content index 136.
Transfer manager 126 is in essence a proxy that handles the exchange of P2P data. Manager 126 utilizes fetch redirection module 144 to redirect a request for content to a node with a lower path cost. A node 14 may make a request for a specific file on another node 14. If that file is available via a more cost efficient path, fetch redirection module 144 will silently direct the request to another node having a more cost efficient path.
A plurality of P2P protocol specific handlers 146 are responsible for maintaining a specific P2P protocol, for example Gnutella or Fasttrack. Transmission Control Protocol (TCP) handler 148 ensures the maintenance of correct TCP behavior. Similarly, Internet Protocol (IP) handler 150 serves the same purpose for IP. It is not the intent of the inventor to restrict the present invention to the use of TCP and IP. These serve only as an example. As one skilled in the art can appreciate any number of communication protocols may be used, including, but not restricted to: ATM, UDP, and wireless.
Differentiated Services Code Point (DSCP) marking module 152, utilizes Differentiated Services (DiffServ or DS) to specify IP packets by class so that certain types of packets get precedence over others. For example a limit may be imposed on the number of P2P packets allowed to enter or leave a network 12. Such a feature is optional but may be used by networks that find P2P data is consuming too much of their bandwidth. As one skilled in the art can appreciate any number of schemes such as packet snooping or recognizing specific port addresses may be utilized to identify P2P traffic. It is not the intent of the inventor to restrict the ability to limit P2P traffic to the DSCP solution.
PPO 10 optimizes behavior between and within the networks 12 to which it is connected. Behavior is the ability to create, destroy, modify or ignore messages. Behavior optimizes future behavior of each network 12, not just the current message. An example of creating a message is a false pong. An example of destroying a message is deleting a message that has already been answered or in the case of Gnutella, a message whose TTL has expired. Modification is not limited to QoS modification module 134. For example, search amalgamation module 130 may modify messages to reflect the closest filename as determined by string edit distance module 128. In the case of a specific protocol, for example Gnutella, modification may include overwriting the TTL portion of the message when forwarding the message. Similarly the GUID for a message may be changed if needed. In essence, depending upon the protocol, PPO 10 may modify messages as required to optimize network behavior. An example of ignoring a message is to ignore a query request to a node in a network, as traffic from that network has been restricted.
In order for PPO 10 to examine and act upon P2P requests, it must be aware of a variety of P2P protocols. This functionality is handled by P2P protocol specific handlers 146.
By way of example we refer next to how a P2P protocol specific handler 146 may interface with the Gnutella protocol. It is not the intent of the inventor to restrict the present invention to work simply with the Gnutella protocol, but rather to provide a practical example of how the present invention may deal with P2P requests.
The Gnutella protocol has five message types, namely: ping, pong, query, queryhit and push. How a handler 146 handles each of these messages is shown
A ping message is used to determine if a node 14 is active, and helps to establish a database of active nodes in host cache 138 of
A query message is a search message containing a fragment of a filename, in other words, a request for data. In the present example, incoming query messages from an external node are dropped, thus appearing to be a query miss and thereby avoiding servicing a P2P request from a network 12b to 12n. It is not the intent of the inventor to require that query messages be dropped, it is simply one method that may be used to restrict unwanted P2P traffic into network 12a. Implementations utilizing PPO 10 may choose to allow free flow of all messages or to provide a limited amount of traffic. Query messages from a node 14 within network 12a are forwarded first to the nodes 14 containing the requested file that have a cost efficient path. Typically these would be nodes 14 within network 12a, but that may not always be the case. The nodes 14 having the requested data will then respond with queryhit messages. If there are no matches for the request for data, or if no queryhit message is returned, then the query message is sent to a random set of nodes 14 within network 12a. One method of determining the random set of nodes 14 to receive the query message would be to use a weighted probabilistic function such as a round robin method based upon the number of files available from each node 14. In this way, the query does not always go to the node 14 having the largest number of files. If there is still no match, the query is forwarded to nodes 14 having the lowest path cost in networks 12b to 12n.
A queryhit message is a response to a query message. Incoming queryhit messages from nodes in networks 12b to 12n are forwarded to the appropriate node 14 within network 12a. Incoming queryhit messages from nodes 14 within network 12a are forwarded back to the requesting node within network 12a and not sent out to networks 12b to 12n.
A push message is used when the transmitting node has a firewall and the receiving node does not. The receiving node sends a push message, which causes the transmitting node to open a connection directly to the receiving node. Incoming push requests may be optionally dropped by PPO 10 and are propagated unchanged on the way out of network 12a.
By way of example on how the present invention may be utilized to provide support for the Gnutella protocol, we will now refer to logical flow diagrams 11 to 15. As with the previous discussion with regard to Chart 1, we will be referring to the components of
Referring now to
Referring now to
In the above description of
Referring now to
A test is next made at step 222 to determine if the file has been located on a node 14 within network 12a. If the file has been located the location information is forwarded to the originator of the query message at step 224. If at step 222 the file has not been located, the query message is forwarded to a weighted subset of connected nodes having the lowest cost class in networks 12b to 12n at step 226. As mentioned before, a weighted round robin scheme may be utilized to select the nodes 14 in networks 12b to 12n to receive the query. A connected node is one that has established a communication path with PPO 10, for example via TCP/IP. Returning to step 214 if the query message is not from a node 14 within network 12a, processing moves to step 228 where the TTL value of the message is decremented. A test is then made at step 230 to determine if the TTL value for the message is greater than zero. If it is not, then the message is dropped at step 232 and processing ends. If the TTL value is less than or equal to zero then processing moves to step 226 where the query message is forwarded to all connected nodes in networks 12b to 12n. Optionally, if the query is from a node in networks 12b to 12n, the query may simply be dropped or returned to the requesting node at step 226, thus not requiring PPO 10 to forward the query to connected nodes.
As discussed above with reference to
Referring now to
Referring now to
Although this disclosure and the claims appended hereto make use of the terms query, queryhit, ping, pong, push and connect, it is not the intent of the inventor for these terms to be specifically associated with the Gnutella protocol. To the inventor the term query is analogous to a request for data and queryhit to a reply to a query, indicating that the data has been located. A ping is a standard computer communications term and is short for Packet Internet Groper; in essence it is a message to determine whether a specific computer in a network is accessible. A pong is a response to a ping. A push is a message sent directly to a node that is protected by a firewall. A push is used to request a direct connection between the node behind the firewall and the node sending the push message so that the node behind the firewall can “push” data to the requesting node. A connect is a connection between two nodes.
Although the disclosure refers to a PPO within an ISP by way of example, it is not the intent of the inventor to restrict the invention to such a configuration. For example a PPO may be used within any network, including networks utilized by corporations to exchange data with their employees or customers. Further, multiple PPO's may be utilized to provide redundancy in case one PPO fails and also to provide load balancing. In the case of a network 12 utilizing a single PPO, if the PPO failed, network 12 would revert to the status quo without the PPO; i.e. all P2P messages are exchanged with no decision made on who should service the request.
Although the invention has been described with reference to certain specific embodiments, various modifications thereof will be apparent to those skilled in the art without departing from the spirit and scope of the invention as outlined in the claims appended hereto.
Number | Name | Date | Kind |
---|---|---|---|
5617539 | Ludwig et al. | Apr 1997 | A |
5946316 | Chen et al. | Aug 1999 | A |
6029195 | Herz | Feb 2000 | A |
6185598 | Farber et al. | Feb 2001 | B1 |
6246669 | Chevalier et al. | Jun 2001 | B1 |
6256309 | Daley et al. | Jul 2001 | B1 |
6415280 | Farber et al. | Jul 2002 | B1 |
6542964 | Scharber | Apr 2003 | B1 |
6567856 | Steele et al. | May 2003 | B1 |
6631128 | Lemieux | Oct 2003 | B1 |
6653933 | Raschke et al. | Nov 2003 | B2 |
6801905 | Andrei | Oct 2004 | B2 |
6813631 | Pouchak et al. | Nov 2004 | B2 |
6850965 | Allen | Feb 2005 | B2 |
6877034 | Machin et al. | Apr 2005 | B1 |
6938095 | Basturk et al. | Aug 2005 | B2 |
6981055 | Ahuja et al. | Dec 2005 | B1 |
6993475 | McConnell et al. | Jan 2006 | B1 |
7065579 | Traversat et al. | Jun 2006 | B2 |
7080030 | Eglen et al. | Jul 2006 | B2 |
7174382 | Ramanathan et al. | Feb 2007 | B2 |
7333482 | Johansson et al. | Feb 2008 | B2 |
20020073088 | Beckmann et al. | Jun 2002 | A1 |
20020133534 | Forslow | Sep 2002 | A1 |
20020143855 | Traversat et al. | Oct 2002 | A1 |
20020143944 | Traversat et al. | Oct 2002 | A1 |
20020145981 | Klinker et al. | Oct 2002 | A1 |
20020152299 | Traversat et al. | Oct 2002 | A1 |
20020194108 | Kitze | Dec 2002 | A1 |
20030009587 | Harrow et al. | Jan 2003 | A1 |
20030041095 | Konda et al. | Feb 2003 | A1 |
20030158958 | Chiu | Aug 2003 | A1 |
20030172059 | Andrei | Sep 2003 | A1 |
20030174648 | Wang et al. | Sep 2003 | A1 |
20030191558 | Arellano | Oct 2003 | A1 |
20040095907 | Agee et al. | May 2004 | A1 |
20050021862 | Schroeder et al. | Jan 2005 | A1 |
20050233759 | Anvekar et al. | Oct 2005 | A1 |
Number | Date | Country |
---|---|---|
WO 0215035 | Feb 2002 | WO |
0221378 | Mar 2002 | WO |
WO0231615 | Apr 2002 | WO |
WO 02097557 | Dec 2002 | WO |
Number | Date | Country | |
---|---|---|---|
20030208621 A1 | Nov 2003 | US |