This invention relates to the distribution of traffic across a computer network.
Large Internet Service Providers (ISPs) maintain their own networks with backbones stretching from coast to coast. Because no single major ISP controls the market, it is beneficial for the major ISPs to interconnect their networks, so that users perceive that they are interacting with a single, transparent network. Typically, the interconnections are based on a combination of public multi-access facilities (e.g., MAE-EAST, MAE-WEST) and private point-to-point connections between routers controlled by two respective providers.
Traffic delivered from a customer of an ISP to a destination that is not on the ISP's network traverses an end-to-end path that includes three segments: (1) a segment from the ISP's customer through the ISP's network; (2) a segment represented by the interconnection between the ISP and a target ISP that serves the destination; and (3) a segment across the target ISP to the destination. To reduce congestion, it is often desirable to build both a network and a service model that reduces bottlenecks on all three of these path segments.
An ISP may only have direct control over bandwidth in the first component, while being dependent on the target ISP partially with respect to the second and entirely with respect to the third segment. An ISP may reduce these dependencies by implementing a “local delivery” or “best exit”-like service model that engineers end-to-end paths utilizing the ISP's own network as much as possible.
Traffic between one ISP and another ISP may be exchanged in what is known as “shortest-exit” manner. Under “shortest exit”, traffic originating from one ISP's customer (the source) that is destined for another ISP's customer (the destination) is sent toward the topologically closest interconnection point between the source and the destination. Where the source and destination are not geographically or topologically close to each other, the result tends to be higher costs borne by the destination's ISP. If a first ISP has only large content sites as customers and a second ISP provides only connectivity to content consumers, the cost differences may be so high that it is not in the interest of the second ISP to interconnect at high speed with the first ISP.
Consider traffic sent by a customer, whose content is hosted in a data center in San Jose, Calif., by the “ABC” ISP, across the “XYZ” ISP to a destination located in Washington, DC. Assume that ABC and XYZ can exchange traffic in two locations: one near San Jose, Calif., and the other near Washington, DC. Network traffic flows as follows: (1) a Washington-based customer of XYZ sends a request a short distance across the XYZ network to the interconnection point near Washington; (2) the request enters the ABC network near Washington and is carried a long distance across the ABC network to the San Jose data center; (3) a reply is created by the content server in the San Jose data center and is sent across a short distance on the ABC network to the nearest interconnection point near San Jose; and (4) the reply is carried a long distance across the XYZ network from the San Jose area to the customer in the Washington area.
The traffic flow is asymmetric, with the ABC network carrying the content request and the XYZ network carrying the content reply. Because content replies are typically far larger than content requests (often, by many orders of magnitude—for example, a 100-or-so-byte request can easily result in a multi-megabyte JPG image being returned), the result is greater bandwidth use on the XYZ network and thus greater cost to its ISP.
Referring to
The network structure shown in
Referring to
Because the ISP B San Jose POP 1030 advertises routes to ISP B DC POP 1060 networks, the ISP A San Jose POP 1020 routes a request from end user 1010 to web server 1070 through the ISP B San Jose POP 1030. The request travels across ISP B WAN 1050 to the ISP B DC POP 1060 and then to web server 1070.
A reply sent from web server 1070 to end user 1010 will likely take a different route as described in
Packets sent from end user 1010 to web server 1070 travel most of the distance across ISP B WAN 1050 and packets sent the other direction travel most of the distance across ISP A WAN 1040. Thus ISPs supporting large numbers of end users (e.g., ISPs engaged in server co-location and web hosting) end up carrying a greater portion of transmitted information than ISPs supporting mostly information suppliers. Connecting customers that are large traffic sources to POPs that are not close to the interconnection points may trigger a need for additional backbone capacity.
Costs of interconnection can be reduced by keeping traffic sources topologically close to the points where connections to other ISPs are located, because the bulk of traffic originating at those sources is usually headed toward destinations on other ISP's networks. Keeping the sources close to the exit points reduces the number of links on the backbone that the traffic must cross and therefore reduces the cost of transporting the traffic. Traffic exchange arrangements between ISPs typically have been based on three principles. First, networks are interconnected at multiple, geographically-diverse points, typically including at least one point on the East Coast and one point on the West Coast, in the case of the United States. Second, routing information exchanged at the interconnection points should be identical. Third, network traffic is routed using a “closest-exit” or “hot-potato” approach (i.e., traffic is routed through the topologically-closest exit point on the source ISP's network).
These arrangements place the bulk of the burden of distributing traffic across long distances on the receiver of the traffic. ISPs which are sources of data (i.e., which house large web farms) tend to have much lower costs than those that connect large numbers of data consumers (i.e., those with lots of dialups or other end-users) because, on average, they carry traffic much shorter distances.
The “closest-exit” cost model is easy to implement and reasonably fair, as long as the providers involved are of approximately the same size, exchange roughly the same amount of traffic in each direction, assume comparable costs, and derive comparable benefits from the interconnection arrangement. As the Internet has grown and the market has become somewhat divided into data producers (principally, web-hosting) and consumers (those which connect end users), the larger ISPs are recognizing that being on the receiving side of a traffic imbalance drives up costs without increasing revenue. The result is that larger ISPs resist establishing or expanding interconnections with large data sources.
One way to address the imbalance is to configure the routing system to exchange more detailed routing information (in technical terms, use of “more specifics” and “multiple-exit” discriminators), a step that may effectively disable classless inter-domain routing (CIDR) route aggregation between networks and/or may make ISPs susceptible to malicious behavior by other ISPs.
In one general aspect, a relay apparatus is provided. The relay apparatus may include an interface to a communication network, a data store including data elements, and a processor. Each data element associates a first network address of a server with a second network address of a server. The processor is configured to receive a request corresponding to a data element in the data store, use the data element to determine a corresponding server, and forward the request to the server in a manner such that any response to the forwarded request is sent to the relay apparatus.
The request may be received from an originating device with the processor configured to receive a response to the forwarded request and to forward the response the originating device.
The communication network may include one or more from following: a fiber optic network, a land-line network, a wireless network, a cable network, a telephone network, a local area network (LAN), a wide area network (WAN), an Ethernet network, an Internet Protocol (IP) network, a satellite network, an asynchronous transmission mode (ATM) network, and a frame relay network. Network addresses may be an anycast address, Internet protocol (IP) addresses, or an unicast addresses.
Additionally, the server may be a web server, a news server, or any other content-providing device. The relay device may provide caching capabilities to reduce network utilization between the relay apparatus and servers.
In another general aspect, an apparatus is provided including a server, a first point of presence, a second point of presence, and one or more relay devices. Each relay device is configured to receive a request from an originating device, forward the request to the server, receive a reply from the server, and forward the reply to the originating device, such that a reply to a request received through the first point of presence is forwarded to the originating device through the first point of presence and a reply to a request received through the second point of presence is forwarded to the originating device through the second point of presence.
Relay devices may include a web relay device configured to receive and forward hypertext transport protocol (HTTP) traffic. The server may be a web server or any other content-providing server. Additionally, the relay devices may include a cache component.
In another general aspect, a method may be provided including assigning a first address and a second address to a server, and assigning the first address to a relay group. The relay group may include one or more relay devices configured to forward a request sent to the first address to the second address such that any reply returned by the server is routed symmetrically with respect to the request. The addresses may be anycast addresses, Internet protocol (IP) addresses, and/or unicast addresses.
The details of one or more implementations are set forth in the accompanying drawings and description below. Other features and advantages will be apparent from the description and drawings, and from the claims.
Referring to
Symmetrical routing, as described in
By making routing more symmetric, content-providing ISPs gain greater control over traffic control and bandwidth utilization. ISPs may make routing more symmetric by introducing active content processing devices, called “web relays,” to engineer desired traffic flows. A web relay is a device that accepts requests for content, retrieves that content from an “origin server”, and then relays the content to the original requester. A web relay is similar to a web cache except that a web cache keeps copies of cacheable content it retrieves from origin servers so that future requests for the same content may be satisfied without again consulting the origin server.
Referring again to the example discussed above, a web architecture may be redesigned by inserting web relays between a request originator and a content location to engineer the traffic flows. Revisiting the example described above, the revised traffic flow is roughly as follows: (1) the XYZ customer sends a request, which is sent a short distance across the XYZ network to the interconnection point near Washington; (2) the request enters the ABC network near Washington and is initially processed by a web relay in the Washington area; (3) the web relay in the Washington area initiates a new request for the content and transmits it across the ABC backbone to the content server in the San Jose data center; (4) a reply is created by the content server in the San Jose data center and sent back to the web relay in the Washington area; (5) the web relay in the Washington area takes the reply it receives and uses it to satisfy the original user request; and (6) the reply is carried a short distance across the XYZ network to the original requester. Because both the data center and the web relay are internal to the ABC network, this traffic travels long distance across the ABC backbone.
In this example, traffic flow is symmetric between the content requestor and content source and is kept on the “source provider” network as much as possible. The result is reduced backbone utilization (and, thus, cost) for the ISP of the content-consuming customer, which should make providing interconnection capacity more attractive for that ISP. This, of course, means increased backbone utilization (and cost) for ABC but also provides better control over end-to-end bandwidth capacity. And if the ABC backbone network is superior in capacity and cost, improved performance as well as lower overall system cost will result (the total cost to both ABC and the other ISP).
Additionally, web relays may be used to keep traffic on an ISP's backbone as much as possible. Such routing provides an ISP a greater ability to control bandwidth and performance for its customers. By keeping traffic on its backbone, an ISP may be able to increase performance for both content-consuming customers and content-providing customers.
The web relay in this context may or may not actually cache content. Its primary function is to act as a “content relay” and ensure that traffic flows across the ABC network; where caching of the relayed content is possible (i.e., for static content and content which is not time-sensitive or otherwise not amenable to caching). Keeping copies local to the web relay can reduce load across the backbone.
In order for the web relays to perform their relaying function, web requests should be sent to the relays instead of the ultimate content sources. While it is possible to configure routers to redirect web traffic toward a web relay, doing so tends to increase layer-3 switching load and complexity.
A simpler approach may be used by assigning a special “anycast” address to a set of web relays that provide service for a particular content source. An anycast packet is one that should be delivered to one member in a group of designated recipients (e.g., the closest recipient in the group). Public domain service for the content source URLs (i.e., “www.company.com”) is then configured to point to this anycast address. Shortest-exit routing of content requests will automatically cause the anycast addresses to be handed-off to the content-hosting provider web relay that is closest to the interconnection point.
Referring to
In order to reduce the asymmetry of routing, it is desirable to force responses from information requests back through the same wide area network that the information request traversed. One way to do this is to inject a level of indirection into the system. Because ISPs typically use a “shortest-exit” routing policy, the requests usually enter a content-provider's network at a POP closest to the requesting host. It is desirable to provide a system whereby responses to information requests are routed through the same POP that the information request entered the network such that the exit point from the content provider's network is close to the end user 1010.
Referring to
Web relay 5020 receives information requests destined for web server 1070 and relays those requests across ISP B WAN 1050 to ISP B DC POP 1060. Finally, the request is sent to web server 1070. Without the use of web relay 5020, response packets would enter ISP A's network at ISP A DC POP 1080. However, when web relay 5020 forwards requests to web server 1070, web relay 5020 is identified as the source of the request. Thus, the response is initially sent back to web relay 5020 across ISP B WAN 1050.
Internet Requests for Comment (RFCs) that describe implementations of anycast addresses include: RFC 2372 (July 1998), RFC 1546 (November 1993), RFC 2373 (August 1999), and RFC 2526 (March 1999), all incorporated by reference.
As described in RFC 2373, an anycast address may include a subnet prefix identifier and an anycast identifier. The subnet prefix may be used to specify the network providing the anycast addresses. The anycast identifier is used to specify one of many possible anycast addresses on a particular subnet. A unicast address, or conventional IP address, specifies a single interface on a computer network. In contrast, an anycast address may specify more than one interface. For example, anycast addresses may be used to specify a group of one or more servers on a computer network. These servers may provide a redundant service. Routers forward packets destined to anycast addresses to the closest anycast destination for a particular address. Thus, anycast addresses provide a way to distribute load across one or more servers such that traffic from a requestor is routed to the server closest to that requestor.
As shown in
A web relay may include functions that are effectively a subset of those of a web cache; thus, a web cache may be used to implement a web relay. However, in the context of this design, a web cache deployed as a web relay must be aware that requests are received on an anycast address but are satisfied by queries to origin servers using the relay/cache's unicast address.
Referring to
A typical ISP network includes many geographically distributed POPs with content-providing services located in one or more of the POPs. Some implementations include one or more web relays 6010 distributed through some or all of the POPs. For example, POP A 6020 includes three web relays 6010, POP B 6030 includes two web relays 6010, POP C 6040 includes two web relays, and POP D includes a single web relay 6010 and five web servers 6060. Each of the POPs are connected to network 6070. One or more of the web relays 6010 at each POP may be configured to relay information requests to one or more of the web servers 6060.
Referring to
Other implementations are within the scope of the following claims. For example, the implementations described above may equally be used to modify routing behavior at the local level or global level. The techniques are not limited to controlling routing across the country.
For example, the invention may be used for any network service, such as, the file transfer protocol (FTP), the simple mail transport protocol (SMTP), instant messaging, and gopher, in addition to controlling the routing of web traffic. In these implementations, “web relay” may be a misnomer, but the same principles can be applied.
Number | Date | Country | |
---|---|---|---|
Parent | 09978768 | Oct 2001 | US |
Child | 12178498 | US |