1. Field of the Invention
This disclosure relates generally to load balancing among servers. More particularly but not exclusively, the present disclosure relates to achieving load balancing by, in response to resolving a DNS query by a client, providing the address of a server that is expected to serve the client with a high performance in a given application, based at least in part on remotely obtained health check information.
2. Description of the Related Art
Under the TCP/IP protocol, when a client provides a symbolic name (“URL”) to request access to an application program or another type of resource, the host name portion of the URL needs to be resolved into an IP address of a server for that application program or resource. For example, the URL (e.g., http://www.foundrynet.com/index.htm) includes a host name portion www.foundrynet.com that needs to be resolved into an IP address. The host name portion is first provided by the client to a local name resolver, which then queries a local DNS server to obtain a corresponding IP address. If a corresponding IP address is not locally cached at the time of the query, or if the “time-to-live” (TTL) of a corresponding IP address cached locally has expired, the DNS server then acts as a resolver and dispatches a recursive query to another DNS server. This process is repeated until an authoritative DNS server for the domain (e.g., foundrynet.com, in this example) is reached. The authoritative DNS server returns one or more IP addresses, each corresponding to an address at which a server hosting the application (“host server”) under the host name can be reached. These IP addresses are propagated back via the local DNS server to the original resolver. The application at the client then uses one of the IP addresses to establish a TCP connection with the corresponding host server. Each DNS server caches the list of IP addresses received from the authoritative DNS for responding to future queries regarding the same host name, until the TTL of the IP addresses expires.
To provide some load sharing among the host servers, many authoritative DNS servers use a simple round-robin algorithm to rotate the IP addresses in a list of responsive IP addresses, so as to distribute equally the requests for access among the host servers.
The conventional method described above for resolving a host name to its IP addresses has several shortcomings. For instance, the authoritative DNS does not detect a server that is down. Consequently, the authoritative DNS server continues to return a disabled host server's IP address until an external agent updates the authoritative DNS server's resource records. Further, the conventional DNS algorithm allows invalid IP addresses (e.g., that corresponding to a downed server) to persist in a local DNS server until the TTL for the invalid IP address expires.
One aspect of the present invention provides a system to balance load among host servers. The system includes an authoritative domain name server, and a load balance switch coupled to the authoritative domain name server as a proxy to the authoritative domain name server. A plurality of site switches are communicatively coupled to the load balance switch and remote from the load balance switch. At least one of the site switches can obtain health check information indicative of health status of ports associated with host servers for that site switch and can provide the obtained health check information to the load balance switch, to allow the load balance switch to arrange a list of network addresses from the authoritative domain name server based at least in part on the health check information provided by the site switch.
Embodiments for global server load-balancing techniques that are based at least in part on distributed health check information are described herein. In the following description, numerous specific details are given to provide a thorough understanding of embodiments of the invention. One skilled in the relevant art will recognize, however, that the invention can be practiced without one or more of the specific details, or with other methods, components, materials, etc. In other instances, well-known structures, materials, or operations are not shown or described in detail to avoid obscuring aspects of the invention.
Reference throughout this specification to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention. Thus, the appearances of the phrases “in one embodiment” or “in an embodiment” in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
As an overview, an embodiment of the invention provides a global server load-balancing (GSLB) switch that serves as a proxy to an authoritative DNS and that communicates with numerous site switches coupled to host servers serving specific applications. The GSLB switch receives, from the site switches, operational information regarding host servers being load balanced by the site switches. When a client program requests a resolution of a host name, the GSLB switch, acting as a proxy of an authoritative DNS, returns one or more ordered IP addresses for the host name. The IP addresses are ordered using metrics that include the information collected from the site switches. In one instance, the GSLB switch places the address that is deemed “best” at the top of the list.
One of these metrics includes health check information, which is indicative of the host servers' health. In the prior-filed U.S. application Ser. No. 09/670,487, entitled “GLOBAL SERVER LOAD BALANCING,” filed Sep. 26, 2000 and U.S. application Ser. No. 10/206,580, entitled “GLOBAL SERVER LOAD BALANCING,” filed Jul. 25, 2002, embodiments were disclosed where the GSLB switch carried out health checks in a “centralized manner.” That is, to determine the health of the servers and/or the health of the host application(s) on the servers, the GSLB switch sends Layer 4 transmission control protocol (TCP) or User Datagram Protocol (UDP) health checks to the servers. Layer 3 and Layer 7 health checks can also be sent. If a server fails one of these health checks, it is disqualified from being the “best” IP address.
In contrast to the centralized health check, an embodiment of the present invention performs distributed health checks, where the health-checking tasks are distributed to the peer metric agents at the site switches, instead of being performed by the GSLB switch. The health checking may thus be performed independently of a request from the GSLB switch, in contrast to the centralized health check implementation where the health check information is conveyed in response to a request from the GSLB switch. The distributed health checking allows for reduction in GSLB processing load, reduction in health-check traffic, and increased scalability due to the distribution. Each metric agent generates a health status report, and provides this report to the GSLB switch (such as via part of a protocol message in one embodiment). On receiving the health status report, the GSLB switch processes the health check information therein, updates its records accordingly, and uses the health information to evaluate or modify the DNS response. The health check information may be indicative of access conditions to host servers (including host servers associated with a particular site switch, or with host servers that are not associated with a particular site switch, if that site switch operates as a type of information collector, for instance), and/or the health check information may be indicative of access conditions to an application hosted on a host server or access condition to some other component for which a particular site switch collects health check information.
An embodiment of the invention also allows integration of distributed health check components in systems that also include non-distributed health check components (e.g., centralized health check components). For example, a system described herein includes a GSLB switch and at least one remote metric agent that both support distributed health checks. Embodiments of the distributed health check can also provide compatibility between a remote metric agent that supports distributed health checks and a GSLB switch that does not, or compatibility between a GSLB switch that supports distributed health checks and a remote agent that does not. In situations where both a GSLB switch and a remote agent do not support distributed health checks, a centralized health check (such as disclosed in the co-pending applications identified above) can be implemented. This compatibility allows interoperability, installation, and transition of the distributed health check components into current systems that are based on centralized health checks.
In the remainder of this detailed description, for the purpose of illustrating embodiments of the present invention only, the list of IP addresses returned are assumed to be the virtual IP addresses configured on the proxy servers at switches 18A, 18B, 22A and 22B (sites 20 and 24). In one embodiment, GSLB switch 12 determines which site switch would provide the best expected performance (e.g., response time) for client program 28 and returns the IP address list with a virtual IP address configured at that site switch placed at the top. (Within the scope of the present invention, other forms of ranking or weighting the IP addresses in the list can also be possible.) Client program 28 can receive the ordered list of IP addresses, and typically selects the first IP address on the list to access the corresponding host server.
At the site switch 18A, the remote metric agent 407 is communicatively coupled to a health check module 402. The health check module 402, in a distributed health check embodiment, is responsible for querying host servers and relevant applications hosted on the host servers being load balanced by the site switch 18A to determine the “health” of each host server and each relevant application. In one embodiment, the health information includes a list of VIPs configured at the remote site 18A (e.g., at that SLB SI) and whether the ports associated with these VIPs are up or down. Once this health information is obtained by the health check module 402 (which may be implemented as a software module), the health information is communicated to the remote metric agent 407, which then sends the health information to the metric collector 406 via a protocol message and in a manner that will be described later below.
In a centralized health check embodiment, such as described in the co-pending applications identified above, the health check module 402 is located at the GSLB switch 12, rather than at the site switch 18A. In this implementation, the health check module 402 communicates directly with the GSLB switch controller 401, rather than via protocol messages. Similarly, the local metric agent 404 can communicate health check information to the GSLB switch controller 401 directly, without using the protocol communication.
Routing metric collector 405 collects routing information from routers (e.g., topological distances between nodes on the Internet).
In one embodiment, the metrics used in a GSLB switch 12 include (a) the health of each host server and selected applications, (b) each site switch's session capacity threshold, (c) the round trip time (RTT) between a site switch and a client in a previous access, (d) the geographical location of a host server, (e) the connection-load measure of new connections-per-second at a site switch, (f) the current available session capacity in each site switch, (g) the “flashback” speed between each site switch and the GSLB switch (i.e., how quickly each site switch responds to a health check from the GSLB switch), for implementations that perform centralized health checks rather than distributed health checks, and (h) a policy called the “Least Response Selection” (LRS) which prefers the site switch that has been selected less often than others.
Many of these performance metrics can be provided default values. The order in which these performance metrics can be used to evaluate the IP addresses in the DNS reply can be modified as required. Each metric can be selectively disabled or enabled, such as in systems that include components that support or do not support distributed health checks. Further details of these metrics and how they are used in an example algorithm to re-order an address list to identify the “best” IP address are disclosed in the co-pending applications identified above. For purposes of the present application, such specific details regarding the metrics and their use in the algorithm are omitted herein, so as to instead focus on the techniques to acquire and communicate distributed health check information.
At a block 210 periodic or asynchronous updates related to health check information may be performed. The updates at the block 210 will be described later below, and updates may be performed and/or communicated at any suitable location in the flow chart 200. At a block 202, health check information is collected at a remote site switch (e.g., the site switch 18A) that supports or is otherwise configured for distributed health checking. In one embodiment, this involves having the remote metric agent 407 cooperate with the health check module 402 to check the status (e.g., up or down) of the virtual ports of the VIPs at the site switch 18A. This could entail determining if at least one of the real ports associated with the virtual port of a VIP is healthy. For example, the health check module 402 can “ping” the real ports associated with a virtual port of a VIP to determine if they respond. If it finds at least one such responsive real port, it concludes that the virtual port of the VIP is healthy.
It is noted that in one embodiment of the centralized health check system, the health check module 402 is located at the GSLB switch 12, and sends health check queries to the remote metric agent 407. The remote metric agent 407 treats this health check query similarly as a normal request, and load balances the request among the real servers behind the site switch 18A. The health check information is returned to the GSLB switch 12 by the remote metric agent 407, and the health check information indicates the health status of the VIP port(s) of the site switch 18A. In contrast with the distributed health check system, the remote metric agent 407 and the health check module 402 cooperate at the block 202 to obtain the health status of the real ports mapped under the VIP ports.
It is also noted that in the centralized health check system, each health check query from the GSLB switch 12 to the site switch 18A is an individual TCP connection, in one embodiment. Thus, a separate TCP connection needs to be established to check the health status of each and every port. Furthermore, the TCP connection needs to be established and torn down each time the health check information needs to be updated at the GSLB switch 12. In one embodiment of the centralized health check, the frequency of updating the health check information may be once every 5 seconds. These multiple TCP connections use up bandwidth and requires more processing. Therefore, as will be explained later in the flow chart 200, an embodiment of the distributed health check can provide the complete health status for ports (real or VIP) and hosted applications via inclusion into a protocol message carried by a single TCP connection that is established initially when the metric collector 406 initiates communication with the remote metric agent 407. This connection is maintained in an exchange of keep-alive messages between the metric collector 406 and the remote metric agent 407. This provides a savings in speed, time, and bandwidth utilization.
At a block 204, the remote metric agent 407 generates an address list (identifying the addresses configured on the site switch 18A) and the health status of the ports corresponding to these addresses. In an embodiment, the address list and port status can correspond to the VIP addresses and VIP ports. Whether a port is up or down can be respectively indicated by a binary 1 or 0, or vice versa. It is appreciated that other types of health information, in addition to the address list and port status, can be generated at the block 204, including health status of hosted applications (e.g., whether an application hosted on a real server is available or unavailable).
At a block 206, the health information is communicated by the remote metric agent 407 to the metric collector 406 of the GSLB switch 12. In one embodiment, the health check information (e.g., address list and port status) is communicated to the GSLB switch 12 as a message forming part of a protocol communication. For instance,
The Foundry GSLB Protocol is used for communication between the metric collector 406 residing on the GSLB switch 12 and the remote metric agent 407 at the site switch 18A. A communication using this protocol can be established with a single TCP connection that remains persistent/active, without the need to re-establish a new TCP connection each time a message is to be conveyed, in one embodiment. The protocol communication includes a plurality of message types, which are listed below as non-exhaustive examples:
1. OPEN
2. ADDRESS LIST
3. REQUEST
4. RESPONSE
5. REPORT
6. SET PARAMETERS
7. NOTIFICATION
8. KEEP ALIVE
9. CLOSE
10. RTT TRAFFIC
11. OPAQUE
12. ADDRESS LIST DISTRIBUTED (DIST)
13. SET PARAMETERS DIST
14. OPEN DIST
The last three message types (12, 13, and 14) are usable with distributed health checking, while the other message types may be used either with centralized health checking or distributed health checking.
The TCP connection is established by the metric collector 406 under instruction of the switch controller 401. The metric collector 406 attempts to open a persistent communication with all specified remote metric agents 407. Where remote metric agents 407 support distributed health checks, the metric collector 406 uses the “OPEN DIST” message type to initiate and establish a TCP connection that would be used for communication of health check and other relevant information between these two entities.
When conveying the health check information, the message under the protocol (sent from the remote metric agent 407 to the metric collector 406) is under the message type “ADDRESS LIST DIST.” The ADDRESS LIST DIST message includes a list of the addresses and the health status of the corresponding ports. If ports or addresses are removed or added at the site switch 18A, such updated data is also sent along with the ADDRESS LIST DIST message.
The “SET PARAMETERS” and “SET PARAMETERS DIST” message types are sent by the metric collector 406 to the remote metric agent 407. These message types are used to change protocol parameters at the remote metric agent 407. In the distributed health check model, if the metric collector 406 supports distributed health checks but the remote metric agent 407 does not (e.g., is configured for centralized health check), then the metric collector 406 sends the message with SET PARAMETERS message type to the remote metric agent 407 to ensure that the subsequent message format(s) conforms to that used for centralized health checking. The SET PARAMETERS DIST message type is used when both the metric collector 406 and the remote metric agent 407 support distributed health checking.
At a block 208, the GSLB switch 12 receives the health check information and processes it. More specifically, the metric collector 406 receives the health check information that is sent in a protocol message from the remote metric agent 407, and processes this information.
At the block 208, the GSLB switch 12 (in particular the metric collector 406) may also update databases or other stored records/data to reflect the information indicated in the health check information. For example, if new ports or addresses or hosted applications have been added (or removed) at the remote site switch 18A, the stored records at the GSLB switch 12 can be updated to add entries relevant to the newly added (or removed) ports and address and applications, such as their specific numerical address and their health status. Alternatively or in addition, the stored data can be updated to indicate the current health status of any existing address, port, or application.
The metric collector 406 makes this processed health check information and the database(s) mentioned above available to the switch controller 401. The switch controller 401 then uses this health check information as one of the metrics in the GSLB algorithm to determine which address to place at the top of the address list. The flashback metric is disabled for implementations that support distributed health checking, since the flashback metric is used to measure the time it takes for health check information to be returned to the GSLB switch 12. The re-ordered list is subsequently provided to the requesting client program 28
At a block 210, updated health check information is sent from the remote metric agent 407 to the GSLB switch 12. In one embodiment, these updates may be periodic and/or asynchronous updates. Periodic updates are sent at the block 210 periodically from the remote metric agent 407 to the metric collector to communicate to it the latest health information. In addition, asynchronous updates are also sent at the block 210 whenever there is a change in VIP or port configuration at the site switch 18A. In one embodiment, the interval between periodic health check messages is user-configurable, and can range between 2-120 seconds, for example. A default interval can be 5 seconds, for example.
In an embodiment, the remote metric agent(s) 407 is responsible for periodically generating and sending health check information for all the VIPs configured at their respective site switch. The health check reporting interval can be configured globally on the switch controller 401 or locally on an individual remote metric agent 407. Command line interface (CLI) software commands may be used by one embodiment to specify the interval, at the GSLB switch 12 or at the remote site switches. If the reporting interval is configured on the switch controller 401, the interval is communicated to the distributed health check remote metric agents 407 via the SET PARAMETERS DIST message.
The various components of the flow chart 200 repeat or are otherwise performed continuously, as the remote site switch(es) continue to obtain and send health check information to the GSLB switch 12. The GSLB switch 12 responsively continues to examine and process the health check information so as to appropriately re-order the address list for the DNS reply.
The above-described embodiments relate to use of a remote metric agent 407 and the GSLB switch 12 that both support distributed health checks. For situations where neither of these components support distributed health checks, a centralized health check technique (such as described in the co-pending applications) can be used.
Another situation is where the GSLB switch 12 supports distributed health checks, but at least one of the remote agents 407 with which it communicates does not support it. For such situations, the GSLB switch 12 can have installed therein (or otherwise be capable of enabling) its own health check module 402. The non-distributed health check remote metric agents 407 are pre-identified for this GSLB switch 12, so that its health check module 402 can send health checks to these non-distributed health check remote metric agents 407 in a centralized manner. In the protocol communication scheme, a persistent TCP connection to these non-distributed health check remote metric agents 407 initiated by the metric collector 406 uses a message type “OPEN” instead of “OPEN DIST,” for example.
Note that the other remote metric agents 407 that support distributed health check will generate the health check information as described earlier and communicate it to the metric collector 406. The health check module 402 of the GSLB switch 12 does not send any health checks for these distributed health check remote metric agents 407.
In the protocol communication, a connection to these distributed health check remote metric agents 407, initiated by the metric collector 406, uses a message type “OPEN DIST” for these agents.
The flashback metric is disabled, in an embodiment, for this situation where some remote metric agents support distributed health checks while some may not. It is advisable in some instances to enable the flashback metric (via CLI or other technique) only if the user is absolutely certain that none of the remote metric agents 407 support distributed health checks.
Yet another situation is where the GSLB switch 12 does not support distributed health checks, but at least one of the remote metric agents 407 with which it communicates does support it. The remote metric agent 407 can first detect this limitation of the GSLB switch 12, for instance, if its metric collector 406 uses the message type “OPEN” when it first establishes a protocol communication with the remote metric agent 407. Alternatively or in addition, the non-distributed health check GSLB switch 12 can be pre-identified for the remote metric agent 407, or it may detect this limitation if it explicitly receives a query for health check information from the GSLB switch 12. After identification of the non-distributed health check GSLB switch 12, the remote metric agent 407 can send its address list information to the GSLB switch 12 with a message type “ADDRESS LIST” (instead of “ADDRESS LIST DIST”) or other format compatible with a centralized health check implementation. Note that unlike the ADDRESS LIST DIST message sent by the distributed health check remote agent 407 to a distributed health check metric collector 406, the ADDRESS LIST message sent to a non-distributed health check metric collector 406 does not contain any health check information. In one embodiment of centralized health check, the ADDRESS LIST message merely serves the purpose of communicating the addresses configured on site switch 18A to the metric collector 406.
In one embodiment of an optimization algorithm utilized by GSLB switch 12 and executed by the switch controller 401 to process the IP address list received from DNS server 16, the health check metric is used as the first criteria to determine which IP address is “best” and to preliminarily place that IP address at the top of the list of IP addresses. Thereafter, other metrics may be used to perform additional re-ordering of the IP address list, such as a connection-load metric, FIT, flashback (for systems that include non-distributed health check components), and so forth. In one embodiment, the health check information, whether obtained by either the distributed or the centralized techniques, are considered in the same priority in the algorithm—only the process by which this health check information is obtained and communicated is different.
In systems that include both distributed and non-distributed health check components, the flashback metric can be selectively enabled or disabled. When used in connection with all non-distributed health check components, the flashback metric is enabled and placed in the algorithm just prior to the least response metric, in an embodiment, when considering a list of IP addresses corresponding to the servers and applications associated with a remote metric agent 407 that does not support distributed health check.
All of the above U.S. patents, U.S. patent application publications, U.S. patent applications, foreign patents, foreign patent applications and non-patent publications referred to in this specification and/or listed in the Application Data Sheet, are incorporated herein by reference, in their entirety.
The above description of illustrated embodiments of the invention, including what is described in the Abstract, is not intended to be exhaustive or to limit the invention to the precise forms disclosed. While specific embodiments of, and examples for, the invention are described herein for illustrative purposes, various equivalent modifications are possible within the scope of the invention and can be made without deviating from the spirit and scope of the invention.
These modifications can be made to the invention in light of the above detailed description. The terms used in the following claims should not be construed to limit the invention to the specific embodiments disclosed in the specification and the claims. Rather, the scope of the invention is to be determined entirely by the following claims, which are to be construed in accordance with established doctrines of claim interpretation.
The present application is a continuation-in-part of U.S. application Ser. No. 09/670,487, entitled “GLOBAL SERVER LOAD BALANCING,” filed Sep. 26, 2000, assigned to the same assignee as the present application, and which is incorporated herein by reference its entirety.
Number | Date | Country | |
---|---|---|---|
Parent | 09670487 | Sep 2000 | US |
Child | 10305823 | US |