The ubiquity of computers in business, government, and private homes has resulted in the availability of massive amounts of information from network-connected sources, such as data stores accessible through communication networks. In recent years, computer communication and search tools have become widely available to facilitate the location and availability of information to users. Most computer communication and search tools implement a client-server architecture, where a user client computer communicates with a service provider via a remote server computer over a communication network.
One approach to increasing service provider communication bandwidths relates to employing multiple network server computers offering the same services. These server computers may be arranged in server farms, in which a single server from the server fa1111 receives and processes a particular request from a client computer. Typically, server farms implement some type of load balancing algorithm to distribute requests from client computers among the multiple servers. Generally described, in a typical client-server computing environment, client devices generally issue requests to server devices for some kind of service and/or processing, and the server devices process those requests and return suitable results to the client devices. In an environment where multiple clients send requests to multiple servers, workload distribution among the servers significantly affects the quality of service that the client devices receive from the servers.
Central control of load balancing typically requires a dedicated hardware controller, such as a master server, to keep track of all servers and their respective loads at all times. Alternatively, the central communication processing component may be a communication processing device that uses a simple algorithm, such as a round-robin load distribution algorithm, to distribute client requests over several servers. The communication load resulting from client requests affects not only the servers that serve the client requests, but also the hardware communication processing components which have to route the client requests. Because the communication loads affect the communication processing components, an efficient and effective load balancing solution must take into account the load imposed on hardware load balancing components, as well as the servers which service client requests.
One approach is to use a server locator service (SLS) for handling client requests. In this approach, the client request is directed to a well-known name or internet protocol (IP) address for service. The communication processing component queries SLS services running on multiple servers to locate a server to service the client request. The communication processing component is actually distributing the communication load over the SLS services and not the services requested by the client computing device. Once a host is located, the host name is returned to the client computing device for further client service requests. In this approach, the first client request is a discovery request which, as noted above, is directed to the SLS services running on the servers. Discovery requests do not include data requests and are only used to locate servers. Such discovery requests are out-of-band communications, meaning that discovery requests do not pass through the same logical communication channels as data requests. Out-of-band communication incurs certain overhead costs, such as additional communication related to the discovery packets which do not contribute to transmission of data.
The foregoing aspects and many of the attendant advantages of this invention will become more readily appreciated as the same become better understood by reference to the following detailed description, when taken in conjunction with the accompanying drawings, wherein:
The following detailed description describes illustrative embodiments of the invention. Although specific operating environments, system configurations, user interfaces, and flow diagrams may be illustrated and/or described, it should be understood that the examples provided are not exhaustive and do not limit the invention to the precise forms and embodiments disclosed. Persons skilled in the field of computer programming will recognize that the components and process elements described herein may be interchangeable with other components or elements or combinations of components or elements and still achieve the benefits and advantages of the invention. Although the present description may refer to the Internet, persons skilled in the art will recognize that other network environments that include local area networks (LAN), wide area networks (WAN), and/or wired or wireless networks, may also be suitable.
Prior to discussing the details of the invention, it will be appreciated by those skilled in the art that the following description is presented largely in terms of logic operations that may be performed by conventional computer components. These computer components, which may be grouped in a single location or distributed over a wide area, generally include computer processors, memory storage devices, display devices, input devices, etc. In circumstances where the computer components are distributed, the computer components are accessible to the each other via communication links.
In the following descriptions, numerous specific details are set forth in order to provide a thorough understanding of the invention. However, it will be apparent to one skilled in the art that the invention may be practiced without some or all of these specific details. In other instances, well-known process elements have not been described in detail in order not to unnecessarily obscure the invention.
Generally described, the invention relates to client request load balancing in a client-server computing environment. Specifically, the invention relates to the balancing of server load using hardware communication processing component and server assignments for subsequent service requests. In accordance with an illustrative embodiment of the invention, a client device initially transmits a first data request that is handled by a communication processing component. The communication processing component routes the client's request to a server based on some load balancing algorithm. In an illustrative embodiment, the load balancing algorithm can correspond to a round-robin method for sequentially selecting servers, a random (or pseudo-random) selection method, a load-based selection method, and the like. In addition to the appropriate response from the client request (e.g., requested data or acknowledgement), the server returns a service response including the data requested from the client, if any, a service instance ID (which may be associated with server device), and a connection lease. Once the client device has the information returned from the server, namely, the service instance ID and the connection lease, the client device sends subsequent service requests to the server directly, bypassing the communication processing component for the duration of the connection lease.
In an illustrative embodiment, the client service requests may include requests for data, such as Web pages, to be returned to the client 102 by the server 112. The client service request may also indicate a request to perform some process or task at the server 112, such as a registration or a data update at the server 112, without returning any data. In all cases, however, the server 112 processes the service request from the client 102. Client devices may include, but are not limited to, a personal computer, a personal digital assistant (PDA), a mobile phone device, etc. As noted above, the network 108 may include the Internet, a corporate LAN, or a WAN.
In one illustrative embodiment, the client device 102 may include a communication component 106 that processes the information returned from the server 112 encapsulated in a service response, including the service instance ID and the connection lease. In another illustrative embodiment, the client component that handles the service response from the server 112 may be separate from the communication component 106 which handles only data transmission to and from the network 108. In another illustrative embodiment, the client communication component may be integrated with another software component running on the client device 102. For example, the client device 102 may include a plug-in component integrated with the Web browser running on the client device 102 for handling data related to the service routing process, such as a service instance ID and the connection lease.
Although the above descriptions and the detailed descriptions that follow may refer to a single client and two servers, it will be appreciated by those skilled in the art that the present invention is not limited to a single client or two servers, but is equally applicable to any number of client and server machines/components. Additionally, even though the following descriptions may refer to the Web and Web-based protocols, those skilled in the art will appreciate that the techniques and systems described are equally applicable to other types of computing environments, such as LANs and WANs.
Once the server 112 is selected by the communication processing component 110, the first client service request is forwarded to the server 112. The service 114 running on the server 112 processes the first client service request and returns a first service response to the client device 102. In one illustrative embodiment, the first service response is returned to the client device 102 via the communication processing component 110. In another illustrative embodiment, the first service response is returned directly to the client device 102. The service response may include any data requested by the client service request, a service instance ID identifying the server 112 and/or service 114 servicing the client service request, and a connection lease indicating a duration of a direct communication channel between the server 112 and the client device 102. In one illustrative embodiment, the service instance ID is assigned by a distributed messaging system. The distributed messaging system may include messaging components running as background services on server devices 112. The messaging components communicate using a distributed protocol to assign service instance IDs to services 114 and to route incoming requests to an identified service 114 running on an appropriate device currently hosting the identified service 114. As instances of services 114 are added or deleted from the servers 112, the service instance IDs are updated accordingly.
The connection lease may be based on a number of parameters, such as time, a number of client service requests, an amount of data transmitted, or any combination of these parameters. For example, the connection lease may be valid for a fixed time duration, such as 100 milliseconds. Alternatively, the connection lease may be based on the number of client service requests. For example, the connection lease time may be valid for a fixed number of client service requests, such as 1000 service requests. Similarly, the connection lease may be valid for a fixed amount of data, such as 10 MB of transmitted data. In an illustrative embodiment, the lease information may be directly transmitted from server. Alternatively, the lease information may be indirectly referenced to information already stored on the client machine. For example, the server may reference a table of lease terms that is pre-stored on a client device 102. Those skilled in the art will appreciate that the connection lease may be based on many other parameters or combination of parameters without departing from the spirit of the present disclosures.
Depending on the type of service requested by the client device 102, the service response may or may not include any data. For example, if the client service request is transmitted to the server 112 in order to initiate some action or task, then the service response will not include any data. Once the client device 102 receives the service response back from the server 112, the client device 102 can directly communicate with the server 112 using the service instance ID for the duration of the connection lease. Therefore, subsequent client service requests and service responses are communicated directly between the client device 102 and the server device 112, bypassing the communication processing component 110 for the duration of the connection lease.
The connection lease may be terminated by the client device 102 or the server device 112 for various reasons. For example, if the server device determines that it is overloaded with client service requests, then the server 112 may terminate the connection lease. In such case, the process of discovery starts over again. That is, the client device 102 will send another first client service request to the communication processing component 110 to be forwarded to another server 112. Another reason for breaking the connection lease by the server device 112 is a failure in some aspect of the service 114. For example, if the service 114 is the interface to a database from which the client device 102 has requested some data, and the database is inaccessible, then the server 112 may break the connection lease. In one illustrative embodiment, the breaking of a connection lease may be indicated to the client device 102 with a special error code embedded in the service response sent by the server 112 to the client device 102. In another illustrative embodiment, the breaking of the connection lease may be determined by the client device based on a time-out parameter. The server device 112 may also renegotiate or adjust the connection lease terms for substantially the same reasons discussed above with respect to lease termination. For example, if the server device 112 determines that the server load is increasing at a threshold rate, the server device may adjust the connection lease of one or more of the client devices 102 currently being serviced by the server device 112 in the next service response to shorten the term of the lease and reduce server load.
After termination of the connection lease, the client device 102 may wait a random amount of time before attempting to obtain a new connection lease from the communication processing component 110. This is to avoid overloading the communication processing component 110 in case of simultaneous multiple server device failures. Those skilled in the art will appreciate that there are many common methods to handle connection failures, such as retrying a predetermined number of times, waiting a random amount of time, obtaining a status of the server from an external source, performing exponential back-off, wherein the wait time is increased exponentially as a function of the number of previous retries and failures, and the like, before trying to connect again. Additionally, the client device 102 may attempt to continue to communicate with an identified server 112 and/or service 114 after termination if no alternate server/service can be identified.
The client device 102 may also terminate the connection lease for various reasons. For example, the communication component 106 may send multiple client service requests for discovery purposes to the communication processing component 110 and receive multiple service responses in response to the multiple client service requests. Next, the communication component 106 may compare the information included in the service responses returned from the different servers 112 to determine which server 112 could offer the best and/or most efficient service. In one illustrative embodiment, the service response may include additional information such as server load statistics, different types of services 114 offered by the server 112, and other information usable for selection of a server device 112. In this illustrative embodiment, the communication components 106 contribute significantly to overall load balancing by selecting the server 112. Such selection of server device 112 by the communication component 106 is above and beyond the server selection performed by the communication processing component 110. Therefore, in this embodiment, a layered approach is taken to load balancing with a first layer implemented by the use of the communication processing component 110 and the second layer on top of the first layer comprises further load balancing by the use of the communication components 106 based on the information provided by the service responses from servers 112.
At block 320, the communication component 106 of the client device 102 receives the service instance ID and the connection lease. In one illustrative embodiment, the client service request is an HTTP request and the service 114 is a Web server. In this embodiment, the service instance ID and the connection lease may be passed as part of the URI exchange between the client device 102 and the server device 112. Those skilled in the art will appreciate that there are many other ways to communicate information by transmitting data packets between a client and a server device. For example, applications, such as FTP, implemented using IP have their own specific protocols which may include various fields for the communication of different types of data, such as the service instance ID and the connection leases. At decision block 330, it is determined whether the connection lease has expired. If the connection lease has expired, the method proceeds back to block 310 where the client device 102 transmits a new client service request to the communication processing component 110 in order to select another server 112 to service the client service request. If the connection lease has not expired, the method proceeds to block 340 where subsequent client service requests are transmitted directly to the server 112 having the service instance ID transmitted initially to the client device 102.
At decision block 420, the need for information updates is assessed. Information updates may include a different connection lease, an error condition, new data to be sent back to the client device 102, etc. As briefly discussed above, the server 112 may break the connection lease for various reasons including overloading of the server and/or an error condition such as loss of access to a database. If updated information is needed, updates are obtained at block 430 to be included in the service response. The updated information may be obtained from the server device 112, from other server devices, the communication processing component 110, or other sources of information. If no information update is required, the server 112 transmits the service response including the service instance ID and the connection lease at block 440. The method continues in a loop and goes back to block 410 to obtain the next service request from the client.
The communication component 106 determines which server 112 can best serve the client device 102 based on the information included in the service responses received from the various servers 112. Once a server device 112 is selected based on such comparison of service responses, the client device 102 transmits all subsequent client service requests to the selected server 112 for the duration of the connection lease. The communication component 106 may continue to send other client service requests through the communication processing component 110 to other server devices 112 in order to continually improve the quality of service received by the client device 102. If the communication component 106 finds a server 112 which can better serve the client device 102, then the client device 102 may break the connection lease with the server device 112 currently servicing the client device 102 and start direct communication with the new server device 112. The communication processing component 110 may use a number of selection methods in selecting the next server for receiving the client service request. For example, the selection method may include a random selection, a probabilistic selection, weighted probabilistic server selection, and the like. For example, in a weighted probabilistic server selection algorithm, a probability of selection to each server device 112 based on a server load is calculated based on reported server loads/resources. The probability is inversely proportional to the server load. The server load may be indicated by different parameters, such as server processing load measured by the number of processes waiting to be executed on the server, the number of clients currently being served by the server, the average latency between a client service request and the service response, etc.
While illustrative embodiments have been illustrated and described, it will be appreciated that various changes can be made therein without departing from the spirit and scope of the invention.
Embodiments of the disclosure are described in the following clauses:
Clause 1. A computer-implemented system for managing service requests, the system comprising:
a service provider interface component, comprising a server device, accessible through a computer network, wherein the service provider component interface provides a service response in response to a client service request from a client device, the service response including at least a service instance ID and a connection lease;
a communication processing component for routing client service requests to the service provider; and
a client communication component included in the client device operable to communicate with the communication processing component, wherein the client communication component sends a first client service request to the communication processing component and wherein the client communication component sends subsequent client service requests, according to the connection lease, directly to the server device based on the service instance ID, bypassing the communication processing component.
Clause 2. The system of Clause 1, wherein the service component is based on HTTP.
Clause 3. The system of Clause 1, wherein the client service request is provided using a Web browser.
Clause 4. The system of Clause 1, wherein the connection lease is predetermined.
Clause 5. The system of Clause 1, wherein the connection lease is determined based on a load of the server device.
Clause 6. The system of Clause 1, wherein the connection lease is determined based on a load of the server device, the load of the server device being determined by the client device.
Clause 7. The system of Clause 1, wherein the server device returns subsequent service responses in response to the subsequent client service requests.
Clause 8. The system of Clause 7, wherein the subsequent service responses include at least the server ID and the connection lease.
Clause 9. The system of Clause 7, wherein the subsequent service responses include a new connection lease.
Clause 10. The system of Clause 7, wherein the subsequent service responses include data representing a cancellation of the connection lease.
Clause 11. The system of Clause 1, wherein the client device selects one of a first and a second server device based on a first and a second service responses received in response to a first and a second client service request, respectively.
Clause 12. The system of Clause 1, wherein connection lease is determined based on time duration.
Clause 13. The system of Clause 1, wherein connection lease is determined based on a number of client service requests.
Clause 14. The system of Clause 1, wherein connection lease is determined based on an amount of data returned in the service responses.
Clause 15. The system of Clause 1, wherein connection lease is determined based on a combination of one or more of a time duration, a number of client service requests, and an amount of data returned in the service responses.
Clause 16. The system of Clause 5, wherein the load of the server device is determined based on a latency of the service response.
Clause 17. The system of Clause 5, wherein the load of the server device is determined based on a processing load of the server device.
Clause 18. The system of Clause 5, wherein the communication processing component is coupled with a plurality of server devices.
Clause 19. The system of Clause 18, wherein the communication processing component sends the first client service request to one of the plurality of server devices having the smallest load of the plurality of server devices.
Clause 20. The system of Clause 1, wherein the service instance ID corresponds to an identified server.
Clause 21. The system of Clause 1, wherein the service instance ID corresponds to an identified service.
Clause 22. The system of Clause 1, wherein the communication processing component is a hardware load balance device.
Clause 23. The system of Clause 1, wherein the communication processing component is a software load balance component.
Clause 24. A computer-implemented system for managing service requests, the system comprising:
a service provider having a:
a server device for processing client service requests;
a communication processing component coupled with the server device, wherein the communication processing component routes at least one client service request from a client device to the server device and returns at least one service response from the server device to the client device, the service response including at least a service instance ID, and a connection lease; and
a client communication component included in the client device, wherein the client communication component transmits client service requests to the server device based on the service instance ID during the connection lease.
Clause 25. The system of Clause 24, wherein the communication processing component is coupled with a plurality of server devices.
Clause 26. The system of Clause 25, wherein the server devices are Web servers.
Clause 27. The system of Clause 25, wherein the communication processing component routes the at least one client service request to the server device based on a load of the server device.
Clause 28. The system of Clause 27, wherein the communication processing component routes the at least one client service request to one of the plurality of server devices having the smallest load of the plurality of server devices.
Clause 29. The system of Clause 27, wherein the load of the server device is determined based on a latency of the service response.
Clause 30. The system of Clause 24, wherein the connection lease is predetermined.
Clause 31. The system of Clause 24, wherein the connection lease is determined based on the load of the server device.
Clause 32. The system of Clause 24, wherein connection lease is determined based on time duration.
Clause 33. The system of Clause 24, wherein connection lease is determined based on a number of client service requests.
Clause 34. The system of Clause 24, wherein connection lease is determined based on an amount of data returned in the service responses.
Clause 35. A computer-implemented method of managing service requests, the method comprising:
transmitting a first client service request to a communication processing component;
receiving a service response provided in response to the first client response request, the service response including at least a service instance ID and a connection lease, wherein the connection lease is a time interval during which client service requests are sent directly to a server device having the service instance ID; and
transmitting subsequent client service requests directly to the server device based on the service instance ID during the connection lease.
Clause 36. The method of Clause 35, wherein sending a first client service request comprises sending an HTTP request.
Clause 37. The method of Clause 35, wherein the communication processing component is coupled to a plurality of server devices.
Clause 38. The method of Clause 37, wherein the first client service request is used for discovery of one of the plurality of server devices.
Clause 39. The method of Clause 37, wherein each one of the plurality of server devices is associated with a load.
Clause 40. The method of Clause 39, wherein the first client service request is routed by the communication processing component to one of the plurality of the server devices based on the load associated with each one of the plurality of server devices.
Clause 41. The method of Clause 40, wherein the first client service request is routed to the one of the plurality of the server devices that is associated with the smallest load.
Clause 42. The method of Clause 39, wherein the connection lease is determined based on the load of the server device.
Clause 43. The method of Clause 35, wherein the client device selects one of a first and a second server device based on a first and a second service responses received in response to a first and a second client service request, respectively.
Clause 44. The method of Clause 35, wherein connection lease is determined based on time duration.
Clause 45. A computer-implemented method of managing service requests, the method comprising:
in response to receiving a first client service request from a client device via a communication processing component, determining a connection lease and a service instance ID;
returning a first service response including the connection lease and the service instance ID to the client device via the communication processing component; and
in response to receiving subsequent client service requests directly from the client device, returning subsequent service responses including the connection lease and the server ID directly to the client device.
Clause 46. The method of Clause 45, wherein the connection lease is determined based on a load of a server device associated with the service instance ID.
Clause 47. The method of Clause 46, wherein the load of the server device is determined based on a latency of the service response.
Clause 48. The method of Clause 47, wherein the load of the server device is determined based on a processing load of the server.
Clause 49. The method of Clause 45, wherein the communication processing component is coupled with a plurality of server devices.
Clause 50. The method of Clause 45, wherein the server device returns subsequent service responses in response to the subsequent client service requests.
Clause 51. The method of Clause 50, wherein the subsequent service responses include at least the service instance ID and the connection lease.
Clause 52. The method of Clause 50, wherein the subsequent service responses include a new connection lease.
Clause 53. The method of Clause 50, wherein the subsequent service responses include data representing a cancellation of the connection lease.
Clause 54. A computer-implemented method of managing service requests, the method comprising:
transmitting a first client service request to a communication processing component;
transmitting a second client service request to the communication processing component;
receiving a first service response provided in response to the first client response request, the first service response including at least a first service instance ID and a first connection lease, wherein the first connection lease is a time interval during which client service requests are sent directly to a first server device having the first service instance ID;
receiving a second service response provided in response to the second client response request, the second service response including at least a second first service instance ID and a connection lease, wherein the connection lease is a time interval during which client service requests are sent directly to a second server device having the second service instance ID;
selecting one of the first and the second server devices based on information included in the first and second service responses; and
transmitting subsequent client service requests directly to the selected server device during a connection lease included in the service response corresponding to the selected server device.
Clause 55. The method of Clause 54, wherein the selected server device returns subsequent service responses in response to the subsequent client service requests.
Clause 56. The method of Clause 54, wherein the subsequent service responses include at least the server ID and the connection lease.
Clause 57. The method of Clause 54, wherein the subsequent service responses include a new connection lease.
Clause 58. The method of Clause 54. wherein the subsequent service responses include data representing a cancellation of the connection lease.
This application is a continuation of U.S. patent application Ser. No. 15/178,410, filed Jun. 9, 2016, which is a continuation of U.S. patent application Ser. No. 13/942,498, filed Jul. 15, 2013, now U.S. Pat. No. 9,379,997, which is a continuation of U.S. patent application Ser. No. 13/472,199, filed May 15, 2012, now U.S. Pat. No. 8,495,170, which is a continuation of U.S. patent application Ser. No. 11/771,965, filed Jun. 29, 2007, now U.S. Pat. No. 8,260,940, which are hereby incorporated by referenced herein by their entirety.
Number | Name | Date | Kind |
---|---|---|---|
6324582 | Sridhar et al. | Nov 2001 | B1 |
6449648 | Waldo et al. | Sep 2002 | B1 |
6779017 | Lamberton et al. | Aug 2004 | B1 |
6950874 | Chang et al. | Sep 2005 | B2 |
7003575 | Ikonen | Feb 2006 | B2 |
7203764 | Jorgenson | Apr 2007 | B2 |
7320131 | O'Toole, Jr. | Jan 2008 | B1 |
7441045 | Skene et al. | Oct 2008 | B2 |
7512707 | Manapragada et al. | Mar 2009 | B1 |
7552237 | Cernohous et al. | Jun 2009 | B2 |
7774782 | Popescu et al. | Aug 2010 | B1 |
8260940 | Vosshall et al. | Sep 2012 | B1 |
8495170 | Vosshall et al. | Jul 2013 | B1 |
9379997 | Vosshall et al. | Jun 2016 | B1 |
10616372 | Vosshall et al. | Apr 2020 | B2 |
20020040400 | Masters | Apr 2002 | A1 |
20030051042 | Tazoe | Mar 2003 | A1 |
20030074453 | Ikonen | Apr 2003 | A1 |
20040260745 | Gage | Dec 2004 | A1 |
20050021848 | Jorgenson | Jan 2005 | A1 |
20050050202 | Aiken, Jr. et al. | Mar 2005 | A1 |
20050078668 | Wittenberg et al. | Apr 2005 | A1 |
20060242300 | Yumoto et al. | Oct 2006 | A1 |
Entry |
---|
Colajanni et al., “Dynamic Load Balancing in Geographically Distributed Heterogeneous Web Servers,” in Proceedings of the International Conference on Distributed Computing Systems, May 26-29, 1998, pp. 295-302. |
Number | Date | Country | |
---|---|---|---|
20200296185 A1 | Sep 2020 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 15178410 | Jun 2016 | US |
Child | 16840162 | US | |
Parent | 13942498 | Jul 2013 | US |
Child | 15178410 | US | |
Parent | 13472199 | May 2012 | US |
Child | 13942498 | US | |
Parent | 11771965 | Jun 2007 | US |
Child | 13472199 | US |