System for balance distribution of requests across multiple servers using dynamic metrics

Description

FIELD OF THE INVENTION

This invention relates to computers and digital processing systems requiring coordination of multiple digital processing units. In particular, this invention relates to load balancing or distribution of client requests across multiple servers in a networked computing environment.

BACKGROUND OF THE INVENTION

The Internet has become an increasingly useful tool and means of communication to many people. As the popularity of the Internet has increased, traffic to many Internet service provider (ISP) and application service provider (ASP) sites has become so congested at times that many companies have to impose a limit on the number of users using their sites during peak hours. As a result, a significant loss of business for e-business merchants, user dissatisfaction, and a permanent loss of many potential customers occur. According to at least one source, during the 1999 holiday shopping season, 25 percent of all potential online buyers never completed their online purchases because the e-tail sites of interest had either crashed or were simply too slow. The principle cause of these problems in the case of larger sites was and is an inappropriate distribution of the requests of customers or users (clients) among the sites' resources (servers), namely the multiple content and application servers that are responsible for responding to these requests.

Allocating content and application server resources to respond to a large number of client requests can become rather complex in certain circumstances involving multiple servers at a given site. If it is assumed that there is always at least one server available for each new task that arises, resource assignments may be made in an arbitrary manner, making the resource allocation procedure trivial. To satisfy the assumption underlying this approach to resource allocation, it is generally desirable to create a system design that has abundant resources and strives to conserve them to maintain availability and efficient throughput. In this approach, each client request received at a site is handled as an independent event. U.S. Pat. Nos. 6,173,322, 6,070,191, 5,999,965, and 5,504,894 all describe resource demand distribution schemes that allocate client request among various resources where the client requests are each treated as independent events.

U.S. Pat. No. 6,173,322 is a good example of this approach and describes a system comprised of three host servers each having different request handling capabilities. For illustrative purposes, suppose that hosts H1, H2, and H3 have capabilities C1, C2, and C3 respectively with C3 being the most capable. Further suppose that there are three requests pending, R1, R2, and R3, needing capabilities C1, C2, and C3 respectively. If each request is considered independently and in the order the requests arrive, R1 might be assigned to H3 since this host will serve the request with the least delay. Next, R2 might be assigned to H2 for the same reason. R3 would then suffer if it were assigned to the only remaining host, H1, since H1 is under-powered to handle the request. Alternatively, R3 could wait for H3 to become available. The effect of these kinds of inefficiencies is cumulative; if the same three requests (or their respective equivalents) come in repeatedly and are serviced independently, there will be an ever-diminishing availability of resources until the system saturates and stops responding to new requests. Moreover, Internet demand is not well behaved. Service requests often come in bursts or may back up to form a large backlog for a variety of reasons. As a consequence, it is desirable for the resource allocation procedure to respond in a more sophisticated manner.

Another problem of the request distribution processes described in U.S. Pat. Nos. 6,070,191, 5,999, 965, and 5,504,894 is that these processes consider only parameters related to available resources and do not consider the attributes of the incoming client requests. U.S. Pat. No. 6,173,322 parses certain data contained in incoming clients requests, but only for the purpose of applying a static rule to distribute the requests to one of several server groups. Once this has been done, dynamic resource capability rules are applied to assign the request to a server within the group. These rules may operate in consideration of the static rules previously applied, but only after the static rules are first applied.

While existing schemes for distributing client requests among multiple servers have begun to address some of the problems that arise, it would be desirable to provide a system for distributing client requests across multiple servers that was more efficient and robust. Specifically, it would be advantageous to provide a system for distributing client requests across multiple servers that analyzed the attributes of client requests for expected demand patterns with which resource requirements may be associated, allowing for a comparison of the resource needs of incoming client requests with the resources available, and thus improving the capability of the resource allocation scheme to be more adaptive and dynamic from all operating aspects.

SUMMARY OF THE INVENTION

The present invention is a system for distributing incoming client requests across multiple servers in a networked client-server computer environment. The system collects information on both the attributes of the requests and the resource capability of the servers to dynamically allocate the requests in a set to the appropriate servers upon the completion of the time interval. Preferably, the system includes a request table to collect at least two requests incoming within a predetermined time interval. A request examiner routine analyzes each collected request with respect to at least one attribute. A system status monitor collects resource capability information of each server in a resource table at least once during said time interval. An optimization and allocation process distribute collected requests in the request table across the multiple servers upon completion of said time interval based on an optimization of potential pairings of the requests in the request table with the servers in the server table. The optimization and allocation process preferably analyzes metrics maintained in the request table and resource table as part of a relational database to allocate requests to servers based on a minimization of the metric distance between pairings of requests and servers. Preferably, the request table is part of a dynamic, relational database and a process of statistical inference for ascertaining expected demand patterns involving the attributes adds predictive information about client requests as part of the request examiner routine.

The present invention responds to the demanding circumstances described above by shifting from processing each request as an independent event to processing as a set of requests incoming within a predetermined time interval. The requests are processed as a set by collecting the requests incoming within the predetermined time interval, analyzing each of these requests with respect to at least one attribute, collecting at least once during the time interval information about each server's ability and availability, i.e., resource capability information, to handle requests, and distributing the set of requests across the multiple servers upon the completion of the time interval in response to the above actions, and then repeating these steps for each consecutive time interval. This invention has been denominated virtual extended technology (VXT) because it can intelligently run in the background within the confines of current day bandwidth and processing technology.

Resource allocation, the key to optimum throughput, is the real-time intelligent management of system resources. This invention utilizes several interactive decision processes that can consider all operating aspects of a system's resources, both static and dynamic, while balancing the continuously changing competition for these resources. One of the objectives of this invention is to provide a new algorithm for allocating Internet client requests in an intelligent manner to multiple servers to maximize the efficiency and fault tolerance of the resources. Costs of requests within a reasonable time interval are considered simultaneously to produce a solution that is globally effective (i.e., most effective for a site as a whole) at the possible expense of some individual (localized) requests. The objective is further achieved through analysis of attributes of requests as these attributes correlate to request demands on resources and of the just-in-time running status of those resources.

To return to the above example, a more effective solution would look at all three requests simultaneously, and assign R1 to H1, R2 to H2, and R3 to H3. The performance associated with request R1 will receive the nominal service (slightly less than that offered by the above solution) it needs, while R3 will also receive the appropriate level of service in a prompt manner, and the overall performance of the site will therefore be better. This latter solution is considered globally optimal because the number of requests managed per second is maximized and the collective resources are available for the next set of requests sooner.

Requests to a site can vary widely from web surfing, product search, price comparison, and checkout to multimedia access. However, the demand on resources by each kind of request is predictable. The distribution decision-making process of this invention accounts for attributes and behavior of incoming requests and the corresponding compatibility of system hardware and software. Incoming client requests are analyzed to determine their attributes and behavior so that a given request's expected demand on resources can be predicted and resource requirements can be assigned to the request.

One of the components of the invention will perform the extraction of the attributes from incoming requests. An analysis of the effectiveness of the characteristics above as well as the identification of other parameters that may be beneficial can be performed during the requirements analysis task. Extraction of the dynamic attributes will be performed in real-time by VXT's integral system request examiner or status monitor.

The invention learns how to characterize site-specific traffic in several ways. First, it expands or contracts the number of expected demand patterns based on the success of the request classification. In other words, if a live request does not sufficiently match an already existing pattern, a new pattern is created. Also, if the resource requirement parameters for the matching entry are not correct as measured by system experience, either the parameters themselves are adjusted, or a new pattern is created. Conversely, the number of patterns is constrained to minimize the computation required to classify live requests. The pattern set may be reorganized to eliminate unused, redundant, or ineffective entries. This self-organizing and reorganizing paradigm refines parameters by experience and remains vigilant to non-stationary statistical trends.

Similarly, the compatibility of the system hardware and software is also provided to the decision-making process. Some of these characteristics are static and known in advance, while others are dynamic and a function of the tasks currently executing. Preferably, a collection of resource capability information for each server includes metrics for CPU and memory availability and connectivity to a proxy server, to a main storage system, and to other content servers. This collection process can be push or pull from the server at certain times and any of several techniques can be implemented for minimal interruption of the main execution on the servers. For example, information can be pulled periodically by the main proxy server, or the main server can be pushed to accept such information from servers when any certain parameter exceeds a pre-determined threshold. This performance feedback allows for an informed decision on which request to send to which server.

Once this information is captured for a given interval of time, it must be reduced to a metric representation that can be manipulated to compute the best assignments of client requests to resources. The metrics associated with each request form a requirement data set whose elements represent the requirement level of each of the parameters used in the decision process. The metrics associated with the ability of a particular server to satisfy the request forms a capability data set with each element of this data set having a counterpart in the requirement data set. During operations, each request has its own requirement data set and each server or processing node has its own capability data set. The difference or metric distance between a requirement data set and a capability data set, calculated for any given pairing of client request and server, represents the mismatch (or cost) incurred by the corresponding assignment of the request to the server. If the data sets are identical, the cost is zero.

The assignment of multiple simultaneous requests can be done by one of several routines. The purpose of each routine, however, should be to select a server or processing resource for each client request so that the sum of all the costs, for the combination of resource and request pairings, is minimized. The solution can be found by using one of several algorithms.

Some algorithms find a perfect solution but require considerable processing, while others will find a nearly optimal solution quickly. Often, the nearly optimal solution is good enough to satisfy the presently existing circumstances.

One embodiment of the invention is a method for allocating a server selected from a plurality of servers to client requests originating over a predefined time interval at a plurality of user accounts, the method comprising: collecting a plurality of client requests that arrive within the predefined time interval wherein at least two of said client requests are serviceable by the server and wherein a first of said at least two of said client requests originates at a first user account and a second of said at least two of said client requests originates at a second user account; determining a first value of a cost metric for a first set of client request-server pairings wherein said first set includes at least one client request-server pair with said server being paired with either said first or said second of said at least two client requests; determining a second value of a cost metric for a second set of client request-server pairings wherein said second set includes at least one client request-server pair with said server being paired with both said first and said second of said at least two client requests; and at the end of said time interval distributing said client requests according to one of said first and said second set of client request-server pairings based on said first and second values of said cost metric.

A second embodiment is a method for distributing client requests across a plurality of servers in a client-server networked system, the method comprising: selecting a time window; collecting client requests arriving within said time window wherein said client requests include at least a first plurality of said client requests that originate at a first user account and at least a second plurality of client requests that originate at a second user account; determining a first cost metric corresponding to a first set of client request-server pairing wherein at least one server is paired with at least one of said first plurality of said client requests and at least one of said second plurality of client requests; determining a second cost metric corresponding to a second set of client request-server pairings wherein said second set is characterized by first and second disjoint subsets with all pairings that include client requests originating at the first user account belonging to the first subset and all pairings that include client requests originating at the second user account belonging to the second subset; and selecting one of said first set of client request-server pairs and said second set of client request-server pairs based on a differential between said first cost metric and said second cost metric.

One exemplary embodiment of the present invention includes a method for allocating a server, selected from a plurality of servers, to client requests originating over a predefined time interval at a plurality of user accounts. The method comprising: collecting a plurality of client requests that arrive within the predefined time interval wherein at least two of said client requests are serviceable by the server and wherein a first of said at least two of said client requests originates at a first user account and a second of said at least two of said client requests originates at a second user account; determining a first value of a cost metric for a first set of client request-server pairings wherein said first set includes at least one client request-server pair with said server being paired with either said first or said second of said at least two client requests; determining a second value of a cost metric for a second set of client request-server pairings wherein said second set includes at least one client request-server pair with said server being paired with both said first and said second of said at least two client requests; and at the end of said predefined time interval distributing said client requests according to one of said first and said second set of client request-server pairings based on said first and second values of said cost metric; wherein the step of determining the first or the second value of a cost metric for the first or the second set of client request-server pairings further comprises the steps of: initializing the first or the second set of client request-server pairings at a commencement of the predefined time interval; a) selecting a client request-server pair to satisfy a selection criteria; b) creating a requirement vector corresponding to said client request; c) creating a capability vector corresponding to said server; d) calculating a distance between the requirement vector and the capability vector and adding said distance to a cumulative value when said distance exceeds a match threshold value and repeating steps a), b), c) and d); e) adding said client request-server pair to said set of client request-server pairings when said distance exceeds the match threshold value, said cumulative value is less than a cost threshold and said client request has arrived within said predefined time interval.

One exemplary embodiment of the present invention includes a method wherein the step of determining the value of the first or the second cost metric for the first or the second set of client request-server pairings comprises the steps of: at the commencement of said predefined time interval, initializing a cumulative value to zero; for each client request-server pair in the first or the second set of client request-server pairings, a) creating a requirement vector corresponding to said client request; b) creating a capability vector corresponding to said server; c) calculating an inner product of said requirement vector and said capability vector and adding said inner product to the cumulative value and repeating steps a), b) and c) for all client request-server pairs in the first or the second set of client request-server pairings whereupon said cumulative value represents the value of the cost metric.

One exemplary embodiment of the present invention includes a method step of distributing said client requests further comprises distributing said client requests according to said first set of client requests-server pairings if said first value of the cost metric is lower than the second value of the cost metric otherwise distributing said client requests according to said second set of client requests-server pairings.

One exemplary embodiment of the present invention includes a method wherein said selection criteria comprises matching a client request with a server to generate at least one client request-server pairing belonging to one of said first set and said second set.

One exemplary embodiment of the resent invention includes a s stem for distributing load within a client-server computer network, comprising: a plurality of interconnected computer servers, each server having at least one processor, wherein each computer server is associated with a capability vector having at least one element associated with a resource expected to be requested by at least one of a plurality of incoming client requests; a dynamic capability vector determining module adapted configured to generate a dynamic capability vector for each server of said plurality of interconnected servers, said dynamic capability vector representing an update to said capability vector such that the at least one element of the capability vector corresponds to an unused portion of the resource associated with the at least one element and measured at the commencement of one of a sequence of predefined time intervals; a requirement vector determining module configured to generate a requirement vector for each incoming client request during the one of the sequence of predefined time intervals; and a load balancing module for selectively pairing said plurality of interconnected computer servers with one or more of said plurality of incoming client requests so as to minimize a cost metric computed during the one predefined time interval in said sequence of predefined time intervals wherein said cost metric is a function of vector distances between said dynamic capability vectors and said requirement vectors associated with said computer servers and said client request pairs in said computer server-client request pairing; wherein said load balancing module further comprises a plurality of instances of load balancing modules resident on an appropriate plurality of servers disposed at intermediate nodes forming a connectivity hierarchy of layers throughout said client-server computer network such that said cost metric is computed and minimized for at least one layer of server nodes corresponding to the same connectivity hierarchy whereby each incoming client request is satisfied by a plurality of computer servers and transmission paths.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a diagrammatic view of the present invention implemented across servers in a networked computing environment.

FIG. 2 is a diagrammatic view of the main interacting elements with a system with multiple servers for responding to client request, including a proxy server, content servers, switches, and storage system.

FIG. 3 illustrates five types of inter-processor connectivity for a system with multiple servers.

DESCRIPTION OF THE PREFERRED EMBODIMENT

FIG. 2 shows a typical configuration of a system (10) having multiple resources that may be allocated to respond to client requests received from the Internet. A proxy server(s) (12) receives the client request from the Internet and using the VXT (100), as will be described shortly, distributes those requests via network switches (14) to one of the multiple content/application servers (16) which preferably have access to a common DASD storage unit (18) on which information pertinent to the client requests may be stored. As will be discussed in connection with FIG. 3, it will be understood that the present invention is applicable to numerous configurations of server resources in a system (10). In one embodiment as described in the previously identified applications entitled “Scalable Internet Engine” and “Method and System For Providing Dynamic Host Service Management Across Disparate Accounts/Sites”, servers are dynamically allocated among multiple sites or accounts. In this embodiment, the present invention is applicable not only for allocating server resources among requests in a single account, but may also be extended to provide additional information for how to allocate servers among different accounts over time.

In the preferred embodiment of the VXT (100) as shown in FIG. 1, the invention comprises a request table (110) to collect at least two requests (102) incoming within a predetermined time interval, a request examiner process (120) to analyze each said collected request with respect to at least one attribute, a system status monitor (130) to collect resource capability information of each server (104), an optimization and allocation process (140) to distribute said collected requests in the request table (110) across the multiple servers upon completion of said time interval in response to said attributes and said resource capability information. Incoming client requests (102) are analyzed for their respective attributes by the request examiner process (120). This attribute information is then sent to the request table (110). Preferably, the system status monitor (130) collects resource capability information as part of a resource table (132).

In a preferred embodiment, the request table (110) and the resource table (132) are preferably implemented as part of a relational database. A process of rational statistical inference (150) analyzes each client request to assign a pattern classification so that its expected demand on resources can be predicted using the pattern classification in the adaptive request table (110).

One of the primary responsibilities of the request examiner (120) of the VXT (100) is to examine all incoming requests and to prioritize these requests based on criteria that can be described in general as (1) categorical criteria such as product searching, price, comparison, online shopping, web surfing, audio streaming, and video downloads, and (2) demographic criteria such as the origin of the request and possible user profile. Comparing these attributes with a dynamic, relational database that records past requests and their behavior along with a process of rational statistical inference (150) permits the VXT (100) to estimate each client request's (102) resource requirements in terms of CPU availability, memory availability, and bandwidth or connectivity of the servers (104).

The purpose of the database and process of statistical inference (150) is to facilitate the construction of an adaptive request table (110) containing several generic request types or pattern classifications that are most likely to be received by the proxy server (12). Each request type is assigned a set of at least five parameters or resource requirement metrics (114) that reflect different requirement aspects for the respective request. The values assigned to these five parameters form a requirements vector (116) that prescribes the generic request's expected resource requirements in terms of CPU time, memory, bandwidth or connectivity for storage, bandwidth or connectivity to the main proxy server, and bandwidth or connectivity to peer servers (i.e., connectivity between content servers). When a request from the Internet comes in, the request examiner (120) compares the request with the patterns (112) contained in the adaptive request table (110), finds the closest match, and creates a requirement vector (116) including the five corresponding resource parameters.

With reference to FIG. 3, a functional and cost effective system (10) should have at least two levels of networked servers. The lowest level consists of a collection of symmetric multiple processors (SMP) on the same bus. The bus should be a network having an aggregate bandwidth greater than 1 Gbps and very low latency. The next level is a collection of SMPs on one or more switches with less than 1 Gbps bandwidth and higher latency. The VXT (100) is designed to intelligently handle the added complexities of such an ASP system.

The VXT (100) ranks the available servers according to specific ranking criteria and servers' current running status in CPU availability, memory availability, storage connectivity, main proxy server connectivity, and pear server connectivity and generates a resource table (132) summarizing the resource capability metric (134) in a capability vector (136).

In a dynamic environment, each processor is capable of handling more than one task at a time, whether it is performing a price comparison or a search for a specific consumer item. The CPU availability parameter is defined in absolute terms as the unused portion of each processor's computing power measured in units of millions of instructions per second (MIPS).

The memory availability parameter is defined in absolute terms as the unused portion of each node's shared memory measured in units of megabytes divided by the number of processors in the node. This is because for SMP systems with several, processors (usually 4 to 8) in each node, the amount of memory available to one particular processor cannot be determined as the memory is shared among all processors in the same node.

Connectivity is a complex matter. In most systems, each processor has five different communication partners. Latency (determined by hardware) and available bandwidth (determined by current utilization) should be ascertained for each of these partners. Figure four identifies five types of inter-processor connectivity with a wide range of latency and bandwidth. Type I connectivity is between processors on the same node. Type II connectivity is between processors on different nodes, but on the same switch. Type III connectivity is between processors on different switches (for which a new parameter should be introduced to represent the number of hops to reach the partner processor). Type IV connectivity is between the processor and the proxy server. Type V connectivity is between the processor and the main storage system. Presently most ASP systems are not sophisticated enough to take advantage of the inter-processor connectivity information, i.e., Types II and III connectivity, so VXT (100) combines Types II and III connectivity into an aggregate connectivity. Three parameters are defined to represent available main proxy connectivity, central storage connectivity, and peer server connectivity. These parameters are bandwidths measured in units of Mbps recorded by the system status monitor.

Once all this information for incoming Internet requests and system resources is captured for a given time interval, it must be reduced to form a metric representation that can be manipulated to compute the best assignments of requests to resources. The metrics associated with each request forms a requirement vector (116) whose elements represent the requirement level of each of the parameters used in the decision process. The metrics associated with the ability of a particular server (104) to satisfy the request (102) is referred to as a capability vector (136). Each element of this vector (136) has a counterpart in the requirement vector (116). During operations, each request (102) has its own requirement vector (116), and each server or processing node (104) has its own capability vector (136). The vector space distance between the requirement vector (116) and capability vectors (136) for any given pairing of request (102) and server (104) represents the degree of mismatch (cost) incurred by the corresponding assignment of the request to that server. If the vectors are identical, the cost is zero.

The assignment of multiple simultaneous requests (102) can be done in several ways. As described above, the preferred approach creates a requirement vector (116) for each request (102) and capability vector (136) for each resource (104). The distance vector between each pair of request to resource then becomes an element in a cost matrix whereby the row index is a request identifier and the column index is the resource identifier. The cost matrix is usually sparse since some assignments may be ruled out for simple reasons. A decision-making algorithm then selects a resource for each request so that the sum of all the costs in the matrix is minimized for all combinations of requests and resources. There are several minimization techniques available, such as general neural network techniques, simulated annealing methods and generic assignment algorithm approaches

The preferred algorithm provides a fast quasi-optimal solution to the distribution problem based on standard methods. One example of such a standard method is a neural network paradigm as described in U.S. Pat. No. 5,548,683, the disclosure of which is hereby incorporated by reference. Other examples of generic algorithm approach would be a greedy search solution algorithm. A greedy algorithm can be applied when the optimization problem is to decide whether or not to include some element from a given set. A greedy algorithm begins with no elements and sequentially selects an element from the feasible set of remaining elements by myopic optimization. (The elements could have been sorted by some criterion, such as associated weights.) This results in an optimal solution to the problem if, and only if, there is an underlying matroid structure (for example, a spanning tree). Other types of generic assignment algorithms would include auction algorithms or Munres algorithms.

Although the preferred embodiment has been described herein, numerous changes and variations can be made and the scope of the present invention is intended to be defined by the claims.

Claims

1. A method for allocating a server, selected from a plurality of servers, to client requests originating over a predefined time interval at a plurality of user accounts, the method comprising: collecting a plurality of client requests that arrive within the predefined time interval wherein at least two of said client requests are serviceable by the server and wherein a first of said at least two of said client requests originates at a first user account and a second of said at least two of said client requests originates at a second user account;determining a first value of a cost metric for a first set of client request-server pairings wherein said first set includes at least one client request-server pair with said server being paired with either said first or said second of said at least two client requests;determining a second value of a cost metric for a second set of client request-server pairings wherein said second set includes at least one client request-server pair with said server being paired with both said first and said second of said at least two client requests; andat the end of said predefined time interval distributing said client requests according to one of said first and said second set of client request-server pairings based on said first and second values of said cost metric;wherein the step of determining the first or the second value of a cost metric for the first or the second set of client request-server pairings further comprises the steps of:initializing the first or second set of client request-server pairings at a commencement of the predefined time interval;a) selecting a client request-server pair to satisfy a selection criteria;b) creating a requirement vector corresponding to said client request;c) creating a capability vector corresponding to said server;d) calculating a distance between the requirement vector and the capability vector and adding said distance to a cumulative value when said distance exceeds a match threshold value and repeating steps a), b), c) and d); ande) adding said client request-server pair to said set of client request-server pairings when said distance exceeds the match threshold value, said cumulative value is less than a cost threshold and said client request has arrived within said predefined time interval.
2. The method of claim 1 wherein the step of determining the value of the first or the second cost metric for the first or the second set of client request-server pairings comprises the steps of: at the commencement of said predefined time interval, initializing a cumulative value to zero;for each client request-server pair in the first or the second set of client request-server pairings,a) creating a requirement vector corresponding to said client request;b) creating a capability vector corresponding to said server;c) calculating an inner product of said requirement vector and said capability vector and adding said inner product to the cumulative value and repeating steps a), b) and c) for all client request-server pairs in the first or second set of client request-server pairings whereupon said cumulative value represents the value of the cost metric.
3. The method of claim 1 wherein the step of distributing said client requests further comprises distributing said client requests according to said first set of client requests-server pairings if said first value of the cost metric is lower than the second value of the cost metric otherwise distributing said client requests according to said second set of client requests-server pairings.
4. The method of claim 1 wherein said selection criteria comprises matching a client request with a server to generate at least one client request-server pairing belonging to one of said first set and said second set.
5. A system for distributing load within a client-server computer network, comprising: a plurality of interconnected computer servers, each server having at least one processor, wherein each computer server is associated with a capability vector having at least one element associated with a resource expected to be requested by at least one of a plurality of incoming client requests;a dynamic capability vector determining module configured to generate a dynamic capability vector for each server of said plurality of interconnected servers, said dynamic capability vector representing an update to said capability vector such that the at least one element of the capability vector corresponds to an unused portion of the resource associated with the at least one element and measured at the commencement of one of a sequence of predefined time intervals;a requirement vector determining module configured to generate a requirement vector for each incoming client request during the one of the sequence of predefined time intervals; anda load balancing module for selectively pairing said plurality of interconnected computer servers with one or more of said plurality of incoming client requests so as to minimize a cost metric computed during the one predefined time interval in said sequence of predefined time intervals wherein said cost metric is a function of vector distances between said dynamic capability vectors and said requirement vectors associated with said computer servers and said client request pairs in said computer server-client request pairing;wherein said load balancing module further comprises a plurality of instances of load balancing modules resident on an appropriate plurality of servers disposed at intermediate nodes forming a connectivity hierarchy of layers throughout said computer client-server network such that said cost metric is computed and minimized for at least one layer of server nodes corresponding to the same connectivity hierarchy whereby each incoming client request is satisfied by a plurality of computer servers and transmission paths.

RELATED APPLICATIONS

This application claims priority as continuation to U.S. patent application Ser. No. 09/765,766, filed Jan. 18, 2001, now U.S. Pat. No. 6,938,256, which claims the benefit of U.S. Provisional Application No. 60/176,665, filed Jan. 18, 2000, both of which are incorporated herewith by reference. This application is also related to two applications that are assigned to the common assignee of the present application, the first of which is entitled “Scalable Internet Engine,” Ser. No. 09/709,820, filed Nov. 10, 2000, now U.S. Pat. No. 6,452,809, and the second of which is entitled “Method and System For Providing Dynamic Host Service Management Across Disparate Accounts/Sites,” Ser. No. 09/710,095, filed Nov. 10, 2000, now U.S. Pat. No. 6,816,905.

US Referenced Citations (225)

Number	Name	Date	Kind
3764747	Nakajima et al.	Oct 1973	A
4502116	Fowler et al.	Feb 1985	A
4920487	Baffes	Apr 1990	A
5031089	Liu et al.	Jul 1991	A
5155854	Flynn et al.	Oct 1992	A
5187710	Chau et al.	Feb 1993	A
5247427	Driscoll et al.	Sep 1993	A
5251097	Simmons et al.	Oct 1993	A
5303297	Hillis	Apr 1994	A
5335343	Lampson et al.	Aug 1994	A
5351286	Nici	Sep 1994	A
5371848	Casey et al.	Dec 1994	A
5460441	Hastings et al.	Oct 1995	A
5473773	Aman et al.	Dec 1995	A
5487170	Bass et al.	Jan 1996	A
5488541	Mistry et al.	Jan 1996	A
5504894	Ferguson et al.	Apr 1996	A
5504899	Raz	Apr 1996	A
5504900	Raz	Apr 1996	A
5537542	Eilert et al.	Jul 1996	A
5539883	Allon et al.	Jul 1996	A
5548683	Engel et al.	Aug 1996	A
5586312	Johnson et al.	Dec 1996	A
5615329	Kern et al.	Mar 1997	A
5630081	Rybicki et al.	May 1997	A
5664106	Caccavale	Sep 1997	A
5675739	Eilert et al.	Oct 1997	A
5675785	Hall et al.	Oct 1997	A
5696895	Hemphill et al.	Dec 1997	A
5701480	Raz	Dec 1997	A
5745884	Carnegie et al.	Apr 1998	A
5764915	Heimsoth et al.	Jun 1998	A
5771354	Crawford	Jun 1998	A
5774668	Choquier et al.	Jun 1998	A
5794221	Egendorf	Aug 1998	A
5795228	Trumbull et al.	Aug 1998	A
5819092	Ferguson et al.	Oct 1998	A
5822531	Gorczyca et al.	Oct 1998	A
5828737	Sawyer	Oct 1998	A
5832222	Dziadosz et al.	Nov 1998	A
5845267	Ronen	Dec 1998	A
5875306	Bereiter	Feb 1999	A
5877938	Hobbs et al.	Mar 1999	A
5889944	Butt et al.	Mar 1999	A
5899980	Wilf et al.	May 1999	A
5901228	Crawford	May 1999	A
5912802	Nelson	Jun 1999	A
5928323	Gosling et al.	Jul 1999	A
5938732	Lim et al.	Aug 1999	A
5946670	Motohashi et al.	Aug 1999	A
5948065	Eilert et al.	Sep 1999	A
5951694	Choquier et al.	Sep 1999	A
5956391	Melen et al.	Sep 1999	A
5956697	Usui	Sep 1999	A
5974462	Aman et al.	Oct 1999	A
5978577	Rierden et al.	Nov 1999	A
5983225	Anfindsen	Nov 1999	A
5983326	Hagersten et al.	Nov 1999	A
5987621	Duso et al.	Nov 1999	A
5991792	Nageswaran	Nov 1999	A
5999965	Kelly	Dec 1999	A
6006259	Adelman et al.	Dec 1999	A
6011791	Okada et al.	Jan 2000	A
6014651	Crawford	Jan 2000	A
6014669	Slaughter et al.	Jan 2000	A
6025989	Ayd et al.	Feb 2000	A
6035281	Crosskey et al.	Mar 2000	A
6035356	Khan et al.	Mar 2000	A
6038587	Phillips et al.	Mar 2000	A
6041354	Biliris et al.	Mar 2000	A
6067545	Wolff	May 2000	A
6067580	Aman et al.	May 2000	A
6070191	Narendran et al.	May 2000	A
6088727	Hosokawa et al.	Jul 2000	A
6088816	Nouri et al.	Jul 2000	A
6092178	Jindal et al.	Jul 2000	A
6094351	Kikinis	Jul 2000	A
6094680	Hokanson	Jul 2000	A
6097882	Mogul	Aug 2000	A
6105067	Batra	Aug 2000	A
6108703	Leighton et al.	Aug 2000	A
6112243	Downs et al.	Aug 2000	A
6115693	McDonough et al.	Sep 2000	A
6134673	Chrabaszcz	Oct 2000	A
6145098	Nouri et al.	Nov 2000	A
6151688	Wipfel et al.	Nov 2000	A
6154787	Urevig et al.	Nov 2000	A
6157927	Schaefer et al.	Dec 2000	A
6167446	Lister et al.	Dec 2000	A
6170067	Liu et al.	Jan 2001	B1
6173322	Hu	Jan 2001	B1
6182109	Sharma et al.	Jan 2001	B1
6185598	Farber et al.	Feb 2001	B1
6199111	Hara et al.	Mar 2001	B1
6199173	Johnson et al.	Mar 2001	B1
6209018	Ben-Shachar et al.	Mar 2001	B1
6216185	Chu	Apr 2001	B1
6223202	Bayeh	Apr 2001	B1
6230183	Yocom et al.	May 2001	B1
6233587	Tandon	May 2001	B1
6243737	Flanagan et al.	Jun 2001	B1
6243838	Liu et al.	Jun 2001	B1
6266721	Sheikh et al.	Jul 2001	B1
6272675	Schrab et al.	Aug 2001	B1
6279001	DeBettencourt et al.	Aug 2001	B1
6298451	Lin	Oct 2001	B1
6301612	Selitrennikoff et al.	Oct 2001	B1
6317773	Cobb et al.	Nov 2001	B1
6324580	Jindal et al.	Nov 2001	B1
6327579	Crawford	Dec 2001	B1
6330689	Jin et al.	Dec 2001	B1
6338112	Wipfel et al.	Jan 2002	B1
6374243	Kobayashi et al.	Apr 2002	B1
6374297	Wolf et al.	Apr 2002	B1
6389012	Yamada et al.	May 2002	B1
6405317	Flenley et al.	Jun 2002	B1
6411943	Crawford	Jun 2002	B1
6411956	Ng	Jun 2002	B1
6412079	Edmonds et al.	Jun 2002	B1
6421661	Doan et al.	Jul 2002	B1
6421688	Song	Jul 2002	B1
6421777	Pierre-Louis et al.	Jul 2002	B1
6425006	Chari et al.	Jul 2002	B1
6430618	Karger et al.	Aug 2002	B1
6442618	Phillips et al.	Aug 2002	B1
6446200	Ball et al.	Sep 2002	B1
6452809	Jackson et al.	Sep 2002	B1
6460082	Lumelsky et al.	Oct 2002	B1
6463454	Lumelsky et al.	Oct 2002	B1
6496828	Cochrane et al.	Dec 2002	B1
6504996	Na et al.	Jan 2003	B1
6519553	Barnette et al.	Feb 2003	B1
6519679	Devireddy et al.	Feb 2003	B2
6532488	Ciarlante et al.	Mar 2003	B1
6542926	Zalewski et al.	Apr 2003	B2
6553416	Chari et al.	Apr 2003	B1
6553420	Karger et al.	Apr 2003	B1
6574748	Andress et al.	Jun 2003	B1
6578147	Shanklin et al.	Jun 2003	B1
6587938	Eilert et al.	Jul 2003	B1
6601096	Lassiter, Jr.	Jul 2003	B1
6606253	Jackson et al.	Aug 2003	B2
6608832	Forslow	Aug 2003	B2
6615199	Bowman-Amuah	Sep 2003	B1
6615265	Leymann et al.	Sep 2003	B1
6625639	Miller et al.	Sep 2003	B1
6633916	Kauffman	Oct 2003	B2
6640244	Bowman-Amuah	Oct 2003	B1
6640249	Bowman-Amuah	Oct 2003	B1
6647508	Zalewski et al.	Nov 2003	B2
6651125	Maergner et al.	Nov 2003	B2
6681316	Clermidy et al.	Jan 2004	B1
6681342	Johnson et al.	Jan 2004	B2
6684343	Bouchier et al.	Jan 2004	B1
6687729	Sievert et al.	Feb 2004	B1
6687831	Albaugh et al.	Feb 2004	B1
6704737	Nixon et al.	Mar 2004	B1
6704768	Zombek et al.	Mar 2004	B1
6714980	Markson et al.	Mar 2004	B1
6715145	Bowman-Amuah	Mar 2004	B1
6718359	Zisapel et al.	Apr 2004	B2
6718415	Chu	Apr 2004	B1
6721568	Gustavsson et al.	Apr 2004	B1
6725317	Bouchier et al.	Apr 2004	B1
6728958	Klein et al.	Apr 2004	B1
6742015	Bowman-Amuah	May 2004	B1
6779016	Aziz et al.	Aug 2004	B1
6816903	Rakoshitz et al.	Nov 2004	B1
6816905	Sheets et al.	Nov 2004	B1
6820171	Weber et al.	Nov 2004	B1
6826709	Clermidy et al.	Nov 2004	B1
6832238	Sharma et al.	Dec 2004	B1
6839700	Doyle et al.	Jan 2005	B2
6842906	Bowman-Amuah	Jan 2005	B1
6853642	Sitaraman et al.	Feb 2005	B1
6871210	Subramanian	Mar 2005	B1
6877035	Shahabuddin et al.	Apr 2005	B2
6898642	Chafle et al.	May 2005	B2
6901442	Schwaller et al.	May 2005	B1
6938256	Deng et al.	Aug 2005	B2
6950848	Yousefi'zadeh	Sep 2005	B1
6952401	Kadambi et al.	Oct 2005	B1
6963915	Karger et al.	Nov 2005	B2
6973517	Golden et al.	Dec 2005	B1
6985967	Hipp	Jan 2006	B1
6986139	Kubo	Jan 2006	B1
7032241	Venkatachary et al.	Apr 2006	B1
7051098	Masters et al.	May 2006	B2
7051188	Kubala et al.	May 2006	B1
7055052	Chalasani et al.	May 2006	B2
7080051	Crawford	Jul 2006	B1
7085837	Kimbrel et al.	Aug 2006	B2
7099981	Chu	Aug 2006	B2
7140020	McCarthy et al.	Nov 2006	B2
7146446	Chu	Dec 2006	B2
7185112	Kuranari et al.	Feb 2007	B1
7228546	McCarthy et al.	Jun 2007	B1
7289964	Bowman-Amuah	Oct 2007	B1
7328297	Chu	Feb 2008	B2
7356602	Goldszmidt et al.	Apr 2008	B2
7363415	Chu	Apr 2008	B2
7363416	Chu	Apr 2008	B2
7376779	Chu	May 2008	B2
RE41092	Chu	Jan 2010	E
7676624	Chu	Mar 2010	B2
7693993	Sheets et al.	Apr 2010	B2
7730172	Lewis	Jun 2010	B1
7764683	DiGiorgio et al.	Jul 2010	B2
7844513	Smith	Nov 2010	B2
20010039581	Deng et al.	Nov 2001	A1
20020007468	Kampe et al.	Jan 2002	A1
20020083078	Pardon et al.	Jun 2002	A1
20020091854	Smith	Jul 2002	A1
20020107877	Whiting et al.	Aug 2002	A1
20020124083	Jeyaraman et al.	Sep 2002	A1
20030037092	McCarthy et al.	Feb 2003	A1
20030039237	Forslow	Feb 2003	A1
20040162901	Mangipudi et al.	Aug 2004	A1
20050076214	Thomas et al.	Apr 2005	A1
20050182838	Sheets et al.	Aug 2005	A1
20060129687	Goldszmidt et al.	Jun 2006	A1
20070140242	DiGiorgio et al.	Jun 2007	A1
20100268827	Sheets et al.	Oct 2010	A1
20110191462	Smith	Aug 2011	A1
20110238564	Lim et al.	Sep 2011	A1

Foreign Referenced Citations (22)

Number	Date	Country
2415770	Apr 2010	CA
01812619.7	Nov 2006	CN
0833514	Apr 1998	EP
0844577	May 1998	EP
0844577	Feb 1999	EP
0873009	Nov 2005	EP
1 091 296	Apr 2011	EP
11027635	Jan 1999	JP
11-120127	Apr 1999	JP
2000-040115	Feb 2000	JP
2002-132741	May 2002	JP
2002-202959	Jul 2002	JP
2002-245017	Aug 2002	JP
2004-082911	Mar 2004	JP
2004-519749	Jul 2004	JP
0840960	Jun 2008	KR
WO 0004458	Jan 2000	WO
WO 0014634	Mar 2000	WO
WO 0167707	Mar 2001	WO
WO 0201347	Jan 2002	WO
WO 0207037	Jan 2002	WO
WO 0208891	Jan 2002	WO

Related Publications (1)

	Number	Date	Country
	20060036743 A1	Feb 2006	US

Provisional Applications (1)

	Number	Date	Country
	60176665	Jan 2000	US

Continuations (1)

	Number	Date	Country
Parent	09765766	Jan 2001	US
Child	11202644		US

System for balance distribution of requests across multiple servers using dynamic metrics

Information

Patent Number

Date Filed

Date Issued

Inventors

Original Assignees

Examiners

Agents

CPC

US Classifications

Field of Search

US

International Classifications

Disclaimer

Term Extension

Abstract