This application claims priority to Taiwanese Application No. 104122574 filed on Jul. 13, 2015.
The disclosure relates to a method for distributing workload among a plurality of computer servers providing web services, more particularly, to an auto-scaling method for automatically distributing workload among a plurality of servers providing web services.
In the field of web computing service, a conventional method for service providers (or system operators) is to dedicate or reserve a specific number of servers resources to deal with the entirety of the incoming network traffic. In order to determine the needed amount of server resources, it is necessary for the system operator to predict the growth trend of the incoming network traffic, which is a relatively difficult task considering the varying traffic load at different times and in different regions. To give an example, a web server may encounter workload bursts in which the traffic surges far beyond its normal volume. If insufficient servers resources are available, this may overload the systems and cause data packets to be lost. However, it may be wasteful to dedicate in advance an amount of server resources sufficient to accommodate peak traffic load because this would lead to certain amounts of server resources sitting idle and not being used during times of non-peak traffic.
An object of the disclosure is to provide a method for distributing workload among a plurality of servers to improve the cost effectiveness of operations while maintaining a reliable web service.
Another object of the disclosure is to provide a system for distributing workload among a plurality of servers to improve the cost-effectiveness of operations while maintaining a reliable web service.
Yet another object of the disclosure is to provide a server to be used in a web service in which the workload of the server is distributed in a cost-effective and reliable fashion.
This disclosure proposes a method for distributing workload among a plurality of servers each having a given workload capacity to process incoming workload. Each of the servers has a maximum allowable workload that is smaller than the given workload capacity.
In one embodiment, the method includes the steps of:
a) for each of activated ones) of the servers operating to provide the web service, detecting a current network workload thereof;
b) calculating an overall workload that is equal to a summation of the current network workload(s) detected in step a); and
c) deactivating/activating at least one of the servers so as to adjust a number of the activated one(s) of the servers according to the overall workload calculated in step b), the current network workload(s) detected in step a), and the maximum allowable workload(s) of the activated one(s) of the servers.
With the setting of the maximum allowable workloads of the servers, the method can estimate the status of each activated server and determine whether to increase or decrease the total number of activated server(s).
In one embodiment, a system for distributing workload includes a plurality of servers and a master module. Each of the servers has a given workload capacity for providing a web service and a maximum allowable workload that is smaller than the given workload. The master module is for distributing workload among the servers, and is configured to detect a current network workload for each of activated one(s) of the servers operating to provide the web service, and calculate an overall workload equal to a summation of the current network workload(s) detected thereby, and deactivate/activate at least one of the servers so as to adjust a number of the activated one(s) of the servers according to the overall workload, the current network workload(s), and the maximum allowable workload(s) of the activated one (s) of the servers.
Other features and advantages of the disclosure will become apparent in the following detailed description of the embodiment with reference to the accompanying drawings, of which:
Referring to
In step S1, the master module 100 detects, for each of activated one(s) of the servers 200 operating to provide the web service, a current network workload thereof.
In step S2, the master module 100 calculates an overall workload equal to a summation of the current network workload(s) detected in step S1.
In step S3, the master module 100 deactivates/activates one of the servers 200 so as to adjusts a number of the activated one(s) of the servers 200 according to the overall workload calculated in step S2, the current network workload(s) detected in step S1, and the maximum allowable workload(s) of the activated one(s) of the servers 200.
Specifically, the master module 100 activates one of the servers 200 other than the activated one(s) to provide the web service when the overall workload is greater than a summation of the maximum allowable workload(s) of the activated one(s) of the servers 200.
In addition, when the number of the activated one(s) of the servers 200 is greater than one (i.e., there are a plurality of the activated ones of the servers), the master module 100 deactivates a selected one of the activated ones of the servers 200 when the current network workload of any one of the activated ones of the servers 200 is lower than a dynamic parameter that is associated with the current network workloads of the activated ones of the servers 200 (to be defined in the following). In one embodiment, the master module 100 deactivates said selected one of the activated ones of the servers 200 when the current network workload of any one of the activated ones of the servers 200 is lower than the dynamic parameter and remains lower than the dynamic parameter for a predetermined duration. In one embodiment, the master module 100 selects one of the activated ones of the servers 200 whose current network workload is lower than a residual workload capacity (to be defined in the following) as the selected one of the servers 200. In one embodiment, the master module 100 selects one of the activated ones of the servers 200 whose current network workload is the smallest among the activated ones of the servers 200 as said selected one of the servers 200 to be deactivated. The conditions for selecting said selected one of the servers 200 to be deactivated may be combinable where feasible. The mater module 100 then deactivates the selected one of the servers 200, and distributes the current network workload of said selected one of the servers 200 to remaining one(s) of the activated ones of the servers 200.
The dynamic parameter is defined as an average of the current network workloads of the activated ones of the servers 200 and is periodically updated as desired. The residual workload capacity is a difference between a summation of the current network workload(s) of the remaining one(s) of the activated ones of the servers 200 and a summation of the maximum allowable workload(s) of the remaining one(s).
It should be noted that the master module 100 may distribute the current network workload of said selected one of the servers 200 to the remaining one(s) of the activated ones of the servers 200 by one of equal distribution and weighted distribution. Since the main feature of this disclosure does not reside in equal distribution and weighted distribution, details of the same are omitted for the sake of brevity.
In step S4, the master module 100 repeats steps S1 to S3 until the overall workload is not greater than the summation of the maximum allowable workloads of the activated one(s) of the servers 200 and the current network workload of any one of the activated one(s) of the servers 200 is not lower than the dynamic parameter.
In this example, the first server 10 is an activated server and is operating to provide the web service and the second server 20 is in a power-saving mode such as a sleep mode and a turned-off mode and does not provide the web service (i.e., deactivated). The master module 100 detects a current network workload (L1) of the first server 10 as 9200. At this time, since the second server 20 does not provide the web service, the overall workload is equal to the current network workload (L1) of 9200 of the first server 10 as determined by the master module 100, which is greater than the summation of the maximum allowable workload of the activated server, i.e., the maximum allowable workload (T1_U) of 9000 of the first server 10. In view of this, the master module 100 activates the second server 20, which is not previously activated, to provide the web service. By this way, the current network workload of the first server 10 may be shared by the second server 20 and be decreased below its maximum allowable workloads (T1_U) of 9000. It should be noted that the number of the servers may vary in other embodiments of this disclosure.
Referring to
It should be noted that the master module 100 may be one of the servers and may alternatively be an electronic device without web service functionality such as a virtual machine, and the disclosure is not limited to this aspect.
To sum up, in this disclosure, the method to be implemented by the master module 100 is capable of detecting the current network workloads of the activated server(s) and adjusting the number of the activated server(s) according to the overall workload, the dynamic parameter, and the maximum allowable workloads of the activated server(s) periodically or in real-time. Thus, the present disclosure can achieve a relative low operation cost while maintaining a reliable web service.
In the description above, for the purposes of explanation, numerous specific details have been set forth in order to provide a thorough understanding of the embodiment. It will be apparent, however, to one skilled in the art, that one or more other embodiments may be practiced without some of these specific details. It should also be appreciated that reference throughout this specification to “one embodiment,” “an embodiment,” an embodiment with an indication of an ordinal number and so forth means that a particular feature, structure, or characteristic may be included in the practice of the disclosure. It should be further appreciated that in the description, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of various inventive aspects.
While the disclosure has been described in connection with what is considered the exemplary embodiment, it is understood that this disclosure is not limited to the disclosed embodiment but is intended to cover various arrangements included within the spirit and scope of the broadest interpretation so as to encompass all such modifications and equivalent arrangements.
Number | Date | Country | Kind |
---|---|---|---|
104122574 | Jul 2015 | TW | national |