This application claims the priority of European Patent Application, Serial No. 16 201 857.6, filed Dec. 2, 2016, pursuant to 35 U.S.C. 119(a)(d), the content of which is incorporated herein by reference in its entirety as if fully set forth herein.
The present invention refers to method for connecting a client to a server in a communication system.
This communication system according to the common state-of-the-art includes a network, in particular the public Internet, a plurality of cluster servers in this network, which cluster servers, each having at least one public Internet address, are mutually connected and spread over the network at wide distances, in particularly worldwide. Further on the cluster servers are arranged in groups of one or multiple cluster servers. The latter are member of a server cluster which is defined by a cluster domain name usable for a domain name system name resolution (DNS name resolution). Finally a plurality of clients are connectable to the network from which clients at least one—in the following called “connect client”—is to be connected to one of the cluster servers via the network.
To explain the background of the invention reference is made to the accompanying drawing
All of these servers are connected to the public Internet IN and have at least one public Internet address each, and so are configured in one or multiple Internet DNS servers reachable worldwide (DNS=Domain Name System according e.g. to the standard RFC 882 and RFC 883).
There exists a domain name, for example “cluster1.com”, where, when used for DNS name resolution, all Internet addresses of all the servers S1-S5 in the so-called geo-cluster are returned. This is also called “Resource Record Set”. The Internet DNS servers may return all these Internet addresses in a fixed sequence, i.e. in the sequence as configured. DNS servers can also be configured to return the Internet addresses “round-robin”; this is random, depending on the implementation of the specific DNS-server.
Each of the cluster servers S1-S5 can have one or multiple Internet addresses configured; at least one Internet address needs to be reachable by using the DNS name “cluster1.com”. But also, all single Internet addresses of theses cluster servers S1-S5 could be resolved by dedicated DNS names. So a particular server (for example cluster server S1), and only this, could have the DNS name “server1.cluster1.com”.
Now when a user working with client CLI1 wants to ask any cluster server S1-S5 for services, using any software, for example a Web Browser, or any other software which can be used to connect to a server, he connects to the cluster CLU1, hereby addressing the cluster DNS name “cluster1.com”. Now, the software running on client CLI1 tries to get a connection to any server of cluster CLU1 and does a DNS request (Domain Name Resolution) asking the DNS servers in the well-known and normal way for the Internet addresses of the domain with the name of “cluster1.com”.
The DNS servers will return with a list of network IP addresses as so-called dotted Internet addresses of all of these cluster servers S1-S5. The client software chooses one of these Internet addresses, normally the first one, and tries to establish a network connection. When this is successful e.g. with cluster server S1, on according connection CON1 is working. When this connection attempt fails, the client software tries to connect to one of the other Internet addresses returned by the DNS servers, mostly in a sequential way. When all connections fail, the client software cannot connect, and mostly gives an error message to the user and terminates.
Now the problem underlying the connection procedure explained above is the fact that the established connection is generated more or less arbitrarily with one of the cluster servers S1-S5 without taking into consideration the question whether or not the most appropriate cluster server as concerns performance parameters is selected within the desired server cluster.
Inasmuch it is an object of the invention to provide for a method for connecting a client to a server in a server cluster environment which ensures that a client is connected to an appropriate cluster server leading to optimised performance conditions under data transmission and data processing aspects.
The object of the invention is achieved according to the invention by following method steps:
The concept of the invention is based on the finding that in a computer communication network with cluster servers and clients to be connected thereto the performance of the connection amongst others depends on the physical distance of the client and the cluster server to which the client is connected. In particularly within the worldwide Internet and the multiple locations spread over the world, where cluster servers for one and the same cluster are located, the occurrence of long distant connections increases and thus sharpens performance problems. This is avoided by integrating the distance between client and cluster server into the connection establishment. As concerns the relocation of the first connection to a second connection according to the last method step of the invention it is to be noted that—of course—when the first connection is determined as being the nearest connection between the connect client and the connected cluster server then the relocation step is suppressed.
According to a preferred embodiment of the invention the current status and/or load value of the nearest cluster server is included as additional input parameter when initiating the relocation of the first connection of the connect client to the second connection. Thus the basic concept of the invention is optimised inasmuch as the per se known concept of load balancing is integrated into the invention.
In more detail the method according to the invention may include the steps that in case that two or more cluster servers are present within one group of cluster servers at a same geographic location the first connection is relocated to an active cluster server within this group with the least load.
The relocation of the first connection due to the foregoing conditions is provided by a “MOVED” process which is defined by the Hypertext Transfer Protocol HTTP.
According to another preferred embodiment of the invention the current status and/or load value of each of the cluster servers are exchanged as input parameters between all cluster servers in defined time intervals. Due to this measure all cluster servers are automatically updated with the parameters necessary for establishing an optimal connection between the cluster servers of one cluster and a connect client.
In further optimising the invention an active status of the cluster servers can be determined by checking an active network connection of each cluster server. Thus in a reliable and simple way it is possible for each cluster server to gain an information about the status of all the other cluster servers within the cluster. Further on the load value of the cluster servers is represented by at least one determining factor like the number of currently connected users, the load of the CPU, the memory usage, the swap activity or the network usage of each cluster server.
According to a further preferred embodiment of the invention each cluster server gets the network IP addresses of all other cluster servers of the server cluster by, preferably automatically, sending a DNS query to an appropriate DNS-server and receiving responses for all other cluster servers containing all their registered network IP addresses. By this measure the invention advantageously uses the well-established DNS query.
Each of the cluster servers may further be configured with the name of the group of which it is a member. This helps to optimise the organisation of the process steps of the present invention.
According to a further preferred embodiment the position data representative for the geographic location of each cluster server and/or client are defined by the usual geographic coordinates of the geographic location of each cluster server. Thus the invention works with reliable and easily available position data.
In case that the connect client is not directly integrated into a public network, but part of a private network which is secured by a firewall and an according router, as position data representative for the geographic location of each client the position data of this router may be used in context with the invention. Alternatively, it may sometimes be convenient or sufficient to use the position data of an Internet service provider -ISP- server used by the client, as by this measure and the calculation effort for finding the optimum cluster server maybe decreased. In fact as a certain ISP server of course handles a plurality of clients, the geographical positions of the latter—although being located at different, but regularly not too far away—are for sake of ease represented by the geographical position of their ISP server.
Finally according to another preferred embodiment of the invention it is proposed that for connecting the connect client to the best appropriate cluster server a combined load balancing between the cluster servers is performed on the basis of an evaluation formula including as variables at least the load value of each cluster server and the calculated distance between each cluster server and the connect client. Due to this measure load-balancing can be done with a set of rules, where different parameters like CPU usage, main memory utilization or swap activity of a server is combined by means of the formula. Now, in this combination, also the distance between the client and a certain cluster group is used in the calculation which server the client should connect to. In this way, a certain load-balancing between the cluster groups is performed.
The evaluation formula may not be fixed, but variably configurable by a cluster administrator, wherein normal mathematical operations can be used, like add, multiply and so on. Brackets and other mathematical functions can be used like logarithm, square-root and so on. The result of this calculation leads to a certain value as decision basis for the selection of a certain cluster server. The higher the value returned by calculating this formula, the higher the load. The server with the least load is selected.
Each cluster server may preferably get knowledge of the network IP addresses of all other cluster servers of the server cluster by means of configuration data provided with configuring the respective cluster server or by using DNS—domain name system—on the basis of the cluster server name.
According to a further preferred embodiment of the invention a cluster server, when having a load which is higher than a configured threshold value, is treatable as overloaded and excluded from being connected to the connect client. This helps to optimize the performance of such a clientserver-communication system.
Finally it is advantageous when said overloaded cluster server is blocked as target for said “MOVED” process. This is a reliable, effective and simple method to exclude this overloaded cluster server to the connect client.
Further features, details and advantages of the invention become apparent from the following description of preferred embodiments of the invention on the basis of the enclosed drawing.
As already explained above the client-server-communication system includes a network IN, namely the public Internet in which a plurality of cluster servers S1-S5 are mutually connected and spread over the network IN in different places and thus located apart over wide distances like e.g. on different continents of the earth. The cluster servers S1-S5 further on are gathered under a cluster CLU1 having a defined domain name, like “cluster1.com” which is adequate for DNS name resolution. Further on all network IP addresses of all the servers S1-S5 in the so-called geo-cluster CLU1 are returned in the so-called “Resource Record Set”.
Further on the cluster servers S1-S5 are arranged in groups, namely cluster servers S1-S3 in group G1 and cluster servers S4-S5 in group G2. Both groups G1 and G2 combine cluster servers S1-S3 and S4-S5, respectively which are locally near, what can mean e.g. in a certain country or continent, i.e. group G1 in North America and group G2 in Europe.
Now in the network IN a plurality of clients are located, one client CLI1 is depicted in
Now an example for establishing an optimised connection between this client CLI1 and one of those servers S1-S5 is explained in the following together with the technical background:
All the servers S1-S5 are connected with each other (each with all of the others of all groups), typically using the TCP protocol, maybe encrypted, but another network protocol (Internet Protocol) could also be used. Each Cluster server S1-S5 can get the Internet Addresses of all other cluster servers S1-S5 by sending a DNS query to a DNS server DNSS—e.g. cluster server S2, as shown in
As each server S1-S5 has connections to all other up and running servers S1-S5, a certain server S1-S5 can find out if anyone of the other servers S1-S5 is still up and running, simple by checking if there is an active network connection.
Each of the servers S1-S5, as part of its configuration, has got a database, and this database contains all Internet addresses, IPv4 and IPv6, worldwide, and their location on this world, normally the geographic coordinates. Each of these servers S1-S5 is also configured with the name of the group G1, G2 it is member of, and also with the geographic coordinates of the group location on this world of all cluster servers S1-S5.
When one of these connection attempts between all cluster servers S1-S5 did succeed, the server, e.g. server S1, which is the cluster member where the client CLI1 is connected to, automatically knows the Internet address of the connecting client CLI1. In some cases—as is outlined in dashed lines in
There may be a certain difference in the location of the real client CLI1 and the location of the ISP, but normally this does not matter. Now the server S1 calculates the distance on earth between the coordinates returned by the database lookup (so of the client CLI1 or ISP) and the coordinates of each of the cluster groups G1, G2 or their cluster servers S1-S5. The client CLI1 will have the best networking experience when being connected to the nearest cluster server, so the cluster server S1 checks which cluster group or member is nearest to the client, is up and running, and not overloaded. When there are multiple cluster members in the same location (what each server can find out by comparing the entries), the server chooses the one with the least load that is up and running.
In this way, a certain load-balancing is done. When the server S1 finds out, the client CLI1 is currently not connected to the nearest cluster member, it sends back to the client a so-called MOVED reply. In HTTP (Hypertext Transfer Protocol, RFC 1945 etc.) this is defined as a status code of 3xx. In the RFC, there is written: 3xx: Redirection—Further action must be taken in order to complete the request.
Together with this message MOVED, it sends the dotted (binary) Internet Address or a known DNS name valid only for this particular server back to the client. This could be “server4.cluster1.com”.
In the HTTP header, the part which sends the Internet Address of the new server S4 is called ‘Location’. The current connection CON1 is then terminated by either side, and the client connects to the server S4 where he got the address (dotted or DNS name) by receipt in the MOVED reply. Thus with this—what is called here—“MOVED process” the better connection CON2 is established with less distance between the connect client CLI1 and the cluster server S4 and may be less load of the server S4 compared to server S1.
MOVED is implemented in the HTTP protocol (Status Code 3xx), but it can also in a simple way be implemented in every other networking protocol or software.
It is also possible to have a certain load-balancing between the cluster groups G1, G2. Not only when a certain cluster group G1, G2 is overloaded, also when there is additional configuration that the client CLI1 is MOVED to another (not the nearest) cluster group G1 where the distance between the client CLI1 and the cluster server S1 is longer than the shortest distance to one of the servers, like cluster server S4 related to the client CLI1.
Load-Balancing can be done with a set of rules, where different parameters like CPU usage, main memory utilization or swap activity of a server is combined. Now, in this combination, also the distance between the client CLI1 and a certain cluster group G1, G2 can be used in the calculation which cluster server S1-S5 the client CLI1 should connect to. In this way, a certain load-balancing between the cluster groups is done.
This combination of the different load parameters with the distance can be done by a formula—see FORM below. This formula FORM is not fixed, but configurable by the cluster administrator.
In the formula FORM, normal mathematical operations can be used, like add, multiply and so on. Brackets can be also used. Further on in this formula FORM, mathematical functions can be used like logarithm, squareroot and so on.
The higher the value returned by calculating this formula FORM, the higher the load of the cluster server been evaluated calculation. Within the concept of this invention the server with the least calculated load is selected.
An example of such a formula FORM including the distance is
load=distance+100*CPU-load+log(memory-in-use) -FORM-
The variables in this example are:
CPUs (cores) normally execute machine-instructions at a certain speed, depending on the clock frequency. Newer CPUs vary the speed, sometimes faster (for a short time only), sometimes slower. But still it is true that either a CPU/core is executing instructions or it is in the wait state where it can only be awoken by an interrupt. The CPU-load percentage is the percentage of the time a CPU is busy, executing instructions, not idle, relative to the total time. This percentage of CPU load is always calculated over short time intervals, for example some seconds.
Different vendors count the load of a total machine with multiple cores differently. Some vendors say, a machine with eight cores and all cores busy all the time has 800% load. Other vendors divide this number by the number of cores, so still the maximum load is 100%.
Now with a formula exclusively for the load, a certain threshold value (a higher number) can be compared to the load calculated. The load calculated in this case uses a formula which does not contain the distance between the client and the server. When the calculated load is higher as the configured threshold value, as it is configured by the administrator, this server can be treated as overloaded. So, an overloaded server will no more be used as a target for “MOVED”.
Number | Date | Country | Kind |
---|---|---|---|
16 201 857.6 | Dec 2016 | EP | regional |