A virtual machine is a self-contained execution environment that behaves as if it is a separate computer and which can run its own operating system. Virtual machines provide Virtual CPUs (VCPUs) to clients or “guests”, and each VCPU runs on a dedicated physical CPU. A VCPU is a representation of a physical processor within a Virtual Machine. In conventional systems, the mapping between virtual and physical CPUs is static.
If multiple CPUs are available to a client or “guest”, the guest tasks are spread between the CPUs. This is preferably done such that the available resources are used in the most efficient way and computing time is decreased. This process is generally referred to as “load balancing”.
Conventional load balancing algorithms may be insufficient. Let us consider, for example, the sharing of a plurality of physical CPUs between dedicated real-time software and generic server software. Let us assume an UP (uniprocessor) execution environment (e.g. LINUX) running the real-time software on CPU 0, and an SMP (symmetric multiprocessing) execution environment (LINUX) running a server software on CPUs 0-3. In this example, CPU 0 is shared between the real-time software and the server software. The dedicated real-time has a higher scheduling priority. In this example, the SMP load balancer does not take into account the real-time activity on CPU 0. This may skew the SMP load balancing. The present invention aims to address this and other problems of conventional load balancing. In particular, but not exclusively, the present invention is concerned with better balancing the load of physical CPUs in a computer system comprising physical and virtual CPUs.
The invention is recited by the independent claims. Preferred features are recited by the dependent claims.
The present invention is based on the realisation that the overall performance of a system as shown in
VCPUx→CPUa and VCPUy→CPUb (1)
VCPUx→CPUb and VCPUy→CPUa (2)
The scheduler is able to dynamically choose one of these mappings depending on CPU loads. The mapping switch results in a “swapping” of VCPUs, i.e. in two VCPUs migrating from one physical CPU to another. Such an operation is fully transparent for the guest and does not change a fixed physical set of CPUs assigned to the guest.
By implementing such a load balancing mechanism, it is possible to at least partially resolve the above described SMP load balancing skewing problem by migrating a server VCPU running on CPU0, for example, to another CPU when the real-time activity is high and this VCPU is loaded. In order to improve the overall performance by such a migration, an underloaded CPUn (n>0, in this example) must be found and VCPU running on CPUn must migrate to CPU0.
This solution is partial only in that it does not work when the system is heavily loaded, i.e. when all physical CPUs are fully loaded. However, as such a situation is rare in practice, it is acceptable.
Regularly, at a given time period, the scheduler calculates the load of each physical CPU (LOADn, n=0 . . . N) and all VCPUs running on each physical CPU (VLOADn,m, m=0 . . . M). More particularly, two loads are calculated for each VCPU: a “positive” load and a “negative” load.
The positive load for a given VCPUn,m is equal to the actual VCPU load:
VPLOADn,m=VLOADn,m
The negative load for a given VCPUn,m is equal to a sum of the loads of all other VCPUs running on the same physical CPU:
VNLOADn,m=ΣVLOADn,i i>0 . . . M, i≠m
A physical CPU is considered overloaded if its load is above a predetermined threshold:
LOADn≧LOADover
A physical CPU is considered underloaded if its load is below a predetermined threshold:
LOADn≦LOADunder
Load balancing is only applied to a pair of CPUs in which one CPU is overloaded and other CPU is underloaded:
CPUiCPUj
where
CPUi≧LOADover and CPuj≦LOADunder
The load balancing comprises finding two unbalanced VCPUs of the same SMP guest running on CPUi and CPUj such that:
VPLOADi,k>VPLOADj,1
and swapping these VCPUs across physical CPUs.
Because a VCPU migration, in terms of processing power, is a quite expensive operation, the migration criteria is adjusted by introducing a positive migration threshold:
VPLOADi,k≧VPLOADj,l≧MIGR_POS_WTMARK
In addition, the migration criteria takes into account the negative load of the overloaded emigrant;
VNLOADi,k≧MIGR_NEG_WTMARK
The negative load water mark avoids unnecessary migrations when the CPU overloading is not caused by a simultaneous activity of multiple guests, but rather by a single guest monopolizing the physical CPU.
A mymips program has been used to demonstrate skewing in the SMP load balancing of a guest operating system. The mymips program permanently calculates the program execution speed (MIPS) and prints out the calculation results on console.
mymips provides the following result when running on a SMP Linux guest with a dedicated single physical CPU:
The results above and below were obtained on a DELL D820 Dual Core1.8 MHz Laptop.
Two mymips programs provide the following result when running on an SMP Linux with two dedicated CPUs:
A basic configuration which can be used to implement the load balancing mechanism in accordance with an embodiment of the invention comprises two SMP Linux guests sharing two physical CPUs. In order to obtain an unbalanced load on such a configuration, guests have been running on a conventional system without load balancing mechanism. Two mymips programs running simultaneously on each guest provide the following results:
This shows about 25% of performance hit comparing to a single SMP Linux which performs a load balancing (at OS level) across multiple CPU s. The performance hit is due to sporadic mymips migrations from one CPU to another. Such a migration randomly runs mymips on the same processor.
This practical result is fully in line with a theoretical determination. Because of a random nature of migrations, the probability of running both mymips on the same CPU is 0.5. Thus, an expected performance hit is 0.25 because when running two programs on the same CPU, only a half of the CPU power is available.
When running the same load with load balancing enabled, the performance is close to a single SMP Linux.
The load balancing compensates sporadic migrations of mymips (from one VCPU to another) caused by the Linux SMP scheduler. In other words, the scheduler tries to execute heavy loaded VCPUs on different physical CPUs.
In order to confirm the above theoretical conclusion, a Linux kernel compilation was used as a variable load. Two compilations were running in parallel on two Linux guests in the following three configurations:
Each time, the duration of compilation was measured. Results corresponding to different Linux kernel compilations are shown below.
(1.1) 11m21.046s
(2.1) 16m4.272s
(2.2) 12m20.575s
(3.1) 13m51.974s
(3.2) 10m32.467s
The performance hit on the system without load balancing (2) is about 40%, while the performance hit on the system with load balancing (3) is about 20%. Accordingly, the load balancing improves a performance degradation caused by a transparent CPU sharing among multiple SMP guests.
The above results were obtained using the following load balancing parameters:
PERIOD=10 milliseconds
It may be possible to achieve even better result on a variable load by modifying these parameters.
It will be clear from the forgoing that the above-described embodiments are only examples, and that other embodiments are possible and included within the scope of the invention as determined from the claims.
| Number | Date | Country | Kind |
|---|---|---|---|
| 07290793.4 | Jun 2007 | EP | regional |