This application claims the benefit of Korean Patent Application No. 10-2013-0071828, filed on Jun. 21, 2013, which is hereby incorporated by reference in its entirety into this application.
1. Technical Field
The present invention relates generally to a method and apparatus for recovering the failed disk of a virtual machine and, more specifically, to a method and apparatus that are capable of maintaining the continuity of virtualization service, thereby recovering a failed disk while ensuring the performance of virtual machines.
2. Description of the Related Art
The term “virtualization” refers to a technology that enables a plurality of operating systems to run on a single physical server. Each of these operating systems is called a virtual machine. Virtualization has advantages including the separation of the execution environments of virtual machines, an increase in the utilization of a server, the convenient management of the resources of virtual machines, and stability having no connection with the error of virtual machines.
For these advantages, virtualization is adopted in many company environments. In particular, Internet Data Centers (IDCs) or various types of portal companies in which clusters have been constructed using cheap computers have been highly interested in virtualization. Such companies attempt to use many computers having low performance as virtual machines that run on a high performance server. This task is called server consolidation. In order to construct a virtual infrastructure in which tasks performed by existing non-virtual servers are replaced with virtual servers installed on a small number of physical servers by generating virtual machines and providing service, there is a need for a physical node that will generate virtual machines and the disks of the virtual machines that will be generated on the physical node.
Conventional technologies for overcoming the failure of a virtual machine include Microsoft Exchange Server by Microsoft, XenApp by Citrix, vSphere by VMware, and onQ by Quorum. To improve recovery speed in such a way as to store copies of disk images of virtual machines in a central or distributed storage server connected over a network and recover a original using the backup copy when the failure of a virtual machine occurs, an expensive backup server and expensive network equipment are required.
U.S. Pat. No. 7,933,987 entitled “Application of Virtual Servers to High Availability and Disaster Recovery Solutions” discloses server virtualization technology. This technology is problematic in that it is difficult to overcome a real-time failure situation (e.g., within several ms) on a high-capacity virtual disk (e.g., having a capacity of several tens of gigabytes or more) and the recovery of the disk of a specific virtual machine may affect the operating speed of other virtual machines that run on the same virtualization server.
Accordingly, the present invention has been made keeping in mind the above problems occurring in the conventional art, and an object of the present invention is to provide a method and apparatus for recovering the failed disk of a virtual machine, which are capable of ensuring the continuity of virtualization service in a server virtualization environment.
Another object of the present invention is to provide a method and apparatus for recovering the failed disk of a virtual machine, which are capable of ensuring the performance of virtual machines.
Yet another object of the present invention is to provide a method and apparatus for scheduling resources that are used to recover the failed disk of a virtual machine.
Further yet another object of the present invention is to provide a method and apparatus for recovering the failed disk of a virtual machine based on a remote storage device.
In accordance with an aspect of the present invention, there is provided a method of recovering the failed disk of a virtual machine in a virtualization system, the method including calculating the total resources of the virtualization system, that is, network and disk I/O resources; calculating operating resources used to drive the virtualization system; calculating use resources corresponding to the amount of the network and disk I/O resources used; calculating recovery resources, that is, network and disk I/O bandwidths capable of being assigned to failure recovery without disturbing performance of other virtual machines based on the total resources, the operating resources and the use resources; and performing recovery of a failed disk by recovering a copy disk, that is, a copy of the failed disk, in a stream manner using a mandatory disk stored in the virtualization system based on the recovery resources.
Performing the recovery of the failed disk may include deleting the failed disk and assigning the recovered copy disk to a virtual machine corresponding to the failed disk.
Performing the recovery of the failed disk may include recovering the failed disk by copying a copy disk, that is, a copy of a local mandatory disk stored in a local storage device, in a local stream manner using the local mandatory disk.
Performing the recovery of the failed disk may include recovering the failed disk by copying a copy disk, that is, a copy of a remote mandatory disk stored in a remote storage device, in a remote stream manner using the remote mandatory disk.
The recovery resources may be the remaining resources of the total resources other than the operating resources and the use resources.
Performing the recovery of the failed disk may be stopped if the recovery resources have not been assigned.
Performing the recovery of the failed disk may include providing all recovery tasks with assignment resources to which the recovery resources have been equally assigned if the recovery resources have been assigned; dividing the mandatory disk into a plurality of blocks; and performing recovery on each block section formed of each of the blocks based on the assignment resources.
The assignment resources may include idle resources in which the performance of the recovery is stopped.
The idle resources may be assigned based on network or disk I/O resource performed in a block section before the former block section.
Performing the recovery of the failed disk may include performing the recovery of the failed disk while periodically calculating the use resources and the recovery resources.
In accordance with another aspect of the present invention, there is provided an apparatus for recovering the failed disk of a virtual machine in a virtualization system, the apparatus including a system performance analysis unit configured to calculate recovery resources, that is, network and disk I/O bandwidths, to be assigned to the recovery of a failed disk by analyzing the performance of the virtualization system; a failed disk recovery unit configured to perform the discovery of the failed disk by recovering a copy disk, that is, a copy of the failed disk, using a mandatory disk stored in the virtualization system while ensuring the performance of virtual machines based on the recovery resources; and a disk exchange unit configured to delete the failed disk and assign the recovered copy disk to a virtual machine corresponding to the failed disk.
The system performance analysis unit may include a total resource calculation unit configured to calculate total resources, that is, total network and disk I/O resources of the virtualization system; an operating resource calculation unit configured to calculate operating resources used to drive the virtualization system; a use resource calculation unit configured to calculate use resources, that is, the amount of the network and disk I/O resources used; and a recovery resource calculation unit configured to calculate recovery resources, that is, network and disk I/O bandwidths capable of being assigned to failure discovery without disturbing the performance of other virtual machines based on the total resources, the operating resources and the use resources.
The failed disk recovery unit may include a local stream recovery unit configured to recover a copy disk, that is, a copy of a local mandatory disk stored in a local storage device, by copying the copy disk in a local stream manner using the local mandatory disk; and a remote stream recovery unit for recovering a copy disk, that is, a copy of a remote mandatory disk stored in a remote storage device, by copying the copy disk in a remote stream manner using the remote mandatory disk.
The recovery resources may be the remaining resources of the total resources other than the operating resources and the use resources.
The failed disk recovery unit may be stopped if the recovery resources have not been assigned.
The failed disk recovery unit may be performed again if the recovery resources have been assigned.
The local stream recovery unit and the remote stream recovery unit may include an assignment unit configured to provide all recovery tasks with assignment resources to which the recovery resources have been equally assigned; a division unit configured to divide the mandatory disk into a plurality of blocks; and a performance unit configured to perform recovery on each block section formed of each of the blocks based on the assignment resources.
The assignment resources may include idle resources in which the performance of the recovery is stopped.
The idle resources may be assigned based on network or disk I/O resource performed in a block section before the former block section.
The system performance analysis unit may periodically calculate the recovery resources.
The above and other objects, features and advantages of the present invention will be more clearly understood from the following detailed description taken in conjunction with the accompanying drawings, in which:
Embodiments of the present invention are described in detail below with reference to the accompanying drawings. In the following description of the present invention, repetitive descriptions and detailed descriptions of known functions and configurations which are deemed to make the gist of the present invention obscure are omitted.
The typical configuration of a disk assignment and recovery system for virtual machines in a server virtualization environment is described below.
Referring to
The present invention proposes an apparatus 300 and method for recovering the failed disk of a virtual machine when any one of the disks 1, 2 and 3 410, 420 and 430 assigned to the virtual machines 1, 2 and 3 100, 110 and 120 fails. In the following description of the present invention, it is assumed that the disk 1 410 assigned to the virtual machine 1 100 has failed. Accordingly, it is assumed that the disk 2 420 and the disk 3 430 assigned to the virtual machine 2 110 and the virtual machine 3 120 normally operate.
The apparatus 300 for recovering the failed disk of a virtual machine and the method of recovering the failed disk of a virtual machine according to an embodiment of the present invention perform scheduling in order to prevent the deterioration of the performance of the virtual machine 2 110 and the virtual machine 3 120 which have not failed when recovering the disk 1 410 assigned to the virtual machine 1 100 because the disk 1 410 has failed.
An operation of recovering the failed disk of a virtual machine according to an embodiment of the present invention is described below.
Referring to
If the use disk 411 assigned to the virtual machine 1 100 has failed and thus become the failed disk 411, the copy disk 412 is recovered using the mandatory disk 1 413 stored in the local storage device 400 or the mandatory disk 2 510 stored in the remote storage device 500 in order to recover the failed disk 411. More specifically, when the copy disk 412 is recovered, a method of recovering the copy disk 412 in a stream manner is adopted. The mandatory disk 1 413 is used in a local stream manner, and the mandatory disk 2 510 is used over a network in a remote stream manner. Thereafter, the failed disk 411 is deleted and the recovered copy disk 412 is assigned to the virtual machine 1 100, thereby completing the recovery of the failed disk 411. In this case, the failed disk 411 and the copy disk 412 may be replaced with each other in real time because they are present in the same local storage device 400.
The assignment of system resources capable of performing recovery while ensuring the performance of virtual machines is described below.
Referring to
A method of recovering the failed disk of a virtual machine according to an embodiment of the present invention is described below.
Referring to
At step S100 of calculating the total resources, that is, network and disk I/O resources, of a virtualization system, the total resources mean all system resources corresponding to a ratio of 1 in
After the total resources have been calculated, operating resources, that is, resources used to drive the virtualization system, are calculated at step S200. In this case, the operating resources are required to drive the virtualization system in a virtualization server, and correspond to 1−X in
After the operating resources have been calculated, use resources, that is, the amount of the network and disk I/O resources used, are calculated at step S300. In this case, the use resources mean resources that belong to the total resources and correspond to Y other than the resources corresponding to the 1−X, and are used to drive a virtual machine. A change in the use resources is detected in real time.
After the use resources has been calculated, recovery resources, that is, network and disk I/O bandwidths capable of being assigned to failure recovery without disturbing the performance of other virtual machines based on the total resources, the operating resources and the use resources, are calculated at step S400. In this case, the recovery resources mean the remaining resources of the total resources other than the operating resources and the use resources, and refer to resources used to recover the failed disk of a virtual machine. That is, the recovery resources correspond to X−Y in
After the recovery resources have been calculated, whether or not the recovery resources are present is determined at step S500. If, as a result of the determination, it is determined that the recovery resources have not been ensured (i.e., X−Y=0), an operation of recovering the failed disk of a virtual machine is stopped at step S600. If, as a result of the determination, it is determined that the recovery resources have been ensured, recovery is performed at step S700. Steps S600 and S700 may return to step S300. Accordingly, whether or not to perform recovery is determined based on a real-time change in the use resources.
The execution of the method of recovering the failed disk of a virtual machine and a method of controlling resource use bands are described below.
The performance of recovery described with reference to
More specifically, at step S710 of providing all recovery tasks with assignment resources to which the recovery resources have been equally assigned, the network and disk I/O bandwidths, that is, the recovery resources calculated at step S400, are equally assigned to all the recovery tasks. In this case, use resources calculated at step S300 are periodically updated, and then steps S300 to S700 are repeatedly performed. Thereafter, the mandatory disk is divided into the plurality of blocks at step S720, and recovery is performed on each block section formed of each of the blocks of a specific size at step S730.
An apparatus for recovering the failed disk of a virtual machine according to an embodiment of the present invention is described below.
Referring to
The system performance analysis unit 310 functions to analyze the performance of a virtualization system and calculate recovery resources, that is, network and disk I/O bandwidths that may be assigned to failure recovery. More specifically, the system performance analysis unit 310 includes a total resource calculation unit 311 configured to calculate total resources, that is, the total network and disk I/O resources of a virtualization system, an operating resource calculation unit 312 configured to calculate operating resources, that is, resources used to drive the virtualization system, a use resource calculation unit 313 configured to calculate use resources, that is, the amount of the network and disk I/O resources used, and a recovery resource calculation unit 314 configured to calculate recovery resources, that is, network and disk I/O bandwidths capable of being assigned to failure discovery without disturbing the performance of other virtual machines based on the total resources, the operating resources and the use resources. The total resources mean all system resources corresponding to the ratio 1 in
If the recovery resources have not been ensured (i.e., X−Y=0) after the recovery resources have been calculated, the failed disk recovery unit 320 does not operate. If the recovery resources have been ensured, however, the failed disk recovery unit 320 executes recovery. The system performance analysis unit 310 repeatedly operates in real time, and thus, whether or not to perform recovery is determined depending on whether or not recovery resources have been ensured in real time.
The failed disk recovery unit 320 includes a local stream recovery unit 321 configured to recover a copy disk, that is, a copy of a local mandatory disk stored in a local storage device, by copying the copy disk in a local stream manner using the local mandatory disk, and a remote stream recovery unit 322 configured to recover a copy disk, that is, a copy of a remote mandatory disk stored in a remote storage device, by copying the copy disk in a remote stream manner using the remote mandatory disk. For example, referring to
The local stream recovery unit 321 includes an assignment unit 321 a configured to provide all recovery tasks with assignment resources to which the recovery resources have been equally assigned, a division unit 321b configured to divide the mandatory disk into a plurality of blocks, and a performance unit 321c configured to perform recovery on each block section formed of each of the blocks based on the assignment resources. Furthermore, the remote stream recovery unit 322 includes an assignment unit 322a configured to provide all recovery tasks with assignment resources to which the recovery resources have been equally assigned, a division unit 322b configured to divide the mandatory disk into a plurality of blocks, and a performance unit 322c configured to perform recovery on each block section formed of each of the blocks based on the assignment resources.
For example, referring back to
The method of controlling resource use bandwidths in the task of executing a failed disk is illustrated in
As described above, at least one embodiment of the present invention has the advantage of recovering the failed disk of a virtual machine while maintaining the continuity of virtualization service in a server virtualization environment.
At least one embodiment of the present invention has the advantage of recovering the failed disk of a virtual machine while ensuring the performance of virtual machines.
At least one embodiment of the present invention has the advantage of preventing the performance of virtual machines from being deteriorated during the performance of recovery by scheduling resources used to recover the failed disk of a virtual machine.
At least one embodiment of the present invention has the advantage of recovering the failed disk of a virtual machine based on a remote storage device when recovering the failed disk.
Although the preferred embodiments of the present invention have been disclosed for illustrative purposes, those skilled in the art will appreciate that various modifications, additions and substitutions are possible, without departing from the scope and spirit of the invention as disclosed in the accompanying claims.
Number | Date | Country | Kind |
---|---|---|---|
10-2013-0071828 | Jun 2013 | KR | national |