1. Field of the Invention
The invention relates generally to storage systems and more specifically relates to maintaining cache persistence in a storage controller of a storage system in which the storage controller comprises multiple virtual machines each using a cache memory.
2. Discussion of Related Art
Storage systems have evolved in directions in which the storage controller of the storage system provides not only lower-level storage management such as RAID (Redundant Array of Independent Drives) storage management but also provides a number of higher layer storage applications operating within the storage controller on the storage system. These storage applications are made available to host systems with access to the storage system. For example, some storage systems include applications to provide: continuous data protection through automated backup procedures, database management application processes, snapshot (e.g., “shadow copy”) management processes, the de-duplication management processes, etc.
Some commercial storage system products providing such storage management coupled with the storage applications utilize Virtual Machine Managers (commonly referred to as “hypervisors”) to provide a virtual machine for each of the multiple application processes as well as for the lower-level storage management processes. In general, a hypervisor controls the overall operation of each of a plurality of virtual machines. Each virtual machine may include its own specific operating system kernel and associated application processes such that the hypervisor hides the underlying physical hardware circuitry interfaces from the operating system and application processes operating within a virtual machine. A variety of such virtual machine operating systems are well known and commercially available including, for example, the Xen hypervisor and the VMWare hypervisor. Information regarding these and other virtual machine operating systems as well known to those of ordinary skill and generally available at, for example, www.xen.org and www.vmware.com.
In virtual storage system controllers it is common that the lower level storage management processes (e.g., RAID storage management processes) operates in a virtual machine under control of the hypervisor and that the various application processes each run in separate virtual machines. All the virtual machines typically utilized cache memory to enhance their respective performance. Thus, each virtual machine in such a virtual machine storage controller may include access to a shared cache memory.
Typically, the cache memory is implemented as a battery backed random access memory (RAM) so that loss of power to the storage system will not result in immediate loss of data in the cache memory. The battery power retains the content in the cache memory until external power is restored to the storage system. However, as the size of the cache memories for storage management and various storage applications increase, the load increases on such a battery used for retaining the volatile cache memory content. To assure that the content of the cache memory is maintained for a sufficient period of time to allow restoration of external power therefore requires ever-larger battery components. Larger battery systems impose added cost and complexity to the storage systems.
Thus, it is an ongoing challenge to assure that cache memory utilized by a plurality of virtual machines in a storage controller of a storage system is retained during a potentially lengthy loss of power to the storage system.
The present invention solves the above and other problems, thereby advancing the state of the useful arts, by providing methods and apparatus for managing persistence of the content of cache memory used by each of multiple virtual machines operating in a storage controller. In accordance with features and aspects hereof, a storage controller assigns each of multiple virtual machines a corresponding portion of a shared cache memory. Upon loss of external power to the storage system, a persistence apparatus of the storage controller (operating under battery power to the storage controller) copies the content of each portion of the cache memory to a persistent memory thus assuring persistence of the cache content before battery power is exhausted. Upon restoration of the external power to the storage system, the persistence apparatus may restore the content of each portion of the cache memory before allowing the virtual machine to resume operation.
A first aspect hereof provides a method operable in a storage controller of a storage system for maintaining cache persistence. The storage controller includes a persistent memory, a cache memory, and multiple virtual machines coupled with the cache memory and operating under control of a hypervisor. The method includes associating each of multiple portions of cache memory with a corresponding virtual machine of the multiple virtual machines and sensing a loss of external power to the storage controller. The method also includes copying content from each portion of the cache memory associated with a corresponding virtual machine to the persistent memory in response to sensing the loss of external power.
Another aspect hereof provides apparatus in a storage system. The apparatus includes a battery and a storage controller coupled to the battery to receive power temporarily in case of loss of external power to the storage controller. The storage controller includes multiple virtual machines under control of a hypervisor and a cache memory coupled with each of the multiple virtual machines. The cache memory has multiple portions, each portion associated with a corresponding virtual machine of the multiple virtual machines. The storage controller also has a persistent memory adapted to persistently retain stored information despite loss of external power and a persistence apparatus coupled with the cache memory, coupled with the persistent memory, coupled with the hypervisor, and coupled with the multiple virtual machines. The persistence apparatus is adapted to receive a power loss signal from the hypervisor indicating loss of external power. The persistence apparatus is further adapted to copy content from each of the multiple portions of the cache memory to the persistent memory in response to receipt of the power loss signal.
Yet another aspect hereof provides a computer readable medium embodying stored program instructions for performing various methods hereof.
Storage controller may include processor 160 on which hypervisor 102 and VMs 104.1 through 104.3 may operate. Processor 160 may be any suitable computing device and associated program and data memory for storing programmed instructions and associated data. Processor 160 may be any general or special purpose processor and associated programmed instructions and/or may include customized application specific integrated circuits designed specifically for virtual machine processing.
Storage controller 150 also includes cache memory 106 logically subdivided into portions each of which corresponds to one of the virtual machines. For example, cache portion 106.1 is utilized by virtual machine “A” 104.1, cache portion 106.22 is utilized by virtual machine “B” 104.2, and cache portion 106.3 is utilized by virtual machines “C” 104.3. Those of ordinary skill in the art will readily recognize that any number of virtual machines may be provided under control of hypervisor 102 and hence a corresponding number of portions of cache memory 106 may be defined. Further, the size of each cache portion 106.1 through 106.3 may be fixed and equal or may vary depending upon the needs of each particular virtual memory virtual machine.
Cache memory 106 is typically implemented utilizing volatile, non-persistent, random access memory (e.g., static or dynamic Random Access Memory—RAM). But for the presence of battery 154, loss of power from external power source 152 would cause total loss of the content of the volatile, non-persistent cache memory 106. Battery 154 is adapted to provide backup power in case of loss of power from external source 152 but only for a brief period of time. As noted above, the power load imposed on the battery 154 increases as the size of cache memory 106 continues to increase. In the context of storage controller 150 having multiple storage related processes each operating in a virtual machine with an associated portion of cache memory, the power load on battery 154 may be substantial. Thus, the time that battery 154 may power the storage controller 102 is limited.
Storage controller 150 also includes persistent memory 108 that does not rely on continuous application of power to retain its stored data. Persistent memory 108 may be implemented as any suitable nonvolatile, persistent memory device components including, for example, flash memory, and disk storage such as optical or magnetic disk storage.
Storage controller 150 also includes persistence apparatus 110 coupled with cache memory 106, with persistent memory 108, with the hypervisor 102 and each of the virtual machines 104.1 through 104.3. Persistence apparatus 110 is adapted to receive a power loss signal from the hypervisor indicating loss of external power 152 (and hence switchover of controller 150 to battery power 154). Persistence apparatus 110 is further adapted to copy the content from each of the multiple portions 106.1 through 106.3 of cache memory 106 to the persistent memory 108 responsive to the signal detecting loss of external power. Copying the content of cache 106 to persistent memory 108 prevents loss of data in cache memory 106.
Persistence apparatus 110 may be implemented as suitably programmed instructions executed by processor 160 or may be implemented as suitably designed custom integrated circuits dedicate to the functions performed by the apparatus 110. Further, in exemplary embodiments, persistence apparatus may be implemented as tightly integrated with the hypervisor or as distinct from the hypervisor (e.g., operable within a virtual machine managed by the hypervisor). Further details of exemplary embodiments of persistence apparatus 110 are presented herein below.
Those of ordinary skill in the art will readily recognize numerous additional and equivalent components and modules in a fully operational system 100. Such additional and equivalent components are omitted herein for simplicity and brevity of this discussion
In other exemplary embodiments discussed further below, the plug-in function 200.1 through 200.3 may also return information to the persistence apparatus 110 when the plug-in function is invoked. The return information may indicate a subset of the portion of cache memory that is actually utilized by the corresponding virtual machine 104.1 through 104.3 (as opposed to merely allocated for the corresponding virtual machine). In such embodiments, the persistence apparatus 110 may save only the subset of the cache portion (106.1 through 106.3 of
Persistence apparatus 110 as shown in
The exemplary embodiments of
Step 500 associates a portion of a cache memory with each of the multiple virtual machines. As noted generally above, the size of each portion associated with each virtual machine either may be fixed and equal to all other portions or may vary depending upon the requirements of the particular virtual machine and application. Such design choices are readily apparent to those of ordinary skill in the art based on the needs of a particular storage application environment. Step 502 awaits detection of a signal indicating loss of external power to the storage controller. Upon sensing loss of external power to the storage controller, step 504 copies the contents from each portion of the cache memory associated with a corresponding virtual machine to the persistent memory. As noted above, though the cache memory is volatile and not persistent it retains its content after loss of external power for brief periods time based on battery power. By contrast, the persistent memory does not require any power source to retain its stored data.
Step 506 then shuts down the storage system and turns off the battery power. By shutting down the storage system completely and turning off the battery power source, the remaining charge in the battery may be conserved for subsequent uses after restoration of the external power. Later, external power to the storage controller may be restored (i.e., after the cause of failure for the external power is determined and remedied). Following restoration of the external power, step 508 restores each portion of the cache memory from a corresponding location in the persistent memory copy generated in step 506. Step 510 then allows resumption of virtual machine processing with the cache content fully restored to its state prior to loss of external power.
Step 600 invokes the plug-in function for the first or next virtual machine to be processed by the persistence apparatus responsive to sensing loss of external power. Step 602 determines from the returned values of the invocation of the plug-in function a subset of the cache portion of the current virtual machine that needs to be copied to the persistent storage. In one exemplary embodiment, the returned values provide a start address value and an extent value defining the location and length of a contiguous sequence of memory locations in cache memory to be copied to the persistent memory. In another exemplary embodiment, the returned values from the plug-in function invocation may represent one or more tuples of values wherein each tuple provides a start address value and an extent value for contiguous memory locations to be copied to the persistent memory. Where multiple such tuples are provided, the collection of memory locations defined by all such tuples comprises the subset of the cache portion that is to be copied to the persistent storage.
Having so determined the subset of cache portion to be copied, step 604 copies the identified subset of the cache portion for the present virtual machine to the persistent memory. Optionally, step 604 may also store meta-data that aids in identifying the exact locations in the portion of cache memory from which the copied subset is obtained. The meta-data may then be used later when restoring the copied portions of cache memory. Step 606 then determines whether more virtual machines remain to be processed for purposes of copying their respective portions of cache memory. If so, processing continues looping back iteratively repeating steps 600 through 606 until all virtual machines have been processed by the persistence apparatus.
Step 704 then determines values to be returned to the invoking persistence apparatus to indicate a subset of the cache portion actually used by the virtual machine. As noted above, the restructuring of step 702 may assure that the content of the cache portion is reorganized into one or more contiguous blocks of memory locations. Thus, step 704 may determine one or more sets of values (i.e., one or more tuples) to be returned to the invoking persistence apparatus. Each tuple may then indicate, for example, a starting address and an extent of a contiguous block of memory to be saved and later restored by the persistence apparatus. Step 706 then returns to the invoking persistence apparatus with the return values determined by step 704.
Embodiments of the invention can take the form of an entirely hardware (i.e., circuits) embodiment, an entirely software embodiment or an embodiment containing both hardware and software elements. In one embodiment, the invention is implemented in software, which includes but is not limited to firmware, resident software, microcode, etc.
Furthermore, embodiments of the invention can take the form of a computer program product accessible from a computer-usable or computer-readable medium 812 providing program code for use by or in connection with a computer or any instruction execution system. For the purposes of this description, a computer-usable or computer readable medium can be any apparatus that can contain, store, communicate, propagate, or transport the program for use by or in connection with the computer, instruction execution system, apparatus, or device.
The medium can be an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system (or apparatus or device) or a propagation medium. Examples of a computer-readable medium include a semiconductor or solid-state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk and an optical disk. Current examples of optical disks include compact disk—read only memory (CD-ROM), compact disk—read/write (CD-R/W) and DVD.
An I/O controller device computer 800 suitable for storing and/or executing program code will include at least one processor 802 coupled directly or indirectly to memory elements 804 through a system bus 850. The memory elements 804 can include local memory employed during actual execution of the program code, bulk storage, and cache memories that provide temporary storage of at least some program code in order to reduce the number of times code must be retrieved from bulk storage during execution.
Input/output interface 806 couples the controller to I/O devices to be controlled (e.g., storage devices, etc.). Host system interface 808 may also couple the computer 800 to other data processing systems.
While the invention has been illustrated and described in the drawings and foregoing description, such illustration and description is to be considered as exemplary and not restrictive in character. One embodiment of the invention and minor variants thereof have been shown and described. In particular, features shown and described as exemplary software or firmware embodiments may be equivalently implemented as customized logic circuits and vice versa. Protection is desired for all changes and modifications that come within the spirit of the invention. Those skilled in the art will appreciate variations of the above-described embodiments that fall within the scope of the invention. As a result, the invention is not limited to the specific examples and illustrations discussed above, but only by the following claims and their equivalents.