The present disclosure relates generally to virtual machine data protection and more particularly to a system and method for securely storing virtual machine data in a public cloud.
Virtual machine data, such as virtual machine images, may be a target for unauthorized access to private data. In existing systems, virtual machine data belonging to a cloud customer may be required to reside in a cloud environment, where it may be vulnerable to attacks by cloud vendor personnel and/or other unauthorized users such as hackers. In addition, migration and/or re-provisioning of virtual machine data into different cloud environments may be cumbersome and time consuming.
According to one embodiment of the present disclosure, a method includes partitioning a disk image file into a plurality of segments. The method also includes generating a unique key for each segment, storing the unique keys in an image mapping file, and transmitting the image mapping file to a particular one of a plurality of nodes on a network. The method further includes transmitting a first segment and a second segment of the plurality of segments to different nodes of the plurality of nodes.
Technical advantages of the present disclosure include the secure storage of virtual machine data in cloud environments. Particular embodiments may also allow for rapid migration and/or provisioning of virtual machine data from one cloud environment to another. Some embodiments may also allow for redundant storage of virtual machine data in cloud environments.
Other technical advantages of the present disclosure will be readily apparent to one skilled in the art from the following figures, descriptions, and claims. Moreover, while specific advantages have been enumerated above, various embodiments may include all, some, or none of the enumerated advantages.
For a more complete understanding of certain embodiments of the present invention and features and advantages thereof, reference is now made to the following description taken in conjunction with the accompanying drawings, in which:
Embodiments of the present disclosure and its advantages are best understood by referring to
For instance, in existing systems, a user wishing to run a virtual machine in a cloud environment may be required to upload an initial virtual machine image to the cloud environment to instantiate the image. After using the instantiated virtual machine, the state of the virtual machine may have changed from that captured by the initial image uploaded to the cloud environment. If the user wishes to temporarily stop using the instantiated virtual machine, the user may store an updated image of the virtual machine in the cloud environment. However, the virtual machine image may be vulnerable to attacks and/or theft while being stored in the cloud environment. To alleviate this concern, some users may wish to download the updated virtual machine image from the cloud, store the image locally, and later upload the image when further use is desired. However, this may be inefficient and/or time-consuming, as downloading and uploading large amounts of data may take a substantial amount of time to perform in addition to the time required to re-instantiate and/or re-provision the image.
Some users may wish to migrate a virtual machine to a different cloud environment as well. In existing systems, this may require the user to migrate a new image capturing the updated state of the virtual machine from the current environment to the new environment. Alternatively, the user may upload the initial image to the new cloud environment and provision the image such that it mirrors the updated state of the virtual machine in the current environment. Either process may take a significant amount of time, however, and may be quite burdensome.
Accordingly, embodiments of the present disclosure may allow for secure storage of data in cloud environments. In addition, embodiments of the present disclosure may allow for the rapid migration or re-provisioning of images securely stored in the cloud. For instance, a virtual machine image file in a cloud environment may be partitioned into a plurality of segments. In some embodiments, the image file may be compressed prior to being partitioned into segments using any suitable data compression technique. Unique keys for each segment may then be generated, and may be used for location and/or identification purposes. For example, the unique keys may indicate the node on which each segment is located on a network.
The segments may then be distributed among various nodes coupled to a network. Such distributed storage of the segments may allow for secure storage of the virtual machine image, since each segment of data may not be particularly relevant to another user without the other segments of the virtual machine image. In addition, such distributed storage may allow for rapid recovery of the virtual machine image, as each segment may be retrieved in parallel from each of the nodes on the network without requiring a large amount of bandwidth from any one of the nodes. In particular embodiments, the segments may be replicated across the various nodes on the network according to any suitable replication technique. Such replication may provide redundancy protection in the event of data loss at one or more of the nodes on the network.
The segments may also be encrypted prior to distribution among the nodes. This may include generation of encryption keys for each segment being distributed on the network, which may be stored in an image mapping file (IMF) along with the unique keys. The IMF may then be stored and/or controlled by the owner of the virtual machine image, and/or may allow a user with control of or access to the file to retrieve and re-provision a virtual machine image file that has been distributed on the network.
The segments may be retrieved from the various nodes on the network when desired, for example, when the user wishes to re-provision the image in the cloud environment. In some embodiments, the segments may be retrieved according to the unique keys in the IMF. In particular embodiments, the segments may only be retrieved by a user with control of or access to the IMF. Once retrieved, the segments may be assembled to form the original virtual machine image file. If encrypted, the segments may be decrypted prior to assembly. If the assembled file is a compressed version of the image file, it may be decompressed to yield the original virtual machine image file.
As an example, referring to
Datacenter 110 may refer to a collection of hardware resources such as a server 112 and/or storage 111. Server 112 includes a processor 113, memory 114, and an interface 115. Processor 113 may refer to any suitable device operable to execute instructions and manipulate data to perform operations for server 112. Processor 113 may include, for example, any type of central processing unit (CPU).
Memory 114 may comprise one or more tangible, computer-readable, and/or computer-executable computer readable medium, and may store data, information, and/or instructions operable to be executed by processor 113. Examples of memory include computer memory (for example, Random Access Memory (RAM) or Read Only Memory (ROM)), mass computer readable media (for example, a hard disk), removable computer readable media (for example, a Compact Disk (CD) or a Digital Video Disk (DVD)), and/or other computer-readable medium.
Interface 115 may refer to any suitable device operable to receive input for server 112, send output from server 112, perform suitable processing of the input or output or both, communicate to other devices, or any combination of the preceding. Interface 115 may include appropriate hardware (e.g. modem, network interface card, etc.) and software, including protocol conversion and data processing capabilities, to communicate through a LAN, WAN, or other communication system that allows server 112 to communicate to other devices. Interface 115 may include one or more ports, conversion software, or both.
Storage 111 may provide additional data storage capacity and may include database and/or network storage, or any other suitable tangible, computer-readable storage media. In certain embodiments, storage 111 may include network resources, such as a storage area network (SAN) or network-attached storage (NAS).
Network 120 may refer to any interconnecting system capable of transmitting audio, video, signals, data, messages, or any combination of the preceding. Network 120 may include all or a portion of a public switched telephone network (PSTN), a public or private data network, a local area network (LAN), a metropolitan area network (MAN), a wide area network (WAN), a local, regional, or global communication or computer network such as the Internet, a wireline or wireless network, an enterprise internet, or any other suitable communication link, including combinations thereof. Network 120 may connect a plurality of nodes 130.
Nodes 130, like server 112, may include a processor 131, memory 132, and interface 133. As an example, and not by way of limitation, nodes 130 may comprise an embedded computer system, a system-on-chip (SOC), a single-board computer system (SBC) (such as, for example, a computer-on-module (COM) or system-on-module (SOM)), a desktop computer system, a laptop or notebook computer system, an interactive kiosk, a mainframe, a mesh of computer systems, a mobile telephone, a personal digital assistant (PDA), a server, or a combination of two or more of these. Where appropriate, nodes 130 may include one or more computer systems; be unitary or distributed; span multiple locations; span multiple machines; or reside in a cloud, which may include one or more cloud components in one or more networks.
At step 220, the compressed file is split or partitioned into a plurality of data segments, which may comprise the various partitions of the compressed file. The number of data segments may be any suitable number. In some embodiments, the number of segments may be chosen such that each segment may be of little value without the rest of the segments. This may allow for more secure storage of each segment, since an unauthorized user would require access to every segment in order to use and/or view the data. In certain embodiments, the number of segments may be chosen based on the preferred size of each segment. For instance, a size of 1 MB to 64 MB per segment may be chosen based on disk sector sizes available for storage.
At step 230, each segment is encrypted. Encryption may be performed using any suitable encryption technique. During this step, encryption keys may be generated according to the encryption technique selected. In some embodiments, an encryption key may be generated for each segment. After encryption, a unique key is then generated for each segment at step 240. In some embodiments, the unique key may identify the particular segment to which it corresponds. In some embodiments, the unique may identify a location at which the segment is to be stored. In particular embodiments, the unique key may be generated according to a hash function.
At step 250, the encryption keys and unique keys are stored in an image mapping file (IMF). In particular embodiments, the IMF may comprise a distributed hash table (DHT) containing unique key and segment pairs, which may indicate a location of each segment on the network. In such embodiments, the DHT may provide a lookup service similar to a hash table, and may utilize the unique keys to locate each segment on the network.
At step 260, the IMF is sent to a particular one of a plurality of nodes on a network. In some embodiments, the particular node may be the owner of the virtual machine data. In some embodiments, having possession of the IMF for a particular archive file may allow a user to recover the archive file at a later time. In further embodiments, possession of the IMF containing the unique keys and encryption keys may be required to retrieve each segment from the nodes on the network.
Finally, at step 270, the plurality of segments are sent to a plurality of nodes on a network. In some embodiments, the plurality of nodes may include the node to which the IMF was sent. In other embodiments, the plurality of nodes may not include the node to which the IMF was sent. In particular embodiments, the segments may be distributed according to the BitTorrent protocol, or other similar P2P protocol. In certain embodiments, the plurality of segments may be replicated among the plurality of nodes for redundancy purposes using any suitable data replication technique.
At step 320, the plurality of segments are retrieved from the plurality of nodes. In some embodiments, the segments may be retrieved based on the unique keys in the IMF. For example, in embodiments where the IMF comprises a DHT, the unique key and segment pairs may indicate the location of the segments on the network. In particular embodiments, the segments may be retrieved from the nodes in parallel. In some embodiments, the segments may be retrieved from the nodes in a particular order.
At step 330, each segment is decrypted. In particular embodiments, the segments may be decrypted based on the encryption keys in the IMF. Once each segment is decrypted, the archive file is assembled at step 340 and then decompressed at step 350, yielding the original virtual machine data in unencrypted and uncompressed form.
Although the present disclosure has been described in several embodiments, a myriad of changes, substitutions, and modifications may be suggested to one skilled in the art, and it is intended that the present disclosure encompass such changes, substitutions, and modifications as fall within the scope of the present appended claims.
Number | Name | Date | Kind |
---|---|---|---|
6029168 | Frey | Feb 2000 | A |
6112255 | Dunn et al. | Aug 2000 | A |
6256673 | Gayman | Jul 2001 | B1 |
6591376 | VanRooven et al. | Jul 2003 | B1 |
6674884 | Bacus et al. | Jan 2004 | B2 |
7162452 | Epstein | Jan 2007 | B1 |
7814146 | Chavez et al. | Oct 2010 | B2 |
7849303 | Miller | Dec 2010 | B2 |
8051205 | Roy et al. | Nov 2011 | B2 |
8073926 | Traut et al. | Dec 2011 | B2 |
8190921 | Harwood et al. | May 2012 | B1 |
8260794 | Caceres | Sep 2012 | B2 |
20030169878 | Miles | Sep 2003 | A1 |
20070294676 | Mellor et al. | Dec 2007 | A1 |
20070300221 | Hartz et al. | Dec 2007 | A1 |
20080016387 | Bensinger | Jan 2008 | A1 |
20080144471 | Garapati et al. | Jun 2008 | A1 |
20090122724 | Rosenberg | May 2009 | A1 |
20090210697 | Chen et al. | Aug 2009 | A1 |
20090220090 | Savagaonkar et al. | Sep 2009 | A1 |
20090290709 | MacDonald et al. | Nov 2009 | A1 |
20090319502 | Chalouhi et al. | Dec 2009 | A1 |
20100094921 | Roy et al. | Apr 2010 | A1 |
20100095009 | Matuszewski et al. | Apr 2010 | A1 |
20100153749 | Sakai | Jun 2010 | A1 |
20100332456 | Prahlad et al. | Dec 2010 | A1 |
20110022574 | Hansen | Jan 2011 | A1 |
20110047380 | Miller | Feb 2011 | A1 |
20110106758 | Agapiev et al. | May 2011 | A1 |
20110126197 | Larsen et al. | May 2011 | A1 |
20110173609 | Buragohain et al. | Jul 2011 | A1 |
20110179268 | Strom et al. | Jul 2011 | A1 |
20110179415 | Donnellan et al. | Jul 2011 | A1 |
20110246767 | Chaturvedi et al. | Oct 2011 | A1 |
20110246784 | Hobbett et al. | Oct 2011 | A1 |
20110246786 | Laor et al. | Oct 2011 | A1 |
20110282915 | Steer et al. | Nov 2011 | A1 |
20110289555 | DeKoenigsberg et al. | Nov 2011 | A1 |
20110296201 | Monclus et al. | Dec 2011 | A1 |
20110307533 | Saeki | Dec 2011 | A1 |
20120030176 | Gelson et al. | Feb 2012 | A1 |
20120084262 | Dwarampudi et al. | Apr 2012 | A1 |
20120084445 | Brock et al. | Apr 2012 | A1 |
20120143688 | Alexander et al. | Jun 2012 | A1 |
20120179778 | DeSwardt et al. | Jul 2012 | A1 |
20120191969 | Clifford et al. | Jul 2012 | A1 |
20120290582 | Oikarinen | Nov 2012 | A1 |
20120317275 | Lee et al. | Dec 2012 | A1 |
20120317279 | Love et al. | Dec 2012 | A1 |
20130132523 | Love et al. | May 2013 | A1 |
20130132950 | McLeod et al. | May 2013 | A1 |
Entry |
---|
Jinzhu Kong, “Protecting the Confidentiality of Virtual Machines against Untrusted Host,” 2010 International Symposium on Intelligence Information Processing and Trusted Computing, © 2010 IEEE, pp. 364-368. |
Number | Date | Country | |
---|---|---|---|
20130305046 A1 | Nov 2013 | US |