SYSTEMS AND METHODS FOR SECURE DISTRIBUTED STORAGE

Abstract
Embodiments relate to systems and methods for secure distributed storage. In aspects, a set of remote storage hosts, such as personal computers, servers, media devices, cell phones, or others, can subscribe or register to provide storage via a cloud-based or other distributed network. Source data from an originating computer, such as a data file, can be decomposed into data storage subunits, each of which is encrypted via a cloud management system or other logic or control. The data storage subunits can comprise data blocks or even or uneven size. The set of encrypted data storage subunits can be registered to a table or other record, and disseminated to the remote storage hosts. In the event of data loss at the originating computer or at other times, the remotely stored data storage subunits can be extracted, decrypted, and reassembled to reconstruct the original source data.
Description
FIELD

The present teachings relate to systems and methods for secure distributed storage, and more particularly to platforms and techniques for receiving data for storage in a cloud-based or other distributed network, in which remote client machines supporting the cloud-based or other network can receive pieces of files or other data objects in secure format from a source machine, store that data locally, and transmit that data back to the original source machine for backup, data reconstruction, or other purposes.


BACKGROUND

Platforms for redundant data storage are known. For instance, storage such as RAID (redundant array of inexpensive disks) servers, disaster recovery storage sites, and other storage or services are available which create and store an image of a file, disk or other storage object to permit a user to access and recover data when original or source data becomes compromised or unavailable, such as, for example, a transaction server with an associated database crashes, or other events occur.


In other regards, cloud-based computing networks have become more prevalent for purposes of deploying virtual machines, networks, storage, and other resources or services. It may be possible to generate data storage or data backup using existing cloud-based network infrastructures. However, existing cloud-based or other distributed networks may not permit a user wishing to perform data backups and/or data recovery to break the data being backed up into smaller storage subunits, and disseminate those data fragments to various remote storage hosts in a cloud-based or otherwise distributed network. Existing data backup platforms likewise may not permit distributed storage to a set of diverse hosts on a secure basis. It may be desirable to provide systems and methods for secure distributed storage, in which a file or other data object can be decomposed into small storage subunits, encrypted or otherwise secured, and distributed to cloud-based or other remote storage hosts, for data recovery or other purposes.





DESCRIPTION OF THE DRAWINGS

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the present teachings and together with the description, serve to explain the principles of the present teachings. In the figures:



FIG. 1 illustrates an overall cloud system architecture in which various embodiments of the present teachings can be practiced;



FIG. 2 illustrates an overall cloud system architecture including multiple cloud arrangements in which various embodiments of the present teachings can be practiced in another regard, according to various embodiments;



FIG. 3 illustrates a network configuration in which a cloud management system can perform various storage, data processing, and recovery functions, according to various embodiments;



FIG. 4 illustrates an exemplary hardware configuration for a cloud management system, according to various embodiments; and



FIG. 5 illustrates a flowchart for overall storage, data processing, and recovery processing in a cloud computing environment, according to various embodiments.





Embodiments of systems and methods for secure distributed storage described herein can be implemented in, or supported by, a cloud network architecture. As used herein, a “cloud” can comprise a collection of resources that can be invoked to instantiate a virtual machine, process, or other resource for a limited or defined duration. As shown for example in FIG. 1, the collection of resources supporting a cloud 102 can comprise a set of resource servers 108 configured to deliver computing components needed to instantiate a virtual machine, process, or other resource. For example, one group of resource servers can host and serve an operating system or components thereof to deliver to and instantiate a virtual machine. Another group of resource servers can accept requests to host computing cycles or processor time, to supply a defined level of processing power for a virtual machine. A further group of resource servers can host and serve applications to load on an instantiation of a virtual machine, such as an email client, a browser application, a messaging application, or other applications or software. Other types of resource servers are possible.


In embodiments, the entire set of resource servers 108 or other hardware or software resources used to support the cloud 102 along with its instantiated virtual machines is managed by a cloud management system 104. The cloud management system 104 can comprise a dedicated or centralized server and/or other software, hardware, and network tools that communicate via network 106 such as the Internet or other public or private network with all sets of resource servers to manage the cloud 102 and its operation. To instantiate a new set of virtual machines, a user can transmit an instantiation request to the cloud management system 104 for the particular type of virtual machine they wish to invoke for their intended application. A user can for instance make a request to instantiate a set of virtual machines configured for email, messaging or other applications from the cloud 102. The request can be received and processed by the cloud management system 104, which identifies the type of virtual machine, process, or other resource being requested. The cloud management system 104 can then identify the collection of resources necessary to instantiate that machine or resource. In embodiments, the set of instantiated virtual machines or other resources can for example comprise virtual transaction servers used to support Web storefronts, or other transaction sites.


In embodiments, the user's instantiation request can specify a variety of parameters defining the operation of the set of virtual machines to be invoked. The instantiation request, for example, can specify a defined period of time for which the instantiated machine or process is needed. The period of time can be, for example, an hour, a day, or other increment of time. In embodiments, the user's instantiation request can specify the instantiation of a set of virtual machines or processes on a task basis, rather than for a predetermined amount of time. For instance, a user could request resources until a software update is completed. The user's instantiation request can specify other parameters that define the configuration and operation of the set of virtual machines or other instantiated resources. For example, the request can specify an amount of processing power or input/output (I/O) throughput the user wishes to be available to each instance of the virtual machine or other resource. In embodiments, the requesting user can for instance specify a service level agreement (SLA) acceptable for their application. Other parameters and settings can be used. One skilled in the art will realize that the user's request can likewise include combinations of the foregoing exemplary parameters, and others.


When the request to instantiate a set of virtual machines or other resources has been received and the necessary resources to build that machine or resource have been identified, the cloud management system 104 can communicate with one or more set of resource servers 108 to locate resources to supply the required components. The cloud management system 104 can select providers from the diverse set of resource servers 108 to assemble the various components needed to build the requested set of virtual machines or other resources. It may be noted that in some embodiments, permanent storage such as hard disk arrays may not be included or located within the set of resource servers 108 available to the cloud management system 104, since the set of instantiated virtual machines or other resources may be intended to operate on a purely transient or temporary basis. In embodiments, other hardware, software or other resources not strictly located or hosted in the cloud can be leveraged as needed. For example, other software services that are provided outside of the cloud 102 and hosted by third parties can be invoked by in-cloud virtual machines. For further example, other non-cloud hardware and/or storage services can be utilized as an extension to the cloud 102, either on an on-demand or subscribed or decided basis.


With the resource requirements identified, the cloud management system 104 can extract and build the set of virtual machines or other resources on a dynamic or on-demand basis. For example, one set of resource servers 108 may respond to an instantiation request for a given quantity of processor cycles with an offer to deliver that computational power immediately and guaranteed for the next hour. A further set of resource servers 108 can offer to immediately supply communication bandwidth, for example on a guaranteed minimum or best-efforts basis. In other embodiments, the set of virtual machines or other resources can be built on a batch basis or at a particular future time. For example, a set of resource servers 108 may respond to a request for instantiation at a programmed time with an offer to deliver the specified quantity of processor cycles within a specific amount of time, such as the next 12 hours.


The cloud management system 104 can select group of servers in the set of resource servers 108 that match or best match the instantiation request for each component needed to build the virtual machine or other resource. The cloud management system 104 can then coordinate the integration of the completed group of servers from the set of resource servers 108, to build and launch the requested set of virtual machines or other resources. The cloud management system 104 can track the combined group of servers selected from the set of resource servers 108, or other distributed resources that are dynamically or temporarily combined, to produce and manage the requested virtual machine population or other resources.


In embodiments, the cloud management system 104 can generate a resource aggregation table that identifies the various sets of resource servers that will be used to supply the components of the virtual machine or process. The sets of resource servers can be identified by unique identifiers such as, for instance, Internet protocol (IP) addresses or other addresses. The cloud management system 104 can register the finalized group of servers in the set resource servers 108 contributing to an instantiated machine or process.


The cloud management system 104 can then set up and launch the initiation process for the virtual machines, processes, or other resources to be delivered from the cloud. The cloud management system 104 can for instance transmit an instantiation command or instruction to the registered group of servers in set of resource servers 108. The cloud management system 104 can receive a confirmation message back from each participating server in set of resource servers 108 indicating a status regarding the provisioning of their respective resources. Various sets of resource servers may confirm, for example, the availability of a dedicated amount of processor cycles, amounts of electronic memory, communications bandwidth, or applications or other software prepared to be served.


As shown for example in FIG. 2, the cloud management system 104 can then instantiate one or more than one set of virtual machines 116, or other processes based on the resources supplied by the registered set of resource servers 108. In embodiments, the cloud management system 104 can instantiate a given number, for example, 10, 500, 1000, or other numbers of virtual machines to be made available to users on a network 114, such as the Internet or other public or private network. Each virtual machine can be assigned an instantiated machine ID that can be stored in the resource aggregation table, or other record or image of the instantiated population. Additionally, the cloud management system 104 can store the duration of each virtual machine and the collection of resources utilized by the complete set of instantiated virtual machines 116.


In embodiments, the cloud management system 104 can further store, track and manage a user's identity and associated set of rights or entitlements to software, hardware, and other resources. Each user that populates a set of virtual machines in the cloud can have specific rights and resources assigned and made available to them. The cloud management system 104 can track and configure specific actions that a user can perform, such as provision a set of virtual machines with software applications or other resources, configure a set of virtual machines to desired specifications, submit jobs to the set of virtual machines or other host, manage other users of the set of instantiated virtual machines 116 or other resources, and other privileges or actions. The cloud management system 104 can further generate records of the usage of instantiated virtual machines to permit tracking, billing, and auditing of the services consumed by the user. In embodiments, the cloud management system 104 can for example meter the usage and/or duration of the set of instantiated virtual machines 116, to generate subscription billing records for a user that has launched those machines. Other billing or value arrangements are possible.


The cloud management system 104 can configure each virtual machine to be made available to users of the network 114 via a browser interface, or other interface or mechanism. Each instantiated virtual machine can communicate with the cloud management system 104 and the underlying registered set of resource servers 108 via a standard Web application programming interface (API), or via other calls or interfaces. The set of instantiated virtual machines 116 can likewise communicate with each other, as well as other sites, servers, locations, and resources available via the Internet or other public or private networks, whether within a given cloud 102 or between clouds.


It may be noted that while a browser interface or other front-end can be used to view and operate the set of instantiated virtual machines 116 from a client or terminal, the processing, memory, communications, storage, and other hardware as well as software resources required to be combined to build the virtual machines or other resources are all hosted remotely in the cloud 102. In embodiments, the set of virtual machines 116 or other resources may not depend on or require the user's own on-premise hardware or other resources. In embodiments, a user can therefore request and instantiate a set of virtual machines or other resources on a purely off-premise basis, for instance to build and launch a virtual storefront or other application.


Because the cloud management system 104 in one regard specifies, builds, operates and manages the set of instantiated virtual machines 116 on a logical level, the user can request and receive different sets of virtual machines and other resources on a real-time or near real-time basis, without a need to specify or install any particular hardware. The user's set of instantiated machines 116, processes, or other resources can be scaled up or down immediately or virtually immediately on an on-demand basis, if desired. In embodiments, the various sets of resource servers that are accessed by the cloud management system 104 to support a set of instantiated virtual machines 116 or processes can change or be substituted, over time. The type and operating characteristics of the set of instantiated virtual machines 116 can nevertheless remain constant or virtually constant, since instances are assembled from abstracted resources that can be selected and maintained from diverse sources based on uniform specifications.


In terms of network management of the set of virtual machines 116 that have been successfully configured and instantiated, the cloud management system 104 can perform various network management tasks including security, maintenance, and metering for billing or subscription purposes. The cloud management system 104 of a given cloud can 102, for example, install or terminate applications or appliances on individual machines. The cloud management system 104 can monitor operating virtual machines to detect any virus or other rogue process on individual machines, and for instance terminate the infected application or virtual machine. The cloud management system 104 can likewise manage an entire set of instantiated clients 116 or other resources on a collective basis, for instance, to push or delivery a software upgrade to all active virtual machines. Other management processes are possible.


In embodiments, more than one set of virtual machines can be instantiated in a given cloud at the same, overlapping or successive times. The cloud management system 104 can, in such implementations, build, launch and manage multiple sets of virtual machines based on the same or different underlying set of resource servers 108, with populations of different instantiated virtual machines 116 such as may be requested by different users. The cloud management system 104 can institute and enforce security protocols in a cloud 102 hosting multiple sets of virtual machines. Each of the individual sets of virtual machines can be hosted in a respective partition or sub-cloud of the resources of the main cloud 102. The cloud management system 104 of a cloud can for example deploy services specific to isolated or defined sub-clouds, or isolate individual workloads/processes within the cloud to a specific sub-cloud. The subdivision of the cloud 102 into distinct transient sub-clouds or other sub-components which have assured security and isolation features can assist in establishing a multiple user or multi-tenant cloud arrangement. In a multiple user scenario, each of the multiple users can use the cloud platform as a common utility while retaining the assurance that their information is secure from other users of the overall cloud system. In further embodiments, sub-clouds can nevertheless be configured to share resources, if desired.


In embodiments, and as also shown in FIG. 2, the set of instantiated virtual machines 116 generated in a first cloud 102 can also interact with a set of instantiated virtual machines or processes generated in a second, third or further cloud 102. The cloud management system 104 of a first cloud 102 can interface with the cloud management system 104 of a second cloud 102, to coordinate those domains and operate the clouds and/or virtual machines or processes on a combined basis. The cloud management system 104 of a given cloud 102 can track and manage individual virtual machines or other resources instantiated in that cloud, as well as the set of instantiated virtual machines or other resources in other clouds.


In the foregoing and other embodiments, the user making an instantiation request or otherwise accessing or utilizing the cloud network can be a person, customer, subscriber, administrator, corporation, organization, or other entity. In embodiments, the user can be or include another virtual machine, application or process. In further embodiments, multiple users or entities can share the use of a set of virtual machines or other resources.



FIG. 3 illustrates an exemplary network incorporating cloud-based resources and other elements that can be used to generate a secure distributed backup of a source data object 212, according to various embodiments. In embodiments as shown, a source machine 204 such as a client, server, host, target, or other machine or device can host or access a source data object 212. In aspects, source data object 212 can be or include a set of data such as, for instance, a data file for use by an application, application code or files, operating system code or files, and/or other information. Source machine 204 can likewise host, access, or execute a storage engine 216 to control data access, communications, and/or other activity to generate a secure backup of source data object 212 via cloud 102 and/or associated resources. More particularly, in embodiments as shown, source machine 204 can communicate with a set of remote storage hosts 210 via one or more networks 106, along with cloud 102 and associated resources. Set of remote storage hosts 210 can be or include host clients, targets, and/or other machines which can be registered by individual and/or home users, or others, via cloud management system 104 or other network management logic to contribute or dedicate some or all of the resources of the participating machine to cloud 102 for data storage and recover purposes. In embodiments, individual machines in set of remote storage hosts 210 can be or include desktop computers, laptop computers, media playback devices, cellular telephones or other network-enabled communications devices, and/or other devices, machines, or hardware. In embodiments, individual machines in set of remote storage hosts 210 can assign or subscribe storage resources, such as hard disk storage, electronic memory or storage, optical storage, and/or other storage media to data backup, storage, and recovery operations.


In embodiments as shown, when data storage and/or backup operations are desired, source system 214 can transmit one or more source data object 212 to cloud management system 104 or other management logic. In embodiments, source system 204 can host a storage engine 216, which can comprise software and/or logic to access source data object 212, and transmit that object to cloud management system 104 via one or more networks 106, for instance using TCP/IP (transfer control protocol/Internet protocol) or other formats or connections. Cloud management system 104 or other logic can receive source data object 212, and divide, partition, or otherwise decompose source data object 212 into a set of data storage subunits 202. Set of storage subunits 202 can, for instance, be or include files, datagrams, or other data objects of comparatively small size for distribution to remote cloud-based or distributed hosts. In embodiments, set of data storage subunits 202 can be of equal size, or unequal size, for instance, to accommodate available storage in different hosts.


Could management system 104 can encrypt or otherwise secure set of data storage subunits 202, for instance, using an encryption engine 214 to apply public/private key security or infrastructure to those data pieces. When secured, cloud management system 104 can transmit one or more data units in set of data storage subunits 202 to corresponding hosts in set of remote storage hosts 210, such as remote personal computers, laptops, workstations, media playback devices, or other storage resources. In embodiments, set of remote storage hosts 210 can subscribe and participate in cloud 102 to offer hard disk, electronic, optical or other storage resources via cloud management system 104, using resource-sharing mechanisms described herein. In embodiments, it will also be noted that set of remote storage hosts 210 can also or instead contribute hard disk, electronic, optical or other storage via other logic, such as a network management server, or on a peer-to-peer or other basis.


Source data object 212 can thereby be accessed, decomposed, secured and distributed to set of remote storage hosts 210 for data backup, recovery, mirroring, and/or other purposes. In aspects, cloud management system 104 can record the assignment of individual data subunits to recipient hosts, for instance in a lookup table or other record. Upon the occurrence of a data recovery event or at other times, source system 204 can request the retrieval and reconstruction of source data object 212 via cloud management system 104 or other logic. For example, source system 204 can detect data corruption or data loss in its copy of source data object 212, for instance on an incorporated hard disk. Source system 204 can then transmit a data recovery request or command to cloud management system 104 or other destination. In response, cloud management system 104 or other logic can access and retrieve set of data storage subunits 210 from set of remote storage hosts 210 via one or more networks 106, and/or other channels. After collecting set of data storage subunits from those hosts, cloud management system 104 or other logic can decrypt set of data storage subunits 104, and reconstruct source data object 212 from those constituent data pieces. After reassembly, cloud management system 104 can transmit the recovered source data object 212 to source system 204 and/or other desired destination.



FIG. 4 illustrates an exemplary diagram of hardware and other resources that can be incorporated in a source system 204 configured to communicate with set of remote storage hosts 210 via one or more networks 106, according to embodiments. In embodiments as shown, source system 204 can comprise a processor 130 communicating with memory 132, such as electronic random access memory, operating under control of or in conjunction with operating system 136. Operating system 136 can be, for example, a distribution of the Linux™ operating system, the Unix™ operating system, or other open-source or proprietary operating system or platform. Processor 130 also communicates with cloud store 138, such as a database stored on a local hard drive. Processor 130 further communicates with network interface 134, such as an Ethernet or wireless data connection, which in turn communicates with one or more networks 106, such as the Internet or other public or private networks. Processor 130 also communicates with could store 138 and management engine 128, to execute control logic and control the operation of virtual machines and other resources in cloud 102. Other configurations of source system 204, associated network connections, and other hardware and software resources are possible.



FIG. 5 illustrates a flowchart of overall software diagnostic processing, according to various embodiments of the present teachings. In step 502, processing can begin. In step 504, source data object 212 can be accessed, updated, and/or read out via source system 204, such as by reading a file from a hard disk incorporated in source system 204. In 506, source data object 212 can be transmitted to cloud management system 104 or other management logic, for example using storage engine 216 of source system 204, or other logic. In 508, the source data object 212 can be received in cloud management system 104 and/or other destination. In 510, cloud management system 104 can divide or decompose source data object 212 into a set of data storage subunits 202, such as data blocks, datagrams, and/or other data units derived from source data object 212.


In 512, cloud management system 104 and/or other logic can encrypt or otherwise secure set of data storage subunits 202, for instance using encryption engine 214 to generate public/private key pairs, and/or using an authentication or certificate authority, as understood by persons skilled in the art. In embodiments, a password or challenge mechanism can also or instead be used. In 514, cloud management system 104 and/or other management logic can assign each data storage subunit in set of data storage subunits 202 to one or more hosts in set of remote storage hosts 210, and can store those assignments to storage management store 218 or other data store. In embodiments, the assignment or association of a data storage subunit to one or more host can be stored in a table, tree, and/or other record or format.


In 516, access to the set of data storage subunits 202 can be initiated based on a recovery event, and/or other conditions. For instance, source system 204 can detect a hard disk crash, virus intrusion, and/or other data fault or condition, and transmit a message to cloud management system 104 or other logic to request the recovery of source data object 212. In 516, set of data storage subunits 202 can be accessed and/or retrieved via corresponding hosts assignments stored in storage management store 218, and/or other retrieval mechanisms. In 520, cloud management system 104 or other logic can decrypt the retrieved set of data storage subunits 202, as appropriate. In 522, cloud management system 104 can reconstruct and/or restore source data object 212 from the decrypted set of storage subunits 202. In embodiments, cloud management system 104 or other logic or site can transmit the reconstructed source data object 212 to source system 204 or other destination. In step 524, as understood by persons skilled in the art, processing can repeat, return to a prior processing point, jump to a further processing point, or end.


The foregoing description is illustrative, and variations in configuration and implementation may occur to persons skilled in the art. For example, while embodiments have been described in which one source system 204 distributes one source data object 212 for secure storage in set of remote storage hosts 210, in embodiments, multiple source systems can transmit a source data object 212 to those hosts. Similarly, while embodiments have been described which involve the storage of one source data object 212, in embodiments, multiple data objects from a source system can be decomposed, secured and stored. Other resources described as singular or integrated can in embodiments be plural or distributed, and resources described as multiple or distributed can in embodiments be combined. The scope of the present teachings is accordingly intended to be limited only by the following claims.

Claims
  • 1. A method of generating a remote distributed image of a stored data object, comprising: receiving the stored data object from a source;decomposing the stored data object into a set of storage subunits;encrypting the set of storage subunits;transmitting each of the set of encrypted storage subunits to a respective remote storage host; andgenerating a record of a network location of each remote storage host usable to reconstruct the stored data object from the set of encrypted storage units.
  • 2. The method of claim 1, wherein the stored data object comprises a computer readable file.
  • 3. The method of claim 1, wherein the encryption comprises encryption via a public/private key infrastructure.
  • 4. The method of claim 1, wherein each remote storage host comprises at least one of a client computer, a server computer, a media playback device, and a wireless communications device.
  • 5. The method of claim 1, wherein the record of the network location of each remote storage host comprises a storage table, the storage table associating each storage subunit with at least one remote storage host.
  • 6. The method of claim 1, wherein the set of storage subunits comprises a set of files comprising equally sized data blocks.
  • 7. The method of claim 1, wherein the set of storage subunits comprises a set of files comprising unequally sized data blocks.
  • 8. The method of claim 1, wherein each of the set of remote storage hosts is managed via a cloud management system.
  • 9. The method of claim 1, further comprising retrieving the set of encrypted storage subunits from the set of remote storage hosts via the record.
  • 10. The method of claim 9, further comprising decrypting the set of encrypted storage subunits.
  • 11. The method of claim 10, further comprising reconstructing the stored data object from the decrypted storage subunits.
  • 12. The method of claim 11, wherein the reconstructing is generated based on a data recovery event.
  • 13. A system, comprising: a first interface to a data store storing a stored data object; anda second interface to a set of remote storage hosts; anda processor, communicating with the data store via the first interface and the set of remote storage hosts via the second interface, the processor being configured to— receive the stored data object,decompose the stored data object into a set of storage subunits,encrypt the set of storage subunits,transmit each of the set of encrypted storage subunits to a respective remote storage host in the set of remote storage hosts, andgenerate a record of a network location of each remote storage host usable to reconstruct the stored data object from the set of encrypted storage units.
  • 14. The system of claim 13, wherein the stored data object comprises a computer readable file.
  • 15. The system of claim 13, wherein the encryption comprises encryption via a public/private key infrastructure.
  • 16. The system of claim 13, wherein each remote storage host comprises at least one of a client computer, a server computer, a media playback device, and a wireless communications device.
  • 17. The system of claim 13, wherein each of the set of remote storage hosts is managed via a cloud management system.
  • 18. The system of claim 13, wherein the processor is further configured to retrieve the set of encrypted storage subunits from the set of remote storage hosts via the record, decrypt the set of encrypted storage subunits, and reconstruct the stored data object from the decrypted storage subunits.
  • 19. A reconstructed data object, the reconstructed data objecting being stored in a computer readable storage medium and being generated via a method of: receiving a stored data object from a source;decomposing the stored data object into a set of storage subunits;encrypting the set of storage subunits;transmitting each of the set of encrypted storage subunits to a respective remote storage host;generating a record of a network location of each remote storage host usable to reconstruct the stored data object from the set of encrypted storage units;retrieving the set of encrypted storage subunits from the set of remote storage hosts via the record;decrypting the set of encrypted storage subunits; andreconstructing the stored data object from the decrypted storage subunits.