The present invention relates generally to storage networks, and more particularly to integrated local and cloud storage services.
The demand for storage has been rapidly increasing. As the amount of data such as digital media stored by users grows, so does the need to store digital media reliably over extended periods of time. Traditional, backup solutions periodically copy data to, for example, backup tapes, compact discs (CDs), or other local storage media. However, such solutions are not optimal, as the backup media is stored in a single location and media used for backups are prone to failure.
Other solutions include storing data files on a local hard-drive of a personal computer (PC) and synchronizing the data remotely using hosted storage services. Having a remote backup ensures that data is stored in multiple locations and is protected from local disasters, such as fires or floods. However, such solutions require installation of special client software on each individual PC, which is prone to software incompatibilities, lack of central control, and high deployment cost.
Commercially available services referred to as cloud storage services provide mass storage through a web service interface available through the Internet.
A data center 110 typically consists of servers and mass storage facilitating cloud storage services to the clients 120. Such services enable applications including, for example, backup and restoration of data, data migration, data sharing, data collaboration, and so on. Cloud storage services are accessible from anywhere in the world. To this end, each client 120 implements a web services interface designed to at least synchronize data with the data centers 110. Applications enabled by the cloud storage services are not aware of the specifics of the services and the underlying data synchronization operations. The disadvantage of commercially available cloud storage services is that such services do not implement standard file sharing protocols (e.g., common internet file system (CIFS) or network file system (NFS)). Furthermore, accessing files stored in the cloud storage is typically slower than accessing files stored in local storage devices.
A network-attached storage (NAS) device is a self-contained appliance connected to a network with a primary purpose of supplying file-based data storage services to clients on the network. Specifically, a NAS device provides the functionality of data storage, file-based operations (e.g., read, write, delete, modify, etc.), and the management of these functionalities. However, commercially available NAS devices do not operate in conjunction with cloud storage services. Therefore, organizations and businesses utilizing NAS devices to store and manage their data cannot benefit from mass storage and applications of cloud storage services.
It would be therefore advantageous to provide a solution for integrating NAS devices with cloud storage services.
Certain embodiments disclosed herein include a network attached storage device for performing network attached storage operations with cloud storage services. The device includes at least one network controller for communicating with a plurality of clients over a local area network (LAN) and with the cloud storage service (CSS) over a wide area network (WAN); a cache memory for locally caching data of the CSS in the device; and a virtual cloud drive (VCD) for enabling the plurality of clients to perform file-based operations on data stored in the CSS using at least one file sharing protocol.
Certain embodiments disclosed herein also include a method for performing network attached storage operations with cloud storage services. The method comprises receive a request from a client of a plurality of clients to read a byte range of a file in a virtual cloud drive (VCD); determine a set of data blocks required for reconstructing the byte range; determine if a first set of data blocks of the set of data blocks is maintained in a cache memory; fetch the first set of data blocks from the cache memory, upon determining that the first set of data blocks is maintained in the cache memory; retrieve a second set of data blocks from at least one cloud storage service (CSS), wherein the second set of data blocks are data blocks within the byte range that are not maintained in the cache memory; reconstruct the byte range from the first set and second set of data blocks; and return the reconstructed data to the client.
Certain embodiments disclosed herein also include a network attached storage device for performing network attached storage operations with cloud storage services. The method comprises at least one network controller for communicating with a plurality of clients over a local area network (LAN) and with a cloud storage service (CSS) over a wide area network (WAN); a plurality of storage devices for locally storing data in the device; a CSS module for bidirectional synchronizing between data locally stored in the plurality of storage devices and data stored in the CSS and for unidirectional synchronizing of data locally stored in the plurality of storage devices; and a processor for enabling the plurality of clients to perform file-based operations on the device using a file sharing protocol.
The subject matter herein disclosed is particularly pointed out and distinctly claimed in the claims at the conclusion of the specification. The foregoing and other objects, features, and advantages of the disclosed embodiments will be apparent from the following detailed description taken in conjunction with the accompanying drawings.
It is important to note that the embodiments disclosed herein are only examples of the many advantageous uses of the innovative teachings herein. In general, statements made in the specification of the present application do not necessarily limit any of the various claimed embodiments. Moreover, some statements may apply to some inventive features but not to others. In general, unless otherwise indicated, singular elements may be in plural and vice versa with no loss of generality. In the drawings, like numerals refer to like parts through several views.
As illustrated in
According to certain embodiments, in order to allow transparent access from clients 210 to files stored in the CSS 240, the device 220 provides a shared network folder (hereinafter the “virtual cloud drive” (VCD)). The VCD exposes files that are stored at the CSS 240. When a client 210 tries to access a specific byte range from a VCD of the device 220 that is mapped to the CSS 240, the device 220 transparently contacts the CSS 240 and requests the blocks including the requested byte range on behalf of the client 210. The blocks are then reassembled, decrypted and decompressed as needed, to recover the original byte range. The reconstructed byte range is then returned to the client 210. To the client 210, the file appears to be stored locally on the device 220. The device 220 may cache recently and/or frequently accessed data blocks in the memory 460 and/or in the storage 450. Such blocks can be returned directly from the cache instead of from the CSS 240.
The device 220 further includes a NAS module 470 emulating the device 220 as a NAS device and a CSS module 480 allowing the integration of the device 220 with the CSS 240. In accordance with an embodiment, the processor 410 runs an operating system (not shown) adapted to provide file-based operations on the CSS and further to control the operation of the modules 470 and 480. The storage controllers 430 include, but are not limited to, a small computer system interface (SCSI), a serial advanced technology attachment (SATA), a universal serial bus (USB), a fibre channel (FC), a serial attached SCSI (SAS), and the like. In certain embodiments, the storage devices 450 may be external to the device 220.
One of the primary tasks of the CSS module 480 is to periodically synchronize data between the device 220 (i.e., data stored in the storage devices 450) and the CSS 240. The synchronization may be in the direction from the device 220 to the CSS 240 (hereinafter “the outgoing direction”), in the direction from the CSS 240 to the device 220 (hereinafter “the incoming direction”), or simultaneously in both directions. It should be noted that all files can be synchronized, or a partial subset of the files can be synchronized.
Synchronization in the outgoing direction is typically used as a data backup mechanism, allowing files to be backed up to the CSS 240 for safekeeping. Synchronization in the incoming direction is typically used as a data distribution mechanism, allowing files to be distributed from the CSS 240 at the device 220 to provide fast and reliable local access to a set of files. Synchronization in both directions (bidirectional) is used to maintain data consistency between the device 220 and CSS 240. This allows files to be modified or created both in the device 220 (through a filing sharing protocol) and in the CSS 240 (through the web portal 340). It should be noted that, in certain embodiments, when using bidirectional synchronization, one or more devices 220 can be optionally synchronized to a single location in the CSS 240. This enables multiple devices 220 to synchronize with each other through the CSS 240, which acts as a mediator.
In S510, files stored either in the device 220 or CSS 240 that have been changed since the last synchronization are marked. At S520, each marked file is divided into variable size blocks. At S530, each block is assigned with a unique code using, for example, a message digest code (MDC) function. Thus, each block is addressed by its unique code. In accordance with an embodiment, steps S510, S520, and S530 can be executed concurrently in a pipelined fashion rather than sequentially, to provide higher efficiency.
Thereafter, it is determined which of the data blocks have been modified. With this aim, at S535, the device 220 sends the block codes of all the blocks in a file to the CSS 240, which checks whether a block with the same code exists on the scope of the same DG. If a block with the same code already exists on the CSS 240 in the same DG, then the block does not need to be transmitted. In an embodiment, the device 220 sends the block codes of all the blocks in each file to the CSS 240. The CSS 240 then replies with a compact run length encoded (RLE) list of the blocks which are missing on the CSS and which should be transferred.
The CSS 240 maintains a reference count for each block, which is increased by 1 for each file that uses the block. When a file is deleted from the CSS 240, the reference count of all the blocks of the file is reduced by 1. When the reference count of a block reaches 0, this block is no longer used by any file and the storage space of this block may be freed. It should be appreciated that this approach results in significant storage space reduction, as multiple identical files or parts of files belonging to devices in each DG are stored only once in the CSS 240. This approach also reduces the consumption of network bandwidth, as only modified blocks are transmitted over the network 250. Furthermore, the CSS 240 can store a number of previous versions for each file, allowing the user to restore a file to an earlier version. Since the file is stored as multiple blocks, the entire file does not need to be duplicated. Rather, only the differences between file versions are stored. This reduces the required storage space for file versioning.
At S540, all modified blocks that should be sent over the network 250 may be encrypted and compressed using lossless compression algorithm, to reduce the bandwidth requirements. The encryption may be a block level encryption that uses, for example, a keyed hash cryptographic message digest code (HMAC) function to add an authentication code to each block. In many cases, the operator of the CSS is not considered a trusted party by the device owner. Thus, the encryption key can be known only to the owner of the device 220, thereby preventing even the operator of the CSS 240 from viewing or altering the contents of the stored files.
At S550 all modified blocks (which may be optionally encrypted and compressed) are transmitted from the device 220 to the CSS 240 over the network 250. It should be noted that multiple blocks are transmitted without waiting for the CSS 240 to acknowledge the reception of a previously transmitted block, thereby enabling efficient utilization of high latency links.
The communication between the device 220 and CSS 240 is performed by means of a cloud transport protocol (CTP) implemented in accordance with an embodiment. The CTP is a secure, encrypted connection based on Transmission Control Protocol (TCP)/internet protocol (IP), such as secure sockets layer (SSL) or transport level security (TLS). This ensures confidentiality against external eavesdroppers and malicious modification of the data in transit. The CTP also supports a message framing protocol for sending and receiving arbitrary length messages between the device 220 and the CSS 240, and implements an authentication method by which the device 220 authenticates the CSS 240, for example, by using a security certificate (asymmetric key), or by means of a symmetric secret key or password. The CSS 240 authenticates to the device 220 by, for example, using a security certificate (asymmetric key), thus preventing an attacker from impersonating the CSS 240.
In addition to synchronizing files with the local storage available on the device 220, the CSS 240 can be utilized to expand the amount of local storage on the device 220. With this aim, the CSS 240 is exposed on the device 220 as an extended storage space in the device 220. This allows the device 220 to have the capacity of a mass storage system (i.e., practically infinite storage space) and, specifically, to allow small NAS-like devices to have the storage space of mass storage systems. To allow access to the extended storage space as if it is on the device itself, the VCD allows read/write operations on the expanded storage space on the CSS 240.
The cloud connector 310 includes a unified cloud protocol module 610 for communicating with the device 220 by means of the cloud transport protocol described above. The cloud connector 310 also includes a permissions-and-quotas enforcement module 620, a service entitlement database 630, a cloud cache module (CCM) 640, a storage balancing module 650, and one or more cloud protocol drivers 660 for interfacing with storage devices and cloud storage providers in the CSS 240.
The storage balancing module 650 performs load balancing between multiple cloud storage providers and possibly multiple local storage devices 330 based on criteria including, but not restricted to, performance, cost and reliability. For example, a simplistic balancing scheme could be to store 20 percent of the data to a storage device 320, and 80 percent to a cloud storage provider 330. The split between the cloud storage providers 330 can be either static (for example, according to the block code) or dynamic (for example, based on the current cost and availability of each cloud storage provider).
The CCM 640 may cache recently used or frequently accessed data blocks locally, for reduction in communication costs to the cloud storage providers and reduced latency.
The permissions-and-quota module 620 enforces and restricts access of the devices 220 to data blocks according to a list of access control rules. The permissions- and quota module 620 can also enforce storage quotas for each device 220 and provide differentiated service levels per customer. The entitlement database 630 is used to store the service level agreement (SLA) for each of the customers, having accesses through clients 210, subscribed to the third party cloud storage services 330. When a customer connects to the CSS 240, the entitlement database 630 is accessed to check whether the customer is entitled to the requested service. In addition, the entitlement database 630 contains additional service level information, such as storage and bandwidth quotas for each customer. It should be appreciated that the embodiments described herein provides a storage solution that combines the speed and practicality of NAS devices on the local network, with the scalability and disaster protection of cloud storage services.
The various embodiments disclosed herein can be implemented as hardware, firmware, software, or any combination thereof. Moreover, the software is preferably implemented as an application program tangibly embodied on a program storage unit or computer readable medium consisting of parts, or of certain devices and/or a combination of devices. The application program may be uploaded to, and executed by, a machine comprising any suitable architecture. Preferably, the machine is implemented on a computer platform having hardware such as one or more central processing units (“CPUs”), a memory, and input/output interfaces. The computer platform may also include an operating system and microinstruction code. The various processes and functions described herein may be either part of the microinstruction code or part of the application program, or any combination thereof, which may be executed by a CPU, whether or not such a computer or processor is explicitly shown. In addition, various other peripheral units may be connected to the computer platform such as an additional data storage unit and a printing unit. Furthermore, a non-transitory computer readable medium is any computer readable medium except for a transitory propagating signal.
All examples and conditional language recited herein are intended for pedagogical purposes to aid the reader in understanding the principles of the invention and the concepts contributed by the inventor to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions. Moreover, all statements herein reciting principles, aspects, and embodiments of the invention, as well as specific examples thereof, are intended to encompass both structural and functional equivalents thereof. Additionally, it is intended that such equivalents include both currently known equivalents as well as equivalents developed in the future, i.e., any elements developed that perform the same function, regardless of structure.
This application is a continuation of U.S. patent application Ser. No. 13/188,144 filed on Jul. 21, 2011, now allowed, which is a continuation of U.S. patent application Ser. No. 12/641,559, filed on Dec. 18, 2009, now pending. All of the above-referenced applications are herein incorporated by reference for all that they contain.
Number | Name | Date | Kind |
---|---|---|---|
6470329 | Livschitz | Oct 2002 | B1 |
7113963 | McCaw | Sep 2006 | B1 |
7203796 | Muppalaneni | Apr 2007 | B1 |
7325145 | England | Jan 2008 | B1 |
7430639 | Bali et al. | Sep 2008 | B1 |
7596620 | Colton et al. | Sep 2009 | B1 |
7752180 | Fair | Jul 2010 | B1 |
7865663 | Nelson et al. | Jan 2011 | B1 |
8127358 | Lee | Feb 2012 | B1 |
8135683 | Douglis et al. | Mar 2012 | B2 |
8595847 | Petta et al. | Nov 2013 | B2 |
20020095598 | Camble et al. | Jul 2002 | A1 |
20030074574 | Hursey et al. | Apr 2003 | A1 |
20040039889 | Elder et al. | Feb 2004 | A1 |
20040054891 | Hengeveld | Mar 2004 | A1 |
20040098539 | Frank | May 2004 | A1 |
20040143713 | Niles et al. | Jul 2004 | A1 |
20050071586 | Bartfai et al. | Mar 2005 | A1 |
20050097225 | Glatt | May 2005 | A1 |
20060010154 | Prahlad | Jan 2006 | A1 |
20060230076 | Gounares et al. | Oct 2006 | A1 |
20060230245 | Gounares et al. | Oct 2006 | A1 |
20060239275 | Zlateff et al. | Oct 2006 | A1 |
20070143589 | Rawe et al. | Jun 2007 | A1 |
20070162510 | Lenzmeier et al. | Jul 2007 | A1 |
20070250593 | Sikdar et al. | Oct 2007 | A1 |
20080005168 | Huff et al. | Jan 2008 | A1 |
20080040420 | Twiss et al. | Feb 2008 | A1 |
20080080552 | Gates et al. | Apr 2008 | A1 |
20080288301 | Emling | Nov 2008 | A1 |
20080289006 | Hock et al. | Nov 2008 | A1 |
20080317068 | Sagar et al. | Dec 2008 | A1 |
20090006792 | Federwisch et al. | Jan 2009 | A1 |
20090006869 | Satya | Jan 2009 | A1 |
20090007234 | Birger et al. | Jan 2009 | A1 |
20090031212 | Hsu | Jan 2009 | A1 |
20090144422 | Chatley et al. | Jun 2009 | A1 |
20090248693 | Sagar et al. | Oct 2009 | A1 |
20090288169 | Petta et al. | Nov 2009 | A1 |
20090300169 | Sagar et al. | Dec 2009 | A1 |
20100042720 | Stienhans et al. | Feb 2010 | A1 |
20100088150 | Mazhar et al. | Apr 2010 | A1 |
20100149096 | Migos | Jun 2010 | A1 |
20100274982 | Mehr | Oct 2010 | A1 |
20100318632 | Yoo | Dec 2010 | A1 |
20100332401 | Prahlad et al. | Dec 2010 | A1 |
20100332588 | Schwimer | Dec 2010 | A1 |
20130204849 | Chacko | Aug 2013 | A1 |
20140082131 | Jagtap | Mar 2014 | A1 |
20140082156 | Jagtap | Mar 2014 | A1 |
Number | Date | Country | |
---|---|---|---|
20150106470 A1 | Apr 2015 | US | |
20190098108 A9 | Mar 2019 | US |
Number | Date | Country | |
---|---|---|---|
61140071 | Dec 2008 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 13188144 | Jul 2011 | US |
Child | 14572067 | US | |
Parent | 12641559 | Dec 2009 | US |
Child | 13188144 | US |