Computer data is vital to today's organizations and a significant part of protection against disasters is focused on data protection. As solid-state memory has advanced to the point where cost of memory has become a relatively insignificant factor, organizations can afford to operate with systems that store and process terabytes of data.
Conventional data protection systems include tape backup drives, for storing organizational production site data on a periodic basis. Another conventional data protection system uses data replication, by generating a copy of production site data of an organization on a secondary backup storage system, and updating the backup with changes. The backup storage system may be situated in the same physical location as the production storage system, or in a physically remote location. Data replication systems generally operate either at the application level, at the file system level, or at the data block level.
Most of the modern storage arrays provide snapshot capabilities. These snapshots allow a user to save a freeze an image of a volume or set of volumes at some point-in-time and to restore this image when needed.
In one aspect, a method includes receiving an I/O to write data to a volume stored, increasing a hash reference count for a hash of the data in response to receiving the I/O, periodically generating snapshots of the volume, adding metadata on the I/O and a timestamp to a metadata journal and increasing the reference count value in response to adding the metadata.
In another aspect, an apparatus includes electronic hardware circuitry configured to receive an I/O to write data to a volume stored, increase a hash reference count for a hash of the data in response to receiving the I/O, periodically generate snapshots of the volume, add metadata on the I/O and a timestamp to a metadata journal and increase the reference count value in response to adding the metadata.
In a further aspect, an article includes a non-transitory computer-readable medium that stores computer-executable instructions. The instructions cause a machine to receive an I/O to write data to a volume stored, increase a hash reference count for a hash of the data in response to receiving the I/O, periodically generate snapshots of the volume, add metadata on the I/O and a timestamp to a metadata journal and increase the reference count value in response to adding the metadata.
In some examples, snapshot technology provides crude granularity in the snapshots that are generated. As a consequence a user is not able to restore the volumes to the state that they were in a time period between times that two snapshots were generated. In some examples snapshot technology stores a limited amount of snapshots. Described herein are techniques that enable an efficient native continuous data protection (CDP) mechanism on a storage array that will allow a user to restore a state of volume to any point-in-time for the storage array that provide snapshot images.
The following definitions may be useful in understanding the specification and claims.
I/O REQUEST—an input/output request (sometimes referred to as an I/O or IO), which may be a read I/O request (sometimes referred to as a read request or a read) or a write I/O request (sometimes referred to as a write request or a write).
Referring to
In one example, the CDP module 120 is used to enable the storage array 104 to perform CDP. The metadata journal 118 saves metadata for each I/O.
The application 110 reads and writes data to the production volume 122.
As storage array 104 is a deduplicated storage array, each of the data in the storage array is kept in two separate levels. In a first level, each volume contains a set of pointers from address-to-hash value of the data address (e.g. address-to-hash mapping), which is kept in a compact format. A second level of mapping is a map from hash-to the physical location where the data matching the hash value is stored
For each hash there is also a hash reference count which counts the number of references to the data which the hash points to. If the exact same data is saved again later on the storage array then the hash is not saved again but a rather a pointer is added point to the hash. In some examples, in order to allow CDP, the system periodically takes a snapshot of the production volume 122 to form, for example, snapshots 132a-132c.
The hash reference count table 136 is a table of a reference counts for each hash value, and has a pointer from each hash value to its physical location. In one example, each hash reference count value represents a number of entities (e.g., journals, tables) that rely on the hash value. In one particular example, a hash reference count of ‘0’ means no entities in the storage array are using the hash value and the data which the hash points to may be erased. As will be described herein the hash count is incremented for each entity that uses the hash. One of ordinary skill in the art would recognize that system may be configured so that a hash reference counter counts up or counts down for each new entity that depends on the hash value and the hash reference count value may start at any value.
In one example, the storage array 104 is flash storage array. In other examples, the storage array 104 is a deduplication device. In other examples, the storage array 104 may be part of a device used for scalable data storage and retrieval using content addressing. In one example, the storage array 104 may include one or more of the features of a device for scalable data storage and retrieval using content addressing described in U.S. Pat. No. 9,104,326, issued Aug. 11, 2015, entitled “SCALABLE BLOCK DATA STORAGE USING CONTENT ADDRESSING,” which is assigned to the same assignee as this patent application and is incorporated herein in its entirety. In other examples, the storage array 104 is a flash storage array used in EMC® XTREMIO®.
Referring to
Process 200 determines the hash value of the data in the I/O (210) and writes the new data to the storage array and stores the hash-to-physical address if the hash is new (218).
Process 200 increases the hash reference count (220). For example the hash reference count for the hash is increased by 1.
Process 200 generates a pointer from address to hash (222) and determines if the volume in the logical unit is a CDP volume (228). If the volume is a CDP volume, process 200 increases the hash reference count (232) and writes the metadata to the metadata journal (236). For example, the hash reference count for the hash in the hash reference count table 136 is increased by 1 and the metadata is written as a new journal entry in the metadata journal 118. In on example, the metadata includes a timestamp of the new journal entry generation, the offset of the data on the storage LU and the hash value.
Referring to
Process 300 periodically deleted snapshots and metadata (310). For example, process 300 performs all or a portion of process 500 (
Referring to
Process 400 adds data from the metadata journal to the copy of the latest snapshot available before the requested point-in-time (416). For example, the metadata is used from the metadata journal 118 to capture the changes from the latest snapshot before the requested point-in-time up until and including the requested point-in-time by using the timestamps. In one example, changed data from the latest snapshot copied is derived from the metadata and is applied to the copy of the latest snapshot using, for example, an xcopy command, or any other method. Since the journal contains the hash value for the data that should appear in the point-in-time, the address-to-hash value of the data in the snapshot can be changed to the hash value of the relevant point-in-time and thus the volume will have the relevant data of the desired point-in-time
Referring to
Process 500 deletes old snapshots (502). For example, snapshots of volumes older than a predetermined time are deleted. In one particular example, if snapshots are generated every 6 or 12 hours then snapshots older than 14 days are deleted.
Process 500 reads the metadata journal entries corresponding to the older snapshots deleted (510). For example, the process 500 reads the hash value for each journal entry in the metadata journal 118 corresponding to the older snapshots deleted in processing block 502 (e.g., read the entries which relates to a timestamp generated between the generation time of the deleted snapshot and the generation time of the following snapshot).
Process 500 decreases the hash reference count value for each hash included in the metadata journal entries read. For example, the hash reference count value in hash reference count table 136 for each hash included in the metadata journal entries read in processing block 510 are decreased by 1.
Process 500 deletes journal entries corresponding to the older snapshots deleted (518). For example, journal entries corresponding to the older snapshots deleted in processing block 502 are deleted from the metadata journal 118.
Referring to
The processes described herein (e.g., process processes 200, 300, 400 and 500) are not limited to use with the hardware and software of
The system may be implemented, at least in part, via a computer program product, (e.g., in a non-transitory machine-readable storage medium such as, for example, a non-transitory computer-readable medium), for execution by, or to control the operation of, data processing apparatus (e.g., a programmable processor, a computer, or multiple computers)). Each such program may be implemented in a high level procedural or object-oriented programming language to communicate with a computer system. However, the programs may be implemented in assembly or machine language. The language may be a compiled or an interpreted language and it may be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment. A computer program may be deployed to be executed on one computer or on multiple computers at one site or distributed across multiple sites and interconnected by a communication network. A computer program may be stored on a non-transitory machine-readable medium that is readable by a general or special purpose programmable computer for configuring and operating the computer when the non-transitory machine-readable medium is read by the computer to perform the processes described herein. For example, the processes described herein may also be implemented as a non-transitory machine-readable storage medium, configured with a computer program, where upon execution, instructions in the computer program cause the computer to operate in accordance with the processes. A non-transitory machine-readable medium may include but is not limited to a hard drive, compact disc, flash memory, non-volatile memory, volatile memory, magnetic diskette and so forth but does not include a transitory signal per se.
The processes described herein are not limited to the specific examples described. For example, the processes 200, 300, 400 and 500 are not limited to the specific processing order of
The processing blocks (for example, in the processes 200, 300, 400 and 500) associated with implementing the system may be performed by one or more programmable processors executing one or more computer programs to perform the functions of the system. All or part of the system may be implemented as, special purpose logic circuitry (e.g., an FPGA (field-programmable gate array) and/or an ASIC (application-specific integrated circuit)). All or part of the system may be implemented using electronic hardware circuitry that include electronic devices such as, for example, at least one of a processor, a memory, a programmable logic device or a logic gate.
Elements of different embodiments described herein may be combined to form other embodiments not specifically set forth above. Other embodiments not specifically described herein are also within the scope of the following claims.
Number | Name | Date | Kind |
---|---|---|---|
7203741 | Marco et al. | Apr 2007 | B2 |
7647466 | Rao | Jan 2010 | B1 |
7719443 | Natanzon | May 2010 | B1 |
7840536 | Ahal et al. | Nov 2010 | B1 |
7840662 | Natanzon | Nov 2010 | B1 |
7844856 | Ahal et al. | Nov 2010 | B1 |
7860836 | Natanzon et al. | Dec 2010 | B1 |
7882286 | Natanzon et al. | Feb 2011 | B1 |
7934262 | Natanzon et al. | Apr 2011 | B1 |
7958372 | Natanzon | Jun 2011 | B1 |
8037162 | Marco et al. | Oct 2011 | B2 |
8041940 | Natanzon et al. | Oct 2011 | B1 |
8060713 | Natanzon | Nov 2011 | B1 |
8060714 | Natanzon | Nov 2011 | B1 |
8103937 | Natanzon et al. | Jan 2012 | B1 |
8108634 | Natanzon et al. | Jan 2012 | B1 |
8214612 | Natanzon | Jul 2012 | B1 |
8250149 | Marco et al. | Aug 2012 | B2 |
8271441 | Natanzon et al. | Sep 2012 | B1 |
8271447 | Natanzon et al. | Sep 2012 | B1 |
8332687 | Natanzon et al. | Dec 2012 | B1 |
8335761 | Natanzon | Dec 2012 | B1 |
8335771 | Natanzon et al. | Dec 2012 | B1 |
8341115 | Natanzon et al. | Dec 2012 | B1 |
8370648 | Natanzon | Feb 2013 | B1 |
8380885 | Natanzon | Feb 2013 | B1 |
8392680 | Natanzon et al. | Mar 2013 | B1 |
8429362 | Natanzon et al. | Apr 2013 | B1 |
8433869 | Natanzon et al. | Apr 2013 | B1 |
8438135 | Natanzon et al. | May 2013 | B1 |
8464101 | Natanzon et al. | Jun 2013 | B1 |
8478955 | Natanzon et al. | Jul 2013 | B1 |
8495304 | Natanzon et al. | Jul 2013 | B1 |
8510271 | Tsaur | Aug 2013 | B1 |
8510279 | Natanzon et al. | Aug 2013 | B1 |
8521691 | Natanzon | Aug 2013 | B1 |
8521694 | Natanzon | Aug 2013 | B1 |
8543609 | Natanzon | Sep 2013 | B1 |
8583885 | Natanzon | Nov 2013 | B1 |
8600945 | Natanzon et al. | Dec 2013 | B1 |
8601085 | Ives et al. | Dec 2013 | B1 |
8627012 | Derbeko et al. | Jan 2014 | B1 |
8683592 | Dotan et al. | Mar 2014 | B1 |
8694700 | Natanzon et al. | Apr 2014 | B1 |
8706700 | Natanzon et al. | Apr 2014 | B1 |
8712962 | Natanzon et al. | Apr 2014 | B1 |
8719497 | Don et al. | May 2014 | B1 |
8725691 | Natanzon | May 2014 | B1 |
8725692 | Natanzon et al. | May 2014 | B1 |
8726066 | Natanzon et al. | May 2014 | B1 |
8738813 | Natanzon et al. | May 2014 | B1 |
8745004 | Natanzon et al. | Jun 2014 | B1 |
8751828 | Raizen et al. | Jun 2014 | B1 |
8769336 | Natanzon et al. | Jul 2014 | B1 |
8805786 | Natanzon | Aug 2014 | B1 |
8806161 | Natanzon | Aug 2014 | B1 |
8825848 | Dotan et al. | Sep 2014 | B1 |
8832399 | Natanzon et al. | Sep 2014 | B1 |
8850143 | Natanzon | Sep 2014 | B1 |
8850144 | Natanzon et al. | Sep 2014 | B1 |
8862546 | Natanzon et al. | Oct 2014 | B1 |
8892835 | Natanzon et al. | Nov 2014 | B1 |
8898112 | Natanzon et al. | Nov 2014 | B1 |
8898409 | Natanzon et al. | Nov 2014 | B1 |
8898515 | Natanzon | Nov 2014 | B1 |
8898519 | Natanzon et al. | Nov 2014 | B1 |
8914595 | Natanzon | Dec 2014 | B1 |
8924668 | Natanzon | Dec 2014 | B1 |
8930500 | Marco et al. | Jan 2015 | B2 |
8930947 | Derbeko et al. | Jan 2015 | B1 |
8935498 | Natanzon | Jan 2015 | B1 |
8949180 | Natanzon et al. | Feb 2015 | B1 |
8954673 | Natanzon et al. | Feb 2015 | B1 |
8954796 | Cohen et al. | Feb 2015 | B1 |
8959054 | Natanzon | Feb 2015 | B1 |
8977593 | Natanzon et al. | Mar 2015 | B1 |
8977826 | Meiri et al. | Mar 2015 | B1 |
8996460 | Frank et al. | Mar 2015 | B1 |
8996461 | Natanzon et al. | Mar 2015 | B1 |
8996827 | Natanzon | Mar 2015 | B1 |
9003138 | Natanzon et al. | Apr 2015 | B1 |
9026696 | Natanzon et al. | May 2015 | B1 |
9031913 | Natanzon | May 2015 | B1 |
9032160 | Natanzon et al. | May 2015 | B1 |
9037818 | Natanzon et al. | May 2015 | B1 |
9063994 | Natanzon et al. | Jun 2015 | B1 |
9069479 | Natanzon | Jun 2015 | B1 |
9069709 | Natanzon et al. | Jun 2015 | B1 |
9081754 | Natanzon et al. | Jul 2015 | B1 |
9081842 | Natanzon et al. | Jul 2015 | B1 |
9087008 | Natanzon | Jul 2015 | B1 |
9087112 | Natanzon et al. | Jul 2015 | B1 |
9104529 | Derbeko et al. | Aug 2015 | B1 |
9110914 | Frank et al. | Aug 2015 | B1 |
9116811 | Derbeko et al. | Aug 2015 | B1 |
9128628 | Natanzon et al. | Sep 2015 | B1 |
9128855 | Natanzon et al. | Sep 2015 | B1 |
9134914 | Derbeko et al. | Sep 2015 | B1 |
9135119 | Natanzon et al. | Sep 2015 | B1 |
9135120 | Natanzon | Sep 2015 | B1 |
9146878 | Cohen et al. | Sep 2015 | B1 |
9152339 | Cohen et al. | Oct 2015 | B1 |
9152578 | Saad et al. | Oct 2015 | B1 |
9152814 | Natanzon | Oct 2015 | B1 |
9158578 | Derbeko et al. | Oct 2015 | B1 |
9158630 | Natanzon | Oct 2015 | B1 |
9160526 | Raizen et al. | Oct 2015 | B1 |
9177670 | Derbeko et al. | Nov 2015 | B1 |
9189339 | Cohen et al. | Nov 2015 | B1 |
9189341 | Natanzon et al. | Nov 2015 | B1 |
9201736 | Moore et al. | Dec 2015 | B1 |
9223659 | Natanzon et al. | Dec 2015 | B1 |
9225529 | Natanzon et al. | Dec 2015 | B1 |
9235481 | Natanzon et al. | Jan 2016 | B1 |
9235524 | Derbeko et al. | Jan 2016 | B1 |
9235632 | Natanzon | Jan 2016 | B1 |
9244997 | Natanzon et al. | Jan 2016 | B1 |
9256605 | Natanzon | Feb 2016 | B1 |
9274718 | Natanzon et al. | Mar 2016 | B1 |
9275063 | Natanzon | Mar 2016 | B1 |
9286052 | Solan et al. | Mar 2016 | B1 |
9305009 | Bono et al. | Apr 2016 | B1 |
9317375 | Sadhu | Apr 2016 | B1 |
9323750 | Natanzon et al. | Apr 2016 | B2 |
9330155 | Bono et al. | May 2016 | B1 |
9336094 | Wolfson et al. | May 2016 | B1 |
9336230 | Natanzon | May 2016 | B1 |
9367260 | Natanzon | Jun 2016 | B1 |
9378096 | Erel et al. | Jun 2016 | B1 |
9378219 | Bono et al. | Jun 2016 | B1 |
9378261 | Bono et al. | Jun 2016 | B1 |
9383937 | Frank et al. | Jul 2016 | B1 |
9389800 | Natanzon et al. | Jul 2016 | B1 |
9405481 | Cohen et al. | Aug 2016 | B1 |
9405684 | Derbeko et al. | Aug 2016 | B1 |
9405765 | Natanzon | Aug 2016 | B1 |
9411535 | Shemer et al. | Aug 2016 | B1 |
9459804 | Natanzon et al. | Oct 2016 | B1 |
9460028 | Raizen et al. | Oct 2016 | B1 |
9471579 | Natanzon | Oct 2016 | B1 |
9477407 | Marshak et al. | Oct 2016 | B1 |
9501542 | Natanzon | Nov 2016 | B1 |
9507732 | Natanzon et al. | Nov 2016 | B1 |
9507845 | Natanzon et al. | Nov 2016 | B1 |
9514138 | Natanzon et al. | Dec 2016 | B1 |
9524218 | Veprinsky et al. | Dec 2016 | B1 |
9529885 | Natanzon et al. | Dec 2016 | B1 |
9535800 | Natanzon et al. | Jan 2017 | B1 |
9535801 | Natanzon et al. | Jan 2017 | B1 |
9547459 | BenHanokh et al. | Jan 2017 | B1 |
9547591 | Natanzon et al. | Jan 2017 | B1 |
9552405 | Moore et al. | Jan 2017 | B1 |
9557921 | Cohen et al. | Jan 2017 | B1 |
9557925 | Natanzon | Jan 2017 | B1 |
9563517 | Natanzon et al. | Feb 2017 | B1 |
9563684 | Natanzon et al. | Feb 2017 | B1 |
9575851 | Natanzon et al. | Feb 2017 | B1 |
9575857 | Natanzon | Feb 2017 | B1 |
9575894 | Natanzon et al. | Feb 2017 | B1 |
9582382 | Natanzon et al. | Feb 2017 | B1 |
9588703 | Natanzon et al. | Mar 2017 | B1 |
9588847 | Natanzon et al. | Mar 2017 | B1 |
9594822 | Natanzon et al. | Mar 2017 | B1 |
9600377 | Cohen et al. | Mar 2017 | B1 |
9619255 | Natanzon | Apr 2017 | B1 |
9619256 | Natanzon et al. | Apr 2017 | B1 |
9619264 | Natanzon et al. | Apr 2017 | B1 |
9619543 | Natanzon et al. | Apr 2017 | B1 |
9632881 | Natanzon | Apr 2017 | B1 |
9639295 | Natanzon et al. | May 2017 | B1 |
9639383 | Natanzon | May 2017 | B1 |
9639592 | Natanzon et al. | May 2017 | B1 |
9652333 | Bournival et al. | May 2017 | B1 |
9658929 | Natanzon et al. | May 2017 | B1 |
9659074 | Natanzon et al. | May 2017 | B1 |
9665305 | Natanzon et al. | May 2017 | B1 |
9668704 | Fuimaono et al. | Jun 2017 | B2 |
9672117 | Natanzon et al. | Jun 2017 | B1 |
9678680 | Natanzon et al. | Jun 2017 | B1 |
9678728 | Shemer et al. | Jun 2017 | B1 |
9684576 | Natanzon et al. | Jun 2017 | B1 |
9690504 | Natanzon et al. | Jun 2017 | B1 |
9696939 | Frank et al. | Jul 2017 | B1 |
9710177 | Natanzon | Jul 2017 | B1 |
9720618 | Panidis et al. | Aug 2017 | B1 |
9720619 | Shah | Aug 2017 | B1 |
9722788 | Natanzon et al. | Aug 2017 | B1 |
9727429 | Moore et al. | Aug 2017 | B1 |
9733969 | Derbeko et al. | Aug 2017 | B2 |
9737111 | Lustik | Aug 2017 | B2 |
9740572 | Natanzon et al. | Aug 2017 | B1 |
9740573 | Natanzon | Aug 2017 | B1 |
9740880 | Natanzon et al. | Aug 2017 | B1 |
9749300 | Cale et al. | Aug 2017 | B1 |
9772789 | Natanzon et al. | Sep 2017 | B1 |
9798472 | Natanzon et al. | Oct 2017 | B1 |
9798490 | Natanzon | Oct 2017 | B1 |
9804934 | Natanzon et al. | Oct 2017 | B1 |
9811431 | Natanzon et al. | Nov 2017 | B1 |
9823865 | Natanzon et al. | Nov 2017 | B1 |
9823973 | Natanzon | Nov 2017 | B1 |
9832261 | Don et al. | Nov 2017 | B2 |
9846698 | Panidis et al. | Dec 2017 | B1 |
9875042 | Natanzon et al. | Jan 2018 | B1 |
9875162 | Panidis et al. | Jan 2018 | B1 |
9880777 | Bono et al. | Jan 2018 | B1 |
9881014 | Bono et al. | Jan 2018 | B1 |
9910620 | Veprinsky et al. | Mar 2018 | B1 |
9910621 | Golan et al. | Mar 2018 | B1 |
9910735 | Natanzon | Mar 2018 | B1 |
9910739 | Natanzon et al. | Mar 2018 | B1 |
9917854 | Natanzon et al. | Mar 2018 | B2 |
9921955 | Derbeko et al. | Mar 2018 | B1 |
9933957 | Cohen et al. | Apr 2018 | B1 |
9934302 | Cohen et al. | Apr 2018 | B1 |
9940205 | Natanzon | Apr 2018 | B2 |
9940460 | Derbeko et al. | Apr 2018 | B1 |
9946649 | Natanzon et al. | Apr 2018 | B1 |
9959061 | Natanzon et al. | May 2018 | B1 |
9965306 | Natanzon et al. | May 2018 | B1 |
20070300033 | Kano | Dec 2007 | A1 |
20110231452 | Nakajima | Sep 2011 | A1 |
20150261792 | Attarde | Sep 2015 | A1 |