Computer data is vital to today's organizations and a significant part of protection against disasters is focused on data protection. As solid-state memory has advanced to the point where cost of storage has become a relatively insignificant factor, organizations can afford to operate with systems that store and process terabytes of data.
Conventional data protection systems include tape backup drives, for storing organizational production site data on a periodic basis. Another conventional data protection system uses data replication, by creating a copy of production site data of an organization on a secondary backup storage system, and updating the backup with changes. The backup storage system may be situated in the same physical location as the production storage system, or in a physically remote location. Data replication systems generally operate either at the application level, at the file system level, or at the data block level.
In one aspect, a method includes storing data in a distributed storage environment that includes data servers and configuring each data server to mark a respective bit map for each block of data changed. In another aspect, an apparatus includes electronic hardware circuitry configured to store data in a distributed storage environment that includes data servers and to configure each data server to mark a respective bit map for each block of data changed. In a further aspect, an article includes a non-transitory computer-readable medium that stores computer-executable instructions. The instructions cause a machine to store data in a distributed storage environment that includes data servers and configure each data server to mark a respective bit map for each block of data changed. In each of the aspects above each data server is configured to handle a respective portion of a logical unit.
Described herein are techniques to perform incremental backup in a distributed storage environment. Using the techniques described, the tracking of the changes in a logical unit has far less overhead than using snapshots, for example. The change tracking is highly distributed, in line with the distribution of the volume itself, and therefore scales with system size. The incremental backup is naturally parallelized within the distributed block storage system and also scales with system size. The incremental backup can take advantage of read path optimizations which are not available to other clients not using distributed block storage. In one example, this can be done by allowing the data reads (for the incremental backup) to be performed by the data servers, thereby avoiding any network overhead and also allowing bulk reads of disparate portions of the changed data, knowing that all the reads are from a block device local to the data server.
Referring to
Referring to
As will be further described herein, each data server 122a-122d includes a bit map for the portion of the logical unit it is handling. For example, the data servers 122a-122d include bit maps 126a-126d respectively. A bit-map is used to record whether a block changed or not. For example, if a block of data or blocks of data have changes changed a “1” is stored in the bit map for that block or blocks of data.
Referring to
In the case where there is no prior full or incremental backup, process 300 begins by taking a full back-up of the logical unit 160 (304). In other cases, process 300 starts incremental backup (308).
Referring to
Process 400 finds the server hosting that portion of the logical unit (404) and sends the write request to the appropriate server (406). For example, the write request is for writing data to the portion C 162c of the logical unit 160 and the write request is sent to the data server 122c.
Process 400 marks the bit map. For example, the data server 122c marks the block as modified in the bit map 126c if the data server 126c has been designated to track dirty blocks from the process 300. The data server 122c writes to the block storage device 116c.
Referring to
Process 500 switches to a new bitmap (506). For example, a command is issued to the data servers 122a-122d to start a new bit map. Switching bitmaps occurs between application writes. The action of switching to a new bitmap (504) interacts with process of writing data (400) in a way that guarantees that each write will update at least either the bitmap being switched from or the bitmap being switched to.
Process 500 obtains bit maps from data servers to form a combined bit map X (504). For example, if the logical unit 160 is being backed up, then the bit maps 126a-126d are retrieved from the data servers 122a-122d, respectively. In one example, processing blocks 504 and 506 are performed simultaneously.
Process 500 takes a snapshot of the logical unit (512). For example, the backup module takes or causes to be taken a snapshot of the logical unit 160 on the distributed block storage 116a-116d.
Process 500 obtains the new bitmaps from processing block 504 to form a new combined bitmap (514) and merges the previous combined bit map with the new combined bit map (522). For example, the combined bitmap X and a new combined bitmap X′ are merged together to form a single bitmap using a logical “or” function.
Process 500 segments the merged bit map (528). For example, the merged bit map is segmented based on how the logical unit 160 is split across the data servers 122a-122d.
Process 500 issues a command to the data servers (532). For example, each data server 122a-122d receives its respective segmented portion of the merged bitmap and copies its respective changed portions from the snapshot taken in processing block 512 to the object store 118. In another embodiment, the backup module 150 may directly read the changed blocks from the snapshot and copy the changed blocks to the object store, though this loses the benefits of a parallel copy by the data servers and also incurs additional network overhead during the block reads.
Process 500 releases the snapshot after the copy is done (536). The backup module 150 releases the snapshot taken in processing block 512 to be, for example, erased after processing block 532 has completed.
Referring to
The processes described herein (e.g., processes 300, 400, and 500) are not limited to use with the hardware and software of
The system may be implemented, at least in part, via a computer program product, (e.g., in a non-transitory machine-readable storage medium such as, for example, a non-transitory computer-readable medium), for execution by, or to control the operation of, data processing apparatus (e.g., a programmable processor, a computer, or multiple computers)). Each such program may be implemented in a high level procedural or object-oriented programming language to communicate with a computer system. However, the programs may be implemented in assembly or machine language. The language may be a compiled or an interpreted language and it may be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment. A computer program may be deployed to be executed on one computer or on multiple computers at one site or distributed across multiple sites and interconnected by a communication network. A computer program may be stored on a non-transitory machine-readable medium that is readable by a general or special purpose programmable computer for configuring and operating the computer when the non-transitory machine-readable medium is read by the computer to perform the processes described herein. For example, the processes described herein may also be implemented as a non-transitory machine-readable storage medium, configured with a computer program, where upon execution, instructions in the computer program cause the computer to operate in accordance with the processes. A non-transitory machine-readable medium may include but is not limited to a hard drive, compact disc, flash memory, non-volatile memory, volatile memory, magnetic diskette and so forth but does not include a transitory signal per se.
The processes described herein are not limited to the specific examples described. For example, the processes 300, 400, and 500 are not limited to the specific processing order of
The processing blocks (for example, in the processes 300, 400, and 500) associated with implementing the system may be performed by one or more programmable processors executing one or more computer programs to perform the functions of the system. All or part of the system may be implemented as, special purpose logic circuitry (e.g., an FPGA (field-programmable gate array) and/or an ASIC (application-specific integrated circuit)). All or part of the system may be implemented using electronic hardware circuitry that include electronic devices such as, for example, at least one of a processor, a memory, a programmable logic device or a logic gate.
Elements of different embodiments described herein may be combined to form other embodiments not specifically set forth above. Various elements, which are described in the context of a single embodiment, may also be provided separately or in any suitable subcombination. Other embodiments not specifically described herein are also within the scope of the following claims.
Number | Name | Date | Kind |
---|---|---|---|
6385706 | Ofek | May 2002 | B1 |
7603533 | Tsypliaev | Oct 2009 | B1 |
7953948 | Dyatlov | May 2011 | B1 |
8010495 | Kuznetzov | Aug 2011 | B1 |
8108640 | Holl, II | Jan 2012 | B1 |
8370301 | Chen | Feb 2013 | B1 |
9171002 | Mam | Oct 2015 | B1 |
9424137 | Mam | Aug 2016 | B1 |
20020079221 | Dolphin et al. | Jun 2002 | A1 |
20030061399 | Wagener | Mar 2003 | A1 |
20050015685 | Yamamoto | Jan 2005 | A1 |
20050086432 | Sakai | Apr 2005 | A1 |
20050125609 | Satoyama | Jun 2005 | A1 |
20060015696 | Nguyen | Jan 2006 | A1 |
20070088767 | Passerini | Apr 2007 | A1 |
20070146788 | Shinozaki | Jun 2007 | A1 |
20070220309 | Andre | Sep 2007 | A1 |
20080301663 | Bahat | Dec 2008 | A1 |
20090006792 | Federwisch | Jan 2009 | A1 |
20100023716 | Nemoto | Jan 2010 | A1 |
20100076934 | Pershin | Mar 2010 | A1 |
20100077165 | Lu | Mar 2010 | A1 |
20110231698 | Zlati | Sep 2011 | A1 |
20120079221 | Sivasubramanian | Mar 2012 | A1 |
20120290802 | Wade | Nov 2012 | A1 |
20130238562 | Kumarasamy | Sep 2013 | A1 |
20140108351 | Nallathambi | Apr 2014 | A1 |
20160147607 | Dornemann | May 2016 | A1 |
Entry |
---|
U.S. Appl. No. 14/673,998, filed Mar. 31, 2015, O'Connell et al. |
U.S. Non-Final Office Action dated Feb. 27, 2017 corresponding to U.S. Appl. No. 14/673,998; 28 Pages. |
Response filed on Jun. 13, 2017 to Non-Final Office Action dated Feb. 27, 2017; for U.S. Appl. No. 14/673,998; 9 pages. |
Final Office Action dated Sep. 29, 2017 from U.S. Appl. No. 14/673,998; 23 Pages. |
Request for Continued Examination (RCE) and Response to Final Office Action dated Sep. 29, 2017 corresponding to U.S. Appl. No. 14/673,998; Response Filed Jan. 29, 2018; 14 Pages. |
Notice of Allowance dated Mar. 13, 2018 for U.S. Appl. No. 14/673,998; 11 pages. |