The described subject matter relates to electronic computing, and more particularly to data synchronization management.
Effective collection, management, and control of information have become a central component of modern business processes. To this end, many businesses, both large and small, now implement computer-based information management systems.
Data management is an important component of computer-based information management systems. Many users implement storage networks to manage data operations in computer-based information management systems. Storage networks have evolved in computing power and complexity to provide highly reliable, managed storage solutions that may be distributed across a wide geographic area.
The ability to duplicate and store the contents of a storage device is an important feature of a storage system. A storage device or network may maintain redundant copies of data to safeguard against the failure of a single storage device, medium, or communication connection. Upon a failure of the first storage device, medium, or connection, the storage system may then locate and/or retrieve a copy of the data contained in a second storage device or medium. The ability to duplicate and store the contents of the storage device also facilitates the creation of a fixed record of contents at the time of duplication. This feature allows users to recover a prior version of inadvertently edited or erased data.
Redundant copies of data records require synchronization on at least a periodic basis. Data synchronization can be a resource-intensive process. Hence, adroit management of data synchronization processes contributes to efficient operations.
In one embodiment, a method comprises receiving, in a source controller, a signal indicative of a write request to a source volume managed by the source controller in response to the signal: writing data associated with the write request to a destination controller when a connection to the destination controller is available; and setting a synchronization flag associated with the data storage segment managed by the source controller when a connection to the destination controller is unavailable.
Described herein are exemplary system and methods for implementing data synchronization in a storage device, array, or network. The methods described herein may be embodied as logic instructions on a computer-readable medium. When executed on a processor such as, e.g., a disk array controller, the logic instructions cause the processor to be programmed as a special-purpose machine that implements the described methods. The processor, when configured by the logic instructions to execute the methods recited herein, constitutes structure for performing the described methods. The methods will be explained with reference to one or more logical volumes in a storage system, but the methods need not be limited to logical volumes. The methods are equally applicable to storage systems that map to physical storage, rather than logical storage.
Exemplary Storage Network Architectures
In one embodiment, the subject matter described herein may be implemented in a storage architecture that provides virtualized data storage at a system level, such that virtualization is implemented within a storage area network (SAN), as described in published U.S. Patent Application Publication No. 2003/0079102 to Lubbers, et al., the disclosure of which is incorporated herein by reference in its entirety.
Computing environment 100 further includes one or more host computing devices which utilize storage services provided by the storage pool 110 on their own behalf or on behalf of other client computing or data processing systems or devices. Client computing devices such as client 126 access storage the storage pool 110 embodied by storage cells 140A, 140B, 140C through a host computer. For example, client computer 126 may access storage pool 110 via a host such as server 124. Server 124 may provide file services to client 126, and may provide other services such as transaction processing services, email services, etc. Host computer 122 may also utilize storage services provided by storage pool 110 on its own behalf. Clients such as clients 132, 134 may be connected to host computer 128 directly, or via a network 130 such as a Local Area Network (LAN) or a Wide Area Network (WAN).
Referring to
Each array controller 210a, 210a further includes a communication port 228a, 228b that enables a communication connection 238 between the array controllers 210a, 210b. The communication connection 238 may be implemented as a FC point-to-point connection, or pursuant to any other suitable communication protocol.
In an exemplary implementation, array controllers 210a, 210b further include a plurality of Fiber Channel Arbitrated Loop (FCAL) ports 220a-226a, 220b-226b that implements an FCAL communication connection with a plurality of storage devices, e.g., sets of disk drives 240, 242. While the illustrated embodiment implement FCAL connections with the sets of disk drives 240, 242, it will be understood that the communication connection with sets of disk drives 240, 242 may be implemented using other communication protocols. For example, rather than an FCAL configuration, a FC switching fabric may be used.
In operation, the storage capacity provided by the sets of disk drives 240, 242 may be added to the storage pool 110. When an application requires storage capacity, logic instructions on a host computer such as host computer 128 establish a LUN from storage capacity available on the sets of disk drives 240, 242 available in one or more storage sites. It will be appreciated that, because a LUN is a logical unit, not a physical unit, the physical storage space that constitutes the LUN may be distributed across multiple storage cells. Data for the application may be stored on one or more LUNs in the storage network. An application that needs to access the data queries a host computer, which retrieves the data from the LUN and forwards the data to the application.
The memory representation enables each logical unit 112a, 112b to implement from 1 Mbyte to 2 TByte of physical storage capacity. Larger storage capacities per logical unit may be implemented. Further, the memory representation enables each logical unit to be defined with any type of RAID data protection, including multi-level RAID protection, as well as supporting no redundancy. Moreover, multiple types of RAID data protection may be implemented within a single logical unit such that a first range of logical disk addresses (LDAs) correspond to unprotected data, and a second set of LDAs within the same logical unit implement RAID 5 protection.
A persistent copy of a memory representation illustrated in
The PSEGs that implement a particular LUN may be spread across any number of physical storage disks. Moreover, the physical storage capacity that a particular LUN represents may be configured to implement a variety of storage types offering varying capacity, reliability and availability features. For example, some LUNs may represent striped, mirrored and/or parity-protected storage. Other LUNs may represent storage capacity that is configured without striping, redundancy or parity protection.
A logical disk mapping layer maps a LDA specified in a request to a specific RStore as well as an offset within the RStore. Referring to the embodiment shown in
In one embodiment, L2MAP 310 includes a plurality of entries, each of which represents 2 Gbyte of address space. For a 2 Tbyte logical unit, therefore, L2MAP 310 includes 1024 entries to cover the entire address space in the particular example. Each entry may include state information relating to the corresponding 2 Gbyte of storage, and an LMAP pointer to a corresponding LMAP descriptor 320. The state information and LMAP pointer are set when the corresponding 2 Gbyte of address space have been allocated, hence, some entries in L2MAP 310 will be empty or invalid in many applications.
The address range represented by each entry in LMAP 320, is referred to as the logical disk address allocation unit (LDAAU). In one embodiment, the LDAAU is 1 MByte. An entry is created in LMAP 320 for each allocated LDAAU without regard to the actual utilization of storage within the LDAAU. In other words, a logical unit can grow or shrink in size in increments of 1 Mbyte. The LDAAU represents the granularity with which address space within a logical unit can be allocated to a particular storage task.
An LMAP 320 exists for each 2 Gbyte increment of allocated address space. If less than 2 Gbyte of storage are used in a particular logical unit, only one LMAP 320 is required, whereas, if 2 Tbyte of storage is used, 1024 LMAPs 320 will exist. Each LMAP 320 includes a plurality of entries, each of which may correspond to a redundancy segment (RSEG). An RSEG is an atomic logical unit that is analogous to a PSEG in the physical domain—akin to a logical disk partition of an RStore.
In one embodiment, an RSEG may be implemented as a logical unit of storage that spans multiple PSEGs and implements a selected type of data protection. Entire RSEGs within an RStore may be bound to contiguous LDAs. To preserve the underlying physical disk performance for sequential transfers, RSEGs from an RStore may be located adjacently and in order, in terms of LDA space, to maintain physical contiguity. If, however, physical resources become scarce, it may be necessary to spread RSEGs from RStores across disjoint areas of a logical unit. The logical disk address specified in a request selects a particular entry within LMAP 320 corresponding to a particular RSEG that in turn corresponds to 1 Mbyte address space allocated to the particular RSEG #. Each LMAP entry also includes state information about the particular RSEG #, and an RSD pointer.
Optionally, the RSEG #s may be omitted, which results in the RStore itself being the smallest atomic logical unit that can be allocated. Omission of the RSEG # decreases the size of the LMAP entries and allows the memory representation of a logical unit to demand fewer memory resources per MByte of storage. Alternatively, the RSEG size can be increased, rather than omitting the concept of RSEGs altogether, which also decreases demand for memory resources at the expense of decreased granularity of the atomic logical unit of storage. The RSEG size in proportion to the RStore can, therefore, be changed to meet the needs of a particular application.
In one embodiment, the RSD pointer points to a specific RSD 330 that contains metadata describing the RStore in which the corresponding RSEG exists. The RSD includes a redundancy storage set selector (RSSS) that includes a redundancy storage set (RSS) identification, a physical member selection, and RAID information. The physical member selection may include a list of the physical drives used by the RStore. The RAID information, or more generically data protection information, describes the type of data protection, if any, that is implemented in the particular RStore. Each RSD also includes a number of fields that identify particular PSEG numbers within the drives of the physical member selection that physically implement the corresponding storage capacity. Each listed PSEG # may correspond to one of the listed members in the physical member selection list of the RSSS. Any number of PSEGs may be included, however, in a particular embodiment each RSEG is implemented with between four and eight PSEGs, dictated by the RAID type implemented by the RStore.
In operation, each request for storage access specifies a logical unit such as logical unit 112a, 112b, and an address. A controller such as array controller 210A, 210B maps the logical drive specified to a particular logical unit, then loads the L2MAP 310 for that logical unit into memory if it is not already present in memory. Preferably, all of the LMAPs and RSDs for the logical unit are also loaded into memory. The LDA specified by the request is used to index into L2MAP 310, which in turn points to a specific one of the LMAPs. The address specified in the request is used to determine an offset into the specified LMAP such that a specific RSEG that corresponds to the request-specified address is returned. Once the RSEG # is known, the corresponding RSD is examined to identify specific PSEGs that are members of the redundancy segment, and metadata that enables a NSC 210A, 210B to generate drive specific commands to access the requested data. In this manner, an LDA is readily mapped to a set of PSEGs that must be accessed to implement a given storage request.
In one embodiment, the L2MAP consumes 4 Kbytes per logical unit regardless of size. In other words, the L2MAP includes entries covering the entire 2 Tbyte maximum address range even where only a fraction of that range is actually allocated to a logical unit. It is contemplated that variable size L2MAPs may be used, however such an implementation would add complexity with little savings in memory. LMAP segments consume 4 bytes per Mbyte of address space while RSDs consume 3 bytes per MB. Unlike the L2MAP, LMAP segments and RSDs exist only for allocated address space.
Storage systems may be configured to maintain duplicate copies of data to provide redundancy. Input/Output (I/O) operations that affect a data set may be replicated to redundant data set.
In the embodiment depicted in
In normal operation, write operations from host 402 are directed to the designated source virtual disk 412A, 422B, and may be copied in a background process to one or more destination virtual disks 422A, 412B, respectively. A destination virtual disk 422A, 412B may implement the same logical storage capacity as the source virtual disk, but may provide a different data protection configuration. Controllers such as array controller 210A, 210B at the destination storage cell manage the process of allocating memory for the destination virtual disk autonomously. In one embodiment, this allocation involves creating data structures that map logical addresses to physical storage capacity, as described in greater detail in published U.S. Patent Application Publication No. 2003/0084241 to Lubbers, et al., the disclosure of which is incorporated herein by reference in its entirety.
To implement a copy transaction between a source and destination, a communication path between the source and the destination sites is determined and a communication connection is established. The communication connection need not be a persistent connection, although for data that changes frequently, a persistent connection may be efficient. A heartbeat may be initiated over the connection. Both the source site and the destination site may generate a heartbeat on each connection. Heartbeat timeout intervals may be adaptive based, e.g., on distance, computed round trip delay, or other factors.
In one embodiment, an array controller such as one of the array controller 210A, 210B defines a data replication management (DRM) group as one or more source volumes associated with one or more destination volumes. DRM source group members (i.e., virtual disks) are logically associated with a remote destination group member's virtual disks. The copy set is represented by this association. In
DRM groups may be used to maintain data consistency upon failure and preserve write order sequence among source virtual disks. The consistency property applies when the group has more than one member. A group maintains write ordering among the members for asynchronous operation and logging/merging. Asynchronous operation refers to an operation mode in which a modification to one member of a copy set can be propagated to other members of the copy set after a time delay. During this time delay, the various replicas are inexact. When asynchronous operation is allowed, all replicas should eventually implement the modification. Since multiple modification operations may be pending but uncommitted against a particular replica, the original order in which the modifications were presented should be preserved when the pending modifications are applied to each replica. Even when asynchronous operation is not explicitly allowed, a destination volume may become unavailable for a variety of reasons, in which case a copy set is implicitly operating in an asynchronous mode.
To ensure write order preservation an array controller may maintain a log for each DRM group, e.g., in a non-volatile cache, that records the history of write commands and data from a host. The log may be sized to store all write transactions until the transaction is committed to each member of a copy set. When required, the log can be replayed to merge the pending writes, in order, to each remote group. When required, the cached writes can be written to a log on media along with subsequent host writes and then later replayed to merge the pending writes, in order, to each remote group. The ordering algorithm uses a “group sequence number” and the remote groups ensure that the data is written in order sequence. Group members enter and exit logging at the same time, to assure order across the volumes.
In one embodiment, members of a DRM group may have the same alternate site. A DRM group may include up to 32 virtual disks in a particular implementation. A virtual disk can belong to at most one group. Virtual disks in the same DRM group may belong to different disk groups. When a DRM group object is created on a source controller, the source controller automatically creates a symmetric group object on the alternate site controller. A DRM group is created during copy set creation if the user chooses not to use an existing group.
Exemplary Operations
If, at operation 520, there is a connection to the remote site(s) that host the destination volume(s), then control passes to operation 525 and a data replication routine may be implemented to replicate the write I/O operation to the destination volume. By contrast, if at operation 520 there is not a connection to the remote site, then control passes to operation 530. In one embodiment the connection status may be determined by monitoring for heartbeats from the remote site. In an alternate embodiment the connection status may be determined by pinging the remote site and waiting for a reply from the remote site.
At operation 530 the array controller determines whether fast synchronization is valid. In one embodiment, fast synchronization authorization may be managed at the DRM group level. For example, a DRM group may include an authorization parameter that indicates whether fast synchronization is authorized and/or available. A source array controller may set the status of the authorization parameter based on input from a user such as a system administrator and/or based on the status of the destination. For example, if one or more destination volumes are unavailable then the authorization parameter may be set to reflect that fast synchronization is invalid for the DRM group.
If, at operation 530, fast synchronization is determined to be valid, then control passes to operation 535, and the array controller determines whether the fast synchronization flag is set. In one embodiment, a source controller maintains a fast synchronization flag associated with each logical segment of addressable memory space. For example, referring briefly to
In one embodiment, the fast synchronization flag is used to indicate that the data in the logic segment associated with the fast synchronization flag has been changed by a write I/O operation. Hence, if at operation 535 the fast synchronization flag is not set, then control passes to operation 540 and the array controller sets the fast synchronization flag.
At operation 545 it is determined whether the log file is full. If the log file is not full, then control passes to operation 550 and the source array controller writes the write I/O operation to the log file. By contrast, if at operation 545 the log file is full, then control passes to operations 555-560, which mark the source volume for a full copy if the source volume is not already so marked.
The operations of
At operation 610 a full copy operation is implemented, e.g., by the source array controller. If, at operation 615, the full copy operation is not complete, then control passes to operation 620 and the next segment is selected. If, at operation 625, fast synchronization is not valid for the DRM group then control passes to operation 630 and the full segment is copied. Similarly, if at operation 635 the fast synchronization flag is set for the segment, the control passes to operation 630 and the full segment is copied.
By contrast, if the fast synchronization flag is not set, then the source array controller skips copying the segment (operation 640). Control then passes back to operation 615. The operations of
In an alternate embodiment the source controller may maintain a count parameter indicative of the number of segments in the source volume which have had their fast synchronization flag set. For example, in such an embodiment, at operation 540 the source controller increments this count parameter each time the controller sets a fast synchronization flag. During the recovery process, the source controller may implement logic that marks the volume for a full copy when the count parameter exceeds a threshold. The threshold may be fixed (e.g., a fixed number of fast synchronization flags), may correspond to a portion of the segments in the volume (e.g., X % of the segments in the volume), or may be set dynamically based on operating conditions in the storage network. For example, during peak traffic periods the threshold may be set high to preserve bandwidth on the network. By contrast, during periods of low usage, the threshold may be set low since bandwidth is readily available.
Reference in the specification to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least an implementation. The appearances of the phrase “in one embodiment” in various places in the specification are not necessarily all referring to the same embodiment.
Thus, although embodiments have been described in language specific to structural features and/or methodological acts, it is to be understood that claimed subject matter may not be limited to the specific features or acts described. Rather, the specific features and acts are disclosed as sample forms of implementing the claimed subject matter.
This application claims priority from provisional U.S. Patent Application Ser. No. 60/701,542, entitled Data Synchronization Management by Fandel, et al., filed Jul. 22, 2005.
Number | Name | Date | Kind |
---|---|---|---|
5923876 | Teague | Jul 1999 | A |
6161192 | Lubbers | Dec 2000 | A |
6170063 | Golding | Jan 2001 | B1 |
6295578 | Dimitroff | Sep 2001 | B1 |
6397293 | Shrader | May 2002 | B2 |
6487636 | Dolphin | Nov 2002 | B1 |
6490122 | Holmquist et al. | Dec 2002 | B1 |
6493656 | Houston | Dec 2002 | B1 |
6505268 | Schultz | Jan 2003 | B1 |
6523749 | Reasoner | Feb 2003 | B2 |
6546459 | Rust | Apr 2003 | B2 |
6560673 | Elliot | May 2003 | B2 |
6587962 | Hepner | Jul 2003 | B1 |
6594745 | Grover | Jul 2003 | B2 |
6601187 | Sicola | Jul 2003 | B1 |
6606690 | Padovano | Aug 2003 | B2 |
6609145 | Thompson | Aug 2003 | B1 |
6629108 | Frey | Sep 2003 | B2 |
6629273 | Patterson | Sep 2003 | B1 |
6643795 | Sicola | Nov 2003 | B1 |
6647514 | Umberger | Nov 2003 | B1 |
6658590 | Sicola | Dec 2003 | B1 |
6663003 | Johnson | Dec 2003 | B2 |
6681308 | Dallmann | Jan 2004 | B1 |
6708285 | Oldfield | Mar 2004 | B2 |
6715101 | Oldfield | Mar 2004 | B2 |
6718404 | Reuter | Apr 2004 | B2 |
6718434 | Veitch | Apr 2004 | B2 |
6721902 | Cochran | Apr 2004 | B1 |
6725393 | Pellegrino | Apr 2004 | B1 |
6742020 | Dimitroff | May 2004 | B1 |
6745207 | Reuter | Jun 2004 | B2 |
6763409 | Elliot | Jul 2004 | B1 |
6772231 | Reuter | Aug 2004 | B2 |
6775790 | Reuter | Aug 2004 | B2 |
6795904 | Kamvysselis | Sep 2004 | B1 |
6802023 | Oldfield | Oct 2004 | B2 |
6807605 | Umberger | Oct 2004 | B2 |
6817522 | Brignone | Nov 2004 | B2 |
6823453 | Hagerman | Nov 2004 | B1 |
6839824 | Camble | Jan 2005 | B2 |
6842833 | Phillips | Jan 2005 | B1 |
6845403 | Chadalapaka | Jan 2005 | B2 |
7007044 | Rafert et al. | Feb 2006 | B1 |
7010721 | Vincent | Mar 2006 | B2 |
7039661 | Ranade | May 2006 | B1 |
7152183 | Fujibayashi | Dec 2006 | B2 |
7461230 | Gupta et al. | Dec 2008 | B1 |
20020019863 | Reuter | Feb 2002 | A1 |
20020019908 | Reuter | Feb 2002 | A1 |
20020019920 | Reuter | Feb 2002 | A1 |
20020019922 | Reuter | Feb 2002 | A1 |
20020019923 | Reuter | Feb 2002 | A1 |
20020048284 | Moulton | Apr 2002 | A1 |
20020188800 | Tomaszewski | Dec 2002 | A1 |
20030051109 | Cochran | Mar 2003 | A1 |
20030056038 | Cochran | Mar 2003 | A1 |
20030063134 | Lord | Apr 2003 | A1 |
20030074492 | Cochran | Apr 2003 | A1 |
20030079014 | Lubbers | Apr 2003 | A1 |
20030079074 | Sicola | Apr 2003 | A1 |
20030079082 | Sicola | Apr 2003 | A1 |
20030079083 | Lubbers | Apr 2003 | A1 |
20030079102 | Lubbers | Apr 2003 | A1 |
20030079156 | Sicola | Apr 2003 | A1 |
20030084241 | Lubbers | May 2003 | A1 |
20030101318 | Kaga | May 2003 | A1 |
20030110237 | Kitamura | Jun 2003 | A1 |
20030126315 | Tan | Jul 2003 | A1 |
20030126347 | Tan | Jul 2003 | A1 |
20030140191 | McGowen | Jul 2003 | A1 |
20030145045 | Pellegrino | Jul 2003 | A1 |
20030145130 | Schultz | Jul 2003 | A1 |
20030170012 | Cochran | Sep 2003 | A1 |
20030177323 | Popp | Sep 2003 | A1 |
20030187847 | Lubbers | Oct 2003 | A1 |
20030187947 | Lubbers | Oct 2003 | A1 |
20030188085 | Arakawa | Oct 2003 | A1 |
20030188114 | Lubbers | Oct 2003 | A1 |
20030188119 | Lubbers | Oct 2003 | A1 |
20030188153 | Demoff | Oct 2003 | A1 |
20030188218 | Lubbers | Oct 2003 | A1 |
20030188229 | Lubbers | Oct 2003 | A1 |
20030188233 | Lubbers | Oct 2003 | A1 |
20030191909 | Asano | Oct 2003 | A1 |
20030191919 | Sato | Oct 2003 | A1 |
20030196023 | Dickson | Oct 2003 | A1 |
20030212781 | Kaneda | Nov 2003 | A1 |
20030229651 | Mizuno | Dec 2003 | A1 |
20030236953 | Grieff | Dec 2003 | A1 |
20040019740 | Nakayama | Jan 2004 | A1 |
20040022546 | Cochran | Feb 2004 | A1 |
20040024838 | Cochran | Feb 2004 | A1 |
20040024961 | Cochran | Feb 2004 | A1 |
20040030727 | Armangau | Feb 2004 | A1 |
20040030846 | Armangau | Feb 2004 | A1 |
20040049634 | Cochran | Mar 2004 | A1 |
20040078638 | Cochran | Apr 2004 | A1 |
20040078641 | Fleischmann | Apr 2004 | A1 |
20040128404 | Cochran | Jul 2004 | A1 |
20040168034 | Homma et al. | Aug 2004 | A1 |
20040215602 | Cioccarelli | Oct 2004 | A1 |
20040230859 | Cochran | Nov 2004 | A1 |
20040267959 | Cochran | Dec 2004 | A1 |
Number | Date | Country | |
---|---|---|---|
20070022263 A1 | Jan 2007 | US |
Number | Date | Country | |
---|---|---|---|
60701542 | Jul 2005 | US |