The present invention is directed generally toward a method and apparatus for utilizing an expander for data duplication of devices direct-attached to the expander.
Backing up data and data duplication are critical in industries dependent on data storage and data duplication. Current methods used to duplicate data involve copying data from a source disk to a destination disk through a storage topology. Transferring data through a storage topology to a host and back through the storage topology may consume a significant portion of the available bandwidth for a host bus adapter's (HBA) storage topology. In addition, as disk capacities have increased to hold multiple terabytes of data, or even larger capacities, the time for duplicating disk data can take hours or days for a single set of disks, let alone for multiple sets.
Current topologies may include multiple expanders (e.g., dozens of expanders) and multiple disk drives (e.g., hundreds of disk drives). In such large topologies the overhead required to communicate with end devices can become significant—consuming bandwidth when acquiring the source disk data, only to return the same data back down a nearly identical path to a destination disk connected to the same expander. In the case of disk duplication, the data is never modified just moved. Since SAS is a point-to-point protocol, a data connection must be established for each data transfer. At each SAS expander level in the topology, there is an opportunity for a particular SAS expander to reject the open connection, which delays the I/O. In addition, heavy bandwidth usage by other disks in the topology will further reduce the overall performance of the duplication.
Currently, a storage topology generally consists at least of: (1) a host system, (2) one or more disks where the source data is contained, (3) any intermediary devices (e.g., one or more levels of SAS expanders) in the storage fabric, and (4) one or more destination disks. Currently, in a typical data duplication process the host must: initiate read I/Os from one or more source disks; transfer the data to some intermediate storage location (such as some memory); and then issue write I/Os to the destination disk. Throughout this process, the host must also handle any associated I/O errors or problems that occur throughout the topology. This current process of data duplication consumes a substantial associated bandwidth of a storage topology and results in a significant performance decrease of the host system that can endure for extended periods of time.
Therefore, it may be desirable to provide a method and apparatus which address the above-referenced problems associated with the data duplication process.
Accordingly, a method is included for transfer of data from at least one source disk to at least one destination disk of a storage topology, the storage topology comprising a plurality of storage-topology-connected devices including at least one data-duplicating expander, the at least one source disk, and the at least one destination disk. This method may include receiving an instruction, instructions, trigger, or triggers from at least one initiator storage-topology-connected device to configure or start at least one data transfer. The method may also include transmitting instructions to the at least one source disk to reduce the accessibility of the at least one source disk and transmitting instructions to the at least one destination disk to reduce the accessibility of the at least one destination disk. This method may further include receiving source data from the at least one source disk by utilizing at least a first dedicated expander phy. Additionally, this method may include transferring destination data directly to the at least one destination disk by utilizing at least a second dedicated expander phy, said destination data associated with the source data, wherein directly transferring destination data bypasses transfer of the source data or the destination data to or from a host system.
Also included is a data-duplicating expander device attachable to a storage topology, the storage topology including at least one source disk and at least one destination disk. The device may include an expander configured to directly attach to the storage topology. The data-duplicating expander device may comprise a plurality of dedicated expander phys associated with the expander for attaching the expander to the storage topology. The data-duplicating expander device may include and be associated with at least one processor configured to process instructions or triggers. The device may be configured to receive an instruction, instructions, trigger, or triggers from at least one initiator storage-topology-connected device to configure or start at least one data transfer. The data-duplicating expander device may further be configured to transmit instructions to the at least one source disk and to the at least one destination disk. The device may also be configured to receive source data of the at least one source disk by utilizing at least a first dedicated expander phy. Additionally, the data-duplicating expander device may be configured to transfer destination data directly to the at least one destination disk by utilizing at least a second dedicated expander phy, said destination data associated with the source data, wherein directly transferring destination data bypasses transfer of the source data or the destination data to or from a host system.
Further included is a data-duplicating expander device. The device may include an SAS expander configured for direct duplication of data from a plurality of source disks to a plurality of destination disks. The device may be directly attached to a storage topology. The data-duplicating expander device may further comprise a plurality of dedicated expander phys and at least one processor configured to process instructions, commands, requests, or triggers. The device may be configured to receive an instruction, instructions, trigger, or triggers from at least one initiator storage-topology-connected device to configure or start at least one data transfer. The data-duplicating expander device may also be configured to transmit instructions to the plurality of source disks and the plurality of destination disks to reduce the accessibility of the plurality of source disks and the plurality of destination disks or to take the plurality of source disks and the plurality of destination disks offline. The data-duplicating expander device may further be configured to receive source data simultaneously from the plurality of source disks by utilizing at least two of the plurality of dedicated expander phys. The device may additionally be configured to transfer destination data directly and simultaneously to the plurality of destination disks by utilizing at least two of the plurality of dedicated expander phys, said destination data associated with the source data, wherein directly transferring destination data bypasses transfer of the source data or the destination data to or from a host system. The data-duplicating expander device may still further be configured to dynamically receive commands or requests from a host application and to extract status reports.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention claimed. The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate embodiments of the invention and together with the general description, serve to explain the principles.
The numerous objects and advantages of the present invention may be better understood by those skilled in the art by reference to the accompanying figures in which:
Reference will now be made in detail to the subject matter disclosed, which is illustrated in the accompanying drawings. The scope of the invention is limited only by the claims; numerous alternatives, modifications, and equivalents are encompassed. For the purpose of clarity, technical material that is known in the technical fields related to the embodiments has not been described in detail to avoid unnecessarily obscuring the description.
The present invention may include a method and apparatus for improving data duplication and improving efficiency of data duplication of direct-attached devices of a storage topology. Embodiments of the present invention may include the use of one or more data-duplicating expanders. Data-duplicating expanders may include data-duplicating serial attached SCSI (SAS) expanders. Data-duplicating expanders, such as data-duplicating SAS expanders, may independently duplicate data or portions of data of one or more disks. Data-duplicating expanders may eliminate the need to route duplication data to the host or other system hardware components for a disk's data to be duplicated. Embodiments of the invention may further eliminate countless error conditions that can occur when transferring data from a source disk through the storage topology to the host and then back to a destination disk. Embodiments of the invention may significantly improve the host system performance during data duplication processes by eliminating unnecessary consumption of bandwidth typically used to transfer data from a source disk through the storage topology to a host and back to a destination disk. Thus, embodiments of this invention may optimize the data duplication process, may have significant impacts on systems, and may be a significant competitive advantage in industries dependent on data storage and data duplication.
A data-duplicating expander may be an SAS expander that performs a data duplication operation between two or more direct-attached devices. That is, the duplication operation typically handled by host control may be incorporated into the expander. The data duplication operation of a data-duplicating expander may be for an entire disk (i.e., a disk copy), a virtualized disk, a partitioned portion of a disk, or a selected region of a disk (e.g., LBA (“logical block address”) start to LBA end). A data-duplicating expander's data duplication operation may be controlled by a host application or host bus adapter (HBA). The data-duplicating expander's data duplication operation control by the HBA may include the HBA sending a configuration instruction or configuration instructions to the expander. For example, the HBA may send a configuration instruction or instructions to the expander (e.g., via vendor unique SMP (“Serial Management Protocol”) requests) for the purpose of using two dedicated SAS expander phys for transferring data from a source disk's LBA(s) to a destination disk's LBA(s) at the highest available link speed. These configuration instructions or instructions may result in a highly efficient data duplication using the shortest data path possible and may consume none of the bandwidth to the HBA. The host application may pre-configure or dynamically configure these dedicated SAS expander phys at any time to perform or to optimize performance of the data duplication task. For example a host application may dynamically configure a data-duplicating expander to pause, resume, cancel, abort, or skip duplication of an LBA range or ranges, resume duplication at a specified LBA, or redo duplication pursuant to instructions or status-report-event instructions that the data-duplicating expander may receive from the host application of the host system or HBA.
Computer-readable program instructions for carrying out duplicating the disk data may be built into the expander's hardware and/or firmware. The data-duplicating expander may include one or more processors for carrying out program instructions. The data-duplicating expander may include one or more computing devices, wherein a computing device may include one or more processors, storage, memory, other computer hardware, software, middleware, firmware, or the like.
Methods of duplication may include: (1) direct data duplication from any direct-attached source disk to any direct-attached destination disk, (2) the ability to apply a compression/decompression algorithm as the data is transferred from any direct-attached source disk to any direct-attached destination disk (this compression algorithm may be reversed for restoring data), or (3) the expander may be configured with a dedicated expander phy(s) that may initiate an automatic disk copy of a pre-selected source disk (for example, one or more disks may include dedicated duplication slot(s) in a JBOD (“just a bunch of disks”)).
Embodiments of the invention may include one or more data-duplicating expanders, such as SAS expanders configured for data duplication, which may manage multiple data duplications occurring simultaneously with minimal host intervention and minimal host CPU cycles. By way of example, a small, typical, or very large topology may have 50% of the disks, operatively configured as source disks, simultaneously being duplicated with almost no CPU usage to the other 50% of disks, operatively configured as destination disks. By additional way of example, a topology may have all but one of the disks, operatively configured as source disks, simultaneously being compressed and duplicated with almost no CPU or host resource usage to one other disk, operatively configured as a destination disk to store compressed data of the other disks. A disk may include any storage topology connectable storage device such as hard disk drives, solid state drives, or the like.
Furthermore, the data-duplicating expander may be configured to communicate with, network with, interact with, send instructions or status reports to, receive instructions from, receive triggers from, transfer data to or from, and operate with HBAs, expanders, source disks, destination disks, and dedicated duplication slots.
Embodiments may provide for disk duplication such that the associated performance penalties of currently-practiced methods of data duplication are no longer a burden to the host system or to the rest of the storage topology associated with paths from source device(s) to the host and then from the host to destination device(s). That is, the data duplication may be independently completed by the expander alone, without requiring HBA resources or redundant-path topology bandwidth for the data duplication operation.
Some embodiments may be configured to begin disk duplication as soon as a destination drive is attached to a specific phy in a JBOD (i.e., in a configuration where there is at least one dedicated slot for duplication purposes), where the JBOD is part of a data-duplicating expander, the JBOD includes a built-in or integrated data-duplicating expander, or the JBOD is operatively connected to a topology implemented with a data-duplicating expander, wherein said data duplicating expander may be a data-duplicating SAS expander. A user, an automated mechanical device, a robotic device, or switch may insert, attach, or connect one or more raw disks, such as disk drives or removable-writable disks, into one or more dedicated slots; and the data from one or more associated source disks may automatically be duplicated by the expander to the one or more raw destination disks.
In some implementations of embodiments of the invention, the data from one or more source disks may be modified by the data-duplicating expander, and then the data-duplicating expander transfers the modified source data to one or more destination disks.
In some implementations of embodiments of the invention, the modifying of source data by the data-duplicating expander may include compressing or decompressing source data through any of various compression/decompression algorithms and then transferring the compressed or decompressed source data to one or more destination disks. Compression/decompression algorithms may include standard, unique, or semi-unique algorithms implemented in firmware or hardware of the expander. By way of example, semi-unique algorithms may include unique-to-an-entity, unique-to-an-enterprise, unique-to-an-organization, unique-to-a-department, unique-to-an-industry, or unique-to-a-vendor algorithms. Different users, systems, processes, or applications could request or have access to request different compression and decompression algorithms to make the product unique or unique to certain groups of users. For example, a particular compression/decompression algorithm may be incompatible with a different vendor or different vendors' products. If no compression was necessary, very large sizes of I/Os (e.g., megabytes, 10s of megabytes, or larger) may be transferred with little-to-no data buffering required since all data received from the source disk would be immediately or almost immediately sent to the destination disk.
In other implementations of embodiments of the invention, the modifying of source data by the data-duplicating expander may include encrypting or decrypting source data through any of various encryption/decryption algorithms and then transferring the encrypted or decrypted source data to one or more destination disks. Encryption/decryption algorithms may include standard, unique, or semi-unique algorithms implemented in firmware or hardware of the expander. By way of example, semi-unique algorithms may include unique-to-an-entity, unique-to-an-enterprise, unique-to-an-organization, unique-to-a-department, unique-to-an-industry, or unique-to-a-vendor algorithms. Different users, systems, processes, or applications could request or have access to request different encryption/decryption algorithms to make the product unique or unique to certain groups of users for data security purposes.
The data-duplicating expander may be configured by the host to take the required disks offline, reduce the accessibility of required disks, or otherwise make the required disks inaccessible to other devices. Alternatively, the data-duplicating expander itself may take the required disks offline, reduce the accessibility of required disks, or otherwise make the required disks inaccessible to other devices. The data-duplicating expander may then perform the duplication task as quickly as possible with no need for the data to ever leave a direct path (e.g., the direct path from the source disk to the data-duplicating expander to the destination disk, or the direct path from the source disk to the destination disk if either or both of the source disk or destination disk include an integrated or built-in data-duplicating expander). Duplication over a direct path minimizes use of valuable topology bandwidth needed to send or transfer data to or from other devices and minimizes use of valuable host resources, such as CPU cycles. The data-duplicating expander, such as a data-duplicating SAS expander, may not be limited to a single duplication, but rather, may process multiple duplications simultaneously.
Special SMP (“Serial Management Protocol”) commands and requests may be used to configure the expander. SMP commands and requests may also be used for extracting status reports. The data-duplicating expander may be configured to handle complex error recovery scenarios by implementing one or more recovery algorithms. The one or more recovery algorithms may be implemented in firmware or hardware of the expander. Implementation of the recovery algorithm may allow the data-duplicating expander to take a particular recovery action in response to a particular status of a status report from a source or destination device. Recovery actions by the data-duplicating expander may include pausing, resuming (such as at a specified LBA), or aborting duplication, data transfer, compression, decompression, encryption, or decryption processes. Additionally, the data-duplicating expander may simply provide a status report to the host regarding the failure so the host may implement a recovery action. For example, once the host has completed the recovery, the data duplicating expander, such as a data duplicating SAS expander, may then be instructed as to where (such as at a specified LBA) to resume the duplication so as to avoid a restart of the entire duplication activity.
Referring to
Referring to
Referring still to
Referring to
Referring to
Further referring to
Additionally, the data compression and copying method by a data-duplicating compression-configured expander (e.g., 420) may be reversed to decompress and transfer the restored data to one or more source-disks-to-be-restored (e.g. 430, 450) to restore a previously backed up state from a single (or multiple) destination disk (e.g., 440), which contains the compressed data (e.g., 442, 444) or backup data of one or more source disks (e.g., 430, 450).
Additionally, referring to
Referring still to
Referring to
Referring to
Each dedicated disk duplication slot (e.g., 662, 672, 682) may be configured for copying or duplicating the entire data contents or pre-selected or selected portion(s) of data contents of one or more associated disks (e.g., 660, 670, 680) to an inserted, engaged, or connected raw destination disk. A dedicated disk duplication slot (e.g., 662, 672, or 682) may further be configured to automatically begin copying or duplicating the entire data contents or pre-selected or selected portion(s) of data contents of an associated disk (e.g., 660, 670, or 680) automatically upon insertion, engagement, or connection of a raw disk to a dedicated disk duplication slot (e.g., 662, 672, or 682). A user or application can configure the expander or expanders of the JBOD/data-duplicating expander device 620 for as many dedicated disk duplication slots as desired or required, including none, one, or multiple dedicated disk duplication slots for each disk of the JBOD. Utilizing a JBOD/data-duplicating expander device may enhance system performance and efficiency when duplicating multiple disks.
Referring to
Embodiments of the invention may include an initiator storage-topology-connected device. The initiator storage-topology-connected device may include a dedicated disk duplication slot or a host system or HBA. The initiator system-connected-device may trigger or instruct a data-duplicating expander to start a data transfer from at least one source disk to at least one destination disk. The initiator storage-topology-connected device may trigger the data-duplicating expander by using an application, such as a host application. Additionally, the initiator storage-topology-connected device may trigger the data-duplicating expander through implementations in hardware or firmware. Embodiments of the invention contemplate that an initiator storage-topology-connected device may include one or more host systems or one or more dedicated disk duplication slots.
It is also contemplated that an external initiator device may initiate or trigger the initiator storage-topology-connected device to in turn trigger the data-duplicating expander to start a transfer of data or duplication. The external initiator device may not need to be directly-attached to the storage topology. The external initiator device may communicate, interact with, and/or trigger the initiator storage-topology-connected device through wired or wireless networks. The external initiator device may allow for remote initiation of a data-duplicating expander to begin direct transfer of data from at least one source disk to at least one destination disk, wherein the external initiator device remotely triggers an initiator storage-topology-connected device to trigger a data-duplicating expander.
Referring to
Referring to
Referring to
Additionally, features, functionality, and storage-topology-connected devices of the topology 100, associated with
It is believed that the present invention and many of its attendant advantages will be understood by the foregoing description, and it will be apparent that various changes may be made in the form, construction, and arrangement of the components thereof without departing from the scope and spirit of the invention or without sacrificing all of its material advantages. The form herein before described being merely an explanatory embodiment thereof, it is the intention of the following claims to encompass and include such changes.