The accompanying drawings, which are incorporated in and constitute a part of this specification exemplify the embodiments of the present invention and, together with the description, serve to explain and illustrate principles of the inventive technique. Specifically:
In the following detailed description, reference will be made to the accompanying drawing(s), in which identical functional elements are designated with like numerals. The aforementioned accompanying drawings show by way of illustration, and not by way of limitation, specific embodiments and implementations consistent with principles of the present invention. These implementations are described in sufficient detail to enable those skilled in the art to practice the invention and it is to be understood that other implementations may be utilized and that structural changes and/or substitutions of various elements may be made without departing from the scope and spirit of present invention. The following detailed description is, therefore, not to be construed in a limited sense. Additionally, the various embodiments of the invention as described may be implemented in the form of a software running on a general purpose computer, in the form of a specialized hardware, or combination of software and hardware.
The inventive concept deals with a long-term data archiving system. The inventive methodology provides a method for legacy applications to access long-term archived data even if the storage system interface has changed. The inventive concept will be illustrated in detail with reference to the following exemplary embodiment thereof.
The inventive concept will be illustrated herein in the context of an example of a data migration from a storage system with an access method implemented in accordance with a SCSI protocol (legacy storage system) to a storage system with a file access protocol (modern storage system). However, as would be appreciated by those of skill in the art, the inventive mechanisms are not restricted to any specific interface or interfaces of the legacy storage system and/or the modern storage system. In fact, the inventive methodology is applicable to data migration involving any two types of storage systems.
The Legacy Storage System 4000 includes a Storage Controller 4501 coupled to a set of Disk Drives 4508. The storage controller 4501 comprises a CPU 4502, memory 4503, cache memory 4504, host interface 4505, management interface 4506, and disk interface 4507. The storage controller processes input-output (I/O) requests received from the host 1000.
Memory 4503 of the controller 4501 of the legacy storage system 4000 stores a software program, which handles I/O operations associated with the data stored in the legacy storage system. The aforesaid program is executed by the CPU 4502 of the legacy storage controller 4501. The cache memory 4504 temporally stores the data written to the legacy storage system by the host 1000, before these data is stored to the disk drives 4508. The cache memory may also temporally store the read data that are requested by the host 1000. The cache may be implemented as a battery backed-up non-volatile memory, which would protect the cached data against power failure. In another implementation, the memory 4503 and the cache memory 4504 are combined within the same memory unit.
The host interface 4505 provides a networking connection capability between the host 1000 and the controller 4501. The Fibre Channel (FC) and Ethernet protocols are two exemplary protocols, which may be utilized in establishing the aforesaid connection between the host and the controller. The management interface 4506 is used by the management host 6000 to connect to and to manage the storage controller 4501. The disk drive interface 4507 is provided to interconnect the disk drives 4508 with the storage controller 4501. Each of the Disk Drives 4508 processes the input and output (I/O) requests received by the legacy storage system 4000 in accordance with the SCSI Device command set, well known to persons of skill in the art.
The modern Storage System 5000 includes two main components—the File Head 5501 and the Storage System 5510. The File Head 5501 and the storage system 5510 can be connected via the interface 5507. The file head 5501 and the storage system 5510 may be implemented within one storage unit. In such an implementation, the aforesaid two elements may be connected via a system bus, such as PCI. In another implementation, the file head and the storage system may be physically separated. In this case, the aforesaid two elements may be interconnected via network connections such as Fibre Channel or Ethernet.
The file head 5501 includes a CPU 5502, memory 5503, cache memory 5504, front-end network interface (NIC) 5505, management interface 5506, disk interface (I/F) 5507, and inter storage network interface 5508. The file head processes various requests from the host 2000 and the management host 6000.
Similar to the legacy storage system, the memory 5503 of the file head 5501 of the modern storage system 4000 stores a software program, which handles I/O operations associated with the data stored in the modern storage system. The aforesaid program is executed by the CPU 5502 of the file head 5501.
Cache 5504 temporally stores data written from the host 2000 before the data is forwarded to the storage system 5510, or it stores read data that are requested by the host 2000. The cache may be implemented as a battery backed-up non-volatile storage unit. In another implementation, memory 5503 and cache memory 5504 are combined with the same memory unit. The front-end interface 5501 is used to establish a data connection between the host 2000 and the file head 5501. One common implementation of the front-end interface 5501 is an interface based on the Ethernet protocol, well known to persons of skill in the art.
Management interface 5506 is used by the management host to manage the File Head 5501 and the storage system 5510. Disk interface 5507 is provided to enable the data transfer between the file head 5501 and the storage system 5510. The Fibre Channel (FC) and Ethernet are two typical examples of protocols, which may be used in implementing of the interface 5507. In the case of an internally implemented connection between the file head and the storage system, a system bus-type interface may be used in implementing such a connection.
Inter storage network interface 5508 is provided to interconnect the file head 5501 to the old storage system 4000. The storage system 5510 has a similar hardware configuration to the storage system 4000. It processes I/O requests from the File Head 5501. The same legacy software application executes on both the host 1000 and the host 2000. This application is not shown in
Management Host 6000 executes management software (not shown in
The Legacy Storage System 4000 may incorporate a storage controller 4501, which processes SCSI commands sent by the host 1000. Volumes 4600 may each be composed of one or more disk drives 4508. The modern Storage System 5000 incorporates two main components—file head 5501 and storage system 5510.
The file head 5501 processes file-related operations directed to the modern storage system 5000. The local file system 5106 of the modern storage system 5000 processes file I/O operations initiated from the host 2000. Specifically, the local file system 5106 translates the file I/O operations to the block level operations, and communicates with the storage system 5510 via SCSI commands. A migration module 5004 is operable to read data from another storage system, such as the storage system 4000 using an appropriate I/O driver 5002, such as a SCSI driver, and to write the read data to the storage system 5510 via the file system 5106. During the writing operation, the migration module 5004 utilizes the conversion rule table 5005 to determine the manner of data placement within the storage system 5510. The conversion rule table 5005 may be manually populated by a storage system administrator from the storage management host 6000. The aforesaid table may be physically stored within the storage system 5510. After finishing the data migration, the migration module 5004 stores the new location of the migrated data in the location table 5006. The location table 5006 may be also physically stored in the storage system 5510.
The storage system 5510 will now be described. The storage controller 5601 processes SCSI commands from the file head 5501. File systems for storing data in the file format are created on volumes 5600 of the storage system 5510.
The host 1000 is a computer platform executing the legacy application (AP) 1010 running under an OS 1011. The legacy application may generate I/O operations addressed to the legacy storage system 4000. The communication between the application 1010 and the legacy storage system 4000 is accomplished by means of a software driver 1012. The host 1000 and the storage system 4000 are interconnected via a network 3000, such as a storage area network based on the fibre channel protocol (FCP) well known to persons of skill in the art. The host 1001 is generally similar to the host 1000. It incorporates legacy application 1020, OS 1021 and software driver 1022.
Host 2000 is a computer platform on which the virtual machines (VM) 2001 and 2002 are executed under the OS 2004. Each VM emulates the execution environment of the legacy application. Using the VM 2001, a software application originally designed for a legacy execution environment, such as the environment of the host 1000, can be executed without any modification. An application running on the VM 2001 also generates I/O operations. However, these I/O operations generated by a legacy application running on a virtual machine do not necessarily match the data access protocol of the modern storage system 5510. Therefore, the SCSI/File converter module 2003 converts the I/O operations from the legacy data access format to the data access format used in the modern storage system. The driver program 2005 communicates with the modern storage system 5000 and transmits the I/O operations initiated by the application running under the VM 2001. The host 2000 and the storage system 5000 are interconnected via a network such as Ethernet or FC.
The management host 6000 will now be described. The management host 6000 is coupled to the legacy storage system 4000 via management interface 4002 and to the modern storage system 5000 via management interface 5003, see
At the end of life of the legacy storage system 4000, a storage administrator migrates OS/application binary code and data stored in the logical units 4100-4103 of the legacy storage system 4000 to a modern storage system 5000. To this end, the administrator utilizes storage management software 6001 executing on the storage management host 6000 to invoke a migration module 5004 on the modern storage system 5000.
After configuring the original application environment by VM 2001, the legacy application execution under VM 2001 proceeds to issue I/O operations requesting the data stored in the modern storage system.
As would be appreciated by persons of skill in the art, another storage interface transition may take place during the term of archiving of the data in the modern storage system 5000. Specifically, a technology transition may take place to a third generation data access interface, such as, an object-based interface.
In accordance with the migration process illustrated in
In the control flow associated with the reconstruction of the legacy application environment at host 9000, the SCSI/3rd generation converter module 9003, instead of the SCSI/File converter module, requests the reading of the location table 10006 to the new storage system 10000 via the 3rd generation interface 9005.
Finally, it should be understood that processes and techniques described herein are not inherently related to any particular apparatus and may be implemented by any suitable combination of components. Further, various types of general purpose devices may be used in accordance with the teachings described herein. It may also prove advantageous to construct specialized apparatus to perform the method steps described herein. The present invention has been described in relation to particular examples, which are intended in all respects to be illustrative rather than restrictive. Those skilled in the art will appreciate that many different combinations of hardware, software, and firmware will be suitable for practicing the present invention. For example, the described software may be implemented in a wide variety of programming or scripting languages, such as Assembler, C/C++, perl, shell, PHP, Java, etc.
Moreover, other implementations of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. Various aspects and/or components of the described embodiments may be used singly or in any combination in the computerized storage system with data replication functionality. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.