The present application relates generally to an improved data processing apparatus and method and more specifically to a rollback mechanism for linear tape file systems.
Linear Tape File System (LTFS) refers to both the format of data recorded on magnetic tape media and the implementation of specific software that uses this data format to provide a file system interface to data stored on magnetic tape. The Linear Tape File System format is a self-describing tape format developed by International Business Machines (IBM) Corporation of Armonk, N.Y. to address tape archive requirements. The LTFS Format specification defines the organization of data and metadata on tape, where the files are stored in a hierarchical directory structure. Data tapes written in the LTFS Format can be used independently of any external database or storage system allowing direct access to file content data and file metadata. This format makes it possible to implement software that presents a standard file system view of the data stored in the tape medium. This file system view makes accessing files stored on the LTFS formatted media similar to accessing files stored on other forms of storage media such as disk or removable flash drives.
As mentioned above, in LTFS, in addition to the storing of the content of a file itself into a tape medium, metadata related to the file is also stored. This metadata, which may include data referred to as an index used for identifying the file, the name of the file, etc., is stored into the tape medium, such as in an Extendible Markup Language (XML) file format. In the existing LTFS Format, it is not permitted to divide an index into a plurality of XML files. The task of storing an index on the tape medium is referred to as a “sync” task or operation.
In one illustrative embodiment, a method, in a data processing system, is provided for restoring a file recorded on a storage medium to a previous version of the file. The illustrative embodiment presents at least two different versions of a file recorded on the storage medium to a user via a graphical user interface. In the illustrative embodiment, the at least two different versions of the file are identified from at least two different indexes recorded on the storage medium. Responsive to a selection of the previous version of the file from the at least two different versions of the file, the illustrative embodiment restores the file to the previous version of the file by recording a new index on the storage medium for the file that matches an index of the file associated with the previous version of the file.
In other illustrative embodiments, a computer program product comprising a computer useable or readable medium having a computer readable program is provided. The computer readable program, when executed on a computing device, causes the computing device to perform various ones of, and combinations of, the operations outlined above with regard to the method illustrative embodiment.
In yet another illustrative embodiment, a system/apparatus is provided. The system/apparatus may comprise one or more processors and a memory coupled to the one or more processors. The memory may comprise instructions which, when executed by the one or more processors, cause the one or more processors to perform various ones of, and combinations of, the operations outlined above with regard to the method illustrative embodiment.
These and other features and advantages of the present invention will be described in, or will become apparent to those of ordinary skill in the art in view of, the following detailed description of the example embodiments of the present invention.
The invention, as well as a preferred mode of use and further objectives and advantages thereof, will best be understood by reference to the following detailed description of illustrative embodiments when read in conjunction with the accompanying drawings, wherein:
Again, a linear tape file system (LTFS) makes accessing files stored on a LTFS formatted media similar to accessing files stored on other forms of storage media such as disk or removable flash drives. In LTFS, since data is written linearly to the tape medium, a new write operation will never be performed at a position where a previous write operation has been performed. For example, even when a file that is already recorded on the tape medium is changed, the original file is not deleted and rewritten. Rather, only the difference is written to the tape medium and the index is updated to reflect the portions of the original file and the new data that comprise the file after the changes. As another example, when a file is deleted from the tape medium, the file is not actually removed from the tape medium. Rather, the index is merely updated with an indication that the file is no longer valid. Further, in both of these examples, the original index is not updated. Rather, a new index is written to the tape medium thereby replacing the older index. For this reason, because of the nature of LTFS, there is a possibility of recovering a file to a previous state.
As illustrated in
Therefore, in a tape medium, every previous (past) data remains without being deleted or erased. However, since the indexes of a tape medium are updated at regular time intervals, then, if the tape medium were written every hour of every day for a year, there may be 8,760 indexes on the tape medium. Consequently, when an LTFS user wants to recover the content of a file back to a previous state before updating or when an LTFS user wants to recover a file that was deleted by mistake, the user would have to select from 8,760 indexes in order to choose an appropriate index and file version, which may be an arduous task.
The illustrative embodiments provide a rollback mechanism for linear tape file systems that provide the user with an easy means to recover the content of a file back to a previous state at a user-identified point in time. The mechanism utilizes a plurality of components that are graphically presented to the user that allows the user to select from a plurality of year/month/day/time ranges to which a file can be recovered. With each year/month/day/time range available to be selected by the user, there is a different version of the file, which may be provided as a preview to the user. With the selection of a version of the file at one of the year/month/day/time ranges, the mechanism of the illustrative embodiments returns the file to the previous version by updating the index to reflect the selected previous state.
Thus, the illustrative embodiments may be utilized in many different types of data processing environments. In order to provide a context for the description of the specific elements and functionality of the illustrative embodiments,
In the depicted example, server 304 and server 306 are connected to network 302 along with storage unit 308. In addition, clients 310, 312, and 314 are also connected to network 302. These clients 310, 312, and 314 may be, for example, personal computers, network computers, or the like. In the depicted example, server 304 provides data, such as boot files, operating system images, and applications to the clients 310, 312, and 314. Clients 310, 312, and 314 are clients to server 304 in the depicted example. Distributed data processing system 300 may include additional servers, clients, and other devices not shown.
In the depicted example, distributed data processing system 300 is the Internet with network 302 representing a worldwide collection of networks and gateways that use the Transmission Control Protocol/Internet Protocol (TCP/IP) suite of protocols to communicate with one another. At the heart of the Internet is a backbone of high-speed data communication lines between major nodes or host computers, consisting of thousands of commercial, governmental, educational and other computer systems that route data and messages. Of course, the distributed data processing system 300 may also be implemented to include a number of different types of networks, such as for example, an intranet, a local area network (LAN), a wide area network (WAN), or the like. As stated above,
In the depicted example, data processing system 400 employs a hub architecture including north bridge and memory controller hub (NB/MCH) 402 and south bridge and input/output (I/O) controller hub (SB/ICH) 404. Processing unit 406, main memory 408, and graphics processor 410 are connected to NB/MCH 402. Graphics processor 410 may be connected to NB/MCH 402 through an accelerated graphics port (AGP).
In the depicted example, local area network (LAN) adapter 412 connects to SB/ICH 404. Audio adapter 416, keyboard and mouse adapter 420, modem 422, read only memory (ROM) 424, hard disk drive (HDD) 426, CD-ROM drive 430, universal serial bus (USB) ports and other communication ports 432, and PCI/PCIe devices 434 connect to SB/ICH 404 through bus 438 and bus 440. PCI/PCIe devices may include, for example, Ethernet adapters, add-in cards, and PC cards for notebook computers. PCI uses a card bus controller, while PCIe does not. ROM 424 may be, for example, a flash basic input/output system (BIOS).
HDD 426 and CD-ROM drive 430 connect to SB/ICH 404 through bus 440. HDD 426 and CD-ROM drive 430 may use, for example, an integrated drive electronics (IDE) or serial advanced technology attachment (SATA) interface. Super I/O (SIO) device 436 may be connected to SB/ICH 404.
An operating system runs on processing unit 406. The operating system coordinates and provides control of various components within the data processing system 400 in
As a server, data processing system 400 may be, for example, an IBM® eServer™ System p® computer system, running the Advanced Interactive Executive (AIX®) operating system or the LINUX® operating system. Data processing system 400 may be a symmetric multiprocessor (SMP) system including a plurality of processors in processing unit 406. Alternatively, a single processor system may be employed.
Instructions for the operating system, the object-oriented programming system, and applications or programs are located on storage devices, such as HDD 426, and may be loaded into main memory 408 for execution by processing unit 406. The processes for illustrative embodiments of the present invention may be performed by processing unit 406 using computer usable program code, which may be located in a memory such as, for example, main memory 408, ROM 424, or in one or more peripheral devices 426 and 430, for example.
A bus system, such as bus 438 or bus 440 as shown in
Those of ordinary skill in the art will appreciate that the hardware in
Moreover, the data processing system 400 may take the form of any of a number of different data processing systems including client computing devices, server computing devices, a tablet computer, laptop computer, telephone or other communication device, a personal digital assistant (PDA), or the like. In some illustrative examples, data processing system 400 may be a portable computing device that is configured with flash memory to provide non-volatile memory for storing operating system files and/or user-generated data, for example. Essentially, data processing system 400 may be any known or later developed data processing system without architectural limitation.
In accordance with the illustrative embodiments, in response to a request to view previous versions of a particular file, rollback logic 504 starts a process of reading all of the indexes of tape medium 508 starting with the most recent index. Rollback logic 504 reads the most recent index (n) and the index just prior to the most recent index (n−1), i.e. the preceding index. For the most recent index (n), rollback logic 504 determines whether the most recent index (n) and the n−1 index comprise any information associated with the file. If rollback logic 504 determines that there is information associated with the file in the most recent index (n) and the n−1 index, rollback logic 504 determines whether the information associated with the file in the most recent index (n) is different (indicating a change) from information associated with the file in the n−1 index. If the information associated with the file is different, then rollback logic 504 records the information as a version of the file along with an identifying date and time of the index.
If rollback logic 504 determines that there is no information associated with the file in the most recent index (n) or if the information associated with the file in the most recent index (n) is not different (indicating a change) from information associated with the file in the n−1 index, rollback logic 504 determines whether there is another index preceding the n−1 index. That is, rollback logic 504 determines whether there is an n−2 index. If there is an n−2 index, then rollback logic 504 determines whether the n−1 index and the n−2 index comprise any information associated with the file. If rollback logic 504 determines that there is information associated with the file in the n−1 index and the n−2 index, rollback logic 504 determines whether the information associated with the file in the n−1 index is different (indicating a change) from information associated with the file the n−2 index. If the information associated with the file is different, then rollback logic 504 records the information as a version of the file along with an identifying date and time of the index. If rollback logic 504 determines that there is no information associated with the file in the n−1 index or if the information associated with the file in the n−1 index is not different (indicating a change) from information associated with the file in the n−2 index, rollback logic 504 repeats the process until the initial index is reached. As is illustrated, only when a change is detected does rollback logic 504 record the information about the file, such that only the information associated with changes in the file are recorded as changed file versions.
As described above, since indexes of a tape medium are updated at regular time intervals, then, if the tape medium were written every hour of every day for a year, there may be 8,760 indexes on the tape medium. Rollback logic 504 reduces a user having to review every index to identify a previous version of the file by identifying only those indexes and information where the file changed or was added to tape medium 508. Thus, if a file was added to tape medium 508 and was updated only 14 times over a year, rollback logic 504 identifies the 15 indexes where the file changes and the information associated with the file indicating the change.
Further, in response to the request to view previous versions of a particular file, rollback logic 504 presents the information to the user via a graphical user interface in display 510 as a combination of components for a user to easily select an arbitrary previous point in time for recovery. That is, rollback logic 504 may present the different versions as a combination of three sliders where time is split into three different time scales. Each of the three different time scales is assigned to the corresponding one of the three sliders. The first slider may be used for selection in the time scale of “year/month.” The second slider may be used for selection in the time scale of “day.” The third slider may be used for selection in the time scale of “time.” With the use of these sliders, the user may select a returnable point in time only. By the user selecting a combination of year/month, day, and time, rollback logic 504 is able to identify the version of the file associated with that point in time as the version to restore. Additionally, once the user has selected a combination of year/month, day, and time, in addition to identifying the version of the file associated with that point in time as the version to restore, rollback logic 504 may also provide a preview of the identified version of the file that may be previewed by the user prior to actually restoring the file. That is, by the user selecting the preview of the file at the combination of year/month, day, and time, tape drive logic 502 may read the file based on the selected index information and provide a preview of the file to the user based on that information. Finally, if the user is satisfied that the identified version of the file is the version of the file the user wants to restore, rollback logic 504 provides a restore button for the user to indicate the selection.
The above aspects and advantages of the illustrative embodiments of the present invention will be described in greater detail hereafter with reference to the accompanying figures. It should be appreciated that the figures are only intended to be illustrative of exemplary embodiments of the present invention. The present invention may encompass aspects, embodiments, and modifications to the depicted exemplary embodiments not explicitly shown in the figures but would be readily apparent to those of ordinary skill in the art in view of the present description of the illustrative embodiments.
As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method, or computer program product. Accordingly, aspects of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “circuit,” “module” or “system.” Furthermore, aspects of the present invention may take the form of a computer program product embodied in any one or more computer readable medium(s) having computer usable program code embodied thereon.
Any combination of one or more computer readable medium(s) may be utilized. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium is a system, apparatus, or device of an electronic, magnetic, optical, electromagnetic, or semiconductor nature, any suitable combination of the foregoing, or equivalents thereof. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical device having a storage capability, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber based device, a portable compact disc read-only memory (CDROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium is any tangible medium that can contain or store a program for use by, or in connection with, an instruction execution system, apparatus, or device.
In some illustrative embodiments, the computer readable medium is a non-transitory computer readable medium. A non-transitory computer readable medium is any medium that is not a disembodied signal or propagation wave, i.e. pure signal or propagation wave per se. A non-transitory computer readable medium may utilize signals and propagation waves, but is not the signal or propagation wave itself. Thus, for example, various forms of memory devices, and other types of systems, devices, or apparatus, that utilize signals in any way, such as, for example, to maintain their state, may be considered to be non-transitory computer readable media within the scope of the present description.
A computer readable signal medium, on the other hand, may include a propagated data signal with computer readable program code embodied therein, for example, in a baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Similarly, a computer readable storage medium is any computer readable medium that is not a computer readable signal medium.
Computer code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, radio frequency (RF), etc., or any suitable combination thereof.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java™, Smalltalk™, C++, or the like, and conventional procedural programming languages, such as the “C” programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
Aspects of the present invention are described below with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to the illustrative embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer readable medium that can direct a computer, other programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions stored in the computer readable medium produce an article of manufacture including instructions that implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus, or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
If at step 706 either the first or the second index fail to comprise information about the file, if at step 708 the information associated with the file fails to be different, or from step 710, the rollback mechanism determines if there is another index preceding the current second index (step 712). If at step 712 there is another index preceding the current second index, the rollback mechanism reads a next immediately preceding index (n−2) such that, in the comparison that is performed by the rollback mechanism, the next immediately preceding index (n−2) is considered as the second index and the previous second index is considered as the first index (step 714), with the operation returning to step 706 thereafter. If at step 712 there is not another index, the rollback mechanism records the information from the second index as an initial version of the file along with an identifying date and time of the index (step 716).
The rollback mechanism then presents the information to the user via a graphical user interface along with a preview of the most current version of the file, such as is illustrated in graphical user interface 600 of
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Thus, the illustrative embodiments provide mechanisms for a rollback mechanism for linear tape file systems that provide the user with an easy means to recover the content of a file back to a previous state at a user-identified point in time. The mechanism utilizes a plurality of components that are graphically presented to the user that allows the user to select from a plurality of year/month/day/time ranges to which a file can be recovered. With each year/month/day/time range available to be selected by the user, there is a different version of the file, which may be provided as a preview to the user. With the selection of a version of the file at one of the year/month/day/time ranges, the mechanism of the illustrative embodiments returns the file to the previous version by updating the index to reflect the selected previous state.
As noted above, it should be appreciated that the illustrative embodiments may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment containing both hardware and software elements. In one example embodiment, the mechanisms of the illustrative embodiments are implemented in software or program code, which includes but is not limited to firmware, resident software, microcode, etc.
A data processing system suitable for storing and/or executing program code will include at least one processor coupled directly or indirectly to memory elements through a system bus. The memory elements can include local memory employed during actual execution of the program code, bulk storage, and cache memories which provide temporary storage of at least some program code in order to reduce the number of times code must be retrieved from bulk storage during execution.
Input/output or I/O devices (including but not limited to keyboards, displays, pointing devices, etc.) can be coupled to the system either directly or through intervening I/O controllers. Network adapters may also be coupled to the system to enable the data processing system to become coupled to other data processing systems or remote printers or storage devices through intervening private or public networks. Modems, cable modems and Ethernet cards are just a few of the currently available types of network adapters.
The description of the present invention has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the invention in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art. The embodiment was chosen and described in order to best explain the principles of the invention, the practical application, and to enable others of ordinary skill in the art to understand the invention for various embodiments with various modifications as are suited to the particular use contemplated.