This application claims priority under 35 U.S.C. § 119 from Japanese Patent Application No. 2022-148540, filed on Sep. 16, 2022, the entire disclosure of which is incorporated herein by reference.
The present disclosure relates to an information processing apparatus, an information processing method, and an information processing program.
WO2021/181739A discloses a technique of grouping and recording a plurality of pieces of data on a magnetic tape.
JP2021-117772A discloses a technique of generating parity data by using a plurality of pieces of data and distributing and recording the plurality of pieces of data and the parity data on a plurality of magnetic tapes.
In a case of migrating data from a data-migration-source magnetic tape to a data-migration-destination magnetic tape, as a frequency of replacement of magnetic tapes with respect to a tape drive increases, a data migration time is increased.
The present disclosure has been made in view of the above circumstances, and an object of the present disclosure is to provide an information processing apparatus, an information processing method, and an information processing program capable of reducing a frequency of replacement of magnetic tapes with respect to a tape drive during data migration.
According to a first aspect, there is provided an information processing apparatus comprising: at least one processor, in which the processor is configured to: determine, for each of a plurality of packed objects recorded on one data-migration-source magnetic tape designated as a data migration target among a plurality of magnetic tapes on which the plurality of packed objects and parity data generated by using the plurality of packed objects are distributed and recorded, whether or not to regenerate the packed objects by using objects included in the packed objects, the plurality of packed objects being obtained by grouping one or more objects including data and metadata related to the data; generate a second packed object by using objects included in a first packed object determined as being regenerated; generate parity data by using a plurality of second packed objects; perform control of distributing and recording the plurality of second packed objects and the parity data on the plurality of magnetic tapes; and perform control of recording a packed object determined as not being regenerated on a magnetic tape different from the data-migration-source magnetic tape.
According to a second aspect, in the information processing apparatus according to the first aspect, the processor is configured to further transfer the first packed object used for generating the second packed object to a magnetic tape different from the data-migration-source magnetic tape in a case where a purpose of data migration is to initialize and reuse the data-migration-source magnetic tape.
According to a third aspect, in the information processing apparatus according to the second aspect, the plurality of packed objects and the parity data generated from the plurality of packed objects are regarded as one group, and the processor is configured to: set, in a case where there are magnetic tapes that are designated as data migration sources and are designated as data migration targets other than the data-migration-source magnetic tape, during data migration processing of the data-migration-source magnetic tape, as a data migration order of magnetic tapes designated as data migration sources other than the data-migration-source magnetic tape, an order in which the number of recorded packed objects belonging to the same group as the packed object recorded on the data-migration-source magnetic tape during the data migration processing is large.
According to a fourth aspect, in the information processing apparatus according to any one aspect of the first aspect to the third aspect, the plurality of packed objects and the parity data generated from the plurality of packed objects are regarded as one group, and the processor is configured to: determine, in a case where a purpose of data migration is data migration from relatively-old magnetic tapes to relatively-new magnetic tapes, not to regenerate a packed object by using objects included in the packed object recorded on the data-migration-source magnetic tape in a case where a packed object belonging to the same group as the packed objects recorded on the data-migration-source magnetic tape is recorded on a relatively-new magnetic tape.
According to a fifth aspect, in the information processing apparatus according to any one aspect of the first aspect to the fourth aspect, the processor is configured to: determine whether or not to regenerate a packed object by using objects included in the packed object, based on a deletion rate of the objects in the packed object recorded on the data-migration-source magnetic tape.
According to a sixth aspect, there is provided an information processing method executed by a processor of an information processing apparatus, the method comprising: determining, for each of a plurality of packed objects recorded on one data-migration-source magnetic tape designated as a data migration target among a plurality of magnetic tapes on which the plurality of packed objects and parity data generated by using the plurality of packed objects are distributed and recorded, whether or not to regenerate the packed objects by using objects included in the packed objects, the plurality of packed objects being obtained by grouping one or more objects including data and metadata related to the data; generating a second packed object by using objects included in a first packed object determined as being regenerated; generating parity data by using a plurality of second packed objects; performing control of distributing and recording the plurality of second packed objects and the parity data on the plurality of magnetic tapes; and performing control of recording a packed object determined as not being regenerated on a magnetic tape different from the data-migration-source magnetic tape.
According to a seventh aspect, there is provided an information processing program for causing a processor of an information processing apparatus to execute a process comprising: determining, for each of a plurality of packed objects recorded on one data-migration-source magnetic tape designated as a data migration target among a plurality of magnetic tapes on which the plurality of packed objects and parity data generated by using the plurality of packed objects are distributed and recorded, whether or not to regenerate the packed objects by using objects included in the packed objects, the plurality of packed objects being obtained by grouping one or more objects including data and metadata related to the data; generating a second packed object by using objects included in a first packed object determined as being regenerated; generating parity data by using a plurality of second packed objects; performing control of distributing and recording the plurality of second packed objects and the parity data on the plurality of magnetic tapes; and performing control of recording a packed object determined as not being regenerated on a magnetic tape different from the data-migration-source magnetic tape.
According to the present disclosure, it is possible to reduce a frequency of replacement of magnetic tapes with respect to a tape drive during data migration.
Hereinafter, an example of an embodiment for performing a technique according to the present disclosure will be described in detail with reference to the drawings.
First, a configuration of an information processing system 10 according to the present embodiment will be described with reference to
The tape library 14 includes a plurality of slots (not illustrated) and a plurality of tape drives 18, and each slot houses a magnetic tape T as an example of a recording medium. Each tape drive 18 is connected to the information processing apparatus 12. The tape drive 18 writes or reads data to or from the magnetic tape T under a control of the information processing apparatus 12. Examples of the magnetic tape T include a linear tape-open (LTO) tape.
In a case where the information processing apparatus 12 writes or reads data to or from the magnetic tape T, the magnetic tape T as a write target or a read target is loaded from the slot into a predetermined tape drive 18. In a case where data is written or read to and from the magnetic tape T loaded into the tape drive 18, the magnetic tape T is unloaded from the tape drive 18 into the slot in which the magnetic tape T is originally housed.
In the present embodiment, as an example, as illustrated in
In addition, in the present embodiment, as illustrated in
Examples of the packing rule include a rule for grouping a plurality of objects including pieces of data having the same extension into the same packed object and a rule for grouping a plurality of objects that are likely to be read at the same time into the same packed object. In addition, examples of the packing rule include a rule for grouping a plurality of objects into one packed object such that a size of one packed object is equal to or larger than a predetermined lower limit value and is smaller than a predetermined upper limit value. In addition, examples of the packing rule include a rule for grouping a plurality of objects into one packed object such that the number of objects included in one packed object is equal to or larger than a predetermined lower limit value and is smaller than a predetermined upper limit value. In addition, a plurality of packing rules may be combined.
Next, a hardware configuration of the information processing apparatus 12 according to the present embodiment will be described with reference to
The storage unit 22 is realized by a hard disk drive (HDD), a solid state drive (SSD), a flash memory, or the like. An information processing program 30 is stored in the storage unit 22 as a storage medium. The CPU 20 reads the information processing program 30 from the storage unit 22, develops the read information processing program 30 in the memory 21, and executes the developed information processing program 30.
On the other hand, the information processing apparatus 12 according to the present embodiment has a function of migrating a packed object recorded on the magnetic tape T to another magnetic tape T. Hereinafter, in the data migration, a magnetic tape T as a data migration source is referred to as a migration source tape, and a magnetic tape T as a data migration destination is referred to as a migration destination tape. A packed object as a migration target that is recorded on the migration source tape will be described with reference to
As illustrated in
In the present embodiment, three packed objects and one piece of parity data belonging to one group are distributed and recorded in four storage pools, that is, four magnetic tapes T. In
In
In addition, in the present embodiment, since the storage pool function is used, recording states of pieces of data on the magnetic tapes T between the storage pools are not always the same. For example, as illustrated in
Next, a functional configuration of the information processing apparatus 12 according to the present embodiment will be described with reference to
The determination unit 40 determines whether or not to regenerate a packed object by using objects included in the packed object, for each packed object recorded on one migration source tape designated as a data migration target among the plurality of magnetic tapes T. In the present embodiment, the determination unit 40 determines whether or not to regenerate a packed object by using objects included in the packed object, based on a deletion rate of the objects in the packed object recorded on the migration source tape.
In the information processing system 10 according to the present embodiment, an object for which a deletion request is input by a user, an administrator, or the like of the information processing system 10 is not deleted from the magnetic tape T at a timing when the deletion request is input, and deletion information indicating that the object is deleted is assigned to the object. The deletion information is saved in association with the object, for example, in a database for managing the object or an area for recording management information of the magnetic tape T on which the object is recorded. This is because it is difficult to delete, from the magnetic tape T, only a specific object among the objects recorded on the magnetic tape T. The deletion rate of the objects in the packed object represents a rate at which the objects in the packed object are deleted.
In the present embodiment, the determination unit 40 derives, as the deletion rate, a ratio of the number of objects to which pieces of the deletion information are assigned to the total number of objects included in the packed object. The determination unit 40 may derive, as the deletion rate, a ratio of a total size of objects to which pieces of the deletion information are assigned to a total size of objects included in the packed object. In addition, as illustrated in
In
The first generation unit 42 generates a packed object (hereinafter, referred to as “second packed object”) by using the objects included in the packed object (hereinafter, referred to as “first packed object”) determined by the determination unit 40 as being regenerated. As illustrated in
The second generation unit 44 generates parity data by using a plurality of second packed objects generated by the first generation unit 42. As illustrated in
In a case of generating the parity data by using the plurality of second packed objects, the second generation unit 44 may perform processing of making the sizes of the plurality of second packed objects the same size by adding dummy data to the second packed objects other than the second packed object having a maximum size. Examples of the dummy data include data padded with 0, data padded with 1, and the like.
The controller 46 performs control of distributing and recording the plurality of second packed objects and the parity data on the plurality of magnetic tapes T, the parity data being generated by the second generation unit 44 by using the plurality of second packed objects. That is, as illustrated in
In addition, the controller 46 performs control of recording, for a packed object determined by the determination unit 40 as not being regenerated, the packed object on a magnetic tape T different from the migration source tape. Specifically, as illustrated in
Next, an operation of the information processing apparatus 12 according to the present embodiment will be described with reference to
In step S10 of
In step S12, as described above, the first generation unit 42 generates a second packed object by using objects included in the first packed object determined as being regenerated in step S10. In step S14, as described above, the second generation unit 44 generates parity data by using a plurality of second packed objects generated in step S12. In step S16, as described above, the controller 46 performs control of distributing and recording the plurality of second packed objects and the parity data on the plurality of magnetic tapes T, the parity data being generated in step S14 by using the plurality of second packed objects. In a case where the processing of step S16 is completed, data migration processing is completed.
On the other hand, in step S18, as described above, the controller 46 performs control of recording, for a packed object determined in step S10 as not being regenerated, the packed object on a magnetic tape T different from the migration source tape. In a case where the processing of step S18 is completed, data migration processing is completed.
As described above, according to the present embodiment, data migration can be performed from one migration source tape while maintaining redundancy of the packed object recorded on the migration source tape. That is, the number of migration source tapes is one, and thus the number of tape drives 18 used for data migration can be reduced. As a result, it is possible to reduce a frequency of replacement of the magnetic tapes T with respect to the tape drive 18 during data migration. Therefore, it is possible to prevent a data migration time from being increased.
In the embodiment, in a case where a purpose of the data migration is to initialize and reuse the migration source tape, the controller 46 may further perform control of transferring the first packed object used for generating the second packed object to a magnetic tape T different from the migration source tape. Specifically, as illustrated in
Further, in the example of the embodiment, during the data migration processing of the migration source tape, in a case where there are magnetic tapes that are designated as data migration sources and are designated as data migration targets other than the migration source tape, the controller 46 may derive an order of data migration as described below. That is, in this case, the controller 46 may set, as a data migration order of magnetic tapes designated as data migration sources other than the migration source tape, an order in which the number of recorded packed objects belonging to the same group as the packed objects recorded on the migration source tape during the data migration processing is large. A specific example of the order of data migration will be described with reference to
As illustrated in
In the example of
In addition, in the embodiment, the determination unit 40 may perform the following determination in a case where a purpose of data migration is data migration from relatively-old magnetic tapes T to relatively-new magnetic tapes T. That is, in this case, in a case where a packed object belonging to the same group as the packed objects recorded on the migration source tape is recorded on a relatively-new magnetic tape T, the determination unit 40 may determine not to regenerate the packed object by using objects included in the packed object recorded on the migration source tape. A specific example of the determination processing will be described with reference to
As illustrated in
Further, in the embodiment, for example, as a hardware structure of a processing unit that executes various processing, such as the determination unit 40, the first generation unit 42, the second generation unit 44, and the controller 46, the following various processors may be used. The various processors include, as described above, a CPU, which is a general-purpose processor that functions as various processing units by executing software (program), and a dedicated electric circuit, which is a processor having a circuit configuration specifically designed to execute a specific processing, such as a programmable logic device (PLD) or an application specific integrated circuit (ASIC) that is a processor of which the circuit configuration may be changed after manufacturing such as a field programmable gate array (FPGA).
One processing unit may be configured by one of these various processors, or may be configured by a combination of two or more processors of the same type or different types (for example, a combination of a plurality of FPGAs or a combination of a CPU and an FPGA). Further, the plurality of processing units may be configured by one processor.
As an example in which the plurality of processing units are configured by one processor, firstly, as represented by a computer such as a client and a server, a form in which one processor is configured by a combination of one or more CPUs and software and the processor functions as the plurality of processing units may be adopted. Secondly, as represented by a system on chip (SoC) or the like, a form in which a processor that realizes the function of the entire system including the plurality of processing units by one integrated circuit (IC) chip is used may be adopted. As described above, the various processing units are configured by using one or more various processors as a hardware structure.
Further, as the hardware structure of the various processors, more specifically, an electric circuit (circuitry) in which circuit elements such as semiconductor elements are combined may be used.
Further, in the embodiment, an example in which the information processing program 30 is stored (installed) in the storage unit 22 in advance has been described. On the other hand, the present disclosure is not limited thereto. The information processing program 30 may be provided by being recorded in a recording medium such as a compact disc read only memory (CD-ROM), a digital versatile disc read only memory (DVD-ROM), or a Universal Serial Bus (USB) memory. Further, the information processing program 30 may be downloaded from an external apparatus via a network.
Number | Date | Country | Kind |
---|---|---|---|
2022-148540 | Sep 2022 | JP | national |