The application claims priority to Chinese Patent Application No. 202010449309.4 filed on May 25, 2020. Chinese Patent Application No. 202010449309.4 is hereby incorporated by reference in its entirety.
Embodiments of the present disclosure generally relate to the field of data storage, and in particular, to a method, an electronic device, and a computer-readable storage medium for creating a snapview backup.
A backup system usually includes a backup client and a backup server, where the backup client can send files to be backed up to the backup server for backup. Many backup systems provide a function called snapview backup. This function will create a new backup by referencing data in existing backup files in the backup server.
In a disk image backup scenario, basic disk image files and changed data block files are usually created at different times, so there are usually two backup files created for them respectively in the backup server. When customers want to restore the latest disk image file, they need to restore the basic disk image files and the changed data block files, and then merge them to obtain the latest disk image file. This is often inconvenient for the customers.
Embodiments of the present disclosure provide a method, an electronic device, and a computer-readable storage medium for creating a snapview backup.
In a first aspect of the present disclosure, there is provided a method for creating a snapview backup. The method includes: acquiring, at a backup client, a first file list corresponding to the snapview backup to be created, the first file list indicating a plurality of files to be referenced by the snapview backup, and data of the plurality of files being contained in a plurality of containers at a backup server; generating a second file list by sorting the plurality of files indicated in the first file list according to the containers to which they belong, the second file list indicating that each of the plurality of containers contains data of at least one of the plurality of files; and creating the snapview backup by causing the backup server to reference the data of the plurality of files in the plurality of containers based on the second file list.
In a second aspect of the present disclosure, there is provided an electronic device. The electronic device includes at least one processing unit and at least one memory. The at least one memory is coupled to the at least one processing unit and stores instructions for execution by the at least one processing unit. The instructions, when executed by the at least one processing unit, cause the device to perform actions, and the actions include: acquiring a first file list corresponding to a snapview backup to be created, the first file list indicating a plurality of files to be referenced by the snapview backup, and data of the plurality of files being contained in a plurality of containers at a backup server; generating a second file list by sorting the plurality of files indicated in the first file list according to the containers to which they belong, the second file list indicating that each of the plurality of containers contains data of at least one of the plurality of files; and creating the snapview backup by causing the backup server to reference the data of the plurality of files in the plurality of containers based on the second file list.
In a third aspect of the present disclosure, there is provided a computer-readable storage medium that contains machine-executable instructions. The machine-executable instructions, when executed by a device, cause the device to perform the method described according to the first aspect of the present disclosure.
In a fourth aspect of the present disclosure, there is provided a computer program product. The computer program product is tangibly stored in a computer-readable storage medium and contains machine-executable instructions. The machine-executable instructions, when executed by a device, cause the device to perform the method described according to the first aspect of the present disclosure.
The summary part is provided to introduce a selection of concepts in a simplified manner, which will be further described in the detailed description below. The summary part is neither intended to identify key features or essential features of the present disclosure, nor intended to limit the scope of the present disclosure.
The foregoing and other objectives, features, and advantages of the present disclosure will become more apparent from more detailed description of the example embodiments of the present disclosure in conjunction with the accompanying drawings. In the example embodiments of the present disclosure, like reference numerals usually represent like components.
In the accompanying drawings, like or corresponding numerals represent like or corresponding parts.
Preferred embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. Although the preferred embodiments of the present disclosure are shown in the accompanying drawings, it should be understood that the present disclosure can be implemented in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided to make the present disclosure more thorough and complete and to fully convey the scope of the present disclosure to those skilled in the art.
The term “include” and its variants as used herein indicate open-ended inclusion, i.e., “including, but not limited to.” Unless specifically stated, the term “or” indicates “and/or.” The term “based on” indicates “based at least in part on.” The terms “an example embodiment” and “an embodiment” indicate “at least one example embodiment.” The term “another embodiment” indicates “at least one additional embodiment.” The terms “first,” “second,” and the like may refer to different or identical objects. Other explicit and implicit definitions may also be included below.
As shown in
In addition, user 110 may initiate a request to create a snapview backup to backup client 120. The request may include a file list corresponding to the snapview backup to be created, and the file list may indicate one or more files to be referenced by the snapview backup to be created.
According to example file list 200 as shown in
File data at backup server 130 may be stored in units of containers. In this text, the term “container” refers to a data storage unit that stores corresponding file data of one or more files, which may exist in the form of a single file in a file system. By storing the file data of a plurality of files in the containers, the number of file system operations in data backup can be reduced. Corresponding file data in a backup file (e.g., backup file 131, 132, or 133) may be stored in one or more containers. Backup server 130 may create a snapview backup by referencing the corresponding file data in the containers according to a file list corresponding to the snapview backup.
In this text, the operation of referencing the file data in the containers is also referred to as a “synthesis operation” or “reference operation,” and “reference” and “synthesis” may be used interchangeably.
In a traditional solution, the efficiency of creating a snapview backup is usually relatively low. This is because users often do not know where files are stored in a backup server, so the users can create file lists corresponding to snapview backups at will, or create file lists corresponding to snapview backups based on some simple rules (such as alphabetical order). This will bring the following problems.
First, it may cause the same container to be repeatedly opened and closed multiple times. In the traditional solution, when a synthesis operation is performed, it is necessary to open a container before the synthesis operation and close the container after the synthesis operation is completed. This is because the file lists corresponding to the snapview backups are usually created in any order. Therefore, data of the next file to be referenced may be located in another container. If the container is not closed after each synthesis operation is completed, many unused but opened containers may be generated, and too many opened containers may cause problems with the backup server. Therefore, in the traditional solution, the containers are usually closed after the synthesis operations are completed. Since the users often do not know the containers where the backed up files are located, even if a plurality of files to be referenced are in the same container, the container will be repeatedly opened and closed.
Second, the efficiency of a synthesis operation performed on a plurality of files with continuous positions in the containers is not high. The file lists corresponding to the snapview backups may include the plurality of files with continuous positions in the containers. These files may be referenced via a single synthesis operation. However, since the order of files in the file lists corresponding to the snapview backups may be arbitrary, the plurality of files with continuous positions in the containers may not be continuous in the file lists. In this case, in the traditional solution, synthesis operations can only be performed on the plurality of files one by one, resulting in the same container being repeatedly opened and closed multiple times, thereby making the efficiency of creating a snapview backup relatively low.
Embodiments of the present disclosure provide a solution for creating a snapview backup, which may solve one or more of the foregoing problems and other potential problems. According to the solution, at a backup client, a first file list corresponding to a snapview backup to be created at a backup server is acquired. The first file list indicates a plurality of files to be referenced by the snapview backup, and data of the plurality of files is contained in a plurality of containers at the backup server. A second file list is generated by sorting the plurality of files indicated in the first file list according to the containers to which they belong. The second file list indicates that each of the plurality of containers contains data of at least one of the plurality of files. Then, the snapview backup is created by causing the backup server to reference the data of the plurality of files in the plurality of containers based on the second file list. In this way, the embodiments of the present disclosure can effectively reduce the number of file operations and message transfers in the process of creating a snapview backup, thereby improving the efficiency of creating the snapview backup.
At block 410, backup client 120 acquires a first file list corresponding to a snapview backup to be created. For example, the first file list may be input by user 110 to backup client 120. An example of the first file list may be example file list 200 as shown in
At block 420, a second file list is generated by backup client 120 by sorting the plurality of files indicated in the first file list according to the containers to which they belong, where the second file list indicates that each of the plurality of containers contains data of at least one of the plurality of files.
In some embodiments, in order to generate the second file list, backup client 120 may acquire, from backup server 130, information related to the plurality of files indicated in the first file list. The information may indicate the position of the container to which each of the plurality of files belongs, the offset of each of the plurality of files in the container to which it belongs, and the file size. Backup client 120 may generate an intermediate file list for creating the snapview backup based on the information and the first file list.
In some embodiments, backup client 120 may respectively create corresponding sub-file lists for a plurality of containers involved based on the intermediate file list. Each sub-file list may correspond to a container and indicate a group of files in the container to be referenced by the snapview backup. Backup client 120 may generate a second file list by combining these sub-file lists.
As shown in
Depending on a hash function used to calculate the hash values, the hash values corresponding to paths of different containers may be the same; that is, a “collision” occurs. In some embodiments, the problem of hash value collision may be solved by checking the complete hash values and/or container paths. For example, in order to add file A to the correct sub-list, after P[h1] is determined, the path of the container to which file A belongs may be compared with the paths of the containers corresponding to the one or more sub-lists pointed to by P[h1]. If the path of the container to which file A belongs matches the path of the container corresponding to a certain sub-list (for example, sub-list 621), the entry corresponding to file A is added to sub-list 621. In some embodiments, the added entry may record the offset (i.e., a start position) and the file size of file A in the container. In some embodiments, in order to improve the efficiency of referencing files with continuous positions subsequently, the entries corresponding to the files may be added in an ascending order according to the start positions of the files in the containers. That is, the entry corresponding to a file whose start position is near the front will be added to a position near the head of the list, and the entry corresponding to a file whose start position is near the back will be added to a position near the end of the list. Alternatively, if the path of the container to which file A belongs does not match the paths of the containers corresponding to all existing sub-lists, a new sub-list (for example, sub-list 622) may be created for the container to which file A belongs, and the entry corresponding to file A is added to new sub-list 622.
In this way, backup client 120 can generate second file list 620 based on intermediate file list 500. Second file list 620 indicates a group of files contained in each of the plurality of containers involved in the snapview backup to be created. It should be understood that, second file list 620, sub-file lists 621 and 622 therein, etc., may be implemented using a linked list or any other suitable data structures, and the scope of the present disclosure is not limited in this regard.
Referring back to
As shown in
At block 720, backup client 120 sends at least one operation request to backup server 130 to cause backup server 130 to reference the data of a group of files in the container through at least one synthesis operation.
In some embodiments, for files with discontinuous positions in the group of files, backup client 120 may send an operation request for the files to backup server 130 so as to cause backup server 130 to reference data of the files in the container through the synthesis operation. Additionally or alternatively, for a plurality of files with continuous positions in the group of files, backup client 120 may send an operation request for the plurality of files to backup server 130 to cause backup server 130 to reference data of the plurality of files in the container through a single synthesis operation.
As shown in
At block 804, backup client 120 determines whether the entry is the last entry in the sub-file list. If yes, at block 811, backup client 120 sends an operation request to backup client 130 to reference data from B1 to E1 in the container. If not, at block 805, backup client 120 reads the next entry in the sub-file list.
At block 806, backup client 120 acquires, from the next entry, start position B2 and file size S2 of the next file in the container recorded by the next entry. At block 807, backup client 120 determines whether start position B2 of the next file is equal to end position E1 of the current file. If yes, at block 808, backup client 120 updates E1 to E1=E1+S2, and then method 800 proceeds to block 804. If not, at block 809, backup client 120 sends an operation request to backup client 130 to reference data in the container from B1 to E1.
At block 810, backup client 120 will update B1 with B2, that is, B1=B2, and update E1 to E1=B1+S2. Method 800 then proceeds to block 805.
In this way, when there are a plurality of files with continuous positions in the container in the sub-file list, backup client 120 can determine the start positions and the end positions of the plurality of files in the container, and send the operation request to backup server 130 to reference, through a single synthesis operation, the data in the container from the start positions to the end positions that are determined, thereby completing the reference to the file data of the plurality of files. Merging the synthesis operations for the plurality of files with continuous positions can effectively reduce the number of times the container is repeatedly opened and closed, and at the same time, reduce the number of message transfers between backup client 120 and backup server 130, thereby significantly improving the performance of creating the snapview backup.
Referring back to
A plurality of components in device 900 are connected to I/O interface 905, including: input unit 906, such as a keyboard and a mouse; output unit 907, such as various types of displays and speakers; storage unit 908, such as a magnetic disk and an optical disk; and communication unit 909, such as a network card, a modem, and a wireless communication transceiver. Communication unit 909 allows device 900 to exchange information/data with other devices over a computer network such as the Internet and/or various telecommunication networks.
The various processes and processing described above, such as methods 400, 700, and/or 800, may be performed by processing unit 901. For example, in some embodiments, methods 400, 700, and/or 800 may be implemented as a computer software program that is tangibly contained in a machine-readable medium, such as storage unit 908. In some embodiments, part or all of the computer program may be loaded and/or installed on device 900 via ROM 902 and/or communication unit 909. When the computer program is loaded onto RAM 903 and executed by CPU 901, one or more actions of methods 400, 700, and/or 800 described above may be performed.
The present disclosure may be a method, an apparatus, a system, and/or a computer program product. The computer program product may include a computer-readable storage medium on which computer-readable program instructions for performing various aspects of the present disclosure are loaded.
The computer-readable storage medium may be a tangible device that may retain and store instructions for use by an instruction execution device. For example, the computer-readable storage medium may be, but is not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the above. More specific examples (a non-exhaustive list) of the computer-readable storage medium include: a portable computer disk, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a static random access memory (SRAM), a portable compact disk read-only memory (CD-ROM), a digital versatile disk (DVD), a memory stick, a floppy disk, a mechanical encoding device such as a punch card or a raised structure in a groove with instructions stored thereon, and any suitable combination of the foregoing. The computer-readable storage medium, as used herein, is not explained as transient signals themselves, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., light pulses propagating through a fiber-optic cable), or electrical signals transmitted through an electrical wire.
The computer-readable program instructions described here can be downloaded from the computer-readable storage medium to various computing/processing devices, or downloaded to an external computer or an external storage device over a network such as the Internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers. A network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in each computing/processing device.
Computer program instructions for performing the operations of the present disclosure may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source code or object code written in any combination of one or more programming languages. The programming languages include object-oriented programming languages, such as Smalltalk and C++, and conventional procedural programming languages, such as “C” language or similar programming languages. The computer-readable program instructions may be executed entirely on a user computer, partly on a user computer, as a standalone software package, partly on a user computer and partly on a remote computer, or entirely on a remote computer or a server. In the case where a remote computer is involved, the remote computer can be connected to a user computer over any kind of networks, including a local area network (LAN) or a wide area network (WAN), or can be connected to an external computer (e.g., connected over the Internet using an Internet service provider). In some embodiments, an electronic circuit, such as a programmable logic circuit, a field programmable gate array (FPGA), or a programmable logic array (PLA), is customized by utilizing state information of the computer-readable program instructions. The electronic circuit may execute the computer-readable program instructions so as to implement various aspects of the present disclosure.
Various aspects of the present disclosure are described here with reference to flowcharts and/or block diagrams of the method, the apparatus (the system), and the computer program product according to the embodiments of the present disclosure. It should be understood that each block in the flowcharts and/or block diagrams as well as a combination of blocks in the flowcharts and/or block diagrams may be implemented by using the computer-readable program instructions.
The computer-readable program instructions may be provided to a processing unit of a general purpose computer, a special purpose computer, or other programmable data processing apparatuses to produce a machine, such that the instructions, when executed by the processing unit of the computer or other programmable data processing apparatuses, generate an apparatus for implementing the functions/actions specified in one or more blocks in the flowcharts and/or block diagrams. The computer-readable program instructions may also be stored in a computer-readable storage medium, to cause a computer, a programmable data processing apparatus, and/or other devices to work in a specific manner, such that the computer-readable medium storing the instructions includes an article of manufacture that contains instructions for implementing various aspects of the functions/actions specified in one or more blocks in the flowcharts and/or block diagrams.
The computer-readable program instructions may also be loaded onto a computer, other programmable data processing apparatuses, or other devices, so that a series of operating steps are performed on the computer, other programmable data processing apparatuses, or other devices to produce a computer-implementing process. As such, the instructions executed on the computer, other programmable data processing apparatuses, or other devices implement the functions/actions specified in one or more blocks in the flowcharts and/or block diagrams.
The flowcharts and block diagrams in the accompanying drawings show the architectures, functionalities, and operations of possible implementations of the system, the method and the computer program product according to a plurality of embodiments of the present disclosure. In this regard, each block in the flowcharts or block diagrams may represent a module, a program segment, or part of instructions that contains one or more executable instructions for implementing specified logical functions. In some alternative implementations, the functions marked in the blocks may also occur in an order different from that marked in the accompanying drawings. For example, two successive blocks may actually be performed basically concurrently, and they may sometimes also be performed in a reverse order, depending on the functions involved. It should be further noted that each block in the block diagrams and/or flowcharts as well as a combination of the blocks in the block diagrams and/or flowcharts may be implemented by using a dedicated hardware-based system for executing specified functions or actions or by a combination of dedicated hardware and computer instructions.
Various embodiments of the present disclosure have been described above. The foregoing description is illustrative rather than exhaustive, and is not limited to the disclosed embodiments. Numerous modifications and alterations are apparent to those of ordinary skill in the art without departing from the scope and spirit of illustrated various embodiments. The selection of terms used herein is intended to best explain the principles and practical applications of the embodiments or the improvements to technologies on the market, or to enable other persons of ordinary skill in the art to understand the embodiments disclosed herein.
Number | Date | Country | Kind |
---|---|---|---|
202010449309.4 | May 2020 | CN | national |