Metadata for a file stored in a file system contains information describing the data contained in the file. The metadata may contain the file's unique identifier, among other attributes associated with the file. If the file is replicated to a different file system, the metadata may be replicated as well.
Certain examples are described in the following detailed description and in reference to the drawings, in which:
The present disclosure is generally related to replicating metadata. When a file located in a source file system is replicated to a target file system, the metadata associated with the file can be replicated as well. However, custom metadata that a user associates with the file may not be automatically replicated, as the custom metadata may be external to the file, and may reside in a database. One method to replicate the metadata is to manually run a script to export the metadata from the source file system's express query database, and import the metadata to the target file system's database, where it is associated with the path name of the replicated file. However, this method can be prone to errors. For example, a change in the path name of the replicated file can result in invalid association between the replicated file and the metadata.
Described herein is a method to automatically associate metadata with a replicated file in a target file system following file replication. An original file in a source file system can have its metadata associated with a unique identifier of the file. When the original file is replicated to a target file system, the metadata associated with the unique identifier of the file can be replicated as well. In the target file system, a unique identifier of the replicated file can be mapped to the unique identifier of the original file, such that the metadata is then associated with the unique identifier of the replicated file. In this way, the metadata replication and association can be performed automatically without user intervention. The metadata association can also be unaffected by changes or errors in the path name of the replicated file. Furthermore, the replicated metadata can be stored in a scalable pipelined database. The pipelined database may use a mechanism of lazy ingestion of file system events. The metadata associated with the replicated file may be stored in a query-able authority table in the pipelined database.
The processor 102 can be a single core processor, a multi-core processor, a computing cluster, or any number of other appropriate configurations.
The processor 102 may be connected through a system bus 104 (e.g., AMBA®, PCI®, PCI Express®, Hyper Transport®, Serial ATA, among others) to an input/output (I/O) device interface 106 adapted to connect the computing system 100 to one or more I/O devices 108. The I/O devices 108 may include, for example, a keyboard and a pointing device, wherein the pointing device may include a touchpad or a touchscreen, among others. The I/O devices 108 may be built-in components of the computing system 100, or may be devices that are externally connected to the computing system 100.
The processor 102 may also be linked through the system bus 104 to a display device interface 110 adapted to connect the computing system 100 to display devices 112. The display devices 112 may include a display screen that is a built-in component of the computing system 100. The display devices 112 may also include computer monitors, televisions, or projectors, among others, that are externally connected to the computing system 100.
The processor 102 may also be linked through the system bus 104 to a memory device 114. In some examples, the memory device 114 can include random access memory (e.g., SRAM, DRAM, eDRAM, EDO RAM, DDR RAM, RRAM®, PRAM, among others), read only memory (e.g., Mask ROM, EPROM, EEPROM, among others), non-volatile memory (PCM, STT_MRAM, ReRAM, Memristor), or any other suitable memory systems.
The processor 102 may also be linked through the system bus 104 to a storage device 116. The storage device 116 may contain one or more files 118 in a file system. The file 118 may be a document, application, media, or any other virtual item that can be stored. The storage device may also contain metadata 120, which provides information regarding the file 118. Such information may include time of file creation, ownership of the file, and file access permissions. In some examples, the metadata 120 may be custom metadata containing information that a user has manually associated with the file 118. A replication module 122 in the storage device can include instructions to direct the processor 102 to replicate the file 118 from a source location in the storage device 116 to a target location. The target location may be in a second storage device inside the computing system 100, or in an external device coupled to the computing system 100 via wired or wireless means. For example, an external storage device 124 may be linked to the system bus 104 via a communications port 126. The replication module 122 can also replicate the metadata 120 to the target location. The replication module 122 can map the replicated file to the original file 118, such that the replicated file is associated with the metadata.
The first file 202a can include a unique identifier and associated with metadata. The metadata can contain at least one key and value pair. The key is the name of a metadata element, while the value pertains to the information contained in the metadata element. In one example, the metadata may be custom metadata describing a color of the first file 202a. The key of the custom metadata may read “color”, while the value of the custom metadata may read “red”. The unique identifier and the metadata can be stored in a first database 208 of the source file system 204. The unique identifier and metadata may be associated with one another and stored in a table of the first database 208. The first file 202a can also include an extended attribute 210, which contains the unique identifier and a timestamp of the metadata. The timestamp can refer to when the metadata was created or last modified.
The first file 202a can be replicated to produce the identical second file 202b to be stored in the target file system 206. The second file 202b can use a different unique identifier from the first file 202a. The extended attribute 210, which contains the unique identifier of the first file 202a and the timestamp of the metadata, can be replicated to the target file system 206 as well. Furthermore, the table in the first database 208 can also be replicated to a second database 212 in the target file system 206.
The unique identifier of the second file 202b can be mapped to the unique identifier of the first file 202a in a temporary table in the second database 212. As a result, the unique identifier of the second file 202b becomes associated with the metadata. Thus, the metadata can correspond to both the first file 202a and the second file 202b. The process of associating the second file 202b to the metadata can be done automatically in response to replication of the first file 202a. The second database 212 can be a pipelined database wherein the association between the metadata and the second file 202b can be stored in a query-able table.
At block 302, the processor accesses a first file with a first unique identifier at a source location in a storage device. Metadata corresponding to the first file can be stored in a first database with the first unique identifier, such that the first unique identifier is associated with the metadata. The first file may include an extended attribute that contains the first unique identifier and a timestamp corresponding to the metadata.
At block 304, the processor replicates the first file to produce a second file at a target location. The target location may be in a second storage device, either contained in the computing system or coupled externally. The extended attribute of the first file can be replicated to the target location as well. The second file can have a second unique identifier.
At block 306, the processor replicates the metadata and the first unique identifier to a second database. The second database may be at the target location. The metadata and the first unique identifier may be associated together in a temporary table in the second database.
At block 308, the processor maps the second unique identifier to the first unique identifier in the second database. As a result, the second unique identifier is associated with the metadata corresponding to the first file.
As shown in
The block diagram of
While the present techniques may be susceptible to various modifications and alternative forms, the examples discussed above have been shown only by way of example. It is to be understood that the technique is not intended to be limited to the particular examples disclosed herein. Indeed, the present techniques include all alternatives, modifications, and equivalents falling within the true spirit and scope of the appended claims.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2013/073591 | 12/6/2013 | WO | 00 |