In a test environment, the performance of an application as it accesses a file system may be evaluated. For example, software applications such as Information Management (IM) products may be used for a wide range of services that backup and archive large amounts of customer data, such as data protection, archiving, and records management. Because the amount of data accessed by the IM products is large, the scalability of the IM products is tested.
The customer data may be protected, stored in a suitable location, and restored using the IM products. Each customer data set introduces a unique file system schematic to the IM product, as each data set is different. For example, one file system may have a large number of small files in a single directory or at a mount point, while another file system has a small number of large files that collectively contains a large amount of data. Additionally, the file systems may include extensive nesting levels for the directories and files.
For testing purposes, the file systems are recreated as they exist at customer sites. The time spent creating different combinations of the file system content and structure may range from a few days to several weeks. In many cases, a large amount of storage space is used for maintaining such varied file systems for testing, resulting in high power costs. Additionally, verification of data that has been manipulated in a testing scenario may involve the time consuming process of generating checksums for the entire data set. Limited hardware resources can also cause scheduling delays when the hardware storage space is not available to replicate a file system for testing purposes.
Certain embodiments are described in the following detailed description and in reference to the drawings, in which:
As discussed above, Information Management (IM) products may be used for a wide range of applications, such as data protection, archiving, and records management. Typical operations carried out by IM products include enumerating files on a file system, reading file data and attributes, and sending the file system data to a data store. Subsequent operations on the file system may lead to changes in file attributes, which are file contents that are tracked by these applications. As described herein, an application is a set of instructions implemented by a computer. Further, as used herein, a file system is an organized collection of data that may be stored in memory or generated in response to applications attempting to access file system data.
Embodiments described herein provide for the implementation of a scalable test environment without extensive use of dedicated storage devices. In various examples, the scalable test environment may be used to test the performance of software applications, such as IM products. For example, the scalable test environment may be used to determine whether a particular IM product is capable of dealing with a very large, complex set of data prior to the release of the IM product into the market.
In various examples, a file system environment creator may be used to generate the content of a file system within the scalable test environment. The content of the file system may be generated in response to a system call from a particular application, such as an IM application. The file system environment creator may intercept the system call from the application to the file system, and may generate a model of the content of the file system based on the system call and the structure of the file system. In other words, the file system environment creator may allow the testing of a software application, such as an IM product, without the use of a dedicated storage space for the file system used to test the capabilities of the IM product.
A human machine interface 108 may be adapted to connect the computing system 100 through the bus 106 to user-interface devices 110. The user-interface devices 110 may include, for example, a keyboard and a pointing device, such as a mouse, trackball, touchpad, joy stick, pointing stick, stylus, or touch screen, among others. The computing system 100 may also be linked through the bus 106 to a display interface 112 adapted to connect the computing system 100 to a display device 114, wherein the display device 114 may include a computer monitor, camera, television, projector, or mobile device, among others.
A network interface controller (NIC) 116 may be adapted to connect the computing system 100 through the bus 106 to a network 118. Through the network 118, electronic data 120 may be downloaded and stored within the memory device 104. Further, through the network 118, web-based applications 122 may be downloaded and stored within the memory device 104, or may be accessed through a Web browser. In examples, an IM product may be a web-based application 122 or can be stored within the memory device 104.
The memory device 104 can include random access memory (RAM), read only memory (ROM), flash memory, or any other suitable memory systems. The memory device 104 can also include, or be communicably coupled to, a storage device (not shown). The storage device may include a hard drive, an optical drive, a thumb drive, an array of drives, or any combinations thereof. The memory device 104 may be adapted to store instructions that are executable by the processor 102. These instructions implement a method that may include modeling a structure of a file system, intercepting a system call from an application to the file system, and generating the content of the file system based on the structure of the file system and the system call.
The memory device 104 can also include components for implementing these instructions, including a file system environment creator 124 located within a user space 126 and a file system 128 located within a kernel space 130. The user space 126 and the kernel space 130 may be memory components within the memory device 104 that are in operative communication with one another. The file system 128 maintains and organizes the structure and content of files within the memory device 104 of the computing system 100. In various examples, the file system 128 is backed up using an IM application. “Backing up” the file system includes copying the data contained within the file system to another location. Further, in examples, the file system environment creator 124 may dynamically model the state of the file system 128, as specified by system calls from applications, without the use of a dedicated storage space within the computing system 100. The state of the file system is the corresponding structure and content of the file system. By modeling the state of the file system 128, the file system environment creator 124 can respond to various system calls to the file system 128.
The file system 200 may include a root directory 202. The root directory 202 is the top-most directory within the file system 200, and may be the starting point from which all other directories within the file system 128 originate. A second directory 204 is located within the root directory 202. The second directory 204 contains files 206. In examples, the second directory 204 contains a small number of files 206, on the order of a few hundred files. In other examples, the second directory 204 contains a large number of files 206, on the order of a billion files.
A third directory 208 is also located within the root directory 202. In some examples, the third directory 208 is a sub-directory located within the second directory 204, as shown in
The application 302, a configuration file 304, and a data model file 306 may be located within separate mount points in the user space 126, and may be in operative communication with the file system environment creator 124. A mount point may be a root location of the file system. In examples, the configuration file 304 and the data model file 306 may be extensible markup language (XML) files that can be created or edited by any suitable text editor. The configuration file 304 and the data model file 306 may be used by the file system environment creator 124 to model the structure of the file system 200. The structure of the file system 200 may include a number of directories within the file system, a number of files in each directory of the file system, or a depth of the file system. In examples, the structure of the file system 200 may be determined from a hardware configuration of the computing system 100.
In examples, the data model file 306 may be read by the file system environment creator 124 in order to determine the various types of data objects that constitute the application 302, such as, for example, directories, files, and symbolic links, as well as their attributes. The configuration file 304 may provide parameters relating to the manner in which the application 302 organizes data, such as, for example, the number of directories, the number of files per directory, or the depth of the file system, among others. The data model file 306 may also specify a nomenclature mechanism that is used to name data objects within the data model file 306. These data object names may be used by the application 302 to refer to various data objects within the instance of the application 302. In some examples, a system call from the application 302 may refer to specific data object names. The file system environment creator 124 may then be used to parse the corresponding data objects in order to determine the attributes of such data objects.
The file system 200 may be located within the kernel space 130 of the computing system 100, which is in operative communication with the user space 126, as indicated by arrow 308. In examples, when the application 302 sends a system call to the file system 200, as indicated by arrow 310, the file system environment creator 124 intercepts the system call, as indicated by arrow 312. The file system environment creator 124 may then generate the content of the file system 200 within the user space 126. In this manner, the content of the file system 200 may be dynamically generated on the fly by the file system environment creator 124 in response to the system call from the application 302. As a result, the content of the file system 200 is not stored in the user space 126 or the kernel space 130.
The system call from the application 302 may include, for example, a read operation or a write operation, among others. The system call may be used to delete, rename, or recreate certain files at a particular point in time, change file attributes, or change a few blocks of certain files at a particular point in time. The system call from the application 302 may also be used to backup or restore data, or to verify backed-up or restored data.
In a scenario where the application 302 is used to backup or restore data, the application may be tested as follows. The file system environment creator 124 may access configuration files 304 and data model files 306 in order to model the structure of the file system 200 used by the application 302. When the application 302 issues a system call to backup or restore data, the file system environment creator 124 may intercept the system call. The file system environment creator 124 may then generate the content of the file system 200 based on the system call, as well as the configuration and data model files. Thus, the application 302 will access content generated dynamically by the file system environment creator 124. In examples, the file system environment creator 124 may verify the content generated in response to the system call by referring to the configuration files 304 and the data model files 306. For example, in the case of a restore operation, the file system environment creator 124 may ensure that the content written by the application 302 is the same as the content that was generated dynamically by the file system environment creator 124.
The method begins at block 402 with the modeling of the structure of the file system used by the application within the file system environment creator. This may be accomplished through the use of a configuration file, such as an XML configuration file, and a data model file, such as an XML data model file. The configuration file is a file that contains policies applicable to a file system at a particular instance in time, and may contain information regarding how the application organizes data within the file system such as the number of directories, how many files per directory, and the depth of the file system. Thus, the policies define how a file system or database is structured. In various examples, modeling the structure of the file system includes determining the number of directories within the file system, the number of files within the file system, the number of files in each directory of the file system, or the depth of the file system, or any combinations thereof. In some examples, the structure of the file system may also be determined by a hardware configuration of the computing environment.
The data model file is a file that contains policies regarding the types of objects accessed by the application. The data model file can specify a nomenclature mechanism that is used to name objects accessed by the application. Further, the policies in the data model file may apply to the content of the file system, such as restrictions on the values of the generated content.
At block 404, a system call from the application to the file system may be intercepted at the file system environment creator. By intercepting the system call, the file system environment creator may identify the object associated with the system call by name according to the data model file. Data associated with the object may be generated as specified in the configuration file. In examples, the requested data associated with the object is returned to the application without the use of a dedicated storage space for the generated data. This may allow for the performance of tests and the generation of the content and the structure of the file system dynamically. As discussed above, the system call from the application may relate to read operations, write operations, backup operations, or restoration operations, among others. Further, the system call from the application may be used to perform any of a number of information management procedures within the computing environment.
In various examples, the application may attempt to traverse the file system directly by reading the contents of the root mount point, and opening and reading the contents of each directory through “open directory” or “read directory” system calls. However, the file system environment creator may intercept the system calls through a hooking procedure. For example, when a directory is open for reading a list of files or sub-directories, hooks attached to the Open Directory or Read Directory system calls cause them to be sent to the file system environment creator, allowing the application to operate within the scalable test environment.
At block 406, the content of the file system may be generated within the file system environment creator based on the system call and the structure of the file system. This may be accomplished by determining, for each file within the file system, a size of the file, a file permission associated with the file, and data within the file. In this manner, an arbitrary sized file system is generated. In various examples, the content of the file system that is generated may include modified content of the file system, as specified by the system call from the application. For example, when the intercepted system call includes instructions to write data to a file X, then return the entire content of file X, the content generated by the file system environment creator includes the entire content of file X, as particularly modified by the data written to file X. Such modified content is dynamically generated within the file system environment creator without resulting in the modification of the content of the file system itself. In other words, the content of the file system may be generated on the fly without using data from a dedicated storage location or device. In another example, the data or metadata associated with specific content within the file system, such as the size of a file, the content of the data in the file, or the permissions on the file, may be generated by the file system environment creator. This may allow for the testing of multiple configurations without investment in hardware. Further, the amount of power consumed for the testing process may be significantly reduced.
In various examples, after the Open Directory or Read Directory system calls are sent to the file system environment creator at block 404, the file system environment creator may analyze applicable policies located within the configuration file, such as, for example, an XML configuration file. The file system environment creator may then generate the names of the files and directories dynamically.
At block 508, a system call from the application to the file system may be intercepted at the file system environment creator. If the system call is a read or write operation, process flow continues to block 510. If the system call is one that occurs during a restore operation, process flow continues to block 514. At block 510, the content of the file system is generated by determining, for each file within the file system, a size of the file, a file permission associated with the file, and data within the file. The content of the file system may include modified content of the file system as specified by the system call. In examples, the file system is a database, and the content of the database is determined by policies within the configuration file.
At block 512, the generated content of the file system is returned to the application that issued the system call. At block 514, the generated content may be verified in response to the system call within the file system environment creator. In examples, the file system environment creator can work in a “verify” mode where metadata and data written to the generated file system can be interpreted by the file system environment creator generator and verified against a policy within the data model file. Such a verification occurs when an application is performing a restore operation using the file system environment creator.
The various software components discussed herein may be stored on the tangible, non-transitory computer-readable medium, as indicated in
While the present techniques may be susceptible to various modifications and alternative forms, the exemplary embodiments discussed above have been shown only by way of example. It is to be understood that the technique is not intended to be limited to the particular embodiments disclosed herein. Indeed, the present techniques include all alternatives, modifications, and equivalents falling within the true spirit and scope of the appended claims.