This invention generally relates to computer systems and more specifically relates to file system directories with associated queries of other directories.
The development of the EDVAC computer system of 1948 is often cited as the beginning of the computer era. Since that time, computer systems have evolved into extremely sophisticated devices that may be found in many different settings. Computer systems typically include a combination of hardware (e.g., semiconductors, circuit boards, etc.) and software (e.g., computer programs). As advances in semiconductor processing and computer architecture push the performance of the computer hardware higher, more sophisticated computer software has evolved to take advantage of the higher performance of the hardware, resulting in computer systems today that are much more powerful than just a few years ago.
One of the primary purposes of computers is the storage, retrieval, and manipulation of data. Data in computers is often organized in a hierarchical structure analogous to a physical file cabinet. An office might have many file cabinets, with each different file cabinet reserved for a different type of data, and each file cabinet being further divided into folders, each of which contains paper or other information/data. Analogously, the storage of a computer is divided into directories (also known as folders), each of which may contain files and/or sub-directories (also known as sub-folders), each of which may be further divided, and so on as needed. Thus, a directory may contain a sub-directory and may also be a sub-directory. The files store the data and may be in any format, such as databases, tables, images, video files, audio files, word-processing documents, drawings, graphs, spreadsheets, programs, pages, or any other type of file. A user, system administrator, or application program may perform operations against the directories, such as creating, deleting, reading, renaming, reorganizing, and searching them.
Searching directories for files that match a certain criteria is an important function because of the large number of directories and files that may exist in a computer. The search operation may be implemented via a variety of techniques. For example, the application implementing the search may perform simple file I/O (Input/Output) to find all files on the computer system and determine which of them match the criteria. In a second example, the application may use file system-specific APIs (Application Program Interfaces) to access the list of files meeting the criteria. In a third example, the application may perform normal file I/O against a directory that is connected to a symbolic link directory via a query that specifies the search criteria.
For the simple I/O example, the application implements the query and reads files in the directory. Simple I/O has the disadvantage that the application is now more complex and may need to use more APIs to determine if a file matches the criteria, depending on the complexity of the criteria. Further, the criteria must also be input to the application, which adds an additional step with resulting complexity and performance degradation.
For the file system specific APIs example, the application sends the query to the file system, which has the advantage of being faster if the file system is backed by a database and can determine the results more quickly. Unfortunately, the file system specific APIs hurt the portability of the application, which may needed to be coded to support multiple API sets.
For the symbolic links example, the application accesses the files in the given directory using normal file I/O API's. Unfortunately, a separate process is needed to ensure that the symbolic links are always kept up to date. For example, the query may be stored in a background program, which constantly runs to maintain the relationship between the directory and the symbolic link directory. Also, if the criteria is based on user-specific properties (such as the access level), then a separate directory must be kept for every user.
Thus, without a better way to perform queries against directories, users will continue to suffer from lack of portability, degraded performance, and complexity.
A method, apparatus, system, and signal-bearing medium are provided that, in an embodiment, in response to a read command directed to a first directory, a query associated with the first directory is performed against a second directory, and the results of the query are returned in response to the read command. In response to an open command directed to a file in the first directory, the associated query is performed against the second directory, and the file is found in the query results. A file handle associated with the file is then created. If the query has an associated API, an instance of the API is created and a pointer to the instance is stored in the file handle. In response to a file command directed to the file handle, if the file handle contains a pointer to the API instance, the file command is passed to the API instance.
Various embodiments of the present invention are hereinafter described in conjunction with the appended drawings:
It is to be noted, however, that the appended drawings illustrate only example embodiments of the invention, and are therefore not considered limiting of its scope, for the invention may admit to other equally effective embodiments.
Referring to the Drawings, wherein like numbers denote like parts throughout the several views,
The major components of the computer system 100 include one or more processors 101, a main memory 102, a terminal interface 111, a storage interface 112, an I/O (Input/Output) device interface 113, and communications/network interfaces 114, all of which are coupled for inter-component communication via a memory bus 103, an I/O bus 104, and an I/O bus interface unit 105.
The computer system 100 contains one or more general-purpose programmable central processing units (CPUs) 101A, 101B, 101C, and 101D, herein generically referred to as the processor 101. In an embodiment, the computer system 100 contains multiple processors typical of a relatively large system; however, in another embodiment the computer system 100 may alternatively be a single CPU system. Each processor 101 executes instructions stored in the main memory 102 and may include one or more levels of on-board cache.
The main memory 102 is a random-access semiconductor memory for storing data and programs. In another embodiment, the main memory 102 represents the entire virtual memory of the computer system 100, and may also include the virtual memory of other computer systems coupled to the computer system 100 or connected via the network 130. The main memory 102 is conceptually a single monolithic entity, but in other embodiments the main memory 102 is a more complex arrangement, such as a hierarchy of caches and other memory devices. For example, memory may exist in multiple levels of caches, and these caches may be further divided by function, so that one cache holds instructions while another holds non-instruction data, which is used by the processor or processors. Memory may be further distributed and associated with different CPUs or sets of CPUs, as is known in any of various so-called non-uniform memory access (NUMA) computer architectures.
The memory 102 includes a directory 150, a virtual directory 152, a file system 154, i-nodes 156, vi-nodes 158, and an application 160. Although the directory 150, the virtual directory 152, the file system 154, the i-nodes 156, the vi-nodes 158, and the application 160 are illustrated as being contained within the memory 102 in the computer system 100, in other embodiments some or all of them may be on different computer systems and may be accessed remotely, e.g., via the network 130. The computer system 100 may use virtual addressing mechanisms that allow the programs of the computer system 100 to behave as if they only have access to a large, single storage entity instead of access to multiple, smaller storage entities. Thus, the directory 150, the virtual directory 152, the file system 154, the i-nodes 156, the vi-nodes 158, and the application 160 are not necessarily all completely contained in the same storage device at the same time.
The directory 150 includes files 162 and may also include one or more unillustrated sub-directories in a hierarchical structure. The directory 150 may itself be a sub-directory contained within another directory. In an embodiment, the directory 150 and the files 162 may be organized and stored as UNIX directories and files, but in another embodiment any appropriate type of directories and files may be used. UNIX directories and files are hierarchical in that they resemble a tree structure. The tree is anchored at a place called the root, designated by a slash “/”. Every item in the UNIX file system tree is either a file, or a directory that can contain files and other directories. A directory contained within another is called the child of the other. A directory in the file system tree may have many children, but it can only have one parent. UNIX files can have attributes such as size, permissions, create time, among others, associated with it. For example, every file and directory may have associated ownership and access permissions for which one is able to specify those to whom the permissions apply.
The virtual directory 152 includes a query 146. The virtual directory 152 is said to be virtual because it does not contain files, but instead contains the query 146, which may be used to find a file or files in the files 162, as further described below with reference to
The file system 154 manages the directories 150, the files 162, the virtual directories 152, the queries 164, the i-nodes 156, and the vi-nodes 158, and performs such operations as creating directories, reading directories, performing queries, and performing file operations. In an embodiment, the file system 154 includes instructions capable of executing on the processor 101 or statements capable of being interpreted by instructions executing on the processor 101 to perform the functions as further described below with reference to
The i-nodes 156 are data structures or information blocks that contain information about the files 162. Each file 162 has an i-node in the i-nodes 156 and is identified by an i-node number (i-number). The i-nodes 156 include information about the files 162, such as file ownership information (user and group ownership); time stamps for last modification, last access and last mode modification; link count; file size; and addresses of physical blocks.
The vi-nodes 158 are data structures or information blocks that contain information about the virtual directories 152, such as directory ownership information, time stamps for last modification, a query, and an action. The query in the vi-node 158 is a copy of the query 164, and the action is derived from the query.
The memory bus 103 provides a data communication path for transferring data among the processor 101, the main memory 102, and the I/O bus interface unit 105. The I/O bus interface unit 105 is further coupled to the system I/O bus 104 for transferring data to and from the various I/O units. The I/O bus interface unit 105 communicates with multiple I/O interface units 111, 112, 113, and 114, which are also known as I/O processors (IOPs) or I/O adapters (IOAs), through the system I/O bus 104. The system I/O bus 104 may be, e.g., an industry standard PCI bus, or any other appropriate bus technology.
The I/O interface units support communication with a variety of storage and I/O devices. For example, the terminal interface unit 111 supports the attachment of one or more user terminals 121, 122, 123, and 124. The storage interface unit 112 supports the attachment of one or more direct access storage devices (DASD) 125, 126, and 127 (which are typically rotating magnetic disk drive storage devices, although they could alternatively be other devices, including arrays of disk drives configured to appear as a single large storage device to a host). The contents of the main memory 102 may be stored to and retrieved from the direct access storage devices 125, 126, and 127.
The I/O and other device interface 113 provides an interface to any of various other input/output devices or devices of other types. Two such devices, the printer 128 and the fax machine 129, are shown in the exemplary embodiment of
Although the memory bus 103 is shown in
The computer system 100 depicted in
The network 130 may be any suitable network or combination of networks and may support any appropriate protocol suitable for communication of data and/or code to/from the computer system 100. In various embodiments, the network 130 may represent a storage device or a combination of storage devices, either connected directly or indirectly to the computer system 100. In an embodiment, the network 130 may support Infiniband. In another embodiment, the network 130 may support wireless communications. In another embodiment, the network 130 may support hard-wired communications, such as a telephone line or cable. In another embodiment, the network 130 may support the Ethernet IEEE (Institute of Electrical and Electronics Engineers) 802.3x specification. In another embodiment, the network 130 may be the Internet and may support IP (Internet Protocol).
In another embodiment, the network 130 may be a local area network (LAN) or a wide area network (WAN). In another embodiment, the network 130 may be a hotspot service provider network. In another embodiment, the network 130 may be an intranet. In another embodiment, the network 130 may be a GPRS (General Packet Radio Service) network. In another embodiment, the network 130 may be a FRS (Family Radio Service) network. In another embodiment, the network 130 may be any appropriate cellular data network or cell-based radio network technology. In another embodiment, the network 130 may be an IEEE 802.11B wireless network. In still another embodiment, the network 130 may be any suitable network or combination of networks. Although one network 130 is shown, in other embodiments any number (including zero) of networks (of the same or different types) may be present.
It should be understood that
The various software components illustrated in
Moreover, while embodiments of the invention have and hereinafter will be described in the context of fully-functioning computer systems, the various embodiments of the invention are capable of being distributed as a program product in a variety of forms, and the invention applies equally regardless of the particular type of signal-bearing medium used to actually carry out the distribution. The programs defining the functions of this embodiment may be delivered to the computer system 100 via a variety of signal-bearing media, which include, but are not limited to:
(1) information permanently stored on a non-rewriteable storage medium, e.g., a read-only memory device attached to or within a computer system, such as a CD-ROM, DVD-R, or DVD+R;
(2) alterable information stored on a rewriteable storage medium, e.g., a hard disk drive (e.g., the DASD 125, 126, or 127), CD-RW, DVD-RW, DVD+RW, DVD-RAM, or diskette; or
(3) information conveyed by a communications medium, such as through a computer or a telephone network, e.g., the network 130, including wireless communications.
Such signal-bearing media, when carrying machine-readable instructions that direct the functions of the present invention, represent embodiments of the present invention.
Embodiments of the present invention may also be delivered as part of a service engagement with a client corporation, nonprofit organization, government entity, internal organizational structure, or the like. Aspects of these embodiments may include configuring a computer system to perform, and deploying software systems and web services that implement, some or all of the methods described herein. Aspects of these embodiments may also include analyzing the client company, creating recommendations responsive to the analysis, generating software to implement portions of the recommendations, integrating the software into existing processes and infrastructure, metering use of the methods and systems described herein, allocating expenses to users, and billing users for their use of these methods and systems.
In addition, various programs described hereinafter may be identified based upon the application for which they are implemented in a specific embodiment of the invention. But, any particular program nomenclature that follows is used merely for convenience, and thus embodiments of the invention should not be limited to use solely in any specific application identified and/or implied by such nomenclature.
The exemplary environments illustrated in
If the determination at block 710 is true, then the directory 152 has an associated query 164, so control continues to block 715 where the file system 154 performs the query 164 against the entity (e.g., the directory 150) specified in the query 150. Control then continues to block 720 where the file system 154 returns the results from the query 164 to the application 160. Control then continues to block 799 where the logic of
If the determination at block 710 is false, then the directory (e.g, the directory 150) does not have an associated query 164, so control continues to block 725 where the file system 154 returns the contents of the directory to the application 160. Control then continues to block 799 where the logic of
If the determination at block 820 is true, then the node is a vi-node 158, so control continues to block 822 where the file system 154 determines whether an associated API exists for the VI-node 158. If the determination at block 822 is true, then control continues to block 825 where the file system 154 creates an instance of the API associated with the vi-node 158 if the API exists. Control then continues to block 830 where the file system 154 creates and returns a file handle that points to the instance of the API to the application 160. That is, the file system 154 stores a pointer to the instance of the API in the file handle. Control then continues to block 899 where the logic of
If the determination at block 822 is false, then control continues to block 835 where the file system 154 returns a file handle of the file 162 associated with the vi-node 158 to the application 160.
If the determination at block 820 is false, then the node is an i-node 156, so control continues to block 835 where the file system 154 returns a file handle of the file 162 associated with the i-node 156 to the application 160. Control then continues to block 899 where the logic of
If the determination at block 915 is true, then the parent directory i-node has an associated query, so control continues to block 920 where the file system 154 performs the query against the parent directory. Control then continues to block 925 where the file system 154 finds the file 162 that matches the query in the parent directory. Control then continues to block 930 where the file system 154 creates a vi-node containing the query, action (if present), and the file path of the parent directory. Control then continues to block 999 where the logic of
If the determination at block 915 is false, then the parent directory i-node does not have an associated query, so control continues to block 935 where the file system 154 retrieves the i-node for the file. Control then continues to block 999 where the logic of
If the determination at block 1010 is true, then the path of the directory is absolute, so control continues to block 1015 where the file system 154 finds the root (e.g., the “/”) directory i-node. Control then continues to block 1020 where the file system 154 determines whether the current found directory is the target of the read directory command.
If the determination at block 1020 is false, then the current directory is not the target of the read directory command, so control continues to block 1025 where the file system 154 finds the next sub-directory. Control then continues to block 1030 where the file system 154 determines whether the i-node of the sub-directory has an associated query.
If the determination at block 1030 is true, then the i-node of the sub-directory has an associated query, so control continues to block 1035 where the file system 154 performs the query. Control then continues to block 1040 where the file system 154 finds a matching entry for the file 162 in the query results. Control then continues to block 1045 where the file system 154 creates a vi-node 158 with an updated query and action. Control then returns to block 1020, as previously described above.
If the determination at block 1030 is false, then the i-node of the sub-directory does not have an associated query, so control continues to block 1050 where the file system 154 performs a standard lookup function for the file 162 in the sub-directory. Control then returns to block 1020, as previously described above.
If the determination at block 1020 is true, then the directory is the target of the read directory command, so control continues to block 1055 where the file system 154 returns the i-node 156 to the application 160. Control then continues to block 1099 where logic of
If the determination at block 1010 is false, then the path of the directory is relative instead of absolute, so control continues to block 1060 where the file system 154 finds the current working directory i-node 156. Control then continues to block 1020, as previously described above.
If the determination at block 1110 is true, then an API instance exists in the file handle, so control continues to block 1115 where the file system 154 passes the command or request to the API. Control then continues to block 1199 where the logic of
If the determination at block 1110 is false, then an API instance does not exist in the file handle, so control continues to block 1120 where the file system 154 performs the command or request. Control then continues to block 1199 where the logic of
In the previous detailed description of exemplary embodiments of the invention, reference was made to the accompanying drawings (where like numbers represent like elements), which form a part hereof, and in which is shown by way of illustration specific exemplary embodiments in which the invention may be practiced. These embodiments were described in sufficient detail to enable those skilled in the art to practice the invention, but other embodiments may be utilized and logical, mechanical, electrical, and other changes may be made without departing from the scope of the present invention. Different instances of the word “embodiment” as used within this specification do not necessarily refer to the same embodiment, but they may. The previous detailed description is, therefore, not to be taken in a limiting sense, and the scope of the present invention is defined only by the appended claims.
In the previous description, numerous specific details were set forth to provide a thorough understanding of embodiments of the invention. But, the invention may be practiced without these specific details. In other instances, well-known circuits, structures, and techniques have not been shown in detail in order not to obscure the invention.