Remote file systems employ such a client-server paradigm. A server of the remote file system is a computer that shares its file system with other computers on the network. A client of the remote file system is another computer on the network that may access the file system on the server. The client mounts the file system and subsequently may make file access requests over the network to the server.
The client computer 102 includes at least an RFS Client 202, a virtual memory system (VMS) 204, an interface 206 in the VMS 204, and various page lists. The various page lists include at least a “dirty” list 207, an “uncommitted” list 208, and a “clean” list 209. The client computer 102 also includes various components which are not depicted, such as, for example, one or more processors, volatile memory, one or more communication buses, other operating system software components, application software, input/output, and so on.
The server computer 104 includes at least an RFS server 212, a file system 214, a virtual memory system 216, and stable data storage (typically, hard disk storage) 218. The server computer 104 also includes various components which are not depicted, such as, for example, one or more processors, volatile memory, one or more communication buses, other operating system software components, application software, input/output, and so on.
In a first step 302, the RFS Client 202 sends a write request, including the data of the “dirty” pages, over the network to the RFS Server 212. This write request is an asynchronous write request.
When the RFS Server 212 receives the write request, it writes the data to the file system 214 which stores the data in volatile memory 217 of the Server 104. Thereafter, the RFS Server 212 sends a write acknowledgement to the appropriate RFS Client 202. Periodically, the file system 214 writes data from the volatile memory 217 to the stable data storage 218 of the Server 104.
In a second step 304, the RFS Client 202 receives a write acknowledgement over the network from the RFS Server 212 and sets a bit in the page specifying that it is uncommitted before releasing the page. The write acknowledgement is typically received in a relatively short time because the write is asynchronous. If the write acknowledgement is not received within an allotted time, then the RFS Client 202 typically goes back to the first step 302 and re-sends the write request.
In a third step 306, after the page is released, the Virtual Memory System (VMS) 204 at the client computer 102 removes these pages from a “dirty” list 207 and adds them to a separate “uncommitted” list 208. The “dirty” and “uncommitted” lists are lists of pages of the remote file system. In addition, there is a “clean” list 209. These lists may be accessed (read and written) by the VMS 204 by way of an interface 206 at the client computer 102. The VMS 204 may determine that these pages are to be added to the “uncommitted” list 208 (instead of the “clean” list 209) because the aforementioned uncommitted bit is set.
In a first step 402, the RFS Client 202 makes a determination (“decides”) to commit uncommitted pages to stable data storage 218. In many instances, this determination to commit may occur a substantial time after the pages were written to the remote file system. For example, a large file may be sent by an RFS Client 202 to the remote file system via many write requests. Subsequently, the RFS Client 202 may determine to commit any uncommitted pages.
In a second step 404, the RFS Client 202 calls the interface 206 to access the “uncommitted” list 208 so as to obtain a list of all the uncommitted pages. Such a separate “uncommitted” list 208 does not appear to be built and maintained by conventional remote file system clients. In accordance with a preferred embodiment, the “uncommitted” list comprises a linked list of uncommitted pages. Advantages of using such a linked list data structure are discussed below in relation to
In a third step 406, the list of uncommitted pages is rapidly retrieved by the interface 206 and returned to the RFS Client 202. This rapid retrieval is enabled by the maintenance of the separate “uncommitted” list 208 at the client computer 102.
In a fourth step 408, the RFS Client 202 then sends to the RFS Server 212 a request to commit the list of pages. All, some, or none of these pages may already be committed to the stable data storage 218. This is because uncommitted pages are periodically committed to the stable data storage 218 by the file system 214. The file system 214 works to commit those pages not yet committed to the stable data storage 218. When the entire list of pages has been committed, the RFS Server 212 returns a commit acknowledgement to the RFS Client 202.
Per the decision block 410, if a commit acknowledgement is received by the RFS Client 202 within the allotted time period, then, in a fifth step 412, the VMS 204 at the client computer 102 may use the interface 206 to remove the pages from the “uncommitted” list 208 and add them to the “clean” list 209. On the other hand, if no commit acknowledgement is received by the RFS Client 202 within the allotted time period, then the RFS Client may re-send the commit request.
In accordance with an embodiment of the invention, the linked lists are maintained in an unsorted order for higher performance. Nevertheless, the virtual memory system 204 may be configured to return either a sorted or an unsorted list to the RFS Client 202. For example, the VMS 204 may be configured such that the RFS Client 202 may request a contiguous range of pages. The VMS 204 may then retrieve a first page within that range, then retrieve pages before and after the first page so as to retrieve the range of pages.
Problems and Inefficiencies Overcome
When a client in a conventional remote file system wants to commit pages to stable data storage, the client has to scan a list of all clean pages. As the client scans the list of clean pages, it checks whether the page is committed or uncommitted. If the page is uncommitted, then the client builds a range of consecutive pages that are uncommitted. This range is sent to the server for the data to be committed to stable data storage.
Such a conventional technique has at least two problems. First, time is wasted scanning pages that have already been committed. For example, if a file has one thousand clean pages, but only one page is uncommitted, the conventional technique must still scan all the one thousand pages before determining that only one page needs to be committed. Second, the clean list may be unsorted and so the client may have to send many messages to the server.
The present application discloses a much more efficient technique for handling uncommitted pages in a remote file system. In accordance with an embodiment of the invention, at least three lists of pages are formed and maintained, including a “dirty” list, a “clean” list, and a separate “uncommitted” list. In accordance with an embodiment of the invention, these lists may be structured as linked lists having an unsorted order.
The technique disclosed herein has various advantages over the conventional technique. First, forming and maintaining a separate “uncommitted” list enables the client to quickly obtain ranges of uncommitted pages to be committed. Second, with the lists structured as linked lists, pages may be readily added or removed from the lists.
Hence, the technique disclosed herein provides a remote file system with a highly efficient way of handling uncommitted pages. In particular, this technique solves the problem of inefficient scanning for uncommitted pages which occurs in the conventional technique.
In the above description, numerous specific details are given to provide a thorough understanding of embodiments of the invention. However, the above description of illustrated embodiments of the invention is not intended to be exhaustive or to limit the invention to the precise forms disclosed. One skilled in the relevant art will recognize that the invention can be practiced without one or more of the specific details, or with other methods, components, etc. In other instances, well-known structures or operations are not shown or described in detail to avoid obscuring aspects of the invention. While specific embodiments of, and examples for, the invention are described herein for illustrative purposes, various equivalent modifications are possible within the scope of the invention, as those skilled in the relevant art will recognize.
These modifications can be made to the invention in light of the above detailed description. The terms used in the following claims should not be construed to limit the invention to the specific embodiments disclosed in the specification and the claims. Rather, the scope of the invention is to be determined by the following claims, which are to be construed in accordance with established doctrines of claim interpretation.