File system workloads that read large quantities of data can be categorized into two general types of access patterns, namely random-access and sequential access. A random-access pattern describes a workload where the block-identifiers (“block-ids”) of all the blocks of data read from the disk do not generate a discernable pattern. Accordingly, a workload having such a random-access pattern reads blocks of data on disk in any order and may seek back and forth across offsets within a file. In contrast, a sequential access pattern describes a workload where blocks are requested from the file system in some predictable sequence. In some cases, the predictable sequence is as simple as reading every block in the file, one after the other. Other requests have more complex sequential access patterns, which may be more difficult to predict.
It is possible to observe sequential access patterns as an application makes use of the file system; it is also possible to optimize the way the file system reads blocks of data so that applications that exhibit sequential access patterns may achieve optimal performance. The process of loading blocks into memory before an application needs to use them (based on recognizing the sequential access patterns) is called prefetch, or read-ahead. If a file system is able to retrieve blocks of data that an application is going to use, before the application needs to use these blocks, the application's performance tends to improve. The improvement results from the file system performing a portion of the work for the application. Of course, application performance only improves when the file system is correctly able to predict the blocks of data that the application eventually uses. If the file system incorrectly retrieves blocks that are not used, performance suffers as the file system has performed unnecessary work.
Prefetch strategies typically handle sequential blocks, fetched in a linear fashion (i.e., one block after another). The kinds of application that benefit most from this type of prefetch are applications that copy files from one location to another, stream files sequentially, or otherwise deal with traversing a linear sequence of block-ids.
Most production file systems used by various UNIX operating systems (such as Linux, Solaris, and Berkeley Software Design (BSD), etc.) implement a simple prefetch strategy, which operates at the block-device level, and handles linear access patterns or possibly only linear access patterns in increasing block-ids. For example, prefetch strategies for Unix File System (UFS) on Solaris® (registered trademark of Sun Microsystems, Inc, Santa Clara, Calif. USA), and those employed on Linux only deal with block-ids that are increasing and which match a linear, one block after the other strategy. Today's applications frequently have multiple streams reading at different offsets within the same file. These prefetch strategies assume that only one reader is present, and that it is reading from a linearly increasing offset.
In general, in one aspect, the invention relates to a file system comprising a file accessed by the file system, wherein an instance of file access information is generated upon each access of the file, an access history buffer associated with the file, wherein the access history buffer stores a plurality of the instance of file access information, and a prefetch stream buffer configured to store a plurality of prefetch streams, wherein each of the plurality of prefetch streams is generated by satisfying a collinear check for a sequential access pattern between at least three of the plurality of the instance of file access information.
In general, in one aspect, the invention relates to a method for prefetching data in a file system comprising detecting an access to a file in the file system, wherein an instance of file access information is generated upon each access to the file, placing a plurality of the instance of file access into an access history buffer, performing a collinear check between at least three of the plurality of the instance of file access information in the history buffer to determine a sequential access pattern, creating a prefetch stream based on the sequential access pattern if the collinear check succeeds, and placing the prefetch stream into the prefetch stream buffer.
In general, in one aspect, the invention relates to a computer system for prefetching data in a file system comprising a processor, a memory, a storage device, and software instructions stored in the memory for enabling the computer system under control of the processor, to detect an access to a file in the file system, wherein an instance of file access information is generated upon each access to the file, place a plurality of the instance of file access into an access history buffer, perform a collinear check between at least three of the plurality of the instance of file access information in the history buffer to determine a sequential access pattern, create a prefetch stream based on the sequential access pattern, and place the prefetch stream into the prefetch stream buffer.
Other aspects and advantages of the invention will be apparent from the following description and the appended claims.
Specific embodiments of the invention will now be described in detail with reference to the accompanying figures. Like elements in the various figures are denoted by like reference numerals for consistency.
In the following detailed description of embodiments of the invention, numerous specific details are set forth in order to provide a more thorough understanding of the invention. However, it will be apparent to one of ordinary skill in the art that the invention may be practiced without these specific details. In other instances, well-known features have not been described in detail to avoid unnecessarily complicating the description.
In general, embodiments of the invention relate to a file system with an infrastructure to support a prefetch strategy handling various types of sequential access patterns. In one embodiment of the invention, the prefetch strategy uses information available to the file system, prior to issuing block-device requests, to obtain more robust information about the way applications are accessing files. More specifically, in one embodiment of the invention, the file system maintains file access information (stored in a file access history buffer) about multiple accesses on a per-file basis. Further, the prefetch strategy, in one embodiment of the invention, involves a collinear check of multiple instances of access information for a file (stored in the file access history buffer) to determine the presence of a prefetch stream(s) and store the prefetch stream(s) in an appropriate buffer. A stream of data may then be fetched into memory based on the prefetch streams matching a particular sequential access pattern. In one or more embodiments of the invention, the number of blocks prefetched in a stream is dependent on the reliability of the access pattern. As more hits in a sequential access pattern are found, more blocks in the pattern are prefetched. Similarly, false positives do not trigger further prefetches and, thus, save file system resources.
In one or more embodiments of the invention, the file system of the present invention employs a prefetch strategy designed to prefetch streams of data for access patterns with both ascending block-ids and descending block-ids. Ascending block-ids are ordered in an increasing fashion. For example, an access pattern with ascending block-ids may include the block-ids 3, 5, 7, 9, etc. Descending block-ids are ordered in a decreasing fashion. For example, an access pattern with descending block-ids may read block-ids 80, 79, 78, 77, etc. Accordingly, if a file is being accessed from beginning to end, or end to beginning, the file system is able to find a sequential access pattern, if one exists. In one embodiment of the invention, the file system using the prefetch strategy is able to detect the various sequential access patterns, including a linear access pattern, a strided access pattern, and a multi-strided access pattern.
In one embodiment of the invention, a linear access pattern is one where an application accesses each block successively, one after another. An example of a linear access pattern would be block-ids 1, 2, 3, 4, 5, 6, etc.
In one embodiment of the invention, a strided access pattern is one that increases or decreases sequentially, but instead of a linear sequence, the application skips a measured stride of blocks, and accesses another in sequence.
An example of a strided access would be for block-ids 100, 110, 120, 130, 140, etc. In this case, the prefetch strategy is able to determine that the length of the stride is 10 blocks, and can prefetch blocks this distance apart.
In one embodiment of the invention, a multi-block, strided access pattern is a combination of the previous two types of access patterns. For example, assume an application accesses blocks 110, 111, and 112 in one read operation; then accesses 120, 121, and 122 in another read operation; and then accesses 130, 131, and 132 in a third read operation. Then, in one embodiment of the invention, the file system is ready and able to prefetch blocks 140, 141, 142, and so on, if these sequential accesses continue.
In one or more embodiments of the invention, the file system (100) is an object-based file system (i.e., both data and metadata are stored as objects). More specifically, the file system (100) includes functionally to store both data and corresponding metadata in a storage pool (not shown). Further, in one or more embodiments of the invention, an access history buffer (130) and a prefetch stream buffer (140) are associated with each file (e.g., File A (105), File B (110), File N (115)) in the file system (100). In one or more embodiments of the invention, the access history buffer (130) and the prefetch stream buffer (140) are buffers that remain in memory while the file (e.g., File A (105), File B (110), File N (115)) exists. Each of these buffers is maintained on a per-file basis and may be stored in memory directly with the associated file or with other buffers maintained in the file system (100). Further, each of these buffers are tunable (i.e., may be as large or small as required for a particular use). Likewise, the type of data structure used for the buffer may vary depending on the type of file and the volume of data involved.
One skilled in the art will appreciate that, in one or more embodiments of the invention, the buffers may not only remain in memory, but may also be stored to a persistent data store (i.e., a storage device, a database, etc.), where the information stored in the buffer is archived and the data may be leveraged over time to predict usage patterns using various models known in the art. These usage patterns may then be used to aid in supporting more refined prefetch strategies within the file system.
The access history buffer (130) includes multiple instances of file access information associated with a file (e.g., File B (110)). Each instance of the file access information is generated when the file is accessed (e.g., File B (110)). As discussed above, the file (e.g., File B (110)) may be accessed whenever an application issues a request that is serviced by the file system (100). For example, if an application issues a modification request, the file within the file system is accessed to perform the modification. The file access information may include a variety of information about the file access, including size of the file, position of the file, requesting application, requesting thread performing the access, information about the file system during the access, etc. One skilled in the art will appreciate that maintaining file access information beyond only the block-id allows the file system to track multiple accesses per-file with various access pattern types.
The prefetch stream buffer (140) stores prefetch streams. The prefetch streams are generated by performing a collinear check for a sequential access pattern (e.g., linear access pattern, strided access pattern, multi-strided access pattern, etc.) between instances of file access information found in the access history buffer (130).
For example, Stream A (145) may include the following segments of information: number of hits (160), a start position (163), a stride (165), a length of access (170), a last position prefetched (175), and a last position read (180). The number of hits (160) keeps track of how many blocks in the sequential access pattern have been read after the prefetch stream is created. Start position (163) relates to the first block-id in the sequence. For example, a pattern with block-ids 2, 4, 6, etc. have a start position of 2. Stride (165) pertains to the spacing between blocks retrieved. For example, a pattern with block-ids 3, 6, 9, 12, 15, etc. would have a stride of 3. Length of access (170) indicates the number of consecutive blocks read in a multi-block strided access pattern. For example, a multi-block strided access pattern with block-ids 0, 1, 2, 3, 4, 100, 101, 102, 103, 104, 200, 201, 202, 203, 204, etc. would have a length of access of 5 and a stride of 100. The last position prefetched (175) stores the block-id of the last block in the pattern that has been prefetched and cached, and the last position read (180) stores the block-id of the last block that has been read by an instance of file access.
In one or more embodiments of the invention, the number of hits (160) determines how aggressively the prefetch strategy should be applied. Specifically, when a sequential access pattern is already present in the prefetch stream buffer, the number of hits (160) is increased on each successive access. The increase in the number of hits (160) increases how aggressively the prefetch strategy is for a particular prefetch stream.
Specifically, in one or more embodiments of the invention, the amount of data prefetched is exponentially increased based on the number of hits in the stream. A hit is found when the next block in a sequential access pattern is read and is a measure of the sequential access pattern's reliability. For example, if a sequential access pattern is created from the block-ids 1, 2, and 3, the next block corresponding to block-id 4 is prefetched and cached. If block-id 4 is subsequently accessed by the requesting application, then two more blocks in the sequential access pattern, corresponding to block-ids 5 and 6, are prefetched and cached. Once those are read, then four more blocks (block-ids 7, 8, 9, 10) in the sequential access pattern are prefetched and cached, and so on until a set maximum prefetch amount is reached. Those skilled in the art will appreciate that the amount of data prefetched based on the number of hits may be based on another function, such as linear or factorial. Further, because the amount of data prefetched depends on the reliability of the sequential access pattern, a minimal amount of resources is taken up by false positives.
One skilled in the art will appreciate that the prefetch stream buffer shown in
If a hit has been made with the block-id, a check is made to determine whether all prefetched blocks in the prefetch stream have been read (STEP 316). For example, if a sequential access pattern has a starting block-id of 2, a stride of 2, a length of 1, a last read position of 6, and a last prefetched position of 10, then a new block read of block-id 8 would not read all prefetched blocks in the prefetched stream because block 10 is still prefetched and cached. In the case where not all prefetched blocks are read, the prefetch stream is updated to include the newest information and no further action is needed (STEP 324). In one or more embodiments of the invention, updating the prefetch stream (STEP 324) when all prefetched blocks have not been read involves changing the last position read in the prefetch stream.
Alternately, the last block read may by the same as the last block prefetched and intended to be read. For example, if a prefetch stream had a starting block-id of 1, a stride of 1, a length of 1, a last read position of 3, and a last prefetched position of 4, a block read of block-id 4 indicates that all prefetched blocks have been read. As a result, more blocks in the sequential access pattern are prefetched based on the number of hits in the prefetch stream (STEP 318). As stated above, the number of prefetched blocks may be a linear, exponential, or other function of the number of hits in the prefetch stream. For example, the number of prefetched blocks may be equal to the number of hits in the prefetch stream, or the number of prefetched blocks may be an exponential function that increases each time a new prefetch is conducted, as stated above (e.g. 2 blocks, 4 blocks, 8 blocks, 16 blocks, etc.).
Once more blocks are prefetched, the prefetch stream is updated to reflect the changes (STEP 324). Because the prefetch stream contains both a new block read as well as newly prefetched blocks, both the last position prefetched and last position read need to be updated. For example, in a sequential access pattern of stride 1, if the obtained block-id is 6 and four more blocks are prefetched after it, the prefetch stream is updated with a last position read of 6 and a last position prefetched of 10.
Continuing with
If a new sequential access pattern is found in the history buffer, a prefetch stream is created based on the sequential access pattern (STEP 312). In one or more embodiments of the invention, the prefetch stream is initialized with values for start position, stride, length of access, and last position read. Those skilled in the art will appreciate that these values may be updated later in the process, and that values like last position prefetched may be written to before the actual prefetch is executed The block-ids pertaining to the prefetch stream are removed from the history buffer (STEP 320) so that future collinear checks are simplified, and the next block in the sequential access pattern is prefetched (STEP 322). In one or more embodiments of the invention, only one block is prefetched to begin with to minimize the use of resources by false positives. Once a reliable sequential access pattern is established, more blocks are prefetched at once to speed up the file read. Once the next block is prefetched, the prefetch stream is updated to reflect the last block prefetched (STEP 324).
Within the iteration, a distance is computed between the new block-id and each other block-id found in the access history buffer (STEP 404). This can be done by subtracting the older block-id from the new block-id. For example, the distance between a new block-id of 10 and an older block-id of 6 would be 4. Similarly, the distance between a new block-id of 4 and an older block-id of 9 would be −5. Once the distance between the two block-ids is computed (STEP 404), the previous block-id in the sequence formed by the two block-ids (i.e., the new block-id and each other block-id found in the access history buffer) is determined (STEP 406). For example, with an older block-id of 6 and a new block-id of 9, the previous block-id of the sequence formed by the two would be 3. Those skilled in the art will appreciate that a check for the next block-id in the sequence formed by a new block-id and an older block-id is not necessary because the next block-id in the sequence would come after the new block-id in order for the sequence to be valid. For example, the numbers 1, 2, and 3 form a sequential access pattern but the numbers 1, 3, and 2 do not because the numbers are out of order.
Next, a check is made to determine whether a previous block-id in the sequence formed by the new block-id and an older block-id exists in the history buffer (STEP 408). If not, a determination is made to see if any block-ids have not been compared with the new block-id (STEP 414). If any block-ids remain to compare, the iteration continues through the older block-ids in the history buffer.
If a previous block-id in the sequence formed by the new block-id and an older block-id is found in the history buffer, a sequential access pattern has been found and a prefetch stream is created based on the three block-ids of the sequential access pattern (STEP 410). As stated above, creating the prefetch stream involves filling in the fields for one or more of the following: start position, stride, length of access, last position read, last position prefetched, and number of hits. Once the prefetch stream is created, the three block-ids corresponding to the prefetch stream are removed (STEP 412), allowing future sequential access patterns to be detected more easily.
Embodiments of the invention have one or more of the following advantages. Embodiments of the invention provide much better performance when a file is being accessed by concurrent readers that exhibit different sequential access patterns. Further, applications that process matrices, database tables, otherwise deal with strided offsets, or work by traversing a file using decreasing block-ids are supported by the prefetch strategy of the present invention.
Further, multi-threaded applications, which deal with multiple streams reading the same file, also benefit from the strategy used in the present invention. For example, in many situations, streaming video servers serve the same file, but find that the clients read linearly from different offsets in the file. When implementing the prefetch strategy of the present invention, it is possible to determine how many streaming video servers are accessing the same set of blocks, and apply techniques to handle multiple applications reading from the same file at the same time, or multiple threads in an application concurrently accessing the same file.
Further, embodiments of the invention provide a prefetch strategy that uses information beyond block-id to allow the strategy to track multiple accesses per-file, giving it flexibility beyond what would be achieved by one application linearly accessing data. Additionally, the strategies employed by the file system of the present invention allow the file system to achieve optimal performance on a wide variety of applications, including those that have sequential but non-linear access patterns. Accordingly, the performance of High Performance Computing (HPC) applications, which frequently access data in matrices, improve. Database performance is also improved using the prefetch strategy of the present invention, as many databases layout tables and indices as matrices, or in other sequential but non-linear data structures. Aside from helping applications that use complicated sequential access patterns, the present invention improves the performance of applications that have multiple processes or threads concurrently accessing the same file. The prefetch strategy of the present invention tracks how each separate thread or process accesses a file, and optimizes how each individual process or thread is reading the particular file. For applications, such as streaming media servers, this creates excellent performance by tracking multiple concurrent readers, so it becomes possible to prefetch blocks for each reader, yielding additional performance wins.
The invention may be implemented on virtually any type of computer regardless of the platform being used. For example, as shown in
Further, those skilled in the art will appreciate that one or more elements of the aforementioned computer system (200) may be located at a remote location and connected to the other elements over a network. Further, the invention may be implemented on a distributed system having a plurality of nodes, where each portion of the invention may be located on a different node within the distributed system. In one embodiment of the invention, the node corresponds to a computer system. Alternatively, the node may correspond to a processor with associated physical memory. The node may alternatively correspond to a processor with shared memory and/or resources. Further, software instructions to perform embodiments of the invention may be stored on a computer readable medium such as a compact disc (CD), a diskette, a tape, a file, or any other computer readable storage device.
While the invention has been described with respect to a limited number of embodiments, those skilled in the art, having benefit of this disclosure, will appreciate that other embodiments can be devised which do not depart from the scope of the invention as disclosed herein. Accordingly, the scope of the invention should be limited only by the attached claims.
This application claims benefit of U.S. Provisional Application Ser. No. 60/733,442 filed on Nov. 4, 2005, entitled “METHOD AND SYSTEM FOR ADAPTIVE INTELLIGENT PREFETCH” in the names of William H. Moore, Krister M. Johansen, and Jeffrey S. Bonwick.
Number | Date | Country | |
---|---|---|---|
60733442 | Nov 2005 | US |