Some non-volatile storage devices have a block device interface, which is an application program interface that allows block (or sector) access to data stored in the storage device. The host device typically uses a file system (e.g., FAT, NTFS, or EXT3) to provide user-friendly access and management of data. A file system can include information about the addresses where data is stored in the storage device. With such information, the host device can organize the data and provide efficient ways of accessing and updating data. A file system is often tuned for specific needs of the host device and is deeply integrated with the host device's operating system.
Unfortunately, the storage device typically does not have any information about how the file system is managing the data stored in the storage device. So, the storage device responds to a read or write command from a host device by reading or writing data specified by the block address in the command without any knowledge of what the data is or how it relates to other data stored in the storage device. Therefore, the storage device is unable to optimize data handling to fit the needs of the host device's operating system and file system. For example, after several erases and writes to the storage device (e.g., when updating files), data from a file can become fragmented across the storage device instead of being stored in nearby memory sectors. This can lead to an inefficient use of the storage device.
Embodiments of the present invention are defined by the claims, and nothing in this section should be taken as a limitation on those claims.
By way of introduction, the below embodiments relate to a method and storage device for using file system data to predict host device operations. In one embodiment, a storage device is disclosed having a first memory storing data and file system metadata, a second memory, and a controller. In response to receiving a command from the host device to read a first address in the first memory, the controller reads data from the first address in the first memory and returns it to the host device. The controller then predicts a second address in the first memory to be read by a subsequent read command from the host device, reads the data from the predicted second address, and stores it in the second memory. Other embodiments are possible, and each of the embodiments can be used alone or together in combination. Accordingly, various embodiments will now be described with reference to the attached drawings.
The following embodiments relate to a method and storage device for using file system data to predict host device operations. As mentioned above, because a storage device typically does not have any information about how the file system on the host is managing data stored in the storage device, the storage device is unable to optimize data handling. These embodiments recognize that it would be useful if the storage device could understand the host device's file system management to optimize internal storage device memory management and provide better performance to the host. Before turning to these and other embodiments, the following section provides a discussion of exemplary host and storage devices that can be used with these embodiments. Of course, these are just examples, and other suitable types of host and storage devices can be used.
Turning now to the drawings,
As shown in
The memory 120 can take any suitable form. In one embodiment, the memory 120 takes the form of a solid-state (e.g., flash) memory and can be one-time programmable, few-time programmable, or many-time programmable. However, other forms of memory, such as optical memory and magnetic memory, can be used. In this embodiment, the memory 120 comprises a public memory area 125 that is managed by a file system on the host 50 and a private memory area 136 that is internally managed by the controller 110. The private memory area 136 can store data, such as but not limited to content encryption keys (CEKs) and firmware (FW) code. The public memory area 125 can store user data and other data. The public memory area 125 and the private memory area 136 can be different partitions of the same memory unit or can be different memory units. The private memory area 136 is “private” (or “hidden”) because it is internally managed by the controller 110 (and not by the host's controller 160).
Turning now to the host 50, the host 50 comprises a controller 160 that has a storage device interface 161 for interfacing with the storage device 100. The controller 160 also comprises a central processing unit (CPU) 163, an optional crypto-engine 164 operative to provide encryption and/or decryption operations, read access memory (RAM) 165, read only memory (ROM) 166, a security module 171, and storage 172. The storage device 100 and the host 150 communicate with each other via a storage device interface 161 and a host interface 112. For operations that involve the secure transfer of data, it is preferred that the crypto-engines 114, 164 in the storage device 100 and host 150 be used to mutually authenticate each other and provide a key exchange. After mutual authentication is complete, it is preferred that a session key be used to establish a secure channel for communication between the storage device 150 and host 100. Alternatively, crypto-functionality may not be present on the host side, where authentication is done only using a password. In this case, the user types his password into the host device 50, and the host device 50 sends it to the storage device 100, which allow access to the public memory area 125. The host 50 can contain other components (e.g., a display device, a speaker, a headphone jack, a video output connection, etc.), which are not shown in
In some environments, the host device 50 is operable to render content stored in the storage device 100. As used herein, “content” can take any suitable form, including, but not limited to, a song, a movie, a game, an application (“app”), a game installer, etc. Depending on the type of content, “render” can mean playing (e.g., when the content is a song or movie), deciphering (e.g., when the content is a game installer), or whatever action is needed to “enjoy” the content. In some embodiments, the host device 50 contains the necessary software to render the content (e.g., a media player), whereas, in other embodiments, such software is provided to the host device 50 by the memory device 100 or another entity.
Returning to the drawings,
In operation, the host device 50 reads the file system metadata 126 from the storage device 100 and uses this metadata 126 to find the physical locations of data belonging to files stored in the storage device 100, as well as the allocation of free space in the user area that can be used to store additional files. When a user of the host device 50 requests a file stored on the storage device 100, the host device 50 locates the addresses of the sectors of the file from the metadata 126 and then sends several read commands (each with a different address) to read all of the data sectors that make up the file. When the storage device 100 receives a read command, it fetches the sector of data specified by the address in the command. Instead of sending the data directly to the host device 50, the storage device 100 can store the data in its internal RAM 115 for data processing. For example, the storage device 100 can perform error detection and correction on the data, decrypt the data (if it is stored in encrypted form), encrypt (or re-encrypt) data before it is sent to the host device 50, or perform other actions specified by the memory management system of the storage device 100. After the storage device 100 completes its processing of the data, it sends the data from the RAM 115 to the host device 50 via the interface 112. Similarly, when data is sent to the storage device 100 for storage, the data may be first stored in the RAM 115 to perform data processing (e.g., generating an error correction code, encrypting the data, etc.) before the data is stored in the memory.
This embodiment recognizes that it would be useful if the storage device 100 could understand the host device's file system management to optimize internal storage device memory management and provide better performance to the host device 50. Specifically, this embodiment takes advantage of the fact that a given file system specifies where the file system metadata is to be stored (e.g., in the FAT32 file system, the file system metadata is stored in the first sectors of the user area). The controller 110 in the storage device 100 can run firmware 128 to read and parse this file system metadata 126 in an attempt to predict the sectors that the host device 50 will likely read next. This parsing can be done in a similar way as the host device 50 parses the file system metadata 126. However, since the controller 110 is typically more resource limited than the host device's controller, the controller 110 may not be able to parse the metadata as extensively as the host device's controller. However, with currently-available storage device controller technology, the controller 110 can extract some information from the file system metadata 126 for memory management optimization purposes, as will be discussed below.
By parsing the file system metadata 126 to predict which sectors the host device 50 will likely read next, the storage device 100 can “pre-fetch” data that it thinks the host device will need and store the data in RAM 115 (and even start the processing of the data when in RAM 115). That way, if the host device 50 later, in fact, sends a read command for the pre-fetched sector, the data will be ready to be provided to the host device 50 faster than if the pre-fetch did not occur. In one embodiment, the controller 110 performs the file system metadata parsing and pre-fetching data while the data from the previous read command is being processed in RAM 115 and/or while the storage device 100 is waiting for the host device 50 to send the next command.
Turning again to the drawings,
When the storage device 100 receives the read command from the host device 50 (act 300), the controller 110 in the storage device 100 determines if the read is for a prepared cluster of sectors (act 310). A “prepared cluster of sectors” is a set of sectors that were previously identified by parsing the file system metadata 126. (Typically, a file system operates with data units called clusters, where every cluster can include one or more sectors. For simplicity, the term “sector,” as used herein, can refer to a single sector, a part of a cluster, a full cluster, or multiple clusters.) If the read command is not for a prepared cluster, the relevant file system metadata 126 has not yet been parsed and a next-accessed sector has not yet been predicted. So, in addition to executing the read command, the controller 110 takes the opportunity to collect and parse the relevant file system metadata 126 to predict what the next-accessed sector will be. Specifically, the controller 110 in the storage device 100 attempts to find, in the file system sectors that hold the file system metadata 126, the information that fits the read sector (act 320). The controller 110 can search the file system metadata 126 for the address specified in the read command and then parse the metadata 126 for information, such as, but not limited to, the file name, the file length, and the file cluster map (act 330). As will be explained in more detail below, with this information, the controller 110 can predict which sectors the host device 50 will want to read next and pre-fetch the data. Next, the controller 110 reads the sectors requested by the read command from the host device 50 (act 340). As discussed above, these read sectors can be stored in RAM 115 for processing before sending them to the host device 50.
Next, the read sectors are sent to the host device (act 350). Before, during, or after that act, the controller 110 can prepare the next sectors specified by the file system metadata 126 for the next read command (act 360). For example, if the file system metadata 126 indicates that a certain file is stored in sectors 2, 5, 4, 1, and 3 (in that order), and the host device 50 requested a read of sector 2, by parsing the file system metadata 126, the controller 110 can predict that sector 5 is the next sector that the host device 50 will want to read (because sector 5 is the next sector in the file). So, during the processing (e.g., error correction, decryption, etc.) of the data from sector 2 or while waiting for the next command from the host device 50, for example, the controller 110 can read the data from sector 5 and store it in RAM 115 so it is ready to go when the host device 50 requests it.
So, going back to the top of the flow chart, if the host device 50 later requests to read sector 5 (act 300), the controller 110 would know that sector 5 was in the prepared cluster (act 310) and would return the data from sector 5 to the host device 50 (act 350). Again, this improves performance of the storage device 100 because the requested data is ready to be sent to the host device 50 upon receipt of the read command for the data (or at the very least, the data is already read from memory, even if the processing of the data is not yet finished). The controller 110 then looks for the next sector to pre-fetch (in this example, sector 4) (act 360), and the process described above repeats until the end of the file is reached (in this example, until sector 3 is read) (act 370). If it turns out that the controller's prediction is incorrect and the host device 50 requests a sector different from the pre-fetched one, the data stored in RAM 115 from the pre-fetched sector can be discarded.
There are several advantages associated with this embodiment. For example, by predicting and pre-fetching the next read sector, this embodiment can provide better performance of the storage device 100. For example, as noted above, storing the data from the predicted address in RAM 115 reduces the time needed to retrieve and return the data to the host device 50. This embodiment also allows this benefit to be realized without any modification to the host device 50 (e.g., no special host operations are needed), and the memory device 100 does not need to store any additional information about the data, as all of the information needed is contained in the conventional file system metadata 126.
There are several alternatives that can be used with these embodiments. For example, the prediction algorithm can be improved by taking into account the sectors read previous to the data access. The controller 110 can be configured to remember the sectors read by the host device 50 and can limit the search in file system metadata 126 to the read sectors. Especially interesting is the last sector that was read. Since access to the storage device 100 is done in resolution of sectors, the firmware 128 may not be able to determine exactly which file is accessed, but this information can significantly decrease search time. So, with reference to
As another alternative, these embodiments can be used for write commands in addition to or instead of read commands. During a write operation, the host device 50 looks for the free space. After the first write operation, the controller 110 can parse the file system metadata 126 and detect if the next cluster is free. If the next cluster is not free, the controller 110 can find the next available cluster and prepare the write operation prior to the receiving of write command from the host device.
In another alternative embodiment, instead of or in addition to using the predicted next address to pre-fetch data to be returned to the host device 50, the controller 110 can use the knowledge of the predicted next address to adjust and optimize a memory management function, such as, but not limited to, garbage collection, read scrubbing, compaction, folding, updating a translation table update, and wear leveling. For example, the controller 110 can avoid performing a memory management function (e.g., wear leveling) on an address that will be accessed in the near future. As another example, based on the knowledge of the predicted address, the controller 110 can perform garbage collection to make the retrieval of the data from the predicted address faster.
As yet another alternative, these embodiments can be used to manage security file properties. For example, if the storage device 100 includes a secure platform that allows protection of files (e.g., the TrustedFlash™ platform by SanDisk Corporation), this embodiment can be used to locate the appropriate key to decrypt requested data after access to the data is granted. More specifically, when a file is accessed, the secure platform can use this embodiment to calculate the file name according to every read operation. The key is bound to a specific file name, and, if access is allowed to the specific file, the secure platform can decrypt the content with corresponding key and send the decrypted content to the host device 50. So, this alternative avoids the need for the host device 50 to send a special command to the storage device 100 to indicate which key to use to decrypt the file because the controller 110 can use the file system information to determine which key should be used to decrypt the file.
It is intended that the foregoing detailed description be understood as an illustration of selected forms that the invention can take and not as a definition of the invention. It is only the following claims, including all equivalents, that are intended to define the scope of the claimed invention. Finally, it should be noted that any aspect of any of the preferred embodiments described herein can be used alone or in combination with one another.