1. Technical Field
This disclosure relates to data storage systems, such as disk drives, for computer systems. More particularly, the disclosure relates to high performance media transport manager architecture for data storage systems.
2. Description of the Related Art
Data storage systems, such as disk drives that comprise solid-state memory, are generally required to provide support for concurrent execution of data storage operations while maintaining coherency and robustness of stored data. Various data storage systems utilize bridge interfaces for accessing solid-state memory. Such bridge interface may perform some level of basic channel management of solid-state memory and support additional capabilities, such as concurrent execution of multiple commands. When solid-state memory is accessed using a bridge interface, it is important to maximize the transfer and throughput of data communicated across the bridge interface. Thus, there exists a need for an effective architecture for providing data to the bridge interface in a manner that achieves optimal performance of the data storage system.
Systems and methods that embody the various features of the invention will now be described with reference to the following drawings, in which:
While certain embodiments are described, these embodiments are presented by way of example only, and are not intended to limit the scope of protection. Indeed, the novel methods and systems described herein may be embodied in a variety of other forms. Furthermore, various omissions, substitutions and changes in the form of the methods and systems described herein may be made without departing from the scope of protection.
Overview
Data storage systems, such as hard disk drives, hybrid disk drives, and solid state drives, may utilize media transport manager architectures. These architectures can be configured to perform concurrent queuing management of storage access operations received from a host system and/or internally generated by a data storage system. Media transport manager architectures can also be configured to manage different types of media utilized by the data storage system. Different types of media include non-volatile storage, such as magnetic storage, solid-state memory storage, etc., and volatile storage, such as random access memory (RAM). Solid-state memory storage comprises flash integrated circuits, Chalcogenide RAM (C-RAM), Phase Change Memory (PC-RAM or PRAM), Programmable Metallization Cell RAM (PMC-RAM or PMCm), Ovonic Unified Memory (OUM), Resistance RAM (RRAM), NAND memory, NOR memory, EEPROM, Ferroelectric Memory (FeRAM), Magnetoresistive RAM (MRAM), other discrete NVM (non-volatile memory) chips, or any combination thereof.
In some embodiments of the present invention, a data storage system, such as hybrid disk drives, comprises magnetic media along with solid-state storage, such as NAND memory. Solid-state storage memory can have multiple data and control channels so that multiple storage access operations can be executed in parallel. To achieve optimal performance, the data storage system can employ a media transport manager architecture that fully exploits parallel execution capabilities of the solid-state storage.
In some embodiments of the present invention, the data storage system uses a bridge interface for the solid-state storage. Such bridge interface may perform some level of basic channel management of solid-state memory and support additional capabilities, such as concurrent execution of multiple commands. This is further disclosed in co-pending patent application Ser. No. 13/226,393, entitled “Systems and Methods for an Enhanced Controller Architecture in Data Storage Systems,” filed on Sep. 6, 2011, the disclosure of which is hereby incorporated by reference in its entirety. When solid-state memory is accessed using a bridge interface, the bridge interface may perform basic signal processing and channel management of solid-state memory, as well as some basic error correction functions and XOR parity accumulation. Such arrangement can reduce latency associated with accessing solid-state memory and improve the overall performance of the disk drive. Using a media transport manager architecture that fully exploits parallel execution capabilities of solid-state memory and maximizes the transfer and throughput of data communicated across the bridge can result in significant performance improvements.
System Overview
The controller 130 can be configured to receive data and/or storage access commands from a storage interface module 112 (e.g., a device driver) of a host system 110. Storage access commands communicated by the storage interface 112 can include write data and read data commands issued by the host system 110. Storage access commands can be received and processed by the host command processor 132. Read and write commands can specify a logical address (e.g., LBA) used to access the data storage system 120. The controller 130 can execute the received commands in the non-volatile memory module 150 and/or the magnetic storage module 160.
Data storage system 120 can store data communicated by the host system 110. In other words, the data storage system 120 can act as memory storage for the host system 110. To facilitate this function, the controller 130 can implement a logical interface. The logical interface can present to the host system 110 data storage system's memory as a set of logical addresses (e.g., contiguous address) where host data can be stored. Internally, the controller 130 can map logical addresses to various physical locations or addresses in the non-volatile memory module 150 and/or the magnetic storage module 160.
The controller 130 comprises a media transport manager (MTM) 134, which can be configured to manage storage operations to be executed in the non-volatile memory module 150 and/or the magnetic storage 160. The MTM 134 can be configured to arrange, reorder, prioritize, and provide to the bridge 140 storage access operations communicated by the host system 110 or generated by the data storage system 120. In one embodiment, the MTM 134 can be configured to facilitate optimal overlapping execution of storage access operations in the non-volatile memory module 150.
Media Transport Manager Architecture
In one embodiment, the MTM architecture 300 can be configured so that different streams of storage commands are associated with an appropriate operation. For example, all background storage commands can be routed and handled by the operation 306. This can facilitate determining priorities of various storage commands and making decisions as to the order of execution of commands in the non-volatile memory module 150. In one embodiment, for example, storage access commands associated with background activities operation 306 may typically (but not always) be assigned a lower priority value than storage access commands associated with operations 302 and 304.
Commands are submitted to a submit queue 310. In one embodiment, there is a single submit queue 310, which can be arranged as a FIFO. In other words, storage commands are queued in a chronological order. In an alternative embodiment, more than one submit queues can be utilized and storage commands can be queued in any suitable order. Submit queue 310 can be configured to maintain separate queues or lists of different storage command types.
One or more operation handlers 320 handle operations 302, 304, and 306 queued in the submit queue 310. In one embodiment, the MTM architecture 300 ensures that each storage command is handled by an appropriate operation handler 320. In one embodiment, periodic or intermittent tasks can be associated with one of the depicted handlers or be associated with a separate handler. For example, periodically storing or flushing to non-volatile memory of mapping tables can be performed after every 1,000 write storage operations handled by the data storage system. This task can be handled by the operation handler associated with the system operations 304.
Operation handlers provide storage access commands to a packet generator 330, which converts or divides each particular storage access command into at least one storage access operation or subcommand. For example, a read data storage access command associated with several starting logical addresses can be converted to several storage access operations, each associated with a particular starting logical address of the read data storage access command. In one embodiment, each storage access operation generated by the packet generator 330 is associated with a storage access operation type, such as host system command, system operation, background activity, etc. In an alternative embodiment, the packet generator 330 can be configured to combine several storage access commands into a single operation. For example, several smaller read data storage access commands with overlapping logical addresses can be combined into a single read data storage access operation associated with the entire logical address range of the several read data commands. Although one packet generator 330 is illustrated, those of ordinary skill in the art will appreciate that more than one packet generator can be utilized.
In one embodiment, the packet generator 330 provides storage access operations to the bundle manager 340, which generates bundles that comprise one or more storage access operations. The bundle manager 340 generates bundles in accordance with a specific set of predetermined or dynamic (e.g., heuristic) rules. Many different rules can be utilized by the MTM architecture 300, such as, maximum size of a bundle as dictated at least in part by the characteristics of the bridge 140, creation of bundles comprising storage access operations of the same type, and so on. Another rule can be that once a particular function handler begins to provide storage access commands, that function handler is not interrupted until it has finished. This can help to ensure coherency of data stored in the data storage system. The bundle manager 340 can also implement a priority mechanism, whereby different storage access operations are assigned various priority values and bundles are generated to comprise storage access operations of the same or substantially same priorities.
Bundle manager 340 provides the bundles to the bridge 140, which executes storage access operations of the bundles in the non-volatile memory module 150. In one embodiment, the bridge 140 provides the bundles to a hardware access layer module 360, which in turn executes or causes the storage access operations to be executed in the non-volatile memory module 150.
Operation handlers 420 provide storage commands to the packet generator 430. The packet generator 430 converts or divides each particular storage access command into at least one storage access operation or subcommand, illustrated as 432. In one embodiment, each storage access operation is associated with a storage access operation type, such as host system command, system operation, background activity, etc. The packet generator 430 provides storage access operations to the bundle manager 440. According to a set of predetermined or dynamic rules, the bundle manger 440 generates bundles (illustrated as 442) that comprise one or more storage access operations. In one embodiment, several storage access operations 432 can be combined into a single storage access operation by the bundle manager 440. For example, several partial non-volatile memory page program operations can be combined or aggregated into a single full page program operation, thereby increasing efficiency and improving performance. As another example, several smaller read data operations with overlapping logical addresses can be combined into a single read data operation associated with the entire logical address range of the several read data operations.
In one embodiment, the bundle manager 440 can also implement a priority mechanism, whereby different storage access operations are assigned various priority values, and bundles are generated to comprise storage access operations of the same or substantially same priorities. In one embodiment, the bundle manager 440 assigns priority values based on the storage access operation type. As is illustrated, a list or queue 444 comprises bundles associated with erase non-volatile memory storage access operations and a list or queue 446 comprises bundles associated with write non-volatile memory storage access operations. If the MTM architecture 300 determines that the non-volatile memory module 150 does not have enough free space to store data requested to be written to the non-volatile memory, the bundle manager 440 may assign a higher priority value to the erase operations bundles than write operation bundles. This way, non-volatile memory space can be freed so that data can be stored in the non-volatile memory module 150. Conversely, if the MTM architecture 300 determines that the non-volatile memory module 150 has enough free space, write operation bundles may be assigned higher priority than erase operation bundles. As another example, storage access operations of background activity type typically (but not always) are assigned a lower priority than other storage access operation types. Accordingly, bundles comprising storage access operations associated with background activities are typically executed in the non-volatile memory module 150 after storage access operations of other types are executed.
In one embodiment, based on the assigned priority values, the bundle manager 440 provides a highest priority bundle to the bridge 450. Lower priority bundles can be stored (e.g., queued) by the bundle manger for later execution. In one embodiment, the bridge 450 can include a staging list or queue where multiple bundles are stored. As is illustrated, the queue can be configured to hold two entries: the current bundle being executed 452 and the next bundle to be executed 454. Those of ordinary skill in the art will appreciate that the queue can be configured to accommodate more or less entries. The bridge 450 can execute the storage access operations of the bundles in the non-volatile memory module 150.
In one embodiment, execution status of storage access operations of the bundle can be provided (e.g., returned by the non-volatile memory module 150) to the bridge. Execution status can comprise an indication of success, an indication of an error, or a combination of the two. In one embodiment, the bridge 450 can be configured to provide the execution status for handling to the bundle manager 440. Execution status of particular storage access operations of the bundle can be provided before other storage access operations of the bundle have been executed. In case the execution status indicates success, the bundle manager can provide successful execution status to the associated operation or client (e.g., 302, 304, or 306). In case the execution status indicates at least one error, the bundle manger 440 can determine whether the at least one error is severe enough to halt execution of the entire bundle. For example, the error may indicate that the non-volatile memory is full, and the bundle manager 440 may decide to terminate execution of a bundle containing write operations.
In one embodiment, the MTM architecture 300 may take an action based on the error. For example, the error may correspond to a read error of a non-volatile memory module 150 location. However, a valid copy of host data stored (e.g., cached) in the non-volatile memory module may reside in the magnetic medium 164. In this case, the MTM architecture 300 can instead retrieve the copy of the data from the magnetic medium 164 and provide retrieved data to the operation or client. In one embodiment, the MTM architecture 300 may also provide an error indication to inform the operation or client of the problem encountered during retrieving the data from the non-volatile memory module 150.
In one embodiment, the MTM architecture 300 is configured to determine and take into account activity levels of its components. This can be accomplished by maintaining a “state” indicator, which can indicate various activity levels, such as “free,” “busy,” “very busy,” and so on. Depending on the activity level of the upstream component, a downstream component can be configured to “back off” and “stage” (e.g., wait) before providing data to the upstream component. As is illustrated in the embodiment shown in
This may be advantageous, for example, in a case of coalescing or combining multiple small storage access commands into a larger, more complete command. For instance, the host system 110 may be issuing a large number of partial page program commands. These commands may typically be handled as separate read-modify-write operations. However, if may be advantageous to wait and coalesce or aggregate multiple partial page program commands into one complete page program command, executing which would accomplish the same result more efficiently and economically.
In block 506, the process 500 combines storage access operations into bundles. In one embodiment, the process 500 combines storage access operations of the same type into bundles. The bundles are optimized for overlapping execution of storage access operations in the non-volatile memory module 150. In one embodiment, the process 500 creates bundles so that at least some storage access operations of a particular bundle are concurrently executed in the non-volatile memory module 150. In another embodiment, all storage access operations of the particular bundle are concurrently executed in the non-volatile memory module 150.
In block 508, the process 500 determines or assigns priority to the bundles. In one embodiment, the process 500 assigns priorities based on the type of storage access operations in the bundle. As explained above, storage access operations of background activity type typically (but not always) are assigned a lower priority than other storage access operation types. In one embodiment, the process 500 selects in block 510 a bundle having the highest priority and provides that bundle to the bridge 140 for execution in the non-volatile memory module 150. As mentioned above, the bundle can be generated such that at least some (or all) storage access operations of the bundle are executed concurrently.
In block 512, the process 500 receives from the non-volatile memory module 150 execution statuses of each storage access operation in the bundle. In one embodiment, the process 500 receives an execution status of a particular storage access operation as soon as it has been completed. In another embodiment, the bridge 140 delays communicating individual execution statuses, and instead combines multiple execution statuses into a single (or multiple) message. This may minimize the traffic over the bridge 140 and, thereby, reduce latency and improve performance.
In block 514, the process 500 receives an execution status for the entire bundle when all storage access operations of the bundle have been executed. In block 516, the process 500 provides bundle complete indication to a respective operation handler, which in turn causes the operation or client to be notified in block 518. In one embodiment, such notification informs the operation or client that a particular storage access command has been executed.
Conclusion
To achieve optimal performance, a data storage system can utilize a media transport manager architecture that fully exploits parallel execution capabilities of solid-state memory. Especially when solid-state memory is connected via a bridge interface, the MTM architecture can be configured to optimize the transfer and throughput of data communicated across the bridge. The MTM architecture can support reordering and interleaving of storage access commands by using priority and staging mechanisms. Balanced load of solid-state memory can be achieved, whereby the MTM architecture keeps the bridge busy, and therefore keeps the solid-state memory busy. Improved concurrency and increased performance can be attained.
Other Variations
Those skilled in the art will appreciate that in some embodiments, other types of media transport architectures can be implemented. In addition, the actual steps taken in the disclosed processes, such as the process illustrated in
While certain embodiments have been described, these embodiments have been presented by way of example only, and are not intended to limit the scope of the protection. Indeed, the novel methods and systems described herein may be embodied in a variety of other forms. Furthermore, various omissions, substitutions and changes in the form of the methods and systems described herein may be made without departing from the spirit of the protection. The accompanying claims and their equivalents are intended to cover such forms or modifications as would fall within the scope and spirit of the protection. For example, the systems and methods disclosed herein can be applied to hard disk drives, hybrid hard drives, and the like. In addition, other forms of storage (e.g., DRAM or SRAM, battery backed-up volatile DRAM or SRAM devices, EPROM, EEPROM memory, etc.) may additionally or alternatively be used. As another example, the various components illustrated in the figures may be implemented as software and/or firmware on a processor, ASIC/FPGA, or dedicated hardware. Also, the features and attributes of the specific embodiments disclosed above may be combined in different ways to form additional embodiments, all of which fall within the scope of the present disclosure. Although the present disclosure provides certain preferred embodiments and applications, other embodiments that are apparent to those of ordinary skill in the art, including embodiments which do not provide all of the features and advantages set forth herein, are also within the scope of this disclosure. Accordingly, the scope of the present disclosure is intended to be defined only by reference to the appended claims.
Number | Name | Date | Kind |
---|---|---|---|
5333138 | Richards et al. | Jul 1994 | A |
5581785 | Nakamura et al. | Dec 1996 | A |
5586291 | Lasker et al. | Dec 1996 | A |
6016530 | Auclair et al. | Jan 2000 | A |
6044439 | Ballard et al. | Mar 2000 | A |
6115200 | Allen et al. | Sep 2000 | A |
6275949 | Watanabe | Aug 2001 | B1 |
6429990 | Serrano et al. | Aug 2002 | B2 |
6661591 | Rothberg | Dec 2003 | B1 |
6662267 | Stewart | Dec 2003 | B2 |
6687850 | Rothberg | Feb 2004 | B1 |
6725342 | Coulson | Apr 2004 | B1 |
6754021 | Kisaka et al. | Jun 2004 | B2 |
6785767 | Coulson | Aug 2004 | B2 |
6807630 | Lay et al. | Oct 2004 | B2 |
6909574 | Aikawa et al. | Jun 2005 | B2 |
6968450 | Rothberg et al. | Nov 2005 | B1 |
7017037 | Fortin et al. | Mar 2006 | B2 |
7028174 | Atai-Azimi et al. | Apr 2006 | B1 |
7082494 | Thelin et al. | Jul 2006 | B1 |
7107444 | Fortin et al. | Sep 2006 | B2 |
7120806 | Codilian et al. | Oct 2006 | B1 |
7136973 | Sinclair | Nov 2006 | B2 |
7142385 | Shimotono et al. | Nov 2006 | B2 |
7308531 | Coulson | Dec 2007 | B2 |
7334082 | Grover et al. | Feb 2008 | B2 |
7356651 | Liu et al. | Apr 2008 | B2 |
7395452 | Nicholson et al. | Jul 2008 | B2 |
7411757 | Chu et al. | Aug 2008 | B2 |
7424577 | Bali et al. | Sep 2008 | B2 |
7461202 | Forrer, Jr. et al. | Dec 2008 | B2 |
7472222 | Auerbach et al. | Dec 2008 | B2 |
7477477 | Maruchi et al. | Jan 2009 | B2 |
7509471 | Gorobets | Mar 2009 | B2 |
7516346 | Pinheiro et al. | Apr 2009 | B2 |
7610438 | Lee et al. | Oct 2009 | B2 |
7613876 | Bruce et al. | Nov 2009 | B2 |
7631142 | Nishide et al. | Dec 2009 | B2 |
7634585 | Conley et al. | Dec 2009 | B2 |
7644231 | Recio et al. | Jan 2010 | B2 |
7685360 | Brunnett et al. | Mar 2010 | B1 |
7725661 | Liu et al. | May 2010 | B2 |
7752491 | Liikanen et al. | Jul 2010 | B1 |
7831634 | Petev et al. | Nov 2010 | B2 |
7861038 | Fontenot et al. | Dec 2010 | B2 |
7934053 | Chen et al. | Apr 2011 | B2 |
7962685 | Cheung et al. | Jun 2011 | B2 |
20010018728 | Topham et al. | Aug 2001 | A1 |
20020001152 | Iida | Jan 2002 | A1 |
20050125614 | Royer | Jun 2005 | A1 |
20050172082 | Liu et al. | Aug 2005 | A1 |
20050251617 | Sinclair et al. | Nov 2005 | A1 |
20050273755 | Bass et al. | Dec 2005 | A1 |
20060080501 | Auerbach et al. | Apr 2006 | A1 |
20060143360 | Petev et al. | Jun 2006 | A1 |
20060143427 | Marwinski et al. | Jun 2006 | A1 |
20060195657 | Tien et al. | Aug 2006 | A1 |
20060248124 | Petev et al. | Nov 2006 | A1 |
20060248387 | Nicholson et al. | Nov 2006 | A1 |
20070174546 | Lee | Jul 2007 | A1 |
20070220202 | Sutardja et al. | Sep 2007 | A1 |
20070288692 | Bruce et al. | Dec 2007 | A1 |
20080005462 | Pyeon et al. | Jan 2008 | A1 |
20080040537 | Kim | Feb 2008 | A1 |
20080059694 | Lee | Mar 2008 | A1 |
20080126623 | Chang et al. | May 2008 | A1 |
20080130156 | Chu et al. | Jun 2008 | A1 |
20080141054 | Danilak | Jun 2008 | A1 |
20080141055 | Danilak | Jun 2008 | A1 |
20080177938 | Yu | Jul 2008 | A1 |
20080209114 | Chow et al. | Aug 2008 | A1 |
20080215800 | Lee et al. | Sep 2008 | A1 |
20080222353 | Nam et al. | Sep 2008 | A1 |
20080244164 | Chang et al. | Oct 2008 | A1 |
20080256287 | Lee et al. | Oct 2008 | A1 |
20080294846 | Bali et al. | Nov 2008 | A1 |
20080307270 | Li | Dec 2008 | A1 |
20090019218 | Sinclair et al. | Jan 2009 | A1 |
20090024793 | Fontenot et al. | Jan 2009 | A1 |
20090031072 | Sartore | Jan 2009 | A1 |
20090103203 | Yoshida | Apr 2009 | A1 |
20090106518 | Dow | Apr 2009 | A1 |
20090144501 | Yim et al. | Jun 2009 | A2 |
20090150599 | Bennett | Jun 2009 | A1 |
20090172324 | Han et al. | Jul 2009 | A1 |
20090210614 | Gorobets | Aug 2009 | A1 |
20090249168 | Inoue | Oct 2009 | A1 |
20090271562 | Sinclair | Oct 2009 | A1 |
20090327603 | McKean et al. | Dec 2009 | A1 |
20100077175 | Wu et al. | Mar 2010 | A1 |
20100088459 | Arya et al. | Apr 2010 | A1 |
20100100666 | Min | Apr 2010 | A1 |
20100169604 | Trika et al. | Jul 2010 | A1 |
20100262721 | Asnaashari et al. | Oct 2010 | A1 |
20100262740 | Borchers et al. | Oct 2010 | A1 |
20100268881 | Galchev et al. | Oct 2010 | A1 |
20100312950 | Hsieh | Dec 2010 | A1 |
20110082985 | Haines et al. | Apr 2011 | A1 |
20120331207 | Lassa et al. | Dec 2012 | A1 |
Entry |
---|
U.S. Appl. No. 12/720,568, filed Mar. 9, 2010, 22 pages. |
Hannes Payer, Marco A.A. Sanvido, Zvonimir Z. Bandic, Christoph M. Kirsch, “Combo Drive: Optimizing Cost and Performance in a Heterogeneous Storage Device”, http://csl.cse.psu.edu/wish2009—papers/Payer.pdf. |
Gokul Soundararajan, Vijayan Prabhakaran, Mahesh Balakrishan, Ted Wobber, “Extending SSD Lifetimes with Disk-Based Write Caches”, http://research.microsoft.com/pubs/115352/hybrid.pdf, Feb. 2010. |
Xiaojian Wu, A. L Narasimha Reddy, “Managing Storage Space in a Flash and Disk Hybrid Storage System”, http://www.ee.tamu.edu/˜reddy/papers/mascots09.pdf. |
Tao Xie, Deepthi Madathil, “SAIL: Self-Adaptive File Reallocation on Hybrid Disk Arrays”, The 15th Annual IEEE International Conference on High Performance Computing (HiPC 2008), Bangalore, India, Dec. 17-20, 2008. |
Non-Volatile Memory Host Controller Interface revision 1.0 specification available for download at http://www.intel.com/standards/nvmhci/index.htm. Ratified on Apr. 14, 2008, 65 pages. |