The present invention generally relates mass storage devices adapted for use with personal computers, servers, or other host systems. Specifically, the current invention discloses a volatile memory-based mass storage device that uses DDR (double data rate) SDRAM (synchronous dynamic random access memory) memory devices running at a low frequency while providing error correction to allow for failure of an entire memory device.
Computers (including personal computers, servers, and other host systems) have evolved to the point where there is an increasing mismatch between the non-volatile storage of a computer and its combined processing power and memory system (comprising system memory). Generally, the non-volatile storage (or mass storage media) of a computer includes at least one tier of storage comprising non-volatile memory technology in the form of one or more hard disk drives (HDD) and/or solid state drives (SSD), which can be connected to the motherboard of the computer using conventional Serial ATA (SATA) or serially attached SCSI (SAS) interfaces or else directly plugged into an expansion slot of the motherboard using PCIe (PCI (peripheral component interconnect) Express) or similar protocols. The combined processing power and memory system of a computer generally encompasses its central processing unit (CPU), including various cache levels, and a graphics processing unit (GPU), including a local frame buffer (LFB). Access latencies and transfer rates of system memory vs. non-volatile storage differ by several orders of magnitude. For example, modern system memory made up of SDRAM (synchronous dynamic random access memory) integrated circuit (IC) components may have initial access latencies of under 50 nano seconds (nsec), whereas flight time between the request and the actual reading of data from drives of the mass storage media is generally measured in milli seconds (msec). It would, therefore, be desirable to store all data in system memory to grant the processor the fastest possible access.
System memory consumes a substantial amount of power. In addition, data in the system memory need to be refreshed on average every 64 msec, which means that after a refresh interval (tREF) the charges of each capacitor comprising an SDRAM memory cell need to be read into the sense amplifier, amplified and then written back to the cell of origin. With increasing memory density, the refresh, during which time no data can be accessed, takes an increasing amount of overhead within the entire operational budget of system memory. Even though burst refreshes of several rows and other countermeasures may be taken, it becomes un-economic to increase the memory space beyond a certain capacity.
In addition, even though error checking and correction (ECC) algorithms have been implemented on system memory, especially in servers, most memory systems are only capable of correcting single bit errors. On the contrary, if an entire memory IC component with, for example, an 8-bit wide I/O path fails, the result will be catastrophic failure of the system.
Non-volatile memory solutions, including but not limited to NAND flash-based SSDs, embrace a substantially lower cost per bit and have implemented more sophisticated error correction schemes, including low density parity check (LDPC) or Bose-Ray-Chaudhuri-Hocquenghem (BCH). Moreover, the power envelope is substantially below that of DRAM. However, as discussed above, access latencies and bandwidth are orders of magnitude higher than those in DRAM systems.
In view of the above, it would be desirable to have an intermediate storage tier with lower power consumption than that of conventional system memory made up of SDRAM, and better access time and bandwidth than that of non-volatile memory, including NAND flash-based SSDs, while providing the necessary fail-over to compensate for multi-bit errors up to failure of an entire memory IC.
The present invention provides volatile memory-based solid-state mass storage devices adapted for use in a host system as a storage tier that has lower power consumption than that of conventional system memory made up of volatile memory-based IC components, and is capable of faster access times and bandwidths than that of a non-volatile memory-based mass storage device.
According to a first aspect of the invention, the mass storage device comprises a substrate on which is mounted a system interface, a control circuitry, and a plurality of substantially identical random access memory components that define at least one memory array. Each of the memory components of the memory array has associated therewith an input/output path, a width of the input/output path, and a burst length. The mass storage device further comprises means that uses parity information to provide redundancy data sufficient to correct a catastrophic failure of one of the memory components of the memory array. The number of correctable bits to correct the catastrophic failure of one of the memory components equals the product of the width of the input/output path thereof and the burst length thereof.
According to a second aspect of the invention, a method is provided for storing and accessing data from a host computer. The method comprises connecting to the host system a mass storage device that comprises a substrate on which is mounted a system interface, a control circuitry, and a plurality of substantially identical random access memory components that define at least one memory array. Each of the memory components of the memory array has associated therewith an input/output path, a width of the input/output path, and a burst length. Parity information is used to provide redundancy data sufficient to correct a catastrophic failure of one of the memory components of the memory array. The number of correctable bits to correct the catastrophic failure of one of the memory components equals the product of the width of the input/output path thereof and the burst length thereof.
According to a third aspect of the invention, a solid-state mass storage device is provided that is adapted for use in a computer system and for storing data thereof. The storage device includes a substrate on which is mounted a system interface, a control circuitry, and a plurality of substantially identical random access memory components organized in ranks that define at least one memory array. Each of the memory components of the memory array has associated therewith an input/output path, a width of the input/output path, and a burst length. The storage device uses sets of data equaling in size the product of the number of memory components, the input/output width of the memory components, and the burst length of the memory components. The sum of bits of a single transfer of data from a single I/O of each memory comprises a subset of data including redundancy data. The redundancy data are sufficient to correct a catastrophic failure of one of the memory components of the memory array, and a number of correctable bits to correct the catastrophic failure of one of the memory components equals the product of the width of the input/output path thereof and the burst length thereof.
The following describe certain additional and preferred but nonlimiting aspects of the invention.
A particular approach that can be implemented with the invention is for each I/O pad of each memory IC component to have its own data trace to the control circuitry and all memory IC components burst data substantially simultaneous. Another approach entails configuring the entire storage device to comprise ranks of memory IC components, wherein each rank bursts a certain number of transfers. As an example, the mass storage device may use seventy-two individual 8-bits wide (×8) DDR3 SDRAM devices as the memory IC components. In this example, the entire storage device can be configured as eight ranks of nine (8 data+1 parity) ×8 memory IC components, wherein each rank bursts eight transfers. All eight ranks are placed on a multi-drop bus and are accessed sequentially by rank-switching on a single read or write request for a combined transfer of an entire sector of 512 Bytes plus parity information for every read or write. In order to achieve chip-fail correction, the data are rearranged such that each quad-word (64 bits) constitutes a single transfer at one I/O pin of each of the seventy-two memory IC components wherein the additional eight bits are used for parity information.
Low frequency operation of the memory IC components used in the storage device can be achieved by turning off an internal digital delay lock loop (DLL) built into DDR-SDRAM IC components and synchronizing data to strobe signals only.
A function of the control circuitry is to perform logical re-arrangement of data stored by the storage device. Suitable control circuitry for this purpose include, but are not limited to, custom field-programmable gate arrays (FPGAs) and custom application specific integrated circuits (ASICs) configured as a state machine. For example, an array of latches of an FPGA can be arranged into different domains and accessed in a time-multiplexed manner over a common I/O bus. The common I/O bus can be configured as a multi-drop bus for the different ranks of memory IC components and also for the different domains of latches in the FPGA. Alternatively, the FPGA can switch internally between the different domains. In preferred embodiments, any quad word including parity information is composed of data of the same I/O pin (DQ) of each memory IC component and the same transfer within the burst.
The I/Os of the memory IC components can be scrambled so that one bit of any one of the eight I/Os of the each memory IC component contributes to the quad word, and wherein only a single bit from each memory IC component is part of the quad word. Individual transfers within a burst sequence can be logically scrambled to constitute the quad word using a single bit from each memory IC component for each quad word. Scrambling of both I/Os and burst sequences can be employed, yet data are still arranged such that only a single bit from each memory IC component is part of the quad word. Chip-fail redundancy for the storage device can be achieved by recombining the individual data I/Os in a manner to utilize a single bit from each one of a plurality of substantially identical memory IC components, forming a logical array for a transfer of any quad word including its error checking and correction (ECC) values.
Other aspects and advantages of this invention will be better appreciated from the following detailed description.
The present invention relates to solid-state mass storage devices suitable for use in host systems (including personal computers, servers, etc.), and more specifically to a mass storage system that makes use of a volatile memory-based mass storage device. In the past, this kind of storage device has faced the challenges of extreme cost of acquisition along with a power budget that negated the benefits of such a device. As a compromise, so-called RAM drives were established as partitions of the system memory space. In contrast,
The control circuitry 16 is labeled in
In a first embodiment of the invention, each of the seventy-two memory IC components 22 has its own I/O path to the control circuitry 16, while address and command lines are shared.
During a read from the component 22, one I/O (DQ) pin of each memory IC component 22 outputs a single bit of a data set to the bus connecting the component 22 to its memory controller. Each data set constitutes a 72-bit transfer using standard ECC algorithms like Hamming code-based parity information wherein no memory IC component 22 contributes more than a single bit. Because of this data arrangement, even the complete failure of any single memory IC component 22 can be corrected. Since the embodiment of
At the control circuitry 16, all data are latched into a 512 Bytes (plus parity information) wide array of latches of the control circuitry 16. The entire sector is transferred as a burst of eight transactions (T0-T7) according to the DDR3 burst protocol. For the host system, the latch array can use virtual addressing. Consequently, data can be scrambled not only with respect to the DQ numbers of the memory IC component 22, but also across the burst in that a seventy-two bit data set may contain data from different transactions within the burst, as long as the above mentioned limitation of no more than a single bit read from or written to each memory IC component 22 is satisfied.
As previously noted,
Each transfer of a sector starts with a burst of eight transfers of a single rank 24, as shown in
In the case of writing data from the host system to the storage device 10, the same sequence of transfers from sub-domains corresponding to transfers within the burst and domains corresponding to the ranks 24 is maintained.
The invention has been described so far with respect to using ×8 memory IC components 22 and a burst length of eight transactions per access. In this configuration each memory IC component 22 contributes sixty-four bits to each burst transaction of the array 20, which means that catastrophic failure of one component 22 results in the loss of sixty-four bits of data that need to be corrected. Using standard ECC mechanisms, one bit per sixty-four bits of data is correctable if the total transfer is seventy-two bits. Since the storage device 10 can be operated so that each memory IC component 22 only contributes one bit to each burst transaction, a minimum of seventy-two memory IC components 22 is able to achieve chip kill redundancy. As such, the number of correctable bits to correct the catastrophic failure of one of the components 22 equals the product of the width of the input/output path and burst length of the component 22. On this basis, it should be apparent that other configurations are possible as well. For example, it is possible to use ×4 memory IC components 22 and reduce the burst length to four from the default of eight transactions by using DDR3-SDRAM IC components 22 in “burst chop mode.” This results in each IC component 22 outputting or receiving only sixteen bits per read or write access, respectively. As a result, in order to achieve chip kill redundancy, only sixteen bits need to be correctable, which means that the smallest redundant array 20 can be as small as eighteen memory IC components 22. Alternatively, if the seventy-two component array size is maintained, the number of tolerable catastrophic failures increases to four memory IC components 22.
The IC components 22 of the mass storage device 10 preferably run at low frequency to minimize power consumption. In preferred embodiments of the invention, the storage device 10 uses DDR3-SDRAM IC components 22, which as known in the art are designed for transfer rates of 800 Mbps and higher. If, for example, a 6 Gbps SATA/SAS interface 14 is used to interface with a host system, it would be of no practical value to increase the maximally sustainable bandwidth between the memory array 20 and the controller beyond the same 6 Gbps interface bandwidth to the host system. For a 64-bit wide interface 14 (plus ECC), the data rate at which the memory array 20 saturates the interface 14 is approximately 94 Mbps. In preferred embodiments, the memory array 20 is configured as a double-sided dual channel array, which increases the effective memory interface width to 128 bits (plus ECC). Accordingly, the DDR3 core frequency only needs to be at 23.5 MHz for fully utilizing the available host interface bandwidth. It is understood that memory arrays as described here are not necessarily operating at 100% bus efficiency. Therefore it can be desirable to add additional headroom by increasing the operating frequency of the DDR3-SDRAM components 22. Assuming 60% efficiency in memory bandwidth usage caused by row to row delays and similar other latencies, this is believed to raise the memory target frequency to approximately 40 MHz.
The low-frequency operation desired for the IC components 22 used in the storage device 10 can be achieved by turning off an internal digital delay lock loop (DLL) built into DDR3 SDRAM IC components 22, so that data are synchronized to strobe signals only. The DLL of these memory IC components synchronize the different areas of the die to the same clock signal. Though digital DLLs have a relatively limited operating range, it is possible to turn off the DLL during memory component initialization using a mode register set entry to run the DDR3 SDRAM IC components 22 outside the frequency range supported by the DLL. An external differential clock signal can be supplied to the IC components 22 via the CK and CK# pins and input and output signals are aligned with the differential strobes according to the DDR3 specifications.
Internally, the control circuitry 16 interfaces with a decode block which decodes the host system requests (packets) and subsequently communicates with a microprocessor, a nonlimiting example of which is a MicroBlaze SoftCore processor, through a buffer. The microprocessor performs ancillary functions of the storage device 10, for example, speed or retry negotiations and other house-keeping functions. The interface of the microprocessor is a fully synchronous processor local bus (PLB) designed for a single clock source for all master and slave devices in the host system. The decode block also interfaces with the memory controller on demand by the host system. Transferred packets can contain commands or data, wherein the contents are identified by a header. In case the packets contain commands, they are forwarded to the microprocessor, whereas actual data received from the host system are sent to a memory controller interface. The routing of data to two different destinations would cause a problem with respect to the clock domains. That is, the control and house-keeping data need to be clock and phase aligned with the PLB clock, whereas storage-bound data are time-critical and synchronized with the host interface. Moreover, the data buffer needs to run at the same clock as the memory array 20 which may or may not run at the same frequency as the PLB clock. In addition, the two control and data clocks are generated independent from each other which, even if they were at the same frequency, would cause phase misalignment.
In view of the above, another aspect of the invention is to circumvent the mismatch and phase skew between the two different clock domains that interfere with the PLB functionality by shadowing the data between two buffers, one designated in
The command buffer is represented in
While the invention has been described in terms of specific embodiments, it is apparent that other forms could be adopted by one skilled in the art. For example other types of memory IC components may emerge with future technologies. Moreover, alternative bus technologies, for example, time-division or wavelength multiplexing of memory data, may greatly simplify the layout of the circuit board 12 to allow for a single-rank configuration of the full width of the array 20. Therefore, the scope of the invention is to be limited only by the following claims.
This application claims the benefit of U.S. Provisional Application No. 61/559,944, filed Nov. 15, 2011, the contents of which are incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
61559944 | Nov 2011 | US |