1. Field of the Invention
The present invention relates to data storage systems. More specifically, the present invention relates to tiered storage systems that utilize MAID tiers.
2. Description of the Related Art
As companies create and store more and more data, there is an increasing need for improved data storage systems. Oftentimes, companies create data, store the data, utilize it for different periods of time, and then rarely access the data again. Sometimes, the period of time the data is accessed after it is created is within a short period of time only.
Tiered data storage systems are utilized by data centers to provide different levels of storage at different levels of speed and cost. Tiered data systems often provide a high tier storage level for data which can be accessed quickly. Though having a quick access time, the high data storage tier is expensive to maintain. Tiered systems also include a low tier data storage system. Low tier data storage is typically implemented with tape drives. Tape infrastructure is less expensive, but has very slow access times. Sometimes, accessing data from a tape drive can take hours or days.
What is needed is an improved method to access data other than a high tier storage and low tier storage.
The present invention utilizes a fast mount cache provided by any offline storage medium for fast volume mount access. The fast mount cache may be used as the first level in a hierarchical storage configuration after the high performance tier for data having high access rates shortly after creation but decreasing sharply as the data ages. This provides the present system with very fast access to large amounts of data which is impractical to be maintained on online hard disk drives because of capacity issues.
When migrated from a high performance tier, the data is migrated to the fast mount cache and any other tier according to policies implemented by a data storage manager. The fast mount cache may store migrated data from online storage devices and maintains the data by volume. As the fast mount cache capacity fills, or other active or passive events trigger a volume change, the fast mount cache erases volumes according to the storage manager's policies. In this manner, the fast mount cache may create space by erasing volumes of data. While data is maintained on the fast mount cache for periods of time soon after it is migrated, the data may be accessed quickly. After the initial period of time has expired, or other storage policies eliminate fast mount cache volumes, the data only exists on tape or other low tier data storage.
An embodiment for managing data storage in a multitier data storage system begins with migrating data from a high performance data storage devices to MAID data storage and tape storage. An event may be detected which is associated with the MAID data storage. The oldest volume of data in the MAID data storage may be erased in response to the event.
In embodiments, a fast mount cache provided by any offline storage media having fast volume mount access. The fast mount cache may be used as the first level in a hierarchical storage configuration after the high performance tier for data having high access rates shortly after creation but which decreases sharply as the data ages. This provides the present system with very fast access to large amounts of data which is impractical to be maintained on online hard disk drives because of capacity issues.
Data migrated from a high performance tier is migrated to the fast mount cache and any other tier according to policies implemented by a data storage manager. The fast mount cache may store migrated data from online hard disk drives and maintains the data by volume. As the fast mount cache capacity fills, or other events trigger a volume change, the fast mount cache selects a volume to be erased. In this manner, the fast mount cache may create space by erasing volumes of data. While data is maintained on the fast mount cache for periods of time soon after it is migrated, the data may be accessed quickly. After the initial period of time has expired, or other storage policies eliminate fast mount cache volumes, the data only exists on tape or other low tier data storage.
High performance tier 150 may provide fast access to store data at higher costs. High performance tier may be utilized with online high performance disc drives. Data storage manager 160 may communicate with high performance tier 150, fast mount cache 170, and tape storage and low tier 190. Data storage manager 160 may implement policies to migrate data from the high performance tier to lower stage tiers and vice versa. Data storage manager 160 may manage migration, implement policies which determine where data should be stored, and manage the fast mount cache 170. Data storage manager may be implemented on a computing device with one or more modules stored in memory that are executable to implement the functionality described herein, and may be implemented separately from storage devices and systems 150-180 or as part of one or more devices and systems 150-180.
Fast mount cache 170 may include an offline storage media that provides very fast volume mount characteristics. A fast mount cache may be used for data with high access rates shortly after creation but which decrease sharply as the data ages. Fast mount cache may be implemented using a massive array of idle discs (MAID) or some other form of offline storage media having a very fast volume mount characteristic. Tape storage or low tier 180 may have low access rates at very low costs. Data storage to tape storage 180 is frequently permanent.
Though the present technology discusses fast mount cache is implemented with MAID in some embodiments, the general concept of the present invention may be applied to any form of tiering, and differing devices within a single tier.
In some embodiments, the Fast Mount Cache may eliminate volumes based on policy implemented by the Data Manager. For example, in the high performance tier 150, storage is allocated, consumed, and managed by file or object. At lower tiers 180 and 190, storage may be allocated, consumed, and managed by volume, an aggregation or container of files, or objects. Files and objects may be retrieved individually at the lower tiers. Policies may apply to manage these volumes, to select the right location for a volume or contents thereof into a new volume, eliminating some volumes and accessing objects elsewhere based on performance vs economics.
An event is detected associated with the fast mount cache tier at step 320. The event may trigger a volume of the fast mount cache to be erased, for example according to policies that erase volumes based on active or passive events and are implemented at the data manager. The event may be detection that the storage of the fast mount cache has exceeded a threshold, a period of time expired, or some other event that triggers erasing a volume of data in the cache.
After detecting an event, the fast mount cache may perform defragmentation of one or more volumes at step 330. Defragmentation may be performed using policies based on events. In some embodiments, the defragmentation may be for at least the volume having the oldest data in the fast mount cache, defragging files an old volume into a new volume that were retrieved together, and other events. The defragmentation may help construct new volumes with more consistent write history such that no files are contained only in portions in the volume to be erased.
A volume of data in the fast mount cache is erased at step 340. The volume may be erased as part of a first in first out storage strategy, or alternatively as part of a policy based volume management system. Subsequent data from a high performance tier is migrated to the newly erased volume in the fast mount cache tier at step 350. The erase volume may be used in turn after other volumes are full.
If the data is not located on the fast mount cache, the data storage manager identifies the next fastest tier from which the requested data is available at step 440. Identifying the next fastest tier may involve querying a list of tier records identifying the tiered order that the data could be provided quickest. For example, the next fastest tier after the fast mount cache would be queried for the file name first. If the file was not located on that record, a record for the next fastest tier would be queried for the file name. Once the next fastest tier was identified, the data is retrieved from that identified tier to the high performance tier at step 450.
The components shown in
Mass storage device 530, which may be implemented with a magnetic disk drive or an optical disk drive, is a non-volatile storage device for storing data and instructions for use by processor unit 510. Mass storage device 530 can store the system software for implementing embodiments of the present invention for purposes of loading that software into main memory 510.
Portable storage device 540 operates in conjunction with a portable non-volatile storage medium, such as a floppy disk, compact disk or Digital video disc, to input and output data and code to and from the computer system 500 of
Input devices 560 provide a portion of a user interface. Input devices 560 may include an alpha-numeric keypad, such as a keyboard, for inputting alpha-numeric and other information, or a pointing device, such as a mouse, a trackball, stylus, or cursor direction keys. Additionally, the system 500 as shown in
Display system 570 may include a liquid crystal display (LCD) or other suitable display device. Display system 570 receives textual and graphical information, and processes the information for output to the display device.
Peripherals 580 may include any type of computer support device to add additional functionality to the computer system. For example, peripheral device(s) 580 may include a modem or a router.
The components contained in the computer system 500 of
The foregoing detailed description of the technology herein has been presented for purposes of illustration and description. It is not intended to be exhaustive or to limit the technology to the precise form disclosed. Many modifications and variations are possible in light of the above teaching. The described embodiments were chosen in order to best explain the principles of the technology and its practical application to thereby enable others skilled in the art to best utilize the technology in various embodiments and with various modifications as are suited to the particular use contemplated. It is intended that the scope of the technology be defined by the claims appended hereto.