A portion of the disclosure of this patent document may contain command formats and other computer language listings, all of which are subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
This invention relates to data storage.
Computer systems are constantly improving in terms of speed, reliability, and processing capability. As is known in the art, computer systems which process and store large amounts of data typically include a one or more processors in communication with a shared data storage system in which the data is stored. The data storage system may include one or more storage devices, usually of a fairly robust nature and useful for storage spanning various temporal requirements, e.g., disk drives. The one or more processors perform their respective operations using the storage system. Mass storage systems (MSS) typically include an array of a plurality of disks with on-board intelligent and communications electronics and software for making the data on the disks available.
Companies that sell data storage systems and the like are very concerned with providing customers with an efficient data storage solution that minimizes cost while meeting customer data storage needs. It would be beneficial for such companies to have a way for reducing the complexity of implementing data storage.
A computer-executable method, computer program product, and system for managing metadata within a distributed data storage system, including a compute node in communication with a data storage array, the computer-executable method, computer program product, and system comprising receiving a data I/O from an application executing within the distributed data storage system, and creating a first storage system within the compute node, wherein the first storage system is enabled to manage metadata related to the data I/O, and processing the data I/O using the first storage system.
Objects, features, and advantages of embodiments disclosed herein may be better understood by referring to the following description in conjunction with the accompanying drawings. The drawings are not meant to limit the scope of the claims included herewith. For clarity, not every element may be labeled in every figure. The drawings are not necessarily to scale, emphasis instead being placed upon illustrating embodiments, principles, and concepts. Thus, features and advantages of the present disclosure will become more apparent from the following detailed description of exemplary embodiments thereof taken in conjunction with the accompanying drawings in which:
Like reference symbols in the various drawings indicate like elements.
Traditionally, distributed data storage systems are tasked with managing larger and larger data set, often referred to as big data. Generally, to manage big data, a distributed data storage system creates metadata to enable the distributed data storage system to manage, store, and/or access the big data efficiently. Conventionally, as big data grows, so does its associated metadata, which can become unwieldy to manage by itself. Typically, improving the ability to manage metadata within a distributed data storage system would be beneficial to the data storage system industry.
In many embodiments, the current disclosure may enable creation of a distributed data storage system that may be enabled to manage large amounts of metadata associated with big data. In various embodiments, the current disclosure may enable a distributed data storage system to create one or more dynamically created sub-systems to manage metadata within a distributed data storage system. In most embodiments, the current disclosure may enable a distributed data storage system to utilize a storage engine to manage one or more layers (or abstractions) of metadata associated with data. In other embodiments, the current disclosure may enable a distributed data storage system to dynamically add or remove layers and/or levels of abstraction to manage metadata as the amount of metadata changes over time.
In many embodiments, a distributed data storage system may include one or more zones and/or clusters. In various embodiments, a zone and/or cluster may include one or more compute nodes and one or more data storage arrays. In certain embodiments, a zone and/or cluster may be enabled to communicate with one or more zones and/or clusters in the distributed data storage systems. In most embodiments, a zone and/or cluster may be enabled to manage and/or store data chunk format. In various embodiments, chunk format may include file and object storage formats. In other embodiments, chunk format may be portions of data storage of a specified size (i.e. 64 MB/125 MB). In certain embodiments, a zone may be a cluster. In some embodiments, a cluster may be a zone. In certain embodiments, a zone may include one or more clusters.
In most embodiments, a cluster may include one or more compute nodes and one or more data storage arrays. In various embodiments, a compute node may include a storage engine for manage data services, metadata, Quality of Service, and/or communication between one or more of the nodes in the distributed data storage system. In certain embodiments, applications may communicate with a node's storage engine to facilitate use of data storage within the distributed data storage system.
In many embodiments, a storage engine may include one or more layers. In various embodiments, layers within a storage engine may include a transaction layer, index layer, chunk management layer, storage server management layer, partitions record layer, and/or a storage server (Chunk I/O) layer. In certain embodiments, a transaction layer may parse received object request from applications within a distributed data storage system. In most embodiments, a transaction layer may be enable to read and/or write object data to the distributed data storage system. In some embodiments, data written to a distributed data storage system may be in a chunk format which may be portions of data storage of a specified size (i.e. 64 mb/128 mb). In many embodiments, an index layer may be enabled to map file-name/data-range to data stored within the distributed data storage system. In various embodiments, an index layer may be enabled to manage secondary indices which may be used to manage data stored on the distributed data storage system.
In many embodiments, a chunk management layer may manage chunk information, such as, but not limited to, location and/or management of chunk metadata. In various embodiments, a chunk management layer may be enabled to execute per chunk operations. In certain embodiments, a storage server management layer may monitor the storage server and associated disks. In most embodiments, a storage server management layer may be enabled to detect hardware failures and notify other management services of failures within the distributed data storage system. In some embodiments, a partitions record layer may record an owner node of a partition of a distributed data storage system. In many embodiments, a partitions record layer may record metadata of partitions, which may be in a btree and journal format.
In most embodiments, a storage server layer may be enabled to direct I/O operations to one or more data storage arrays within the distributed data storage system. In various embodiments, a chunk manager service may select which storage server may be utilized for received I/O requests. In certain embodiments, a storage server manager service may be utilized to select disks to be utilized on storage servers selected by the chunk manager service. In most embodiments, once a chunk manager server and storage server manager service has initially processed an I/O request, a transaction layer may be enabled to access one or more storage servers based on the chunk manager service and/or storage server manager service directives.
Refer to the example embodiment of
Refer to the example embodiment of
Refer to the example embodiment of
Refer to the example embodiment of
In this embodiment, storage system 440A is in communication with storage server Layer 440 and storage system 440B. Storage system 440B is in communication with storage system 440A, storage server layer 440, and is enabled to connect to storage system 440N. Storage system 440N is in communication with Storage system 440B and storage server layer 440. In various embodiments, a storage engine may include multiple storage systems which may be enabled to manage portions of the data managed within a given node. In this embodiment, storage system 440A is enabled to store data on data storage array 450A and/or data storage array 450B. Storage system 440A is enabled to manage a portion of the data storage system 440A manages using storage system 440B. Storage system 440B is enabled to store data on data storage array 450A and data storage array 450B. Storage system 440B is enabled to manage a portion of the data stored by storage system 440B using storage system 440N. In many embodiments, one or more storage systems may be use to portion data storage within a distributed data storage system into smaller manageable portions.
Refer to the example embodiment of
Refer to the example embodiment of
Refer to the example embodiments of
The methods and apparatus of this invention may take the form, at least partially, of program code (i.e., instructions) embodied in tangible non-transitory media, such as floppy diskettes, CD-ROMs, hard drives, random access or read only-memory, or any other machine-readable storage medium.
The logic for carrying out the method may be embodied as part of the aforementioned system, which is useful for carrying out a method described with reference to embodiments shown in, for example,
Although the foregoing invention has been described in some detail for purposes of clarity of understanding, it will be apparent that certain changes and modifications may be practiced within the scope of the appended claims. Accordingly, the present implementations are to be considered as illustrative and not restrictive, and the invention is not to be limited to the details given herein, but may be modified within the scope and equivalents of the appended claims.
This Application claims priority from U.S. Provisional Patent Application Ser. Nos. 61/988,603 entitled “DISTRIBUTED DATA STORAGE MANAGEMENT” and 61/988,796 entitled “ZONE CONSISTENCY” filed on May 5, 2014 the content and teachings of which are hereby incorporated by reference in their entirety. This Application is related to U.S. patent application Ser. No. 14/319,349 entitled “DISTRIBUTED DATA STORAGE MANAGEMENT”, Ser. No. 14/319,360 entitled “DISTRIBUTED METADATA MANAGMENT”, Ser. No. 14/319,378 entitled “DISTRIBUTED DATA STORAGE MANAGEMENT”, Ser. No. 14/319,383 entitled “DATA BACKUP MANAGEMENT ON DISTRIBUTED STORAGE SYSTEMS”, Ser. No. 14/319,113 entitled “ZONE CONSISTENCY”, and Ser. No. 14/319,117 entitled “ZONE CONSISTENCY” filed on even date herewith, the teachings of which applications are hereby incorporated herein by reference in their entirety.
Number | Name | Date | Kind |
---|---|---|---|
6021508 | Schmuck | Feb 2000 | A |
6032216 | Schmuck | Feb 2000 | A |
7711897 | Chatterjee | May 2010 | B1 |
8321643 | Vaghani | Nov 2012 | B1 |
8510279 | Natanzon | Aug 2013 | B1 |
8671265 | Wright | Mar 2014 | B2 |
9003021 | Wright | Apr 2015 | B2 |
9396198 | Takaoka | Jul 2016 | B2 |
20020078174 | Sim | Jun 2002 | A1 |
20020133491 | Sim | Sep 2002 | A1 |
20030195895 | Nowicki | Oct 2003 | A1 |
20040122917 | Menon | Jun 2004 | A1 |
20070260830 | Faibish | Nov 2007 | A1 |
20070260842 | Faibish | Nov 2007 | A1 |
20130019067 | Vilayannur | Jan 2013 | A1 |
20150193169 | Sundaram | Jul 2015 | A1 |
Entry |
---|
Implementation of a Software-Defined Storage Service with Heterogeneous Storage Technologies; Yang et al; IEEE 29th International Conference on Advanced Information Networking and Applications Workshops; Mar. 24-27, 2015; pp. 102-107 (6 pages). |
ZettaDS: A Light-weight Distributed Storage System for Cluster; Liu et al; The Third ChinaGrid Annual Conference; Aug. 20-22, 2008; pp. 158-164 (7 pages). |
QuickSilver distributed file services: an architecture for horizontal growth; Cabrera et al; Proceedings of the 2nd IEEE Conference on Computer Workstations; Mar. 7-10, 1999; pp. 23-37 (15 pages). |
A reliable object-oriented data repository for a distributed computer system; Svobodova, Liba; Proceedings of the eighth ACM symposium on Operating systems principles; Dec. 14-16, 1981; pp. 47-58 (12 pages). |
Number | Date | Country | |
---|---|---|---|
61988603 | May 2014 | US | |
61988796 | May 2014 | US |