Claims
- 1. A method of storing data in a storage system, the method comprising the steps of:
processing a data block to generate an address which is determined as a function of the contents of the data block; and storing the data block in the system in a memory location identified by the address; wherein the processing and storing steps provide write-once storage of the data block in the system such that the contents of the data block are not modifiable without also altering the address of the data block determinable in the processing step.
- 2. The method of claim 1 wherein the processing and storing steps provide archival data storage of the data block.
- 3. The method of claim 1 wherein a request to store the data block is generated by a client device and the processing and storing steps are implemented in a server coupled to the client device.
- 4. The method of claim 1 wherein the processing step further comprises determining a substantially unique identifier of the data block.
- 5. The method of claim 4 wherein the substantially unique identifier is determined by applying a collision-resistant hash function to the contents of the data block.
- 6. The method of claim 4 wherein the address is determined from the substantially unique identifier by utilizing the identifier to perform a lookup of the address in an index.
- 7. The method of claim 4 wherein in conjunction with a request from a client device for retrieval of the stored data block, the client device and a server which stores the data block each recompute the substantially unique identifier from the contents of the retrieved data block in order to verify integrity of the retrieved data block.
- 8. The method of claim 3 wherein the server comprises a data block cache configured so as to permit retrieval of the data block from the server without requiring retrieval of the data block from the memory location corresponding to the address determined as a function of the contents of the data block.
- 9. The method of claim 3 wherein the server comprises an index cache configured so as to permit retrieval of the data block from the server without requiring a search through an index that specifies a mapping between a substantially unique identifier of the data block and the memory location address.
- 10. The method of claim 1 further including the steps of detecting duplicate write operations for the given data block and then performing at least one of: (i) deleting one or more of said duplicate write operations; and (ii) combining said duplicate write operations into a single write operation.
- 11. The method of claim 1 wherein multiple client devices of the system write data blocks using the same block size and alignment.
- 12. The method of claim 1 wherein the storing step further comprises storing the data block in an append-only data log in a storage element of the system.
- 13. The method of claim 12 wherein the data log comprises a plurality of storage areas, each of the areas comprising a data block section in which data blocks are stored in an append-only manner.
- 14. The method of claim 13 wherein when a given one of the areas is filled, the area is marked as sealed, and a substantially unique identifier is computed for the contents of the area.
- 15. The method of claim 6 wherein the index comprises a plurality of buckets, each bucket comprising an index map for a portion of a space defined by possible values of the substantially unique identifiers, the index map having a plurality of entries each providing a mapping between a given one of the substantially unique identifiers and a corresponding memory location address in a storage element of the system.
- 16. The method of claim 15 wherein the substantially unique identifiers are distributed across the buckets in a substantially uniform manner by application of a hash function to the identifiers, the output of the hash function being used to determine a particular one of the buckets that will include an index map entry for the identifier.
- 17. The method of claim 1 wherein the storing step comprises storing the data block at the memory location in a RAID device having a plurality of disk drives.
- 18. The method of claim 1 wherein the processing step further comprises forming a plurality of pointer blocks each comprising a plurality of substantially unique identifiers for a corresponding plurality of data blocks stored in the system, the pointer block itself being subject to the processing and storing steps in a recursive manner until a single unique identifier of a root of a tree of pointer blocks is obtained.
- 19. An apparatus for storing data in a storage system, the apparatus comprising:
a server having at least one processor coupled to a memory, the server being operative to process a data block to generate an address which is determined as a function of the contents of the data block, and to store the data block in the memory in a location identified by the address; wherein the server is configured to provide write-once storage of the data block in the system such that the contents of the data block are not modifiable without also altering the address of the data block determinable as a function of the contents.
- 20. A machine-readable storage medium for storing one or more software programs for use in storing data in a storage system, the one or more software programs when executed in the system implementing the steps of:
processing a data block to generate an address which is determined as a function of the contents of the data block; and storing the data block in the system in a memory location identified by the address; wherein the processing and storing steps provide write-once storage of the data block in the system such that the contents of the data block are not modifiable without also altering the address of the data block determinable in the processing step.
RELATED APPLICATION(S)
[0001] The present application claims the priority of U.S. Provisional Patent Application Serial No. 60/306,564, filed Jul. 19, 2001 and entitled “Method and Apparatus for Archival Data Storage,” the disclosure of which is hereby incorporated by reference herein.
Provisional Applications (1)
|
Number |
Date |
Country |
|
60306564 |
Jul 2001 |
US |