Claims
- 1. A method of initiating an internally-consistent system-wide file system image in an object-based distributed data storage system comprising:
providing a plurality of management entities each of which maintains a record representing a configuration of a portion of said data storage system; and using a Distributed Consensus Algorithm (DCA) to elect one of said plurality of management entities to serve as a first image master to coordinate execution of said system-wide file system image.
- 2. The method of claim 1, wherein said record maintained by each of said plurality of management entities represents said configuration of entire of said data storage system.
- 3. The method of claim 1, wherein said configuration of said portion of said data storage system includes at least one of the following:
information about how one or more data files are stored in said portion of said data storage system; and information about identity of a file manager that manages a root file directory for a system-wide file namespace.
- 4. The method of claim 1, wherein each of said plurality of management entities is a realm manager.
- 5. The method of claim 1, wherein using said DCA to elect said first image master includes configuring each of said plurality of management entities to elect said first image master therefrom using said DCA at a boot time for said object-based data storage system.
- 6. The method of claim 1, further comprising:
detecting absence of said first image master; and electing a second image master from a remainder of said plurality of management entities using said DCA.
- 7. A computer-readable storage medium containing a program code, which, upon execution by a processor in an object-based distributed data storage system, causes said processor to perform the following:
provide a plurality of management entities each of which maintains a record representing a configuration of a portion of said data storage system; and use a Distributed Consensus Algorithm (DCA) to elect one of said plurality of management entities to serve as an image master to coordinate execution of a system-wide file system image in said data storage system.
- 8. An object-based data storage system comprising:
means for maintaining a record representing a configuration of a portion of said data storage system; and means for electing one of said record maintaining means as an image master using a Distributed Consensus Algorithm (DCA), wherein said image master is configured to coordinate execution of an internally-consistent system-wide file system image.
- 9. A method of achieving a quiescent state in a storage system having a plurality of object-based secure disks (OBDs) and a plurality of executable client applications, wherein each client application, upon execution, is configured to access one or more of said plurality of object-based secure disks, said method comprising:
defining a plurality of capabilities required to perform a data write operation on corresponding one or more of said plurality of object-based secure disks; granting one or more of said plurality of capabilities to each of said plurality of client applications; and invalidating each of said plurality of capabilities so long as said quiescent state is to be maintained, thereby preventing each client application from accessing one or more of corresponding object-based secure disks to perform said data write operation thereon during said quiescent state.
- 10. The method of claim 9, wherein defining said plurality of capabilities includes:
providing a management entity in said storage system, wherein said management entity is configured to manage data transfer operations between said plurality of client applications and said one of said plurality of OBDs; and configuring said management entity to digitally sign each of said plurality of capabilities.
- 11. The method of claim 10, further comprising configuring each of said plurality of OBDs to recognize said digital signature of said management entity.
- 12. The method of claim 10, wherein configuring said management entity to digitally sign each of said plurality of capabilities includes configuring said management entity to digitally sign each of said plurality of capabilities using a first cryptographic key, and wherein invalidating each of said plurality of capabilities includes configuring said management entity to require each said capability granted to a corresponding client application to contain a second cryptographic key instead of said first cryptographic key, wherein said second cryptographic key is different from said first cryptographic key.
- 13. The method of claim 9, wherein granting said one or more of said plurality of capabilities includes:
receiving a corresponding request from each of said plurality of client applications for one or more of said plurality of capabilities; and issuing one or more of said plurality of capabilities to each of said plurality of client applications in response to said request received therefrom.
- 14. The method of claim 13, wherein issuing one or more of said plurality of capabilities includes validating each of said one or more of said plurality of capabilities with a digital signature.
- 15. The method of claim 9, wherein invalidating each of said plurality of capabilities includes:
identifying a management entity in said storage system that facilitates said data write operation on one or more of said plurality of OBDs for each corresponding client application by recognizing one or more of said plurality of capabilities presented thereto by each said client application; and instructing said management entity to stop recognizing said one or more of said plurality of capabilities so long as said quiescent state is to be maintained, thereby preventing each said client application from performing said data write operation on corresponding one or more of said plurality of OBDs.
- 16. A computer-readable storage medium containing a program code, which, upon execution by a processor in an object-based distributed data storage system, causes said processor to perform the following:
define a plurality of capabilities required to perform a data write operation on corresponding one or more of a plurality of object-based secure disks in said data storage system; grant one or more of said plurality of capabilities to each of a plurality of client applications configured to access one or more of said plurality of object-based secure disks; and invalidate each of said plurality of capabilities so long as a quiescent state is to be maintained, thereby preventing each client application from accessing one or more of corresponding object-based secure disks to perform said data write operation thereon during said quiescent state.
- 17. An object-based data storage system comprising:
a plurality of object-based secure disks (OBDs); a plurality of executable client applications, wherein each client application, upon execution, is configured to access one or more of said plurality of OBDs to perform a data write operation thereon; means for generating a plurality of capabilities required to perform said data write operation; means for receiving a corresponding request from each of said plurality of client applications for one or more of said plurality of capabilities; means for issuing one or more of said plurality of capabilities to each of said plurality of client applications in response to said request received therefrom; and means for preventing recognition of each of said plurality of capabilities so long as a quiescent state is to be maintained for said storage system, thereby preventing each said client application from performing said data write operation during said quiescent state.
- 18. A method of performing an internally-consistent system-wide file system image in an object-based data storage system comprising:
receiving a request for said system-wide file system image; quiescing said object-based data storage system in response to said request for said image; cloning each live object group stored in said data storage system after said data storage system is quiesced without cloning any object contained in each said object group during said system-wide file system image; and responding to said request for said system-wide file system image.
- 19. The method of claim 18, wherein receiving said request includes:
electing an image master from a plurality of realm managers in said object-based data storage system, wherein said image master coordinates execution of said system-wide file system image, and wherein each of said plurality of realm managers maintains a record of the system-wide file storage configuration in said object-based data storage system; and configuring said image master to receive said request for said system-wide file system image.
- 20. The method of claim 18, wherein quiescing said data storage system includes:
instructing each client application running in said object-based data storage system to finish a corresponding data write operation already commenced thereby; advising each said client application to discard corresponding one or more write capabilities issued thereto for a future data write operation, wherein each said client application requires a write capability to perform said future data write operation in said data storage system; and ceasing issuance of new write capabilities to each said client application.
- 21. The method of claim 18, wherein quiescing said data storage system includes invalidating each write capability issued to each client application running in said object-based data storage system, wherein each said client application requires a write capability to perform a corresponding data write operation in said object-based data storage system.
- 22. The method of claim 21, wherein invalidating each said write capability includes:
identifying a management entity in said data storage system that facilitates said data write operation for each corresponding client application by recognizing each said write capability presented thereto by each said client application; and instructing said management entity to stop recognizing each said write capability, thereby preventing each said client application from performing said data write operation in said object-based data storage system.
- 23. The method of claim 21, wherein invalidating each said write capability includes preventing recognition of each said write capability in said data storage system.
- 24. The method of claim 18, wherein cloning each said live object group includes:
providing a plurality of management entities in said data storage system, wherein each of said plurality of management entities is configured to manage data transfer operations within said data storage system, and wherein each of said plurality of management entities contains at least one live object group stored in said data storage system; and instructing each of said plurality of management entities to clone corresponding one or more live object groups contained therein.
- 25. The method of claim 18, wherein cloning each said object group includes:
creating a corresponding duplicate object group for each said object group; and copying respective header information contained in each said object group into said corresponding duplicate object group.
- 26. The method of claim 25, wherein creating said corresponding duplicate object group includes assigning a corresponding new object group_ID to each said duplicate object group.
- 27. The method of claim 25, wherein each said object group is cloned into said corresponding duplicate object group without rewriting metadata for one or more objects contained in each said object group.
- 28. The method of claim 26, wherein assigning said corresponding new object group_ID includes:
indicating to each object-based secure disk (OBD) in said data storage system that each object group stored thereon is to be cloned; supplying said corresponding new object group_ID for each said duplicate object group to each respective OBD in said data storage system; and instructing each said OBD to copy each object group stored thereon to create said corresponding duplicate object group identified therefor and assign said corresponding new object group_ID to respective duplicate object group.
- 29. The method of claim 18, wherein responding to said request includes:
creating an image directory in a root directory object in said object-based data storage system to record therein information related to said system-wide file system image; re-enabling one or more data write operations in said data storage system; informing each file manager in said data storage system about a timing of said system-wide file system image; and indicating completion of said system-wide file system image to an entity in said data storage system that requested said system-wide file system image.
- 30. The method of claim 29, wherein responding to said request further includes recording an image_ID and said timing of said system-wide file system image in said image directory.
- 31. The method of claim 29, wherein re-enabling said one or more data write operations includes commencing recognizing each write capability issued to each client application running in said object-based data storage system, wherein each said client application requires a write capability to perform a corresponding data write operation in said object-based data storage system.
- 32. An object-based data storage system comprising:
means for receiving a request for an internally-consistent system-wide file system image; means for quiescing said object-based data storage system in response to said request for said image; means for cloning each object group stored in said data storage system after said data storage system is quiesced without primarily cloning any object contained in each said object group during said system-wide file system image; and means for responding to said request for said system-wide file system image.
- 33. A computer-readable storage medium containing a program code, which, upon execution by a processor in an object-based distributed data storage system, causes said processor to perform the following:
receive a request for a system-wide file system image in said data storage system; quiesce said object-based data storage system in response to said request for said image; clone each object group stored in said data storage system after said data storage system is quiesced without cloning any object contained in each said object group during said system-wide file system image; and respond to said request for said system-wide file system image.
- 34. A method of performing an internally-consistent system-wide file system image in an object-based data storage system comprising:
receiving a request for said system-wide file system image; placing a dummy image directory in a root directory object in said object-based data storage system; quiescing said object-based data storage system in response to said request for said image; informing each file manager in said data storage system about a timing of said system-wide file system image; indicating completion of said system-wide file system image; and configuring each said file manager to copy each corresponding object managed thereby prior to authorizing a write operation to said object after said completion of said system-wide file system image.
- 35. The method of claim 34, wherein receiving said request includes:
electing an image master from a plurality of realm managers in said object-based data storage system, wherein said image master coordinates execution of said system-wide file system image, and wherein each of said plurality of realm managers maintains a record of the system-wide file storage configuration in said object-based data storage system; and configuring said image master to receive said request for said system-wide file system image.
- 36. The method of claim 34, wherein quiescing said data storage system includes:
instructing each client application running in said object-based data storage system to finish a corresponding data write operation already commenced thereby; advising each said client application to discard corresponding one or more write capabilities issued thereto for a future data write operation, wherein each said client application requires a write capability to perform said future data write operation in said data storage system; and ceasing issuance of new write capabilities to each said client application.
- 37. The method of claim 34, wherein quiescing said data storage system includes invalidating each write capability issued to each client application running in said object-based data storage system, wherein each said client application requires a write capability to perform a corresponding data write operation in said object-based data storage system.
- 38. The method of claim 37, wherein invalidating each said write capability includes instructing each said file manager to stop recognizing each said write capability, thereby preventing each said client application from performing said corresponding data write operation in said object-based data storage system.
- 39. The method of claim 37, wherein invalidating each said write capability includes preventing recognition of each said write capability in said data storage system.
- 40. The method of claim 34, wherein indicating said completion of said system-wide file system image includes indicating said completion to an entity in said data storage system that requested said system-wide file system image.
- 41. The method of claim 34, further comprising responding to said request for said system-wide file system image.
- 42. The method of claim 41, wherein responding to said request includes:
converting said dummy image directory into a valid image directory in said root directory object at a conclusion of said system-wide file system image to record therein information related to said system-wide file system image; and re-enabling one or more data write operations in said data storage system.
- 43. The method of claim 42, wherein responding to said request further includes recording an image_ID and said timing of said system-wide file system image in said valid image directory.
- 44. The method of claim 42, wherein re-enabling said one or more data write operations includes commencing recognizing each write capability issued to each client application running in said object-based data storage system, wherein each said client application requires a write capability to perform a corresponding data write operation in said object-based data storage system.
- 45. The method of claim 34, wherein configuring each said file manager to copy each corresponding object includes configuring each said file manager to perform the following:
determine if there is a mismatch between a first image version number contained in said corresponding object and a second image version number of said system-wide file system image; upon detecting said mismatch, perform an image of each file directory appearing in a file path for said corresponding object until one of the following file directories in said file path is encountered:
said root directory object, and a file directory whose image already exists for said second image version number; create a copy of said corresponding object in said image of the file directory containing said corresponding object; and replacing said first image version number contained in said corresponding object with said second image version number prior to authorizing said write operation to said corresponding object.
- 46. The method of claim 45, wherein configuring each said file manager to perform said image of each said file directory includes configuring each said file manager to duplicate each said file directory appearing in said file path for said corresponding object prior to creating said copy of said corresponding object in said image of the file directory containing said corresponding object.
- 47. The method of claim 45, wherein configuring each said file manager to create said copy of said corresponding object includes:
configuring each said file manager to create said copy of said corresponding object having a first object_ID that is different from a second object_ID of said corresponding object; and further configuring each said file manager to record said first object_ID of said copy of said corresponding object in said image of the file directory containing said corresponding object
- 48. An object-based data storage system comprising:
means for receiving a request for an internally-consistent system-wide file system image; means for placing a dummy image directory in a root directory object in said object-based data storage system; means for quiescing said object-based data storage system in response to said request for said image; means for informing each file manager in said data storage system about a timing of said system-wide file system image; means for indicating completion of said system-wide file system image; and means for configuring a file manager to copy an object managed thereby prior to authorizing a write operation to said object after said completion of said system-wide file system image.
- 49. A computer-readable storage medium containing a program code, which, upon execution by a processor in an object-based distributed data storage system, causes said processor to perform the following:
receive a request for a system-wide file system image in said object-based data storage system; place a dummy image directory in a root directory object in said object-based data storage system; quiesce said object-based data storage system in response to said request for said image; inform each file manager in said data storage system about a timing of said system-wide file system image; indicate completion of said system-wide file system image; and configure each said file manager to copy each corresponding object managed thereby prior to authorizing a write operation to said object after said completion of said system-wide file system image.
- 50. A method of performing an internally-consistent system-wide file system image in an object-based data storage system comprising:
preparing said object-based data storage system for said system-wide file system image; and performing said system-wide file system image without updating any directory objects stored in said object-based data storage system during said system-wide file system image.
- 51. The method of claim 50, further comprising receiving a request for said system-wide file system image prior to preparing said data storage system.
- 52. The method of claim 51, wherein receiving said request includes:
electing an image master from a plurality of realm managers in said object-based data storage system, wherein said image master coordinates execution of said system-wide file system image, and wherein each of said plurality of realm managers maintains a record of the system-wide file storage configuration in said object-based data storage system; and configuring said image master to receive said request for said system-wide file system image.
- 53. The method of claim 50, wherein preparing said data storage system includes placing a dummy image directory in a root directory object in said object-based data storage system.
- 54. The method of claim 53, wherein preparing said data storage system further includes quiescing said object-based data storage system prior to performing said system-wide file system image.
- 55. The method of claim 54, wherein quiescing said data storage system includes preventing each client application running in said data storage system from performing a corresponding data write operation in said data storage system so long as said system-wide file system image is in progress.
- 56. The method of claim 53, wherein performing said system-wide file system image includes converting said dummy image directory into a valid image directory in said root directory object at a conclusion of said system-wide file system image to record therein information related to said system-wide file system image.
- 57. The method of claim 56, wherein performing said system-wide file system image includes recording an image_ID and a timing of said system-wide file system image in said valid image directory.
- 58. The method of claim 50, wherein performing said system-wide file system image includes:
informing each file manager in said data storage system about a timing of said system-wide file system image; and indicating a completion of said system-wide file system image to an entity in said data storage system that requested said image.
- 59. The method of claim 50, wherein performing said system-wide file system image includes cloning one or more of said directory objects during said image without updating said directory objects to be cloned.
- 60. A computer-readable storage medium containing a program code, which, upon execution by a processor in an object-based distributed data storage system, causes said processor to perform the following:
prepare said object-based data storage system for an internally-consistent system-wide file system image; and perform said system-wide file system image without updating any directory objects stored in said object-based data storage system during said system-wide file system image.
- 61. An object-based data storage system comprising:
means for receiving a request for a system-wide file system image; means for quiescing said object-based data storage system in response to said request; and means for performing said system-wide file system image without updating any directory objects stored in said object-based data storage system during said system-wide file system image.
- 62. In an object-based data storage system having a plurality of object-based secure disks (OBDs) and a storage manager facilitating data storage in one or more of said plurality of OBDs, a method of avoiding a need to rewrite metadata for an object, stored by said storage manager on one of said plurality of OBDs, when a system-wide file system image in said object-based data storage system is taken, said method comprising:
obtaining information identifying said file system image; using said image identifying information, dynamically obtaining a mapping of a non-image identity of each object group appearing in a file path for said object into a corresponding identity of each said object group in said file system image; and for each said object group in said file path for said object, dynamically substituting said corresponding identity in said file system image in place of respective non-image identity therefor when accessing a version of said object in said file system image.
- 63. The method of claim 62, wherein said information identifying said file system image includes an image_ID stored in an image directory object created in said data storage system during said file system image.
- 64. The method of claim 63, wherein obtaining said information includes accessing said image directory object to obtain said image_ID therefrom.
- 65. The method of claim 62, wherein dynamically obtaining said mapping includes:
at run time, accessing each file directory object appearing in said file path for said object to obtain identity of said storage manager and said non-image identity of a corresponding object group associated therewith; for each said object group corresponding to a respective file directory object, supplying said image identifying information, said non-image identity of said object group, and identity of said storage manager at run time to a realm manager in said object-based data storage system, wherein said realm manager is configured to maintain a record of a file storage configuration for the files stored by said storage manager; and for each said object group corresponding to said respective file directory object, querying said realm manager at run time to obtain said mapping for said object stored by said storage manager.
- 66. The method of claim 65, wherein querying said realm manager at run time includes configuring said realm manager to provide said mapping using the following information supplied thereto at run time:
said image identifying information; said identity of said storage manager; and said non-image identity of each said object group appearing in said file path for said object.
- 67. The method of claim 65, wherein dynamically substituting said corresponding identity for each said object group in said file system image includes:
for each said object group corresponding to said respective file directory object, substituting at run time said corresponding identity thereof in said file system image in place of said non-image identity thereof; and continuing said substitution until said version of said object in said file system image is reached.
- 68. An object-based data storage system comprising:
a plurality of object-based secure disks (OBDs); a storage manager facilitating data storage in one or more of said plurality of OBDs; a realm manager configured to maintain a record of a file storage configuration for the files stored by said storage manager; and an executable client application, wherein said client application, upon execution, is configured to perform the following at run time:
access a first OBD to obtain identity of a system-wide file system image from an image directory object stored thereon, and obtain a mapping from said realm manager of each object group corresponding to a respective file directory object in a file path for an object stored by said storage manager on a second OBD, wherein said realm manager maps a non-image identity of each said object group into a corresponding identity of each said object group in said file system image.
- 69. A computer-readable storage medium containing a program code, which, upon execution by a processor in an object-based distributed data storage system, causes said processor to perform the following at run time to access an image version of an object stored in said data storage system:
obtain information identifying a system-wide file system image in said data storage system; using said image identifying information, obtain a mapping of a non-image identity of each object group appearing in a file path for an object stored in said data storage system into a corresponding identity of each said object group in said file system image; and for each said object group in said file path for said object, substitute said corresponding identity in said file system image in place of respective non-image identity therefor when accessing said image version of said object in said file system image.
REFERENCE TO RELATED APPLICATIONS
[0001] This application claims priority benefits of prior filed co-pending U.S. provisional patent applications Serial No. 60/368,785, filed on Mar. 29, 2002 and Serial No. 60/372,027, filed on Apr. 12, 2002, the disclosures of both of which are incorporated herein by reference in their entireties.
Provisional Applications (2)
|
Number |
Date |
Country |
|
60368785 |
Mar 2002 |
US |
|
60372027 |
Apr 2002 |
US |