This invention relates in general to the field of computer storage systems and more specifically to the optimization of computer storage systems utilizing deduplication.
Growing complexity of storage infrastructure requires solutions for efficient use and management of resources. The use of a virtualized storage system enables to present to the user a logical space for data storage while the storage system itself handles the process of mapping it to the actual physical location. Today many virtualized storage systems implement data deduplication. Data deduplication is a data compression technique of optimizing the efficiency of utilization of available storage space in a storage system (including storage area network systems (SAN)). In the deduplication process a single copy of a data unit is stored while duplications of identical data units are eliminated and only a virtual representation of these units is maintained. In response to a request (e.g. to a read request) for these data units, the data can be easily reconstructed. By storing a single copy of each data unit, deduplication enables to reduce the required storage space of a physical storage.
In some cases it may occur that substantial parts of different data units are identical while significantly smaller parts are non-identical. In such scenarios currently known deduplication techniques which require complete identity between the data unit, would determine such data unit as non-identical and store all of different data portion on the physical storage.
Prior art references considered to be relevant as background to the invention are listed below. Acknowledgement of the references herein is not to be inferred as meaning that these are in any way relevant to the patentability of the invention disclosed herein.
US Patent Application No. 2009234892 discloses a system and method for assuring integrity of deduplicated data objects stored within a storage system. A data object is copied to secondary storage media, and a digital signature such as a checksum is generated of the data object. Then, deduplication is performed upon the data object and the data object is split into chunks. The chunks are combined when the data object is subsequently accessed, and a signature is generated for the reassembled data object. The reassembled data object is provided if the newly generated signature is identical to the originally generated signature, and otherwise a backup copy of the data object is provided from secondary storage media.
US Patent Application No. US2008005141 discloses a system and method for calculating and storing block fingerprints for data deduplication. A fingerprint extraction layer generates a fingerprint of a predefined size, e.g., 64 bits, for each data block stored by a storage system. Each fingerprint is stored in a fingerprint record, and the fingerprint records are, in turn, stored in a fingerprint database for access by the data deduplication module. The data deduplication module may periodically compare the fingerprints to identify duplicate fingerprints, which, in turn, indicate duplicate data blocks.
U.S. Pat. No. 7,822,939 discloses a system for de-duplicating data, which includes providing a first volume including at least one pointer to a second volume that corresponds to physical storage space, wherein the first volume is a logical volume. A first set of data is detected as a duplicate of a second set of data stored on the second volume at a first data chunk. A pointer of the first volume associated with the first set of data is modified to point to the first data chunk. After modifying the pointer, no additional physical storage space is allocated for the first set of data.
Thus, there is still a need in the art to further improve the efficiency of the utilization of physical storage and provide deduplication techniques which enable to further reduce storage space requirements.
According to certain aspects of the disclosed subject matter there is provided a storage system, the system comprising: a plurality of physical storage devices controlled by a plurality of storage control devices constituting a storage control layer, the storage control layer operatively coupled to the plurality of physical storage devices storing a plurality of data units, wherein the storage control layer being operable to identify a first data part and a respective first metadata part within a first data unit, and to perform at least the following operations: identify at least one data unit among the plurality of data units, having a data part which is identical to the first data part and a respective metadata part, which is different than the first metadata part, by comparing the first data part and the respective first metadata part with data parts and respective metadata parts of data units from the plurality of data units, thereby giving rise to partly-identical data units; associate a logical address corresponding to a data part of one data unit of the partly-identical data units with a physical address of a data part of a second data unit of the partly-identical data units; store the respective first metadata part in a designated area in the physical storage and associate a logical address of the respective first metadata part with a physical address of the metadata part; and discard the data part of the one data unit of the partly-identical data units, thereby reducing volume of stored data in the physical storage.
According to further aspects of the disclosed subject matter the control layer is further operable to receive information at least in respect of the first data unit including at least a size of the first metadata part and its location in respect of the first data unit and is operable to identify the first data part and the first metadata part within the data unit based on the received information.
According to further aspects of the disclosed subject matter the control layer is operable to identify data units among the plurality of data units, having data parts which are identical to the first data part, and having respective metadata parts, which are identical to the first metadata part, by comparing the first data unit with data units from the plurality of data units, thereby giving rise to identical data units; associate a logical address corresponding to one of the identical data units with a physical address of another identical data unit of the identical data units and discard one of the identical data units.
According to further aspects of the disclosed subject matter the control layer is operable to calculate a hash value for at least the first data part; wherein the comparing is performed by comparing the hash value with other hash values calculated for data parts of at least part of the plurality of data units.
According to further aspects of the disclosed subject matter the control layer is further operable to divide a data unit into a plurality of subunits, and wherein the first data unit is one of subunits generated by such division.
According to further aspects of the disclosed subject matter the control layer is operatively connectable to one or more hosts and, in response to a read request issued by a host requesting for a data unit comprising the first data part and the first metadata part operable to perform at least the following: identify that the first metadata part is characterized by a logical address associated with a designated area; retrieve the first metadata part from the designated area based on its physical address; retrieve the first data part from its physical storage device; combine the first data part and the first metadata part and reconstruct the first data unit.
According to other aspects of presently disclosed subject matter there is provided a SO (storage optimization) module operatively connectable to a storage control layer in a storage system, the storage system comprising a plurality of physical storage devices, storing a plurality of data units, and controlled by a plurality of storage control devices constituting the storage control layer, the layer operatively coupled to the plurality of physical storage devices; the SO module being operable to identify a first data part and a respective first metadata part within a first data unit, and to perform at least the following operations: identify at least one data unit among the plurality of data units, a having data part which is identical to the first data part and a respective metadata part, which is different than the first metadata part, by comparing the first data part and the respective first metadata part with data parts and respective metadata parts of data units from the plurality of data units, thereby giving rise to partly-identical data units; associate a logical address corresponding to a data part of one data unit of the partly-identical data units with a physical address of a data part of a second data unit of the partly-identical data units; store the respective first metadata part in a designated area in the physical storage and associate a logical address of the respective first metadata part with a physical address of the metadata part; and discard the data part of the one data unit of the partly-identical data units, thereby reducing volume of stored data in the physical storage.
According to other aspects of the presently disclosed subject matter there is provided a method of allocating data to a physical data storage associated with a storage system comprising a plurality of physical storage devices storing a plurality of data units, the method comprising: identifying a first data part and a respective first metadata part within a first data unit; identifying at least one data unit among the plurality of data units, having a data part which is identical to the first data part and a respective metadata part, which is different than the first metadata part, by comparing the first data part and the respective first metadata part with data parts and respective metadata parts of data units from the plurality of data units, thereby giving rise to partly-identical data units; associating a logical address corresponding to a data part of one data unit of the partly-identical data units with a physical address of a data part of a second data unit of the partly-identical data units; storing the respective first metadata part in a designated area in the physical storage; associating a logical address of the respective first metadata part with a physical address of the metadata part; and discarding the data part of the one data unit of the partly-identical data units, thereby reducing volume of stored data in the physical storage.
According to further aspects of the disclosed subject matter the method further comprises receiving information at least in respect of the first data unit including at least a size of the first metadata part and its location in respect of the first data unit and identifying the first data part and the first metadata part within the data unit based on the received information.
According to further aspects of the disclosed subject matter the method further comprises identifying data units among the plurality of data units, having data, parts which are identical to the first data part, and having respective metadata parts, which are identical to the first metadata part, by comparing the first data unit with data units from the plurality of data units, thereby giving rise to identical data units; associating a logical address corresponding to one of the identical data units with a physical address of another identical data unit of the identical data units and discarding tone of the identical data units.
According to further aspects of the disclosed subject matter the method further comprises calculating a hash value for at least the first data part; wherein the comparing is performed by comparing the hash value with other hash values calculated for data parts of at least part of the plurality of data units.
According to further aspects of the disclosed subject matter the method further comprises concatenating data parts corresponding to the plurality of subunits within the data unit and comparing the concatenated data parts with concatenated data parts corresponding to other data units stored within the storage devices.
According to other aspects of the presently disclosed subject matter there is provided a program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform method steps of allocating data to a physical data storage associated with a storage system comprising a plurality of physical storage devices storing a plurality of data units, the method comprising identifying a first data part and a respective first metadata part within a first data unit; identifying at least one data unit among the plurality of data units, having data parts which are identical to the first data part and a respective metadata part, which is different than the first metadata part, by comparing the first data part and the respective first metadata part with data parts and respective metadata parts of data units from the plurality of data units, thereby giving rise to partly-identical data units; associating a logical address corresponding to a data part of one data unit of the partly-identical data units with a physical address of a data part of a second data unit of the partly-identical data units; storing the respective first metadata part in a designated area in the physical storage; associating a logical address of the respective first metadata part with a physical address of the metadata part; and discarding the data part of the one data unit of the partly-identical data units, thereby reducing volume of stored data in the physical storage.
In order to understand the disclose subject matter and to see how it may be carried out in practice, embodiments will now be described, by way of non-limiting example only, with reference to the accompanying drawings, in which:
In the drawings and descriptions set forth, identical reference numerals indicate those components that are common to different embodiments or configurations.
Unless specifically stated otherwise, as apparent from the following discussions, it is appreciated that throughout the specification discussions utilizing terms such as “processing”, “controlling”, “configuring”, “receiving”, “enabling”, “performing”, “executing”, “monitoring”, “analyzing”, “determining”, “identifying”, “associating”, “discarding”, “storing”, “comparing”, “calculating”, “retrieving”, “combining”, “concatenating” “defining” or the like, include action and/or processes of a computer that manipulate and/or transform data into other data, the data represented as physical quantities, e.g. such as electronic quantities, and/or the data representing the physical objects. The term “computer” should be expansively construed to cover any kind of electronic device with data processing capabilities. The operations in accordance with the teachings herein may be performed by a computer specially constructed for the desired purposes or by a general purpose computer specially configured for the desired purpose by a computer program stored in a computer readable storage medium.
It is appreciated that certain features of the disclosed subject matter, which are, for clarity, described in the context of separate embodiments, may also be provided in combination in a single embodiment. Conversely, various features of the disclosed subject matter, which are, for brevity, described in the context of a single embodiment, may also be provided separately or in any suitable sub-combination.
In embodiments of the disclosed subject matter, fewer, more and/or different stages than those shown in
Certain embodiments of the presently disclosed subject matter are applicable to the architecture of a computer system described with reference to
Bearing this in mind, attention is drawn to
The virtualization functions may be provided in hardware, software, firmware or any suitable combination thereof. Optionally, the functions of control layer 103 may be fully or partly integrated with one or more host computers and/or storage devices and/or with one or more communication devices enabling communication between the hosts and the storage devices. Optionally, a format of logical representation provided by control layer 103 may differ, depending on interfacing applications.
The physical storage space may comprise any appropriate permanent storage medium and include, by way of non-limiting example, one or more disk drives and/or one or more disk units (DUs). The physical storage space comprises a plurality of data blocks, each data block may be characterized by a pair (DDid, DBA) where DDid is a serial number associated with the disk drive accommodating the data block, and DBA is a logical block number within the respective disk. By way of non-limiting example, DDid may represent a serial number internally assigned to the disk drive by the system or, alternatively, a WWN or universal serial number assigned to the disk drive by a vendor. The storage control layer 103 and storage devices 1041-n may communicate with host computers 1011-n and within the storage system in accordance with any appropriate storage protocol.
Stored data may be logically represented to a client (host) in terms of logical objects. Depending on the storage protocol, the logical objects may be logical volumes, data files, multimedia files, snapshots and other copies, etc.
A logical volume (LU) is a virtual entity logically presented to a client as a single virtual storage device. The logical volume represents a plurality of data blocks characterized by successive Logical Block Addresses (LBA) ranging from 0 to a number LUK. Different LUs may comprise different numbers of data blocks, while the data blocks are typically of equal size (e.g. 512 bytes). Blocks with successive LBAs may be grouped into portions that act as basic units for data handling and organization within the system. Thus, for instance, whenever space has to be allocated on a disk or on a memory component in order to store data, this allocation may be done in terms of “data portions” otherwise also known as “allocation units”. Data portions are typically (but not necessarily) of equal size throughout the system. Successive data portions constituting a logical volume are typically stored in different disk drives (e.g. for purposes of both performance and data protection), and to the extent that it is possible, across different DUs. For purpose of illustration only, the operation of the storage system 102 is described herein in terms of entire data portions. Those skilled in the art will readily appreciate that the teachings of the presently disclosed subject matter are applicable in a similar manner to parts of data portions.
Typically, definition of LUs in the storage system involves in-advance configuring an allocation scheme and/or allocation function used to determine the location of the various data portions (and their associated parity portions) across the physical storage medium. The allocation scheme can be handled for example, by an allocation module 105 being a part of the storage control layer 103. The location of various data portions allocated across the physical storage can be recorded and monitored with the help of one or more allocation tables linking between logical data addresses and their corresponding allocated location in the physical storage.
The allocation module 105 may be implemented as a centralized module operatively connected to the plurality of storage control devices or may be, at least partly, distributed over a part or all storage control devices. Logical contiguity of successive portions and physical contiguity of the storage location allocated to the portions in the system are not necessarily correlated.
In the present description the term “data unit” is a general term which includes units containing data of a certain size. For example, a data portion which is one type of data unit (e.g. with a size of 64 kilobyte) can be divided into 128 data units each of 512 bytes (e.g. blocks). Other data units of other sizes consisting different numbers of bytes or bits can be defined as well.
Control layer 103 can be further operable to perform a deduplication process, on data units. Such data units can be, for example, data units which are received from host computers 1011-n to be stored on stored on storage devices 104i-n. In another example, such data units can be retrieved from storage devices 1041-n by control layer 103, during an inspection process directed for improving storage efficiency and quality.
As explained above a deduplication process enables to reduce the required storage space by storing on the physical data storage only a single copy of different data units containing identical data. In the deduplication process incoming data units (e.g. in a write request) are uniquely identified and compared with previously stored data units. If an incoming data unit is identical to previously stored data units, the data units is not stored on the physical data storage and only a reference is maintained in an allocation table (which can be managed by allocation module 105) associating between the logical address of the incoming data unit and the identical data stored in the physical storage. On the other hand in case the data within the incoming data unit is not stored on the physical storage, control layer 103 stores the data on the physical storage and updates the allocation table to maintain a reference associating between the logical address of the data and its new physical address on the disk.
A common approach in a deduplication process, is to calculate, for an incoming data unit, a hash value (also known as “fingerprint”), using one or more types of hash algorithms and to associate the calculated hash values with the corresponding data unit. The calculated hash-values are compared with hash values which were calculated for other data units, stored on the physical storage, and are thus used to determine whether an identical data unit is already stored on the physical storage.
The size of the data unit on which deduplication is implemented can vary. While implementation deduplication on small data units increases the probability of identifying identical data units, and therefore increases the chances of decreasing the required storage space, the smaller the data units, the greater is the processing load and the greater is the complexity and size of the links configuration which is required in order to link between the logical address of data units the physical address of other identical data units which are already stored on the storage space.
While deduplication can be an efficient method for reducing storage space requirements, when implementing deduplication only on identical copies of data units, similarity between data units is overlooked, even where the difference between the data units is limited to a very small portion of the data. This scenario may occur, for example, where a certain host computer is operable to calculate a hash value of a predefined size (say 8 bytes) for each data unit of a given size (e.g. an 8 kilobytes data unit) and append the extra bytes of the calculated value to the data unit at a predefined location (e.g. at the end of the data unit or at the beginning of the data unit). Such hash values can be calculated for the purpose of source identification or protection against data corruption. If different hash functions are used or the hash function which is used for calculating the hash value includes one or more variables which are not dependent solely on the data within the data unit, a different hash value can be calculated for different data units having identical data.
For example, in some implementations a Data Integrity Field (DIF) is appended to data units, for the purpose of preventing data corruption. In general data corruption includes errors in the data which may occur during the transmission of the data, while the data is stored on the disk or while data is being read from the disk or written to the disk. DIF includes a number of extra bytes which are added to a standard data unit and represent one or more metadata elements corresponding to the data unit. The DIF can be calculated, for example, by an application running on host computers 1011-n and sent together with the data to control layer 103 in storage system 102. Control layer 103 can then store the data unit and the DIF tuple on the physical disk.
After the data unit is transmitted over a network (e.g. from host to storage system or vice versa) and before it is written to the disk, or when the data unit is later read from the disk, one or more of the DIF metadata elements can be re-calculated and compared with the previously calculated elements. If a discrepancy is found between the values of one or more of the DIF metadata elements and the newly calculated elements it may indicate that the data has been corrupted.
However, different DIF tuples can be calculated to identical data units. For example, the hash value function used for calculating the DEST element in a DIF tuple may result in different values when calculated for identical data where each data unit is directed to a different destination. In addition different applications may calculate a different hash value recognizing the specific application, for identical data units. Accordingly, data units with identical data may be appended with a different DIF tuple.
For the reasons specified above, in a deduplication process, data units which differentiate from previously stored data units in only a small portion and are otherwise identical, would calculate to different hash values in the deduplication process and therefore would be all stored on the physical storage. Thus, although only a very small part of the entire data is different (e.g. the metadata elements represented by the DIP tuple) the deduplication process would fail to detect the identity between the data units and as a result valuable disk storage-space is used for storing redundant data.
As explained in more detail below, control layer 103 can be operable to perform storage optimization on data units by implementing deduplication on data units which differentiate in part of their data. The different data can include for example metadata elements such as the elements of a DIF tuple. To this end control layer 103 can comprise, (or otherwise be associated with), a storage optimization (SO) module 107. SO module 107 can be, for example, part of allocation module 105 or otherwise associated therewith. In other cases SO module 107 can be directly associated or connected to storage system 102, not via control layer 103.
It should be noted that in the context of the subject matter disclosed herein, data units can include data parts and metadata parts (referred to herein also as metadata elements). Metadata parts can include for example auxiliary data which is used for characterizing the data. By way of non-limiting example metadata parts include metadata elements of a DIF tuple (described in more detail above with reference to
Storage system 102 can be configured to receive and manage data according to a prescribed format. Thus, in case storage system 102 is configured to handle data units which include data parts and appended metadata parts, control layer 103 is provided with information in respect of the location and size of the data unit and the location and size of the metadata parts in respect of the data unit. Consider the example illustrated in
In some cases control layer 103 is operable to manage data units which consist of several smaller subunits, where each of the smaller subunits is appended with metadata parts (e.g. DIF tuple). For example, as mentioned above control layer 103 can be operable to manage data portion consisting of 128 blocks. In case each block is of a size of 520 bytes of the format illustrated in
Control layer 103 is operable to receive one or more data units. As mentioned above the data units can be received from one or more of hosts 1011-n. (e.g. in a write request to the storage system) and processed before they are stored with storage devices 1041-n. The data units can also be data units which are stored in storage devices 1041-n. Such data portions can be retrieved from storage devices 1041-n by control layer 103, for example, during an inspection process directed for improving storage efficiency for example by a deduplication process. An example of such a process is a scrubbing operation. In a scrubbing operation data portions stored in storage devices are inspected in order to identify different errors in the data (e.g. incorrect data, incomplete data, duplicated data, readable or unreadable data etc.). Such errors can then be rectified, thereby maintaining data integrity and correctness. Thus, the operations performed by control layer 103 and storage optimization module 107 as described herein can be performed on data portions which are retrieved from the data storage devices 1041-n specifically for that purpose or which are retrieved as part of a scrubbing process.
In some implementations, once a data unit is received by control layer 103, SO module 107 is operable to execute a deduplication process on the entire data unit. For better clarity of the description, data units which are processed by control layer 103 and more specifically by SO module 107, are referred to herein as “source data units”. SO module 107 is operable to compare the source data unit with data which is stored in the physical storage space, on storage devices 1041-n. In case identical data is already stored on the physical storage space, SO module 107 is operable to update a corresponding allocation table with one or more references associating the logical address (e.g. defined by the LBA range of the data unit) of the source data unit with the physical address of the identical data unit while the data of the source data unit is discarded.
In case SO module 107 does not identify data which is identical to the entire source data unit, SO module 107 is operable to identify data parts and metadata parts within the source data unit and to compare the data parts from the source data unit with data parts which are stored in the physical storage space. In case identical data is found, SO module 107 can be made operable to update a corresponding allocation table with one or more references, associating the logical address of the data parts, within the source data unit, with the physical address of the identical data, while the data itself is not stored in the physical storage. The metadata parts on the other hand can be allocated to the physical storage, to a new physical address. The logical address of the metadata parts is associated with the new physical address.
In case SO module 107 does not identify data in the physical address which is identical neither to the data parts nor to the metadata parts within the source data unit, SO module 107 is operable to allocate the entire data unit to a new physical address in the physical storage and maintain a reference associating between the logical address of the data unit and the new physical address. A more detailed explanation of this process is described below with reference to
As the use of storage optimization by SO module 107 described above consumes processing resources from storage system 102 and therefore may impact the overall performance of the system, there is a tradeoff between storage optimization and overall system performance. Accordingly, the decision whether to activate storage optimization can be dynamically set according to the current performance of the system.
In some implementations, the current free storage space is considered. Thus, for example in case control layer 103 detects that a certain part of the physical storage is occupied with data (e.g. more than 70% is occupied), SO module 107 becomes operable for storage optimization in order to make more storage space available. In another example in case the control layer 103 detects that a certain part of the physical storage is free (e.g. more than 50% is free), storage optimization can be deactivated.
In some implementations the performance of the processing resources of the system is considered. For example in case the control layer 103 determines that the system is working under heavy processing load, the storage optimization is deactivated in order to minimize the processing resources which are consumed by storage optimization.
In other cases the decision as to whether, and to what extent, storage optimization should be activated is made, while considering both the current available storage space and the current load on the processing resources of the storage system. Thus for example, in case the current processing load is greater than a certain threshold (e.g. 80%) the availability of the storage space is determined. In case the available storage space is greater than a certain threshold (e.g. greater than 50%) storage optimization is deactivated, otherwise storage optimization is maintained active.
According to the implementation which is being utilized control layer 103 (e.g. by SO module 107) can be operable to determine the storage space availability and/or the load on the system's processing resource, and based on one of these factors or a combination of the two, determine whether to operate storage optimization of the disclosed subject matter.
The decision whether to activate storage optimization can be based on additional criteria other than the system performance and storage space occupancy. U.S. provisional No. 61/391,657, which is incorporated herein by reference, describes a system and method which is operable inter alia, for analyzing frequency of accessing (otherwise referred to as “frequency of addressing”) certain data portions and characterizing each data portion in accordance with frequency of accessing thereof. The analyzed data unit can, for example, be stored at a specific location within the physical storage, according to the determined frequency. As explained in 61/391,657 this is done, inter alia, for improving the performance of a storage system. Such analysis can be used in the context of the presently disclosed subject matter for determining whether to activate storage optimization. For example, control layer 103 can be operable to determine the accessing frequency of different data units and submit to storage optimization, data units which are characterized by an accessing frequency which is below a predefined value. Another criterion which can be considered is the source host or port of the data portion. Control layer 103 can be configured to limit storage optimization only to data units which arrive from a specific one or more hosts.
As explained above, in some cases once a data unit is received by control layer 103, initially a decision is made as to whether, and to what extent, storage optimization should be activated. As described above with reference to
In case it is determined not to proceed with storage optimization, the process can turn to the processing of a different data unit, or the process can be terminated. In case it is determined to operate storage optimization, the data unit is analyzed and data parts and metadata parts within the source data unit are identified (stage 210). SO module 107 can utilize information in respect of the format of the data unit, including for example, specific information in respect of the size and location of data parts and metadata parts within the data unit, in order to identify data and metadata.
Once the data part within the source data unit is identified and differentiated from the metadata parts in the source data unit, the data part is compared with data which is stored on the physical storage (stage 220). A hash value (e.g. CRC) can be calculated for the data part within the data unit and compared with other hash values which were calculated for data parts of other data units which are stored on the physical storage. To this end, a hash value can be calculated (e.g. by control layer 103) for data parts of each of the data unit stored in the storage system. Thus, for example, in case each data unit of 520 bytes comprises a data part of 512 bytes and a metadata part of 8 bytes a hash value can be calculated for the 512 bytes of the data part and stored in association with the corresponding data unit. Differentiating between the data part and metadata part within a data unit and comparing only the data part of a data unit with data parts of other data units stored on the storage space enables to overcome the problem which arises when attempting to execute a deduplication process on different data units which comprise identical data parts but different metadata parts. As the data parts are compared separately from the metadata parts, the identity between such data units can be identified.
In case no identity is found between the data parts of the source data unit and the data part of data units which are stored in the physical data storage, SO module 107 can be operable to allocate the data unit (including both the data part and metadata part) to a new physical address in the physical storage and associate (e.g. in the allocation table) the logical address of the data unit with the new physical address (stage 240). The hash value that was calculated for the data part of the source data unit for the purpose of comparison can be stored in association with the data unit to enable the comparison of the data part with other data parts of other data units.
In case identical data parts of a data unit which is already stored on the physical storage are identified, SO module 107 can be operable to associate the logical address of the data parts within the source data unit with the physical address of the identified data, instead of allocating a new storage space for the data part (stage 250). SO 107 can be further made operable to allocate the metadata part from the source data unit to a new physical address in the physical storage and associate the logical address of the metadata part with the new physical address (stage 260). The data part of the source data unit can then be discarded (stage 270). In case the source data unit is retrieved from data storage device 1041-n the source data unit is already allocated to the physical storage. Thus the source data unit can be deleted from the physical storage.
Furthermore, in such a case, instead of discarding the data from the source data unit, the logical address of the data which is identical to the data from the source data unit can be associated with the physical address of the source data unit and then this data can be removed from the physical storage instead of the data from the storage data unit. This variation is also applicable in respect of to the examples illustrated with reference to
Metadata parts can be allocated to an area in the physical storage specifically designated for storing metadata parts of source data units comprising data parts with identical copies which are already stored on the storage space. In some implementations where control layer 103 is adapted to manage its storage space with data units of predefined size, the designated area is divided into data units of that size (referred to herein as a “designated data unit”). For example, as mentioned above, control layer 103 can be adapted to manage its storage space (e.g. write to the disk and read from the disk) with data portions comprising 128 blocks, each block being 520 bytes long calculating to a data unit of 66560 bytes. Such a data unit can store more than 3000, 8 bytes long metadata parts from different data units.
Accordingly, the metadata part of a source data unit can be allocated to a designated data unit (e.g. of the size of 66560 bytes) and in order to link between a given metadata part (e.g. 8 bytes long), allocated to a designated unit in the physical storage, and its corresponding source data unit, the physical address of the designated data unit and the location of the metadata part within the designated data unit (e.g. the offset of the metadata part in respect of the designated data unit) are stored in association with the logical address of the source data unit. This enables storage system 102 to utilize data units of the same predetermined size for managing the storage of the smaller metadata parts, originating from one or more source data units.
In some implementations, different metadata parts originating from different source data units are directed and stored in specific designated data units based on various characteristics of the source data unit. For example, in case data units are sorted in accordance with frequency of accessing thereof (as mentioned above with reference to U.S. provisional No. 61/391,657) data units which are accessed very often (e.g. based on a predefined threshold) can be stored in one or more specific designated data units allocated to the physical storage, while data units which are accessed less often can be stored in other designated data units.
In another example, different designated data units can be determined for each host, such that data units originating from one host are stored in a specific (one or more) designated data units allocated to the physical storage, while data units originating from another host are stored in a different (one or more) designated data units.
As described above, storage system 102 can be configured to manage data units of a predefined size, such as for example a data portion 66560 bytes long. These data portions can be divided into smaller subunits, for example, 128 data blocks 8 bytes long. The information which is received by storage system 102 (e.g. from host computers 1011-n) is divided into data units of the prescribed size and these data portions are managed by the system. In some cases the subunits (e.g. blocks) within the longer data portion can comprise a data part and a metadata part, as illustrated in
In the first stage (stage 410), the entire data portion is compared with other data portions which are stored in the physical storage. To this end a hash value corresponding to the entire data portion (including all data and metadata of all the subunits) can be calculated and compared with hash values corresponding to other data portions stored in the physical storage. In case an identical data portion is found in the physical storage space the operation continues to stage 420 which include associating the logical address of the data portion with the physical address of the identified data portion. As previously described, the link between the logical address of the source data portion and the physical address of the identified identical data portion can be recorded in an allocation table. The source data portion can be then discarded.
In case an identical data portion is not found in the physical storage space the process proceeds to determine whether similarity exists between the data part within the data portion and the data parts of other data portions which are stored in the physical storage. To this end, the data parts within the source data portion are separated from the metadata parts within the source data portion. This can be accomplished based on information which is made available to control layer 103 specifying the layout of the data within the data portion. For example, in case a data portion comprises 128 blocks of data, each block characterized by the format depicted in
Once separated, the entire data parts can be concatenated into one unit of data, combining all data parts within the source data portion. A hash value corresponding to concatenated data parts can be calculated and used for comparing with other hash values corresponding to other data parts of data portions stored within the physical storage. Hash values corresponding to the data parts within data portions which are stored in the physical storage can be calculated in respect of each data portion. In some implementations such hash values can be calculated for example when a data portion is first introduced to storage system 102 and can then be stored in association with the data portion.
In case a data portion having identical data parts is found within the physical storage the process proceeds to stage 450 where the data parts within the source data portion are associated (e.g. in an allocation table) with the physical address of the corresponding identical data parts of the identified data portion. The metadata parts within the source data portion are allocated to a new physical address in the physical storage and the logical address of the metadata is associated with the new physical address (stage 455). The data parts within the source data portion can be discarded (stage 460).
As explained before, in case storage system 102 is configured to manage data of predefined size (e.g. data portion) the metadata parts from the source data portion are allocated to a designated data portion. In order to link between a given metadata part (e.g. 8 bytes long), allocated to a designated area in the physical storage, and its source data portion, the physical address of the designated data unit and the location of the metadata part within the designated data unit (e.g. the offset of the metadata part in respect of the designated data unit) are stored in association with the logical address of the source data portion.
In case no data portion with identical data parts is found within the physical storage space, the source data portion can be divided into its smaller subunits (e.g. 520 byte long blocks) and each subunit can then be processed in accordance with the operations illustrated above with reference to
Data portion 1 represents a source data portion, which may have been, for example, received from a host to be written to the physical storage space in storage system 102. Data portion 1 includes data units (e.g. blocks) each including data parts and metadata parts. Data portion 3 is stored within the physical storage space of storage system 102. As explained above with reference to
Once it is identified that the data parts within data portion 1 are identical to the data parts within data portion 3, the logical address of the parts within data portion 1 are associated (e.g. linked) to the physical address of the corresponding data parts within data portion 3. This is schematically illustrated in
The metadata parts within data portion 1 are allocated to a designated data portion (data portion 2) and the logical address of the metadata parts are associated with the new physical address which was allocated in the designated data portion 2. This is schematically illustrated in
In stage 630 it is determined whether at least part of the metadata of the requested data units is allocated to a designated data unit. An allocation table which is managed by allocation module 105 can include an indication that enables to associate between the logical address of a requested data unit and a corresponding designated data unit allocated to the physical storage. In some implementations, the allocation table specifies information indicating the logical address of the metadata parts (enabling to locate the metadata parts within the original source data units), the location of the corresponding designated data unit within the physical storage, and the location of the metadata parts within the designated data units. Thus, metadata parts, of a requested data portion, which are allocated to a designated data portion in the physical storage, can be located and retrieved from the physical storage based this information.
In case it is determined that no part of the requested data unit is allocated to a designated data unit in the physical storage, the requested data unit is retrieved from the physical storage (stage 640) and provided to the requesting host (stage 690). In some cases different data subunits (e.g. blocks) of a requested data unit can be linked to corresponding subunits allocated to different locations in the physical storage. In such cases control layer 103 (e.g. with the help of SO module 107) can be configured to identify the different data subunits in the different locations, retrieve the data from the different locations and combine all subunits together in order to reconstruct the requested data unit.
In case it is determined that at least part of the metadata parts of the requested data unit has been stored by a storage optimization process disclosed herein in a designated data unit allocated to the physical memory, the data parts of the requested data units are retrieved from the physical storage. To this end the allocation table is utilized for obtaining the location of the different data parts of the requested data unit and for retrieving these data parts from the physical storage based on this information (stage 650).
The location of the designated data unit containing the metadata parts of the requested data unit is also obtained from the allocation table, as well as the location of the metadata parts within the designated data unit. This information is then used for retrieving the metadata parts of the requested data unit from the corresponding (one or more) designated data units (stage 660).
The data parts and metadata parts can then be concatenated according to their logical address, which can also be retrieved from the allocation table, thereby reconstructing the requested data unit (stage 670). The reconstructed data unit can be provided to the requesting host (stage 690).
It will be understood that the system according to the disclosed subject matter may be a suitably programmed computer. Likewise, the disclosed subject matter contemplates a computer program being readable by a computer for executing the method of the invention. The disclosed subject further contemplates a machine-readable memory tangibly embodying a program of instructions executable by the machine for executing the method of the invention.
Those skilled in the art will readily appreciate that various modifications and changes can be applied to the embodiments of the invention as hereinbefore described without departing from its scope, defined in and by the claims associated with the present invention.