This specification is directed to computer systems and more specifically to disk space partitioning.
Conventional disk space allocation involves partitioning. Traditionally, a storage array will divide all its disks into partitions and then combine partitions from one or more disks to construct virtual disks (e.g., Logical Unit Numbers or LUNs). The partitions may be combined via striping, concatenation, mirroring, or a combination of these mechanisms. These methods have at least two disadvantages.
First, performance across a disk is not uniform. That is, blocks or partitions near the outer rim of a disk perform significantly better than those nearer the center of the disk. This is due to the speed at which the outer rim rotates in relation to the inner rim of the disk. This performance difference is commonly ignored because it is complicated to take into consideration. Software on the host does not know where its allocated space (e.g., virtual disk) is located on the physical disk, and it cannot assume that higher disk addresses are closer to the center of the disk (e.g. if there are several partitions on a disk). Thus this performance difference is not well exploited.
Second, a file system will usually consume the lower disk addresses first leaving the free space at higher addresses. This works well if the file system is using the whole physical disk. However if there are several partitions on a physical disk assigned to different hosts, then the allocated locations are separated by gaps of unused space. This results in larger seeks as the I/O is serviced for the different partitions.
Assigning whole physical disks to hosts is frequently impractical because it is too large a lump of storage. Thus partitioning disks becomes necessary, but causes the above problems. A new method of disk space partitioning is needed to solve the problems discussed above.
Embodiments herein describe stippling, a method of dividing disk space that manages disk space and performance. In one embodiment stippling may include setting stippling parameters, and configuring stipples. In another embodiment, stippling may include dividing a disk into equal portion spaces, grouping the equal portion spaces into equal size sets and allocating a portion of each set to each of a plurality of stipples. In yet another embodiment a method of managing disk performance may include interleaving stipples.
Traditionally storage (e.g., disk) is divided into a relatively small number of contiguous partitions. Stippled storage is divided into a relatively small number of interleaved portions referred to herein as stipples. Each stipple is made of a plurality of relatively small and interleaved portions spread across the storage or disk.
One embodiment of a method of stippling a disk is represented in process 125 of
Another embodiment of a method of stippling a disk is represented in process 100 of
Sizing and Grouping
As mentioned above, a stippled disk can be divided into a significant number of small equal size disk portions. These portions can be referred to as “strokes”. Process 250 shown in
In some embodiments, the stroke set size of a disk can be changed without remapping existing stipples. This is accomplished by multiplying the stroke set size by an integer and/or evenly dividing the stroke size by an integer. The stipple member set can be changed to add more members to keep the mapping the same. For example, to increase an example stroke set size of 16, multiply by an integer value of 2 to get a stroke set size of 32. Note that the strokes within each stroke set have similar performance characteristics.
In these embodiments, a portion of each stroke set is allocated to each stipple. Therefore, the size of a stroke set can have an effect on the granularity of the stipples. That is, the smaller the stroke set size, the fewer the number of potential stipples; the larger the stroke set size, the larger the number of potential stipples.
Interleaving
Note that Disk 2 in
Member Numbers
Each stroke in a stroke set can have a member number from 0 to (stroke set size—1). For example, if the stroke set size is 8 strokes, the member numbers can range from 0-7.
Consider the example shown in
Stipple A and Stipple B can not appear on the same disk since they overlap. However Stipple C with a bit mask of 0x55 (0101 0101) or member set array {0, 2, 4, 6}, and Stipple D with a bit mask of 0xAA (1010 1010) or member set array {1, 3, 5, 7}, shown in stroke sets 305 and 306 respectively, interlace on every other stroke and split the disk in half. Note that it can be more efficient to have the stroke members of a stipple adjacent to each other as in Stipple B.
Another example disk shown in
Recall that, in some embodiments, the stroke set size of a disk can be changed without remapping existing stipples. For example, if the size of stroke set 313 in
The stroke size parameter can also be divided evenly by an integer value. This decrease in stroke size causes an increase in the stroke set size. For example, a stroke set 315 has a stroke size of 4096 blocks, a stroke set size of 4, and a stipple 4a using the second stroke of the stroke set. If the stroke size is divided by 2 to make a stroke size of 2048 blocks, the stroke set size is increased to 8 (doubled) so that the stipple ratios in the stroke sets, or stipple proportions, are maintained. The new stipple, Stipple 4b includes the third and fourth strokes of the new stroke set as shown in stroke set 316.
Configuration
Process 350 configures the stipples by assigning stroke set members to each stipple and is illustrated in
Converting the Stipple Block
To read the data in a stippled disk, the stipple information can be converted to an actual disk block number to allow seek operations to locate the data.
For example, to illustrate the concept further,
The stroke set members are assigned to stipples. In this example, stipple 1 includes all the 0 stroke set member numbers of the stroke sets, represented as member set array {0}. These member set arrays are reflected in the stipple members assigned in the column of stroke sets 450. For example, stroke set member number 0 of stroke set 0 (471) is assigned to stipple 1, stroke set member number 0 of stroke set 1 (472) is assigned to stipple 1, an so on. Stipple 2 includes all the stroke set member numbers 1 and 3 represented as member set array {1,3}. For example, stroke set member numbers 1 and 3 of stroke set 0 (471) are assigned to stipple 2, stroke set member numbers 1 and 3 of stroke set 1 (472) are assigned to stipple 2, and so on.
The strokes in each stipple can be labeled. For example, the first stroke of stipple 1 in stroke set 451 can be labeled stipple 1, stroke 0 (410). The second stroke of stipple 1 in stroke set 452 is labeled stipple 1, stroke 1 (411), and so on from stroke sets 453 to 458. The first, second, third, and forth strokes of stipple 2 can be labeled stipple 2, stroke 0 (420), stipple 2, stroke 1 (421), stipple, 2, stroke 2 (422) and stipple 2, stroke 3 (423), respectively. Note that the first and second strokes of stipple 2 are in stroke set 451, while the third and fourth strokes of stipple 2 are in stroke set 452 and so on from stroke sets 453 to 458.
Each stroke set can be numbered.
Recall that the unit of measure for stroke size is disk blocks. Each of the virtual stipple blocks such as 401 in
To convert a Stipple Block Number into a Disk Block Number using arithmetic equations the member set of the stipple is represented as an array of indexes rather than as a bit mask. For example, if there are 8 strokes in a stroke set then the Stroke Set Size is 8. The Member Set Array for the example mask 0x32 (0011 0010) is {1, 4, 5}. The Member Set Size in this example is 3 since there are 3 strokes of the stroke set that are part of this stipple. The Stroke Size in this example is 2048 blocks. These variables are defined in process action 482 of
In process action 484 the Stipple Stroke Number is calculated by dividing the Stipple Block Number by the Stroke Size, with the Stroke Size having units in blocks. In Process action 486 the Stroke Block Offset is obtained by calculating the remainder of the quotient of the Stipple Block Number and the Strike Size in units of blocks. The “%” sign indicates the mathematical operator of modulo which calculates the remainder. Process action 488 calculates the Stroke Set Number using the calculated Stipple Stroke Number divided by the Member Set Size determined in process action 482. In process action 490 Member Set Index is calculated as the remainder of stipple Stroke Number divided by the Member Set Size. Process action 492 calculates Stroke Set Member. Stroke Set Member is the stroke set member number of the member, the Member Set Index is a positional number referring to the first (0), second (1), third (2), etcetera member or each stroke set. For example, if the Member Set Array is {0, 2, 4, 6}, then a Member Set Index of 3 points to the fourth stroke set member number starting from the lowest member. In this example, the fourth stroke set member number is 6. Process action 494 uses the Stroke Set Member calculated in process action 492 to calculate Disk Stroke Number. Process action 496 uses the Disk Stroke Number to calculates Disk Block Number.
The following two example illustrate conversion of a stipple block into a disk block.
Example 1 shows how Stipple block 1,000,000 of the above example stipple would be mopped to a disk block. In process action 482 the inputs are set as follows.
Stroke Size is 2048 blocks—one megabyte of 512 byte sectors.
Stroke Set Size is 8 strokes—the disk is divided into stroke sets of 8 strokes each.
Member Set Size is 3—this stipple uses 3 strokes of each stroke set—⅜th of the disk.
Member Set Mask is 0x32—this identifies which strokes are used in each stroke set.
Member Set Array is {1, 4, 5}—a different representation of the information in the mask.
Stipple Block Number is 1,000,000—the stipple block to be mapped to a disk block.
The following chart details the process action in process 400, and the calculation performed at that process action for this example.
From these calculations, block 1,000,000 of the stipple maps to block 2,665,024 on the disk. Since the stipple consumes ⅜th of the disk it makes sense that the disk block number is close to 8/3rd times as large as the stipple block number.
Example 2 shows how block number 25000 in stipple number 2, illustrated by element 459 in
Stroke Size is 2048 blocks—one megabyte of 512 byte sectors.
Stroke Set Size is 4 strokes—the disk is divided into stroke sets of 4 strokes each.
Member Set Size is 2—stipple 2 uses 2 strokes of each stroke set—½ of the disk.
Member Set Mask is 0x5—this identifies which strokes are used in each stroke set.
Member Set Array is {1, 3}—a different representation of the information in the mask.
Stipple Block Number is 25,000—the stipple block to be mapped to a disk block.
The following chart details the process action in process 400, and the calculation performed at that process action for this example.
From these calculations, block number 25000 of stipple number 2 maps to disk block 51,624. Note that process action 494 calculates that the stipple 2 block 25000 corresponds to a Disk Stroke Number of 25 (481). Since the stipple consumes ½ of the disk it makes sense that the disk block number is close to 2 times as large as the stipple block number.
Stippling and Partitions
Stipples can be mirrored by stipples on other disks. A disk may be both stippled and partitioned. Either a stipple can be partitioned (most likely by a host), or a partition can be stippled. Stippling provides a method of dividing a disk into portions that can be treated like virtual whole disks. This new methodology can be useful for a storage array that is presenting portions of a disk as a virtual disk to different hosts.
Stippling and Performance
Stippling results in the set of allocated spaces (e.g., virtual disks) being evenly spread across the storage area, or disk. The host that uses the virtual disk can assume that the lower block numbers are closer to the outer rim of the disk and thus perform better. This is helpful for maximizing the utilization of large disks. A small heavily used file system can be placed on the first partition of the virtual disk and a second larger file system can be placed on the remainder of the virtual disk to hold old infrequently accessed data. This can be done without knowing the physical location of the partition underlying the virtual disk and without giving the host an entire physical disk.
For example, with RAID 5 a single address space is constructed from multiple physical spindles. The RAID 5 space can be divided into stipples as if it is one single disk. In one embodiment, the stroke size can be aligned with a multiple of the RAID 5 stripe size. When stippling a RAID 5 disk it makes sense to align the stroke size with the RAID 5 stripes so that each stroke contains an integral number of stripes.
Stippling can provide more efficient use of a disk for a system with multiple hosts that cannot coordinate disk allocation with each other. The lower disk addresses of all the virtual disks are on the outer edge of the physical disk. As the hosts start filling up their virtual disks with data, all the data from all the hosts is on the outer edge of the disk. Stippling can be configured such that there are no gaps of unused space between each virtual disk.
Stippling can make it easier to manage performance since all the stipples have similar performance. An unused stipple preserves not only its space on the disk, but also a portion of the disk's performance. An unused stipple contains some blocks of every performance characteristic available on the disk.
Stippled Disks and ASM
Disk stippling can work with the Automated Storage Management (ASM) product which is commercially available from Oracle Corporation of Redwood Shores, Calif. More information regarding implementation of ASM can be found in U.S. Pat. No. 6,530,035 and U.S. Pat. No. 6,405,284 which are hereby incorporated by reference as if fully set forth herein. The stroke size can be set to match the ASM allocation unit size and the two can be aligned. Each allocation unit can be one stroke on the underlying physical disk. This can keep one megabyte aligned I/O's on contiguous storage all the way from the file I/O down to the physical disk I/O.
Stippling can also be applied to support ASM sharing disks between hosts with different operating systems. If the storage array can present virtual disks that are stipples, then disk groups on different hosts can efficiently share the same disks.
For example, allocating two partitions on the same disk to the same disk group in a system without stippling is inefficient, resulting in the system trying to load balance between two areas on the same disk and causing many useless seeks. On the other hand, allocating two stipples on the same disk to the same disk group has only minor consequences, resulting in some extents being relocated to the new stipple. But these extents will go to the outer edge of the physical disk along side of the existing data in the other stipple.