1. Technical Field
The present invention relates generally to the field of data storage, and more particularly to a method of and system for allocating to a user contiguous blocks in a file system for storage of data and metadata.
2. Description of the Related Art
Computer storage systems store the actual file data and metadata on physical disk drives or disk arrays. A disk drive comprises a stack of disks mounted to rotate together about a spindle. Each disk is coated with magnetic material. Read and write heads are movably mounted with respect to the disks. The disks are spun at high speed and the read and write heads are moved in and out to read data from or write data to the spinning disks. A disk controller controls the movement of the heads, thereby controlling where the data is written to or read from.
The actual storage of files on physical media is very complex. This complexity is hidden from the user or the application by the computer operating system, which uses various levels of abstraction. The user or application sees files stored in a relatively simple file system. The user creates a file and the operating system handles how and where the data of the file is actually stored.
When the user creates a file, the operating system allocates a certain amount of space in the file system for the file and in turn allocates a certain amount of space on the physical disk for storage of the file. Over time, as the user works with the file the file may grow larger than the space originally allocated for storage of the file. The operating system simply stores part of the file in the originally allocated space and rest of the file in non-contiguous newly allocated space, thereby fragmenting the file.
As the user works with multiple files, all of the files may become fragmented. Fragmentation adds to the complexity of file storage. This complexity is hidden from the user, but the complexity of fragmentation degrades the performance of the system as the storage system has to assemble the file from multiple locations.
Embodiments of the present invention provide methods, systems, and computer program products that enable a user to control the allocation in a file system of file data and metadata blocks. An embodiment of a method according to the present invention determines ranges of unallocated contiguous blocks in the file system. The method allocates to the user a metadata storage range of contiguous blocks within one of ranges of unallocated contiguous blocks. The method further allocates to the user a data storage range of contiguous blocks within the range of unallocated contiguous blocks. The method may then allocate to the user ranges of contiguous blocks within the allocated data storage range for storage of respective files.
The novel features believed characteristic of the invention are set forth in the appended claims. The invention itself, however, as well as a preferred mode of use, further purposes and advantages thereof, will best be understood by reference to the following detailed description of an illustrative embodiment when read in conjunction with the accompanying drawings, where:
Referring now to the drawings, and first to
Host system 101 is virtualized system in that it hosts logical partitions 113. As is known to those skilled in the art, a logical partition is a division of the resources 103 of host system 101 into sets of resources so that each set of resources can be operated independently with its own operating system instance and application or applications. Each logical partition 113, and the application or applications running therein, may be used by one or more users.
Host system 101 includes a hypervisor 115. Hypervisor 115 is a software layer that provides the foundation for virtualization of host system 101. Hypervisor 115 enables the hardware resources 103 of host system 101 to be divided among logical partitions 113, and it ensures strong isolation between them. Hypervisor 115 is responsible for dispatching the respective workloads of logical partitions 113 across hardware resources 106. Hypervisor 115 also enforces partition security and it can provide inter-partition communication.
Referring to
Each host system 201 is coupled through a network 203 to a network storage system 205. Network 203 may be of any of any suitable configuration and may comprise a storage area network. Network storage 205 provides storage to host systems 201.
Logical layer 303 comprises a logical volume manager 317. Logical volume manager 317 provides a set of operating system commands, library subroutines and other tools that allow the user to establish and control logical volume storage. Logical volume manager 317 controls the resources of physical layer 301 by mapping data between the user's view of the storage space provided by file systems 313-315 and logical volumes 319 and 321 and the actual physical disks 307-311. Logical volume manager 317 does this by using a layer of logical volume device driver code 323 that runs above the traditional physical device drivers, such as device driver 325 and RAID adapter 327. Logical volume device driver 323 maps logical volumes 319-321 onto physical volumes 329-333. Device driver 325 maps physical volumes 329 and 331 onto physical disks 307 and 309, respectively. RAID adapter 327 maps physical volume 333 onto physical array 311. This logical view of the disk storage is provided to applications and is independent of the underlying physical disk structure.
The blocks and sets of blocks in logical volume 401 are identified by an offset and a range into logical volume 401. For example, the offset 411 of logical volume control block 403 is zero and its range 413 is one block. Similarly, the offset 415 of super block 405 is one and its range 417 is also one block. Then, the offset 419 of set of allocated blocks 407 is two and its range 421 is N blocks. The offset 423 of set of unallocated blocks 409 is two plus N and its range 425 is M blocks.
Occasionally, a user may underestimate the space required to store a particular file. As will be explained in detail hereinafter, in such cases the system puts the excess part of the file into normal file system storage. The user may then move the overflow part of the file into a portion of the user's allocation. The file would be fragmented, but in only one place. Alternatively, the user could obtain a new, larger, allocation for the file and consolidate the original part of the file and the overflow part into the new allocation, thereby avoiding fragmentation of the file.
fs_data_alloc-b<data_range_name>-u<user>-<rblk_offset>:<blk_range>
<filesystem_name>
where:
Referring again to
fs_file_alloc-b<inode_range_name>-u<user>-r<blk_offset>:<blk_range>
<filesystem_name>
where:
# touch-i<inode_range_name>-d<data_range_name>:<off>;<len><filename>
where:
Embodiments of the present invention allow the user to determine where a particular file is stored on the logical volume using a “fileplace <filename>” command. In response to the fileplace command, the system outputs the size of the file and the logical extents of where the file is stored. For example, the user wants to determine where a file named 49288.004.errpt.out is stored. In response to a fileplace <49288.004.errpt.out> query, the system outputs:
Thus, the user determines that file 49288.004.errpt.out, which has a size of 291905 bytes and requires seventy-two 4096-byte blocks for storage, is fragmented with a first part comprising 262144 bytes being stored space allocated to the user and an overflow part comprising 32768 bytes being stored in a default file system data range, not allocated to any user.
According to the present invention, the user may consolidate the file by allocating a new data range to the file and moving the file into the new data range. The user may allocate a new data range using the fs_data_alloc command discussed above. Embodiments of the present invention introduce a new blkmv command by which a user may move data from one data range to another. The blkmv command is illustrated as follows:
Thus, the user may allocate the new data range using the fs_data_alloc command, as follows:
Referring to
Peripheral component interconnect (PCI) bus bridge 1014 connected to I/O bus 1012 provides an interface to PCI local bus 1016. A number of modems may be connected to PCI local bus 1016. Typical PCI bus implementations will support four PCI expansion slots or add-in connectors. Communications links to network 109 in
Those of ordinary skill in the art will appreciate that the hardware depicted in
The data processing system depicted in
As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method or computer program product. Accordingly, aspects of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “circuit,” “module” or “system.” Furthermore, aspects of the present invention may take the form of a computer program product embodied in one or more computer readable medium or media having computer readable program code embodied thereon.
Any combination of one or more computer readable medium or media may be utilized. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof A computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C++ or the like and conventional procedural programming languages, such as the “C” programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
The computer program instructions comprising the program code for carrying out aspects of the present invention may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer readable medium that can direct a computer, other programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions stored in the computer readable medium produce an article of manufacture including instructions which implement the function/act specified in the foregoing flowchart and/or block diagram block or blocks.
The computer program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the foregoing flowchart and/or block diagram block or blocks.
The flowcharts and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms “a”, “an”, and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
The corresponding structures, materials, acts, and equivalents of all means or step plus function elements in the claims below are intended to include any structure, material, or act for performing the function in combination with other claimed elements as specifically claimed. The description of the present invention has been presented for purposes of illustration and description, but is not intended to be exhaustive or limited to the invention in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the invention. The embodiment was chosen and described in order to best explain the principles of the invention and the practical application, and to enable others of ordinary skill in the art to understand the invention for various embodiments with various modifications as are suited to the particular use contemplated.
From the foregoing, it will be apparent to those skilled in the art that systems and methods according to the present invention are well adapted to overcome the shortcomings of the prior art. While the present invention has been described with reference to presently preferred embodiments, those skilled in the art, given the benefit of the foregoing description, will recognize alternative embodiments. Accordingly, the foregoing description is intended for purposes of illustration and not of limitation.