This application is related to commonly assigned and U.S. patent application Ser. No. 12,135,012 entitled “SELECTIVELY MARK FREE FRAMES AS UNUSED FOR COOPERATIVE MEMORY OVER-COMMITMENT,” filed on even date herewith and hereby incorporated by reference.
1. Field of the Invention
The present invention relates generally to a computer implemented method, data processing system, and computer program product for efficient use of volatile or physical memory. More specifically, the present invention relates to bringing into physical memory, or preventing the transferring from physical memory, the residency of a virtual memory page ahead of a flush operation.
2. Description of the Related Art
Designers of modern computers rely on two or more classes of workspace or addressing space (memory or storage). A first class of workspace is fast but relatively expensive. A second class of workspace is relatively slow, but cheaper than the first class of workspace. The first class of workspace can be volatile memory such as random access memory (RAM), while the second class of workspace can be a block storage device such as a disk drive. Frequently, there may be ten or more times as much of the second class of workspace available as compared to the first class in a typical computer system.
In order to operate efficiently, a system, as described above, moves instructions and data from the second class of workspace to the first class of workspace before a system processor operates on such instructions or data. A scheme to address memory within both classes of workspace is called virtual memory. Such an addressing scheme provides for a range of addresses larger than the addressable physical memory. Accordingly, a virtual memory ‘page’ can be actually stored to either physical memory or to storage, which correspond to the first class of workspace and the second class of workspace, respectively. Nevertheless, such virtual memory pages, if in the second class of workspace, must be marked as such, within a table, and then transferred to physical memory when accessed by the processor. The handling and manipulating of a virtual memory pages, a computer resource, is done through a virtual memory manager sub-component to a hypervisor. A hypervisor assigns resources to logical partitions.
Transferring data between storage and physical memory can be an expensive operation since the transfer is protracted, as compared to moving data between physical memories. Accordingly, benefits may be achieved if such transfers are minimized.
A virtual memory page is a fixed-size block of data. A virtual memory page can be resident in memory. Such a virtual memory page is mapped to a location where the physical data is stored in physical memory. Otherwise, a virtual memory page can be resident on a disk or other block device. In other words, the virtual memory page is paged out of physical memory, and instead, placed into paging space or a file system.
Accordingly, a virtual memory page can be an abstraction of memory storage that decouples the physical media from the operating characteristics of the virtual memory page as used within an operating system (OS). Such physical media is where the virtual memory page resides. The physical media is any storage mechanism, for example, random access memory, disk storage, tape storage, and flash memory, among others.
The present invention provides a computer implemented method and apparatus to for marking as critical a virtual memory page in a data processing system. An operating system indicates to a virtual memory manager a virtual memory page selected for paging-out to disk. The operating system determines that the data processing system is using a cooperative memory over-commitment. The operating system, responsive to a determination that the data processing system is using cooperative memory over-commitment, marks the virtual memory page as critical, such that the virtual memory page remains in physical memory. The operating system, responsive to marking the virtual memory page as critical, sets the virtual memory page to a page-out state.
The novel features believed characteristic of the invention are set forth in the appended claims. The invention itself, however, as well as a preferred mode of use, further objectives and advantages thereof, will best be understood by reference to the following detailed description of an illustrative embodiment when read in conjunction with the accompanying drawings, wherein:
With reference now to the figures and in particular with reference to
In the depicted example, local area network (LAN) adapter 112 connects to south bridge and I/O controller hub 104 and audio adapter 116, keyboard and mouse adapter 120, modem 122, read only memory (ROM) 124, hard disk drive (HDD) 126, CD-ROM drive 130, universal serial bus (USB) ports and other communications ports 132, and PCI/PCIe devices 134 connect to south bridge and I/O controller hub 104 through bus 138 and bus 140. PCI/PCIe devices may include, for example, Ethernet adapters, add-in cards, and PC cards for notebook computers. PCI uses a card bus controller, while PCIe does not. ROM 124 may be, for example, a flash binary input/output system (BIOS). Hard disk drive 126 and CD-ROM drive 130 may use, for example, an integrated drive electronics (IDE) or serial advanced technology attachment (SATA) interface. A super I/O (SIO) device 136 may be connected to south bridge and I/O controller hub 104.
An operating system runs on processor 106 and coordinates and provides control of various components within data processing system 100 in
Instructions for the operating system, the object-oriented programming system, and applications or programs are located on storage devices, such as hard disk drive 126, and may be loaded into main memory 108 for execution by processor 106. The processes of the present invention can be performed by processor 106 using computer implemented instructions, which may be located in a memory such as, for example, main memory 108, read only memory 124, or in one or more peripheral devices.
Those of ordinary skill in the art will appreciate that the hardware in
In some illustrative examples, data processing system 100 may be a personal digital assistant (PDA), which is configured with flash memory to provide non-volatile memory for storing operating system files and/or user-generated data. A bus system may be comprised of one or more buses, such as a system bus, an I/O bus and a PCI bus. Of course, the bus system may be implemented using any type of communications fabric or architecture that provides for a transfer of data between different components or devices attached to the fabric or architecture. A communication unit may include one or more devices used to transmit and receive data, such as a modem or a network adapter. A memory may be, for example, main memory 108 or a cache such as found in north bridge and memory controller hub 102. A processing unit may include one or more processors or CPUs. The depicted example in
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms “a”, “an”, and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
The corresponding structures, materials, acts, and equivalents of all means or step plus function elements in the claims below are intended to include any structure, material, or act for performing the function in combination with other claimed elements as specifically claimed. The description of the present invention has been presented for purposes of illustration and description, but is not intended to be exhaustive or limited to the invention in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the invention. The embodiment was chosen and described in order to best explain the principles of the invention and the practical application, and to enable others of ordinary skill in the art to understand the invention for various embodiments with various modifications as are suited to the particular use contemplated.
As will be appreciated by one skilled in the art, the present invention may be embodied as a system, method or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “circuit,” “module”, or “system.” Furthermore, the present invention may take the form of a computer program product embodied in any tangible medium of expression having computer usable program code embodied in the medium.
Any combination of one or more computer usable or computer readable medium(s) may be utilized. The computer-usable or computer-readable medium may be, for example but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, device, or propagation medium. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disc read-only memory (CDROM), an optical storage device, a transmission media such as those supporting the Internet or an intranet, or a magnetic storage device. Note that the computer-usable or computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted, or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory. In the context of this document, a computer-usable or computer-readable medium may be any medium that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. The computer-usable medium may include a propagated data signal with the computer-usable program code embodied therewith, either in baseband or as part of a carrier wave. The computer usable program code may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc.
Computer program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C++ or the like and conventional procedural programming languages, such as the “C” programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
The present invention is described below with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable medium that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable medium produce an article of manufacture including instruction means which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
A data processing system, such as described in
The illustrative embodiments provide an operating system or logical partition that may react to a determining step. The determining step can be determining that a virtual memory page is selected to be paged-out by an operating system. The reaction or response of the operating system can be to follow with a step of elevating the affected virtual memory page to ‘critical’ status. Accordingly, the cooperative operation of the virtual memory manager and operating system may avoid paging-out the virtual memory page to the hypervisor swap-space ahead of the initiation of the operating system paging-out the virtual memory page.
A virtual memory manager is configured to provide access to data stored in one of the two classes of workspace to operating systems and applications. A virtual memory manager can be a part of a hypervisor 201. The hypervisor can determine how to handle situations when the occupied space of the physical memory 221 is either full, or reaches a preset threshold. An ideal situation involves the hypervisor placing in physical memory 221 each virtual memory page in advance of a processor operating on such virtual memory page. In practice, though, a processor may require such a virtual memory page before the virtual memory page is placed into physical memory 221. Physical memory is memory that stores charges or electromagnetic radiation by either continuously or nearly continuously applying power to circuits or other structures to maintain such charges or electromagnetic radiation. Physical memory includes, for example, random access memory (RAM).
Nevertheless, the data may be available, albeit in a slower form of storage than in physical memory 221. As explained above, physical storage 223, sometimes called paging space, is a second class of workspace. Physical storage operates on a slower basis than physical memory 221. Consequently, virtual memory pages resident in physical storage 223 are ‘paged-in’ through an I/O buffer, for example, disk buffer 231. Such paging-in is action where the virtual memory manager copies data from physical storage 223 to physical memory 221. Physical storage is a data storage structure that does not rely on the routine application of power to maintain storage elements. This feature is apparent, for example, in the magnetic domains of hard disk drives, or in the pits and other physical changes made to optical media. When the virtual memory page is paged-in, one or more clusters or other storage units in physical storage are accessed. Accordingly, data may arrive from the disk buffer in multiple stages, as, for example, when a disk read/write head is moved to successive tracks of the applicable media. When data arrives at a processor in this manner, the process of accessing such data is many times slower than accessing data that already exists in physical memory without the need to resort to block storage device access (or paging-in).
Physical storage can be subdivided depending on whether storage is allocated to a logical partition or whether storage is dynamically allocated by the hypervisor to provide backing stores to physical memory pages. In the first case, physical storage is subdivided and allocated such that each portion of physical storage is dedicated to one logical partition. In the second case, physical storage is provided by the hypervisor on a dynamic basis as backing stores to virtual memory pages stolen from a logical partition. The second form of storage allocation relies on a backing store known as a hypervisor swap-space, such as hypervisor swap-space 227 shown in
In addition, to make adequate space for a virtual memory page to be made resident in physical memory 221, hypervisor 201 may page-out a virtual memory page already present in physical memory 221. One of several algorithms can accomplish selection of pages to page-out, or replacement pages. Such algorithms may be computer instructions executing on a processor, for example, processor 106, from
Each replacement paging algorithm, mentioned above, is organized to match one or more underlying assumptions concerning which pages in physical memory are likely to be reused and, if so, how heavily. For example, designers that implement a least recently used (LRU) page replacement algorithm assume that virtual memory pages that have been used frequently during a previous period of execution are virtual memory pages that are more likely to be reused in a next period of execution. Such an assumption compares and/or sorts virtual memory pages according to frequency of use, and may page-out the least used pages first.
The LRU page replacement algorithm can weigh more heavily a recent use of a virtual memory page, as compared to a use of a virtual memory page during an oldest sub-period of a measurement period. Accordingly, two recent accesses of a virtual memory page may be more heavily weighted than three less recent accesses of a second virtual memory page. Under such an implementation, a number is assigned to the second “less recent accesses” virtual memory page and a higher number is assigned to the first “recent accesses” virtual memory page. Virtual memory pages with the lowest associated number can be the first candidates for page replacement.
It is appreciated that generally, there are many page replacement algorithms to pick from, and particularly, that the LRU page replacement algorithm may have many suitable implementations.
The arrangement can provide a virtual address space that is substantially smaller than the physical address space. This variance in address space sizing may be due to limitations of the operating system that relies on virtualized memory. Hardware page table 241 can provide translations of addresses from the virtual domain to the physical domain, as well as vice-a-versa. A hardware page table is a table that operates under the control of a virtual memory manager to translate addresses from the logical address space to the physical address space.
Some hints of the operating system's status vis-à-vis a virtual memory page include, ‘unused’ 301, ‘loaned’ 303, ‘active’ 305, and ‘critical’ 307. Accordingly, upon receiving each of these hints, the hypervisor tracks the page state and modifies the hypervisor's behavior with respect to the virtual memory page. It is understood that each state can be dynamic and that the virtual memory page can transition between states several times a second. ‘Unused’ 301 signals that the applicable virtual memory page is not needed by the operating system according to current conditions that the operating system encounters with one or more applications. ‘Loaned’ 303 signals that the corresponding virtual memory page is unused. Consequently, ‘loaned’ is a kind of ‘unused’ state. When an operating system marks a virtual memory page as ‘loaned’, then the operating system indicates it is reliant upon the virtual memory page the least as compared to other virtual memory stages that are not marked ‘loaned’. Consequently, a virtual memory page state of ‘loaned’ indicates that if the hypervisor were to re-allocate the virtual memory page, the operating system predicts that the performance impact would be minimal as compared to the other states, explained further below. As a result, the hypervisor may re-allocate the virtual memory page to other operating systems resident on the data processing system. The hypervisor may coordinate with additional operating systems such that the number of virtual memory pages loaned in this manner is restored to the loaning operating system on a prioritized basis.
‘Active’ state 305 signals that the virtual memory page is in use by the operating system. Any unavailability of the virtual memory page within physical memory is predicted to cause a greater impact to the operating system than a spontaneous loss from physical memory of a ‘loaned’ 303 page. Finally, ‘critical’ 307 signals that the associated virtual memory page is in use by the operating system, and that loss of the virtual memory page is predicted to result in a still higher impact in operating system performance, as compared with a predicted impact associated with a loss of an ‘active’ 305 page. One risk, or performance impact, associated with marking a virtual memory page ‘unused’ is that during an interval prior to the operating system marking the virtual memory page ‘active’ 305, the hypervisor can page-in data for a second operating system to the page. If the first operating system hints ‘active’ 305 within a short interval of the ‘unused’ marking, then the first operating system can suffer delays. The delays are associated with the initial paging-in of the virtual memory page for the second operating system, as well as the delay associated with the subsequent paging-out of the virtual memory page. The hypervisor may re-allocate virtual memory pages of the operating system on a priority basis. In other words, the hypervisor may be configured to first exhaust all virtual memory pages marked loaned by the operating system. Similarly, the hypervisor may exhaust all loaned virtual memory pages, followed by the unused virtual memory pages, and lastly, by the critical virtual memory pages. Such re-allocated virtual memory pages may be re-purposed to support operations by other operating systems or logical partitions present in the data processing system.
A virtual memory page may transition among the various states. Some state transitions are regulated and, in some cases, inhibited by one or more illustrative embodiments. These transitions include transition 321 and transition 323. These transitions are the transitions from ‘active’ to ‘unused’, and from ‘critical’ to ‘unused’, respectively.
At a first time, a logical partition is allocated memory according to chart 400. At a second time, the logical partition is allocated memory according to chart 410. At each time, the logical partition may be allocated a maximum logical memory, respectively, maximum logical memory 403 and maximum logical memory 413. Maximum logical memory is the amount of memory available for the hypervisor to allocate, and corresponds to a size of the combined physical memory and physical storage. Current logical memory is the sum of the virtual memory pages allocated to a logical partition, and includes, for example, current logical memory 405 and current logical memory 415. Minimum logical memory is the smallest logical memory that is configured for the partition, as shown by minimum logical memory 407 and minimum logical memory 417. The minimum and maximum values are constraints on a partition's logical address space as configured by a system administrator, based on the expected usage of the partition.
Accordingly, the physical memory that backs the virtual memory pages of the operating system may dynamically change from physical memory 409 at the first time, to physical memory 419 at the second time. The balance of the physical workspace allocated to the operating system in each time is the difference between the current logical memory and the physical memory. The hypervisor may supply physical storage in, for example, a disk drive as a place for this balance of virtual memory pages to reside. Thus, the memory size of a logical partition is a virtual memory size, referred to as the partition's “logical” memory size. Because a logical partition's logical memory is virtualized, the sum of all of the logical partitions' logical memory sizes can be much larger than the amount of physical memory in the data processing system. The hypervisor handles this over-commitment of physical memory by paging out partitions' logical memory pages to paging space.
Cooperative memory over-commitment is the allocating of less physical memory to each of two or more logical partitions than the hypervisor provides in virtual memory pages, as well as providing a hierarchical manner for the hypervisor to sort physical memory markings in order to select virtual memory pages to steal from one operating system.
If the operating system determines that the data processing system is using a cooperative over-commitment, the operating system marks the virtual memory page as ‘critical’ (step 505). Step 505 may include the virtual memory manager copying the virtual memory page to physical memory. This step can be responsive to the virtual memory page being physically located in storage. Next, or if the determination of step 503 is negative, the operating system moves the virtual memory page to a ‘page-out’ state (step 507). If the virtual memory page is in a page-out state, it is protected from access by the application. In other words, the operating system alone may access the page until the data of the page is completely copied to physical storage. The operating system may set states for a logical page to indicate a page-out state or three alternative states, such as free state, in-use state, or page-in state. Attendant with step 507, the operating system may flush the virtual memory page out to a disk. As such, the flush operation may be completed when all bits of the applicable virtual memory page are copied to disk. The disk can be, for example, hard disk drive 126 of
If there are further pages in memory not yet determined to be ‘critical’ with respect to process 550, the virtual memory manager examines a next virtual memory page resident in physical memory (step 557). Next, the virtual memory manager repeats step 553, with respect to the next virtual memory page resident to physical memory.
A negative outcome to step 553, leads the virtual memory manager to steal the current virtual memory page from the operating system to which the virtual memory page is allocated (step 559). Accordingly, the physical page that holds the virtual memory page is paged out to the physical storage, also known as backing store. In addition, step 559 may include the virtual memory manager reassigning the physical page to a different logical address. Processing terminates after step 559, or after all pages are determined to be exhausted at step 555.
Thus, contemporaneously with, or shortly after a determination that a virtual memory page is selected to be paged-out by an operating system, the operating system in cooperation with a virtual memory manager may elevate the affected virtual memory page to ‘critical’ status. Accordingly, the cooperative operation of the virtual memory manager and operating system may avoid paging-out the virtual memory page to the hypervisor swap-space ahead of the initiation of the operating system paging-out the virtual memory page.
The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The invention can take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment containing both hardware and software elements. In a preferred embodiment, the invention is implemented in software, which includes but is not limited to firmware, resident software, microcode, etc.
Furthermore, the invention can take the form of a computer program product accessible from a computer-usable or computer-readable medium providing program code for use by or in connection with a computer or any instruction execution system. For the purposes of this description, a computer-usable or computer readable medium can be any tangible apparatus that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
The medium can be an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system (or apparatus or device) or a propagation medium. Examples of a computer-readable medium include a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk and an optical disk. Current examples of optical disks include compact disk—read only memory (CD-ROM), compact disk—read/write (CD-R/W) and DVD.
A data processing system suitable for storing and/or executing program code will include at least one processor coupled directly or indirectly to memory elements through a system bus. The memory elements can include local memory employed during actual execution of the program code, bulk storage, and cache memories, which provide temporary storage of at least some program code in order to reduce the number of times code must be retrieved from bulk storage during execution.
Input/output or I/O devices (including but not limited to keyboards, displays, pointing devices, etc.) can be coupled to the system either directly or through intervening I/O controllers.
Network adapters may also be coupled to the system to enable the data processing system to become coupled to other data processing systems or remote printers or storage devices through intervening private or public networks. Modems, cable modem and Ethernet cards are just a few of the currently available types of network adapters.
The description of the present invention has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the invention in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art. The embodiment was chosen and described in order to best explain the principles of the invention, the practical application, and to enable others of ordinary skill in the art to understand the invention for various embodiments with various modifications as are suited to the particular use contemplated.
Number | Name | Date | Kind |
---|---|---|---|
5715420 | Kahle et al. | Feb 1998 | A |
RE36462 | Chang et al. | Dec 1999 | E |
6067607 | Maegawa et al. | May 2000 | A |
6530077 | Marsh | Mar 2003 | B1 |
6785278 | Calvignac et al. | Aug 2004 | B1 |
7000051 | Armstrong et al. | Feb 2006 | B2 |
7370160 | Neiger et al. | May 2008 | B2 |
7624240 | Colbert et al. | Nov 2009 | B1 |
7747838 | Hepkin et al. | Jun 2010 | B2 |
20070005870 | Neiger et al. | Jan 2007 | A1 |
20090307377 | Anderson et al. | Dec 2009 | A1 |
Number | Date | Country | |
---|---|---|---|
20090307462 A1 | Dec 2009 | US |