Not Applicable
Not Applicable
This invention is related to a method and apparatus for dynamically determining the optimal page size to use for an application running in a computer system.
Referring to
With the advent of multiple page size support in most modem operating systems, applications can significantly benefit by selecting an appropriate page size to use to attain the best performance. On a system that supports two page sizes, for example 4 KB and 64 KB, applications which access small dispersed chunks of memory (from a program address' perspective) are better off using the smaller page size of 4 KB. The trade-off in page size selection is typically increased memory fragmentation and longer page-in and page-out delays for larger page sizes versus increased TLB (Translation Look-aside Buffer) misses with decreased fragmentation and shorter page-in and page-out delays for smaller page sizes.
A Translation Look-aside Buffer (TLB) is a hardware apparatus with which a processor can efficiently translate the virtual/effective addresses used by the applications to the real/physical addresses used by the memory controller/coherence controller, etc. The TLB is organized as a list of entries, where each entry maps a contiguous range of virtual addresses (e.g. one page) to a contiguous range of physical addresses of the same size. The size of a TLB (number of entries) is limited by the amount of time it takes to associatively search the TLB entries.
Whenever there is a TLB miss (i.e. a TLB entry cannot be found for the given virtual address), the processor looks up the virtual-to-real address translation in the page table. Page table lookup is a much more time consuming operation than a TLB lookup. So, from the application performance's point of view, and also from the overall system throughput's point of view, it is best to have as few TLB misses as possible.
Since increasing the page size of each TLB entry amounts to increasing the amount of memory covered by the TLB at any point in time (“TLB reach”), one might think that one way of reducing the number of TLB misses is to increase the size of the address range (page size) referred to by each TLB entry. However, increasing page size may not necessarily result in reduction in the number of TLE misses, which can vary for each application depending on the memory access behavior of that application. For example, if an application's memory access patterns are highly dispersed, then increasing the page size would not result in any reduction in TLB misses; moreover, increasing page size may cause memory fragmentation, thereby resulting in lower memory utilization for the OS.
Currently, the application programmer or the system administrator has to know the memory access patterns of the application and instruct the operating system to use the best page size for each application. This becomes even more complex because users often want to run their applications on different platforms, but different platforms support different page sizes. Hence, an application programmer has to know which platforms the application is going to run on, which page sizes are supported on those platforms, and what is the best page size to use on each of those platforms. On a given platform, requiring the system administrator to select the right page size for each application introduces an even bigger problem of the sysadmin having to know each application's characteristics. It also involves much manual work, and hence increases the probably of errors.
One attempt to relieve the programmer of the burden of having to adjust page size is a method known as “preemptive reservation”, where the Virtual Memory Manager (VMM) reserves large page sizes, but “takes back” the unused reserved memory if there is a demand for real memory. While “preemptive reservation” is effective against fragmentation, it is not effective against TLB misses. “Preemptive reservation” is described in the following paper: Juan Navarro, Rice University and Universidad Catolica de Chile; Sitaram Iyer, Peter Druschel, and Alan Cox, Rice University; Practical, Transparent Operating System Support for Superpages; Fifth Symposium on Operating Systems Design and Implementation, December 2002.
There is therefore a need for automatic and dynamic changing to an optimum page size determined as a result of running an application.
It is, therefore, an object of this invention to autonomically determine and dynamically set the page size of an application to an optimal value by tracking the number of virtual to real address translation mechanism misses (for example, TLB (Translation Look-aside Buffer misses) for each page size per unit of time incurred during the execution of that application on a given platform (i.e. hardware and operating system combination).
It is another object of this invention to eliminate the need for the system administrator to manually specify the optimal page size for an application.
It is another object of this invention to eliminate the need for the application programmer and/or system administrator to know the correct page size to use for an application's memory accesses, and the need to know the different page sizes available on a given platform.
This invention uses a mechanism to keep track of the number of virtual to real address translation caching mechanism misses, such as TLB misses, on a per-process basis, associates an application with a set of processes, determines the optimum page size for the application based on the miss counts for the application's processes, and optionally, dynamically sets the optimal page-size for the running application. This invention can also be used to discover different optimal page sizes for different memory regions in the process.
This invention provides a mechanism to determine the optimal page size for an application by monitoring the TLB misses for different page sizes.
This invention provides a mechanism to maintain the list of frequently used applications whose TLB misses are worth tracking, to identify the list of processes of each application, to enable/disableTLB-miss-tracking for each of the processes, to maintain the TLB misses on a per-process basis, and to consolidate the per-process TLB misses into per-application TLB misses, and finally to determine whether the page size of each application should be changed based on its TLB misses.
With this invention, the application programmer and/or system administrator is relieved of the need to know the correct page size to use for the application's memory accesses, and the need to know of the different page sizes available on a given platform. This invention also eliminates the need for the system administrator to manually specify the optimal page size for the applications.
Most computer platforms already have a mechanism to accumulate the TLB miss counts. This invention uses such a mechanism to keep track of the number of TLB misses on a per-process basis.
The subject matter, which is regarded as the invention, is particularly pointed out and distinctly claimed in the claims at the conclusion of the specification. The foregoing and other features and also the advantages of the invention will be apparent from the following detailed description taken in conjunction with the accompanying drawings.
This invention provides a mechanism to determine the optimal page size for an application by monitoring the TLB misses for different page sizes. This detailed description describes:
1. A mechanism to maintain the list of frequently used applications whose TLB misses are worth tracking.
2. A mechanism to identify the list of processes of each application, and to enable/disable TLB-miss-tracking for each process.
3. A mechanism to maintain the TLB misses on a per-process basis.
4. A mechanism to consolidate the per-process TLB misses into per-application TLB misses, and to determine whether the page size of each application should be changed.
Although it is assumed for the purposes of illustration that one single page size is used for all of the application's data, the methods described in this invention can be used even when different address regions of the application use different page sizes. The hardware could provide a mechanism to obtain the TLB miss data for each region while the OS provides mechanisms to get and set the page size value for each address region.
The current invention is not limited to TLB-based systems; it is also applicable to any virtual-to-real address translation caching mechanisms.
1. A mechanism to maintain the list of frequently used applications whose TLB misses are worth tracking. See
Referring to
An alternative mechanism to identify the frequently used applications that are worth tracking is to use the Operating system provided accounting tools. Operating systems typically come with software tools that enable system administrators to keep track of which applications are running on the system, which users are logged on to the system and for how long, etc. These tools are referred to as “accounting tools” since they are used to track the usage of the system and charge the customers based on the usage.
Referring now to
This system call reads all the per-process data structures in the kernel and stores the TLBmisses and clockTics values into the buf provided.
The type pidTlbMisses_t is defined as follows:
In step 303 the TLB miss counters and corresponding CPU times for each application are calculated by adding all the TLB miss counters of all the processes belonging to each application. Note, that instead of simply adding TLB miss counter values, one could also add weighted TLB miss counter values. In step 304, the Application Page Size Table 500 of
Tables 400, 500, and List 600 are described below.
Shown in
TRACK→the application is already marked for TLB miss tracking;
EVALUATE→the application is being evaluated to determine whether its TLB misses should be traced;
DO_NOT_TRACK→the application should not be tracked for TLB misses.
Once an application is identified as a candidate whose TLB miss rate should be tracked, it will be added to an appPgSzData table as shown in
Since maintaining the TLB misses for every process adds overhead to the system, we want to track the TLB misses for only those applications which can significantly benefit themselves and other users of the OS by changing their page size. So, we will also maintain a flag in the kernel, pgsztrace, (702) in each process' kernel data structure to indicate that this process' TLB misses should be tracked. The following syscall provides the interface to set/reset this flag for each process.
The computer system can include a display interface 708 that forwards graphics, text, and other data from the communication infrastructure 802 (or from a frame buffer not shown) for display on the display unit 710. The computer system also includes a main memory 14, preferably random access memory (RAM), and may also include a secondary memory 712. The secondary memory 712 may include, for example, a hard disk drive 714 and/or a removable storage drive 716, representing a floppy disk drive, a magnetic tape drive, an optical disk drive, etc. The removable storage drive 716 reads from and/or writes to a removable storage unit 718 in a manner well known to those having ordinary skill in the art. Removable storage unit 718, represents a floppy disk, a compact disc, magnetic tape, optical disk, etc. which is read by and written to by removable storage drive 716. As will be appreciated, the removable storage unit 718 includes a computer readable medium having stored therein computer software and/or data.
In alternative embodiments, the secondary memory 712 may include other similar means for allowing computer programs or other instructions to be loaded into the computer system. Such means may include, for example, a removable storage unit 722 and an interface 720. Examples of such may include a program cartridge and cartridge interface (such as that found in video game devices), a removable memory chip (such as an EPROM, or PROM) and associated socket, and other removable storage units 722 and interfaces 720 which allow software and data to be transferred from the removable storage unit 722 to the computer system.
The computer system may also include a communications interface 724. Communications interface 724 allows software and data to be transferred between the computer system and external devices. Examples of communications interface 724 may include a modem, a network interface (such as an Ethernet card), a communications port, a PCMCIA slot and card, etc. Software and data transferred via communications interface 724 are in the form of signals which may be, for example, electronic, electromagnetic, optical, or other signals capable of being received by communications interface 724. These signals are provided to communications interface 724 via a communications path (i.e., channel) 726. This channel 726 carries signals and may be implemented using wire or cable, fiber optics, a phone line, a cellular phone link, an RF link, and/or other communications channels.
In this document, the terms “computer program medium,” “computer usable medium,” and “computer readable medium” are used to generally refer to media such as main memory 14 and secondary memory 712, removable storage drive 716, a hard disk installed in hard disk drive 714, and signals. These computer program products are means for providing software to the computer system. The computer readable medium allows the computer system to read data, instructions, messages or message packets, and other computer readable information from the computer readable medium. The computer readable medium, for example, may include non-volatile memory, such as a floppy disk, ROM, flash memory, disk drive memory, a CD-ROM, and other permanent storage. It is useful, for example, for transporting information, such as data and computer instructions, between computer systems. Furthermore, the computer readable medium may comprise computer readable information in a transitory state medium such as a network link and/or a network interface, including a wired network or a wireless network, that allow a computer to read such computer readable information.
Computer programs (also called computer control logic) are stored in main memory 14 and/or secondary memory 712. Computer programs may also be received via communications interface 724. Such computer programs, when executed, enable the computer system to perform the features of the present invention as discussed herein. In particular, the computer programs, when executed, enable the processor 12 to perform the features of the computer system. Accordingly, such computer programs represent controllers of the computer system.
Although specific embodiments of the invention have been disclosed, those having ordinary skill in the art will understand that changes can be made to the specific embodiments without departing from the spirit and scope of the invention. The scope of the invention is not to be restricted, therefore, to the specific embodiments. Furthermore, it is intended that the appended claims cover any and all such applications, modifications, and embodiments within the scope of the present invention.