A computing device, such as a desktop, laptop or tablet computer, smartphone, portable digital assistant, portable game console, etc., includes one or more processors, such as central processing units, graphics processing units, digital signal processors, etc. Other electronic devices, such as computer peripheral devices, as well as consumer electronics devices that have not traditionally been referred to as computing devices, may also include one or more processors. In computing and other devices, such a processor reads instructions or software code from a system memory with which the processor communicates via one or more buses, and performs or manages tasks in accordance with its execution of the code. A processor may be programmed in this manner to manage multiple tasks. A unit of code and data that may be referred to for convenience as a software image may support a processor's management of on the order of hundreds or even thousands of tasks. To promote high throughput, the system memory may be of a type capable of high-speed operation, such as double data rate dynamic random access memory (DDR-DRAM).
Some types of devices, such as portable devices, may have a relatively limited amount of system memory (storage) capacity, such that the memory is incapable of storing the entire software image. A technique commonly known as demand paging may be employed to address this problem. In demand paging, a subset of the software image is stored in a secondary memory and transferred into the system memory in units of pages on an as-needed basis in response to page requests initiated by the processor. The secondary memory may be of a type that is slower than the system memory. Consequently, demand paging may impact the performance of tasks that require a processor to access memory faster than the secondary memory allows.
A demand paging technique has been developed in which a subset of the software image is stored in a compressed form in system memory. In response to a page request initiated by the processor, a portion of the software image is decompressed, and the resulting page is then stored in the system memory for access by the processor.
Systems, methods, and computer programs are disclosed for demand paging in an adaptive, compression-based manner.
In exemplary methods for demand paging, a plurality of compressed software image segments are stored in a memory. Each compressed software image segment corresponds to at least one software task of a plurality of software tasks. Each compressed software image segment comprises one or more pages that are compressed in accordance with a compression characteristic associated with the compressed software image segment and that is different from the compression characteristics of the other compressed software image segments. In response to a page request associated with an executing software task, it is determined whether the page request identifies a page stored in the memory. If the identified page is not stored in the memory, then a portion of one of the compressed software image segments containing the identified page is decompressed into a decompressed page. The decompressed page is then stored in the memory.
Exemplary systems for demand paging include a memory and a processor. The memory is configured to store a plurality of compressed software image segments. Each compressed software image segment corresponds to at least one software task of a plurality of software tasks. Each compressed software image segment comprises one or more pages compressed in accordance with a compression characteristic associated with the compressed software image segment and that is different from compression characteristics of the other compressed software image segments. The processor is configured to: determine whether a page request associated with an executing software task identifies a page stored in the memory; decompress a portion of one of the compressed software image segments containing an identified page into a decompressed page if the identified page is not stored in the memory; and store the decompressed page in the memory in response to the page request.
Exemplary computer program products for demand paging include computer-executable logic embodied in a non-transitory storage medium. Execution of the logic by the processor configures the processor to: determine whether a page request associated with an executing software task identifies a page stored in a memory, wherein the memory has stored therein a plurality of compressed software image segments, each compressed software image segment corresponding to at least one software task of a plurality of software tasks, and wherein each compressed software image segment comprises one or more pages compressed in accordance with a compression characteristic associated with the compressed software image segment and that is different from compression characteristics of the other compressed software image segments; decompress a portion of one of the compressed software image segments containing an identified page into a decompressed page if the identified page is not stored in the memory; and store the decompressed page in the memory in response to the page request.
In the Figures, like reference numerals refer to like parts throughout the various views unless otherwise indicated. For reference numerals with letter character designations such as “102A” or “102B”, the letter character designations may differentiate two like parts or elements present in the same Figure. Letter character designations for reference numerals may be omitted when it is intended that a reference numeral to encompass all parts having the same reference numeral in all Figures.
The word “exemplary” is used herein to mean “serving as an example, instance, or illustration.” Any aspect described herein as “exemplary” is not necessarily to be construed as preferred or advantageous over other aspects.
The terms “component,” “database,” “module,” “system,” and the like are intended to refer to a computer-related entity, either hardware, firmware, a combination of hardware and software, software, or software in execution. For example, a component may be, but is not limited to being, a process running on a processor, a processor, an object, an executable, a thread of execution, a program, and/or a computer. By way of illustration, both an application running on a computing device and the computing device may be a component. One or more components may reside within a process and/or thread of execution, and a component may be localized on one computer and/or distributed between two or more computers. In addition, these components may execute from various computer readable media having various data structures stored thereon. The components may communicate by way of local and/or remote processes, such as in accordance with a signal having one or more data packets (e.g., data from one component interacting with another component in a local system, distributed system, and/or across a network such as the Internet with other systems by way of the signal).
The term “application” or “image” may also include files having executable content, such as: object code, scripts, byte code, markup language files, and patches. In addition, an “application” referred to herein, may also include files that are not executable in nature, such as documents that may need to be opened or other data files that need to be accessed.
The term “content” may also include files having executable content, such as: object code, scripts, byte code, markup language files, and patches. In addition, “content” referred to herein, may also include files that are not executable in nature, such as documents that may need to be opened or other data files that need to be accessed.
The term “task” may include a process, a thread, or any other unit of execution in a device.
The term “virtual memory” refers to the abstraction of the actual physical memory from the application or image that is referencing the memory. A translation or mapping may be used to convert a virtual memory address to a physical memory address. The mapping may be as simple as 1-to-1 (e.g., physical address equals virtual address), moderately complex (e.g., a physical address equals a constant offset from the virtual address), or the mapping may be complex (e.g., every 4 KB page mapped uniquely). The mapping may be static (e.g., performed once at startup), or the mapping may be dynamic (e.g., continuously evolving as memory is allocated and freed).
In this description, the terms “communication device,” “wireless device,” “wireless telephone”, “wireless communication device,” and “wireless handset” are used interchangeably. With the advent of third generation (“3G”) wireless technology and four generation (“4G”), greater bandwidth availability has enabled more portable computing devices with a greater variety of wireless capabilities. The term “portable computing device” (“PCD”) is used to describe any device operating on a limited-capacity power supply, such as a battery, and lacking a system for removing excess thermal energy (i.e., for cooling, such as a fan, etc.). A PCD may be a cellular telephone, a satellite telephone, a pager, a PDA, a smartphone, a navigation device, a smartbook or reader, a media player, a laptop or hand-held computer with a wireless connection, or a combination of the aforementioned devices, among others.
As illustrated in
In the exemplary embodiment, a form of demand paging is employed because, for example, memory 104 may not have sufficient storage capacity to contain the entire software image associated with the control by processor 102 of tasks 108a-108f. Nevertheless, in other embodiments, the methods and systems described herein may be employed regardless of whether there is sufficient system memory to contain the software image. As described below in further detail, the demand paging method employs data compression.
In the exemplary embodiment, a portion of the software image is compressed to form two or more compressed software image segments 110. Compressed software image segments 110 may comprise, for example, three compressed software image segments 110a, 110b and 110c. Compressed software image segments 110a-110c are stored in memory 104. Although these three compressed software image segments 110a-110c are described herein for purposes of illustration in relation to an exemplary embodiment, any other number of compressed software image segments 110 may exist in other embodiments. Although compressed software image segments 110a, 110b and 110c are illustrated for purposes of clarity as being separated from one another, they may occupy contiguous memory address space.
Another portion of the software image may also be stored in memory 104 in an uncompressed form. Some or all of this uncompressed portion of the software image may be in the form of a page pool 112. Although page pool 112 is illustrated for purposes of clarity as being separate from compressed software image segments 110, page pool 112 and compressed software image segments 110 may occupy contiguous memory address space.
Each of compressed software image segments 110a-110c comprises one or more pages compressed in accordance with a unique compression characteristic. That is, compressed software image segment 110a is compressed in accordance with a compression characteristic that differs from the compression characteristics with which compressed software image segments 110b and 110c are respectively compressed; compressed software image segment 110b is compressed in accordance with a compression characteristic that differs from the compression characteristics with which compressed software image segments 110a and 110c are respectively compressed; and compressed software image segment 110c is compressed in accordance with a compression characteristic that differs from the compression characteristics with which compressed software image segments 110a and 110b are respectively compressed. As described in further detail below, the compression characteristic may be, for example, compression algorithm, compression block size, or a combination of compression algorithm and compression block size.
Each of compressed software image segments 110 is associated with at least one of tasks 108. For example, as illustrated in
Decompression logic 114 is also associated with compressed software image segments 110 and tasks 108, as indicated in
As described below with regard to an exemplary method, each of compressed software image segments 110a, 110b and 110c comprises one or more pages that may be decompressed into page pool 112. Decompression logic element 114a may be employed to decompress portions of software image segment 110a. Decompression logic element 114b may be employed to decompress portions of software image segment 110b. Decompression logic element 114c may be employed to decompress portions of software image segment 110c.
As illustrated in
As indicated by block 204, one of tasks 108a-108f may initiate a page request. As well understood by one of ordinary skill in the art, a page request may occur when the processing system attempts to access a system memory address within a unit of memory space known as a page at a time at which the portion of the software image corresponding to that page is not resident in memory. Paging is described in further detail below.
As indicated by block 206, in response to a page request, a portion of the compressed software image containing the requested page is decompressed. The decompression is performed using decompression logic 114 that is associated with the one of compressed software image segments 110 that contains the requested page (in compressed form). Thus, in the exemplary embodiment illustrated in
As illustrated in
In
Continuing in order of latency tolerance, it can be noted that task 108d has a higher latency tolerance than task 108c, task 108e has a higher latency tolerance than task 108d, and task 108f has the highest latency tolerance among tasks 108a-108f. Tasks 108d-108f may have similar enough latency tolerances that they are grouped together relative to the other tasks 108. Accordingly, compressed software image segment 110c, which is associated with each of tasks 108d-108f, is compressed in accordance with a compression characteristic that provides slower decompression than the characteristic with which compressed software image segment 110b is compressed. Accordingly, decompression logic element 114c may be characterized by, for example, a slower compression algorithm and a larger block size than the algorithm and block size that characterize decompression logic element 114b.
A further differentiation in decompression speed may be provided in an exemplary embodiment by employing hardware-based decompression logic 115 in the decompression of only those of software image segments 110 corresponding to tasks 108 having lower latency tolerances or higher priorities. For example, decompression logic element 114a may offload the work of decompression to hardware-based decompression logic 115, while decompression logic elements 114b and 114c perform the work of decompression themselves (i.e., as software-based computations, without the aid of hardware-based decompression logic 115). Alternatively, decompression logic elements 114a and 114b may offload the work of decompression to hardware-based decompression logic 115, while decompression logic element 114c performs the work of decompression.
In some instances, there may be an inverse relationship between a task's latency tolerance and the task's priority. “Priority” of a task relates to the degree to which the performance a task in relation to the performance of other tasks affects performance results of a system encompassing the tasks. A task that affects system performance to a greater extent than another task may be assigned a higher priority than the other task. A task may have a higher latency tolerance and a lower priority than some other tasks. Conversely, a task may have a lower latency tolerance and a higher priority than some other tasks.
In the exemplary embodiment, the system clock and/or voltage level may be adjusted using, for example, dynamic voltage and frequency scaling (DVFS) techniques in response to the priority of a task. As illustrated in
In the exemplary embodiment, software image segments associated with higher-priority tasks may be compressed and stored in a system memory that is characterized by low latency (i.e., high access speed), while software image segments associated with lower-priority tasks may be compressed and stored in a secondary memory that is characterized by a higher latency (i.e., lower access speed) than the system memory. For example, as conceptually illustrated in
In FIGS. 5A5B, an exemplary method 500 that is similar to the above-described exemplary method 200 is illustrated. Block 502 is similar to above-described block 202. A software tool (not shown) may be employed to generate compressed software image segments 110. The tool receives the software image and the compression characteristics as inputs. The tool may also receive information identifying the various tasks 108 and their respective latency tolerances and/or priorities. Such information may be determined empirically or in other ways, as understood by one of ordinary skill in the art. The tool maintains an ordered list of tasks 108, ranked in order of latency tolerance and/or priority, as described above with regard to
Block 501 is similar to above-described block 204. In further detail, block 501 comprises blocks 504, 506, 508, 510 and 512. As indicated by block 504, one of tasks 108a-108f may initiate a page request. A page request is identified by a virtual address of the requested page. As indicated by block 506, the virtual address may be translated into a physical address in memory 104 using a translation lookaside buffer or “TLB” (not shown). As indicated by block 508, it is determined whether the physical address is present in the TLB. A determination that a physical address is present in the TLB is commonly referred to as a “TLB hit.” A determination that a physical address is not present in the TLB is commonly referred to as a “TLB miss.” If it is determined that a TLB hit did not occur (i.e., a TLB miss occurred), it is then determined whether the physical address is present in a page table (not shown), as indicated by block 510. A determination that a physical address is present in the page table is commonly referred to as a “page table hit.” A determination that a physical address is not present in the page table is commonly referred to as a “page table miss.” If it is determined that neither a TLB hit nor a page table hit occurred (i.e., both a TLB miss and a page table miss occurred), then a portion of the one of software image segments 110a-110c that is associated with the requesting one of tasks 108a-108f is decompressed. As described above with regard to block 206, which is similar to block 514, this decompression is performed using the one of decompression logic elements 114a-114c associated with the one of compressed software image segments 110a-110C containing the requested page.
As indicated by block 513, in conjunction with the paging and decompression described above, system 100 (
Continuing to
As indicated by block 518, the decompressed page is mapped into the page table and TLB. As the management of a TLB and page table are well understood by one of ordinary skill in the art, further details of such processes are not described herein. Demand paging logic 519 (
As indicated by block 520, the DVFS characteristic may be returned to its previous setting following the above-described decompression. However, in an instance in which two or more decompressions are to be performed in immediate succession, and following one such decompression the DVFS characteristic is already set to a level or setting associated with the requesting task 108 associated with the next decompression, the DVFS characteristic need not be returned to its previous level or setting per block 520, and can remain at its then-current setting. Following the above-described paging, decompression and DVFS adjustment, the requesting task 108 may access the decompressed page in memory 104 and otherwise continue to execute.
If it is determined that there was a hit in either the TLB or the page table, then neither the decompression described above with regard to blocks 514 and 516 nor the DVFS adjustment described above with regard to block 520 are performed. As a page hit indicates that the requested page is already in memory 104, the requesting task may access the page and otherwise continue to execute.
It should be appreciated that one or more of the method steps or acts described above may be stored in memory 104 as computer program instructions. These instructions may be executed by any type of processor 102 in any type of device to perform the methods described herein.
Although certain acts or steps in the above-described process flows naturally precede others for the exemplary embodiments to operate as described, the invention is not limited to the order of those acts or steps if such order or sequence does not alter the functionality of the invention. That is, it is recognized that some acts or steps may be performed before, after, or parallel (substantially simultaneously with) other acts or steps without departing from the scope and spirit of the invention. In some instances, certain acts or steps may be omitted or not performed without departing from the invention. Further, words such as “thereafter,” “then,” “next,” etc., are not intended to limit the order of the acts or steps. These words are simply used to guide the reader through the descriptions of the exemplary methods.
Additionally, one of ordinary skill in the art is capable of writing computer code or identifying appropriate hardware and/or circuits to implement the disclosed invention without difficulty, based on the flow diagrams and associated description in this specification, for example.
Therefore, disclosure of a particular set of program code instructions or detailed hardware devices is not considered necessary for an adequate understanding of how to make and use the invention. The inventive functionality of the claimed computer-implemented processes is explained in the above description and in conjunction with the drawing figures, which may illustrate various process flows.
In one or more exemplary aspects, the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be embodied in computer-executable instructions or code stored on a computer-readable medium. Computer-readable media include any available media that may be accessed by a computer or similar computing or communication device. By way of example, and not limitation, such computer-readable media may comprise RAM, ROM, EEPROM, NAND flash, NOR flash, M-RAM, P-RAM, R-RAM, CD-ROM or other optical, magnetic, solid-state, etc., data storage media. It should be noted that a combination of a non-transitory computer-readable storage medium and the computer-executable logic or instructions stored therein for execution by a processor defines a “computer program product” as that term is understood in the patent lexicon.
As illustrated in
As illustrated in
In such exemplary embodiments, computer system 700 may further include a system memory 708 and mass-storage devices, such as a non-removable media (e.g., FLASH memory, eMMC, magnetic disk, etc.) data storage 710 and a removable-media drive 712 (e.g., DVD-ROM, CD-ROM, Blu-ray disc, etc.). For example, removable-media drive 712 may accept a DVD-ROM 713. The terms “disk” and “disc,” as used herein, include compact disc (“CD”), laser disc, optical disc, digital versatile disc (“DVD”), floppy disk and Blu-ray disc. Combinations of the above are also included within the scope of computer-readable media. Computer system 700 also includes a USB port 714, to which user interface devices or other peripheral devices, such as a mouse 716 and a keyboard 718, may be connected. In addition, computer system 700 may include a network interface 720 to enable communication between computer system 700 and an external network, such as the Internet. User interface peripheral devices may also include a video monitor 722, which may be connected to video processor 706.
In
Block 901 is similar to above-described block 501. As described above with regard to block 501, if it is determined that a page fault occurred, then a portion of the software image segment that is associated with the requesting task is obtained and decompressed. As indicated by block 914, in the case of the requesting task having a high priority and/or low latency tolerance, the software image segment portion to be decompressed is retrieved from the system memory, where that software image segment had been stored in accordance with above-described block 902. In contrast, as indicated by block 915, in the case of the requesting task having a low priority and/or high latency tolerance, the software image segment portion to be decompressed is retrieved from the secondary memory, where that software image segment had been stored in accordance with above-described block 903. In the case of either block 914 or 915, this decompression is performed using decompression logic associated with the compressed software image segment containing the requested page, as in the other embodiments described above. Thus, in the same manner as described above with regard to exemplary methods 200 and 500, in exemplary method 900 the decompression logic that is employed is associated with a compression characteristic, i.e., combination of one or more of compression algorithm, compression block size, and DVFS setting, associated with the priority and/or latency tolerance of the requesting task.
Referring again to the embodiment illustrated in
Block 913 is similar to above-described block 513 in that a DVFS characteristic may be set temporarily and used while decompression is being performed, then returned (block 920) to its previous setting. Of course, it the DVFS characteristic is already set to a level or setting associated with the requesting task 108, it need not be set again to the same level or setting per block 913, and can remain at its then-current setting. Continuing to
In a particular aspect, one or more of the method steps described herein (such as described above with regard to
As illustrated in
A stereo audio CODEC 830 may be coupled to the analog signal processor 806. Also, an audio amplifier 832 may be coupled to the stereo audio CODEC 830. In an exemplary aspect, a first stereo speaker 834 and a second stereo speaker 836 are coupled to the audio amplifier 832. In addition, a microphone amplifier 838 may be coupled to the stereo audio CODEC 830. A microphone 840 may be coupled to the microphone amplifier 838. In a particular aspect, a frequency modulation (“FM”) radio tuner 842 may be coupled to the stereo audio CODEC 830. Also, an FM antenna 844 is coupled to the FM radio tuner 842. Further, stereo headphones 846 may be coupled to the stereo audio CODEC 830.
A radio frequency (“RF”) transceiver 848 may be coupled to the analog signal processor 806. An RF switch 850 may be coupled between the RF transceiver 848 and an RF antenna 852. The RF transceiver 848 may be configured to communicate with conventional terrestrial communications networks, such as mobile telephone networks, as well as with global positioning system (“GPS”) satellites.
A mono headset with a microphone 856 may be coupled to the analog signal processor 806. Further, a vibrator device 858 may be coupled to the analog signal processor 806. A power supply 860 may be coupled to the on-chip system 802. In a particular aspect, the power supply 860 is a direct current (“DC”) power supply that provides power to the various components of the portable communication device 800 that require power. Further, in a particular aspect, the power supply is a rechargeable DC battery or a DC power supply that is derived from an alternating current (“AC”) to DC transformer that is connected to an AC power source.
A DVFS controller 862 may be coupled to DSP 804. DVFS controller 862 may respond to control signals received from DSP 804 by adjusting a DVFS setting that affects a DVFS characteristic, such as a system clock frequency applied to DSP 804.
A keypad 854 may be coupled to the analog signal processor 806. The touchscreen display 812, the video port 818, the USB port 822, the camera 828, the first stereo speaker 834, the second stereo speaker 836, the microphone 840, the FM antenna 844, the stereo headphones 846, the RF switch 850, the RF antenna 852, the keypad 854, the mono headset 856, the vibrator 858, and the power supply 860 are external to the on-chip system 802.
In a particular aspect, one or more of the method steps described herein (such as described above with regard to
Alternative embodiments will become apparent to one of ordinary skill in the art to which the invention pertains without departing from its spirit and scope. Therefore, although selected aspects have been illustrated and described in detail, it will be understood that various substitutions and alterations may be made therein without departing from the spirit and scope of the present invention, as defined by the following claims.