One of the features of a computing device, such as a mobile phone or a tablet, is running applications (or “apps”). All of the code and data fragments of active applications are stored and accessed from relatively fast volatile memory (e.g., RAM), as compared to relatively-slower non-volatile memory (e.g., Flash). However, a computing device typically has a relatively-small amount of volatile memory as compared to non-volatile memory, so there is a limit as to the number of applications whose code and data fragments can all be loaded into volatile memory. As such, an operating system on the computing device can decide, according to its own heuristics, to end (or “kill”) one or more applications currently running in volatile memory in order to provide volatile memory resources for a different (currently-running or newly-executed) application that has a higher priority. For example, an application manager in the user space can compute a priority parameter (sometimes known as adjustment) and report this parameter to the operating system kernel. Whenever memory resources are insufficient for fulfilling a memory allocation request of a process in the kernel, the kernel can free some memory by killing a low priority processes as indicated by the adjustment parameter.
Embodiments of the present invention are defined by the claims, and nothing in this section should be taken as a limitation on those claims.
By way of introduction, the below embodiments relate to a computing device and method for predicting low memory conditions. In one embodiment, a computing device is provided having volatile memory, non-volatile memory, and a processor. The processor generates a metric predictive of an upcoming low-memory condition in the volatile memory. The processor then compares the metric to a threshold. If the metric exceeds the threshold, the processor creates free space in the volatile memory.
Other embodiments are possible, and each of the embodiments can be used alone or together in combination. Accordingly, various embodiments will now be described with reference to the attached drawings.
As mentioned above, whenever memory resources are insufficient for fulfilling a memory allocation request of a process in the kernel, the kernel can free some memory by killing a low-priority process. Low-priority processes can be indicated by a priority parameter (sometimes known as adjustment) computed by an application manager in the user space. However, the problem with this approach is that once memory resources cannot be fulfilled, it may be too late to begin performing memory management, and the performance of the computing device may be degraded. The following embodiments can be used to overcome this problem. Before turning to these and other embodiments, the following section provides a discussion of exemplary computing and storage devices that can be used with these embodiments. Of course, these are just examples, and other suitable types of computing and storage devices can be used.
Turning now to the drawings,
The processor 110 is responsible for running the general operation of the computing device 100. This includes, for example, running an operating system, as well as various applications. The computer-readable program code for the operating system and applications can be stored in the non-volatile memory 120 and then loaded into the volatile memory 130 for execution. The following embodiments provide several examples of methods that can be performed by the processor 110.
The non-volatile and volatile memories 120, 130 can take any suitable form. For example, the volatile memory 130 can use any current or future technology for implementing random access memory (RAM). In one embodiment, the non-volatile memory 120 takes the form of a solid-state (e.g., flash) memory and can be one-time programmable, few-time programmable, or many-time programmable. The non-volatile memory 120 can also use single-level cell (SLC), multiple-level cell (MLC), triple-level cell (TLC), or other memory technologies, now known or later developed.
The non-volatile memory 120 can simply be a memory chip or can be part of a self-contained storage device with its own controller. An example of such a storage device 200 is shown in
The controller 210 also comprises a central processing unit (CPU) 213, an optional hardware crypto-engine 214 operative to provide encryption and/or decryption operations, read access memory (RAM) 215, read only memory (ROM) 216 which can store firmware for the basic operations of the storage device 100, and a non-volatile memory (NVM) 217 which can store a device-specific key used for encryption/decryption operations, when used. The controller 210 can be implemented in any suitable manner. For example, the controller 210 can take the form of a microprocessor or processor and a computer-readable medium that stores computer-readable program code (e.g., software or firmware) executable by the (micro)processor, logic gates, switches, an application specific integrated circuit (ASIC), a programmable logic controller, and an embedded microcontroller, for example. Suitable controllers can be obtained from Marvell or SandForce.
The storage device 200 can be embedded in or removably connected with the computing device 100. For example, the storage device 200 can take the form of an iNAND™ eSD/eMMC embedded flash drive by SanDisk Corporation or can take the form of a removable memory device, such as a Secure Digital (SD) memory card, a microSD memory card, a Compact Flash (CF) memory card, or a universal serial bus (USB) device.
Returning to
As shown in
In the user space, the relevant objects are applications (e.g., such as apps for making a phone call, taking a picture, opening a video, etc.), and each application translates into a process (or several processes) that need to run in order to support the application's functionality. Each process has a projection into the kernel space. From the operating system kernel's perspective, a process is an entity that requires resources: memory, time slots to run in, structures that describe the process, etc. The operating system kernel 310 is the process manager and allocates the memory resources and the time slots where the process can run. So, in some sense, the processes can be said to run in the operating system kernel 310; however, the operating system kernel 310 has no knowledge of the functionality of the processes. The operating system kernel 310 does not even know if a process is running in the background or foreground. From the operating system kernel's perspective, the process is defined by the resources it needs to support it.
In the user space, the application management layer 305 is aware of the functionality of each process, of the processes associated with each application 300, and of the priority of an application 300 and its associated processes. In order to support the operating system kernel 310 in its role of resource allocation to the processes running in the operating system kernel 310, the application management layer 305 in the user space computes a priority parameter, sometimes known as adjustment, and reports this parameter to the operating system kernel 310. Typically, the adjustment parameter is added to the structure defining the process (i.e., the reflection of the process in the kernel space) and will be updated on a regular basis. For example, the adjustment parameter can be defined as a 16-level parameter where a low value indicates high priority and a high value indicates low priority.
Whenever memory resources are insufficient for fulfilling a memory allocation request of a process (in the operating system kernel 310), the operating system kernel 310 may free some memory in the volatile memory 130, either by swapping (i.e., moving some data from the volatile memory 130 (e.g., RAM) into the non-volatile memory (e.g., main storage)) or by ending (or “killing”) low-priority processes (as indicated by the adjustment parameter). The operating system kernel 310 can compute a first threshold function: A=F(free memory, required memory), where A is a number in the range of the adjustment parameter. Then, the operating system kernel 310 can kill any process with an adjustment greater than (or equal) to A in order to fulfill the requests from current processes.
Because this approach reacts to low memory conditions according to “memoryless” (i.e., without relation to prior history and dynamics of memory allocation) snapshots of various system metrics, it reacts only when requests for additional memory cannot be fulfilled. However, at this stage, performing the memory management operations to create free space, such as swapping memory from RAM into main storage, may degrade overall performance and user experience. That is, the problem with this approach is that once memory resources cannot be fulfilled, it may be too late to begin performing memory management, and the performance of the computing device (e.g., a smart phone) may be degraded.
The following embodiments can be used to predict low memory conditions early enough so that free memory can be created in the volatile memory 130 early enough so the user does not experience performance degradation.
Returning to the drawings,
As discussed above, the first act in this method is to generate a metric predictive of an upcoming low-memory condition in the volatile memory 4 (act 410). In general, the metric can take into account the history of the computing device 100 and the dynamics of the metric evolvement. The following paragraphs provide a number of examples of various metrics that can be used. However, it should be understood that the claims are not limited to any particular example unless expressly recited therein. Also, it should be understood that these are just examples, and other types of metrics can be used. Further, as will be discussed below, while a single metric can be used, multiple metrics can be used, with each of the metrics being given the same or different weights.
In one example, the metric compares an amount of memory that was allocated versus an amount of memory that was requested over a period of time. This can be measured by comparing a number of allocation requests that were granted versus a number of allocation requests that were denied. This can also be measured by comparing a size (e.g., megabits) of memory allocation requests versus a size of those memory allocation requests that were denied. Alternatively a weighted combination can be considered. Denoting a successful memory allocation by M and a failed memory request by F, the ratio between successful requests to unsuccessful requests (in a time window) may be expressed as: f(M, F)=#M/#F, where #X denotes the number of requests of type X.
A failed allocation can include not only allocations that were refused by the memory allocation unit (e.g., kmalloc, the Kernel memory allocator), but also memory requests that the memory allocation unit had difficulties in providing. The difficulty in providing a memory allocation request can be measured by the time it took for the memory allocation unit to provide the requested memory or by measuring the number of software units that were referred to for providing the request. For example, the kmalloc function for allocating memory can respond immediately to a memory request if there exists a big enough continuous chunk of memory available. Alternatively if there is not a big enough continuous chunk of memory available, then the kmalloc function can call upon sub-functions to find multiple chunks of memory that accumulate to the desired request, or it can call a swap mechanism to free some memory. A failure, F, can sometimes be associated with a memory request if the kmalloc function had to call other sub-functions before allocating the memory.
The last examples were described from the kernel's perspective and took into consideration the memory condition and memory dynamics as seen by the kernel. They did not take the user application perspective into consideration. In another embodiment, similar dynamics in the user space can be taken into consideration. According to one example, the adjustment parameter described above can be traced along a sliding window, and its dynamic behavior can be taken into account when considering which processes may be killed. Alternatively, the toggling of a process between foreground and background modes can be traced.
In another example, a metric can record successful memory allocations and failed memory allocations in a string that denotes a pattern of successful and unsuccessful memory allocation requests over a period of time. For example, the string “MMFFM” can denote two successful allocations followed by two failed allocations followed by one successful allocation. According to this example, the metric is a function of the pattern of the success/fail strings of this type and can be generated for a fixed or sliding window of a predefined or variable length. By using this type of metric, the processor 110 can take into account the specific pattern of the string and allocate memory resources accordingly. More generally, the processor 110 can take into account other patterns of allocation successes and failures (not only a moving window summary).
As another example, a metric can be a function of a number of memory allocation requests over a period of time. An increased number of memory allocation requests can indicate that enhanced memory usage is expected in the near future, even though there is enough free memory at the present moment. When the processor 110 decides that enhanced memory usage is expected, it can initiate memory management activities to free more memory.
Another exemplary metric predictive of an upcoming low-memory condition in the volatile memory 130 is based on information about one or more applications 300 received from the user space (e.g., information about the initiation or termination of an application 300 or that an application 300 is going to the background). Other examples include, but are not limited to, an amount of free space in the volatile memory 130, a cache page size, a number of processes beginning and terminating in a period of time, and an amount of memory allocated and de-allocated in a period of time. It should be noted that while several of the metrics described above were discussed as one measurement versus another measurement (e.g., the number of successful allocation versus the number of unsuccessful allocations), it should be understood that each of the metrics can stand on its own and do not necessarily need to be compared to another metric (e.g., look at the number of successful allocation without comparing it to the number of unsuccessful allocations).
It should be noted that the period of time under which a metric is measured can be a fixed window or a sliding window. Also, if multiple metrics are used, each metric can be judged under the same or different amount of time, and all or some can use a fixed window versus a sliding window. In other words, the size and fixed/sliding nature of the window can be the same for all metrics or vary for some or all metrics.
With the metric generated, the processor 110 then compares the metric with a threshold (act 420). There are many variations that can be used. For example, if there are multiple metrics, all of the metrics can be compared to the same threshold, or some or all of the metrics can be compared to different thresholds. Also, the metrics can be compared to the thresholds in a direct fashion, or some metrics can be given different weights. Further, a smoothing computation (e.g., computing an average of the metric over some time period or computing a moving average) can be performed on the metrics before comparing them to the threshold.
Also, the threshold can be predetermined (static) or dynamic. The threshold can be an absolute predefined threshold, or it can be a relative threshold (e.g., comparing the latest values of the metric to recent computations of the threshold). It is also possible to compute averages along different time periods and compare between them. For example, one embodiment may compute fast averages (e.g. where averaging of the metric is done at a high rate) and compare the results to a slow averaging process. Another embodiment can assign different weights to recent measurements versus old measurements. In another embodiment, the dynamic threshold is by itself computed by one or more of metrics over a period of time (either the same period of time as the metrics or different period of time) (e.g., average over window N vs. average over window N−1).
For example, consider the metric defined above f(M, F)=#M/#F that computes the ratio of successful allocations to unsuccessful allocations of memory. According to one implementation, a fixed threshold can be computed, such that whenever the value of f(M, F) is below the threshold, action can be taken to free memory (e.g., by swapping some memory pages from the volatile memory 130 to the non-volatile memory 120). According to another implementation, f(M, F) can be computed on different sliding windows, where the computation on a specific set of sliding windows can serve as a threshold for more recent computations on a different window. Whenever the current value of f(M, F) is lower than the threshold, action can be taken to free memory. This example demonstrates dynamic computations with fixed and relative thresholds. In particular, in this example, the threshold can be itself computed in the same way as the metric it is compared to (however at a different time window).
As noted above, averaging any of the metrics can be done by a sliding window, by a jumping window, by arithmetic averaging, and by changing the window size per parameter over time or over number of allocation operations. The computations of any of the above metrics can be computed on a short window and can be compared to computation of the metric over a large window. Whenever there is a discrepancy between the metrics of the two windows by more than a predefined threshold, pre-emptive action can be taken to free more memory. This enables “adaptable thresholds” to detect dynamically significant changes in a metric compared to its latest history.
In another embodiment, a weight function is computed for each process in addition to the threshold function, where the weight function takes into account the adjustment parameter for the process and the amount of memory consumed by the process along time. For example, the weight function can be a product of the adjustment variable with the memory consumption. According to this embodiment, for example, the processor 110 can kill any application whose adjustment parameter is greater than F(free memory). This is similar to the decision in prior approaches, but, with this approach, the priorities for the killing of processes can be made according to their weight function, where “heavy” processes will be killed first.
If the metric exceeds the threshold, the processor 110 creates free memory (act 430). In general, there is a need to create free space because of the limited size of the volatile memory 130. By way of background, the computer-readable program code and data fragments for an application can be copied from the non-volatile memory 120 and placed in the volatile memory 130 for execution. This avoids the performance delays associated with loading code or data fragments from non-volatile to volatile memory while the application is running. However, the size of volatile memory 130 is relatively small as compared to the size of the non-volatile memory, so there is a limit as to the number of applications whose code and data fragments can all be running from volatile memory 130.
When the metric exceeds the threshold and free space needs to be created in the volatile memory 130, many different types of techniques can be used. For example, as noted above, the processor 110 can decide to end (or “kill”) one or more applications currently running in the volatile memory 130 in order to provide volatile memory resources for a different (currently-running or newly-executed) application that has a higher priority. As another alternative, when the threshold is exceeded, the processor 110 can change the priority level that is used to determine when an application is to be terminated (or swapped or hibernated, as will be discussed below).
Some systems, such as mobile phones, may wish to avoid “killing” an open application because killing an application may result in slower performance. To avoid killing an application, the processor 110 can perform a “swapping” operation, in which data associated with a low-priority application is copying from the volatile memory 130 and stored in the non-volatile memory 120. So, when the metric exceeds the threshold and free space needs to be create in the volatile memory 130, the processor 110 can perform a swapping operation or even change the rate of swapping memory.
As an alternative to swapping, a hibernation process can be used. “Hibernation” refers to the process of stopping an application, moving all of the application's memory pages from volatile memory 130 to non-volatile memory 120, and storing at least some of the application's state parameters in the non-volatile memory 120, so the application can later be restored from the non-volatile memory 120 in the same state that the application was in before the application was placed in hibernation. A hibernated application can be removed from the operating system's scheduler queue (or tagged as non-runnable). Hibernating an application can be simpler and less expensive than killing an application.
Other ways of creating free space can be used. For example, in another embodiment, free memory in the volatile memory 130 is created by moving an application to a compressed memory zone. Also, the processor 110 can decide whether to kill or swap or hibernate an application by determining priorities for the applications and processes according to dynamic behavior of the application or process.
Irrespective of what technique is used to create the free space, by triggering the creation of free space on a predicted low memory condition rather than on a present low memory condition, these embodiments avoid the performance penalty associated with prior approaches that create free space in response to a request from an application for more memory. In this way, these embodiments provide for better memory management by proactively reacting to early predictions of low memory conditions and freeing memory in advance of the actual need for the memory. This approach is fundamentally different from previous approaches as it does not decide according to a snapshot of the current status of the system but according to a history record of the system. By processing different metrics and reacting accordingly, these embodiments enable both noise filtering and the ability to react to changing dynamics of the computing device 100 (e.g., by detecting a sudden change in the behavior of the device 100, such as a burst of memory allocation). This is advantageous over previous methods because previous methods do not protect from occasional deviations. Memory resource needs can be of a sinusoidal or otherwise varying nature (e.g., free now, busy after 10 seconds, free again after 20 seconds, etc.), and garbage collection frees buffers and swaps out pages of memory periodically due to memory condition. These embodiments can be used to equalize this situation. For example, if the sliding window is large enough, it can predict a real rising memory pressure situation and prevent false responses.
These embodiments can be implemented in any suitable manner in the computing device 100. For example, as discussed above, the processor 110 of the computing device 100 can execute an operating system kernel 310 as well as applications 300 and an application management layer 310 running in the user space. The operating system kernel 310 can be Linux or incompatible with Linux. Operating systems with a kernel incompatible with Linux include, but are not limited to, Windows operating systems (e.g., Windows 8 NT and Windows 8) and Apple operating systems (e.g., iOS and Mac-OSx).
In one embodiment, the operating system kernel 310 generates the metric, compares the metric to the threshold, and creates the free memory in the volatile memory 130. In another embodiment, the user space triggers the generation of the metric by the operating system kernel 310, wherein the operating system kernel 310 returns the metric to the user space, and wherein the user space compares the metric to the threshold. In yet another embodiment, the operating system kernel 310 triggers itself to generate the metric, and wherein the user space requests the metric from the operating system kernel 310 and compares the metric to the threshold. Further, the generating, comparing, and initiating acts can be performed by sending function calls from the application management layer 305 to the operating system kernel 310.
It is intended that the foregoing detailed description be understood as an illustration of selected forms that the invention can take and not as a definition of the invention. It is only the following claims, including all equivalents, that are intended to define the scope of the claimed invention. Finally, it should be noted that any aspect of any of the preferred embodiments described herein can be used alone or in combination with one another.
This application claims priority to U.S. provisional patent application No. 61/871,706, filed Aug. 29, 2013, which is hereby incorporated by reference.
Number | Date | Country | |
---|---|---|---|
61871706 | Aug 2013 | US |