Portable computing devices (“PCDs”) are becoming necessities for people on personal and professional levels. These devices may include cellular telephones, portable digital assistants (“PDAs”), portable game consoles, palmtop computers, and other portable electronic devices.
One aspect of PCDs that is in common with most computing devices is the use of electronic memory components for storing instructions and/or data. Various types of memory components may exist in a PCD, each designated for different purposes and each with its own set of advantages and disadvantages. Volatile rapid access memory (“RAM”), for example, is in high demand by PCD designers to be used as a primary memory component because it is extremely fast in responding to data requests from processing components. But, RAM is expensive and sometimes impractical to scale due to limited size options. By contrast, read only memory (“ROM”) is inexpensive, but is relatively slower in its response time and less practical for managing data updates due to the original data image being permanently “burned” into the unchangeable, non-volatile ROM at the time of manufacture.
Often, a baseline data image is stored in ROM and subsequent updates or revisions to the data image are managed in RAM. As one of ordinary skill in the art understands, managing in RAM software updates to data instantiated in ROM can take up a lot of valuable RAM capacity that could be used for other purposes. Therefore, there is a need in the art for a system and method that optimizes RAM space utilization for software updates. More specifically, there is a need in the art for a system and method that optimizes RAM capacity through the use of ROM-based paging.
Various embodiments of methods and systems for memory paging in a system on a chip (“SoC”) are disclosed. An exemplary method includes identifying a subset of a baseline data image stored in a secondary storage device, such as a read only memory (“ROM”) device, and determining that a revision data image requires an update of the subset. In response to the update, generating a diff file that represents binary differences between the revision data image subset and the baseline data image subset. Next, storing the diff file in a primary storage device, such as a double data rate (“DDR”) dynamic random access memory device. Subsequently, upon receiving a request for a data block associated with the revision data image that causes a page fault, generating the requested data block based on a combination of the baseline data image and the diff file. Placing the generated requested data block in either a swap pool area designated in the primary storage device or in a cache closely coupled to a processing component that initiated the request. Finally, responding to the request by providing the generated requested data block to the processing component.
In the drawings, like reference numerals refer to like parts throughout the various views unless otherwise indicated. For reference numerals with letter character designations such as “102A” or “102B”, the letter character designations may differentiate two like parts or elements present in the same figure. Letter character designations for reference numerals may be omitted when it is intended that a reference numeral to encompass all parts having the same reference numeral in all figures.
The word “exemplary” is used herein to mean serving as an example, instance, or illustration. Any aspect described herein as “exemplary” is not necessarily to be construed as exclusive, preferred or advantageous over other aspects.
In this description, the term “application” may also include files having executable content, such as: object code, scripts, byte code, markup language files, and patches. In addition, an “application” referred to herein, may also include files that are not executable in nature, such as documents that may need to be opened or other data files that need to be accessed.
In this description, reference to double data rate (“DDR”) memory components and/or dynamic random access memory (“DRAM”) and/or “primary” memory components will be understood to envision any of a broader class of volatile random access memory (“RAM”) used for data storage and will not limit the scope of the solutions disclosed herein to a specific type or generation of RAM.
In this description, reference to “external memory device” and/or “ROM” and/or “secondary” memory and the like refers to a broader class of non-volatile (i.e., retains its data after power is removed), one-time programmable and nonreversible memory for storing executable code and/or data and will not limit the scope of the solutions disclosed. As such, it will be understood that use of the terms envisions any programmable read-only memory or field programmable non-volatile memory suitable for a given application of a solution such as, but not limited to, embedded multimedia card (“eMMC”) memory, electrically erasable programmable read-only memory (“EEPROM”), flash memory, etc.
In this description, the term “page fault” generally refers to an event where a processing component executing a program requests a data page not presently stored in primary storage. The invalid memory reference may cause the memory controller and/or page fault handler to determine the location of the requested page data in secondary storage, identify an unused memory space in the primary storage, retrieve the requested page from secondary storage and save a copy into the identified space in the primary storage, update a page table associated with the primary storage, and return instructions to the processing component to make a new request for the page data at the new location of the copy.
As used in this description, the terms “component,” “database,” “module,” “system,” and the like are intended to refer to a computer-related entity, either hardware, firmware, a combination of hardware and software, software, or software in execution. For example, a component may be, but is not limited to being, a process running on a processor, a processor, an object, an executable, a thread of execution, a program, and/or a computer. By way of illustration, both an application running on a computing device and the computing device may be a component. One or more components may reside within a process and/or thread of execution, and a component may be localized on one computer and/or distributed between two or more computers. In addition, these components may execute from various computer readable media having various data structures stored thereon. The components may communicate by way of local and/or remote processes such as in accordance with a signal having one or more data packets (e.g., data from one component interacting with another component in a local system, distributed system, and/or across a network such as the Internet with other systems by way of the signal).
In this description, the terms “central processing unit (“CPU”),” “digital signal processor (“DSP”),” “graphical processing unit (“GPU”),” and “chip” are used interchangeably. Moreover, a CPU, DSP, GPU or a chip may be comprised of one or more distinct processing components generally referred to herein as “core(s).”
In this description, the term “portable computing device” (“PCD”) is used to describe any device operating on a limited capacity power supply, such as a battery. Although battery operated PCDs have been in use for decades, technological advances in rechargeable batteries coupled with the advent of third generation (“3G”) and fourth generation (“4G”) wireless technology have enabled numerous PCDs with multiple capabilities. Therefore, a PCD may be a cellular telephone, a satellite telephone, a pager, a PDA, a smartphone, a navigation device, an “ebook” or reader, a media player, a handheld game console, a combination of the aforementioned devices, a laptop computer with a wireless connection, among others.
ROM-based paging systems and methods according to the solution work to manage capacity in a primary storage component, such as a double data rate (“DDR”) memory device, by only storing new data from a software revision in the DDR. Unlike prior art solutions that store the entire image for a software update in the primary storage, embodiments of the present solution only store what “has changed” in a candidate portion of an original software instantiation, i.e. embodiments of the present solution store in the primary storage a “delta” or “diff” image of the data that has changed in a candidate portion of a baseline data image due to a software upgrade.
As such, because the entire candidate portion is not stored in the primary storage device (only the delta of data between the original candidate and the upgraded candidate), a page request on the candidate portion of the software from a processing component will generate a page fault, as would be understood by one of ordinary skill in the art. That is, a request for a page of data from the candidate portion that the processing component expects to be located in the primary storage results in a page fault when a memory manager module (comprising a memory controller) determines that the requested data is not presently stored in the primary storage. Embodiments of the solution respond by acquiring the requested page from a permanent instantiation in the secondary storage, such as a read only memory (“ROM”), and updating it with data from the delta file stored in the primary storage. The new updated file is then placed in a swap pool area of the primary storage so that the request from the processing component may be serviced from that swap pool location. In this way, embodiments of the solution optimize the use of valuable primary storage space by avoiding the need to store entire images of updated candidate portions of a software file—only the delta file, which is necessarily smaller than the entire updated candidate portion, is stored in the primary storage.
As one of ordinary skill in the art would acknowledge, primary storage such as DRAM is valuable to PCD designers as it is in high demand for providing various functionalities to a PCD user. Simply adding more DRAM is not always a viable choice for satisfying the demand, as DRAM can be cost prohibitive. As such, when an application grows in size due to software updates or other reasons, it may be desirable to leverage more cost effective storage solutions such as secondary storage devices in the form of ROM.
Thus, embodiments of the solution seek to leverage ROM as a way to address the issue of a growing image size for a given application. Again, prior art solutions simply accommodate a growing image size by storing the ever changing and additional data in RAM, but embodiments of the present solution avoid the need to allocate large amounts of RAM for such purposes. Embodiments of the solution use ROM that may already exist in a system or, it is envisioned, may leverage a relatively small piece of on-chip ROM added for the purpose of accommodating a growing image size. In this way, embodiments of the solution avoid the need to add ever larger sizes of expensive DRAM.
Advantageously, embodiments of the solution efficiently patch data stored in ROM without having to use a lookup table in RAM to determine whether the requested data is located in RAM or ROM. To accomplish this, embodiments work from an original, baseline instantiation in the ROM and, when an updated instantiation of the data is required, a difference (a binary difference) between the updated instantiation and the baseline instantiation is determined. The diff data, or “delta” file, that represents the binary difference as determined at build time (i.e., prior to runtime) is stored in the DRAM primary storage.
The requested data is paged from the ROM and it is determined whether the data has been updated from the previous version. If it has been updated, then the diff file is located in the DRAM and the requested page is generated as a patched page on demand from the diff file in combination with the baseline data stored in ROM. The patched page may be stored in a processor cache or swap pool location in DRAM and executed from that location.
Advantageously, the amount of DRAM needed for a given software update is relatively small because only enough DRAM to accommodate the binary difference between the baseline image and the updated image is required. By contrast, in prior art patching methods, patching occurs at the variable/function level so that even if a single byte of data has changed from the baseline instantiation, the whole function gets patched. Embodiments of the present solution may be operable to patch at the binary level, accommodating software updates at the page level or cache line level.
Certain embodiments of the solution may be viewed as an extension or improvement of traditional demand paging methodologies. As one of ordinary skill in the art of prior art demand paging would understand, when a page fault occurs in the DRAM, the requested page is acquired from a secondary storage location and copied into a selected location in DRAM. The selected location is then mapped into a lookup table in DRAM and the page is executed from the selected location. Embodiments of the solution accomplish the paging by acquiring the baseline version of the requested page from the secondary storage location and then updating the page based on diff data stored in the DRAM to create a patched page. The patched page is then stored in a swap pool location in DRAM, or in a cache of the requesting processing component, and the request is filled from that point. Once the patched page generated according to an embodiment of the solution is stored in the DRAM or cache, traditional demand paging algorithms may be leveraged to flush old pages, overwrite pages, etc.
The scope of the solution disclosed herein is not limited to using any one algorithm for generation of the diff/delta file. Even so, certain algorithms for determining the binary difference between a baseline image and an updated image may be novel in, and of, themselves. It is envisioned that certain embodiments of a ROM-based paging system and method according the proposed solution may utilize a diff or patch algorithm that represents the image diff data in the form of an edit script. Consistent with that which is described above and below, the edit script may be used with a baseline image to reconstruct a revision software image.
To improve the runtime image reconstruction performance, an edit script may be organized at the page boundary, i.e. on fixed size blocks or pages. Each requested page may be reconstructed independently using the edit information from the edit script that includes edit commands. The edit commands define a set of reconstruction operations that provide for combining the baseline image with the diff file data to generate a patched or updated page being requested by a processing component. Edit scripts may contain metadata and a list of pages that contain the various edit commands.
As illustrated below, it is envisioned that a diff algorithm leveraging edit scripts may use four types of edit commands: copy, small copy, insert and partial copy.
The “copy” and “small copy” commands may be used to copy a block of data from the baseline image located at a certain offset within the baseline image, with the “small copy” command using a relatively smaller amount of bits to encode the data block size. The “copy” and “small copy” commands may be applicable when the block of data in the baseline image and the associated block of data in the revised software image (i.e., the updated or modified image) are exactly the same or very similar in contents.
The “insert” command may be used to insert a block of data into the baseline image to generate the patched page. The data that needs to be inserted may be embedded in the edit script.
The “partial copy” command may be used on top of a copy or small copy command to “fix” a word in the previously copied data. A partial copy command may be used when the block of data is similar between the baseline image and the revision image and the block may be copied to fix some portion of it required due to address changes. The sub-CMD in a partial copy edit script may be used to identify the location of the small change coming from a global table or local cache, depending on embodiment. The address change may be used to implement changes in functions having numerous common offsets (a global table may be used to store the offsets and a most recently used offset table may be cached when a page is reconstructed according to an embodiment of the solution).
To generate the edit scripts, the particular part of the revision image of concern may be divided into pages. Subsequently, for each word in a page its symbol may be determined along with the same symbol in the baseline image. If there is no equivalent symbol in the baseline image, an insert command may be created to represent the data block for the new symbol. If there is an equivalent symbol in the baseline image, and the symbol length is the same as that which is in the revision image, a copy command may be created to copy the whole symbol with partial copy commands created on top by scanning through the page, word by word. If there is an equivalent symbol in the baseline image, but the length of the symbol is not the same as that which appears in the revision image, one word from the revision image may be selected from the page and used to find a longest matched block in the baseline symbol (may be exact match or fuzzy match). Subsequently, a copy command and multiple partial copy commands may be created for the match block.
In general, the memory manager 101 may be formed from hardware and/or software and may be responsible for generating diff files, servicing page faults and satisfying page requests according to embodiments of ROM-based paging solutions. Advantageously, embodiments of the solution may provide for optimized RAM utilization by only storing the diff data that represents a difference between a baseline image stored in ROM and a revised image associated with a software update. Using the diff data, a memory manager 101 may respond to a page fault by retrieving a baseline image data from ROM and modifying it based on diff data stored in RAM. The resulting modified image may be stored in RAM and used to satisfy the page request from that point.
As illustrated in
A subscriber identity module (“SIM”) card 146 may also be coupled to the CPU 110. Further, as shown in
As further illustrated in
The CPU 110 may also be coupled to one or more internal, on-chip thermal sensors 157A as well as one or more external, off-chip thermal sensors 157B. The on-chip thermal sensors 157A may comprise one or more proportional to absolute temperature (“PTAT”) temperature sensors that are based on vertical PNP structure and are usually dedicated to complementary metal oxide semiconductor (“CMOS”) very large-scale integration (“VLSI”) circuits. The off-chip thermal sensors 157B may comprise one or more thermistors. The thermal sensors 157 may produce a voltage drop that is converted to digital signals with an analog-to-digital converter (“ADC”) controller 103. However, other types of thermal sensors 157 may be employed.
The touch screen display 132, the video port 138, the USB port 142, the camera 148, the first stereo speaker 154, the second stereo speaker 156, the microphone 160, the FM antenna 164, the stereo headphones 166, the RF switch 170, the RF antenna 172, the keypad 174, the mono headset 176, the vibrator 178, thermal sensors 157B, the PMIC 180 and the power supply 188 are external to the on-chip system 102. It will be understood, however, that one or more of these devices depicted as external to the on-chip system 102 in the exemplary embodiment of a PCD 100 in
In a particular aspect, one or more of the method steps described herein may be implemented by executable instructions and parameters stored in the memory 112 or as form the memory manager 101 and/or its page fault handler. Further, the memory manager 101, the memory 112, the instructions stored therein, or a combination thereof may serve as a means for performing one or more of the method steps described herein.
Advantageously, and as can be inferred from the
At block 620, a request for a page associated with the revised image may be received from a processing component, such as a CPU core. At decision block 625, if the requested page is stored in the primary storage, then the “NO” path is followed to block 630 and the page request is filled from the primary storage location. If, however, the requested page is not in primary storage, the “YES” path is followed to block 635.
At block 635, the requested page may be generated by retrieving the page from the baseline image in the secondary storage device and modifying it according to data stored in the diff file located in the primary storage device. At block 640, the newly generated requested page is stored in a swap pool area of the primary storage device. At block 645, the page request from the processing component is satisfied from the swap pool storage location. The method 600 returns.
At block 720, a request for a page associated with the revised image may be received from a processing component, such as a CPU core. At decision block 725, if the requested page is stored in the primary storage, then the “NO” path is followed to block 730 and the page request is filled from the primary storage location. If, however, the requested page is not in primary storage, the “YES” path is followed to block 735.
At block 735, the requested page may be generated on a cache line level by retrieving the page from the baseline image in the secondary storage device and modifying it according to data stored in the diff file located in the primary storage device. At block 740, the newly generated requested page is stored in a cache associated with the processing component. At block 745, the page request from the processing component is satisfied on a cache line level from the cache. The method 700 returns.
Certain steps in the processes or process flows described in this specification naturally precede others for the invention to function as described. However, the invention is not limited to the order of the steps described if such order or sequence does not alter the functionality of the invention. That is, it is recognized that some steps may performed before, after, or parallel (substantially simultaneously with) other steps without departing from the scope and spirit of the invention. In some instances, certain steps may be omitted or not performed without departing from the invention. Further, words such as “thereafter”, “then”, “next”, etc. are not intended to limit the order of the steps. These words are simply used to guide the reader through the description of the exemplary method.
Additionally, one of ordinary skill in programming is able to write computer code or identify appropriate hardware and/or circuits to implement the disclosed invention without difficulty based on the flow charts and associated description in this specification, for example. Therefore, disclosure of a particular set of program code instructions or detailed hardware devices is not considered necessary for an adequate understanding of how to make and use the invention. The inventive functionality of the claimed computer implemented processes is explained in more detail in the above description and in conjunction with the drawings, which may illustrate various process flows.
In one or more exemplary aspects, the functions described may be implemented in hardware, software, or any combination thereof. If implemented in software, the functions may be stored on or transmitted as one or more instructions or code on a computer-readable medium. Computer-readable media include both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that may be accessed by a computer. By way of example, and not limitation, such computer-readable media may comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium that may be used to carry or store desired program code in the form of instructions or data structures and that may be accessed by a computer.
Therefore, although selected aspects have been illustrated and described in detail, it will be understood that various substitutions and alterations may be made therein without departing from the spirit and scope of the present invention, as defined by the following claims.