The present invention relates in general to a data processing system comprising cache storage, and more specifically relates to dynamic partitioning of the cache storage for application tasks in a multiprocessor.
Cache partitioning is a well-known technique in multi-tasking systems for achieving more predictable cache performance by reducing resource interference. In a data processing system comprising of multiprocessors, the cache storage is shared between multiple processes or application tasks. The cache storage is partitioned into different sections for different application tasks. In a multiprocessing system with large number of application tasks, cache partitioning may result in small sized partitions per application tasks, as the total cache size is limited. This will cause performance deterioration, as the application tasks will not be able to accommodate its working set in the allotted cache partition which causes more cache misses. It can be advantageous to partition the cache into sections, where each section is allocated to a respective class of processes, rather than the processes sharing entire cache storage.
US Patent application 2002/0002657A1 by Henk Muller et al discloses a method of operating a cache memory in a system, in which a processor is capable of executing a plurality of processes. Such techniques partition the cache into many small partitions instead of using one monolithic data-cache in which, accesses to different data objects may interfere. In such cases, typically the compiler is aware of the cache architecture, and allocates the cache partitions to the application tasks.
Future multiprocessor systems are going to be very complex and will contain a large number of application tasks. Hence the cache partitioning will result in small sized partitions per tasks, which will deteriorate the performance.
It is, inter alia, an object of the invention to provide system and method for improved dynamic cache partitioning in multiprocessors. The invention is defined by the independent claims. Advantageous embodiments are defined in the dependent claims.
The invention is based on the recognition that the prior art techniques do not exploit the pattern of execution of the application tasks. For instance, the execution behavior of multimedia applications often follows a periodic pattern of execution. I.e. the multimedia applications include application tasks that get scheduled periodically and follow a pattern of execution. By exploiting this behavior it is possible to have more efficient cache partitioning.
A cache partitioning technique for application tasks based on the scheduling information in multiprocessors is provided. Cache partitioning is performed dynamically based on the information of the pattern of task scheduling provided by the task scheduler. Execution behavior of the application tasks is obtained from the task scheduler and partitions are allocated to only a subset of application tasks, which are going to be executed in the upcoming clock cycles. The present invention will improve the cache utilization by avoiding unnecessary reservation of the cache partitions for the executing application tasks during the entire duration of their execution and hence an effective utilization of the cache is achieved.
In an example embodiment of the present invention, a method for dynamically partitioning a cache memory in a multiprocessor for a set of application tasks is provided. The method includes the steps of storing a scheduling pattern of the set of application tasks in a storage, selecting a subset of application tasks from the set of application tasks and updating a cache controller logic with the subset of application tasks, where the subset of application tasks comprise a set of application tasks which are executed in the upcoming clock cycles, and allocating cache partitions dynamically to the subset of application tasks updated in the cache controller logic. A task scheduler stores the scheduling pattern of the application tasks in a look up table (LUT). The selected subset of application tasks are updated in a partition control register and a dynamic partition unit allocates cache partitions dynamically to the subset of application tasks stored in the partition control register.
In another example embodiment of the present invention, a system is provided for dynamically partitioning a cache memory in a multiprocessor for a set of application tasks. The system includes a task scheduler for storing a scheduling pattern of the set of application tasks, and cache controller logic for selecting and updating a subset of application tasks from the set of application tasks, and for allocating cache partitions dynamically to the subset of application tasks updated in the cache controller logic. The cache controller logic includes a partition control register for updating the subset of application tasks, and a dynamic partition unit for allocating cache partitions dynamically to the subset of application tasks.
The above summary of the present invention is not intended to represent each disclosed embodiment, or every aspect, of the present invention. Other aspects and example embodiments are provided in the figures and the detailed description that follows.
The present invention proposes a cache partitioning technique based on the patterns of execution of the application tasks. The partitions are allocated to only a subset of application tasks which are going to be executed in the upcoming clock cycles. Since, considering only a subset of application tasks for cache portioning, large sized partitions can be allotted for the application tasks.
At step 115, a partition control register is updated with the subset of application tasks selected in step 110. A partition control register may be implemented as a memory mapped input output (MMIO) register. A task scheduler updates the partition control register with the selected subset of application tasks. At step 120, a dynamic partitioning unit allocates cache partitions dynamically to the subset of application tasks updated in the partition control register.
Such a data processing system 200 may be implemented as a system-on-chip (SoC). The data processing system 200 explained above is particularly applicable to multi-tasking streaming applications, for example in audio and video applications.
In this case the suitable subset is (T1, T2) as the task T3 occurs only in the scheduling instance 7 and hence the partition for task T3 can be allocated at a later time (i.e. by schedule instance 7). As the entire cache partition is allocated for T1 and T2 for most of the execution time (schedule instance 1-6), a more efficient cache partitioning is achieved. At schedule instance 7, a subset of cache partition occupied by either T1 or T2 (according to some cache replacement policy like least recently used (LRU) can be evicted to accommodate the partition for T3.
The task scheduler 205, stores the task schedule pattern in the form of LUT 410. The cache controller logic 215 includes a partition control register 425 and a dynamic partition unit 420. The partition control register 425 is updated by the task scheduler 205. The partition control register 425 contains information regarding which are going to be executed in the upcoming clock cycles. This information includes the task IDs of the application tasks. I.e. according to the example in
Dynamic partition unit 420 reads the information from the partition control register 425 and allocates partitions to only application tasks which have IDs registered in the partition control register 425. In this way only a subset of application tasks are selected for allocating cache partitions and hence effectively utilizing the available cache space across the application tasks.
The present invention will find its industrial applications in system on chip (SoC) for audio, video and mobile applications. The present invention will improve the cache utilization by avoiding unnecessary reservation of the cache partitions for the executing application tasks during the entire duration of their execution. So an effective utilization of the cache storage is achieved.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design many alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word “comprising” does not exclude the presence of elements or steps other than those listed in a claim. The word “a” or “an” preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and/or by means of a suitably programmed processor. In the device claim enumerating several means, several of these means may be embodied by one and the same item of hardware. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.
Number | Date | Country | Kind |
---|---|---|---|
06111711.5 | Mar 2006 | EP | regional |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/IB07/53404 | 9/20/2006 | WO | 00 | 9/17/2008 |