1. Field
This invention relates to the field of data processing systems.
2. Description
It is known to provide data processing systems with cache memories in order to provide lower latency access to frequently used or critical data or instructions. One known type of cache memory is a write-back cache memory. Data may be written to a write-back cache memory without other versions of that data, such as held in the main memory, being updated until the data which has been written (the dirty data) is evicted from the write-back cache.
In accordance with at least some example embodiments of the disclosure, there is provided apparatus for processing data comprising:
a write-back cache having a plurality of cache lines;
processing circuitry to perform processing operations specified by program instructions; and
an instruction decoder to decode a load-and-clean instruction to generate control signals:
In accordance with at least some embodiments of the disclosure there is provided apparatus for processing data comprising:
write-back cache means for storing data said write-back cache means having a plurality of cache lines;
processing means for performing processing operations specified by program instructions; and
instruction decoding means for decoding a load-and-clean instruction to generate control signals:
In accordance with at least some embodiments of the disclosure there is provided a method of processing data comprising:
storing data within a write-back cache having a plurality of cache lines;
perform processing operations specified by program instructions; and
decoding a load-and-clean instruction to generate control signals:
In accordance with at least some embodiments of the disclosure there is provided a method of compiling a source program to generate an object program comprising:
identifying a last use within said source program of a data value stored at a memory address;
if said source program specifies loading a target data value that is a last use of said target data value, then generating a corresponding load-and-clean instruction within said object program; and
if said source program specifies loading a target data value that is not a last use of said target data value, then generating a corresponding load instruction within said object program.
Example embodiments will now be described, by way of example only, with reference to the accompanying drawings in which:
It is possible that programmer or compiler may identify that when a load is being performed of a data value from a memory address there will be no subsequent use of that data value in the program concerned. Examples of such situations include stack memories where data has been spilled to the stack memory upon a context change and is then POPed from the stack memory when the original context is resumed. In this case, the stack memory serves as temporary storage and once the data has been recovered, the data values stored within the memory address space which provided the temporary storage are no longer required. Another example would be use of a FIFO, circular buffer or other temporary buffer.
When a programmer or compiler has identified that a load will be the last one to be performed upon a data value at a given memory address location, then a load-and-clean instruction may be used for that final load operation in place of for example, a standard load instruction. The load-and-clean instruction controls a write-back cache which, may be storing the data value to be loaded, to mark that data value as clean after the load-and-clean instruction has been executed such that it no longer needs to be written back to the backing memory system. This saves memory bandwidth in writing back dirty data values which are no longer required and will not be used again. Marking the data as clean does not require the dirty data to be written out to the main memory as would be conventional with a full clean operation (write back and mark as clean).
It is possible that the write-back cache for which the load-and-clean instruction suppresses unnecessary write-back of dirty values may use cache lines which comprise a plurality of portions each having a dirty flag indicative of whether a respective portion has been written with data that has not yet been written back to a memory. As an example, per-byte dirty flags may be provided within each cache line.
In one example embodiment the write-back cache may respond to a load-and-clean instruction to change a dirty flag for at least the target portion of the cache line from which a load is being performed such that if the dirty flag for that target portion is set to “dirty”, then it is changed to he clean. It will he appreciated that the load-and-clean instruction may change a dirty flag for a target portion to clean or if the flag for the target portion already indicates that it is clean, then this will be left unchanged. It is possible that the target portion may have been written while it was stored within the write-back cache, but has already been subject to a clean operation, such as by virtue of eviction from and then reloading into the write-back cache.
In some embodiments the load-and-clean instruction may change a dirty flag for the target portion which indicates that the target portion is dirty to indicate that the target portion is clean whilst leaving unchanged any dirty flags for other portions of the target cache line. This type of behavior is suited to embodiments in which the data being cached may correspond to a general purpose buffer in which there is no particular pattern to the accesses to different portions of a cache line.
In other example embodiments, the data structure stored within the write-back cache may he one with a particular access pattern, such as a stack memory. In this case, it may be known that if a load-and-clean instruction is executed for a target portion within a cache line, then any other portions of that cache line within the region extending from the target portion to a predetermined end of the cache line (i.e. extending in a predetermined memory-address-order from the target portion) will also not be needed again and so can be marked as clean (any dirty flags set to clean, but without a write-hack needing to be performed). The portions of the cache line extending in the opposite direction to the pre-determined memory address order can have their dirty flags left unchanged.
In other example embodiments, it may be that extending the marking of portions of a cache line as clean when these do not encompass the entire cache line is of reduced benefit and accordingly operation may he simplified when, if the target portion is at a predetermined end of the target cache line, then any dirty flags for all portions of that target cache line are changed as necessary to clean, whereas, if the target portion is not at the predetermined end of the target cache line, then the dirty flags for portions other than the target portion are left unchanged. Such an embodiment still uses individual dirty flags for the different portions of a cache line.
In other embodiments, a plurality of portions of the target cache line may share a dirty flag and in some example embodiments a single dirty flag may be provided for a whole cache line. With such embodiments, the write-back cache may respond to a load-and-clean instruction if the target portion is at a predetermined end of the target cache line to change the dirty flag for the target cache line to indicate that the target cache line is clean and to suppress such action if the target portion is not at the predetermined end of the target cache line.
A feature of at least some example embodiments of the disclosure is that if a target portion for a load-and-clean instruction is marked as clean to avoid any subsequent unnecessary needed write-back, then the cache line containing that target portion remains in the write-back cache and so is available for further access operations, e.g. access operations to different portions of that cache line which are still required and still valid.
The write-back cache comprises cache line eviction circuitry which controls eviction of cache lines from the write-back cache, typically in accordance with one of many known eviction policies. This cache line eviction circuitry may also be responsive to execution of load-and-clean program instructions.
The manner in which the cache line eviction circuitry is responsive to execution of a load-and-clean program instruction can vary. In some example embodiments, when the execution of a load-and-clean program instruction results in all portions of the target cache line concerned being marked as clean, then this will serve to control the cache line eviction circuitry to promote that target cache line within an order for eviction make it the next eviction candidate. In other embodiments, the cache line eviction, e.g. circuitry may respond to execution of a load-and-clean program instruction, as distinct from other forms of memory access instructions, to suppress updating of least-recently-used data associated with that target cache line such that the load-and-clean program instruction will not have an influence upon how the cache line is treated for eviction based upon its least-recently-used data.
In some embodiments, in addition to having one or more dirty flags per cache line, a cache line may also include a valid flag indicative of whether that cache line contains any valid data.
While it will be appreciated that the techniques of the disclosure could he used in a wide variety of different forms of processing systems, they may find good utility in the context of processing circuitry which a FIFO and/or circular buffer for memory accesses and/or within a system using a graphics processing unit (which may typically have a predictable pattern of use of data held within a write-back cache such that the last use of that data can he identified and a load-and-clean instruction employed to suppress wasteful unnecessary write-back operations).
Another form of the present technique is the provision of a compiler to identify places with a program in which load-and-clean program instructions can usefully be employed. Compilers typically already track data value usage within program code for reasons other than associated with write-back from cache memories. Given that a compiler can relatively readily identify the last use of a data value within a program, then the compiler when generating a load instruction can determine if that load instruction is the last use of the data value concerned, and accordingly generate a load-and-clean instruction while otherwise generating a “normal” load instruction if the load is not the last use of the data value concerned.
The processor 4 fetches instructions from the instruction cache 8 to an instruction pipeline 14. When the instructions reach a decode stage, then they are decoded by an instruction decoder 16 which generates control signals which control processing performed by a variety of processing pipelines 18, 20, 22 that include a load/store pipeline 24. The load/store pipeline 24 is responsible for handling memory access instructions including both load-and-clean instructions and standard load instructions.
The write-back data cache 6 includes a plurality of cache lines 26 which each store a plurality of portions of data, e.g. the write-back data cache 6 may support access granularity down to byte accesses and include per-byte dirty bits as well as a per-line valid bit. Eviction circuitry 28 within the write-back data cache serves to control cache line eviction using one of a variety of different eviction algorithms, such as least-recently-used, round-robin, random etc.
It will be appreciated that the examples of
It will be appreciated that the processing illustrated in
Although illustrative embodiments have been described in detail herein with reference to the accompanying drawings, it is to be understood that the claims are not limited to those precise embodiments, and that various changes, additions and modifications can be effected therein by one skilled in the art without departing from the of the appended claims. For example, various combinations of the features of the dependent claims could he made with the features of the independent claims.
Number | Date | Country | Kind |
---|---|---|---|
1422789.6 | Dec 2014 | GB | national |