This invention relates to the method of prefetching instructions in a micro-processor buffer under software control.
Cache memories have been widely used in microprocessors and microcontrollers (now on referred to as processor) for faster data transfer between the processor and main memory. Low end processors however do not employ cache for mainly two reasons. 1) The overhead of cache implementation in terms of energy and area is greater, and 2) as the cache performance primarily depends on number of hits, increasing data miss could cause processor to remain in stall mode for longer durations which in turn makes cache to become a liability than an advantage. Based on the facts discussed above a method of buffering instructions using software based prefetching is proposed which with minimum logic and power overhead could be employed in low-end processors for improving throughput. A preliminary search of the prior work in this field did not disclose any patents directly related to this invention but the following could be considered related:
U.S. Pat. No. 5,838,945: In which instruction and data prefetch method is described, where a prefetch instruction can control cache prefetching.
U.S. Pat. No. 4,713,755: In which a method of cache memory consistency control using software instructions is claimed.
U.S. Pat. No. 5,784,711: In which a method of data cache prefetching under the control of instruction cache is claimed.
U.S. Pat. No. 4,714,994: In which a method to control the instruction prefetch buffer array is claimed. The buffer could store the code for a number of instructions that have already been executed and those which are yet to be executed.
U.S. Pat. No. 4,775,927: In which a method and apparatus that enables an instruction prefetch buffer to distinguish between old prefetches that occurred before a branch and new prefetches which occurred after the branch in an instruction stream is claimed.
The major difference between the proposed buffer and typical cache systems is its control that is completely done by software. During software design phase or code compilation, control words specifying exact location of the instructions are placed at the location one instruction ahead, so that during execution the instructions required in the next cycle could be fetched seamlessly.
Essential Features of the invention are a processor with cycle time greater than or equal to that of the associated data memory (i.e. time to perform a memory read or memory write). Whereas for the instruction memory the memory read cycle time (only) should be less than or equal to that of the processor.
An instruction memory capable of providing access to at least two locations in one cycle.
Addition of special control words (or instructions) before each instruction of the user code to help the system know in advance which data is to fetch next.
Important (but not Essential) Features include a software tool or compiler to automatically generate and insert the control words to the code and a software tool or an extension of the tool mentioned above; to keep track of available data buffer space and insert control words to replace data not needed.
The proposed embodiment contains a two buffer based instruction buffer area, where one buffer is to always serve as a default location if a branch is taken (called True) and the other if the branch is not taken (called False).
The instruction buffers may not have any source address associated with it. Although it could have a single bit tag to indicate the buffer as a default True or False.
The address of these two instructions will be indicated by the preceding control words beforehand. These two instructions will then be held at fixed locations in the Instruction Buffer i.e. one location fixed for true and the other for false. If the branch is taken then the instruction from the true buffer location is executed, otherwise the instruction in the false buffer is executed. As any of the two buffers can be used as default location so that non-branch instructions get stored and executed from there. This type of buffer would accelerate the performance of the processor which in otherwise has to stall until the branch is resolved.