Claims
- 1. A central processing unit of a computer, comprising:
- a data cache for storing data specified by an address request of a first instruction;
- a prefetch cache for generating a prefetch cache hit signal in response to said address request; and
- a prefetch engine for deriving in response to said prefetch cache hit signal, a prefetch address which specifies data predicted to be requested in one or more instructions subsequent to said first instruction, wherein said prefetch engine includes:
- an adder to add a stride to a physical address of said data specified by said address request;
- first and second input ports coupled to simultaneously receive from said prefetch cache first and second physical addresses, respectively; and
- first and second output ports for providing first and second prefetch addresses to a memory external to said central processing unit, said first and second prefetch addresses derived from said first and second physical addresses, respectively.
- 2. The apparatus of claim 1, wherein said prefetch engine derives said first and second prefetch addresses by adding a stride to said first and second physical addresses, respectively.
- 3. The apparatus of claim 2, wherein said stride is a fixed value.
- 4. The apparatus of claim 2, wherein said stride is a variable, the value of which depends upon instruction loop heuristics of a computer program being executed by said central processing unit.
- 5. The apparatus of claim 1, wherein said prefetch engine derives said first prefetch address by adding a first stride to said first physical address, and derives said second prefetch address by adding a second stride to said second physical address, wherein said first stride is different from said second stride.
- 6. The apparatus of claim 5, wherein said first and second strides are fixed values.
- 7. The apparatus of claim 5, wherein said first and second strides are variables, the values of each depending upon instruction loop heuristics of a computer program being executed by said central processing unit.
- 8. A method for generating two prefetch addresses in response to two data requests from, respectively, a first instruction and a second instruction which are grouped, said grouped first and second instructions being executed by a central processing unit, said prefetch addresses identifying data which are to be prefetched into prefetch cache memory of said central processing unit and which are predicted to be requested in instructions subsequent to said grouped first and second instructions, said method comprising the steps of:
- extracting from said first and second instructions a first address and a second address, respectively, corresponding to said data requests;
- inputting said first address and said second address into a multi-port prefetch engine;
- adding a stride to said first address and said second address; and
- deriving, respectively, said first and second prefetch addresses from said first address and said second address.
- 9. The method of claim 8, wherein said adding step includes the step of adding a variable stride that depends upon loop heuristics.
- 10. A central processing unit of a computer, comprising:
- a data cache for storing data specified by two address requests of first and second instructions, respectively;
- a prefetch cache for generating prefetch cache hit signals in response to said address requests, said prefetch cache having first and second input ports coupled to substantially concurrently receive first and second data requests, respectively; and
- a multi-port prefetch engine for deriving, in response to each prefetch cache hit signal, a prefetch address which specifies data predicted to be requested in one or more instructions subsequent to said first and second instructions.
- 11. The central processing unit of claim 10, wherein said multi-port prefetch engine includes an adder to add a stride to a physical address of said data specified by said address request.
- 12. The apparatus of claim 11, wherein said stride is a fixed value.
- 13. The central processing unit of claim 11, wherein said stride is a variable, the value of which depends upon instruction loop heuristics of a computer program being executed by said central processing unit.
- 14. The central processing unit of claim 10, wherein said multi-port prefetch engine includes an input port coupled to an output port of said prefetch cache and an output port for providing said derived prefetch address to said external memory.
CROSS-REFERENCE TO RELATED APPLICATIONS
This application is related to U.S. patent application Ser. No. 08/882,691, entitled "MICROPROCESSOR HAVING A PREFETCH CACHE" and bearing attorney docket No. A64703 WSG/WLP, to U.S. patent application Ser. No. 08/882,517, entitled "DATA LOAD HISTORY TRACKING CIRCUIT" and bearing attorney docket No. A64704 WSG/WLP, and to U.S. patent application Ser. No. 08/881,044, issued as U.S. Pat. No. 5,996,061, entitled "A METHOD FOR INVALIDATING DATA IDENTIFIED BY SOFTWARE COMPILER" and bearing attorney docket No. A64706 WSG/WLP, all filed on Jun. 25, 1997 and assigned to the assignee of the present invention.
US Referenced Citations (11)