Claims
- 1. A data processor comprising:
an instruction execution pipeline comprising N processing stages, each of said N processing stages capable of performing one of a plurality of execution steps associated with a pending instruction being executed by said instruction execution pipeline; a data cache capable of storing data values used by said pending instruction; a plurality of registers capable of receiving said data values from said data cache; a load store unit capable of transferring a first one of said data values from said data cache to a target one of said plurality of registers during execution of a load operation; a shifter circuit associated with said load store unit capable of one of a) shifting, b) sign extending, and c) zero extending said first data value prior to loading said first data value into said target register; and bypass circuitry associated with said load store unit capable of transferring said first data value from said data cache directly to said target register without processing said first data value in said shifter circuit.
- 2. The data processor as set forth in claim 1 wherein said bypass circuitry transfers said first data value from said data cache directly to said target register during a load word operation.
- 3. The data processor as set forth in claim 2 wherein said bypass circuitry transfers said first data value from said data cache directly to said target register at the end of two machine cycles.
- 4. The data processor as set forth in claim 1 wherein said shifter circuit one of a) shifts, b) sign extends, and c) zero extends said first data value prior to loading said first data value into said target register during a load half-word operation.
- 5. The data processor as set forth in claim 4 wherein said shifter circuit loads said shifted first data value into said target register at the end of three machine cycles.
- 6. The data processor as set forth in claim 1 wherein said shifter circuit one of a) shifts, b) sign extends, and c) zero extends said first data value prior to loading said first data value into said target register during a load byte operation.
- 7. The data processor as set forth in claim 6 wherein said shifter circuit loads said shifted first data value into said target register at the end of three machine cycles.
- 8. The data processor as set forth in claim 1 wherein said bypass circuitry comprises a multiplexer having a first input channel coupled to a data output of said data cache.
- 9. The data processor as set forth in claim 8 wherein said multiplexer has a second input channel coupled to an output of said shifter circuit.
- 10. For use in a processor comprising an N-stage execution pipeline, a data cache, and a plurality of registers, a method of loading a first data value from the data cache into a target one of the registers, the method comprising the steps of:
determining if a pending instruction in the execution pipeline is one of a load word operation, a load half-word operation, and a load byte operation; in response to a determination that the pending instruction is a load half-word operation, transferring the first data value from the data cache to a shifter circuit and shifting the first data value prior to loading the first data value into the target register; and in response to a determination that the pending instruction is a load byte operation, transferring the first data value from the data cache to a shifter circuit and shifting the first data value prior to loading the first data value into the target register in response to a determination that the pending instruction is a load word operation, transferring the first data value from the data cache directly to the target register without processing the first data value in the shifter circuit.
- 11. The method as set forth in claim 10 wherein the step of transferring the first data value requires two machine cycles during a load word operation.
- 12. The method as set forth in claim 10 wherein the step of transferring the first data value requires three machine cycles during a load half-word operation.
- 13. The method as set forth in claim 10 wherein the step of transferring the first data value requires three machine cycles during a load byte operation.
- 14. A processing system comprising:
a data processor; a memory coupled to said data processor; a plurality of memory-mapped peripheral circuits coupled to said data processor for performing selected functions in association with said data processor, wherein said data processor comprises: an instruction execution pipeline comprising N processing stages, each of said N processing stages capable of performing one of a plurality of execution steps associated with a pending instruction being executed by said instruction execution pipeline; a data cache capable of storing data values used by said pending instruction; a plurality of registers capable of receiving said data values from said data cache; a load store unit capable of transferring a first one of said data values from said data cache to a target one of said plurality of registers during execution of a load operation; a shifter circuit associated with said load store unit capable of one of a) shifting, b) sign extending, and c) zero extending said first data value prior to loading said first data value into said target register; and bypass circuitry associated with said load store unit capable of transferring said first data value from said data cache directly to said target register without processing said first data value in said shifter circuit.
- 15. The processing system as set forth in claim 14 wherein said bypass circuitry transfers said first data value from said data cache directly to said target register during a load word operation.
- 16. The processing system as set forth in claim 15 wherein said bypass circuitry transfers said first data value from said data cache directly to said target register at the end of two machine cycles.
- 17. The processing system as set forth in claim 14 wherein said shifter circuit one of a) shifts, b) sign extends, and c) zero extends said first data value prior to loading said first data value into said target register during a load half-word operation.
- 18. The processing system as set forth in claim 17 wherein said shifter circuit loads said shifted first data value into said target register at the end of three machine cycles.
- 19. The processing system as set forth in claim 14 wherein said shifter circuit one of a) shifts, b) sign extends, and c) zero extends said first data value prior to loading said first data value into said target register during a load byte operation.
- 20. The processing system as set forth in claim 19 wherein said shifter circuit loads said shifted first data value into said target register at the end of three machine cycles.
- 21. The processing system as set forth in claim 14 wherein said bypass circuitry comprises a multiplexer having a first input channel coupled to a data output of said data cache.
- 22. The processing system as set forth in claim 21 wherein said multiplexer has a second input channel coupled to an output of said shifter circuit.
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] The present invention is related to those disclosed in the following U.S. patent applications:
[0002] 1. Ser. No. [Docket No. 00-BN-052], filed concurrently herewith, entitled “PROCESSOR PIPELINE STALL APPARATUS AND METHOD OF OPERATION”;
[0003] 2. Ser. No. [Docket No. 00-BN-053], filed concurrently herewith, entitled “CIRCUIT AND METHOD FOR HARDWARE-ASSISTED SOFTWARE FLUSHING OF DATA AND INSTRUCTION CACHES”;
[0004] 3. Ser. No. [Docket No. 00-BN-054], filed concurrently herewith, entitled “CIRCUIT AND METHOD FOR SUPPORTING MISALIGNED ACCESSES IN THE PRESENCE OF SPECULATIVE LOAD INSTRUCTIONS”;
[0005] 4. Ser. No. [Docket No. 00-BN-055], filed concurrently herewith, entitled “BYPASS CIRCUITRY FOR USE IN A PIPELINED PROCESSOR”;
[0006] 5. Ser. No. [Docket No. 00-BN-056], filed concurrently herewith, entitled “SYSTEM AND METHOD FOR EXECUTING CONDITIONAL BRANCH INSTRUCTIONS IN A DATA PROCESSOR”;
[0007] 6. Ser. No. [Docket No. 00-BN-057], filed concurrently herewith, entitled “SYSTEM AND METHOD FOR ENCODING CONSTANT OPERANDS IN A WIDE ISSUE PROCESSOR”;
[0008] 7. Ser. No. [Docket No. 00-BN-058], filed concurrently herewith, entitled “SYSTEM AND METHOD FOR SUPPORTING PRECISE EXCEPTIONS IN A DATA PROCESSOR HAVING A CLUSTERED ARCHITECTURE”;
[0009] 8. Ser. No. [Docket No. 00-BN-059], filed concurrently herewith, entitled “CIRCUIT AND METHOD FOR INSTRUCTION COMPRESSION AND DISPERSAL IN WIDE-ISSUE PROCESSORS”;
[0010] 9. Ser. No. [Docket No. 00-BN-066], filed concurrently herewith, entitled “SYSTEM AND METHOD FOR REDUCING POWER CONSUMPTION IN A DATA PROCESSOR HAVING A CLUSTERED ARCHITECTURE”; and
[0011] 10. Ser. No. [Docket No. 00-BN-067], filed concurrently herewith, entitled “INSTRUCTION FETCH APPARATUS FOR WIDE ISSUE PROCESSORS AND METHOD OF OPERATION”.
[0012] The above applications are commonly assigned to the assignee of the present invention. The disclosures of these related patent applications are hereby incorporated by reference for all purposes as if fully set forth herein.