Claims
- 1. Data processor apparatus having a multi-stage pipeline and an instruction set having at least one extension instruction; comprising;
a plurality of first instructions having a first length; a plurality of second instructions having a second length; and logic adapted to decode and process both said first length and second length instructions from a single program having both first and second length instructions contained therein.
- 2. The apparatus of claim 1, wherein said logic comprise an instruction aligner disposed in a first stage of said pipeline, said aligner adapted to provide at least one first word of said first length and at least one second word of said second length to decode logic, said decode logic selecting between said at least one first and second words.
- 3. The apparatus of claim 2, said aligner further comprising a buffer, said buffer adapted to store at least a portion of a fetched instruction from an instruction cache operatively coupled to the aligner, said storing mitigating stalling of said pipeline.
- 4. Reduced memory overhead data processor apparatus having a multi-stage pipeline with at least fetch, decode, execute, and writeback stages, and an instruction set having (i) a base instruction set and (ii) at least one extension instruction; the apparatus comprising;
a plurality of first instructions having a first length; a plurality of second instructions having a second length; and logic adapted to decode and process both said first length and second length instructions;
wherein the selection of instructions of said first or second length is conducted based at least in part on minimizing said memory overhead.
- 5. Digital processor pipeline apparatus, comprising:
an instruction fetch stage; an instruction decode stage operatively coupled downstream of said fetch stage; an execution stage operatively coupled downstream of said decode stage; and a writeback stage operatively coupled downstream of said execution stage;
wherein said fetch, decode, execute, and writeback stages are adapted to process a plurality of instructions comprising a first plurality of 16-bit instructions and a second plurality of 32-bit instructions.
- 6. The apparatus of claim 5, wherein said plurality of instructions comprises at least one extension instruction.
- 7. The apparatus of claim 6, further comprising at least one selector operatively coupled to at least said fetch stage, said at least one selector operative to select between individual ones of 16-bit and 32-bit instructions within said first and second plurality of instructions, respectively.
- 8. The apparatus of claim 5, further comprising a register file disposed within said decode stage.
- 9. The apparatus of claim 5, further comprising:
(i) an instruction cache within said fetch stage; (ii) an instruction aligner operatively coupled to said instruction cache; and (iii) decode logic operatively coupled to said instruction aligner and said decode stage;
wherein said aligner is configured to provide both 16-bit and 32-bit instructions to said decode logic, said decode logic selecting between said 16-bit and 32-bit instructions to produce a selected instruction, said selected instruction being passed to said decode stage of said pipeline apparatus.
- 10. Processor pipeline code compression apparatus, comprising:
an instruction cache adapted to store a plurality of instruction words of first and second lengths; an instruction aligner operatively coupled to said instruction cache; and decode logic operatively coupled to said aligner;
wherein said aligner is adapted to provide at least one first word of said first length and at least one second word of said second length to said decode logic, said decode logic selecting between said at least one first and second words.
- 11. The apparatus of claim 10, wherein said aligner further comprises a buffer, said buffer adapted to store at least a portion of a fetched instruction from said cache, said storing mitigating pipeline stalling.
- 12. The apparatus of claim 11, wherein said fetched instruction crosses a longword boundary.
- 13. The apparatus of claim 11, further comprising a register file disposed downstream of said aligner, said register file adapted to store a plurality of source data.
- 14. The apparatus of claim 13, further comprising at least one multiplexer operatively coupled to said decode logic and said register file, wherein said at least one multiplexer selects at least one operand for the selected one of said first or second word.
- 15. The apparatus of claim 10, wherein said first length is shorter than said second length, and said decode logic further comprises logic adapted to expand said first word from said first length to said second length.
- 16. A method of compressing the instruction set of a user-configurable digital processor design, comprising:
providing a first instruction word; generating at least second and third instructions words, said second word having a first length and said third word having a second length, said second length being longer than said first length; and selecting, based on at least one bit within said first instruction word, which of said second and third words is valid;
wherein said acts of generating and selecting cooperate to provide code density greater than that obtained using only instruction words of said second length.
- 17. A digital processor with multi-stage pipeline and multi-length ISA comprising a buffered instruction aligner disposed in the first stage of said pipeline, wherein said instruction aligner allows unrestricted selection of instructions of either a first or second length.
- 18. An embedded integrated circuit, comprising:
at least one silicon die; at least one processor core disposed on said die, said at least one core comprising:
(i) a base instruction set; (ii) at least one extension instruction; (iii) a multi-stage pipeline with instruction cache and code aligner in the first stage thereof, said instruction aligner adapted to generate instruction words of first and second lengths, said processor core further being adapted to determine which of said instruction words is optimal; at least one peripheral; and at least one storage device disposed on said die adapted to hold a plurality of instructions; wherein said integrated core is designed using the method comprising:
(i) providing a basecase core configuration; and (ii) selectively adding said at least one extension instruction.
- 19. A method of processing multi-length instructions within a digital processor instruction pipeline, comprising:
providing a plurality of first instructions of a first length; providing a plurality of second instructions of a second length, at least a portion of said plurality of second instructions comprising components of a longword; determining when a given longword comprises one of said first instructions or a plurality of said second instructions; and when said act of determining indicates that said given longword comprises a plurality of said second instructions, buffering at least one of said second instructions.
- 20. The method of claim 19, wherein said act of determining comprises reading the most significant bits of each of said first and second instructions.
- 21. The method of claim 19, wherein said act of buffering comprises determining whether said at least one second instruction being buffered comprises the first portion of an instruction of said first length.
- 22. The method of claim 21, wherein said first length comprises 32-bits, and said second length comprises 16-bits.
- 23. The method of claim 21, further comprising concatenating said at least one second instruction with at least a portion of a subsequent longword.
- 24. A method of processing multi-length instructions within a digital processor instruction pipeline, at least one of said instructions comprising a branch or jump instruction, comprising:
providing a first 16-bit branch/jump instruction within a first longword having an upper and lower portion, said branch/jump instruction being disposed in said upper portion; processing said branch/jump instruction, including buffering said lower portion; concatenating the upper portion of a second longword with said buffered lower portion of said first longword to produce a first 32-bit instruction; and taking the branch/jump, wherein the lower portion of said second longword is discarded.
- 25. The method of claim 24, wherein said first 32-bit instruction resides in the delay slot of said first 16-bit branch/jump instruction.
- 26. A single mode pipelined digital processor with an ISA, said ISA having a plurality of instructions of at least first and second lengths, said instructions each having an opcode in their upper portion, said opcode containing at least two bits which designate the instruction length;
wherein said ISA is adapted to automatically select instructions of said first or second length based at least in part on said opcode and without mode switching.
- 27. A method compressing a digital processor instruction set, comprising;
providing a first plurality of instructions of a first length, said first length being consistent with the architecture of the processor; providing a second plurality of instructions of a second length, said first length being an integer multiple of said second length; selectively utilizing individual ones of said second plurality of instructions.
- 28. A digital processor, comprising;
a first ISA having a plurality of first instructions of a first length associated therewith; a second ISA having a plurality of second instructions of a second length, said first length being an integer multiple of said second length; selection apparatus adapted to selectively utilize individual ones of said second instructions in at least instances where either said first instructions or said second instructions could be utilized to perform an operation, said utilization of said second instructions reducing the cycle count required to perform said operation.
- 29. A method of programming a digital processor, comprising:
providing a first ISA having a plurality of first instructions of a first length associated therewith; providing a second ISA having a plurality of second instructions of a second length, said first length being an integer multiple of said second length; and selecting individual ones of said first and second instructions during said programming; and generating a computer program using said selected first and second instructions;
wherein the execution of said computer program on said processor requires no mode switching.
- 30. User-configured data processor apparatus having a multi-stage pipeline, a base instruction set, and at least one extension instruction; comprising;
a plurality of first instructions having a 32-bit length; a plurality of second instructions having a 16-bit length; an instruction cache disposed in a first stage of said pipeline; an instruction aligner disposed in said first stage of said pipeline and operatively coupled to said instruction cache; a register file disposed in a second stage of said pipeline; and decode logic operatively coupled between said aligner and said register file;
wherein said aligner and said decode logic are adapted to generate and decode both said first and second instructions, said acts of generating and decoding allowing said user to freely intermix said first and second instructions within a program running on said apparatus.
RELATED APPLICATIONS
[0001] The present application claims priority benefit of U.S. Provisional Application Serial No. 60/353,647 filed Jan. 31, 2002 and entitled “CONFIGURABLE DATA PROCESSOR WITH MULTI-LENGTH INSTRUCTION SET ARCHITECTURE”, which is incorporated herein by reference in its entirety. The present application is also related to co-pending and co-owned U.S. patent application Ser. No. ______ filed Dec. 26, 2002 and entitled “METHODS AND APPARATUS FOR COMPILING INSTRUCTIONS FOR A DATA PROCESSOR”, which claims priority benefit of U.S. Provisional Serial No. 60/343,730 filed Dec. 26, 2001 of the same title, both of which are incorporated by reference herein in their entirety.
Provisional Applications (1)
|
Number |
Date |
Country |
|
60353647 |
Jan 2002 |
US |