Claims
- 1. A microprocessor, comprising:
an instruction decoding stage that provides three sequences of decoded instructions, one set of three instructions at a time, a data memory with only two ports, three multi-staged pipelines receiving and processing in parallel the three sequences of decoded instructions provided by the instruction decoding stage, and a control circuit responsive to an individual set of three instructions for dynamically connecting the two memory ports to any two of the pipelines to which instructions of the individual set requiring access to the memory are being sent while an instruction of the individual set not requiring access to the memory is sent through another of the pipelines.
- 2. The microprocessor of claim 1, which includes exactly three multi-staged pipelines, and wherein each set of instructions includes exactly three instructions.
- 3. The microprocessor of claim 1, wherein the instruction of the individual set not requiring access to the memory includes a jump instruction.
- 4. The microprocessor of claim 1, wherein the instruction of the individual set not requiring access to the memory includes an instruction to move data between two of a plurality of registers.
- 5. The microprocessor of claim 1, wherein the instruction of the individual set not requiring access to the memory includes an instruction to perform arithmetic or logic operations on data in two of a plurality of registers.
- 6. The microprocessor of claim 1, wherein each of the three pipelines includes an address generation stage and an instruction execution stage, the address generation and instruction execution stages of one of the three pipelines having significantly less capability than those of the other two of the three pipelines, whereby space and power are conserved by said one of the three pipelines.
- 7. The microprocessor of claim 1, additionally including a set of registers from which data is read and into which data is written by each of the three pipelines.
- 8. A microprocessor, comprising:
an instruction decoding stage that provides three sequences of decoded instructions, one set of three instructions at a time, three multi-staged pipelines receiving and processing in parallel the three sequences of decoded instructions provided by the instruction decoding stage, two arithmetic logic units, a move unit, and a control circuit responsive to an individual set of three instructions for dynamically connecting the two arithmetic logic units individually in any two of the three pipelines in order to accept instructions of the individual set requiring an arithmetic logic unit to execute while the move unit is connectable to another of the pipelines which accepts an instruction of the individual set not requiring an arithmetic logic unit to execute.
- 9. The microprocessor of claim 8, which includes exactly three multi-staged pipelines, and wherein each set of instructions includes exactly three instructions.
- 10. The microprocessor of claim 8, wherein the instruction of the individual set that is accepted by said another of the pipelines includes a jump instruction.
- 11. The microprocessor of claim 8, wherein the instruction of the individual set that is accepted by said another of the pipelines includes instructions to move data between two of a plurality of registers and instructions to move data between one of the plurality of registers and a memory.
- 12. The microprocessor of claim 8, additionally including a set of registers from which data is read and into which data is written by each of the three pipelines.
- 13. A microprocessor, comprising:
a number of pipelines in excess of two that are operated in parallel, each of the plurality of pipelines having a plurality of pipeline stages that executes instructions in steps along its stages, a number of data memory access ports at least one less than the number of pipelines, a switching circuit that individually connects the data memory ports with selected stages of any of a number of the plurality of pipelines at least one more than the number of data memory access ports at different times when necessary to execute instructions being processed by the pipelines, and at least one remaining pipeline to which the data memory is not connected at one of said times being capable of executing instructions not requiring memory access.
- 14. The microprocessor of claim 13, additionally comprising:
a number of arithmetic logic units at least one less than the number of pipelines, said switching circuit additionally individually connecting the arithmetic logic units into one of the stages of any of a number of the plurality of pipelines at least one more than the number of arithmetic logic units at different times when necessary to execute instructions being processed by the pipelines, and at least one remaining pipeline to which an arithmetic logic unit is not connected at one of said times being capable of executing instructions not requiring an arithmetic logic unit.
- 15. The microprocessor of claim 14, which additionally comprises a move unit that is connectable into said remaining at least one pipeline for moving data between ones of a plurality of registers or between one of the registers and a memory.
- 16. A microprocessor, comprising:
a number of pipelines in excess of two that are operated in parallel, each of the plurality of pipelines having a plurality of pipeline stages that executes instructions in steps along its stages, a number of arithmetic logic units at least one less than the number of pipelines, a switching circuit that individually connects the arithmetic logic units into one of the stages of any of a number of the plurality of pipelines at least one more than the number of arithmetic logic units at different times when necessary to execute instructions being processed by the pipelines, and at least one remaining pipeline to which an arithmetic logic unit is not connected at one of said times being capable of executing instructions not requiring an arithmetic logic unit.
- 17. The microprocessor of claim 16, which additionally comprises a move unit that is connectable into said remaining at least one pipeline for moving data between ones of a plurality of registers or between one of the registers and a memory.
- 18. A microprocessor formed on a single integrated circuit chip, comprising:
an instruction memory adapted to provide a sequence of instructions to be executed, an instruction issuing stage coupled to the instruction memory for making a set of three instructions stored therein available in parallel during a common interval for processing, a data memory having first and second ports for simultaneous access therethrough to read operands therefrom, three address generation stages, two of said address generation stages having individual outputs connected to address the data memory respectively through said first and second ports thereof and read operands therefrom, a remaining one of the address generation stages not having access to read operands stored in the data memory, three arithmetic logic unit (ALU) stages, one of said three ALUs having less processing capability than the other two of said three ALUs, and an interconnection circuit responsive to each set of three instructions made available by the instruction issuing stage (a) for routing up to two of the three instructions needing operands from the data memory through the two address generation stages having outputs connected to address the data memory, (b) for connecting two operands read from the data memory to any two of the ALUs having sufficient processing capability to execute their associated instructions, and (c) for routing a remaining one of the three instructions not requiring an operand either to a remaining one of the address generation stages or a remaining one of the ALUs, thereby to process the set of three instructions in parallel.
- 19. The microprocessor of claim 18, wherein the data memory and instruction memory are separate from each other.
- 20. The microprocessor of claim 18, additionally comprising a plurality of registers, the contents of which are readable by at least some of the address generation and ALU stages.
- 21. A method of processing a sequence of computer instructions with access to data stored in a memory through only a given number of parallel access ports, comprising:
reviewing in a single interval each of a set of a number of instructions at least one more than the given number, calculating a memory address from each of no more than the given number of instructions in the set that require data from the memory, reading data from the memory at the calculated addresses through the given number of ports, executing those of the set of instructions having data that have been read from the memory, and depending upon the type of at least one of the set of instructions in excess of the given number that does not need data from memory, either (a) concurrently with said address calculating operation, calculating from said excess instruction an address of another instruction, or (b) concurrently with executing those of the set of instructions having data read from the memory, executing said excess instruction.
- 22. The method according to claim 21, wherein said given number is two.
- 23. The method according to claim 21, wherein the excess instruction is a jump instruction, and wherein the address of another instruction calculated from the excess instruction is subsequently used to designate another set of instructions that are reviewed in a subsequent interval.
- 24. The method according to claim 21, wherein the excess instruction is a move instruction that is executed to move data between individual ones of a plurality of registers.
- 25. The method according to claim 21, wherein the excess instruction is an instruction to perform arithmetic or logic operations on data in two of a plurality of registers.
- 26. A method of executing a sequence of computer instructions by a processor having a plurality of registers, a given number of arithmetic logic units (ALUs), and access to a memory, comprising:
reviewing in a single interval each of a set of a number of instructions at least one more than the given number, executing a given number of said set of instructions during a subsequent interval by use of the given number of ALUs, thereby to leave at least one of the set of instructions that is not being executed by one of the ALUs during the subsequent interval, and depending upon the type of said at least one instruction not being executed by one of the ALUs during the subsequent interval, either (a) executing a jump to a new set of instructions, or (b) moving data between two registers, or © moving data between one of the registers and the memory.
- 27. The method according to claim 21, wherein said given number is two.
- 28. A microprocessor on a single integrated circuit chip, comprising:
an instruction cache memory for storing instructions to be processed, an instruction fetch stage that accesses the instruction cache memory to obtain instructions therefrom in a sequence in which the instructions are to be executed, an instruction queue stage receiving instructions from the instruction fetch stage for storing three sequential instructions at a time for processing, first, second and third address generating stages that each include adder circuits, the adder circuit of the third address generating stage having fewer input ports than the adder circuits of each of the first and second address generating stages, a data cache memory for storing operands used in processing instructions and for storing results of processing instructions, the data cache memory having first and second parallel access ports that are connected to receive addresses calculated by the adders of the first and second address generating stages, respectively, and provide respective first and second operands from the data cache memory in response, the third address generating stage having no access to the data cache memory, a circuit connecting an output of the adder of the third address generation stage to the instruction fetch stage for designating an address of an instruction to be read from the instruction cache memory, first, second and third instruction execution stages that each include respective first, second and third arithmetic logic units (ALUs) with the third ALU having fewer input ports than either of the first or second ALUs, circuits connected to outputs of the ALUs for writing results of instruction processing thereby into the registers or into the data cache memory through its said first and second ports, a plurality of registers connected to provide data inputs to the adder circuits and each of the first, second and third ALUs, and to receive data from the writing circuits, and a control circuit that routes instructions stored in the instruction queue stage into the first, second and third address generating stages and the first, second and third instruction execution stages in a manner that instructions requiring operands from the data cache memory are not routed to the third address generating stage and a limited set of instructions are routed to the third instruction execution stage.
CROSS-REFERENCE TO RELATED APPLICATION
[0001] This is a continuation-in-part of copending patent application Ser. No. 09/062,804, filed Apr. 20, 1998, which application is expressly incorporated herein in its entirety by this reference.
Divisions (1)
|
Number |
Date |
Country |
Parent |
09151634 |
Sep 1998 |
US |
Child |
09842107 |
Apr 2001 |
US |
Continuation in Parts (1)
|
Number |
Date |
Country |
Parent |
09062804 |
Apr 1998 |
US |
Child |
09151634 |
Sep 1998 |
US |