Claims
- 1. A multiple execution unit processor, the processor comprising:
a memory unit storing a plurality of execution packets; a buffer storage unit for storing the execution packets; a dispatch unit for directing each instruction of and execution packet applied thereto to an preselected execution unit; a program memory control unit for retrieving execution packets from the memory unit, the program memory unit having a first state wherein an execution packet from the memory unit is applied to the dispatch unit and to the buffer storage unit, the execution packet applied to the execution unit being stored therein, wherein in the first state the retrieved instruction stage and any instruction stage stored in the buffer storage unit are applied to the dispatch unit simultaneously, the program control memory unit having a second state wherein the execution packets stored in the buffer storage unit are simultaneously applied to the dispatch unit, the program control memory unit having a third state implemented after a selected execution packet has been executed a predetermined number of times, wherein in the third state after the earliest stored execution packet in the buffer storage unit is eliminated after each application of the stored execution packets to the crossbar unit, wherein the processor uses the three instruction states to executes an inner loop of a nested-loop instruction set, the inner loop instruction set and an outer loop instruction set having overlapping execution; a comparison unit receiving signals from the buffer storage unit and the program memory control unit, the comparison unit generating control signals; and a gate unit responsive to the control signals for preventing an associated instruction for being applied to the dispatch unit.
- 2. The processor as recited in claim 1 wherein the comparison unit compares valid bits from the buffer storage unit with valid bits from the program memory control unit.
- 3. The processor as recited in claim 2 wherein the inner loop execution packets are stored in the buffer storage unit during execution of the outer loop instruction stages.
- 4. The processor as recited in claim 2 further comprising a second buffer storage unit, wherein the outer loop instruction stages are stored in the second buffer storage unit, the comparison unit receiving signals from the second buffer unit instead of the program memory control unit.
- 5. A method of executing a nested-loop set of instruction stages, the execution including the execution of outer loop instruction stages a first plurality of times, the execution of the nested loop of instructions including execution of inner loop instruction stages a second plurality of times for each execution of the outer loop of instruction stages, the method comprising:
using a software pipeline procedure to execute the inner loop instruction stages for each execution of the inner loop instructions the second plurality of times:
overlapping execution the inner loop instruction stages and the outer loop instruction stages; comparing valid bits from the inner loop execution packets with valid bits from outer loop execution packets for execution packets that will be executed simultaneously; and when the valid bits are the same, preventing the associated instruction from the buffer unit from being executed.
- 6. The method as recited in claim 5 further comprising:
storing the inner loop instruction stages in a buffer storage unit during the execution of the outer loop instruction set.
- 7. The method as recited in claim 5 further comprising:
storing the outer loop instruction stages in a buffer memory unit during execution of the inner loop instruction stages.
- 8. The method as recited in claim 7 wherein storing the outer loop instruction stage includes executing the outer loop instruction stages in a new sequence, the outer loop instruction stages executed after execution of the inner loop instruction stages being executed in the sequence before the execution of the outer loop instruction stages executed before the execution of the inner loop instruction stages in the new sequence.
- 9. The method as recited in claim 5 wherein the outer loop instruction set can have a instruction conflict with one of the inner loop states selected from the group consisting of the prolog state and the epilog state.
- 10. In a multi-execution unit processing unit for processing a nested loop instruction set, the inner loop instruction set being execution by a software pipeline procedure, the inner loop instruction set being stored in a buffer memory unit, execution of outer loop execution packets overlapping execution of the inner loop execution packets, apparatus for preventing conflict between an instruction in an outer loop execution packet and an instruction of the inner loop, the apparatus comprising:
a comparison unit for comparing valid bits from an outer loop execution packet with valid bits from an inner loop execution packet when the execution packets are executed simultaneously, when at least one valid bits from each execution packet is the same, generating control signals indicative of the conflicting instructions; and a gate unit responsive to the control signals for preventing the conflicting instruction in the inner loop instruction set from being forwarded for execution.
- 11. The apparatus as recited in claim 10 wherein the function of the instruction prevented from being forwarded is included in the conflicting outer loop instruction.
- 12. In a multi-execution unit processing unit for processing a nested loop instruction set, the inner loop instruction set being execution by a software pipeline procedure, the inner loop instruction set being stored in a buffer memory unit, execution of outer loop execution packets overlapping execution of the inner loop execution packets, a method for preventing conflict between an instruction in an outer loop execution packet and an instruction of the inner loop, the method comprising:
comparing valid bits from the inner loop execution packets with outer loop execution packets for execution packets to be executed simultaneously; and when valid bits associated with each instruction are the same for an instruction in both execution packet to be executed simultaneously, preventing the instruction with the same valid bit in the inner loop instruction set from being executed.
- 14. The method as recited in claim 13 wherein the functionality of instruction prevented from being executed is included in the outer loop instruction for which the conflict is identified.
RELATED APPLICATION
[0001] This application claims priority from provisional patent application No. 60/342,706 entitled APPARATUS AND METHOD FOR A SOFTWARE PIPELINE LOOP PROCEDURE IN A DIGITAL SIGNAL PROCESSOR, invented by Eric J. Stotzer, Steve D. Krueger, and Timothy D. Anderson, filed on Dec. 20, 2001, and assigned to the assignee of the present Application: and provisional patent application No. 60/342,728 entitled APPARATUS ANDMETHOD FOR IMPROVED EXECUTION OF A SOFTWARE PIPELINE LOOP PROCEDURE IN A DIGITAL SIGNAL PROCESSOR, invented by Timothy D. Anderson, Michael D. Asal, and Eric J. Stotzer, filed on Dec. 20, 2001, and assigned to the assignee of the present Application:
[0002] U.S. patent application Ser. No. 09/855,140 (Attorney Docket TI-25737) entitled LOOP CACHE MEMORY AND CACHE CONTROLLER FOR PIPELINED MICROPROCESSORS, invented by Richard H. Scales, filed on May 14, 2001, and assigned to the assignee of the present application: U.S. patent application (Attorney Docket TI-33895), entitled APPARATUS AND METHOD FOR A SOFTWARE PIPELINE LOOP PROCEDURE IN A DIGITAL SIGNAL PROCESSOR, invented by Eric J. Stotzer, Steve D. Krueger, and Timothy D. Anderson, filed on even date herewith, and assigned to the assignee of the present Application: U.S. patent application (Attorney Docket TI-33896), entitled APPARATUS ANDMETHOD FOR IMPROVED EXECUTION OF A SOFTWARE PIPELINE LOOP PROCEDURE IN A DIGITAL SIGNAL PROCESSOR, invented by Timothy D. Anderson, Michael D. Asal, and Eric J. Stotzer, filed on even date herewith, and assigned to the assignee of the present Application: U.S. patent application (Attorney Docket TI-34336), entitled APPARATUS AND METHOD FOR PROCESSING AN INTERRUPT IN A SOFTWARE PIPELINE LOOP PROCEDURE IN A DIGITAL SIGNAL PROCESSOR, invented by Eric J. Stotzer, Steve D. Krueger, Timothy D. Anderson, and Michael D. Asal filed on even date herewith, and assigned to the assignee of the present Application: U.S. patent (Attorney Docket TI-34337), entitled APPARATUS AND METHOD FOR EXECUTING A NESTED LOOP PROGRAM WITH A SOFTWARE PIPELINE LOOP PROCEDURE IN A DIGITAL SIGNAL PROCESSOR, invented by Eric J. Stotzer and Michael D. Asal, filed on even date herewith, and assigned to the assignee of the present Application; and U.S. patent application (Attorney Docket TI-34335) entitled APPARATUS AND METHOD FOR EXITING FROM A SOFTWARE PIPELINE LOOP PROCEDURE IN A DIGITAL SIGNAL PROCESSOR, invented by Elana D Granston, Eric J. Stotzer Steve D. Krueger, and Timothy D. Anderson, filed on even date herewith and assigned to the assignee of the present application are related applications.
Provisional Applications (2)
|
Number |
Date |
Country |
|
60342706 |
Dec 2001 |
US |
|
60342728 |
Dec 2001 |
US |