Claims
- 1. A method for optimizing a software pipelineable loop in a software code, wherein the loop comprises one or more pipelined stages and one or more loop operations, the method comprising:
(a) evaluating an initiation interval time (IN) for a pipelined stage of the loop; (b) determining a loop operation time latency (Tld); (c) determining a number of loop operations (Np) from the pipelined stages to peel based on IN and Tld; (d) peeling Np copies of the loop operation; (e) copying the peeled loop operations before the loop in the software code; (f) allocating a vector of registers; (g) assigning results of the peeled loop operations and a result of an original loop operation to the vector of registers; and (h) assigning memory addresses to the results of the peeled loop operations and original loop operation.
- 2. The method of claim 1, wherein the initiation interval time comprises a number clock cycles for a number of stages required to execute a loop kernel.
- 3. The method of claim 1, wherein the one or more loop operations comprises one or more load operations.
- 4. The method of claim 3, wherein steps (d)-(h) are repeated for every load operation.
- 5. The method of claim 1, wherein the number of peeled loop operations equals (Tld+IN−1)/IN.
- 6. The method of claim 1, wherein the vector of registers is a length of Np+1.
- 7. The method of claim 1, further comprising generating a software code based on the optimization.
- 8. The method of claim 1, wherein allocating a vector of registers comprises allocating a vector of registers with a rotating Base.
- 9. An apparatus for optimizing a software pipelineable loop in a software code, wherein the loop comprises one or more pipelined stages and one or more loop operations, the apparatus comprising:
(a) instructions for evaluating an initiation interval time (IN) for a pipelined stage of the loop; (b) instructions for determining a loop operation time latency (Tld); (c) instructions for determining a number of loop operations from the pipelined stages to peel (Np) based on the IN and Tld; (d) instructions for peeling Np copies of the loop operation; (e) instructions for copying the peeled loop operations before the loop in the software code; (f) instructions for allocating a vector of registers; (g) instructions for assigning results of the peeled loop operations and a result of an original loop operation to the vector of registers; and (h) instructions for assigning memory addresses to the results of the peeled loop operations and original loop operation.
CROSS-REFERENCES TO RELATED APPLICATIONS
[0001] This application is Continuation-In-Part application which is related to and claims priority from U.S. patent application Ser. No. 09/505,657 filed Feb. 17, 2000 which claims priority from U.S. Provisional Patent Application Nos. 60/120,352; 60/120,360; 60/120,361; 60/120,450; 60/120,461; 60/120,464; 60/120,528; 60/120,530; and 60/120, 533, all of which were filed Feb. 17, 1999, the disclosures of which are incorporated herein by reference in their entirety.
Provisional Applications (9)
|
Number |
Date |
Country |
|
60120352 |
Feb 1999 |
US |
|
60120360 |
Feb 1999 |
US |
|
60120361 |
Feb 1999 |
US |
|
60120450 |
Feb 1999 |
US |
|
60120461 |
Feb 1999 |
US |
|
60120464 |
Feb 1999 |
US |
|
60120528 |
Feb 1999 |
US |
|
60120530 |
Feb 1999 |
US |
|
60120533 |
Feb 1999 |
US |
Continuation in Parts (1)
|
Number |
Date |
Country |
Parent |
09505657 |
Feb 2000 |
US |
Child |
09972337 |
Oct 2001 |
US |