Bacon, D.; Graham, S.; Sharp, O.; "Compiler Transformations for High Performance Computing"; ACM Computing Surveys; vol. 26, No. 4, pp. 345-420, Dec. 1994. |
Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems, Oct. 1994, S. Carr et al., "Compiler Optimizations for Improving Data Locality". |
Manjikian, N.; Abdelrahman, T.; "Fusion of Loops for Parallelism and Locality"; IEEE Transactions on Parallel and Distributed Systems; vol. 8, Issue 2, pp. 193-209, Feb. 1997. |
Sha, E.; Lang, C.; Passos, N.; "Polynomial-Time Nested Loop Fusion with Full Parallelism"; Proceedings of the 1996 International Conference on Parallel Processing; vol. 3, pp. 9-16, Aug. 1996. |
McKinley, K.; Carr, S.; Tseng, C.; "Improving Data Locality with Loop Transformations"; ACM Transactions on Programming Languages and Systems; vol. 18, No. 4, pp. 424-453, Jul. 1996. |
Chesney, D.; Cheng, B.; "Generalizing the Unimodular Approach"; International Conference on Parallel and Distributed Systems; pp. 398-404, Dec. 1994. |
Pugh, W.; "Uniform Techniques for Loop Optimization"; Proceedings of the 1991 International Conference on Supercomputing; pp. 341-352. |