“Compiler Transformations for High-Performance Computing”, D.F.Bacon et al, ACM Computing Surveys, Vol 26, No 4, Dec. 1994 pp. 345-420.* |
“Optimization of Data/Control Conditions in Task Graphs” M. Girkar et al. pp. 152-168, 1990s.* |
S. Amarasinghe, et al., “An Overview of the SUIF Compiler for Scalable Parellel Machines”, Proc. Of the 7th SIAM Conf. On Parallel Proc. For Scientific Computing, 1995. |
C. Ancourt, et al., “Automatic Data Mapping of Signal Processing Applications”, Proc. Intnl. Conf. On Applic.-Spec. Array Processors, Zurich, Switzerland, pp. 350-363, Jul. 1997. |
J. Anderson, et al., “Data and Computing Transformation for Multiprocessors”, in 5th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 39-50, Aug. 1995. |
M. Cierniak, et al., “Unifying Data and Control Transformation for Distributed Shared-Memory Machines”, Proc. Of the SIGPLAN '95 Conference on Programming Language Design and Implementation, La Jolla, pp. 205-217, Feb. 1995. |
K. Danckaert, et al., “Platform Independent Data Transfer and Storage Exploration Illustrated on a Parallel Cavity Detection Algorithm”, Proc. CSREA Conference on PAR. And Dist. Proc. Techniques and Applications, vol. 3, pp. 1669-1675, Las Vegas, NV, USA, Jun. 1999. |
K. Danckaert et al., “Strategy for Power-Efficient Design of Parallel Systems”, IEEE Trans. On Parallel and Distributed Systems, vol. 7, No. 11, pp. 1150-1163, Nov. 1996. |
C. Diderich, et al., “Solving the Constant-Degree Parallelism Alignment Problem”, Proc. EuroPar Conference, Lyon, France, Aug. 1996. Lecture notes in Computer Science, Spring Verlag, pp. 451-454, 1996. |
M. Dion, et al., “Mapping Affine Loop Nests: New Results”, Lecture Notes in Computer Science, vol. 919, High Performance Computing and Networking, pp. 184-189, 1995. |
J.Z. Fang, et al., “An Iteration Partition Approach for Cache of Local Memory Thrashing on parallel Processing”, IEEE Trans. On Computers, vol. C-42, No. 5, pp. 529-546, May 1993. |
F. H. M. Franssen, et al., “Modeling Multidimensional Data and Control Flow”, IEEE Trans. On VLSI Systems, vol. 1, No. 3, pp. 319-327, Sep. 1993. |
M. Lam, et al., “The Cache Performance and Optimizations of Blocked Algorithms”, In Proc. ASPLOS-IV, pp. 63-74, Santa Clara, CA 1991. |
L. Lamport, “The Parallel Execution of DO Loops”, Communications of the ACM, vol. 17, No. 2, pp. 83-93, Feb. 1974. |
K. McKinley, “A Compiler Optimization Algorithm for Shared-Memory Multiprocessors”, IEEE Trans. On Parallel and Distributed Systems, vol. 9, No. 8, Aug. 1998. |
P.R. Panda, et al., “Memory Data Organization for Improved Cache Performance in Embedded Processor Applications”, In Proc. ISSS-96, pp. 90-95, La Jolla, CA 1996. |
N.L. Passos, et al., “Achieving Full Parallelism Using Multidimensional Retiming”, IEEE Trans. On Parallel and Distributed Systems, vol. 7, No. 11, Nov. 1996. |
K. Pettis, et al., “Profile Guided Code Positioning”, In ACM SIGPLAN '90 Conference on Programming Language and Design Implementation, pp. 16-27, Jun. 1990. |
W. Shang, et al., “On Uniformization of Affine Dependence Algorithms”, IEEE Trans. On Computers, vol. 46, No. 7, Jul. 1996. |
M. E. Wolf, et al., “A Loop Transformation Theory and an Algorithm to Maximize Parallelism”, IEEE Trans. On Parallel and Distributed Systems, vol. 2, No. 4, Oct. 1991. |