Swamy Punyamurtula et al, “Minimum Dependence Distance Tiling of Nested Loops with Non-uniform Dependences”, IEEE pp. 74-81, Apr. 1994.* |
Josep Torrellas et al, “Optimizing Instruction Cache Performance for Operating System Intensive Workloads”, IEEE pp. 360-369, Feb. 1995.* |
Michael E. Wolf et al, “Combining Loop Transformations Considering Caches and Scheduling”, IEEE pp. 274-286, Jan. 1996.* |
Lam et al., “The Cache Performance and Optimizations of Blocked Algorithms”, ACM, pp. 63-74, 1991. |
Llaberia et al., “Performance evaluation of tiling for the register level”, IEEE pp. 254-265, Feb. 1998.* |
Ferrante et al., “Hierarchical tiling for improved superscalar performance”, IEEE pp. 239-245, Apr. 1995. performance. |