Keller, R.M., “Look-Ahead Processors,” Computing Surveys, vol. 7, No. 4, Dec. 1975, pp. 177-195. |
Lam, M.S., “Instruction Scheduling For Superscalar Architectures,” Annu. Rev. Comput. Sci., vol. 4, 1990, pp. 173-201. |
Lightner et al. “The Metaflow Architecture,” p. 11, 12, 63, 64, 67, and 68, Jun. 1991, IEEE Micro Magazine. |
Lightner, B.D. et al., “The Metaflow Lightning Chipset,” pp. 13-14 and 16, 1991 IEEE Publication. |
Melvin, S. and Patt, S., “Exploiting Fine-Grained Parallelism Through a Combination of Hardware and Software Techniques”, Proceedings From ISCA-18, pp. 287-296, May, 1990. |
Murakami, K. et al., “SIMP (Single Instruction stream/Multiple instruction Pipelining): A Novel High-Speed Single-Processor Architecture,” ACM, 1989, pp. 78-85. |
Patt, Y.N. et al., “Critical Issues Regarding HPS, A High Performance Microarchitecture,” Proceedings of the 18th Annual Workshop on Microprogramming, Dec. 1985, pp. 109-116. |
Patt, Y.N. et al., “HPS, A New Microarchitecture: Rationale and Introduction,” Proceedings of the 18th Annual Workshop on Microprogramming, Dec. 1985, pp. 103-108. |
Patt, Y.N. et al., “Run-Time Generation of HPS Microinstructions From a VAX Instruction Stream,” Proceedings of MICRO 19 Workshop, New York, New York, Oct. 1986, pp. 1-7. |
Peleg et al., “Future Trends in Microprocessors: Out-of-Order Execution, Spec. Branching and Their CISC Performance Potential”, Mar. 1991. |
Pleszkun, A.R. and Sohi, G.S., “The Performance Potential of Multiple Functional Unit Processors,” Proceedings of the 15th Annual Symposium on Computer Architecture, Jun. 1988, pp. 37-44. |
Pleszkun, A.R. et al., “WISQ: A Restartable Architecture Using Queues,” Proceedings of the 14th International Symposium on Computer Architecture, Jun. 1987, pp. 290-299. |
Popescu, V. et al., “The Metaflow Architecture”, IEEE Micro, vol. 11, No. 3, pp. 10-13 and 63-73, Jun. 1991. |
Smith, M.D. et al., “Boosting Beyond Static Scheduling in a Superscalar Processor,” IEEE, 1990, pp. 344-354. |
Smith, J.E. and Pleszkun, A.R., “Implementation of Precise Interrupts in Pipelined Processors,” Proceedings of the 12th Annual International Symposium on Computer Architecture, Jun. 1985, pp. 36-44. |
Acosta, Ramón D. et al., “An Instruction Issuing Approach to Enhancing Performance in Multiple Functional Unit Processors,” IEEE Transactions On Computers, vol. C-35, No. 9, Sep. 1986, pp. 815-828. |
Agerwala. T. and Cocke, J., “High Performance Reduced Instruction Set Processors,” IBM Research Division, Mar. 31, 1987, pp. 1-61. |
Aiken, A. and Nicolau, A., “Perfect Pipelining: A New Loop Parallelization Technique*,” pp. 221-235. |
Butler, M. and Patt, Y., “An Improved Area-Efficient Register Alias Table for Implementing HPS”, University of Michigan, Ann Arbor, Michigan, Jan. 1990, pp. 1-15. |
Butler, M. and Patt, Y., “An Investigation of the Performance of Various Dynamic Scheduling Techniques”, Proceedings from MICRO-25, Dec. 1-4, 1992, pp. 1-9. |
Butler, M., “Single Instruction Stream Parallelism Is Greater Than Two,” Proceedings of ISCA-18, May 1990, pp. 276-286. |
Charlesworth, A.E., “An Approach to Scientific Array Processing: The Architectural Design of the AP-120B/FPS-164 Family,” Computer, vol. 14, Sep. 1981, p. 18-27. |
Colwell, R.P. et al., “A VLIW Architecture for a Trace Scheduling Compiler,” Proceedings of the 2nd International Conference on Architectural Support for Programming Languages and Operating Systems, Oct. 1987, pp. 180-192. |
Dwyer, A Multiple, Out-of-Order Instruction Issuing System for Superscalar Processors, (All), Aug. 1991. |
Foster, C.C. and Riseman, E.M., “Percolation of Code to Enhance Parallel Dispatching and Execution,” IEEE Trans. On Computers, Dec. 1971, pp. 1411-1415. |
Gee, J. et al.,“The Implementation of Prolog via VAX 8600 Microcode,” Proceedings of Micro 19, New York City, Oct. 1986, pp. 1-7. |
Goodman, J.R. and Hsu, W., “Code Scheduling and Register Allocation in Large Basic Blocks,” ACM, 1988, pp. 442-452. |
Gross, T.R. and Hennessy, J.L., “Optimizing Delayed Branches,” Proceedings of the 5th Annual Workshop on Microprogramming, Oct. 5-7, 1982, pp. 114-120. |
Groves, R.D. and Oehler, R., “An IBM Second Generation RISC Processor Architecture,” IEEE, 1989, pp. 134-137. |
Hennessy, J.L. et al. “Computer Architecture A Quantitative Approach,” 1990, Ch. 6.4, 6.7 and p. 449. |
Horst, R.W. et al., “Multiple Instruction Issue in the NonStop Cyclone Processor,” IEEE, 1990, pp. 216-226. |
Hwu, W. and Patt, Y.N., “Checkpoint Repair for High-Performance Out-of-Order Execution Machines,” IEEE Trans. On Computers, vol. C-36, No. 12, Dec. 1987, pp. 1496-1514. |
Hwu, W. et al., “Design Choices for the HPSm Microprocessor Chip”, Proceedings of the Twentieth Annual Hawaii International Conference on System Sciences, 1987, pp. 330-336. |
Hwu, W. et al., “Experiments with HPS, a Restricted Data Flow Microarchitecture for High Performance Computers”, COMPCON 86, 1986. |
Hwu, W. and Chang, P.P., “Exploiting Parallel Microprocessor Microarchitectures with a Compiler Code Generator,” Proceedings of the 15th Annual Symposium on Computer Architecture, Jun. 1988, pp. 45-53. |
Hwu, W. et al., “An HPS Implementation of VAX: Initial Design and Analysis”, Proceedings of the Nineteenth Annual Hawaii International Conference on System Sciences, pp. 282-291, 1986. |
Hwu, W. and Patt, Y.N., “HPSm, a High Performance Restricted Data Flow Architecture Having Minimal Functionality”, Proceedings of the 18th International Symposium on Computer Architecture, pp. 297-306, Jun. 1986. |
Hwu, W. and Patt, Y.N., “HPSm2: A Refined Single-chip Microengine”, HICSS '88, pp. 30-40, 1988. |
IBM Journal of Research and Development, vol. 34, No. 1, Jan. 1990, pp. 1-70. |
Johnson, M., Superscalar Microprocessor Design, “Chapter 5-The Role of Exception Recovery”, pp. 87-102; “Chapter 6-Register Dataflow”, pp. 103-125, Prentice Hall, 1991. |
Johnson, W.M., Super-Scalar Processor Design, (Dissertation), Copyright 1989, 134 pages. |
Jouppi, N.P. and Wall, D.W., “Available Instruction-Level Parallelism for Superscalar and Superpipelined Machines,” Proceedings of the 3rd International Conference on Architectural Support for Programming Languages and Operating Systems, Apr. 1989, pp. 272-282. |
Jouppi, N.P., “Integration and Packaging Plateaus of Processor Performance,” IEEE, 1989, pp. 229-232. |
Jouppi, N.P., “The Nonuniform Distribution of Instruction-Level and Machine Parallelism and Its Effect on Performance,” IEEE Transactions on Computers, vol. 38, No. 12, Dec. 1989, pp. 1645-1658. |
Kateveris, Hardware Support “Thesis,” 1984, p. 138-145. |
Smith, M.D. et al., “Limits on Multiple Instruction Issue,” Proceedings of the 3rd International Conference on Architectural Support for Programming Languages and Operating Systems, Apr. 1989, pp. 290-302. |
Sohi, G.S. and Vajapeyam, S., “Instruction Issue Logic for High Performance, Interruptable Pipelined Processors,” The 14th Annual International Symposium on Computer Architecture, Jun. 2-5, 1987, pp. 27-34. |
Swenson, J.A. and Patt, Y.N., “Hierarchical Registers for Scientific Computers”, St. Malo '88, University of California at Berkeley, 1988, pp. 346-353. |
Thornton, J.E., Design of a Computer: The Control Data 6600, Control Data Corporation, 1970, pp. 58-140. |
Tjaden, G.S. et al., “Detection and Parallel Execution of Independent Instructions,” IEEE Trans. On Computers, vol. C-19, No. 10, Oct. 1970, pp. 889-895. |
Tjaden, G.S. and Flynn, M.J., “Representation of Concurrency with Ordering Matrices,” IEEE Trans. On Computers, vol. C-22, No. 8, Aug. 1973, pp. 752-761. |
Tjaden, G.S., Representation and Detection of Concurrency Using Ordering Matrices, (Dissertation), 1972, pp. 1-199. |
Tomasulo, R.M., “An Efficient Algorithm for Exploiting Multiple Arithmetic Units,” IBM Journal, vol. 11, Jan. 1967, pp. 25-33. |
Uht, A.K., “An Efficient Hardware Algorithm to Extract Concurrency From General-Purpose Code,” Proceedings of the 19th Annual Hawaii International Conference on System Sciences, 1986, pp. 41-50. |
Uvieghara, G.A. et al., “An Experimental Single-Chip Data Flow CPU”, Symposium on ULSI Circuits Design Digest of Technical Papers, May 1990. |
Uvieghara, G.A. et al., “An Experimental Single-Chip Data Flow CPU,” IEEE Journal of Solid-State Circuits, vol. 27, No. 1, pp. 17-28, Jan., 1992. |
Wedig, R.G., Detection of Concurrency In Directly Executed Language Instruction Streams, (Dissertation), Jun. 1982, pp. 1-179. |
Weiss, S. et al., “Instruction Issue Logic in Pipelined Supercomputers,” Reprinted from IEEE Trans. on Computers, vol. C-33, No. 11, Nov. 1984, pp. 1013-1022. |
Wilson, J.E. et al., “On Turning the Microarchitecture of an HPS Implementation of the VAX”, Proceedings of Micro 20, Dec. 1987, pp. 162-167. |