Acosta, Ramón D. et al., “An Instruction Issuing Approach to Enhancing Performance in Multiple Functional Unit Processors,” IEEE Transactions On Computers, vol. C-35, No. 9, Sep. 1986, pp. 815-828. |
Agerwala et al., “High Performance Reduced Instruction Set Processors,” IBM Research Division, Mar. 31, 1987, pp. 1-61. |
Aiken, A. and Nicolau, A., “Perfect Pipelining: A New Loop Parallelization Technique*,” pp. 221-235. |
Butler, M. and Patt, Y., “An Improved Area-Efficient Register Alias Table for Implementing HPS,” University of Michigan, Ann Arbor, Michigan, Jan. 1990, pp. 1-15. |
Butler, M. et al., “Single Instruction Stream Parallelism Is Greater Than Two,” Proceedings of ISCA-18, May 1990, pp. 276-286. |
Charlesworth, A.E., “An Approach to Scientific Array Processing: The Architectural Design of the AP-120B/FPS-164 Family,” Computer, vol. 14, Sep. 1981, pp. 18-27. |
Colwell, et al., “A VLIW Architecture for a Trace Scheduling Compiler,” Proceedings of the 2nd International Conference on Architectural Support for Programming Languages and Operating Systems, Oct. 1987, pp. 180-192. |
Dywer, “A Multiple, Out-of-Order, Instruction Issuing System For Superscaler Processors,” (All); Aug. 1991. |
Foster et al., “Percolation of Code to Enhance Parallel Dispatching and Execution,” IEEE Trans. On Computers, Dec. 1971, pp. 1411-1415. |
Gee, J. et al., “The Implementation of Prolog via VAX 8600 Microcode,” Proceedings of Micro 19, New York City, Oct. 1986, pp. 1-7. |
Goodman, J.R. and Hsu, W., “Code Scheduling and Register Allocation in Large Basic Blocks,” ACM, 1988, pp. 442-452. |
Gross et al., “Optimizing Delayed Branches,” Proceedings of the 5th Annual Workshop on Microprogramming, Oct. 5-7, 1982, pp. 114-120. |
Groves, R.D. and Oehler, R., “An IBM Second Generation RISC Processor Architecture,” IEEE, 1989, pp. 134-137. |
Hennessy, J.L and Patterson, D.A., Computer Architecture A Quantitative Approach, 1990, Ch. 6.4, 6.7 and p. 449. |
Horst, R.W. et al., “Multiple Instruction Issue in the NonStop Cyclone Processor,” IEEE, 1990, pp. 216-226. |
Hwu, W. et al., “An HPS Implementation of VAX: Initial Design and Analysis,” Proceedings of the Nineteenth Annual Hawaii International Conference on System Sciences, 1986, pp. 282-291. |
Hwu, W. et al., “Checkpoint Repair for High-Performance Out-of-Order Execution Machines,” IEEE Trans. On Computers, vol. C-36, No. 12, Dec. 1987, pp. 1496-1514. |
Hwu, W. and Patt, Y.N., “Design Choices for the HPSm Microprocessor Chip,” Proceedings of the Twentieth Annual Hawaii International Conference on System Sciences, 1987, pp. 330-336, 1987. |
Hwu, W. et al., “Experiments with HPS, a Restricted Data Flow Microarchitecture for High Performance Computers,” COMPCON 86, 1986. |
Hwu, W. et al., “Exploiting Parallel Microprocessor Microarchitectures with a Compiler Code Generator,” Proceedings of the 15th Annual Symposium on Computer Architecture, Jun. 1988, pp. 45-53. |
Hwu, W. and Patt, Y.N., “HPSm, a High Performance Restricted Data Flow Architecture Having Minimal Functionality,” Proceedings of the 18th International Symposium on Computer Architecture, pp. 297-306, Jun. 1986. |
Hwu, W. and Patt, Y.N., “HPSm2: A Refined Single-chip Microengine,” HICSS '88, 1988, pp. 30-40. |
Johnson, William M., Super-Scalar Processor Design, (Dissertation), Copyright 1989, 134 pages. |
Jouppi et al., “Available Instruction-Level Parallelism for Superscalar and Superpipelined Machines,” Proceedings of the 3rd International Conference on Architectural Support for Programming Languages and Operating Systems, Apr. 1989, pp. 272-282. |
Jouppi, N.H., “Integration and Packaging Plateaus of Processor Performance,” IEEE, 1989, pp. 229-232. |
Jouppi, N.P., “The Nonuniform Distribution of Instruction-Level and Machine Parallelism and Its Effect on Performance,” IEEE Transactions on Computers, vol. 38, No. 12, Dec. 1989, pp. 1645-1658. |
Keller, “Look-Ahead Processors”; Dec. 1975, pp. 177-194. |
Lam, M.S., “Instruction Scheduling For Superscalar Architectures,” Annu. Rev. Comput. Sci., vol. 4, 1990, pp. 173-201. |
Lightner et al., “The Metaflow Architecture”, IEEE Micro Magazine, Jun. 1991, pp. 11-12 and 63-68. |
Lightner et al., “The Metaflow Lightning” Chip Set Mar. 1991 IEEE Lightning Outlined. Microprocesor Report. Sep. 1990. |
Melvin, S. and Patt, Y., “Exploiting Fine-Grained Parallelism Through a Combination of Hardware and Software Techniques,” Proceedings From ISCA-18, May 1990, pp. 287-296. |
Murakami, K. et al., “SIMP (Single Instruction stream/Multiple instruction Pipelining): A Novel High-Speed Single-Processor Architecture,” ACM, 1989, pp. 78-85. |
Patt, Y.N. et al., “Critical Issues Regarding HPS, A High Performance Microarchitecture,” The 18th Annual Workshop on Microprogramming, Pacfiic Grove, California, Dec. 3-6, 1985, IEEE Computer Order No. 653, pp. 109-116. |
Patt, Y.N. et al., “HPS, A New Microarchitecture: Rationale and Introduction,” The 18th Annual Workshop on Microprogramming, Pacific Grove, California, Dec. 3-6, 1985; IEEE Computer Society Order No. 653, pp. 103-108. |
Patt, Y.N. et al., “Run-Time Generation of HPS Microinstructions From a VAX Instruction Stream,” Proceedings of MICRO 19 Workshop, New York, New York, Oct. 1986, pp. 1-7. |
Peleg et al., “Future Trends in Microprocessors: Out-of-Order Execution, Spec. Branching and Their CISC Performance Potential”, Mar. 1991. |
Pleszkun et al., “The Performance Potential of Multiple Functional Unit Processors,” Proceedings of the 15th Annual Symposium on Computer Architecture, Jun. 1988, pp. 37-44. |
Pleszkun et al., “WISQ: A Restartable Architecture Using Queues,” Proceedings of the 14th International Symposium on Computer Architecture, Jun. 1987, pp. 290-299. |
Smith, M.D. et al., “Boosting Beyond Static Scheduling in a Superscalar Processor,” IEEE, 1990, pp. 344-354. |
Smith, et al., “Implementation of Precise Interrupts in Pipelined Processors,” Proceedings of the 12th Annual International Symposium on Computer Architecture, Jun. 1985, pp. 36-44. |
Smith, M.D. et al., “Limits on Multiple Instruction Issue,” Computer Architecture News, No. 2, Apr. 17, 1989, pp. 290-302. |
Sohi, G.S. et al., “Instruction Issue Logic for High Performance, Interruptable Pipelined Processors,” The 14th Annual International Symposium on Computer Architecture, Jun. 2-5, 1987, pp. 27-34. |
Swenson, J.A. and Patt, Y.N., “Hierarchical Registers for Scientific Computers,” St. Malo '88, University of California at Berkeley, 1988, pp. 346-353. |
Thornton, J.E., Design of a Computer: The Control Data 6600, Control Data Corporation, 1970, pp. 58-140. |
Tjaden et al., “Detection and Parallel Execution of Independent Instructions,” IEEE Trans. On Computers, vol. C-19, No. 10, Oct. 1970, pp. 889-895. |
Tjaden, et al., “Representation of Concurrency with Ordering Matrices,” IEEE Trans. On Computers, vol. C-22, No. 8, Aug. 1973, pp. 752-761. |
Tjaden, Representation and Detection of Concurrency Using Ordering Matrices, (Dissertation), 1972, pp. 1-199. |
Tomasulo, R.M., “An Efficient Algorithm for Exploiting Multiple Arithmetic Units,” IBM Journal, vol. 11, Jan. 1967, pp. 25-33. |
Uht, A.K., “An Efficient Hardware Algorithm to Extract Concurrency From General-Purpose Code,” Proceedings of the 19th Annual Hawaii International Conference on System Sciences, 1986, pp. 41-50. |
Uvieghara, G.A. et al., “An Experimental Single-Chip Data Flow CPU,” IEEE Journal of Solid-State Circuits, vol. 27, No. 1, Jan. 1992, pp. 17-28. |
Uvieghara, G.A. et al., “An Experimental Single-Chip Data Flow CPU,” Symposium on ULSI Circuits Design Digest of Technical Papers, May 1990. |
Wedig, R.G., Detection of Concurrency In Directly Executed Language Instruction Streams, (Dissertation), Jun. 1982, pp. 1-179. |
Weiss et al., “Instruction Issue Logic in Pipelined Supercomputers,” Reprinted from IEEE Trans. on Computers, vol. C-33, No. 11, Nov. 1984, pp. 1013-1022. |
Wilson, J.E. et al., “On Turning the Microarchitecture of an HPS Implementation of the VAX,” Proceedings of Micro 20, Dec. 1987, pp. 162-167. |
IBM Journal of Research and Development, vol. 34, No. 1, Jan. 1990, pp. 1-70. |