Acosta, R. D. et al., “An Instruction Issuing Approach to Enhancing Performance in Multiple Functional Unit Processors,” IEEE Transactions On Computers, IEEE, vol. C-35, No. 9, pp. 815-828 (Sep. 1986). |
Agerwala, T. and Cocke, J., “High Performance Reduced Instruction Set Processors,” IBM Research Division, pp. 1-61 (Mar. 31, 1987). |
Aiken, A. and Nicolau, A., “Perfect Pipelining: A New Loop Parallelization Technique,” Proceedings of the 1988 ESOP, Springer-Verlag, pp. 221-235 (1988). |
Charlesworth, A.E., “An Approach to Scientific Array Processing: The Architectural Design of the AP-120B/FPS-164 Family,” Computer, IEEE, vol. 14, pp. 18-27 (Sep. 1981). |
Colwell, R.P. et al., “A VLIW Architecture for a Trace Scheduling Compiler,” Proceedings of the 2nd International Conference on Architectural Support for Programming Languages and Operating Systems, ACM, pp. 180-192 (Oct. 1987). |
Dwyer, H, A Multiple, Out-of-Order Instruction Issuing System for Superscalar Processors, UMI, pp. 1-259 (Aug. 1991). |
Foster, C.C. and Riseman, E.M., “Percolation of Code to Enhance Parallel Dispatching and Execution,” IEEE Transactions On Computers, IEEE, pp. 1411-1415 (Dec. 1971). |
Goodman, J.R. and Hsu, W., “Code Scheduling and Register Allocation in Large Basic Blocks,” International Conference on Supercomputing, ACM, pp. 442-452 (1988). |
Gross, T.R. and Hennessy, J.L., “Optimizing Delayed Branches,” Proceedings of the 5th Annual Workshop on Microprogramming, IEEE, pp. 114-120 (Oct. 5-7, 1982). |
Groves, R.D. and Oehler, R., “An IBM Second Generation RISC Processor Architecture,” Proceedings 1989 IEEE International Conference on Computer Design: VLSI in Computers and Processors, IEEE, pp. 134-137 (Oct. 1989). |
Horst, R.W. et al., “Multiple Instruction Issue in the NonStop Cyclone Processor,” Proceedings of the 17 th Annual International Symposium on Computer Architecture, IEEE, pp. 216-226 (1990). |
Hwu, W-M. W. and Patt, Y.N., “Checkpoint Repair for High-Performance Out-of-Order Execution Machines,” IEEE Trans. On Computers, IEEE, vol. C-36, No. 12, pp. 1496-1514 (Dec. 1987). |
Hwu, W-M. W. and Chang, P.P., “Exploiting Parallel Microprocessor Microarchitectures with a Compiler Code Generator,” Proceedings of the 15th Annual Symposium on Computer Architecture, IEEE, pp. 45-53 (Jun. 1988). |
Hwu, W-M. and Patt, Y.N., “HPSm, a High Performance Restricted Data Flow Architecture Having Minimal Functionality,” Proceedings from ISCA-13, IEEE, pp. 297-306 (Jun. 2-5, 1986). |
IBM Journal of Research and Development, IBM, vol. 34, No. 1, pp. 1-70 (Jan. 1990). |
Johnson, M. Superscalar Microprocessor Design, Prentice-Hall, Entire book submitted (1991). |
Johnson, W.M., Super-scalar Processor Design, (Dissertation), 134 pages (1989). |
Jouppi, N.P. and Wall, D.W., “Available Instruction-Level Parallelism for Superscalar and Superpipelined Machines,” Proceedings of the 3rd International Conference on Architectural Support for Programming Languages and Operating Systems, ACM, pp. 272-282 (Apr. 1989). |
Jouppi, N.P., “Integration and Packaging Plateaus of Processor Performance,” International Conference of Computer Design, IEEE, pp. 229-232 (Oct. 1989). |
Jouppi, N.P., “The Nonuniform Distribution of Instruction-Level and Machine Parallelism and Its Effect on Performance,” IEEE Transactions on Computers, IEEE, vol. 38, No. 12, pp. 1645-1658 (Dec. 1989). |
Keller, R.M., “Look-Ahead Processors,” Computer Survey, ACM, vol. 7, No. 4, pp. 177-195 (Dec. 1975). |
Lam, M.S., “Instruction Scheduling For Superscalar Architectures,” Annu. Rev. Comput. Sci., Annual Reviews, vol. 4, pp. 173-201 (1990). |
Lightner, B.D. and Hill, G., “The Metaflow Lightning Chipset”, Compcon Spring 91, IEEE, pp. 13-16 (Feb. 25-Mar. 1, 1991). |
Murakami, K. et al., “SIMP (Single Instruction stream/Multiple instruction Pipelining): A Novel High-Speed Single-Processor Architecture,” Proc. 16th Int. Symp. on Computer Architecture, ACM, pp. 78-85 (Jun. 1989). |
Patt, Y.N. et al., “Critical Issues Regarding HPS, A High Performance Microarchitecture”, Proceedings of 18th Annual Workshop on Microprogramming, IEEE, pp. 109-116 (Dec. 3-6, 1985). |
Patt, Y.N. et al., “HPS, A New Microarchitecture: Rationale and Introduction”, The 18th Annual Workshop on Microprogramming, Pacific Grove, CA, Dec. 3-6, 1985, IEEE Computer Society Order No. 653, pp. 103-108. |
Patterson, D.A. and Hennessy, J.L., Computer Architecture: A Quantitative Approach, Morgan Kaufmann Publishers, pp. 257-278, 290-314 and 449 (1990). |
Peleg, A. and Weiser, U., “Future Trends in Microprocessors: Out-of Order Execution, Speculative Branching and their CISC Performance Potential”, IEEE, pp. 263-266 (1991). |
Pleszkun, A.R. and Sohi, G.S., “The Performance Potential of Multiple Functional Unit Processors,” Proceedings of the 15th Annual Symposium on Computer Architecture, IEEE, pp. 37-44 (Jun. 1988). |
Pleszkun, A.R. et al., “WISQ: A Restartable Architecture Using Queues,” Proceedings of the 14th International Symposium on Computer Architecture, ACM, pp. 290-299 (Jun. 1987). |
Popescu, V. et al., “The Metaflow Architecture”, IEEE Micro, IEEE, vol. 11, No. 3, pp. 10-13 and 63-73 (Jun. 1991). |
Smith, M.D. et al., “Boosting Beyond Static Scheduling in a Superscalar Processor,” International Symposium on Computer Architecture, IEEE, pp. 344-354 (May 1990). |
Smith, J.E. and Pleszkun, A.R., “Implementation of Precise Interrupts in Pipelined Processors,” Proceedings of the 12th Annual International Symposium on Computer Architecture, IEEE, pp. 35-44 (Jun. 1985). |
Smith, M.D. et al., “Limits on Multiple Instruction Issue,” Computer Architecture News, ACM, No. 2, pp. 290-302 (Apr. 3-6, 1989). |
Sohi, G.S. and Vajapeyam, G.S., “Instruction Issue Logic For High-Performance, Interruptable Pieplined Processors,” Conference Proceedings of the 14th Annual International Symposium on Computer Architecture, pp. 27-34 (Jun. 2-5, 1987). |
Thornton, J.E., Design of a Computer: The Control Data 6600, Control Data Corporation, pp. 58-140 (1970). |
Tjaden, G.S. and Flynn, M.J., “Detection and Parallel Execution of Independent Instructions,” IEEE Trans. On Computers, IEEE, vol. C-19, No. 10, pp. 889-895 (Oct. 1970). |
Tjaden, G.S and Flynn, M.J. Representation and Detection of Concurrency Using Ordering Matrices, (Dissertation), UMI, pp. 1-199 (1972). |
Tjaden et al., “Representation of Concurrency with Ordering Matrices,” IEEE Transactions On Computers, IEEE, vol. C-22, No. 8, pp. 752-761 (Aug. 1973). |
Tomasulo, R.M., “An Efficient Algorithm for Exploiting Multiple Arithmetic Units,” IBM Journal, IBM, vol. 11, pp. 25-33 (Jan. 1967). |
Uht, A.K., “An Efficient Hardware Algorithm to Extract Concurrency From General-Purpose Code,” Proceedings of the 19th Annual Hawaii International Conference on System Sciences, HICSS, pp. 41-50 (1986). |
Wedig, R.G., Detection of Concurrency In Directly Executed Language Instruction Streams, (Dissertation), pp. 1-179 (Jun. 1982). |
Weiss, S. and Smith, J.E., “Instruction Issue Logic in Pipelined Supercomputers,” IEEE Trans. on Computers, IEEE, vol. C-33, No. 11, pp. 1013-1022 (Nov. 1984). |