System and Method for Vector Computations in Arithmetic Logic Units (ALUS)

Information

  • Patent Application
  • 20070182746
  • Publication Number
    20070182746
  • Date Filed
    December 13, 2006
    19 years ago
  • Date Published
    August 09, 2007
    18 years ago
Abstract
The present disclosure describes implementations for processing instructions and data across multiple Arithmetic Logic Units (ALUs). In one implementation, a graphics processing apparatus comprises a plurality of ALUs configured to process independent instructions in parallel. Pre-processing logic is configured to receive instructions and associated data to be directed to one of the plurality of ALUs for processing from a register file, the pre-processing logic being configured to selectively format received instructions for delivery to a plurality of the ALUs. In addition, post-processing logic is configured to receive data output from the plurality of the ALUs and deliver the received data to the register file for write-back, the post-processing logic being configured to selectively format data output from a plurality of the ALUs for delivery to the register file as though the data had been output by a single ALU.
Description

BRIEF DESCRIPTION OF THE DRAWINGS

Many aspects of the disclosure can be better understood with reference to the following drawings. The components in the drawings are not necessarily to scale, emphasis instead being placed upon clearly illustrating the principles of the present disclosure. Moreover, in the drawings, like reference numerals designate corresponding parts throughout the several views.



FIG. 1 is a block diagram illustrating a portion of a pipelined processor architecture, as is known in the prior art.



FIG. 2 is a block diagram similar to FIG. 1, but illustrating multiple ALUs configured to process instructions and/or associated data in parallel, as is known in the prior art.



FIGS. 3A and 3B are block diagrams illustrating components of an architecture constructed in accordance with embodiments of the present invention.



FIG. 4 is a block diagram illustrating components of an architecture constructed in accordance with embodiments of the present invention.



FIG. 5 is a flowchart illustrating certain high-level operations of a method executed in accordance with embodiments of the invention.



FIG. 6 is a block diagram illustrating components of an architecture constructed in accordance with embodiments of the present invention.



FIGS. 7A and 7B are block diagrams illustrating components of an architecture constructed in accordance with an alternative embodiment of the present invention.


Claims
  • 1. A graphics processing apparatus comprising: a plurality of arithmetic logic units (ALUs) receive instructions and associated data configured for processing in parallel;pre-processing logic configured to receive instructions and associated data to be directed to one of the plurality of ALUs for processing from a register file, the pre-processing logic being configured to selectively format received instructions for delivery to a plurality of the ALUs;post-processing logic configured to receive data output from the plurality of the ALUs and deliver the received data to the register file for write-back, the post-processing logic being configured to selectively format data output from a plurality of the ALUs for delivery to the register file as though the data had been output by a single ALU.
  • 2. The graphics processing apparatus of claim 1, wherein the plurality of ALUs consists of precisely four ALUs.
  • 3. The graphics processing apparatus of claim 1, wherein the pre-processing logic comprises logic configured to perform shift and delay operations.
  • 4. The graphics processing apparatus of claim 3, wherein the pre-processing is configured to progressively shift and delay the received data across the plurality of ALU's, such that for each additional ALU to be delivered instructions or associated data, there is an additional shift and delay operation performed in the pre-processing logic.
  • 5. The graphics processing apparatus of claim 1, wherein the post-processing logic comprises logic configured to perform shift and delay operations.
  • 6. The graphics processing apparatus of claim 5, wherein the post-processing is configured to progressively shift and delay the received data from the plurality of ALU's, such that for each additional ALU to deliver data, there is an additional shift and delay operation performed in the post-processing logic.
  • 7. The graphics processing apparatus of claim 1, further including indication logic configured to indicate whether data from the pre-processing logic should selectively format received instructions and data, and wherein the pre-processing logic is further configured to either format the received instructions for delivery to a single one of the ALUs or to a plurality of the ALUs depending on a state of the indication logic.
  • 8. The graphics processing apparatus of claim 1, wherein the pre-processing logic is configured to selectively format received instructions based on an output of indication logic, which indication logic indicates whether a current instruction and associated data are to be processed in a horizontal or a vertical mode.
  • 9. A graphics processing apparatus comprising: a register file;logic for managing a plurality of threads;a plurality of arithmetic logic units (ALUs); anddata configuring logic capable of selectively configuring consecutive data in the register file associated with a given processing thread to be successively delivered to a single one of the ALUs in response to a first processing mode, said data configuring logic capable of selectively configuring consecutive data in the register file associated with a given processing thread to be successively delivered to different ones of the ALUs in response to a second processing mode.
  • 10. The graphics processing apparatus of claim 9, wherein the first processing mode is a horizontal instruction mode.
  • 11. The graphics processing apparatus of claim 9, wherein the second processing mode is a vertical instruction mode.
  • 12. The graphics processing apparatus of claim 9, wherein the first processing mode is identified with an execution of a first shader program.
  • 13. The graphics processing apparatus of claim 12, wherein the second processing mode is identified with an execution of a second shader program, the second shader program being different than the first shader program.
  • 14. A method for processing instructions and data comprising: receiving instructions and associated data from a register file;determining which one of two modes is active for the received instructions and associated data;delivering the instructions and data directly to a plurality of arithmetic logic units (ALUs) for processing, without reorganizing, when a first mode is active; andreorganizing the instructions and data, and then delivering the instructions and data to the plurality of ALUs for processing, when a second mode is active.
  • 15. The method claim 14, wherein the first mode is a horizontal mode.
  • 16. The method claim 14, wherein the second mode is a vertical mode.
  • 17. The method claim 14, wherein the reorganizing further comprises shifting and delaying the instructions and data.
  • 18. A method for processing operations in a plurality of arithmetic logic units (ALUs) comprising: retrieving an instruction and associated data from a register file;determine a mode of operation;delivering the retrieved instruction and associated data directly to the plurality of ALUs if the mode is determined to be a horizontal mode; andreformatting the retrieved instruction and associated data such that items originally formatted for delivery to adjacent ALUs are reformatted for delivery into a single ALU, and thereafter delivering the reformatted instruction and associated data to the plurality of ALUs.
  • 19. The method claim 18, wherein the operations of the method collectively function to process instructions and associated data of different threads.
Provisional Applications (1)
Number Date Country
60765654 Feb 2006 US