1. Technical Field
The present invention relates to processors in general, and, in particular, to registers within a processor. Still more particularly, the present invention relates to an apparatus for increasing the ability of a processor to address registers.
2. Description of Related Art
Within a processor, registers can be used to store various data intended for manipulations. When it comes to data manipulations, registers are preferred over a system memory in many aspects. For example, registers can typically be designated by fewer bits in instructions than locations in system memory require to be addressed. In addition, registers have higher bandwidth and shorter access time than most system memories. Furthermore, registers are relatively straight-forward to design and test. Thus, modern processor architectures tend to have a relatively large number of registers.
Although the performance of a processor can generally be improved by increasing the number of registers within the processor, a large number of registers can also present its own problems. One of the problems is register addressability. If a processor includes a large number of addressable registers, each instruction having one or more register designations would require many bits to be allocated solely for the purpose of addressing registers. For example, if a processor has thirty-two registers, a total of twenty bits will be required to designate four registers within an instruction because five bits are needed to address all thirty-two registers. Thus, the maximum number of registers that can be provided within a processor architecture is effectively constrained.
Indirection is a technique that has been used to access large register files. An indirection mechanism useful for extending an architecture such as the PowerPC™ architecture to accommodate potentially very large register files should satisfy the following objectives:
Prior art indirection mechanisms for accessing large register files fail to meet one or more of the above-mentioned objectives. Such prior art indirection mechanisms include:
Consequently, it would be desirable to provide an improved apparatus for increasing the ability of a processor to address registers.
In accordance with a preferred embodiment of the present invention, an apparatus for increasing addressability of registers within a processor includes a set of apparent registers and a set of actual registers. The total number of actual registers is substantially higher than the total number of apparent registers such that only a subset of the actual registers is referenced by all of the apparent registers at any given time. Any one of the actual registers can be designated by an instruction via one of the apparent registers. Any one of the actual registers can also be directly designated by an instruction.
All features and advantages of the present invention will become apparent in the following detailed written description.
The invention itself, as well as a preferred mode of use, further objects, and advantages thereof, will best be understood by reference to the following detailed description of an illustrative embodiment when read in conjunction with the accompanying drawings, wherein:
The present invention may be implemented in reduced instruction set computing (RISC) processors or complex instruction set computing (CISC) processors. For the purpose of illustration, a preferred embodiment of the present invention, as described below, is implemented on a RISC processor, such as the PowerPC™ family processor manufactured by the International Business Machines Corporation of Armonk, N.Y.
Referring now to the drawings and in particular to
Instruction unit 15 dispatches instructions as appropriate to execution units such as an integer unit 16, a load/store unit 17 and/or a floating-point unit 18. Integer unit 16 performs add, subtract, multiply, divide, shift or rotate operations on integers, retrieving operands from and storing results to general-purpose registers 13. Floating-point unit 18 performs single-precision and/or double-precision multiply/add operations, retrieving operands from and storing results to floating-point registers 14. Load/store unit 17 loads instruction operands from data cache 11 into general-purpose registers 13 or floating-point registers 14, as needed, and stores instructions results when available from general-purpose registers 13 or floating-point registers 14 into data cache 11.
A completion unit 19, which includes multiple reorder buffers, operates in conjunction with instruction unit 15 to support out-of-order instruction processing. Completion unit 19 also operates in connection with rename buffers within general-purpose registers 13 and floating-point registers 14 to avoid any conflict in a specific register for instruction results.
In accordance with a preferred embodiment of the present invention, a set of apparent registers is used to increase the addressability of a set of actual registers within a processor. The apparent registers are addressed in a space called the apparent register name space, and the actual registers are addressed in a larger space called the actual register name space. Entries in the apparent registers refer to names of registers in the actual register name space. The apparent register name space is directly addressable by a register number used in an instruction. On the other hand, the actual register name space can be addressed either directly (from some instructions) or indirectly through values stored in the apparent registers.
With reference now to
Actual registers 22 also include multiple register entries. The number of bits in each apparent register entry is large enough to address the number of provided actual registers, possibly allowing space for future growth in that number. The total number of registers within actual registers 22 preferably equals to at least two to the power of the number of bits within a register entry of apparent registers 21. For example, if the number of bits within each register entry of apparent register 21 is five, then the total number of registers within actual registers 22 is thirty-two; if the number of bits within each register entry of apparent registers 21 is six, then the total number of registers within actual registers 22 is sixty-four. In the embodiment shown in
During operation, a register within actual registers 22 is selected by the bits within a register entry of apparent registers 21, which is selected by the bits within an apparent register field of an instruction, such as instruction 23. For example, as shown in
An instruction may include two different types of register fields. As shown in
The control of apparent registers 21 can be designed in a way that is appropriate for the processor architecture in which the present invention is incorporated. In the PowerPC™ architecture, for example, it may be appropriate to control the mapping through two or more special purpose registers.
The register re-mapping of the present invention can be applied to different sets of registers independently. For example, in the context of the PowerPC™ architecture, the register re-mapping may be applied to vector registers and to floating-point registers but not to general-purpose registers.
A special instruction can be used to change the mapping of some or all of the register entries within apparent registers 21 at once. Certain instructions, such as Load and Store instructions, which can directly address the entire set of actual registers 22, at the cost of providing, in the instruction, sufficient bits to address the fill complement of the actual registers.
The result of a standard arithmetic instruction can be moved into one of apparent registers 21 and that value is then used to address one of actual registers 22. Doing so would make it easy to perform arithmetic on the register mapping, such as incrementing a register entry number within apparent registers 21. Such mapping is similar to incrementing an address register used to address a conventional system memory.
Additionally, instructions may be provided which increment and/or decrement, by one or by an integer specified in the instruction, the value in a specific apparent register.
It is also valuable to allow different mappings for each of the register designations commonly used in an instruction set, and to allow different mappings for each of the source operands. Thus, one mapping could be used for target operands, and another used for source operands. For example, in the PowerPC™ architecture, one mapping is for vD, and a separate one for vA and vB. The usage of independent mappings for vA and vB could be valuable and would serve to increase the number of addressable registers in the working set.
As has been described, the present invention provides an apparatus for increasing addressability of registers within a processor. The present invention can be implemented in a compatible way into an existing processor architecture.
While the invention has been particularly shown and described with reference to a preferred embodiment, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention.