1. Field of the Invention
The present invention relates generally to instruction execution and in particular to conditional branch instruction execution.
2. Background Information
Conditional branch instructions direct the processor to continue execution from the next instruction (fall through) or from the address specified in the instruction itself or in a register (target). However, a branch is simply a binary decision based on a single bit, set in a previous or the current instruction, to a single target.
There are conventional approaches for multiple branch targets for an instruction. Such conventional approaches, however, involve using an array of registers to hold such targets. Such approaches cause unnecessary program code size due to setting multiple targets, which leads to a shortage of General Purpose (GP) registers, which in turn lead to register spill to memory. Such approaches also cause larger instructions due to the need to map more registers in an instruction. Further, an additional dedicated array of target registers is needed.
The invention provides a method and system for relative multiple-target branch instruction execution in a processor. One embodiment involves receiving a multiple-target branch instruction for execution; determining a next instruction to execute based on multiple condition bits or outcomes of a comparison by the current instruction; obtaining a specified instruction offset in the current instruction; and using the offset as the basis for multiple instruction targets based on said outcomes, wherein the number of conditional branches is reduced.
Other aspects and advantages of the present invention will become apparent from the following detailed description, which, when taken in conjunction with the drawings, illustrate by way of example the principles of the invention.
For a fuller understanding of the nature and advantages of the invention, as well as a preferred mode of use, reference should be made to the following detailed description read in conjunction with the accompanying drawings, in which:
The following description is made for the purpose of illustrating the general principles of the invention and is not meant to limit the inventive concepts claimed herein. Further, particular features described herein can be used in combination with other described features in each of the various possible combinations and permutations. Unless otherwise specifically defined herein, all terms are to be given their broadest possible interpretation including meanings implied from the specification as well as meanings understood by those skilled in the art and/or as defined in dictionaries, treatises, etc.
The invention provides a method and system for multiple target branches based on only one specified offset in a program/code instruction. A multiple-branch determines the target based on more than 1 condition bit or on the several outcomes of a comparison (equal, larger, smaller). For example, an instruction that is based on 2 condition bits has four different outcomes. This reduces at least 2 branch instructions. The specified offset is used as the basis for multiple targets. If a branch instruction is based on two condition bits, the possible outcomes are:
00—fall-through to next instruction.
01—branch to current address+offset.
10—branch to current address−offset*2.
11—branch to current address+offset*2.
The bits 00, 01, 10 and 11 show the four possibilities for the two condition bits; each having values 0 or 1.
The advantages of such a technique include less conditional branches being executed (leading to faster code execution) and less GP registers being used to hold comparison values, condition bits, or branch targets. Further, no dedicated arrays of branch targets are needed.
A first example is described below. The following high level instruction C programming language code in Table I sets the value of a based on the values of b and c.
The corresponding assembly code generated by a compiler (Power5 processor (GCC 4.1.1 with -O3) is shown in Table II below.
ble 7,.L2
ble 1,.printf
bgt 1,.printf
In the above assembly code in Table II, the comparisons are based on only 2 condition bits (bold italic code lines 1, 2). The bold code lines (bold non-italic code lines 4, 6, 10) are conditional branches, three of which are generated by the compiler, and at least two are executed through any given path. Since the comparisons are based on only two condition bits, a multiple target branch instruction according to the invention can eliminate the need for the three static and two dynamic conditional branches, as described below.
Referring to
00—fall-through to next instruction.
01—branch to current address+target offset.
10—branch to current address−target offset*2.
11—branch to current address+target offset*2.
The bits 00, 01, 10 and 11 show the four possibilities for the two condition bits cbit1 and cbit2, each having values 0 or 1.
A second version of assembly code for the code in Table I is shown in Table III below, wherein multiple-branch instruction is generated in the assembly code by the compiler based on the multiple-target branch instruction in
mwbc 7,1,.tgt
Only one branch instruction is needed. Specifically, the comparisons are based on only 2 condition bits (bold italic code lines 4, 5) in Table III. The bold code line (bold non-italic code line 7) is a conditional branch, one of which is generated. The number of conditional branches has been reduced to 1, compared to that in Table II. Although several no-operation (nop) instructions have been added for padding, they are not in the execution path.
Another multi-branch instruction embodiment based on the values of two condition bits is:
00—fall-through to next instruction
01—branch to current address+target offset*1
10—branch to current address+target offset*2
11—branch to current address+target offset*3
wherein the compiler generates assembly code as in Table IV below.
mwbc 7,1,.tgt
In other instruction set architectures, such as Microprocessor without Interlocked Pipeline Stages or MIPS for example, the branch instruction performs the comparison and branch in the same instruction, as shown by example in Table V below.
An example corresponding multi-target branch instruction (bcmp) for Table V according to the invention, represented as bcmp $1, $2,100, comprises:
a==b—fall-through to next instruction.
a>b—branch to instruction+100.
a<b—branch to instruction−100.
Then the following C code translates to code in Table VI (further below) based on multi-target branch instruction bcmp:
Alternative implementations can reduce the number of nops. The invention can be implemented such as in embedded processors and applications where minimizing code size is important. The compiler performs analysis in order to make use of the new multiple-target instruction branches.
As is known to those skilled in the art, the aforementioned example embodiments described above, according to the present invention, can be implemented in many ways, such as program instructions for execution by a processor, as software modules, as computer program product on computer readable media, as logic circuits, as silicon wafers, as integrated circuits, as application specific integrated circuits, as firmware, etc. Though the present invention has been described with reference to certain versions thereof, however, other versions are possible. Therefore, the spirit and scope of the appended claims should not be limited to the description of the preferred versions contained herein.
Those skilled in the art will appreciate that various adaptations and modifications of the just-described preferred embodiments can be configured without departing from the scope and spirit of the invention. Therefore, it is to be understood that, within the scope of the appended claims, the invention may be practiced other than as specifically described herein.