METHOD FOR EFFICIENTLY EMULATING COMPUTER ARCHITECTURE CONDITION CODE SETTINGS

Information

  • Patent Application
  • 20080184014
  • Publication Number
    20080184014
  • Date Filed
    January 30, 2007
    17 years ago
  • Date Published
    July 31, 2008
    15 years ago
Abstract
Emulation of source machine instructions is provided in which target machine CPU condition codes are employed to produce emulated condition code settings without the use, encoding or generation of branching instructions.
Description
TECHNICAL FIELD

This invention relates in general to the emulation of computer system architectures, and more particularly, to methods and systems for handling condition code settings encountered in the emulation process. Even more particularly, the present invention is directed to providing sequences of instructions that produce valid condition code settings without the use of branching instructions from the target architecture.


BACKGROUND OF THE INVENTION

In virtually all modern data processing systems, the execution of various operations such as arithmetic operations, logical operations and even data transfer operations, may result in the generation of several bits of data to indicate the outcome status of instruction execution. These bits are typically referred to as condition codes. As a simple example, a special condition code setting may be set after an arithmetic addition which results in an overflow due to the addends being too large for the number of bits available for the result. The use of condition codes permeates the execution of almost every instruction


A classic example of an instruction which produces condition code changes upon execution is the compare instruction which sets a condition code to “zero” if the operands are equal, to “one” if the first operand is strictly less than the second operand and to “two” if the first operand is strictly greater than the second operand. The compare instruction represents an archetypical use of condition code settings.


For a number of reasons, it may be desirable to emulate the instructions designed for one computer architecture on another system with a different set of executable instructions. For example, emulation may be employed in system design or test. It may also be employed to expand the capabilities of one data processing system so that it is enabled to handle instructions written for another system. The present invention relates to the handling of condition code settings in the context of instruction emulation. While the systems and methods of the present invention are widely applicable to any emulation method where condition codes are present, it is particularly applicable to the emulation of the z/Architecture. However, the principles set forth herein are applicable to any source architecture and to any target architecture.


In the principle emulation environment considered in the present description, it is the job of emulation software to accept, as input, strings of source architecture instructions and to generate therefrom strings of instructions that, when run on the target architecture, produce the same results. These results include the setting of various condition codes, such as sign, carry, overflow and various others indicating exceptions and machine states. It is noted that while an emulation environment preferably results in the setting of hardware or condition code elements in the target architecture, the present invention also contemplates the situation in which condition codes are generated and stored in locations other than condition code registers in the target machine.


It is to be particularly noted that the present invention, deliberately avoids the conventional handling of condition code generation. An example of this difference is provided through a brief consideration of the compare instruction. This instruction compares two operands and sets a two bit condition code according to the outcome of the comparison. For example, if the comparison of the two operands determines that they are the same, the condition code is set to zero. If it is determined that the first operand is strictly less than the second operand the condition code is set to one. Lastly, if it is determined that the first operand is greater than the second operand, the condition code is set to two. In conventional approaches to the emulation of a compare instruction, the result is the construction of a sequence of instructions, which include three branch instructions. For the reasons set forth immediately below the presence of branch instructions in the target architecture instruction stream is undesirable.


Branch instructions are undesirable for at least two reasons. In particular, it is noted that most modern data processing architectures include features known as branch prediction. In these architectures, a guess is made as to which of two or more paths that the instruction stream will follow after encountering a branch instruction. If a correct guess is made, then all is well and machine processing time is thereby speeded up. However, if an incorrect to guess is made the machine hardware must backtrack through the path taken, and then take another path. At this point in time, the branch instruction is seen to be a detriment to overall processing speed. Accordingly, it is seen that branch instructions introduce complications which are not otherwise present. Furthermore, as a second reason for their avoidance, is noted that branch instructions actually consume the aforementioned branch prediction resources so that they are thus not available for other instruction streams being executed by a processor. Thus branch instructions are not only potentially wasteful in and of themselves, they also deprive other instruction streams of limited, yet valuable, computer resources.


Accordingly, it is seen that the designer of emulation systems is faced with the paradoxical choice of needing branch instructions to successfully emulate the generation of condition code settings in target architectures while at the same time desiring to avoid branching instructions because of their disadvantages. This problem is especially severe when condition code generation and functionality in the target architecture are quite different from that found in the architecture of the source machine.


It is to be particularly noted that computer programs that emulate the machine state of the z/Architecture deal with many z/Architecture instructions that modify the condition codes. In short, the z/Architecture is a prime exemplar of an architecture in which condition code settings are typically quite different than that found in other architectures, especially ones that have historically grown up from relatively simple microprocessor designs. Additionally, the modification of condition code settings in the z/Architecture is pervasive. The generation and use of condition code settings is most typically found as the result of performing an arithmetic, logical or compare operations after which one or more condition code settings are changed based on the result or other factors. The pervasiveness of condition code modifying instructions in the z/Architecture and the sometimes arbitrary semantics of these instructions introduces complicated control flow to the stream of instructions that are ultimately executed on the target architecture. This control flow adds considerable space and performance overhead to the emulated instructions. The present invention is directed to more efficiently handling this situation. While the method and system herein are particularly applicable to the so-called z/Architecture which is present in large data processing systems manufactured and sold by the assignee of the present invention, it is by no means limited to that architecture as a base of supply for source instructions.


It should also be noted that the present invention is employed in two contexts or modes. In one mode, source computer instructions are converted into target machine instructions for later execution. In another mode of operation, more akin to the operation of interpreters, source instructions are converted into target instructions for immediate execution. The present invention, in its broadest scope, contemplates both of these modalities of operation.


SUMMARY OF THE INVENTION

In a method for emulating computer instructions from a source machine to produce sequences of instructions on a target machine, the present invention generates a sequence of target machine instructions which together operate to directly calculate target machine condition codes from carry, sign and overflow codes without the use of branch instructions from the target machine. The direct calculation avoids the use of branching instructions whose disadvantages are cited above.


The present invention provides specific guiding techniques and several sequences derived from these techniques to efficiently set conditions codes or detect exceptional cases in an emulated binary translation environment for the z/Architecture. These techniques are specifically directed to situations in which the PowerPC architecture and the Intel IA32 architectures are employed to emulate the z/Architecture. The sequences of the present inventions are more efficient and generally smaller as opposed to a more straightforward method that requires more flow control. However, it is noted that the principals, techniques and methods of the present invention are not limited to any particular target machine architecture. The two exemplar architectures discussed herein are merely the most currently ones anticipated to be of the greatest value.


Additional features and advantages are realized through the techniques of the present invention. Other embodiments and aspects of the invention are described in detail herein and are considered a part of the claimed invention.


The recitation herein of a list of desirable objects which are met by various embodiments of the present invention is not meant to imply or suggest that any or all of these objects are present as essential features, either individually or collectively, in the most general embodiment of the present invention or in any of its more specific embodiments.





BRIEF DESCRIPTION OF THE DRAWINGS

The subject matter which is regarded as the invention is particularly pointed out and distinctly claimed in the concluding portion of the specification. The invention, however, both as to organization and method of practice, together with the further objects and advantages thereof, may best be understood by reference to the following description taken in connection with the accompanying drawings in which:



FIG. 1 is a flow chart indicating the fact that conventional handling of condition code settings in an emulation environment employs up to three branch instructions;



FIG. 2 is a flow chart of the present process in which branch instructions are avoided in the emulation of computer instructions;



FIG. 3 is a block diagram illustrating an example of the environment in which the present invention is employed;



FIG. 4 is a top view of a typical computer readable medium containing program code which implements the methods of the present invention, as for example, as shown a Compact Disc (CD); and



FIG. 5 is a block diagram illustrating the environment in which the present invention operates and is employed.





DETAILED DESCRIPTION

The technique used herein to derive the sequences is to implement very short one or two instruction sequence fragments that set a bit or bits in a result or temporary register to distinguish each possible outcome of a condition code setting. These small code fragments typically manipulate the carry, sign and overflow bits and are tied together into slightly larger sequences by standard and usually high-performing shifts, rotates and various arithmetic or Boolean instructions available on most computer architectures. Very efficient sequences result by avoiding both branch instructions and more complex instructions that are less likely to be optimized in hardware.


In some cases it is possible and efficient to manipulate the PowerPC condition code register itself to derive the z/Architecture condition code settings. In these cases a PowerPC record form instruction is used and the resulting PowerPC condition register is manipulated by rotations and logical operations to derive the corresponding z/Architecture condition code setting.


The examples below are in assembler pseudocode and are applicable to most architectures that can manipulate a carry bit and have the usual shift/rotate/negate instructions. The assembler pseudocode used is purposefully verbose so as to make the implied semantics clear. In some cases non-standard mnemonics are used when an efficient way to implement a particular operation is more likely to vary on different architectures. These non-standard mnemonics are explained more fully in the table below. In each case rX and rY are named target machine registers and “b” is an immediate value. The carry_bit is the carry out of the high order bit position.










TABLE I





Instruction
Description

















get_bit
rX, rY, b
The least significant bit in rX receives the bit value




(0 or 1) from position b in rY and the rest of rX is




set to zero


get_sign_bit
rX, rY
places the sign bit from rY in rX and clears the rest




of rX


double
rX, rY
doubles the value in rY and places the result in rX


set_bit_on_zero
rX, rY
if rY contains 0 then a 1 is placed in rX otherwise




a 0 is placed in rX


set_bit_on_not_zero
rX, rY
if rY contains a non-zero value then a 1 is placed




in rX otherwise a 0 is placed in rX


add_to_carry
rX, rY, rZ
rX = rY + rZ + carry_bit. This operation is also




assumed to set the carry bit based on the result of




the add.


add_set_carry
rX, rY, rZ
rX = rY + rZ and the carry bit is set based on the




result of the add. Similar semantics for




sub_set_carry (subtract).


add_to_carry_immed
rX, rY, imm
rX = rY + imm + carry_bit. This operation is also




assumed to set the carry bit based on the result of




the add.


add_set_carry_immed
rX, rY, imm
rX = rY + imm and set the carry bit based on the




result of the add.


move_from_carry
rX
rX = carry bit


flip_bit
rX, rY, b
The bit value at position b of rY is changed from




either 0 to 1 or 1 to 0 and the entire changed




register value is placed in rX









Bits are numbered from 0-63 for a 64 bit register and 0-31 for a 32 bit register. 0 is the most significant position and 31 or 63 is the least significant position. In the description below, the following register naming conventions are used:

    • rA, rB—the first and second operand, respectively, of the z/Architecture instruction evaluated into a register;
    • rT—the result value of the z/Architecture instruction computed into a register;
    • rX, rY, rZ—temporary registers used to hold intermediate results;
    • rC—the register that will hold the condition code value at the end of the sequence.


Various sequences derived via this technique are listed and discussed below. As indicated above, the compare instruction is an archetypical example of an instruction which sets condition codes. Accordingly, a sequence for emulating condition code settings is provided below. The sequence provided is typical of the approach taken in the practice of the present invention. In particular, the subject source instruction is the z/Architecture Instruction called the Compare Logical operation. The example assumes that rA and rB are zero-extended in a 64 bit environment (only when compare instruction operates on 32 bit operands).













TABLE II









[1]
sub_set_carry
rC, rA, rB



[2]
set_bit_on_zero
rX, rC



[3]
set_bit_on_not_zero
rC, rC



[4]
add_to_carry_immed
rC, rC, 0



[5]
sub
rC, rC, rX















rA > rB
rA = rB
rA < rB



Register Contents
Register Contents
Register Contents
















Instruction
rC
rX
Carry
rC
rX
Carry
rC
rX
Carry





[1]
rC > 0
N/A
1
rC = 0
N/A
1
rC < 0
N/A
0


[2]
rC > 0
0
1
rC = 0
1
1
rC < 0
0
0


[3]
1
0
1
0
1
1
1
0
0


[4]
2
0
N/A
1
1
N/A
1
0
N/A


[5]
2
0
N/A
0
1
N/A
1
0
N/A









As the table above indicates, after the execution of “sub_set_carry” (instruction [1]), the condition of register rX is not applicable. Execution of instruction


in the target machine, however, does set a carry bit in the CPU state which is accessed by later instructions. This is indicated in the “Carry” column in the table which refers to the carry bit flag in the target machine. It is important to note and to realize that this carry bit, like many other flag bits in the target machine, is not set in the target machine in the same manner or under the same conditions as are present in the source machine. At this point, conventional approaches to setting a corresponding value in the register location rC would employ multiple branch instructions as shown in FIG. 1. These conventional approaches, as well as the present process, operate so as to provide a proper indication of the carry bit for use by the emulation software.


With respect to instruction [1], its execution sets the carry bit (that is, the CPU carry bit) to “1” in the case that rA≧rB and to “0” in the case that rA<rB. Additionally, rC contains the result of the subtraction, which, notably could be “0.” The entries “rC>0,” “rC=0” and “rC<0” in the table above are meant to provide an indication of the resulting condition. After the execution of instruction [2] (set_bit_on_zero), the status of register rC is unchanged but the contents of rX are set equal to “1” if the two operands, rA and rB, are the same based on the contents or rC (limited to zero or not in this case). The execution of instruction [2] does not affect the contents of rC. Additionally, instruction [2] does not affect the CPU carry bit. With respect to instruction [3] (set_bit_on_not_zero), rC is set equal to “1” whenever rC is not zero, that is, whenever rA is not equal to rB. The CPU carry bit is unaffected by instruction [3]. Thus, at this point, if rA>rB or if rA<rB, then rC=1, but if rA=rB, then rC=0. Note that at this point, rX is set up to provide discrimination information distinguishing equality from inequality and that this occurs outside of (that is, apart from) both rC and the CPU carry bit.


Instruction [4] (add_to_carry_immed) is then executed with the arguments shown (rC, rC, 0), with “0” being an immediate operand. With the operands shown, it carries out the operation: rC+“CPU carry bit”+0. While it also sets the CPU carry bit as well, this result is not required for subsequent processing. It is seen in Table II above that if rA>rB then the contents of rC are now “2”; if rA=rB, then the contents of rC are “1”; and if rA<rB, then the contents of rC are also “1.” At this stage it is relevant to note that there is provided an indication in rC for which the case rA>rB is distinguished from the other two cases (rA=rB and rA<rB).


The execution of instruction [5] (sub), with operands “rC, rC, rX” provides the last step in which the contents of rX, now denoting equality, are subtracted from rC as mechanism for distinguishing the case that rA=rB from the case that rA<rB since the case of equality results in the subtraction of “1” from “1” and placing the result “0” in rC. Thus, at the end of the instruction sequence set out above, the following results are obtained: rC=2 if rA>rB; rC=1 if rA<rB; and rC=0 if rA=rB.


In this manner, then, it is seen that desired emulator results for condition code settings are obtained without the execution of any branching instructions. The concepts presented above are equally applicable to the emulation of any source instruction which produces a condition code change. While the above example is specifically directed to the setting of a carry bit, it is equally applicable to other target architecture condition code bits, such as the sign and overflow bits.


As another example of the application of the present invention to providing condition code generation in an emulation environment the Add Logical (32 bit) and Add Logical (64 bit) instructions are considered below. As with the Compare Logical example discussed above, rA and rB are assumed to be zero extended for a 64 bit target architecture environment for Add Logical (32 bit). The following is a sequence in pseudo-assembly code which provides the proper setting in the location rC at the end of the process. Below, c is the carry bit.

















[1] add_set_carry rT, rA, rB



[2] move_from_carry rC



[3] double rC



[4] set_bit_on_not_zero rX, rT



[5] or rC, rC, rX
















rT is zero
rT not zero
rT is zero
rT not zero



no carry
no carry
carry
carry



cc = 0
cc = 1
cc = 2
cc = 3



Register
Register
Register
Register



Contents
Contents
Contents
Contents
























rC
rT
rX
c
rC
rT
rX
c
rC
rT
rX
c
rC
rT
rX
c





[1]
n/a
0
n/a
0
n/a
not 0
n/a
0
n/a
0
n/a
1
n/a
not 0
n/a
1


[2]
0
0
n/a
0
0
not 0
n/a
0
1
0
n/a
1
1
not 0
n/a
1


[3]
0
0
n/a
0
0
not 0
n/a
0
2
0
n/a
1
2
not 0
n/a
1


[4]
0
0
0
0
0
not 0
1
0
2
0
0
1
2
not 0
1
1


[5]
0
0
0
0
1
not 0
1
0
2
0
0
1
3
not 0
1
1









In general, this process has the following steps, none of which includes the use or execution of any branch instructions. First, an instruction (step 100 in FIG. 2) in the target machine's architecture is executed which mimics the instruction present in the source instruction stream in a manner which sets one or more target CPU flag bits and which places a result in a storage location (first location, such as rC above) accessible to the emulator. Next, an instruction (step 105) is executed which uses that result to set a bit or bits in another emulator controlled storage location (second location, such as rX above) to distinguish one or more case results. Next, the aforementioned result is used to reset itself (step 110) to a shorter bit configuration (one bit in the above example) which also serves to distinguish one or more case results. Next, an instruction is executed (step 115) which employs the first storage location to produce a result which distinguishes a different set of case results. Lastly, a target machine instruction is executed (step 120) which uses the results in the first and second locations to provide an indication in one of the two emulator accessible instructions in which at least three cases are distinguished.


Even more generally, the present process is directed to emulation methods which do not employ target machine branch instructions but rather employ target machines instructions whose executions result in the control of target machine condition codes which are used in subsequently executed non-branch instructions in ways that are used to distinguish one or more result states which are made available in a location which an emulator can employ as a condition code emulation location.


It is noted that the process set forth herein contemplates that it encompasses both the generation of suitable sequences of instructions to be executed on a target machine and the actual execution of those instructions on a target machine, whether that execution occurs immediately upon the sequence for a source instruction being generated, as one might find in an “interpretive” environment or in a “compilation-like” environment, where actual execution might occur at a later time, if necessarily at all.


In any event the environment in which the present invention operates is shown in FIG. 3. The present invention operates in a data processing environment which effectively includes one or more of the computer elements shown in FIG. 3. In particular, computer 500 includes central processing unit (CPU) 520 which accesses programs and data stored within random access memory 510. Memory 510 is typically volatile in nature and accordingly such systems are provided with nonvolatile memory typically in the form of rotatable magnetic memory 540. While memory 540 is preferably a nonvolatile magnetic device, other media may be employed. CPU 530 communicates with users at consoles such as terminal 550 through Input/Output unit 530. Terminal 550 is typically one of many, if not thousands, of consoles in communication with computer 500 through one or more I/O unit 530. In particular, console unit 550 is shown as having included therein a device for reading medium of one or more types such as CD-ROM 600 shown in FIG. 5. Media 600, an example of which is shown in FIG. 4, comprises any convenient device including, but not limited to, magnetic media, optical storage devices and chips such as flash memory devices or so-called thumb drives. Disk 600 also represents a more generic distribution medium in the form of electrical signals used to transmit data bits which represent codes for the instructions discussed herein. While such transmitted signals may be ephemeral in nature they still, nonetheless constitute a physical medium carrying the coded instruction bits and are intended for permanent capture at the signal's destination or destinations.


The typical emulation environment in which the present invention is employed is illustrated in FIG. 5. Emulators such as 320 except as input instruction streams 305, representing machine or assembly language instructions which are designed to operate on source machine 300. Emulator 320 employees memory 315 in target machine 310 to produce a stream of instructions which are capable of executing on target machine 310. While FIG. 5 particularly shows operation within an emulation environment, it is also noted that the present invention contemplates a situation in which emulator 320 operates essentially as an interpreter in which the instructions are not only translated to the new architecture but in which they are also executed at essentially the same time.


While the invention has been described in detail herein in accordance with certain preferred embodiments thereof, many modifications and changes therein may be effected by those skilled in the art. Accordingly, it is intended by the appended claims to cover all such modifications and changes as fall within the true spirit and scope of the invention.

Claims
  • 1. A method for emulating computer instructions from a source machine to produce sequences of instructions on a target machine, said method comprising: generating a sequence of target machine instructions which together operate to directly calculate target machine condition codes from carry, sign and overflow codes without including branch instructions from the target machine.
  • 2. The method of claim 1 in which said calculation of target machine condition codes is carried out in a sequence of non-branching instructions that manipulate condition code settings produced within the target machine itself to produce the same condition code in a target machine location.
  • 3. The method of claim 2 in which said manipulation employs temporary locations for storing intermediate results.
  • 4. The method of claim 1 further including the step of executing said generated sequence of target machine instructions.
  • 5. The method of claim 4 in which said executing occurs at substantially the same time as said generating.
  • 6. The method of claim 1 in which said generating comprises the steps of: mimicking an instruction present in a source instruction stream in a manner which sets one or more target CPU flag bits and which places a result in a first storage location;executing an instruction which uses said result to set at least one bit in second storage location so as to distinguish one or more case results;using said result to reset itself to a shorter bit configuration which serves to distinguish at least one of said case results;executing an instruction which employs said first storage location to produce a result which distinguishes a different set of case results; andexecuting an instruction which uses the results in said first and second storage locations to provide an indication in which at least three cases are distinguished, said cases representing condition codes of said source machine.
  • 7. A computer program product comprising a machine readable medium including instructions encoded therein for generating a sequence of target machine instructions which together operate to directly calculate target machine condition codes from carry, sign and overflow codes without including branch instructions from the target machine.
  • 8. A data processing system including a memory for stored program execution by said system, said memory having code therein for generating a sequence of target machine instructions which together operate to directly calculate target machine condition codes from carry, sign and overflow codes without including branch instructions from the target machine.