The present invention relates in general to the field of pipelined microprocessor architectures, and particularly to the forwarding of floating-point results from one instruction to another.
The x86 architecture specifies multiple data formats for floating point operands, namely, single-precision, double-precision, and extended double-precision. This implies that the floating point units have a different multiplier, adder, etc. for each architected data format. This is an inefficient use of space and power. So, to reduce the number of multipliers, adders, etc., the floating point units include a single multiplier, adder, etc. each capable of operating on operands that are in a single non-architected data format. The floating point units convert the received source operands from their architected data format to the non-architected data format, perform the operation on the non-architected data format operands to generate a result in the non-architected data format, and then convert the result back to the architected data format. The architected data format results are then forwarded to the floating point units as source operands, as illustrated by the conventional floating point units 112 shown in
In one aspect the present invention provides a microprocessor having an instruction set architecture (ISA) that specifies at least one architected data format (ADF) for floating-point operands. The microprocessor includes first and second floating-point units. The first floating-point unit is configured to speculatively forward a non-ADF result generated by the first floating-point unit to the second floating-point unit, wherein the non-ADF result is associated with a first instruction. The second floating-point unit is configured to use the speculatively forwarded non-ADF result associated with the first instruction as a source operand to generate a result of a second instruction. The second floating-point unit is further configured to convert the non-ADF result to an ADF result and to determine whether the non-ADF result creates an exception condition when converted to the ADF result. The microprocessor is configured to cancel the second instruction, in response to determining that the non-ADF result creates an exception condition when converted to the ADF result.
In another aspect, the present invention provides a method for processing floating-point instructions in a microprocessor having first and second floating-point units, wherein the microprocessor has an instruction set architecture (ISA) that specifies at least one architected data format (ADF) for floating-point operands. The method includes speculatively forwarding a non-ADF result generated by the first floating-point unit from the first floating-point unit to the second floating-point unit, wherein the non-ADF result is associated with a first instruction. The method also includes the second floating-point unit using the speculatively forwarded non-ADF result associated with the first instruction as a source operand to generate a result of a second instruction. The method also includes determining whether the non-ADF result creates an exception condition when converted to an ADF result. The method also includes canceling the second instruction, in response to determining that the non-ADF result creates an exception condition when converted to the ADF result.
In yet another aspect, the present invention provides a computer program product encoded in at least one computer readable medium for use with a computing device, the computer program product comprising computer readable program code embodied in said medium for specifying a microprocessor having an instruction set architecture (ISA) that specifies at least one architected data format (ADF) for floating-point operands. The computer readable program code includes first program code for specifying a first floating-point unit and second program code for specifying a second floating-point unit. The first floating-point unit is configured to speculatively forward a non-ADF result generated by the first floating-point unit to the second floating-point unit, wherein the non-ADF result is associated with a first instruction. The second floating-point unit is configured to use the speculatively forwarded non-ADF result associated with the first instruction as a source operand to generate a result of a second instruction. The second floating-point unit is further configured to convert the non-ADF result to an ADF result and to determine whether the non-ADF result creates an exception condition when converted to the ADF result. The microprocessor is configured to cancel the second instruction, in response to determining that the non-ADF result creates an exception condition when converted to the ADF result.
The forwarding of architected data format results described above with respect to
Referring now to
In one embodiment, the microprocessor 100 is an x86 (also referred to as IA-32) architecture microprocessor 100; however, other microprocessor architectures may be employed. A microprocessor is an x86 architecture processor if it can correctly execute a majority of the application programs that are designed to be executed on an x86 microprocessor. An application program is correctly executed if its expected results are obtained. In particular, the microprocessor 100 executes instructions of the x86 instruction set and includes the x86 user-visible register set.
Referring now to
The converter 222 converts the ADF operands 152 into NADF operands 272 that are provided to the mux 224. The mux 224 also receives a NADF result 252 forwarded from the NADF multiplier 226 and a NADF result 254 forwarded from the NADF adder 236. From its inputs, the mux 224 selects NADF operands 266 for provision to the NADF multiplier 226, which multiplies the operands 266 to generate the NADF result 252. The converter 228 converts the NADF result 252 to the ADF result 162 of
The converter 232 converts the ADF operands 152 into NADF operands 274 that are provided to the mux 234. The mux 234 also receives the NADF result 252 forwarded from the NADF multiplier 226 and the NADF result 254 forwarded from the NADF adder 236. From its inputs, the mux 234 selects NADF operands 268 for provision to the NADF adder 236, which adds the operands 268 to generate the NADF result 254. The converter 238 converts the NADF result 254 to the ADF result 164 of
As may be observed by comparing
Floating point operations may generate exception conditions, such as overflow or underflow. A side-effect of the NADF is that some results that would overflow/underflow in the ADF would not do so in the NADF, e.g., because of the larger exponent, as discussed above. Consequently, the forwarding of the NADF results 252/254 is speculative because the programmer may not want the instruction that receives the forwarded NADF result 252/254 to execute with a value that would cause an exception when converted to ADF. Therefore, in parallel with the speculative forwarding of NADF results 252/254, the converters 228/238 also perform the conversion to ADF, and if the conversion yields an overflow/underflow, then they generate an exception 172/174 on the forwarding instruction and the microprocessor 100 kills the instruction that executed using the speculatively forwarded NADF result, as described in more detail with respect to
Referring now to
At block 302, floating point unit 112A receives an instruction-B for execution. The mux 224 detects that one of the source operands is the NADF result 254 of a previous instruction-A that has been forwarded from the NADF adder 236 and accordingly selects the forwarded NADF result 254. The mux 224 may also select as the other operand the forwarded NADF result 252 from the NADF multiplier 226 or the converted NADF operands 272. Flow proceeds to block 304.
At block 304, the NADF multiplier 226 multiplies the NADF operands 266 to generate the NADF result 252 for instruction-B. Flow proceeds concurrently from block 304 to blocks 306 and 326.
At block 306, the forwarding buses forward the NADF result 252 of instruction-B to the NADF adder 236. Flow proceeds to block 308.
At block 308, floating point unit 112B receives an instruction-C for execution. The mux 234 detects that one of the source operands is the NADF result 252 of instruction-B that has been forwarded at block 306 from the NADF multiplier 226 and accordingly selects the forwarded NADF result 252. The mux 234 may also select as the other operand the forwarded NADF result 254 from the NADF adder 236 or the converted NADF operands 274. Flow proceeds to block 312.
At block 312, the NADF adder 236 adds the NADF operands 268 to generate the NADF result 254 for instruction-C. Flow ends at block 312, although it is understood that the forwarding of NADF results 252 and/or 254 may advantageously continue for a long sequence of instructions, thereby reducing latency and speeding up the execution of the sequence of instructions relative to the conventional floating point units 112 of
At block 322, the converter 228 converts the NADF result 252 of instruction-B to ADF result 162. Flow proceeds to decision block 324.
At decision block 324, the converter 228 determines whether the NADF result 252 of instruction-B creates an exception condition when converting to ADF. If so, flow proceeds to block 326; otherwise, flow proceeds to block 328.
At block 326, the converter 228 asserts the exception indicator 172 to the ROB 114. Consequently, the microprocessor 100 will take an exception, and the ROB 114 will flush instruction-C since instruction-C is newer in program sequence than instruction-B that caused the exception. This is necessary since the NADF result 252 of instruction-B was speculatively forwarded to the NADF adder 236 without knowledge of whether the NADF result 252 was a good operand, i.e., without knowledge of whether the NADF result 252 was a non-underflowed/overflowed value from an ADF perspective. That is, the programmer may not have desired instruction-C to execute with a non-good operand. However, advantageously the NADF results 252/254 are speculatively forwarded to potentially reduce the latency of instruction execution and in most cases both the forwarding and the receiving instructions will complete successfully. Flow ends at block 326.
At block 328, floating point unit 112A provides the ADF result 162 to the ROB 114 for storage in a temporary register therein. Flow proceeds to block 332.
At block 332, the ROB 114 retires the ADF result 162 from the temporary register to the appropriate GPR 118. Flow ends at block 332.
While various embodiments of the present invention have been described herein, it should be understood that they have been presented by way of example, and not limitation. It will be apparent to persons skilled in the relevant computer arts that various changes in form and detail can be made therein without departing from the scope of the invention. For example, software can enable, for example, the function, fabrication, modeling, simulation, description and/or testing of the apparatus and methods described herein. This can be accomplished through the use of general programming languages (e.g., C, C++), hardware description languages (HDL) including Verilog HDL, VHDL, and so on, or other available programs. Such software can be disposed in any known computer usable medium such as magnetic tape, semiconductor, magnetic disk, or optical disc (e.g., CD-ROM, DVD-ROM, etc.), a network, wire line, wireless or other communications medium. Embodiments of the apparatus and method described herein may be included in a semiconductor intellectual property core, such as a microprocessor core (e.g., embodied in HDL) and transformed to hardware in the production of integrated circuits. Additionally, the apparatus and methods described herein may be embodied as a combination of hardware and software. Thus, the present invention should not be limited by any of the exemplary embodiments described herein, but should be defined only in accordance with the following claims and their equivalents. Specifically, the present invention may be implemented within a microprocessor device which may be used in a general purpose computer. Finally, those skilled in the art should appreciate that they can readily use the disclosed conception and specific embodiments as a basis for designing or modifying other structures for carrying out the same purposes of the present invention without departing from the scope of the invention as defined by the appended claims.
This application claims priority based on U.S. Provisional Application Ser. No. 61/240,753, filed Sep. 9, 2009, entitled FAST FLOATING POINT RESULT FORWARDING USING NON-ARCHITECTED DATA FORMAT, which is hereby incorporated by reference in its entirety. This application is related to U.S. Non-Provisional Application TBD, filed concurrently herewith, entitled FAST FLOATING POINT RESULT FORWARDING USING NON-ARCHITECTED DATA FORMAT, which is incorporated by reference herein in its entirety, and which is subject to an obligation of assignment to common assignee VIA Technologies, Inc.
Number | Date | Country | |
---|---|---|---|
61240753 | Sep 2009 | US |