This invention relates to a programmable integrated circuit device, and particularly to a specialized processing block in a programmable integrated circuit device.
Considering a programmable logic device (PLD) as one example of an integrated circuit device, as applications for which PLDs are used increase in complexity, it has become more common to design PLDs to include specialized processing blocks in addition to blocks of generic programmable logic resources. Such specialized processing blocks may include a concentration of circuitry on a PLD that has been partly or fully hardwired to perform one or more specific tasks, such as a logical or a mathematical operation. A specialized processing block may also contain one or more specialized structures, such as an array of configurable memory elements. Examples of structures that are commonly implemented in such specialized processing blocks include: multipliers, arithmetic logic units (ALUs), barrel-shifters, various memory elements (such as FIFO/LIFO/SIPO/RAM/ROM/CAM blocks and register files), AND/NAND/OR/NOR arrays, etc., or combinations thereof.
One particularly useful type of specialized processing block that has been provided on PLDs is a digital signal processing (DSP) block, which may be used to process, e.g., audio signals. Such blocks are frequently also referred to as multiply-accumulate (“MAC”) blocks, because they include structures to perform multiplication operations, and sums and/or accumulations of multiplication operations.
For example, PLDs sold by Altera Corporation, of San Jose, Calif., as part of the STRATIX® and ARRIA® families include DSP blocks, each of which includes a plurality of multipliers. Each of those DSP blocks also includes adders and registers, as well as programmable connectors (e.g., multiplexers) that allow the various components of the block to be configured in different ways.
Typically, the arithmetic operators (adders and multipliers) in such specialized processing blocks have been fixed-point operators. If floating-point operators were needed, the user would construct them outside the specialized processing block using general-purpose programmable logic of the device, or using a combination of the fixed-point operators inside the specialized processing block with additional logic in the general-purpose programmable logic.
In accordance with embodiments of the present invention, specialized processing blocks such as the DSP blocks described above may be enhanced by including floating-point addition among the functions available in the DSP block. This reduces the need to construct floating-point functions outside the specialized processing block. The addition function may be a wholly or partially dedicated (i.e., “hard logic”) implementation of addition in accordance with the IEEE754-1985 standard, and can be used for addition operations, multiply-add (MADD) operations, or vector (dot product) operations, any of which can be either real or complex. The floating-point adder circuit may be incorporated into the DSP Block, and can be independently accessed, or used in combination with a multiplier in the DSP block, or even multipliers in adjacent DSP blocks.
Therefore, in accordance with embodiments of the present invention there is provided a specialized processing block on a programmable integrated circuit device. The specialized processing block includes a first floating-point arithmetic operator stage, a second floating-point arithmetic operator stage, and configurable interconnect within the specialized processing block for routing signals into and out of each of the first and second floating-point arithmetic operator stages. There is also provided a programmable integrated circuit device comprising a plurality of such specialized processing blocks.
In some embodiments, the specialized processing block includes a plurality of block inputs, at least one block output, a direct-connect input from another one of the specialized processing blocks, and a direct-connect output to another one of the specialized processing blocks. In some of those embodiments, the configurable interconnect may be configurable to route a plurality of the block inputs to inputs of the first floating-point arithmetic operator stage, at least one of the block inputs to an input of the second floating-point arithmetic operator stage, output of the first floating-point arithmetic operator stage to an input of the second floating-point arithmetic operator stage, at least one of the block inputs to the direct-connect output, output of the first floating-point arithmetic operator stage to the direct-connect output, and the direct-connect input to an input of the second floating-point arithmetic operator stage.
Further features of the invention, its nature and various advantages will be apparent upon consideration of the following detailed description, taken in conjunction with the accompanying drawings, in which like reference characters refer to like parts throughout, and in which:
In the logical representation of
The floating point multiplier 101 can feed the floating point adder 102 directly in a multiplier-add (MADD) mode, as depicted in
In addition, the output of multiplier 101 and/or one of the inputs 201 to the DSP block 200, can also be routed via a direct connection 212 to the adder in an adjacent similar DSP block 200 (it being apparent that, except at the ends of a chain of blocks 200, each direct connection 202 receives its input from a direct connection 212, and that each direct connection 212 provides its output to a direct connection 202). Specifically, multiplexer 211 may be provided to select either input 201 or direct connection 202 as one input to adder 102. Similarly, multiplexer 221 may be provided to select either input 201 or the output of multiplier 101 as another input to adder 102. A third multiplexer 231 may be provided to select either input 201 or the output of multiplier 101 as the output to direct connection 212. Thus the inputs to adder 102 can be either input 201 and the output of multiplier 101, or input 201 and direct connection 202, and direct connection 212 can output either input 201 or the output of multiplier 101.
In one embodiment, multiplexer 221 and multiplexer 231, which have the same two inputs (input 201 and the output of multiplier 101), share a control signal, but in the opposite sense as indicated at 241, so that if one of the two multiplexers selects one of those two inputs, the other of the two multiplexers selects the other of those two inputs.
Multiple DSP blocks according to embodiments of the invention may be arranged in a row or column, so that information can be fed from one block to the next using the aforementioned direct connections 202/212, to create more complex structures.
The same DSP block features can be used to implement a complex dot product. Each second pair of DSP blocks would use a subtraction rather than an addition in the first level addition, which can be supported by the floating-point adder (e.g., by negating one of the inputs, in a straightforward manner). The rest of the adder tree is a straightforward sum construction, similar to that described in the preceding paragraph.
As discussed above, IEEE754-compliant rounding can be provided inside embodiments of the DSP block, or can be implemented in the general-purpose programmable logic portion of the device.
Another feature that could be implemented in dedicated logic is the calculation of an overflow condition of the rounded value, which can be determined using substantially fewer resources than the addition. Additional features could calculate the value of a final exponent, or special or error conditions based on the overflow condition.
For the illustrated method of adder tree implementation, each DSP block output other than the output of the last block is fed back to the input of another DSP block. In some cases the output is fed back to an input of the same block, such as the EF+GH output 412 in
Another embodiment of a dedicated floating-point processing block is a dedicated floating-point adder block. Such a block can be binary (2 input operands) or ternary (3 input operands).
By providing specialized processing blocks, including dedicated but configurable floating point operators, the present invention allows the implementation of certain operations, such as the vector dot product described above, with less reliance on programmable logic outside the blocks.
A PLD 90 incorporating specialized processing blocks according to the present invention may be used in many kinds of electronic devices. One possible use is in an exemplary data processing system 900 shown in
System 900 can be used in a wide variety of applications, such as computer networking, data networking, instrumentation, video processing, digital signal processing, or any other application where the advantage of using programmable or reprogrammable logic is desirable. PLD 90 can be used to perform a variety of different logic functions. For example, PLD 90 can be configured as a processor or controller that works in cooperation with processor 901. PLD 90 may also be used as an arbiter for arbitrating access to a shared resources in system 900. In yet another example, PLD 90 can be configured as an interface between processor 901 and one of the other components in system 900. It should be noted that system 900 is only exemplary, and that the true scope and spirit of the invention should be indicated by the following claims.
Various technologies can be used to implement PLDs 90 as described above and incorporating this invention.
It will be understood that the foregoing is only illustrative of the principles of the invention, and that various modifications can be made by those skilled in the art without departing from the scope and spirit of the invention. For example, the various elements of this invention can be provided on a PLD in any desired number and/or arrangement. One skilled in the art will appreciate that the present invention can be practiced by other than the described embodiments, which are presented for purposes of illustration and not of limitation, and the present invention is limited only by the claims that follow.
This claims the benefit of, and priority to, copending, commonly-assigned U.S. Provisional Patent Application No. 61/483,924, filed May 9, 2011, which is hereby incorporated by reference herein in its entirety.
Number | Date | Country | |
---|---|---|---|
61483924 | May 2011 | US |