1. Field of the Invention
The present invention relates to an architecture for a connection block in reconfigurable gate arrays and, in particular, to an architecture for reprogrammable interconnections of multi-context Programmable Gate Arrays (PGA) to implement connections between logic blocks and routing lines in reconfigurable gate arrays including connection blocks to connect inputs and outputs of different logic elements by means of connection wires.
2. Description of the Related Art
As is well known in this specific technical field, various architecture and circuit solutions have been proposed to implement the connection between routing lines and logic blocks in Programmable Gate Arrays.
To reduce the overload due to the processing time, multi-context reconfiguration architectures have been recently proposed to store some configurations in the array allowing context switchings in a very short time.
However this approach has some drawbacks: each SRAM cell used to store configuration bits must be repeated as many times as the contexts.
Most of the area occupation is due to the high number of SRAM memories allocated in the array, in particular those being used to determine which routing switchings must be activated to reach the desired connectivity between the logic blocks.
On the other hand the technology miniaturisation leads to programmable interconnections being responsible for most of the area occupation and of the delay. Therefore interconnections are an increasingly important key requirement for reprogrammable architectures, wherein devices like pass transistors, tristate buffers or multiplexers increase the area occupation and the capacitive load on wires and connectors, affecting the overall performance.
New solutions to optimise programmable interconnections are thus necessary to remove this difficulty.
This basic solution has been proposed by Kerry M. Pierce et al. in U.S. Pat. No. 5,760,604 granted on Jan. 3, 1996 (assigned to Xilinx, Inc.) and concerning an “Interconnect architecture for field programmable gate-array”.
This first solution provides the connection of each logic block input or output line by means of routing wires using switches formed by n-MOS pass transistors and a configuration cell memory enabling or disabling the connection. This choice requires a large silicon area, particularly in a multi-context structure.
The multiplexing diagram shown in FIG. 5a of U.S. Pat. No. 5,760,604 is the previously described switching structure. Only a switch is enabled for the connection of the output line corresponding to the logic block by means of an outer routing wire.
Some other circuit solutions have been used to connect wires using a CMOS transfer gate instead of a n-MOS pass transistor to preserve a high logic value of the signals, or arranging buffers before switches to improve the signal transmission rate.
However all these solutions require several configuration memory cells used to enable or disable switches.
The area occupation highly depends on n, which is the vertical routing bus amplitude. However this solution is the ideal solution as for performance times since only one pass transistor passes through and in a connection block.
This second prior art solution has been described in U.S. Pat. No. 6,134,173 granted on Nov. 2, 1998 to R. G. Cliff, L. T. Cope, C. R. McClintock, W. Leong, J. A. Watson, J. Huang, B. Ahanin (assigned to Altera Corporation) and concerning a “Programmable logic array integrated circuit”.
Several detailed proposals to realize connection blocks have been disclosed by Richard G. Cliff et al, in the above-mentioned second patent.
While the connection block structure is quite similar to the above-described block (see
This diagram allows the number of programming cells to be reduced, but it increases the delay since the signal must pass through a switch and a multiplexer. A further alternative connection is proposed by using only a multiplexer (FIG. 7 of U.S. Pat. No. 6,134,173). In this case the connection between the logic block wire and the routing wire is fixed and a multiplexer connects the correct signal.
This structure minimizes the configuration memory cells but the signal has a delay depending on the multiplexer architecture and size. The last alternative architecture proposed by U.S. Pat. No. 6,134,173 is shown in FIG. 9 thereof wherein the output of a single multiplexer is connected to some logic blocks by means of a switch.
This diagram minimizes the number of multiplexers but it increases the signal delay, since it is necessary to pass through two stages.
Two similar multiplexer structures to be used for designing connection blocks are described in the following pages.
1) P. Chow, S. Seo, J. Rose, K Chung, G. Paez, I. Rahardja “The Design of a SRAM-Based Field Programmable Gate Array, Part 11: Circuit Design and Layout” in IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Volume: 7, Published on: Sep. 3, 1999, pages: 321–330 and,
2) V. Baena-Lecuyer, M. A. Aguirre, A. Torralba, L. G. Franquelo, J. Faura “Decoder driven switching matrices in multicontext fpgas: area reduction and their effect on routability” In Proceedings of the 1999 IEEE International Symposium on Circuits and Systems. ISCAS'99, volume 1, pages 463–466, 1999
These solutions simultaneously add a delay in the signal transmission and they do not require both control signal (true value and the opposite thereof) stored in each memory cell, the complete signal oscillation is limited by the transistor electric threshold on both high and low values.
On the other hand the first solution uses only n-MOS pass transistors (see
The disclosed embodiments of the present invention provide an architecture effective to combine area reduction and ideal speed performance for programmable interconnections in Programmable Gate Arrays, and considerable advantages are achieved in the case of multi-context architectures.
In accordance with one embodiment of the invention, a decoding stage is inserted between memories, containing information about the connectivity in connection blocks and pass transistors switches in order to considerably reduce the area occupation without affecting the latency of the signals passing through the programmable interconnections. The design thus provides an ideal compromise between the typical approach for connection block programmable interconnections and a multiplexer structure.
According to an embodiment of the invention a circuitry is provided for implementing connection blocks in programmable gate arrays, more particularly for implementing connections between logic blocks and routing lines in reconfigurable gate arrays comprising connection blocks to connect inputs and outputs of different logic elements by means of connection wires, wherein each connection block comprises a single line of pass transistor switches and a decoding stage to drive said pass transistor switches.
According to a further embodiment of the invention, a circuit architecture to implement connections between logic blocks and routing lines in reconfigurable gate arrays is provided. The circuit includes connection blocks configured to connect inputs and outputs of different logic elements by means of connection wires, each connection block having a single line of pass transistor switches, and further including a decoding stage to drive the pass transistor switches.
In accordance with another embodiment of the invention, a connection block for reconfigurable gate arrays is provided, the connection block includes a plurality of connection wires coupled to an output of the logic block and each wire coupled to a respective routing line, each wire including a single pass transistor switch having a control terminal; and a decoding stage having a plurality of outputs, each output coupled to a respective pass transistor switch control terminal to drive the pass transistor switch and selectively couple the logic block to at least one of the routing lines.
In accordance with another embodiment of the invention, a memory architecture is provided, that includes a plurality of SRAM memory cells; at least one logic block; a plurality of routing lines; a connection block coupled between the at least one logic block and the plurality of routing lines, the connection block comprising a plurality of wires coupled to an output of the at least one logic block and each of the plurality of wires coupled to a respective routing line, the connection block further comprising a pass transistor switch in each connection wire for selectively coupling the at least one logic block to a respective routing line, the pass transistor switch including a control terminal; a decoding stage having a plurality of decoding outputs, each decoding output coupled to the control terminal of a respective pass transistor switch to drive the pass transistor switch.
The features and advantages of the architecture according to this invention will be apparent from the following description of the best embodiment of the invention given by way of non-limiting example with reference to the attached drawings.
With reference to the above drawings, a Multi-context Decoding Interconnection Architecture 1 based on the present invention is shown in
A single line 2 of pass transistors switches 3, belonging to a connection block 5 of the configurable gate array, is shown in
Decoder Diagram
The structure of the decoder 6 is particularly critical, The additional area occupation should not offset the reduced number of multi-context memories. A decoder 6, configured to provide for a minimum area occupation rather than for a time optimisation, is represented in the diagram. The latency of the decoder 6 is not a critical parameter since it occurs just after the switching from one context to the other and does not to affect the critical path and the overall performance.
A schematic view of the decoder 6 according to the present invention is shown in
In prior art diagrams most of the area occupation is due to the pull-down network, which is composed of the parallel to n-MOS transistors m=k ┌log2n┐ for each decoder output.
On the contrary, in the solution of the invention the decoder outputs OUT0, OUT1, . . . , OUT7 are connected two by two with m-1 pass transistors, so that only one additional n-MOS transistor is required as a pull-down circuit for each output.
The following table 1 shows the total number of transistors that are required when a traditional pull-down network is used and in the case of the proposed solution for typical decoder sizes. It is clearly shown that a considerable improvement is achieved.
Table 1 below quotes the number of transistors required for different decoder sizes with the diagram of the invention and with known solutions.
It must be noted that the choice of a p-MOS pull-up tree minimises the number of p-MOS transistors and it provides correct outputs for the direct drive of n-MOS switches 3. The proposed diagram could obviously suffer from long latencies, especially on the emerging output edge, when the number of p-MOS series transistors must be passed through but, as above-mentioned, latency is not a critical problem for this decoder.
Multi-Context Memory Cell
Since the memory area becomes larger and larger as the number of contexts increase, a memory cell structure with a minimum silicon area occupation has been implemented. A RAM static cell 12 has been used as the most effective choice for minimising the transistor importance and for supporting multi-context functions, as shown in
The basic SRAM cell 12 has only one n-MOS pass transistor 10 for writing data, another transistor 11 for reading and a bistable element 9 for storing data. Therefore, when more cells 12 are interconnected for a multi-context memory, each SRAM cell 12 can be written while another cell is read, lending a high reconfiguration flexibility in a multi-context processing.
A basic circuit thus comprises six transistors: five transistors thereof have a minimum size, while only the n-MOS transistor 10 is used for writing purposes and it must be larger. At the end of the reading line a single level-shifter buffer 13 has been added to recover an electric threshold and to provide both true and complementary values to feed the following decoding stage, not shown.
Performance Analysis
The evaluation of the interconnection structure DBM according to the invention should take into account performance times and power consumption besides area occupation, being the key requirement. Comparisons between delays and power dissipation makes sense only with respect to the multiplexing tree structure, since DBM interconnections provide the same performances as the typical approach shown in
As shown in
Performance Times
The evaluation of the DBM structure from the performance time point of view is carried out by comparing the delays thereof with the multiplexing tree approach delays inserting, as shown in
Buffering stages, especially in the case of multiplexing tree connections, are used in order to avoid incomplete transitions and not enough sharp edges.
Results clearly show that the multiplexing tree approach is seriously affected by a series of pass transistors having delays more than three times higher than DBM delays. On the other hand the delays in the DBM architecture are the same as a typical FPGA diagram (see
Power Consumption
The use of a long series of pass transistors, with many buffering stages as in the case of multiplexing tree interconnections, can result in an increase in power consumption. In fact, pass transistors cannot generate sharp edges, and this results in the slow gate switching, producing power dissipation.
In order to evaluate the power dissipation, the same path used for the time analysis must be considered, supposing a clock frequency of 50 MHz. The power consumption of all buffers along the path has been estimated and taken into account.
According to the above description, the invention achieves a plurality of advantages that can be summarised in the following features:
Area reduction: The Multi-context Decoding Interconnection Architecture ensures a considerable area reduction, related to configuration memories, whose number is considerably reduced in this solution.
Latency optimisation: The Multi-context Decoding Interconnection Architecture ensures a minimum latency, since the number of switchings being passed through is the lowest.
Power consumption: The optimisation of the number of switchings being passed-through ensures less sharp slopes of the signals being transmitted from a logic element to another.
This ensures a considerable power saving due to faster signal transitions.
All of the above U.S. patents, U.S. patent application publications, U.S. patent applications, foreign patents, foreign patent applications and non-patent publications referred to in this specification and/or listed in the Application Data Sheet, are incorporated herein by reference, in their entirety.
From the foregoing it will be appreciated that, although specific embodiments of the invention have been described herein for purposes of illustration, various modifications may be made without deviating from the spirit and scope of the invention. Accordingly, the invention is not limited except as by the appended claims.
Number | Date | Country | Kind |
---|---|---|---|
MI2003A0276 | Feb 2003 | IT | national |
Number | Name | Date | Kind |
---|---|---|---|
5130704 | Ogawa et al. | Jul 1992 | A |
5801551 | Lin | Sep 1998 | A |
5808933 | Ross et al. | Sep 1998 | A |
6265895 | Schleicher et al. | Jul 2001 | B1 |
6617912 | Bauer | Sep 2003 | B1 |
6768335 | Young et al. | Jul 2004 | B1 |
6804143 | Hobson | Oct 2004 | B1 |
20020071305 | Lu et al. | Jun 2002 | A1 |
20040212395 | Madurawe | Oct 2004 | A1 |
20050058003 | Yamada | Mar 2005 | A1 |
Number | Date | Country | |
---|---|---|---|
20040225980 A1 | Nov 2004 | US |