1. Field of the System
The present system relates to field programmable gate array (FPGA) devices. More specifically, the system relates to a synchronous first in/first out memory module for an FPGA.
2. Background
FPGAs are known in the art. An FPGA comprises any number of logic modules, an interconnect routing architecture and programmable elements that may be programmed to selectively interconnect the logic modules to one another and to define the functions of the logic modules. To implement a particular circuit function, the circuit is mapped into the array and the appropriate programmable elements are programmed to implement the necessary wiring connections that form the user circuit.
An FPGA core tile may be employed as a stand-alone FPGA, repeated in a rectangular array of core tiles, or included with other functions in a system-on-a-chip (SOC). The core FPGA tile may include an array of logic modules, and input/output modules. An FPGA circuit may also include other components such as static random access memory (SRAM) blocks. Horizontal and vertical routing channels provide interconnections between the various components within an FPGA core tile. Programmable connections are provided by programmable elements between the routing resources.
An FPGA circuit can be programmed to implement virtually any set of digital functions. Input signals are processed by the programmed circuit to produce the desired set of outputs. Such inputs flow from the user's system, through input buffers and through the circuit, and finally back out to the user's system via output buffers. The bonding pad, input buffer and output buffer combination is referred to as an input/output port (I/O). Such buffers provide any or all of the following input/output (I/O) functions: voltage gain, current gain, level translation, delay, signal isolation or hysteresis.
As stated above, many FPGA designers incorporate blocks of SRAM into their architecture. In some applications, the SRAM blocks are configured to function as a first-in/first-out (FIFO) memory. A FIFO is basically a SRAM memory with automatic read and write address generation and some additional control logic. The logic needed to implement a FIFO, in addition to the SRAM blocks, consists of address generating logic and flag generating logic.
Counters are used for address generation. Two separate counters are used in this application for independent read and write operations. By definition, a counter circuit produces a deterministic sequence of unique states. The sequence of states generated by a counter is circular such that after the last state has been reached the sequence repeats starting at the first state. The circular characteristic of a counter is utilized to generate the SRAM's write and read addresses so that data is sequenced as the first data written to the SRAM is the first data read. The size of the sequence produced by the counters is matched to the SRAM address space size. Assuming no read operation, when the write counter sequence has reached the last count, the SRAM has data written to all its addresses. Without additional control logic, further write operations would overwrite existing data starting at the first address.
Additional logic is needed to control the circular sequence of the read and write address counters in order to implement a FIFO. The control logic enables and disables the counters when appropriate and generates status flags. The read and write counters are initialized to produce a common start location. The control logic inhibits reading at any location until a write operation has been performed. When the write counter pulls ahead of the read counter by the entire length of the address space, the SRAM has data written to all its addresses. The control logic inhibits overwriting an address until its data has been read. Once the data has been read, the control permits overwriting at that address. When the read counter catches up to the write counter, the SRAM no longer contains valid data and the control logic inhibits reading until a write operation is performed.
Output signals, known to those of ordinary skill in the art as flags, provide the system with status on the SRAM capacity available. The full and empty conditions are indicated through full and empty flags. Two additional flags are generated to warn of approaching empty or full conditions.
FPGAs have programmable logic to implement this control logic. With the availability of a SRAM block, an FPGA application may be configured to operate as a FIFO memory. Many prior art FPGAs use this approach. However, considerable FPGA gates are consumed when implementing the control logic for a FIFO in this manner and this increases the cost of the application. Also, the performance of the FIFO is likely to be limited by the speed of the control logic and not the SRAM.
Hence, there is a need for an FPGA that has dedicated logic specifically included to implement a FIFO. The FIFO logic may included among the SRAM components in an FPGA core tile. The result is improved performance and a decrease in silicon area needed to implement the functions with respect to implementing the FIFO-function with FPGA gates.
A field programmable gate array having a plurality of random access memory blocks coupled to a plurality of dedicated first-in/first-out memory logic components and a plurality of random access memory clusters programmably coupled to the rest of the FPGA is described.
A better understanding of the features and advantages of the present invention will be obtained by reference to the following detailed description of the invention and accompanying drawings which set forth an illustrative embodiment in which the principles of the invention are utilized.
Those of ordinary skill in the art will realize that the following description of the present invention is illustrative only and not in any way limiting. Other embodiments of the invention will readily suggest themselves to such skilled persons.
In the present disclosure, Vcc is used to define the positive power supply for the digital circuit as designed. As one of ordinary skill in the art will readily recognize, the size of a digital circuit may vary greatly depending on a user's particular circuit requirements. Thus, Vcc may change depending on the size of the circuit elements used.
Moreover, in this disclosure, various circuits and logical functions are described. It is to be understood that designations such as “1” and or “0” in these descriptions are arbitrary logical designations. In a first implementation of the invention, or “1” may correspond to a voltage high, while “0” corresponds to a voltage low or ground, while in a second implementation, “0” may correspond to a voltage high, while “1” corresponds to a voltage low or ground. Likewise, where signals are described, a “signal” as used in this disclosure may represent the application, or pulling “high” of a voltage to a node in a circuit where there was low or no voltage before, or it may represent the termination, or the bringing “low” of a voltage to the node, depending on the particular implementation of the invention.
Referring still to
In the present example, for illustrative purposes only, SRAM block 108 has multiple bits accessible by two independent ports: a read only port (all circuitry on the right of SRAM block 108) and a write only port (all circuitry on the left of SRAM block 108). Both ports may be independently configured in multiple words by bits per words combinations. For example, both ports may be configured as 4,096×1, 2,048×2, 1,024×4, 512×9, 256×18 and 128×36. In addition, a plurality of SRAM blocks may be cascaded together by means of busses 152, 156, 158, 174, 178, 198. In the present example, there are five enable lines for each port, one for real enable and four for higher order address bits. The ten XOR gates are used to invert or not invert the lines on a block-by-block basis effectively making AND gates 170 and 190 decoders with programmable bubbles on the inputs. The write port is synchronous to the write clock and the read port is synchronous to the read clock. As one of ordinary skill in the art would readily recognize, the above example is illustrative only, many other configurations or memory blocks could be used.
Referring still to
Read data bus 250 and write data bus 252 are coupled directly to SRAM block 108. When the FIFO logic component is not active, controller bits 248 are set at 0 disabling the tri-state buffers 206, 208214 and 216. When the SRAM is not configured as a FIFO, all input signals originate from adjacent SRAM clusters 106. When a SRAM is configured as a FIFO, a select set of signals from the RAM cluster modules are set to high impedance and FIFO logic component 200 seizes control of the signal lines. When FIFO logic component 200 is active, it seizes control of the write enable signals 158, the read enable signals 178 and the read and write address lines 174 and 156 respectively as shown in
Counters 210 and 212 are binary counters, however, they also generate gray code. Gray code or “single distance code” is an ordering of 2n binary numbers such that only one bit changes between any two consecutive elements. The binary value is sent to subtractor 222 to calculate the difference between the read and write counters for the almost full and almost empty flags. The gray code is sent to address comparators 232 and 238 as well as to tri-state buffers 214 and 216. In gray code, one and only one bit changes between any two consecutive codes in the sequence. The purpose of registers 218 and 220 is to synchronize the read counter address in 210 to write clock signal and the purpose of registers 224 and 226 is to synchronize the write counter address to read clock signal for comparison purposes. Because there is no requirement that read clock signal 253 and write clock signal be synchronous, there is no guarantee that the outputs of 210 will not be changing during the setup and hold time windows of register 218. Because of the likelihood of change during the register setup and hold time window, there is a chance of an uncertain result. The chance of an uncertain result is limited by using gray code to make sure that only one bit can change at a time. However the uncertainty on that one bit resolves itself, the result is that the bit will either get the last address or the next address and no other address when comparing the read and write addresses.
When the memory is full writing must be inhibited to prevent overwriting valid data in the SRAM. To control this the comparison between the read and write addresses is done in the write clock (WCK) time domain since write operations are synchronous to WCK. The read address counter 210 gray code sampled two WCK cycles in the past by registers 218 and 220 is compared to the current write address counter 212 gray code by comparator 232. If the result is equal, then the SRAM may be full and writing is inhibited. There is no way to reliably know for certain if the SRAM is really full. The read address being compared is two WCK cycles old and one or more read operations may have occurred during that time. However, by erring on the side of safety when it is possible that the memory might be full, overwriting of data can be reliably prevented.
In a similar manner, when the memory is empty reading must be inhibited to prevent outputting invalid data from the SRAM. To control this the comparison between the write and read addresses is done in the RCK time domain since read operations are synchronous to RCK. The write address counter 212 gray code sampled two RCK cycles in the past by registers 224 and 226 is compared to the current read address counter 210 gray code by comparator 238. If the result is equal, then the SRAM may be empty and reading is inhibited. There is no way to reliably know for certain if the SRAM is really full. The write address being compared is two RCK cycles old and one or more read operations may have occurred during that time. However, by erring on the side of safety when it is possible that the memory might be empty, reading of invalid data can be reliably inhibited.
Since both a full and an empty condition are detected by equality between the read and write addresses, a way to tell the difference between the two conditions is require. This is accomplished by having an extra most significant bit (MSB) in counters 210 and 212 which is not part of the address space sent to the SRAM block (and not shown in
To avoid overcomplicating the disclosure and thereby obscuring the present invention, receiver modules 312, transmitter modules 314 and buffer module 316 are not described in detail herein. The implementation of receiver modules 312 and transmitter modules 314 suitable for use according to the present system is disclosed in co-pending U.S. patent application Ser. No. 10/323,613, filed on Dec. 18, 2002, and hereby incorporated herein by reference. The implementation of buffer modules 316 suitable for use according to the present system is disclosed in U.S. Pat. No. 6,727,726, issued Apr. 27, 2004, and hereby incorporated herein by reference.
In the present example, for illustrative purposes only, the interface to each SRAM block 108 is logically one RAM cluster 106 wide and seven rows long. Thus, there is a column of seven RAM clusters 106(0) through 106(6) for every SRAM block 108. Sub-clusters 300 and 302 of RAM cluster 106(0) each have one RAM clock interface input (RC) module 304, six single ended input (RT) modules 306 and two RAM interface output (RO) modules 308 in addition to the two transmitter modules 314 and two receiver modules 312 as set forth above. Right sub cluster 302 also has a buffer module 316. RC modules 304 in RAM cluster 106(0) select the write and read clock signals from all the HCLK and RCLK networks or from signals in either of two adjacent two routed channels and determine their polarity. RC modules 304 will be discussed in greater detail below. Each RT module 306 provides a control signal to SRAM module 108 which is either routed from a single channel or tied off to logic 1 or logic 0. RO modules 308 transmit read-data or FIFO flags from SRAM module 108 into an individual output track. RT modules 306 and RO modules 308 will be discussed in greater detail below.
Sub-clusters 300 and 302 of RAM clusters 106(1-6) each have three two-input RAM channel-up/channel-down non-cascadable signal (RN) modules 310, three RO modules 308 and six two-input RAM channel-up/channel-down cascadable signal (RI) modules 309 in addition to the two transmitter modules 314 and two receiver modules 312 as set forth above. Right sub cluster 302 also has a buffer module 316. RN modules 310 and RI modules 309 provide an input signal to SRAM module 108 that can be routed from two rows, the one in which it is located and the row immediately above it.
RN module 310 comprises a two-input AND gate 356 and a buffer 358. One input of two-input AND gate 356 is programmably coupled to a horizontal routing track in routing architecture row 350. The second input of two-input AND gate 356 is programmably coupled to a horizontal routing track in routing architecture row 352. The output of two-input AND gate 356 is coupled to SRAM module 108 through buffer 358.
RI module 309 comprises a two-input NAND gate 376 having the ability to select a signal from routing architecture row 150 or 152. Two-input NAND gate 376 has an output coupled to SRAM block 108 through tri-state buffer 380 and one inverted signal input of a two-input OR gate 378. Two-input OR gate has a second input coupled to Vcc or ground and its output coupled to SRAM module 108 through tri-state buffer 380. In the present disclosure, Vcc is used to define the positive power supply for the digital circuit as designed. As one of ordinary skill in the art will readily recognize, the size of a digital circuit may vary greatly depending on a user's particular circuit requirements. Thus, Vcc may change depending on the size of the circuit elements used.
In this disclosure, various circuits and logical functions are described. It is to be understood that designations such as “1” and “0” in these descriptions are arbitrary logical designations. In a first implementation of the invention, “1” may correspond to a voltage high, while “0” corresponds to a voltage low or ground, while in a second implementation, “0” may correspond to a voltage high, while or “1” corresponds to a voltage low or ground. Likewise, where signals are described, a “signal” as used in this disclosure may represent the application, or pulling “high” of a voltage to a node in a circuit where there was low or no voltage before, or it may represent the termination, or the bringing “low” of a voltage to the node, depending on the particular implementation of the invention.
RO module 308 comprises a buffer 360 having an input coupled to FIFO control block 200 or SRAM block 108. The output of buffer 360 requires programming voltage protection and drives an output track which in routing architecture row 352.
While embodiments and applications of this system have been shown and described, it would be apparent to those skilled in the art that many more modifications than mentioned above are possible without departing from the inventive concepts herein. The system, therefore, is not to be restricted except in the spirit of the appended claims.
This application is a continuation of co-pending U.S. patent application Ser. No. 10/948,010, filed Sep. 22, 2004, which is a continuation of U.S. patent application Ser. No. 10/448,259, filed May 28, 2003, now U.S. Pat. No. 6,838,902, issued Jan. 4, 2005, which are hereby incorporated by reference as if set forth herein.
Number | Date | Country | |
---|---|---|---|
Parent | 10948010 | Sep 2004 | US |
Child | 11297088 | Dec 2005 | US |
Parent | 10448259 | May 2003 | US |
Child | 10948010 | Sep 2004 | US |