This relates to integrated circuits and more particularly, to clock routing networks on integrated circuits.
An integrated circuit often contains clock-triggered storage elements such as digital flip-flops. These flip-flops are typically triggered using control signals such as clock signals. The integrated circuit can include a clock source that generates the clock signals for the flip-flops. In general, it is desirable for clock signals to arrive at flip-flops located in different regions of the integrated at the same time. Any unintended deviation between the arrival times of clock signals at the different flip-flops is referred to as clock skew.
In an effort to reduce clock skew, conventional integrated circuits are provided with a fixed clock tree. The fixed clock tree is a non-configurable network of routing paths that serve to route the clock signals from the clock source to the various flip-flops on the integrated circuit with minimal clock skew. This is accomplished by forming each individual clock routing path with substantially equal lengths and path delays. While a fixed clock tree offers reduced clock skew, each newer generation of integrated circuits (e.g., if differing in size) will require a complete redesign of the clock tree.
Moreover, a fixed clock tree cannot easily switch between different clock domains (i.e., a fixed clock tree will only be able to serve a given region within its coverage with a fixed latency). If a larger region of coverage is required, the fixed clock tree cannot be easily extended. If a smaller region of coverage is required, the clock routing latency cannot be easily reduced.
It is within this context that the embodiments described herein arise.
An integrated circuit with a hybrid fixed-routed clock network is provided.
In accordance with an embodiment, the hybrid clock network may include a configurable clock routing portion that routes clock signals from a clock source to a clock tree root and a fixed clock routing portion that routes the clock signals from the clock tree root to corresponding clock tree leaf nodes (e.g., to registers, counters, etc.). Arranged in this way, the configurable clock routing portion provides a common path for the clock signals, whereas the fixed clock routing portion provides divergent paths branching off from the clock tree root.
The configurable and fixed clock routing portions may be implemented using an array of logic regions (sometimes referred to as “sectors”). Each logic region may include a clock switching block, a horizontal routing segment, a vertical routing segment, and associated programmable logic circuitry. The clock switching block may include four 4:1 multiplexers that receive the same clock signals.
The horizontal/vertical routing segments that are used to implement the configurable routing portion may include bidirectional tristate buffers. On the other hand, horizontal/vertical routing segments that are used to implement the fixed routing portion may include simple inverters that exhibit less latency than the tristate buffers. The fixed clock routing portion may also be arranged in an H-tree mesh. A hybrid clock network formed in this way can be used to provide coverage for any suitable region on an integrated circuit.
Further features of the invention, its nature and various advantages will be more apparent from the accompanying drawings and following detailed description.
Embodiments of the present invention relate to integrated circuits and in particular, to programmable integrated circuits with clock distribution networks.
A programmable integrated circuit may include an array of programmable logic regions (sometimes referred to as logic “sectors”). Each of the programmable sectors in the array may include circuitry for implementing configurable clock routing paths and/or fixed clock routing paths. To implement a global clock routing network, a first portion of the global clock routing network may include configurable clock routing paths, whereas a second portion of the global clock routing network may include fixed clock routing paths. To implement a regional, peripheral, or other smaller clock domains on the integrated circuit, only configurable clock routing paths might be used. Arranged in this way, the configurable portion of the clock routing network provides flexibility and composability while the fixed portion of the clock routing network provides reduced latency and increased tolerance to uneven transistor aging/variability.
It will be recognized by one skilled in the art that the present exemplary embodiments may be practiced without some or all of these specific details. In other instances, well-known operations have not been described in detail in order not to unnecessarily obscure the present embodiments.
An illustrative integrated circuit such as a programmable logic device (PLD) 100 is shown in
Programmable device 100 may contain programmable memory elements. Memory elements may be loaded with configuration data (also called programming data) using input/output elements (IOEs) 102. Once loaded, the memory elements each provide a corresponding static control signal that controls the operation of an associated functional block (e.g., LABs 110, DSP 120, RAM 130, or input/output elements 102).
In a typical scenario, the outputs of the loaded memory elements are applied to the gates of metal-oxide-semiconductor transistors in a functional block to turn certain transistors on or off and thereby configure the logic in the functional block including the routing paths. Programmable logic circuit elements that may be controlled in this way include parts of multiplexers (e.g., multiplexers used for forming routing paths in interconnect circuits), look-up tables, logic arrays, AND, OR, NAND, and NOR logic gates, pass gates, etc.
The memory elements may use any suitable volatile and/or non-volatile memory structures such as random-access-memory (RAM) cells, fuses, antifuses, programmable read-only-memory memory cells, mask-programmed and laser-programmed structures, combinations of these structures, etc. Because the memory elements are loaded with configuration data during programming, the memory elements are sometimes referred to as configuration memory, configuration RAM (CRAM), or programmable memory elements. The CRAM cells may be different than RAM blocks 130 in the sense that CRAM cells store configuration data that remains relatively constant while RAM blocks 130 store user data that can change often during normal operation of device 100.
In addition, the programmable logic device may have input/output elements (IOEs) 102 for driving signals off of PLD and for receiving signals from other devices. Input/output elements 102 may include parallel input/output circuitry, serial data transceiver circuitry, differential receiver and transmitter circuitry, or other circuitry used to connect one integrated circuit to another integrated circuit. As shown, input/output elements 102 may be located around the periphery of the chip. If desired, the programmable logic device may have input/output elements 102 arranged in different ways. For example, input/output elements 102 may form one or more columns of input/output elements that may be located anywhere on the programmable logic device (e.g., distributed evenly across the width of the PLD). If desired, input/output elements 102 may form one or more rows of input/output elements (e.g., distributed across the height of the PLD). Alternatively, input/output elements 102 may form islands of input/output elements that may be distributed over the surface of the PLD or clustered in selected areas.
The PLD may also include programmable interconnect circuitry in the form of vertical routing channels 140 (i.e., interconnects formed along a vertical axis of PLD 100) and horizontal routing channels 150 (i.e., interconnects formed along a horizontal axis of PLD 100), each routing channel including at least one track to route at least one wire. If desired, the interconnect circuitry may include pipeline elements, and the contents stored in these pipeline elements may be accessed during operation. For example, a programming circuit may provide read and write access to a pipeline element.
Note that other routing topologies, besides the topology of the interconnect circuitry depicted in
Furthermore, it should be understood that embodiments may be implemented in any integrated circuit. If desired, the functional blocks of such an integrated circuit may be arranged in more levels or layers in which multiple functional blocks are interconnected to form still larger blocks. Other device arrangements may use functional blocks that are not arranged in rows and columns.
In accordance with an embodiment, programmable integrated circuit 100 may include a clock distribution network such as clock distribution network 200 that routes clock signals to various portions of IC die 100 (see
Clock distribution network 200 may include routing paths for routing the select clock signal(s) to different locations on integrated circuit 100. For example, integrated circuit 100 may include clocked storage elements such as registers 206 formed at different physical locations on integrated circuit 100. Registers 206 may be controlled using the clock signals routed through clock distribution network 200. In order to ensure that all clock signals arrive at the different registers 206 at roughly the same time, clock distribution network 200 may be configured in a mesh-like structure to minimize clock skew between the different clock routing paths.
Clock distribution network 300 can sometimes be referred to as a clock “tree.” A clock tree may include common paths and divergent paths. Common paths represent all routing paths linking the clock source to a clock tree “root” (e.g., all clock signals traveling through network 300 needs to traverse through the common paths regardless of its final destination). Divergent paths represent all routing paths branching off separately from a clock tree root, linking the root to a corresponding clock tree “leaf” node 306. Each leaf node 306 may be a register, counter, or other circuits that can be controlled by a clock signal. Since configurable clock routing circuitry 302 is more susceptible to random clock tree skew induced by uneven transistor aging, it may be desirable to implement the common paths linking clock source 202 to root(s) 303 using configurable clock network 302. From the clock root(s) on, the clock tree may be stripped of all is configurable elements to improve divergent path latency to form fixed clock network(s) 304.
Because the fixed portion of the hybrid clock tree begins at the first divergent point of the tree (i.e., at a root 303), the impact of transistor variability is mitigated. Additionally, the transistors in all fixed branches 304 will age equally and thus will not introduce skew. The use of fixed clock network 304 can also help reduce power supply jitter from the overall clock insertion delay, which is roughly quartered in register setup cases and halved in register hold cases.
Clock routing network 300 of
Clock switching block 410 may serve to route clock signals between adjoining vertical routing segments 412 and horizontal routing segments 414 between adjacent logic regions 402.
Each multiplexer 500 (e.g., multiplexers 500-1, 500-2, 500-3, and 500-4) may receive signals FN, FS, FE, and FW and may be configured using static control bits stored on memory elements 510. Depending on the bits currently stored in memory elements 510, multiplexer 500 may route signals from a selected one of its four inputs to its output. Output drivers 502 (e.g., buffers 502-1, 502-2, 502-3, and 502-4) may be tristate buffers that are controlled using static control bits stored on memory elements 512. For example, if the control bit in memory element 512 is asserted, corresponding driver 502 may be activated. If, however, the control bit in memory element 512 is deasserted, the corresponding driver 502 may be switched out of use (i.e., temporarily disabled). Memory elements 510 and 512 may (for example) be volatile RAM elements or non-volatile storage elements.
While
The example of
The premise is the same whether implement large or small clock trees; lower latency routing segments (e.g., the fixed routing segments shown in
The embodiments thus far have been described with respect to integrated circuits. The methods and apparatuses described herein may be incorporated into any suitable circuit. For example, they may be incorporated into numerous types of devices such as programmable logic devices, application specific standard products (ASSPs), and application specific integrated circuits (ASICs). Examples of programmable logic devices include programmable arrays logic (PALs), programmable logic arrays (PLAs), field programmable logic arrays (FPLAs), electrically programmable logic devices (EPLDs), electrically erasable programmable logic devices (EEPLDs), logic cell arrays (LCAs), complex programmable logic devices (CPLDs), and field programmable gate arrays (FPGAs), just to name a few.
The programmable logic device described in one or more embodiments herein may be part of a data processing system that includes one or more of the following components: a processor; memory; IO circuitry; and peripheral devices. The data processing can be used in a wide variety of applications, such as computer networking, data networking, instrumentation, video processing, digital signal processing, or any suitable other application where the advantage of using programmable or re-programmable logic is desirable. The programmable logic device can be used to perform a variety of different logic functions. For example, the programmable logic device can be configured as a processor or controller that works in cooperation with a system processor. The programmable logic device may also be used as an arbiter for arbitrating access to a shared resource in the data processing system. In yet another example, the programmable logic device can be configured as an interface between a processor and one of the other components in the system. In one embodiment, the programmable logic device may be one of the family of devices owned by ALTERA/INTEL Corporation.
The foregoing is merely illustrative of the principles of this invention and various modifications can be made by those skilled in the art. The foregoing embodiments may be implemented individually or in any combination.