This application claims priority to U.S. Nonprovisional patent application Ser. No. 12/125,009 filed May 21, 2008, and entitled “Resonant Clock and Interconnect Architecture for Digital Devices with Multiple Clock Networks” by Alexander T. Ishii, et al., which claims priority to U.S. Provisional Patent Application No. 60/931,582 filed May 23, 2007, and entitled “Resonant Clock and Interconnect Architecture for Programmable Logic Devices,” by Alexander Ishii, et al., both of which are hereby incorporated herein by reference.
This disclosure is related to the technologies described in U.S. Pat. No. 6,879,190 (“Low-power driver with energy recovery”), U.S. Pat. No. 6,777,992 (“Low-power CMOS flip-flop”), U.S. Pat. No. 6,742,132 (“Method and apparatus for generating a clock signal having a driven oscillator circuit formed with energy storage characteristics of a memory storage device”), U.S. Patent Application No. 20070096957 (“Ramped clock digital storage control), U.S. Pat. No. 7,355,454 (“Energy recovery boost logic”), U.S. patent application Ser. No. 11/949,664 (“Clock distribution network architecture for resonant-clocked systems”), U.S. patent application Ser. No. 11/949,669 (“Clock distribution network architecture with resonant clock gating”), and the U.S. patent application Ser. No. 11/949,673 (“Clock distribution network architecture with clock skew management”), the entire disclosures of which are hereby incorporated by reference.
This disclosure relates generally to clock and data distribution network architectures for programmable logic devices (PLDs) such as field programmable gate arrays (FPGAs). It also relates generally to clock distribution network architectures for digital devices with multiple clock networks and various clock frequencies such as microprocessors, application-specific integrated circuits (ASICs), and System-on-a-Chip (SOC) devices.
Resonant drivers have recently been proposed for the energy-efficient distribution of signals in synchronous digital systems. For example, in the context of clock distribution networks, energy efficient operation with resonant drivers is achieved using an inductor to resonate the parasitic capacitance of the clock distribution network. Clock distribution with extremely low jitter is achieved through the elimination of buffers. Moreover, extremely low skew is achieved among the distributed clock signals through the design of relatively symmetric distribution networks. Network performance depends on operating speed and overall network inductance, resistance, size, and topology, with lower-resistance symmetric networks resulting in lower jitter, skew, and energy consumption when designed with adequate inductance.
The distribution of clock and data signals presents a particular challenge in the context of FPGAs, resulting in limited operating speeds and high energy dissipation. Typically, FPGAs deploy multiple clock networks, operating at various clock frequencies. To ensure a high degree of programmability, FPGAs typically provide the means for connecting any storage device (flip-flop) in the FPGA to any of these multiple clock networks. Moreover, all clock networks must be distributed across the entire FPGA. The resulting clock distribution networks are thus highly complex, resulting in relatively lower operating speeds. To exacerbate the situation, the large size and high complexity of these clock networks require the extensive deployment of sophisticated power management techniques such as clock gating, so that overall power consumption is kept at acceptable levels. These power management techniques result in additional design complexity, increased uncertainty in signal timing, and therefore additional limitations to operating speeds.
To maximize programming flexibility, FPGAs typically include one or more large-scale networks for distributing data across the entire device. These networks comprise multiple programmable switches to provide for selective connectivity among the logic blocks in the FPGA. They also include multiple and long interconnects that typically rely on multiple buffers (repeaters) to propagate data. The high complexity of these networks results in increased timing uncertainty in signal timing, limiting operating speeds. The extensive deployment of buffers results in increased energy dissipation. To exacerbate the situation, these networks are often pipelined to provide for higher data transfer rates, resulting in even higher complexity and energy dissipation.
In addition to FPGA devices, multiple clock networks operating at various clock frequencies are generally deployed in microprocessor, ASIC, and SOC designs to implement complex computations and achieve high performance. These clock networks are distributed across the entire device and make extensive use of power management techniques such as clock gating to keep power consumption at acceptable levels. They are therefore highly complex, and their maximum achievable performance is limited by increased timing uncertainty.
One disclosure of design methods for resonant clock networks can be found in U.S. Pat. No. 5,734,285 (“Electronic circuit utilizing resonance technique to drive clock inputs of function circuitry for saving power”). A single resonant domain is described along with methods for synthesizing harmonic clock waveforms that include the fundamental clock frequency and a small number of higher-order harmonics. It also describes clock generators that are driven at a reference frequency, forcing the entire resonant clock network to operate at that frequency. However, the methods do not address clock network architectures or scaling issues that encompass the requirements of FPGA devices. Moreover, it is not concerned with devices that include multiple clock networks operating at various clock frequencies.
Another disclosure of design methods for resonant clock networks can be found in U.S. Pat. No. 6,882,182 (“Tunable clock distribution system for reducing power dissipation”). A method is described for using inductance and capacitance to tune the frequency of a clock distribution network in a programmable logic device. This method focuses on frequency tuning and does not address any clock scaling issues that encompass the requirements of large FPGA devices. Moreover, it does not disclose any clock network architectures for FPGAs.
Resonant clock network designs for local clocking (i.e., for driving flip-flops or latches) are described and empirically evaluated in the following articles: “A 225 MHz Resonant Clocked ASIC Chip,” by Ziesler C., et al., International Symposium on Low-Power Electronic Design, August 2003; “Energy Recovery Clocking Scheme and Flip-Flops for Ultra Low-Energy Applications,” by Cooke, M., et al., International Symposium on Low-Power Electronic Design, August 2003; “Resonant Clocking Using Distributed Parasitic Capacitance,” by Drake, A., et al., Journal of Solid-State Circuits, Vol. 39, No. 9, September 2004, and “Resonant-Clock Latch-Based Design” by Sathe, V., et al., Journal of Solid-State Circuits, Vol. 43, No. 4, April 2008. The designs set forth in these papers are directed to a single resonant domain, however, and do not describe the design of large-scale chip-wide resonant clock network architectures for FPGAs or other devices with multiple clock networks and various clock frequencies.
The design and evaluation of resonant clocking for high-frequency global clock networks was addressed in “Design of Resonant Global Clock Distributions,” by Chan, S., et al., International Conference on Computer Design, October 2003, “A 4.6 GHz Resonant Global Clock Distribution Network,” by Chan, S., et al., International Solid-State Circuits Conference, February 2004, and “1.1 to 1.6 GHz Distributed Differential Oscillator Global Clock Network,” by Chan, S., et al., International Solid-State Circuits Conference, February 2005. These articles focus on global clocking, however, and do not provide any methods for designing a large-scale resonant network that distributes clock signals with high energy efficiency all the way to the individual flip-flops in an FPGA device. Moreover, they are not directed to FPGAs or other devices with multiple clock networks and various clock frequencies.
Another approach for addressing the speed limitations of current FPGA devices is the use of asynchronous logic design. In this approach, clocks are eliminated from the device, and computations are coordinated through the deployment of handshake circuitry. A design for asynchronous FPGAs is described in “Highly Pipelined Asynchronous FPGAs” by Teifel, J., et al., ACM FPGA Conference, 2004. The design and evaluation of a small-scale asynchronous FPGA prototype is described in “A High Performance Asynchronous FPGA: Test Results” by Fang, D., et al., IEEE Symposium on Field Programmable Custom Computing Machines, 2005. A significant drawback of asynchronous FPGAs is the challenge of verifying that the design meets performance requirements under worst-case conditions. FPGA tools are not tailored to perform worst-case timing analysis of a logic structure having multiple clocks. For complex asynchronous structures, checking the worst-case timing of each clock and datapath to verify that worst-case timing constraints are met is an extremely tedious or next to impossible task. Other drawbacks of asynchronous FPGAs include the difficulty in interfacing with conventional synchronous designs and the difficulty in ensuring during testing that they meet worst-case performance requirements under all operating conditions (temperature, supply voltage etc.). With regard to energy consumption, asynchronous circuitry still dissipates the CV2 energy that is required to charge and discharge a capacitive load. It therefore dissipates more energy than resonant drivers when used to drive a signal over capacitive interconnect across an FPGA device.
A clock and data distribution network is proposed that uses resonant drivers to distribute clock and data signals without buffers, thus achieving low jitter, skew, and energy consumption, and relaxed timing requirements. Such a network is generally applicable to architectures for programmable logic devices (PLDs) such as field programmable gate arrays (FPGAs), as well as other semiconductor devices with multiple clock networks and various clock frequencies, and high-performance and low-power clocking requirements such as microprocessors, ASICs, and SOCs.
This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
a)-(b) illustrate functioning of the boost driver in accordance with one aspect of the disclosure.
Beyond FPGAs, the disclosed resonant clock architecture is applicable to other semiconductor devices with high-performance and low-power clocking requirements such as microprocessors, ASICs, and SOCs. In these devices, the disclosed resonant clock architecture with its multiple enable signals EN2, . . . , ENN provides a higher-performance, lower-power, and lower-complexity alternative to clock gating.
The timing requirements of the signals EN2, . . . , ENN are significantly less stringent that those of the conventional clock signals CLK2, . . . , CLKN. Therefore, the resonant clock network architecture shown in
Another factor that contributes to the high performance of the disclosed resonant clock network architecture is its high energy efficiency. Specifically, due to its intrinsically higher energy efficiency, the resonant clock network architecture enables the deeper pipelining of data paths and data interconnect. In conventional clock networks, the introduction of additional clocked pipeline stages (e.g., flip-flops) raises energy dissipation to prohibitively high levels.
An example of a buffer-less clock distribution network for each resonant clock domain from
A possible implementation of a dual-rail resonant driver as the boost driver shown in
At the end of each bit-line in
In an alternative implementation of the resonant interconnect architecture, the dual-rail drivers are replaced by single-rail drivers that use a single resonant waveform φ. In this case, straightforward amplitude-based encoding is used with a single rail per bit. In another alternative implementation, the resonant drivers operate in a “pulsed” mode, rather than in a steady-state oscillation, using a capacitive tank to store charge when not transmitting data. In this case, the waveform resulting on the bit-line is the transient response of the RLC network formed by the driver and the interconnect.
In the disclosed resonant interconnect architecture, it is possible to significantly reduce interconnect overheads by multiplexing multiple bits, so that they are transmitted over the same physical rail, as illustrated in
Number | Name | Date | Kind |
---|---|---|---|
4611135 | Nakayama et al. | Sep 1986 | A |
5023480 | Gieseke et al. | Jun 1991 | A |
5036217 | Rollins et al. | Jul 1991 | A |
5111072 | Seidel | May 1992 | A |
5122679 | Ishii et al. | Jun 1992 | A |
5146109 | Martignoni et al. | Sep 1992 | A |
5311071 | Ueda | May 1994 | A |
5332916 | Hirai | Jul 1994 | A |
5384493 | Furuki | Jan 1995 | A |
5396527 | Schlecht et al. | Mar 1995 | A |
5410491 | Minami | Apr 1995 | A |
5430408 | Ovens et al. | Jul 1995 | A |
5473526 | Svensson et al. | Dec 1995 | A |
5489866 | Diba | Feb 1996 | A |
5504441 | Sigal | Apr 1996 | A |
5506520 | Frank et al. | Apr 1996 | A |
5506528 | Cao et al. | Apr 1996 | A |
5508639 | Fattaruso | Apr 1996 | A |
5517145 | Frank | May 1996 | A |
5517399 | Yamauchi et al. | May 1996 | A |
5526319 | Dennard et al. | Jun 1996 | A |
5537067 | Carvajal et al. | Jul 1996 | A |
5559463 | Denker et al. | Sep 1996 | A |
5559478 | Athas et al. | Sep 1996 | A |
5587676 | Chowdhury | Dec 1996 | A |
5675263 | Gabara | Oct 1997 | A |
5701093 | Suzuki | Dec 1997 | A |
5734285 | Harvey | Mar 1998 | A |
5760620 | Doluca | Jun 1998 | A |
5838203 | Stamoulis et al. | Nov 1998 | A |
5841299 | De | Nov 1998 | A |
5872489 | Chang et al. | Feb 1999 | A |
5892387 | Shigehara et al. | Apr 1999 | A |
5896054 | Gonzalez | Apr 1999 | A |
5970074 | Ehiro | Oct 1999 | A |
5986476 | De | Nov 1999 | A |
5999025 | New | Dec 1999 | A |
6009021 | Kioi | Dec 1999 | A |
6009531 | Selvidge et al. | Dec 1999 | A |
6011441 | Ghoshal | Jan 2000 | A |
6037816 | Yamauchi | Mar 2000 | A |
6052019 | Kwong | Apr 2000 | A |
6069495 | Ciccone et al. | May 2000 | A |
6091629 | Osada et al. | Jul 2000 | A |
6150865 | Fluxman et al. | Nov 2000 | A |
6160422 | Huang | Dec 2000 | A |
6169443 | Shigehara et al. | Jan 2001 | B1 |
6177819 | Nguyen | Jan 2001 | B1 |
6230300 | Takano | May 2001 | B1 |
6242951 | Nakata et al. | Jun 2001 | B1 |
6278308 | Partovi et al. | Aug 2001 | B1 |
6323701 | Gradinariu et al. | Nov 2001 | B1 |
RE37552 | Svensson et al. | Feb 2002 | E |
6433586 | Ooishi | Aug 2002 | B2 |
6438422 | Schu et al. | Aug 2002 | B1 |
6477658 | Pang | Nov 2002 | B1 |
6538346 | Pidutti et al. | Mar 2003 | B2 |
6542002 | Jang et al. | Apr 2003 | B2 |
6559681 | Wu et al. | May 2003 | B1 |
6563362 | Lambert | May 2003 | B2 |
6608512 | Ta et al. | Aug 2003 | B2 |
6720815 | Mizuno | Apr 2004 | B2 |
6742132 | Ziesler et al. | May 2004 | B2 |
6777992 | Ziesler et al. | Aug 2004 | B2 |
6856171 | Zhang | Feb 2005 | B1 |
6879190 | Kim et al. | Apr 2005 | B2 |
6882182 | Conn et al. | Apr 2005 | B1 |
7005893 | Athas et al. | Feb 2006 | B1 |
7145408 | Shepard et al. | Dec 2006 | B2 |
7215188 | Ramaraju et al. | May 2007 | B2 |
7227425 | Jang et al. | Jun 2007 | B2 |
7233186 | Ishimi | Jun 2007 | B2 |
7301385 | Takano et al. | Nov 2007 | B2 |
7307486 | Pernia et al. | Dec 2007 | B2 |
7355454 | Papaefthymiou et al. | Apr 2008 | B2 |
7622997 | Amato et al. | Nov 2009 | B2 |
7719316 | Chueh et al. | May 2010 | B2 |
7719317 | Chueh et al. | May 2010 | B2 |
7956664 | Chueh et al. | Jun 2011 | B2 |
7973565 | Ishii et al. | Jul 2011 | B2 |
20010013795 | Nojiri | Aug 2001 | A1 |
20020140487 | Fayneh et al. | Oct 2002 | A1 |
20030189451 | Ziesler et al. | Oct 2003 | A1 |
20050057286 | Shepard et al. | Mar 2005 | A1 |
20050114820 | Restle | May 2005 | A1 |
20060082387 | Papaefthymiou et al. | Apr 2006 | A1 |
20060152293 | McCorquodale et al. | Jul 2006 | A1 |
20070096957 | Papaefthymiou et al. | May 2007 | A1 |
20070168786 | Drake et al. | Jul 2007 | A1 |
20070216462 | Ishimi | Sep 2007 | A1 |
20080136479 | You et al. | Jun 2008 | A1 |
20080150605 | Chueh et al. | Jun 2008 | A1 |
20080150606 | Kumata | Jun 2008 | A1 |
20080164921 | Shin | Jul 2008 | A1 |
20080303576 | Chueh et al. | Dec 2008 | A1 |
20090027085 | Ishii et al. | Jan 2009 | A1 |
20110084736 | Papaefthymiou et al. | Apr 2011 | A1 |
20110084772 | Papaefthymiou et al. | Apr 2011 | A1 |
20110084773 | Papaefthymiou et al. | Apr 2011 | A1 |
20110084774 | Papaefthymiou et al. | Apr 2011 | A1 |
20110084775 | Papaefthymiou et al. | Apr 2011 | A1 |
20110090018 | Papaefthymiou et al. | Apr 2011 | A1 |
20110090019 | Papaefthymiou et al. | Apr 2011 | A1 |
20110109361 | Nishio | May 2011 | A1 |
20110140753 | Papaefthymiou et al. | Jun 2011 | A1 |
20110210761 | Ishii et al. | Sep 2011 | A1 |
20110215854 | Chueh et al. | Sep 2011 | A1 |
Number | Date | Country |
---|---|---|
0953892 | Nov 1999 | EP |
1126612 | Aug 2001 | EP |
1764669 | Mar 2007 | EP |
63246865 | Oct 1988 | JP |
7321640 | Dec 1995 | JP |
3756285 | Jan 2006 | JP |
2005092042 | Oct 2005 | WO |
Entry |
---|
Athas et al., “Low-Power Digital Systems Based on Adiabatic-Switching Principles,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 2, No. 4, pp. 398-407, Dec. 1994. |
Chan et al., “1.1 to 1.6GHz Distributed Differential Oscillator Global Clock Network,” International Solid-State Circuits Conference, pp. 518-519, Feb. 9, 2005. |
Chan et al., “A 4.6GHz Resonant Global Clock Distribution Network,” IEEE International Solid-State Circuits Conference, Feb. 18, 2004. |
Chan et al., “A Resonant Global Clock Distribution for the Cell Broadband Engine Processor,” IEEE Journal of Solid State Circuits, vol. 44, No. 1, pp. 64-72, Jan. 2009. |
Chan et al., “Design of Resonant Global Clock Distributions,” Proceedings of the 21st International Conference on Computer Design, pp. 248-253, Oct. 2003. |
Chueh et al., “900MHz to 1.2GHz Two-Phase Resonant Clock Network with Programmable Driver and Loading,” IEEE Custom Integrated Circuits Conference, pp. 777-780, Sep. 2006. |
Chueh et al., “Two-Phase Resonant Clock Distribution,” Proceedings of the IEEE Computer Society Annual Symposium on VLSI: New Frontiers on VLSI Design, May 2005. |
Cooke et al., “Energy Recovery Clocking Scheme and Flip-Flops for Ultra Low-Energy Application,” International Symposium on Low-Power Electronic Design, pp. 54-59, Aug. 25-27, 2003. |
Drake et al., “Resonant Clocking Using Distributed Parasitic Capacitance,” IEEE Journal of Solid-State Circuits, vol. 39, No. 9, pp. 1520-1528, Sep. 2004. |
Dunning, Jim, “An All-Digital Phase-Locked Loop with 50-Cycle Lock Time Suitable for High-Performance Microprocessors,” IEEE Journal of Solid-State Circuits, vol. 30, No. 4, pp. 412-422, Apr. 1995. |
Fang et al., “A High-Performance Asynchronous FPGA: Test Results,” Proceedings of the 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, Apr. 2005. |
Favalli et al., “Testing Scheme for IC's Clocks,” IEEE European Design and Test Conference, Mar. 1997. |
Gutnik et al., “Active GHz Clock Network Using Distributed PLLs,” IEEE Journal of Solid-State Circuits, vol. 35, No. 11, pp. 1553-1560, Nov. 2000. |
Ishii et al., “A Resonant-Clock 200MHz ARM926EJ-S(TM) Microcontroller,” European Solid-State Circuits Conference, Sep. 2009. |
Kim et al., “Energy Recovering Static Memory,” Proceedings of the 2002 International Symposium on Low Power Electronics and Design, pp. 92-97, Aug. 12-14, 2002. |
Maksimovic et al., “Design and Experimental Verification of a CMOS Adiabatic Logic with Single-Phase Power-Clock Supply,” Proceedings of the 40th Midwest Symposium on Circuits and Systems, pp. 417-420, Aug. 1997. |
Maksimovic et al., “Integrated Power Clock Generators for Low Energy Logic,” IEEE Annual Power Electronics Specialists Conference, vol. 1, pp. 61-67, Jun. 18-22, 1995. |
Moon et al., “An Efficient Charge Recovery Logic Circuit,” IEEE Journal of Solid-State Circuits, vol. 31, No. 4, pp. 514-522, Apr. 1996. |
Sathe et al., “A 0.8-1.2GHz Frequency Tunable Single-Phase Resonant-Clocked FIR Filter with Level-Sensitive Latches,” IEEE 2007 Custom Integrated Circuits Conference, pp. 583-586, Sep. 2007. |
Sathe et al., “A 1.1GHz Charge-Recovery Logic,” IEEE International Solid-State Circuits Conference, Feb. 7, 2006. |
Sathe et al., “A 1GHz Filter with Distributed Resonant Clock Generator,” IEEE Symposium on VLSI Circuits, pp. 44-45, Jun. 2007. |
Sathe et al., “Resonant-Clock Latch-Based Design,” IEEE Journal of Solid-State Circuits, vol. 43, No. 4, pp. 864-873, Apr. 2008. |
Teifel et al., “Highly Pipelined Asynchronous FPGAs,” Proceedings of the 2004 ACM/SIGDA 12th International Symposium on Field Programmable Gate Arrays, pp. 133-142, Feb. 22-24, 2004. |
Weste et al., “Principles of CMOS VLSI Design: A Systems Perspective,” 2nd Edition, Addison-Wesley, pp. 9-11, 1992. |
Ziesler et al., “A 225 MHz Resonant Clocked ASIC Chip,” Proceedings of the 2003 International Symposium on Low Power Electronics and Design, pp. 48-53, Aug. 25-27, 2003. |
Ziesler et al., “A Resonant Clock Generator for Single-Phase Adiabatic Systems,” Proceedings of the 2001 International Symposium on Low Power Electronics and Design, pp. 159-164, Aug. 6-7, 2001. |
Ziesler et al., “Energy Recovering ASIC Design,” Proceedings of the IEEE Computer Society Annual Symposium on VLSI, Feb. 20-21, 2003. |
Search Report and Written Opinion from International Serial No. PCT/US2007/086304 mailed Mar. 3, 2009. |
Search Report and Written Opinion from International Serial No. PCT/US2008/064766 mailed Dec. 22, 2008. |
Search Report and Written Opinion from International Serial No. PCT/US2010/052390 mailed Jun. 23, 2011. |
Search Report and Written Opinion from International Serial No. PCT/US2010/052393 mailed Jun. 23, 2011. |
Search Report and Written Opinion from International Serial No. PCT/US2010/052395 mailed Jun. 23, 2011. |
Search Report and Written Opinion from International Serial No. PCT/US2010/052396 mailed Jun. 23, 2011. |
Search Report and Written Opinion from International Serial No. PCT/US2010/052397 mailed Jun. 23, 2011. |
Search Report and Written Opinion from International Serial No. PCT/US2010/052401 mailed Jun. 29, 2011. |
Search Report and Written Opinion from International Serial No. PCT/US2010/052402 mailed Jun. 23, 2011. |
Search Report and Written Opinion from International Serial No. PCT/US2010/052405 mailed Jun. 23, 2011. |
Search Report from International Serial No. PCT/US2003/010320 mailed Sep. 29, 2003. |
Supplementary European Search Report from European Serial No. 03716979.4 mailed Jun. 7, 2006. |
Taskin, Baris et al., “Timing-Driven Physical Design for VLSI Circuits Using Resonant Rotary Clocking,” 49th IEEE International Midwest Symposium on Circuits and Systems, pp. 261-265, Aug. 6, 2006. |
Number | Date | Country | |
---|---|---|---|
20110210761 A1 | Sep 2011 | US |
Number | Date | Country | |
---|---|---|---|
60931582 | May 2007 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 12125009 | May 2008 | US |
Child | 13103985 | US |