This invention relates to processor-based emulation engines.
IBM is a registered trademark of International Business Machines Corporation, Armonk, N.Y.
Hardware emulators are programmable devices used in the verification of logic designs. A common method of logic design verification is to use processors to emulate the design. These processor-based emulators sequentially evaluate combinatorial logic levels, starting at the inputs and proceeding to the outputs. Each pass through the entire set of logic levels is called a Target Cycle; the evaluation of each individual logic level is called an Emulation Step.
Speed is a major selling factor in the emulator market, and is a well known problem. The purpose of this invention is to significantly improve our emulator's speed.
Our invention is an improvement over that disclosed in U.S. Pat. No. 5,551,013, “Multiprocessor for Hardware Emulation,” issued to Beausoleil, et al., where a software-driven multiprocessor emulation system with a plurality of emulation processors connected in parallel in a module has one or more modules of processors to make up an emulation system. Our current processor-based emulator consists of a large number of interconnected processors, each with an individual control store, as described in detail in the U.S. Pat. No. 5,551,013. It would be desirable to improve the speed of this emulator.
While not suitable for our purposes, but for completeness, we note that FPGA-based emulation systems exist that achieve high speeds for small models. However, FPGA-based emulators are inherently I/O bound, and therefore perform poorly with large models. In general, the problem of high-speed emulation of large models had not been solved.
We have increased the processor-based emulation speed by increasing the amount of work done during each emulation step. In the original emulator, an emulation step consisted of a setup phase, an evaluation phase, and a storage phase. With this invention, clusters of processors are interconnected such that the evaluation phases can be cascaded. All processors in a cluster perform the setup in parallel. This setup includes routing of the data through multiple evaluation units for the evaluation phase. (For most efficient operation, the input stack and data stack of each processor must be stored in shared memory within each cluster.) Then, all processors perform the storage phase, again in parallel. The net result is multiple cascaded evaluations performed in a single emulation step. A key feature of the invention is that every processor in a cluster can access the input and data stacks of every other processor in the cluster.
Before turning to the detailed description of our invention, we would note that one method of speedup is to evaluate independent logic paths in parallel. A parallel system may consist of hierarchically arranged processors: multiprocessor modules on multi-module boards, in a multi-board system. Synchronization is achieved by delaying the start of the next target cycle until the completion of all paths. This means that the effective emulation speed is determined by the time required to evaluate the longest path (called the critical path).
For evaluation of independent logic paths in parallel, we can describe our improvement over that disclosed in U.S. Pat. No. 5,551,013, “Multiprocessor for Hardware Emulation,” issued to Beausoleil, et al. (fully incorporated herein by this reference) where a software-driven multiprocessor emulation system with a plurality of emulation processors connected in parallel in a module has one or more modules of processors to make up an emulation system. To illustrate, refer to FIG. 1 of U.S. Pat. No. 5,551,013, which shows an emulation chip, called a module here, having multiple (e.g. 64) processors. All processors within the module are identical and have the internal structure shown in
Data routing within each processor's data flow and through the interconnection network occurs independently of and overlaps the execution of the logic emulation function in each processor. Each control store stores control words executed sequentially under control of the sequencer and program steps in the associated module. Each revolution of the sequencer causes the step value to advance from zero to a predetermined maximum value and corresponds to one target clock cycle for the emulated design. A control word in the control store is simultaneously selected during each step of the sequencer. A logic function operation is defined by each control word. Thus, we have provided in
The embedded control store in each of the emulation processors stores logic-representing signals for controlling operations of the emulation processor. The emulation engine's processor evaluation unit illustrated by
An execution unit in each processor's emulation unit includes a table-lookup unit for emulating any type of logic gate function and a connection from the output of each processor to a multiplexor input with every other processor in a module. Each processor embeds a control store to store software logic-representing signals for controlling operations of each processor. Also in the prior system a data store is embedded in each processor to receive data generated under control of the software signals in the control store. The parallel processors on each module have a module input and a module output from each processor. The plurality of modules have their module outputs interconnected to module inputs of all other modules. A sequencer synchronously cycles the processors through mini-cycles on all modules. Logic software drives all of the processors in the emulation system to emulate a complex array of Boolean logic, which may represent all of the logic gates in a complex logic semiconductor chip or system. Each cycle of processing may control the emulation of a level of logic being verified by the single emulation processor illustrated in
For a more detailed understanding of our invention, it should be understood that at each emulation step, a processor reads a logic function and associated operands from the data store, performs the operation, and writes the results as illustrated by
Before we developed our current emulator, the clock granularity was the time for one processor to evaluate one logic function. We have found that signal propagation times and power consumption considerations determine the step time t. This time t is greater-than or equal-to D1+D2+D3+D4.
This sum, D1+D2+D3+D4, includes reading from the data store, setting up the operation, performing the evaluation, and storing the results. Note that setup can include gathering data from other processors on the same module or on other modules. We determined that for our planned interconnection networks, the setup times dominate the sum; there is a large differential between the amount of time spent during setup versus the amount of time spent during the logic evaluation.
We have provided, in accordance with our invention, the ability to exploit this time differential by tapping the results from one processor and feeding them to the next, within the step time t. Thus, when clusters of processors are interconnected such that the setup and storing of results is done in parallel, as illustrated by
Illustrating how different connections can be made for differing numbers of processors,
While the preferred embodiment to the invention has been described, it will be understood that those skilled in the art, both now and in the future, may make various improvements and enhancements which fall within the scope of the claims which follow. These claims should be construed to maintain the proper protection for the invention first described.
This application is a continuation of and claims the benefit of United States application Ser. No. 09/373,125, filed on Aug. 12, 1999 now U.S. Pat. No. 6,618,698.
Number | Name | Date | Kind |
---|---|---|---|
3775598 | Chao et al. | Nov 1973 | A |
4306286 | Cocke et al. | Dec 1981 | A |
4594677 | Barlow | Jun 1986 | A |
4656580 | Hitchcock, Sr. et al. | Apr 1987 | A |
4744084 | Beck et al. | May 1988 | A |
4754398 | Pribnow | Jun 1988 | A |
4769817 | Krohn et al. | Sep 1988 | A |
4775950 | Terada et al. | Oct 1988 | A |
4782440 | Nomizu et al. | Nov 1988 | A |
4819150 | Jennings et al. | Apr 1989 | A |
4862347 | Rudy | Aug 1989 | A |
4914612 | Beece et al. | Apr 1990 | A |
4918594 | Onizuka | Apr 1990 | A |
5132971 | Oguma et al. | Jul 1992 | A |
5146460 | Ackerman et al. | Sep 1992 | A |
5179672 | Genduso et al. | Jan 1993 | A |
5210700 | Tom | May 1993 | A |
5263149 | Winlow | Nov 1993 | A |
5299313 | Petersen et al. | Mar 1994 | A |
5313618 | Pawloski | May 1994 | A |
5327361 | Long et al. | Jul 1994 | A |
5329470 | Sample et al. | Jul 1994 | A |
5339262 | Rostoker et al. | Aug 1994 | A |
5375074 | Greenberg et al. | Dec 1994 | A |
5410300 | Gould et al. | Apr 1995 | A |
5425036 | Liu et al. | Jun 1995 | A |
5442772 | Childs et al. | Aug 1995 | A |
5448496 | Butts et al. | Sep 1995 | A |
5452231 | Butts et al. | Sep 1995 | A |
5452239 | Dai et al. | Sep 1995 | A |
5475624 | West | Dec 1995 | A |
5477475 | Sample et al. | Dec 1995 | A |
5490266 | Sturges | Feb 1996 | A |
5537341 | Rose et al. | Jul 1996 | A |
5548785 | Fogg, Jr. et al. | Aug 1996 | A |
5551013 | Beausoleil et al. | Aug 1996 | A |
5566097 | Myers et al. | Oct 1996 | A |
5581562 | Lin et al. | Dec 1996 | A |
5583450 | Trimberger et al. | Dec 1996 | A |
5590345 | Barker et al. | Dec 1996 | A |
5590372 | Dieffenderfer et al. | Dec 1996 | A |
5596742 | Agarwal et al. | Jan 1997 | A |
5600263 | Trimberger et al. | Feb 1997 | A |
5602754 | Beatty et al. | Feb 1997 | A |
5612891 | Butts et al. | Mar 1997 | A |
5615127 | Beatty et al. | Mar 1997 | A |
5629637 | Trimberger et al. | May 1997 | A |
5629858 | Kundu et al. | May 1997 | A |
5634003 | Saitoh et al. | May 1997 | A |
5644515 | Sample et al. | Jul 1997 | A |
5646545 | Trimberger et al. | Jul 1997 | A |
5696987 | DeLisle et al. | Dec 1997 | A |
5699283 | Okazaki et al. | Dec 1997 | A |
5701441 | Trimberger | Dec 1997 | A |
5715172 | Tzeng | Feb 1998 | A |
5715433 | Raghavan et al. | Feb 1998 | A |
5721695 | McMinn et al. | Feb 1998 | A |
5721953 | Fogg, Jr. et al. | Feb 1998 | A |
5727217 | Young | Mar 1998 | A |
5734581 | Butts et al. | Mar 1998 | A |
5734869 | Chen | Mar 1998 | A |
5737578 | Hennenhoefer et al. | Apr 1998 | A |
5742180 | DeHon et al. | Apr 1998 | A |
5754871 | Wilkinson et al. | May 1998 | A |
5761483 | Trimberger | Jun 1998 | A |
5761484 | Agarwal et al. | Jun 1998 | A |
5784313 | Trimberger et al. | Jul 1998 | A |
5790479 | Conn | Aug 1998 | A |
5796623 | Butts et al. | Aug 1998 | A |
5798645 | Zeiner et al. | Aug 1998 | A |
5801955 | Burgun et al. | Sep 1998 | A |
5802348 | Stewart et al. | Sep 1998 | A |
5812414 | Butts et al. | Sep 1998 | A |
5815687 | Masleid et al. | Sep 1998 | A |
5819065 | Chilton et al. | Oct 1998 | A |
5822564 | Chilton et al. | Oct 1998 | A |
5822570 | Lacey | Oct 1998 | A |
5825662 | Trimberger | Oct 1998 | A |
5842031 | Barker et al. | Nov 1998 | A |
5966528 | Wilkinson et al. | Oct 1999 | A |
6035117 | Beausoleil et al. | Mar 2000 | A |
6051030 | Beausoleil et al. | Apr 2000 | A |
6075937 | Scalzi et al. | Jun 2000 | A |
6192072 | Azadet et al. | Feb 2001 | B1 |
6370585 | Hagersten et al. | Apr 2002 | B1 |
6564376 | Froehlich et al. | May 2003 | B1 |
6618698 | Beausoleil et al. | Sep 2003 | B1 |
Number | Date | Country | |
---|---|---|---|
20030212539 A1 | Nov 2003 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 09373125 | Aug 1999 | US |
Child | 10459340 | US |