The present invention relates to the field of multi-core microprocessor design.
There is a trend in recent years in microprocessor development toward multi-core microprocessors. Time to market and scalability are important considerations in multi-core microprocessor design. Furthermore, the amount of power consumed by the multi-core microprocessor may be considerable, making power management thereof an important consideration. Finally, because the multiple cores typically communicate with one another and with a chipset or other type of memory controller/bus bridge via a common bus, signal quality on the bus may be an important consideration.
In one aspect the present invention provides a method for printing a multi-core die on a semiconductor wafer by modifying a reticle set useable for manufacturing dies with half as many, or fewer, cores. A first reticle set is developed or obtained that is usable to print Q-core dies, where Q is at least 1. Modifications are made to the first reticle set to develop a second reticle set useable to print P-core dies, where P is at least twice Q. To illustrate, a single-core die would be represented by a Q of 1, a dual-core die would be represented by a Q or P of 2, and a quad-core die would be represented by a Q or P of 4.
The first reticle set defines scribe lines to separate the Q-core dies, and the scribe lines collectively define a seal ring to surround each Q-core die. At least one defined scribe line of the first reticle set is removed, and corresponding inter-core communication wires are defined to connect at least two adjacent cores that would have been separated by the replaced scribe line. The inter-core communication wires are configured to enable the at least two connected cores to communicate during operation. Moreover, the inter-core communication wires are configured to not connect to physical input/output landing pads of the P-core die, such that a P-core die manufactured in accordance with the modified reticle set will not carry signals through the inter-core communication wires off the P-core die.
A semiconductor wafer of multi-core dies is then manufactured using the second reticle set, including printing the multi-core dies on the semiconductor wafer using the second reticle set, and cutting the multi-core dies on the semiconductor wafer along the remaining scribe lines.
In other aspects, the inter-core communication wires are configured for particular purposes. In one aspect, they are configured to enable the cores of the multi-core die to communicate with one another to perform power management of the multi-core die. In a related aspect, the power management comprises synchronizing power state changes by the cores, managing a shared voltage source, and/or managing a shared clock source. In another aspect, the inter-core communication wires define an internal bypass bus by which the cores bypass an external processor bus that interconnects the multi-core die to a chipset. In a related aspect, the cores are configured so that when one of the cores drives the external bus, the other cores listen via the internal bypass bus rather than the external bus.
In another aspect, the second reticle set is develop by modifying less than all of, and more particularly less than half of, the layers of the first reticle set. In a further related aspect, only non-transistor layers of the first reticle set are modified to develop the second reticle set.
In yet further aspects, second reticle sets are developed with specifiable relationships to the first reticle sets. In one aspect, where the first reticle set is usable to print an M×N matrix of Q-core dies. In a more particular aspect, M is an even number at least 2 and N is at least 1, and the second reticle set is developed to print only P-core dies, specifically (N×M)/2 P-core dies. In an alternative aspect, the second reticle set is developed to print both P-core dies and Q-core dies. In a more particular alternative aspect, M is an odd number at least 3 and N is at least 2, and the second reticle set is developed to print M Q-core dies and N×[(M−1)/2] P-core dies. In another more particular alternative aspect, M is an even number at least 4 and N is at least 2, and the second reticle set is developed to print (N×M)/2 Q-core dies and (N×M)/4 P-core dies.
In one aspect, the present invention provides a multi-core die produced by a process as defined above. In a related aspect, the multi-core die employs the inter-core communications for one or more of the particular purposes, as defined above, for which they are configured.
Described herein are embodiments of a method for taking a reticle design for a single-core microprocessor and quickly turning it into a reticle design for a dual-core microprocessor. Additionally, embodiments of a method for taking a reticle design for a dual-core microprocessor and quickly turning it into a reticle design for a quad-core microprocessor are described. The notion may be extended to the design of multi-core microprocessors having even more than four processing cores.
Referring now to
The original reticle set 100 includes a grid of crossing horizontal scribe lines 106 and vertical scribe lines 108 to define and facilitate physical separation of the single-core dies 102 from each other. For example, the original reticle set 100 of the embodiment of
Referring now to
Referring now to
At block 302, the designers develop a first reticle set, such as the original reticle set 100 of
At block 304, the designers modify less than all of the reticles of the first reticle set produced according to block 302 to produce a second reticle set, such as the reticle set 200 of
In one embodiment, the inter-core communication wires 212 between the two cores embody a comprehensive (or relatively so) parallel bypass bus, with multiple inter-core communication wires enabling inter-core communications of each of a large relevant set of processor bus signals. An example of such a parallel bypass bus is described in more detail in the section of Ser. No. 61/426,470, filed Dec. 22, 2010, entitled “Multi-Core Internal Bypass Bus” (CNTR.2503), which is incorporated herein by reference.
In another embodiment, the inter-core communication wires 212 between the two cores comprise a smaller set of wires. For example, the section of Ser. No. 61/426,470, filed Dec. 22, 2010, entitled “Distributed Management of a Shared Power Source to a Multi-Core Processor” (CNTR.2534), describes a relatively small set of inter-core communication wires 212 that exchange each core's desired voltage ID (VID) value with the other. Even smaller sets of inter-core communication wires 212 could be accommodated using, for example, a serial interface like that described for inter-die communications in FIG. 2 of CNTR.2534.
The inter-core communication wires 212 enable the two connected cores to communicate during operation. The wires 212 are not connected to physical I/O landing pads of the dual-core die; hence, they do not carry signals off the dual-core die. As discussed above, in one embodiment, the first reticle set is a 3×3 matrix of single-core dies, and the second reticle set produces three dual-core dies and three single-core dies. More generally, the first reticle set is an M×N matrix of single-core dies. If M is an odd number, the second reticle set produces N×[(M−1)/2] dual-core dies and N single-core dies, an example of which is shown in the embodiment of
At block 306, a manufacturer manufactures a wafer of dies using the second reticle set produced according to block 304. The manufacturer then cuts the dies along the remaining scribe lines to produce the single-core and dual-core dies. Alternatively, the manufacturer uses the second reticle set to produce all dual-core dies, as described with respect to block 406 of the flowchart of
Broadly speaking, according to one embodiment, a non-full height seal ring is created around each individual core of the dual-core dies 204 and a full height seal ring is created around the entire dual-core die 204.
Referring now to
At block 404, the designers modify less than all of the reticles of the first reticle set produced according to block 302 to produce a second reticle set so that the second reticle set can be used to print a set of dual-core dies, such as the dual-core dies 204 of
At block 406, a manufacturer manufactures a wafer of dies using the second reticle set produced according to block 404. The manufacturer then cuts the dies along the remaining scribe lines to produce the dual-core dies. Flow ends at block 406.
Various uses of the inter-core communication wires 212 are described herein; however, the uses are not limited to those described. One use is to provide an internal bypass bus on the inter-core communication wires 212 to overcome poor signal quality of an external processor bus that interconnects the cores and other system components such as the chipset. In the bypass bus described in CNTR.2503, when one core detects that the other core on its die is driving the external bus, the one core listens to the other core via the internal bypass bus rather than the external bus. Another use is to provide sideband communication wires to facilitate a multi-core power management scheme for functions such as shared voltage identifier (VID), phase-locked loop (PLL) change coordination, and C-state (power state) transition synchronization. Uses of inter-core communication wires 212 for such purposes are described in more detail in CNTR.2534 and the section of Ser. No. 61/426,470, filed Dec. 22, 2010, entitled “Decentralized Power Management Distributed Among Multiple Processor Cores” (CNTR.2527), which is also incorporated herein by reference.
According to a more advanced reticle modification embodiment, a first reticle set is provided for printing a 3×3 matrix of single-core dies, referred to as core B. The core B design does not include a bus interface architecture (e.g., appropriate bypass bus muxes or the inter-core communication wire transceivers) to accommodate dual die operation. For example, the core B design may have a bus interface architecture like that of FIG. 2 of CNTR.2503, which describes both a conventional processor core bus interface architecture, which does not accommodate bypass bus communications. CNTR.2503 also describes several modified embodiments or processor core bus interfaces that do accommodate bypass bus communications.
Continuing with the more advanced reticle modification embodiment, elements to accommodate bypass bus communications (such as those described in one of the embodiments of CNTR.2503) are implemented in an intermediately modified reticle set by using spare transistors and gates of the core B design. In one example involving a 70-layer first reticle set, approximately 25 metal and via layers of the first reticle set are initially modified, and no transistor layers are changed, to enable the cores to accommodate bypass bus communications. The single-core dies that can be manufactured using the initially modified reticle set are referred to as core Y. Five layers in the intermediately modified reticle set are further modified to create a second, fully modified, reticle set for printing a matrix having three single-core Y dies and three dual-core dies referred to as core X. The modified reticle set defines each dual-core die X to have communication wires connecting the two cores. These five layers include the top metal and via layers in which the inter-core communication wires reside and three bump and passivation layers to improve reliability, such as adding dummy bumps in the scribe line region for physical stability.
An extension of the technique described above to convert a dual-core design to a quad-core design will now be described with respect to
Referring now to
In one embodiment, the dual-core reticle set 500 configures each dual-core die 504 to include native inter-core communication wires between the two cores. The native inter-core wires may embody a relatively comprehensive internal bypass bus, as described, for example, in CNTR.2503, or a much smaller set of inter-core communication wires, as described, for example, in CNTR.2534, between the two cores of the dual-core die 504. In an alternative embodiment, the dual-core reticle set 500 defines each dual-core die to share a set of landing pads, and to have a bus interface architecture like that of the twin core pair of FIG. 6 of CNTR.2503. In another embodiment, the dual-core reticle set 500 defines the dual-core dies 504 with a shared level-2 cache memory.
Referring now to
Various uses of the (further) inter-core communication wires 412 are described according various embodiments in detail in CNTR.2503, CNTR.257, and CNTR.2534. Even further uses are described in the sections of Ser. No. 61/426,470, filed Dec. 22, 2010, entitled “Dynamic Multi-Core Microprocessor Configuration” (CNTR.2533) and “Dynamic and Selective Core Disablement in a Multi-Core Processor” (CNTR.2536), which are herein incorporated by reference.
In one embodiment, the inter-core communication wires 412 in combination with native inter-core communication wires 212 connect each core in the quad-core die 604 to each other core in the quad-core die 604 to facilitate direct power management communication, in accordance with a collaborative peer-to-peer coordination model, between each pair of cores. CNTR.2527 describes both collaborative peer-to-peer coordination and master-mediated coordination models for power management between cores, and
In another embodiment, the inter-core communication wires 412 connect only one core in one previously distinct dual-core die 504 with only one core in the other previously distinct dual-core die 504. For example, one core in each of the two previously distinct dual-core dies 504 could act as a “master” for that die, with each master connected with its originally paired core via native inter-core communication wires 412. In this embodiment, an additional set of inter-core communication wires 412 would connect the two masters together. CNTR.2527 describes several quad-core dies whose cores are connected in such fashion.
In yet another embodiment, a single set of inter-core communication wires 412 connect a twin core pair of one previously distinct dual-core die 504 with a twin core pair of the other previously distinct dual-core die 504. FIG. 8 of CNTR.2503 is illustrative of such an embodiment.
In yet another embodiment, two pairs of inter-core communication wires 412 are provided for increased configuration flexibility, redundancy, and/or reliability. A pair of inter-core communication wires 412 connects each core in one previously distinct dual-core die 504 with a “complementary” core in the other previously distinct dual-core die 504. The inter-core communication wires 412 are in addition to the two pairs of native inter-core communication wires 212 connecting the cores of each previously distinct dual-core die 504 together. CNTR.2534 describes a processor with two dual-core dies wherein each core of each die is connected in an equivalent fashion. In a corresponding quad-core die embodiment, a complementary architecture analogous to that of CNTR.2534 would be applied.
In yet another embodiment, a set of inter-core communication wires 212 would be provided between a designated master core and each of the other three cores. In such an embodiment, the non-master cores of the die would not be connected by inter-core communication wires. In this last embodiment, reticles for such a quad-core die would preferably be developed directly from reticles of a single-core die, rather than from reticles of a dual-core die.
It is noted that although in many of these embodiments, each core of the quad-core die 604 is not enabled to directly communicate with each other core in the quad-core die 604, each core may nevertheless be configured to indirectly communicate with such cores through one or more cores of the quad-core die 604.
Referring now to
At block 702, the designers develop a dual-core reticle set, such as the dual-core reticle set 500 of
At block 704, the designers modify less than all of the reticles of the dual-core reticle set produced according to block 702 to produce a modified reticle set, such as the reticle set 600 of
At block 706, a manufacturer manufactures a wafer of dies using the modified reticle set produced according to block 704. The manufacturer then cuts the dies along the remaining scribe lines to produce the dual-core and quad-core dies. Alternatively, the manufacturer uses the modified reticle set to produce all quad-core dies, as described with respect to block 806 of the flowchart of
Referring now to
At block 804, the designers modify less than all of the reticles of the dual-core reticle set produced according to block 702 to produce a modified reticle set so that the modified reticle set can be used to print a set of quad-core dies, such as the quad-core dies 604 of
At block 806, a manufacturer manufactures a wafer of dies using the modified reticle set produced according to block 804. The manufacturer then cuts the dies along the remaining scribe lines to produce the quad-core dies. Flow ends at block 806.
In addition to the advantages mentioned above, another advantage of the design and manufacture method described herein is that it avoids having to add additional physical pads to the dies to create the inter-core communication wires between the cores, or more particularly between the entities that were previously multiple dies that are merged in a single die. This may be observed in more detail with respect to the embodiment of FIGS. 3 and 4 of CNTR.2503, which achieves a dual-core microprocessor with improved signal quality afforded by the internal bypass bus and yet avoids having to add additional physical pads to create the bypass bus between the two cores. This provides the benefit of solving the pad-limitedness for a pad-limited design in which the two cores need to communicate.
Although embodiments have been described for quickly modifying reticles to produce dual-core dies and quad-core dies, other embodiments are contemplated in which the techniques described may be employed to quickly modify reticles to produce multi-core dies having larger numbers of cores.
While various embodiments of the present invention have been described herein, it should be understood that they have been presented by way of example, and not limitation. It will be apparent to persons skilled in the relevant computer arts that various changes in form and detail can be made therein without departing from the scope of the invention. For example, software can enable, for example, the function, fabrication, modeling, simulation, description and/or testing of the apparatus and methods described herein. This can be accomplished through the use of general programming languages (e.g., C, C++), hardware description languages (HDL) including Verilog HDL, VHDL, and so on, or other available programs. Such software can be disposed in any known computer usable medium such as magnetic tape, semiconductor, magnetic disk, or optical disc (e.g., CD-ROM, DVD-ROM, etc.), a network, wire line, wireless or other communications medium. Embodiments of the apparatus and method described herein may be included in a semiconductor intellectual property core, such as a microprocessor core (e.g., embodied, or specified, in a HDL) and transformed to hardware in the production of integrated circuits. Additionally, the apparatus and methods described herein may be embodied as a combination of hardware and software. Thus, the present invention should not be limited by any of the exemplary embodiments described herein, but should be defined only in accordance with the following claims and their equivalents. Specifically, the present invention may be implemented within a microprocessor device which may be used in a general purpose computer. Finally, those skilled in the art should appreciate that they can readily use the disclosed conception and specific embodiments as a basis for designing or modifying other structures for carrying out the same purposes of the present invention without departing from the scope of the invention as defined by the appended claims.
This application is a divisional of U.S. non-provisional application Ser. No. 13/299,171, filed Nov. 17, 2011, which claims priority based on U.S. Provisional Application Ser. No. 61/426,470, filed Dec. 22, 2010, entitled MULTI-CORE INTERNAL BYPASS BUS, each of which is hereby incorporated by reference in its entirety. This application is related to the following co-pending U.S. patent applications, each of which is hereby incorporated by reference in its entirety. Publication No.DateTitleUS 2012/0239847Sep. 20, 2012MULTI-CORE INTERNAL BYPASS BUSUS 2012/0166845Jun. 28, 2012POWER STATE SYNCHRONIZATION IN AMULTI-CORE PROCESSORUS 2012/0166837Jun. 28, 2012DECENTRALIZED POWER MANAGEMENTDISTRIBUTED AMONG MULTIPLEPROCESSOR CORESUS 2012/0166763Jun. 28, 2012DYNAMIC MULTI-CORE MICROPROCESSORCONFIGURATION DISCOVERYUS 2012/0166832Jun. 28, 2012DISTRIBUTED MANAGEMENT OF A SHAREDPOWER SOURCE TO A MULTI-COREMICROPROCESSORUS 2012/0166764Jun. 28, 2012DYNAMIC AND SELECTIVE COREDISABLEMENT AND RECONFIGURATION IN AMULTI-CORE PROCESSOR
Number | Name | Date | Kind |
---|---|---|---|
4748559 | Smith et al. | May 1988 | A |
5467455 | Gay et al. | Nov 1995 | A |
5485625 | Gumkowski | Jan 1996 | A |
5546588 | Deems et al. | Aug 1996 | A |
5587987 | Okabe | Dec 1996 | A |
5918061 | Nikjou | Jun 1999 | A |
6496880 | Ma et al. | Dec 2002 | B1 |
6665802 | Ober | Dec 2003 | B1 |
6968467 | Inoue et al. | Nov 2005 | B2 |
7257679 | Clark | Aug 2007 | B2 |
7308558 | Arimilli et al. | Dec 2007 | B2 |
7358758 | Gaskins et al. | Apr 2008 | B2 |
7451333 | Naveh et al. | Nov 2008 | B2 |
7467294 | Matsuoka et al. | Dec 2008 | B2 |
8024591 | Bertelsen et al. | Sep 2011 | B2 |
8046615 | Taguchi et al. | Oct 2011 | B2 |
8103816 | Tiruvallur et al. | Jan 2012 | B2 |
8214632 | Choi et al. | Jul 2012 | B2 |
8358651 | Kadosh et al. | Jan 2013 | B1 |
8359436 | Vash et al. | Jan 2013 | B2 |
20040019827 | Rohfleisch et al. | Jan 2004 | A1 |
20040117510 | Arimilli et al. | Jun 2004 | A1 |
20050138249 | Galbraith et al. | Jun 2005 | A1 |
20060171244 | Ando | Aug 2006 | A1 |
20060224809 | Gelke et al. | Oct 2006 | A1 |
20060282692 | Oh | Dec 2006 | A1 |
20070070673 | Borkar et al. | Mar 2007 | A1 |
20070143514 | Kaushik et al. | Jun 2007 | A1 |
20070266262 | Burton et al. | Nov 2007 | A1 |
20080129274 | Komaki | Jun 2008 | A1 |
20090013217 | Shibata et al. | Jan 2009 | A1 |
20090083516 | Saleem et al. | Mar 2009 | A1 |
20090094481 | Vera et al. | Apr 2009 | A1 |
20090172423 | Song et al. | Jul 2009 | A1 |
20090222654 | Hum et al. | Sep 2009 | A1 |
20090233239 | Temchenko et al. | Sep 2009 | A1 |
20090307408 | Naylor | Dec 2009 | A1 |
20090319705 | Foong et al. | Dec 2009 | A1 |
20100058078 | Branover et al. | Mar 2010 | A1 |
20100138683 | Burton et al. | Jun 2010 | A1 |
20100250821 | Mueller | Sep 2010 | A1 |
20100325481 | Dahan et al. | Dec 2010 | A1 |
20100332869 | Hsin et al. | Dec 2010 | A1 |
20110185125 | Jain et al. | Jul 2011 | A1 |
20110265090 | Moyer et al. | Oct 2011 | A1 |
20110271126 | Hill | Nov 2011 | A1 |
20110295543 | Fox et al. | Dec 2011 | A1 |
20120005514 | Henry et al. | Jan 2012 | A1 |
20120023355 | Song et al. | Jan 2012 | A1 |
20120124264 | Tiruvallur et al. | May 2012 | A1 |
20120161328 | Henry et al. | Jun 2012 | A1 |
20120166763 | Henry et al. | Jun 2012 | A1 |
20120166764 | Henry et al. | Jun 2012 | A1 |
20120166832 | Gaskins et al. | Jun 2012 | A1 |
20120166837 | Henry et al. | Jun 2012 | A1 |
20120166845 | Henry et al. | Jun 2012 | A1 |
20120239847 | Gaskins | Sep 2012 | A1 |
Number | Date | Country |
---|---|---|
101111814 | Jan 2008 | CN |
101901177 | Dec 2010 | CN |
Entry |
---|
“Intel® Core™ 2 Extreme Quad-Core Mobile Processor and Intel® Core™ 2 Quad Mobile Processor on 45-nm Process.” Datasheet. For platforms based on Mobile Intel® 4 Series Express Chipset Family. Jan. 2009 Document Number 320390-002 pp. 1-72. |
“Intel® Atom™ Processor 330Δ Series.” Datasheet. For systems based on Nettop Platform for '08. Revision 002. Feb 2009. Document No. 320528-002. pp. 1-50. |
“Intel® Core™ 2 Duo Processors and Intel® Core™ 2 Extreme Processors for Platforms Based on Mobile Intel® 965 Express Chipset Family.” Datasheet. Jan. 2008. Document No. 316745-005. pp. 1-87. |
“Intel® Core™ Duo Processor and Intel® Core™ Solo Processor on 65 nm Processs.” Datasheet. Jan. 2007. Document No. 309221-006. pp. 1-91. |
Richard, Michael Graham. “Intel's Next CPU to Include Dedicated ‘Power Control Unit’ to Save Power.” Aug. 22, 2008 Downloaded from http://www.treehugger.com/files/2008/08/intel-cpu-processor-nehalem-i7-power-pcu.php on Jul. 14, 2010. pp. 1-4. |
Naveh, Alon et al. “Power and Thermal Management in Intel® Core™ Duo Processor.” Intel® Technology Journal. vol. 10, Issue 02, Published May 15, 2006. ISSN 1535-864X pp. 109-123. |
Glaskowsky, Peter. “Investigating Intel's Lynnfield Mysteries.” Mcall.com, CNET News, Speeds and Feeds. Sep. 21, 2009. Downloaded from http://mcall.com.com/8301-13512—3-10357328-23.html?tag=mncol;txt downloaded on Jul. 14, 2010. pp. 1-4 |
Wasson, Scott. “Intel's Core i7 Processors; Nehalem Arrives with a Splash.” The Tech Report. Nov. 3, 2008, Downloaded from http://techreport.com/articles.x/15818/1 on Jul. 14, 2010. pp. 1-11. |
“Intel® QuickPath Architecture; A New System Architecture.” White Paper Document No. 319725-001US. Mar. 2008. pp. 1-6. |
Sartori, John et al. “Distributed Peak Power Management for Many-core Architectures.” Published in Design, Automation & Test in Europe Conference & Exhibition, Apr. 2009, pp. 1556-1559. |
Number | Date | Country | |
---|---|---|---|
20140084427 A1 | Mar 2014 | US |
Number | Date | Country | |
---|---|---|---|
61426470 | Dec 2010 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 13299171 | Nov 2011 | US |
Child | 14094206 | US |