The present invention relates to the field of multi-processor unit microprocessors; more specifically, it relates to a method of testing and sorting multi-processor unit microprocessors and specifying performance and resource requirements for each processing unit of multi-processor unit microprocessors.
Microprocessors are tested and sorted to specific operating specifications such as frequency and power. Large multi-processor unit microprocessors often can not meet a common optimal specification due to, for example, process variations across the integrated circuit chip. In one example, process variations cause one portion of the microprocessor to run slow, but to consume less power than an another portion which runs faster but consumes more power. This leads to a specification on the entire microprocessor of the speed of the slower region, but at the cost of a faster region consuming more power than is desirable in a speed/power optimized microprocessor. In such a case, the microprocessor has less market value. Further, regions running different power levels generate non-uniform heating for which it is more difficult to provide a cooling solution.
Therefore, there is a need for a method to guarantee that a microprocessor's performance and heating are as uniform as possible across the integrated circuit chip.
A first aspect of the present invention is a method, comprising: (a) selecting and testing, with a selected parameter set of a group of parameter sets, a processor unit of a microprocessor having two or more processor units; (b) comparing the operation of the selected processor unit to a selected specification of a set of operational specifications of the microprocessor; (c) if the testing indicates that the operation of the selected processor unit does not meet the selected specification, repeating (a) and (b) with a different parameter set of the group of parameter sets until either the selected processor unit meets the selected specification or all parameter sets of the group of parameter sets have been selected; and (d) if the operation of the selected processor unit does meet the selected specification, repeating (a), (b) and (c) until all processor units of the two or more processor units of the microprocessor have been selected.
A second aspect of the present invention is a computer program product, comprising a computer usable medium having a computer readable program code embodied therein, the computer readable program code comprising an algorithm adapted to implement a method for testing and sorting a microprocessor having two or more processor units, the method comprising the steps of: (a) selecting and testing, with a selected parameter set of a group of parameter sets, a processor unit of a microprocessor having two or more processor units; (b) comparing the operation of the selected processor unit to a selected specification of a set of operational specifications of the microprocessor; (c) if the testing indicates that the operation of the selected processor unit does not meet the selected specification, repeating (a) and (b) with a different parameter set of the group of parameter sets until either the selected processor unit meets the selected specification or all parameter sets of the group of parameter sets have been selected; and (d) if the operation of the selected processor unit does meet the selected specification, repeating (a), (b) and (c) until all processor units of the two or more processor units of the microprocessor have been selected.
A third aspect of the present invention is a microprocessor, comprising: two or more processor units, each processor unit comprising a voltage island; and a fuse bank in the microprocessor, the fuse bank encoding, independently for each processor unit of the two or more processor units, at least one operating parameter for each of the processor units of the two or more processor units.
The features of the invention are set forth in the appended claims. The invention itself, however, will be best understood by reference to the following detailed description of an illustrative embodiment when read in conjunction with the accompanying drawings, wherein:
For the purposes of the present invention the term processor unit denotes a completely functional microprocessor. Processor units are also known as processor cores or microprocessor cores. To avoid confusion, though processor units are microprocessors two conventions may be used. In the first convention the term microprocessor is used to describe an electronic device implemented as an integrated circuit chip and having multiple processor units in different regions of the same integrated circuit chip. In the second convention the term microprocessor is used to describe an electronic device implemented as a multi-chip module (MCM) and having multiple processor units, each processor unit on different integrated circuit chips of the MCM. The embodiments of the present invention are described in terms of the first convention (a single integrated circuit chip), but may be applied to the second convention as well (an MCM).
For the purposes of the present invention the term voltage island denotes a bounded region of an integrated circuit chip having an internal power distribution network that is supplied from a power source external to that region. Different voltage islands may be supplied from a same power supply or from different power supplies. Voltage islands may include fencing circuits for communication across voltage island boundaries. Voltage islands are also known as voltage domains.
Power pads 125 and 135 supply power to respective processor units 105 and 110 from one or two external power supplies. In the event of two external power supplies, then all power pads 125 are supplied from a first external power supply and all power pads 135 are supplied from a second external power supply. In the event of two external power supplies, the external power supplies may have the same VDD level or different VDD levels as described infra. Though, generally ground (GND or VSS) of both power supplies are connected externally and all ground pads 130A and 130B are connected to the common ground, it is possible to have separate grounds from each power supply, (which may have the same or different voltage levels), the ground of the first external power supply connected to all grounds pads 130A and the ground of the second external power supply connected to all ground pads 130B.
In one embodiment, processor units 105 and 110 are also clock domains which are separated from each other by boundary 115. This may be implemented several ways:
First processor unit may include an optional first clock generating circuit (in one example a phase-lock-loop (PLL)) 140 that generates the clock signal that defines the operating frequency of first processor unit 105, and second processor unit 110 may include an optional second clock generating circuit (in one example a PLL) 145. Clock generating circuits 140 and 145 may have a same frequency or different frequencies.
Clock signals may be supplied from two external clock circuits through corresponding I/O pads 120A and 120B of each processor unit 105 and 110. The external clock circuits may have a same frequency or different frequencies.
In one embodiment, only one clock generating circuit is present and is in a third voltage island different from that of first and second processor units 105 and 110.
In one embodiment, microprocessor 100 includes an optional fuse bank and support circuit 150 that may be used, for example, to encode operational specifications/information general to microprocessor 100 as well as specific to processor units 105 and 11O. Such information may include operating voltage and operating frequency. While fuse bank and support circuit 150 are illustrated in second processor unit 110, there may be an additional fuse bank located in first processor unit 105 or fuse bank and support circuit 150 may be located in another, non-processor unit region of microprocessor 100 including a third or fourth voltage island.
While only a small number of pads are illustrated in
In
Microprocessor 100 of
Power grid 255 and ground grid 265 comprise a first voltage island. Power grid 260 and ground grid 265 comprise a second voltage island. Connected between power grid 255 and ground grid 265 are the circuits of a first processor unit (see
Power grid 255 is electrically and physically part of a first processor unit. Power grid 260 is electrically and physically part of a second processor unit. Ground grid 265 is physically shared between the first and second processor units. Alternatively, ground grid 265 may be split into two electrically separate ground grids, a first ground grid physically located in the first processor unit and a second ground grid physically located in the second processor unit.
In a similar manner to power distribution network 250, a clock domain network may be illustrated with power grids 255 and 260 replaced by clock trees, which may be grid-like in structure or comprised of a set of cascaded spoke-like distribution nodes.
All methods of testing and sorting microprocessors according to embodiments of the present invention are performed after functional test has been performed and the microprocessor is functionally “good.” Microprocessors that do not pass functional test are discarded and are not sorted.
A sort specification is a set of specified parameters that for a given processor unit must occur and be satisfied together. A sort test includes the same parameters as its corresponding sort specification, except some parameters are supplied by the tester and some are measured by the tester. For example, a voltage level may be supplied and operating frequency, power consumption and temperature measured. A general example of a sort test is a specification stating a power requirement of the processor unit at a given operating frequency and operating temperature. Since power is current (I) times voltage (V) or IV, operating voltage is a parameter as well. Any given sort may include a range of one or more of the specified parameters. Sorts according the embodiments of the present invention include, but are not limited to holding power, operating frequency and temperature constant at different voltages (to control power consumption); and holding voltage, power and temperature constant at different frequencies (to control performance). Table I gives some exemplary sort tests.
In all tables, “W” is watts and “V” is volts. While Table I, shows only one parameter being varied in each sort, it is possible to vary two or more parameters within a sort. Also, while Table 1 shows only three sets of parameters in each sort, there may be any number of parameter combinations within a sort. Also, to pass a specific parameter, the test result need not be exactly the value listed in Table I, but within a range. For example the 100 W specification may be passed if the processor unit is between 95 W and 105 W.
In step 310, all the combinations of tests of each test that the current processor unit passes are recorded as illustrated in Table II infra. Next, in step 315, it is determined if there is another sort test to be performed, if not the method proceeds to step 320, if there is another sort test to be performed, the method loops back to step 305. In step 315, it is determined if there is another processor unit to be tested, if not the method step 325, if there is another sort test to be performed, the method loops back to step 300. Table II gives some exemplary sort test results.
In step 325, a set of selection rules is applied to the information of Table II in order to select a parameter combination for each processor unit that optimizes a goal of the sort testing. For example, the goals of the sort testing may be to have a microprocessor with the maximum performance, lowest power requirement or most uniform heat dissipation across the integrated circuit chip. Table III gives some exemplary rules.
The rules are applied in order and the method stops when a rule is met. Within each rule, there may be a hierarchy of sub-rules. For example, when processor units having different VDD are allowed, there may be a rule indicating a maximum voltage difference between the processor unit having lowest VDD and the processor unit having the highest VDD. In another example, when processors with different FREQ and VDD are allowed, there may be sub-rules indicating whether the closet match in FREQ or PWR is to be selected. Note rule 0, is essentially a perfect microprocessor with all processor units performing to a prime specification.
In optional step 330, codes indicating the parameters for each processor unit are encoded into the fuse bank(s) (see
On the first pass through step 420, the rules are not applied and the method goes directly to step 425 because the rules can be applied only when two or more processor units have each been sort tested and each has passed at least one sort test. However, in any step where rules are applied, the rules are applied to all current entries of table II. Assuming a second or subsequent pass through step 420, rules from a table similar to table III discussed supra are applied.
If in step 420, any rule is passed the method proceeds to testing the next processor unit (PU 3), if in step 420, no rule is passed, then in step 425, the second processor unit (PU2) is tested with the first/next sort test (based on a PU2 sort test counter value) until a sort is passed or no further sort tests are left. In step 430, it is determined if the second processor unit has passed any sort test. If no sort test has been passed, in step 435, the microprocessor is designated a functional non-sort part and the testing and sorting is terminated. If in step 430, the second processor unit has passed a sort test, then in step 440, the sort test parameters are recorded in a table similar to Table II discussed supra and the method proceeds to step 445.
If in step 445, any rule is passed the method proceeds to testing the next processor unit (PU 3), if not then in step 455 all sort test counters of the current and previously tested processor units are incremented by one and the method loops to step 400. (In step 455, the counters for PU1 and PU2 are incremented). Testing, sorting and looping of processor units PU 3 to the next to last processor unit (PUN-1) are similar to testing the second processor unit. The flow diagram of
Skipping to testing the last processor unit (PUN), in step 460, the last processor unit (PUN) is tested with the first/next sort test (based on a PUN sort test counter value) until a sort is passed or no further sort tests are left. In step 465, it is determined if the last processor unit (PUN) has passed any sort test. If no sort test has been passed, in step 470, the microprocessor is designated a functional non-sort part and the testing and sorting is terminated. If in step 465, the last processor unit has passed a sort, then in step 475, the sort test parameters are recorded in a table similar to Table II discussed supra and the method proceeds to step 480.
If in step 480, any rule is passed the method proceeds to step 485, if not then in step 490 all sort test counters (PU1 through PUN) are incremented by one and the method loops to step 400. In step 485, the rules are again applied to the cumulative passed sort tests recorded in Table II and a parameter combination for each processor unit that optimizes the goal of the sort testing is selected as described supra in reference to step 325 of
The method described in
A further decrease in tester time may be accomplished by “hard coding” the rules into the method flow and by selection and positioning of the sort tests themselves within the method flow as illustrated in
In
The steps in the first row of the array are performed in sequence with an immediate branch to the step in the first row of the next column in the event a test in any step of the column is passed. In the last column of the array an immediate branch to step 505(1) occurs the event any test in the last column of the array is passed. Upon a pass of a test in the array, the VDD value and processor unit is recorded. In step 505(1) the microprocessor is designated as a sort 1 part number (P/N) (BIN is shorthand for P/N bin) with separate VDD codes indicating the pass VDD value of each processor. Optionally the power supply voltage levels required for each processor unit may be encoded in fuse banks contained in the microprocessor. In subsequent module processing or packaging operations, the power supply voltage levels required for each processor unit may be encoded in fuse banks contained in the microprocessor, if they were not encoded in step 505(1), and/or they may be printed on the microprocessor module. In the event any test in the last row of the array is not passed an immediate branch to step 510(11) is performed. A fail in step 500(X1) also causes a branch to step 510(11).
A SORT 2 matrix is defined by the corner steps 510(11), 510(X1), 510(1N) and 510(XN). In step 510(11), the first processor unit (PU1) is tested with the second sort (SORT 2) conditions against the first VDD value (VDD1). If the processor passes the test, then the method proceeds to step 510(12). Steps 510(11), 510(X1), 510(1N) and 510(XN) may be considered the corners of an array of potential tests. A first column of the array is defined by steps 510(11) through 510(X1) and a last column of the array is defined by steps 510(1N) through 510(XN). A first row of the array is defined by steps 510(11) through 510(1N) and a last row of the array is defined by steps 510(X1) through 510(XN). Each of the steps 510(11) through 510(1N) performs a SORT 2 test using VDD1 on a different processor unit and each of steps 510(X1)through 510(XN) performs a SORT 2 test using VDDX on a different processor unit. Movement between steps in the SORT 2 matrix is similar to that described for the SORT 1 matrix supra.
A SORT M matrix is defined by the corner steps 515(11), 515(X1), 515(1N) and 515(XN). In step 515(11), the first processor unit (PU1) is tested with the Mth sort (SORT M) conditions against the first VDD value (VDD1). If the processor passes the test, then the method proceeds to step 515(12). Steps 515(11), 515(X1), 515(1N) and 515(XN) may be considered the comers of an array of potential tests. A first column of the array is defined by steps 515(11) through 515(X1) and a last column of the array is defined by steps 515(1N) through 515(XN). A first row of the array is defined by steps 515(11) through 515(1N) and a last row of the array is defined by steps 515(X1) through 515(XN). Each of the steps 515(11) through 515(1N) performs a SORT M test using VDD 1 on a different processor unit and each of steps 515 (X1 )through 515(XN) performs a SORT M test using VDDX on a different processor unit. Movement between steps in the SORT M matrix is similar to that described for the SORT 1 matrix supra except that a no from any of steps 515(X1) through 515 (XN) causes a branch to step 520.
If step 520 is reached, no combination of VDDs results in a passed sort and the microprocessor is designated a functional non-sort part and the testing and sorting is terminated.
The arrays of potential tests between the SORT 2 matrix and the SORT M matrix are not shown, but indicated by the three dots between steps 510(X1) and 510(X2). In order to avoid repeating the same sort/processor unit/VDD test combination, the sorts and VDD value a processor unit passes may be tracked and the flow through each of the N times X times M potential test arrays adjusted automatically based on earlier test results on the microprocessor to avoid repeating identical tests.
ROM 620 contains the basic operating system for computer system 600. The operating system may alternatively reside in RAM 615 or elsewhere as is known in the art. Examples of removable data and/or program storage device 630 include magnetic media such as floppy drives and tape drives and optical media such as CD ROM drives. Examples of mass data and/or program storage device 635 include electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system (or apparatus or device) or a propagation medium. Examples of a computer-readable medium include a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk and an optical disk. Current examples of optical disks include compact disk-read only memory (CD-ROM), compact disk-read/write (CD-R/W) and DVD. In addition to keyboard 645 and mouse 650, other user input devices such as trackballs, writing tablets, pressure pads, microphones, light pens and position-sensing screen displays may be connected to user interface 640. Examples of display devices include cathode-ray tubes (CRT) and liquid crystal displays (LCD).
A computer program with an appropriate application interface may be created by one of skill in the art and stored on the system or a data and/or program storage device to simplify the practicing of this invention. In operation, information for or the computer program created to run the present invention is loaded on the appropriate removable data and/or program storage device 630, fed through data port 660 or typed in using keyboard 645.
Thus, the embodiments of the present invention provide a method to guarantee that a microprocessor's performance and heating are optimized across the integrated circuit chip.
The description of the embodiments of the present invention is given above for the understanding of the present invention. It will be understood that the invention is not limited to the particular embodiments described herein, but is capable of various modifications, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, it is intended that the following claims cover all such modifications and changes as fall within the true spirit and scope of the invention.