1. Field of the Invention
The present invention relates to scan testing chips having multiple cores.
2. Background Art
Chip multithreading (CMT) processors and chips may include a number of cores. The cores may include flops, combination logic, and other features grouped to facilitate executing any number of operations commonly associated with integrated circuits. One or more of the cores may be the same type of core in so far as they are logically and physically the same design copied over multiple times on a die. This type of CMT can be used to integrate the power of symmetric multiprocessing (SMP) on to a single chip, allowing a single processor to execute several software threads simultaneously. Traditional single-core processors can only process one thread at a time, spending a majority of time waiting for data from memory. CMT processors can process multiple software threads using a variety of methods, such as (i) having multiple cores on a single chip (CMP), (ii) executing multiple threads on a single core (SMT), or (iii) combination of both CMP and SMT.
Scan testing may be used to test the chip for manufacturing defects. The scan testing generally corresponds with serially shifting stimulus data into scan flops in order to program the flops to executed a desired operation. The data for a particular test pattern can be arranged into a scan chain where the scan chain includes a stimulus bit for each flop required to execute the desired operation. Multiple scan chains can be used in parallel to speed testing and/or to support different test patterns. The programmed flops can then be instigated to execute the desired operation according to the stimulus data, typically according a functional clock that operations at a greater speed than a scan clock used to facilitate programming the flops. Each of the executed flops may generate a response bit to reflect its execution of the desired operation. This information can then be shifted out of the flops for analysis. A error can be determined based on whether the response bits matches with corresponding test bits.
The present invention is pointed out with particularity in the appended claims. However, other features of the present invention will become more apparent and the present invention will be best understood by referring to the following detailed description in conjunction with the accompany drawings in which:
With chips becoming more complex, the number of flops per chip has increased and it is not uncommon to have 1-2 million flops in a microprocessor. With geometries shrinking in advanced semiconductor process technologies, there is a need for test patterns that target complex fault models such as transition faults, path delay faults, bridging faults, multiple detect faults, etc. The number of scan test patterns required to target all these fault models in complex microprocessors has increased significantly. The increase in number of flops and in the number of test patterns has resulted in test data volumes that do not fit cost-effectively inside testers and in manufacturing test flows.
The test configuration 30 only requires the tester to output F number of bits to test C number of cores having a same F number of flops, as opposed to the C*F number of bits required to in the test arrangement described in
Because the cores 12, 14, 16, 18 are logically and physically identical, the response of the cores 12, 14, 16, 18 should be the same for each test pattern. If the response of one of the cores 12, 14, 16, 18 fails to match with the other cores 12, 14, 16, 18, it can be assumed that one of the cores 12, 14, 16, 18 has an error. Optionally, the compactor 50 may be an exclusive-or gate tree configured to exclusively-or the response bit of each corresponding core 12, 14, 16, 18 as the bits are scanned out of the flops. The exclusive-or function requires the tester 36 to process a single output bit against a single test bit in order to determine whether one of the cores 12, 14, 16, 18 has an error. Once the core 12, 14, 16, 18 having the error is known, the position of the error bit with respect to the other bits shifted out of the cores 12, 14, 16, 18 can be used to identify the flop actually causing the error. Of course, the exclusive-or function is unable to detect masking errors where each of the cores 12, 14, 16, 18 have the same error at the same time, however, it is assumed that such as masking error is relatively unlikely.
The additional demultiplexer 62 and compactor 60 may operate in the same manner as the demultiplexer 32 and compactor 50 described above such that the tester need only output F*P total number of stimulus bits to program each chain of F flops for P number of patterns and the tester 36 need only process two error bits from each of the compactors 50, 60. Optionally, an additional compactor (not shown) could be included on the chip 10 to compact the outputs from the two illustrated compactors 50, 60 so as to reduce the outputted error bit to one. In this case, an error would indicate that one of the cores 12, 14, 16, 18 failed under one of the test patterns but it would be unknown whether it was in response to the first or second test pattern. Testing the chip 10 in this manner can increase the number of patterns that can be tests in the same period of time relative to the single chain testing.
In both of the above configuration shown in
The ability of the scan register 70 to maintain this state information for each of cores 12, 14, 16, 18 allows the tester 36, when an error is detected, to stop scanning out the response bits and instead instigate a scan operation of the scan register 70 so that the bits in the scan register can be compared against a test bit to determine which one or more of the cores 12, 14, 16, 18 outputted a different bit relative to the other cores 12, 14, 16, 18, i.e., the core 12, 14, 16, 18 actually having the error. Once the core 12, 14, 16, 18 having the error is known, the position of the error bit with respect to the other bits shifted out of the cores 12, 14, 16, 18 can be used to identify the flop actually causing the error. Of course, the scan register 70 may be configured in any other manner and may store more than one bit. Storage of a single bit at a time for each core may be advantageous in limiting the memory demands of the chip.
A chip, for example, may include two levels of hierarchy with four identical cores at the first level and four micro-cores per core at the second level. Since all the micro-cores are identical, they need exactly the same test stimulus for a certain level of fault coverage. Also given the same test stimulus, they will generate exactly the same test response if there are no faults present. As described above, the present invention supports connecting the scan chains such that one scan-in pin of the chip fans out to each of the scan chains in the 16 micro-cores. Each bit of test stimulus data can thereby be shifted in from the pin, replicated internally into 16 bits at the fanout branches, and feeds the 16 scan chains, i.e., any test stimulus driven by the tester into the scan-in pin can be replicated to the 16 micro-cores. The ends of the 16 chains that shift out test responses can feed an exclusive-OR gate. The output of this gate can be connected to a single scan-out pin. Since the micro-cores are identical, their test responses to the replicated test stimulus is the same when there are not faults present such that as the test response is shifted out of the chains, the output of the exclusive-OR will be zero (low) in a fault-free case or one (high) if there is a fault in one or more micro-cores, the scan chains corresponding to the faulty micro-cores will have their test response different from the rest. This scheme described can be sufficient for a pass/fail test.
A diagnostic register can be used to process the test stimulus. This can be achieved by modifying the exclusive-OR into a programmable compactor, where a selected chain gets connected to the scan out pin. In a diagnosis mode, each chain can be connected directly to the scan-out pin and multiple test runs with different scan chains connected to the scan-out pin can be performed to identify the faulty core. Another way to do this is to connect all 16 chains to 16 different scan-out pins which will be used only when diagnosing and not during manufacturing test. If the multiple runs need to be reduced further an on-chip signature compressor can be added at the ends of each of the chains and an exclusive-OR of the signatures can be performed. In test mode, the exclusive-OR is visible to the tester and in diagnosis mode, each of the signatures can be looked at via a scan or any other slow test port. attachments: A.
One non-limiting aspect of the present invention relates to reducing test data volume of scan patterns for CMT processors. Reducing the scan data volume allows for better utilization of tester memory. The savings in tester memory can be used towards fitting in other test patterns, thus increasing the overall test coverage and improving the outgoing quality and reducing test escapes in manufacturing. Test patterns targeting a wide range of fault models and a larger number of patterns can be fit in the available tester memory. The present invention, for a fixed level of test coverage or quality, can reduced the test time and tester memory. The invention is not restricted to CMT processors and be applied to any chip design that has multiple instances of design blocks that are logically and physically identical.
As required, detailed embodiments of the present invention are disclosed herein; however, it is to be understood that the disclosed embodiments are merely exemplary of the invention that may be embodied in various and alternative forms. The figures are not necessarily to scale, some features may be exaggerated or minimized to show details of particular components. Therefore, specific structural and functional details disclosed herein are not to be interpreted as limiting, but merely as a representative basis for the claims and/or as a representative basis for teaching one skilled in the art to variously employ the present invention.
While embodiments of the invention have been illustrated and described, it is not intended that these embodiments illustrate and describe all possible forms of the invention. Rather, the words used in the specification are words of description rather than limitation, and it is understood that various changes may be made without departing from the spirit and scope of the invention.