The present invention relates generally to integrated circuits, and relates more specifically a design structure for built-in self-test mechanisms.
A built-in self-test (BIST) mechanism within an integrated circuit (IC) is a function that verifies all or a portion of the internal functionality of the IC. There are many different approaches to architecting memory BIST, each of which has distinct advantages and disadvantages.
For example, one conventional BIST architecture uses a serial path to propagate test data from the BIST engine to the memory when setting up a memory write operation, or from the memory to the BIST engine when collecting data from a memory read operation. The write and read operations are always performed in a parallel or “broadside” manner. Although this approach succeeds in minimizing BIST logic area utilization, it also requires a large number of “overhead cycles” to serially propagate the control and to observe data to and from the memory for each cycle of memory operation. Moreover, since memory units are separated from each other by several serial shift cycles, back-to-back at-speed memory operations are not possible, and effectiveness of at-speed testing is limited. Additionally, the overhead cycles contribute to an increase in test time. This can be mitigated by performing all operations (including the serial shift) “at-speed;” however, this also burdens the chip designer with the task of closing at-speed timing on the serial shift paths (in addition to on the broadside access paths and all of the other functional paths of the chip).
Another conventional approach uses a totally broadside BIST architecture in which all memory inputs and outputs propagate to and from the memory along a wide bus to and from the BIST engine. Although this architecture allows back-to-back “at-speed” memory operations (and, as such, provides good at-speed test coverage for the memory), it also requires higher utilization of the BIST logic at the memory. Further, because the BIST engine and supporting BIST pattern distribution logic, compare logic, encoder logic, and redundancy failing address and repair register (FARR) must be run at-speed when running BIST, a burden is placed on the chip designer to close timing on some BIST logic and on all of the functional logic.
In one embodiment, the invention is a method, apparatus, and design structure for built-in self-test for embedded memory in integrated circuit chips. One embodiment of a method for built-in self-test of an embedded memory includes setting up a plurality of test patterns at a speed of a test clock, where the speed of the test clock is slow enough for a tester to directly communicate with a chip in which the memory is embedded, and where the setting up includes loading a plurality of signal states used to communicate the test patterns to one or more components of a built-in self-test system, applying the test patterns to the embedded memory as a microburst at-speed, capturing output data from the embedded memory at-speed, the output data corresponding to only one of test patterns, and comparing the output data to expected data at the speed of the test clock.
So that the manner in which the above recited features of the present invention can be understood in detail, a more particular description of the invention may be had by reference to embodiments, some of which are illustrated in the appended drawings. It is to be noted, however, that the appended drawings illustrate only typical embodiments of this invention and are therefore not to be considered limiting of its scope, for the invention may admit to other equally effective embodiments.
In one embodiment, the present invention is a method and apparatus for improved built-in self-test for integrated circuit chips. Embodiments of the invention reduce test area overhead of at-speed built-in self-test through two main mechanisms: (1) time-multiplexing of test logic; and (2) an asynchronous interface that allows the test inputs (test patterns) provided to the memory and the test results (output data bits) captured from the memory to be set up and compared at low-speed, but executed within the memory at high-speed. This allows a tester to communicate with a chip and removes the requirement for most of the built-in self-test logic to run fast.
As illustrated, the BIST system 100 comprises a BIST controller 102, a BIST engine 104 (e.g., an asynchronous BIST or ABIST engine), a comparator/encoder 106, a first BIST collar (BIO) 108, and a second, “fast” BIST collar (FBIO) 110.
The BIST controller 102 schedules and enables BIST operations. The BIST controller 102 is coupled directly to the BIST engine 104, via a BIST controller (BC) interface 124 in the BIST engine 104. The BIST engine 104 is responsible for setting up cycles of memory operation (i.e., a combination of read and/or write operations, hereinafter referred to as “test patterns”) to apply to the memory 112. Within this context, “setting up” a test pattern refers to the act of loading into the registers 114 the specific digital signal states needed to communicate the test patterns to the first BIST collar 108. The test patterns comprise signals designed to enable and diagnose the memory 112. The BIST engine 104 comprises a plurality of registers 1141-114n (hereinafter collectively referred to as “registers 114”), in which a plurality of test patterns is set up (e.g., one test pattern per register 114). In one embodiment, the BIST engine 104 sets up these test patterns at a “slow” test clock speed, where the test clock speed is slow enough for a tester to directly communicate with the built-in self-test (e.g., in the range of approximately fifty to approximately two hundred fifty megahertz). In one embodiment, the slow clock is slow relative to the at-speed functional clock of the memory, although there may be instances in which the at-speed clock is actually slower than the “slow” clock.
The BIST engine 104 is directly coupled to the first BIST collar 108 and provides the test patterns, via the registers 114, to the first BIST collar 108 at the test clock speed. As described in further detail below, the BIST engine 104 provides the test patterns in groups as “microbursts” to the first BIST collar 108. In one embodiment, a microburst comprises two or more test patterns. Each microburst spans a single memory address or a small subset of the total memory address space. As such, testing of the full memory address space will generally require a plurality of microbursts.
The first BIST collar 108 is also directly coupled to the comparator/encoder 106. As described in further detail below, the first BIST collar 108 receives output data bits from the memory via the second BIST collar 110 (i.e., via an output capture register of the second BIST collar 110) and forwards these output data bits to the comparator/encoder 106 for comparison and/or encoding.
The first BIST collar 108 is directly coupled to the second BIST collar 110, and sets up the test patterns to the second BIST collar 110 at the slow test clock speed. The second BIST collar 110 comprises a multiplexer 116, an address (A&D) register 118, a state machine 120, and an output capture (Q&BFM) register 122. The multiplexer 116 receives the test patterns from the first BIST collar 108 and allows the at-speed capture of the selected test patterns to the address, data and control (A&D) register 118, which is configured to apply the test patterns directly to the memory 112 at-speed. In addition, the address, data and control register 118 returns one of the test patterns to the multiplexer 116 in order to keep the memory running until the next microburst. The output capture register 122 of the second BIST collar 110 is configured for capturing output data bits from one of a set of applied test patterns (i.e., one test pattern per micro-burst) at-speed from the memory 112, and for subsequently providing the output data bits to the first BIST collar 108 at the slow speed. Each group of output data bits represents a portion (1/n) of the word width of the memory 112, where the portion is directly proportional to the number of test patterns (n) in a microburst. In the example shown in
The comparator/encoder 106 receives output data bits captured by the output capture register 122 of the second BIST collar 110 from the memory 112, which are provided to the comparator/encoder 106 via the first BIST collar 108. The comparator/encoder 106 compares the output data bits with expected data in order to determine if there is a match. In one embodiment, the expected data is generated by the BIST engine 104 in accordance with methods known to those of skill in the art. The comparator/encoder 106 outputs a signal indicative of the result of the comparison to the BIST engine 104. The output signal from the comparator portion of the comparator/encoder 106 is a fail signal (i.e., the BIST system 100 has encountered a fail), whereas the output signals from the encoder portion of the comparator/encoder 106 represent the failing column address of the fail. The encoder portion will also output a multiple-hit detect signal, for the case where more than one data out bit of the memory 112 has failed. The comparator/encoder 106 has a width equal to approximately 1/n the word width of the memory 112, where n is the number of test patterns contained in a microburst (e.g., if a microburst comprises 4 test patterns, the width of the comparator/encoder 106 is ¼). As described in further detail below, this allows the comparator/encoder logic to be shared across the n test patterns. The comparator/encoder 106 compares and/or encodes 1/n of the data output bits received from the memory 112 during each of the slow clock cycles used by the BIST engine 104 to set up a microburst. Thus, while the BIST engine 104 is setting up the next microburst, the comparator/encoder 106 evaluates the results of the “just-applied” microburst.
The method 200 is initialized at step 202 and proceeds to step 204, where the BIST engine performs a setup of a first plurality of test patterns. In one embodiment, the BIST engine performs a setup of four test patterns, though in other embodiments the first plurality of test patterns can total any number of test patterns that is two or greater. In general, a smaller number of test patterns results in less effective testing from an at-speed and noise perspective, as well as a less significant savings in BIST area utilization; however, the smaller the number of test patterns, the smaller the amount of test-time impact. A greater number of test patterns generally results in more effective testing from an at-speed and noise perspective, as well as greater savings in BIST area utilization; however, the greater the number of test patterns, the greater the amount of test-time impact. The first plurality of test patterns is set up using a “slow,” tester-controlled clock. In one embodiment, the “slow” clock runs at any speed that is slower than the memory's at-speed functional clock.
In step 206, the BIST engine applies the first plurality of test patterns, via the first BIST collar and the second BIST collar, as a first “microburst” to the memory, using the memory's at-speed functional clock.
In step 208, the output capture register captures output data bits from only one of the test patterns applied in the microburst in step 206. The output data bits comprise the result of application of the corresponding test pattern to the memory and are captured at-speed.
Having captured the output data bits, the method 200 proceeds simultaneously to steps 210 and 212. In step 210, the comparator/encoder compares the output data bits to expected data, at the test clock speed, in order to determine if there is a memory fail. In one embodiment, the comparator/encoder sends a signal to the BIST engine indicative of the results of the comparison. In one embodiment, the comparator/encoder identifies not just the existence of a fail, but also the specific component(s) within the memory (e.g., data bit(s)) that has failed.
In step 212, the BIST engine determines whether to continue testing (i.e., whether any untested memory space remains). If the BIST engine concludes in step 212 that testing does not need to continue, the method 200 terminates in step 214.
Alternatively, if the BIST engine concludes in step 212 that testing should continue, the method 200 returns to step 204, where the BIST engine sets up a next plurality of test patterns at the test clock speed. While it might appear from the flowchart of
Thus, the BIST system of the present invention limits the amount of BIST architecture that must be run at-speed. Specifically, only the memory and the second BIST collar need to be run at-speed, while the remainder of the BIST system (i.e., the BIST engine, the first BIST collar, and the FARR) can be run at a slower speed. This greatly reduces the design time required to close functional timing for a design of an IC chip, which may comprises many hundreds of embedded memories.
Moreover, because the width of the comparator/encoder is only a fraction of the memory word width, the chip area utilized by the BIST logic is greatly reduced. In addition, because the comparing and encoding is performed using the slow clock, the comparing and encoding functions can be powered down, requiring even less chip area.
Additionally, because memory output data is captured at-speed, but compared later during slow clock cycles, a real-time bit fail map for an at-speed memory failure can be generated with the slow clock and captured by the BIST engine. This enables tester clock-controlled at-speed memory diagnostics. That is, fails can be detected by the BIST system during a slow-clock compare sequence, and because the test data (output data bits from the memory) is still in the output capture register at the time of such detection, the location of the specific component(s) within the memory that has failed can be quickly identified.
One risk associated with BIST architectures that test the memory in a series of at-speed burst operations is poor test quality. Specifically, as discussed above, the at-speed bursts are separated by a number of non-operational cycles in which the memory is not accessed. These non-operational cycles are generated as the BIST engine readies data for the next burst and processed data from the last burst. There can be anywhere from one (or a small fraction of one) of these setup operations per one at-speed operation to many hundreds of setup operations per one at-speed operation, depending of the specific BIST architecture.
These non-operational cycles reduce test quality because the memory is inactive and therefore consumes much less power during the set up of the test patterns. During the actual burst portion of the test, however, the memory immediately begins consuming active power, which can result in significantly increased noise and reduced or collapsed power supplies. As such, several cycles of the burst may experience a localized test environment that is much different from the normal operating environment. Since many tests are run under stressful voltage or temperature settings, the noise and power supply issues may result in significant over-testing, which will, in turn, produce poor yield. Testing under less stressful conditions to account for environmental changes at the beginning of each burst can alternatively lead to significant under-testing for memory operations in later bursts (after the power supplies have recovered).
Thus, further embodiments of the present invention modify the state machine 120 illustrated in
The method 300 is initialized at step 302 and proceeds to step 304, where the BIST system 100 begins a burst operation (i.e., begins accesses to the memory 112). In step 306, the state machine 120 sets a valid read flag, so that read operations performed in accordance with the burst operation are treated as valid (i.e., output data bits will be captured for observation and comparison, as discussed above).
In step 308, the BIST system 100 accesses the memory 112 in accordance with the burst operation. In step 310, the BIST system 100 determines whether the memory access (e.g., read operations) is valid. In one embodiment, this step involves confirming that the valid read flag is set. If the BIST system 100 concludes in step 310 that the memory access is valid, the method 300 proceeds to step 312, where the BIST system captures output data bits from the memory 112, as discussed above. The method 300 then proceeds to step 314, where the state machine 120 updates. Alternatively, if the BIST system 100 concludes in step 310 that the memory access is not valid, the method 300 proceeds directly to step 314.
In step 316, the BIST system 100 determines whether the valid memory accesses have been completed. If the BIST system 100 determines in step 316 that the valid memory accesses have not been completed, the method 300 returns to step 308, and the BIST system 100 continues to access the memory 112.
Alternatively, if the BIST system 100 determines in step 316 that the valid memory accesses have been completed, the method 300 proceeds to step 318, where the state machine 120 turns off or disables the valid read flag. The method 300 then proceeds with two separate operations, which in one embodiment are performed substantially in parallel.
According to the first of these operations, the method 300 proceeds to step 320, where the BIST system 100 accesses the memory 112. In step 322, the BIST system 100 determines whether the next burst is ready to apply. If the BIST system 100 concludes in step 322 that the next burst is not ready, the method 300 returns to step 320, and the BIST system 100 continues to access the memory 112.
Alternatively, if the BIST system 100 concludes in step 322 that the next burst is ready, the method 300 returns to step 304 and proceeds as described above to begin the next burst operation.
According to the second of the two operations, in step 324, the BIST system 100 determines whether testing of the memory 112 is done. If the BIST system 100 concludes in step 324 that the test is done, the method 300 terminates in step 328. Alternatively, if the BIST system 100 concludes in step 324 that the test is not done, the method 300 proceeds to step 326, where the BIST system 100 sets up the next burst for application to the memory 112. The method 300 then proceeds to step 322 and proceeds as described above to begin the next burst operation, once the next burst is ready.
Thus, even once valid operations in accordance with a particular burst are completed, the BIST system 100 continues to access the memory 112, while at the same time the next burst is set up in the background. In other words, the BIST system 100 prepares for the next burst while the memory 112 continues to be operated. In some embodiments, the last operation to the memory is re-sent during this time, so that the memory continues to be accessed during the setup. However, the read data is not captured or processed because the valid read flag is turned off. Thus, the valid read flag controls whether output data bits are captured from the memory 112 and processed. In one embodiment, one or more shadow registers or data hold states are implemented in the BIST system architecture in order to enable this functionality.
Re-executing the last read operation to the memory 112, as discussed above, will generate valid noise similar to normal memory operations. Moreover, there are certain fail mechanisms that respond to repeated reads of a memory cell over a long period of time (so that the next time a valid read of the memory cell is performed, a “continuous-read fail” may be observed). However, a drawback of re-executing the last write operation to the memory is that writing a given memory cell repeatedly may force the memory cell, if weakly written (i.e., defective), into a passing state, thus hiding a true error.
In an alternative embodiment of the present invention, an addition is made to the memory 112 in order to detect weak write fails. In this embodiment, the memory 112 provides a control signal that the BIST system 100 can activate during write operations that occur while the BIST system is setting up test patterns. The control signal allows the bit lines to be activated (which accounts for most of the power consumption when the memory 112 is accessed), but the word line is not activated. This prevents weakly written memory cells from being re-written over time into a passing state, while still producing much of the normal noise profile and power consumption of a standard write operation.
The method 400 is initialized at step 402 and proceeds to step 404, where the BIST system 100 begins a burst operation (i.e., begins accesses to the memory 112). In step 406, the state machine 120 sets a valid read flag, so that read operations performed in accordance with the burst operation are treated as valid (i.e., output data bits will be captured for observation and comparison, as discussed above).
In step 408, the BIST system 100 accesses the memory 112 in accordance with the burst operation. In step 410, the BIST system 100 determines whether the memory access (e.g., read operations) is valid. In one embodiment, this step involves confirming that the valid read flag is set. If the BIST system 100 concludes in step 410 that the memory access is valid, the method 400 proceeds to step 412, where the BIST system captures output data bits from the memory 112, as discussed above. The method 400 then proceeds to step 414, where the state machine 120 updates. Alternatively, if the BIST system 100 concludes in step 410 that the memory access is not valid, the method 400 proceeds directly to step 414.
In step 416, the BIST system 100 determines whether the valid memory accesses have been completed. If the BIST system 100 determines in step 416 that the valid memory accesses have not been completed, the method 400 returns to step 408, and the BIST system 100 continues to access the memory 112.
Alternatively, if the BIST system 100 determines in step 416 that the valid memory accesses have been completed, the method 400 proceeds to step 418, where the state machine 120 turns off or disables the valid read flag. In step 420, the memory 112 asserts write word line suppress control, as discussed above. In particular, the word lines of the cells in the memory 112 are suppressed (not activated), while the bit lines are activated. The method 400 then proceeds with two separate operations, which in one embodiment are performed substantially in parallel.
According to the first of these operations, the method 400 proceeds to step 422, where the BIST system 100 accesses the memory 112. In step 424, the BIST system 100 determines whether the next burst is ready to apply. If the BIST system 100 concludes in step 424 that the next burst is not ready, the method 400 returns to step 422, and the BIST system 100 continues to access the memory 112.
Alternatively, if the BIST system 100 concludes in step 424 that the next burst is ready, the method 400 returns to step 404 and proceeds as described above to begin the next burst operation.
According to the second of the two operations, in step 426, the BIST system 100 determines whether testing of the memory 112 is done. If the BIST system 100 concludes in step 426 that the test is done, the method 400 terminates in step 430. Alternatively, if the BIST system 100 concludes in step 426 that the test is not done, the method 400 proceeds to step 428, where the BIST system 100 sets up the next burst for application to the memory 112. The method 400 then proceeds to step 424 and proceeds as described above to begin the next burst operation, once the next burst is ready.
Alternatively, the BIST module 505 can be represented by one or more software applications (or even a combination of software and hardware, e.g., using Application-Specific Integrated Circuits (ASIC)), where the software is loaded from a storage medium (e.g., I/O devices 506) and operated by the processor 502 in the memory 504 of the general purpose computing device 500. Additionally, the software may run in a distributed or partitioned fashion on two or more computing devices similar to the general purpose computing device 500. Thus, in one embodiment, the BIST module 505 for testing embedded memory described herein with reference to the preceding figures can be stored on a computer readable storage medium (e.g., RAM, magnetic or optical drive or diskette, and the like).
Design flow 600 may vary depending on the type of representation being designed. For example, a design flow 600 for building an application specific IC (ASIC) may differ from a design flow 600 for designing a standard component or from a design flow 600 for instantiating the design into a programmable array, for example a programmable gate array (PGA) or a field programmable gate array (FPGA) offered by Altera® Inc. or Xilinx® Inc.
Design process 610 preferably employs and incorporates hardware and/or software modules for synthesizing, translating, or otherwise processing a design/simulation functional equivalent of the components, circuits, devices, or logic structures shown in
Design process 610 may include hardware and software modules for processing a variety of input data structure types including netlist 680. Such data structure types may reside, for example, within library elements 630 and include a set of commonly used elements, circuits, and devices, including models, layouts, and symbolic representations, for a given manufacturing technology (e.g., different technology nodes, 32 nm, 45 nm, 90 nm, etc.). The data structure types may further include design specifications 640, characterization data 650, verification data 660, design rules 670, and test data files 685 which may include input test patterns, output test results, and other testing information. Design process 610 may further include, for example, standard mechanical design processes such as stress analysis, thermal analysis, mechanical event simulation, process simulation for operations such as casting, molding, and die press forming, etc. One of ordinary skill in the art of mechanical design can appreciate the extent of possible mechanical design tools and applications used in design process 610 without deviating from the scope and spirit of the invention. Design process 610 may also include modules for performing standard circuit design processes such as timing analysis, verification, design rule checking, place and route operations, etc.
Design process 610 employs and incorporates logic and physical design tools such as HDL compilers and simulation model build tools to process design structure 620 together with some or all of the depicted supporting data structures along with any additional mechanical design or data (if applicable), to generate a second design structure 690. Second design structure 690 resides on a storage medium or programmable gate array in a data format used for the exchange of data of mechanical devices and structures (e.g. information stored in an initial graphics exchange specification (IGES), drawing exchange format (DXF), Parasolid XT, JT, DRG, or any other suitable format for storing or rendering such mechanical design structures). Similar to design structure 620, second design structure 690 preferably comprises one or more files, data structures, or other computer-encoded data or instructions that reside on transmission or data storage media and that when processed by an ECAD system generate a logically or otherwise functionally equivalent form of one or more of the embodiments of the invention shown in
Second design structure 690 may also employ a data format used for the exchange of layout data of integrated circuits and/or symbolic data format (e.g. information stored in a GDSII (GDS2), GL1, OASIS, map files, or any other suitable format for storing such design data structures). Second design structure 690 may comprise information such as, for example, symbolic data, map files, test data files, design content files, manufacturing data, layout parameters, wires, levels of metal, vias, shapes, data for routing through the manufacturing line, and any other data required by a manufacturer or other designer/developer to produce a device or structure as described above and shown in
It should be noted that although not explicitly specified, one or more steps of the methods described herein may include a storing, displaying and/or outputting step as required for a particular application. In other words, any data, records, fields, and/or intermediate results discussed in the methods can be stored, displayed, and/or outputted to another device as required for a particular application. Furthermore, steps or blocks in the accompanying Figures that recite a determining operation or involve a decision, do not necessarily require that both branches of the determining operation be practiced. In other words, one of the branches of the determining operation can be deemed as an optional step.
While the foregoing is directed to embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof. Various embodiments presented herein, or portions thereof, may be combined to create further embodiments. Furthermore, terms such as top, side, bottom, front, back, and the like are relative or positional terms and are used with respect to the exemplary embodiments illustrated in the figures, and as such these terms may be interchangeable.