The field of the present disclosure relates to systems, methods and apparatus for testing integrated circuits.
Even the best integrated circuit designs are subject to flaws, such as physical flaws or timing flaws. The flaws may arise during manufacturing or at anytime over the life of the chip. Thus, integrated circuits are typically tested before and/or after packaging.
Testing integrated circuits may be costly in terms of test-cycle duration and engineering time devoted to designing tests and examining test results. Further, integrated circuits may have a plethora of inputs and outputs that are not accessible via external pads or pins. As a result, internal defects may not be readily discernable by simply using externally accessible inputs and outputs. Accordingly, many integrated circuits are designed for test, i.e., having testability capability. One such technique is known as built-in self-testing (BIST). According to this technique, an integrated circuit is designed to include BIST circuitry, which enables application of a test pattern to the integrated circuit's functional circuitry (which is modified to be BIST-operable) and observation of a response of the functional circuitry to the test pattern. If the observed response matches an expected value, the functional circuitry can be considered to be operating properly.
Field-programmable object arrays (FPOAS) have been developed to fill a gap between field-programmable gate arrays (EPGAs) and application-specific integrated circuits (ASICs). While FPGAs are programmable at the gate level, they may not be able to keep up with some demanding applications, such as machine-vision, video application, medical imaging, and radar processing, for example. While ASICs can be designed to have the processing power to meet those demands and others, the time and cost required to develop an ASIC may be too great in certain situations. FPOAs can sometimes be suitable for demanding applications, such as those and others, while the programmable nature of an FPOA (like an FPGA) can considerably alleviate development time and costs, as compared to an ASIC.
A typical FPOA comprises a number of programmable objects along with a programmable high-speed interconnection network. The objects in an FPOA, as compared to the relatively simple gates of an FPGA, can be considerably more complex, while the number of objects in a typical FPOA is usually much less than the number of gates in an FPGA. Examples of object types include arithmetic logic units (ALUs), multiply/accumulate units (MACs), and memory banks such as register files (RFs). An FPOA's objects, which are typically designed in advance to have timing closure at high clock speeds, can be combined in intuitive ways to provide powerful processing capability, which is especially well suited for byte-width, word-width, or other multi-bit data.
The unique architecture and features of FPOAs present challenges and opportunities for built-in self-testing of FPOAs.
According to one embodiment, an integrated circuit with built-in self-testing capability comprises an array of programmable objects, a plurality of interfaces, and a controller. The array of programmable objects may be designed to operate at an operational clock speed during non-testing operation, wherein the design of the objects is not constrained to require within an object extra circuitry not essential to non-testing operation to facilitate built-in self-testing. The plurality of interfaces may be connected to the objects to enable communication with the objects and to thereby facilitate built-in self-testing of the objects. The controller may be operably connected to the objects and to the interfaces and configured to cause a selected subset of the objects to be activated and configured for testing, to stimulate the selected subset of objects for a given time with an input test pattern delivered via one or more of the plurality of interfaces while the selected subset of objects operates at the operational clock speed, and to observe a response of the selected subset of objects for testing purposes.
According to another embodiment, a method tests an integrated circuit comprising in substantial part an array of objects in a central region of the integrated circuit and further comprising registers outside of the central region. The test method may comprise (a) establishing a subset of the objects as a set of objects-under-test, (b) configuring the array of objects so that the set of objects-under-test and a set of the registers communicate via a set of intermediate objects in the array, (c) testing the set of objects-under-test, (d) establishing a new set of objects-under-test as a different subset of the objects, and (e) repeating steps (b), (c), and (d) until every object in the array has been included in at least one set of objects-under-test. Step (c) may include setting the set of objects-under-test into a configuration, stimulating the set of objects-under-test with a test pattern via the set of intermediate objects while the set of object-under-test operates, and receiving an output pattern from the set of objects-under-test in response to the test pattern, the output pattern received at the set of registers via the set of intermediate objects.
According to still another embodiment, a method tests an integrated circuit comprising an array of objects. The method may comprise fully powering up a set of objects to be tested, partially powering up another set of objects to allow unidirectional segmented buses included therein to transfer data to and from the fully-powered-up set of objects, fully powering down any remaining objects of the array, thereby limiting the array's power consumption, and transmitting a test pattern to the fully powered-up set of objects and an output pattern from the fully powered-up set of objects via the partially powered-up set of objects, the output pattern generated by the fully powered-up set of objects in response to the test pattern.
As one skilled in the art will appreciate in view of the teachings herein, certain embodiments may be capable of achieving certain advantages, including by way of example and not limitation one or more of the following: (1) the ability to reduce power consumption and heat generation during testing, (2) the ability to perform testing at full clock speed for object's core functionality; (3) little or no imposition of dedicated testing circuitry in objects, thereby allowing smaller footprint that provides greater core operational speeds and/or usable functionality; (4) flexibility to adjust the complexity of testing by controlling the size, shape, and/or location of selected portions of the chip tested together, thereby enabling a trade-off among thoroughness of testing, observability, and other factors such as, for example, speed and power consumption; (5) the ability to test objects to a statistically significant degree of thoroughness by pseudo-randomly setting input stimulus as well as the configurations or set-up of objects-under-test; (6) the ability to test operation of an array or other core to a statistically significant degree using communication circuitry on the periphery of the array or other core and therefore not impacting the design of the array or other core; (7) the ability to test long delay paths; (8) enabling the use of simpler board and power supply designs; (9) reduced current surges and associated voltage dips during testing; and (10) the ability to perform testing on a variety of unique objects without altering the test method (i.e., the testing technique is invariant to the object's design). These and other advantages of various embodiments will be apparent upon reading the following.
With reference to the above-listed drawings, this section describes particular embodiments and their detailed construction and operation. The embodiments described herein are set forth by way of illustration only. In light of the teachings herein, those skilled in the art will recognize that there may be equivalents to what is expressly or inherently taught herein. For example, variations can be made to the embodiments described herein and other embodiments are possible. It is not always practical to exhaustively catalog all possible embodiments and all possible variations of the described embodiments.
For the sake of clarity and conciseness, certain aspects of components or steps of certain embodiments are presented without undue detail where such detail would be apparent to those skilled in the art in light of the teachings herein and/or where such detail would obfuscate an understanding of more pertinent aspects of the embodiments.
Before describing detailed examples of BIST for FPOAs, the FPOA architecture and associated concepts are first described in this and the subsequent three subsections.
Within the core 105, the objects 115 may generally be of any type, designed by the FPOA maker to have any feasible size, architecture, capabilities, and features. Some specific examples of object types include arithmetic logic units (ALUs) 116, multiply accumulators (MACs) 117, and memory banks such as register files (RFs) 118. In brief, a typical ALU 116 may perform logical and/or mathematical functions and may provide general purpose truth functions for control, a typical MAC 117 may perform multiply operations and include an accumulator for results, and a typical RF 118 contains memory that can be utilized as, for example, RAM (random access memory), a FIFO (first-in first-out) structure, or as a sequential read object. For example, one version of an ALU 116 may have a 16-bit data word length, one version of a MAC 117 may operate on 16-bit multiplicands and have a 40-bit accumulator, and one version of a RF may have 0.16 KB (kilobytes) of memory organized as 64 20-bit words. For purposes of the BIST techniques described herein, the internal construction and operational capabilities of the objects 115 are arbitrary.
The size of an FPOA and the number of objects 115 is also arbitrary (within realistic constraints for semiconductor manufacturing). The example FPOA 100, as illustrated in
The objects 115 may communicate with each other and/or the periphery region 110 by various methods. For example, two forms of communication mechanisms are (1) nearest neighbor communication and (2) party line communication. A nearest neighbor communication mechanism allows a core object 115 to communicate with one or more of its immediate neighbors. A party line communication mechanism allows an object 115 to communicate with other objects 115 at greater distances and with objects in the periphery region 110. Examples of such communication mechanisms will be described in more detail with respect to
An FPOA can include—typically in its periphery or non-core region—various interfaces used for initialization and control of the array and/or other parts of the device. For example, the FPOA 100 includes a Joint Test Action Group (JTAG) controller 120, which can provide access to a set of registers for controlling the FPOA 100, and a PROM (programmable read-only memory) controller 125, which can oversee the FPOA 100's loading and initialization process. In case a PROM is not present or does not contain valid initialization instructions and/or data, the JTAG controller 120 may also initialize the FPOA 100. If a PROM is present, a PROM controller 125 can oversee the FPOA 100's loading and initialization process (an example of which will be described in more detail with reference to
An FPOA can also include—also typically in its periphery or non-core region—a number of interfaces for communicating with external devices. For example, the FPOA 100 includes two general purpose input/output (GPIO) objects or interfaces 135 (located on the north and south side of the FPOA 100), each of which can provide and/or interface to bidirectional I/O lines or pins, allowing data transfer between the FPOA 100 and external devices. As another example, four high speed interfaces can also be provided on the east and west side of the FPOA 100: two transmit (TX) interfaces 140 and two receive (RX) interfaces 145. The interfaces may operate according to a protocol, such as, for example, parallel low-voltage differential signaling (LVDS). A greater or lesser number of I/O interfaces can be provided in different versions of FPOAs. In addition, other types of I/O interfaces may be provided, such as PCI-e, XAUI, and others.
An FPOA may also include memory and/or memory interfaces in its non-core region. By way of example, the example FPOA 100 comprises XRAM (external RAM) interfaces 150 and IRAM (internal RAM) in its periphery region 110. The XRAM interfaces 150 can provide access to external memory, which may be potentially large in capacity (e.g., 16 meg (16×106) by 72-bit of data accessible via the RLDRAM-II protocol). The IRAM 160 may be a bank of on-chip memory (e.g., single port 2K (2048) by 76-bit SRAM), which can be preloaded during initialization. Thus, the FPOA 100 had three groups of memory: (1) the RFs 118 (assuming such objects are included in the core 105); (2) the XRAM 150; and (3) the IRAM 160.
The example object 215 also includes four nearest neighbor registers 340, 350, 360, and 370, each of which may provide data to two adjacent objects via the appropriate pair of nearest neighbor communication channels 321-328. For example, an object directly to the east of object 215 may pull data from register 370 via the nearest neighbor communication channel 321_OUT. Likewise, an object to the northeast of object 215 may pull data from register 370 via the nearest neighbor communication channel 322_OUT. In a similar vein, the object 215 may pull data from the nearest neighbor registers of each of its eight adjacent neighbors. For example, the object 215 may pull data from the southwest nearest neighbor register of an object that is northeast of the object 215 via the nearest neighbor communication channel 322_IN. Likewise, the object 215 may pull data from the southwest (or northwest) nearest neighbor register of an object that is northeast (or east) of the object 215 via the nearest neighbor communication channel 321_IN.
In one example implementation, the party line communication channels are implemented as unidirectional segmented buses. Such a bus is segmented in the sense that it passes through some logic circuitry (e.g., one of the launch multiplexers 425, 430, 435, or 440) and/or a register (e.g., one of the party line registers 445, 450, 455, or 460) along the way from one bus segment to the next bus segment.
With respect to the example implementation illustrated in
The object 215 may also be configured to “land” data from a previous object on a bus into one of its party line registers 445, 450, 455, or 460. For example, a value on the party line communication channel 405_IN may be stored in the north party line register 450 via the north party line multiplexer 470 and/or stored in the south party line register 460 via the south party line multiplexer 475. As shown, the north party line register 450 may also store values from the core 465 and/or from the party line communication channel 410_IN via the north party line multiplexer 470. Likewise, the south party line register 460 may also store values from the core 465 and/or from the party line communication channel 410_IN via the south party line multiplexer 475.
The object 215 may “launch” values to another object via various party line communication channels and the launching multiplexers 425, 430, 435, and 440. For example, the south launching multiplexer 440 can launch data from the south party line register 460, nearest neighbor registers 457 and 462, or the party line communication channels 405_IN, 415_IN, and 420_IN. Likewise, the north launching multiplexer 430 can launch data from the north party line register 450, nearest neighbor registers 447 and 452, or the party line communication channels 410_IN, 415_IN, and 420_IN.
Each multiplexer 425, 430, 435, 440, 470, 475, 480, and 485 has a selector (e.g., a select input) to control which of the input signals (e.g., 410_IN, 415_IN, etc.) will be used as the output signal (e.g., 410_OUT). Thus, the party line communication channels are controlled by the selectors of the multiplexers. The selector can be set to a static position when the object is initialized or it can be controlled dynamically during runtime. The configuration and initialization of the object will be described in more detail with respect to
As shown in
With continued reference to
The components that make up the core 505 and communications infrastructure 510 may be selectively connected to the power bus. For example, one or more transistors may be dedicated to connecting or disconnecting the various components within the core 505 and communications infrastructure 510 to the power bus. Selectively coupling the logic core 505 and communications infrastructure 510 to the power bus allows one or the other or both to be powered up or down in any desired combination. For example, the logic core 505 may be powered up when it is needed to perform functions and powered down when not in use. By way of another example, any combination of the party line communication channels 520, 525, 530, and 535 may be powered up when needed to relay data to other objects. Thus, the object 500 may have both its core 505 and party line communication channels 520, 525, 530, and 535 powered down when not in use and powered up when in use. As another possibility, the object 500 may have its logic core 505 powered down but have the north and south party line communication channels 520 and 525 powered up to allow objects to the north and south of the object 500 to pass data through the object 500. Further, the object 500 may have its core 505 powered down but have the east and west party line communication channels 530 and 535 powered up to allow objects to the east and west of the object 500 to pass data through the object 500.
Powering up or down the core 505 and communications infrastructure 510 may be accomplished in other ways. For example, the logic core 505 and communications infrastructure 510 may be selectively disconnected from power rails or a ground bus (or ground plane). Further, the clock(s) driving the core 505 and the communications infrastructure 510 may be selectively gated, slowed, or disconnected. For example, if only the communications infrastructure 510 is needed, the clock driving the communications infrastructure 510 may be activated while the clock driving the core 505 may be deactivated. This effectively prevents portions of the circuitry from changing states.
Next, in a configuration state 615, the objects within the FPOA are configured. According to one example, a scan chain controller (as will be discussed in more detail with respect to
After the configuration information has been loaded, the objects operating in the high-speed domain are coupled to the high-speed clock in state 620, and the objects are initialized in state 625. A current surge may occur at step 620 due to the activation of dynamic logic within the objects. For example, current surges up to approximately 50 amperes within a nanosecond are possible. To help reduce the current surge, the objects are set to a predetermined default state at step 625. For example, the predetermined default state may be configured to reduce the number of state changes within the object (e.g., reduce the number of toggling signals). After the power supply has stabilized from the sudden current inrush, the FPOA can begin executing its application in its normal operation state 630 (e.g., by clearing an initialize signal). Another large current surge may occur as the objects transition from their default state (e.g., data paths stable) to their running state. The FPOA remains in the normal operation state until dislodged from that state, such as, for example, when a JTAG controller (see
In a control or debugging state 635, a JTAG controller can pause the operation of the FPOA at any point during normal operation. This may allow, for example, the internal status of any objects within the FPOA to be observed. After the JTAG controller has completed its operation, the FPOA may be returned to its normal operation (i.e., to the normal operational state 630) or reset (i.e. to the loading state 610).
As will be described in more detail with respect to
According to this example, the scan chain controller 710 operates in a low-speed clock domain (i.e., it typically operates at a lower clock speed than the operational-speed of the objects). As used herein, the terms “low-speed” and “high-speed” simply mean lower and higher than each other, respectively, without implying any numerical values or ranges of values. By way of example and not limitation, the scan chain controller 710 may operate at approximately 50 MHz (megaHertz or one million cycles per second) or less, whereas the objects are designed to operate at approximately 1 GHz (gigaHertz or one billion cycles per seconds). Of course, the scan chain controller 710 and the objects may operate at other clock speeds. For example, the scan chain controller 710 may also operate in the high-speed clock domain and may control other high-speed and/or low-speed scan chains. In addition, scan chain configurations other than those shown in
According to one example, three sets of scan chains are utilized: (1) two party line scan chains (e.g., scan chains 750 and 751); (2) two configuration scan chains (e.g., scan chains 752 and 753); and (3) one latch scan chain (e.g., scan chain 754). As previously mentioned, certain scan chains may not be readable. For example, the latch scan chain 754 may only have a scan chain input 754_IN, but no scan chain output 754_OUT.
The party line scan chains 750 and 751 may be used to configure how each object interacts with the party line communication channels. For example, the party line scan chains 750 and 751 may control the selector of each multiplexer in the object to dictate which of the input signals will be used as an output signal (refer to
The configuration scan chains 752 and 753 may be used to configure the primary or core functionality of objects. For example, internal registers, counters, instructions, addresses, and the like can be programmed or set in this way. Again, by using two configuration scan chains 752 and 753, data may be shifted in more quickly. Accordingly, fewer or additional scan chains may be used to configure the primary functionality of object. Finally, the latch scan chain 754 may be used to configure any memories within the objects. Additional scan chains may be used to configure object memories.
Of course, a FPOA may utilize additional or fewer scan chains to program objects in the core region. Further, one or more scan chains may be used to program logic in the non-core region, such as the I/O interfaces, memory interfaces, and memory. By way of example, a scan chain circling the periphery of the FPOA may be used to program the I/O interface.
The description of
Built-in self-testing an FPOA may be accomplished in various ways, as described in this subsection.
According to one embodiment, built-in self-testing of an FPOA proceeds by testing a subset of objects at one time. The subset of objects-under-test may be a single object or all objects but is typically less than all objects (i.e., a proper subset) and most typically a small number of objects. In that case, comprehensive testing of all objects to some extent (but not necessarily to the full extent of an individual object's capabilities) can be achieved by serially testing different subsets until all subsets have undergone testing.
Before describing details of testing approaches at the subset level or the array level, the concept of subsets will be explained further. The case of a regular rectilinear array of objects 800 is illustrated in
Before or after the objects 830 in the RUT 835 have been tested, all or most of the other objects in the array 800 may be tested by iteratively testing new RUTs or other subsets of objects. For example, a RUT of the same size and shape may start in the lower left hand portion of the core region of the array and iteratively march right in steps of one column or greater. The size of the step is preferably less than the size of the RUT in the east-west direction for comprehensive coverage of all objects. After the RUT reaches the right most column, the RUT may be moved up one or more rows, and the right-marching process repeated until all or most of the objects have been tested at least once. The size of the RUT may be altered during the test or from test to test. For example, a large RUT may be used to test the number of hops data can move in one clock cycle. Indeed, the RUT may have a square, rectangular, or other geometric shape or may comprise one or more entire columns or rows.
Also shown in
The SOIs 810 may also enable the objects 800 to communicate with, for example, a BIST module 820, as shown in
The BIST module 820 may program the objects 800 such that some or all of the objects 830 in the RUT 835 are fully powered up, other objects 840 are partially powered up to enable the north/south party line communication, other objects 850 are partially powered up to enable the east/west party line communication, and yet other objects 860 are fully powered down. The objects 830 collectively define the objects to be tested (i.e., the RUT 835). The objects 840 collectively define the north and south communication channels 845 that allow communication between the RUT 835 and the SOIs 810 to the north and south of RUT 835. Likewise, the objects 850 collectively define the east and west communication channels 855 that allow communication between the RUT 835 and SOIs 810 to the east and west of RUT 835. Fully powering down the remaining objects (i.e., objects 860) may help limit the total power consumption of the array and help prevent a large current inrush. As will be described in more detail with respect to
After programming the objects 800, the BIST module 820 may tell some or all of the objects to run at full speed for a certain number of clock cycles. For example, the objects 830, 840, and 850 may be coupled to the high-speed clock and initialized so that one or more test patterns may be transmitted to the objects 830 (i.e., RUT 835) via objects 840 and/or 850 and one or more output patterns generated by objects 830 may be carried away via objects 840 and/or 850. The test patterns may be generated, for example, by linear feedback shift registers (LFSRs) 811 within the SOIs 810 and the output patterns may be received by, for example, multiple-input shift registers (MISRs) 812 within the SOIs 810. The MISRs 812 may compress the output patterns into a signature register 813, which may also be located within the SOIs 810. As will be described in more detail below, after the objects 800 have been tested, a final signature may be obtained by serially shifting the data within each signature register 813 out of the SOIs 810 and into the BIST module 820. If the final signature matches an expected value, the FPOA is likely operating as designed.
The method 900 then sets up (step 920) the array with the RUT in an initial position, such as one corner of the array. In doing so, the method 900 fully or partially powers up or down array objects as desired, such as described above in relation to
Finally, the method 900 checks (step 980) the final BIST signature, which has been updated all along, by, for example, comparing it to a known good result to decide whether the FPOA has passed or failed BIST testing. The known good result may be based on a simulation. In addition, the known good result may be based on empirical data. For example, after performing the same method 900 on a plurality of different FPOAs, an identical final BIST signature may be observed multiple times. The identical final BIST signature may be adopted as the known good result and be used to make subsequent pass/fail determinations.
Next, the method 900 runs (step 946) the RUTS (and any other powered or partially powered objects outside the RUT) for a number (K) of clock cycles. Preferably, step 946 occurs at the operational (e.g., “high”) clock speed at which the objects would operate during non-testing operation. In other words, the testing at this step is performed in the high-speed clock domain. The other steps of the method 900, by contrast, may be run in the low-speed clock domain. High speed BIST testing of the objects per se is possible because the objects themselves need not be altered to accommodate BIST. That is to say, the design of the objects is not constrained to require within an object extra circuitry not essential to non-testing operation just to facilitate BIST. Rather, the non-essential testing circuitry is in the non-core (e.g., periphery) region of the array, which may operate in the low-speed clock domain without causing operational performance penalty.
After the K clock cycles complete, the method 900 records (step 948) the outputs of the RUT objects, such as in the MISRs of the SOIs to which the RUT objects are linked. The steps 942-948 can thereafter be repeated a configurable number of times. An advantage of repeating the steps under varied configurations, set-up or initial conditions and/or varied input test patterns is an increase in the coverage of possible modes, states, and operational scenarios being tested, thereby increasing the statistical confidence level of the testing. Using known but random-like configurations and/or input test patterns seems to provide satisfactory variability in the testing conditions.
Many variations of the method 900 or its steps are possible.
At step 1010, another set of objects is partially powered up to allow party line communication channels to relay data to and from the RUT. According to one embodiment, partially powering up an object involves fully powering up a portion of an object's party line communication channels, while powering down (e.g., gating the clocks) the rest of the party line communication channels and the core. For example, an object that relays data east and west may have its east and west party line communication channels powered up while its north and south party line communication channels and its core are powered down. By way of another example, an object that relays data north and south may have its north and south party line communication channels powered up while its east and west party line communication channels and its core are powered down. Powering up or down the core and the east/west or north/south party line communication channels may involve selectively connecting the various components that make up the core and party line communication channels to a power bus, a ground bus, and/or a high-speed clock.
According to another embodiment, partially powering up an object involves logically turning on a portion of the party line communication channels (e.g., the north/south party line communication channels or east/west party line communication channels) while logically turning off the rest of the party line communication channels and the core. For example, an object that relays data from east to west or west to east may be configured such that the multiplexer 485 (
By way of another example, an object that relays data from north to south or south to north may be configured such that the multiplexer 470 always places data from line 410_IN into party line north register 450 and the north launch multiplexer 430 always launches data from the party line north register 450 onto its output (410_OUT). Likewise, the multiplexer 475 may always place data from line 405_IN into party line south register 460 and the south launch multiplexer 440 may always launch data from the party line south register 460 onto its output (405_OUT). All of the other multiplexers (e.g., 425, 435, 480, and 485) may have a logical output of ‘0’ or ‘1’ regardless of their inputs. This may effectively disable the core 465 and the east/west party line communication channels by preventing them from communicating with other objects. In other words, the core and east/west party line communication channels are logically off.
At step 1015, any remaining objects in the array are preferably fully powered down. According to one embodiment, fully powering down an object includes powering down its core and party line communication channels. For example, the various components that make up the core and party line communication channels may have their clocks gated or may be disconnected from a power bus, a ground bus, and/or a high-speed clock. According to another embodiment, an object may be configured such that it is communicatively isolated from other objects. In other words, the four launch multiplexers 425, 430, 435, and 440 could be configured such that their output is always a logical ‘0’ or ‘1’ regardless of its inputs. Even though an object may be powered down, the powered down object may still be configured via one or more scan chains and provide input to the RUT. For example, data shifted into a nearest neighbor register of a powered down object may be pulled from a RUT object during test.
The objects within the array need not be configured into a fully powered up, partially powered up, or fully powered down state in any particular order. In fact, according to one embodiment each object's configuration is shifted in. Thus, all of the objects may be effectively configured together. In other words, the steps 1005, 1010, and 1015 need not be performed in the order shown in
At step 1020, the fully powered-up set of objects (e.g., the RUT) is stimulated with a test pattern. For example, one or more LFSRs 811 (
At step 1045, the objects-under-test are tested. According to one embodiment, the testing includes setting the objects-under-test to an initial condition or object configurations. For example, one of the objects-under-test may be configured to communicate with a different one of the objects-under-test using a party line communication channel (e.g., via a party line scan chain). In addition, one of the objects-under-test may be configured to communicate with an adjacent object outside of the set of objects-under-test (e.g., by pulling data from a nearest neighbor register of an adjacent object). Further, the primary functionality of the objects-under-test may be configured using a configuration scan chain and any memories within the objects-under-test may be configured using a latch scan chain.
Once the objects-under-test are set to an initial condition, the objects-under-test are stimulated with a test pattern via the set of intermediate objects. For example, a full speed clock may drive the set of intermediate objects and the objects-under-test. This allows a pseudo-random test pattern generated by the LSFRs 811 (
At step 1050, a new set of objects-under-test is established to include a different subset of objects. Steps 1040, 1045, and 1050 may repeat until step 1055 determines that every object in the array has been included in at least some number (e.g., 1) of sets of objects-under-test. Once every object in the array has been included in at least one set of objects-under-test, the signatures 813 in all the SOIs 810 may be serially shifted around the array and compressed into a final signature. At step 1060, the final signature is compared to an expected value to determine whether the array of objects if functioning properly. According to one embodiment, step 1050 involves shifting the objects-under-test one column to the east. Then steps 1040, 1045, and 1050 may be repeated until the objects-under-test reaches the east-most column in the array. At this point, the objects-under-test may be shifted up one row and steps 1040, 1045, and 1050 may be repeated again until the objects-under-test reaches the east-most column. The cycle in this embodiment repeats until the set of objects-under-test reaches the north-east corner of the array.
According to one embodiment, the FPOA checks whether it should load its configuration information via PROM or a JTAG interface upon power-up (see
As will be described in more detail with reference to
According to one embodiment, a BIST signal block 1130 connects the BIST controller 1125 to the objects 1100 and SOIs 1140-1145. Thus, the BIST signal block 1130 spans the high-speed and low-speed domains. As will be described in more detail with reference to
The BIST signal block 1130 may also transmit a Bndry_Data_OUT signal 1136. The Bndry_Data_OUT signal 1136 may communicate to the components 1105 various output values from the high-speed scan chains, such as a final BIST signature, an output from the LFSR, MISR, and signature scan chains.
The BIST signal block 1130 may also communicate with a chain of SOIs 1146. For example, the BIST signal block 1130 may drive signature scan chain data, MISR scan chain data, and LFSR scan chain data to SOI 1140 via lines 1150_OUT, 1151_OUT, and 1152_OUT, respectively. Likewise, the BIST signal block 1130 may receive signature scan chain data, MISR scan chain data, and LSFR scan chain data from SOI 1145 via lines 1150_IN, 1151_IN, and 1152_IN, respectively.
According to one embodiment, the BIST signal block 1130 also communicates with the objects 1100. For example, the BIST signal block 1130 may tell the objects 1100 to pause via the HoldState signal 1160 and load their initial states (as may be determined by the data shifted in via the configuration scan chain 1122 and latch scan chain 1123) via the Initialize signal 1161.
According to one embodiment, the SOIs 1140-1145 communicate with objects 1100. For example, as will be described in more detail with respect to
According to one embodiment, the BIST state machine 1210 configures the objects, starts and stops full speed operation of the objects, stimulates the objects within the RUT, and observes a response of the objects within the RUT to the applied stimulus. In other words, the BIST state machine 1210 may execute the methods 1000 and 1030 described with reference to
As described with reference to
The configuration parameters may include one or more of the following: (1) RUT size parameters, such as a width and height of the RUT and a starting location of the RUT; (2) RUT shifting parameters, such as a number of times the RUT needs to be shifted right and a number of times the RUT needs to be shifted up; (3) RUT testing parameters, such as a number of clock cycles the objects in the RUT should be operated before altering the configuration within the RUT and a total number of times the configuration of the objects within the RUT should be altered before moving the RUT; (4) party line configuration parameters, such as a predetermined set of party line configurations that will be used to configure the fully powered up objects, the partially powered up objects, and the fully powered down objects and a predetermined set of select codes for specifying which of the predetermined party line configurations will be used for the objects within the RUT; and (5) object configuration parameters, such as seed values used to seed pseudorandom generators that may be used to pseudo-randomly configure the objects (e.g., seeds for LFSR01232 and LSFR11233). According to one embodiment, the configuration parameters are included as part of a BIST configuration scan chain that is shifted into registers 1230 within the BIST scan chain control block 1215 via a “Cfg Shift In” or similar signal 1231.
According to one embodiment, the BIST scan chain control block 1215 includes two internal 32-bit LFSR registers 1232 and 1233. The two LFSR registers 1232 and 1233 may be used to supply pseudo-random data to the configuration scan chains and latch scan chains (e.g., 1122′ and 1123′). In addition, the two LFSR registers may be used to supply pseudo-random data to a input/output scan chain that is used to configure objects in the periphery region (e.g., IRAM). While the LFSR registers 1232 and 1233 may be implemented using any feedback polynomial, a feedback polynomial according to one embodiment is set forth in the following equation, in which the terms present correspond to positions in the register where feedback connections are present (e.g., the first flip-flop, fifth flip-flop, sixth flip-flop, and thirty-first flip-flop), according to accepted conventions for specifying the construction of LFSRs:
P(x)=x+x5+x6+x31.
The BIST state machine 1210 may shift pseudo-random data into the objects using the configuration scan chain 1122′ during the initial configuration of the objects. In addition, if the configuration of the objects changes iteratively before moving the RUT, one or more bits of pseudo-random data may be shifted onto the configuration scan chain 1122′. The pseudo-random data may be supplied by the two 32-bit LFSR registers 1232 and 1233. While the pseudo-random data may be selected in other ways, according to one embodiment, the pseudo-random data is selected by concatenating the first 20 bits of the LFSR 1232 following by the first 20 bits of the LFSR 1233. Since there may actually be two configuration scan chains, one scan chain can pull twenty bits from the LFSR 1232 while the other scan chain pulls twenty bits from the LSFR 1233.
In a similar vein, the BIST state machine 1210 may shift in pseudo-random data using the latch scan chain 1123′ during the initial configuration of the objects. In addition, if the configuration of the objects changes iteratively before moving the RUT, according to one embodiment, three or more bits of pseudo-random data may be shifted onto the latch scan chain 1123′. For example, an ALU object may need three new bits of data from the latch scan chain to create a new instruction for the ALU object. Thus, during each reconfiguration iteration, the latch scan chain 1123′ may shift in three new pseudo-random bits instead of just one. Because an RF object may not need to be updated with new content, RF objects may be configured to simply pass incoming data on the latch scan chain 1123′ to downstream objects. The pseudo-random data may be supplied by the two 32-bit LFSR registers 1232 and 1233. While the pseudo-random data may be selected in other ways, according to one embodiment, the pseudo-random data is selected by concatenating the last ten bits of the LFSR 1232 followed by the last ten bits of the LFSR 1233. Since there may be only one latch scan chain, the latch scan chain may pull ten bits from LFSR 1232 and ten bits from LSFR 1233.
Likewise, the BIST state machine 1210 may shift in pseudo-random data using an I/O scan chain 1240 during the initial configuration of the objects. In addition, if the configuration of the objects changes iteratively before moving the RUT, according to one embodiment, one or more bits of pseudo-random data may be shifted onto the I/O scan chain 1240. The pseudo-random data may be supplied by the two 32-bit LFSR registers 1232 and 1233. While the pseudo-random data may be selected in other ways, according to one embodiment, the pseudo-random data is selected as the first four bits from the LFSR 1232.
As previously described with reference to
According to one embodiment, eight party line configurations are shifted into the registers 1230 within the BIST scan chain control block 1215 along with other configuration parameters. Of course, additional or fewer party line configurations are possible. The objects not under test may have one of three party line configurations: (1) party lines disabled; (2) east-west party lines configured for retiming, and (3) north-south party lines configured for retiming. In the first configuration, all of an object's party line communication channels are powered down. When in the second configuration, an object's east and west party line communication channels may be configured to retime values (e.g., see
Objects within the RUT may be configured to use any number of party line configurations. As previously discussed with reference to
The eight party line configurations may include the three party line configurations previously described with reference to the objects not under test (i.e., party lines disabled, east-west party lines configured for retiming, and north-south party lines configured for retiming). In addition, the eight party line configurations may be optimized to test circuitry within an object's core. By way of example, the party line configurations for objects within a RUT may be selected such that a result from each object's core is launched onto one of the party lines so that the result is observable outside of the RUT. Further, the eight party line configurations may be optimized to test circuitry within an object's communication infrastructure. By way of example, party line configurations may be selected to explore as many party line launch multiplexer (e.g., multiplexers 425, 430, 435, and 440 of
According to one embodiment, each object in the array will have only one of the eight party line configurations applied at a time. The party line configurations may be the same for all object types regardless of the objects functionality (e.g., ALUs, RFs, or MACs). According to one embodiment, a table downloaded as part of the BIST scan chain defines which of the eight party line configurations applies to each object within an eight-by-eight RUT. The table may be implemented by an eight-by-eight array of 3-bit fields, each corresponding to a specific object in the RUT. Of course, a RUT smaller or larger than eight-by-eight objects may be used. A larger RUT may be provided using additional bits (e.g., 4-bits). In addition, configurations from the eight-by-eight array may be repeated in a modulo-eight fashion to provide a larger RUT.
Because, according to one embodiment, only eight possible party line configurations are used, only a subset of local resources (e.g., nearest neighbor registers and party line landing registers) will be launched from any given object within the RUT. Therefore, it may be necessary to reconfigure the objects within the RUT with different party line configurations (before moving the RUT) to provide full visibility to an object's local resources. In addition, reconfiguring the objects within the RUT may allow testing of other launch multiplexer selections (e.g., passing and turning) and hop count behavior. In other words, to fully exercise the objects, multiple party line configurations may be necessary and/or desirable.
By using the party-lines-disabled configuration for one or more of the objects within the RUT, a hole may be created in the RUT. This may allow the RUT to be larger than if all the objects were powered up and may be useful in testing large hop counts, such as when a slower core clock speed is used. In addition, an entire column or row may be the RUT. This may result in another party line configuration to be downloaded in place of, for example, the party line retime east-west or the party line retime north-south configuration.
As previously mentioned, while eight party line configurations may be used, additional or fewer party line configurations are possible. The following example illustrates testing a twenty-by-twenty array using four RUTs having various party line configurations.
Initially, the twenty-by-twenty FPOA is programmed with a first set of party line configurations and tested to determine whether the FPOA's core circuitry is functioning properly. To minimize power consumption, a two-by-two RUT is used. Because a smaller RUT is being used, a simpler addressing schema is used (as compared to
As described with reference to
The following table illustrates four party line configurations along with an indication of how each launch multiplexer's selector would be set. For example, with reference to
As will be apparent from studying the launch multiplexer table, the party line configurations for objects within the two-by-two RUT have been selected so that a result from each object's core is launched onto one of the party lines (either directly or indirectly via a register). Thus, the internal core logic associated with the object is readily observable outside of the RUT. In other words, the final BIST signature would be probative of the functionality of the FPOA's core logic.
The following table illustrates the four party line configurations discussed above along with an indication of how each land multiplexer's selector would be set. For example multiplexer 470a lands data from the core 465 when the object is in configurations 1 and 2 and multiplexer 470a lands data from party line communication channel 410a when the object is in configurations 3 and 4. By way of another example, multiplexer 475a lands data from party line communication channel 405a when the object is in configurations 1 and 2 and multiplexer 475a lands data from the core 465 when the object is in configurations 3 and 4. When the landing multiplexer lands data from the core 465, the object is essentially re-circulating internal core logic result states. When the landing multiplexer lands data from one of the party line communication channels, pseudorandom data (from the LFSRs in the SOIs) is brought into the object for use by the core 465.
Next, the twenty-by-twenty FPOA is reprogrammed three more times with a second, third, and fourth set of party line configurations and retested each time it is reprogrammed to determine whether the FPOA's party line interface logic is functioning properly.
The following table illustrates launch multiplexer selections for the second set of party line configurations. A two-by-two RUT is again used to test the twenty-by-twenty FPOA. As discussed above, a table downloaded along with the party line configurations defines which of four party line configurations applies to each object within the two-by-two RUT. As the following table illustrates each party line configurations dictates how each launch multiplexer's selector would be set. For example, multiplexer 430a turns data from party line communication channel 415a when the object is in configuration 1, turns data from party line communication channel 420a when the object is in configuration 2, and passes data from party line communication channel 410a when the object is in configurations 3 and 4. By way of another example, multiplexer 440a passes data from party line communication channel 405a when the object is in configurations 1 and 2, turns data from party line communication channel 415a when the object is in configuration 3, and turns data from party line communication channel 420a when the object is in configuration 4. As previously described, the third group of channels heading north and south do not have party line communication channels heading east and west Thus, multiplexer 430c passes data from party line communication channel 410c when the object is in configurations 1, 2, 3, and 4. Likewise, multiplexer 440c passes data from party line communication channel 405c when the object is in configurations 1, 2, 3, and 4.
Each land multiplexer's selector is set so that data traveling in a certain direction will land in an appropriate party line register such that it will be launched in the same direction with the next clock cycle. For example, landing multiplexers 470a-c land data from party line communication channels 410a-c into party line registers 450a-c, respectively. Likewise, landing multiplexers 485a-b land data from party line communication channels 415a-b into party line registers 445a-b, respectively.
As will be apparent from the landing multiplexer description and from studying the above launch multiplexer table, the party line configurations for objects within the two-by-two RUT have been selected to test circuitry within an object's communication infrastructure. In other words, the party line configurations have been defined to test as many party line launch multiplexer (e.g., multiplexers 425, 430, 435, and 440 of
The third set of party line configurations are defined to test an entire row of the twenty-by-twenty FPOA at a time. Testing an entire row at once may be useful in testing hop count. To accomplish this, a twenty-by-one RUT is used to test the FPOA. Each object in the RUT has a uniform party line configuration. All party line launch multiplexers (e.g., multiplexers 425a-b, 430a-c, 435a-b, and 440a-c) for each object are configured to retime data. For example, multiplexers 425a-b launch data from party line registers 445a-b, multiplexers 430a-c launch data from party line registers 450a-c, multiplexers 435a-b launch data from party line registers 455a-b, and multiplexers 440a-c launch data from party line registers 460a-c. The north and south party line landing multiplexers are configured to reflect data. For example, the landing multiplexers 470a-c land data from party line communication channels 405a-c into party line registers 450a-c, respectively. Likewise, the landing multiplexers 475a-c land data from party line communication channels 410a-c into party line registers 460a-c, respectively. The east and west party line landing multiplexers are configured such that data traveling in an east-west direction will land in an appropriate party line register so that it will be launched in the same direction with the next clock cycle. For example, landing multiplexers 480a-b land data from party line communication channels 420a-b into party line registers 455a-b, respectively. Likewise, landing multiplexers 485a-b land data from party line communication channels 415a-b into party line registers 445a-b, respectively.
The fourth set of party line configurations are defined to test an entire column of the twenty-by-twenty FPOA at a time. Testing an entire column at once may be useful in testing hop count. To accomplish this, a one-by-twenty RUT is used to test the FPOA. Each object in the RUT has a uniform party line configuration. As previously described with reference to the twenty-by-one RUT, all party line launch multiplexers for each object are configured to retime data. For example, multiplexers 425a-b launch data from party line registers 445a-b, multiplexers 430a-c launch data from party line registers 450a-c, multiplexers 435a-b launch data from party line registers 455a-b, and multiplexers 440a-c launch data from party line registers 460a-c. With the one-by-twenty RUT, the east and west party line landing multiplexers are configured to reflect data. For example, the landing multiplexers 480a-b land data from party line communication channels 415a-b into party line registers 455a-b, respectively. Likewise, the landing multiplexers 485a-b land data from party line communication channels 420a-b into party line registers 445a-b, respectively. The north and south party line landing multiplexers are configured such that data traveling in a north-south direction will land in an appropriate party line register so that it will be launched in the same direction with the next clock cycle. For example, landing multiplexers 470a-c land data from party line communication channels 410a-c into party line registers 450a-c, respectively. Likewise, landing multiplexers 475a-c land data from party line communication channels 405a-c into party line registers 460a-c, respectively.
To summarize, the twenty-by-twenty FPOA will be tested essentially from two perspectives: (1) each object's internal core logic; and (2) each objects party line interface logic. The internal core logic is tested by programming the FPOA with the first set of party line configurations and testing it using the two-by-two RUT. The party line interface logic is tested using three different RUTs. First, the twenty-by-twenty FPOA is reprogrammed with the second set of party line configurations and retested using the two-by-two RUT. Next, the FPOA is reprogrammed with the third set of party line configurations and retested using the twenty-by-one RUT. Finally, the FPOA is reprogrammed with the fourth set of party line configurations and retested using the one-by-twenty RUT. At each iteration, the final BIST signature may be checked to determine whether the FPOA has passed or failed (which may halt further testing of the FPOA). Further, each iteration may operate at different clock speeds (which may be useful in testing maximum operating frequency and hop count). For example, the first iteration covering core logic may be designed to test the maximum operating frequency of the core logic.
The BIST signal block 1130′ may include two registers 1510 and 1520 that help interface the high-speed and low-speed components. According to one embodiment, the registers 1510 and 1520 are addressable. By way of example, the Bndry_Data_IN signal 1135′ may carry data that is used to drive a LFSR scan chain 1152_OUT′, a MISR scan chain 1151_OUT′, and a signature scan chain 1150_OUT′ via a de-multiplexer 1525. As described with reference to
The register 1520 may read the signature scan chain 1150_IN′, the MISR scan chain 1151_IN′, LFSR scan chain 1152_IN′, via a multiplexer 1515. These scan chains originate from the SOIs that surround the array. The register 1520 allows the scan chains and a final signature stored in register 1535 to be accessed by components operating in the low-speed domain (e.g., BIST controller 1125 of
According to one embodiment, a 32-bit LFSR 1530 may be included within the BIST signal block 1130′ to generate pseudo-random data that is shifted into the LFSR scan chain 1152′ and signature scan chain 1150′. The LFSR register 1530 may be the same or different from the LFSR 1232 in
The BIST signal block 1130′ may also include a 48-bit final BIST signature register 1535. According to one embodiment, the final BIST signature register 1535 accumulates and compresses the output from the signature scan chain 1150_IN′ each time a RUT completes testing at a particular location. As previously described, when BIST completes, the JTAG controller 1115 (
P(x)=x2+x47.
The SOIs may be grouped into two broad categories: (1) north-south SOIs; and (2) east-west SOIs. The north-south SOIs reside at the top and bottom of every column in the array and, according to one embodiment, support three party line communication channels. The east-west SOIs reside on the ends of every row in the array and support, for example, two party line communication channels. Of course, the SOIs may reside in other locations and support fewer or additional party line communication channels. Because the SOI 1600 has three party line communication channels, the SOI 1600 represents a column-based SOI.
According to one embodiment, the SOIs contain LFSRs 1605, MISRs 1610, and a signature register 1615. As previously described, the LFSRs 1605 drive pseudo-random data towards the RUT objects. According to one embodiment, there are 200 LFSRs 1605 located around the perimeter of the object array. The LFSRs 1605 may be daisy chained to form a LFSR scan chain and may be accessed via the LFSR scan chain 1152′ (in
According to one embodiment, the MISRs 1610 are 21-bit registers having, for example, 16 bits of data, 1 valid bit, and 4 control bits. While the MISR registers 1610 may be implemented using any feedback polynomial, one suitable polynomial is the following:
P(x)=x20+1.
According to one embodiment, the signature register 1615 is a 16-bit register. While the signature register 1615 may be implemented using any feedback polynomial, one suitable polynomial is the following:
P(x)=x1+x2+x4+x15.
According to one embodiment, the SOI LFSRs 1605 are 21-bit registers (e.g., 16-bits of data, 1 valid bit, and 4 control bits). While the LFSR registers 1605 may be implemented using any feedback polynomial, one suitable polynomial is the following:
P(x)=x20+1.
While the SOIs are placed around the periphery of the objects according to one embodiment, other configurations are possible. For example, the SOIs may simply be connected in some manner to the objects to enable communication with objects in the non-core region (e.g., periphery region 110 of
According to one embodiment, BIST may be initiated by a driver, communicating with the FPOA via the JTAG interface. During BIST, periphery object configurations may change in a random-like manner. This might cause periphery object outputs that drive external FPOA pins to change state (e.g., the bi-directional GPIO pins may toggle direction between input to output). Accordingly, according to one embodiment, the FPOA is disconnected from its pins before running BIST. For example, after asserting the self-clearing chip reset, an instruction may be written to a JTAG instruction register to inhibit output pins from changing state while the JTAG boundary scan chain preloads. In addition, the JTAG boundary scan chain may be loaded to a safe state. For example, the FPOA may operate in a state that allows cells that compose the JTAG boundary scan chain to connect directly to FPOA pins. This may allow the outputs to remain in a safe state as BIST testing occurs. In addition, boundary scan cells may force pins to a static bi-directional state.
Next, the JTAG boundary scan chain may be loaded to a safe state so that output pins do not change state and the GPIO bi-direction pins do not change from input to output. After setting the self-clearing JTAG reset bit in the JTAG control register, the BIST sub-system is ready to be configured (e.g., specifying the number of clock to run in BIST state; specifying an additional amount of time to continue shifting the LFSR scan chain to account for pipeline latency in clocks; and shifting in BIST configuration scan chain content).
After configuring the BIST sub-system, BIST may be initiated by setting a self-clearing BIST enable bit in the BIST control register. While BIST is running, a BIST busy bit in a status control register may be polled to check when BIST is finished. According to one embodiment, BIST may run for approximately 10 msec (milliseconds or one-thousandth of a second) assuming a 20 MHz slow clock.
After BIST completes, the final BIST signature register may be read and compared with an expected result to determine whether the FPOA is functioning properly (e.g., whether it passed or failed). A failure, for example, might be caused by physical flaws or timing flaws within the FPOA. Access to the final BIST signature register may be enabled using a scan chain address control register.
The methods and systems for testing an integrated circuit may be implemented in and/or by any suitable hardware, software, firmware, or combination thereof. Accordingly, as used herein, a component or module may comprise hardware, software, and/or firmware (e.g., self-contained hardware or software components that interact with a larger system). Embodiments may include various steps, which may be embodied in machine-executable instructions to be executed by an FPOA or other processor. Alternatively, the steps may be performed by hardware components that include specific logic for performing the steps or by a combination of hardware, software, and/or firmware. A result or output from any step, such as a confirmation that the step has or has not been completed or an output value from the step, may be stored, displayed, printed, and/or transmitted over a wired or wireless network. For example, a determination of whether the FPOA is functioning properly (e.g., whether it passed or failed BIST) based on the final BIST signature may be stored, displayed, or transmitted over a network.
Embodiments may also be provided as a computer program product including a machine-readable storage medium having stored thereon instructions (in compressed or uncompressed form) that may be used to program a computer (or other electronic device) to perform processes or methods described herein. The machine-readable storage medium may include, but is not limited to, hard drives, floppy diskettes, optical disks, CD-ROMs, DVDs, read-only memories (ROMs), random access memories (RAMs), EPROMs, EEPROMs, flash memory, magnetic or optical cards, solid-state memory devices, or other types of media/machine-readable medium suitable for storing electronic instructions. Further, embodiments may also be provided as a computer program product including a machine-readable signal (in compressed or uncompressed form). Examples of machine-readable signals, whether modulated using a carrier or not, include, but are not limited to, signals that a computer system or machine hosting or running a computer program can be configured to access, including signals downloaded through the Internet or other networks. For example, distribution of software may be via CD-ROM or via Internet download.
The terms and descriptions used herein are set forth by way of illustration only and are not meant as limitations. Those skilled in the art will recognize that many variations can be made to the details of the above-described embodiments without departing from the underlying principles of the invention. The scope of the invention should therefore be determined only by the following claims (and their equivalents) in which all terms are to be understood in their broadest reasonable sense unless otherwise indicated.
This application claims the benefit under 35 U.S.C. § 119(e) of U.S. Provisional Application No. 60/991,695, filed Nov. 30, 2007, which is hereby incorporated by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
60991695 | Nov 2007 | US |