The present invention generally related to the field of silicon debug using design-for-debug (DFD) techniques. Specifically, the present invention relates to the field of trace-based silicon debug and trace data compression.
The ever-increasing design complexity of integrated circuits (ICs) and the inherent inaccuracy of circuit models at high abstraction levels significantly challenge the effectiveness of pre-silicon verification techniques, and it is not uncommon that IC products need to go through multiple re-spins to be error-free (see Abramovici (2008)), despite the fact that more than half of the resources are devoted to verification tasks (see SIA (2003)). Consequently, to reduce expensive re-spins and time-to-market, silicon debug (also known as post-silicon validation) cannot be an afterthought and has become an essential step in today's IC design flow.
Since the core under debug (CUD) is a piece of silicon that has already been fabricated, the main challenge in silicon debug is the limited visibility of internal signals. To tackle this problem, usually dedicated design-for-debug (DFD) circuitries are added to the design to improve its observability.
Trace-based debug (see ARM (2013)) that allows designers to real-time observe a set of signals in consecutive cycles, being non-intrusive to the circuit's normal operation, is one of the most effective silicon debug techniques and has been widely adopted by the industry (see Leatherman and Stollon (2005) and Liu and Xu (2009)). To be specific, in this technique, a set of “key” signals in the CUD are tapped and they can be traced after being triggered. The sampled data are then sent to internal trace buffers and/or external trace ports via trace interconnection fabric (see Livengood and Medeiros (2007)), for later analysis by debug software and physical probing tools to further root cause and fix the bug (see Chang et al. (2007), Ko and Nicolici (2008) and Yang and Touba (2009)).
Once a bug is activated, it leaves its erroneous effects in one or more state elements of the circuit at some cycles. The objective of trace-based debug is to observe and localize such errors with as few debug runs as possible. Since it is not possible for us to trace all internal signals in the circuit, on one hand, the effectiveness of trace-based debug depends on the quality of the selected trace signals, which may include both manually-picked signals by experienced designers and signals selected via automated solutions guided by some visibility-enhancement metrics including Park et al. (2008), Lai et al. (2009), Vishnoi et al. (2009) and Anis et al. (2007). On the other hand, even with pre-determined trace signals that can capture a bug, it will only manifest itself at some specific time and it is crucial to ensure the signals at the “right” time are indeed traced.
Clearly, the more trace data that we can acquire, the higher possibility for us to catch a bug's erroneous effects in them and the less time and effort to identify the bug. Unfortunately, what we can trace in each debug run is usually quite limited. This is because, trace-based debug involves non-trivial overhead and we are only given limited trace buffer size and/or few external pins as trace ports.
Because of the above, it is not quite economical to store the “raw” trace data. In Park and Mitra (2008), Park and Mitra compressed the execution states of microprocessor into a small amount of footprints, taking advantage of the fact that the locality feature of instruction sequence and redundant information in monitored data that can be easily identified with the executed instructions. Yang and Touba. (2008) and Anis et al. (2007) utilized the data locality feature when accessing cache and adopted dictionary-based compression to improve the compression ratio.
The above trace compression solutions focused on debugging microprocessors. Several compression techniques have also been presented for signal tracing in general logic circuits to improve their error detection capability, and they can be broadly classified into the following three categories:
Lossless trace compressors, which take advantage of the locality of trace data for lossless compression. In Anis and Nicolici. (2007), Anis and Nicolici presented several dictionary-based compressors to trace repeatable data. Based on the observation that toggling rate of state values is usually low, Prabhakar et al. (2011) proposed to compress the differential data to achieve higher compression quality.
Spatial lossy trace compressors, which compact a set of N signals into M parity signals (N>M) using an XOR network before signal tracing starts (see Mitra et al. (2005)). To reduce routing overhead, such spatial compressors are usually organized as a tree-like structure as part of the trace interconnection fabric.
Temporal lossy trace compressors, which compact a number of cycles (e.g., 1 k) of the raw data into a signature during signal tracing (see Touba (2007) and Yang et al. (2009)) with the help of multiple-input signature register (MISR), originally used for test response compaction in VLSI testing domain. As shown in
From the above, it is clear that temporal lossy trace compressors are quite appealing due to their impressive compression ratio. However, the effectiveness of such MISR-based compressors relies on the existence of clean “golden vector” to generate reference signatures for comparison. This is usually not the case during silicon debug, rendering the lossy compression technique less effective on error detection. This is because: (1) it is often too time-consuming to run gate-level simulation for failed silicon test, and hence designers often resort to high-level simulator to generate “golden vectors” and many unknown (X) bits are obtained when they are mapped onto gate-level vectors; and (2) asynchronous clock domains and uninitialized state elements also result in many X bits in functional patterns.
An objective of the present invention is to provide an effective and efficient X-tolerant temporal lossy trace compressor.
The present invention, as suggested in the paper published by Yuan et al. (2012) at the Design Automation Conference, is an X-tolerant trace data compression scheme that produces compressed known (non-X) signature for silicon debug. It comprises a MISR-based trace compressor and an non-X signature extraction algorithm, where the MISR-based trace compressor takes any number of trace signals (to be observed signals for debugging purpose) containing any distribution of X's as inputs and outputs compressed X-contaminated trace data signatures, each bit of which is a linear combination of X bits and non-X information bits in the trace data. The non-X signature extraction algorithm is responsible for performing offline analysis on the X-contaminated trace data signatures and generating non-X signatures that keep a maximized number of non-X information bits.
Given a core under debug and a set of trace signals to be debugged, in the present invention, the MISR-based trace compressor may comprise one or more MISRs. Each MISR is implemented with a different primitive polynomial for connection to the same set of trace signals. The purpose is to provide redundant trace data signatures for X-tolerance. In one embodiment of the present invention, a reconfiguration capability is implemented in an MISR to enhance the diversity of redundancy. A first reconfiguration may use a primitive polynomial selector to select a desired primitive polynomial. A second reconfiguration may use an input order manipulator to change the positions of the trace signals. Furthermore, a reconfigurable counter may be used to set the number of cycles to unload a trace data signature. It is worth noting that any of the above reconfiguration schemes is independent of each other in constructing a trace compressor. The reconfiguration capability is compulsive when a trace compressor is implemented with one MISR, while it is optional to implement a trace compressor with two or more MISRs.
In another embodiment of the present invention, a non-X information extraction algorithm is used to convert an X-contaminated trace data signature to a non-X signature. Every bit in the X-contaminated trace data signature is a linear combination of X bits and non-X information bits. X bits are cancelled by identifying and XORing feasible combinations of bits in the X-contaminated trace data signature, and such combinations are named as X-cancelling schemes. Consequently, each bit in a resulting non-X signature is a linear combination of non-X information bits, and bugs are found if a mismatch occurs between the non-X signature and the known bug-free signature. In the present invention, an X-matrix may be first constructed, according to the X bit distribution in the X-contaminated trace data signature. Then, an X-cancelling scheme is a non-zero solution for the X-matrix. The X-matrix may be transformed into a column echelon form (see Cohen (2000)) that has the same solution space. A non-X signature extraction algorithm explores the X-cancelling solution space to maximize the number of kept non-X information bits using an X-cancelling solution transformation method, which generates an initial X-cancelling scheme and transforms one X-cancelling scheme to another one.
The foregoing and additional objects, features and advantages of the invention will become more apparent from the following detailed description, which proceeds with references to the following drawings.
The following description is presently contemplated as the best mode of carrying out the present invention. This description is not to be taken in a limiting sense but is made merely for the purpose of describing the principles of the invention. The scope of the invention should be determined by referring to the appended claims.
Starting from the initial X-cancelling scheme, an X-cancelling solution transformation method to explore the X-cancelling solution space is then used to generate new X-cancelling schemes. To guarantee that the obtained solution is still an X-cancelling scheme, the transformation method may obey the following three bit flipping rules: (1) any free bit can be freely flipped to generate a new X-cancelling scheme; (2) to flip a stack bit, all pivot bits whose corresponding pivots are on the same columns of non-zero entries of the stack row correlated with to-be-flipped stack bit, need to be flipped. For example, to flip the fifth bit O2 in Sinit, whose corresponding stack row is {1,0,0,1} 760, the first and fourth pivot bits, O7 and O3, need to be flipped. This is because column O2 is equal to a linear combination of the columns corresponding to O7 and O3, i.e., O2=O7⊕O3, and thus the above concurrent flipping operations cancel each other and generate a new X-cancelling scheme. In this case, a new X-cancelling scheme Ssec={0,1,1,0,1,0,0,1} is reached by performing the operation O6=O7⊕O5⊕O4⊕O3⊕(O2⊕O7⊕O3)=O5⊕O4⊕O2; and (3) all pivot bits cannot be flipped. In addition, new X-cancelling schemes can be acquired by simply changing different targeted bits in the X-contaminated trace data signature.
This application claims the benefit of Provisional U.S. Patent Appl. Ser. No. 61/654,200, filed Jun. 1, 2012, and incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
61654200 | Jun 2012 | US |