a illustrates the average signal latency of an ALU result bus least significant bit measured over the lifetime of the microprocessor,
In order to better understand the physical phenomenon that cause wearout and why technology scaling has such a dramatic impact on lifetime reliability, we briefly discuss a subset of the wearout mechanisms that plague modern integrated circuit designs (e.g. microprocessor designs). This section presents industry-standard theoretical models for each wearout mechanism and discusses how these mechanisms affect circuit-level timing within the design.
EM is a physical phenomenon that causes the mass transport of metal within semiconductor interconnects. As electrons flow through the interconnect, momentum is exchanged when they collide with metal ions. This pushes metal ions in the direction of electron flow and, at high current densities, results in the formation of voids (regions of metal depletion) and hillocks (regions of metal deposition) in the conductor metal [13].
The model of electromigration that we employ is based on a version of Black's equation found in [5] and is consistent with recent literature [19, 32]:
MTTF
EM∝(J−Jcrit)−ne(E
where,
n=1.1, material dependent constant
k=Boltzmann's constant
Studies have shown that the progression of EM over time can be separated into two distinct phases. During the first phase, sometimes referred to as the incubation period, interconnect characteristics remain relatively unchanged as void formations slowly increase in size. Once a critical void size is achieved, the second phase, the catastrophic failure phase is entered, characterized by a sharp increase in interconnect resistance [17, 13].
This sharp increase in interconnect resistance can be related to interconnect delay using the Elmore delay equation [15,4], arguably the most widely used interconnect delay model. This model describes how delay through interconnects are related to various parameters including: driver impedance, load capacitance, geometry, etc.:
where,
r is the resistance of the interconnect wire
κ incorporates all terms in (2) that are independent of r
γ incorporates all terms in (2) that are dependent on r
Empirical studies focusing on interconnects spanning a wide range of process technologies have correlated a sharp rise in resistance as the mass transport of metal begins to inhibit the movement of charge. Coupling this phenomenon with Equation 2, it follows that EM can be modelled as an increasing interconnect delay. Further, as technology scales, smaller wire geometries, coupled with increasing current densities will dramatically accelerate the effects of EM.
TDDB, also known as gate oxide breakdown, is caused by the formation of a conductive path through the gate oxide. TDDB exhibits two distinct failure modes, namely soft and hard breakdown [14, 8, 30]. The widely accepted Klein/Solomon model [30] of TDDB characterizes oxide wearout as a multistage event with a prolonged wearout period (trap generation) during which charge traps are formed within the oxide. This is followed by a partial discharge event (soft breakdown) triggered by locally high current densities due to the accumulation of charge traps. Typically, in thinner oxides, a series of multiple soft breakdowns eventually leads to a catastrophic thermal breakdown of the dielectric (hard breakdown).
The rate of failure due to TDDB is dependent on many factors, the most significant being oxide thickness, operating voltage, and temperature. This work uses the empirical model described in [32] which is based on experimental data collected at IBM [37]:
where,
v=operating voltage
k=Boltzmann's constant
a, b, X, Y, and Z are all fitting parameters based on [37]
Research has shown that TDDB has a detrimental impact on circuit performance. As the gate oxide wears down, the combined effects of increased leakage current and shifting current-voltage curves results in devices with slower response times [8]. Further, the ultra-thin oxides projected in future technology generations, will make devices increasingly susceptible to TDDB.
NBTI occurs predominantly in PFET devices and causes the gate to become negatively biased with respect to the source and the drain, leading to an accumulation of positive charge within the gate oxide. The main effect of NBTI is an increase in the threshold voltage of the transistor, slowing down the performance of the gate. The model used in this work is from work at IBM [39]:
where,
k=Boltzmann's constant
NBTI causes failure by shifting the threshold voltage of the device to the point where signal propagation delay exceeds the clock cycle time. Since the shift in threshold voltage due to NBTI is largely a function of temperature, the effects of this wearout mechanism, will become more pronounced in the coming technology generations.
Though this is not an exhaustive list of all potential wearout mechanisms, it does illustrate a representative set of potential wearout mechanisms and the ways in which feature size scaling will affect the reliability of future microprocessors. Most importantly, the physical impact of all these wearout phenomenon is increased device delay until ultimate failure. Wearout mechanisms not discussed here, such as hot carrier injection and stress migration, have been shown to be similarly dependent on current density and temperature and are expected to also negatively affect device delay.
This section describes the infrastructure developed to simulate the effects of wearout over time and details the wearout characteristics of an embedded processor (as one example of an integrated circuit). It begins by describing the microprocessor core studied in this work, along with the synthesis flow used for its implementation. This is followed by a description of the approach used to calculate MTTF values for structures within the design. Finally, the model used to correlate wearout with time is presented along with a statistical analysis of the impact of wearout on signal propagation latency.
The testbed used to conduct wearout experiments was a Verilog model of the OpenRISC 1200 (OR1200) CPU core [1]. The OR1200 is an open-source, embedded-style, 32-bit, Harvard architecture that implements the OR-BIS32 instruction set. The microprocessor contains a single-issue, 5-stage pipeline, with direct mapped 8 KB instruction and data caches and virtual memory support. This microprocessor core has been used in a number of commercial products and is capable of running the μClinux operating system.
The OR 1200 core was synthesized using Synopsys Design Compiler with an Artisan cell library characterized for a 130 nm IBM process with a clock period of 5 ns (200 MHz). Cadence First Encounter was used to con-duct floorplanning, cell placement, clock tree synthesis, and routing. This design flow provided accurate timing information (cell and interconnect delays), and circuit parasitics (resistance and capacitance values) for the entire OR 1200 core. The floorplan along with several salient characteristics of the implementation is shown in
The final layout of the OR1200 includes a guard band of 100 ps slack time and consists of roughly 24,000 logic cells.
In this work, the MTTF values for design elements (logic cells and wires) within the microprocessor core were calculated using the equations modelling EM, TDDB, and NBTI presented above. These MTTF calculations required two parameters, activity and local temperature, for each design element. The activity data was generated by simulating the execution of a benchmark on the core using Synopsys VCS (Five benchmarks were chosen for this study to represent a range of computational behaviour for embedded systems: dhrystone—a synthetic integer benchmark; g721 encode and rawcaudio from the MediaBench suite; rc4—an encryption algorithm; and sobel—an image edge detection algorithm). This activity information, along with the parasitic data generated during placement and routing, was then used by Synopsys PrimePower to generate a per-benchmark power trace. The power trace and floorplan were in turn processed by HotSpot [27], a block level temperature analysis tool, to produce a dynamic temperature trace and a steady state temperature (per benchmark) for each structure within the design. A flowchart detailing this process is shown in
Once the per-benchmark activity and temperature data were derived, the MTTF for each wire within the design was calculated using Equation 1 for EM, and the MTTF for each logic cell was calculated using Equations 3 and 4 for TDDB and NBTI, respectively. This computation was repeated for each benchmark. The MTTF values were then normalized to the worst case (minimum) MTTF across all benchmarks, resulting in a relative wearout factor (RWF) for each design element. A per-module MTTF was determined by identifying the minimum MTTF across all design elements within each top-level module of the OR1200 core.
As shown above, wearout phenomena have a significant impact on circuit-level timing. In order to simulate this effect, we model wearout as a rise in interconnect delay across wires and an increase in logic cell response time. We correlate these increases in propagation latency to processor age using the widely accepted reliability bathtub curve [26], depicted in
The bathtub curve consists of three distinct regions, the infant period, the grace period, and the breakdown period. The infant period is characterized by a significant but decreasing rate of failures as weak/defective devices fail soon after manufacture. The grace period, characterized by a small but slowly increasing failure rate, constitutes the majority of a device's lifespan, and comes to an end near to the MTTF of the device. At this point, the breakdown period is entered, where the effects of wearout become more prominent. As these effects gain momentum, the failure rate increases dramatically. The WDU proposed herein is used to detect this period and safeguard against failures.
In order to quantify the effects of wearout, a model was derived correlating the age of a microprocessor to the maximum percentage increase in latency experienced by any logic cell or wire. This time-dependent worst case percentage increase in latency is referred to as the Age Index (AI). In other words, the AI represents the decrease in performance for the most degraded logic cell or wire across the entire processor at given point in time.
To simulate the effects of wearout using an increase in signal propagation latency, the increase in latency for each logic cell and wire is determined using their respective RWFs and the processor's AI. To simulate the effects of process variation and the fact that some areas of the design are more robust than others, we also apply a Gaussian random variable with a mean of 1 and a standard deviation of 5%. The change in delay due to wearout for each logic cell and wire within the design is then calculated as shown in Equation 5. Note that the RWF of a device/wire and its original delay are both static values while the AI increases over time (see
Δdelay=(original delay)×(RWF)×(AI)×(random variable) (5)
Wearout-dependent delay data for each individual design element was collected and used to model the latency behaviour of entire paths through higher-level architectural structures. Accurate modelling of these path latencies was done with a framework developed for interacting with the Synopsys VCS simulator. Wearout dependent delay information (Δdelay) for each cell and wire was annotated onto the design netlist and custom signal monitoring handlers were registered to measure the propagation delay through design modules. The signal monitors captured this latency information into a database that furnished random samples for the statistical analysis described below.
a plots the average of recorded sample mean latency values for the least significant bit of the ALU result bus (obtained while the processor is running the five benchmarks). The error bars bound the range of observed latencies for this experiment. One may notice the data suggests that there does not exist much variation in the output latency. However, the lack of variation in this plot is because the averaging of sample mean latencies acts as a low pass filter. Sample mean values were averaged in this experiment to mimic the hardware based sampling methodology described in the following section.
b shows the distribution of the observed latencies on the least significant bit of the ALU result bus throughout the grace period. Note that the majority of the sample points lie within a tightly bounded region falling rather sharply toward the tails.
Lastly,
In this section, we use the latency trends demonstrated in Section 3 to design a generic, self-calibrating wearout detection unit (WDU) that can be used to monitor a variety of processor structures and predict their likely failure.
An introduction to the trend analysis technique used in the WDU design is presented first. This is followed by details of the design and implementation of the WDU. Next a brief description of dynamic environmental variations, such as clock jitter and power/temperature fluctuations, is provided, as well as an analysis of how these variations may affect the operation of the WDU. Finally, the details of integrating a WDU into the microprocessor pipeline are discussed.
The area and power overhead of the WDU, its accuracy in detecting wearout, and the increase in processor lifetime that can be achieved by augmenting a design with WDUs and cold spare structures, are discussed following this.
c demonstrates that the output signals from most modules experience a sharp rise in propagation latency as the microprocessor approaches the breakdown period. In order to capitalize on this trend of divergence from the signal propagation latencies observed during the infant and grace periods of the microprocessor's lifetime, TRIX (triple-smoothed exponential moving average) [34] is used, this is a trend analysis technique used to measure momentum in financial markets. TRIX analysis relies on the composition of three calculations of an exponential moving average (EMA) [9]. The EMA is calculated by combining a percentage of the current sample value with an inverse percentage of the previous EMA, causing the weight of older sample values to decay exponentially over time. The calculation of EMA is given as:
EMA=α×sample+(1−α)EMAprevious
The use of TRIX rather than the EMA provides two significant benefits. First, TRIX provides an effective filter of noise within the data stream because the composed applications of the EMA act as a filter, smoothing out aberrant data points that may be caused by dynamic variation, such as temperature or power fluctuations. Second, the TRIX value tends to provide a better leading indicator of sample trends. The equations for computing the TRIX value are:
EMA
1=α(sample−EMA1previous)+EMA1previous
EMA
2=α(EMA1−EMA2previous)+EMA2previous
TRIX=α(EMA2−TRIXprevious)+TRIXprevious
TRIX calculation is recursive and parameterized by the weight, α, applied to previous TRIX calculations. The WDU discussed below uses the calculation of two TRIX values using different weights to determine the divergence of trends in the observed signal latency.
Below there is discussed how TRIX calculations using these particular α values can be leveraged to determine the onset of the breakdown period.
The WDU discussed herein uses the calculation of two TRIX values which diverge significantly when the microprocessor enters the breakdown period. The first TRIX calculation, TRIXl, is used to track the local latency trend by weighting recent samples heavily. The second TRIX calculation, TRIXg, is used to track the global latency trends, placing significantly more emphasis on the latency history.
A schematic diagram of the WDU is shown in
The purpose of the first stage is to obtain a point estimate of the mean propagation latency for a given output signal. The signal being monitored is tapped off from the functional unit in which it is used and fed into the first stage of the WDU and is subjected to a series of delay buffers. Each delay buffer in this series feeds one bit in a vector of registers such that the signal arrival time at each register in this vector is monotonically increasing. At the positive edge of the clock, some of these registers will capture the correct value of the module output, while others will store an incorrect value (the previous value on the output line). This situation arises because the addition of delay buffers causes the output signal to arrive after the clock edge for a subset of these registers. The value stored at each of the registers is then compared with a copy of the correct output value. This pair-wise comparison produces a bit vector that represents the propagation delay of the path exercised by the module output being monitored. As the signal latency increases (i.e. wearout progresses) fewer comparisons will succeed as more and more signals arrive late to their respective registers.
One important consideration in designing Stage 1 of the WDU is the length of the buffer chain used to measure slack time. The amount of delay introduced must be sufficient to cause at least some registers within the WDU to latch incorrect values, each time a module output transitions, in order to generate useful delay profiles. Depending on the particular path being exercised this delay could be substantial. However, as we demonstrate later, the area required by this delay chain (even in the worst case) does not significantly impact the overall area of the WDU
The propagation latency for a signal is dependent upon 1) the module inputs and 2) the path taken for signal propagation. Therefore, the second stage of the WDU depends upon an initial averaging filter to capture a representative sample of the latency for a given output signal. For this example 1024 signal transition latencies are accumulated in stage 1 before the sample value is passed on to stage 2.
Next, TRIXl and TRIXg are calculated using α values of ½ and ½12 respectively. It is important to note that the value of α is dependent on the sample rate and sample period. Herein it is assumed a sample rate of three to five samples per day is used over an expected 30 year lifetime. Also, the long incubation periods for many of the common wearout mechanisms require that the computed TRIX values are routinely saved into a small area of non-volatile storage, such as flash memory.
Since the three TRIX calculations are identical, the impact of Stage 2 on both area and power can be minimized by spanning the calculation of the TRIX values over multiple cycles and only synthesizing a single instance of the TRIX calculation hardware.
The final stage of the WDU receives TRIXl and TRIXg values from the previous stage and is responsible for predicting a wearout if the difference between these two values exceeds a given threshold. The simulations conducted indicate that a 10% difference between TRIXl and TRIXg is almost universally indicative of the microprocessor entering the breakdown period and therefore can be used as the threshold for triggering a wearout response. Computing this difference of 10% in the hardware is typically a costly affair, and we get around this problem by doing an approximation of percentage increase using shift operations: shift by 4 gives 6.25% of a value, shift by 5 gives 3.125%; and adding both of these together gives 9.375%, which is a good enough estimate for computing 10% of a value.
Dynamic environmental variations such as temperature spikes, power surges, and clock jitter can each have an impact on circuit-level timing, potentially affecting the operation of the WDU. Below are discussed some of the sources of dynamic variation and their impact on the WDU's efficacy.
Temperature is a well known factor in calculating device delay, where higher temperatures typically increase the response time for logic cells.
Another source of variation is clock jitter. In general, there are three types of jitter: absolute jitter, period jitter, and cycle-to-cycle jitter. Of these, cycle-to-cycle jitter is the only form of jitter that may potentially affect the WDU. Cycle-to-cycle jitter is defined as the difference in length between any two adjacent clock periods and may be both positive (cycle 2 longer than cycle 1) or negative (cycle 2 shorter than cycle 1). Statistically, jitter measurements exhibit a random distribution with a mean value approaching 0[38].
In general, the sampling techniques employed by the WDU should be sufficient to smooth out the effects of dynamic variation described here. For example, a conservative, linear scaling of temperature effects on the single inverter delay to a 4.4% increase in module output delay does not present a sufficient magnitude of variance to overcome the 10% threshold required for the WDU to predict failure. Also, because the expected variation due to both clock jitter and temperature will exhibit a mean value of 0 (i.e. temperature is expected to fluctuate both above and below the mean value), statistical sampling of latency values should minimize the impact of these variations. To further this point, since the TRIX calculation acts as a three-phase low-pass filter, the worst case dynamic variations would need to cause latency samples to exceed the stored TRIXg value by more than 10% over the course of more than 12 successive sample periods, corresponding to over four days of operation.
The above discussed the operation of the WDU in isolation as it monitored a single module output for an increase in signal latency. The section below discusses the necessary hardware for monitoring multiple output signals, and how the WDU can be integrated into a microprocessor to facilitate the swapping of cold spare hardware structures.
In order to monitor multiple output signals from a module, modest hardware modifications are necessary. First, a round robin arbiter is needed to systematically cycle through the output signals from the module. This can be done with a multiplexer controlled by a wrap-around counter proportional in size to the number of signals being monitored. The counter is incremented each time Stage 2 of the WDU updates the TRIXl value (1024 transition events on a single output). The counter can also serve as the read/write address for a small cache which stores the TRIXl and TRIXg value associated with each output. Once the WDU has been supplemented with this hardware, it may be used to monitor multiple output signals, significantly increasing its efficacy since observing sharp increases in latency on a single output signal is sufficient to conclude that the structure as a whole is likely to fail. Multiple signals with a functional unit may be monitored. This behaviour is analyzed below.
Given that any design augmented with a WDU has a reliable means of detecting when individual modules are worn out (ALU, LOAD/STORE, etc), the use of cold spares can be employed to extend a system's operating life. An efficient approach to enhancing reliability with minimal overhead would be to analytically determine the structures most likely to fail and only place WDUs at the outputs of the most susceptible structures. As the modules age, the WDU could indicate when to swap in a cold spare device in order to avoid catastrophic failure. The section below evaluates the potential gain in processor lifetime as a function of the area overhead for adding these devices.
Below are discussed area and power consumption statistics for two implementations of the WDU. In addition, we evaluate the ability of the WDU to detect the onset of the breakdown period is evaluated. Lastly, a cost benefit analysis for augmenting the OR1200 core with multiple WDU and cold spare structures is presented.
Table 1 displays the area and power consumption numbers for two WDU designs. The first implementation is a WDU designed to monitor only a single output signal, while the second implementation is designed to monitor up to eight different output signals for a given module (the justification for monitoring of only a small number of signals per module is discussed later in this section). This table shows that a typical WDU consumes only about 0.05 mm2 (excluding the non-volatile storage) and that adding a single WDU to monitor up to eight output signals increases the overall CPU area by only about 4.45%. The power consumption for the WDU is estimated by Synopsys Design Compiler to be 8.02 mW, compared to an estimate of about 92.22 mW for the entire OR 1200 core. One should note even though the power consumption of the WDU is appreciably high, it amounts to a negligible energy consumption because of its infrequent use (about four times in a usage day).
In order to assess the merits of the prediction scheme, a WDU monitoring three different structures within the OR1200 core across the five embedded benchmarks was simulated.
Though
Similarly, 66% of the signals on the ALU are flagged about 0.5 years early. In general, 100% of the signals for all modules were marked as entering the wearout period within 0.33 of a year from the beginning of the breakdown period. Since the WDU attached to each of the modules was able to identify more than 75% of the signals as entering the breakdown period with 0.25 years of the 30 year AI, it is clear that the WDU need not monitor all output signals for each module.
The data shown in
Issues in technology scaling and process variation have raised concerns for reliability in future microprocessor generations. Recent research work has attempted to diagnose and, in some cases, reconfigure the processing core to increase operational lifetime. Below there is discussed this related work.
As mentioned above, much of the research into failure detection relies upon redundancy, either in time or space. One such example of hardware redundancy is DIVA [6] which targets soft error detection and online correction. It strives to provide a low cost alternative to the full scale replication employed by traditional techniques like triple-modular redundancy. The system utilizes a simple in-order core to monitor the execution from a large high performance superscalar processor. The smaller checker core recomputes instructions before they commit and initiates a pipeline flush within the main processor whenever it detects an incorrect computation. Although this technique proves useful in certain contexts, the second microprocessor requires significant design/verification effort to build and incurs additional area overhead.
Bower el al. [12] extends the DIVA work by presenting a method for detecting and diagnosing hard failures using a DIVA checker. The proposed technique relies on maintaining counters for major architectural structures in the main microprocessor and associating every instance of incorrect execution detected by the DIVA checker to a particular structure. When the number of faults attributed to a particular unit exceeds a predefined threshold it is deemed faulty and decommissioned. The system is then reconfigured and in the presence of cold spares can extend the useful life of the processor. Related work by Shivakumar et al. [25] argues that even without additional spares the existing redundancy within modern processors can be exploited to tolerate defects and increase yield through reconfiguration.
Research by Vijaykumar [16, 35] at Purdue, and similar work by Falsafi [20, 28], attempts to exploit the redundant, and often idle, resources of a high end superscalar processor to enhance reliability by utilizing these extra units to verify computations during periods of low resource demand. This technique represents an example of the time redundant computation alluded to in Section 1. It uses work at NCSU by the Slipstream group [24,21] on simultaneous redundant multithreading as well as earlier work on instruction reuse [29]. ReStore [36] is yet another variation on this theme which couples time redundancy with symptom detection to manage the adverse effects of redundant computation by triggering replication only when the probability of an error is high.
Srinivasan et al. have also been very active in promoting the need for robust designs that can withstand the wide variety of reliability challenges on the horizon [33]. Their work attempts to accurately model the MTTF of a device over its operating lifetime, facilitating the intelligent application of techniques like dynamic voltage and/or frequency scaling to meet reliability goals. Although some common physical models are shared in common, the focus of the present technique is not to guarantee that designs can achieve any particular reliability goal but rather to enable a design to recognize behaviour that is symptomatic of wearout induced breakdown allowing it to react accordingly.
Analyzing circuit timing in order to self-tune processor clock frequencies and voltages is a well studied area. Kehl [18] discusses a technique for re-timing circuits based on the amount of cycle-to-cycle slack existing on worst-case latency paths. The technique presented requires offline testing involving a set of stored test vectors in order to tune the clock frequency. Although the proposed circuit design is similar in nature to the WDU, it only examines the small period of time preceding a clock edge and is only concerned with worst case timing estimation, whereas the WDU employs sampling over a larger time span in order to conduct average case timing analysis. Similarly, Razor [7] is a technique for detecting timing violations using time-delayed redundant latches to determine if operating voltages can be safely lowered. Again, this work studies only worst-case latencies for signals arriving very close to the clock edge.
In the above there is described online wearout detection unit to predict the failure of architectural structures within microprocessor cores. This unit uses the symptoms of wearout to predict imminent failure. This solution seeks to utilize signal latency information for wearout detection and failure prediction. To investigate the design of the WDU, accelerated wearout experiments are presented above on the OpenRISC 1200 embedded microprocessor core that was synthesized and routed using industry standard CAD tools. Further, accurate models for TDDB, EM and NBTI were used to model the wearout related failures and determine the MTTFs for devices within the design. The results of these accelerated wearout experiments showed that most signals experience a sharply increasing latency when the breakdown period is entered. This recognition contributed to the design of the WDU. To enable the WDU to work in the presence of temperature variability, clock jitter and other environmental noise it uses statistical analysis hardware.
The WDU accurately detects and diagnoses wearout with a small area footprint: 4.45% of the OR1200 die area. The WDU was able to successfully detect the trends of increasing latency across multiple output signals for each module of the OpenRISC 1200 that was examined. These modules were then flagged as ailing before the point of failure. The achievable increase in the overall MTTF by incorporating WDUs and cold spare structures into the design is also described. With a an increase of 16.2% in the area, the MTTF increases by nearly 50%. A more substantial MTTF increase of approximately 150% can be obtained by 65% increase in the area.
The above description has included a discussion of
When the time for monitoring is reached then processing proceeds to step 102 at which the first signal to be monitored is selected. The wearout detection unit in this example can have the form illustrated in
At step 104 the latency associated with the signal transition being monitored is sampled over multiple transitions and then at step 106 the short term and long term average latency values are updated. Step 108 then determines whether there has been a change in either of these short term or long term average latency values which is indicative of imminent wearout within the circuitry (functional circuit) associated with the signal being monitored. If there has been such a change, then step 110 triggers a wearout response matched to the functional circuit concerned. If there has been no such change, then step 110 is bypassed.
Step 112 determines whether there are any more signals to be monitored in the current monitoring cycle. If there are such further signals then the next of these is selected at step 114 and processing is returned to step 104. If there are no further signals then processing terminates.
In
Wearout response 5 relates to multiprocessor systems in which wearout detected within one of the processors or one of the functional circuits within one of the processors can be used to influence the task allocation performed by the operating system controlling the multiprocessor system. As an example, if the wearout detection unit detects that the integer arithmetic or floating point unit within a particular processor is showing signs of imminent wearout then the operating system can serve to allocate tasks known to make intensive use of the integer arithmetic unit or floating point unit of a processor to other of the multiple processors so as to not force the processor subject to imminent wearout into actually exhibiting failure. This can extend the working life of the integrated circuit in a useful way. The operating system could allocate tasks to the imminently failing circuits when necessary at times of highest peak performance, but could otherwise allocate the tasks elsewhere so as to preserve the useful life of the functional circuit potentially subject to imminent wearout.
The sixth example wearout response in
Although illustrative embodiments of the invention have been described in detail herein with reference to the accompanying drawings, it is to be understood that the invention is not limited to those precise embodiments, and that various changes and modifications can be effected therein by one skilled in the art without departing from the scope and spirit of the invention as defined by the appended claims.
Number | Date | Country | Kind |
---|---|---|---|
0702096.9 | Feb 2007 | GB | national |
Number | Date | Country | |
---|---|---|---|
60836400 | Aug 2006 | US |