The present disclosure generally relates to processing, and more particularly relates to runtime processor integrity testing.
Ionizing radiation, power supply voltage glitches, manufacturing defects, and numerous other effects can corrupt circuitry by changing the electrical state of a circuit element. Some systems guard against the effects of single event effects by implementing voting systems having redundant circuitry. For example, a system may utilize three separate processors and utilize a voting system to determine a valid output. In other words, if at least two of the three processors output the same value, the output is assumed to be valid. However, voting systems are expensive as at least three times the hardware is required to implement the system. Other systems may continuously check a configuration of a circuit by scrubbing the circuit for configuration errors. However these types of systems can still allow a corrupted circuit to output data until the configuration error is detected.
In one embodiment, for example, a processor is provided. The processor may include, but is not limited to, a logic pipeline configured to process input data and a built-in configuration error detector for detecting a change to a configuration of the logic pipeline. The built-in configuration error detector may include, but is not limited to a pipeline status indicator configured to determine when the logic pipeline is idle, a test vector source coupled to the pipeline status indicator and selectively coupled to an input of the logic pipeline, the test vector source storing at least one test vector, the test vector source configured to transmit the at least one test vector to the logic pipeline when the pipeline status indictor determines that the logic pipeline is idle, and a validator coupled to an output of the logic pipeline, the validator configured to compare an output of the logic pipeline in response to the test vector to a predetermined data set, the validator configured to allow the processor to output data when the output of the logic pipeline in response to the test vector matches the predetermined data set and to block the processor from outputting data when the output of the logic pipeline in response to the test vector does not match the predetermined data set.
In another embodiment, a built-in configuration error detector for a logic pipeline in a processor is provided. The built-in configuration error detector may include, but is not limited to, a pipeline status indicator configured to determine when the logic pipeline is idle, a test vector source coupled to the pipeline status indicator and selectively coupled to an input of the logic pipeline, the test vector source storing at least one test vector, the test vector source configured to transmit the at least one test vector to the logic pipeline when the pipeline status indictor determines that the logic pipeline is idle, and a validator coupled to an output of the logic pipeline, the validator configured to compare an output of the logic pipeline in response to the test vector to a predetermined data set, the validator configured to allow the processor to output data when the output of the logic pipeline in response to the test vector matches the predetermined data set and to block the processor from outputting data when the output of the logic pipeline in response to the test vector does not match the predetermined data set.
In yet another embodiment, a method for validating a configuration of a logic pipeline of a processor is provided. The method may include, but is not limited to detecting, by a pipeline status identifier, when the logic pipeline is idle, inserting, by a test vector source, a test vector into the logic pipeline, comparing, by a validator, an output of the logic pipeline in response to the test vector to an expected data set, and opening, by the validator, a switch coupled between the logic pipeline and an output of the processor, when the output of the logic pipeline in response to the test vector does not match the expected data set.
The detailed description will hereinafter be described in conjunction with the following drawing figures, wherein like numerals denote like elements, and wherein:
The following detailed description is merely exemplary in nature and is not intended to limit the invention or the application and uses of the invention. As used herein, the word “exemplary” means “serving as an example, instance, or illustration.” Thus, any embodiment described herein as “exemplary” is not necessarily to be construed as preferred or advantageous over other embodiments. All of the embodiments described herein are exemplary embodiments provided to enable persons skilled in the art to make or use the invention and not to limit the scope of the invention which is defined by the claims. Furthermore, there is no intention to be bound by any expressed or implied theory presented in the preceding technical field, background, brief summary, or the following detailed description.
In accordance with one embodiment, processor having a built-in configuration error detector is provided. The built-in configuration error detector unobtrusively tests a configuration of a processor before and after normal operation of the processor. As discussed in further detail below, the built-in configuration error detector inserts test vectors during idles in operation of a logic pipeline to test the configuration of the logic pipeline. Furthermore, the built-in configuration error detector is a hardware based tester internal to the processor, and, thus, is invisible to any external software application and does not require any external or software based interrupts to operate.
Unlike conventional central processing units (CPU's) with many short combinatorial logic paths, the logic pipeline 110 is a relatively long logic path through the processor 100. When the processor 100 is an ASIC or the like, the logic of the logic pipeline 110 is fixed upon manufacturing. When the processor 100 is programmable, such as an SRAM FPGA, or the like, the logic of the logic pipeline 110 is fixed upon configuration of the part. The logic pipeline 110 receives data from the processor 100 and/or a data source external to the processor 100 and performs a logical data operation on the received data. The logic pipeline 110 could have myriad of uses as the processor 100 can be arranged to implement any logic circuit. Some possible logic operations performed by the logic pipeline 110 include, but are not limited to, motor control in satellites, video processing in medical imaging systems, software defined radio transceivers, fast Fourier transform logic in filtering applications. However, a wide variety of logic implementations could be implemented in the logic pipeline 110.
The processor 100 further includes a built-in configuration error detector 120. The built-in configuration error detector 120 verifies that the configuration of the logic pipeline 110 of the processor 100 has not been changed due to ionizing radiation or other sources of interference. As discussed above, ionizing radiation, power supply voltage glitches, manufacturing defects, and numerous other effects can corrupt circuitry by changing the electrical state of one or more elements in the logic pipeline 110 of the processor 100. The errors can be catastrophic, causing the failure of the processor 100 or other components coupled to the processor 100. For example, if the processor 100 is being used to calculate current commands of a motor in a pump, an error introduced by ionizing radiation, voltage glitch, or manufacturing defect could cause the processor 100 to output a control signal (e.g., a motor winding current command) which could stress or even break the motor, the motor drive electronics, the pump, or the fluid moving through the pump. In pumping applications such as a pace-maker, a petroleum pipeline, or an aircraft engine this disruption can lead to loss of life or significant financial loss.
As discussed in further detail below, the built-in configuration error detector 120 is a hardware system internal to the processor 100. Accordingly, the built-in configuration error detector 120 allows the processor 100 to test itself for configuration errors. The construction of the built-in configuration error detector 120 may vary depending upon what type of processor 100 is being used. In embodiments where the processor 100 is a SRAM FPGA, for example, the components of the built-in configuration error detector 120 may be constructed from FPGA logic within the processor 100.
The test vector source 210 includes memory to store one or more test vectors, the anticipated result from a properly functioning pipeline, the maximum allowable time-to-execute for the input test stimulus, and control logic to initiate the built-in test. When the processor 100 is a SRAM based FPGA, for example, the memory and control logic is constructed from the FPGA logic. The test vectors stored in the test vector source 210 are predetermined bit strings, which when passed through the logic pipeline 110 should result in the anticipated results. As discussed in further detail below, the output of the logic pipeline 110 in response to the test vectors is used to verify the integrity of the logic pipeline 110. The test vector(s) may include a test vector flag. The test vector flag indicates that the data with the test vector flag is test data, rather than application data for normal operation of the logic pipeline 110. The logic pipeline 110, as discussed above, may have various uses. However, in some embodiments, such as motor control, an output of the logic pipeline 110 may be based upon data from a previous cycle. In these instances, the test vector flag indicates to the logic pipeline 110 to not overwrite any data saved in registers or other memory used in the logic pipeline 110, thereby preserving the correct data in the logic pipeline for a subsequent data pass through the logic pipeline 110.
In one embodiment, for example, the memory of the test vector source may be stored in an application SRAM (otherwise known as block random access memory (RAM) in FPGA applications). The application SRAM may be subject to error monitoring to ensure the stability of the test vectors in the test vector source 210. In one embodiment, for example, one or more of the test vectors stored in the memory may include an error correcting code (ECC). The ECC is used by the test vector source 210 to verify the integrity of the test vectors.
When the test vector source 210 receives an idle indication from logic pipeline 110 via the pipeline status identifier 200, the test vector source 210 initiates the built-in test by sending one or more test vectors to the logic pipeline 110. In the embodiment illustrated in
The processor 100 further includes a validator 240 coupled to the output of the logic pipeline. In one embodiment, for example, the validator 240 may include a memory, a comparator and a switch. When the processor 100 is a SRAM based FPGA, for example, the memory, comparator and switch may be constructed from the FPGA logic. The memory of the validator 240 stores the expected output of the logic pipeline to the test vector(s) input. The memory of the validator 240, like the memory of the test vector source 210, is subject to independent monitoring via an error correction code (ECC) in a similar manner as discussed above.
The validator 240, upon receipt of data having the test vector flag, compares the data output from the logic pipeline to the expected result stored in the memory of the validator 240. When the data output from the logic pipeline 110 does not match the expected data, the validator 240 may open its internal switch, cutting off the logic pipeline 110 from the output of the processor 100. Accordingly, rather than outputting possibly corrupted data, the validator 240 causes the processor 100 to output nothing at all. In other words, the validator 240 opening its internal switch upon detecting a difference via the comparator causes the processor 100 to have a fail passive output. In certain embodiments, such as motor control, having a fail passive output is vital. For example, if the logic pipeline 110 is configured to control a motor current, outputting a corrupted current command could damage the motor, the motor drive electronics, or the end user application as discussed above. Accordingly, by outputting no data in the event the logic pipeline becomes misconfigured, the processor 100 maintains a fail passive output.
When the data output from the logic pipeline 110 matches the expected data, the validator 240 may close its internal switch, if previously open, allowing the logic pipeline 110 to output data from the processor 100. In one embodiment, for example, the memory of the validator 240 may store one or more previously processed data sets by the logic pipeline 110 through the logic pipelines normal operation. In this embodiment, when the data output from the logic pipeline 110 after a test matches the expected data, the validator 240 may output one or more previously processed data sets from the memory of the validator 240. The validator 240 may then open the switch and save any subsequent data output from the logic pipeline 110 until the next configuration test result is received passes. Accordingly, in this embodiment all data output from the logic pipeline 110 can be assumed to be free of any possible FPGA configuration errors caused by radiation or other error sources as a subsequent pass of data from the test vector source 210 contained no errors. Furthermore, the configuration test completed previous to a current configuration test serves to verify that the configuration of the logic pipeline 110 was correct before the normal application data was processed. Accordingly, the built-in configuration error detector 120 is capable of verifying the configuration of the logic pipeline 110 before and after processing application data, thereby ensuring that valid data is output by the processor without the use of a voting system, nor reliant upon a slow-rate scrubbing of the processor 100.
When the validator 240 detects a configuration error, the validator 240 may send a signal to a processor manager 250. In one embodiment, for example, the processor manager 250 may output an error flag to one or more other systems or processors to indicate that an error has occurred in the logic pipeline 110. The result of the error flag may depend upon the system.
When the processor 100 is an ASIC or any other non-reprogrammable processor, the error flag may signal that the part needs to be replaced. In instances where the processor 100 is reprogrammable, such as when the processor 100 is a SRAM type FPGA, the error flag can be used to initiate a reprogramming of the processor 100 to correct the configuration error. In some embodiments, for example, the error flag could initiate a reboot of the processor 100. The error flag could initiate a power cycle of the processor 100.
In the embodiment illustrated in
When a response to the error detection includes an attempt to reconfigure the logic pipeline 110, via a scrubbing, a power cycle or any other method, the processor manager 250 may initiate a retest by signaling the test vector source 210 to retest the logic pipeline 110. The retest verifies that the configuration of the logic pipeline 110 is correct before the processor manager 250 allows the processor 100 to return to normal operation.
In one embodiment, for example, the error flag output by the processor manager 250 could be used to control an outside system. For example, if the processor 100 is being used to control a motor, the error flag may be used by the processor 100 or another external controller to stop the motors.
The processor managers 250 response to the error detected by the validator 240 illustrated in
When the validator 240 identifies output of the logic pipeline accompanying a test vector flag, the validator compares the output to an expected output saved in memory. (Step 530). When the output of the logic pipeline matches the expected output, the validator closes a switch, if previously open, to allow the processor 100 to output data. (Step 540). In the embodiment illustrated in
The validator 240, upon detecting a configuration error when the output of the logic pipeline does not match the expected output, opens a switch (either internal to the validator or an external switch 260), if previously closed, preventing the processor 100 to output data and sends a signal to a processor manager 250 indicating the configuration error. (Step 550). By opening a switch connecting the logic pipeline 110 to an output of the processor, the output of the processor is fail passive. In other words, rather than outputting corrupted data, the processor is prevented from outputting data entirely.
The processor manager 250, upon receiving an indication of a configuration error from the validator 240, may output an error signal and/or attempt to correct the configuration of the logic pipeline 110. (Step 560). The response to the detection of the configuration error may vary from system to system. In some embodiments, for example, the processor manager 250 may attempt to power cycle the processor 100 to initiate a reprogramming of the processor 100. In another embodiment, for example, the processor manager 250, or an external controller, could attempt to scrub the logic pipeline 110 to find and correction the configuration error. In these embodiments, the processor manager 250 may be configured to initiate a retest the logic pipeline 110 to ensure that the configuration of the logic pipeline 110 was corrected. When the configuration error was corrected, the processor manager 250 allows the processor to return to normal operation and the built-in configuration error detector returns to step 510 to await the next idle in the logic pipeline 110.
In instances where the logic pipeline 110 is not reprogrammable, the error signal may indicate that the processor 100 needs to be replaced or that a given computation might be repeated in an attempt to be fault tolerant to transient error. Depending upon the application of the processor 100, the processor manager 250 may allow the processor 100 to continue to process data or may prevent the processor 100 from returning to operation. For example, if the processor 100 is being utilized to generate video in a medical imaging device, the processor manager 250 may allow the processor to continue operation even though the data may be corrupted. When the processor 100 is in a satellite being used to generate motor data, the processor manager 250 may shut down operation of the processor 100.
While at least one exemplary embodiment has been presented in the foregoing detailed description of the invention, it should be appreciated that a vast number of variations exist. It should also be appreciated that the exemplary embodiment or exemplary embodiments are only examples, and are not intended to limit the scope, applicability, or configuration of the invention in any way. Rather, the foregoing detailed description will provide those skilled in the art with a convenient road map for implementing an exemplary embodiment of the invention. It being understood that various changes may be made in the function and arrangement of elements described in an exemplary embodiment without departing from the scope of the invention as set forth in the appended claims.