Faulty Printed Circuit Boards (PCBs) populated with various types of electrical components arrive at development labs from various locations for error analysis. When an error occurs on a PCB, the voltage regulator on the PCB outputs a fault associated with the error. The fault may be stored in memory to allow engineers to analyze the faulty circuits or systems on the PCBs, The PCBs arrive from various locations that were operated with various firmware versions and/or system configurations that cannot be identified without powering the PCB. However, powering a PCB with incorrect firmware and/or system configurations can lead to additional damage that prevents diagnosis of the error that originally occurred. Additionally, catastrophic errors may shut down a PCB before the PCB can store the fault associated with the catastrophic error into memory.
The figures are not to scale. Wherever possible, the same reference numbers will be used throughout the drawing(s) and accompanying written description to refer to the same or like parts.
Non-volatile phase change memory elements (e.g. memristors) have numerous advantages in electrical designs. Memristors are semiconductive devices that have variable resistance. The resistance of a memristor can be changed and/or read by applying various voltages for various amounts of time. Accordingly, a memristor can be implemented as a memory bit. In such an example, applying a first voltage for a first duration of time changes the resistance of the memristor to a low resistance to act like a closed switch. Additionally, applying a second voltage for a second duration of time changes the resistance of the memristor to a high resistance to act like an open switch. The amount of resistance can be equated to a binary value (e.g., like a bit cell) in memory. Small blocks of memristors are able to store data faster than conventional memory. Additionally, stored values in memristors can be determined by measuring the resistance of the memristors. Accordingly, a user can probe memristors externally with a probe to determine data stored in the memristors (e.g., via an oscilloscope) without actually powering up the memristors and/or a processor associated with the memristors.
Examples disclosed herein utilize the properties of memristors to perform status and/or error logging associated with an integrated circuit (IC). Status logging includes determining the status of a system (e.g., status data) associated with the integrated circuit. The status data may include a firmware being utilized by the system, system configurations, operating temperature, ambient temperature, software status, identifiers associated with the integrated circuit and/or a device in communication with the integrated circuit, etc. Error logging includes determining fault data related to an error associated with the integrated circuit. When an error occurs, a voltage regulator down may output fault data to a programmable logic device, an integrated lights out device, and/or a memristor system. As further described herein, the fault data may be minimal or may be more detailed depending on how the memristor system is attached to the integrated circuit.
Conventional techniques for storing fault data and/or status data include storing the data into an active health negative-and (NAND) memory, Electrically Erasable Programmable Read-Only Memory (EEPROM), and/or Read-Only Memory (ROM). However, such conventional techniques require a large amount of space. Additionally, active health NAND memory, EEPROM, and/or ROM cannot be read externally (e.g., via a probe). Therefore, an engineer needs to power a circuits and/or systems on a PCB associated with the memory to determine the stored data. However, powering the circuits and/or systems PCB with incorrect system configurations and/or firmware may lead to additional damage that prevents diagnosis of the error that originally occurred. Additionally, such conventional techniques are slower than examples disclosed herein. Catastrophic errors cause all and/or some of the components of the PCB and memory to shut down immediately. Thus, such conventional techniques are not fast enough to log (e.g., store) a fault associated with the catastrophic error.
Examples described herein communicatively coupe memristors to an integrated circuit to store fault data and/or status data associated with the integrated circuit. Memristors store data much faster than traditional memory. Thus, a fault associated with a catastrophic error can be detected and shared before a catastrophic error shuts-down the components of the PCB. Additionally, data stored in memristors can be read by determining the resistance of the memristors. Therefore, an engineer can determine system configurations, firmware versions, fault data, etc. corresponding to the PCB by probing the memristors externally with a probe without powering up the components of the PCB. In this manner, an engineer can power the PCB with the correct system configurations, options, riser cards, mated system boards, firmware, etc. to prevent further damage. Additionally, components on a PCB can re-boot with correct system configurations, firmware version, etc. when the components are activated/re-activated.
The example IC 100 is semiconductor device including a set of electrical components (e.g., resistors, capacitors, inductors, transistors, etc.). The example IC 100 may be part of a bigger IC and/or in communication with other ICs. For example, the example IC 100 may be part of and/or connected to an Ethernet motherboard, a central processing unit, memory, a server, a processor, a controller, etc. The example integrated circuit 100 includes the example converter 101 to receive a first signal (e.g., first voltage) and output a second signal (e.g. a second voltage). In some examples, the integrated circuit 100 includes and/or is coupled to a processor to output the first signal to the example converter 101 (e.g., when an error occurs and/or to transmit status data). In
The example programmable logic device 102 receives the minimal fault data from the example converter 101 and outputs the minimal fault data and more detailed fault data. Additionally, once the minimal fault data is received by the example programmable logic device 102, the programmable logic device 102 communicates with the example IC 100 to get additional information about the failure. For example, the additional information may include data related to a type of error that occurred, which part of the IC 100 caused the error, which rail is turning off due to the error, a status of other rails in the bus, and/or whether or not turning off the rail caused the IC 100 to continue operating without a reboot. In some examples, the example programmable logic device 102 includes ambient temperature sensors. In such examples, the example programmable logic device 102 includes the ambient temperature sensed by the sensors in the more detailed fault data, In some examples, the example programmable logic device 102 is a complex programmable logic device (CPLD) including logic implementing disjunctive normal form expressions and/or more specialized logic operations, Alternatively, the example programmable logic device 102 may be a programmable logic array, a programmable array logic device, a generic array logic device, a field-programmable gate array, and/or any other type of programmable logic device. The example programmable logic device 102 outputs the more detailed fault data to the example iLO 104 and/or the example memristor system 106.
The example iLO 104 is a remote server management processor to control and/or monitor status of the example integrated circuit 100 from a remote location. The example iLO 104 timestamps the error when the more detailed fault data is received, In some examples, the example iLO 104 communicates with a basic input/output system (BIOS) and/or unified extensible firmware interface (UEFI) associated with the example IC 100, Communicating with the BIOS and the UEFI allows the example iLO 104 to determine various status data such as system status (e.g., booting, running, idle), a current checkpoint state of the BIOS, stress load of the system, etc. In some examples, the iLO 104 includes a network connection and/or Internet Protocol (IP) address to connect to other management networks. In some examples, the iLO 104 includes a remote web-based console to communicate with the example integrated circuit 100 remotely. Using the remote web-based console, a user can operate features including setup, configuration, remote power on and/or off, secure socket layer security, detailed status (e.g., server status), virtual indicators, and/or diagnostics. In some examples, the iLO 104 outputs a most detailed fault data by outputting the more detailed fault data from the example programmable logic device 102 along with a timestamp, state of the system (e.g., BIOS checkpoints), data related wo whether and/or how long the system was booted up for, the stress load of the system, etc. In some examples, the example programmable logic device 102 and the example iLO 104 may be combined into one device performing the functionalities of both devices.
The example memristor system 106 includes the example memristor controller 107 and the example memristor array 108. In some examples, the memristor system 106 is a separate device (e.g., PCB) attached at least one of the example IC 100, the example programmable logic device 102, and/or the example iLO 104. Alternatively, the example memristor system 106 may be embedded in at least one of the example IC 100, the example programmable logic device 102, and/or the example iLO 104. In
The example memristor controller 107 receives fault data and/or status data from the example converter 101. In some examples, the example memristor controller 107 communicates (e.g., periodically or aperiodically) with the example IC 100, the example programmable logic device 102, and/or the example iLO 104 to determine status data of the example IC 100. The status data may include system configurations, options, riser card identifiers, mated system board identifiers, firmware identifiers, software identifiers, an IC identifier, an operating temperature, an ambient temperature, a timestamp, and/or any other data related to the example IC 100. In some examples, the example memristor controller 107 polls an input associated with the example converter 101 to identify (e.g., detect) a fault output by the example converter 100 associated with the example IC 100. The example memristor controller 107 stores the fault data and/or status data in the example memristor array 108. Additionally, the example memristor controller 107 may read the stored status data from the example memory array 108. For example, if the example IC 100 needs to be rebooted (e.g. restarted), the example memristor controller 107 may read the status data in the example memory array 108 to determine IC system configurations, options, a riser car identifier, a mated system board identifier, and/or firmware version installed on the example IC 100. In such an example, the memristor controller 107 may transmit the system configurations and/or firmware version to the example IC 100 prior to and/or after the reboot. Transmitting the correct firmware various allows the example IC 100 to reboot without causing further damage associated with rebooting with incorrect configurations.
The example memristor array 108 is an array of non-volatile phase change memory elements (e.g. memristors). The example memristor array 108 may include any number of memristors. The memristors in the memristor array 108 are resistive elements created from a thin doped semiconductor film with variable resistance. As voltage is applied to a memristor, the doping of the semiconductor film changes. As the doping changes, the resistance associated with the memristor changes. For example, a memristor may be doped. The doped memristor has a low resistance (e.g., 100 Ohms) and operates like a closed switch. In such an example, the low resistance of the memristor may be associated with a binary value of “1” to indicate “ON.” If a high voltage is applied to the doped memristor for a sufficient duration of time, the dopants drift to cause the memristor to act like undoped semiconductor material. The “undoped” memristor has a high resistance (e.g., 1 megaOhm) and operates like an open switch. In such an example, the high resistance of the memristor may be associated with a binary value of “0” to indicate “ON,” In some examples, the amount of resistance can be associated with an analog value. In such example the example memristor controller 107 outputs a particular voltage for a particular amount of time to change the resistance of a memristor to a particular resistance. The particular resistance can later be read and associated with the stored data (e.g., based on the resistance). As previously described, the resistance is non-volatile. Thus, stored values in memristors can be determined after power is lost and without providing power to any other device. For example, a user can read the stored data by probing the example memristor array 108 without powering the example IC 100 (e.g., via an oscilloscope).
In
In
In
In
The example block diagram 110 of
The example block diagram 140 of
The example receiver 200 receives fault data and/or status data from the example IC 100, the example programmable logic device 102, and/or the example iLO 104. In some examples, the example receiver 200 may receive stored data from the example memristor array 108. The example receiver 200 transmits the received data to the example transmitter 202, the example fault detector 204, and/or the example status determiner 206 for further processing.
The example transmitter 202 transmits a write signal or a plurality of write signals to store received fault data and/or status data into the example memory array 108 of
The example fault detector 204 processes data received via the example receiver 200 to determine if fault data has been received by the example receiver 200. The example receiver 200 may receive fault data and/or status data. Thus, the example fault detector 204 may sort through received data to determine if the received data is fault data and/or status data. If the example fault detector 204 determines that the received data is associated with an error (e.g., the received data is fault data), the fault detector 204 generates a write signal to store the fault data into the example memristor array 108. If the received data is associated with the IC status, the status determiner 206 processes the received data.
The example status determiner 206 processes received signals to determine how to store the received signals into the example memristor array 206. In some examples, the example status determiner 206 generates a write signal to store the received fault data and/or status data into the example memristor array 208 via the example transmitter 202. In some examples, the example status determiner 206 may analyze the stored status data to generate signal identifying a last known firmware and/or other data based on the stored status data. In such an example, the example transmitter 202 may transmit the generated signal to the example IC 100 (e.g., for a re-boot).
While example manners of implementing the example memristor controller 107 of
Flowcharts representative of example machine readable instructions for implementing the example memristor controller 107 of
As mentioned above, the example processes of
At block 302, the example receiver 200 receives status data associated with the example IC 100 of
At block 308, the status determiner 206 determines if the status data associated with the IC 100 has been updated. In some examples, the status determiner 206 sends requests via the example transmitter 202 to the example IC 100, the example programmable logic device 102, and/or the example iLO 104 to send updated status data. Once the example transmitter 202 receives the updated status data, the example status determiner 206 compares the updated status data with the stored status data to determine if the status data has been updated. In some examples, the IC 100, the example programmable logic device 102, and/or the example iLO 104 may send IC status data only when the example IC 100 has been updated (e.g., based on system configurations, a firmware version, a software version, etc.). In such an example, the status determiner 206 determines that the integrated circuit status data has been updated whenever the example receiver 200 receives additional status data.
If the example status determiner 206 determines that the IC status data has been updated, the example status determiner 206 transmits a write signal via the example transmitter 202 to update the example memristor array 108 based on the updated IC status data. If the example status determiner 206 determines that the IC status data has not been updated, the example fault detector 204 determines if a fault has been detected (block 310). As previously described, the fault detector 204 may determine a fault has been detected when fault data is received by the example receiver 200. If fault data has not been detected, the status determiner 206 continues to determine if the IC status data has been updated. If a fault has been detected, the example fault detector 204 generates a write signal to transmit to the example memristor array 108 via the example transmitter 202 to write the fault data into the example memristor array 108 (block 312). In some examples, if the error associated with the fault does not cause the system to shut down, the process may return to continue to check the IC status data has been updated and/or if additional faults are received by the example receiver 200.
At block 402, the example transmitter 202 transmits a read signal to the example memristor array 108 to determine values stored in the example memristor array 108. The status determiner 206 receives the values from the example memristor array 108 and determines the stored IC status data based on the received values (block 404). For example, the example status determiner 206 may determine system configuration of the example IC 100, last ran firmware version on the example IC 100, etc. The example status determiner 206 generates a signal to send the example IC 100, the example programmable logic device 102, and/or the example iLO 104 based on the determined status data.
At block 406, the example transmitter202 transmits the status data to the example IC 100, the example programmable logic device 102, and/or the example iLO 104. As previously described, the example IC 100 may initialize (e.g., re-boot) based on the received status data. For example, the IC 100 may initialize with firmware associated with and/or compatible with the firmware identified in the received status data. Additionally or alternatively, the example IC 100 may initialize with system configuration corresponding to the system configurations associated with the status data.
The processor platform 500 of the illustrated example includes a processor 512. The processor 512 of the illustrated example is hardware. For example, the processor 512 can be implemented by one or more integrated circuits, logic circuits, microprocessors or controllers from any desired family or manufacturer.
The processor 512 of the illustrated example includes a local memory 513 (e.g., a cache). The processor 512 executes the instructions of
The processor platform 500 of the illustrated example also includes an interface circuit 520. The interface circuit 520 may be implemented by any type of interface standard, such as an Ethernet interface, a universal serial bus (USB), and/or a PCI express interface.
In the illustrated example, one or more input devices 522 are connected to the interface circuit 520. The input device(s) 522 permit(s) a user to enter data and commands into the processor 512. The input device(s) can be implemented by, for example, an audio sensor, a microphone, a keyboard, a button, a light emitting diode (LED), a mouse, a touchscreen, a track-pad, a trackball, isopoint and/or a voice recognition system.
One or more output devices 524 are also connected to the interface circuit 520 of the illustrated example. The output devices 524 can be implemented, for example, by display devices (e.g., a light emitting diode (LED), an organic light emitting diode (OLED), a liquid crystal display, a cathode ray tube display (CRT), a touchscreen, a tactile output device, a light emitting diode (LED) and/or speakers). The interface circuit 520 of the illustrated example, thus, typically includes a graphics driver card, a graphics driver chip or a graphics driver processor.
The interface circuit 520 of the illustrated example also includes a communication device such as a transmitter, a receiver, a transceiver, a modem and/or network interface card to facilitate exchange of data with external machines (e.g., computing devices of any kind) via a network 526 (e.g., an Ethernet connection, a digital subscriber line (DSL), a telephone line, coaxial cable, a cellular telephone system, etc.).
The processor platform 500 of the illustrated example also includes one or more mass storage devices 528 for storing software and/or data. Examples of such mass storage devices 528 include floppy disk drives, hard drive disks, compact disk drives, Blu-ray disk drives, RAID systems, and digital versatile disk (DVD) drives.
The coded instructions 532 of
From the foregoing, it will be appreciated that the above disclosed method, apparatus, and articles of manufacture utilize a memristor system to store fault data and/or status data associated with an IC. Using example disclosed herein stored fault data and/or status data in the memristor system can be determined (e.g., via a probe) without supplying power to (e.g., powering up) the IC. Additionally, examples disclosed herein can store fault data associated with CATERRs. In some examples, the memristor system is attached directly to the IC to receive faults associated with the CATERRs. In some examples, the memristor system is attached to a programmable device and/or an iLO to receive more detailed fault data. In some examples, the memristor system and/or multiple memristor systems are attached to any combination of the IC, the programmable logic device, and the iLO to receive fault data associated errors that occur within a system. Additionally, the memristor system receives and stores status data associated with the IC. In some examples, the IC 100 uses status data stored in the example memristor system to initialized and/or re-boot the IC 100.
Conventional methods for storing status and/or fault data associated with an IC include storing the status and/or the fault data into active health NAND, EEPROM, and/or ROM attached to the IC. Such conventional methods require more space, cannot be read without powering the IC, and are slower. Conventional methods cannot be read without powering the IC and additional damage can be caused by powering the IC with incorrect (e.g., outdated) configurations and/or firmware. Additionally, the amount of time to store data into such conventional techniques is so long. Thus, conventional techniques are unable to store faults associated with a CATERR (e.g., since CATERRs cause the IC to shut-down). Examples disclosed herein alleviate such problems by utilizing the memristor system to store fault data and/or status data into a faster memristors that can be read without providing power to the IC attached to the memristor system.
Example methods and apparatus are disclosed to store status and/or fault data associated with an integrated circuit. Such an example apparatus includes a plurality of resistive elements, a fault detector to determine when a fault corresponding to an integrated circuit has occurred, and a data determine to, when first data related to the integrated circuit is updated, store the first data in a first subset of the plurality of resistive elements, the data determine to, in response to, in response to the detection of the fault, store second data in a second subset of the plurality of resistive elements, the second data corresponding to an error associated with the fault.
In some examples, the plurality of resistive elements are memristors. In some examples, the first data include an identifier identifying at least one of the integrated circuit, firmware utilized by the integrated circuit, software utilized by to the integrated circuit, hardware corresponding to the integrated circuit, a temperature corresponding to the integrated circuit, or a component associated with the integrated circuit. In some examples, the status determiner is to determine that the integrated circuit has been updated by polling the integrated circuit.
In some examples, the second data includes a timestamp corresponding to when the fault occurred, In some examples the first data and the second data can be read without powering the integrated circuit. In some example the data determine is to, when an error associated with the fault causes the integrated circuit to re-boot, transmit the first data to the integrated circuit prior to the re-booting.
Although certain example methods, apparatus and articles of manufacture have been disclosed herein, the scope of coverage of this patent is not limited thereto. On the contrary, this patent covers all methods, apparatus and articles of manufacture fairly falling within the scope of the claims of this patent.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2016/014909 | 1/26/2016 | WO | 00 |