1. Technical Field
The present invention relates to circuit architectures and more particularly to circuit designs employing semiconductor stacks having power circuits, self-repairing circuits, self-checking circuits or other integrated circuits advantageously positioned in the stack.
2. Description of the Related Art
Reliability, availability and serviceability (RAS) of complex integrated circuits, such as high performance microprocessors, require that fault detection and repair actions happen immediately after fault occurrence. Otherwise, several clock cycles are wasted during which erroneous instruction and data information are spread throughout the system. This would make recovery and state rollback to a last correct state very difficult given the exponential growth of a fault tree.
One way to have an early detection of fault occurrence is to place function checkers and fault detection circuitry as close as possible to the hardware where checking is needed. This is in contrast to relegating the monitoring and recovery capabilities to the system or firmware levels. The problem with bringing fault-detection, repair, and recovery functions to the hardware level is the negative impact that implementing such circuitry has on chip area, wireability, and performance of the overall chip.
A three-dimensional architecture chip includes a base chip including a unit integrated thereon and configured to perform electrical signal operations. An active layer is separately fabricated from the base layer. The active layer includes a component to service the unit of the base chip. The active layer is bonded to the base chip such that the component is aligned in vertical proximity of the unit. An electrical connection connects the unit to the component through vertical layers of at least one of the base chip and the active layer.
These and other features and advantages will become apparent from the following detailed description of illustrative embodiments thereof, which is to be read in connection with the accompanying drawings.
The disclosure will provide details in the following description of preferred embodiments with reference to the following figures wherein:
Embodiments in accordance with present principles preferably employ the manufacturing of three-dimensional chips by bonding several active layers in one stack along with interconnect level connection between each portion of the stack. For present purposes, a chip may be defined as an integrated circuit including one more passive or active elements. A stack includes two or more chips operatively coupled to each other to perform an operation. A stack may be referred to as having a three-dimensional architecture or as a three-dimensional chip, since the stack employs not only a layout area but a stack height. Separately fabricated refers two chips fabricated separately in different processes and perhaps remote locations.
In accordance with present principles, fabrication and manufacturing methods are employed to take advantage of the stacking capabilities in designing fault-detection, repair, and recovery circuits that run concurrently with monitored hardware. Design and architectures implemented in three-dimensional semiconductor stacks provide self-checking integrated circuits, self-repairing integrated circuits, power management integrated circuits, redundant components, etc. that are closer in proximity to hardware needing these services or functions. Vertical proximity will be referred to herein to describe a placement area for a component on an active layer above or below a unit on a different chip that it services. The placement area is such that improved performance is achieved by such placement.
Embodiments of the present invention can take the form of a hardware embodiment that may include any types of integrated circuit or combinations thereof. Integrated circuits may include, for example, electronic, magnetic, optical, electromagnetic, infrared, or other semiconductor devices or components.
Integrated circuits or chips as provided herein may be created in a graphical computer programming language, and stored in a computer storage medium (such as a disk, tape, physical hard drive, or virtual hard drive such as in a storage access network). If the designer does not fabricate chips or the photolithographic masks used to fabricate chips, the designer transmits the resulting design by physical means (e.g., by providing a copy of the storage medium storing the design) or electronically (e.g., through the Internet) to such entities, directly or indirectly. The stored design is then converted into the appropriate format (e.g., Graphic Data System II (GDSII)) for the fabrication of photolithographic masks, which typically include multiple copies of the chip design in question that are to be formed on a wafer. The photolithographic masks are utilized to define areas of the wafer (and/or the layers thereon) to be etched or otherwise processed.
The method as described herein is preferably employed in the fabrication of integrated circuit chip stacks. The resulting integrated circuit chip stacks can be distributed by the fabricator in multiple packaged forms. In one example, the stack is mounted in a chip package (such as a plastic carrier, with leads that are affixed to a motherboard or other higher level carrier) or in a multichip package (such as a ceramic carrier that has either or both surface interconnections or buried interconnections). In any case the chip stack may be integrated with other chips, stacks, discrete circuit elements, and/or other signal processing devices as part of either (a) an intermediate product, such as a motherboard, or (b) an end product. The end product can be any product that includes integrated circuit chips, ranging from toys and other low-end applications to advanced computer products having a display, a keyboard or other input device, and a central processor.
Referring now to the drawings in which like numerals represent the same or similar elements and initially to
Layer 106 or substrates 108 and 110 are processed to form active devices and/or passive devices. Active devices may include transistors, diodes, or entire circuits such as but not limited to self-repair circuits, self-checks circuits, monitoring circuits, etc. Passive devices may include resistors, capacitors, inductors, etc. After processing of layer 106 is completed a dielectric layer 105 may be formed and patterned to provide contact layer V1112 with contacts 114 to gate structures 116 and diffusion regions 117. Layer 112 is buried by further depositing dielectric material 107.
An adhesive layer 118 is formed on layer 107 to provide a bond to a handle wafer 120. Handle wafer 120 is provided to protect integrated circuit chip 100 during transport and/or further processing, and provide a gripping position. Substrate 102 can now be removed to provide an active device layer 101 (
Advantageously, active layer or layers 101 provide additional area for placing checkers, monitors, fault detectors right above or below important circuits or units of a processor, memory chip or other integrated circuit device.
Referring to
Chip 200 may be any type of device, e.g., a processor, a memory chip, a combination thereof, etc. Chip 200 includes devices or circuits that may need to be monitored, checked, corrected, etc. Active device layer 101 includes circuits e.g., circuit 108, which provides one or more functions to a circuit 206 which will be in close vertical proximity to circuit 108 once assembled. In this process, the active device layer 101 is aligned with chip 200 and bonded to chip 200. Alignment is preferably within about a 0.5-1.5 micron tolerance; however this depends on the application and the technology. Alignment can be carried out using known techniques. Active device layer 101 is bonded to chip 200, by e.g., fusion bonding or polymer bonding. The area of placement of a component on active device layer 101 is aligned to the unit of the base chip 200 that the component will service. The proximity to which the component is placed with reference to the unit will depend on many factors, such as, performance improvements needed, heat dissipation considerations, processing considerations, etc. Other circuits 204 may also have components arranged to be near or above/below circuits of active layer 101 by adjusting other aspects of the design.
Referring to
Metal lines M2308 may be formed simultaneously with vias V2 in a dual damascene process, or metal lines M2 may be formed in a separate deposition process (e.g., single damascene process). Metal lines M2308 make connection between devices in active device layer 101 and chip 200. A top surface 306 is planarized, e.g. by a chemical mechanical polish process. Top surface 306 may now be further processed by additional deposition processes or additional active layer devices may be added. The additional active layer devices may provide support functions for active layer device 101, chip 200 or devices or layers formed above active layer device.
In the illustrative embodiment depicted in
Referring to
Active layers 404 and 406 provide additional area for placing checkers, monitors, fault detectors, power management devices, spare or redundant circuits directly above or below important circuits or units of a processor, memory chip or other integrated circuit device. In one embodiment, active layers 404 and 406 can function as a repository of “spare parts” 434 for one or more units 432 in a processor 440. A switching element 436 provided on the additional active layer 404 may include control logic or fuses 438, which may be employed to bring up a spare part 434 and reconfigure the processor 440 to use the spare part 434. Note that with the illustrative architecture shown in
In one embodiment, switching element 436 includes one or more arrays of e-fuses 438, or programmable fuses, which are placed on the additional active layer 404. These e-fuses 438 may be implemented for the activation and reconfiguration of the spare parts 434. The switching elements 436 can also be responsible for activating additional resources (such as an extra core 445) to handle unexpectedly heavy processing loads. Core 445 may be placed on one of the active layers 404 and 406 as well as or in addition to being on base chip 420. The array of e-fuses and/or control logic 438 are preferably in close physical proximity to the monitored units (e.g., processor core 440) to ensure that the latency of sparing action after defect detection is minimal.
Additional area can also be used to achieve higher levels of physical redundancy, which are not provided in two-dimensional semiconductors. While binary redundancy permits the detection of a fault, only tertiary redundancy is capable, using majority voting, for deciding which of two identical units is defective. Current computer systems may only have two cores per chip running concurrently, and hardware errors are detected using comparisons between outputs of the cores. If recovery from a local hardware error fails to occur, both cores are “checkstopped” with an obvious hit on resource availability and a noticeable latency on the restoration of execution. Such a hit will be avoided if three-dimensional chips are employed as the two cores that are in agreement will continue execution. This tertiary scheme assumes that two cores are very unlikely to fail concurrently at the same execution point.
Referring to
One form of fault detection and correction, especially for dense memory systems and high-speed communication buses, includes error-correcting codes (ECC). For embedded memory, a method for detecting memory errors includes the Hamming code of double error detection and single error correction. In hardware, the Hamming code is implemented using combinational logic. Latency and area overhead in two-dimensional semiconductors are two reasons that deeper forms of ECC are not used. Such deeper forms (e.g., multi-error corrections) are now needed because of the increasing sensitivity of memory systems (such as on-chip caches) to manufacturing variability and radiation-induced errors. The availability of additional area, in accordance with present principles, enables the implementation of more advanced error correcting schemes.
Referring to
Referring to
Referring to
Referring to
Advantageously, the memory storage 910 may be placed close to the areas of the base chip 904 that need to use the memory. The vertical distance above and below base chip can provide a large amount of memory space in close proximity to the device or unit that needs to use the storage space. This can greatly improve performance due at least to the reduction in delay time for memory accesses of the units 912.
Having described preferred embodiments for three-dimensional architectures for self-checking and self-repairing integrated circuits and methods (which are intended to be illustrative and not limiting), it is noted that modifications and variations can be made by persons skilled in the art in light of the above teachings. It is therefore to be understood that changes may be made in the particular embodiments disclosed which are within the scope and spirit of the invention as outlined by the appended claims. Having thus described aspects of the invention, with the details and particularity required by the patent laws, what is claimed and desired protected by Letters Patent is set forth in the appended claims.
This invention was made with Government support under Contract No.: N66001-04-C-8032 awarded by the Defense Advanced Research Projects Agency (DARPA). The Government has certain rights in this invention.