In computing systems, electrical circuits can be designed on separate components, such as a die or a chip, and combined to create more complex systems. Integrated circuits can be customized and designed to perform specific functions, and multiple components can be modularly combined to increase performance. In some computing systems, similar components can be combined as an interconnected stack, with the combination acting as a single device. For example, dies can be stacked together to create a three-dimensional integrated circuit that reduces the overall footprint of the components and power consumption while increasing computing power or memory.
The accompanying drawings illustrate a number of example implementations and are a part of the specification. Together with the following description, these drawings demonstrate and explain various principles of the present disclosure.
Throughout the drawings, identical reference characters and descriptions indicate similar, but not necessarily identical, elements. While the example implementations described herein are susceptible to various modifications and alternative forms, specific implementations have been shown by way of example in the drawings and will be described in detail herein. However, the example implementations described herein are not intended to be limited to the particular forms disclosed. Rather, the present disclosure covers all modifications, equivalents, and alternatives falling within the scope of the appended claims.
The present disclosure is generally directed to apparatuses, systems, and methods for balancing timing closure, particularly for three-dimensional integrated circuits. As described below, by vertically stacking integrated circuits or dies, a computing system can maintain a smaller circuit footprint and reduce power consumption in comparison to many two-dimensional configurations. For example, a three-dimensional integrated circuit of multiple, identical stacked die can replace a horizontal die configuration of a much larger footprint. In this example, the vertically stacked dies can be electronically interconnected, such as by through-silicon vias (TSVs) integrated into each die. To send data through the stack, the TSVs can transmit data to each layer of dies.
However, each die within a stack can have different timing paths, offset positions, different fuses, or other variations regardless of whether the dies are intended to be identical. In other words, because the dies are not a single piece of silicon, inconsistencies in manufacturing and placement can create global variations between dies. For example, different dies can have different global transistor variations, different metal properties due to mask shifts, and/or other processing differences. These variations can then lead to increased problems with timing closure in comparison to two-dimensional configurations. For example, two cached macros that are linearly arranged on the same die can have less timing variation than macros on two different stacked dies. Additionally, the placement of each die can increase variations as dies are stacked higher. Specifically, the die nearest to a base die can have a problem with hold slack, while the furthest die can have a problem with setup slack. Without configurable designs, timing closure can become difficult to achieve.
In some examples, an asynchronous interface can be used to ensure closure of both setup timing and hold timing. However, this can impose a heavy latency penalty. Alternatively, by making the entire chip-to-chip path multicycle, this design can impose a heavy bandwidth penalty. In other examples, by actively skewing a clock with a loop, such as a delay-locked loop, the design can impose a heavy power penalty. In contrast, hard coding delays into each die can result in a need for a large variety of dies with different delays based on configuration and placement, making die manufacturing expensive and binning difficult. Thus, a more flexible and efficient design for delays is needed to ensure timing closure on stacked dies.
In some implementations, the disclosed integrated circuit die includes connections, such as TSVs, to interface with other dies in a stack. In these implementations, the TSVs can transmit data signals between dies. In a non-limiting example, the disclosed integrated circuit die can include a programmable delay element configured to delay data signals. By integrating the programmable delay element into the integrated circuit, the disclosed die can control a timing of incoming and outgoing data signals to ensure timing closure. In some non-limiting examples, a programmable delay element can be independently integrated into each die within a stack. In other examples, programmable delay elements can be independently controlled via fuses to delay the timing of data signals, thereby enabling setup and hold timing to be closed on each die while maximizing overall frequency.
Furthermore, a computing system can include a three-dimensional integrated circuit. In some implementations, the three-dimensional integrated circuit can include multiple, identical dies, each with a programmable delay element to control the timing of data signals for the die. By integrating independently programmable delay elements, each die can be uniquely programmed to simultaneously close setup and hold timing at high speeds. Additionally, by enabling the tuning of the delay of each die post-silicon, the three-dimensional integrated circuit can avoid timing discrepancies due to manufacturing inconsistencies. In other words, the programmable delay element enables each layer in a die stack to be identically designed while also enabling different delay settings on each die via a fuse. Thus, the disclosed apparatus, system, and method of manufacturing integrate programmable delay elements into integrated circuit dies for better timing closure.
As will be described in greater detail below, the present disclosure describes various apparatuses, systems, and methods for balancing timing closure. In one implementation, an integrated circuit die includes a set of electronic circuits disposed on a semiconductor material. The integrated circuit die also includes one or more through-silicon vias (TSVs) that vertically span the semiconductor material to transmit data signals. Additionally, the integrated circuit die includes a programmable delay element integrated with the set of electronic circuits on the semiconductor material and configured to delay data signals.
In one example, the programmable delay element is configured to delay an incoming data signal arriving at the integrated circuit die to enable a closure of a setup timing of the incoming data signal and/or a hold timing of the incoming data signal. In this example, the programmable delay element is configured to enable the closure of the setup timing of the incoming data signal by implementing a low delay. Additionally or alternatively, in this example, the programmable delay element is configured to enable the closure of the hold timing of the incoming data signal by implementing a high delay.
In one example, the programmable delay element is configured to delay an outgoing data signal from the integrated circuit die to enable a closure of a setup timing of the outgoing data signal and/or a hold timing of the outgoing data signal. In this example, the programmable delay element is configured to enable the closure of the setup timing of the outgoing data signal by implementing a low delay. Additionally or alternatively, in this example, the programmable delay element is configured to enable the closure of the hold timing of the outgoing data signal by implementing a high delay.
In one example, the programmable delay element further includes one or more fuses electronically coupled to the programmable delay element and configured to control the programmable delay element within a three-dimensional integrated circuit, wherein the integrated circuit die is stacked with one or more other integrated circuit dies.
In one implementation, a three-dimensional integrated circuit includes a base die. The three-dimensional integrated circuit also includes a stack of two or more dies electronically coupled to an interface of the base die, wherein a programmable delay element is integrated with each die to control a timing of a data signal and one or more TSVs vertically span each die to interconnect to other dies to transmit data signals.
In one example, the two or more dies are identically designed. In this example, the two or more dies are vertically stacked by electronically coupling an interface of a first die with an interface of a second die such that one or more TSVs of the first die is aligned to one or more TSVs of the second die. In this example, the stack of the two or more dies is electronically coupled to the interface of the base die by electronically coupling the interface of the first die with the interface of the base die such that the one or more TSVs of the first die is aligned to an electronic component of the base die.
In one example, the base die is configured to broadcast data to each die of the stack, wherein the base die broadcasts data to a distal die through one or more TSVs of a proximal die. In this example, the base die is configured to receive data from each die of the stack, wherein the base die receives data from the distal die through the one or more TSVs of the proximal die.
In one example, the programmable delay element is configured to independently control the timing of the data signal for a single die within the stack.
In one example, the programmable delay element is configured to set a configuration bit to change a delay of the data signal based on a position of a die within the stack and/or a clock skew between the die within the stack and one or more other dies. In this example, the programmable delay element is configured to implement a low delay to close a setup timing of the data signal for a die within the stack distal to the base die. Additionally, in this example, the programmable delay element is configured to implement a high delay to close a hold timing of the data signal for a die within the stack proximal to the base die.
In one implementation, a method of manufacturing includes disposing a set of electronic circuits on a semiconductor material of an integrated circuit. The method of manufacturing also includes forming one or more TSVs to vertically span the semiconductor material, wherein the one or more TSVs are configured to transmit data signals. The method of manufacturing then includes integrating a programmable delay element with the set of electronic circuits, wherein the programmable delay element is configured to delay data signals.
In one example, the method of manufacturing further includes performing a pre-silicon validation of the integrated circuit by calculating a clock skew of the integrated circuit based on a position of the integrated circuit within a stack of dies and setting a configuration bit of the programmable delay element. In this example, the method of manufacturing further includes performing a post-silicon validation of the integrated circuit by stacking the integrated circuit in the stack of dies, testing a timing of the integrated circuit within the stack of dies, and reprogramming the configuration bit of the programmable delay element based on the testing.
Features from any of the implementations described herein can be used in combination with one another in accordance with the general principles described herein. These and other implementations, features, and advantages will be more fully understood upon reading the following detailed description in conjunction with the accompanying drawings and claims.
The following will provide, with reference to
In the example of
In the example of
In some examples, three-dimensional integrated circuit 100 can be all or a portion of a computing system that generally represents any type or form of computing system or computing device with electronic components to perform computing functions. Examples of computing systems include, without limitation, chiplets, printed circuit boards (PCBs), processors, and/or other electronic components or combinations of the same. Additional examples of computing systems include, without limitation, laptops, tablets, desktops, servers, cellular phones, Personal Digital Assistants (PDAs), multimedia players, embedded systems, wearable devices (e.g., smart watches, smart glasses, etc.), smart vehicles, so-called Internet-of-Things devices (e.g., smart appliances, etc.), gaming consoles, servers, variations or combinations of one or more of the same, a portion of one or more of the same, or any other suitable computing device.
Many other devices or subsystems can be connected to three-dimensional integrated circuit 100 in
In one example, dies 106(1)-(3) are identically designed. In this example, dies 106(1)-(3) are vertically stacked by electronically coupling an interface of die 106(1) with an interface of die 106(2) such that one or more of TSVs 108(1)-(N) of die 106(1) is aligned to one or more of TSVs 108(1)-(N) of die 106(2). Similarly, an interface of die 106(2) is electronically coupled to an interface of die 106(3) such that one or more of TSVs 108(1)-(N) of die 106(2) is aligned to one or more of TSVs 108(1)-(N) of die 106(3). In the example of
In one example, base die 102 is configured to broadcast data to dies 106(1)-(3) of stack 104, wherein base die 102 broadcasts data to die 106(2) through one or more of TSVs 108(1)-(N) of die 106(1). In this example, base die 102 broadcasts data to die 106(3) through one or more of TSVs 108(1)-(N) of both die 106(1) and die 106(2). Similarly, in some examples, base die 102 is configured to receive data from dies 106(1)-(3) of stack 104, wherein base die 102 receives data from die 106(2) through one or more of TSVs 108(1)-(N) of die 106(1) and data from die 106(3) through one or more of TSVs 108(1)-(N) of both die 106(1) and die 106(2).
In one example, programmable delay elements 110(1)-(3) are configured to independently control the timing of the data signal for each of dies 106(1)-(3) within stack 104. For example, programmable delay element 110(1) is configured to control the timing of the data signal for die 106(1) independently of dies 106(2)-(3), which are controlled by programmable delay elements 110(2)-(3).
In the example of
Similarly, in the example of
In one example, programmable delay element 110(1) and/or programmable delay element 110(2) can be configured to set a configuration bit to change a delay of the data signal based on a position of die 106(1) and/or die 106(2) within stack 104. Additionally or alternatively, the configuration bit can be set based on a clock skew between base die 102 and die 106(1) and/or a clock skew between die 106(1) and die 106(2). The clock skew is calculated as a difference between the arrival times of incoming clock signals at each die, which can be caused by different path lengths to arrive at the clock pin of the flip-flops of each die. The amount of delay set for each die can be calculated based on the clock skew between the dies. Due to dies 106(1)-(2) being separate components with potential differences in the silicon material, clock skew can be greater than the clock skew across a single die or chip. In some examples, dies 106(1)-(2) can represent dies of different types of technology or materials, rather than simply global variations.
In some examples, programmable delay element 110(1) can be configured to implement a high delay to close a hold timing of the data signal for die 106(1) due to its proximal position to base die 102. Similarly, programmable delay element 110(2) can be configured to implement a low delay to close a setup timing of the data signal for die 106(2) due to its distal position to base die 102. Thus, programmable delay element 110(1) is set at the high delay for a tight hold path, and programmable delay element 110(2) is set at the low delay for a tight setup path.
As shown in
The systems described herein can perform step 610 in a variety of ways. As shown in
Returning to
The systems described herein can perform step 620 of
Returning to
The systems described herein can perform step 630 of
In one example, the method of manufacturing can further include performing a pre-silicon validation of the integrated circuit by calculating a clock skew of the integrated circuit based on a position of the integrated circuit within a stack of dies. In this example, the method of manufacturing can set the configuration bit of the programmable delay element. Additionally, the method of manufacturing can further include performing a post-silicon validation of the integrated circuit by stacking the integrated circuit in the stack of dies, testing the timing of the integrated circuit within the stack of dies, and reprogramming the configuration bit of the programmable delay element based on the result. In non-limiting examples, the term “pre-silicon validation” refers to a process of verifying that the design of a chip or a die meets specifications prior to manufacturing. For example, a pre-silicon timing report can be generated using simulations of a computing component to determine if slack and hold timing meet chip requirements. In non-limiting examples, the term “post-silicon validation” refers to a process of verifying that the functionality of a chip or a die meets specifications after manufacturing.
Subsequently, the method of manufacturing disclosed herein can perform a post-silicon validation 704 of
In some examples, the disclosed method can stack dies 106(1)-(3) of
In some examples, the disclosed method can include blowing fuses differently for different components, with some components having one set of fuses with a set delay and other components having a different set of fuses with a different delay. In additional examples, the pre-silicon validation and the post-silicon validation can improve the binning process to categorize each die and ensure the die is within parameters for a particular stack or computing component. In further examples, the disclosed apparatuses and systems can be used for two-dimensional configurations, other three-dimensional technologies like hybrid bonded memory, other data broadcast or data transmission technologies like bus communication, and/or any other suitable computing systems. Furthermore, although described as a data-based delay system, the disclosed systems and apparatuses can be implemented using clocking-based delay systems, wherein clocks are delayed to macros.
As described above, the disclosed apparatuses, systems, and methods balance timing closure of integrated circuit dies to ensure closure of both setup and hold timing for a three-dimensional die stack. Accordingly, the implementations and systems described herein incorporate a programmable delay element into each stacked die to uniquely control a delay of data transmission for each die in the stack. The programmable delay element enables quick what-if analyses when running timing tests prior to manufacturing. For example, the programmable delay element enables faster iterations of pre-silicon simulations to generate timing reports. The disclosed programmable delay element also includes a configuration bit that can be used to adjust the delay after manufacturing is complete, thereby enabling adjustment of the delay based on a position of the die after being stacked. In other words, the programmable delay enables testing of a chip design by setting the configuration bit that changes the delay to optimize the chip. Rather than estimating the appropriate delays for dies, the disclosed methods enable independently programmable delays to be uniquely determined for each die layer to simultaneously close setup and hold timing at high speeds. Thus, the disclosed three-dimensional integrated circuit and the integrated circuit die can provide flexibility and speed while improving timing closure.
While the foregoing disclosure sets forth various implementations using specific block diagrams, flowcharts, and examples, each block diagram component, flowchart step, operation, and/or component described and/or illustrated herein can be implemented, individually and/or collectively, using a wide range of hardware, software, or firmware (or any combination thereof) configurations. In addition, any disclosure of components contained within other components should be considered example in nature since many other architectures can be implemented to achieve the same functionality.
In some examples, all or a portion of exemplary three-dimensional integrated circuit 100 in
In some examples, all or a portion of exemplary three-dimensional integrated circuit 100 in
The process parameters and sequence of steps described and/or illustrated herein are given by way of example only and can be varied as desired. For example, while the steps illustrated and/or described herein can be shown or discussed in a particular order, these steps do not necessarily need to be performed in the order illustrated or discussed. The various example methods described and/or illustrated herein can also omit one or more of the steps described or illustrated herein or include additional steps in addition to those disclosed.
The preceding description has been provided to enable others skilled in the art to best utilize various aspects of the example implementations disclosed herein. This example description is not intended to be exhaustive or to be limited to any precise form disclosed. Many modifications and variations are possible without departing from the spirit and scope of the present disclosure. The implementations disclosed herein should be considered in all respects illustrative and not restrictive. Reference should be made to the appended claims and their equivalents in determining the scope of the present disclosure.
Unless otherwise noted, the terms “connected to” and “coupled to” (and their derivatives), as used in the specification and claims, are to be construed as permitting both direct and indirect (i.e., via other elements or components) connection. In addition, the terms “a” or “an,” as used in the specification and claims, are to be construed as meaning “at least one of.” Finally, for ease of use, the terms “including” and “having” (and their derivatives), as used in the specification and claims, are interchangeable with and have the same meaning as the word “comprising.”