Digital systems, such as memory devices, continue to operate at higher and higher speeds. Various signal lines that carry digital signals may exhibit low-pass filter (LPF) characteristics, either due to increasing channel loss with frequency, or through capacitive filtering. In addition, process and temperature variance can also impact the speed at which circuitry is capable of operating. Thus, the maximum data rate supported by a channel becomes limited. Existing solutions to compensate for channel data rate limitations may include various equalization techniques that have been used, which include added complex circuitry that may not effectively improve channel data rate in many circumstances.
This disclosure describes examples of apparatuses and methods using a neural network during a write or programming operation to precondition transmitted write data signals. Preconditioning may include modifying the shape of a transmitted signal such that the properties (e.g., capacitance, circuit switching speed, etc.) of the signal line cause the transmitted signal to be received and stored at the memory cell array with a desired shape. Preconditioning may include pre-emphasis or de-emphasis of the signal shape. Pre-emphasis refers to increasing the amplitude of a digital signal by providing, at every bit transition, an overshoot that becomes filtered by the capacitive effects of the signal line. De-emphasis refers to a complementary process of decreasing the amplitude of a digital signal, where at every bit transition a full rail-to-rail swing between a high supply voltage (VDDQ, VDD) and low supply voltage (VSSQ, VSS) is provided.
One conventional way to implement de-emphasis/pre-emphasis is to utilize a delay chain to sequentially turn on or turn off the legs of a pull-up and/or pull-down circuit of a voltage driver. This causes a dynamic change in the driver output impedance, which can degrade signal integrity. Furthermore, de-emphasis/pre-emphasis is typically asymmetric, either strengthening pull-up from VSSQ or pull-down from VDDQ. The use of the neural network may mitigate the negative impacts of these conventional approaches.
The following detailed description illustrates a few exemplary embodiments in further detail to enable one of skill in the art to practice such embodiments. The described examples are provided for illustrative purposes and are not intended to limit the scope of the invention. In the following description, for the purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the described embodiments. It will be apparent to one skilled in the art, however, that other embodiments of the present invention may be practiced without some of these specific details.
In some embodiments, the semiconductor device 100 may include, without limitation, a DRAM device, such as a DDR3 or DDR4 device integrated into a single semiconductor chip, for example. The die may be mounted on an external substrate, for example, a memory module substrate, a mother board or the like. The semiconductor device 100 may further include a memory array 150. The memory array 150 includes a plurality of banks, each bank including a plurality of word lines WL, a plurality of bit lines BL, and a plurality of memory cells MC arranged at intersections of the plurality of word lines WL and the plurality of bit lines BL. The selection of the word line WL is performed by a row decoder 140 and the selection of the bit line BL is performed by a column decoder 145. Sense amplifiers (SA) are located for their corresponding bit lines BL and connected to at least one respective local I/O line, which is in turn coupled to a respective one of at least two main I/O line pairs, via transfer gates (TG), which function as switches.
The semiconductor device 100 may employ a plurality of external terminals that include address and command terminals coupled to command/address bus (C/A), clock terminals CK and /CK, data terminals DQ (e.g., pads or pins coupling the semiconductor device 100 to the data bus), data strobe signal terminal DQS (e.g., pads or pins coupling the semiconductor device 100 to the data strobe signal line), and data mask signal terminal DM (e.g., pads or pins coupling the semiconductor device 100 to the data mask signal line), power supply terminals high voltage supply VDD, low voltage supply VSS, high voltage supply for input/output signals VDDQ, and low voltage supply for input/output signals VSSQ, and the output impedance control ZQ calibration terminal (ZQ).
The command/address terminals may be supplied with an address signal and a bank address signal from outside. The address signal and the bank address signal supplied to the address terminals are transferred, via the address/command input circuit 105, to an address decoder 110. The address decoder 110 receives the address signal and supplies a decoded row address signal to the row decoder 140, and a decoded column address signal to the column decoder 145. The address decoder 110 also receives the bank address signal and supplies the bank address signal to the row decoder 140, the column decoder 145.
The command/address terminals may further be supplied with a command signal from outside, such as, for example, a memory controller. The command signal may be provided, via the C/A bus, to the command decoder 115 via the address/command input circuit 105. The command decoder 115 decodes the command signal to generate various internal commands that include a row command signal to select a word line and a column command signal, such as a read command or a write command, to select a bit line.
Accordingly, when a read command is issued and a row address and a column address are timely supplied with the read command, read data is read from a memory cell in the memory array 150 designated by these row address and column address. The read data DQ is output to the outside from the data terminals DQ, DQS, and DM via read/write amplifiers 155 and an input/output circuit 160. Similarly, when the write command is issued and a row address and a column address are timely supplied with this command, and then write data is supplied to the data terminals DQ, DQS, DM, the write data is received by data receivers in the input/output circuit 160, and supplied via the input/output circuit 160 and the read/write amplifiers 155 to the memory array 150 and written in the memory cell designated by the row address and the column address.
When write data DQ is received at receivers of the input/output circuit 160, it would typically be passed onto the read/write amplifiers 155 to be written into the memory array 150. However, to improve a channel data rate for transferring the data to onto the read/write amplifiers 155 to be written into the memory array 150, the write data signals may be processed through a neural network of the preconditioning control circuit 125 to modify the shape of the write data signals such that the properties (e.g., capacitance, circuit switching speed, etc.) of the write data channels cause the write data signals to be received at the memory cell array 150 with a desired shape. Preconditioning may include pre-emphasis or de-emphasis of the signal shape.
The neural network may be trained to set coefficients that are applied to the write data signals to compensate for channel characteristics of the write data channels and corresponding circuitry. For example, using the following equation:
Turning to the explanation of the external terminals included in the semiconductor device 100, the clock terminals CK and /CK are supplied with an external clock signal and a complementary external clock signal, respectively. The external clock signals (including complementary external clock signal) may be supplied to a clock input circuit 120. The clock input circuit 120 may receive the external clock signals to generate an internal clock signal ICLK. The internal clock signal ICLK is supplied to an internal clock generator 130 and thus a phase controlled internal clock signal LCLK is generated based on the received internal clock signal ICLK and a clock enable signal CKE from the address/command input circuit 105. Although not limited thereto, a DLL circuit can be used as the internal clock generator 130. The phase controlled internal clock signal LCLK is supplied to the input/output circuit 160 and is used as a timing signal for determining an output timing of read data. The internal clock signal ICLK is also supplied to a timing generator 135 and thus various internal clock signals can be generated.
The power supply terminals are supplied with power supply potentials VDD and VSS. These power supply potentials VDD and VSS are supplied to an internal voltage generator circuit 170. The internal voltage generator circuit 170 generates various internal potentials VPP, VOD, VARY, VPERI, and the like and a reference potential ZQVREF based on the power supply potentials VDD and VSS. The internal potential VPP is mainly used in the row decoder 140, the internal potentials VOD and VARY are mainly used in the sense amplifiers included in the memory array 150, and the internal potential VPERI is used in many other circuit blocks. The reference potential ZQVREF is used in the ZQ calibration circuit 165.
The power supply terminals are also supplied with power supply potentials VDDQ and VSSQ. These power supply potentials VDDQ and VSSQ are supplied to the input/output circuit 160. The power supply potentials VDDQ and VSSQ are the same potentials as the power supply potentials VDD and VSS, respectively. However, the dedicated power supply potentials VDDQ and VSSQ are used for the input/output circuit 160 so that power supply noise generated by the input/output circuit 160 does not propagate to the other circuit blocks.
The calibration terminal ZQ is connected to the ZQ calibration circuit 165. The ZQ calibration circuit 165 performs a calibration operation with reference to an impedance of RZQ, and the reference potential ZQVREF, when activated by the ZQ calibration command signal (ZQ_com). An impedance code ZQCODE obtained by the calibration operation is supplied to the input/output circuit 160, and thus an impedance of an output buffer (not shown) included in the input/output circuit 160 is specified.
In some examples, the preconditioning time for modifying the shape of the write data DQ may be controlled via external control signals, such as those generated by the preconditioning control logic 210. This may include adjustment of the preconditioning time, as well as the preconditioning amplitude adjustment magnitude. The memory controller 205 may include an external controller, such as a processor, to control preconditioning operations.
In some embodiments, memory controller 205 may optionally include a training circuit 235. The training circuit 235 may be configured to train the neural network of the preconditioning control circuit 220 to precondition the write data signals prior to storage at the memory array 230, based on, without limitation, data eye optimization, reference voltage (amplitude) calibration, and write data training. In some embodiments, the training circuit 235 may also optionally be connected to the preconditioning control logic 210. Accordingly, in some embodiments, the preconditioning control signals may be adjusted to adjust the coefficients of the neural network based on input from the training circuit 235, such as, for example, in data eye optimization. In further embodiments, data eye optimization may include first identifying a preconditioning amplitude adjustment direction (e.g., de-emphasis or pre-emphasis) and magnitude providing the best data eye for a given channel, such as, without limitation, a data path for data in the memory array 230. To perform the training cells in a code word may be programmed, and then a real voltage may be measured. During the training, the input to the neural network may be an expected output and the target output may be the write data signals. The training circuit 235 may compare the real voltage against the target output to determine the converged difference estimation (e.g., the target input minus output of neural network). The training circuit 235 may modify the coefficients of the neural network until the target output is met, and the converged difference and the coefficients may be stored in coefficient data memory.
In some examples, the preconditioning time for modifying the shape of the write data DQ may be controlled via external control signals. This may include adjustment of the preconditioning time, as well as the preconditioning amplitude adjustment by changing the coefficients applied to the write data signals by the neural network.
The processing unit 405 may receive input data (e.g. X_1/2/N (n)) 410a-c from a computing system, such as a host computing device. In some examples, the input data 410a-c may be write data associated with write or programming operations at a memory. The processing unit 405 may include multiplication unit/accumulation units 412a-c, 416a-c and memory lookup units 414a-c, 418a-c that, when mixed with coefficient data retrieved from the memory 430, may generate output data (e.g. Y_1/2/N (n)) 420a-c. In some examples, the output data 420a-c may be utilized as input data for another processing stage or as output data, such as one or more channel characteristics associated with the channel within the memory. In other words, the process unit 405 can include one or more stages of a neural network, such that the processing unit 405 receives input data 410a-c comprising data associated with write or programming operations and generates output data 420a-c comprising one or more of the channels via which the write or programming operations are performed 410a-c.
In implementing one or more processing units 405, a computer-readable medium at an electronic device may execute respective control instructions to perform operations through executable preconditioning control instructions 415 within a processing unit 405. For example, the control instructions provide instructions to the processing unit 405 that, when executed by the electronic device, cause the processing unit 405 to configure the multiplication units 412a-c to multiply input data 410a-c with coefficient data and accumulation units 416a-c to accumulate processing results to generate the output data 420a-c.
The multiplication units/accumulation units 412a-c, 416a-c multiply two operands from the input data 410a-c to generate a multiplication processing result that is accumulated by the accumulation unit portion of the multiplication units/accumulation units 412a-c, 416a-c. The multiplication units/accumulation units 412a-c, 416a-c add the multiplication processing result to update the processing result stored in the accumulation unit portion, thereby accumulating the multiplication processing result. For example, the multiplication unit/accumulation units 412a-c, 416a-c may perform a multiply-accumulate operation such that two operands, M and N, are multiplied and then added with P to generate a new version of P that is stored in its respective multiplication unit/accumulation units. The memory look-up units 414a-c, 418a-c retrieve coefficient data stored in memory 430. For example, the memory look-up unit can be a table look-up that retrieves a specific coefficient. The output of the memory look-up units 414a-c, 418a-c is provided to the multiplication unit/accumulation units 412a-c, 416a-c that may be utilized as a multiplication operand in the multiplication unit portion of the multiplication units/accumulation units 412a-c, 416a-c. Using such a circuitry arrangement, the output data (e.g. Y_1/2/N (n)) 420a-c may be generated from the input data (e.g. X_1/2/N (n) 410a-c.
In some examples, coefficient data, for example from memory 430, can be mixed with the input data X_1/2/N (n) 410a-c to generate the output data Y_1/2/N (n) 420a-c.
As described above, the memory look-up units 414a-c, 418a-c retrieve coefficients to mix with the input data. Accordingly, the output data may be provided by manipulating the input data with multiplication/accumulation units using a set of coefficients stored in the memory associated with characteristic of a write data channel at the memory. The resulting mapped data may be manipulated by additional multiplication/accumulation units using additional sets of coefficients stored in the memory associated with the characteristic of the channel. The sets of coefficients multiplied at each stage of the processing unit 405 may represent or provide an estimation of the processing of the input data in specifically-designed hardware (e.g., an FPGA). Further, it can be shown that the system 400 may approximate any nonlinear mapping with arbitrarily small error in some examples and the mapping of system 400 is determined by the coefficients. For example, if such coefficient data is specified, any mapping and processing between the input data X_1/2/N (n) 410a-c and the output data Y_1/2/N (n) 420a-c may be accomplished by the system 400. Such a relationship, as derived from the circuitry arrangement depicted in system 400, may be used to train an entity of the computing system 400 to generate coefficient data. For example, an entity of the computing system 400 may compare input data to the output data to generate the coefficient data.
In the example of system 400, the processing unit 405 mixes the coefficient data with the input data X_1/2/N (n) 410a-c utilizing the memory look-up units 414a-c, 418a-c. In some examples, the memory look-up units 414a-c, 418a-c can be referred to as table look-up units. The coefficient data may be associated with a mapping relationship for the input data X_1/2/N (n) 410a-c to the output data Y_1/2/N (n) 420a-c. For example, the coefficient data may represent non-linear mappings of the input data X_1/2/N (n) 410a-c to the output data Y_1/2/N (n) 420a-c. In some examples, the non-linear mappings of the coefficient data may represent a Gaussian function, a piecewise linear function, a sigmoid function, a thin-plate-spline function, a multi-quadratic function, a cubic approximation, an inverse multi-quadratic function, or combinations thereof. In some examples, some or all of the memory look-up units 414a-c, 418a-c may be deactivated. For example, one or more of the memory look-up units 414a-c, 418a-c may operate as a gain unit with the unity gain. In such a case, the instructions (e.g., executable instructions 415) may be executed to facilitate selection of a unity gain processing mode for some or all of the memory look-up units 414a-c, 418a-c.
Each of the multiplication unit/accumulation units 412a-c, 416a-c may include multiple multipliers, multiple accumulation units, or and/or multiple adders. Any one of the multiplication units/accumulation units 412a-c, 416a-c may be implemented using an arithmetic logic unit (ALU). In some examples, any one of the multiplication units/accumulation units 412a-c, 416a-c can include one multiplier and one adder that each perform, respectively, multiple multiplications and multiple additions. The input-output relationship of a multiplication/accumulation unit 412, 416 may be represented as:
where “I” represents a number to perform the multiplications in that unit, Ci the coefficients which may be accessed from a memory, such as memory 430, and Bin(i) represents a factor from either the input data X_1/2/N (n) 410a-c or an output from multiplication units/accumulation units 412a-c, 416a-c. In an example, the output of a set of multiplication units/accumulation units, Bout, equals the sum of the coefficient data, Ci multiplied by the output of another set of multiplication unit/accumulation units, Bin(i). Bin(i) may also be the input data such that the output of a set of multiplication unit/accumulation units, Bout, equals the sum of coefficient data, Ci multiplied by input data.
The method 500 includes generating a write data training dataset based on a characteristic of a write data channel of a memory, wherein the training dataset comprises correlations between the write data for the channel and the characteristic of the write data channel at 502.
The method 500 further includes training a neural network of a write data preconditioning circuit of a memory using the write data training dataset to determine a channel characteristic of the write data channel based on write data for the write data channel, at 504. In some examples, the write data training dataset is associated with a codeword of the memory. In some examples, the method 500 further includes applying the trained neural network to determine the characteristic of the write data channel based on the write data, and modifying one or more coefficient values of the neural associated with the write data channel based on the determined channel characteristic.
In some examples, the method 500 further includes generating write data training dataset using the write data, testing the trained neural network using the write data training dataset, and retraining the trained neural network using a different training dataset when the trained neural network does not exceed a threshold accuracy level.
The method 600 includes receiving, at a receiver circuit of a memory, write data via a data terminal, at 602.
The method 600 further includes preconditioning, via a neural network of a preconditioning circuit of the memory, a write data signal corresponding to the write data based on a characteristic of a write data path to provide a modified write data signal, at 604. The write data path includes at least the propagation path of write data as it travels through a semiconductor device from the data terminal where it is received to physical storage at a memory cell array, including a input/output circuitry (e.g., buffers), read/write circuitry, sense amplifiers, transfer gates, etc. In some examples, the method 600 further includes modifying the write data signal based on one or more coefficient values selected based on the characteristic of the write data path. In some examples, the one or more coefficient values are determined during training of the neural network by writing test write data to the memory array. In some examples, the method 600 further includes causing, via the neural network, an amplitude of the write data signal to be increased and an amplitude the write data signal to provide the modified write data signal. In some examples, the method 600 further includes causing, via the neural network, an amplitude of the write data signal to be decreased to provide the modified write data signal. In some examples, the characteristic of the write data path includes a capacitance, process variation of circuit components of the memory array, or any combination thereof.
The method 600 further includes storing, at a memory array, the write data based on the modified write data signal, at 606.
While certain features and aspects have been described with respect to exemplary embodiments, one skilled in the art will recognize that various modifications and additions can be made to the embodiments discussed without departing from the scope of the invention. Although the embodiments described above refer to features, the scope of this invention also includes embodiments having different combinations of features and embodiments that do not include all of the above-described features. For example, the methods and processes described herein may be implemented using hardware components, software components, and/or any combination thereof. Further, while various methods and processes described herein may be described with respect to particular structural and/or functional components for ease of description, methods provided by various embodiments are not limited to any particular structural and/or functional architecture, but instead can be implemented on any suitable hardware, firmware, and/or software configuration. Similarly, while certain functionality is ascribed to certain system components, unless the context dictates otherwise, this functionality can be distributed among various other system components in accordance with the several embodiments.
Moreover, while the procedures of the methods and processes described herein are described in a particular order for ease of description, various procedures may be reordered, added, and/or omitted in accordance with various embodiments. The procedures described with respect to one method or process may be incorporated within other described methods or processes; likewise, hardware components described according to a particular structural architecture and/or with respect to one system may be organized in alternative structural architectures and/or incorporated within other described systems. Hence, while various embodiments are described with or without certain features for ease of description, the various components and/or features described herein with respect to a particular embodiment can be combined, substituted, added, and/or subtracted from among other described embodiments. Consequently, although several exemplary embodiments are described above, it will be appreciated that the invention is intended to cover all modifications and equivalents within the scope of the following claims.
This application claims the benefit under 35 U.S.C. § 119 of the earlier filing date of U.S. Provisional Application No. 63/507,174 filed Jun. 9, 2023 the entire contents of which are hereby incorporated by reference in their entirety for any purpose.
Number | Date | Country | |
---|---|---|---|
63507174 | Jun 2023 | US |