This application claims priority under 35 U.S.C. § 119 to Korean Patent Application No. 10-2023-0113668 filed on Aug. 29, 2023, and Korean Patent Application No. 10-2024-0040528 filed on Mar. 25, 2024, in the Korean Intellectual Property Office, the disclosures of which are incorporated by reference herein in their entireties.
Embodiments of the present disclosure described herein relate to a computing-in-memory device that generates a reference voltage by sharing charges of bit lines and a method of operating the same.
Since a memory supporting read/write operations and an operator supporting a data operation are separated in a conventional computer structure, the energy consumption generated by data movement between the memory and the operator is very large compared to the energy consumption used for the operation itself. In particular, a Multiply-Accumulate (MAC) operation, which is a multiply-accumulate operation used in the convolution layer of artificial neural networks, which is mainly used in modern applications, requires a huge amount of data. Therefore, much more energy is consumed in the movement of data required for such an artificial neural network operation than in a general operation. To solve this issue, a memory technology called computing-in-memory (CIM) has developed that reduces data movement between the memory and the operator by adding computing operations in the memory performing only read/write operations.
However, since a large number of registers and capacitors are required to generate a reference voltage corresponding to multiple bits of the existing CIM memory, there is an issue that the area and power consumption of the memory increase.
Embodiments of the present disclosure provide a computing-in-memory device that generates a reference voltage by sharing charges of bit lines and an operating method thereof.
According to an embodiment of the present disclosure, a computing-in-memory device includes a memory cell array including a plurality of local arrays that generates a first output signal and a second output signal by performing a multiply-accumulate (MAC) operation on an input signal applied through a plurality of operation word lines and a stored weight, a reference voltage generator that generates a reference voltage by sharing charges of the first output signal with adjacent global bit lines, and an analog-to-digital converter that compares the reference voltage with the second output signal and converts the second output signal into a digital signal based on a result of the comparison.
According to an embodiment, the local array may include a local cell including a plurality of first transistors storing the weight and a second transistor computing the input signal and that generates a result of the MAC operation with respect to the input signal using the plurality of first transistors and the second transistor, and a peripheral circuit located below the local cell and that transfers the result of the MAC operation to a global bit line pair.
According to an embodiment, the local cell may generate a least significant bit by multiplying the input signal by the weight so as to transfer to a local bit line bar, and may generate a most significant bit by multiplying an inverted signal of the input signal by the weight so as to transfer to a cell strapping line and a common source line.
According to an embodiment, the peripheral circuit may be connected to the global bit line pair and may be commonly connected to a local bit line pair, the cell strapping line, and the common source line.
According to an embodiment, the peripheral circuit, when the most significant bit and the least significant bit are applied, may electrically connect the local bit line pair, the cell strapping line, and the common source line to convert the most significant bit and the least significant bit into an analog voltage.
According to an embodiment, the reference voltage generator may determine a first global bit line as a master bit line, and may determine a second global bit line adjacent to the first global bit line as a slave bit line.
According to an embodiment, the reference voltage generator may generate the reference voltage having 16 different voltage levels by sharing charges of the master bit line and the slave bit line.
According to an embodiment, the digital signal may be a 5-bit signal with respect to the second output signal.
According to an embodiment of the present disclosure, a method of operating a computing-in-memory device includes generating a first output signal and a second output signal by performing a multiply-accumulate (MAC) operation on an input signal applied through a plurality of operation word lines and a stored weight, generating a reference voltage by sharing charges of the first output signal with adjacent global bit lines, and comparing the reference voltage with the second output signal and converting the second output signal into a digital signal based on a result of the comparison.
According to an embodiment, the generating of the reference voltage may include determining a first global bit line as a master bit line, determining a second global bit line adjacent to the first global bit line as a slave bit line, and generating the reference voltage having 16 different voltage levels by sharing charges of the master bit line and the slave bit line.
According to an embodiment of the present disclosure, the computing-in-memory device that generates a reference voltage by sharing charges of bit lines may omit registers or capacitors by using bit lines to convert an analog signal into a digital signal. Accordingly, the power consumption and the design area of the computing-in-memory device may be minimized.
The above and other objects and features of the present disclosure will become apparent by describing in detail embodiments thereof with reference to the accompanying drawings.
Hereinafter, embodiments of the present disclosure may be described in detail and clearly to such an extent that an ordinary one in the art easily implements the present disclosure.
A computing-in-memory device 10 may include a memory cell array 100, a reference voltage generator 200, and an analog-to-digital converter 300.
The computing-in-memory device 10 according to an embodiment of the present disclosure may be implemented in a memory device. For example, the computing-in-memory device 10 may be implemented in in-memory computing of various devices such as a laptop computer, a desktop computer, a notebook computer, a tablet computer, a smart phone, a personal digital assistant (PDA), a wearable device, etc.
The memory cell array 100 may perform a multiply-accumulate (MAC) operation on an input signal applied through a plurality of operation word lines WLW and WLWb and a stored weight in advance.
In addition, the memory cell array 100 may be configured to output a first output signal and a second output signal according to the MAC operation to the global bit line pair GBL and GBLB, respectively.
In an embodiment, a local array of the memory cell array 100 may include a plurality of local cells. In this case, each of the plurality of local cells may store a weight. The weight may be stored in each local cell based on a word line that is separately connected to the local cell to store the weight rather than the plurality of operation word lines WLW and WLWb.
In an embodiment, the MAC operation performed by the memory cell array 100 may be performed simultaneously in the plurality of local cells included in the local array, and the memory cell array 100 may output the results of the MAC operation performed in each cell as the first output signal and the second output signal through the global bit line pair GBL and GBLB.
The reference voltage generator 200 may receive the first output signal through the global bit line GBL and may generate the reference voltage with respect to the first output signal through a charge sharing repetition based on a precharge. The reference voltage is generated based on an analog signal of the memory cell array 100, and thus may be a reference voltage for reading the second output signal.
The reference voltage generator 200 may receive the first output signals of adjacent local cells to a master bit line “Master BL” and a slave bit line “Slave BL” and may generate a plurality of reference voltages by precharging or discharging each bit line pair.
For example, when the reference voltage generator 200 is connected to the memory cell array 100 with 32 global bit lines GBL, the reference voltage generator 200 may receive a first output signal pair of adjacent local cells to the master bit line “Master BL” and the slave bit line “Slave BL” and may generate a reference voltage having 16 voltage levels.
The analog-to-digital converter 300 may receive a reference voltage through the global bit line GBL and may receive a second output signal that is obtained by the MAC operation through the global bit line bar GBLB. Accordingly, the analog-to-digital converter 300 may output a digital value, which is a multi-bit signal with respect to the second output signal, by sequentially comparing the reference voltage with the second output signal.
For example, the analog-to-digital converter 300 may output a 5-bit digital value by comparing the reference voltage having 16 voltage levels with the second output signal. A more detailed description will be described in
As described above, the computing-in-memory device 10 according to the embodiment of the present disclosure may reduce the overall power consumption of the computing-in-memory device 10 by using the bit lines to convert an analog signal into a digital signal, and may reduce the design area and cost by omitting registers or capacitors.
Referring to
The local array 110 may include local cells that store weights required for the MAC operation and a local MAC operation unit that performs the MAC operation on an analog voltage. Accordingly, the local array 110 may perform the MAC operation on weights stored in the cell in advance and the analog voltage corresponding to multi-bit input data.
As described above, the memory cell array 100 according to the embodiment of the present disclosure may perform the MAC operation within the memory, thereby increasing energy efficiency, and may reduce the cost and the design complexity of the computing-in-memory by generating a reference voltage obtained by utilizing the column direction structure of the local cells.
Referring to
In an embodiment, the local array 110 may include a local cell 111 and a peripheral circuit 112. The local cell 111 may have a P-7T SRAM cell structure. In detail, the local cell 111 may have the P-7T SRAM cell structure that is implemented with six transistors for storing weights and one transistor for computing an input signal.
The P-7T SRAM cell structure may be a structure in which one NMOS transistor “N0” is additionally connected to a conventional 6T SRAM structure capable of only read/write operations, as illustrated. For example, in the P-7T SRAM cell, a source electrode and a drain electrode of the NMOS transistor “N0” may be connected to the cell strapping line CSS and the common source line CSL, respectively.
The peripheral circuit 112 is located below the plurality of local cells 111 and may transfer the MAC operation results of the local cells 111 to the global bit line pair GBL and GBLB. In an embodiment, the peripheral circuit 112 may be connected to the global bit line pair GBL and GBLB through two switches SW1 and SW2 and may be commonly connected to the local bit line pair LBL and LBLB, the cell strapping line CSS, and the common source line CSL.
In an embodiment, the peripheral circuit 112 may operate as a digital-to-analog converter DAC. For example, the peripheral circuit 112 may convert the MAC operation results of the local cells 111 into an analog voltage so as to transfer to the global bit line pair GBL and GBLB.
Referring to
Referring to
For example, the local cell 111 may precharge all three bit lines LBLB, CSS, and CSL with a power supply voltage VDD before starting the operation. In this case, when the least significant bit is “1” and the weight is “1”, a discharge path is formed through two transistors inside the local cell 111, so that the local bit line bar LBLB may be discharged to a ground voltage VSS. In addition, when the most significant bit is “1” and the weight is “1”, all three transistors NO, N1, and N2 are turned on, so that both the cell strapping line CSS and the common source line CSL may be discharged to the ground voltage VSS. In contrast, when the most significant bit is “1” and the weight is “0”, one transistor N2 is turned on and the remaining two transistors NO and N1 are turned off, so that the discharge path of the cell strapping line CSS and the common source line CSL is not formed, and thus the power supply voltage VDD may be maintained.
In an embodiment, the entire result of the operation of the local cells 111 may be accumulated and transferred to the peripheral circuit 112 through the local bit line pair LBL and LBLB, the cell strapping line CSS, and the common source line CSL.
Referring to
The reference voltage generator 200 may perform charge sharing by turning on or off a switch connecting the master bit line “Master BL” and the slave bit line “Slave BL” based on a plurality of control signals Pch and Dch and an initialization signal “Init”.
For example, the reference voltage generator 200 may generate a plurality of reference voltages REF [0:15] by repeatedly performing the operations of precharging the master bit line “Master BL” and discharging the slave bit line “Slave BL”.
Referring to
In an embodiment, the reference voltage generator 200 may connect the master bit line “Master BL” to the slave bit line “Slave BL” to share the charges of the 16 precharged master bit lines “Master BL” and the 16 discharged slave bit lines “Slave BL”. In this case, since only half of the global bit lines among the entire global bit lines are precharged with the power supply voltage VDD, ½ of the power supply voltage VDD is generated by the law of conservation of charge.
Thereafter, the reference voltage generator 200 may again connect the master bit line “Master BL” and the slave bit line “Slave BL” to induce charges in the eight discharged master bit lines “Master BL” and the eight precharged slave bit lines “Slave BL”. In detail, by connecting the eight slave bit lines “Slave BL” precharged with ½ of the power supply voltage VDD and the eight discharged master bit lines “Master BL”, ¼ of the power supply voltage VDD may be formed.
As described above, the reference voltage generator 200 may generate the plurality of reference voltages REF [0:15] having 16 voltage levels based on an operation of discharging or precharging the global bit line pair and reusing charges precharged on one bit line.
Referring to
In an embodiment, the multiplexer 310 may receive the reference voltage REF [0:15] generated by the reference voltage generator 200 and may output one reference voltage VREF. To this end, the multiplexer 310 may be a 16:1 multiplexer.
The comparator 320 may receive the reference voltage VREF to a first input terminal, may receive the MAC operation value to a second input terminal, and may determine one of the 4-bit signals by comparing voltage levels of the first input terminal and the second input terminal.
The decoder 330 may control to change the reference voltage VREF output from the multiplexer 310 according to the comparison result of the comparator 320.
The shift register 340 may perform a shift operation on the 4-bit signal determined through the repeated operation of the reference voltage generator 200 to output a 5-bit digital value.
Referring to
For example, when the MAC operation value is 600 mV and the power supply voltage VDD is 900 mV, in step “0”, the analog-to-digital converter 300 may determine a first bit among 4 bits by comparing magnitudes of 600 mV and 450 mV, which is ½ of the power supply voltage VDD. In addition, in step “1”, the analog-to-digital converter 300 may determine a second bit among the 4 bits by comparing 600 mV with ¼ and ¾ of the power supply voltage VDD. In detail, the analog-to-digital converter 300 may generate 16 different 4-bit digital values corresponding to the 16 reference voltages REF [0:15] through a total of 5 comparison steps. In addition, the analog-to-digital converter 300 may perform a shift operation to output a 5-bit digital value.
For example, when the 4-bit digital value is “0101”, the MAC operation value may be compared with ½ of the power supply voltage VDD, ¾ of the power supply voltage VDD, ⅝ of the power supply voltage VDD, 11/16 of the power supply voltage VDD, and 21/32 of the power supply voltage VDD. It may be confirmed that this is consistent with the result of
Referring to
In operation S120, the reference voltage may be generated by repeatedly sharing charges of the first output signal. For example, the reference voltage generator 200 may repeatedly share charges of the first output signal to generate the reference voltage.
Thereafter, an operation of generating the reference voltage may be performed. The operation may be performed through detailed operations as follows, and each detailed operation may be performed simultaneously.
First, in operation S121, the reference voltage generator 200 may receive each of two adjacent global bit lines GBL as the master bit line “Master BL” and the slave bit line “Slave BL”.
In operation S122, the reference voltage generator 200 may share the charges of the master bit line “Master BL” and the slave bit line “Slave BL” to generate the reference voltage having 16 different voltage levels.
In operation S130, a digital signal may be generated based on the result of comparing the reference voltage with the second output signal. For example, the analog-to-digital converter 300 may compare the reference voltage and the second output signal and may convert the second output signal into a digital signal based on a result of the comparison.
According to an embodiment of the present disclosure, the computing-in-memory device that generates a reference voltage by sharing charges of bit lines may omit registers or capacitors by using bit lines to convert an analog signal into a digital signal. Accordingly, the power consumption and the design area of the computing-in-memory device may be minimized.
The above descriptions are specific embodiments for carrying out the present disclosure. Embodiments in which a design is changed simply or which are easily changed may be included in the present disclosure as well as an embodiment described above. In addition, technologies that are easily changed and implemented by using the above embodiments may be included in the present disclosure. While the present disclosure has been described with reference to embodiments thereof, it will be apparent to those of ordinary skill in the art that various changes and modifications may be made thereto without departing from the spirit and scope of the present disclosure as set forth in the following claims.
Number | Date | Country | Kind |
---|---|---|---|
10-2023-0113668 | Aug 2023 | KR | national |
10-2024-0040528 | Mar 2024 | KR | national |