This application claims the priority benefit of China application no. 202210190672.8, filed on Feb. 28, 2022. The entirety of each of the above mentioned patent applications is hereby incorporated by reference herein and made a part of this specification.
The present invention relates to the storing field, in particular to a highly energy-efficient CAM based on a single FeFET and an operating method thereof, which considers the FeFET device to be used in the design of a low power consumption and high performance CAM with non-volatile characteristics.
A Content Addressable Memory (CAM) is a promising hardware solution for Computing in Memory (CiM), which can solve a memory wall problem in von Neumann machines. With its highly parallel search capability, the CAM has a great potential in today's data-intensive applications, including machine learning, neuromorphic computing, and look-up tables, among others.
Traditional CMOS CAMs suffer from high power consumption and low area-density, so researchers consider emerging non-volatile memory (NVM) devices, such as resistive RAMs (ReRAMs), Spin Transfer Torque Magnetic RAMs (STT-MRAMs) and Ferroelectric Field Effect Transistors (FeFETs) to be used to construct compact and efficient CAM designs. The ReRAM and STT-MRAM incorporate variable resistance and non-volatile storage characteristics to encode their Low Resistance States (LRSs)/High Resistance States (HRSs) as logical values of “1”/“0”, respectively, which can be used to replace the CMOS SRAM in constructing the CAM designs. However, three-terminal FeFET devices can be used as 1T non-volatile memories or switches instead of variable resistors due to their unique hysteresis I-V characteristic curves, high switching current ratio, and high turn-off resistance. Therefore, the FeFETs have smaller area overheads and lower energy consumptions compared to the CMOS CAM designs, thus promising to construct compact and efficient CAM designs. These NVM-based innovations focus on the application of NVM in compact CAM designs, resulting in CAM cells with smaller areas, lower energy consumptions, and lower delays.
The purpose of the present invention is to provide a CAM design based on a single FeFET for the problems of high energy consumption, large area overhead and poor performance of the existing CAM, and to propose a highly energy-efficient data search method to achieve a lower energy consumption and delay.
The object of the present is realized by the following technical schemes:
A highly energy-efficient CAM based on a single FeFET, wherein each CAM cell is composed of 1FeFET and 2NMOSs, and 2NMOSs is divided into T1 and T2; source of an FeFET device in a CAM cell structure is connected to a search line
Further, MLs are discharged through a single NMOS.
Further, each column of the array shares the same longitudinal SL and
Further, the match lines MLs realize an adaptive precharge and discharge data search design method through detection amplifiers SAs each of which is based on a Threshold Inverter Quantization (TIQ) comparator, and the method can terminate the precharge and discharge process of the MLs in advance, thereby reducing an ML voltage swing, and improving the energy efficiency of the CAM.
Further, two types of storing are performed on the FeFET by an operation of the gate: 1, 0.
Further, the drain of the FeFET passes information about matching or not.
An operating method of the CAM as described above, wherein the operating method comprises:
The present invention has the following beneficial effects:
The CAM design in the present invention can realize energy saving and delay reduction.
(1) For the CAM design of 2T-1FeFET, in the array, each CAM cell has only one NMOS connected to the match line ML, which reduces the ML capacitance and thus reduces the precharge energy consumption. And because of the reduced ML capacitance and the reduced resistance between the ML and ground, the array consists of the CAM design of 2T-1FeFET has a smaller search delay than that of the existing CAM design. Because the number of devices in this design is small, the area of the CAM is small, and only the single FeFET is used, the design overhead is reduced, and the production cost can be reduced. In addition, this design can also be used by NVM in addition to FeFET, and has versatility.
(2) For the CAM design of 2T-1FeFET, different from the default CAM, this CAM is self-adaptive in the operation of ML precharge and discharge during each search. By using the TIQ comparator in the detection amplifier SA, when ML voltage is precharged to be higher than the threshold voltage of the detection amplifier SA, the charging is stopped. When the ML voltage is discharged below the threshold voltage of the detection amplifier SA, the discharge is stopped. This highly energy-efficient data search design method reduces the voltage swing of the ML, so it can effectively reduce the energy consumption.
The present invention is further described in detail in combination with the accompany drawings and specific embodiments.
A highly energy-efficient CAM based on a single FeFET and an operating method thereof are shown in part (a) of
M0 and T1 are connected in series on the left side of the cell to form a voltage divider circuit, which is powered by the search lines SL and
Therefore, the voltage of the node D during searching for “1” is:
Wherein Vsearch is the voltage on SL, RT1,Sr1 is the resistance of a T1 transistor during searching for “1”, and RM0 is the resistance of the FeFET, which can be Rlow or Rhigh and depends on the state of VTH of the FeFET. Therefore, when storing state “1” (i.e. a low VTH state or Rlow), the voltage of the node D is as follows:
When storing state “0” (i.e. a high VTH state or Rhigh), the voltage of the node D is as follows:
Therefore, by choosing an appropriate control bias for the transistor T1, its resistance RT1,Sr1 can be set between Rlow and Rhigh, so that the corresponding VD,Sr1St1 and VD,Sr1St0 are lower than and higher than VTH of the transistor T2 respectively, thus achieving match and mismatch operations.
Similarly, when searching for logic “0”, the voltages on SL and
By setting the resistance RT1,Sr0 of T1 between Rlow and Rhigh, when “1” is stored, the voltage of the node D is:
When “0” is stored, the voltage of the node D is as follows:
Therefore, VD,Sr0St1 and VD,Sr0St0 are above and below VTH of the transistor T2 respectively, thus correctly implementing the function of searching for “0”.
2. Overall structure and operation process of the CAM array of 2T-1FeFET: As shown in part (a) of
The whole operation process of the CAM array of the 2T-1FeFETs is as follows: (1) before the array consisted of the CAMs of the 2T-1FeFETs starts to work, performing data storing on each cell, that is, after the information is encoded into a binary sequence, performing writing on the 1FeFET through the WL, wherein the written state is represented by S. In addition, in writing operations, the Vwrite/2 suppression bias scheme needs to be applied to all WLs related to unselected rows and
(2) For each search cycle, it is divided into two stages:
At this time, for a matched cell, D is 0 and ML does not discharge through the NMOS; for a mismatched cell, D is 1 and ML discharges through the NMOS. Therefore, in the case of mismatch, ML is discharged to the ground through the mismatched CAM cell and pull-down transistor, while in the case of match, ML remains at an original level, because there is no discharge path. After waiting for a period of time, the discharge process is finished, and the output of the detection amplifier SA in each row is observed. The TIQ comparator in the SA compares the ML voltage with the threshold voltage and generates the output signal. If the output is at a high level, it indicates that the input of the SA is at a low level, showing that there is discharge in ML of this row, and the row does not match. Otherwise, if the output is at a low level, it indicates that the input of the SA is at a high level, showing that the ML of this row is not discharged and the row matches.
The truth value table for write and search is shown in the following table (where a write voltage Vw=4V, a search voltage Vs=1V):
3. Adaptive ML precharge and discharge scheme:
Since the precharge energy consumption Epre of the CAM array is expressed as follows:
Epre=CMLVDDΔV
Wherein CML is the associated capacitance of ML, VDD is the supply voltage, and ΔV is the voltage swing of ML. Therefore, this scheme saves the energy consumption of precharge by reducing ΔV.
The functions and effects of the present invention are further illustrated and demonstrated by the following simulation experiment:
1. Simulation Conditions
The FeFETs are simulated by using a compatible SPECTRE and SPICE model based on a physical circuit, wherein the model is based on the Preisach model. This model achieves an efficient design and analysis, and has been widely used in the FeFET circuit design. It supports nm, 22 nm or 10 nm Predictive Technology Models (PTMs) as basic transistors. The basic transistor used in the simulation is a PTM 45 nm model. The voltage is set to 1V.
During the simulation, for the CAM design of the 2T-1FeFET, a SPECTRE software is used to simulate. In addition to the simulation of the CAM design in the present invention, we compare our results with four CAM designs mentioned in a non-patent document 1 (A. T. Do et al., “Design of a power-efficient cam using automated background checking scheme for small match line swing,” in ESSCIRC, pp. 209-212, IEEE, 2013), non-patent document 2 (J. Li et al., “1 mb 0.41 μm2 2t-2r cell nonvolatile team with two-bit encoding and clocked self-referenced sensing,” JSSC, vol. 49, pp. 896-907, 2014), non-patent document 3 (C. Wang et al., “Design of magnetic non-volatile team with priority-decision in memory technology for high speed, low power, and high reliability,” IEEE TCAS-I, vol. 67, no. 2, pp. 464-474, 2019) and a non-patent document 4 (X. Yin et al., “An ultra-dense 2fefet team design based on a multidomain fefet model,” IEEE TCAS-II, vol. 66, pp. 1577-1581, 2018.).
The comparison metrics mainly include the number of transistors, the area of each CAM cell, the search delay and the search energy consumption each time when searching for each CAM cell. For the CAM design in the present invention, the measurement delay is the delay in the worst case, that is, only one CAM cell does not match with discharge; and, energy consumption is measured using the average case of energy consumption, that is, half of the CAM cells in a row match with discharge, and half of the CAM cells do not match with discharge.
2. Simulation Results
1) Verification of Non-Volatility
1.1)
1.2)
2) Energy Consumption and Delay Analysis Under VDD and Vref Adjustment
Based on the above analysis, reducing Vref would reduce the energy consumption of the SA at the expense of increasing the ML voltage swing, which can be partially solved by lowering VDD so as to reduce the upper bound of the voltage swing.
3) Energy Consumption and Delay Analysis Under Different Word Length
According to the voltage regulation analysis, we used the lowest operating VDD (0.6V) and Vref (0.6V). Part (a) and part (b) in
4) Robustness Verification Against Process Change
We also verify the robustness of the 2T-1FeFET CAM design and the adaptive ML precharge and discharge scheme in the present invention. We assume that the FeFET device has an experimental variation of σ=54 mV in the low/high Vth state, whereas the CMOS device has a 5% size variation.
5) Optimization of Energy Consumption
The following table presents the comparison of the metrics of the CAM design based on the single FeFET in the present invention with other CAM designs.
The above table summarizes the technical metrics of the 2T-1FeFET CAM and other CAMs, wherein the cell size is estimated based on the 2×2 2T-1FeFET CAM array layout. As can be seen from the above table, the cell size of 2T-1FeFET CAM is 10.9% of that of traditional 10T CMOS CAM. The smaller the area overhead of the CAM, the smaller the ML associated parasitic capacitance, thus reducing the search energy and search delay. With the 1FeFET-based CAM design and adaptive ML precharge and discharge scheme, the 2T-1FeFET CAM saves 6.64 times of energy consumption and reduces the delay by 2.67 times compared with the 10T CMOS CAM design. Because the present invention adjusts VDD and Vref in order to reduce the ML voltage swing and search energy consumption, the 2T-1FeFET CAM is slightly slower than the 2T-2R CAM and the 2FeFET CAM, but still within an acceptable range as the 2T-1FeFET CAM design is 4.74 times/3.02 times more energy efficient than the 2T-2R/2FeFET CAM design. While the search delay of the STT-MRAM CAM is only 42% of that of the 2T-1FeFET CAM, the cell size of the 2T-1FeFET CAM is only 1.99% of that of the STT-MRAM CAM, thus yielding a huge density advantage that can compensate for the slightly reduced performance. Moreover, the 2T-1FeFET CAM saves 9.14 times of energy consumption compared with the 20T-6MTJ CAM.
It can be seen from the above results that the present invention not only has non-volatility which is difficult to achieve by the CMOS design, and robustness against the process changes, but also has the characteristics of compact design, low energy consumption and low delay. In addition, the above results also validate the effectiveness of the 2T-1FeFET CAM array of the adaptive ML precharge and discharge scheme in data-intensive search applications.
The above embodiments are used to explain the present invention, not to restrict it, and without departing the the spirit and protection scope of present invention, any modification or alteration made to the present invention falls within the protection scope of the present invention.
Number | Date | Country | Kind |
---|---|---|---|
202210190672.8 | Feb 2022 | CN | national |