This application claims priority from French Patent Application No. 2204938 filed on May 23, 2022. The content of this application is incorporated herein by reference in its entirety.
The present invention generally relates to the field of resistive random access memories or RRAM, and more particularly the so-called 1S1R-type ones, whose memory cells comprise a selector and a programmable resistance. In particular, it finds application in hardware accelerators for deep learning, more particularly in the implementation of MAC (Multiply ACcumulate) operations.
Artificial neural networks are used in quite many artificial intelligence applications. In general, the calculations required by these neural networks are performed by software executed by a processor, the intermediate results being stored in a memory. The processing by the processor and the considerable data exchanges between the processor and the memory lead to a high energy consumption in particular for large-sized networks. Yet, the energy resources are sometimes very limited such as within the nodes of connected networks. This results in that the calculations have to be made in the Cloud or at the network periphery.
However, the use of artificial neural networks in connected objects is sometimes desirable for reasons related to security, offline availability or lag time.
Recently, resistive memories or RRAM have been suggested as hardware accelerators with a low energy consumption to perform some neuromorphic calculations, like the MAC (Multiply ACcumulate) operations taking place in the products between matrices of synaptic weights and activation vectors.
A description of the use of resistive memories for neuromorphic calculations could be found in the article by H. Li et al. entitled “Memristive crossbar arrays for storage and computing applications” published in Adv. Intell. Syst., 3, 2100017, pp. 1-26.
The resistive memories are composed of cells each comprising a resistive element with a variable resistance encoding a synaptic coefficient. In its simplest form, a resistive memory implements a binary neural network or BNN. Each resistive element of one cell is made of a material capable of toggling in a reversible manner between a high-resistance state or HRS and a low-resistance state or LRS. Different technologies allow carrying out such a toggling between two resistivity states. Thus, one could distinguish the phase-change memories or PCRAM (Phase Change Random Access Memory), the so-called conductive-bridge memories or CBRAM (Conductive Bridge Random Access Memory), the ferroelectric memories or FERAM (FErroelectric Random Access Memory), the oxide-based memorises or OxRAM (Oxide based Random Access Memory). In these memories, each cell can store only one data bit, represented by the state of the resistive element.
A method for performing a neuromorphic calculation, more specifically the product of a matrix of synaptic coefficients of a BNN network with a vector whose elements are the activation values of the neurons of a layer has been described in the article by S. N. Truong entitled “Single crossbar array of memristors with bipolar inputs for neuromorphic image recognition” published in IEEE Access, vol. 8, pp. 69327-69332.
If we denote al the activation vector of the layer l, al+1 the activation vector of the layer l+1, and Wl,l+1 the matrix of the synaptic coefficients between the two layers, we have:
a
l+1=sign(Wl,l+1al) (1)
where sign(a) is the operation which gives the sign of the different elements of the vector a. The activation value of the neuron i of the layer +1 is thus obtained from the sign of the scalar product:
a
i
l+1=sign(al)=sign(Σ
ajl) (2)
Where is the line vector corresponding to the ith line of the matrix
. The scalar product involved in the expression (2) simply measure the similarity between the vector
stored in the RRAM and the input vector
=popcount(
)=popcount(XNOR
)) (3)
where the XNOR operation should be understood on each of the bits of the two vectors and popcount is the sum (positive or negative) of the bits of the resulting vector.
The aforementioned article suggests calculating the scalar product by applying at the input of the RRAM memory the binary word
in polar representation (
=
−
) and by performing a current reading on the corresponding output line i, the sum Σj
simply resulting from Kirchhoff law.
Memory technologies, some of which can be used for MAC address computation, are disclosed in patent document US 2019/362787 and in J-M. Hung, X. Li, J. Wu, and M-F. Chang, “Challenges and Trends in Developing Non-volatile Memory-Enabled Computing Chips for Intelligent Edge Devices,” IEEE Transactions on Electron Devices, vol. 67, no. 4, pp. 1444-1453, April 2020, doi: 10.1109/TED.2020.2976115.
Recently, resistive memories allowing storing several bits per cell have been developed. The article by E. Esmanhotto et al. entitled “High-density monolithically integrated multiple 1T1R multiple-level cell for neural networks” published in IEEE International Electron Devices Meeting (IEDM) Proc. December 2020, pp. 36.5.1-36.5.4 describes in particular a multi-level resistive cell or MLC (Multi-Level Cell) of the 1T1R type i.e. consisting of a resistive element and a transistor allowing blocking or giving access to the resistive element for reading or writing operations while limiting the leakage currents in the rest of the array.
Nonetheless, the 1T1R cells do not allow reaching a high level of integration because of the surface necessary to make the access transistor. For this reason, it is preferred to use 1S1R-type cells comprising a non-linear selector and a resistive element in series, the selector generally being an ovonic threshold switch or OTS (Ovonic Threshold Selector).
A 1S1R-type resistive memory is represented in
Consequently, it is difficult to read the value of a synaptic coefficient stored in an MLC RRAM memory cell and a fortiori carry out a neuromorphic calculation such as a MAC operation by means of such an MLC RRAM memory.
Consequently, an object of the present invention is to provide a method for performing a neuromorphic calculation and in particular a MAC operation by means of an MLC RRAM memory.
The present invention is defined by a method for calculating a MAC operation to provide the scalar vector between a first vector, whose elements are binary elements, and a second vector whose elements are values quantised over M>2 levels, said operation being carried out by means of a memory composed of memory cells including a plurality of word lines and a plurality of bit lines, a memory cell relating each word line to each bit line according to a crossbar configuration, each memory cell possibly taking on a plurality M of states, each state being associated with a current-voltage characteristic of the cell, the memory cells of a bit line storing the elements of the second vector, wherein each memory cell is read by successively applying M−1 voltages Vread
Advantageously, the memory cells are made by means of an ovonic selector in series with a resistive element programmable in a low-resistivity state (LRS) or a high-resistivity element (HRS).
The first vector may be an activation vector whose elements are the activation values of a neural layer, the activation values possibly taking on the values +1 and −1.
The second vector may be a synaptic coefficient vector of synapses between said neural layer and the next layer of a neural network quantised over M levels.
The M possible quantised values of the synaptic coefficients are typically equal to
In this case, the scalar product may be deduced from the difference between the total number of passing cells in the first phase and the total number of passing cells in the second phase, corrected for the bias, by means of a normalisation operation transforming an integer X into a synaptic coefficient quantised value,
The memory can be a RRAM memory of 1S1R type or a memory with three terminals.
Other features and advantages of the invention will appear upon reading a preferred embodiment of the invention, made with reference to the appended figures wherein:
Next, as a non-limiting example, we will consider an MLC RRAM memory i.e. whose memory cells are capable of storing a piece of information over M>2 levels (or states), in other words a piece of information of more than one bit. Furthermore, we will suppose that this memory is a 1S1R-type crossbar array, i.e. each of the memory cells of which comprises a selector and a programmable resistive element.
Other types of memory can be used to implement the present invention, such as a three-terminal memory, such as FeFET memories or flash memories.
Furthermore, in general, the present invention can be implemented with different types of selector element. Typically, the selector element is an ovonic threshold selector or OTS. Nonetheless, a person skilled in the art should understand that any selector capable of triggering at one or more threshold value(s) could be used by a person skilled in the art without departing from the scope of the present invention. For example, the selector can also be a Mixed Ionic Electronic conduction (‘MIEC’) selector, a Metal Insulator Transition (‘MIT’) selector, a diode type selector or a filamentary volatile selector.
Nonetheless, for simplicity, we will consider beforehand a 1S1R memory cell with only M=2 levels. In such a case, the cell may be programmed in a high resistivity state (HRS) and therefore with a low conductance, denoted Goff, or in a low resistivity state (LRS) and therefore with a high conductance, denoted Gon.
One idea at the origin of the present invention is to sequentially apply a first voltage and a second reading voltage, and to deduce corresponding currents, read at the output, the result of the operation a·w where a is a polar binary activation value (a=±1) and W is the value of the binary synaptic weight stored in the cell (w=±1). The value w=−1 is stored by programming the cell with the conductance Goff and the value w=+1 is stored with the conductance Gon. The first reading voltage is selected equal to zero if the activation value is equal to −1 and equal to Vread with VTH-LRS<Vread<VTH-HRS where [VTH-LRS,VTH-HRS] is a voltage range in which the conductance takes on the value Goff if w=−1 and the value Gon if w=±1.
The behaviour of such a memory cell during these two reading phases is now considered.
The voltage-current characteristic of a two-level type memory cell is represented in
Returning now back to
If the number of positive activation values is equal to the number of negative activation values, the difference (Nread open)+−(Nread open)− is none other than the scalar product w·a where the (binary) elements of the vector w are stored in the cells of the column. Nonetheless, in general, the obtained result should be corrected for a bias related to the difference between the number of positive activation values and that of negative activation values
w·a=(Nread open)+−(Nread open)−−(Σai+−Σai−) (4)
The first (resp. second) term of the expression (4) corresponds to the number of passing cells of the bit line when the reading voltage Vread is applied to the word lines corresponding to the positive (resp. negative) activation values of the activation vector a.
The second term of the expression (4) corresponds to the bias value resulting from the difference between the number of positive activation values and the number of negative activation values in a.
In a first reading phase, a first reading voltage Vread1 is applied at the input of the memory, in a first step, to the word lines corresponding to the positive activation values. This operation is repeated in a second step, by applying a second reading voltage Vread2 to the same word lines.
For each bit line, the read current is equal to the sum of the currents in the different memory cells of the associated column. Thus, the current read in the first step is in the range of (Nread1 open)+·ILRS where (Nread1 open)+ is the number of passing memory cells in the column. Similarly, the current read in the second step is in the range of (Nread2 open)+ILRS. The currents thus read in the first and second steps are summed up for each bit line, which is represented in the figure by the expression ((Nread1 open)++(Nread2 open)+)ILRS.
In a second reading phase, a first reading voltage Vread1 is applied at the input of the memory, in a first step, to the word lines corresponding to the negative activation values. This operation is repeated in a second step, by applying the second reading voltage Vread2 to the same word lines.
For each bit line, the current read in the first step is in the range of (Nread1 open)−·ILRS where (Nread1 open)− is the number of passing memory cells in the column. Similarly, the current read in the second step is in the range of (Nread2 open)−ILRS. The currents thus read in the first and second steps are summed up for each bit line, which is represented in the figure by the expression ((Nread1 open)−+(Nread2 open)−)ILRS.
Considering now the behaviour of a memory cell during the two reading phases. The voltage-current characteristic of a memory cell has been represented in
Hence, each memory cell can encode a synaptic coefficient able to take on three distinct values: −1, 0, +1.
When the number of positive activation values is equal to the number of negative activation values, the difference between the sum (Nread1 open)++(Nread2 open)+ representative of the currents read in the first phase and the sum (Nread1 open)−+(Nread2 open)− representative of the currents read in the second phase for the same bit line provides the scalar product w·a (each of the elements of the vector w stored in the cells of the column possibly taking on 3 levels). However, like before, when the number of positive activation values differs from the number of negative activation values, the obtained result is to be corrected for a bias related to the difference between the number of positive activation values and that of negative activation values, namely:
w·a=((Nread1 open)++(Nread2 open)+)−((Nread1 open)−+(Nread2 open)−)−(Σa+−a−) (5)
In this embodiment, the activation vector of the neural layer is composed of elements representing either a positive activation value (ai+=+1), namely a negative activation value (ai−=−1). The synaptic coefficients may take on M distinct values. Each MC RRAM memory cell can store such a synaptic coefficient while being able to take on M distinct states, each state associated to a current-voltage characteristic of the cell (Iprog1(V) . . . , IprogM-1(V)). The readings of the output currents at the reading voltages, Vread
I
read
=I
LRS
, . . . ,I
read
=I
LRS
,I
read
=I
HRS
, . . . ,I
read
=I
HRS.
Alternatively, it should be understood that this correspondence could be reversed, the state m then being associate with the read output currents:
I
read
=I
HRS
, . . . ,I
read
=I
HRS
,I
read
=I
LRS
, . . . ,I
read
=I
LRS.
Regardless of the used encoding convention, the first phase corresponds to the application of the successive reading voltages Vread1, Vread2, . . . , Vread
Thus, in
Afterwards, this operation is repeated by applying the voltage Vread2, . . . , Vread
Afterwards, this operation is repeated by applying the voltage Vread2, . . . , Vread
It should be noted that the first and second reading phases could be interlinked. For example, it is possible to proceed by successively applying the voltages Vread1, Vread2, . . . , Vread
The total number of passing memory cells during the first reading phase is obtained by summing up, in 611, the numbers of passing memory cells at each reading step of this phase, namely Σm=1M−1(Nread
This difference is corrected in 630 by subtracting from the result the bias equal to the difference between the number of positive activation values and the number of negative activation values in the activation vector.
Afterwards, the difference thus obtained may be normalised in 640 to switch from the variation range of the number of passing cells to the variation range of the possible quantised values of the synaptic coefficients. Thus, for example, if the possible quantised values of a synaptic coefficient are
the normalisation operation will consist in transforming any integer X into a quantised value
The obtained result is none other than that of the MAC operation, Σi=1Nwiai in other words the scalar product w·a. This calculation may be performed in parallel for all of the bit lines of the memory. Thus, the activation vector of the next layer of the neural network may be quickly calculated, including when the synaptic coefficients are quantised over more than 2 levels (M>2) by means of an MLC-type RRAM memory.
Number | Date | Country | Kind |
---|---|---|---|
2204938 | May 2022 | FR | national |