The present specification relates to hardware acceleration circuits, and more particularly, to a reconfigurable threshold voltage field effect transistors-based approach for Euclidean distance calculation in neuromorphic hardware.
Machine learning algorithms are often trained by receiving a large amount of training data, and iteratively updating weights associated with a plurality of nodes. Updating the weights typically requires performing one or more mathematical computations associated with each node. Modern machine learning algorithms may utilize a large amount of nodes. For example, deep neural networks may utilize many layers, with each layer containing thousands or even millions of nodes.
As such, each time that the weights of the neural network are updated, millions of weight values associated with the nodes must be retrieved from memory, millions of computations must be performed, and the updated weights must be stored back in the memory. In addition to requiring a large amount of computing resources, the time required to read and write to the memory during each iteration of updating the nodes may cause the training process to be very slow. Accordingly, a need exists for hardware acceleration for Euclidean distance computations and training such neural networks.
In an embodiment, an apparatus may include a synapse comprising a first reconfigurable field-effect transistor; a second reconfigurable field-effect transistor connected in parallel to the first reconfigurable field-effect transistor; an input voltage applied to each of the first reconfigurable field-effect transistor and the second reconfigurable field-effect transistor corresponding to an input attribute associated with an error computation; and a current sensor measures a saturation drain current of the first reconfigurable field-effect transistor and the second reconfigurable field-effect transistor and determines a Euclidean error based on the saturation drain current of the FETs.
In another embodiment, a method may include receiving an input voltage corresponding to an input attribute of a self-organizing feature map; applying the input voltage to each synapse of a two-dimensional grid of synapses, each synapse corresponding to a neuron of the self-organizing feature map, and comprising two reconfigurable field-effect transistors, wherein a weight associated with the neuron is stored as a threshold voltage of at least one of the reconfigurable field-effect transistors; determining a best matching neuron based on a saturation drain current of at least one of the reconfigurable field-effect transistors; and updating the threshold voltages of the reconfigurable field-effect transistors based on the best matching neurons.
The embodiments set forth in the drawings are illustrative and exemplary in nature and not intended to limit the disclosure. The following detailed description of the illustrative embodiments can be understood when read in conjunction with the following drawings, where like structure is indicated with like reference numerals and in which:
One type of machine learning algorithm that utilizes neural networks is a Self-Organizing Feature Map (SOFM). An SOFM is an unsupervised machine learning algorithm used to produce a low-dimensional (typically 2-D) representation of a higher dimensional data set while preserving the topological structure of the data. For example, a data set with p variables measured in n observations may be represented as clusters of observations with similar values for the variables. These clusters may be visualized as a 2-D map such that observations in proximal clusters have more similar values than observations in distal clusters. This can allow for high-dimensional data to be easily visualized and analyzed.
An SOFM comprises a single layer of interconnected neurons. Each neuron in the network contains a weight vector of the same dimension as the input space, with each weight corresponding to an input attribute.
To project the input data onto the neuron map, each neuron must learn and be representative of an input datapoint or set of input data. First, each neuron in the map computes a Euclidean error or Euclidean distance between each attribute in the input and the corresponding weight. The Euclidean distance may be calculated using the following equation:
∈=(xi−wij)
where xi is the input attribute and wij is the corresponding weight on neuron j. These input-weight errors are then accumulated to compute a total error between the observed input and each neuron in the map.
The neuron with the least total Euclidean error or highest similarity is then selected as the representative neuron or Best Matching Unit (BMU). The BMU is selected as the cluster center for that input and pulls weights of the neurons in its neighborhood closer. This is done by updating every neuron's weight vector in the direction of the input vector in relation to the neuron's distance from the BMU of the input, as shown in the equation below.
Δwij=η∧j(j,jBMU)(xi−wij)
Additionally, this weight update is scaled by the input-weight error for each weight and a global learning rate shown in the equation below.
where ∧(j),jBMU is the neighborhood function output for neuron j with respect to BMU jBMU.
The neighborhood of the BMU is often modeled as a Gaussian distribution centered at the BMU and in respect to the Euclidean distance between the BMU and the neuron as shown in the above equation, where d2 is the Euclidean distance between neuron j and the BMU, and σ is the neighborhood rate. The neighborhood rate and the learning rate experience exponential decay over time by a decay time constant allowing the SOFM to converge. By clustering neurons in the neighborhood of the BMU for each input, the topography of the data is maintained since neurons in proximity in the input space will appear in proximity in the neuron space.
As discussed above, the key to updating the weights while training an SOFM is determining the Euclidian distance between an input attribute and a current weight of a neuron, defined as ∈=(xg−wij)2. While this can be computed using conventional computing methods, for a large SOFM with millions of nodes, this would require retrieving millions of weights from memory, performing the required calculations to determine the update weight for each node, and then storing the updated weights in memory. The amount of time required to read and write so much data to the memory during each iteration of updating the weights would cause the SOFM to train very slowly.
As such, in embodiments disclosed herein, rather than storing the weights associated with each node in a memory, each node is represented by a pair of Field-effect transistors (FETs), as disclosed herein. The saturation drain current (IDS) of a long-channel FET is given by:
where W is the width, L is the length, Cox′ is the oxide capacitance per unit area, p is the effective mobility carrier, VGS is the gate voltage, and VT is the threshold voltage. The values of W, L, Cox′, and p are constant once a FET is manufactured. As such, IDS is proportional to (VGS−VT)2 which is the Euclidean error between the gate voltage VGS and the threshold voltage VT.
Accordingly, in embodiments disclosed herein, FETs are used to represent nodes of an SOFM, with the input attribute xi of the SOFM corresponding to the gate voltage VDS of the FET, and the weight of a neuron of the SOFM corresponding to the threshold voltage VT. Then, measuring the saturation drain current IDS indicates the Euclidean error between the gate voltage VDS and the threshold voltage VT, which corresponds to the Euclidian error between the input attribute and the neuron weight of the SOFM. As such, during training of the SOFM, the weights can be updated by modifying the threshold voltage VT, without needing to access a memory. This may allow for the SOFM to be trained significantly faster than training the SOFM using traditional computing techniques.
In traditional FETs, the threshold voltage VT is fixed by doping when a FET is manufactured. However, certain FETs are reconfigurable such that the threshold voltage VT can be set by applying a bias. In particular, ferroelectric field-effect transistors (FeFETs) are a modification of a metal-oxide field-effect transistor (MOSFET) device with a ferroelectric material integrated in the gate stack. An example FeFET 100 is shown in
Ferroelectric materials are a subset of dielectric materials that exhibit a permanent induced polarization in the presence of a sufficient externally applied electric field. This polarization arises due to polarization of re-orientable unit cells which organize into polarized ferroelectric domains resulting in a net polarization across the material. By integrating a multi-domain ferroelectric material on the gate stack, the effective threshold voltage VT of the FET can be modulated by polarizing the ferroelectric material. As such, the ability to induce a permanent polarization of the ferroelectric material results in a FeFET with a reprogrammable threshold voltage VT.
Because FeFETS have a reprogrammable threshold voltage VT they can be used to represent nodes of an SOFM. When the weight of a node is to updated, the saturation drain current IDS of the FeFET can be measured to determine the Euclidian distance needed to determine the updated weight, and the threshold voltage VT can be reprogrammed by applying a bias voltage to effectively store the new weight as the reprogrammed threshold voltage VT. While embodiments disclosed herein utilize FeFETS as nodes of an SOFM, it should be understood that in other examples, other types of reconfigurable FETs may be used instead.
While the saturation drain current IDS of a FeFET is proportional to the Euclidian distance between the gate voltage VDS and threshold voltage VT, this requires that the FeFET be conducting. Therefore, an n-channel FeFET only indicates the Euclidean error when VGS≥VT. Accordingly, in embodiments disclosed herein, a dual n-channel FeFET synapse is utilized where each node of an SOFM is represented by a pair of FeFETS. One FeFET computes the Euclidean error when VGS≥VT and the other FeFET computes the Euclidean error when VGS<VT.
Turning now to
As shown in
When the gate voltage is greater than or equal to the threshold voltage, the current flows through the first n-channel FeFET 202, and when the gate voltage is less than the threshold voltage, the current flows through the second n-channel FeFET 204. Accordingly, the drain current measures the Euclidean error in either situation. Therefore, by applying the voltages as discussed above and measuring the saturation drain currents of the FeFETs 202, 204, the Euclidean distance needed to determine the updated weight values may be determined. The updated weight value for the node may be computed, and the updated weight value may be stored in the synapse 200 by modifying the threshold voltages of the FeFETs 202, 204, as discussed in further detail below.
In some examples, the first n-channel FeFET 202 and the second n-channel FeFET 204 may be replaced by a first and second p-channel FeFET. In these examples, the first p-channel FeFET receives the weight voltage VW as its gate voltage and the input voltage minus a source voltage VX−VDD as its threshold voltage. Furthermore, in these examples, the second p-channel FeFET receives the input voltage VX as its gate voltage and the weight voltage minus the source voltage VW−VDD as its threshold voltage
While the example synapse 200 of
In the examples of
In the illustrated example, when α has a value of 2, the synapses 200 and 300 compute Euclidian distance, which may be used to train an SOFM, as disclosed herein. However, in other examples, different values of α may be used to compute other distance metrics, which may be used to implement other data analytics techniques. In addition, an SOFM in which Euclidean error is calculated using a value for α of less than 2 may also be used, which may be preferable for certain applications.
Furthermore, Euclidean error may be used in other applications other than SOFMs. For example, many different machine learning or data analysis techniques utilize Euclidean error or similar error metrics. As such, the synapses 200 and 300 disclosed herein may be used in a variety of applications and the value of α may be tuned for different applications during the manufacturing process of the particular FETs to be used.
Turning now to
In the example of
The SOFM 400 also includes a BMU selection circuit 422 and a BMI neuron labeling circuit 424. The BMU selection circuit 422 may select a best matching unit (BMU), as disclosed herein. In particular, the BMU selection circuit may be implemented using a complementary metal-oxide-semiconductor (CMOS) logic circuit that identifies the neuron whose capacitor is last to charge to logic 1. The selection of the BMU is an important part in training the SOFM 400 as it identifies the location of weight update and thus clusters for competitive learning. In addition, after the SOFM 400 has been trained, it may be used to classify input data. In some examples, a Gaussian distribution is applied to offset each neuron's error, due to device and material variability, to produce an effective error.
The SOFM 500 of
The SOFM 500 of
Referring back to
To account for charge sharing, the output of the current input capacitor 602 is scaled by a voltage divider circuit 606. Four transistors 608, 610, 612, and 614 are used to reset and charge the capacitors 600, 602, as shown in
Since the voltage across the capacitor 602 is proportional to the total drain current of the neuron or total Euclidean error, the BMI neuron labeling circuit 424 is able to notify when a new BMI has been found. This can be used to label each of the neurons by updating the neuron's label, upon every instance of a new BMI being detected, to the label of the new BMI. This allows for the entire SOFM to be labeled efficiently allowing for real-time dynamic labeling of the SOFM as the labeled dataset may change.
At step 706, it is determined whether the summed saturation drain currents are greater than a reference current produced by the FeFETS 448, 450, 452 in the example of
At step 712, it is determined whether a potential BMU is 1 (indicating that a node is a candidate BMU). If the potential BMU is 0, then at step 718, the value of the counter 470 is increased by 1, at step 722, the value of k is compared to k0. If the value of k is equal to the value of k0, the initial value of k, then and at step 726, the BMU threshold is increased. Control may then return to step 700 so that another input can be received.
If, at step 712, the potential BMU is 1, then at step 714, the k-th neuron is selected as the BMU, and at step 716, the BMU threshold is decreased. At step 720, it is determined whether the saturation drain current of the BMU is greater than a growth threshold. If the saturation drain current of the BMU is greater than the growth threshold, then at step 724, a neuron is grown. If the saturation drain current of the BMU is not greater than the growth threshold, then control passes to step 728.
At step 728, the value of VA,j is computed by a reprogrammable interconnect such as 472, 474, 510, 512, 514, and at step 720, the value of VT,ij is updated. At step 734, it is determined whether the BMU potential of node j and the BMU potential of node j+1 is 1. If not, then at step 736, the RRAM is decayed. If so, the RRAM is set to LRS. Then, at step 740, the BMUs are cleared, and control returns to step 700 to receive the next input.
At step 806, a best matching unit is selected. In particular, the Euclidean errors are accumulated to compute a total error between the input and each synapse or neuron. The BMU selection circuit 422 then selects the synapse with the least total Euclidean error as the best matching unit, as discussed above. Then, at step 808, the weights of the neurons are updated by updating the threshold voltages of the FeFETs of the synapse. In particular, the weight of each neuron is updated based on how far it is from the best matching unit.
In the illustrated example, the weights are updated by applying programming pulses to the gate bias of the FeFETs of each synapse to modify their threshold voltages. In some examples, threshold voltage programming circuits 430, 432, 434, 436, 438, 440, 442, 444, and 446 apply the programming pulses to update the weights. When the synapses comprise two n-channel FeFETs, as in the example of
At step 810, it is determined whether additional inputs are available. If additional inputs are available (Yes at step 710), then control returns to step 700 to apply the next set of input attributes. Alternatively, if additional inputs are not available (No at step 710), then the method of
The disclosed SOFM algorithm was trained and tested on a COVID-19 chest x-ray set, comprising 148 images compressed to 100×100 pixels each. The COVID-19 chest x-ray dataset was split into a training dataset of 118 images and testing dataset of 30 images with samples from all classes present in both sets. The COVID-19 chest x-ray dataset consisted of two classes: healthy subjects (class 0) and the fraction of subjects after being diagnosed with COVID-19 (class 1).
After training the SOFM for 30 epochs, the training process was halted to evaluate the inferencing accuracy of the architecture on unobserved samples. The SOFM neurons were labeled according to the process described herein. Each neuron was labeled with the label of the best matching input in the labeled dataset. The labeled dataset was a fraction of the training dataset that retained its label or was labeled, while the remaining training dataset was unlabeled. The entirety of the testing dataset had to be labeled to evaluate the inferencing accuracy of the disclosed SOFM. The SOFM was evaluated using labeled datasets ranging from 1-5% of the trained dataset, and the labeled dataset had representation from each class.
The disclosed SOFM was able to successfully separate the clusters of healthy and COVID-19 diagnosed chest x-rays. The separation of clusters can be seen in
It should now be understood that embodiments described herein are directed to a ferroelectric field effect transistors based approach for Euclidean distance calculation in neuromorphic hardware. By utilizing synapses comprising pairs of FeFETs as nodes of an SOFM network, the weights of the SOFM may be stored as threshold voltages of the FeFETs. Accordingly, no access to an external memory is required. Furthermore, the Euclidean error needed to train the SOFM can be obtained directly from the saturation drain current of the FeFETs. Therefore, the disclosed techniques can greatly simplify and accelerate training and use of SOFM networks compared to traditional computing methods.
This application claims priority to U.S. Provisional Application No. 63/431,219 filed on Dec. 8, 2022, which is incorporated herein by reference in its entirety.
The present invention was made with Government support under Contract Nos. CCF-1718428 and ECCS-1926465 awarded by the National Science Foundation. The Government has certain rights in the invention.
Number | Date | Country | |
---|---|---|---|
63431219 | Dec 2022 | US |