The present application claims priority from Japanese application JP 2019-210549, filed on Nov. 21, 2019 the contents of which is hereby incorporated by reference into this application.
The present invention relates to artificial intelligence or its machine learning technology.
Artificial intelligence is a technology that allows a computer to perform processes or a robot to operate based on a mathematical model called a neural network. The artificial intelligence is notable for its capability to perform processes or actions characteristic of humans. It is necessary to appropriately adjust parameters (also called weights) inside an artificial neural network according to the processes or actions so that the artificial intelligence can perform such processes or actions.
As the computer or the robot is required to perform more complicated processes or actions, the artificial neural network needs to be more complicated, thus increasing parameters to be adjusted. The time required to numerically acquire an optimal parameter value increases exponentially corresponding to the number of parameters. Therefore, the capability to acquire optimal parameters in a shorter time is one of the important issues for the development of artificial intelligence.
Solutions to this issue include the improvement of optimal value search algorithms and the development of dedicated hardware based on a GPU (Graphics Processing Unit). However, an iterative improvement method for respective parameters inevitably requires repetitive trials that increase the calculation time to find an optimal value.
There is also proposed a concurrent improvement method for all parameters. An example system uses an electronic circuit including a memristor as an electric resistance element that stores the amount of passed current (see Japanese Unexamined Patent Application Publication No. 2018-521397, US 2015/0278682 A1, and X. Wu, et al. “A CMOS Spiking Neuron for Brain-Inspired Neural Networks with Resistive Synapses and In-Situ Learning,” IEEE Transactions on Circuits and Systems II: Express Briefs, 62(11), 1088-1092 (2015)).
According to the method described in Japanese Unexamined Patent Application Publication No. 2018-521397, US 2015/0278682 A1, or X. Wu, et al. “A CMOS Spiking Neuron for Brain-Inspired Neural Networks with Resistive Synapses and In-Situ Learning”, IEEE Transactions on Circuits and Systems II: Express Briefs, 62(11), 1088-1092 (2015), a circuit is supplied with pulse signals representing an input value and a resulting output value at that time. Then, the resistance value of the memristor varies with the supplied current. An optimal parameter can be found by measuring the final memristor resistance value. However, it is difficult to find a material that is known to function as memristors and causes a resistance change to be large enough to make this concept fit for practical use.
There is a need for a technique that can quickly find an optimal parameter for the neural network based on other approaches.
According to a preferred aspect of the present invention, an electronic circuit includes a quantum dot, a capacitance portion, a current portion, and a current adjustment portion. In this circuit, the quantum dot includes a first electrode, a second electrode, and a third electrode. The first electrode is connected to a first potential. The second electrode is connected to a first current source. The third electrode is connected to a second current source. The current portion discharges current from the second electrode or supplies current to the second electrode. The current adjustment portion adjusts a current of the current portion and outputs a parameter to adjust the current.
According to a more specific configuration, an electron or a hole stably flows from the first potential to the first electrode and the second electrode via the quantum dot. A non-linear relationship is maintained between the current amount for electron or hole flowing between the quantum dot and the second electrode and the current amount for electron or hole flowing between the quantum dot and the third electrode.
According to a more specific configuration, the current adjustment portion determines the current amount Iw for the current portion based on a relational expression of Iw=w1ix1+w2ix2+ . . . +wnixn+b, where Iw denotes the current amount for the current portion, ix1 through ixn denote current values of the first current source, w1 through wn and b denote the parameters.
According to another preferred aspect of the present invention, a neural network is configured as a multi-layer network by connecting multiple electronic circuits to form multiple stages.
According to yet another preferred aspect of the present invention, a learning method of the above-described neural network allows each of the electronic circuits to perform a first step of supplying the first current source with a current value corresponding to a problem of training data; a second step of supplying the second current source with a current value corresponding to a solution of training data; a third step of outputting the parameter; and a fourth step of recording the parameter.
It is possible to quickly find an optimal parameter for the neural network.
Embodiments of the present invention will be described in further detail with reference to the accompanying drawings. However, the present invention is not interpreted based exclusively on the contents of the embodiments described below. It is further understood by those skilled in the art that various modifications may be made in the specific configurations without departing from the spirit and scope of the present invention.
In the configurations of the invention described below, the same portions or portions having similar functions use the same reference numerals in different drawings, and redundant description may be omitted.
When there is a plurality of elements having the same or similar functions, the same reference numeral may be given different additional characters. However, additional characters may be omitted when there is no need to make a distinction among the elements.
The notations such as “first,” “second,” and “third” in this specification, for example, are used to identify composing elements and do not necessarily limit the number of items, order, or contents thereof. A number to identify a composing element is used for each context. A number used in one context does not necessarily indicate the identical configuration in other contexts. A composing element identified by a given number may also function as a composing element identified by another number.
Positions, sizes, shapes, and ranges of respective configurations shown in the drawings, for example, may not represent actual ones to facilitate understanding of the invention. Therefore, the present invention is not necessarily limited to the positions, sizes, shapes, and ranges disclosed in the drawings, for example.
A neural network is broadly divided into a linear transformation part and a non-linear transformation part. The linear transformation represents that output is equal to the linear transformation of input. This equality relationship can be associated with Kirchhoff's current law in terms of circuits. A difference between both, if any, can be found by an increase or decrease in the node potential. An increase or decrease in the potential, if any, is balanced by adjusting a coefficient for the linear transformation.
In the present embodiments, the non-linear transformation uses electric conduction properties of a quantum dot (QD). A combination of these can configure an electronic circuit that works similarly to a neural network. The use of this circuit can find an optimal parameter for the neural network.
The inventors fabricated a special artificial neuron element that autonomously supplies a connection strength when input and output are given. The inventors conceived that a neural network comprised of this neuron element may be able to find a connection strength without searching by trial and error. To achieve this conception, the inventors have devised a special artificial neuron element that autonomously causes an appropriate connection strength under condition of input and output supplied in accordance with the natural force that aims to stabilize the energy. A conventional ordinary artificial neuron element determines an output depending on the input and the connection strength and differs from the artificial neuron element devised by the inventors in the function and the usage.
The following embodiments will describe the creation of an artificial neuron element that autonomously causes the connection strength to be an optimal value when input and output are given; and a technique of networking the artificial neuron elements. The embodiments use the non-linear electric conduction of a quantum dot. The quantum dot is a microscopic semiconductor or metal structure typically sized at several tens of nanometers. The quantum dot features non-ohmic resistivity allowing a current and a voltage to be disproportionate while an ordinary conductor features ohmic resistivity allowing a voltage and a current to be proportionate. The quantum dot is reported in M. Sugawara, “Self-Assembled InGaAs/GaAs Quantum Dots,” Semiconductors and Semimetals, Vol. 60 (1999), for example.
The use of a configuration as a combination of the quantum dot, the capacitance, and power supply makes it possible to design the input-output relationship so that a non-linear function represents the relationship between input current and output current. The charging energy of the capacitance is unstable under the condition of an input-output relationship that deviates from the non-linear function. The energy stabilization effect varies the input-output relationship to conform to the designed nonlinear function. Then, there autonomously occurs a flow of discharging a current to the outside or conversely supplying a current from the outside. This current can be associated with the connection strength of the artificial neuron. The measurement of the current makes it possible to find the optimal connection strength corresponding to input and output.
The learning system management portion 101 includes a storage 102 as a storage device to store a collection of datasets to be learned. The dataset is training data composed of a set of problem and solution data, for example. The question is input to the neural network. The solution is expected output from the neural network.
Each stored dataset D is converted into electric signal S by the data converter 103 and is periodically transmitted from an output device of the computer to the electronic circuit 104. The data converter 103 can be implemented as software by allowing the processing device to execute a program stored in the storage device, for example. The data converter 103 can be also configured as hardware including comparable functions.
When periodic electric signal S is input to the electronic circuit 104, weight W of the neural network output from the electronic circuit 104 starts to chronologically change. The change gradually decreases. When the change becomes small, the data converter 105 converts weight W into digital data. The storage 106 stores the digital data. Like the data converter 103, the data converter 105 can be configured as software or hardware. As a result, weight W of the learned neural network is stored.
When the neural network operates, inputting problem data to the neural network weighted by stored weight W outputs a required answer.
The current control portion 201 controls the current amount for the current portions 202-1 through 202-N and 203-1 through 203-N based on electric signal S input from the learning system management portion 101. According to the present embodiment, electric signal S results from a voltage change. Therefore, the current control portion 201 includes a function of converting electric signal S based on the voltage into electric signals Ix1 through IxN and Iy1 through IyN based on the current. The conversion function is unnecessary if electric signal S results from a current change. Electrical signals Ix1 through IxN correspond to problem data of dataset D and are input to the neural network. Electric signals Iy1 through IyN correspond to solution data of dataset D and provide expected output from the neural network.
Generally, there are multiple elemental components 204 that correspond to nodes of the neural network. Input-output terminals of each elemental components 204 are connected to each other, configuring a multi-stage neural network as a whole. According to the present embodiment, the elemental components 204 are shaped into a matrix of N×N′. In the following description, the elemental components 204 may be described as elemental components “1, 1” through “N, N′” as illustrated in
The current portions 202-1 through 202-N and 203-1 through 203-N include input-side current portions 202-1 through 202-N and output-side current portions 203-1 through 203-N. The input-side current portions 202-1 through 202-N are supplied with electric signals Ix1 through IxN corresponding to problems in dataset D. The output-side current portions 203-1 through 203-N are supplied with electric signals Iy1 through IyN corresponding to solutions in dataset D.
According to the present embodiment, each elemental component 204 outputs weight signal w and bias b that are transmitted from the electronic circuit 104 to the learning system management portion 101. Weight signals output from elemental component “N, N′” are represented as w1N, N′, w2N, N′ through wnN, N′. The weight signal determines the weight of n input signals input to elemental component “N, N′.” The output from the elemental component 204 is equal to the sum of weighted input signals plus bias b. As described above, each elemental component 204 may use a different number of terminals in the electronic circuit 104. Each elemental component 204 may use a different value for n. In
The feature of the quantum dot 301 includes the provision of a nanoscale space region, a capability to enter and leave the space region due to the tunnel effect, and a capability to control the entry and exit of electrons between the quantum dot and the outside. This feature makes it possible to design various electric characteristics. The quantum dot is fabricated through the use of compound semiconductors, for example.
For example, suppose AlGaAs is doped with Si in a laminate structure of two types of compound semiconductor layers such as AlGaAs and GaAs. Then, the boundary between AlGaAs and GaAs forms a layer where conduction electrons called two-dimensional electron gas are accumulated. This electron can conduct in the x and y planes but not in the z-direction. Namely, the electrons are confined in the z-direction. When a negative voltage is applied to the gate electrode provided on the surface of the AlGaAs/GaAs laminate structure, a wall of electrostatic potential can be fabricated in the plane of the two-dimensional electron gas layer immediately below the gate electrode. This enables the confinement in the x and y directions. A dent of electrostatic potential to confine the electrons corresponds to the quantum dot. The existence region of electrons spreading outside the quantum dot works as an electric terminal of the quantum dot.
Including the case of parasitic capacitance, the capacitance portions 305 and 306 may not explicitly exist in the circuit. The current adjustment portion 313 includes a variable adjustment portion 308 and a current control portion 309 and outputs adjusted variables. The electrodes 302, 303, and 304 may be provided as two-dimensional electron gas spreading outside the above-mentioned quantum dot, for example, and may or may not have a physical electrode structure.
The quantum dot 301 illustrated in
As an example of the above adjustment technique allows the thickness of the tunnel barrier between the quantum dot 301 and the electrode 302 in
Consequently, the voltage portion 312 stably supplies electrons or holes to the electrodes 302 and 303 via the quantum dot 301, making it possible to provide the non-linear relationship between the current amount for electron or hole flowing between the quantum dot 301 and the electrode 303 and the current amount for electron or hole flowing between the quantum dot 301 and the electrode 304.
The elemental component 204 illustrated in
The current adjustment portion 313 works as a feedback circuit that actively reduces a temporal variation of potential vi. If the current adjustment portion 313 is not provided, currents ix1 through ixn enter (or leave) the node indicated by potential vi at the left in the drawing and leave (or enter) the same at the right. Theoretically, vi is constant if the sum of these currents is zero. However, ix1 through ixn and iin alone do not work to zeroize the sum. Therefore, if the current flowing into the node is excessive (or insufficient), the current adjustment portion 313 provides control to discharge (or supply) the current from the node so that the sum of currents becomes zero. The sum of ix1 through ixn and iin generally does not become zero. Adjustable current Iw is then added to this sum and the sum of ix1 through ixn, iin, and Iw is used to zeroize the currents flowing to and from the node A variation of potential vi becomes zero when the sum of currents flowing to and from the node becomes zero. Therefore, the current adjustment portion 313 reduces a variation of vi by zeroing the sum of currents input and output from the node.
Weight w takes some value even at each time until the temporal variation in potential vi is reduced. It is formally possible to assign weights w to parameters of the neural network. As a result, the input-output relation y=f (x) of the neural network is formed but does not satisfy the required input-output relation. The configuration of the present embodiment can reduce the temporal variation in potential vi and thereby acquire weight w in association with inputs to the node of the neural network.
The processing of the variable adjustment portion 308 is not limited to digital or analog processing. In
In the circuits in
v
rsv
=c
a
v
i
+c
b
where ca and cb denote constants. Constant ca may be set to zero. In this case, the voltage portion 312 outputs a constant voltage. The physical configuration of the quantum dot 301 and a voltage from the voltage control portion 311 are controlled to apply a non-linear function to the relationship between current iin and current iout. To provide this non-linear relationship, the voltage of the electrode 302 is set to be positively or negatively larger than at least a value resulting from dividing the product of the Boltzmann constant and the electron temperature in the electrode by the elementary charge.
The temporal variation in input-side voltage vi is described as in equation 1.
The current adjustment portion 313 adjusts Iw so that the temporal variation in vi becomes zero (to sufficiently decrease a potential variation of the electrode 303). As expressed in equation 2, this Iw depends on ix1 through ixn and constant current amount Io, making it possible to adjust current amount Iw based on weights w1 through wn as coefficients and bias b.
[Math 2]
I
w
=w
1
i
x1
+w
2
i
x2
+ . . . +w
n
i
xn
+bi
0 (2)
If the temporal variation in vi is set to zero in equation 1, equation 3 is derived from equation 2.
[Math 3]
i
w=(w1−1)ix1+(w2−1)ix2+ . . . +(wn−1)ixn+bI0 (3)
Equation 3 is a relational expression that represents iin by using a linear transformation of ix1 through ixn. The linear transformation coefficients such as (w1−1) through (wn−1) and b are comparable to parameters to be found for the neural network. Generally, the parameters to be found are described as (w1−A) through (wn−A) and b, where A is a constant.
The function of the variable adjustment portion 308 will be described in detail. Damped vibration is generally described as the following equation 4, where x denotes the amount of displacement (extension of a spring, if any), t denotes the time, and a denotes a constant.
Suppose equation 4 provides the form of designing a differential equation that allows a temporal variation in the state to follow. Then, it is possible to set dx/dt to 0 by setting t to be infinite. When the control is provided according to the variable adjustment portion 308, potential vi can be described in the form of damped vibration like equation 4. The reason will be described below.
The equations for w1 through wn and b described in the variable adjustment portion 308 of
Differentiating both sides of equation 1 twice by t yields equation 6.
Differentiating both sides of the expression described in the current control portion 309 of
Equation 8 is derived from equations 5 through 7.
Equation 8 can be arranged in the form of the differential equation for damped vibration shown in equation 4. Namely, equation 8 can be expressed like equation 9.
When the variable adjustment portion 308 provides feedback control for w1 through wn and b, the temporal variation of the potential vi, as a damped vibration, can use (dvi)/dt set to 0 by setting t to be infinite. Therefore, the feedback control described in the variable adjustment portion 308 can reduce the potential variation and consequently acquire weight w.
The description below explains the principle based on which the calculation of current control portion 309 determines w1 through wn and b to zeroize the sum of currents ix1 through ixn, iin, and Iw. For example, see the upper right equation containing w1 on the left side in
The setting (equation) for Iw provided by the current control portion 309 in
Multiple datasets are input to the electronic circuit 104 (S901-1 through S901-n). If there are multiple training datasets, a possible solution is to chronologically change the amount of current Ix1 through Ixn and Iy1 through Iyn. For example, the current control portion 201 in
The above-described operation causes the charging energy of the elemental component 204 to vary with the time. If input and output are repeatedly modulated, the charging energy should behave to minimize its average. When there are many training datasets, the multiple sets of current amounts are switched at regular intervals. This operation is repeated Q times, allowing the values of weights w1 through wn and bias b to converge. When the values stabilize after a predetermined time, the learning may be completed by reading weights w1 through wn and bias b for the elemental components 204 (S903).
The second embodiment specifically describes a special case of the learning procedure according to the first embodiment.
First, the storage 102 transmits {0.2, 0.6}→{0.4, 0.6} to the data converter 103. The data converter 103 converts this value according to the circuit. Here, the data converter 103 is assumed to multiply all values by 10−12. As a result, the data converter 103 outputs {0.2×10−12, 0.6×10−12}→{0.4×10−12, 0.6×10−12}.
The current control portion 201 is assumed to convert a received value into ampere. In this case, the current portions 202 and 203 respectively supply currents Ix1=0.2 pA, Ix2=0.6 pA, Iy1=0.4 pA, and Iy2=0.6 pA.
The elemental components 204 output signals w11,1, w21,1, b1,1, w12,1, w22,1, b2,1, w11,2, w21,2, b1,2, w12,2, w22,2, and b2,2 that are input to the data converter 105. Here, the data converter 105 is assumed to multiply an input value by 1. The storage 106 stores data resulting from converting a signal after a lapse of tmeas seconds from the transmission of data from the storage 102.
The description below explains a case where the storage 102 stores two data sets of {0.2, 0.6}→{0.4, 0.5} and {0.9, 0.4}→{0.2, 0.2}. Similar to the above, the data converter 103 and the current control portion 201 determine currents to be supplied to the current portions 202 and 203. When there are multiple datasets, currents Ix1, Ix2, Iy1, and Iy2 are switched corresponding to the datasets at regular intervals, and this operation is repeated.
As seen from
The simulation of time characteristics on this circuit records values such as w11,1=0.247, w21,1=0.247, b1,1=1.29, w12,1=−0.754, w22,1=−0.754, b2,1=1.29, w11,2=0.0625, w21,2=−0.0216, b1,2=1.03, w12,2=0.048, w22,2=−0.00669, and b2,2=1.04.
The variables in
Function f (x) in
In this equation, e=+−1.602×10-19 coulombs (the positive or negative depends on electrons and holes) denotes the elementary electric charge and Γi and Γo denote constants. It is possible to set Γi and Γo by controlling the thickness of the tunnel barrier between the quantum dot 301 and the electrode 303 or 304.
Here, eΓi=1 and eΓo=1 are assumed in the above-described simulation of the time characteristics and the condition of the data converter 103. The above-described values are assigned to the neural network that is given {x1, x2}={0.2, 0.6} and then yields {y1, y2}={0.393, 0.512}. Meanwhile, the neural network is given {x1, x2}={0.9, 0.4} and then yields {y1, y2}={0.176, 0.233}. This proves the capability of acquiring the values approximate to training datasets {0.2, 0.6}→{0.4, 0.5} and {0.9, 0.4}→{0.2, 0.2}. As above, the electronic circuit according to the present embodiment can provide the required neural network.
According to the above-described embodiment, the neural network of the electronic circuit 104 is supplied with input electrical signals Ix1 through IxN and Iy1 through IyN dependent on dataset D of the training data. When the elemental component 204 including the quantum dot 301 illustrated in
According to the present embodiment, current Iw is given in equation 2. Therefore, it is possible to separately acquire weights w1 through wN and bias b for inputs ix1 through ixn. Therefore, it is possible to acquire the values of the parameters corresponding to the neural network configuration. It is possible to configure an electronic circuit that can yield weights for a neural network consistent with the situation at the time in reply to the provision of input and corresponding output without the use of a memristor. Therefore, it is possible to find an optimal value while improving all parameters in the neural network.
Number | Date | Country | Kind |
---|---|---|---|
2019-210549 | Nov 2019 | JP | national |