Claims
- 1. A learning method for, training a neural network, under the control of error-backpropagation, the network including neurons organized in successive layers among which there are an input layer and an output layer, a state of a neuron j in a layer l being determined according to the equation:
- x.sub.j,l =.SIGMA.W.sub.ij,l Y.sub.i,(l-1) ;
- wherein:
- a) Y.sub.j,(l-1) is an output potential of a neuron i in a preceding layer l-1, and
- b) W.sub.ij,l is a synaptic coefficient, representative of a coupling from the neuron i to the neuron j, the neural network including a computer memory for storing a value of the synaptic coefficient to be retrieved for use after training, the method comprising the steps of:
- i) determining a value of a quantity E representative of a discrepancy between actual result and desired result at a neural net output at the output layer;
- ii) determining for a particular neuron j in layer 1 a value of a partial derivative g.sub.j,l : g.sub.j,l =.alpha.E/.alpha.x.sub.j,l ;
- iii) producing a product of g.sub.j,l and x.sub.j,l ;
- iv) rescaling the product by a first factor if the partial derivative g.sub.j,l and the state x.sub.j,l have opposite polarities, for producing an increment .DELTA.W.sub.ij,l ;
- v) rescaling the product by a second factor if the partial derivative g.sub.j,l and the state x.sub.j,l have equal polarities, for producing the increment .DELTA.W.sub.ij,l ;
- vi) updating the value of the synaptic coefficient W.sub.ij,l with the increment .DELTA.W.sub.ij,l ; and
- vii) storing the updated value in the computer memory.
- 2. A method as in claim 1 wherein the value of the synaptic coefficient W.sub.ij,l is updated iteratively in a sequence of cycles, each including the steps i)-vii), and wherein before at least one particular cycle the value of one of said first and second factors is increased.
- 3. A method as in claim 1, comprising:
- a) iteratively updating the value of the synaptic coefficient W.sub.ij,l in a sequence of cycles, each cycle including the steps i)-vii);
- b) in each cycle applying a sigmoid function F to the state x.sub.j,l of the neuron j in the layer l for providing an output potential Y.sub.j,l, the sigmoid function F having a steepness dependent on a value of a parameter T;
- c) before at least one particular cycle, increasing the steepness by changing the value of the parameter T.
- 4. A method as in claim 1, wherein the determining of the quantity E as representative of said discrepancy comprises the steps of:
- a) determining, for each neuron of the output layer, a partial discrepancy between the output potential, obtained as a result of supplying an input example to the input layer, and a desired potential;
- b) determining if a polarity of the output potential and a polarity of a desired output potential are equal;
- c) forming weighted contributions to the quantity E by scaling each partial discrepancy by a polarity factor having a value between 0 and 1 if the potential polarities are equal.
- 5. A method as in claim 1 wherein for each respective one of the successive layers 1 producing the increments .DELTA.W.sub.ij,l by rescaling the product by a respective further factor, the further factor for each one of the successive layers being smaller tkan the further factor of a preceding layer.
- 6. A trainable neural network including:
- a) neurons functionally organized in successive layers, among which there are an input layer and an output layer, a state x.sub.j,l of a neuron j in a layer 1 being determined according to the equation:
- x.sub.j,l =.SIGMA.W.sub.ij,l Y.sub.i, (l-1) ;
- wherein:
- i) Y.sub.i,(l-1) is an output potential of a neuron i in a preceding layer l-1, and
- ii) W.sub.ij,l is a synaptic coefficient, representative of a coupling from the neuron i to the neuron j;
- b) a memory for storing a value of the synaptic coefficient to be retrieved for use after training;
- c) computating means coupled to the memory and using error-back propagation, for determining a value of a component g.sub.j,l of a gradient of an error function E in a state space according to: g.sub.j,l =.alpha.E/.alpha.x.sub.j,l ;
- d) a multiplier coupled to the computating part for outputting a value of a product g.sub.j,l x.sub.j,l ;
- e) a polarity checker coupled to the multiplier to determine a polarity of the product;
- f) a scaling means coupled to the multiplier for, under control of the polarity checker, scaling the product by a first factor if the polarity is negative and scaling the product by a second factor which has larger absolute value than the first factor if the polarity is positive;
- g) updating means coupled to the scaling means and the memory for supplying an increment .DELTA.W.sub.ij,l proportional to the scaled product and for updating the value of synaptic coefficient W.sub.ij,l and storing the updated value in the memory.
- 7. A neural network as in claim 6 wherein the computing part, the multiplier, the polarity checker, the scaling means, and the updating means are elements, and at least one of said elements is included in a general purpose computer.
Priority Claims (1)
Number |
Date |
Country |
Kind |
89 07662 |
Jun 1989 |
FRX |
|
Parent Case Info
This is a continuation of application Ser. No. 07/839,020, filed Feb. 18, 1992, now abandoned which is a continuation of application Ser. No. 07/533,651, filed Jun. 5, 1990, now abandoned.
US Referenced Citations (2)
Number |
Name |
Date |
Kind |
4933872 |
Vandenberg et al. |
Jun 1990 |
|
4994982 |
Duvanton et al. |
Feb 1991 |
|
Non-Patent Literature Citations (4)
Entry |
"Parallel Distributed Processing", vol. 1, David E. Rumelhart, 1989. |
"Neural Computing: Theory and Practice", Philip D. Wasserman, Apr. 1989. |
"An Analog VLSI Implementation of Hopfield's Neural Network", 1989 IEEE, Michel Verleysen and Paul G. A. Jespers. |
"Neural Network, Part 2: What are They and Why is Everybody So Interested in Them Now", Philip D. Wasserman. |
Continuations (2)
|
Number |
Date |
Country |
Parent |
839020 |
Feb 1992 |
|
Parent |
533651 |
Jun 1990 |
|