The description relates to neural networks.
One or more embodiments may relate to neural networks for use in activity recognition in wearable devices, for instance.
Neural networks are good candidates for use in activity detection, for instance in wearable devices. A neural network can be embedded in a wearable, low-power system in order to perform processing tasks such as classification of incoming signals in order to detect an activity performed by the user (for instance: jogging, walking, running, biking, stationary state and so on).
Neural networks have formed the subject matter of extensive research, as witnessed, e.g., by:
Despite such an extensive activity, improved solutions are still desirable, for instance as regards one or more of the following aspects:
One or more embodiments contribute in providing such improved solution by means of a neural network having the features set forth in the claims that follow.
One or more embodiments may also concern a corresponding device (e.g., an activity recognition device), corresponding apparatus (e.g., a wearable apparatus, e.g., for sports and fitness activities) as well as a computer program product loadable in the transitory or non-transitory memory of at least one processing module (e.g., a computer) and including software code portions for executing the steps of the method when the product is run on at least one processing module. As used herein, reference to such a computer program product is understood as being equivalent to reference to a transitory or non-transitory computer-readable medium containing instructions for controlling the processing system in order to co-ordinate implementation of the method according to one or more embodiments. Reference to “at least one computer” is intended to highlight the possibility for one or more embodiments to be implemented in modular and/or distributed form.
The claims are an integral part of the disclosure as provided herein.
One or more embodiments may address the problem of classifying time-varying activities performed by a user based on accelerometer measurements provided by an on-body sensor, with accelerometer sensing possibly combined with gyroscope sensing.
One or more embodiments may provide a self-organizing neural network, namely a neural network capable of autonomously organizing connections of neurons (thus organizing network topology and neuron allocation) according to inputs fed thereto with the capability of continuously learning from data and thus improving performance over time, for instance with the capability of adapting to the wearer of wearable device.
One or more embodiments may provide a network capable of learning from time variance of data.
One or more embodiments may provide a network capable of performing, along with conventional supervised training, incremental un-supervised training on large unlabeled data sets with the capability of evolving to a specialized network permitting more accurate classification.
One or more embodiments may be adapted for use in connection with human activity recognition data sets, with performance notably improved in comparison with other recurrent-based approaches and Convolutional Neural Networks (CNNs).
One or more embodiments will now be described, by way of example only, with reference to the annexed figures, wherein:
In the ensuing description, one or more specific details are illustrated, aimed at providing an in-depth understanding of examples of embodiments of this description. The embodiments may be obtained without one or more of the specific details, or with other methods, components, materials, etc. In other cases, known structures, materials, or operations are not illustrated or described in detail so that certain aspects of embodiments will not be obscured.
Reference to “an embodiment” or “one embodiment” in the framework of the present description is intended to indicate that a particular configuration, structure, or characteristic described in relation to the embodiment is comprised in at least one embodiment. Hence, phrases such as “in an embodiment” or “in one embodiment” that may be present in one or more points of the present description do not necessarily refer to one and the same embodiment. Moreover, particular conformations, structures, or characteristics may be combined in any adequate way in one or more embodiments.
The references used herein are provided merely for convenience and hence do not define the extent of protection or the scope of the embodiments.
Feed Forward Neural Networks (FFNNs) are exemplary of a first approach to neural networks including layers of interconnected neurons in a Directed Acyclic Graph (DAG), in which an input signal flows and subsequently activates or inhibits the units to which it is fed. Such networks do not permit inner feedback at any level and have no memory of previous (earlier) states. Also, FFNNs do not admit time-variant inputs: they sample, so to say, “snapshots” of a time series and perform classification by operating on a sort of “still image” of data. Consequently, such networks are hardly applicable to a context involving activities that are time-varying: in that case, classification results may be very poor, especially during transitions between different activities.
Another approach to neural networks involves so-called recurrent neural networks. These networks include layers of neurons admitting an inner feedback mechanism and back propagation of states. A major drawback of recurrent neural networks may lie in that such networks may prove hard to train (off line).
So-called reservoir computing is a branch of recurrent neural networks which addresses the complexity of training by introducing some simplifications. Reservoir computing uses large, randomly generated, sparse sets of neurons (called reservoirs) in order to process an input signal. An input signal flows in a reservoir stage and its dimensionality is expanded within that stage, with the goal of making it easier for the readout stage to perform classification of the expanded signal.
The diagram of
Echo state networks as exemplified in
In the diagram of
A major drawback of such an approach may lie in the difficulty in achieving high performance, e.g., due to the reduced freedom of the underlying model, with few parameters adapted to be tuned in order to improve a performance. Such a drawback is confirmed by poor accuracy shown in tests performed on available datasets.
Certain investigations concerning the idea of a self-organizing reservoir have focused, e.g., on Kohonen's self-organizing maps as a training model. Such an approach has a limit in the fixed network topology (like a fishnet) which is unable to evolve and adapt to inputs, e.g., using a different learning model. This eventually resulted in experiments limited to a few tests without the ability of performing in-depth analysis.
One or more embodiments may address the issues discussed in the foregoing by means of a self-organizing reservoir network which can be categorized as a recurrent neural network, that is a neural network that allows feedback loops with a memory of the previous (earlier) states.
Such an arrangement may include a pool of neurons and respective connections forming a dynamic reservoir stage DR (see, e.g.,
In one or more embodiments such a pool of neurons and their connections can be generated randomly and then trained via (unsupervised) machine learning in order to specialize the network, so that the network can react more effectively to input signals.
In one or more embodiments, training and configuration of the network may involve three different acts:
One or more embodiments may rely on a neuron module which can be regarded as a modified version of the neuron in an echo state network as discussed in the foregoing. In one or more embodiments, such a neuron module makes it possible to evaluate (numerically) a distance between the neuron and the signal fed to the neurons with the neurons adapted to such signal(s).
A neural network according to one or more embodiments may include neurons according to the model exemplified in
Such a neuron model (“unit”) may lie at the basis of a self-organizing neural network embodying an array of weights representing the connections between a certain neuron and (all) other neurons in the network. Reference to—all—the neurons in the network indicates that “self-connection” of a neuron with the neuron itself may be included.
In the schematic representation of
By way of a (non-limiting) example of a possible use of one or more embodiments, one may consider the case where the input connections are used to map a signal from an accelerometer A (see
In one or more embodiments, the input connections of the neurons, used to map the accelerometer signal on the reservoir, may be encoded as a set of weights.
In order to create an operating network with, say, 100 neurons (this is again a purely exemplary value), an input can be generated represented by a (100×3) matrix of weights Win (boldface representation of a matrix is avoided herein for simplicity) each row in the matrix representing the connections that link each dimension of the input signal to the neuron.
In a similar way, the reservoir connections of the neurons may represent the weights of the connections of a neuron to all the units in the reservoir (possibly including the neuron/unit itself).
The neurons in the reservoir may be represented, in such an example, by a (100×100) weight matrix W, each row in the matrix representing the connections that link the neurons of the reservoir to that given neuron.
In one or more embodiments, a first act towards the development of a self-organizing reservoir neural network involves the definition of a new model to compute the activation of each neuron.
Throughout the following discussion:
The diagrams of
In the diagram of
The elements just described are thus exemplary of calculating the L2 norm of the two differences, namely the Euclidean distance between two vectors. Such an entity is representative of the distance between the input signals at time t and certain weights and the distance between the activation signals at time t−1 and certain weights.
The results of multiplication at 121, 122 are then added in a summation node 13 with the result of summation fed to a stage 14 applying a non-linear (e.g., exponential e(*)) function to provide a value v˜i.
The value v˜i thus obtained (see the transition from
In one or more embodiments, the level of activation of each neuron N may thus depend on the input signal at the current time instant x(t) and on the level of activation of the reservoir at the previous (earlier) instant v(t−1).
In one or more embodiments as exemplified in
It will be appreciated that, throughout this description, reference to Euclidean distances is merely exemplary and not limitative of the embodiments; one or more embodiments may involve using other types of distances: see, e.g., https://en.wikipedia.org/wiki/Distance (Mathematics).
Multiplication by the factors −α and −β are exemplary of the activation contribution of both Winin and Wi being somehow “dampened,” e.g., before the overall contribution is computed at 14 as an exponential function of the sum computed in the summation node 13.
In one or more embodiments, the leaky integration exemplified by the diagram of
The role of the leaky integration exemplified in
The diagrams of
The diagrams (plots) of
Comparison of
High values of γ (e.g., 1) lead to a (highly) reactive network, where the contribution of activation at the current instant (see
In
In
To sum up: convergence to a stable output becomes increasingly faster for increasingly smaller values for γ.
As noted, leaky integration may also facilitate temporal decoupling between the input and the output of the network, the latter varying at (much) lower rate than the input.
It will be appreciated that in a self-organizing reservoir, activation is computed via a norm, while in an echo state network (ESN) activation is computed via a dot product, therefore losing a per-component information. This factor may play a role in suggesting the use of self-organization.
In one or more embodiments the neurons of a self-organizing reservoir may act as “prototypes” adapted to the signal being processed.
In one or more embodiments, the reservoir training phase (involving the adaptation of the connection weights) may take place, e.g., in a dedicated workstation or in the Cloud, in view of the large number of input signals being processed.
The diagrams of
It will be appreciated that the block representation adopted throughout the figures is generally exemplary of the possibility of implementing the processing as exemplified by resorting to analog circuits, digital circuits (e.g., in SW form) and/or to a mix of analog and digital circuits.
The diagram of the
A first act in the training procedure may involve receiving the input signal, namely x(t) for Win and v(t) for W. A distance (e.g., Euclidean) can then be computed between x(t) and each unit of Win and between v(t) and each unit of W.
The quantity thus computed may be dampened (e.g., exponentially) by the number of units that are closer, according to a chosen distance, to the received signal (either input signal or reservoir activation).
A “learning constant” may thus be multiplied for an amount of adaptation, e.g., a constant that decays (e.g., exponentially) over the (entire) duration of the training process. The resulting effect is that the units are more mobile and adaptable at the beginning of the training process and become then “stiffer” towards the end, with all adaptations performed.
The exemplary diagram of portion a) of
The other input to the multiplication node 31 is provided starting from another multiplication node 32 to which input values h(i, v(t)) and 1/λ(t) (with λ(t) decaying exponentially) are fed to be multiplied with an exponential function e(−(*)) applied at 33.
The entity h(i,v(t)) denotes the number of units closer than the i-th one to the v(t) signal. In the exemplary case presented here this parameter is used to dampen the activation according to the number of units that are closer (and therefore more affected) to the signal v(t). For instance, it can be represented as a table including a number of lines corresponding to the number of neurons in the reservoir. At each line a value is present indicative of the distance between the weight W and its activation v. This may facilitate selecting, by ordering the table, those neurons having more or less short distances thus providing a measure of the tendency to self-aggregate by activation thus promoting grouping and specialization thereof.
The output from the multiplication node 31 is further multiplied at 34 with a coefficient ε(t) namely a learning rate coefficient which decays exponentially just like λ(t) decays exponentially.
The outcome for multiplication at 34 is an update factor ΔWi
ΔWi=ε(t)·e−h(i,v(t)/λ(t))(v(t)−Wi(t−1))
which is applied at a summation node 35 to the “old” value Wi(t−1) to yield an updated value Wi(t).
The right-hand portion, designated b), of
In one or more embodiments adaptation performed by the unit can be seen as the unit “getting closer” to the input signal, by modifying its weights to reduce the distance between them and the signal.
Exponential dampening by the number of units that are closer, according to the chosen distance, to the received signal (either the input sample or reservoir activation) results in the closer units being adapted more than those units that are further away, thus facilitating better covering of signal dynamics and specialization of the units.
Also, while an exponential decay function was found to be a good choice for dampening as applied at 32 and 34 to the output from the node 30, other forms of space/time dampening (e.g., linear) may be applied in one or more embodiments.
It was observed that as result of such processing clusters tend to form leading to a more uniform distribution of the units in the respective space.
It was also observed that the effect on supervised training can be appreciated by resorting the, e.g., to the T-sne algorithm as discussed in van der Maaten, et al. (cited previously), which is useful in visualizing multi-dimensional spaces in lower-dimensional spaces. The T-sne algorithm is an unsupervised machine learning algorithm which facilitates embedding elements from high dimensional space into a space with smaller dimensions.
By resorting to that method it is possible to visualize in a scatter plot (bidimensional) the elements of both Win and W belonging to 3-d and N-d space where N is the number of neurons.
As noted, another relevant effect of a self-organization is specialization of neurons. For instance it was observed that the level of activation (which may be computed by averaging the instantaneous activation after been fed with the sequence of input samples) is (much) more localized in a trained network while it is more distributed in an untrained network.
The areas of activation in the case of a training networks are more discernible which is a sign of specialization.
In one or more embodiments, after a first training as exemplified in the foregoing, the reservoir (DR in the diagram of
To that effect (classifier training) one or more embodiments may adopt a procedure as schematically represented in
In the diagram of
For instance, in one or more embodiments, the network may be fed with input samples belonging to known classes (the labeled inputs) and the network readout (namely the classifier 50) can be trained to associate to reservoir activation values certain output classes. By referring to the non-limiting example of an accelerometer signal in a wearable device from which activity classes are derived, these output classes may include classes such as jogging, walking, biking, stationary and so on.
Such a procedure can be repeated iteratively until a desired level of accuracy (precision) is achieved, e.g.:
Again, such a phase of the training process can be performed either in a workstation, in a mobile device or in the Cloud.
The possibility also exist of performing a “major” classifier training either at a work station or in the Cloud with incremental training performed in a mobile device thus allowing a finer tuning of the parameters which facilitates adaptation to the specific wearer.
Once the training phase is completed, the network is ready to be operated/deployed, by accepting input signals (for instance accelerometer signals) and providing classifications as schematically represented in the diagram of
In
One or more embodiments lend themselves to be embedded in wearable devices powered, e.g., with a microcontroller of the STM 32 family available with the applicant company.
As regards complexity, by designating N-dim the number of dimensions of the input signal and N the number of neurons in the network, the following operations are performed for each sample in a network as exemplified in the foregoing (MAC=Multiply-ACcumulate operation):
N*(3+2*(N−dim+N)MAC+1 exponential (which can be approximated with about 5 MAC) in order to compute a current contribution (see FIG. 4)
2*(N+1) MAC to compute the leaky integration of FIG. 5
the total cost of a single iteration can be estimated as 2N−dim+4N+10 MAC.
By way of example, by assuming a 100-neuron network that processes accelerometer signals (natively 3-d), the computational costs for each input sample is:
N=100,N−dim=3
100*(3+2*(3+100)+5)=21400 MAC for the activation at current step
2*(101)=202 MAC for the leaky integration
the total cost for computing the activation for each sample is 21602 MACC
By assuming a 16 Hz accelerometer sensor providing input to the network, the total cost is about 345,632 MAC/sec.
By referring to a more computationally-demanding and complex example, one may assume having input signals from a 3-d accelerometer paired with a 3-d gyroscope:
N=100,N−dim=6
100*(6+2*(6+100)+5)=22300 MAC for the activation at current step
2*(101)=202 MACC for the leaky integration
the total cost for computing the activation for each sample is 22502 MACC
Assuming a 16 Hz accelerometer sensor providing input to the network the total cost is about 360,032 MAC/sec, that is an amount slightly higher than the processing cost for handling the 3-d accelerometer signals only.
By referring to training of the reservoir based on the neural model discussed previously, the readout classifier turns out to be appreciably simpler in comparison to those of other neural network-based approaches with the cost of training being appreciably lower in comparison with back-propagation methods used for training feedforward neural networks.
For instance, the following table reports evaluation results in terms of confusion matrix referring to testing a 500-neuron conventional Echo State Network (ESN) with an average recall (AR): 71.02%
The following table reports by way of comparison the results obtained in testing a 500-neuron network based on the self-organizing reservoir approach discussed herein having an average recall with (AR): 98.33%
Operation of a neural network as discussed herein is essentially deterministic: for a given input sequence the network will expectedly output a same sequence (all seeds of the pseudo-random number generated can be explicitly controlled in order to obtain such a deterministic control). Consequently, the same exact output sequence being obtained given a same input sequence is indicative of the self-organizing neural network approach discussed herein being adopted.
In one or more embodiments a neural network (e.g., IN, DR, RO) may include at least one layer (DR) of neurons (e.g., N) including neurons having neuron connections to neurons in the at least one layer and input connections to a network input (e.g., X, Y, Z), wherein the neuron connections and the input connections have respective neuron connection weights (e.g., Wi) and input connection weights (e.g., Wiin), wherein said neurons have neuron responses set by an activation function (e.g., AF) with activation values (e.g., vi(t), vi(t−1)) variable over time, said neurons including activation function computing circuits (see, e.g., 101, 102, 111, 112, 121, 122, 13, 14, 20, 21, 22 in
In one or more embodiments, the neuron connections may include neuron self-connections (that is, with the neuron itself).
In one or more embodiments said activation function computing circuits may include:
In one or more embodiments, the distance computing modules may be configured to compute said distances as Euclidean distances.
One or more embodiments may include dampening modules (e.g., 121, 122) applying dampening factors (e.g., α, β) to said first and second outputs summed to provide said sum of said first and second outputs.
In one or more embodiments, said activation function computing circuits may include a leaky integration stage coupled to the output of said exponential module.
One or more embodiments may include:
In one or more embodiments a device may include:
In one or more embodiments the sensor may include an accelerometer, optionally coupled with a gyrometer (e.g., a gyroscope), providing activity signals, said network-processed output including classifications of said activity signals.
Apparatus according to one or more embodiments (e.g., wearable fitness apparatus) may include:
In one or more embodiments a method of adaptively setting said respective neuron connection weights and input weights in a network according to one or more embodiments may include:
In one or more embodiments the network may include a classification readout stage (e.g., RO) configured for providing classification of signals input to the neural network, the method including, subsequent to adaptively setting said respective network connection weights and input weights:
In one or more embodiments a computer program product, loadable in the memory of at least one computer may include software code portions for performing the steps of the method of one or more embodiments.
Without prejudice to the underlying principles, the details and embodiments may vary, even significantly, with respect to what has been described herein by way of example only, without departing from the extent of protection.
The extent of protection is defined by the annexed claims.
The various embodiments described above can be combined to provide further embodiments. These and other changes can be made to the embodiments in light of the above-detailed description. In general, in the following claims, the terms used should not be construed to limit the claims to the specific embodiments disclosed in the specification and the claims, but should be construed to include all possible embodiments along with the full scope of equivalents to which such claims are entitled. Accordingly, the claims are not limited by the disclosure.
Number | Date | Country | Kind |
---|---|---|---|
102017000047044 | May 2017 | IT | national |