This application is a U.S. 371 Application of International Patent Application No. PCT/JP2019/009114, filed on 7 Mar. 2019, which application claims priority to and the benefit of JP Application No. 2018-047140, filed on 14 Mar. 2018, the disclosures of which are hereby incorporated herein by reference in their entireties.
The present invention relates to an analyzing device, method, and program, and particularly, to an analyzing device, method, and program for analyzing a structure of a neural network.
Techniques of extracting a structure that simplifies an original neural network by applying network analysis to a multilayer neural network learned from data and converting the structure in a more easily analyzable form are known. More specifically, these techniques realize simplification of an original connection structure by finding out a group (also referred to as a cluster or a community) of units having a connection pattern similar to the units of adjacent layers for the units of respective layers of a multilayer neural network according to a predetermined algorithm.
Due to the above-described existing techniques, although it is possible to extract a structure that simplifies the structure of a multilayer neural network, these techniques have the following problems. First, a method of quantitatively knowing the role of each cluster in the cluster structure of a multilayer neural network extracted by the above-described technique is not known. That is, with these techniques, it is not possible to know how much each cluster looks at each part of input data and how much each cluster is used for predicting each part of an output. Moreover, in these existing techniques, simplification of the structure is realized by expressing a plurality of connections present between respective clusters collectively as one connection by threshold processing. Therefore, the structure obtained using the existing technique changes greatly depending on threshold setting and it cannot be said that similar examination results are stably read from the structure. Therefore, a technique capable of obtaining similar examination results stably regardless of the setting of variables such as a threshold and quantitatively knowing the role of each cluster in a neural network is desirable.
The present invention has been made to solve the above-described problems, and an object thereof is to provide an analyzing device, method, and program capable of quantitatively analyzing the structure of a neural network.
In order to attain the object, an analyzing device according to a first invention includes: an input unit that receives a neural network learned in advance for outputting output data from input data on the basis of learning data made up of the input data and the output data and a structure of a cluster made up of units calculated in advance for the neural network; and an analyzing unit that calculates, for each of combinations of a dimension of the input data and the cluster, a sum of squared errors between an output of each unit belonging to the cluster when a value of the dimension of the input data is replaced with an average value of the dimension of the input data included in the learning data and an output of each unit belonging to the cluster for the input data before replacement as a relationship between the combinations, and calculates, for each of combinations of the cluster and a dimension of the output data, a squared error between the value of the dimension of the output data when an output value of each unit belonging to the cluster is replaced with an average output value of each unit of the cluster when the input data included in the learning data was input and the value of the dimension of the output data before replacement as a relationship between the combinations.
An analyzing device according to a second invention includes: an input unit that receives a neural network learned in advance for outputting output data from input data on the basis of learning data made up of the input data of a sample and the output data indicating a dimension to which the sample belongs and a structure of a cluster made up of units calculated in advance for the neural network; an analyzing unit that calculates, for each of combinations of a dimension of the input data and the cluster, a sum of squared errors between an output of each unit belonging to the cluster when a value of the dimension of the input data is replaced with an average value of the dimension of the input data included in the learning data and an output of each unit belonging to the cluster for the input data before replacement as a relationship between the combinations, and calculates, for each of combinations of the cluster and a dimension of the output data, a squared error between the value of the dimension of the output data when an output value of each unit belonging to the cluster is replaced with an average output value of each unit of the cluster when the input data included in the learning data was input and the value of the dimension of the output data before replacement as a relationship between the combinations; and a relearning unit that adds noise corresponding to a relationship between combinations with a dimension to which a sample indicated by the output data included in the learning data to the output value of each unit belonging to each cluster on the basis of the learning data, the cluster structure, and the calculated relationship between each of the combinations of the cluster and the dimension of the output data and relearns the neural network by backpropagation.
An analyzing method according to a third invention includes and performs: a step in which an input unit receives a neural network learned in advance for outputting output data from input data on the basis of learning data made up of the input data and the output data and a structure of a cluster made up of units calculated in advance for the neural network; and a step in which an analyzing unit calculates, for each of combinations of a dimension of the input data and the cluster, a sum of squared errors between an output of each unit belonging to the cluster when a value of the dimension of the input data is replaced with an average value of the dimension of the input data included in the learning data and an output of each unit belonging to the cluster for the input data before replacement as a relationship between the combinations, and calculates, for each of combinations of the cluster and a dimension of the output data, a squared error between the value of the dimension of the output data when an output value of each unit belonging to the cluster is replaced with an average output value of each unit of the cluster when the input data included in the learning data was input and the value of the dimension of the output data before replacement as a relationship between the combinations.
An analyzing method according to a fourth invention includes and performs: a step in which an input unit receives a neural network learned in advance for outputting output data from input data on the basis of learning data made up of the input data of a sample and the output data indicating a dimension to which the sample belongs and a structure of a cluster made up of units calculated in advance for the neural network; a step in which an analyzing unit calculates, for each of combinations of a dimension of the input data and the cluster, a sum of squared errors between an output of each unit belonging to the cluster when a value of the dimension of the input data is replaced with an average value of the dimension of the input data included in the learning data and an output of each unit belonging to the cluster for the input data before replacement as a relationship between the combinations, and calculates, for each of combinations of the cluster and a dimension of the output data, a squared error between the value of the dimension of the output data when an output value of each unit belonging to the cluster is replaced with an average output value of each unit of the cluster when the input data included in the learning data was input and the value of the dimension of the output data before replacement as a relationship between the combinations; a step in which a relearning unit adds noise corresponding to a relationship between combinations with a dimension to which a sample indicated by the output data included in the learning data to the output value of each unit belonging to each cluster on the basis of the learning data, the cluster structure, and the calculated relationship between each of the combinations of the cluster and the dimension of the output data and relearn the neural network by backpropagation.
A program according to a fifth invention is a program for causing a computer to function as each unit of the analyzing device according to the first invention.
A program according to a sixth invention is a program for causing a computer to function as each unit of the analyzing device according to the second invention.
According to the analyzing device, method, and program of the present invention, a neural network learned in advance for outputting output data from input data on the basis of learning data made up of the input data and the output data and the structure of a cluster made up of units calculated in advance for the neural network are received. Moreover, for each combination of the cluster and the dimension of the input data, the sum of squared errors between the output of each unit belonging to the cluster when the value of the dimension of the input data is replaced with an average value of the dimension of the input data included in the learning data and the output of each unit belonging to the cluster for the input data before replacement is calculated as a relationship between combinations. Moreover, for each combination of the cluster and the dimension of the output data, the squared error between the value of the dimension of the output data when the output value of each unit belonging to the cluster is replaced with an average output value of each unit of the cluster when the input data included in the learning data was input and the value of the dimension of the output data before replacement is calculated as a relationship between combinations. By doing so, it is possible to analyze the structure of a neural network quantitatively.
Hereinafter, an embodiment of the present invention will be described with reference to the drawings.
<Configuration of Analyzing Device According to Embodiment of Present Invention>
Next, a configuration of an analyzing device according to an embodiment of the present invention will be described. As illustrated in
The input unit 10 receives a neural network learned in advance and a structure of a cluster made up of units calculated in advance for the neural network. Moreover, the cluster structure may be extracted by an arbitrary method. For example, in a multilayer neural network learned in advance, a community of each layer of the multilayer neural network, extracted on the basis of a connection relationship between vertices of adjacent layers, determined by each edge may be used as a cluster. Moreover, the input unit 10 receives learning data made up of input data of a sample serving as an input to the neural network and output data indicating a dimension to which the sample belongs. The neural network is learned as a neural network for outputting output data from input data on the basis of learning data made up of the input data and the output data.
The arithmetic unit 20 includes an analyzing unit 30 and a relearning unit 32.
The analyzing unit 30 calculates a relationship vc,iin between combinations of a dimension i of input data and a cluster c and a relationship vc,jout between combinations of the cluster c and a dimension j of output data. The calculated vc,iin and vc,jout indicate the strength of relationship in each combination.
Specifically, the analyzing unit 30 inputs learning data to the neural network and defines and calculates, for each combination of the cluster c and the dimension i of the input data, the sum of squared errors between the output of each unit belonging to the cluster c when the value of the dimension i of the input data which is the learning data is replaced with an average value of the dimension i of the input data included in the learning data and the output of each unit belonging to the cluster c with respect to the input data before replacement as the relationship vc,iin between the combinations of the dimension i of the input data and the cluster c.
Moreover, the analyzing unit 30 inputs the learning data to the neural network and defines and calculates, for each combination of the cluster c and the dimension j of the output data, the squared error between the value of the dimension j of the output data when the output value of each unit belonging to the cluster c is replaced with an average output value of each unit of the cluster c when the input data included in the learning data was input and the value of the dimension j of the output data before replacement as the relationship vc,jout between the combinations of the cluster c and the dimension j of the output data.
With the above computation, the role (the strength of relationship between input and output dimensions) of each cluster is obtained in a vector form (vin={vc,jin}, vout={vc,jout}).
Moreover, the output unit 50 outputs the relationship vc,iin between the combinations of the dimension i of the input data and the cluster c and the relationship vc,jout between the combinations of the output data of the cluster c and the dimension j so as to be represented by a vector or a matrix so that the results can be analyzed more easily. For example,
Moreover, by relearning the parameters of the neural network on the basis of the computation results of the analyzing unit 30, it is possible to improve the generalization performance of the neural network.
The relearning unit 32 adds noise corresponding to a relationship between combinations of dimensions to which a sample indicated by the output data included in the learning data to the output value of each unit belonging to each of the clusters on the basis of the learning data, the cluster structure, and the calculated relationship vc,jout between each of the combinations of the cluster and the dimension of the output data, relearns a neural network by backpropagation, and outputs the relearned neural network to the output unit 50.
For example, when a neural network is learned on the basis of stochastic gradient descent, a normal distribution having a variance proportional to the degree of contribution vc,jout of the cluster to the output dimension j, corresponding to an output class (corresponding to information on a character written in an image in the case of recognition of a character image) of the dimension of the output data of the learning data selected randomly in each iteration of learning for each cluster is defined as Formula (1) below.
[Formula 1]
(0,σc,j),σc,j∝vc,jout (1)
The relearning unit 32 updates parameters from a connection weight {ωi,jd} and a bias {θid} which are initialized parameters and a sample. Algorithm 1 illustrated in
<Operation of Analyzing Device According to Embodiment of Present Invention>
Next, an operation of the analyzing device 100 according to an embodiment of the present invention will be described. When the input unit 10 receives a neural network learned in advance, the structure of a cluster made up of units calculated in advance for the neural network, and learning data made up of input data of a sample serving as an input and output data indicating a dimension to which the sample belongs, the analyzing device 100 executes an analysis process routine illustrated in
In step S100, the analyzing unit 30 inputs learning data to a neural network and defines and calculates, for each combination of the cluster c and the dimension i of the input data which is the learning data, the sum of squared errors between the output of each unit belonging to the cluster c when the value of the dimension i of the input data is replaced with an average value of the dimension i of the input data included in the learning data and the output of each unit belonging to the cluster c with respect to the input data before replacement as the relationship vc,iin between the combinations of the dimension i of the input data and the cluster c. Moreover, the analyzing unit 30 inputs the learning data to the neural network and defines and calculates, for each combination of the cluster c and the dimension j of the output data, the squared error between the value of the dimension j of the output data when the output value of each unit belonging to the cluster c is replaced with an average output value of each unit of the cluster c when the input data included in the learning data was input and the value of the dimension j of the output data before replacement as the relationship vc,jout between the combinations of the cluster c and the dimension j of the output data.
In step S102, the output unit 50 outputs the relationship vc,iin between the combinations of the dimension i of the input data and the cluster c and the relationship vc,jout between the combinations of the output data of the cluster c and the dimension j so as to be represented by a vector or a matrix.
In step S104, the relearning unit 32 adds noise corresponding to the normal distribution of Formula (1) using the relationship vc,rout between the combinations with the dimension r to which the sample indicated by the output data included in the learning data belongs to the output value of each unit belonging to each of the clusters c when the input data of the learning data was input on the basis of the learning data, the cluster structure, and the calculated relationship vc,jout between each of the combinations of the cluster and the dimension of the output data, relearns a neural network by backpropagation, and outputs the relearned neural network to the output unit 50.
As described above, according to the analyzing device according to an embodiment of the present invention, a neural network learned in advance for outputting output data from input data on the basis of learning data made up of the input data and the output data and the structure of a cluster made up of units calculated in advance for the neural network are received. Moreover, for each combination of the cluster and the dimension of the input data, the sum of squared errors between the output of each unit belonging to the cluster when the value of the dimension of the input data is replaced with an average value of the dimension of the input data included in the learning data and the output of each unit belonging to the cluster for the input data before replacement is calculated as a relationship between combinations. Moreover, for each combination of the cluster and the dimension of the output data, the squared error between the value of the dimension of the output data when the output value of each unit belonging to the cluster is replaced with an average output value of each unit of the cluster when the input data included in the learning data was input and the value of the dimension of the output data before replacement is calculated as a relationship between combinations. With these calculations, it is possible to analyze the structure of a neural network quantitatively.
Using the technique of the embodiment of the present invention, it is possible to know the role of each cluster by finding out which dimensional information of input data is used by the cluster and which dimensional information of the output is inferred by the cluster. For example, as illustrated in
The present invention is not limited to the above-described embodiment, but various modifications and applications can be made without departing from the spirit of the present invention.
For example, in the above-described embodiment, although a case of including a relearning unit has been described as an example, there is no limitation thereto. For example, the arithmetic unit of the analyzing device may include an analyzing unit only, and the analyzing unit may analyze and output the relationship vc,iin between the combinations of the cluster c and the dimension i of the input data and the relationship vc,jout between the combinations of the cluster c and the dimension j of the output data.
Number | Date | Country | Kind |
---|---|---|---|
2018-047140 | Mar 2018 | JP | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2019/009114 | 3/7/2019 | WO |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2019/176731 | 9/19/2019 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
6134537 | Pao | Oct 2000 | A |
20030191728 | Kulkarni | Oct 2003 | A1 |
20060224533 | Thaler | Oct 2006 | A1 |
20070094168 | Ayala | Apr 2007 | A1 |
20200110930 | Simantov | Apr 2020 | A1 |
20230214458 | Marsden | Jul 2023 | A1 |
Number | Date | Country |
---|---|---|
H07120349 | Dec 1995 | JP |
Entry |
---|
Watanabe, Chihiro, et al., “Modular representation of layered neural networks,” Neural Networks 97 (2018) 62-73. |
Number | Date | Country | |
---|---|---|---|
20210248422 A1 | Aug 2021 | US |