The present invention relates to a model creation method, model creation apparatus, and program.
Creating a prediction model by machine-learning a great amount of data and automatically determining various phenomena using this prediction model has become a practice in various fields in recent years. Examples of created prediction models include a model for determining at a production site whether a product is normal or defective, based on images of the product and a model for classifying the type of a part based on images of the part. A model need not be created using images and may be created by machine-learning various types of data, such as speech, text, or numerical data.
On the other hand, creating an accurate prediction model by machine learning requires learning a great amount of data for a long time. However, there may be a limit to the time or the amount of data. Techniques to address this problem include one called transfer learning that creates a new model using a prediction model created by previously learning a great amount of data. By using a previously prepared prediction model serving as a base, an accurate prediction model can be created in a short time and with a small amount of data. An example of transfer learning is disclosed in Patent Document 1.
Patent Document 1: Japanese Unexamined Patent Application Publication (Translation of PCT Application) No. 2018-525734
However, creation of a prediction model using transfer learning as described above involves the following problems. A first problem is that transfer learning uses existing models and therefore if there are many target models, it takes time and efforts to search for a model suitable for a challenge to be solved among the models. For example, a failure to select a suitable model leads to disadvantages, such as one that learning is rather delayed. A second problem is that the management of models created using transfer learning is complicated and it is difficult to search for a suitable model using these models.
Accordingly, an object of the present invention is to solve the above problems, that is, the difficulties in selecting a suitable model using transfer learning.
A model creation method according to an aspect of the present invention includes selecting a model based on output results obtained by inputting pieces of learning data to registered models, creating a new model by inputting the pieces of learning data to the selected model and performing machine learning, and registering the created new model such that the new model is associated with the selected model.
A model creation apparatus according to another aspect of the present invention includes a selector configured to select a model based on output results obtained by inputting pieces of learning data to registered models, a learning unit configured to create a new model by inputting the pieces of learning data to the selected model and performing machine learning, and a registration unit configured to register the created new model such that the new model is associated with the selected model.
A program according to yet another aspect of the present invention is a program for implementing, in an information processing apparatus, a selector configured to select a model based on output results obtained by inputting pieces of learning data to registered models, a learning unit configured to create a new model by inputting the pieces of learning data to the selected model and performing machine learning, and a registration unit configured to register the created new model such that the new model is associated with the selected model.
The present invention thus configured is able to select a suitable model using transfer learning.
A first example embodiment of the present invention will be described with reference to
[Configuration] A model creation apparatus 10 according to the present invention is an apparatus for creating a model that outputs a predicted output value with respect to one input by performing machine learning using previously prepared learning data. In particular, in the present invention, the model creation apparatus 10 has a function of performing transfer learning that creates a new model by previously storing some models and machine-learning these models. For example, the model creation apparatus 10 creates a model for determining at a production site whether a product is normal or defective, based on images of the product, or a model for classifying the type of a part based on images of the part. Note that a model created by the model creation apparatus 10 may be of any type and data used to machine-learning a model may be of any type, such as speech, text or numerical data.
The model creation apparatus 10 consists of one or more information processing apparatuses each including an arithmetic logic unit and a storage unit. As shown in
The learning data storage unit 16 is storing learning data (data for learning) used to create a model. Each learning data is data to be inputted to create a model by machine learning and is, for example, data on captured images or data on measured measurements. Each learning data is provided with a label serving as a teacher signal representing the correct answer of the learning data. For example, in the present embodiment, it is assumed that each learning data is provided with one of two labels {A,B}, as shown in
The model storage unit 17 is also storing multiple pieces of model data, such as a previously prepared registered model and or a newly created registered model (to be discussed later). Specifically, as shown in
Similarly, the model storage unit 17 is storing model data with respect to base models 2 and 3. That is, the model storage unit 17 is storing these base models, child models created using the base models as the transfer sources, and the parent-child relationships between these models. While
The selector 11 selects one piece of model data as the transfer source from among the pieces of model data stored in the model storage unit 17, inputs each learning data to the selected model, checks the output result of the model, and evaluates the model in terms of whether the output result corresponds to the label of the learning data. Specifically, the selector 11 selects the model data as follows.
First, the selector 11 reads all the base models 1, 2, and 3 from the model storage unit 17. Then, the selector 11 reads pieces of learning data from the learning data storage unit 16, inputs the pieces of learning data to the base models 1, 2, and 3, and compiles the output results from the base models 1, 2, and 3. Here, it is assumed that the output layers of the base models 1, 2, and 3 produce outputs using one of six labels {v,w,x,y,z}, as shown in
In the example of
If there are some generations of child models associated with the selected base model 2, as shown in
Specifically, the selector 11 first reads information on the selected base model 2 as shown in
If child models 2aa, 2ab, and 2ac are associated with the selected child model 2a as subordinates of the child model 2a, as shown in
If there is no child model associated with the initially selected base model 2, the selector 11 determines the base model 2 as the transfer-source model. Similarly, if one child model is selected and there is no child model associated with the selected child model as a subordinate, the selector 11 determines the selected child model as the transfer-source model.
The learning unit 12 reads the model determined as the transfer source from the model storage unit 17. The learning unit 12 also reads pieces of learning data from the learning data storage unit 16. The learning unit 12 then creates a new model by inputting the pieces of learning data to the model determined as the transfer source and performing machine learning. For example, the learning unit 12 performs so-called transfer learning or fine tuning, which uses an existing model determined as the transfer source. Here, it is assumed that a model 2aba corresponding to the learning data is newly created using the child model 2ab as the transfer source, as shown in
The registration unit 13 stores information on the created new model 2aba in the model storage unit 17. At this time, as shown in
Also, when selecting or learning a model as described above, or in accordance with a request from a user, the registration unit 13 outputs model data stored in the model storage unit 17 so that the model data is displayed on a display unit 20. At this time, the registration unit 13 outputs the model data such that the association between the models is clarified. For example, when outputting the model data of the base model 2, the registration unit 13 outputs the model data such that the transfer source and transfer destination are connected by an arrow in a diagram as shown in
Next, operations of the model creation apparatus 10 thus configured will be described mainly with reference to the flowcharts of
First, the model creation apparatus 10 reads all the base models from the model storage unit 17 (step S1). The model creation apparatus 10 also reads labeled pieces of learning data from the learning data storage unit 16 (step S2).
The model creation apparatus 10 then predicts the read pieces of learning data using the read base models, compiles the results, and evaluates the base models (step S3). Here, the model creation apparatus 10 evaluates the base models in terms of which model has unlabeled the pieces of learning data better. For this reason, if the output results as shown in
Next, referring to the flowchart of
First, the model creation apparatus 10 reads the model data of the base model selected as described above from the model storage unit 17 (step S11). For example, if the base model 2 is selected, the model creation apparatus 10 reads the model data of the base model 2 as shown in
Then, the model creation apparatus 10 checks whether there are models (child models) created using the selected base model 2 as the transfer source (step S13). If the base model 2 has no child model (NO in step S13), the model creation apparatus 10 no longer searches for models and determines the base model 2 as the transfer-source model (step S15). On the other hand, if the base model 2 has child models (YES in step S13), the model creation apparatus 10 evaluates the child models to determine whether there is a better transfer source than the base model among the child models (step S14). Here, the child models are evaluated based on output results obtained by inputting the pieces of learning data to each child model. The child models may be evaluated using a method similar to the above evaluation method of the base model, or any other method.
The model creation apparatus 10 evaluates all the models until there are no longer models below the child models, and selects the best model as the transfer-source model (step S15). For example, the model creation apparatus 10 selects the child model 2ab, which the third generation model starting from the base model 2, as shown in
Next, referring to the flowchart of
The model creation apparatus 10 then performs transfer learning of the labeled pieces of learning data using the read model 2ab as the transfer-source model (step S23). As a result of the learning, the model creation apparatus 10 creates a new model and stores information on the new model in the model storage unit 17 (step S24). At this time, the model creation apparatus 10 stores the created new model in the model storage unit 17 such that the model is associated with the transfer-source model as a child model of the transfer-source model, that is, as a subordinate thereof. For example, as shown in
When selecting or learning a model as described above, or in accordance with a request from a user, the model creation apparatus 10 may output information indicating the association between the models as shown in
As seen above, the present invention first inputs the pieces of learning data to the registered models and selects the model based on the output results from the models, creates the new model by inputting the pieces of learning data to the selected model and performing machine learning, and registers the newly created model such that the newly created model is associated with the selected model. This allows for selecting a model from the registered models in accordance with the characteristics of the learning data and creating a new model by performing machine learning using such a model. This means that a model suitable to the learning data can be selected in transfer learning. Also, by registering the source model used in transfer learning and the model newly created by transfer learning in an associated manner, a model to be transferred can be selected from among the models registered in an associated manner. As a result, a more suitable model can be selected in transfer learning.
Next, a second example embodiment of the present invention will be described with reference to
First, referring to
a CPU (central processing unit) 101 (arithmetic logic unit);
a ROM (read-only memory) 102 (storage unit);
a RAM (random-access memory) 103 (storage unit);
programs 104 loaded into the RAM 103;
a storage unit 105 storing the programs 104;
a drive unit 106 that writes and reads to and from a storage medium 110 outside the information processing apparatus;
a communication interface 107 that connects with a communication network 111 outside the information processing apparatus;
an input/output interface 108 through which data is outputted and inputted; and
a bus 109 through which the components are connected to each other.
When the CPU 101 acquires and executes the programs 104, a selector 121, a learning unit 122, and a registration unit 123 shown in
The hardware configuration of the information processing apparatus serving as the model creation apparatus 100 shown in
The model creation apparatus 100 performs a model creation method shown in the flowchart of
As shown in
The present invention thus configured is able to select a model from the registered models in accordance with the characteristics of the learning data and to create a new model by performing machine learning using this model. Thus, the present invention is able to select a model suitable to the learning data in transfer learning. Also, by previously registering the new model created using transfer learning such that the new model is associated with the source model used in transfer learning, a model to be transferred can be selected from among the models registered in an associated manner. As a result, a more suitable model can be selected in transfer learning.
The above programs can be stored in various types of non-transitory computer-readable media and provided to a computer. The non-transitory computer-readable media include various types of tangible storage media. The non-transitory computer-readable media include, for example, a magnetic recording medium (for example, a flexible disk, a magnetic tape, a hard disk drive), a magneto-optical recording medium (for example, a magneto-optical disk), a CD-ROM (read-only memory), a CD-R, a CD-R/W, and a semiconductor memory (for example, a mask ROM, a PROM (programmable ROM), an EPROM (erasable PROM), a flash ROM, a RAM (random-access memory)). The programs may be provided to a computer by using various types of transitory computer-readable media. The transitory computer-readable media include, for example, an electric signal, an optical signal, and an electromagnetic wave. The transitory computer-readable media can provide the programs to a computer via a wired communication channel such as an electric wire or optical fiber, or via a wireless communication channel.
While the present invention has been described with reference to the example embodiments and so on, the present invention is not limited to the example embodiments described above. The configuration or details of the present invention can be changed in various manners that can be understood by one skilled in the art within the scope of the present invention.
The present invention is based upon and claims the benefit of priority from Japanese Patent Application 2019-046365 filed on Mar. 13, 2019 in Japan, the disclosure of which is incorporated herein in its entirety by reference.
Some or all of the embodiments can be described as in Supplementary Notes below. While the configurations of the model creation method, model creation apparatus, and program according to the present invention are outlined below, the present invention is not limited thereto.
A model creation method comprising:
selecting a model based on output results obtained by inputting pieces of learning data to registered models;
creating a new model by inputting the pieces of learning data to the selected model and performing machine learning; and
registering the created new model such that the new model is associated with the selected model.
(Supplementary Note 2)
The model creation method according to Supplementary Note 1, further comprising:
if there are models registered so as to be associated with the selected model, selecting a new model based on output results obtained by inputting the pieces of learning data to the models registered so as to be associated with the selected model;
creating another new model by inputting the pieces of learning data to the selected new model and performing machine learning; and
registering the created other new model such that the other new model is associated with the selected new model.
The model creation method according to Supplementary Note 1 or 2, wherein the selecting the model comprises selecting the model based on labels attached to the pieces of learning data and labels of the output results obtained by inputting the pieces of learning data to the registered models.
The model creation method according to Supplementary Note 3, wherein the selecting the model comprises if the pieces of learning data provided with an identical label are inputted to a registered model and if the pieces of learning data provided with the identical label are aggregated in an output result provided with an identical label of the registered model, selecting the registered model.
The model creation method according to Supplementary Note 3 or 4, wherein the selecting the model comprises if the pieces of learning data are inputted to a registered model and if the number of labeled output results of the registered model is smaller, selecting the registered model.
The model creation method according to any one of Supplementary Note 1 to 5, further comprising outputting associations between the models for display.
A model creation apparatus comprising:
a selector configured to select a model based on output results obtained by inputting pieces of learning data to registered models;
a learning unit configured to create a new model by inputting the pieces of learning data to the selected model and performing machine learning; and
a registration unit configured to register the created new model such that the new model is associated with the selected model.
The model creation apparatus according to Supplementary Note 7, wherein
if there are models registered so as to be associated with the selected model, the selector selects a new model based on output results obtained by inputting the pieces of learning data to the models registered so as to be associated with the selected model,
the learning unit creates another new model by inputting the pieces of learning data to the created new model and performing machine learning, and
the registration unit registers the created other new model such that the other new model is associated with the selected new model.
The model creation apparatus according to Supplementary Note 7 or 7.1, wherein when selecting the model, the selector selects the model based on labels attached to the pieces of learning data and labels of the output results obtained by inputting the pieces of learning data to the registered models.
The model creation apparatus according to Supplementary Note 7.2, wherein if the pieces of learning data provided with an identical label are inputted to a registered model and if the pieces of learning data provided with the identical label are aggregated in an output result provided with an identical label of the registered model, the selector selects the registered model.
The model creation apparatus according to Supplementary Note 7.2 or 7.3, wherein if the pieces of learning data are inputted to a registered model and if the number of labeled output results of the registered model is smaller, the selector selects the registered model.
The model creation apparatus according to any one of Supplementary Note 7 to 7.4, wherein the registration unit outputs associations between the registered models for display.
A program for implementing, in an information processing apparatus:
a selector configured to select a model based on output results obtained by inputting pieces of learning data to registered models;
a learning unit configured to create a new model by inputting the pieces of learning data to the selected model and performing machine learning; and
a registration unit configured to register the created new model such that the new model is associated with the selected model.
Number | Date | Country | Kind |
---|---|---|---|
2019-046365 | Mar 2019 | JP | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2020/006001 | 2/17/2020 | WO | 00 |