The present disclosure relates to an information processing device, an information processing method, and an information processing program.
In recent years, with progress of a deep learning field, deep learning models with high recognition performance have appeared. In training of the deep learning model, the training may be efficiently advanced by using a large amount of labeled data that is data manually assigned with correct answers. For example, in a field of image recognition, in a case where an object is recognized, at least several hundreds of pieces of labeled data are used for one object.
On the other hand, most pieces of training data provided in an environment in which the training data is actually acquired is unlabeled data assigned with no correct answer, and the number of pieces of labeled data is small. For example, in a case where there are several hundreds of pieces of training data in total, there are often only about several tens of pieces of labeled data among the several hundreds of pieces of training data. In a case where the number of pieces of labeled data is small, the deep learning model overfits data used for training, and performance for data not used for training is degraded. Such an event is referred to as overfitting. Therefore, there is a need for a method of training a deep learning model with high recognition performance using unlabeled data as well.
Conventionally, the following technologies have been provided in deep learning. One is a technology of acquiring a feature extraction capability that is a capability of extracting an image feature used for recognition from unlabeled data. Specifically, a feature amount is extracted from unlabeled data by using a deep learning model, pieces of data are grouped together and divided into a plurality of clusters based on the extracted feature amount, and a pseudo label that is a pseudo correct answer is assigned to each cluster to perform training, thereby acquiring a feature amount extraction capability.
The other one is a technology of giving a feature extraction capability acquired in advance to a deep learning model, and then performing training with labeled data limited to a capability of identifying data based on an extracted feature. This technology is referred to as transfer learning.
Additionally, a method is conceivable in which the two technologies described above are combined, and training is performed with labeled data limited to a capability of identifying data based on an extracted feature based on a feature extraction capability acquired from unlabeled data. With this configuration, a deep learning model with high recognition performance may be acquired even with a small amount of labeled data.
Examples of the related art include [Non-Patent Document 1] Self-labelling via simultaneous clustering and representation learning, Yuki M. Asano, Christian Rupprecht, Andrea Vedaldi, ICLR2020, 20 Aug. 2020.
According to an aspect of the embodiments, there is provided an information processing device including: memory configured to store a plurality of pieces of labeled data in which a label that represents a correct answer is associated with object data, a plurality of pieces of unlabeled data that is object data not associated with a correct answer, and a deep learning model; and processor circuitry coupled to the memory, the processor circuitry being configured to perform processing including: generating a pseudo label based on the plurality of pieces of unlabeled data and the deep learning model; calculating, based on the pseudo label and each label included in the plurality of pieces of labeled data, a loss for a result obtained by identifying the plurality of pieces of unlabeled data through the deep learning model, and a loss for a result obtained by identifying the plurality of pieces of labeled data through the deep learning model; and updating the deep learning model based on the calculated losses.
The object and advantages of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the claims.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention.
However, in a case where the training is performed limited to the capability of identifying the data based on the extracted feature after the acquisition of the feature extraction capability from the unlabeled data, each of the feature extraction capability and the identification capability of the deep learning model is individually trained and optimized. In other words, the feature amount extraction capability is optimized by the acquisition of the feature extraction capability from the unlabeled data, and the identification capability is optimized by the training limited to the capability of identifying the data based on the extracted feature. Thus, in a case where each processing is sequentially performed, it is difficult to tune the feature amount extraction capability according to the identification capability, resulting in a local optimal solution. Therefore, performance of the deep learning model in the entire recognition is lowered.
The disclosed technology has been made in view of the above, and an object thereof is to provide an information processing device, an information processing method, and an information processing program that improve recognition performance of a deep learning model.
Hereinafter, embodiments of an information processing device, an information processing method, and an information processing program disclosed in the present application will be described in detail with reference to the drawings. Note that the following embodiments do not limit the information processing device, the information processing method, and the information processing program disclosed in the present application.
The storage unit 11 stores the deep learning model 110, an unlabeled data base (DB) 111, and a labeled DB 112.
The deep learning model 110 is, in the present embodiment, a learning model that performs image recognition. The deep learning model 110 includes a feature amount extraction layer that extracts a feature of image data and an identification layer that identifies an object appearing in the image data from a feature amount of the image data.
The unlabeled DB 111 is a database that stores unlabeled data 201 which is image data. The unlabeled DB 111 stores the unlabeled data 201 input from a user by using an external terminal device or the like. The unlabeled data 201 is training data to which a correct answer label indicating what an object appearing in the image data is not assigned.
The labeled DB 112 is a database that stores labeled data 202 which is image data. The labeled DB 112 stores the labeled data 202 input from a user by using an external terminal device or the like. The labeled data 202 is training data assigned with a correct answer label.
The pseudo label generation unit 12 acquires the deep learning model 110 stored in the storage unit 11. Furthermore, the pseudo label generation unit 12 reads a plurality of pieces of the unlabeled data 201 from the unlabeled DB 111. At this time, the pseudo label generation unit 12 preferably reads all pieces of the unlabeled data 201. Next, the pseudo label generation unit 12 inputs each piece of the unlabeled data 201 included in the read image group to the deep learning model 110, and acquires output corresponding to each piece of the unlabeled data 201.
Next, the pseudo label generation unit 12 groups the respective pieces of the unlabeled data 201 included in the read image group according to output values from the deep learning model 110, and divides the grouped pieces of the unlabeled data 201 into a predetermined number of clusters determined in advance. For example, the pseudo label generation unit 12 performs the clustering by using k-means clustering.
Then, the pseudo label generation unit 12 assigns a pseudo label that is a pseudo correct answer to each cluster. For example, in a case where there are k classes, the pseudo label generation unit 12 assigns the pseudo labels such as a class #1, a class #2, a class #3, . . . , and a class #k. Thereafter, the pseudo label generation unit 12 outputs the pseudo label assigned to each cluster together with information regarding the unlabeled data 201 included in each cluster to the loss calculation unit 14.
The model output unit 13 acquires output from the deep learning model 110 of each of the unlabeled data 201 and the labeled data 202. The model output unit 13 includes a first model output unit 131 and a second model output unit 132.
Furthermore, the loss calculation unit 14 compares an output value from the deep learning model 110 with a pseudo label or a label assigned to the labeled data 202, and calculates each loss. The loss calculation unit 14 includes a first loss calculation unit 141 and a second loss calculation unit 142. Hereinafter, operation of the model output unit 13 and the loss calculation unit 14 will be described in detail.
The first model output unit 131 acquires the deep learning model 110 stored in the storage unit 11. Furthermore, the first model output unit 131 reads a plurality of pieces of the unlabeled data 201 used for training of the deep learning model 110 from the unlabeled DB 111.
Next, the first model output unit 131 inputs each piece of the unlabeled data 201 included in the read image group to the feature amount extraction layer of the deep learning model 110, and obtains output from the deep learning model 110 corresponding to each piece of the unlabeled data 201. For example, in a case where the read image group is Du and the unlabeled data 201 included in Du is xu, the first model output unit 131 acquires yu, which is the output of the deep learning model 110, by using the following Expression (1).
[Expression 1]
y
u
=h
unsup(f(xu)) (1)
Here, f represents the feature amount extraction layer of the deep learning model 110. In other words, f(xu) represents output from the feature amount extraction layer. Furthermore, hunsup represents an identification layer for unlabeled data of the deep learning model 110. In other words, hunsup(f(xu)) is output obtained by inputting the output from the feature amount extraction layer to the identification layer.
Thereafter, the first model output unit 131 outputs an output value of the deep learning model 110 for each piece of the unlabeled data 201 to the first loss calculation unit 141 of the loss calculation unit 14. For example, the first model output unit 131 outputs yu, which is the output of the deep learning model 110, to the first loss calculation unit 141.
The first loss calculation unit 141 calculates a loss in a case where the unlabeled data 201 is used. Hereinafter, the loss may be referred to as Loss.
The first loss calculation unit 141 receives input of the output value from the deep learning model 110 for the unlabeled data 201 from the first model output unit 131. Moreover, the first loss calculation unit 141 receives, from the pseudo label generation unit 12, input of a pseudo label for each cluster created by clustering the unlabeled data 201 together with the information regarding the unlabeled data 201 included in each cluster.
Next, the first loss calculation unit 141 compares the acquired output value with the pseudo label, and calculates a Loss in a case where the unlabeled data 201 is used, which is an error between an estimation result using the deep learning model 110 and the pseudo label that is a correct answer here. For example, the first loss calculation unit 141 calculates LossLunsup which is the Loss in a case where the unlabeled data 201 is used by using the following Expression (2) for yu representing the acquired output value.
Here, tu is the pseudo label. Furthermore, CE represents a general cross-entropy loss.
Thereafter, the first loss calculation unit 141 outputs the calculated loss in a case where the unlabeled data 201 is used to the update unit 15. For example, the first loss calculation unit 141 outputs the calculated Lunsup to the update unit 15.
The second model output unit 132 acquires the deep learning model 110 stored in the storage unit 11. Furthermore, the second model output unit 132 reads the labeled data 202 used for training of the deep learning model 110 from the labeled DB 112.
Next, the second model output unit 132 inputs each piece of the labeled data 202 included in the read image group to the feature amount extraction layer of the deep learning model 110, and obtains output from the deep learning model 110 corresponding to each piece of the labeled data 202. For example, in a case where the read image group is Di and the labeled data 202 included in Di is xi, the second model output unit 132 acquires yi, which is the output of the deep learning model 110, by using the following Expression (3).
[Expression 3]
y
i
=h
sup(f(xi)) (3)
Here, f represents the feature amount extraction layer of the deep learning model 110. In other words, f(xi) represents output from the feature amount extraction layer. Furthermore, hsup represents an identification layer for labeled data of the deep learning model 110. In other words, hsup(f(xi)) is output obtained by inputting the output from the feature amount extraction layer to the identification layer. As described above, in the training device 1 according to the present embodiment, training is individually performed on each of the identification layer for the unlabeled data 201 and the identification layer for the labeled data 202.
Thereafter, the second model output unit 132 outputs an output value of the deep learning model 110 for each piece of the labeled data 202 to the second loss calculation unit 142. For example, the second model output unit 132 outputs yi, which is the output of the deep learning model 110, to the second loss calculation unit 142.
The second loss calculation unit 142 receives input of the output value from the deep learning model 110 for the labeled data 202 from the second model output unit 132. Moreover, the second loss calculation unit 142 acquires a label assigned to each piece of the labeled data 202 read by the model output unit 13 from the labeled DB 112.
Next, the second loss calculation unit 142 compares the acquired output value with the label assigned to each piece of the labeled data 202, and calculates a Loss in a case where the labeled data 202 is used, which is an error between an estimation result using the deep learning model 110 and the label that is a correct answer. For example, the second loss calculation unit 142 calculates Lsup which is the Loss in a case where the labeled data 202 is used by using the following Expression (4) for yi representing the acquired output value.
Here, ti is the correct answer. Furthermore, CE represents a general cross-entropy loss.
Thereafter, the second loss calculation unit 142 outputs the calculated loss in a case where the labeled data 202 is used to the update unit 15. For example, the second loss calculation unit 142 outputs the calculated Lsup to the update unit 15.
The update unit 15 receives input of the loss in a case where the unlabeled data 201 is used from the first loss calculation unit 141. Furthermore, the update unit 15 receives input of the loss in a case where the labeled data 202 is used from the second loss calculation unit 142. Then, the update unit 15 calculates a final loss by performing weighting determined in advance on the estimation result in a case where the unlabeled data 201 is used and the estimation result in a case where the labeled data 202 is used. For example, the update unit 15 calculates Ltotal which is the final Loss by using the following Expression (5) from Lunsup which is the Loss in a case where the unlabeled data 201 is used and Lsup which is the Loss in a case where the labeled data 202 is used.
[Expression 5]
L
total
=α*L
sup+(1−α)*Lunsup (5)
Here, a is a parameter for balance adjustment between Lsup and Lunsup, and is a constant for weighting each. α takes a value greater than 0 and smaller than 1. As α increases, an influence on training by the estimation result in a case where the labeled data 202 is used increases.
Thereafter, the update unit 15 obtains a parameter of the feature amount extraction layer of the deep learning model 110, a parameter of the identification layer for the unlabeled data 201, and a parameter of the identification layer for the labeled data 202 such that the calculated final loss is minimized. Then, the update unit 15 updates the deep learning model 110 held by the model output unit 13 with the obtained parameter of the feature amount extraction layer of the deep learning model 110 and the obtained parameter of the identification layer for the unlabeled data 201. Furthermore, the update unit 15 updates the deep learning model 110 held by the model output unit 13 with the obtained parameter of the feature amount extraction layer of the deep learning model 110 and the obtained parameter of the identification layer for the labeled data 202. For example, the update unit 15 updates the model output unit 13 and the deep learning model 110 held by each model output unit 13 by f, Lsup, and Lunsup that minimize Ltotal.
As described above, in the training device 1 according to the present embodiment, the deep learning model 110 for the unlabeled data 201 and the deep learning model 110 for the labeled data 202 are trained separately from each other and simultaneously in parallel. Note that the feature amount extraction layer is the same and the identification layer is different between the deep learning model 110 for the unlabeled data 201 and the deep learning model 110 for the labeled data 202. Then, in a recognition phase after the training, unknown image data is recognized by using the trained deep learning model 110 for the labeled data 202 held by the model output unit 13.
First, a plurality of the unlabeled data 201 and a plurality of the labeled data 202 are prepared and stored in the unlabeled DB 111 and the labeled DB 112, respectively. As illustrated in
Next, for each piece of the unlabeled data 201 and the labeled data 202, feature amount extraction is performed by the first model output unit 131 and the second model output unit 132 by using the feature amount extraction layer of the deep learning model 110 (Step S1).
Next, the training using the unlabeled data 201 proceeds in a direction of an arrow on an upper side of a paper surface from the feature amount extraction layer toward the identification layer in
The training device 1 acquires the unlabeled data 201 and stores the acquired unlabeled data 201 in the unlabeled DB 111. Furthermore, the training device 1 acquires the labeled data 202 and stores the acquired labeled data 202 in the labeled DB 112 (Step S11).
The update unit 15 acquires a frequency threshold input from an external terminal device or the like (Step S12).
Next, the update unit 15 initializes and sets the number of times of training to 0 (Step S13).
The pseudo label generation unit 12 reads a plurality of pieces of the unlabeled data 201 from the unlabeled DB 111 to perform classification, generates a pseudo label for each class, and assigns the pseudo label to each class (Step S14).
The first model output unit 131 and the second model output unit 132, the first loss calculation unit 141 and the second loss calculation unit 142, and the update unit 15 execute simultaneous training using the labeled data 202 and the unlabeled data 201 (Step S15).
Thereafter, the update unit 15 determines whether or not the number of times of training exceeds the frequency threshold (Step S16). In a case where the number of times of training is equal to or less than the frequency threshold (Step S16: No), the update unit 15 adds 1 to the number of times of training and increments the number of times of training (Step S17). Thereafter, the training processing returns to Step S14.
On the other hand, in a case where the number of times of training exceeds the frequency threshold (Step S16: Yes), the update unit 15 ends the training processing in the training device 1.
The second model output unit 132 reads a plurality of pieces of the labeled data 202 from the labeled DB 112. Then, the second model output unit 132 inputs each piece of the read labeled data 202 to the feature amount extraction layer of the deep learning model 110. Thereafter, the second model output unit 132 acquires output from the deep learning model 110 (Step S101).
The second loss calculation unit 142 acquires a label assigned to the labeled data 202 read by the second model output unit 132 from the labeled DB 112. Then, the second loss calculation unit 142 compares an output value corresponding to each piece of the labeled data 202 acquired from the second model output unit 132 with the label assigned to the labeled data 202, and calculates a Loss in a case where the labeled data 202 is used (Step S102).
The first model output unit 131 reads a plurality of pieces of the unlabeled data 201 from the unlabeled DB 111. Then, the first model output unit 131 inputs each piece of the read unlabeled data 201 to the feature amount extraction layer of the deep learning model 110. Thereafter, the first model output unit 131 acquires output from the deep learning model 110 (Step S103).
The first loss calculation unit 141 compares an output value corresponding to each piece of the unlabeled data 201 acquired from the first model output unit 131 with a pseudo label acquired from the pseudo label generation unit 12, and calculates a Loss in a case where the unlabeled data 201 is used (Step S104).
The update unit 15 acquires the Loss in a case where the labeled data 202 is used from the second loss calculation unit 142. Furthermore, the update unit 15 acquires the Loss in a case where the unlabeled data 201 is used from the first loss calculation unit 141. Then, the update unit 15 calculates an overall Loss by using the respective weights for the Loss in a case where the labeled data 202 is used and the Loss in a case where the unlabeled data 201 is used (Step S105).
Thereafter, the update unit 15 updates the deep learning model 110 included in each of the first model output unit 131 and the second model output unit 132 so as to minimize the overall Loss (Step S106).
As described above, the training device according to the present embodiment divides the unlabeled data into the plurality of clusters, assigns the pseudo labels to the respective clusters, and executes the training of the deep learning model by using the labeled data, the unlabeled data, and the pseudo labels. With this configuration, the training device may simultaneously train the feature amount extraction layer and the identification layer of the deep learning model by using both the labeled data and the unlabeled data. Therefore, even in a case where the training is performed by using a large number of pieces of the unlabeled data and a small number of pieces of the labeled data, optimal recognition performance may be acquired, and the recognition performance of the deep learning model may be improved.
In the training device 1 according to the present embodiment, the number of labels represented by labeled data 202 is equal to the number of clusters in a case where unlabeled data is clustered. In other words, in the training device 1 according to the present embodiment, in a case where the number of labels represented by the labeled data 202 is tu and unlabeled data 201 is classified into ti clusters, tu=ti is satisfied.
A pseudo label generation unit 12 performs clustering in a manner similar to that of the first embodiment by using the unlabeled data 201, and divides the unlabeled data 201 into a plurality of clusters. At this time, the pseudo label generation unit 12 classifies the unlabeled data 201 into clusters as many as the number of labels represented by the labeled data 202. Then, the pseudo label generation unit 12 assigns pseudo labels to the respective clusters. Thereafter, the pseudo label generation unit 12 outputs the generated pseudo labels to a loss calculation unit 14.
A model output unit 13 reads a plurality of pieces of the unlabeled data 201 from an unlabeled DB 111. Furthermore, the model output unit 13 reads a plurality of pieces of the labeled data 202 from a labeled DB 112. Then, the model output unit 13 integrates the read unlabeled data 201 and the read labeled data 202 into integrated data. Then, the model output unit 13 inputs the integrated data to a deep learning model 110 and acquires output.
For example, in a case where the integrated data is x, the model output unit 13 acquires y that is the output from the deep learning model 110 represented by the following Expression (6).
[Expression 6]
y=h(f(x)) (6)
Here, f represents a feature amount extraction layer of the deep learning model 110. In other words, f(x) is output from the feature amount extraction layer. h represents an identification layer of the deep learning model 110. In other words, h(f(x)) is output obtained by inputting an output value from the feature amount extraction layer to the identification layer.
Thereafter, the model output unit 13 outputs an output value for each piece of the integrated data to the loss calculation unit 14.
The loss calculation unit 14 receives input of the output value from the deep learning model 110 for each piece of the integrated data from the model output unit 13. Furthermore, the loss calculation unit 14 acquires a label representing each piece of the labeled data 202 stored in the labeled DB 112 from the labeled DB 112. Furthermore, the loss calculation unit 14 receives input of a pseudo label for each class from the pseudo label generation unit 12.
Next, the loss calculation unit 14 integrates the label acquired from the labeled DB 112 and the pseudo label to generate an integrated label. For example, since the number of labels and the number of pseudo labels acquired from the labeled DB 112 are the same, the loss calculation unit 14 generates the integrated label by replacing each of the pseudo labels with a label determined to refer to the same object.
Thereafter, the loss calculation unit 14 compares the output value from the feature amount extraction layer of the deep learning model 110 for each piece of the integrated data with the integrated label corresponding to each piece of the integrated data, and calculates a loss in a case where the integrated data is used.
For example, in a case where a set including all x that is the integrated data is D and the integrated label is t, the loss calculation unit 14 calculates L that is the Loss in a case where the integrated data is used, by using the following Expression (7). Here, CE is a general cross-entropy loss.
Thereafter, the loss calculation unit 14 outputs the calculated loss to an update unit 15. For example, the loss calculation unit 14 outputs L, which is the Loss in a case where the integrated data is used, calculated by using Expression (7) to the update unit 15.
The update unit 15 receives input of the loss from the loss calculation unit 14. Then, the update unit 15 determines a parameter of the deep learning model 110 that minimizes the loss. Thereafter, the update unit 15 updates the deep learning model 110 included in the model output unit 13 by using the determined parameter.
For example, in a case where L that is the Loss in a case where the integrated data is used is acquired from the loss calculation unit 14, the update unit 15 updates f that is the feature amount extraction layer and h that is the identification layer so as to minimize L. In other words, in the present embodiment, training is performed by using one deep learning model 110 having a similar feature amount extraction layer and identification layer for both the unlabeled data 201 and the labeled data 202.
The model output unit 13 reads the unlabeled data 201 and the labeled data 202 to generate integrated data. Next, the model output unit 13 inputs the integrated data to the deep learning model 110, and acquires output from the deep learning model 110 corresponding to each piece of the integrated data (Step S201).
The pseudo label generation unit 12 performs clustering in a manner similar to that of the first embodiment by using the unlabeled data 201, and divides the unlabeled data 201 into clusters as many as the number of labels representing the labeled data 202 stored in the labeled DB 112. Then, the pseudo label generation unit 12 assigns pseudo labels to the respective clusters (Step S202).
The loss calculation unit 14 integrates the pseudo label and the label representing the labeled data 202 stored in the labeled DB 112 to generate an integrated label. Then, the loss calculation unit 14 compares an output value corresponding to each piece of the integrated data with the integrated label to calculate a loss. The update unit 15 performs training by updating the feature amount extraction layer and the identification layer of the deep learning model 110 included in the model output unit 13 so as to minimize the loss calculated by the loss calculation unit 14 (Step S203).
As described above, the training device according to the present embodiment classifies the unlabeled data into the clusters as many as the number of labels representing the labeled data. Then, the training device generates the integrated data obtained by integrating the labeled data and the unlabeled data, generates the integrated label by integrating the label of the labeled data and the pseudo label, and performs the training by using the integrated data and the integrated label. With this configuration, the deep learning model may be trained by training of a single task by using a single identification layer. Also in this method, even in a case where the training is performed by using a large number of pieces of the unlabeled data and a small number of pieces of the labeled data, optimal recognition performance may be acquired, and the recognition performance of the deep learning model may be improved.
Next, a third embodiment will be described. In the first and second embodiments, the case where the image data is used as the training data has been described as an example, but it is possible to similarly perform training by using unlabeled data and labeled data even when data is other than the image data.
For example, a training device 1 may perform training of a deep learning model 110 by using a moving image as training data. The moving image is a set of RGB values according to a lapse of time in each pixel in a screen. In that case, it is possible to identify a type of an unknown moving image by using the trained deep learning model 110.
Besides, the training device 1 may also perform the training of the deep learning model 110 by using joint data as the training data.
As described above, the training device according to each embodiment may perform the training of the deep learning model by using other data such as the moving image data and the joint data other than the image data. Additionally, even in a case where the other data other than the image data is used, by performing the training by using a large number of pieces of the unlabeled data and a small number of pieces of the labeled data, optimal recognition performance may be acquired, and the recognition performance of the deep learning model may be improved.
(Hardware Configuration)
The computer 90 includes a processor 901, a main storage device 902, an auxiliary storage device 903, an input device 904, an output device 905, a medium drive device 906, an input/output interface 907, and a communication control device 908. The respective components of the computer 90 are coupled to each other by a bus 909.
The processor 901 is, for example, a central processing unit (CPU). The computer 90 may include a plurality of the processors 901. Moreover, the computer 90 may include a graphics processing unit (GPU) or the like as the processor 901. The processor 901 loads a program in the main storage device 902, and executes the program.
The main storage device 902 is, for example, a random access memory (RAM). The auxiliary storage device 903 is, for example, a nonvolatile storage device such as a hard disk drive (HDD) or a solid state drive (SSD). For example, the auxiliary storage device 903 implements the function of the storage unit 11 in
The input device 904 is, for example, a keyboard, a pointing device, or a combination thereof. The pointing device may be, for example, a mouse, a touch pad, or a touch screen. The output device 905 is a display, a speaker, or a combination thereof. The display may be a touch screen.
The input/output interface 907 is coupled to a peripheral component interconnect express (PCIe) device or the like, and transmits/receives data to/from the coupled device.
The communication control device 908 is, for example, a wired local area network (LAN) interface, a wireless LAN interface, or a combination thereof. The computer is coupled to a network such as a wireless LAN or a wired LAN via the communication control device 908. Specifically, the communication control device 908 may be an external network interface card (NIC) or an on-board network interface controller.
A storage medium 91 is an optical disk such as a compact disc (CD) or a digital versatile disk (DVD), a semiconductor memory card such as a magneto-optical disk, a magnetic disk, or a flash memory, or the like. The medium drive device 906 is a device that writes and reads data to and from the inserted storage medium 91.
The program executed by the processor 901 may be installed in the auxiliary storage device 903 in advance. Alternatively, the program may be stored and provided in the storage medium 91, read by the medium drive device 906 from the storage medium 91, copied to the auxiliary storage device 903, and thereafter loaded in the main storage device 902. Alternatively, the program may be downloaded and installed from a program provider over the network to the computer 90 via the network and the communication control device 908.
For example, the processor 901 executes the program to implement the functions of the pseudo label generation unit 12, the model output unit 13, the loss calculation unit 14, and the update unit 15 exemplified in
All examples and conditional language provided herein are intended for the pedagogical purposes of aiding the reader in understanding the invention and the concepts contributed by the inventor to further the art, and are not to be construed as limitations to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to a showing of the superiority and inferiority of the invention. Although one or more embodiments of the present invention have been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.
This application is a continuation application of International Application PCT/JP2021/010452 filed on Mar. 15, 2021 and designated the U.S., the entire contents of which are incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/JP2021/010452 | Mar 2021 | US |
Child | 18458363 | US |