This application is based upon and claims the benefit of priority of the prior Japanese Patent Application No. 2017-096006, filed on May 12, 2017, the entire contents of which are incorporated herein by reference.
The embodiment discussed herein is related to a computer-readable recording medium, a learning method, and a learning apparatus.
In natural language processing, as examples, various types of machine learning are used, such as perceptron, SVMs (Support Vector Machines), PA (Passive-Aggressive), and AROW (Adaptive Regularization of Weight Vectors).
As an example, there is described a case where a word is picked out as a feature from a labeled text that is a learning object, and where a model in which this feature and confidence are associated with each other is learned according to a method called perceptron. In the perceptron method, each feature of each piece of learning data is cross-checked with a feature in the model to evaluate whether labeling is against the confidence of the model. In the perceptron method, a feature labeled against the confidence given by the model is classified as a wrong instance, and the model is caused to learn this wrong instance to update the model.
Patent Document 1: Japanese Laid-open Patent Publication No. 2014-102555
Patent Document 2: Japanese Laid-open Patent Publication No. 2005-44330
However, in the conventional methods, cross-checking with the model and evaluation are repeated for all pieces of learning data. In other words, in the conventional methods, cross-checking with the model and evaluation are performed every time, even for learning data for which classification has been correct for multiple times consecutively. As a result, in the conventional methods, it is needed to have a certain amount of calculation for executing a learning process, and thus reducing the amount of calculation needed for the learning process is difficult.
According to an aspect of an embodiment, a non-transitory computer-readable recording medium stores a learning program that causes a computer to execute a process including: acquiring learning data that is a learning object for a model in which data and confidence of the data are associated with each other; determining whether learning of the learning data is needed by comparing a predetermined condition with a decision result related to updating of the model accumulated for the learning data acquired at the acquiring; and excluding, from a learning object, the learning data of which learning is determined to be unneeded at the determining.
The object and advantages of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the claims.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention, as claimed.
Preferred embodiments of the present invention will be explained with reference to accompanying drawings. These embodiments are only examples and configurations and the like are not limited to those of the embodiments.
The learning apparatus 10 illustrated in
As one embodiment, the learning apparatus 10 can be implemented by installing a learning program that executes the learning process described above in a desired computer, as package software or online software. For example, by causing an information processing apparatus to execute the learning program described above, the information processing apparatus can be caused to function as the learning apparatus 10. The information processing apparatus in this description includes, within its scope, mobile communication terminals such as smartphones and mobile phones, slate terminals such as PDAs (Personal Digital Assistants), in addition to desktop and laptop personal computers. Implementation can be also made as a server apparatus that provides a service related to the learning process described above to a client, the client being a terminal apparatus used by a user. For example, the learning apparatus 10 accepts learning data labeled as a positive instance or a negative instance or identification information with which learning data can be loaded via a network or a storage medium. The learning apparatus 10 is implemented as a server apparatus to provide a learning service of outputting a model that is a result of executing the learning process described above with respect to the learning data. In this case, the learning apparatus 10 may be implemented as a Web server or may be implemented as a cloud to provide a service related to the learning process by outsourcing.
As illustrated in
The acquiring unit 11 acquires learning data that is a learning object of a model (described below). The learning data is a text including a label of a positive instance or negative instance and a feature amount. The acquiring unit 11 acquires a feature included in the text that is a learning object.
As one embodiment, the acquiring unit 11 can also read and acquire learning data saved in an auxiliary storage device, such as a hard disk or an optical disc, or in a removable medium, such as a memory card or a USB (Universal Serial Bus) memory. In addition, the acquiring unit 11 can also receive and acquire learning data from an external apparatus via a network.
The determining unit 12 compares a predetermined condition with a decision result related to a model for learning data accumulated for the learning data acquired by the acquiring unit 11, determines whether learning of the learning data is needed, and excludes, from a learning object, learning data of which learning is determined to be unneeded.
The model storage unit 13 stores a model in which data and confidence of the data are associated with each other. A model is learned through association of a feature included in a text and confidence with each other. A model learns, as a wrong instance, learning data including a label against the confidence assigned by the model, that is, against the confidence of the model. The model is empty at the initial phase of a learning process, and a feature and confidence thereof are newly registered by the updating unit 15 (described below). Alternatively, in this model, confidence associated with a feature is updated by the updating unit 15. The “confidence” referred to in this description indicates the probability of a spam, and thus it is described below as “spam score” merely to represent one aspect.
The cross-checking unit 14 cross-checks learning data of which learning is determined to be needed by the determining unit 12 with a model stored in the model storage unit 13, decides whether the learning data to be cross-checked is data used for updating the model, and accumulates therein a decision result for the learning data to be cross-checked. Specifically, the cross-checking unit 14 decides that the learning data to be cross-checked is data used for updating the model, when the learning data to be cross-checked includes a label against the confidence of the model, that is, when the classification is incorrect (wrong).
The cross-checking unit 14 decides that the learning data to be cross-checked is not data used for updating the model, when the learning data to be cross-checked includes a label corresponding to the confidence of the model, that is, when the classification is correct. The cross-checking unit 14 accumulates therein the correct classification count indicating the count for correctly classified instances for learning data decided as not data used for updating the model, that is, for data for which classification has been correct. When the correct classification count accumulated for the learning data acquired by the acquiring unit 11 is equal to or greater than a predetermined threshold, the determining unit 12 determines that learning of the learning data is unneeded.
The updating unit 15 updates a model stored in the model storage unit 13, on the basis of learning data to be cross-checked that is decided as data used for updating the model by the cross-checking unit 14. Specifically, the updating unit 15 updates, on the basis of a label, confidence associated with a feature matching the model, out of features in text data to be cross-checked that is decided as data used for updating the model. The updating unit 15 adds to the model at least one of features not matching the model, out of features in text data to be cross-checked that is decided as data used for updating the model.
When learning data is acquired in this manner, the acquiring unit 11 extracts a noun included in the text by, for example, performing a morphological analysis and decomposing the text into morphemes. Accordingly, as illustrated in the lower part of
Process in Learning Apparatus
Next, a learning process in the learning apparatus 10 is described. As an example, there is assumed a case where the learning data illustrated in
For example, there is assumed a case where the learning apparatus 10 executes processing of learning data in the first line, learning data in the second line, learning data in the third line, and learning data in the fourth line, out of the learning data illustrated in
In
With reference to
Subsequently, the cross-checking unit 14 cross-checks the data in the first line of the learning data F1 with the model M1 (see Y11). When the learning data F1 to be cross-checked includes a label against the spam score of the model, the classification is incorrect (wrong).
For example, when the product of the label of the learning data and the spam score of the model is equal to or less than 0, the learning data includes a label against the spam score of the model, and the classification is incorrect. In this manner, when the product of the label of the learning data and the spam score of the model is equal to or less than 0, the cross-checking unit 14 decides that updating of the model based on the learning data is needed. In contrast, when the product of the label of the learning data and the spam score of the model is greater than 0, the learning data includes a label matching the spam score of the model, and the classification is correct. In this manner, when the product of the label of the learning data and the spam score of the model is greater than 0, the cross-checking unit 14 decides that updating of the model is unneeded.
In the example of
The updating unit 15 updates, on the basis of the label, the spam score associated with the feature matching the feature of the data in the first line of the learning data F1, out of the spam scores included in the model M1. In the example of
Next, with reference to
Subsequently, the cross-checking unit 14 cross-checks the data in the second line of the learning data F1 with the model M1 (see Y13). In the example of
Next, with reference to
Subsequently, with reference to
Next, the second round of the process for the learning data F1 is described.
Next, with reference to
Next, with reference to
In this manner, for learning data of which the correct classification count is equal to or greater than “1”, the learning apparatus 10 according to the first embodiment does not execute processing for cross-checking with the model and updating of the model. Therefore, the amount of calculation needed for the processing for cross-checking with the model and updating of the model can be reduced.
Process Procedure of Learning Process
Next, a procedure of the learning process according to the first embodiment is described.
As illustrated in
Subsequently, the acquiring unit 11 sets statuses, for example, flags, related to all samples of the learning data T acquired at Step S101 to be unprocessed (Step S104). The learning apparatus 10 executes the process at Step S106 and thereafter, as long as an unprocessed sample of learning data is present in the learning data T (YES at Step S105).
That is, the acquiring unit 11 selects one piece of unprocessed learning data t from learning data T acquired at Step S101 (Step S106). The determining unit 12 refers to the correct classification count of the learning data t and decides whether the correct classification count is equal to or greater than the threshold C (Step S107). In other words, at Step S107, the determining unit 12 compares the correct classification count, which is a decision result related to updating of the model accumulated for the learning data t, with a condition that the correct classification count is equal to or greater than the threshold C to determine whether learning of the learning data t is needed. When the correct classification count of the learning data t is decided to be equal to or greater than the threshold C (YES at Step S107), the determining unit 12 excludes the learning data t from a learning object and proceeds the process to Step S112.
When the determining unit 12 has determined that the correct classification count of the learning data t is not equal to or greater than the threshold C (NO at Step S107), the learning process for the learning data t is executed. Specifically, the cross-checking unit 14 cross-checks a feature of the learning data t with a feature included in the model stored in the model storage unit 13 and acquires a spam score (Step S108).
Subsequently, the cross-checking unit 14 decides whether the learning data t to be cross-checked is data used for updating the model (Step S109). Specifically, when the classification of the learning data t with the spam score obtained by cross-checking at Step S108 is wrong, the cross-checking unit 14 decides that the learning data t is data used for updating the model.
When the cross-checking unit 14 has decided that the learning data t to be cross-checked is data used for updating the model (YES at Step S109), the updating unit 15 updates the model, on the basis of the learning data t (Step S110). Specifically, the updating unit 15 performs updating such that a spam score assigned to a label of the learning data t is added to the current spam score associated with the feature included in the model. On the other hand, when the learning data t to be cross-checked is decided as not data used for updating the model (NO at Step S109), the cross-checking unit 14 adds 1 to the correct classification count of the learning data t (Step S111).
When the determining unit 12 has determined that the correct classification count of the learning data t is equal to or greater than the threshold C (YES at Step S107), the learning apparatus 10 increments a repeated attempt count i held in a register or the like (not illustrated) (Step S112), after the process at Step S110 or Step S111.
When an unprocessed sample of the learning data is not present in the learning data T (NO at Step S105) or after the process at Step S112, the learning apparatus 10 determines whether the repeated attempt count i is less than the repetition count L (Step S113). When the repeated attempt count i is determined to be less than the repetition count L (YES at Step S113), the learning apparatus 10 shifts to Step S104 and repeats execution of processes from Step S104 to Step S113.
On the other hand, when the learning apparatus 10 has determined that the repeated attempt count i has reached the repetition count L (NO at Step S113), the updating unit 15 outputs the model stored in the model storage unit 13 to a predetermined output destination (Step S114) and terminates the process. Examples of the output destination for the model include an application program that executes a filtering process for e-mails. When generating of a model is requested from an external apparatus, a response can be made to the request source.
According to the first embodiment, a predetermined condition and a decision result related to updating of a model accumulated for learning data are compared to determine whether learning of the learning data is needed, and learning data of which learning is determined to be unneeded is excluded from a learning object. Therefore, the amount of calculation needed a learning process can be reduced.
The amount of processing for the learning process according to the present embodiment and the amount of processing for a general learning process are compared with each other.
As illustrated in
Subsequently, in the learning process of the comparative example, the data in the second line of the learning data F2 (see a frame R22) and the model M2 are cross-checked (see Y13A), as illustrated in
Next, in the learning process of the comparative example, the data in the third line of the learning data F2 (see a frame R23) and the model M2 are cross-checked (see Y15A), as illustrated in
Subsequently, in the learning process of the comparative example, the data in the fourth line of the learning data F2 (see a frame R24) and the model M2 are cross-checked (see Y16A), as illustrated in
Next, in the learning process according to the comparative example, the second round of the process for the learning data F2 is described. First, in the learning process of the comparative example, the data in the first line of the learning data F2 (see the frame R21) and the model M2 are cross-checked (see Y21A), as illustrated in
Subsequently, in the learning process of the comparative example, the data in the second line of the learning data F2 (see the frame R22) and the model M2 are cross-checked (see Y22A), as illustrated in
Subsequently, in the learning process of the comparative example, the data in the third line of the learning data F2 (see the frame R23) and the model M2 are cross-checked (see Y23A), as illustrated in
Next, in the learning process of the comparative example, the data in the fourth line of the learning data F2 (see the frame R24) and the model M2 are cross-checked (see Y24A), as illustrated in
In this manner, in the general learning process, classification is performed redundantly for learning data that can be classified correctly. That is, in the general learning process, classification is performed for the data in the third line of the learning data F2 and the data in the fourth line of the learning data F2 also in the second round, even though the classification has been correct in the first round. Thus, in the general learning process, a certain amount of calculation is needed because cross-checking with a model and evaluation are performed every time, even for a feature with multiple consecutive instances of correct classification.
There are cases where the evaluation for data of the same type in data that is a learning object does not change that frequently. The model M2 illustrated in
In contrast, in the learning process according to the first embodiment, the correct classification count for each piece of learning data with respect to a model is accumulated, and learning data of which the correct classification count has become equal to or greater than a threshold is excluded from a learning object. As described with
Therefore, in the first embodiment, the amount of calculation needed for the processing for cross-checking with the model and updating of the model for learning data of which the correct classification count has become equal to or greater than the threshold can be reduced, as compared to the general learning process. Thus, according to the first embodiment, reduction in the calculation time needed for the learning process and reduction in the amount of memory used for the learning process can be also achieved, as compared to the general learning process.
Another Process Procedure of Learning Process
Next, a modification of the first embodiment is described.
Steps S201 to S209 illustrated in
In the learning process illustrated in
Another Process Procedure of Learning Process
Next, another modification of the first embodiment is described. In the learning apparatus 10, the cross-checking unit 14 may accumulate a correct classification score indicating reliability for correct classification for each piece of learning data with respect to the model, instead of the correct classification count. In the learning apparatus 10, the determining unit 12 may perform determination to exclude, from a learning object, learning data of which the correct classification score has become equal to or greater than a threshold.
The acquiring unit 11 acquires a threshold Ca of the correct classification score (Step S303). The threshold Ca of the correct classification score can be set in advance to any value, in accordance with the precision desired for the model. Steps S304 to S306 correspond to Steps S104 to S106 illustrated in
When the determining unit 12 has decided that the correct classification score of the learning data t is equal to or greater than the threshold Ca (YES at Step S307), the learning apparatus 10 proceeds the process to Step S312 after the process at Step S310 or Step S311. Steps S312 to S314 correspond to Steps S112 to S114 illustrated in
In the learning apparatus 10, the determining unit 12 may perform determination to exclude, from a learning object, learning data for which the ratio of the correct classification count with respect to the processing count has become equal to or greater than a predetermined threshold. A specific description is given with reference to
The determining unit 12 refers to the correct classification count and the processing count of the learning data t, calculates the ratio of the correct classification count with respect to the processing count, and determines whether the calculated ratio is equal to or greater than the threshold Cb (Step S407). When the ratio of the correct classification count with respect to the processing count for the learning data t is decided to be equal to or greater than the threshold Cb (YES at Step S407), the determining unit 12 excludes the learning data t from a learning object and proceeds the process to Step S412. When the determining unit 12 has determined that the ratio of the correct classification count with respect to the processing count for the learning data t is not equal to or greater than the threshold Cb (NO at Step S407), the learning process for the learning data t is executed. Steps S408 to S414 correspond to Steps S108 to S114 illustrated in
There is described an example in which the learning process according to the first embodiment is specifically applied to a newspaper-making process. In this example, a created article corresponds to text data and a section such as the front page, the economic section, the cultural section, or the social section corresponds to a label assigned to the text data. A model is set for the number of the sections, and a score is associated with each feature. A learning process is executed in advance to create a model, with multiple existing articles in each section as learning data.
The learning apparatus then 10 applies the learning process according to the first embodiment for a newly created article, determines whether learning is needed, and performs cross-checking with the model and updating of the model when learning is needed. As a result, the learning apparatus 10 outputs a likely section for the article. By applying the first embodiment in this manner, the learning apparatus 10 automatically presents which section is favorable for the created article to be carried in, and thus the time to be taken for a newspaper editor to select the section can be reduced.
Distribution and Integration
Respective constituent elements of respective devices illustrated in the drawings do not need to be physically configured in the way as illustrated in these drawings. That is, the specific mode of distribution and integration of respective devices is not limited to the illustrated ones and all or a part of these units can be functionally or physically distributed or integrated in an arbitrary unit, according to various kinds of load and the status of use. For example, the acquiring unit 11, the determining unit 12, the cross-checking unit 14, or the updating unit 15 can be connected through a network as the external device of the learning apparatus 10. It is also possible to configure that other devices include the acquiring unit 11, the determining unit 12, the cross-checking unit 14, or the updating unit 15 respectively and these units are connected to a network and cooperate to realize the functions of the learning apparatus 10 described above.
Learning Program
Various processes described in the above embodiment can be realized by executing a program prepared in advance with a computer such as a personal computer or a workstation. In the following descriptions, with reference to
As illustrated in
Under such an environment, the learning program 170a is read from the HDD 170 and loaded into the RAM 180 by the CPU 150. As a result, the learning program 170a functions as a learning process 180a as illustrated in
The learning program 170a described above does not need to be stored in advance in the HDD 170 or the ROM 160. For example, the learning program 170a is stored in a “portable physical medium” such as a flexible disk, a so-called FD, a CD-ROM, a DVD disk, a magneto-optical disk, and an IC card inserted into the computer 100. The computer 100 can acquire the learning program 170a from such portable physical media and execute the learning program 170a. Further, it is possible to configure that the learning program 170a is stored in another computer or server device to be connected to the computer 100 via a public communication line, the Internet, a LAN, or a WAN, and the computer 100 acquires and executes the learning program 170a from such media.
The amount of calculation needed for a learning process is reduced.
All examples and conditional language recited herein are intended for pedagogical purposes of aiding the reader in understanding the invention and the concepts contributed by the inventor to further the art, and are not to be construed as limitations to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to a showing of the superiority and inferiority of the invention. Although the embodiment of the present invention has been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.
Number | Date | Country | Kind |
---|---|---|---|
2017-096006 | May 2017 | JP | national |