The present invention relates to a control device and the like using a decision tree.
In recent years, in cases where machine learning technology is applied to control of a predetermined device, a problem has been pointed out that control accuracy is gradually lowered due to a phenomenon called concept drift. Concept drift is a phenomenon in which a model gradually changes from the model initially developed as a learning target, due to degradation of a control-target device over time, or the like. As a technique for preventing control accuracy from being lowered due to concept drift, it has been considered in recent years to update a learned model by performing additional learning based on data acquired from a device.
On the other hand, decision trees such as CART have attracted attention as a machine learning technique in recent years. According to the decision trees, explainability and interpretability of a result of learning can be enhanced.
As apparent from the drawing, the decision tree forms a tree structure starting from a root node (an uppermost node in the diagram), which is a base end, to terminal-end leaf nodes (lowest nodes in the diagram) based on predetermined learned data. On each node, a branch condition is defined that is determined depending on a magnitude relationship among threshold values θ1 to θ4, as a result of machine learning. Thus, in an inference stage, input data inputted from the root node is eventually associated with any one leaf node of the leaf nodes A to E, and an output value is determined based on data associated with the associated leaf node (output node).
For example, in the example in the diagram, data that satisfies a condition x1 ≤ θ1 and a condition x2 ≤ θ2 is associated with the leaf node A. Data that satisfies the condition x1 ≤ θ1 and a condition x2 > θ2 is associated with the leaf node B. An input that satisfies a condition x1 > θ1, a condition x2 ≤ θ3, and a condition x1 ≤ θ4 is associated with the leaf node C. An input that satisfies the condition x1 > θ1, the condition x2 ≤ θ3, and a condition x1 > θ4 is associated with the leaf node D. An input that satisfies the condition x1 > θ1 and a condition x2 > θ3 is associated with the leaf node E.
As conventional techniques for performing additional learning for such a decision tree, mainly the following two techniques can be specified.
A first one is a technique in which the tree structure is further extended in a depth direction, based on additionally learned data. According to the technique, although the tree structure can be further branched, computational costs for both learning and inference increase because the size of the tree structure becomes larger. Moreover, in additional learning, addition of a new storage capacity is needed.
A second one is a technique in which branch conditions in the tree structure are reconfigured anew, based on additionally learned data (Patent Literature 1 as an example). According to the technique, optimal branch conditions can be reconfigured, with the additional data taken into consideration, and an inference cost after learning is unchanged from before. However, since the whole of machine learning is performed again, a computational cost for learning is high, and addition of a new storage capacity is needed in additional learning.
In other words, in additional learning for a decision tree, use of any of the techniques involves an increase in computational cost for learning or inference, and addition of a storage capacity.
Patent Literature 1: International Publication No. 2010/116450
However, when machine learning technology is applied to control of a predetermined device, there is a certain limitation on available hardware resources in terms of cost or the like, in some cases. For example, many of control devices used in areas such as edge computing, which is receiving attention in recent years, are so-called embedded devices, in which case there are certain hardware limitations, such as limitations on computational throughput of a computing device and a storage capacity.
Accordingly, when an attempt is made to incorporate a decision tree that is open to additional learning as described above into a device with such limited hardware resources, a delay from a control cycle, an excess over a storage capacity, and the like may be caused due to an increase in computational cost for learning or inference, so that reliability and security of control may not be able to be guaranteed.
The present invention has been made in the above-described technical background, and an object thereof is to provide an additional learning technique for a decision tree that involves a small computational cost for additional learning, causes no change in inference time even if additional learning is performed, and needs no additional storage capacity or the like, and thus to provide a control device that can guarantee reliability and security of control.
Further other objects and effects of the present invention will be readily understood to those ordinarily skilled in the art, by referring to the following statement of the present description.
The above-described technical problem can be solved by a device, a method, a program, and the like that have configurations as follows.
Specifically, a control device according to the present invention includes: an input data acquisition unit that acquires, as input data, data acquired from a target device; an inference processing unit that identifies an output node corresponding to the input data and generates associated output data to be used to control the target device, through inference processing using a learned decision tree; an actual data acquisition unit that acquires actual data acquired from the target device, the actual data corresponding to the input data; and an additional learning processing unit that performs additional learning processing for the learned decision tree by generating updated output data by updating the output data associated with the output node, based on the output data and the actual data.
According to such a configuration, an additional learning technique for a decision tree can be provided that involves a small computational cost for additional learning, causes no change in inference time even if additional learning is performed, and needs no additional storage capacity or the like. Thus, for example, in a case of performing online additional learning for a decision tree under limited hardware resources, or the like, it is possible to provide a control device that can guarantee reliability and security of control.
The additional learning processing unit may update the output data associated with the output node, based on the output data and the actual data, without involving a change in structure of the learned decision tree or a change in branch condition.
According to such a configuration, an additional learning technique for a decision tree can be provided that involves a small computational cost for additional learning, causes no change in inference time even if additional learning is performed, and needs no additional storage capacity or the like. Thus, for example, in a case of performing online additional learning for a decision tree under limited hardware resources, or the like, it is possible to provide a control device that can guarantee reliability and security of control.
The updated output data may be an arithmetic average value of the output data before updating and the actual data.
According to such a configuration, since updating can be performed through simple arithmetic operation, a learning burden can be further lightened.
The updated output data may be a weighted average value of the output data before updating and the actual data, with reference to the number of data pieces associated with the output node.
According to such a configuration, since an effect of an update at an early stage of learning can be made relatively large and the effect can be made smaller as the number of times of learning increases, stable learning can be performed.
The updated output data may be a value obtained by adding, to the output data before updating, a result of multiplying a difference between the output data before updating and the actual data by a learning rate.
According to such a configuration, a rate of updating can be flexibly adjusted by adjusting the learning rate.
The learning rate may change according to the number of times the additional learning processing is performed.
According to such a configuration, since the learning rate can be lowered or the like as the number of times of learning increases, learning can be made stable.
The learned decision tree may be one of a plurality of decision trees for ensemble learning.
According to such a configuration, an additional learning technique for a decision tree that involves a small computational cost for additional learning, causes no change in inference time even if additional learning is performed, and needs no additional storage capacity or the like can be applied to each decision tree used in ensemble learning.
A control device according to the present invention in another aspect includes: a reference input data acquisition unit that acquires reference input data; a first output data generation unit that generates first output data by inputting the reference input data into a model generated based on training input data and training correct data corresponding to the training input data; a second output data generation unit that generates second output data by identifying an output node corresponding to the reference input data by inputting the reference input data into a learned decision tree generated by performing machine learning based on the training input data and differential training data between output data and the training correct data, the output data being generated by inputting the training input data into the model; a final output data generation unit that generates final output data, based on the first output data and the second output data; a reference correct data acquisition unit that acquires reference correct data; and an additional learning processing unit that performs additional learning processing for the learned decision tree by generating updated output data by updating the second output data associated with the output node, based on the second output data and differential data between the first output data and the reference correct data.
According to the configuration as described above, machine learning can be performed that is adaptive, due to online learning, to a change in characteristic of a target, such as concept drift, while maintaining a certain level of output accuracy due to an approximate function obtained beforehand. In other words, machine learning technology can be provided that is adaptive to a change in characteristic of a target, or the like, while guaranteeing output accuracy to a certain degree. At the time, neither a change in structure, such as depth, of the decision tree occurs, nor arithmetic operation that requires a relatively large amount of computation, such as calculation for branch conditions, is needed. Accordingly, an additional learning technique for a decision tree can be provided that involves a small computational cost for additional learning, causes no change in inference time even if additional learning is performed, and needs no additional storage capacity or the like. Thus, for example, in a case of performing online additional learning for a decision tree under limited hardware resources, or the like, it is possible to provide a control device that can guarantee reliability and security of control.
The additional learning processing unit may update the second output data associated with the output node, based on the second output data and the differential data, without involving a change in structure of the learned decision tree or a change in branch condition.
According to such a configuration, an additional learning technique for a decision tree can be provided that involves a small computational cost for additional learning, causes no change in inference time even if additional learning is performed, and needs no additional storage capacity or the like. Thus, for example, in a case of performing online additional learning for a decision tree under limited hardware resources, or the like, it is possible to provide a control device that can guarantee reliability and security of control.
The updated output data may be an arithmetic average value of the second output data before updating and the differential data.
According to such a configuration, since updating can be performed through simple arithmetic operation, a learning burden can be further lightened.
The updated output data may be a weighted average value of the second output data before updating and the differential data, with the number of data pieces hitherto associated with the output node taken into account.
According to such a configuration, since an effect of an update at an early stage of learning can be made relatively large and the effect can be made smaller as the number of times of learning increases, stable learning can be performed.
The updated output data may be a value obtained by adding, to the second output data before updating, a result of multiplying a difference between the second output data before updating and the differential data by a learning rate.
According to such a configuration, a rate of updating can be flexibly adjusted by adjusting the learning rate.
The learning rate may change according to the number of times the additional learning processing is performed.
According to such a configuration, since the learning rate can be lowered or the like as the number of times of learning increases, learning can be made stable.
The present invention can also be conceptualized as a method. Specifically, a method according to the present invention includes: an input data acquisition step of acquiring, as input data, data acquired from a target device; an inference processing step of identifying an output node corresponding to the input data and generating associated output data to be used to control the target device, through inference processing using a learned decision tree; an actual data acquisition step of acquiring actual data acquired from the target device, the actual data corresponding to the input data; and an additional learning processing step of performing additional learning processing for the learned decision tree by generating updated output data by updating the output data associated with the output node, based on the output data and the actual data.
A method according to the present invention in another aspect includes: a reference input data acquisition step of acquiring reference input data; a first output data generation step of generating first output data by inputting the reference input data into a model generated based on training input data and training correct data corresponding to the training input data; a second output data generation step of generating second output data by identifying an output node corresponding to the reference input data by inputting the reference input data into a learned decision tree generated by performing machine learning based on the training input data and differential training data between output data and the training correct data, the output data being generated by inputting the training input data into the model; a final output data generation step of generating final output data, based on the first output data and the second output data; a reference correct data acquisition step of acquiring reference correct data; and an additional learning processing step of performing additional learning processing for the learned decision tree by generating updated output data by updating the second output data associated with the output node, based on the second output data and differential data between the first output data and the reference correct data.
The present invention can also be conceptualized as a program. Specifically, a program according to the present invention executes: an input data acquisition step of acquiring, as input data, data acquired from a target device; an inference processing step of identifying an output node corresponding to the input data and generating associated output data to be used to control the target device, through inference processing using a learned decision tree; an actual data acquisition step of acquiring actual data acquired from the target device, the actual data corresponding to the input data; and an additional learning processing step of performing additional learning processing for the learned decision tree by generating updated output data by updating the output data associated with the output node, based on the output data and the actual data.
A program according to the present invention in another aspect includes: a reference input data acquisition step of acquiring reference input data; a first output data generation step of generating first output data by inputting the reference input data into a model generated based on training input data and training correct data corresponding to the training input data; a second output data generation step of generating second output data by identifying an output node corresponding to the reference input data by inputting the reference input data into a learned decision tree generated by performing machine learning based on the training input data and differential training data between output data and the training correct data, the output data being generated by inputting the training input data into the model; a final output data generation step of generating final output data, based on the first output data and the second output data; a reference correct data acquisition step of acquiring reference correct data; and an additional learning processing step of performing additional learning processing for the learned decision tree by generating updated output data by updating the second output data associated with the output node, based on the second output data and differential data between the first output data and the reference correct data.
The present invention can also be conceptualized as an information processing device. Specifically, an information processing device according to the present invention includes: an input data acquisition unit that acquires input data; an inference processing unit that identifies an output node corresponding to the input data and generates associated output data, through inference processing using a learned decision tree; a teaching data acquisition unit that acquires teaching data corresponding to the input data; and an additional learning processing unit that performs additional learning processing for the learned decision tree by generating updated output data by updating the output data associated with the output node, based on the output data and the teaching data.
The present invention can also be conceptualized as an information processing method. Specifically, an information processing method according to the present invention includes: an input data acquisition step of acquiring input data; an inference processing step of identifying an output node corresponding to the input data and generating associated output data, through inference processing using a learned decision tree; a teaching data acquisition step of acquiring teaching data corresponding to the input data; and an additional learning processing step of performing additional learning processing for the learned decision tree by generating updated output data by updating the output data associated with the output node, based on the output data and the teaching data.
Moreover, the present invention can also be conceptualized as an information processing program. Specifically, an information processing program according to the present invention includes: an input data acquisition step of acquiring input data; an inference processing step of identifying an output node corresponding to the input data and generating associated output data, through inference processing using a learned decision tree; a teaching data acquisition step of acquiring teaching data corresponding to the input data; and an additional learning processing step of performing additional learning processing for the learned decision tree by generating updated output data by updating the output data associated with the output node, based on the output data and the teaching data.
According to the present invention, an additional learning technique for a decision tree can be provided that involves a small computational cost for additional learning, causes no change in inference time even if additional learning is performed, and needs no additional storage capacity or the like. Thus, for example, in a case of performing online additional learning for a decision tree under limited hardware resources, or the like, it is possible to provide a control device that can guarantee reliability and security of control.
Hereinafter, an embodiment of the present invention will be described in detail with reference to the accompanying drawings.
A first embodiment of the present invention will be described with reference to
Note that
The storage unit 2 stores data that is a target to be learned through machine learning. The learning-target data is so-called training data, and is data acquired beforehand from a control-target device.
Note that
Note that the hardware configurations are not limited to the configurations described above. Accordingly, for example, each of the devices may be configured as a system including a plurality of devices, or the like.
Next, operation of the invention according to the present embodiment will be described with reference to
As apparent from the drawing, first, initial learning processing is performed (S1). The initial learning processing is processing in which machine learning is performed on the information processing device 100 by using training data acquired beforehand from the control-target device, and a learned decision tree is generated.
More specifically, the learning-target data acquisition unit 11 acquires, from the storage unit 2, learning-target data required to generate a decision tree. The decision tree generation processing unit 12 generates a decision tree through machine learning processing, based on the data acquired by the learning-target data acquisition unit 11 and predetermined preconfigured information (parameters such as a tree structure depth and the like), which is read beforehand from the storage unit 2. Information related to the generated decision tree is stored again in the storage unit 2 by the storage processing unit 13. Note that the decision tree generated in the present embodiment is a regression tree that is capable of outputting a continuous value.
In other words, the decision tree adapted to a model of the control-target device in an initial state is generated by the information processing device 100.
Referring back to
After the processing of incorporating the learned decision tree into the control device 200 is completed, processing of incorporating the control device 200 into the control-target device and running the control-target device is performed (S5).
Thereafter, the inference processing unit 22 performs the inference processing by inputting the input data into the read decision tree, and generates output data (S514). In other words, an output node is identified by classifying the data according to a branch condition associated with each node, and generates output data, based on data associated with the output node.
Thereafter, the generated output data is outputted from the control device 200 by the data output unit 24 (S516). The output data is provided to the control-target device and is used for device control. Then, the processing is completed.
Referring back to
Moreover, the additional learning processing unit 28 performs processing of acquiring input data from the storage unit 26 and acquiring actual data, which is data in actuality corresponding to the input data, from the control-target device (S522). Thereafter, additional learning is performed for the decision tree by using the input data and the actual data (S524).
More specifically, first, a terminal-end output node corresponding to the input data is identified, and output data O1cur associated with the terminal-end node is identified. Thereafter, processing of updating the output data O1cur at the output node by using the actual data O1 and obtaining updated output data O1new is performed. The updating processing is performed by calculating an arithmetic average of the output data O1cur before updating and the actual data O1. In other words, the updated output data O1new is calculated by using a following expression.
After the additional learning processing is completed, the additional learning processing unit 28 performs processing of storing the decision tree into the storage unit 26 (S526), and the processing is completed.
Referring back to
As apparent from
On the other hand, in the example of the additional learning according to the present embodiment, as apparent from
According to the configuration as described above, in additional learning, neither a change in structure, such as depth, of the decision tree occurs, nor arithmetic operation that requires a relatively large amount of computation, such as calculation for branch conditions, is needed. Accordingly, an additional learning technique for a decision tree can be provided that involves a small computational cost for additional learning, causes no change in inference time even if additional learning is performed, and needs no additional storage capacity or the like. Thus, for example, in a case of performing online additional learning for a decision tree under limited hardware resources, or the like, it is possible to provide a control device that can guarantee reliability and security of control.
Note that although updating of output data is performed by using Expression 1 in the present embodiment, the present invention is not limited to such a configuration. In other words, any other method may be used in which an output value is updated without involving a change in decision tree structure, for example, extension or the like of the tree structure, or a change in branch condition or the like.
Subsequently, a second embodiment of the present invention, which uses a plurality of learned models, will be described with reference to
In the present embodiment, online learning means that a parameter or the like of a learning model incorporated in a control device is updated on the control device by performing machine learning based on data acquired by a control-target device. At the time, an updating cycle can be variously changed, and the learning may be, for example, sequential learning that is performed in line with a device control cycle, or may be batch learning, mini-batch learning, or the like that are performed when a predetermined amount of learning-target data is accumulated.
In the second embodiment, as in the first embodiment, initial learning processing 2(S1), processing of incorporating a learned model (S3), and processing of running a control-target device (S5) are also performed (
When the initial learning processing is started, first, learning processing related to the offline learning model is performed. As apparent from the top of
Note that although any one or a combination of various known learning models can be adopted for the offline learning model in the present embodiment, a decision tree capable of producing a regression output (regression tree) is adopted in the present embodiment. Note that a formulated model without involving learning may be adopted in place of the offline learning model. In the following, a learned model obtained through machine learning and a model based on formulation will be collectively referred simply as a model, in some cases.
Next, after the learned model is generated, the processing of generating differential data is performed. As apparent from the middle of
After the differential data 34 is generated, learning processing for the online learning model is performed. As apparent from the bottom of
When the initial learning processing (S1) is completed, the processing of incorporating the generated learned offline learning model and the generated learned online learning model into the control device 200 is next performed (S3).
Subsequently, the processing of running the control-target device (S5), that is, iteration of control processing over the control-target device based on inference processing (S51) and additional learning processing (S52) is performed (see
The output data are added thereafter, and ultimate output data 44 is generated. The generated output data is outputted from the control device by the data output unit. The output data is provided to the control-target device and used for device control.
When the control processing over the control-target device based on the inference processing is completed, the additional learning processing is next performed. In the present embodiment, the additional learning processing is performed only with respect to the online learning model.
More specifically, first, in the online learning model, a terminal-ended output node corresponding to the input data 41 is identified, and output data Ocur associated with the terminal-ended node is identified. Thereafter, processing of obtaining updated output data Onew is performed by updating the output data Ocur at the output node by using the actual data O (51) and the output data OOff_cur (42) of the offline learning model. The updating processing is performed by calculating an arithmetic average of the output data Ocur before updating and the differential data 52 between the actual data O and the output data OOff_cur (42) of the offline learning model. In other words, the updated output data Onew is calculated by using a following expression.
After the additional learning processing is completed, processing of storing the decision tree into the storage unit is performed. Thereafter, the processing of running the control-target device is performed by iterating the control processing over the control-target device based on the inference processing, and the additional learning processing.
According to the configuration as described above, machine learning can be performed that is adaptive, due to online learning, to a change in characteristic of a target, such as concept drift, while maintaining a certain level of output accuracy due to an approximate function obtained beforehand through offline learning. In other words, machine learning technology can be provided that is adaptive to a change in characteristic of a target, a change in model, or the like, while guaranteeing output accuracy to a certain degree.
At the time, neither a change in structure, such as depth, of the decision tree occurs, nor arithmetic operation that requires a relatively large amount of computation, such as calculation for branch conditions, is needed. Accordingly, an additional learning technique for a decision tree can be provided that involves a small computational cost for additional learning, causes no change in inference time even if additional learning is performed, and needs no additional storage capacity or the like. Thus, for example, in a case of performing online additional learning for a decision tree under limited hardware resources, or the like, it is possible to provide a control device that can guarantee reliability and security of control.
The present invention is not limited to the above-described embodiments, and can be implemented by making various modifications.
In the embodiments, a description is given particularly of an example in which a regression problem is solved by using a regression tree, among decision trees. However, the present invention is not limited to such a configuration. Accordingly, a classification problem can also be solved by using the above-described configurations.
Here, colors include “red”, “green”, and “blue”, which are labels and therefore cannot be handled as they are in a regression tree. Accordingly, so-called one-hot encoding is utilized. One-hot encoding is processing of substituting a variable with a new feature having a dummy variable of 0 or 1.
On the right side of the drawing, a table is shown that presents states of the individual variables after one-hot encoding processing. As apparent from the drawing, a title “Color” is substituted with three titles “Red”, “Green”, and “Blue”, and “1” is placed for a corresponding color, and “0” is placed for a non-corresponding color. According to such a configuration, a problem can be treated as a regression problem by converting output dimension from one dimension to three dimensions.
A description will be given of a case in which additional learning is further performed. In the example, it is assumed that as further new input data, data on a piece of merchandise with a size of “S”, a price of “6000”, and a color of “green” is additionally learned.
In the above-described configuration, when additional learning is performed, output data after updated is calculated by using an arithmetic average. However, the present invention is not limited to such a configuration.
Accordingly, for example, when current output data at an output node is O1cur, actual data is O1, and the number of data pieces learned so far is num, updated output data O1new may be as follows by using a weighted average.
Moreover, for example, updating may be performed by adding, to the current output data O1cur at the output node, a result of multiplying a difference between the current output data O1cur and the actual data O1 by a predetermined learning rate.
Needless to say, the above expressions can be similarly applied also to the configuration in the second embodiment. For example, the online learning model can be updated by substituting the actual data O1 in Expression 4 with a difference between the actual data O and the output Ooff_cur of the offline learning model, as follows.
In additional learning, when any of the updating techniques is adopted, neither a change in structure, such as depth, of the decision tree occurs, nor arithmetic operation that requires a relatively large amount of computation, such as calculation for branch conditions, is needed. Accordingly, an additional learning technique for a decision tree can be provided that involves a small computational cost for additional learning, causes no change in inference time even if additional learning is performed, and needs no additional storage capacity or the like. Thus, for example, in a case of performing online additional learning for a decision tree under limited hardware resources, or the like, it is possible to provide a control device that can guarantee reliability and security of control.
In the above-described configuration, a single decision tree is used. However, the present invention is not limited to such a configuration. Accordingly, for example, the present invention may be applied to ensemble learning that uses a plurality of decision trees, such as random forest. In other words, when additional learning processing is performed for each decision tree, output data may be directly updated, without changing a division state.
According to such a configuration, an additional learning technique for a decision tree that involves a small computational cost for additional learning, causes no change in inference time even if additional learning is performed, and needs no additional storage capacity or the like can be applied to each decision tree used in ensemble learning.
Although embodiments of the present invention have been described hereinabove, the embodiments only show some of application examples of the present invention, and are not intended to limit the technical scope of the present invention to the specific configurations in the embodiments. Moreover, the embodiments can be combined as appropriate to the extent that no contradiction arises.
The present invention is applicable to various industries and the like that utilize machine learning technology.
2
11
12
13
21
22
24
26
28
100
200
Number | Date | Country | Kind |
---|---|---|---|
2020-104786 | Jun 2020 | JP | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2021/018228 | 5/13/2021 | WO |