The present invention is based upon and claims the benefit of priority from Japanese patent application No. 2022-137348, filed on Aug. 30, 2022, the disclosure of which is incorporated herein in its entirety by reference.
The present invention relates to a server device, a learning method, a storage medium, and a client device.
As a machine learning model having high accuracy and small computation quantity with respect to tabular format data and the like, a Gradient Boosting Decision Tree (GBDT) has been known.
As a literature about GBDT, for example, Patent Literature 1 has been known. Patent Literature 1 describes a learning device for learning using gradient boosting. For example, a learning device includes a data storage unit that stores therein learning data and gradient information to learn a model; a learning unit that learns the model; an update unit that updates the gradient information; a sub-sampling unit that determines whether or not to use the learning data to learn the next model based on a sub-sampling rate; a first buffer unit for buffering the learning data and the gradient information, determined to be used, up to a predetermined capacity; and a second buffer unit for buffering the learning data and the gradient information, determined not to be used, up to a predetermined capacity. When buffering the learning data and the gradient information up to a predetermined capacity, the first buffer unit and the second buffer unit write to the data storage unit for each given block. According to Patent Literature 1, with the above-described configuration, it is possible to perform data sampling on a large amount of sample data in gradient boosting.
As one type of federated learning for performing training of a machine learning model through cooperation by a plurality of clients without direct exchange of learning data, horizontal federated learning has been known. In horizontal federated learning, training is performed by using feature values shared by the clients. In the case of performing such horizontal federated learning for GBDT, by mutually transmitting gradient information as described in Patent Literature 1 while encrypting it, a threshold and a feature value of each node constituting the decision tree may be learned in a cooperative manner.
In the case of mutually transmitting gradient information by encrypting it as described above, the computation quantity is enormous since encrypting is required. Moreover, since it is necessary to communicate gradient information many times for determining a feature value and a threshold of each node, the communication amount is also increased. This causes a problem that it is difficult to perform horizontal federated learning efficiently.
In view of the above, an exemplary object of the present invention is to provide a server device, a learning method, a storage medium, and a client device that solve the above-described problem.
In order to achieve such an object, a server device according to one aspect of the present disclosure is configured to include
The decision tree is learned by determination of the value of each node constituting the decision tree by the determination unit.
Further, a learning method according to another aspect of the present disclosure is configured to include, by an information processing device,
Further, a storage medium according to another aspect of the present disclosure is a computer-readable medium storing thereon a program for causing an information processing device to execute processing to
Further, a client device according to another aspect of the present invention is configured to include
With the configurations described above, the problem described above can be solved.
A first example embodiment of the present disclosure will be described with reference to
A first example embodiment of the present disclosure describes a learning system 100 in which a plurality of client devices 300 and a server device 200 perform federated learning in corporation with each other. As illustrated in
Moreover, by checking a predetermined stop condition in the global model, the server device 200 can decide which of a leaf node that is a terminal node constituting the decision tree and an internal node other than the leaf node, the node whose value is to be determined is. Then, the server device 200 can acquire the value of the node corresponding to the decision result from each client device 300. For example, in the case where the node whose value is to be determined is a leaf node, the server device 200 acquires, from each client device 300, information including an output value of the leaf node independently calculated by each client device 300. On the other hand, in the case where the node whose value is to be determined is an internal node, the server device 200 acquires, from each client device 300, a value corresponding to the branch condition at the internal node independently calculated by each client device 300. As described above, the server device 200 can acquire information corresponding to the decision result about the node whose value is to be determined, from each client device 300, for example. Note that the stop condition may be a known one such as the depth of the node determined in the global model or the number of pieces of allocated learning data.
In the learning system 100 described in the present embodiment, each client device 300 and the server device 200 can share the model structure. For example, each client device 300 and the server device 200 can share information about a node for which determination is currently made, the overall structure of the determination tree, and the like. Sharing of the information as described above may be realized by transmitting the decision result as described above to each client device 300. It is also possible to have another configuration for sharing such information between each client device 300 and the server device 200.
In the present embodiment, as illustrated in
Moreover, in the present embodiment, the case of performing federated learning on the Gradient Boosting Decision Tree (GBDT) for generating a model in which a plurality of trees are added, in the learning system 100. For example, in the gradient boosting decision tree, when learning the tth tree, learning of the new tree can be performed by filling the gap between the sum of outputs up to the t-lth tree and a label that is an objective variable.
As an example, in the gradient boosting decision tree, in the case of determining a branch condition at an internal node, a difference Lsplit in a loss function is calculated by using an expression such as Expression 1, with respect to each threshold of each feature value. Then, in the gradient boosting decision tree, a feature value and a threshold in which the difference Lsplit in the loss function becomes the largest can be adopted as a branch condition.
Note that gi and hi represent numbers called gradient information. Further, IL represents learning data that proceeds to a left-side node after the division, and IR represents learning data that proceeds to a right-side node after the division. Further, I represents learning data present on the node before the division. As described above, in Expression 1, the difference in the loss function before and after the division is calculated.
Moreover, in the gradient boosting decision tree, when determining an output value of a leaf node, the output value can be determined using an expression such as Expression 2.
Note that gi and hi represent numbers called gradient information, as similar to Expression 1.
Hereinafter, the configuration of the learning system 100 will be described in more detail.
The server device 200 is an information processing device that integrates the results received from the respective client devices 300 to determine the value of a node in the global model constituting the decision tree. The server device 200 can determine the value of each node by performing the integration process for each node.
The operation input unit 210 is configured of operation input devices such as a keyboard and a mouse. The operation input unit 210 detects operation by an operator who operates the server device 200, and outputs it to the arithmetic processing unit 250.
The screen display unit 220 is a screen display device such as a liquid crystal display (LCD). The screen display unit 220 can display, on the screen, various types of information stored in the storage unit 240, in response to an instruction from the arithmetic processing unit 250.
The communication I/F unit 230 is configured of a data communication circuit. The communication I/F unit 230 performs data communication with external devices such as the client devices 300 connected over a communication network.
The storage unit 240 is a storage device such as a hard disk drive (HDD), a solid state drive (SSD), or a memory. The storage unit 240 stores therein processing information and a program 243 required for various types of processing performed in the arithmetic processing unit 250. The program 243 is read and executed by the arithmetic processing unit 250 to thereby implement various processing units. The program 243 is read in advance from an external device or a storage medium via a data input/output function of the communication I/F unit 230 and the like, and is stored in the storage unit 240. The main information stored in the storage unit 240 includes, for example, received information 241, model information 242, and the like.
The received information 241 includes information received from each of the client devices 300. For example, the received information 241 is updated in response to receiving, from each client device 300 by the receiving unit 252, of information representing the value of a node or the like independently calculated by each client device 300.
The leaf node information includes information required for determining an output value of a leaf node in the server device 200. As an example, the leaf node information includes, for each client device 300, values such as an output value w of a leaf node independently calculated by each client device 300, and the learning data number N that is the number of pieces of learning data assigned to the leaf node in the client device 300. For example,
Note that the leaf node information may include information indicating the output value co calculated by each client device 300, for each leaf node constituting the decision tree. For example, in the leaf node information, identification information for identifying the leaf node and information indicating the output value co calculated by each client device 300 may be associated with each other.
The branch condition information includes information required for determining a value corresponding to the branch condition at the internal node in the server device 200. As an example, the branch condition information includes, for each client device 300, values such as a feature value “a” that maximizes the difference in the loss function, the learning data number N that is the number of pieces of learning data assigned to the internal node, and a threshold v that are independently calculated by the client device 300. For example,
Note that the branch condition information may include information representing a value corresponding to the branch condition calculated by each client device 300, for each internal node constituting the decision tree. For example, in the branch condition information, identification information for identifying an internal node and information representing a value corresponding to the branch condition calculated by each client device 300 may be associated with each other. Moreover, the server device 200 can determine the branch condition using either one of the methods to be described below. The information included in the branch condition information may be one corresponding to the method used for determining the branch condition by the server device 200. For example, the branch condition information may include information indicating the difference in the loss function for each feature value and information indicating the threshold v for each feature value, in addition to or instead of various values described above as examples. The branch condition information may include information corresponding to the method used for determining the branch condition by the server device 200, other than those described above as examples.
The model information 242 includes information about the decision tree that is a learning model learned by the server device 200. For example, the model information 242 includes a value of a node in the global model determined by integrating the local calculation results received from the respective client devices 300. As an example, the model information 242 includes information indicating a value corresponding to the global model such as a value representing the branch condition at an internal node constituting the decision tree and an output value of a leaf node. For example, the model information 242 is updated in response to determination of the values of the leaf node and the internal node by the leaf node determination unit 253 and the branch condition determination unit 254.
The arithmetic processing unit 250 includes an arithmetic unit such as a central processing unit (CPU) and the peripheral circuits. The arithmetic processing unit 250 reads, from the storage unit 240, and executes the program 243 to implement various processing units through cooperation between the hardware and the program 243. Main processing unit implemented by the arithmetic processing unit 250 include, for example, a stop condition checking unit 251, a receiving unit 252, a leaf node determination unit 253, a branch condition determination unit 254, and a transmission unit 255.
Note that the arithmetic processing unit 250 may include a Graphic Processing Unit (GPU), a Digital Signal Processor (DSP), an Micro Processing Unit (MPU), an Floating point number Processing Unit (FPU), a Physics Processing Unit (PPU), a Tensor Processing Unit (TPU), a quantum processor, a microcontroller, or a combination thereof, instead of the CPU.
The stop condition checking unit 251 checks the predetermined stop condition to decide which of a leaf node and an internal node the determination object node is. The stop condition checking unit 251 may perform the decision using a condition similar to the existing condition used for deciding the node type in the gradient boosting decision tree or the like.
For example, the stop condition checking unit 251 refers to the model information 242 and the like to check whether the determined decision tree and the node satisfy an arbitrary condition corresponding to the depth (passage length) of the determined node, the number of pieces of assigned learning data, and the like. As an example, the stop condition checking unit 251 decides that the determination object node is an internal node when the depth of the determined node is less than a given value. Moreover, the stop condition checking unit 251 decides that the determination object node is a leaf node when the depth of the determined node is equal to or a larger than a given value. As described above, the stop condition checking unit 251 may perform the decision using another known condition or a combination of known conditions.
Note that the stop condition checking unit 251 may be configured to transmit the decision result to each client device 300. With such a configuration, the stop condition checking unit 251 can appropriately acquire information corresponding to the decision result from each client device 300.
The receiving unit 252 receives the value of a node independently calculated by each client device 300, from the client device 300 via the communication I/F unit 230. For example, the receiving unit 252 receives, from each client device 300, information corresponding to the result of decision by the stop condition checking unit 251 such as a value corresponding to an output value of a leaf node or a branch condition at an internal node. Moreover, the receiving unit 252 stores the received information in the storage unit 240 as the received information 241.
Note that the receiving unit 252 may receive various types of information in an aspect corresponding to the method to be used for determining the branch condition by the branch condition determination unit 254 to be described below. For example, the receiving unit 252 receives, from the client device 300, a feature value and the number of pieces of learning data that maximize the difference in the loss function in the client device 300. Then, the receiving unit 252 receives, from the client device 300, a threshold calculated by the client device 300 according to the feature value determined by the branch condition determination unit 254. As described above, the receiving unit 252 may receive the information corresponding to the method to be used for determining the branch condition by the branch condition determination unit 254 in the sequence corresponding to the method.
The leaf node determination unit 253 determines the value of a leaf node that is a determination object in the global model by integrating the output values of the leaf node independently calculated by the respective client devices 300, received from the respective client devices 300. Moreover, the leaf node determination unit 253 stores the determined value in the model information 242.
For example, the leaf node determination unit 253 refers to the leaf node information included in the received information 241 to specify the output value of the leaf node independently calculated by each client devices 300 and the number of pieces of learning data assigned to the leaf node. Then, the leaf node determination unit 253 calculates a weighted average in which weighting is performed on the specified output values, corresponding to the number of pieces of learning data assigned to the leaf node in the client device 300, to thereby determine the value of the node.
Specifically, for example, the leaf node determination unit 253 performs calculation according to Expression 3 to integrate the output values calculated by the respective client devices 300 to thereby determine the output value of the leaf node that is a determination object.
ωj*=ΣkNjkωjk/ΣkNjk [Expression 3]
Note that N represents the number of pieces of learning data, and ω represents an output value. Moreover, k corresponds to the client device 300. For example, Nk represents the number of pieces of learning data received from a client device 300-k. Furthermore, j represents information for identifying the leaf node.
The branch condition determination unit 254 determines the value of the internal node that is a determination object in the global model by integrating the values or the like corresponding to the branch condition received from the respective client devices 300. Moreover, the branch condition determination unit 254 stores the determined value in the model information 242.
As an example, after determining the feature value based on the information received from the respective client devices 300, the branch condition determination unit 254 determines a threshold based on the thresholds calculated by the respective client devices 300 based on the determined feature value to thereby determine the branch condition consisting of the feature value and the threshold. For example, the branch condition determination unit 254 refers to the branch condition information included in the received information 241 to specify the feature value and the number of pieces of learning data that maximize the difference in the loss function, independently calculated by the respective client devices 300. Then, the branch condition determination unit 254 performs weighted majority decision in which weighting is performed on the number of the specified feature values or the like, corresponding to the number of pieces of learning data assigned to the internal node in the client device 300, to thereby determine the feature value at the global internal node. For example, the branch condition determination unit 254 aggregates the numbers received from the respective client devices 300 for each feature value. At that time, the branch condition determination unit 254 may perform aggregation after performing adjustment by weighting according to the number of pieces of learning data. Then, based on the result of aggregation, the branch condition determination unit 254 can determine the value of a feature value that has been received the most, to be the feature value at the global internal node. Note that the weighting state according to the number of pieces of learning data may be adjusted arbitrarily. Moreover, after the receiving unit 252 receives the threshold calculated by each client device 300 based on the determined feature value from each client device 300, the branch condition determination unit 254 determines the value of the node by calculating a weighted average in which weighting is performed corresponding to the number of pieces of learning data assigned to the internal node in the client device 300, with respect to the received thresholds. Specifically, for example, the branch condition determination unit 254 performs calculation according to Expression 4 to integrate the thresholds independently calculated by the respective client devices 300, to thereby determine the threshold of the internal node that is a determination object.
v*=Σ
k
N
k
v
k/ΣkNk [Expression 4]
Note that N represents the number of pieces of learning data, and v represents a threshold. Moreover, k corresponds to the client device 300. For example, Nk represents the number of pieces of learning data received from a client device 300-k.
Instead of the method of receiving the feature value from each client device 300 as described above as an example, the branch condition determination unit 254 may be configured to receive the difference in the loss function for each feature value from each client device 300 to thereby determine the feature value based on the received difference in the loss function. In the case of using such a method, for example, the receiving unit 252 receives, from each client device 300, the difference in the loss function when each client device 300 independently determines the threshold that maximizes the difference in the loss function with respect to each feature value. That is, the receiving unit 252 receives the difference in the loss function for each feature value and the number of pieces of learning data. In response to it, the branch condition determination unit 254 calculates the weighted average as indicated by Expression 5 with respect to each feature value “a” to determine the feature value that maximizes the calculation result to be the feature value of the internal node that is a determination object. In the case of such a method, the process after the determination of the feature value may be similar to that in the above-described example.
L
split
a=ΣkNkLspiltk,a/ΣkNk [Expression 5]
Note that N represents the number of pieces of learning data, and Lsplit represents the difference in the loss function. Moreover, “k” correspond to the client device 300, and “a” represents a feature value.
Instead of the above-described method, the branch condition determination unit 254 may receive thresholds for each feature value from the respective client devices 300 and determine the threshold of the global internal node based on the received thresholds. That is, the branch condition determination unit 254 may be configured to determine the threshold before determining the feature value. Moreover, the branch condition determination unit 254 may determine the feature value based on the difference in the loss function for each feature value when using the determined threshold that is calculated by the respective client devices 300 based on the determined threshold. In the case of using such a method, for example, the receiving unit 252 receives, from each client device 300, a threshold when each client device 300 independently determines a threshold that maximizes the difference in the loss function with respect to each feature value. That is, the receiving unit 252 receives a threshold for each feature value and the number of pieces of learning data. In response to it, the branch condition determination unit 254 can determine the threshold for each feature value by calculating the weighted average of the thresholds as indicated by Expression 6 with respect to each feature value “a”.
v
a=ΣkNkvak/ΣkNk [Expression 6]
Note that N represents the number of pieces of learning data, and v represents a threshold. Moreover, k corresponds to the client device 300. For example, Nk represents the number of pieces of learning data received from a client device 300-k. “a” corresponding to a feature value.
Moreover, after the receiving unit 252 receives, from each client device 300, the difference in the loss function for each feature value calculated by each client device 300 based on the determined threshold, the branch condition determination unit 254 determines a feature value in which the difference in the loss function becomes maximum in response to the result of calculating the weighted average according to Expression 7 with respect to each feature value. Thereby, the branch condition determination unit 254 can determine the feature value and the threshold corresponding to the feature value as a branch condition.
L
split
a=ΣkNkLsplitk,a/Σk [Expression 7]
Note that N represents the number of pieces of learning data, and Lsplit represents the difference in the loss function. Moreover, “k” correspond to the client device 300, and “a” represents a feature value.
The branch condition determination unit 254 can determine the value of the global internal node that is a determination object by integrating the values corresponding to the branch condition received from the respective client devices 300 using any of the above-described methods. Note that the method to be used for determining the branch condition by the branch condition determination unit 254 is shared with the respective client devices 300 in advance. It is also possible to have a configuration in which after the branch condition determination unit 254 determines the branch condition, the stop condition checking unit 251 decides the next node.
The transmission unit 255 transmits the values and the like determined by the leaf node determination unit 253 and the branch condition determination unit 254 to each client device 300. After transmitting the feature value according to the determination of the feature value by the branch condition determination unit 254, the transmission unit 255 may transmit a value in response to the determination by the branch condition determination unit 254 to each client device 300 at the timing of determining the branch condition by the branch condition determination unit 254, such as transmitting the threshold in response to the determination of the threshold by the branch condition determination unit 254.
The exemplary configuration of the server device 200 is as described above. Next, an exemplary configuration of the client device 300 will be described with reference to
The configurations of the operation input unit 310, the screen display unit 320, and the communication I/F unit 330 may be the same as those of the server device 200. Therefore, the description thereof is omitted.
The storage unit 340 is a storage device such as an HDD, an SSD, or a memory. The storage unit 340 stores therein processing information and a program 343 required for various types of processing performed in the arithmetic processing unit 350. The program 343 is read and executed by the arithmetic processing unit 350 to thereby implement various processing units. The program 343 is read in advance from an external device or a storage medium via the data input/output function of the communication OF unit 330 and the like, and is stored in the storage unit 340. The main information stored in the storage unit 340 includes, for example, learning data information 341, model information 342, and the like.
The learning data information 341 includes learning data used for calculation by the calculation unit 352 to be described below. For example, the learning data information 341 is acquired in advance by using a method of acquiring it from an external device via the communication OF unit 330 or inputting it using the operation input unit 310, and is stored in the storage unit 340. For example, the learning data information 341 includes a feature value that is common to the respective client devices 300 although the samples are different.
The model information 342 includes the value of a node in a local model independently calculated using the learning data included in the learning data information 341. The model information 342 may also include, for example, a value of a node in a global model received from the server device 200. For example, the model information 342 is updated corresponding to independent calculation by the calculation unit 352 to be described below based on the learning data included in the learning data information 341, or the like. The model information 342 may also be updated corresponding to receiving of a value of a global node by the receiving unit 351 from the server device 200, or the like.
The arithmetic processing unit 350 includes an arithmetic unit such as a CPU and its peripheral circuits. The arithmetic processing unit 350 reads, from the storage unit 340, and executes the program 343 to implement various processing units through cooperation between the hardware and the program 343. Main processing units to be implemented by the arithmetic processing unit 350 include, for example, the receiving unit 351, the calculation unit 352, and the transmission unit 353. Note that the arithmetic processing unit 350 may include a GPU or the like in place of the CPU, as similar to the case of the server device 200.
The receiving unit 351 receives a value of a node in a global model determined by the server device 200, from the server device 200. Moreover, the receiving unit 351 can store the received value in the storage unit 340 as the model information 342.
The receiving unit 351 can also receive, from the server device 200, information indicating the result of deciding which of a leaf node and an internal node the node to be calculated next is. Note that which of a leaf node and an internal node the node to be calculated next is may be decided by the client device 300.
The calculation unit 352 calculates the value of each node constituting the decision tree based on the learning data included in the learning data information 341. The calculation unit 352 may calculate either an output value or a branch condition corresponding to the decision result by the server device 200 received by the receiving unit 351.
For example, the calculation unit 352 can independently determine a feature value or a threshold by calculating the difference in the loss function using above-described Expression 1 or the like, based on the learning data included in the learning data information 341. In that case, the calculation unit 352 may determine the value by a method corresponding to the method to be used for determining the branch condition by the branch condition determination unit 254 of the server device 200. As an example, the calculation unit 352 can independently determine a feature value and a threshold that maximize the difference in the loss function by calculating the difference in the loss function with respect to each threshold of each feature value, based on the learning data included in the learning data information 341. Moreover, the calculation unit 352 can calculate an optimum threshold to the feature value received from the server device 200, by the same method. The calculation unit 352 may determine the threshold corresponding to the feature value received from the server device 200 by storing the relationship between the feature value and the threshold of the time of independently determining the feature value and the threshold that maximize the difference in the loss function.
The calculation unit 352 can also calculate the output value of a leaf node by using above-described Expression 2 or the like, based on the learning data included in the learning data information 341.
The transmission unit 353 transmits the value calculated and determined by the calculation unit 352, to the server device 200. For example, the transmission unit 353 can transmit the independently calculated value of a node in the local model to the server device 200, in response to calculation and determination by the calculation unit 352.
The exemplary configuration of the client device 300 is as described above. Next, an exemplary operation of the server device 200 and the client device 300 will be described with reference to
When the stop condition checking unit 251 decides that the determination object node is a leaf node (step S101, determine leaf node), the receiving unit 252 receives, from each client device 300, information indicating the output value of the leaf node that is independently calculated by each client device 300 and the number of pieces of learning data (step S102).
The leaf node determination unit 253 calculates a weighted average in which weighting is performed on the plurality of output values received from the respective client devices 300 corresponding to the number of pieces of learning data, to thereby determine the value of the node (step S103).
The transmission unit 255 transmits, to each client device 300, the output value of the leaf node in the global model determined by the leaf node determination unit 253 (step S104).
On the other hand, when the stop condition checking unit 251 decides that the determination object node is an internal node (step S101, determine branch condition), the receiving unit 252 receives, from each client device 300, the feature value that maximizes the difference in the loss function, independently calculated by each client device 300, and the number of pieces of learning data (step S105).
Then, the branch condition determination unit 254 performs weighted majority decision in which weighting is performed on the number of received pieces of learning data aggregated for each feature value, corresponding to the number of pieces of learning data, to thereby determine the feature value at the global internal node (step S106). In response to it, the transmission unit 255 transmits the feature value determined by the branch condition determination unit 254 to each client device 300 (step S107).
The receiving unit 252 receives, from each client device 300, a threshold calculated by the client device 300 based on the feature value transmitted from the transmission unit 255 (step S108).
The branch condition determination unit 254 calculates a weighted average in which weighting is performed on the received plurality of thresholds, corresponding to the number of pieces of learning data assigned to the internal node in the client device 300, to thereby determine the value of the node (step S109). In response to it, the transmission unit 255 transmits the feature value determined by the branch condition determination unit 254 to each client device 300 (step S110).
After the processing of step S104 or step S110, when a node that is a determination object exists (step S111, Yes), the stop condition checking unit 251 checks the predetermined stop condition to decide which of a leaf node and an internal node the determination object node is (step S101). Meanwhile, when no node exists (step S111, No), the server device 200 ends the processing.
The exemplary operation of the server device 200 is as described above. Next, an exemplary operation of the client device 300 will be described with reference to
In the case of receiving an instruction to determine a leaf node (step S202, determination of a leaf node), the calculation unit 352 calculates the output value of the leaf node by using above-described Expression 2 or the like, based on the learning data included in the learning data information 341. In response to it, the transmission unit 353 transmits the output value of the leaf node calculated by the calculation unit 352 and the number of pieces of learning data, to the server device 200 (step S204).
By using the output value of the leaf node transmitted from the transmission unit 353 and the like, the server device 200 determines the output value of the leaf node in the global model. Then, the receiving unit 351 receives the output value of the leaf node in the global model from the server device 200 (step S205).
On the other hand, in the case of receiving an instruction to determine a branch condition (step S202, determination of a branch condition), the calculation unit 352 calculates the difference in the loss function using above-described Expression 1 based on the learning data included in the learning data information 341 to thereby calculate the feature value and the threshold that maximize the difference in the loss function (step S206). In response to it, the transmission unit 353 transmits the feature value calculated by the calculation unit 352 and the number or pieces of learning data, to the server device 200 (step S207).
On the basis of the feature value transmitted by the transmission unit 353, the server device 200 determines the feature value at the internal node that is a determination object. Thereafter, the receiving unit 351 receives the feature value of the internal node determined by the server device 200 (step S208).
The calculation unit 352 calculates a threshold optimum to the feature value received by the receiving unit 351 (step S209). The calculation unit 352 may be configured to identify a threshold optimum to the feature value received by the receiving unit 351, by storing a combination of the feature value and the threshold calculated at step S206. The transmission unit 353 transmits the threshold calculated by the calculation unit 352, to the server device 200 (step S210).
On the basis of the threshold transmitted by the transmission unit 353, the server device 200 determines the threshold at the internal node that is a determination object. Thereafter, the receiving unit 351 receives the threshold of the internal node determined by the server device 200 (step S211). As a result, the client device 300 receives the value corresponding to the branch condition consisting of the feature value and the threshold.
The exemplary operation of the client device 300 is as described above.
As described above, the server device 200 includes the receiving unit 252, the leaf node determination unit 253, and the branch condition determination unit 254. With this configuration, the leaf node determination unit 253 can determine the output value of a leaf node in the global model on the basis of the output values of the leaf node and the like received by the receiving unit 252. Moreover, the branch condition determination unit 254 determines the value corresponding to the branch condition of an internal node in the global mode, on the basis of the values corresponding to the branch condition received by the receiving unit 252. As a result, it is possible to perform horizontal federated learning without transmitting gradient information, and to realize efficient learning.
The server device 200 also includes the stop condition checking unit 251. With this configuration, the receiving unit 252 can acquire information corresponding to the decision result by the stop condition checking unit 251 appropriately from each client device 300.
As described above, the branch condition determination unit 254 can determine the value of a global internal node that is a determination object by integrating the values corresponding to the branch condition received from the respective client devices 300 with use of any of a plurality of methods. Therefore, the processing from step S105 to step S110 illustrated in
Next, a second example embodiment of the present disclosure will be described with reference to
The second example embodiment of the present disclosure describes exemplary configurations of the server device 400 that performs learning in corporation with the client device 500, and the client device 500.
Note that the server device 400 may use a GPU, an MPU, an FPU, a PPU, a TPU, a quantum processor, a microcontroller, or a combination thereof, instead of the CPU described above.
Moreover, the server device 400 can realize functions as the acquisition unit 421 and the determination unit 422 illustrated in
The acquisition unit 421 acquires, from each of a plurality of client devices, information indicating the value of a node constituting the decision tree determined based on the learning data held by the client device itself.
The determination unit 422 integrates the results acquired by the acquisition unit 421 to determine the value of the node constituting the decision tree. The determination unit 422 determines the values of the respective nodes constituting the decision tree, so that the server device 400 learns the decision tree.
As described above, the server device 400 includes the acquisition unit 421 and the determination unit 422. With this configuration, the determination unit 422 can determine the value of each node constituting the decision tree by integrating the results acquired by the acquisition unit 421. As a result, the server device 400 can perform horizontal federated learning without transmitting gradient information, and realize efficient learning.
Note that the server device 400 as described above can be realized by incorporation of a predetermined program into an information processing device such as the server device 400. Specifically, a program that is another aspect of the present invention is a program for implementing, on an information processing device such as the server device 400, processing to acquire, from a plurality of client devices, information representing a value of a node constituting the decision tree, the information being determined based on the learning data held by the client device itself, determine the value of the node constituting the decision tree by integrating the acquired results, and learn the decision tree by determining the value of each node constituting the decision tree.
Further, a learning method carried out by an information processing device such as the server device 400 is a method including, by an information processing device such as the server device 400, acquiring, from a plurality of client devices, information representing a value of a node constituting the decision tree, the information being determined based on the learning data held by the client device itself, determining the value of the node constituting the decision tree by integrating the acquired results, and learning the decision tree by determining the value of each node constituting the decision tree.
An invention of a program, a computer-readable storage medium storing thereon a program, or a learning method having the above-described configuration also exhibits the same actions and effects as those of the server device 400. Therefore, the above-described object of the present invention can also be achieved by such an invention.
Moreover, the client device 500 that transmits the value of a node, independently calculated by itself, to the server device 400 can implement the functions as the calculation unit 521 and the transmission unit 522 illustrated in
The calculation unit 521 determines the value of a node constituting the decision tree based on the learning data held by the own device. For example, the calculation unit 521 may determine the value of each node constituting the decision tree by a method similar to that used for learning a general gradient boosting decision tree.
The transmission unit 522 transmits information representing the value of a node calculated by the calculation unit 521, to the server device 400.
As described above, the client device 500 includes the calculation unit 521 and the transmission unit 522. with this configuration, the transmission unit 522 can transmit the value of a node calculated by the calculation unit 521, to the server device 400. As a result, it is possible to perform the integration processing as described in the server device 400. Therefore, efficient learning can be realized.
Note that the client device 500 described above can be realized by incorporation of a predetermined program into an information processing device such as the client device 500. Specifically, a program that is another aspect of the present invention is a program for causing an information processing device such as the client device 500 to realize processing to determine the value of a node constituting the decision tree based on the learning data held by the own device, and transmit the information representing the calculated value of the node to the server device.
Moreover, a learning method to be carried out by an information processing device such as the client device 500 is a method including, by an information processing device such as the client device 500, determining the value of a node constituting the decision tree based on the learning data held by the own device, and transmitting the information representing the calculated value of the node to the server device.
An invention of a program, a computer-readable storage medium storing thereon a program, or a learning method having the above-described configuration also exhibits the same actions and effects as those of the client device 500. Therefore, the above-described object of the present invention can also be achieved by such an invention.
The whole or part of the example embodiments disclosed above can be described as the following supplementary notes. Hereinafter, the outlines of the server device and the like of the present invention will be described. However, the present invention is not limited to the configurations described below.
A server device comprising:
The server device according to supplementary note 1, wherein the
The server device according to supplementary note 2, wherein
The server device according to any one of supplementary notes 1 to 3, wherein
The server device according to supplementary note 4, wherein
The server device according to supplementary note 4, wherein
The server device according to any one of supplementary notes 1 to 6, further comprising
A learning method performed by an information processing device, the method comprising:
A program for causing an information processing device to execute processing to
A client device comprising:
While the present invention has been described with reference to the example embodiments described above, the present invention is not limited to the above-described embodiments. The form and details of the present invention can be changed within the scope of the present invention in various manners that can be understood by those skilled in the art.
Number | Date | Country | Kind |
---|---|---|---|
2022-137348 | Aug 2022 | JP | national |