The present invention relates to a learning method, a learning apparatus, and a program.
The technology of predicting a future event by modeling events with propagation properties has been studied in the past. Here, propagation refers to, for example, the spreading (sharing) of content in a social networking service (SNS), the spread of infectious diseases, etc. In addition, an event refers to, for example, the act of sharing content in an SNS, the onset of an infectious disease, etc.
A Hawkes process is commonly used to model events with propagation properties (e.g., PTL 1). The Hawkes process is a kind of stochastic process called “point process”. In the framework of a point process, a function called “intensity” is usually assumed to represent the probability of occurrence of an event.
In the Hawkes process, intensity is described as the cumulative sum of the influences of past events. Further, at that time, the influence of an individual event is modeled by a function called “trigger function”.
[NPL 1] RIZOIU, Marian-Andrei, et al. A tutorial on hawkes processes for events in social media. arXiv preprint arXiv:1708.06401, 2017.
Incidentally, the conventional technology described in NPL 1 given above makes a strong assumption that the influences of past events do not change with time. In other words, it is assumed that the shape of the trigger function is constant regardless of absolute time point.
However, in real-world applications, the mode of propagation is considered to change with time depending on the internal state of a recipient of an event. For example, propagation in an SNS is considered to fluctuate periodically depending on the level of human activity. Specifically, the influence of an individual event is considered to be larger (i.e., the intensity of propagation is stronger) during the daytime, and the influence of an individual event is considered to be smaller during the night time. As a result, the accuracy of predicting an event may deteriorate in the conventional technology.
One embodiment of the present invention has been made in view of the above-mentioned problems, and has an object to construct a point process model capable of predicting occurrence of an event with high accuracy.
In order to achieve the above-mentioned object, a learning method to be executed by a computer according to one embodiment includes an acquisition procedure of acquiring event history information representing a history of a predetermined event; and a training procedure of using the event history information acquired in the acquisition procedure to train parameters of an intensity function in which a trigger function is set to be a function represented by a composite function of a first function, and a predetermined second function and a derivative of the first function, the first function being represented by a neural network that models a temporal change in influence of the event.
It is possible to construct a point process model capable of predicting occurrence of an event with high accuracy.
In the following, one embodiment of the present invention is described. This embodiment describes an event prediction device 10 that constructs a point process model capable of predicting occurrence of an event with high accuracy, and predicts occurrence of an event by using the point process model.
Here, the point process model is generally described by a function called “intensity function” (or “strength function”), which represents the probability of occurrence of an event. In the Hawkes process, this intensity function is modeled by a function called “trigger function”. In this embodiment, the trigger function is rewritten to a function that depends on absolute time point, to thereby extend the point process model so as to be capable of taking into account the temporal change in influence of a past event. As a result, it is possible to predict occurrence of a future event with high accuracy even when the influence of an event (i.e., mode of propagation) changes with time.
<Theoretical Configuration>
First, the theoretical configuration of this embodiment is described. In this embodiment, it is assumed that event history information D representing the history of events from time point t1 to time point tN is given. Specific examples of the event history information D include, for example, data representing the history of the act of spreading content in an SNS, data representing the history of infection of an infectious disease, and other types of data. Here, the event history information is data that indicates the history of an event, and is represented by a combination (ti, mi) of a time point ti, which indicates the time point of occurrence of an event, and auxiliary information mi at the time point ti. The auxiliary information is, for example, the user ID of an SNS in the case of the history of the act of spreading content on the SNS, or the type of disease and the profile of an infected person in the case of the history of infection of an infectious disease. N is the number of events included in the event history information D.
In this embodiment, for the simplicity of description, it is assumed that only the history of event occurrence time points is given as the event history information. In other words, it is assumed that the event history information D is represented by {t1, . . . , tN}. A specific example of the event history information D is illustrated in
In this embodiment, the given event history information D is used to train the parameters of the point process model.
First, an intensity function is designed in accordance with the procedure of a general point process model. The intensity function is a function that represents the probability of an event occurring per unit time. When the intensity function that represents the probability of an event occurring at a time point t is represented by λ(t), the intensity function λ(t) may be represented by the following expression (1), for example.
In the above expression, Δj≡t−tj is defined. Further, μ is called “background rate”, and represents the probability of an event occurring independent of the influences of past events. In this embodiment, for the simplicity of description, a time-invariant constant μ is used, but this embodiment can also be easily generalized to a case where p varies with the time point t.
The second term in the above expression (1) is a term representing the influence of a past event. h(·) is a trigger function, which usually depends only on a difference Δj between the time point ti of occurrence of a past event and the current time point t. For example, an exponential decay function, a Weibull distribution, a gamma distribution, and other functions are widely used as the trigger function. In this embodiment, the trigger function h(Δj|t) at the time point t is modeled by the following expression (2) by using a time conversion function f(t).
In the above expression, f(Δj) is represented by the following expression (3).
[Math. 3]
f(Δj)=∫t
g(·) is the trigger function (i.e., exponential decay function, Weibull distribution, gamma distribution, etc.) used in an existing Hawkes process model. a(t) is any black box function representing a temporal change in influence of an event. In this embodiment, a(t) is modeled by a neural network (that is, f(t) is a function represented by a neural network (hereinafter referred to as “neural network function”)). In this way, it is possible to write down a likelihood function by using only the neural network function f(t) and the derivative of f with respect to Δj.
Specifically, in this embodiment, the likelihood function L can be written down as the following expression (4).
An integral interval [0, T] in the above expression (4) is, for example, a time interval in which the event history information D is collected, which satisfies [t1, tN]⊆[0, T]. In addition, G(·) is a primitive function (one of primitive functions) of g(·), and is defined by the following expression (5).
[Math. 5]
G(f)=∫g(f)df (5)
The integral shown in the expression (5) can be solved analytically for many g(·) such as the exponential decay function, the Weibull distribution, and the gamma distribution.
Then, at the time of training, the parameters of the neural network function f(·) are estimated so as to minimize the likelihood L shown in the above expression (4). Any known optimization method can be used to optimize the parameters. The likelihood L shown in the above expression (4) is differentiable with respect to all the parameters, and thus the parameters can be optimized by using a gradient method, for example. The derivative of the likelihood L can be calculated by using a back propagation method, for example.
<Overall Configuration>
Next, the overall configuration of the event prediction device 10 according to this embodiment is described with reference to
As illustrated in
The acquisition unit 101 acquires the event history information D from an event history information storage device 20, which is connected to the event prediction device 10 via a communication network.
The event history information storage device 20 is, for example, a web server or database server that stores the event history information. The event history information stored in the event history information storage device 20 may be operated (registered, deleted, modified, etc.) by using, for example, a terminal (this terminal may be the event prediction device 10 or the event history information storage device 20 itself) connected to the event history information storage device 20 via the communication network.
The parameter training unit 102 uses the event history information D acquired by the acquisition unit 101 to calculate the parameters (i.e., the parameters of the neural network function f(t) embedded in the intensity function λ(t)) of the intensity function λ(t) shown in the above expression (1). At this time, the parameter training unit 102 trains the parameters by minimizing the likelihood L shown in the above expression (4) by using any known optimization method (e.g., gradient method). The parameters trained by the parameter training unit 102 (trained parameters) are stored into the parameter storage unit 106.
The specification reception unit 103 receives specification of a prediction time point when occurrence of an event is predicted by using the intensity function λ(t) with the set trained parameters. For example, information other than the time point maybe received depending on the type or the like of an event (as a specific example, specification of information indicating a location such as a region maybe received in addition to the time point in the case of predicting an outbreak of an infectious disease).
The prediction unit 104 uses the intensity function λ(t) with the set trained parameters to predict occurrence of an event at the time point received by the specification reception unit 103. At this time, the prediction unit 104 predicts occurrence of an event by, for example, calculating the probability of occurrence of an event up to the prediction time point by using the intensity function λ(t), and performing point process simulation. There are various kinds of methods for performing point process simulation, and for example, a method called “thinning” can be used. Refer to, for example, a reference document “OGATA, Yosihiko. On Lewis' simulation method for point processes. IEEE Transactions on Information Theory, 1981, 27.1: 23-31.” and other documents for thinning.
The output unit 105 outputs a result of prediction by the prediction unit 104. The output unit 105 may output the output result to any output destination. For example, the output unit 105 may display the prediction result on a display or the like, store the prediction result into a storage area such as an auxiliary storage device, print the prediction result from a printer or the like, output the prediction result as sound from a speaker or the like, or transmit the prediction result to an external device via the communication network.
The configuration of the event prediction device 10 illustrated in
<Parameter Training Processing>
Next, a flow of the processing of training the parameters of the intensity function λ(t) (i.e., the parameters of the neural network function f(t)) shown in the above expression (1) is described with reference to
First, the acquisition unit 101 acquires the event history information D from the event history information storage device 20 (Step S101). At this time, the user of the event prediction device 10 may specify a range (e.g., temporal range, location range, etc.) to be acquired as the event history information D, for example.
Next, the parameter training unit 102 trains the parameters of the intensity function λ(t) shown in the above expression (1) by using the event information D acquired in Step S101 described above (Step S102). At this time, the parameter training unit 102 trains the parameters of the intensity function λ(t) shown in the above expression (1) by minimizing the likelihood L shown in the above expression (4) through use of any known optimization method.
Then, the parameter training unit 102 stores the parameters (trained parameters) trained in Step S102 described above into the parameter storage unit 106 (Step S103). As a result, it possible to predict occurrence of a future event with high accuracy by using the intensity function λ(t) with the set trained parameters even when the influence of an event (i.e., mode of propagation) changes with time.
<Prediction Processing>
Next, a flow of the processing of predicting occurrence of an event by using the intensity function λ(t) with the set trained parameters is described with reference to
First, the specification reception unit 103 receives specification of the prediction time point (Step S201). The prediction time point can be specified by the user on a user interface (UI) displayed on the display of the event prediction device 10, for example.
Next, the prediction unit 104 predicts occurrence of an event at the time point received in Step S201 described above by using the intensity function λ(t) with the set trained parameters stored in the parameter storage unit 106 (Step S202).
Then, the output unit 105 outputs the result of prediction in Step S202 described above to a predetermined output destination (Step S203).
<Hardware Configuration>
Finally, the hardware configuration of the event prediction device 10 according to this embodiment is described with reference to
As illustrated in
The input device 301 is, for example, a keyboard, a mouse, a touch panel, or the like. The display device 302 is, for example, a display or the like. The event prediction device 10 may not include at least one of the input device 301 and the display device 302.
The external I/F 303 is an interface with an external device. The external device includes a recording medium 303a or the like. The event prediction device 10 can read/write data from/to the recording medium 303a via the external I/F 303. The recording medium 303a may store, for example, one or more programs for realizing each functional unit (acquisition unit 101, parameter training unit 102, specification reception unit 103, prediction unit 104, output unit 105, etc.) included in the event prediction device 10.
The recording medium 303a is, for example, a compact disc (CD), a digital versatile disk (DVD), an Secure Digital (SD) memory card, a Universal Serial Bus (USB) memory card, etc.
The communication I/F 304 is an interface for connecting the event prediction device 10 to the communication network. The event prediction device 10 can acquire the event history information from the event history information storage device 20 via the communication I/F 304. One or more programs that realize each functional unit of the event prediction device 10 may be acquired (downloaded) from a predetermined server device or the like via the communication I/F 304.
The processor 305 is, for example, a central processing unit (CPU), a graphics processing unit (GPU), or other various arithmetic devices. Each functional unit of the event prediction device 10 is realized by one or more programs stored in the memory device 306 or the like causing the processor 305 to execute processing.
The memory device 306 is, for example, a hard disk drive (HDD), a solid state drive (SSD), a random access memory (RAM), a read only memory (ROM), a flash memory, or other various storage devices. The parameter storage unit 106 of the event prediction device 10 can be realized by using the memory device 306.
The event prediction device 10 according to this embodiment has the hardware configuration illustrated in
The present invention is not limited to the above-mentioned embodiment serving as a specific disclosure, and various modifications, changes, combinations with known technologies, etc. are conceivable without departing from the scope of the claims.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2020/022567 | 6/8/2020 | WO |