POINT PROCESS LEARNING METHOD, POINT PROCESS LEARNING APPARATUS AND PROGRAM

TECHNICAL FIELD

The present invention relates to a point process learning method, a point process learning apparatus, and a program.

BACKGROUND ART

Predicting the occurrence of future events is important in various applications, and a model called a point process has been often used conventionally. Note that the events are certain phenomena, and examples thereof include device failures, behaviors of human, crimes, earthquakes, infectious diseases, and the like.

CITATION LIST
Non Patent Literature

Non Patent Literature 1: Edwards, Harrison, and Amos Storkey. “Towards a neural statistician.” arXiv preprint arXiv:1606.02185 (2016).

Non Patent Literature 2: Du, Nan, et al. “Recurrent marked temporal point processes: Embedding event history to vector.” Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016

SUMMARY OF INVENTION
Technical Problem

Although many pieces of event data (i.e., event data representing a history of events that occurred in the past) and prior knowledge are required in order to predict the occurrence of future events by a point process, it may be difficult in reality to prepare these. For example, it is difficult to prepare many pieces of event data in a case where the phenomenon is a new one (e.g., an infectious disease caused by an unknown virus, usage status of a new service, etc.) and there are few events that have occurred in the past. Moreover, it is difficult to prepare the prior knowledge in a case where it is assumed that the occurrence tendency of the event is different from the past (e.g., a case where a service given in a region A is deployed in another region B, a case where a new law is enforced, etc.), for example.

An embodiment of the present invention has been made in view of the above points, and an object thereof is to accurately predict the occurrence of future events.

Solution to Problem

In order to achieve the above object, according to an embodiment, a point process learning method executed by a computer includes: an input procedure of inputting a learning data set including at least first event data representing a series of occurrences of first events; a division procedure of dividing the first event data included in the learning data set by using a prediction time observation area including at least a time series when predicting future event occurrence; and a learning procedure of learning a model parameter including a parameter of an intensity function of a predetermined point process model by using a divided learning data set divided in the division procedure.

Advantageous Effects of Invention

It is possible to accurately predict the occurrence of future events.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 is a diagram illustrating an example of a hardware configuration of a point process learning apparatus according to the present embodiment.

FIG. 2 is a diagram illustrating an example of a functional configuration of a point process learning apparatus at the time of learning.

FIG. 3 is a flowchart illustrating an example of learning processing according to the present embodiment.

FIG. 4 is a diagram for explaining an example of data division.

FIG. 5 is a diagram illustrating an example of a functional configuration of a point process learning apparatus at the time of prediction.

FIG. 6 is a flowchart illustrating an example of prediction processing according to the present embodiment.

FIG. 7 is a diagram illustrating a comparative example with conventional technique.

DESCRIPTION OF EMBODIMENTS

Hereinafter, an embodiment of the present invention will be described. In the present embodiment, a point process learning apparatus 10 capable of accurately predicting the occurrence of future events by a point process even in a case where there is a small number of pieces of past event data and there is no prior knowledge regarding an event to be predicted will be described. Note that a learning time at which a parameter of a model (which will be hereinafter also referred to as a “prediction model”) is learned and a prediction time at which the occurrence of future events is predicted from a prediction model using a learned parameter exist in the point process learning apparatus 10 according to the present embodiment.

First, a hardware configuration of the point process learning apparatus 10 according to the present embodiment will be described with reference to FIG. 1. FIG. 1 is a diagram illustrating an example of a hardware configuration of the point process learning apparatus 10 according to the present embodiment.

As illustrated in FIG. 1, the point process learning apparatus 10 according to the present embodiment is implemented by a hardware configuration of a general computer or computer system, and has an input device 11, a display device 12, an external I/F 13, a communication I/F 14, a processor 15, and a memory device 16. These hardware devices are communicably connected via a bus 17.

The input device 11 is, for example, a keyboard, a mouse, a touch panel, or the like. The display device 12 is, for example, a display or the like. Note that the point process learning apparatus 10 may not have, for example, at least one of the input device 11 or the display device 12.

The external I/F 13 is an interface with an external device such as a recording medium 13a. The point process learning apparatus 10 can perform reading, writing, and the like to the recording medium 13a via the external I/F 13. Note that examples of the recording medium 13a include a compact disc (CD), a digital versatile disk (DVD), a secure digital memory card (SD memory card), a universal serial bus (USB) memory card, and the like.

The communication I/F 14 is an interface for connecting the point process learning apparatus 10 with a communication network. The processor 15 is, for example, one of various arithmetic devices such as a central processing unit (CPU) and a graphics processing unit (GPU). The memory device 16 is, for example, one of various storage devices such as a hard disk drive (HDD), a solid state drive (SSD), a random access memory (RAM), a read only memory (ROM), and a flash memory.

The point process learning apparatus 10 according to the present embodiment can implement learning processing and prediction processing to be described later by having the hardware configuration illustrated in FIG. 1. Note that the hardware configuration illustrated in FIG. 1 is an example, and the point process learning apparatus 10 may have another hardware configuration. For example, the point process learning apparatus 10 may have a plurality of processors 15 or a plurality of memory devices 16.

Next, symbols and the like to be used in the present embodiment are prepared.

The data set is denoted by D=(D_e, {D_c}_cϵC). Here, D_eis event data, and D_cis auxiliary data related to the attribute cϵC. That is, the data set D includes the event data D_eand |C| pieces of auxiliary data.

The event data D_eis obtained by sorting a series of events in order of occurrence thereof, and is represented as:

custom-character
_e
={x
_n}_n=1^N [Math. 1]

N is the number of pieces of data (i.e., the number of occurrences of events) included in the event data, and x_nrepresents an n-th event that has occurred. The x_nis a d-dimensional real vector, that is:

x
_n∈ custom-character ^d [Math. 2]

Examples of x_nand an event include:

- In the case of d=1, x_nis time, and the event is a behavior of a person (e.g., to walk or to eat) or the like.
- In the case of d=3, x_nis spatiotemporal space (time, latitude, and longitude), and the event is cluster occurrence of an infectious disease or the like.

Hereinafter, the above example is assumed as an example in the case of d=1, 3. Moreover, in the following description, an element representing time among the elements of x_nis denoted by t, and the remaining elements are denoted by r.

The auxiliary data D_cis data other than the event, and is represented as:

custom-character
_c={(x_cn,a_cn)}_n=1^N^c [Math. 3]

N_cis the number of pieces of data included in the auxiliary data regarding the attribute c∈C. Moreover, (X_cn, a_cn) represents X_cnand a_cnwith respect to the attribute c, and is:

x
_cn∈ custom-character ^d^c

a
_cn
∈
custom-character
d
^c
^a [Math. 4]

Here, d_c(where d_c≤d) is the number of dimensions of x_cn, and d_cais the number of dimensions of a_cn.

Examples of x_cnand a_cainclude:

- In the case of d=1, it is assumed that d_c=0 and d_ca=1, and a_cnis gender (e.g., the gender is represented by a categorical variable, and a_cn={0, 1}) or the like.

However, d_c=0 is a special case, and a_cnis associated with the entire series (i.e., all x_n).

- In the case of d=1, it is assumed that d_c=1 and d_ca=1, and x_cnis time, a_cnis heart rate or the like.
- In the case of d=3, it is assumed that d_c=2 and d_ca=the number of pixels, and x_cnis latitude and longitude, and a_cnis a pixel value (i.e., pixel value at the latitude and longitude of a satellite image, for example) or the like.
- In the case of d=3, it is assumed that d_c=3 and d_ca=1, and x_cnis time, and the latitude and longitude of the temperature sensor, and a_cnis temperature or the like.

Note that, although the prediction accuracy is expected to be improved when the auxiliary data is present, the auxiliary data may not be present (this case means C=φ).

Moreover, it is assumed that the value of x_n(and x_cn) is normalized or the like for each data set so as to have a common domain between data sets. For example, in the case of d=3, the time t is normalized to represent the time that has elapsed from a reference with the observation start time point of the event as the reference (t=0). Moreover, the latitude and longitude are normalized by [0, 1] (i.e., 0≤r1, r2≤1 are satisfied where the latitude is denoted by r1 and the longitude is denoted by r2, for example).

It is assumed that the following two areas are given as d-dimensional areas.

Prediction time observation area custom-character ^o⊂^dPrediction target area ^t⊂^d [Math. 5]

The prediction time observation area is an area where the occurrence of events is observed at the time of prediction (i.e., at the time of predicting the occurrence of future events). On the other hand, the prediction target area is a prediction target area for which the occurrence of future events is predicted. Note that an outline character is displayed as a normal character in the text of the specification. For example, the prediction time observation area is denoted by X^o, and the prediction target area is denoted by X^t.

Examples of the prediction time observation area X^oand the prediction target area X^tin the case of d=3 include the following.

X
^o={(t,r₁,r₂)|0≤t≤5,0≤r₁,r₂≤1}

X
^t={(t,r₁,r₂)|5≤t≤1000,0≤r₁,r₂≤1}

<<Time of Learning>>

It is assumed that |S| data sets {D^s}_sϵSare given at the time of learning. Here,

custom-character
^s=(_e₁^s{D_c^s}_c=1^C

custom-character
_e
^s
={x
_n}_n=1^N^s

D
_c
^s={(x_cn^s,a_cn^s)}_n=1^N^c^s [Math. 6]

Note that the data set {D^s}_sϵSis also referred to as a “learning data set”.

<<Time of Prediction>>

At the time of prediction, it is assumed that a data set D^s* (where s* is an element not included in S) and a prediction target area X^tare given. Here,

custom-character
^s*=(_e^s*,{_c^s*}_cϵC)

custom-character
_e
^s*
={x
_n}_n=1^N^s*

custom-character
_c
^s*={(x_cn^s*,a_cn^s*)}_n=1^N^c^s* [Math. 7]

However, N^s* is a relatively small natural number (e.g., N^s*=5, N^s*=10, or the like). Note that the data set D^s* is also referred to as a “prediction data set”.

At this time, it is an object to accurately predict events

{x_n}_n=N_s*₊₁^N^s*^+N^x^t [Math. 8]

- that occur in the prediction target area X^t. Here,

custom-character [Math. 9]

- is the number of events that occur in the prediction target area X^t.

Note that each of the event data D_e^sis a series of occurrences of first events used for learning of the prediction model, and the event data D_e^s* is a series of occurrences of second events to be predicted. In the present embodiment, it is assumed that the first events and the second events are different events.

Hereinafter, the prediction model will be described. The prediction model includes the following latent vector z and the intensity function X, and the occurrence of events is predicted by a prediction method described below at the time of prediction.

<<Latent Vector>>

The latent vector z is defined below.

z=f
_z([f_e({x_n}_n=1^N),{f_c({x_cn,a_cn}_n=1^N^c)}_c∈C])∈ custom-character ^K [Math. 10]

Here, [^▪,^▪] represents vector concatenation.

Moreover, f_eis a function that outputs a k_e-dimensional vector with an arbitrary number of events as an input. As f_e, for example, a recurrent neural network (RNN), an attention model-based neural network, or the like can be used.

The f_cis a function that outputs a k_c-dimensional vector with auxiliary data as an input. A specific function to be used as f_cdepends on the format of the auxiliary data. In the case of the above-described image such as a satellite image, for example, a convolutional neural network (CNN) or the like is used as f_c. Moreover, in the case of the series data (e.g., sensor data, etc.), for example, CNN, RNN, or the like is used as f_c. In addition, a fully connected layer neural network, attention model-based neural network, or the like may be used as f_caccording to the format of the auxiliary data.

The f_zis a function that outputs a K-dimensional vector with a (k_e+Σ_cϵCk_c)-dimensional vector as an input. As f_z, for example, a fully connected layer neural network can be used.

Note that the definition of the latent vector z represented in above Formula 10 is an example, and, for example, event data may not be used, that is,

f
_e({x_n}_n=1^N) [Math. 11]

- may not be used.

<<Intensity Function>>

The intensity function λ is defined below.

λ(x|{x_n}_n=1^N,{x_cn,a_cn}_n=1^N^c,z;θ [Math. 12]

Here, θ is all the parameters in the intensity function.

Note that the definition of the intensity function X represented in above Formula 12 is an example, and, for example, auxiliary data

{x_cn,a_cn}_n=1^N^c [Math. 13]

- may be used only partially or not used at all.

Moreover, although the intensity function λ is a function that characterizes a point process model, the present embodiment is applicable to an arbitrary point process model. As an example, a point process model and an intensity function λ that characterizes the point process model are shown below.

- In the case of d=1

Extension using a neural network of the Hawkes process.

At this time, the intensity function λ is represented as follows.

λ(x|{x_n}_n=1^N,z;θ)=f_b(z)+Σ_x_i_<xg(x,x_i;z) [Math. 14]

Here,

g(x,x′;z)=exp (−∥f_l([x,z])−f_l([x′,z])∥²) [Math. 15]

Moreover, f_l(l is a lower case of L) is an arbitrary neural network, and f_bis a neural network in which an arbitrary output has a positive scalar value.

- In the case of d=3

The above spatiotemporal extension.

It is represented as x=(t, r) with t as time and r as position coordinates (e.g., latitude and longitude). At this time, the intensity function λ is represented as follows.

λ((t,r)|{x_n}_n=1^N;θ)=f_b([r,z])+Σ_t_i_<tg₁(r,r_i;z)g₂(t,t_i;z) [Math. 16]

Here,

g
₁(r,r′;z)=exp(−∥f_l₁([r,z])−f_l₁([r′,z])∥²)

g
₂(t,t′;z)=exp (−∥f_l₂([t,z])−f_l₂([t′,z])∥²) [Math. 17]

Moreover,

f
_l
₁
,f
_l
₂ [Math. 18]

- is an arbitrary neural network, and f_bis a neural network in which an arbitrary output has a positive scalar value.

<<Prediction Method>>

In the process of predicting the occurrence of events, the occurrence of the events may be predicted by the prediction likelihood determined from the above intensity function λ, or may be predicted by simulation using the above intensity function λ.

The prediction likelihood determined from the above intensity function X is defined below.

p({x_n|x_n∈ custom-character ^t}|{x_n|x_n∈^o},{{(x_n,a_cn)|x_cn1∈^o}}_c∈C) [Math. 19]

On the other hand, as a simulation using the above intensity function λ, existing technique described in, for example, reference literature “Ogata, Y. “On Lewis ‘simulation method for point processes.”, IEEE Transactions on Information Theory 27(1), 2331 (1981)” or the like may be used.

Next, a functional configuration of the point process learning apparatus 10 at the time of learning will be described with reference to FIG. 2. FIG. 2 is a diagram illustrating an example of a functional configuration of the point process learning apparatus 10 at the time of learning.

As illustrated in FIG. 2, the point process learning apparatus 10 at the time of learning has a selection unit 101, a division unit 102, a feature extraction unit 103, an intensity function estimation unit 104, and a parameter update unit 105. Each of these units is implemented by, for example, processing executed by the processor 15 by one or more programs installed in the point process learning apparatus 10.

Moreover, the point process learning apparatus 10 at the time of learning has a storage unit 110. The storage unit 110 is implemented by, for example, the memory device 16. However, the storage unit 110 may be implemented by, for example, a storage device (e.g., a database server, etc.) connected with the point process learning apparatus 10 via a communication network.

The storage unit 110 stores a learning data set (D^s)_sϵSfor learning a parameter (which will be hereinafter referred to as a “model parameter”) of the prediction model.

The selection unit 101 randomly selects one data set D^sfrom the learning data set {D^s}_sϵSstored in the storage unit 110.

The division unit 102 determines a learning observation area X^o′ from the prediction time observation area X^o, and uses the learning observation area X^o′ to divide the event data D_e^sand the auxiliary data {D_c^s}_cϵCincluded in the data set D^s={D_e^s, {D_c^s}_cϵC}. At this time, the division unit 102 makes a division into three pieces of event data D_e^so′ and auxiliary data {D_c^so′}_cϵC, corresponding to the learning observation area X^o′, event data D_e^stlater than the learning observation area X^o′, and other data.

Note that a specific division method will be described later.

The feature extraction unit 103 calculates the latent vector z^soby the above Formula 10 using the event data D_e^so′ and the auxiliary data {D_c^so′}_cϵCcorresponding to the learning observation area X^o′.

The intensity function estimation unit 104 calculates the intensity function λ by the above Formula 12 using the event data D_e^so′ and the auxiliary data {D_c^so′}_cϵCcorresponding to the learning observation area X^o′ and the latent vector z^so.

The parameter update unit 105 updates the model parameters (i.e., the parameters of the neural network such as f_e, f_c, and f_z, and the parameter θ of the intensity function λ) so as to minimize an error from the event data Dest later than the learning observation area X^o′. At this time, when the prediction likelihood is used, the negative log likelihood of p(D_e^st|D_e^so′, {D_c^so′}_cϵC) may be minimized. Note that the prediction likelihood may be p(D_e^st, D_e^so′|D_e^so′, {D_c^so′}_cϵC) (that is, D_e^so′ may be used at the time of calculating the likelihood). On the other hand, in the case of prediction by simulation, an error between the result and D_e^stmay be minimized.

Next, the learning processing according to the present embodiment will be described with reference to FIG. 3. FIG. 3 is a flowchart illustrating an example of the learning processing according to the present embodiment. Note that the following steps S101 to S107 are repeatedly executed until a predetermined termination condition is satisfied. Examples of such a termination condition include that the number of repetitions has reached a predetermined number of times, that the value of the model parameter has converged (e.g., that the update amount of the model parameter becomes less than the predetermined threshold before and after the repetition), and the like.

First, the selection unit 101 randomly selects one data set D^sfrom the learning data set {D^s}_sϵSstored in the storage unit 110 (step S101).

Next, the division unit 102 determines a learning observation area X^o′ from the prediction time observation area X^o′ (step S102). Here, the learning observation area X^o′ is determined by the following determination method with reference to the prediction time observation area X^o.

- The learning observation area X^o′ has the same size as the size of the prediction time observation area X^o(however, for example, only the time direction may be lengthened or conversely shortened).
- A start point of time of the learning observation area X^o′ is randomly determined (however, for example, the determination may be made on the basis of a certain rule such as adding 1 to the start point of time for each repetition after the initial value of the start point of time is set).

As an example, an example of the learning observation area X^o′ in a case where X^o={(t, r₁, r₂)|0≤t≤5, 0≤r₁, r₂≤1} is satisfied will be described below.

X
^o′={(t,r₁,r₂)|3≤t≤8,0≤r₁,r₂≤1}

X
^o′={(t,r₁,r₂)|4≤t≤9,0≤r₁,r₂≤1}

X
^o′={(t,r₁,r₂)|5≤t≤10,0≤r₁,r₂≤1}

Next, the division unit 102 divides the event data D_e^sand the auxiliary data {D_c^s}_cϵCincluded in the data set D^s={D_e^s, {D_c^s}_cϵC} using the learning observation area X^o′(step S103). That is, the division unit 102 divides the event data D_e^sinto three pieces of event data D_e^so′ corresponding to the learning observation area X^o′, event data D_e^stlater than the learning observation area X^o′, and other data. Similarly, the division unit 102 divides the auxiliary data {D_c^s}_cϵCinto three pieces of auxiliary data {D_c^so′}_cϵC: corresponding to the learning observation area X^o′ and other data. There are three pieces of data used in the processing described later: D_e^so′, D_e^st, and {D_c^so′}_cϵC, and no other data is used. FIG. 4 schematically illustrates this. In FIG. 4, an area later than X^o′ is denoted by X^t, and event data D_e^stcorresponding to this area X^tis used as so-called teacher data (or correct answer data). Note that hatched portions are data that are not used. Moreover, c₁and c₂are elements of C.

Next, the feature extraction unit 103 calculates the latent vector z^soby the above Formula 10 using the event data D_e^so′ and the auxiliary data {D_c^so′}_cϵCcorresponding to the learning observation area X^o′(step S104). That is, the feature extraction unit 103 calculates the latent vector z^soby the following formula.

z
^so
=f
_z([f_e(D_e^so′),{f_c(D_c^so′)}_cϵC])

Note that, as described above, the latent vector z^somay be calculated without using the event data D_e^so′ in a case where auxiliary data is given, or the latent vector z^somay be calculated only using the event data D_e^so′ in a case where no auxiliary data is given.

Next, the intensity function estimation unit 104 calculates the intensity function λ by the above Formula 12 using the event data D_e^so′ and the auxiliary data {D_c^so′}_cϵCcorresponding to the learning observation area X^o′ and the latent vector z^so(step S105). That is, the intensity function estimation unit 104 calculates λ(x|D_e^so′, {D_c^so′}_cΣC, z^so). Note that, as described above, the auxiliary data {D_c^so′}_cϵCmay be used only partially or not used at all.

Next, the parameter update unit 105 calculates an error from event data D_e^stlater than the learning observation area X^o′ (step S106). Note that, as described above, the negative log likelihood of the prediction likelihood p(D_e^st|D_e^so′, {D_c^so′}_cϵC) may be used as the error, or the error between the simulation result and D_e^stmay be used as the error.

Then, the parameter update unit 105 updates the model parameter so as to minimize the error calculated in the above step S106 using, for example, the gradient method (step S107).

As described above, the point process learning apparatus 10 according to the present embodiment can learn the parameters (i.e., the parameters of the neural network such as f_e, f_c, and f_z, and the parameter θ of the intensity function λ) of the prediction model. At this time, as described in the above steps S102 to S103, the point process learning apparatus 10 according to the present embodiment divides the data set D^susing the learning observation area X^o′ determined from the prediction time observation area X^o, and then calculates the intensity function, the prediction likelihood, and the like using the divided data set. As a result, it is possible to accurately predict the occurrence of future events even if the number of pieces of event data given at the time of prediction is small.

Next, a functional configuration of the point process learning apparatus 10 at the time of prediction will be described with reference to FIG. 5. FIG. 5 is a diagram illustrating an example of a functional configuration of the point process learning apparatus 10 at the time of prediction.

As illustrated in FIG. 5, the point process learning apparatus 10 at the time of prediction has the feature extraction unit 103, the intensity function estimation unit 104, and a prediction unit 106. Each of these units is implemented by, for example, processing executed by the processor 15 by one or more programs installed in the point process learning apparatus 10.

Moreover, the point process learning apparatus 10 at the time of prediction has the storage unit 110. The storage unit 110 is implemented by, for example, the memory device 16. However, the storage unit 110 may be implemented by, for example, a storage device (e.g., a database server, etc.) connected with the point process learning apparatus 10 via a communication network.

The storage unit 110 stores a prediction data set D^s* for predicting events that occur in the prediction target area X^t.

The feature extraction unit 103 calculates the latent vector z^s* by the above Formula 10 using the event data D_e^s* and the auxiliary data (D_c^s*)_cϵCincluded in the prediction data set D^s*. However, used are parameters of the neural network such as f_e, f_c, and f_zthat have already been learned.

The intensity function estimation unit 104 uses the event data D_e^s* and the auxiliary data {D_c^s*} ee included in the prediction data set D^s* and the latent vector z^s* to calculate the intensity function λ by the above Formula 12. However, used is the learned parameter θ of the intensity function λ that has already been learned.

The prediction unit 106 predicts events that occur in the prediction target area X^tby the intensity function λ.

Next, prediction processing according to the present embodiment will be described with reference to FIG. 6. FIG. 6 is a flowchart illustrating an example of prediction processing according to the present embodiment.

First, the latent vector z^s* is calculated by the above Formula 10 using the event data D_e^s* and the auxiliary data {D_c^s*})_cϵCincluded in the prediction data set D^s* (step S201). That is, the feature extraction unit 103 calculates the latent vector z^s* by the following formula.

z
^s
*=f
_z([f_e(D_e^s*),{f_c(D_c^s*)}_cϵC])

Note that, as described above, the latent vector z^s* may be calculated without using the event data D_e^s* in a case where auxiliary data is given, or the latent vector z^s* may be calculated only using the event data D_e^s* in a case where no auxiliary data is given.

Next, the intensity function estimation unit 104 uses the event data D_e^s* and the auxiliary data {D_c^s*}_cϵCincluded in the prediction data set D^s* and the latent vector z^s* to calculate the intensity function λ by the above Formula 12 (step S202). That is, the intensity function estimation unit 104 calculates λ(x|D_e^s*, {D_c^s*}_cϵC, z^s*). Note that, as described above, the auxiliary data {D_c^s*}_cϵCmay be used only partially or not used at all.

Then, the prediction unit 106 predicts events that occur in the prediction target area X^tby the intensity function λ(x|D_e^s*, {D_cs*}_cϵC, z^s*) (step S203).

As described above, the point process learning apparatus 10 according to the present embodiment can predict events that occur in the prediction target area X^tusing the prediction data set D^s* including a relatively small number of pieces of data.

FIG. 7 illustrates a comparative example of the point process learning apparatus 10 (proposed technique) according to the present embodiment and conventional technique. As illustrated in FIG. 7, a relatively large area is required as the prediction time observation area X^oin order to accurately predict events that occur in the prediction target area X^tin conventional technique, whereas the point process learning apparatus 10 according to the present embodiment can accurately predict events in a relatively small area as the prediction time observation area X^o. Therefore, it becomes possible with the point process learning apparatus 10 according to the present embodiment to accurately predict the occurrence of future events even in a case where only a relatively small number of pieces of event data can be observed (e.g., in a case where it is assumed that the occurrence tendency of a new phenomenon or event is different from the past, or the like).

The embodiment described above can be easily extended to an arbitrary marked point process. In the marked point process, the event data D_eis given below.

custom-character
_e
={x
_n
,y
_n)}_n=1^N [Math 20]

Note that y_nmay be any of discrete, continuous, and dimensional.

By replacing the event data D_ein the embodiment described above with the event data D_erepresented in the above Formula 20, an arbitrary marked point process is extended.

EXAMPLES

As an example of the above embodiment, an example of data of a case where events to be predicted are set as “the occurrence of infected people of new infectious disease B* in region A* occurring in the next half year” is shown below. At this time, the event data D_e={x_n} is x_n=(time, latitude, longitude).

Example of learning data set: Series of occurrences events of infected people with other infectious diseases B_l, . . . , B_N′, in other regions A₁, . . . , A_N′ (e.g., each in one year or the like)

Example of auxiliary data: Real-time demographic data, map data showing public transportation, and climate information (e.g., the highest temperature, the lowest temperature, the humidity, and the like in the region) data

Example of a mark when applied to marked point process: Gender, age, and occupation of infected person

Example of prediction data set: Series of occurrences of events for the past one week of infected people with new infectious disease B* in region A*, and above-described auxiliary data for the same period or independent of time (e.g., real-time demographic data and climate information as auxiliary data for the same period as the series of the occurrences of the events, map data indicating public transportation as auxiliary data independent of time, etc.)

The present invention is not limited to the above embodiment specifically disclosed, and various modifications and changes, combinations with known technique, and the like can be made without departing from the scope of the claims.

REFERENCE SIGNS LIST

- 10 point process learning apparatus
- 11 input device
- 12 display device
- 13 external I/F
- 13
  a recording medium
- 14 communication I/F
- 15 processor
- 16 memory device
- 17 bus
- 101 selection unit
- 102 division unit
- 103 feature extraction unit
- 104 intensity function estimation unit
- 105 parameter update unit
- 106 prediction unit
- 110 storage unit

POINT PROCESS LEARNING METHOD, POINT PROCESS LEARNING APPARATUS AND PROGRAM

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

PCT Information