ESTIMATION METHOD, ESTIMATION APPARATUS AND PROGRAM

TECHNICAL FIELD

The present invention relates to an estimation method, an estimation apparatus, and a program.

BACKGROUND ART

These days, with the development of sensors and information communication technologies, various data have been collected on a large scale; however, it is often the case that not individual data but aggregated data alone is available due to consideration for privacy, difficulty in observation, etc. For example, human position information obtained by observing radio waves from a GPS (Global Positioning System) satellite may be provided as time-based area population data in which individuals cannot be tracked in consideration for privacy. The time-based area population data is information indicating the number of people in each area at each time step.

A model called a CGM (Collective Graphical Model) is known and widely used as a model with which even in a situation where only thus aggregated data (hereinafter, also referred to as aggregate data) is available, deeper information can be extracted from the data by performing interpolation, estimation, and learning on the basis of probabilistic modeling (see, for example, Non Patent Literature 1).

Among operations in the CGM, there is an operation in which when observed aggregate data and potentials of a graphical model present behind it are given, a true contingency table of the graphical model is estimated by MAP estimation (maximum a posteriori estimation). The MAP estimation is essential for interpolation and estimation, and is a very important operation used also as a subroutine of learning; thus, has been variously studied heretofore.

CITATION LIST
Non Patent Literature

Non Patent Literature 1: D. R. Sheldon and T. G. Dietterich. Collective Graphical Models. In Proceedings of the 24th International Conference on Neural Information Processing Systems, pp. 1161-1169, 2011

SUMMARY OF INVENTION
Technical Problem

Meanwhile, when estimating a true contingency table of a graphical model by MAP estimation, it is necessary to solve a MAP estimation problem. At this time, conventionally, a technique of solution by applying Stirling's approximation and continuous relaxation has been mainly used. However, this technique may have low solution accuracy or low solution interpretability.

An embodiment of the present invention has been made in view of the above points, and an object of the present invention is to obtain a MAP estimation solution with high accuracy and high interpretability.

Solution to Problem

Aiming to achieve the above object, an estimation method according to an embodiment is an estimation method that estimates a MAP solution of a CGM on a path graph, in which a computer executes: an input procedure that receives, as inputs, aggregate data and potentials of the CGM on the path graph; an estimation procedure that uses the aggregate data and the potentials to solve a MAP estimation problem of the CGM by a technique based on discrete DC programming and calculates a MAP estimation solution; and an output procedure that outputs the MAP estimation solution.

Advantageous Effects of Invention

A MAP estimation solution with high accuracy and high interpretability can be obtained.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 is a diagram showing an example of a minimum cost flow problem.

FIG. 2 is a diagram showing an example of a hardware configuration of a MAP estimation apparatus according to the present embodiment.

FIG. 3 is a diagram showing an example of a functional configuration of a MAP estimation apparatus according to the present embodiment.

FIG. 4 is a diagram showing an example of potentials.

FIG. 5 is a diagram showing an example of aggregate data.

FIG. 6 is a diagram showing an example of a MAP estimation solution.

FIG. 7 is a flowchart showing an example of MAP estimation processing according to the present embodiment.

FIG. 8 is a flowchart showing an example of the processing of calculating a MAP estimation solution by a technique based on discrete DC programming according to the present embodiment.

DESCRIPTION OF EMBODIMENTS

Hereinbelow, an embodiment of the present invention is described. In the present embodiment, a MAP estimation apparatus 10 is described with which when observed aggregate data and potentials of a graphical model are given and it is attempted to estimate a true contingency table of the graphical model by MAP estimation, a MAP estimation solution with high accuracy and high interpretability can be obtained.

First, a CGM and MAP estimation are briefly described.

Letting H=(N, A) be an undirected graph, a graphical model expressed by a probability mass function like below will now be considered. Note that N is a set of vertices, and A is a set of edges.

$\begin{matrix} \begin{matrix} p (x; θ) = \Pr (X = x; θ) \\ = \frac{1}{Z (θ)} \prod_{(i, j) \in A} ϕ_{ij} (x_{i}, x_{j}; θ) \end{matrix} & [Math . 1] \end{matrix}$

where φ_ij(x_i, x_j|θ) is a local potential defined for variables (X_i, X_j). The local potential is controlled by a parameter vector θ, and Z(θ) is a normalization constant (partition function). It is assumed that the random variable X_itakes a value in a finite set

x_i [Math. 2]

Samples from this graphical model are represented by x⁽¹⁾, . . . , x^(M). A contingency table n_iregarding a vertex i and a contingency table n_ijregarding an edge (i, j) are defined as follows, respectively.

$\begin{matrix} n_{i} = (n_{i} (x_{i}) ❘ x_{i} \in χ_{i}) n_{ij} = (n_{ij} (x_{i}, x_{j}) ❘ x_{i} \in χ_{i}, x_{j} \in χ_{j}) where & [Math . 3] \\ n_{i} (x_{i}) = \sum_{m = 1}^{M} 𝕀 (X_{i}^{(m)} = x_{i}) n_{i} (x_{i}, x_{j}) = \sum_{m = 1}^{M} 𝕀 (X_{i}^{(m)} = x_{i}, X_{j}^{(m)} = x_{j}) & [Math . 4] \end{matrix}$

provided that

custom-character (⋅) [Math. 5]

is an indicator function. Further, a vector in which the contingency table n_iregarding the vertex i and the contingency table n_ijregarding the edge (i, j) are collected and arranged for all the vertices and all the edges is represented by n.

At this time, a distribution of n (this is referred to as a CGM distribution) can be written as

$\begin{matrix} p (n; θ) = M! \cdot \frac{\prod_{i \in N} \prod_{z_{i} \in ϰ_{i}} {(n_{i} (x_{i})!)}^{υ_{i} - 1}}{\prod_{(i, j) \in A} \prod_{x_{i} \in χ_{i}, z_{j} \in χ_{j}} n_{ij} (x_{i}, x_{j})!} \cdot g (n, θ) \cdot 𝕀 (n \in 𝕃_{M}^{ℤ}) & [Math . 6] \end{matrix}$

$g (n, θ) = \frac{1}{Z ({θ)}^{M}} \prod_{(i, j) \in A} \prod_{x_{i}, x_{j}} {ϕ (x_{i} x_{j} θ)}^{n_{ij} (x_{i}, x_{j})}$

$𝕃_{m}^{ℤ} = {n \in ℤ_{\geq 0}^{❘ n ❘} ❘ M = \sum_{x_{i}} n_{i} (x_{i}) \forall i \in N, n_{i} (x_{i}) = \sum_{x_{j}} n_{ij} (x_{i}, x_{j}) \forall \in N, x_{i} \in χ_{i}, j \in N (i)},$

where ν_iis the degree of the vertex i.

Further, it is assumed that an observation value y (that is, observed aggregate data) is generated from some probability distribution p(y|n) indicating observation noise. Specifically, it is assumed that the value n_iis observed at each vertex in accordance with the following observation noise model.

$\begin{matrix} p_{node} (y ❘ n) = \prod_{i \in N, x_{i}} p_{i, x_{i}} (y_{i} (x_{i}) ❘ n_{i} (x_{i})) & [Math . 7] \end{matrix}$

provided that it is assumed that

−log p_i,x_i(y_i(x_i)|n_i(x_i)) [Math. 8]

is a convex function with respect to n_i(x_i). A posterior distribution of n is given by p(n|y; θ)∝p(n; θ)·p(y|n). Then, a MAP estimation problem is expressed as max_np(n|y; θ).

The above MAP estimation problem will now be modified. When the MAP estimation problem is regarded as a minimization problem of −log p(n|y; θ), the following minimization problem is to be solved.

$\begin{matrix} \min_{n} . ℒ (n) s . t . M = \sum_{x_{i}} n_{i} (x_{i}) (\forall i \in N) n_{i} (x_{i}) = \sum_{x_{j}} n_{ij} (x_{i}, x_{j}) (\forall (i, j) \in A, \forall x_{i} \in χ_{i}) n_{i} (x_{i}) \in ℤ_{\geq 0} (\forall i \in N, \forall x_{i} \in χ_{i}) n_{ij} (x_{ij}) \in ℤ_{\geq 0} (\forall (i, j) \in A, \forall x_{i} \in χ_{i}, \forall x_{j} \in χ_{j}) where & [Math . 9] \end{matrix}$

$\begin{matrix} ℒ (n) := \sum_{(i, j) \in A} \sum_{x_{i} \in χ_{i}} \sum_{x_{j} \in χ_{j}} [\log n_{ij} (x_{i}, x_{j})! - n_{ij} (x_{i}, x_{j}) \log ϕ_{i, j} (x_{i}, x_{j})] - \sum_{i \in N} \sum_{x_{i} \in χ_{i}} (v_{i} - 1) \log n_{i} (x_{i})! - \sum_{i \in N} \sum_{x_{i} \in χ_{i}} \log [p_{i, x_{i}} (y_{i} (x_{i}) ❘ n_{i} (x_{i}))] & [Math . 10] \end{matrix}$

Hereinafter, for simplicity, it is assumed that, for i=1, 2, . . . , |N|,

x
_i={1, 2, . . . , R}. [Math. 11]

Note that this is an example, and the following description can be easily extended to other cases.

The MAP estimation problem shown in Math. 9 above is generally considered to be NP-hard, and is therefore very difficult to solve efficiently. For this reason, conventionally, a technique in which Stirling's approximation and continuous relaxation (that is, removal of the restriction of taking only integer values) are applied to the objective function and message passing is used to make solution, or the like has been mainly used. The continuously relaxed problem is a convex programming problem, and a global optimum solution can be obtained by message passing (see, for example, Reference 1).

Reference 1: T. Sun, D. R. Sheldon and A. Kumar. Message Passing for Collective Graphical Model. In Proceedings of the 32nd International Conference on Machine Learning, pp. 853-861, 2015.

However, the conventional technique described above has the following two problems of (1) and (2).

- (1) When the total number of samples M is small, a solution significantly deviating from the correct solution is outputted. This is due to the fact that Stirling's approximation

log x!≠x log x−x [Math. 12]

is not accurate when x is small.

- (2) The outputted solution is not sparse; hence, interpretation is difficult, and furthermore a large amount of memory is needed to hold the solution. This is because solutions other than integers are outputted due to the application of continuous relaxation.

Thus, in the present embodiment, a case is described where the MAP estimation of a CGM in a path graph is brought down to a minimum cost flow problem on a network and discrete DC (Difference of Convex) programming is applied to this problem to efficiently solve the estimation of n without approximation or continuous relaxation. Thereby, the MAP estimation apparatus 10 according to the present embodiment can output an accurate solution even when the total number of samples M is small, and can output a sparse solution (that is, a solution with high interpretability). The CGM in a path graph is important because it can express a time-series model (Markov model) and therefore has a wide application range.

Note that the discrete DC programming is a technique of efficiently optimizing a function expressed in the form of a difference between two discrete convex functions (see, for example, Reference 2).

Reference 2: T. Maehara, K. Murota, A framework of discrete DC programming by discrete convex analysis. Mathematical Programming, Series A, vol. 152, no. 1-2, pp. 435-466.

Next, a theoretical configuration when the MAP estimation apparatus 10 according to the present embodiment performs MAP estimation is described. As described above, the MAP estimation apparatus 10 according to the present embodiment brings the MAP estimation of a CGM in a path graph down to a minimum cost flow problem on a network, and then applies discrete DC programming to this problem to make solution; thus, obtains a MAP estimation solution.

First, the minimum cost flow problem is described. The (non-linear) minimum cost flow problem is a problem like below. A directed graph G=(V, E) is given as an input, and for each edge (i, j)∈E,

Capacity constraint u_ij∈ custom-character ≥₀

Cost function c_ij: custom-character ≥₀→ [Math. 13]

are allocated. Further, for each vertex i∈V,

Demand b_i∈ custom-character ≥₀ [Math. 14]

is given. The minimum cost flow problem is a problem of obtaining a flow having the minimum cost among flows satisfying the capacity constraint of each edge and the demand constraint of each vertex.

That is, when the flow flowing through the edge (i, j)∈E is represented by x_ij, the minimum cost flow problem can be formulated as follows.

$\begin{matrix} \min_{\infty \in ℤ^{❘ E ❘}} . \sum_{(i, j) \in E} c_{ij} (x_{ij}) s . t . \sum_{j : (i, j) \in E} x_{ij} - \sum_{j : (j, i) \in E} x_{ji} = b_{i} (i \in V) 0 \leq x_{ij} \leq u_{ij} ((i, j) \in E) & [Math . 15] \end{matrix}$

Here, when the cost function satisfies a convex cost condition of C_ij(x+2)+c_ij(x)≥2·c_ij(x+1) in all the edges (i, j)∈E, the minimum cost flow problem shown in Math. 15 above is called a minimum convex cost flow problem, and is known to have an efficient solution method.

Next, a method for creating an instance of a minimum cost flow problem is described. For ease of writing, the notation of

n
_ijk
:=n
_{i, i+1}(j, k)

φ_ijk:=φ_{i, i+1}(j, k)

n
_i
:=n
_i(j)

y
_i
:=y
_i(j)

is used. When the graphical model is of a path type,

$\begin{matrix} v_{i} = {\begin{matrix} 1 & (i = 1, ❘ N ❘) \\ 2 & (otherwise) \end{matrix}, & [Math . 16] \end{matrix}$

and thus the objective function is

$\begin{matrix} ℒ (n) := \overset{❘ N ❘ - 1}{\sum_{i = 1}} \overset{R}{\sum_{j = 1}} \overset{R}{\sum_{k = 1}} [\log n_{ijk}! - n_{ijk} \cdot \log ϕ_{ijk}] - \overset{❘ N ❘ - 1}{\sum_{i = 2}} \overset{R}{\sum_{j = 2}} \log n_{ij}! - \overset{❘ N ❘ - 1}{\sum_{i = 1}} \overset{R}{\sum_{j = 1}} \log [p_{i, j} (y_{ij} ❘ n_{ij})] . & [Math . 17] \end{matrix}$

Further, letting

f
_ijk(z):=log z!−z·log φ_ijk

g(z):=−log z!

h
_ij(z):=−log [p_ij(y_ij|z)],

the objective function can be rewritten as

$\begin{matrix} ℒ (n) := \overset{❘ N ❘ - 1}{\sum_{i = 1}} \overset{R}{\sum_{j = 1}} \overset{R}{\sum_{j = 1}} f_{ijk} (n_{ijk}) - \overset{❘ N ❘ - 1}{\sum_{i = 2}} \overset{R}{\sum_{j = 1}} g (n_{ij}) - \overset{❘ N ❘ - 1}{\sum_{i = 1}} \overset{R}{\sum_{j = 1}} h_{ij} (n_{ij}) & [Math . 18] \end{matrix}$

In order to turn the optimization problem estimation problem) shown in Math. 9 above into a minimum cost flow problem, a minimum cost flow problem on the graph G=(V, E) is constructed by (a) to (g) below. Note that in the following description, the edge (c, u) represents an edge in which the cost function is c and the capacity constraint is u. Further, [m]=(1, 2, . . . , m) is given for any natural number m.

- (a) Let the vertex set V be as follows.

V={o}∪( custom-character ⁽¹⁾∪⁽¹⁾)∪(⁽²⁾∪⁽²⁾). . . ∪(^(|N|)∪^(|N|))∪{d} [Math. 19]

where

custom-character
⁽ⁱ⁾
:={u
_j
⁽ⁱ⁾}_j=1^R; ⁽ⁱ⁾:={v_j⁽ⁱ⁾}_j=1^R [Math. 20].

Further, o is the vertex forming the start point of the flow, and d is the vertex forming the end point of the flow.

- (b) For j∈[R], stretch an edge (0, +∞) from vertex o to vertex u_j⁽¹⁾.
- (c) For j∈[R], stretch an edge (0, +∞) from vertex v_j^(|N|)to vertex d.
- (d) For i=1, |N|, and j∈[R], stretch an edge (h_ij(z), +∞) from vertex u_j⁽ⁱ⁾to vertex v_j⁽ⁱ⁾.
- (e) For i=2, . . . , |N|−1, and j∈[R], stretch an edge (h_ij(z)+g(z), +∞) from vertex u_j⁽ⁱ⁾to vertex v_j⁽ⁱ⁾.
- (f) For i∈[|N|−1], j∈[R], and k∈[R], stretch an edge (f_ijk(z), +∞) from vertex v_j⁽ⁱ⁾to vertex u_k⁽ⁱ⁺¹⁾.
- (g) Let b_o=M, b_d=−M, and b_v=0, where v is a vertex other than o or d, that is:

v∈V\{o, d} [Math. 21]

As an example, an instance of a minimum cost flow problem constructed by (a) to (g) above when |N|=3 and R=3 is shown in FIG. 1.

Then, from an optimum solution of a minimum cost flow problem constructed by (a) to (g) above, a solution n* of the optimization problem shown in Math. 9 above is composed as follows.

n_ijk*=(Flow rate through an edge from vertex v_j⁽ⁱ⁾to vertex u_k⁽ⁱ⁺¹⁾)

n_ij*=(Flow rate through an edge from vertex u_j⁽ⁱ⁾to vertex v_j⁽ⁱ⁾) [Math. 22]

At this time, n* is an optimum solution of the optimization problem shown in Math. 9 above. Thus, it can be seen that the MAP estimation of a CGM can be made by solving a minimum cost flow problem constructed by (a) to (g) above.

Although the minimum cost flow problem constructed by (a) to (g) above does not satisfy the convex cost condition, the minimum cost flow problem has a feature that the cost function of the edge has succeeded in being explicitly separated into a convex function and a concave function. Utilizing this, the minimum cost flow problem constructed by (a) to (g) above will now be solved by a technique based on discrete DC programming.

The minimum cost flow problem constructed by (a) to (g) above can be formulated as an optimization problem like below.

$\begin{matrix} \min_{x} . ℒ (z) = ℱ (z) + 𝒢 (z) s . t . z \in 𝒩 where & [Math 23] \\ ℱ (z) = \overset{❘ N ❘ - 1}{\sum_{i = 1}} \overset{R}{\sum_{j = 1}} \overset{R}{\sum_{j = 1}} f_{ijk} (z_{ijk}) + \overset{❘ N ❘}{\sum_{i = 1}} \overset{R}{\sum_{j = 1}} h_{ij} (z_{ij}) 𝒢 (z) = \overset{❘ N ❘ - 1}{\sum_{i = 2}} \overset{R}{\sum_{j = 1}} g (z_{ij}) & [Math . 24] \end{matrix}$

Further,

custom-character [Math. 25]

is a set of feasible integer solutions of the minimum cost flow problem constructed by (a) to (g) above.

In order to solve the optimization problem shown in Math. 23 above, in a row of feasible solutions z_o, z₁, . . . , z_t, . . . , solutions satisfying

custom-character (z₀)≥(z₁)≥ . . . ≥(z_t) . . . [Math. 26]

will now be generated. The initial point

z₀∈ custom-character [Math. 27]

is arbitrarily taken. When z₀, z₁, . . . , z_tare already determined, for

(Condition 1) custom-character (z_t)=(z_t)

(Condition 2) custom-character (z)≥(z), ∀z∈

(Condition 3) _t(z):=(z)+(z) [Math. 28]

can be efficiently minimized in N,

a function satisfying all of these,

[Math. 29]

is found. Then, based on

$\begin{matrix} z_{t + 1} = \underset{x \in 𝒩}{\arg \min} ℒ_{t} (z), & [Math . 30] \end{matrix}$

a new point z_t+1is determined. At this time,

$\begin{matrix} \begin{matrix} ℒ (z_{t + 1}) = ℱ (z_{t + 1}) + 𝒢 (z_{t + 1}) \\ \leq ℱ (z_{t + 1}) + ℋ_{t} (z_{t + 1}) \\ \leq ℱ (z_{n}) + ℋ_{t} (z_{n}) \\ = ℱ (z_{n}) + 𝒢 (z_{n}) \\ = ℒ (z_{t}) \end{matrix} & [Math . 31] \end{matrix}$

holds; thus, it can be seen that the objective function monotonically decreases. Since the set of feasible solutions is a finite set, z_tis a local optimum solution at a sufficiently large t.

A method for determining, for z_t,

[Math. 32]

will now be described. For any

w∈≥₀ [Math. 33]

g

_w(z):=α_w·(w−z)−log (z!) [Math. 34]

is given, where α_wis any real number satisfying −log (w+1)≤α_w≤−log w. At this time,

g

_w(w)=g(w)

g

_w(z)≥g(z) ∀z∈ [Math. 35]

hold. Thus, when, for z_t,

$\begin{matrix} {\overline{𝒢}}_{t} (z) := \overset{❘ N ❘ - 1}{\sum_{i = 2}} \overset{R}{\sum_{j = 1}} {\overline{g}}_{z_{t, ij}} (z_{ij}) & [Math . 36] \end{matrix}$

is set,

(z_t)=(z_t)

(z)≥(z), ∀z∈, [Math. 37]

and conditions 1 and 2 above are satisfied. Further, the problem of minimizing

(z):=(z)+(z) [Math. 38]

in a set of feasible solutions,

, [Math. 39]

is a minimum convex cost flow problem; thus, a minimum solution can be efficiently obtained, and condition 3 above is satisfied. Thus,

:=_t [Math. 40]

may be set.

In view of the above, in order to solve a minimum cost flow problem constructed by (a) to (g) above, a technique based on discrete DC programming is used and the cost function of an edge stretched from vertex u_j⁽ⁱ⁾to vertex v_j⁽ⁱ⁾in graph G is replaced with

h
_ij(z)+g_z_{t, ij}(z) [Math. 41]

(in other words, the cost function is corrected (or approximated)), and then the minimum convex cost flow problem shown in Math. 30 above is solved to obtain z_t+1. Then, when z_t+1converges (that is, when the value of the objective function does not change), z_t+1at this time is outputted as an optimum solution of the minimum cost flow problem constructed by (a) to (g) above (that is, an optimum solution (MAP estimation solution) of the optimization problem shown in Math. 9 above).

<Hardware Configuration>

Next, a hardware configuration of the MAP estimation apparatus 10 according to the present embodiment is described with reference to FIG. 2. FIG. 2 is a diagram showing an example of a hardware configuration of the MAP estimation apparatus 10 according to the present embodiment.

As shown in FIG. 2, the MAP estimation apparatus 10 according to the present embodiment is obtained by using a hardware configuration of a general computer or computer system, and includes an input device 11, a display device 12, an external I/F 13, a communication I/F 14, a processor 15, and a memory device 16. These pieces of hardware are connected via a bus 17 to be able to communicate with each other.

The input device 11 is, for example, a keyboard, a mouse, a touch panel, or the like. The display device 12 is, for example, a display or the like. The MAP estimation apparatus 10 may not include both of the input device 11 and the display device 12, for example.

The external I/F 13 is an interface with an external device such as a recording medium 13a. The MAP estimation apparatus 10 can perform reading, writing, etc. of the recording medium 13a via the external I/F 13. Examples of the recording medium 13a include a CD (compact disc), a DVD (digital versatile disk), an SD memory card (Secure Digital memory card), a USB (Universal Serial Bus) memory card, and the like.

The communication I/F 14 is an interface for connecting the MAP estimation apparatus 10 to a communication network. The processor 15 is, for example, an arithmetic device of various types such as a CPU (central processing unit) and a GPU (graphics processing unit). The memory device 16 is, for example, a storage device of various types such as an HDD (hard disk drive), an SSD (solid state drive), a RAM (random access memory), a ROM (read-only memory), and a flash memory.

By having the hardware configuration shown in FIG. 2, the MAP estimation apparatus 10 according to the present embodiment can implement MAP estimation processing described later. Note that the hardware configuration shown in FIG. 2 is an example, and the MAP estimation apparatus 10 may have another hardware configuration. For example, the MAP estimation apparatus 10 may include a plurality of processors 15, and may include a plurality of memory devices 16.

<Functional Configuration>

Next, a functional configuration of the MAP estimation apparatus 10 according to the present embodiment is described with reference to FIG. 3. FIG. 3 is a diagram showing an example of a functional configuration of the MAP estimation apparatus 10 according to the present embodiment.

As shown in FIG. 3, the MAP estimation apparatus 10 according to the present embodiment includes an input unit 101, an instance construction unit 102, a MAP estimation unit 103, and an output unit 104. These units are implemented by, for example, processing that one or more programs installed in the MAP estimation apparatus 10 cause the processor 15 to execute.

Further, the MAP estimation apparatus 10 according to the present embodiment includes a potential storage unit 201, an aggregate data storage unit 202, and a MAP estimation solution storage unit 203. These units are obtained by using, for example, the memory device 16. At least one storage unit among these units may be obtained by using, for example, a storage device (a database server or the like) connected to the MAP estimation apparatus 10 via a communication network.

The input unit 101 stores (local) potentials and aggregate data provided to the MAP estimation apparatus 10 in the potential storage unit 201 and the aggregate data storage unit 202, respectively. The input unit 101 may perform correction, etc. of potentials stored in the potential storage unit 201 or aggregate data stored in the aggregate data storage unit 202 in accordance with, for example, an operation or the like from the input device 11.

The instance construction unit 102 constructs (an instance of) a minimum cost flow problem on the basis of (a) to (g) above by using potentials stored in the potential storage unit 201 and aggregate data stored in the aggregate data storage unit 202.

The MAP estimation unit 103 calculates a MAP estimation solution from a minimum cost flow problem constructed by the instance construction unit 102, and stores the MAP estimation solution in the MAP estimation solution storage unit 203. Here, the MAP estimation unit 103 includes a correction unit 111 and a capacity scaling unit 112. The correction unit 111 corrects (or replaces or approximates) the cost function of an edge stretched from vertex u_j⁽ⁱ⁾to vertex v_j⁽ⁱ⁾in graph G. The capacity scaling unit 112 solves the minimum convex cost flow problem shown in Math. 30 above to calculate a solution n*=z_t+1.

The output unit 104 outputs a MAP estimation solution n* stored in the MAP estimation solution storage unit 203 to an arbitrary output destination.

The potential storage unit 201 stores potentials φ_ijkgiven to the MAP estimation apparatus 10. An example of potentials φ_ijkstored in the potential storage unit 201 is shown in FIG. 4. Note that i∈[|N|−1], j∈[R], and k∈[R].

The aggregate data storage unit 202 stores aggregate data y_ijprovided to the MAP estimation apparatus 10. An example of aggregate data y_ijstored in the aggregate data storage unit 202 is shown in FIG. 5.

The MAP estimation solution storage unit 203 stores a MAP estimation solution n* calculated by the MAP estimation unit 103. An example of the MAP estimation solution n* stored in the MAP estimation solution storage unit 203 is shown in FIG. 6. Note that n* is a vector in which n_ijk* and n_ij* are collected and arranged for all the vertices and all the edges.

<MAP Estimation Processing>

Next, a flow of MAP estimation processing according to the present embodiment is described with reference to FIG. 7. FIG. 7 is a flowchart showing an example of MAP estimation processing according to the present embodiment.

First, the instance construction unit 102 reads potentials φ_ijkstored in the potential storage unit 201 and aggregate data y_ijstored in the aggregate data storage unit 202 (step S101).

Next, the instance construction unit 102 constructs an instance of a minimum cost flow problem on the graph G=(V, E) on the basis of (a) to (g) above by using the potentials φ_ijkand the aggregate data y_ijread in step S101 above (step S102).

Next, the MAP estimation unit 103 solves the minimum cost flow problem constructed in step S102 above by a technique based on discrete DC programming, and calculates a MAP estimation solution (step S103). The MAP estimation solution calculated by the MAP estimation unit 103 is stored in the MAP estimation solution storage unit 203. Details of the processing of step S103 will be described later.

Then, the output unit 104 outputs the MAP estimation solution stored in the MAP estimation solution storage unit 203 to an arbitrary output destination (step S104). Examples of the output destination of the MAP estimation solution include another device or program connected via a communication network, the display device 12 such as a display, etc.

Here, the processing of step S102 above (the processing of calculating a MAP estimation solution by a technique based on discrete DC programming) is described with reference to FIG. 8. FIG. 8 is a flowchart showing an example of the processing of calculating a MAP estimation solution by a technique based on discrete DC programming according to the present embodiment.

First, the MAP estimation unit 103 performs initialization of z₀←0 and t←0 (step S201).

Next, the MAP estimation unit 103 executes the processing of repeating step S301 to step S304 below (step S202). This repetition processing is repeatedly executed until YES is obtained as decision in step S303.

The correction unit 111 of the MAP estimation unit 103 replaces the cost function of an edge stretched from vertex u_j⁽ⁱ⁾to vertex v_j⁽ⁱ⁾in graph G with

h
_ij(z)+g_z_{t, ij}(z) [Math. 42]

(step S301).

Next, the capacity scaling unit 112 of the MAP estimation unit 103 solves the minimum convex cost flow problem shown in Math. 30 above by capacity scaling, and calculates a solution z_t+1(step S302). Note that the capacity scaling is an algorithm for efficiently solving the minimum convex cost flow problem (see, for example, Reference 3).

Reference 3: R. K. Ahuja, T. L. Magnanti, J. B. Orlin, Network Flows: Theory, Algorithms, Applications, Prentice Hall, 1993.

Next, the MAP estimation unit 103 decides whether

(z_t)=(z_t+1) [Math. 43]

is satisfied or not (step S303). That is, the MAP estimation unit 103 decides whether the solution z_t+1has converged or not.

In the case where it is not decided in step S303 above that the solution z_t+1has converged (NO in step S303), the MAP estimation unit 103 sets t←t+1 (step S304), and then returns to step S301. On the other hand, in the case where it is decided in step S303 above that the solution z_t+1 has converged (YES in step S303), the MAP estimation unit 103 outputs (stores) the solution z_t+1as a map estimation solution n* to the MAP estimation solution storage unit 203 (step S203).

The present invention is not limited to the above specifically disclosed embodiment, and various modifications and changes, combinations with known technologies, etc. can be made without departing from the scope of the claims.

REFERENCE SIGNS LIST

10 MAP estimation apparatus

11 Input device

12 Display device

13 External I/F

13
a Recording medium

14 Communication I/F

15 Processor

16 Memory device

17 Bus

101 Input unit

102 Instance construction unit

103 MAP estimation unit

104 Output unit

111 Correction unit

112 Capacity scaling unit

201 Potential storage unit

202 Aggregate data storage unit

203 MAP estimation solution storage unit

ESTIMATION METHOD, ESTIMATION APPARATUS AND PROGRAM

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

PCT Information