The present invention relates to a technology of optimizing variables.
An optimization technology is used in various fields, such as image processing, speech recognition, and natural language processing. In the optimization technology, for the sake of optimization, cost functions designed depending on individual problems are used.
The following considers a minimization problem of a cost function G(w)=G1(w)+G2(w)(w ∈Rn (n is an integer of 1 or greater) is a variable being a optimization target, and functions G1 and G2: Rn→R∪{∞} are each a closed proper convex function (the closed proper convex function is hereinafter simply referred to as a convex function)).
Note that, even when the cost function G includes three or more terms, if those are expressed as a sum of two convex functions, those are reduced to expression (1).
A fixed point w, which is an optimal solution (that is, a value that is ultimately obtained through optimization) of the minimization problem (also referred to as a convex optimization problem) is obtained when a subdifferential of the cost function G includes 0.
[Math. 2]
0∈∂G1(w)+∂G2(w) (2)
Here, ∂ represents a subdifferential operator. Further, ∂Gi(i=1, 2) is a maximum monotone operator.
Note that, if the function Gi includes non-continuous points, their subdifferentials form a set. Accordingly, the subdifferentials (right-hand side) of expression (2) may have multiple values. Thus, here, the symbol of inclusion “∈” is used instead of the equal sign “=”.
Lagrange Dual Ascent Problem
As shown in the following expression, the following considers a problem (Lagrange dual ascent problem) of minimizing the sum of two convex cost functions H1: Rk→R∪{∞} and H2: Rm→R∪{∞} when two variables {p, q}(p ∈ Rk and q ∈ Rm (k and m are each an integer of 1 or greater)) are restrained by a linear expression.
Here, matrices A ∈ Rn×k and B ∈ Rn×m and a vector c ∈ Rn (n is an integer of 1 or greater) are given in advance.
One useful strategy for solving a linear restrained minimization problem such as the Lagrange dual ascent problem is to solve a dual problem. If there is a dual problem for the linear restrained minimization problem, the dual problem is defined as a sup/inf (maximum/minimum) problem of a Lagrange function L (p, q, λ).
Here, λ ∈ Rn is a dual variable, and ·T represents transpose. Further, Hi*(i=1, 2) is a convex conjugation function of Hi, and is given by the following respective expressions.
It can be understood that, if λ and w are replaced with each other to obtain ∂G1(w)=A∂H1*(ATw) and ∂G2(w)=B∂H2*(BTw)−c, the problem on the right-hand side of expression (4) arrives at the problem for obtaining the fixed point of expression (2).
As a specific example of the Lagrange dual ascent problem, there is a noise elimination problem of an image using a total variation norm. This problem will be described later.
As one method of solving a minimization problem of a cost function such as a Lagrange dual ascent problem, there is a method described in NPL 1.
When a Lagrange dual ascent problem is solved by using a variable update rule described in NPL 1, in some cases, it is not easy to update (or it is not possible to update) a variable in one step so that a value of a cost function is reduced. In other words, in some cases, the conventional variable update rule has a problem in that convergence to an optimal solution takes time, that is, a problem in that optimization of a variable takes time.
In the light of this, the present invention has an object to provide a technology that optimizes a variable being an optimization target at high speed.
In one aspect of the present invention, assuming that w ∈ Rn is a variable being an optimization target, and G(w)(=G1(w)+G2(w)) is a cost function for optimizing the variable w, calculated by using input data (note that a function Gi(w): Rn→R∪{∞} (i=1, 2) is a closed proper convex function), and D: Rn→R is a strictly convex function (note that the function D is differentiable, and satisfies ∇D(0)=0), and Ri(i=1, 2) and Ci(i=1, 2) are a D-resolvent operator and a D-Cayley operator defined by following expressions, respectively,
R
i=(I+(∇D)−1∘∂Gi)−1
C
i=(I+(∇D)−1∘∂Gi)−1∘(I−(∇D)−1∘∂Gi) [Math. 6]
a variable optimization apparatus recursively calculates a value of the variable w by using the D-resolvent operator Ri(i=1, 2) and the D-Cayley operator Ci(i=1, 2). −Gi(w) (i=1, 2) is a strongly convex function approximating the function Gi(w)(i=1, 2). In the recursively calculating of the value, when ∇D(w) is calculated, for a D-resolvent operator R1 and a D-Cayley operator C1, T1(w)=∇−G1(w)−∇−G1(0) is used for calculation of ∇D(w), and for a D-resolvent operator R2 and a D-Cayley operator C2, T2(w)=∇−G2(w)−∇−G2(0) is used for calculation of ∇D(w).
According to the present invention, a variable being an optimization target can be optimized at high speed.
Hereinafter, embodiments of the present invention will be described in detail. Components having the same function are denoted by the same reference signs, and redundant description thereof will be omitted.
Prior to describing each embodiment, the method of notation herein will be described.
_(underscore) represents the subscript. For example, xy_z represents yz is the superscript to x, and xy_z represents yz is the subscript to x.
A superscript “{circumflex over ( )}” or “˜”, such as {circumflex over ( )}x or ˜x to a character x, should be described otherwise above “x”, but are described as {circumflex over ( )}x or ˜x, under the limitations of the written description herein.
First, a procedure for solving the problem of expression (2) will be described in detail with reference to NPL 1.
1: Variable Update Rule Based on Bregman Monotone Operator Splitting
Here, as a method for solving the problem of expression (2), a method using Bregman monotone operator splitting will be described. The method is a variable update rule for obtaining such a fixed point that ultimately minimizes a cost function G while updating in parallel a plurality of variables including a variable w. Note that a variable being a target of optimization may be referred to as a primary variable.
First, prior to deriving a variable update rule, some preparations are performed.
1-1: Bregman Divergence
Bregman divergence has a role important for correcting measurement of a variable space. For two different points {w, z}, Bregman divergence JD(w∥z) is defined by the following expression.
[Math. 7]
J
D(w∥z)=D(w)−D(z)−∇D(z),w−z (5)
Here, ∇ represents a differential operator. As a function D: Rn→R used for definition of Bregman divergence, any differentiable strictly convex function can be used. Thus, the function D may be, for example, an asymmetric function.
In the following description, the function D is limited to a function that satisfies ∇D(0)=0. The reason for the limitation is to let the following expression (6) hold, in which ∇D is applied to expression (2) being a condition related to the fixed point.
[Math. 8]
0∈(∇D)−1∘∂G1(w)+(∇D)−1∘∂G2(w) (6)
Here, −1 represents an inverse operator, and o represents a synthesis of two operators.
In general, with different functions D, property of (∇D)−1∘∂Gi varies. Thus, depending on design of ∇D, the convergence rate of the variable update rule varies. In other words, ∇D significantly affects increasing the speed of the convergence rate. The design of ∇D that can increase the speed of the convergence rate will be described later.
1-2: D-Resolvent Operator and D-Cayley Operator
A D-resolvent operator Ri(i=1, 2) and a D-Cayley operator Ci(i=1, 2) are given in expression (7) and expression (8), respectively.
Here, I represents the same operator.
Note that, if an n-dimensional Euclidean distance function is used as the function D, the D-resolvent operator and the D-Cayley operator are respectively a resolvent operator and a Cayley operator being widely known. In other words, the D-resolvent operator and the D-Cayley operator are respectively an operator that is obtained by generalizing the resolvent operator and an operator that is obtained by generalizing the Cayley operator.
1-3: Variable Update Rule based on Bregman Monotone Operator Splitting
On the basis of the preparations described above, the variable update rule based on Bregman monotone operator splitting is derived. Here, a Bregman Peaceman-Rachford (B-P-R) variable update rule based on Bregman Peaceman-Rachford (B-P-R) monotone operator splitting (B-P-R splitting) and a Bregman Douglas-Rachford (B-D-R) variable update rule based on Bregman Douglas-Rachford (B-D-R) monotone operator splitting (B-D-R splitting) will be described.
The B-P-R variable update rule can be obtained by transforming expression (6), which represents a condition related to the fixed point in which measurement of the variable space is corrected by (∇D)−1.
With the use of an auxiliary variable z of the variable w that satisfies w ∈ R1(z), expression (9) of B-P-R monotone operator splitting recursive with respect to the auxiliary variable z can be obtained.
[Math.10]
0∈(I+(∇D)−1∘∂G2)(w)−(I−(∇D)−1∘∂G1)(w)
0∈(I+(∇D)−1∘∂G2)∘R1(z)−(I−(∇D)−1∘∂G1)∘R1(z)
0∈R1(z)−R2∘C1(z)
0∈½(C1+I)(Z)−½(C2+I)∘C1(z)
z∈C
2
∘C
1(z) (9)
In the B-P-R variable update rule using expression (9), the fixed point can be obtained by repeatedly updating the variable by using D-Cayley operators C1 and C2.
By splitting expression (9) into a simple variable update rule (B-P-R variable update rule) with the use of the D-resolvent operators R1 and R2 and the auxiliary variables x, y, and z ∈ Rn, expression (10) to expression (13) can be obtained.
[Math.11]
w
t+1
=R
1(zt)=(I+(∇D)−1∘∂G1)−1(zt) (10)
x
t+1
=C
1(zt)=(2R1−I)(zt)=2wt+1−zt (11)
y
t+1
=R
2(xt+1)=(I+(∇D)−1∘∂G2)−1(xt+1) (12)
z
t+1
=C
2(xt+1)=(2R2−I)(xt+1)=2yt+1−xt+1 (13)
Here, t is an index that represents the number of times of update.
Through transformation of expression (10), expression (14) is obtained.
[Math. 12]
(I+(∇D)−1∘∂G1)(w)∈z
0∈(∇D)−1∘∂G1(w)+w−z
0∈∂G1(w)+∇D(w)−∇D(z) (14)
If there is a minimum value of the variable w, the integral form of expression (14) is represented by expression (15).
Expression (15) above shows that a penalty term is generalized by using Bregman divergence.
Through a similar logic, the following expression is obtained based on expression (12).
In summary, the B-P-R variable update rule based on B-P-R monotone operator splitting is as follows.
Next, the B-D-R variable update rule will be described. B-D-R monotone operator splitting can be obtained as expression (16), in which an averaging operator is applied to expression (9).
[Math. 16]
z∈αC
2
∘C
1(z)+(1−α)z (16)
Here, α ∈ (0, 1).
Through a logic similar to the logic described above, the following B-D-R variable update rule based on B-D-R monotone operator splitting can be obtained.
As has been described above, each of the B-P-R variable update rule and the B-D-R variable update rule is summarized as algorithm as illustrated in
2: Condition for Increasing Speed of Convergence Rate
A condition for increasing the speed of the convergence rate is derived by calculating the convergence rate of B-P-R monotone operator splitting and B-D-R monotone operator splitting. With this, a design condition of ∇D for implementing the speed increase can be studied.
It is assumed that monotonicity of a subdifferential ∂Gi is represented by expression (17), with the use of two different points {w, z}.
[Math. 18]
ρLB,i∥w−z∥2≤∥∂Gi(w)−∂Gi(z)∥2≤≤ρUB,i∥w−z∥2 (17)
Here, {ρLB,i, ρUB,i} ∈ [0, ∞]. In general, {ρLB,i, ρUB,i} varies depending on a function Gi. For example, if the function Gi is strongly convex and Lipschitz smooth, {ρLB,i, ρUB,i} ∈ (0, ∞).
It is assumed that, by applying a measurement correction operator (∇D)−1, monotonicity of expression (17) is represented by expression (18).
[Math. 19]
σLB,i∥w−z∥2≤∥(∇D)−1∘∂Gi(w)−(∇D)−1∘∂Gi(z)∥2≤σUB,i∥w−z∥2 (18)
Here, {σLB,i, σUB,i} ∈ [0, ∞]. In general, {σLB,i, σUB,i} varies depending on design of ∇D.
Based on the assumption described above, the convergence rate of B-P-R monotone operator splitting of expression (9) is represented by expression (19) (description of detailed derivation thereof is herein omitted).
[Math. 20]
∥zt−z*∥2≤(η1η2)t∥z0−z*∥2 (19)
Here, zt represents a value of z being updated t times, z0 represents an initial value of z, and z* represents a fixed point of z. Further, ηi (i=1, 2) is given by expression (20).
As can be understood from expression (20), the speed increase of the convergence rate can be further highly anticipated as ηi has a value closer to 0.
This also applies to B-D-R monotone operator splitting, and the convergence rate of B-D-R monotone operator splitting of expression (16) is represented by expression (19)′.
[Math. 22]
∥zt−z*∥2≤(1−α+αη1αη2)t∥z0−z*∥ (19)′
ηi given by expression (20) satisfies expression (21). In other words, ηi may have a value of 0 or greater and 1 or less.
As can be understood from the fact that ηi=0 if σLB,i=1 and σUB,i=1, ηi also has a value close to 0 if each of σLB,i and σUB,i has a value close to 1. Accordingly, by arranging ∇D in such a manner that each of σLB,i and σUB,i satisfying expression (18) has a value close to 1, the speed increase of the convergence rate can be expected.
3: Conventional Design of ∇D
In NPL 1, ∇D is designed as a linear function using a positive definite matrix M as shown in expression (22).
[Math. 24]
∇D(w)=Mw (22)
The reason why a linear function using the positive definite matrix M is used is because it is possible to combine with an existing optimization method, such as Newton's method, an accelerated gradient (AGD) method, and a (one-dimensional) gradient descent (GD) method, depending on the matrix M. In actuality, it has been demonstrated through numerical simulation that appropriate design of the positive definite matrix M enables implementation of high-speed convergence.
However, there are two requirements for the function D used for definition of Bregman divergence: (1) satisfaction of ∇D(0)=0, and (2) differentiable strictly convex function. In other words, design of ∇D with a linear function using the positive definite matrix M as in expression (22) is merely an example of the function D that satisfies the above-described two requirements. In other words, in addition to the design described above, there may be other functions D satisfying the above-described two requirements with which ∇D further increases the speed of the convergence rate.
4: Design of VD in Invention of Present Application
In view of this, the following proposes design of ∇D in which each of σLB,i and σUB,i satisfying expression (18) has a value close to 1, instead of limiting to the function D that satisfies ∇D(w)=Mw. Specifically, the following proposes a method (hereinafter referred to as an adaptive alternate measurement correction method) of (1) using a continuous non-linear function including high-order gradient information for ∇D, and (2) adapting it to ∂G1 and ∂G2 so as to alternately correct ∇D.
Thus, the use of ∇D that satisfies strong monotonicity is considered. Specifically, with the use of a differentiable strongly convex function −Gi (i=1, 2) that approximates the cost function Gi, ∇D is defined by expression (23).
Here, positive coefficients {γ1t, γ2t} are used so that ∇D of expression (23) satisfies strong monotonicity. It is only required that, for example, {γ1t, γ2t} be the following expressions.
γ1t=∥γ2t−1T2∘(∂G1+∂G2)(zt−1)∥2/∥T1∘(∂G1+∂G2)(zt−1)∥2
γ2t=∥γ1tT1∘(∂G1+∂G2)(xt)∥2/∥T2∘(∂G1+∂G2)(xt)∥2 [Math.26]
From expression (23), it can be understood that ∇D is alternately adaptively corrected.
By adopting the design of ∇D of expression (23) into the B-P-R variable update algorithm and the B-D-R variable update algorithm of
5: Variable Update Rule based on Bregman Monotone Operator Splitting for Lagrange Dual Ascent Problem
Here, with the use of ∇D of expression (23), the B-P-R variable update rule and the B-D-R variable update rule for the Lagrange dual ascent problem are derived.
As described in the above, in the Lagrange dual ascent problem, two maximum monotone operators ∂G1(w)=A∂H1*(ATw) and ∂G2(w)=B∂H2*(BTw)−c are used. By transforming the definitional expression w ∈ R1(z) of the auxiliary variable z of the variable w by using ∂G1(w)=A∂H1*(ATw) and expression (7) for the first maximum monotone operator ∂G1(w), expression (24) is obtained.
[Math. 27]
w∈(I+(∇D)−1∘A∂H1*(AT))−1(z)
w+(∇D)−1∘A∂H1*(ATw)∈z
∇D(w)∈∇D(z)−A∂H1*(ATw) (24)
Here, expression (25) holds for variable p ∈ ∂H1*(ATw) and ˜w=∇D(w), ˜z=∇D(z) (that is, ˜w and ˜z are dual variables obtained by applying non-linear transformation to w and z, respectively).
[Math. 28]
{tilde over (w)}∈{tilde over (z)}−Ap (25)
With the use of the basic property that the subdifferential of the convex conjugation function satisfies ∂H1*=(∂H1)−1, expression p ∈ ∂H1*(ATw) related to the variable p is transformed into expression (26).
[Math. 29]
p∈∂H
1
*(ATw)
∂H1(p)∈ATw
0∈∂H1(p)−AT(∇D)−1({tilde over (z)}−Ap) (26)
Here, if there is a minimum value of p, the update rule of p is represented by expression (27) being an integral form of expression (26).
Here, D+ is a strongly convex function that satisfies ∇D+=(∇D)−1.
By combining expression (25) and expression ˜x ∈ 2˜w−˜z obtained by applying non-linear transformation to expression x ∈ 2w−z corresponding to expression (11), expression (28) representing an update rule of a dual variable ˜x is obtained.
[Math. 31]
{tilde over (x)}
t+1
={tilde over (z)}
t−2Apt+1 (28)
Through a similar logic, the following expression can be derived for the second maximum monotone operator ∂G2(w)=B∂H2*(BTw)−c as well.
As has been described above, each of the B-P-R variable update rule and the B-D-R variable update rule for the Lagrange dual ascent problem is summarized as algorithm as illustrated in
By adopting the design of ∇D of expression (23) into the two algorithms illustrated in
6: Noise Elimination Problem of Image Using Total Variation Norm
Here, as an application example of the algorithm of
In order to define the noise elimination problem of an image using the total variation norm, for example, cost functions H1 and H2 of the following expressions can be used.
H
1(p)=½∥p−s∥22
H
2(q)=μ( 74/2∥q∥22+∥q∥1) [Math.33]
Here, p represents a variable representing an image, q is an auxiliary variable of p, and s represents an observation image (that is, an image before noise is eliminated). Further, μ and θ(>0) are predetermined coefficients.
It is assumed that two variables {p, q} are restrained by expression q=Φp (note that Φ is a square circulant matrix). Φ is a square circulant matrix, and thus an element qi, which is the i-th element of q, can be obtained by discrete difference computation qi=[Φp]i=pi−1−pi+1. Note that the reason for the use of the square circulant matrix Φ is for the purpose of reducing the amount of computation.
Here, by adopting A=Φ, B=−I, and c=0, it can be understood that the noise elimination problem on which the above assumption is made is described by expression (3). Thus, the algorithm of
The design of ∇D will be described below. For the first maximum monotone operator ∂G1(z)=Φ∂H1*(ΦTZ), for example, ∇D and (∇D)−1 can be represented as shown in the following respective expressions.
Here, ξ(>0) is a coefficient used such that a function T1 satisfies strong monotonicity.
For the second maximum monotone operator ∂G2(x)=−∂H2*(−x)−c, for example, ∇D and (∇D)−1 can be represented as shown in the following respective expressions.
Here, xi (i=1, . . . , n) represents the i-th element of x. Further, v (>0) is a predetermined constant, and it is assumed that v >μθ holds.
To summarize the above, the B-P-R variable update algorithm and the B-D-R variable update algorithm for the noise elimination problem on which the above assumption is made are as illustrated in
Here, .H represents a Hermitian transpose.
A variable optimization apparatus 100 will be described below with reference to
The variable optimization apparatus 100 optimizes a variable w ∈ Rn (n is an integer of 1 or greater) being a target of optimization by using input data, and outputs the result of optimization as an output value. Here, the input data is data used for calculating a cost function G(w) that is used for optimization of the variable w. In the following description, it is assumed that the cost function G(w) for optimizing the variable w calculated by using the input data is represented by G(w)=G1(w)+G2(w) (note that the function Gi(w): Rn→R ∪{∞}(i=1, 2) is a closed proper convex function).
With reference to
In S120, the variable update unit 120 optimizes the variable w through a predetermined procedure by using input data, and outputs the result of optimization as an output value. This will be described in detail below. Note that it is assumed that the function D: Rn→R used for definition of Bregman divergence is differentiable, and is a strictly convex function satisfying ∇D(0)=0.
First, the variable update unit 120 calculates setup data used at the time of optimizing the variable w by using the input data (S121-1). For example, the variable update unit 120 calculates the cost function Gi(w)(i=1, 2), the D-resolvent operator Ri(i=1, 2) defined by using the function D and the function Gi, the D-Cayley operator Ci(i=1, 2) defined by using the D-resolvent operator Ri, and the strongly convex function −Gi(w)(i=1, 2) that approximates the function Gi(w)(i=1, 2) as the setup data.
Next, the variable update unit 120 recursively calculates the value of the variable w by using the D-resolvent operator Ri(i=1, 2) and the D-Cayley operator Ci(i=1, 2) (S121-2). When the variable update unit 120 calculates ∇D(w), for the D-resolvent operator R1 and the D-Cayley operator C1, T1(w)=∇−G1(w)−∇−G1(0) is used for calculation of ∇D(w), and for the D-resolvent operator R2 and the D-Cayley operator C2, T2(w)=∇−G2(w)−∇−G2(0) is used for calculation of ∇D(w) (see expression (23)).
The variable update unit 120 can also be configured as a configuration unit that recursively calculates the value of the variable w, based on the algorithm of
The variable update unit 120 will be described below with reference to
With reference to
T
1(w)=∇
T
2(w)=∇
Here, the auxiliary variables x, y, and z ∈ Rn of the variable w are used.
In S121, the initialization unit 121 initializes the counter t. Specifically, t is set to 0. The initialization unit 121 calculates the setup data.
In S1221, the first coefficient calculation unit 1221 calculates γ1t+1 which is the (t+1)-th update result of the first coefficient γ1, by using the following expression.
γ1t+1=∥γ2tT2∘(∂G1+∂G2)(zt)∥2/∥T1∘(∂G1+∂G2)(zt)∥2 [Math.37]
In S1222, the variable calculation unit 1222 calculates wt+1, which is the (t+1)-th update result of the variable w, by using the following expression.
In S1223, the first auxiliary variable calculation unit 1223 calculates xt+1, which is the (t+1)-th update result of the auxiliary variable x, by using the following expression.
x
t+1=2wt+1−zt [Math.39]
In S1224, the second coefficient calculation unit 1224 calculates γ2t+1, which is the (t+1)-th update result of the second coefficient γ2, by using the following expression.
γ2t+1=∥γ1t+1T1∘(∂G1+∂G2)(xt+1)∥2/∥T2∘(∂G1+∂G2)(xt+1)∥2 [Math.40]
In S1225, the second auxiliary variable calculation unit 1225 calculates yt+1, which is the (t+1)-th update result of the auxiliary variable y, by using the following expression.
In S1226, the third auxiliary variable calculation unit 1226 calculates zt+1, which is the (t+1)-th update result of the auxiliary variable z, by using a predetermined expression.
When B-P-R monotone operator splitting is used, the following expression is used.
z
t+1
=2
y
t+1
−x
t+1 [Math.42]
When B-D-R monotone operator splitting is used, the following expression is used.
z
t+1=(1−α)zt+α(2yt+1+xt+1) [Math.43]
(Note that, α is a real number satisfying 0<α<1)
In S123, the counter update unit 123 increments the counter t by 1. Specifically, the increment is performed as follows: t <- t+1.
In S124, if the counter t reaches a predetermined number T of times of update (T is an integer of 1 or greater) (that is, if t reaches T, and an end condition is satisfied), the end condition determination unit 124 uses a value wT of the variable w at the time as an output value, and ends the processing. Otherwise, the processing returns to S1221. In other words, the variable update unit 120 repeats the calculation of S1221 to S124.
According to the invention of the present embodiment, variables being an optimization target can be optimized at high speed.
A variable optimization apparatus 200 will be described below with reference to
The variable optimization apparatus 200 optimizes variables p ∈ Rk and q ∈ Rm (k and m are each an integer of 1 or greater) being a target of optimization by using input data, and outputs the result of optimization as an output value. Here, the input data is data used for calculating a cost function H1(p)+H2(q) for optimizing the variables p and q. In the following description, it is assumed that each of functions H1(p): Rk→R∪{∞} and H2(q): RM→R∪{∞} constituting the cost function H1(p)+H2(q) for optimizing the variables p and q calculated by using the input data is a closed proper convex function. It is assumed that restraint is imposed with a constraint Ap+Bq=c that the variables p and q ought to satisfy, by using matrices A E Rn×k and B ∈ Rn×m and a vector c ∈ Rn given in advance.
With reference to
In S220, the variable update unit 220 optimizes the variables p and q through a predetermined procedure by using input data, and outputs the result of optimization as an output value. The following will describe the variable update unit 220 configured as a configuration unit that recursively calculates the values of the variables p and q, based on the algorithm of
The variable update unit 220 will be described below with reference to
With reference to
∂G1(w)=A∂H1*(ATw)
∂G2(w)=B∂H2*(BTw)−c[Math.44]
and T1(w) and T2(w) represent functions defined by the following expressions.
T
1(w)=∂G1(w)
T
2(w)=∂G2(w) [Math.45]
Here, for dual variables x and z ∈ Rn, dual variables ˜x and ˜z ∈ Rn defined by ˜x=∇D(x) and ˜z=∇D(z) are used.
In S221, the initialization unit 221 initializes the counter t. Specifically, t is set to 0. The initialization unit 221 calculates setup data used at the time of optimizing the variables p and q. For example, the initialization unit 221 calculates the cost functions H1(p) and H2(q) as the setup data.
In S2221, the first coefficient calculation unit 2221 calculates γ1t+1, which is the (t+1)-th update result of the first coefficient γ1.
z
t=(∇D)−1({tilde over (z)}t)
γ1t+1=∥γ2tT2∘(∂G1+∂G2)(zt)∥2/∥T1∘(∂G1+∂G2)(zt)∥2 [Math.46]
In S2222, the first variable calculation unit 2222 calculates pt+1, which is the (t+1)-th update result of the variable p, by using the following expression.
In S2223, the first dual variable calculation unit 2223 calculates ˜xt+1, which is the (t+1)-th update result of the dual variable ˜x, by using the following expression.
{tilde over (x)}
t+1
={tilde over (z)}
t−2Apt+1 [Math.48]
In S2224, the second coefficient calculation unit 2224 calculates γ2t+1, which is the (t+1)-th update result of the second coefficient γ2, by using the following expression.
x
t+1=(∇D)−1({tilde over (x)}t+1)
γ2t+1=∥γ1t+1T1∘(∂G1+∂G2)(xt+1)∥2/∥T2∘(∂G1+∂G2)(xt+1)∥2 [Math.49]
In S2225, the second variable calculation unit 2225 calculates qt+1, which is the (t+1)-th update result of the variable q, by using the following expression.
In S2226, the second dual variable calculation unit 2226 calculates ˜zt+1, which is the (t+1)-th update result of the dual variable ˜z, by using a predetermined expression.
When B-P-R monotone operator splitting is used, the following expression is used.
{tilde over (z)}
t+1
={tilde over (x)}
t+1−2(Bqt+1−c) [Math.51]
When B-D-R monotone operator splitting is used, the following expression is used.
{tilde over (z)}
t+1=(1−α){tilde over (z)}t+α({tilde over (x)}t+1−2(Bqt+1−c))[Math.52]
(Note that α is a real number satisfying 0<α<1) In S223, the counter update unit 223 increments the counter t by 1. Specifically, the increment is performed as follows: t<- t+1.
In S224, if the counter t reaches a predetermined number T of times of update (T is an integer of 1 or greater) (that is, if t reaches T, and an end condition is satisfied), the end condition determination unit 224 uses values pT and qT of the variables p and q at the time as output values, and ends the processing. Otherwise, the processing returns to S2221. In other words, the variable update unit 220 repeats the calculation of S2221 to S224.
According to the invention of the present embodiment, variables being an optimization target can be optimized at high speed.
Here, an embodiment corresponding to the algorithm of
A noise elimination apparatus 300 will be described below with reference to
The noise elimination apparatus 300 uses an observation image s to generate an output image, from which noise is eliminated, and outputs the output image. In this case, the variable(s) p (and q) are optimized by using the variable p ∈ Rk representing the image and the auxiliary variable q ∈ Rm of the variable p (k and m are each an integer of 1 or greater), and the output image is thereby generated. Here, as the functions H1(p) and H2(q) constituting the cost function H1(p)+H2(q) for optimizing the variables p and q, the functions defined in the following expressions are used.
Here, μ and θ(>0) are predetermined coefficients.
It is assumed that the variables {p, q} are restrained by expression q=Φp (note that Φ is a square circulant matrix given in advance).
With reference to
In S320, the image update unit 320 uses an observation image s to optimize the variables p and q through a predetermined procedure, and outputs the result of optimization as an output image. The following will describe the image update unit 320 configured as a configuration unit that recursively calculates the values of the variables p and q, based on the algorithm of
The image update unit 320 will be described below with reference to
With reference to
∂G1(w)=Φ∂H1*(ΦTw)
∂G2(w)=−∂H2*(−w) [Math.54]
and T1(w) and T2(w) represent functions defined by the following expressions.
(Note that xi (i=1, . . . , n) represents the i-th element of x. Further, v (>0) is a predetermined constant, and it is assumed that v >μθ holds.) Here, for dual variables x and z ∈ Rn, dual variables ˜x and ˜z ∈ Rn defined by ˜x=∇D(x) and ˜z=∇D(z) are used.
F represents an n-dimensional DFT matrix and Ψ a diagonal matrix that satisfy Φ=FΨFT, and Ω represents a diagonal matrix that satisfies (ΦΦT+ξI)=FΨFT.
In S321, the initialization unit 321 initializes the counter t. Specifically, t is set to 0. The initialization unit 321 calculates setup data used at the time of optimizing the variables p and q. For example, the initialization unit 321 calculates the cost functions H1(p) and H2(q) as the setup data.
In S3221, the first coefficient calculation unit 3221 calculates γ1t+1, which is the (t+1)-th update result of the first coefficient γ1.
z
t=(∇D)−1({tilde over (z)}t)
γ1t+1=∥γ2tT2∘(∂G1+∂G2)(zt)∥2/∥T1∘(∂G1+∂G2)(zt)∥2[ Math.56]
In S3222, the first variable calculation unit 3222 calculates pt+1, which is the (t+1)-th update result of the variable p, by using the following expression.
In S3223, the first dual variable calculation unit 3223 calculates ˜xt+1, which is the (t+1)-th update result of the dual variable ˜x, by using the following expression.
{tilde over (x)}
t+1
={tilde over (z)}
t−2FΨFTpt+1 [Math.58]
In S3224, the second coefficient calculation unit 3224 calculates γ2t+1, which is the (t+1)-th update result of the second coefficient γ2, by using the following expression.
x
t+1=(∇D)−1({tilde over (x)}t+1)
γ2t+1=∥γ1t+1T1∘(∂G1+∂G2)(xt+1)∥2/∥T2∘(∂G1+∂G2)(xt+1)∥2 [Math.59]
In S3225, the second variable calculation unit 3225 calculates qt+1, which is the (t+1)-th update result of the variable q, by using the following expression.
Note that ˜xt+1=[˜x1 t+1, . . . ,˜xnt+1]T.
In S3226, the second dual variable calculation unit 3226 calculates ˜zt+1, which is the (t+1)-th update result of the dual variable ˜z, by using a predetermined expression.
When B-P-R monotone operator splitting is used, the following expression is used.
{tilde over (z)}
t+1
={tilde over (x)}
t+1+2qt+1 [Math.61]
When B-D-R monotone operator splitting is used, the following expression is used.
{tilde over (z)}
t+1=(1−α){tilde over (z)}t+α({tilde over (x)}t+1+2qt+1) [Math.62]
(Note that, α is a real number satisfying 0<α<1) In S323, the counter update unit 323 increments the counter t by 1. Specifically, the increment is performed as follows: t<- t+1.
In S324, if the counter t reaches a predetermined number T of times of update (T is an integer of 1 or greater) (that is, if t reaches T, and an end condition is satisfied), the end condition determination unit 324 uses a value pT of the variable p at the time as an output image, and ends the processing. Otherwise, the processing returns to S3221. In other words, the image update unit 320 repeats the calculation of S3221 to S324.
According to the invention of the present embodiment, an image obtained by eliminating noise from an observation image can be generated at high speed.
Supplement
The apparatus according to the present invention includes, for example, as single hardware entities, an input unit to which a keyboard or the like can be connected, an output unit to which a liquid crystal display or the like can be connected, a communication unit to which a communication apparatus (for example, a communication cable) capable of communication with the outside of the hardware entity can be connected, a Central Processing Unit (CPU, which may include a cache memory, a register, and the like), a RAM or a ROM that is a memory, an external storage apparatus that is a hard disk, and a bus connected for data exchange with the input unit, the output unit, the communication unit, the CPU, the RAM, the ROM, and the external storage apparatuses. An apparatus (drive) capable of reading and writing from and to a recording medium such as a CD-ROM may be provided in the hardware entity as necessary. An example of a physical entity including such hardware resources is a general-purpose computer.
A program necessary to implement the above-described functions, data necessary for processing of this program, and the like are stored in the external storage apparatus of the hardware entity (the present invention is not limited to the external storage apparatus; for example, the program may be read out and stored in a ROM that is a dedicated storage apparatus). For example, data obtained by the processing of the program is appropriately stored in a RAM, the external storage apparatus, or the like.
In the hardware entity, each program and data necessary for the processing of each program stored in the external storage apparatus (or a ROM, for example) are read into a memory as necessary and appropriately interpreted, executed, or processed by a CPU. As a result, the CPU implements a predetermined function (each of components represented by xxx unit, xxx means, or the like).
The present invention is not limited to the above-described embodiment, and appropriate changes can be made without departing from the spirit of the present invention. The processing described in the embodiments are not only executed in time series in the described order, but also may be executed in parallel or individually according to a processing capability of an apparatus that executes the processing or as necessary.
As described above, when a processing function in the hardware entity (the apparatus of the present invention) described in the embodiment is implemented by a computer, processing content of a function that the hardware entity should have is described by a program. By executing this program using the computer, the processing function in the hardware entity is implemented on the computer.
The program in which the processing details are described can be recorded on a computer-readable recording medium. The computer-readable recording medium, for example, may be any type of medium such as a magnetic recording apparatus, an optical disc, a magneto-optical recording medium, or a semiconductor memory. Specifically, for example, a hard disk apparatus, a flexible disk, a magnetic tape, or the like can be used as a magnetic recording apparatus, a Digital Versatile Disc (DVD), a DVD-Random Access Memory (RAM), a Compact Disc Read Only Memory (CD-ROM), CD-R (Recordable)/RW (ReWritable), or the like can be used as an optical disc, a Magneto-Optical disc (MO) or the like can be used as a magneto-optical recording medium, and an Electronically Erasable and Programmable-Read Only Memory (EEP-ROM) or the like can be used as a semiconductor memory.
In addition, the program is distributed, for example, by selling, transferring, or lending a portable recording medium such as a DVD or a CD-ROM with the program recorded on it. Further, the program may be stored in a storage apparatus of a server computer and transmitted from the server computer to another computer via a network so that the program is distributed.
For example, a computer executing the program first temporarily stores the program recorded on the portable recording medium or the program transmitted from the server computer in its own storage apparatus. When executing the processing, the computer reads the program stored in its own storage apparatus and executes the processing in accordance with the read program. Further, as another execution mode of this program, the computer may directly read the program from the portable recording medium and execute processing in accordance with the program, or, further, may sequentially execute the processing in accordance with the received program each time the program is transferred from the server computer to the computer. In addition, another configuration to execute the processing through a so-called application service provider (ASP) service in which processing functions are implemented just by issuing an instruction to execute the program and obtaining results without transmitting the program from the server computer to the computer may be employed. Further, the program in this mode is assumed to include information which is provided for processing of a computer and is equivalent to a program (data or the like that has characteristics of regulating processing of the computer rather than being a direct instruction to the computer).
Although the hardware entity is configured by a predetermined program being executed on the computer in the present embodiment, at least a part of the processing content of the hardware entity may be implemented in hardware.
The foregoing description of the embodiments of the present invention has been presented for purposes of illustration and description. The foregoing description does not intend to be exhaustive and does not intend to limit the invention to the precise forms disclosed.
Modifications and variations are possible from the teachings above. The embodiments have been chosen and expressed in order to provide the best demonstration of the principles of the present invention, and to enable those skilled in the art to utilize the present invention in numerous embodiments and with addition of various modifications suitable for actual use considered. All such modifications and variations are within the scope of the present invention defined by the appended claims that are interpreted according to the width provided justly lawfully and fairly.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2019/036692 | 9/19/2019 | WO |