INFORMATION PROCESSING DEVICE AND OPTIMIZATION METHOD

Information

  • Patent Application
  • 20200401651
  • Publication Number
    20200401651
  • Date Filed
    May 29, 2020
    4 years ago
  • Date Published
    December 24, 2020
    3 years ago
Abstract
An information processing device includes: a memory configured to hold values of state variables included in an evaluation function presenting energy and a weight value for each set of the state variables; and a processor coupled to the memory and configured to: calculate an energy change value when each of the values of the state variables is set as a next change candidate based on the values of the state variables and the weight value; calculate a total energy change value by adding a penalty value according to an excess amount violating an inequality constraint, to each of the energy change values calculated for the state variables, the excess amount being calculated based on a coupling coefficient and a threshold value; and change any value of the state variables in the memory based on a set temperature value, a random number value, and the total energy change values.
Description
CROSS-REFERENCE TO RELATED APPLICATION

This application is based upon and claims the benefit of priority of the prior Japanese Patent Application No. 2019-112547, filed on Jun. 18, 2019, the entire contents of which are incorporated herein by reference.


FIELD

The embodiment is related to an optimization device and an optimization method.


BACKGROUND

As a method of solving an optimization problem of various variables which is not easily handled by a Neumann-type computer, there is an optimization device (also referred to as an Ising machine or a Boltzmann machine) using an Ising-type energy function (also referred to as an evaluation function or an objective function). The optimization device calculates a problem of a calculation target by replacing the problem with an Ising model which is a model representing a behavior of spin of a magnetic material.


Related art is disclosed in Japanese Laid-open Patent Publication No. 2018-41351, International Publication Pamphlet No. WO 2017/017807 and International Publication Pamphlet No. WO 2017/056366.


SUMMARY

According to an aspect of the embodiments, an information processing device includes: a memory configured to hold values of a plurality of state variables included in an evaluation function presenting energy and a weight value for each set of the state variables; and a processor coupled to the memory and configured to: calculate an energy change value when each of the values of the plurality of state variables is set as a next change candidate based on the values of the plurality of state variables and the weight value, in a case where any value of the plurality of state variables changes; calculate a total energy change value by adding a penalty value according to an excess amount violating an inequality constraint, to each of a plurality of the energy change values calculated for the plurality of state variables, the excess amount being calculated based on a coupling coefficient indicating a weight of each of the plurality of state variables in the inequality constraint and a threshold value; and change any value of the plurality of state variables held in the memory based on a set temperature value, a random number value, and a plurality of the total energy change values.


The object and advantages of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the claims.


It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention.





BRIEF DESCRIPTION OF DRAWINGS


FIG. 1 is a diagram illustrating an optimization device according to a first embodiment.



FIGS. 2A and 2B are a diagram illustrating examples of functions E(x) and C(x).



FIG. 3 is a diagram illustrating an example of a relationship between variables.



FIG. 4 is a diagram illustrating a circuit configuration example of the optimization device.



FIG. 5 is a diagram illustrating a block configuration example of a circuit of the optimization device.



FIG. 6 is a diagram illustrating a circuit configuration example of a determination unit.



FIG. 7 is a diagram illustrating a circuit configuration example of a selection unit.



FIG. 8 is a diagram illustrating a calculation example of ΔE and ΔC for each variable.



FIG. 9 is a diagram illustrating a circuit configuration example for calculating energy corresponding to the function E(x).



FIG. 10 is a flowchart illustrating an operation example of the optimization device.



FIGS. 11A and 118 are a diagram illustrating an example of comparison between solution results.



FIG. 12 is a diagram illustrating an example of an optimization system.



FIG. 13 is a diagram illustrating another circuit configuration example of the optimization device.



FIG. 14 is a diagram illustrating an example of an optimization device according to a second embodiment.





DESCRIPTION OF EMBODIMENTS

The optimization device may also be modeled by using, for example, a neural network. In this case, each of a plurality of state variables corresponding to a plurality of spins included in the Ising model functions as a neuron which outputs 0 or 1 according to a value of another state variable and a weight coefficient indicating a magnitude of an interaction between another state variable and its own state variable. The optimization device obtains, as a solution, a combination of values of the respective state variables from which a minimum value of the energy function (hereinafter, referred to as energy) described above, by using a stochastic search method such as simulated annealing.


Here, there is a proposal for an optimization device which calculates a combination of values of each state variable having minimum energy by performing simulated annealing by using, for example, a digital circuit. In addition, there is a proposal for an information processing device including an Ising chip that simulates an interaction between respective spins in the Ising model and an information processing unit that controls the Ising chip.


There is a proposal for an optimization system which derives an optimal solution by reducing a binary quadratic programming problem to a semi-positive definite programming problem, deriving a solution of the semi-positive definite programming problem, and converting the solution of the derived semi-positive definite programming problem into a solution of the binary quadratic programming problem.


An optimization device described above minimizes an evaluation function represented by a quadratic form of a state variable. Meanwhile, in the optimization problem, a constraint condition (inequality constraint) represented by an inequality such as providing an upper limit or a lower limit to a certain amount (for example, a loading amount or a budget) may be imposed. However, it may be difficult to solve the problem of formulating the inequality constraint into the quadratic form of the state variables by the optimization device.


In one aspect, an optimization device and an optimization method which efficiently solve a problem including an inequality constraint may be provided.


Hereinafter, the present embodiments will be described with reference to the drawings.


First Embodiment

A first embodiment will be described.



FIG. 1 is a diagram illustrating an optimization device according to a first embodiment.


An optimization device 10 searches for values (ground state) of each state variable when an energy function is a minimum value, among combinations (states) of respective values of a plurality of state variables corresponding to a plurality of spins included in an Ising model obtained by converting a problem of a calculation target. The state variable may be also referred to as a “binary variable” or a “bit (spin bit)”.


When an optimization problem does not include an inequality constraint, the optimization device 10 may perform a search for the ground state based on an energy function E(x) as follows. The Ising-type energy function E(x) is defined by, for example, Math. 1.









[

Math
.




1

]












E


(
x
)


=


-




(

i
,
j

)









W
ij



x
i



x
j




-



i








b
i



x
i








(
1
)







The optimization problem of a function of a quadratic form of the unconstrained state variable (binary variable) based on Math. 1 may be referred to as quadratic unconstraint binary optimization (QUBO). In addition, formulation in the quadratic form may also be referred to as a QUBO form.


A first term on the right side of Math. 1 is obtained by integrating product of values of two state variables and a weight value without omission and duplication for all combinations of two state variables that are selectable from all state variables. xi is an i-th state variable. xj is a j-th state variable. A weight value Wij indicates a weight (for example, a strength of coupling) between the i-th state variable and the j-th state variable. For a matrix W={Wij}, Wij=Wji and Wii=0 are often satisfied. A subscript i added to a variable, such as the state variable xi, is identification information of the variable and is called an index.


The second term on the right side of Math. 1 is a sum of products of bias values of all the state variables and values of the state variables. bi indicates a bias value for an i-th state variable.


For example, a spin “−1” in the Ising model corresponds to a value “0” of the state variable. A spin “+1” in the Ising model corresponds to a value “1” of the state variable.


If a value of a state variable xi changes to become 1−xi, an increase amount of the state variable xi is represented by δxi=(1−x)−xi=1−2xi. Therefore, an energy change ΔEi accompanying a spin inversion (change in value) of the state variable xi is expressed by Math. 2 for the energy function E(x).









[

Math
.




2

]















Δ






E
i


=




E


(
x
)







x
i



1
-

x
i






-

E


(
x
)










=




-
δ








x
i

(




j








W
ij



x
j



+

b
i


)








=




-
δ







x
i



h
i








=



{




-

h
i






for






x
i


=

0

1







+

h
i






for






x
i


=

1

0













(
2
)







hi is referred to as a local field and is represented by Math. 3.









[

Math
.




3

]












h
i

=




j








W
ij



x
j



+

b
i






(
3
)







A change amount δhi(j) of the local field hi at the time of change in the state variable xi (bit inversion) is represented by Math. 4.









[

Math
.




4

]












δ






h
i

(
j
)



=

{




+

W
ij






for






x
j


=

0

1







-

W
ij






for






x
j


=

1

0










(
4
)







The optimization device 10 includes a register (not illustrated) for storing the local field hi and adds the change amount δhi(j) to hi when the value of the state variable xj changes, thereby, obtaining hi corresponding to a state after the bit inversion.


The optimization device 10 uses a Metropolis method or a Gibbs method in searching a ground state to determine whether or not to allow a state transition (change in a value of the state variable xi) in which an energy change is ΔEi. That is, the optimization device 10 stochastically allows not only a state where energy is lowered but also transition to a state where energy is increased in a neighbor search for searching for transition from a certain state to another state where energy is lower than energy of the state. For example, a probability (transition acceptance probability) A that accepts a change in the value of the state variable of the energy change ΔE is represented by Math. 5.









[

Math
.




5

]












A


(

Δ





E

)


=

{




min


[

1
,

exp


(



-
β

·
Δ






E

)



]




Metropolis





1



/



[

1
+

exp


(


β
·
Δ






E

)



]





Gibbs








(
5
)







Here, β is a reciprocal (β=1/T) of a temperature T. A min operator indicates that a minimum value of an argument is taken. For example, when using the Metropolis method, Math. 6 is obtained by taking natural logarithms of both sides for A=exp (−β·ΔE) of Math. 5.





[Math. 6]





ln(AT=−ΔE  (6)


Therefore, the optimization device 10 allows a change in a value of the corresponding state variable for a uniform random number u (0<u≤1) when the energy change ΔE satisfies Math. 7.





[Math. 7]





ln(uT≤−ΔE  (7)


In addition, the optimization device 10 changes a value of any state variable that is allowed to change. The optimization device 10 repeatedly performs processing of changing the value of the state variable at each temperature while lowering a temperature T from an initial temperature to a lowest temperature, thereby, obtaining a solution to an optimization problem.


An inequality constraint may be added to the optimization problem. Thus, the optimization device 10 provides an operation function for the inequality constraint. The optimization device 10 includes a state holding unit 11, an energy change calculation unit 12, a penalty addition unit 13, an update control unit 14, a constraint excess amount calculation unit 15, and a constraint identification sign input unit 16. The optimization device 10 is realized by using, for example, a semiconductor integrated circuit such as a field-programmable gate array (FPGA). The optimization device 10 may be realized as a semiconductor chip.


The state holding unit 11 holds values of a plurality of state variables included in an evaluation function (energy function E(x) described above) representing energy and a weight value of each set of the state variables. In FIG. 1, the values of the plurality of state variables are represented by a state vector x. For example, the number of the plurality of state variables is n (n is an integer greater than or equal to 2) (that is, the state vector x is n bits).


If any value of a plurality of state variables held in the state holding unit 11 changes, the energy change calculation unit 12 calculates a change value of energy in a case where each of the values of the plurality of state variables is set as a next change candidate based on the values of the plurality of state variables and the weight values. For example, the energy change calculation unit 12 obtains a change value ΔEj of the energy in a case where the change candidate is a state variable xj, for each of j=1 to n.


The penalty addition unit 13 calculates a change value (ΔEj+ΔCj) of a total energy by adding a penalty value ΔCj according to an excess amount (constraint excess amount) violating an inequality constraint to each of a plurality of energy change values ΔEj calculated for the plurality of state variables. The constraint excess amount is a value calculated based on a coupling coefficient indicating a weight of each of the plurality of state variables in the inequality constraint and a threshold value. For example, a change value of the constraint excess amount corresponding to the state variable xj of the change candidate is calculated by the constraint excess amount calculation unit 15 and supplied to the penalty addition unit 13. The penalty addition unit 13 obtains a penalty value ΔCj based on the change value of the constraint excess amount corresponding to the state variable xj, and adds the penalty value ΔCj to ΔEj. For example, the penalty addition unit 13 may calculate the penalty value ΔCj by multiplying the change value of the constraint excess amount by a predetermined coefficient representing importance for the inequality constraint.


The update control unit 14 changes any one value of the plurality of state variables held in the state holding unit 11 based on a set temperature value, a random number value, and change values of the plurality of total energies. The update control unit 14 may also determine stochastically whether or not to accept any one of the plurality of state variables by a relative relationship between the change values of the plurality of total energies and thermal excitation energy (thermal noise). The update control unit 14 includes a transition propriety determination unit 14a and a selection unit 14b.


A transition propriety determination unit 14a outputs a flag Fj indicating whether or not to allow a state transition corresponding to a change value (ΔEj+ΔCj) of the total energy based on the temperature value T, a random number value u, and the change value (ΔEj+ΔCj) of the total energy. For example, when the state transition due to a change in the value of the state variable xj is allowed, Fj=1 and when the state transition due to a change in the value of the state variable xj is not allowed, Fj=0. The temperature value T is input to the transition propriety determination unit 14a by a control unit (or a control circuit) not illustrated in the drawing. Here, determination by the transition propriety determination unit 14a is performed based on Math. 8 in which ΔE of Math. 7 is replaced with a change value (ΔEj+A) of the total energy.





[Math. 8]





ln(uT≤−(ΔE+ΔC)  (8)


The selection unit 14b selects one state transition number (index) corresponding to Fj=1 among Fj (j=1 to n) output by the transition propriety determination unit 14a. The selection unit 14b outputs the selected index to the state holding unit 11, thereby, changing a value of any one of the plurality of state variables held in the state holding unit 11. For example, the state holding unit 11 changes (bit inversion) the value of the state variable corresponding to the index received from the selection unit 14b. By doing so, the value of the state variable held in the state holding unit 11 is updated in synchronization with a dock signal (clk) input to the state holding unit 11.


The constraint excess amount calculation unit 15 calculates a change value of a constraint excess amount when the state variable xj is set as a change candidate based on a threshold value of the inequality constraint and a coupling coefficient of the state variable xj for the inequality constraint and supplies the calculated change value to the penalty addition unit 13. The threshold value may be an upper limit value of the inequality constraint or a lower limit value thereof. The threshold value may be both the upper limit value and the lower limit value, and in this case, it may be considered that a larger value among the change values of the constraint excess amount for the upper limit value and the change value of the constraint excess amount for the lower limit value is adopted as the change value of the constraint excess amount violating the inequality constraint.


The constraint identification sign input unit 16 inputs a constraint identification sign to the constraint excess amount calculation unit 15. The constraint identification sign is information designating the upper limit value or the lower limit value used as a threshold value. The constraint identification sign is input to the constraint identification sign input unit 16 by the control unit described above.


In addition, a configuration is considered in which a maximum number of state variables that may be held by the optimization device 10 is set to N (N is an integer greater than n), n of the N variables is used as a normal state variable, and K (K is an integer greater than or equal to 1) is used as a variable (an inequality constraint variable) for representing an inequality constraint. N=n+K. K corresponds to the number of inequality constraints, and if the number of inequality constraints is 1, K=1, and if the number of inequality constraints is 2, K=2.


In this case, the constraint identification sign input unit 16 designates a normal state variable belonging to a state vector and an inequality constraint variable not belonging to the state vector, among the N variables, by means of a constraint identification sign. Therefore, the constraint identification sign input unit 16 may be configured to input the constraint identification sign for the energy change calculation unit 12, the penalty addition unit 13, and the transition propriety determination unit 14a (FIG. 1 illustrates the configuration). By doing so, as will be described below, a penalty value ΔCi may be calculated by using the current value of hi stored in a register storing the local field hi corresponding to each of the K inequality constraint variables.


When n pieces are used as normal state variables and K pieces are used as inequality constraint variables among N variables, an optimization problem to which the inequality constraint is added may be formulated as follows. Here, a set of indices of the normal state variables is represented by ψ. The number of elements of the set ψ is n. A set of indices of the inequality constraint variables is represented as Ω (Ω={k1, k2, . . . , kK}). The number of elements of the set Ω is K.


An energy function to be minimized is represented by Math. 9 as the function E(x) of the n state variables.









[

Math
.




9

]












E


(
x
)


=


-





i
,

j

Ψ



i
<
j






W
ij



x
i



x
j




-




i

Ψ





b
i



x
i








(
9
)







The K inequality constraints configured by n state variables are represented by Math. 10.









[

Math
.




10

]











{





h
i

=




j

Ψ









a
ij



x
j







(

i

Ω

)







C
li



h
i



C
ui














(
10
)







The coupling coefficient aij indicates a weight of the state variable xj (j∈ψ) in the inequality constraint (expressed as the inequality constraint i) corresponding to the index i. Cii is the lower limit value of an inequality constraint i. Cui is the upper limit value of the inequality constraint i. However, only one of the lower limit value and the upper limit value may be provided as the inequality constraint.


At this time, an objective function Etot(x) corresponding to the total energy is represented by Math. 11.





[Math. 11]






E
tot(x)=E(x)+C(z)  (11)


The function C(x) is a sum of products of a constraint excess amount for an inequality constraint and an importance coefficient λi, for all inequality constraints and is represented by Math. 12.









[

Math
.




12

]












C


(
x
)


=




i

Ω





λ
i



max


[

0
,


h
i

-

C
ui


,


C
li

-

h
i



]








(
12
)







Here, the importance coefficient A is a real number greater than or equal to 0 and represents importance of the inequality constraint i. In addition, a max operator indicates a maximum value of the arguments. A change value of the function C(x) corresponds to the penalty value.



FIGS. 2A and 2B are a diagram illustrating examples of the functions E(x) and C(x).



FIG. 2A illustrates a graph G1 of the example of the function E(x). FIG. 2B illustrates a graph G2 of the example of the function C(x). Horizontal axes of the graphs G1 and G2 indicate a state x (a state represented by a state vector). A vertical axis of the graph G is E(x). A vertical axis of the graph G2 is C(x). For the objective function Etot(x) obtained by adding the function C(x) to the function E(x), a state based on, for example, Math. 8 (Metropolis method) is shifted, and thus, a solution to satisfy (or minimize constraint excess for the inequality constraint) the inequality constraint for the function E(x) is obtained.



FIG. 3 is a diagram illustrating an example of a relationship between variables.


A variable group g1 indicates a set {xj} (j∈ψ) of state variables. A variable group g2 indicates a set {xk} (k∈Ω) of the inequality constraint variables.


A pair of state variables is bi-directional coupling. That is, a relationship is established in which a local field of the other state variable of the pair is influenced by inversion of one state variable of the pair.


A pair of the state variable and the inequality constraint variable is unidirectional coupling from the state variable toward the inequality constraint variable. That is, a relationship is established in which the local field of the inequality constraint variable is influenced by the inversion of the state variable, but the local field of the state variable is not influenced by the inversion of the inequality constraint variable. A magnitude (weight) of influence of the state variable xj on the inequality constraint k is represented by a coupling coefficient akj. The coupling coefficient akj is given as an element of the matrix W described above. In addition, a bias bk of the inequality constraint variable is set to bk=0.


In order to implement a relationship between the state variable and the inequality constraint variable, in one example, the optimization device 10 sets a value of the inequality constraint variable xk to 0 and controls so that the inversion of the inequality constraint variable does not occur, thereby, suppressing the influence of the state variable on the local field. In another example, by setting the coupling coefficient akj to an asymmetric coupling coefficient (that is, akj≠0, ajk=0), the influence of the inequality constraint variable xk on the local field of the state variable may be suppressed. In yet another example, it is possible to control the transition propriety determination unit 14a to output the flag F=0 to the inequality constraint variable xk all the time based on the constraint identification sign.


The inequality constraint variable is not coupled to the other inequality constraint variables. That s, Wij and Wji (i, j∈Ω) are set to 0 for the pair of the inequality constraint variables.


The local field hk of the inequality constraint variable xk (k∈Ω) is represented by Math. 13.









[

Math
.




13

]












h
k

=




j

Ψ









a
kj



x
j







(
13
)







The inequality constraint variable does not influence energy E(x) of the state represented by the normal state variable, but a value of the local field hk corresponding to the inequality constraint variable xk is updated to a latest value according to inversion of the normal state variable. In addition, by comparing the local field hk with upper and lower limits of the inequality constraint, a satisfaction situation of the inequality constraint is obtained.


As represented by Math. 11, the total energy Etot(x) is a sum of the QUBO term E(x) and the inequality constraint term (penalty term) C(x). The energy change ΔEtot j(x) of the total energy for the inversion of the normal state variable xj (j∈ψ) is represented by Math. 14.





[Math. 14]





ΔEtot j=ΔEj+ΔCj  (14)


ΔEj is generated based on the QUBO term, for example, by using an existing circuit configuration disclosed in Japanese Laid-open Patent Publication No. 2018-41351.


ΔCj is generated based on Math. 15 by using a current value of the local field hi of the inequality constraint variable xi (i∈Ω).









[

Math
.




15

]















Δ






C
j


=






i

Ω





λ
i



{


max


[

0
,


h
i

+


a
ij


δ






x
j


-

C
ui


,


C
li

-


a
ij


δ






x
j


-

h
i



]


-













max


[

0
,


h
i

-

C
ui


,


C
li

-

h
i



]


}






=









i

Ω





λ
i


Δ






C

i





j







(

j

Ψ

)











(
15
)







Here, ΔCij is represented by Math. 16.














[

Math
.




16

]













Δ






C

i





j



=


max


[

0
,


h
i

+


a
ij


δ






x
j


-

C
ui


,


C
li

-


a
ij


δ






x
j


-

h
i



]


-

max


[

0
,


h
i

-

C
ui


,


C
li

-

h
i



]







(
16
)







However, δxj is a change amount of the state variable xj at the time of inversion of the state variable xj, and is represented by Math. 17.





[Math. 17]





δxj=(1−xj)−xj=1−2xj  (17)


That is, ΔCij is a value (change value) of a change amount of a constraint excess amount for the inequality constraint (inequality constraint i) of the index i accompanied by the inversion of the state variable xj. In addition, the penalty value ΔCj is a weighted sum, according to the importance coefficient, of the change value of the constraint excess amount for each inequality constraint. When only the upper limit value is given by the inequality constraint i, an argument of “Cli−aijδxj−hi” or an argument of “Cli−hi” may not be included in a max operation of Math. 15 and Math. 16. In addition, when only the lower limit value is given by the inequality constraint i, an argument of “hi+aijδxj−Cui” or an argument of “hi−Cui” may not be included in the max operation of Math. 15 and Math. 16. When equality constraint is represented as the constraint i, Cui may be equal to Cli in Math. 15 and Math. 16.


Next, an implementation example of the theory for the optimization device 10 will be further described. In the following, a maximum number of the state variables which may be held by the optimization device 10 is assumed to be set to N. In addition, an index belonging to Ω may be expressed by a subscript i of ki such that k1 is 1, k2 is 2, . . . , kK is K for the sake of convenience.



FIG. 4 is a diagram illustrating a circuit configuration example of the optimization device.


The energy change calculation unit 12 includes ΔE calculation units 12a1 to 12aN. Each of the ΔE calculation units 12a1 to 12aN calculates a change value ΔE, of energy when a j-th state variable supplied from the state holding unit 11 is set as an inversion candidate, based on Math. 2 to Math. 4 and outputs the calculated change value to the penalty addition unit 13. Current values of local fields h1 to hN are respectively supplied to the ΔE calculation units 12a1 to 12aN from registers (not illustrated) for storing the local fields h1 to hN.


Although the ΔE calculation units 12a1 to 12aN include a ΔE calculation unit corresponding to the inequality constraint variable, the ΔE calculation unit corresponding to the inequality constraint variable outputs a relatively large value based on the constraint identification sign, thereby, suppressing the inversion of the inequality constraint variable, as will be described below.


The penalty addition unit 13 includes coefficient registers 13a1 to 13aK, multipliers 13b1 to 13bK, adders 13c11 to 13c1N, . . . , 13cK1 to 13cKN. Here, although K coefficient registers and multipliers corresponding to the indices of the inequality constraint variables such as the coefficient registers 13a1 to 13aK and the multipliers 13b1 to 13bK are illustrated, the coefficient registers and the multipliers are respectively provided with N pieces. That is, other coefficient registers and other multipliers not corresponding to the indices of the inequality constraint variables are not illustrated. Likewise, although adders of K groups (or K rows) corresponding to the inequality constraint variables such as the adders 13c11 to 13c1N, . . . , 13cK1 to 13cKN are illustrated, a group of the adders is provided with N groups (or N rows). That is, other adder groups not corresponding to the index of the inequality constraint variables are not illustrated.


The coefficient registers 13a1 to 13aK holds the importance coefficients λ1 to λK in Math. 12 and Math. 15 and supplies the importance coefficients to the multipliers 13b1 to 13bK corresponding to the coefficient registers 13a1 to 13aK.


The multipliers 13b1 to 13bK multiplies the ΔCij to ΔCKj (j∈ψ) calculated by the constraint excess amount calculation unit 15 by the coefficients λ1 to λK and supplies the multiplied values to the adders 13c11 to 13c1N, . . . , 13cK1 to 13cKN.


Specifically, the multiplier 13b1 multiplies ΔC11 by λ1 and supplies the multiplied value to the adder 13c11. Here, the multiplier 13b1 multiplies ΔC12 by λ1 and supplies the multiplied value to the adder 13c12. Thereafter, likewise, the multiplier 13b1 multiplies ΔC1N by λ1 and supplies the multiplied value to the adder 13c1N.


The multiplier 13bK multiplies ΔCK1 by λK and supplies the multiplied value to the adder 13cK1. In addition, the multiplier 13bK multiplies ΔCK2 by λK, and supplies the multiplied value to the adder 13cK2. Thereafter, likewise, the multiplier 13bK multiplies ΔCKN by λK and supplies the multiplied value to the adder 13cKN. The same applies to the other multipliers.


The adders 13c11 to 13c1N, . . . , 13cK1 to 13cKN add λ1ΔC11 to λ1ΔC1N, . . . , ΔKΔCK1 to ΔKΔCKN supplied from the multipliers 13b1 to 13bK to ΔE1 to ΔEN calculated by the energy change calculation unit 12.


Specifically, the adder 13c11 adds λ1ΔC11 to ΔE, supplied from the ΔE calculation unit 12a1 and supplies the addition result to an adder (not illustrated) in a next stage corresponding to index 1. The adder 13c12 adds λ1ΔC12 to ΔE2 supplied from the ΔE calculation unit 12a2 and supplies the addition result to an adder in the next stage corresponding to index 2. Thereafter, likewise, the adder 13c1N adds λ1ΔC1N to ΔEN supplied from the ΔE calculation unit 12aN and supplies the addition result to an adder in the next stage corresponding to index N.


The adder 13cK1 adds λKΔCK1 to a value supplied from an adder (not illustrated) in the previous stage corresponding to the index 1 and outputs the addition result to the transition propriety determination unit 14a. The adder 13cK2 adds λKΔCK2 to a value supplied from an adder in the previous stage corresponding to the index 2 and outputs the addition result to the transition propriety determination unit 14a. Thereafter, likewise, the adder 13cKN adds λKλCKN to a value supplied from an adder in the previous stage corresponding to the index N and outputs the addition result to the transition propriety determination unit 14a.


The transition propriety determination unit 14a includes determination units 14a1 to 14aN. Each of the determination units 14a1 to 14aN stochastically determines whether or not to allow inversion of the state variable xj by using Math. 8, based on ΔE1+ΔC1 to ΔEN+ΔCN supplied from the penalty addition unit 13 and supplies flags F1 to FN indicating whether or not to invert to the selection unit 14b. A detailed circuit configuration of the determination units 141 to 14aN will be described below.


The selection unit 14b selects a state variable to be inverted based on the flags F1 to FN supplied from the transition propriety determination unit 14a and supplies the index of the state variable to the state holding unit 11. A detailed circuit configuration of the selection unit 14b will be described below.


The constraint excess amount calculation unit 15 includes a ΔC calculation units 15a1 to 15aK. The ΔC calculation units 15a1 to 15aK calculate ΔC1j to ΔCKj (j∈ψ) based on Math. 16 and supply the calculated values to the penalty addition unit 13.


Specifically, the ΔC calculation unit 15i supplies ΔC11 to ΔC1N to the multiplier 13b1. Thereafter, likewise, the ΔC calculation unit 15aK supplies ΔCK1 to ΔCKN to the multiplier 13bK.


As described above, a function of the optimization device 10 may be realized by an electronic circuit. For example, the state holding unit 11, the energy change calculation unit 12, the penalty addition unit 13, the transition propriety determination unit 14a, the selection unit 14b, and the constraint excess amount calculation unit 15 may also be referred to as a state holding circuit, an energy change calculation circuit, a penalty addition circuit, a transition propriety determination circuit, a selection circuit, and a constraint excess amount calculation circuit, respectively.



FIG. 5 is a diagram illustrating a block configuration example of a circuit of the optimization device.


Circuit elements illustrated in FIG. 4 may be implemented separately in a block Bi corresponding to each of the indices i=1, 2, . . . , N. FIG. 5 illustrates the circuit configuration example of a j-th block Bj of the optimization device 10.


The block Bj includes a ΔE calculation unit 12aj, a coefficient registers 13a1j to 13aKj, multipliers 13b1j to 13bKj, adders 13c1j to 13cKj, a determination unit 14aj, and ΔC calculation units 15a1j to 15aKj.


The ΔE calculation unit 12aj calculates ΔEj based on a value of the state variable xj corresponding to the index j and a value of the local field hj and supplies the calculated value to the adder 13c1j.


The coefficient registers 13a1j to 13aKj hold the coefficients λ1 to λK. The coefficient registers 13a1j to 13aKj may not be provided for each block and may be a shared register for each block.


The multipliers 13b1j to 13bKj respectively multiply λ1 to λK held in the coefficient registers 13a1j to 13aKj by ΔCij to ΔCKj supplied from the ΔC calculation units 15a1j to 15aKj and supply the multiplied values to the adders 13c1j to 13cKj. The multipliers 13b1j to 13bKj are referred to as circuit elements corresponding to the index j of the multipliers 13b1 to 13bK (for example, the multiplier 13b1j is a circuit corresponding to the index j of the multiplier 13b1), respectively.


The adder 13c1j adds λ1ΔC1j supplied from the multiplier 13b1j to ΔEj supplied from the ΔE calculation unit 12aj and supplies the addition result to an adder (not illustrated) in the next stage. Thereafter, likewise, λkCkj (k∈Ω) are sequentially added. The adder 13cKj adds λKΔCKj to a value supplied from an adder (not illustrated) in a previous stage and supplies the result (corresponding to ΔEj+ΔCj) to the determination unit 14aj.


The determination unit 14aj determines whether or not to perform inversion corresponding to the index j based on the change value ΔEj+ΔCj of the total energy and outputs the flag Fj according to the determination result to the selection unit 14b.


The ΔC calculation units 15a1j to 15aKj calculate ΔC1j to ΔCKj, respectively, and supplies the calculated values to the multipliers 13b1j to 13bKj. The ΔC calculation units 15a1j to 15aKj are referred to as circuit elements corresponding to the index j of the ΔC calculation units 15a1 to 15aK (for example, the ΔC calculation unit 15a1j is a circuit element corresponding to the index j of the ΔC calculation unit 15a1), respectively.


Here, a constraint identification sign indicating whether the index j corresponds to a normal state variable or to an inequality constraint variable is input to the block Bj by the constraint identification sign input unit 16. The constraint identification sign includes an identification code for identifying no constraint (which does not correspond to the inequality constraint variable), constraint of only an upper limit, constraint of only a lower limit, or constraint of both the upper limit and the lower limit.


For example, an identification code di is represented by 2-bit (di1 and di2). The identification code di=00 indicates that the index i is an index of a normal state variable. The identification code di=01 indicates that the index i is an index of constraint of only the upper limit (or an inequality constraint variable corresponding to the constraint of only an upper limit). The identification code di=10 indicates that the index i is an index of constraint only the lower limit (or an inequality constraint variable corresponding to the constraint of only a lower limit). The identification code di=11 indicates that the index i is an index of constraint of both the upper limit and the lower limit (or an inequality constraint variable corresponding to the constraint of both the upper limit and the lower limit).


The constraint identification sign also includes designation of an upper limit value and a lower limit value for the identification code. For example, the constraint identification sign becomes a set of (the identification code, the upper limit value, and the lower limit value).


In a case of the equality constraint, the identification code di=11 is used, and the upper limit value and the lower limit value are matched as described above. The ΔE calculation unit 12aj calculates ΔEj for the index j according to the identification code. The ΔC calculation unit 15a1j to 5aKj calculates ΔC1j to ΔCKj for the index j according to the identification code. An example of a method for calculating ΔEj and ΔC1j to ΔCKj according to the identification code will be described below.



FIG. 6 is a diagram illustrating a circuit configuration example of the determination unit.


The determination unit 14aj includes an offset value generation unit 21, a random number generation unit 22, a noise value generation unit 23, a sign inversion circuit 24, adders 25 and 26, and a comparator 27.


The offset value generation unit 21 generates an offset value Eoff (Eoff≥0) based on a flag indicating transition propriety output from the selection unit 14b and supplies the offset value to the adder 25. Specifically, when the flag output from the selection unit 14b indicates a possible transition, the offset value generation unit 21 resets the offset value to 0. When the flag output from the selection unit 14b indicates an impossible transition, the offset value generation unit 21 adds an increment value ΔEoff to the offset value. When the flag continuously indicates the impossible transition, the offset value generation unit 21 integrates ΔEoff to increase the Eoff by ΔEoff.


The random number generation unit 22 generates a uniform random number u of 0<u≤1 and outputs the uniform random number u to the noise value generation unit 23.


The noise value generation unit 23 holds a predetermined conversion table according to a use rule (for example, the Metropolis method). The noise value generation unit 23 generates a value of −T·ln(u) corresponding to a noise value (thermal noise) based on Math. 8 according to the conversion table by using the uniform random number u and temperature information indicating the temperature T supplied by the control unit (or control circuit). The noise value generation unit 23 outputs the generated value of −T·ln(u) to the adder 26.


The sign inversion circuit 24 inverts a sign of the change value (ΔEj+ΔCj) of the total energy supplied from the adder 13cKj and supplies the inverted sign to the adder 25.


The adder 25 adds the offset value Eoff to −(ΔEj+ΔCj) supplied from the sign inversion circuit 24 and supplies the addition result to the adder 26.


The adder 26 adds the thermal noise −T·ln(u) to −(ΔEj+ΔCj)+Eoff supplied from the adder 25 and supplies the addition result to the comparator 27. The comparator 27 compares the evaluation value (−(ΔEj+ΔCj)+Eoff−T*ln(u)) output from the adder 26 with a threshold value (specifically, 0) to perform determination based on Math. 8. When the evaluation value is greater than or equal to 0, the comparator 27 outputs a flag (Fj=1) indicating the possible transition to the selection unit 14b. When the evaluation value is less than 0, the comparator 27 outputs a flag (Fj=0) indicating the impossible transition to the selection unit 14b.


Here, when the flag output from the selection unit 14b indicates the impossible transition, it is considered that a current state falls into a local solution. A state transition is easily allowed by adding Eoff to −(ΔEj+ΔCj) and gradually increasing Eoff by the offset value generation unit 21, and when the current state is in the local solution, an escape from the local solution is promoted.


Next, a circuit configuration example of the selection unit 14b will be described.



FIG. 7 is a diagram illustrating the circuit configuration example of the selection unit.


The selection unit 14b includes a plurality of selector units coupled in a tree shape over a plurality of stages and random number bit generation units 32a, 32b, . . . , 32r. The random number bit generation units 32a to 32r are provided for each stage of the plurality of selector units coupled in the tree shape. Each of the random number bit generation units 32a to 32r generates a 1-bit random number having a value of 0 or 1 and supplies the generated random number to the selection circuit of each stage. The 1-bit random number is used to select one of the pairs of input flags.


A set of flags indicating transition propriety and output from each of the determination units 14a1 to 14aN is input to each of the selector units 31a1, 31a2, . . . , and 31ap in a first stage. For example, a pair of the flag output from the first determination unit 14a1 and the flag output from the second determination unit 14a2 is input to the selector unit 31a1. In addition, a pair of the flag output from the third determination unit and the flag output from the fourth determination unit is input to the selector unit 31a2. Thereafter, likewise, a pair of the flag output from the N−1-th determination unit and the flag output from the N-th determination unit 14aN is input to the selector unit 31ap. As such, the pair of flags output from the adjacent determination units is input to the selector unit in the first stage. The number of the selector units 31a1 to 31ap in the first stage is N/2. Thereafter, the number of selector units is halved each time when passing a stage.


Each of the selector units 31a1 to 31ap selects one of the pair of input flags based on the pair of input flags and the 1-bit random number output from the random number bit generation unit 32a. Each of the selector units 31a1 to 31ap outputs a state signal including the selected flag and an identification value of 1 bit corresponding to the selected flag to the selector units 31b1 to 31bq in a second stage. For example, the state signal output from the selector unit 31a1 and the state signal output from the selector unit 31a2 are input to the selector unit 31b1. Likewise, a pair of state signals output from the adjacent selector units of the selector units 31a8 to 31ap is input to the selector unit in the second stage.


Each of the selector units 31b1 to 31bq selects one of the input state signals based on a pair of the input state signals and the 1-bit random number output from the random number bit generation unit 32b. The respective selector units 31b1 to 31bq output the selected state signals to selector units 31c1, . . . in a third stage. Here, the selector units 31b1 to 31bq perform updating by adding 1 bit to the state signal included in the selected state signals so as to indicate which state signal is selected, and output the selected state signal.


Similar processing is also performed by the selector units in the third and subsequent stages, a bit width of the identification value is increased by the selector unit in each stage by 1 bit, and the state signal which is an output of the selection unit 14b is output from the selector unit 31r in the final stage. The identification value included in the state signal output from the selection unit 14b corresponds to an index represented by a binary number. In the circuit configuration example illustrated in FIG. 7, the index of the variable starts at 0 (a value obtained by adding 1 to the identification value output by the selection unit 14b may be set as the index to correspond to the index starting at 1).


For example, FIG. 7 illustrates the circuit configuration example of the selector unit 31bq. Other selector units in the second and subsequent stages are also realized by a similar circuit configuration as the selector unit 31bq.


An input of the selector unit 31bq is a first state signal (status_1) and a second state signal (status_2). An output of the selector unit 31b1 is a state signal (status). The selector unit 31bq includes an OR circuit 41, a NAND circuit 42, and selector 43 and 44.


A flag (flag1) included in the state signal (status_1) and a flag (flag2) included in the state signal (status_2) are input to the OR circuit 41. For example, the state signal (status_1) is an output on a higher side (a larger side of the index) of two selector units in a previous stage, and the state signal (status_2) is an output on a lower side (a smaller side of the index) of the two selector units on the previous stage. The OR circuit 41 outputs an OR operation result (flag) of flag1 and flag2.


flag1 and flag2 are input to the NAND circuit 42. The NAND circuit 42 outputs a NAND operation result of flag1 and flag2 to a selection signal input terminal of a selector 43.


The selector 43 receives flag1 and a 1-bit random number (rand). The selector 43 selects and outputs either flag1 or rand based on the NAND operation result input from the NAND circuit 42. For example, when the NAND operation result of the NAND circuit 42 is “1”, the selector 43 selects flag1, and when the NAND operation result of the NAND circuit 42 is “0”, the selector 43 selects rand.


A selector 44 receives an identification value (index1) included in the state signal (status_1) and an identification value (index2) included in the state signal (status_2). A selection result of the selector 43 is input to a selection signal input terminal of the selector 44. The selector 44 selects and outputs either index1 or index2 based on the selection result of the selector 43. For example, when the selection result of the selector 43 is “1”, the selector 44 selects index1, and when the selection result of the selector 43 is “0”, the selector 44 selects index2.


A set of outputs from the OR circuit 41 and the selectors 43 and 44 is a state signal (status) output from the selector unit 31bq.


An input of the selector unit in the first stage does not include the identification value. Therefore, the selector unit in the first stage becomes a circuit that adds a bit value corresponding to the selected value (“0” in a case of the lower side and “1” in a case of the higher side) as the identification value (expressed as index in the drawing) and outputs the addition result.


As such, the selection unit 14b selects one of the spin bits of possible transition in a tournament mode. In each match of tournament (that is, selection in each selection circuit), an entry number (0 or 1) of the win (that is, the selected one) is added to the upper bit of an index word. An index output from the selector unit 31r in the final stage indicates the selected spin bit. For example, when the number N of variables is 1024, the state signal output from the selector unit 31r in the final stage includes a flag indicating transition propriety and an index represented by 10-bits.


However, a method other than the method generated by the selection unit 14b as described above is also considered as an index output method. For example, an index corresponding to each determination unit may be supplied from each of the determination units 14a1 to 14aN to the selection unit 14b, and the selection unit 14b may select an index corresponding to the flag, together with a flag indicating transition propriety. In this case, each of the determination units 14a1 to 14aN further includes an index register for storing an index corresponding to itself and supplies an index to the selection unit 14b from the index register.



FIG. 8 is a diagram illustrating a calculation example of ΔE and ΔC for each variable.


The optimization device 10 performs the following procedure in parallel for (x1, h1), (x2, h2), . . . , (xN, hN) relating to all i (i=1 to N), and the identification code di.


(S1) The optimization device 10 determines whether or not i∈Ω. If i∈Ω is not satisfied, the processing proceeds to step S2 and step S2a. If i∈Ω is satisfied, the processing proceeds to step S3 and step S3a. Whether or not i∈Ω may be determined by di=(di1, di2). That is, if at least one of di1 and di2 is 1 (di=01, 10, 11), i∈Ω is satisfied. In addition, if both di1 and di2 are 0 (di=00), i∈Ω is not satisfied.


(S2) The optimization device 10 sets ΔEi as a value (ΔEi=(2xi−1)hi) calculated by Math. 2.


(S2a) The optimization device 10 sets ΔCij=0 over all the indices j.


(S3) The optimization device 10 sets ΔCij as a value calculated by Math. 16 over all the indices j based on the coefficient aij stored in a register R1 holding a matrix W. The local field hi used for calculating ΔCij is supplied to each ΔC calculation unit which calculates ΔCij for the inequality constraint i from a register holding hi.


(S3a) The optimization device 10 is assumed to be ΔEi=Emax. Here, Emax is a very large value and is a value at which Fi=0 by determination of the determination unit 14ai at all times. For example, it may be considered that Emax is a maximum value of values that may be set as ΔEi.


In addition, the optimization device 10 monitors the energy E(x) corresponding to the QUBO term of the total energy Etot(x) by using the following circuit.



FIG. 9 is a diagram illustrating a circuit configuration example for calculating the energy corresponding to the function E(x).


The optimization device 10 further includes an energy calculation unit 17 and a control unit 18.


The energy calculation unit 17 calculates the energy E(x) corresponding to the QUBO term of the total energy Etot(x). The energy calculation unit 17 selects any one of a plurality of energy change values output from the energy change calculation unit 12 based on identification information of a state variable of a change target selected by the update control unit 14. The energy calculation unit 17 calculates the energy value E(x) corresponding to values of a plurality of state variables held in the state holding unit 11 by accumulating the selected change values. Specifically, the energy calculation unit 17 includes a selection unit 17a and an accumulator 17b.


The selection unit 17a receives a selection result of the index i of the state variable to be inverted from the selection unit 14b. The selection unit 17a selects ΔEi corresponding to the index i supplied from the selection unit 14b among ΔEi (i=1, . . . , n) output from the energy change calculation unit 12 and outputs the selected ΔEi to the accumulator 17b.


The accumulator 17b holds a value of the energy E(x), accumulates ΔEi supplied from the selection unit 17a in the value of the energy E(x), updates the energy E(x) corresponding to the current value of the state variable, and outputs the energy E(x). An initial value of the energy E(x) is applied to the accumulator 17b at the time of starting the operation for an initial value of a state vector.


The control unit 18 controls operation of a search unit D1 in the optimization device 10. Here, the search unit D1 is a circuit block including the state holding unit 11, the energy change calculation unit 12, the penalty addition unit 13, the update control unit 14, the constraint excess amount calculation unit 15, the constraint identification sign input unit 16, the selection unit 17a, and the accumulator 17b. The control unit 18 controls the number of searches (the number of inversions of the state variable) of the search unit D1, temperature setting, or the like. The control unit 18 is realized by an electronic circuit. The control unit 18 is also referred to as a control circuit.


The optimization device 10 may monitor the energy E(x) corresponding to the QUBO term according to the circuit configuration illustrated in FIG. 9. By monitoring the energy E(x) according to the circuit configuration of FIG. 9, there is an advantage that there is no overflow of calculation of the energy E(x) even when a value of the coefficient λi in the inequality constraint term in the inequality constraint is relatively large.


Next, a procedure of operations performed by the optimization device 10 will be described.



FIG. 10 is a flowchart illustrating an operation example of the optimization device.


(S10) The control unit 18 initializes the search unit D1. For example, the control unit 18 receives external setting such as temperature, a repetition count C1 for temperature, or a temperature update count C2, and sets an initial temperature in the search unit D1. In addition, the control unit 18 sets a constraint identification sign in the constraint identification sign input unit 16. The control unit 18 sets an initial state in the state holding unit 11. At this time, the control unit 18 sets a value of the variable designated as the inequality constraint variable by the constraint identification sign to 0. In addition, the control unit 18 sets an initial value of the local field corresponding to the weight W, the coefficient λ, and each variable to a predetermined register, and sets the energy E(x) corresponding to the initial state in the accumulator 17b. When initialization is completed, the control unit 18 causes the search unit D1 to start the operation.


While the following steps S11 and S12 are described by focusing on a block Bj, steps S11 and S12 are performed in parallel in blocks B1 to BN.


(S11) The block Bj reads hj and xj. As described above, hj is held in a predetermined register. xj is held in the state holding unit 11.


(S12) The block Bj calculates ΔCij and ΔEj based on the procedure in FIG. 8, calculates ΔCj based on ΔC, and obtains ΔEtot j=ΔE; +ΔCj. The block Bj adds Eoff to −(ΔEj+ΔCj), and further outputs a flag Fj based on determination of an evaluation value obtained by adding a thermal noise −T·ln(u) and a threshold value.


(S13) The selection unit 14b selects the index i of an inversion target based on flags F1 to FN output from blocks B1 to BN, and supplies the selected index i to the state holding unit 11.


(S14) The state holding unit 11 inverts (changes) a value of a state variable xi corresponding to the index i supplied from the selection unit 14b, thereby, updating a state and updating the local field h corresponding to each variable held in a predetermined register (updating based on Math. 4). In description of this step, an index of a change target is represented as 1. The value of the local field for the inequality constraint variable based on Math. 13 is also updated by the updating.


(S15) At the current temperature, the control unit 18 determines whether or not the state variable is updated by a specified number of times C1. When the updating is performed only by the specified number of times C1, the processing proceeds to step S16. When the updating is not performed by the specified number of times C1, the processing proceeds to step S11.


(S16) The control unit 18 updates a temperature. Accordingly, the temperature T set in the search unit D1 is lowered. The control unit 18 lowers the temperature T set in the search unit D1 according to a predetermined method each time the specified number of times C1 is updated.


(S17) The control unit 18 determines whether or not the temperature is updated by a specified number of times C2. When the temperature is updated by the specified number of times C2, the processing proceeds to step 518. When the temperature is not updated by the specified number of times C2, the processing proceeds to step S11.


(S18) The control unit 18 outputs a state corresponding to the lowest energy (minimum value of the energy E(x)) output from the accumulator 17b. For example, the control unit 18 may output a state in which the search unit D1 finally reaches as a state corresponding to the lowest energy. Alternatively, the control unit 18 may output the state corresponding to the lowest energy reached by the search unit D1 during the search process.


Here, in step S12, the penalty addition unit 13 adds a penalty value to an energy change value as follows, thereby, calculating a total energy change value ΔEtot j=ΔEj+ΔCj (j∈ψ).


For example, the penalty addition unit 13 calculates the penalty value ΔCj by multiplying the change value ΔCij of the constraint excess amount calculated based on the coupling coefficient aij (i∈Ω) and the threshold values (Cui and Cli) by the importance coefficient λi indicating importance of the inequality constraint i, and adds the calculated penalty value to ΔEj. When there are a plurality of inequality constraints, the penalty addition unit 13 obtains the sum of λiΔCij of each inequality constraint for i, thereby, obtaining ΔCj. A weight of the inequality constraint for the QUBO term or a weight of each inequality constraint when there are a plurality of inequality constraints may be adjusted by the importance coefficient, and a solution is made.


In addition, when the inequality constraint is an inequality constraint for an upper limit constraint value Cui (i∈Ω), the penalty addition unit 13 adds a first penalty value ΔCj according to a change value of a first excess amount, which is calculated based on a first coupling coefficient aij and a first threshold value Cui, in a positive direction for the first threshold value Cui, to each of the plurality of energy change values. ΔEj.


In addition, when the inequality constraint is an inequality constraint to the lower limit constraint value Cii, the penalty addition unit 13 adds a second penalty value ΔCj according to a change value of a second excess amount, which is calculated based on a second coupling coefficient aij and a second threshold value Cli, in a negative direction for the second threshold value Cii, to each of the plurality of energy change values ΔEj.


In addition, when the inequality constraint is an inequality constraint for the upper limit constraint value Cui and the lower limit constraint value Cii, the penalty addition unit 13 adds a third penalty value ΔCj according to a larger change value between a change value of a third excess amount, which is calculated based on the third coupling coefficient aij and the third threshold value Cui in a positive direction for a third threshold value Cui, and a change value of a fourth excess amount, which is calculated based on a third coupling coefficient au and the fourth threshold value Cli smaller than the third threshold value Cui in a negative direction for a fourth threshold value Cli, to each of the plurality of energy change values ΔEj.


In addition, in a case of an equality constraint for a predetermined constraint value, the penalty addition unit 13 adds a fourth penalty value ΔCj according to a larger change value between a change value of a fifth excess amount, which is calculated based on the fourth coupling coefficient au and the fifth threshold value Cui=Cli in a positive direction for a fifth threshold value Cui=Cli, and a change value of a sixth excess amount in a negative direction for the fifth threshold value Cui=Cli, to each of the plurality of energy change values ΔEj.



FIGS. 11A and 11B are a diagram illustrating an example of a comparison between solution results.



FIG. 11A illustrates an example of the solution results for an optimization problem including an inequality constraint, for which the optimization device 10 is used. FIG. 11B illustrates an example of the solution results for an optimization problem, for which an existing optimization device that does not include the penalty addition unit 13 is used, when the inequality constraint is represented in a quadratic form of the state variable (in a case of a square constraint).


Here, in FIGS. 11A and 11B, a knapsack problem is considered as the example of the optimization problem. When the Knapsack problem is solved by using an existing optimization device (not including the penalty addition unit 13), the problem may be formulated as follows.


The knapsack problem considered here is a problem in which N articles (a weight of an article of the index i is wi and a price is vi) are input to a knapsack with a loading capacity C and a total price (total value) of the articles in the knapsack is maximized. When an article i is input to the knapsack, the state variable xi=1, and when the article i is not input to the knapsack, the state variable xi=0.


In this case, a total value V of the articles in the knapsack is represented by Math. 18.









[

Math
.




18

]











V
=




i
=
1

N








v
i



x
i







(
18
)







In addition, a capacity constraint of the knapsack is represented by Math. 19.









[

Math
.




19

]











W
=





i
=
1

N








w
i



x
i




C





(
19
)







The capacity constraint of Math. 19 is represented by Math. 20 by using a slack variable S (S≥0).









[

Math
.




20

]












S
+




i
=
1

N








w
i



x
i




=
C




(
20
)







An objective function Eof QUBO is represented by Math. 21 based on Math. 20









[

Math
.




21

]











E
=


-




i
=
1

N








v
i



x
i




+


(

C
-
S
-




i
=
1

N




w
i



x
i




)

2






(
21
)







It is considered that S is binary-expanded as in Math. 22 and solved as a QUBO form of the state variables xi and yi. The QUBO form of the state variables xi and yi is represented by Math. 23.









[

Math
.




22

]











S
=




j
=
0

n








2
j



y
i







(
22
)






[

Math
.




23

]











E
=


-




i
=
1

N








v
i



x
i




+


(

C
-




j
=
0

n








2
j



y
i



-




i
=
1

N




w
i



x
i




)

2






(
23
)







Since the slack variable S becomes a relatively small value in optimization, an optimal solution may be obtained even if S=0. Therefore, Math. 24 is obtained.









[

Math
.




24

]











E
=


-




i
=
1

N








v
i



x
i




+


(

C
-




i
=
1

N




w
i



x
i




)

2






(
24
)







In comparison of FIG. 11, following test data was prepared to perform an experiment.


The knapsack capacity C=4952991. The weights W=D1 to DN for each article are values obtained by adding 100000 to uniform random numbers of a section [0, 100], respectively. The prices P=P1 to PN for each article are values generated as uniform random numbers of a section [1, 1000], respectively.


In addition, the experimental conditions were as follows. First, the importance coefficient λ for the inequality constraint is set as λ=10{circumflex over ( )}z and z increases by 1 from −10 to 10. Second, the temperatures T=5, 10, 50, 100, and 500 (initial temperature) are set, and an operation is performed by using a Monte Carlo method according to the Metropolis standard, respectively. Third, 100 initial values of the random number u are prepared, and if there is an initial value which gives a solution that satisfies at least one of the constraints, “a solution is obtained”.


The energy function E(x) including the inequality constraint (absolute value constraint) for the optimization device 10 in the comparison of FIG. 11 is represented by Math. 25.









[

Math
.




25

]












E


(

P
,
W
,

C

x


)


=


-




i
=
1

N




P
i



x
i




+

λmax


{

0
,





i
=
1

N




D
i



x
i



-
C


}







(
25
)







In addition, the energy function E(x) in a case of the square constraint of the comparison in FIG. 11 is represented by Math. 26.









[

Math
.




26

]












E


(

P
,
W
,

C

x


)


=


-




i
=
1

N




P
i



x
i




+

λ



{





i
=
1

N




D
i



x
i



-
C

}

2







(
26
)







However, in Math. 26, since a small value is optimal for the slack variable, the slack variable was set to 0.


A horizontal axis of each of the graphs of FIG. 11(A) and FIG. 11(B) is z in λ=10{circumflex over ( )}z. A vertical axis of each of the graphs of FIGS. 11(A) and 11(B) is a total values (maximum) of the values Pi. When the constraint is not satisfied, Pi=0.


In the above experimental conditions, if λ is greater than or equal to 10{circumflex over ( )}(−1) at any temperature in an absolute value constraint, a solution is obtained, whereas in the square constraint, all Pi=0, and a solution satisfying the constraint is not obtained. Therefore, when a range of λ is widened and the temperature T is set to be large such that T=10{circumflex over ( )}10, 10{circumflex over ( )}15, and 10{circumflex over ( )}20, the solution is obtained only in a limited range in the square constraint, and it is found that a relatively large cost is incurred in the search by the square constraint.


It is considered that a reason why the solution may not be obtained when the coefficient λ is relatively small is that a value of the inequality constraint term is smaller than the total value, and thus, constraints are almost neglected and likelihood of obtaining a solution that satisfies the constraint is decreased.


In FIG. 11(A), a sequence p1 is a case where T=10{circumflex over ( )}10. A sequence p2 is a case where T=10{circumflex over ( )}15. A sequence p3 is a case where T=10{umlaut over ( )}{circumflex over ( )}20. A sequence p4 is a case where T=5. A sequence p5 is a case where T=10. A sequence p6 is a case where T=50. A sequence p7 is a case where T=100. A sequence p8 is a case where T=500.


In FIG. 11(B), a sequence q1 is a case where T=10{circumflex over ( )}10. A sequence q2 is a case where T=10{circumflex over ( )}15. A sequence q3 is a case where T=10{circumflex over ( )}20. In the example of the square constraint illustrated in FIG. 11(B), a solution for sequences corresponding to the temperatures T=5, 10, 50, 100, and 500 was not obtained, and thus, illustration thereof is omitted.


In a case of the square constraint, cause of such a result is considered to be in an equation of an energy function of the square constraint. That is, when a total weight of articles contained in the knapsack is near a capacity of the knapsack, the energy based on Math. 26 increases even if the articles are removed, and thus, a search may not be performed. Particularly, in a case of the square constraint, contribution of the inequality constraint term to a value of the energy function E(x) of Math. 26 may be excessive more than a case of an absolute value constraint represented by a linear term by a quadratic term.


Here, since various constraint conditions are included in the optimization problem, it may be difficult to formulate the problem only in the quadratic form. In addition to the above-described examples, constraints represented by the inequality, such as that a demanded resource amount does not exceed an upper limit or does not fall below a lower limit, for example, an upper limit of loading amount such as a vehicle dispatch delivery problem, an upper limit of a budget for constructing a factory, or the like, may frequently appear.


There is a problem that these inequality constraints may be difficult to be solved by the existing optimization device in the quadratic form and may be difficult to reach an optimal solution when searching to strictly satisfy the inequality constraint. This is because it is often impossible to reach the optimal solution without passing through an intermediate state that breaks the inequality constraint.


Therefore, by using the optimization device 10 having the penalty addition unit 13, it is possible to easily handle the inequality constraint as the original linear expression even if the inequality constraint is not transformed into the quadratic form.


As exemplified in FIG. 11, according to the optimization device 10, it is possible to obtain a solution in a wide range for the coefficient λ and the temperature T, as compared with the existing optimization device. Therefore, the inequality constraint may be handled more easily than in a case of the square constraint, and a kind of the problem that may be made to become an operation target may be expanded. That is, an application range of the optimization device 10 may be expanded, and the optimization device 10 may be used to solve more problems.


In the above-described examples, by setting xk=0 and ΔEk=Emax for k∈Ω, influence on the energy E(x) by the variable xk is suppressed, but influence on the energy E(x) by the variable xk may be suppressed by using an asymmetric coupling coefficient as the coefficient akj. The asymmetric coupling coefficient is represented as akj≠0, ajk=0, as described above. The asymmetric coupling coefficient may be represented as Wkj≠0 and Wjk=0 by using a symbol W of a weight value between the variables.


That is, the state holding unit 11 may hold the coupling coefficient aij for a set of the state variable and the inequality constraint variable corresponding to the inequality constraint. The coupling coefficient aij may be an asymmetric coupling coefficient in which a weight (a) of the state variable xj in the local field hk of the inequality constraint variable xk (k∈Ω) is set to a value other than 0, and a weight (ajk) of the inequality constraint variable xk in the local field hj of the state variable xj is set to 0.


When an asymmetric coupling coefficient is used, even though an inequality constraint variable is selected as an inversion target by the selection unit 14b, influence of the inversion of the variable xk on the energy E(x) may be suppressed.


Next, another configuration example of the first embodiment will be described. First, an example of an optimization system including the optimization device 10 will be described.



FIG. 12 is a diagram illustrating an example of the optimization system.


An optimization system 100 includes the optimization device 10 and an information processing device 110.


The information processing device 110 includes a central processing unit (CPU) 111, a memory 112, and a bus 113.


The CPU 111 is a processor that controls the information processing device 110. The CPU 111 executes a program stored in the memory 112. The memory 112 is, for example, a random-access memory (RAM), and stores a program executed by the CPU 111 and data processed by the CPU 111. The bus 113 is, for example, an internal bus such as Peripheral Component Interconnect Express (PCIe). The CPU 111, the memory 112, and the optimization device 10 are coupled to the bus 113.


The CPU 111 sets the temperature T to the optimization device 10 and receives information on the specified number of times C1 and C2 via the bus 113. In addition, the CPU 111 inputs a constraint identification sign (including an identification code, an upper limit value, and a lower limit value) and the coefficient λ for each inequality constraint to the optimization device 10. The control unit 18 performs setting for the search unit D1 based on the input information thus inputted. For example, a user may set the constraint identification sign or the coefficient λ in the information processing device 110 by operating an input unit (not illustrated) coupled to the information processing device 110.


By doing so, the optimization device 10 is coupled to the information processing device 110 and is used as an accelerator that performs an operation for a combination optimization problem including the inequality constraint at a high speed by hardware.


Next, another circuit configuration example of the optimization device will be described.



FIG. 13 is a diagram illustrating another circuit configuration example of the optimization device.


An optimization device 10a includes a search unit D2 and a control unit 18a. The search unit D2 includes a state holding unit 11a, registers 11b1 to 11bN, h calculation units 11c1 to 11cN, ΔE calculation units 12a1 to 12aN, ΔC addition units 13dl to 13dN, determination units 14a1 to 14aN, and a selection unit 14b.


Here, in FIG. 13, names of the h calculation units 11c1 to 11cN, the ΔE calculation units 12a1 to 12aN, and the ΔC addition units 13d1 to 13dN are expressed with a subscript i as in the “hi” calculation unit and the like so that corresponding to the state variable of the index i is easily understood. In addition, the state holding unit 11a, the registers 11b1 to 11bN, and the h calculation units 11c1 to 11cN correspond to the state holding unit 11 described above.


The state holding unit 11a holds a state vector. The state holding unit 11a updates the state vector by inverting a value of one state variable among the state vectors, based on an update signal supplied from the selection unit 14b. The update signal indicates an index of the state variable to be inverted.


The register 11b1, the h calculation unit 11c1, the ΔE calculation unit 12a1, the ΔC addition unit 13d1, and the determination unit 14a1 perform operations for a first state variable. The register 11b2, the h calculation unit 11c2, the ΔE calculation unit 12a2, the ΔC addition unit 13d2, and the determination unit 14a2 perform operations for a second spin bit. Likewise, a numerical value i at the end of signs such as “11b1” and “12a1” indicates that an operation corresponding to the state variable of the index i is performed. That is, the search unit D2 includes N sets (a set is one unit of the operation processing circuit that performs an operation for the 1 spin bit, which may be referred to as “neuron”) of the register, the h calculation unit, the ΔE calculation unit, the ΔC addition unit, and the determination unit. The N sets perform operations in parallel for the state variables corresponding to the respective sets.


Although the ΔC addition unit in a certain neuron receives a weight value held in a register in another neuron and a local field held by the h calculation unit, signal lines between the ΔC addition unit and the register and the h calculation unit in another neuron are not illustrated.


For example, in the example of the ΔC addition unit 13d1, one or more weight values (Wk1=ak1) corresponding to the inequality constraint variable xk (k∈Ω) among W11, W21, . . . , WN1 are supplied to the ΔC addition unit 13d1. In addition, among the h calculation units 11c1 to 11cN, one or more local fields (hk) corresponding to the inequality constraint variable xk are supplied to the ΔC addition unit 13d1. Even if Wi1 or hi corresponding to the state variable xi (i∈ψ) is supplied to the ΔC addition unit 13d1, the ΔC addition unit 13d1 is set to ΔCi1=0 by a control based on the identification code illustrated in FIG. 8.


In addition, in the example of the ΔC addition unit 13d2, one or more weight values (Wk2=ak2) corresponding to the inequality constraint variable xk (k∈Ω) among W12, W22, . . . , WN2 are supplied to the ΔC addition unit 13d2. In addition, among the h calculation units 11c1 to 11cN, one or more local fields (hk) corresponding to the inequality constraint variable xk are supplied to the ΔC addition unit 13d2. Even if Wi2 or hi corresponding to the state variable xi (i∈ψ) is supplied to the ΔC addition unit 13d2, the ΔC addition unit 13d2 is set to ΔCi2=0 by a control based on the identification code.


Hereinafter, the register 11b1, the h calculation unit 11c1, the ΔE calculation unit 12a1, the ΔC addition unit 13d1, and the determination unit 14a1 will be mainly exemplified and described. The registers 11b2 to 11bN, the h calculation units 11c2 to 11cN, the ΔE calculation units 12a2 to 12aN, the ΔC addition units 13d2 to 13dN, and the determination unit 14a2 to 14aN, which are configured to have similar names, also have similar functions.


The register 11b1 is a storage unit that stores a weight value W1j (j=1 to N) between the state variable x1 and the other state variables. In addition, for the number N of the state variables, the total number of the weight values is N{circumflex over ( )}2. The register 11b1 stores N weight values.


The register 11b1 stores N weight values W11, W12, . . . , W1N for the index 1. Wii=W11=0. The register 11a1 outputs the weight value W corresponding to the index=j supplied by the selection unit 14b to the h calculation unit 11c1.


The h calculation unit 11c1 calculates the local field h1 based on Math. 3 and Math. 4 by using the weight coefficient W1j supplied from the register 11b1. The h calculation unit 11c1 includes a register for holding the local field h1 previously calculated, and updates h1 stored in the register by integrating δh1(j) according to an inversion direction of the state variable indicated by index=j to h1. An initial value of h1 is previously set in the register of the h calculation unit 11c1 according to a problem. In addition, a value of b1 is also set previously in the register of the h calculation unit 11c1 according to the problem. The h calculation unit 11c1 outputs the calculated local field h1 to the ΔE calculation unit 12a1.


The ΔE calculation unit 12a1 calculates ΔE1 by Math. 2 based on the local field h1 supplied from the h calculation unit 11c1, and outputs the calculated value to the ΔC addition unit 13d1. Here, the ΔE calculation unit 1281 calculates ΔE1 by determining a sign of h1 according to the inversion direction of the state variable x1.


The ΔC addition unit 13d1 calculates a penalty value ΔC1 according to the change value of the excess amount for the inequality constraint based on Math. 15, and adds ΔC1 to ΔE1 supplied from the ΔE calculation unit 12a1. The ΔC addition unit 13d1 outputs ΔE1+ΔC1 to the determination unit 14a1.


The determination unit 14a1 outputs the flag F1 indicating inversion propriety to the selection unit 14b based on ΔE1+ΔC1 supplied from the ΔC addition unit 13d1.


The selection unit 14b selects the index j of the state variable of an inversion target based on the flags F1 to FN supplied from the determination units 14a1 to 14aN, and supplies the index j to the state holding unit 11a and the registers 11b1 to 11bN. Although the selection unit 14b supplies a flag indicating whether or not the index j of the state variable of the inversion target is selected to the determination units 14a1 to 14aN to prompt generation of an offset value, a signal line through which the flag is supplied is not illustrated.


The control unit 18a sets the temperature T for the determination units 14a1 to 14aN in the search unit D2, controls the number of times of update of the state variables, and inputs the constraint identification signs to the ΔE calculation units 12a1 to 12aN and the ΔC addition units 13d1 to 13dN. The ΔE calculation of the ΔE calculation units 12a1 to 12aN and the ΔC calculation of the ΔC addition units 13d1 to 13dN are controlled by a constraint identification sign as illustrated in FIG. 8.


The optimization device 10a may also efficiently solve the optimization problem including the inequality constraints like the optimization device 10. By using the optimization device 10a, the inequality constraint may be easily handled, and a type of the problem that may be subjected to the operation target may be expanded. That is, an application range of the optimization device 10a may be expanded, and the optimization device 10a may be used to solve more problems.


Second Embodiment

Next, a second embodiment will be described. Items different from the first embodiment described above will be mainly described, and descriptions of the common items will be omitted.


The first embodiment exemplifies a configuration in which the optimization device 10 or the optimization device 10a includes a single search unit. Meanwhile, in an optimization device including a plurality of search units, a configuration for solving an optimization problem by using a replica exchange method is also considered.



FIG. 14 is a diagram illustrating an example of the optimization device according to the second embodiment.


An optimization device 50 includes search units 51a1 to 51aM (M (M is an integer greater than or equal to 2) search units) and an exchange control unit 52.


The search units 51a1 to 51aM each have a circuit configuration which is the same as the configuration of the search unit D1 or the search unit D2 according to the first embodiment. FIG. 14 illustrates an example of a search unit 51aj.


The search unit 51aj includes a state holding unit 11, an energy change calculation unit 12, a penalty addition unit 13, an update control unit 14, a constraint excess amount calculation unit 15, and a constraint identification sign input unit 16. Functions of the state holding unit 11, the energy change calculation unit 12, the penalty addition unit 13, the update control unit 14, the constraint excess amount calculation unit 15, and the constraint identification sign input unit 16 are the same as the functions of the elements of a similar name according to the first embodiment, respectively. In addition, the search unit 51aj further includes an energy calculation unit 17 (not illustrated).


The exchange control unit 52 sets different temperature values to the respective search units 51a1 to 51aM, and controls a ground state search based on a replica exchange method by using the search units 51a1 to 51aM. The exchange control unit 52 exchanges temperature values or values of the state variables held in the respective search units between the search units, after reaching the number of repetitions of the ground state search by each of the search units 51a1 to 51aM or after a certain time elapses. For example, it is considered that the exchange control unit 52 exchanges temperatures with a predetermined probability (exchange probability) in a pair of two search units having adjacent temperatures. For example, a probability based on the metropolis method is used as the exchange probability. Alternatively, instead of the temperatures, the exchange control unit 52 may exchange states (values of a plurality of state variables) held by the two search units or local fields corresponding to the respective state variables in the pair of the two search units having adjacent temperatures.


According to the optimization device 50, even when the replica exchange method is used, an optimization problem including an inequality constraint may be efficiently solved. According to the optimization device 50, the inequality constraint may be easily handled, and a type of the problem which may become an operation target may be expanded. An application range of the optimization device 50 may be expanded, and the optimization device 50 may be used for solving more problems.


All examples and conditional language provided herein are intended for the pedagogical purposes of aiding the reader in understanding the invention and the concepts contributed by the inventor to further the art, and are not to be construed as limitations to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to a showing of the superiority and inferiority of the invention. Although one or more embodiments of the present invention have been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.

Claims
  • 1. An information processing device comprising: a memory configured to hold values of a plurality of state variables included in an evaluation function representing energy and a weight value for each set of the state variables; anda processor coupled to the memory and configured to: calculate an energy change value when each of the values of the plurality of state variables is set as a next change candidate based on the values of the plurality of state variables and the weight value, in a case where any value of the plurality of state variables changes;calculate a total energy change value by adding a penalty value according to an excess amount violating an inequality constraint, to each of a plurality of the energy change values calculated for the plurality of state variables, the excess amount being calculated based on a coupling coefficient indicating a weight of each of the plurality of state variables in the inequality constraint and a threshold value; andchange any value of the plurality of state variables held in the memory based on a set temperature value, a random number value, and a plurality of the total energy change values.
  • 2. The information processing device according to claim 1, wherein the processor calculates the penalty value by multiplying the change value of the excess amount calculated based on the coupling coefficient and the threshold value by an importance coefficient indicating importance of the inequality constraint.
  • 3. The information processing device according to claim 1, wherein, when the inequality constraint is an inequality constraint for an upper limit constraint value,the processor adds a first penalty value according to a change value of a first excess amount, which is calculated based on a first coupling coefficient and a first threshold value, in a positive direction for the first threshold value, to each of the plurality of energy change values.
  • 4. The information processing device according to claim 1, wherein, when the inequality constraint is an inequality constraint for a lower limit constraint value,the processor adds a second penalty value according to a change value of a second excess amount, which is calculated based on a second coupling coefficient and a second threshold value, in a negative direction for the second threshold value, to each of the plurality of energy change values.
  • 5. The information processing device according to claim 1, wherein, when the inequality constraint is an inequality constraint for an upper limit constraint value and a lower limit constraint value,the processor adds a third penalty value according to a larger change value between a change value of a third excess amount, which is calculated based on a third coupling coefficient and a third threshold value, in a positive direction for the third threshold value, and a change value of a fourth excess amount, which is calculated based on the third coupling coefficient and a fourth threshold value smaller than the third threshold value, in a negative direction for the fourth threshold value, to each of the plurality of energy change values.
  • 6. The information processing device according to claim 1, wherein, in a case of an equality constraint for a predetermined constraint value,the processor adds a fourth penalty value according to a larger change value between a change value of a fifth excess amount, which is calculated based on a fourth coupling coefficient and a fifth threshold value, in a positive direction for the fifth threshold value, and a change value of a sixth excess amount in a negative direction for the fifth threshold value, to each of the plurality of energy change values.
  • 7. The information processing device according to claim 1, wherein the memory holds the coupling coefficient for a set of a state variable and an inequality constraint variable corresponding to the inequality constraint, andwherein the coupling coefficient is an asymmetric coupling coefficient in which a weight of the state variable in a local field of the inequality constraint variable is set to a value other than 0 and a weight of the inequality constraint variable in a local field of the state variable is set to 0.
  • 8. The information processing device according to claim 1, wherein the processor selects any change value of the plurality of energy change values, based on identification information of a state variable of a change target, and calculates an energy value corresponding to values of the plurality of state variables held in the memory by accumulating the selected change value.
  • 9. The information processing device according to claim 1, wherein the processor is provided as plural and the temperature values different from each other are set to the corresponding processor andthe temperature values or the values of the plurality of state variables after reaching a number of repetitions of a ground state search or after a certain time elapses are exchanged.
  • 10. An optimization method comprising: calculating, by a computer, an energy change value when each of values of a plurality of state variables is set as a next change candidate based on the values of the plurality of state variables and a weight value held in a memory, in a case where any value of the plurality of state variables held in the memory changes;calculating a total energy change value by adding a penalty value according to an excess amount violating an inequality constraint to each of a plurality of the energy change values calculated for the plurality of state variables, the excess amount being calculated based on a coupling coefficient indicating a weight of each of the plurality of state variables in the inequality constraint and a threshold value; andchanging any value of the plurality of state variables held in the memory based on a set temperature value, a random number value, and a plurality of the total energy change values.
Priority Claims (1)
Number Date Country Kind
2019-112547 Jun 2019 JP national