COMPUTER-IMPLEMENTED METHOD FOR FINDING AN APPROXIMATE SOLUTION FOR A QUADRATIC UNCONSTRAINED BINARY OPTIMIZATION PROBLEM

Information

  • Patent Application
  • 20240354369
  • Publication Number
    20240354369
  • Date Filed
    July 18, 2022
    2 years ago
  • Date Published
    October 24, 2024
    3 months ago
Abstract
Computer-implemented method for finding an approximate solution for a quadratic unconstrained binary optimization problem, QUBO problem, the method being performed by a computing system and the method comprising: providing, as input to the computing system, the QUBO problem in a form comprising an Ising Hamiltonian operator, iteratively obtaining a cost function, the cost function depending at least on the Ising Hamiltonian operator, one or more spins si and/or the step of the algorithm within each step of the iteration, obtaining, by the computing system, associated intermediate values of the one or more spins si using the cost function, obtaining, at the end of the iterative process, by the computing system, final values of the one or more spins si that approximately minimize the final iteratively obtained cost function, obtaining, by the computing system, from the final values of the one or more spins si, an approximate solution for the QUBO problem, wherein the step of obtaining updated intermediate values of the one or more spins si is performed using a gradient descent technique or a sequential updating of intermediate values of the one or more spins.
Description

The present invention is related to a computer-implemented method for finding an approximate solution for a quadratic unconstrained binary optimization problem, QUBO problem, according to independent claim 1.


PRIOR ART

Quadratic unconstrained binary optimization problems, also referred to herein as QUBO problems, are known in the field of combinatorial optimization problems. Such problems usually occur, for example, in the fields of finance and economics and also have been applied for example in graph coloring and partition problem solutions.


Solving such problems has been an important issue in the last years. One major issue has been to provide solutions to QUBO problems without having to use quantum computers.


Solving a QUBO problem generally comprises finding a solution vector {right arrow over (x)}=(x1,x2, . . . xN) that minimizes the function










f
q

=




Σ



i
=
1

N




Σ



j
=
1

i



q

i

j




x
i



x
j


+



Σ



j
=
1

N



h
j



x
j







(
1
)







The actual problem is described by the parameters qij and hj. The values xt can either be 1 or 0.


It has been found that this problem is closely related to a problem of quantum mechanics, namely finding the ground state (the state with the smallest energy) of the Ising Hamiltonian operator which can be represented as










H
z

=




Σ



i

j




J

i

j




σ
z

(
i
)




σ
z

(
j
)



+



Σ


i



b
i



σ
z

(
i
)








(
2
)







This is achieved through the map xi=(1+σz(i))/2. Herein, the values denote the interaction between spins at lattice places i and j, σz(i), σz(j) denote the z-Pauli matrices acting on a spin at position i and a spin at position j, and bi denotes a bias term that can, for example, come from an external magnetic field acting on the spins of the system.


The solution to this problem is a vector of spins {right arrow over (s)}=(s1, s12, . . . , sn) that minimizes the expectation value custom-characters|Hz|scustom-character, where |scustom-character is a quantum state corresponding to the tensor product of the single spin states given by {right arrow over (s)}. The values of the spins are in that case associated with the classical values xi via si=2xi−1. The classical values that solve the QUBO problem are thus related to the direction (either up or down) of the quantum mechanical spins. For solving the QUBO problems in this formulation, different classical approaches have been attempted that use generally available computers. For example, the Fujitsu digital annealer has been discussed to directly obtain the values xi. This annealer thus attempts to find the classical solutions directly, without using the formulation of the QUBO problem according to (2). This annealer, however, does only reasonably support comparably small problems. This is due to the limitation in the size of the cache of the used computer processor (CPU).


Another approach is the Toshiba simulated bifurcation machine which uses graphics processors to accelerate the solving of the problem on the processor. Thereby, solutions can be found faster compared to the Fujitsu digital annealer, though this machine also does only support a limited number of variables, like spins.


Another approach is the D-Wave algorithm. This algorithm, however, has to be performed on a real quantum computer and thus currently suffers from limitations in the number of available qubits. Currently, the D-Wave hardware is only able to support and solve QUBO problems of approximately 200 fully connected spins while being comparably expensive, compared to a general-purpose computer system.


Problem

In view of the available prior art, the problem addressed by the current invention is to provide a method for solving also large-scale QUBO problems comprised of several thousand or ten thou-sand variables in comparably short time with commonly available hardware.


Solution

This problem is solved by the computer-implemented method for finding an approximate solution for a quadratic unconstrained binary optimization problem, QUBO problem, according to independent claim 1. Preferred embodiments of the invention are provided in the dependent claims.


According to the invention, the computer-implemented method for finding an approximate solution for a quadratic unconstrained binary optimization problem, QUBO problem, the method being performed by a computing system, comprises:

    • providing, as input to the computing system, the QUBO problem in a form comprising an Ising Hamiltonian operator,
    • iteratively obtaining a cost function, the cost function depending at least on the Ising Hamiltonian operator, one or more spins si and/or the step of the algorithm
    • within each step of the iteration, obtaining, by the computing system, associated inter-mediate values of the one or more spins si using the cost function,
    • obtaining, at the end of the iterative process, by the computing system, final values of the one or more spins si that approximately minimize the final iteratively obtained cost function,
    • obtaining, by the computing system, from the final values of the one or more spins si, an approximate solution for the QUBO problem,


      wherein the step of obtaining updated intermediate values of the one or more spins si is per-formed using a gradient descent technique or a sequential updating of intermediate values of the one or more spins.


The input of the QUBO problem can either already be in the form of the Ising Hamiltonian or it can at least specify the interaction strengths Jij and the bias terms bi for all variables, i.e. for the complete problem.


The actual solving of the problem does not necessarily comprise the calculation of quantum mechanical spins. Therefore, in the context of the present invention, at least the intermediate spins si and also their final values may depend on other or additional parameters or values, like angles θi or variables wi (both further explained below) obtained during the iteration that are linked to quantum mechanical spins by a specific transformation.


The cost function is understood to be the expectation value of at least the Ising Hamiltonian operator, but may encompass additional terms that depend on time. The time dependence in this context may be understood to specify a point or step within the iteration. For example, if N iterative steps are performed for solving the QUBO problem or finding its approximate solution, the time t may be denoted as







t
=

i

N
-
a



;

i


;

0

i


N
-

1
.







The cost function may be denoted with C in the following.


It was surprisingly found that, when formulating the QUBO problem in the above way and iteratively obtaining a value of the cost function of a subsequent step depending on the spins si (or associated values like θi or wi) obtained in the previous step and potentially also depending on the number of the subsequent step (which may be considered a “time” in the above sense), gradient descent techniques or sequential updating techniques can be applied in a way during the iteration that the problem is solved more efficiently also on general-purpose computers.


It is indeed a finding of the present invention that, surprisingly, with this formulation of the QUBO problem and the specific steps of solving it by a computing system, the gradient descent techniques and sequential updating techniques as are known from the remote technical field of training neural networks can be applied for efficiently finding a solution to the QUBO problem. This specifically pertains to the computing time that is required for finding an approximate solution to the QUBO problem with a given accuracy. When applying the method according to the invention, the computing time required for obtaining the approximate solution to the QUBO problem with a given accuracy is reduced compared to commonly known approaches like simulated bifurcation.


In one embodiment, the step of minimizing the cost function iteratively is performed using a gradient descent technique and the gradient descent technique comprises applying a momentum to the gradient descent.


Adding momentum means that, when updating the variable(s) used for solving the problem in a subsequent step, using the values of the current step (i.e. the intermediate values of the variable(s)), this updating is done by adding to the values obtained for the vector of variable(s), a “velocity” in the form










v





μ


v



-

η




w


C

(


w


,
t

)








(
3
)







where the momentum μ is a constant in the range of 0 to 1 and C is the cost function depending on the values {right arrow over (w)} which are associated with the spins by a specific transformation and the time t which may, for example, be the time as indicated above.


The new values or variables for the subsequent step for calculating the cost function C are then set to {right arrow over (w)}←{right arrow over (w)}+ν and the next iteration starts with these values {right arrow over (w)}. This is repeated until the end of the iteration, whereby the values {right arrow over (w)} determine the approximate solution to the problem.


A specific realization of this general approach can comprise applying the Nesterov Accelerated Gradient technique by setting the values {right arrow over (w)} used in calculating the updated velocity {right arrow over (ν)} in the above sense for the next step as {right arrow over (w)}={right arrow over (w)}+μ{right arrow over (ν)} and obtaining the gradient with ∇wC({right arrow over (w)}, t).


This can result in a more efficient minimization of the cost function, thus requiring fewer steps.


In a further embodiment, the step of minimizing the cost function iteratively is performed using a sequential updating of intermediate values of the one or more spins si, comprising updating, in each iteration, each of intermediate values of the one or more spins si.


In general, instead of using the variables {right arrow over (w)} as indicated above for solving the problem, the problem may be formulated using classical angles θi that are associated with the actual spin values si by a specific transformation, where the value of a specific angle theta θi that minimizes the cost function is given by










θ
i
min

=

arctan

(


-

t

1
-
t





(







j



J

i

j




sin

(

θ
j

)


+

b
i


)


)





(
4
)







with the time t being the time for a specific step in the iteration in the above sense. It is seen that the values of the θimin that minimize the cost function at a current point in time (i.e. at a current step in the iterative process) explicitly depend on the time. This can be advantageously used and easily implemented throughout the iterative process. This is done sequentially, meaning the value θ1min is calculated using the values of θimin of the previous step. Within the iterative loop, the method then proceeds to calculating θ2min already using the newly obtained θ1min and so on. This may be repeated within a loop of the iteration until all θimin converge to values that actually minimize the cost function in this iterative step. This may require a single or a plurality of calculations of each of the θimin. The obtained final angles θimin of a current step are then used in the subsequent step to calculate the cost function and obtain new values θimin. The above exemplary described steps can be applied to the minimization of the cost function using either the herein described gradient descent techniques or the herein described sequential updating techniques.


In a further embodiment, the step of minimizing the cost function iteratively is at least partially carried out on hardware that is designed to perform matrix multiplications. The computer implemented method according to the above embodiments comprises at least one step that requires a matrix product calculation of the spin values (or values associated with the spins, like θimin or {right arrow over (w)}) with the interaction strength Jij in the cost function. By employing specifically hardware that is designed to perform such matrix multiplications, like a graphics processing unit, these can be advantageously accelerated, thereby reducing the processing time required for obtaining the approximate solution to the QUBO problem.


Furthermore, when calculating the gradient to obtain the parameter values of the variables in the next step of the iteration, it can also be advantageous to implement this on a graphics processing unit.


In a more specific embodiment, the hardware is or comprises a graphics processing unit and/or field programmable gate arrays.


According to embodiments of the invention, the method is performed without using a quantum computer. This means that the computer-implemented method is carried out on a general-purpose computer and not on a quantum computer. Specifically, it is a finding of the present invention that the computer-implemented method according to the above embodiments can be used to reliably solve QUBO problems in a comparably short time with a high degree of accuracy on a general-purpose computer. This allows for solving the QUBO problem on generally available hardware like personal computers and not requiring cost-intensive hardware like real quantum computers.


Furthermore, with the computer-implemented method according to embodiments of the invention, also large-scale QUBO problems can be solved in comparably short times, thereby overcoming the problem of the current limitations of quantum computers.


In a further embodiment, the step of minimizing the cost function iteratively comprises calculating a gradient of the cost function and wherein the calculation of the gradient is performed using the hardware. Employing this hardware in calculating the gradient results in a more efficient (at least with respect to the time required for performing the calculations) calculation of the gradient which is one of the steps in the iterative process that requires most computation.


In a specific embodiment, the Ising Hamiltonian operator Hz can be represented as HzijJijσz(i)σz(j)ibiσz(i), where i and j denote positions of spins i and j and Jij denotes an interaction strength between a spin at position i and a spin at position j and wherein bi denotes a bias term at position i and σz(i), σz(J) denote the z-Pauli matrices acting on a spin at position i and a spin at position j, wherein the QUBO problem can be represented using a position dependent interaction strength Jij and/or a position dependent bias term bi. The position-dependent interaction means that there are values of the interaction strengths Jij that are different from 0 for different spins i, j. The position dependence of the interaction strength can be a long-range interaction meaning that the interaction strengths Jij are not equal to 0 also for large differences i and j or it can also be of short range meaning that, for example, the interaction strength Jij is only different for 0 for immediately neighboring spins si and sj or for the two next neighbors where the difference of |i−j|=1 or |i−j|=2 or the difference. It is noted that the matrix Jij is a symmetric matrix so that Jij=Jji and its diagonal entries Jii=0 for all i. The position dependence of the bias term bi means that the bias term bi may be different for different indices i.


It is a finding of the present invention that the computer-implemented method also results in an efficient solution of the QUBO problem for strongly interacting systems where is not equal to 0 for spins that are acting over a large distance (i.e. also or cases where |i−j|>2).


Specifically, the position-dependence is not trivial in some embodiments.


This means that there is a real position dependence in either the interaction strength Jij so that Jij≠Jkl for i≠k and/or j≠l and/or that the position dependence of the bias term bi is a real position dependence meaning that for at least one pair of i and k where i≠k. Even though the problem becomes more complicated the more complex the interaction strength and/or the position-dependent bias terms actually are (i.e. the more terms Jij≠0), it is a finding of the present invention that also such rather complex problems can be reliably solved in comparably short time.


In a further embodiment, iteratively obtaining the cost function comprises using an operator that does not commute with the Ising Hamiltonian operator. The operator may be denoted as O.


This means that the commutator [Hz, 0]≠0. An operator that does not commute with the Ising Hamiltonian operator may, for example, be










H
x

=



Σ


i



σ
x

(
i
)







(
5
)







Here, σx(i) denotes the Pauli-matrix σx acting on position i. When obtaining the cost function C as the energy value associated with a time-dependent Hamiltonian operator in the form of









H
=



f

(
t
)



H
z


-


(

1
-

f

(
t
)


)



H
x







(
6
)







with a time-dependent function f(t) (with the time set for example as already explained above) that fulfills f(0)=0 and f(t)∈[0,1] by












C
=


θ




"\[RightBracketingBar]"



H




"\[LeftBracketingBar]"

θ







(
7
)







the iteration of the cost function is done depending on the time t which can be represented as the number of iterative steps of the method. The more the time progresses, the closer the cost function C comes to the Ising Hamiltonian operator that represents the actual QUBO problem. It can specifically be provided that the operator is a Hamiltonian operator.


One specific example of such a Hamiltonian operator was already given above. Using a Hamiltonian operator that does not commute with the Ising Hamiltonian operator allows for a reasonable interpretation of the cost function during the iteration as energy. Moreover, unreasonable or un-desirable behavior of the cost function during the iteration due to potential interactions of com-muting operators can be avoided, thereby improving the accuracy of the iteration.


In one embodiment, a computing system comprising a processor and memory is provided, wherein the computing system is adapted to execute a computer-implemented method according to any of the above embodiments. With a corresponding computing system, the advantages of the above-described computer-implemented method according to any of the above embodiments can be obtained when solving QUBO problems.


The computing system can further comprise a graphics processing unit and/or field programmable gate arrays.


By using this computing system, also complex matrix products or other calculations can be performed in comparably short time, thereby reducing the time for solving the problem.


In one embodiment, a computer-readable storage medium is provided that comprises (or stores) computer-executable instructions that, when executed by a computing system, cause the computing system to perform a computer-implemented method according to any of the above embodiments.





BRIEF DESCRIPTION OF THE DRAWING


FIG. 1 shows a flow diagram of a method according to one embodiment



FIG. 2 comparison of the results of simulated bifurcation to a method according to embodiments of the invention





DETAILED DESCRIPTION

Before describing the inventive method in detail with respect to the actual steps performed, a more general description of the underlying problem and the mathematical approach using an Ising Hamiltonian will be provided.


Generally, a quadratic unconstrained binary optimization problem, QUBO problem, is a problem that can be formulated as finding the vector {right arrow over (x)}=(x1, x2, . . . , xn), xi∈[0,1] that minimizes the value











min


x





(

0
,
1

)

n





x


T


Q


x



+



x


T


a





(
8
)







Herein, Q is a symmetric N×N matrix with each Qijcustom-character, Qii=0 entry for all elements on the main diagonal and QT=Q. The vector {right arrow over (a)} is a real N×1 vector.


By using the transformation si=2xi−1, this problem can be reformulated so that it can be mapped to finding the energy of the ground state of a Hamiltonian operator known from solid state physics, namely the Ising Hamiltonian operator.


Generally, this Ising Hamiltonian operator is given by










H
z

=




Σ



i

j




J

i

j




σ
z

(
i
)




σ
z

(
j
)



+



Σ


i



b
i



σ
z

(
i
)








(
9
)







Herein, the elements Jij are elements of a N×N symmetric matrix where Jii=0 for all elements on the main diagonal and bi are elements of an N×1 vector with bicustom-character. The matrices σz(i) are the z-Pauli matrices in the form







σ
z

(
i
)


=


(



1


0




0



-
1




)

.





The upper index of these matrices indicates on which component of a state |ψcustom-character=|ψ1custom-character2custom-character . . . |ψNcustom-character the Pauli-matrices act as operators. In quantum physics, the particle states |ψicustom-character denote the state of a particle at a lattice place i in the lattice that is represented by the Ising Hamiltonian operator and Jij denotes the interaction strength between a particle at lattice place i and another particle at lattice place j.


The QUBO problem is thus equivalent to finding the ground state \f) that minimizes the expectation value of the Ising-Hamiltonian operator Hz, i.e.,











min




"\[LeftBracketingBar]"

ψ






ψ





"\[RightBracketingBar]"





H
z





"\[LeftBracketingBar]"

ψ




.




As the Ising Hamiltonian constitutes a problem with only two possible states for each particle with respect to the z-axis of the system, the particle states |ψicustom-character) are fully described in a normalized form by using the vectors













"\[LeftBracketingBar]"

+



=


1

2




(



1




1



)



and





"\[LeftBracketingBar]"

-





=


1

2




(



1





-
1




)



,




each |ψicustom-character can be represented as
















"\[LeftBracketingBar]"


ψ
i




=



"\[LeftBracketingBar]"


θ
i





=

cos



θ
i

2





"\[LeftBracketingBar]"

+





+

sin



θ
i

2





"\[LeftBracketingBar]"

-





.




Here, the values of θi are from the interval







[



-
π

2

,

π
2


]

.




It is noted that the states |+custom-character and |−custom-character are not the Eigenvectors of the σz(i) matrices, but of the







σ
x

(
i
)


=

(



1


0




0


1



)





Pauli-matrices.

In this context, the angles θi are considered to be associated with the values of the spins that actually solve the underlying minimization problem of the Ising Hamiltonian operator. Indeed, once all θi are found, all states |ψicustom-character are defined.


In some implementations a further transformation can be applied so that










θ
i

=



π
2



a

(

w
i

)


=


π
2


tanh



w
i

2







(
10
)







This problem as now formulated is “time independent” meaning that as long as the interaction strength Jij and the bias term bi is constant, the system will assume a stable state with the minimum energy as calculated above.


The solutions |θicustom-character correspond to solutions of the actual QUBO problem. More specifically, the values for the spins si can be obtained via si=sign(θi).


A core concept of the present invention is to solve the QUBO problem by obtaining the value for the cost function by using a time-dependent Hamiltonian operator. The time dependence allows for specified iterative steps where each point in “time” corresponds to a different iteration step. For example, the time can be set to run from 0 to 1 and a time-dependent function f(t) can be used that takes values within the interval (0, 1) and at f(0)=0.


According to embodiments of the invention, the time-dependent Hamiltonian operator used for solving the QUBO problem is then given by









H
=



f

(
t
)



H
z


-


(

1
-

f

(
t
)


)



H
x







(
11
)







where the additional Hamiltonian operator Hx is preferably an operator that does not commute with the Ising Hamiltonian operator. In addition to the function f(t), each or at least one of the operators Hz and Hx may be multiplied with a constant ϵz or ϵx, respectively. The constants may be values larger than 0. This can be used to increase or decrease the relative influence of Hx to the operator H(t).


It has been found that a reasonable approach for addressing QUBO problems can be the Hamiltonian operator










H
z

=






i



σ
x

(
i
)







(
12
)







The expectation values of this operator Hx using the













"\[LeftBracketingBar]"


θ
i




=

cos



θ
i

2





"\[LeftBracketingBar]"

+





+

sin



θ
i

2





"\[LeftBracketingBar]"

-








as introduced above are custom-characterθ|Hxcustom-charactericos θ1. This has advantages because in the first iterative step of t=0, the ground state is the ground state of Hx which can be exactly obtained using the Eigen-vectors of this operator, i.e. the states |+custom-character and |−custom-character mentioned already above. For computational efficiency, however, it has been found by the inventors that the first iterative step should not be initialized with all spin states being |+custom-character. This is because this state of the system has a gradient of zero, which will not allow for using the techniques described herein for minimizing the cost function in subsequent iterative steps.


The cost function then corresponds to the energy of the time-dependent Hamiltonian operator as specified in (11) and is given by










C

(

θ
,
t

)

-


f

(
t
)



(








i

j




J

i

j



sin


θ
i


sin


θ
j


+






i



b
i


sin


θ
i



)


-


(

1
-

f

(
t
)


)







i


cos


θ
i






(
13
)







Using the above further transformation, also







π
2



a

(

w
i

)





may be used as the argument instead of θi.


With this representation of the cost function which, when minimized, solves the QUBO problem, the method according to the invention can be performed.


It is noted that the actual spin configuration that characterizes the solution to the QUBO problem can be obtained







via



s
i


=


sign

(

θ
i

)

=


sign

(


π
2



a

(

w
i

)


)

=

sign

(

w
i

)







once the final values of the variables are obtained at the end of the iteration. As the QUBO problem is a binary problem where the values solving it either take the value 0 or 1, this can be used to obtain, either in intermediate steps of the inventive method or at the end of the inventive method, the solution to the QUBO problem.


It is a finding of the present invention that formulating the QUBO problem in this specific way allows for applying techniques like gradient descent techniques or sequential updating techniques for obtaining the intermediate values of the spins in an efficient way.



FIG. 1 shows a flow chart according to one embodiment of the invention that employs a computer-implemented method for finding an approximate solution for a QUBO problem using the above-described cost function.


The method begins according to the flow diagram shown in FIG. 1 with a step 101 in which input is provided to the computing system.


This input will generally provide information on the to be (approximately solved) QUBO problem.


Considering the above cost function that is to be processed with the method of FIG. 1, the input may at least comprise the interaction strength Jij for all i and j and the bias term bi. Moreover, the size of the problem (i.e. the range of the index i) can be provided as input to the computing system. It can, however, also be deduced from the input values Jij and bi as, for example in the case the interaction strengths are provided in the form of a matrix, the size of the problem is already defined.


Moreover, a number N of iterations may be set that can then be used to determine the time t, for example via






t
=

i

N
-
1






for the counter i beginning with 0. Alternatively, also other initialization values may be used like, for example






t
=


i
-
1

N





if the counter i starts with i=1


Additionally, unless it is preset, the time-dependent function f(t) used in the cost function according to (13) can be provided as part of the input. This is specifically advantageous in case the cost function is not preset. If the time-dependent function f(t) is not a preset function, it can be advantageous to provide this function for example depending on specific characteristics of the problem. While, in one embodiment, a monotonically increasing function f(t)=t can be advantageous, other approaches like f(t)=t2 or f(t)=t3 may also be considered. Also, more complex functions like sinus-functions or more complex polynomials may be used. The invention is not limited in this respect as long as the required conditions for f(t) are met, specifically f(0)=0 and preferably f(1)=1. However, at least the condition f(1)=1 is not necessary and f(1) may take any value between 0 and 1.


In other cases where the function f(t) is a preset function, it may be envisaged that more than one function f(t) is available to the computing system. The determination which functions to use may either be made by the computing system itself, for example based on the characteristics of the interaction strength Jij and the bias term bi or the selection of the function f(t) may be made by the user. In this respect, a user interface may be provided where either different functions fm(t) are presented and/or the different characteristics of the functions fm(t) can be displayed and the user can decide for a specific function fm(t). This function can then be used for the further calculations. The characteristics of the functions fm(t) may provide additional information on the monotonic behavior, for which kind of QUBO problems the specific function is known to perform best or yield the best results in shortest time or the like. This can aid the user in identifying the function fm(t) which may most appropriately address the QUBO problem he or she attempts to solve.


Alternatively, the function f(t) may be automatically selected by the computing system after having received the input. For example, based on characteristics of the values Jij a specific function fm(t) can be chosen by the computing system.


Proceeding with step 102, the initial cost function can be formulated. This can, for example, comprise setting, internal to the computing system, the values Jij and the values bi, as well as deriving or obtaining initialization values for the to be calculated values θi or wi, depending on which representation is chosen. While this step 102 is provided outside the actual loop for iteratively calculating the cost function C and thereby minimizing it, this step 102 does not need to be separately provided but can also be part of the first step of the iteration.


The method then proceeds with the beginning of the iterative calculation of the cost function.


This iterative calculation of the cost function begins with a first step 104 in which initialization values of the parameters/variables θi or wi are used to calculate the cost function C({right arrow over (w)}, 0) or C({right arrow over (θ)},0).


In order to avoid the cost function to be stuck in a local minimum or otherwise negatively influenced by the initialization variables θi or wi, it can be preferred to set their initialization values randomly in their allowed ranges or all close to 0 or even exactly 0. However, in view of the embodiments of the present invention, the actually chosen initialization values usually do not have a significant impact on the progress of solving the underlying QUBO problem. In some cases, however, it can be advantageous to select initialization values that are derived from the final values for θi or wi of already solved QUBO problems if these QUBO problems are somehow linked to or somehow similar to the problem to be actually solved. This can result in faster convergence of the method towards the best solution.


As indicated above, the time t may be set to







t
=

i
n


.




In the first iteration step, the cost function C({right arrow over (w)}, 0) or C({right arrow over (θ)}, 0) for this time t=0 is calculated in step 105. Using the cost function, new values wi or θi are obtained by using either a gradient decent technique or a sequential updating method. The obtained values may then optionally be used again in the same cost function or the same iterative step so as to minimize the cost function. This procedure may be repeated until the changes in the values of the parameters from one calculation of the cost function to the next calculation of the cost function are below a given threshold and convergence within a single iterative step is obtained. This, however, is not necessary and, according to embodiments of the present disclosure, it is sufficient to obtain values wi or θi only once per iterative step i. Furthermore, by using the gradient decent technique or the sequential updating, it can, in some embodiments, be sufficient to calculate, in each iterative step, only the gradient of the cost function but not the cost function itself. This can be computationally more efficient.


At the end of the last iterative step, the actual value of the cost function may then be obtained also to obtain the final values of wi or θi.


In any case, at the end of a loop, this results in intermediate values wi or θi that can be used in the next step of the iteration. These values wi or θi may be considered to be “intermediate values” of the actual spins and are used for the next step in the iteration.


Obtaining the updated values or intermediate values of the wi (i.e. when using the above trans-formation and calculating the cost function depending on the wi instead of the θi) can be done by either a gradient descent technique. This technique is normally used when training neural networks.


It is a finding of the present invention that this technique can also be used advantageously to avoid the iteration (either within a single loop and/or over different loops of the iteration) to be stuck, for example, in local minima of the cost function that are not the global minimum or that are far from the global minimum. Such local minima do not correspond to the actual solution of the underlying QUBO problem, and, on the other hand, approaches used for solving the QUBO problem so far kept stuck in such local minima which can result in the solving of the problem failing in the worst case or requiring significant processing time to move out of this local minimum.


The gradient decent technique uses a gradient of the cost function C({right arrow over (w)}, t) to calculate the new values of {right arrow over (w)} for the next iterative step and/or for the next loop within the same iterative step. Specifically, the gradient decent technique updates {right arrow over (w)} that is to be used for the next iteration step i+1 (corresponding to







t
=


i
+
1

N


)




or the next loop in the same iterative step using the cost function calculated at time






t
=

i
N





and the values tor {right arrow over (w)} of this step. The gradient decent technique can be written in the form of










w





w


-

η





w




C

(


w


,
t

)








(
14
)







where ∇{right arrow over (w)} constitutes the gradient in the coordinates of {right arrow over (w)}. η may be a constant value and may represent a step size.


This results in a simultaneous updating of all parameters wi in the vector {right arrow over (w)}. This gradient can be obtained by slightly varying the values wi in order to obtain a numerical derivative or gradient of the cost function. This is known as the finite difference method. However, it is a finding of the present invention that ∇{right arrow over (w)}C({right arrow over (w)}, t) can be obtained directly. Specifically, when deriving C({right arrow over (w)}, t). Another way to obtain the gradient is to use the transformation (10) in (13) so as to obtain the gradient














w



C




(


w


,
t

)


=



π
2




(



f

(
t
)


2

J


z



+

b



)

·

x




+



(

1
-


f

(
t
)



z




)

·
a



w








(
15
)







Herein,








z


=


sin



(


π
2



a

(

w


)


)



and



x



=

cos



(


π
2



a

(

w


)


)




,

1
=




(

1
,
1
,




1


)

T



and



á
(

w


)


=





w




a

(

w


)


.







The matrix J is the interaction matrix with its corresponding entries Jij. This gradient is preferably calculated using a specifically dedicated hardware, preferably a graphics processing unit or field programmable gate arrays.


The obtained gradient is then used to modify the initialization values of the parameters wi for the next loop within the iteration step and/or modify the values wi for the next iterative step.


An alternative approach according to embodiments of the invention is the sequential updating of the actual angles θi. This approach makes use of the first derivative of the cost function C({right arrow over (θ)}, t) at a fixed point in time for a specific variable θi. Considering (13), the variable/parameter con-tributes to the value of the cost function C({right arrow over (θ)}, t) by










C



(


θ
i

,
t

)


=



f

(
t
)




(


sin



θ
i







i



J

i

j




sin



θ
j


+

b
i


)


-


(

1
-

f

(
t
)


)



cos



θ
i







(
16
)







When deriving C(θi, t) after θi, one obtains θimin that minimizes this function (15) from








d


C

(


θ
i

,
t

)



d


θ
i



=
0.




This has an exact analytical solution and yields










θ
i
min

=

arctan



(


-


f

(
t
)


1
-

f

(
t
)







(







i



J

i

j




sin



θ
j


+

b
i


)



)






(
17
)







These values θimin minimize the cost function of the current step and are therefore used as the values for calculating the cost function in the next step (and/or calculating the values θimin consecutively in the current step) by using the updating θi←θimin. This approach updates the values θi one after the other. Within a single loop (i.e. within one step of the iteration), this updating can be done several times until the θi converge to the values θimin of the current iterative step at given time t. Furthermore, either additionally or alternatively, the equation (17) can be used to determine the θi that are to be used for the next time step in the iteration.


Both approaches, either the gradient decent technique or the sequential updating technique do require the calculation of matrix products because, as is seen from the above equations, it is always necessary to calculate matrix products comprising the Jij and some function linked to the relevant parameters {right arrow over (w)} or {right arrow over (θ)} that depends on the components of the associated vectors.


For that purpose, it can be preferred if the computer-implemented method is preferred on a computing system where at least the matrix products obtained in the step of calculating the cost function and/or updating the parameter values in line with steps 105 and 106 are performed on a specifically dedicated hardware for calculating matrix products, like a graphics processing unit (GPU) or field programmable gate arrays (FPGAs). Other calculations can then still be performed on the central processing unit (CPU) as long as they do not require matrix calculations. More generally, any heterogeneous computing platform comprising a CPU and a GPU or FPGAs or other dedicated hardware for calculating matrix products may be used in this context.


The gradient descent technique as discussed above and as used for obtaining the updated parameter vectors {right arrow over (w)}, respectively, for the further calculation has so far been employed in training of neural networks. It is a finding of the present invention that these techniques can be used in solving QUBO problems specifically when it comes to obtaining the updated values for the {right arrow over (w)} vectors and calculating, thereby, the values of the cost function. This is because these techniques allow for optimizing parameters also in landscapes (like the cost function when solving QUBO problems) that comprise one or more local minima.


In the next step 107, the time (and therefore the step of the iteration) is increased to the time






t
=



i
+
1

N

.





The method then resumes to step 105 and calculates the cost function for this new time using the updated values {right arrow over (w)} or θi for the respective parameters from step 106. Within the next loop at time







t
=


i
+
1

N


,




sequential updating or the gradient decent technique as explained above can like-wise be used to update the respective parameters/variables {right arrow over (w)} or θi.


A new value of the cost function is then obtained because not only the new values from step 106 are used to calculate the cost function, but the cost function also changes due to the change in the function f(t) as the new argument for the time






t
=


i
+
1

N





is used. This will result in a change of the value of the cost function in addition to the changes resulting from the changed parameter values.


The method then again proceeds from the step 105 overstep 106 where the updated parameter values {right arrow over (w)} or θi are obtained and then proceeds with the step 107, starting the cycle again.


As denoted with the item 110, these iterations can either be repeated until the number of steps i reaches the maximum number N of iterations or, optionally, until an aborting condition is reached. This aborting condition can, for example, comprise that the change of the value of the cost function, C({right arrow over (θ)}, ti+1) or C({right arrow over (w)}, ti+1) respectively, differs from the value of the cost function in the immediately preceding step by not more than a given amount denotes Δ∈custom-character so that 0≤C({right arrow over (X)}, ti)−C({right arrow over (X)}, ti+1)≤Δ and {right arrow over (X)} denotes the argument chosen, i.e. either {right arrow over (θ)} or {right arrow over (w)}. This aborting condition can ensure that the procedure is only aborted if the value of the subsequent step indeed is smaller than the value of the cost function of the preceding step and thus the associated parameter values for {right arrow over (θ)} or {right arrow over (w)} constitute a better approximation of the actual solution of the QUBO problem.


However, considering the formulation of the QUBO problem using the additional Hamiltonian operator Hx as already explained above, it is more preferred to continue until i=N, i.e. until the last step in the iteration, because, in some embodiments for which f(1)=1, at this point, the cost function is reduced to the cost function of the actual Ising Hamiltonian operator which constitutes the to be solved QUBO problem and the additional operator Hx is not further used in this step. This is because at the point i=N, the function f(t)=1 and, therefore, the additional Hamiltonian operator Hx, in this last step, vanishes from the cost function.


The parameter values obtained in this last step when updating them in line with the step 106 are the final solution or the finally approximated solution to the QUBO problem. The method then proceeds to the step 108 (by skipping the step 107 in case i=N) and derives from the final parameter values {right arrow over (θ)} or {right arrow over (w)} obtained, the actual spin values. This can be done by calculating







s
i

=


sign



(

θ
i

)


=


sign



(


π
2



a

(

w
i

)



)


=

sign



(

w
i

)








depending on which parameter values were chosen to solve the problem, i.e. cither {right arrow over (θ)} or {right arrow over (w)}. This can also be done in each step of the iteration, though explicitly obtaining the values si during the iteration can, in some embodiments, also be omitted.


Specifically, the intermediate values {right arrow over (θ)} or {right arrow over (w)} are connected to the intermediate spins and may thus be considered to be or to correspond or to represent these intermediate spins si.


These values si can then constitute the final solution to the QUBO problem unless additional transformations like si=2xi−1 have previously been performed in order to formulate the QUBO problem in line with or in a way that complies with the Ising Hamiltonian operator.


Once the final spin values si are obtained in step 108, the solution of the QUBO problem is derived in step 109 by applying any potentially necessary transformations.


The steps 108 and 109 can, in this sense, also be considered as a single step that is performed after the reaching of the final time t=1 and can be regarded as a step of obtaining the approximate values that solve the QUBO problem.


As already explained above, there are different ways of finding the intermediate values for the parameters that are used for solving the underlying problem.


In the case of using the gradient descent technique, it may also be preferred to use a method that applies a momentum to the gradient descent technique.


This comprises that, when updating the parameter vector {right arrow over (w)} this is updated by {right arrow over (w)}←{right arrow over (w)}+{right arrow over (ν)} where the velocity {right arrow over (ν)} is, in itself, updated in each step of the iteration (and/or within each loop within one step of the iteration) by










v






μ

v



-

η





w



C




(


w


,
t

)







(
18
)







For the first step in the iteration, the velocity may be set to {right arrow over (ν)}. Apart from the term this corresponds to equation (14). The parameter μ∈[0,1] can be set depending on the circumstances. It has been found that values for μ that are close to 1 provide improved results over comparably small values of μ. Preferably, the value μ may thus be set within a range of 0.9≤μ≤1 or 0.95≤μ≤1. In one embodiment, μ=0.99.


It is a finding of the present invention that by applying this momentum to the calculation of the updating of the parameter vector {right arrow over (w)} shallow local minima can be avoided, or the method can at least find a way out of these minima and will thus not be stuck in the same.


A further improvement to this application of momentum is the Nesterov Accelerated Gradient technique. Here, in the argument of the cost function from which the gradient is calculated according to (18) to update the velocity, instead of the above, the argument may be set to C({right arrow over (w)}+μ{right arrow over (ν)}, t). Instead of the values of {right arrow over (w)} of the current iterative step at time t (and/or of the current loop within a single iterative step), some approximation of how these values will be in the next step of the iteration (and/or in the next loop within the same iterative step) is used to calculate the velocity for obtaining the parameter vector {right arrow over (w)} for the next step in the iteration (and/or the next loop within the same iterative step).


As already explained above, it is a finding of the present invention that solving QUBO problems can, in view of approaches applied nowadays, result in a “stucking” of the algorithm in local minima that are pseudo-solutions to the QUBO problem but actually are not the global minimum that would constitute the solution to the QUBO problem. It is a finding of the present invention that these issues can be overcome by applying techniques that are so far applied in the context of training neural networks. Even though the described embodiments may, in this respect, not exactly result in the global minimum, they provide comparably better approximate solutions.


In the context of solving QUBO problems, it has been surprisingly found that these techniques allow for finding ways out of local minima and more reliably identifying the actual minimum of the problem (i.e. the global minimum) that then solves the underlying QUBO problem.


With the techniques applied, and when considering hardware implementations of the algorithm by, specifically, implementing steps of the algorithm that involve matrix multiplications on graphics processing units, QUBO problems of almost arbitrary size, at least comprising thousands or several tens of thousands of spins can be reliably solved in acceptable time using commonly available computing systems. Thereby, the need for using quantum computers for solving these problems is overcome at least to some extent, resulting in less cost-intensive hardware being required for solving also large-scale QUBO problems.



FIG. 2 shows the improvement in performance in approximately solving the QUBO problem using a method according to embodiments of the invention (dashed lined) compared to simulated bifurcation as is known from the prior art. The QUBO problem solved by the different methods is the known MAX-CUT optimization problem. For this QUBO problem, higher obtained cut values correspond to a better approximate solution of the problem. As is seen, for the same number of iterations of the algorithm, the obtained maximum value of the cost function by using embodiments of the invention is larger compared to that obtained with simulated bifurcation. Since the two algorithms have similar computations requirements per iteration, this implies that higher values can also be obtained at smaller GPU times. The performance shown in FIG. 2 was obtained when using the ADAM gradient technique with initial step size and η=4, a time function f(t)=t and multiplying the term Hz of the Hamiltonian by a constant 1/70.


In the above description and embodiments, the finding of an approximate solution for the QUBO problem was described. In some embodiments, it can be preferred to implement the computer-implemented method on a specific device like a general-purpose computer (or smartphone or laptop or the like) that can be directly accessed by a user. In such a case, for performing a method according to any of the above embodiments, it is preferred that the means for realizing a method for finding an approximate solution for a QUBO problem are, for example, implemented by an application that runs on the general-purpose computer. It can be even more preferred if the application does not require access to the internet or other connections to devices outside the device on which the application is run. Thereby, users can input information on the QUBO problems they intend to solve, and the approximate solution can be obtained without having to rely on remote hardware and/or software. This can be specifically advantageous if the input contains sensitive data.


In other embodiments, the application can be provided at least partially via a cloud architecture as software as a service, SaaS. In that case, a user may access the application via a remote device like a general-purpose computer or a smartphone, laptop or tablet or any other suitable device that preferably comprises a display device and means for inputting the necessary input.


This input can then be provided to the cloud architecture where an application can be run that finds the approximate solution for the QUBO problem based on the input. This may provide ad-vantages as software and/or hardware resources may be provided depending on the complexity of the QUBO problem that well extend the computational resources of the user hardware, thereby enabling the solution of a complex problem without physical access to the hardware that implements the invention. The required resources may, for example, be determined based on the number of variables or the dimensionality of the QUBO problem to be solved. This, in turn, can be derived from the input.


The embodiments described above may be used to solve physical or chemical problems that can be formulated as QUBO problems and may thus be used for example for physical or chemical simulations. Specifically, solid state systems and their behavior may be analyzed with embodiments of the present invention. Alternatively, constituents for chemical compounds may be identified using one or more of the above embodiments.


Furthermore, one or more of the above embodiments may be used to find approximate solutions for problems regarding financial forecasting. risk assessment or portfolio optimizations, among others. Also, solutions for social problems may be found by using one or more embodiments of the present disclosure.

Claims
  • 1-14. (canceled)
  • 15. A computer-implemented method for finding an approximate solution for a quadratic unconstrained binary optimization (QUBO) problem, the method being performedby a computing system and the method comprising: providing, as input to the computing system, the QUBO problem in a form comprisingan Ising Hamiltonian operator,iteratively obtaining a cost function, the cost function depending at least on the Ising Hamiltonian operator, one or more spins si and/or the step of the algorithmwithin each step of the iteration, obtaining, by the computing system, associated intermediate values of the one or more spins si using the cost function,obtaining, at the end of the iterative process, by the computing system, final values of the one or more spins si that approximately minimize the final iteratively obtained cost function,obtaining, by the computing system, from the final values of the one or more spins si,an approximate solution for the QUBO problem,
  • 16. The computer-implemented method of claim 15, wherein obtaining the final values ofthe one or more spins si that approximately minimize the final iteratively obtained cost function comprises applying a momentum to the gradient descent.
  • 17. The computer-implemented method of claim 15, wherein obtaining the final values ofthe one or more spins si that approximately minimize the final iteratively obtained cost function comprises using a sequential updating of intermediate values of the one or more spins si, the sequential updating comprising updating, in each iteration, each of intermediate values of the one or more spins si.
  • 18. The computer-implemented method of claim 15, wherein obtaining the final values of the one or more spins si that approximately minimize the final iteratively obtained cost function is at least partially carried out on hardware that is designed to perform matrix multiplications.
  • 19. The computer-implemented method of claim 18, wherein the hardware is or comprises a graphics processing unit and/or field programmable gate arrays.
  • 20. The computer-implemented method of claim 15, wherein the method is performed without using a quantum computer.
  • 21. The computer-implemented method of claim 15, wherein the step of minimizingthe cost function iteratively comprises calculating a gradient of the cost function and wherein the calculation of the gradient is performed using the hardware.
  • 22. The computer-implemented method of claim 15, wherein the Ising Hamiltonian operator Hz is represented as Hz=ΣijJijσz(i)σz(j)+Σibiσz(i), where i and j denote positions of spins i and j and Jij denotes an interaction strength between a spin at position i and a spin at position j, wherein bi denotes a bias term at position i and σz(i), σz(j) denote the z-Pauli matrices acting on a spin at position i and a spin at position j, and wherein the QUBO problem is represented using the position dependent interaction strength Jij and/or the position dependent bias term bi.
  • 23. The computer-implemented method of claim 22, wherein the position-dependence is not trivial.
  • 24. The computer-implemented method of claim 15, wherein iteratively obtaining the cost function comprises using an operator that does not commute with the Ising Hamiltonian operator.
  • 25. The computer-implemented method of claim 24, wherein the operator is a Hamiltonian operator.
  • 26. A computing system comprising a processor and memory, wherein the computing system is adapted to execute a computer-implemented method according to claim 15.
  • 27. The computing system of claim 26, further comprising a graphics processing unit and/or field programmable gate arrays.
  • 28. A computer-readable storage medium comprising computer-executable instructions that, when executed by a computing system, cause the computing system to perform a computer-implemented method according to claim 15.
Priority Claims (1)
Number Date Country Kind
21382653.0 Jul 2021 EP regional
PCT Information
Filing Document Filing Date Country Kind
PCT/EP2022/070077 7/18/2022 WO