This application is based upon and claims the benefit of priority of the prior Japanese Patent Application No. 2023-80206, filed on May 15, 2023, the entire contents of which are incorporated herein by reference.
The embodiments discussed herein are related to an information processing system, an information processing method, and a non-transitory computer-readable recording medium storing a program.
There is an Ising machine (also referred to as a Boltzmann machine) that calculates a multivariable combinatorial optimization problem, which a von Neumann computer is not good at, by replacing the combinatorial optimization problem with an Ising model, which is a model representing the behavior of spins in a magnetic material. Examples of a method for solving the problem replaced with the Ising model in a practical time include a simulated annealing (SA) method and a Markov chain Monte Carlo (MCMC) method such as a replica exchange method. The combinatorial optimization problem is formulated with an energy function including a plurality of state variables. For example, an information processing device searches for a ground state of the Ising model that minimizes a value of the energy function by repeatedly trying state transitions caused by values of the state variables being changed using the MCMC method. The ground state corresponds to an optimal solution of the combinatorial optimization problem.
The energy function may be expressed by a sum of a cost term and a constraint term. The energy function (E(x)) may be expressed by, for example, the following equation (1).
In the equation (1), x represents a vector based on a state variable that is a binary variable of x=(x1, x2, . . . , xn). C (x) is a cost term, which is an objective function to be minimized. αP(x) is a constraint term. P(x)=0 holds when the combinatorial optimization problem satisfies a constraint condition to be satisfied (at a time of constraint satisfaction), and P(x)>0 holds when the constraint condition is not satisfied (at a time of constraint violation). a represents a constraint coefficient, which has a value larger than 0. By adding such a constraint term to the cost term, it becomes difficult to select a solution that does not satisfy the constraint condition (constraint violation solution).
Note that the state of the Ising model that maximizes the value of the energy function may be searched for if the sign of the energy function is changed.
Examples of a practical problem of the combinatorial optimization problem include a quadratic assignment problem (QAP). The QAP is a problem that obtains, when n allocation elements (facilities, etc.) are allocated to n allocation destinations (places, etc.), allocation that minimizes a sum of products of flow amounts (costs such as transportation amounts of supplies between the facilities) between the allocation elements and distances between the allocation destinations to which the individual allocation elements are allocated. The flow amounts between the allocation elements and the distances between the individual allocation destinations may be each expressed using a matrix. Furthermore, a constraint condition that all allocation elements are allocated to allocation destinations different from each other is imposed on the QAP.
Meanwhile, a constraint violation solution is likely to be generated when the constraint coefficient as described above (α in the case of equation (1)) is too small, whereas an energy barrier between constraint satisfaction solutions increases to hinder a solution space search when it is too large. Furthermore, an appropriate value of the constraint coefficient largely varies depending on the individual problem. Until now, a method of optimizing a constraint coefficient by repeating processing of adjusting the constraint coefficient based on a search result of a solution for a combinatorial optimization problem by an Ising machine has been proposed.
International Publication Pamphlet No. WO 2021-044516 and Japanese Laid-open Patent Publication No. 2004-110831 are disclosed as related art.
According to an aspect of the embodiments, there is provided an information processing system configured to calculate a combinatorial optimization problem that includes a constraint condition based on an energy function of an Ising model represented by a sum of a cost term and a constraint term that corresponds to the constraint condition, the information processing system including: a processing circuit configured to obtain problem information of the combinatorial optimization problem, calculate, based on the problem information, an expected value of a change amount of the cost term when transition from a solution that satisfies the constraint condition to a solution that does not satisfy the constraint condition occurs, determine a coefficient of the constraint term based on the expected value, and output information of the energy function using the determined coefficient; and a search circuit that searches for a ground state of the Ising model based on the output information.
The object and advantages of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the claims.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention.
However, according to the existing method, a large number of times of adjustment may be needed to obtain a constraint coefficient that enables a search in a wide range to obtain a high-quality solution, and it takes time to determine the constraint coefficient.
In one aspect, an object of the embodiments is to provide an information processing system, an information processing method, and a program capable of shortening a time needed to determine a constraint coefficient.
Hereinafter, modes for carrying out the embodiments will be described with reference to the drawings.
An energy function of an Ising model may be expressed by the equation (1) described above. Here, it is sufficient if α, which is a constraint coefficient, is set to a value that makes a constraint violation solution larger than a constraint satisfaction solution to avoid constraint violation. However, the constraint satisfaction solution has a range from an optimal solution in which a value of C(x) is minimized to a solution in which the value of C(x) is larger even if a constraint condition is satisfied, and it is not clear which value is suitable to be used as a reference. Furthermore, a value of the optimal solution is not clear in the first place. As described above, it has been proposed to optimize the constraint coefficient by repeating processing of adjusting the constraint coefficient based on a search result of a solution for a combinatorial optimization problem by an Ising machine, but the adjustment takes time. In view of the above, in the present embodiment, α is determined by focusing on a change amount of the value of C(x) at a time of transition from the constraint satisfaction solution to the constraint violation solution.
Note that, according to the MCMC method using a temperature parameter, a probability that the transition from the constraint satisfaction solution to the constraint violation solution with an increase in the value of C(x) is accepted is low although it depends on a setting of a value of the temperature parameter. Thus, hereinafter, a case where the value of C(x) decreases due to the transition from the constraint satisfaction solution to the constraint violation solution will be considered.
In the case where the value of C(x) decreases due to the transition from the constraint satisfaction solution to the constraint violation solution, E(x) of the equation (1) becomes smaller due to the transition if αP(x) added due to the constraint violation is smaller than the change amount (decrease amount) of C(x). Accordingly, the transition to the constraint violation solution is highly likely to occur. Thus, it is sufficient if αP(x) is set to a value that compensates for at least the decrease amount of C(x) to suppress the constraint violation. Next, a method of calculating an expected value of the decrease amount of C(x) will be described.
When C (x) is expressed by quadratic unconstrained binary optimization (QUBO) in the equation (1), the following equation (2) is obtained.
In the equation (2), Qij represents a coupling coefficient matrix, hi represents a bias coefficient vector, and c represents a constant term.
While the number of elements qij (i<j) of Qij is n2 when the order of Qij is assumed to be n, since qij=qji and qii=0 (i.e., diagonal elements are 0), the number of valid qij is N=n(n−1)/2. Hereinafter, the valid N qij described above are set as elements (excluding diagonal elements) of an upper triangle of Qij, which is a coupling coefficient matrix, and are expressed as qs (s=1, . . . , N). A sum of qs may be expressed by the following equation (3), and an average value (μq) of qs may be expressed by the following equation (4).
Furthermore, the number of elements of hi is n, and an average value (μh) of hi may be expressed by the following equation (5).
In the equation (2), Qijxixj has a value other than 0 in a case of xixj=1. That is, in the case of xixj=1, an element that intersects the i-column and the j-row of Qij is selected. When the number of state variables of xi=1 is assumed to be p in xi (i=1, . . . , n) included in a certain solution x, the number of elements of Qij to be selected is p (p−1)/2.
In the equation (2), hixi has a value other than 0 in the case of xi=1. That is, in the case of xi=1, the elements of hi are selected. When the number of state variables of xi=1 is assumed to be p, the number of elements of hi to be selected is p.
When one of p xi having a value of 1 is inverted (deselected) from 1 to 0, the number of elements of Qij to be deselected is p−1, and the number of elements of hi to be deselected is 1. The sum of the deselected elements of Qij and hi is the decrease amount of C(x).
Considering an attempt to extract an element from qs (s=1, . . . , N) described above, values X=q1, . . . , qN of the extracted element are random variables. However, the extraction of the deselected p−1 elements described above is sampling without replacement, and the deselected p−1 elements exist in the same row or column of Qij, and moreover, the selection of xi to be inverted is not random but a function of x. Therefore, the random variable sequence {X1, . . . , Xp−1} does not follow independent identical distribution.
However, the sampling without replacement may be approximately regarded as sampling with replacement in a case where n is larger. Furthermore, in the selection of xi to be inverted, solution by the Ising machine includes a stochastic process. In view of the above, assuming that X is a random variable following the independent identical distribution (iid) as approximation, an expected value of the sum S of the random variable sequence {X1, . . . , Xp−1} including the deselected p−1 elements is μq(1→0)=(p−1)μq.
Likewise, an attempt to extract an element from a set of bias coefficients {hs(s=1, . . . , N)} is considered. Assuming that values Y=h1, . . . , hp of the extracted element are random variables following the iid, an expected value of the sum of the random variable sequence {Y1} (number of elements is 1as described above) including the deselected element is μh(1→0)=μh.
From the above, the expected value of the decrease amount of C(x) is Δ−C(x)=μq(1→0)+μh(1→0)=(p−1)μq+μh. When such Δ−C(x) is used as the constraint coefficient α, αP(x) added due to the constraint violation may be set to a value that compensates for the decrease amount of C(x) in the case where the value of C(x) decreases due to the transition from the constraint satisfaction solution to the constraint violation solution.
In an objective function of a QAP, hi and c are identically 0, and thus C(x) may be expressed by the following equation (6).
In the QAP, each of a flow amount between allocation elements and a distance between individual allocation destinations may be expressed using a matrix (see
In the equation (7), fij represents a flow amount of the i-column j-row of a flow matrix F representing the flow amount between the allocation elements (flow amount between the i-th allocation element and the j-th allocation element). A distance of the k-column I-row of a distance matrix D that represents a distance between allocation destinations (distance between the k-th allocation destination and the l-th allocation destination) is represented by dkl. Allocation of the i-column k-row of an allocation matrix x (see
Note that P(x) in the equation (1) may be expressed as an equation (8) in the QAP.
The first term on the right side is a constraint term for making one i that satisfies xik=1 with the same k, and indicates a constraint that one allocation element is allocated to one allocation destination. The second term on the right side is a constraint term for making one k that satisfies xik=1 with the same i, and indicates a constraint that one allocation element is allocated to one allocation destination. While α≠β may be satisfied, the flow matrix F and the distance matrix D are essentially symmetric matrices in the QAP, and thus β=α may be satisfied and the constraint term may be collectively expressed as αP(x).
Qij in the equation (6) may be expressed by a matrix Q as in the following equation (9) based on a Kronecker product of the flow matrix F and the distance matrix D. The matrix Q includes n×n small matrices fijD.
When the equation (9) is expanded, the matrix Q having the order of n2 as indicated in an equation (10) is obtained.
Furthermore, as in the following equation (11), the allocation matrix x may be expressed by a column vector having the number of elements of n2.
While the number of elements qij (i<j) in the matrix Q is n4 as in the equation (10), since qij=qji and qii=0 (i.e., diagonal elements are 0), the number of valid qij is N=n2(n−1)/2. When the N valid qij are set as elements (excluding diagonal elements) of an upper triangle of the matrix Q and expressed as qs(s=1, . . . , N), the sum of qs may be expressed by the equation (3) described above, and the average value of qs may be expressed by the equation (4) described above.
While the expected value of the decrease amount of C(x) may be expressed as Δ−C(x)=μq(1→0)+μh(1→0)=(p−1)μq+μh as described above, μh≡0 holds in the case of the QAP. Furthermore, in the case of the QAP, the number p of xi that is 1 in the constraint satisfaction solution is p=n. Therefore, Δ−C(x)=(p−1)μq=(n−1)μq=(n−1)S/N=2S/n2(n+1) holds for Δ−C(x).
When such Δ−C(x) is used as the constraint coefficient α, αP(x) added due to the constraint violation may be set to a value that compensates for the decrease amount of C(x) in the case where the value of C(x) decreases due to the transition from the constraint satisfaction solution to the constraint violation solution.
As expressed in the equation (9), the matrix Q includes n×n small matrices fijD. The number of small matrices not including the diagonal element (fijD (i=j)) in the upper triangle of the matrix Q is n(n−1)/2, and the number of elements not including the diagonal element (fijdki (k=l)) in each of the small matrices is n(n−1). In this case, the number of valid qij (i<j) is N′=n(n−1)×n(n−1)/2=n2(n−1)2/2. When the valid N′ qij described above are expressed as qs(s=1, . . . , N′), a sum S′ of qs may be expressed by the following equation (12).
Here, since a value of an invalid element of the matrix Q is 0 in the QAP, S′=S holds. Thus, the average value of qs may be expressed by the following equation (13).
Therefore, Δ−C(x)=(n−1)S/N′=2(n−1)S/n2(n−1)2=2S/n2(n−1) holds for Δ−C(x).
Even if such Δ−C(x) is used as the constraint coefficient α, αP(x) added due to the constraint violation may be set to a value that compensates for the decrease amount of C(x) in the case where the value of C(x) decreases due to the transition from the constraint satisfaction solution to the constraint violation solution.
An information processing system 10 calculates a combinatorial optimization problem based on an energy function of an Ising model as expressed in the equation (1).
The information processing system 10 includes a storage unit 11, a processing unit 12, and a search unit 13.
The storage unit 11 may be a volatile storage device, such as a dynamic random access memory (DRAM) or the like, or may be a nonvolatile storage device, such as a hard disk drive (HDD), a flash memory, or the like. The storage unit 11 stores various types of data to be used for processing by the processing unit 12. Furthermore, the storage unit 11 stores problem information of the combinatorial optimization problem. For example, in a case where the optimization problem is a QAP, the problem information includes a flow amount between individual allocation elements and a distance between individual allocation destinations.
The processing unit 12 may include a central processing unit (CPU), a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), and the like. The processing unit 12 may be a processor that executes a program. The “processor” may include a set of a plurality of processors (multiprocessor).
The processing unit 12 obtains the problem information of the combinatorial optimization problem, formulates the combinatorial optimization problem with the energy function based on the problem information, and outputs information regarding the energy function to the search unit 13. The processing unit 12 obtains a solution obtained by the search unit 13, and outputs the obtained solution to the outside. For example, the processing unit 12 may display the content of the solution on a display device coupled to the information processing system 10, or may transmit the solution to another information processing device via a network.
The search unit 13 receives the information regarding the energy function from the processing unit 12, and searches for, based on the information regarding the energy function, a ground state of the Ising model, that is, a solution that minimizes the value of the energy function. The search unit 13 outputs the solution obtained through the search to the processing unit 12. The search unit 13 may be implemented by dedicated hardware. For example, a search circuit implemented by using an integrated circuit such as an FPGA, an ASIC, a graphics processing unit (GPU), or the like may function as the search unit 13. However, the search unit 13 may be implemented by a processor executing a predetermined program. Examples of a search method used by the search unit 13 include an SA method, a replica exchange method, simulated quantum annealing (SQA), and the like.
Hereinafter, exemplary operation of the information processing system 10 according to the first embodiment will be described.
At a time of formulating the combinatorial optimization problem with the energy function, the processing unit 12 determines a value of α, which is a coefficient of a constraint term (constraint coefficient).
First, the processing unit 12 obtains (reads) the problem information of the combinatorial optimization problem stored in the storage unit 11.
The problem information of the QAP includes a flow amount between individual allocation elements and a distance between individual allocation destinations. The flow amount between the individual allocation elements is represented by a flow matrix F of n rows and n columns, and the distance between the individual allocation destinations is represented by a distance matrix D of n rows and n columns.
Note that
The processing unit 12 calculates, based on the obtained problem information, an expected value of a change amount of a cost term in a case where transition from a solution that satisfies a constraint condition of the combinatorial optimization problem (constraint satisfaction solution) to a solution that does not satisfy the constraint condition (constraint violation solution) occurs.
In a case where the cost term is expressed as in the equation (2), the expected value of the change amount (decrease amount) may be calculated by an equation expressed as Δ−C(x)=(p−1)μq+μh as described above (method of calculating the expected value of the decrease amount of C(x)). Note that, as described above, p is the number of state variables of xi=1 among n state variables xi (i=1, . . . , n) included in a certain solution. An average value of valid N elements qs (s=1, . . . , N) in Qij, which is a coupling coefficient matrix, is represented by μq, and an average value of n elements of hi (bias coefficient vector) is represented by μh.
A constraint satisfaction solution is stored in the storage unit 11 as an initial solution, and the processing unit 12 may determine a value of p based on the initial solution, or the value of p itself may be stored in the storage unit 11 in advance. Based on the problem information, μq and μh are calculated by the equations (3) to (5) described above. In a case where the combinatorial optimization problem is a traveling salesman problem (TSP) or the like, Δ−C(x) may be calculated by Δ−C(x)=(p−1)μq+μh described above.
In the case of the QAP, as described above, it is sufficient if the average value of the valid elements that are the elements (excluding diagonal elements) of an upper triangle of the matrix Q is obtained. Since the number of valid elements is N=n2(n−1)/2, S/N=2S/n2(n2−1), which is obtained by dividing the sum S of the values of the valid elements by N, is the average value μq. Moreover, since p=n holds in the case where the combinatorial optimization problem is the QAP, Δ−C(x)=2S/n2 (n+1) holds.
Note that the processing unit 12 may calculate Δ−C(x) by an equation expressed as Δ−C(x)=2S/n2(n−1) as described above (another example of Δ−C(x)).
Furthermore, as the problem size (n) of the QAP increases, n+1 and n−1 included in the calculation equation of Δ−C(x) become roughly equal, and also become roughly equal to n. Thus, even if n is used instead of n+1 or n−1, the value of Δ−C(x) is a roughly similar value. Therefore, the calculation equation of Δ−C(x) described above may be appropriately changed according to the problem size.
The processing unit 12 determines the constraint coefficient a based on the calculated Δ−C(x). For example, the processing unit 12 may determine α=Δ−C(x), or may determine a value obtained by multiplying Δ−C(x) by an adjustment coefficient to be described later as α.
Thereafter, the processing unit 12 outputs information regarding the energy function obtained by formulating the combinatorial optimization problem using the determined α. The information regarding the energy function to be output is, for example, matrix data (also referred to as a QUBO matrix) in which the energy function of the equation (1) is expressed in a QUBO format.
The search by the search unit 13 is executed as follows, for example. As an example, a case of using the SA method or the replica exchange method is considered.
The search unit 13 calculates, for each of a plurality of state variables, a change amount of the value of the energy function due to a change in the value of one state variable of the plurality of state variables, and stochastically accepts it in a manner of giving a priority to a change in which the value of the energy function becomes smaller. At this time, with a steepest descent method, it is difficult to escape in a case of falling into a local solution. In view of the above, the search unit 13 uses a metropolis method or a Gibbs method to determine a transition probability of the Ising model from a certain state to the next state caused by a certain state variable being changed. That is, the search unit 13 stochastically allows a change in which the value of the energy function becomes larger according to a comparison between the change amount of the value of the energy function and a noise value. The noise value is obtained based on a value of a temperature parameter or a random number. The greater the value of the temperature parameter, the greater the amplitude of the noise value. The greater the amplitude of the noise value, the easier a state transition with a greater amount of increase in the value of the energy function is allowed.
The search unit 13 repeatedly tries the state transition based on such processing from the start of the search until a search end condition is satisfied, and obtains a set of values of the plurality of state variables when the search end condition is satisfied as a solution. For example, the search end condition is verified according to whether or not a predetermined unit time has elapsed, or whether or not the number of trials for the change in the state variable has reached a predetermined unit number of times, or the like.
As described above, the processing unit 12 of the information processing system 10 determines the constraint coefficient a based on the expected value Δ−C(x) of the change amount of C(x) in the case where the transition from the constraint satisfaction solution to the constraint violation solution occurs. As a result, αP(x) added due to the constraint violation may be set to a value that compensates for the decrease amount of C(x) in the case where the value of C(x) decreases due to the transition from the constraint satisfaction solution to the constraint violation solution. Therefore, it becomes possible to suppress occurrence of a large number of times of adjustment to obtain an appropriate a, and to shorten a time needed to determine the constraint coefficient.
An information processing system 20 includes an information processing device 100 and an optimization device 200.
The information processing device 100 generates information regarding an energy function corresponding to a combinatorial optimization problem by formulating the combinatorial optimization problem, and inputs the generated information regarding the energy function to the optimization device 200. In the second embodiment, a QAP is handled as an example of the combinatorial optimization problem to be calculated.
The information processing device 100 includes a CPU 101, a RAM 102, an HDD 103, an input/output (IO) interface 104, a GPU 105, an input interface 106, a medium reader 107, and a network interface card (NIC) 108. The CPU 101 is an example of the processing unit 12 according to the first embodiment. The RAM 102 is an example of the storage unit 11 according to the first embodiment.
The CPU 101 is a processor that executes program commands. The CPU 101 loads at least a part of a program and data stored in the HDD 103 into the RAM 102, and executes the program. Note that the CPU 101 may include a plurality of processor cores. Furthermore, the information processing device 100 may include a plurality of processors. A set of the plurality of processors will be sometimes referred to as a “multiprocessor” or simply a “processor”.
The RAM 102 is a volatile semiconductor memory that temporarily stores a program to be executed by the CPU 101 and data to be used by the CPU 101 for operations. The RAM 102 is, for example, a DRAM. Note that the information processing device 100 may include a memory of a type different from the RAM 102, or may include a plurality of memories.
The HDD 103 is a nonvolatile storage device that stores programs of software, such as an operating system (OS), middleware, application software, and the like, and data. Note that the information processing device 100 may include another type of storage device, such as a flash memory, a solid state drive (SSD), or the like, and may include a plurality of nonvolatile storage devices.
The IO interface 104 is coupled to the optimization device 200, and inputs/outputs data to/from the optimization device 200 in accordance with an instruction from the CPU 101. For example, in response to an instruction from the CPU 101, the IO interface 104 writes data of the RAM 102 to a register or a memory of the optimization device 200 or reads data from the optimization device 200 and writes it to the RAM 102. For example, a peripheral component interconnect-express (PCI-e) or the like is used as the IO interface 104.
The GPU 105 outputs an image to a display 111 coupled to the information processing device 100 in accordance with the instruction from the CPU 101. As the display 111, any type of display, such as a cathode ray tube (CRT) display, a liquid crystal display (LCD), a plasma display, an organic electro-luminescence (OEL) display, or the like, may be used.
The input interface 106 obtains an input signal from an input device 112 coupled to the information processing device 100, and outputs it to the CPU 101. As the input device 112, a pointing device such as a mouse, a touch panel, a touch pad, a trackball, or the like, a keyboard, a remote controller, a button switch, or the like may be used. Furthermore, a plurality of types of input devices may be coupled to the information processing device 100.
The medium reader 107 is a reading device that reads a program and data recorded in a recording medium 113. As the recording medium 113, for example, a magnetic disk, an optical disk, a magneto-optical disk (MO), a semiconductor memory, or the like may be used. Examples of the magnetic disk include a flexible disk (FD) and an HDD. Examples of the optical disk include a compact disc (CD) and a digital versatile disc (DVD).
The medium reader 107 copies, for example, a program or data read from the recording medium 113 to another recording medium such as the RAM 102, the HDD 103, or the like. The read program is executed by, for example, the CPU 101. Note that the recording medium 113 may be a portable recording medium, and is sometimes used for distribution of a program and data. Furthermore, the recording medium 113 and the HDD 103 may be referred to as computer-readable recording media.
The NIC 108 is an interface that is coupled to a network 30 and communicates with another computer through the network 30. For example, the NIC 108 is coupled to a communication device, such as a switch, a router, or the like, with a cable.
The optimization device 200 is an accelerator that carried out, based on the information regarding the energy function, a ground state search by, for example, the SA method or an MCMC method such as the replica exchange method or the like by hardware. The optimization device 200 may be referred to as an Ising machine, a Boltzmann machine, or the like. The optimization device 200 includes an FPGA 201 and a RAM 202. The FPGA 201 implements a solution search function for a problem expressed by an Ising model. The solution search function by the optimization device 200 may be implemented by another type of integrated circuit such as an ASIC, a GPU, or the like. The RAM 202 is a memory that stores various types of data used for the solution search by the FPGA 201. The RAM 202 is, for example, a static RAM (SRAM).
Note that the optimization device 200 may be hardware that carries out the ground state search by a quantum annealing method. Furthermore, a function of a search unit that uses, for example, the SA method, the replica exchange method, an SQA method, or the like may be implemented by, instead of the optimization device 200, the CPU 101 included in the information processing device 100 or another processor executing predetermined software.
The information processing device 100 includes a storage unit 120, a control unit 130, and a solution output unit 140. A storage area of the RAM 102 or the HDD 103 is used as the storage unit 120. The control unit 130 and the solution output unit 140 are implemented by the CPU 101 executing a program stored in the RAM 102.
Furthermore, the optimization device 200 includes a search unit 210. The search unit 210 is implemented by the FPGA 201.
The storage unit 120 stores various types of data used for formulation of the QAP by the control unit 130. For example, the storage unit 120 stores information regarding a flow matrix F and a distance matrix D in the QAP, and information regarding a matrix Q (see
The control unit 130 receives an input of data of the QAP, and formulates the QAP based on the data. The control unit 130 generates information regarding the energy function expressing the equation (1) in a QUBO format. The control unit 130 outputs the generated information regarding the energy function to the optimization device 200, and causes the optimization device 200 to search for a solution based on the energy function.
The control unit 130 obtains the solution obtained through the search by the optimization device 200, and outputs the obtained solution to the solution output unit 140 in a case where the obtained solution, which satisfies the constraint, is to be output as a final solution.
The solution output unit 140 obtains the solution obtained through the search by the optimization device 200, and outputs the obtained solution. For example, the solution output unit 140 may convert the solution obtained from the optimization device 200 into a solution format for the QAP and cause the display 111 to display it, or may transmit information regarding the solution to another computer via the network 30.
The search unit 210 obtains the information regarding the energy function output from the control unit 130, and searches for a ground state of the Ising model, that is, an optimal solution, based on the energy function. The search unit 210 outputs the solution obtained through the search to the information processing device 100.
(Correspondence Relationship between Presence or Absence of Constraint Violation and Energy Function in QAP)
The energy function of the QAP includes a constraint term as expressed in the equation (8). Constraint conditions are, for example, “at one position, one facility is placed” and “one facility is placed at one position” in a facility placement problem, which is an example of the QAP. Thus, the search in the search unit 210 is affected by the constraint term.
For example, the state “1” is {xik}=(x11, x12, x21, x22)=(0, 1, 1, 0). The state “1” does not violate the constraint condition, that is, a state of constraint satisfaction. Thus, the constraint violation eg corresponding to the state “1” is “0”.
The state “2” is {xik}=(x11, x12, x21, x22)=(0, 0, 1, 0). The state “2” violates the constraint condition. The constraint violation eg corresponding to the state “2” is “α+β”.
The state “3” is {xik}=(x11, x12, x21, x22)=(0, 0, 0, 0). The state “3” violates the constraint condition. The constraint violation eg corresponding to the state “3” is “2 (α+β)”.
The state “4” is {xik}=(x11, x12, x21, x22)=(1, 0, 0, 0). The state “4” violates the constraint condition. The constraint violation eg corresponding to the state “4” is “α+β”.
The state “5” is {xik}=(x11, x12, x21, x22)=(1, 0, 0, 1). The state “5” does not violate the constraint condition, that is, a state of constraint satisfaction. Thus, the constraint violation eg corresponding to the state “5” is “0”.
In
Furthermore, in
Since the state “1” is the state of constraint satisfaction (constraint satisfaction solution) and the constraint violation eg described above is “0”, the energy of the sequences G1a and G1b coincides with each other.
Since the states “2” and “4” are the state of constraint violation and the constraint violation eg described above is “α+β”, the energy of the sequence G1a is higher than the energy of the sequence G1b by “α+β”.
Since the state “3” is also the state of constraint violation and the constraint violation eg described above is “2(α+β)”, the energy of the sequence G1a is higher than the energy of the sequence G1b by “2(α+β)”.
As illustrated in
Therefore, it is preferable to determine the constraint coefficients a and B having appropriate magnitude. A constraint violation solution is likely to be generated when a and B are too small, whereas an energy barrier between constraint satisfaction solutions increases to hinder a solution space search when they are too large.
Thus, in a similar manner to the processing unit 12 of the information processing system 10 in
In the case of the QAP, for example, by setting α=Δ·C(x)=2S/n2(n+1) as described above, the constraint term added due to the constraint violation may be set to a value that compensates for the decrease amount of C(x).
Next, a processing procedure of the information processing system 20 will be described.
The control unit 130 obtains the flow matrix F and the distance matrix D from the data of the QAP input by a user (step S1).
The control unit 130 performs matrix expansion (step S2). That is, the control unit 130 calculates a Kronecker product of the flow matrix F and the distance matrix D to generate a matrix Qij.
Allocation elements are represented by A to D, and allocation destinations are represented by a to d. Furthermore,
The flow matrix F is a matrix in which flow amounts between the allocation elements A to D are arranged. For example, fAB represents a flow amount between the allocation elements A and B.
The distance matrix D is a matrix in which distances between the allocation destinations a to d are arranged. For example, dab represents a distance between the allocation destinations a and b.
The matrix Qij obtained by the Kronecker product of the flow matrix F and the distance matrix D includes 4×4 small matrices as illustrated in
A value of each element included in four diagonal small matrices of the matrix Qij represented by a product of the flow amount (=0) between the same allocation elements and each distance is 0. Furthermore, a value of diagonal elements of each of the small matrices represented by a product of a distance (=0) between the same allocation destinations and each flow amount is 0.
Note that, since the small matrices included in the lower triangular portion of the matrix Qij includes small matrices corresponding to the upper triangular portion, the information to be output to the search unit 210 only needs to include information regarding the small matrices corresponding to the upper triangular portion (excluding the four diagonal small matrices described above).
Next, the control unit 130 calculates Δ·C(x) (step S3).
As described above, Δ−C(x) represents an expected value of the decrease amount of C(x) in the case where the transition from the constraint satisfaction solution to the constraint violation solution occurs.
At a time of calculating the cost term, the element qij at a position where a row and a column of Qij corresponding to the state variable having a value of 1 intersect is selected.
In the example of
Here, in a case where x13 is flipped (in a case where the value changes from 1 to 0), the number of state variables having a value of 1 in the first row of the allocation matrix x becomes 0. Thus, the constraint condition of the QAP is not satisfied, and a constraint violation solution is obtained. In this case, among the six qij described above, three qij having values of 12, 8, and 16 are deselected. Therefore, the decrease amount of C(x) is 36.
The expected value Δ−C(x) of such decrease amount of C(x) may be calculated by, for example, an equation expressed as Δ−C(x)=2S/n2 (n−1) as described above.
Next, the control unit 130 determines whether or not an adjustment coefficient has been determined (step S4). The control unit 130 performs processing of step S6 if it is determined that the adjustment coefficient has been determined, and performs processing of step S5 if it is determined that the adjustment coefficient has not been determined.
In the processing of step S5, the control unit 130 determines the adjustment coefficient. The processing of the step S5 will be described later. After the processing of step S5, the control unit 130 performs the processing of step S6.
In the processing of step S6, the control unit 130 calculates a constraint coefficient a by multiplying Δ−C(x) by the adjustment coefficient.
Thereafter, the control unit 130 generates information regarding the energy function expressing the equation (1) in the QUBO format (step S7).
The control unit 130 sets the information regarding the energy function generated in step S7 in the search unit 210 (step S8).
The search unit 210 executes the combinatorial optimization operation based on the set information regarding the energy function. The search unit 210 searches for a solution for the QAP by the operation (step S9).
The search unit 210 outputs the solution obtained through the search to the control unit 130 (step S10).
The control unit 130 obtains the solution from the search unit 210, outputs it to the outside via the solution output unit 140, and terminates the process.
Processing of steps S20 and S21 is the same as the processing of steps S1 and S2 illustrated in
In processing of step S22, an initial value (temporary value) of the constraint coefficient a is set. Thereafter, processing similar to that in steps S7 to S9 illustrated in
Then, it is determined whether or not the constraint condition of the QAP is satisfied based on the allocation matrix x (step S27). Processing of step S28 is performed if it is determined that the constraint condition is satisfied, and processing of step S30 is performed if it is determined that the constraint condition is not satisfied.
In the processing of step S28, the value of the cost term obtained by the combinatorial optimization operation is output.
Thereafter, it is determined whether or not the value of the cost term satisfies a predetermined condition (step S29), and processing of step S31 is performed if it is determined that the condition is satisfied, and processing of step S30 is performed if it is determined that the condition is not satisfied.
In the processing of step S30, the constraint coefficient α is updated. For example, adjustment of increasing the constraint coefficient is performed if a constraint satisfaction solution is not obtained, and adjustment of decreasing the constraint coefficient is performed if the value of the cost term is larger than a predetermined threshold. After the processing of step S30, the process from step S23 is repeated.
In the processing of step S31, processing similar to that of step S10 illustrated in
Next, an exemplary case where various instances of the QAP are calculated using the method of the comparative example as described above will be described above.
When the constraint coefficient is changed from a small value to a larger value, while only the constraint violation solution is obtained in the part where the constraint coefficient is small, the constraint satisfaction solution is obtained as the constraint coefficient increases in the calculation result for any instance. Ultimately, the constraint satisfaction solution is obtained for approximately all the replicas. As described above, while the constraint satisfaction solution may be almost reliably obtained when the constraint coefficient is sufficiently increased, commonly, a solution in which the value of the cost term is closer to the optimal solution (suboptimal solution: solution in which the value of the cost term is sufficiently closer to the value of the optimal solution) is obtained as the constraint coefficient is smaller. Thus, an appropriate value of the constraint coefficient is a value from a value at which the constraint satisfaction solution starts to be obtained to a value that is roughly a constraint satisfaction solution and in which a partial constraint violation solution remains (portion where the inclination of the graph is positive). As illustrated in
For example, in the initial state of solution for rou12 (state before calculation of the relationship as illustrated in
Meanwhile, in a case of calculating nug12 and the like, assuming that the initial value of the constraint coefficient is 50,000, the constraint violation solution is obtained when it is changed to 50,000, 20,000, 10,000, 5,000, 2,000, 1,000, 500, 200, 100, and 50, for example. Thus, there are many useless attempts.
Although it is conceivable to optimize the constraint coefficient using Bayesian inference or the like instead of manual operation, there is no change in that many attempts are needed.
In contrast to the comparative example as described above, according to the information processing system 20 of the present embodiment that performs the process as illustrated in
Next, determination of the adjustment coefficient will be described.
The instances in which the solution is performed by the Ising machine (represented using the QUBO matrix) have different “statistical properties” (average and variance of matrix elements, matrix sparseness and denseness, value distribution, etc.) depending on the problem. Thus, it is preferable to determine the constraint coefficient a by adjusting Δ−C(x) according to the instance sequence.
Problems to be formulated as QAPs include not only allocation of factories to factory sites but also various things. For example, there are arrangement of hospital rooms and facilities in a hospital, gate arrangement for minimizing a total travel distance of transfer passengers at an airport, a printed board arrangement and wiring problem, key arrangement of a keyboard, table seating arrangement, turbine blade weight balance, and the like. The “statistical properties” are different in each of those problems.
Note that the QAPLIB is a library that abstracts such problems and is created as a series having a different size for each problem, and is used as a benchmark for solution accuracy and a solution speed.
As compared with the case of the comparative example illustrated in
First, an initial value, an increase amount, and an upper limit of the adjustment coefficient are set (step S40). In the processing of step S40, for example, the setting may be performed based on the initial value, the increase amount, or the upper limit input by the user, or the setting may be performed based on the initial value, the increase amount, or the upper limit stored in advance in the storage unit 120.
Thereafter, the control unit 130 and the search unit 210 perform processing of steps S41 to S44 similar to the processing of steps S6 to S9 illustrated in
After the processing of the step S45, the control unit 130 determines whether or not the adjustment coefficient has reached the upper limit (step S46).
If it is determined that the adjustment coefficient has not reached the upper limit, the control unit 130 increases the adjustment coefficient by the increase amount (step S47), and repeats the process from step S41.
If it is determined that the adjustment coefficient has reached the upper limit, the control unit 130 determines an optimal adjustment coefficient based on the allocation matrix x, which is the search result obtained thus far (step S48), and terminates the process of determining the adjustment coefficient. For example, as illustrated in
The search range of an appropriate adjustment coefficient is limited to approximately 1×100, and moreover, it is sufficient if search increments (above-described increase amount of the adjustment coefficient) are set to approximately 0.05 to 0.1. Furthermore, the number of times of adjustment of the adjustment coefficient is determined by the initial value, increase amount, and upper limit set in the processing of step S40.
Thus, it may be expected that the number of times of adjustment of the adjustment coefficient is made significantly smaller than the number of times of updating the constraint coefficient a in the comparative example.
Moreover, once the adjustment coefficient is determined, the adjustment coefficient may be used at a time of executing the combinatorial optimization operation of the same type of instances, and thus the attempt to determine the adjustment coefficient only needs to be performed for the first time.
In the example of
As described above, at the time of obtaining an adjustment coefficient, the problem size of the problem to be calculated is reduced to search for an appropriate adjustment coefficient, and the actual problem is solved using the adjustment coefficient, whereby the adjustment coefficient search time may be significantly reduced.
At the time of obtaining an adjustment coefficient, it is preferable to use a model obtained by scaling down the problem to be optimized (e.g., the number of factories and the number of sites are reduced in the allocation of factories to sites). However, in a case where there is no statistically large bias in the flow matrix and the distance matrix even in an abstract problem, it is also possible to obtain the adjustment coefficient using a small-sized flow matrix and distance matrix configured by extracting appropriate rows and columns from each matrix.
Note that, while the method of setting the increase amount and the upper limit of the adjustment coefficient and increasing the adjustment coefficient until the adjustment coefficient reaches the upper limit is used in the example of determining the adjustment coefficient described above, a method of setting a decrease amount and a lower limit of the adjustment coefficient and decreasing the adjustment coefficient until the adjustment coefficient reaches the lower limit may be used.
Next, exemplary QUBO generation based on the determined constraint coefficient a will be described.
The constraint term may be expressed as the equation (8)
described above, for example. A quadratic polynomial by which a of the first term on the right side of the equation (8) is multiplied may be expressed as the following equation (14).
In the equation (14), the quadratic term on the right side may be expressed as the following expression (15).
In the expression (15), the quadratic term is obtained by taking the sum of the state variables in the range of non-diagonal elements of the upper triangle (i>j) of the outer product of the state variable vector (see
A quadratic polynomial by which β of the second term on the right side of the equation (8) is multiplied may be expressed as the following equation (17) for a similar reason described above.
As described below, xikxjk added to the quadratic term and the linear term in the equation (16) are expressed by matrices.
A product of two state variables in a region 40 is added to the quadratic term. In
In
As described below, xikxil added to the quadratic term and the linear term in the equation (17) are expressed by matrices.
A product of two state variables in a region 41 is added to the quadratic term. In
In
When the results of
The QUBO matrix of the entire energy function in
The control unit 130 outputs such a QUBO matrix as information regarding the energy function, and sets it in the search unit 210.
According to the information processing system 20 of the second embodiment as described above, effects similar to those of the information processing system 10 of the first embodiment may be obtained. That is, it becomes possible to suppress occurrence of a large number of times of adjustment to obtain an appropriate constraint coefficient, and to shorten a time needed to determine the constraint coefficient.
Processing of steps S50 and S51 is the same as the processing of steps S1 and S2 illustrated in
After the processing of step S51, a control unit 130 performs matrix modulation (step S52). In the processing of step S52, modulation of Qij generated by a Kronecker product of a flow matrix F and a distance matrix D is performed. An example of the modulation will be described later.
After the processing of step S52, the control unit 130 calculates Δ−C(x) based on Qij after the modulation in a similar manner to the processing of step S3 illustrated in
Processing of steps S54 to S57 is similar to the processing of steps S7 to S10 illustrated in
C′(x) in the following equation (18) is obtained by adding an optional constant C to the cost term C(x) expressed in the equation (2).
In calculation of a combinatorial optimization problem, an optimal solution (optimal allocation in a case of a QAP) does not change even if the same constant C is regularly added to or subtracted from a cost term, and a value of the cost term in the optimal solution is only different by the constant C added to or subtracted from the cost term. Moreover, not only the optimal solution but also all solutions and their ranks (magnitude relationship) are naturally maintained.
Meanwhile, in the QAP, a matrix Q represented by the Kronecker product of the flow matrix F and the distance matrix D includes n×n small matrices fijD as in the equation (9) described above. When any two allocation elements (factories, etc.) identified by i and j are considered, allocatable allocation destinations (sites, etc.) are all combinations of two allocation destinations. The small matrix fijD represents a cost term corresponding to all combinations of two allocation destinations for any two allocation elements identified by i and j.
In a constraint satisfaction solution of the QAP, any one element fijdkl other than a diagonal element is selected from each of the small matrices fijD. That is, only a value of any one element fijdkl other than a diagonal element is added to the cost term of the constraint satisfaction solution from each of the small matrices fijD.
Thus, in terms of the constraint satisfaction solution, even if the same constant is added to or subtracted from all fijdkl included in the solution, the same optimal solution is obtained in a similar manner to the case where the constant C is regularly added to or subtracted from the cost term in a simple manner as described above. Furthermore, all solutions and their ranks are also maintained.
Therefore, in the matrix Q, the same solution (allocation) and cost term order are obtained in terms of the constraint satisfaction solution regardless of whether or not any same constant is added to or subtracted from each element fijdkl for each of n (n−1)/2 small matrices fijD including the valid elements described above. Such an operation of adding or subtracting any same constant to or from each element fijdkl corresponds to the modulation of the matrix Q.
Meanwhile, a small matrix dklF obtained in a case where the Kronecker product is performed by switching the order of the flow matrix F and the distance matrix D represents a cost term corresponding to all combinations of two allocation elements for any two allocation destinations identified by k and l. However, since the Kronecker product is not commutative, elements of the small matrix dklF do not constitute a small matrix in the matrix Q as expressed in the equation (9), and are allocated to each small matrix fijD of the matrix Q one by one.
For example, an element f12d2n of d2nF is allocated to a small matrix f12D, and an element f1nd2n of d2nF is allocated to a small matrix f1nD.
In the constraint satisfaction solution of the QAP, in a similar manner to the case of the small matrix fijD, any one element fijdkl other than a diagonal element is selected from the small matrix dklF. That is, a value of any one element fijdkl other than a diagonal element of the small matrix dklF is added to the cost term of the constraint satisfaction solution. Thus, in terms of the constraint satisfaction solution, even if the same constant is added to or subtracted from all fijdki included in the solution, the same optimal solution is obtained in a similar manner to the case where the constant C is regularly added to or subtracted from the cost term in a simple manner as described above. Furthermore, all solutions and their ranks are also maintained. Note that, since the elements of the lower triangle of the small matrix fijD are the same as the elements of the upper triangle, they may not be used (they may be treated as invalid elements).
Therefore, in a similar manner to the small matrix fijD, the same solution (allocation) and cost term order are obtained in terms of the constraint satisfaction solution regardless of whether or not any same constant is added to or subtracted from each element fijdkl for each of n (n−1)/2 small matrices dklF including valid elements. Such an operation of adding or subtracting any same constant to or from each element fijdkl of dklF also corresponds to the modulation of the matrix Q.
By reducing the deviation of the value of fijdkl by those kinds of matrix modulation to smooth the solution space, it becomes possible to reduce the values of the constraint term to improve the solution search efficiency. This may be regarded as follows from a statistical point of view. The fijdkl originally having a different deviation for each instance sequence is subject to reduction of the deviation based on the matrix modulation, thereby being converted into another instance sequence having statistical properties similar to each other regardless of the original instance sequence and having the same solution as the original instance.
As described above, at the time of determining the constraint coefficient, the statistical properties may be made uniform by applying the matrix modulation to the instance to be calculated. As a result, it becomes possible to set the adjustment coefficient used to correct the statistical difference by the instance sequence used in the second embodiment to an almost constant value while reducing the sequence dependency of the instance.
The control unit 130 calculates an average value of the elements of the matrix Q obtained by the Kronecker product of the flow matrix F and the distance matrix D (step S60). Furthermore, the control unit 130 calculates an average value of the elements of each small matrix fijD (i<j) (step S61). Then, for each small matrix fijD (i<j), the control unit 130 subtracts (the average value of the elements of the small matrix fijD (i<j)—the average value of the elements of the matrix Q) from each element of the small matrix fijD (i<j) (step S62). Note that, in a case where (the average value of the elements of the small matrix fijD (i<j)—the average value of the elements of the matrix Q) is negative, the control unit 130 adds the difference value to each element of the small matrix fijD (i<j).
Next, the control unit 130 calculates an average value of the elements of each small matrix dklF (k<l) (step S63). Then, for each small matrix dklF (k<l), the control unit 130 subtracts (the average value of the elements of the small matrix dklF (k<l)—the average value of the elements of the matrix Q) from each element of the small matrix dklF (k<l) (step S64). Note that, in a case where (the average value of the elements of the small matrix dklF (k<l)—the average value of the elements of the matrix Q) is negative, the control unit 130 adds the difference value to each element of the small matrix dklF (k<l).
Finally, the control unit 130 subtracts the minimum value in the matrix Q from all the elements of Q (step S65).
In the operation of the matrix modulation described above, the processing of step S62 corresponds to adding or subtracting a value in which the average value of the elements of each small matrix fijD (i<j) matches the average value of the elements of the entire matrix Q to or from each element of each small matrix. Furthermore, the processing of step S64 corresponds to adding or subtracting a value in which the average value of the elements of each small matrix dklF (k<l) matches the average value of the elements of the entire matrix Q to or from each small matrix dklF (k<l).
Note that the order of the process described above is an example, and the order of the process may be appropriately changed.
In a case where the matrix Q having been subject to the matrix modulation by the method described above is used, it becomes possible to set the adjustment coefficient to approximately 1.0 for the five instances, as illustrated in
However, the adjustment coefficient becomes 1.0 in a case where the matrix modulation operation as illustrated in
Furthermore, it is naturally possible to adjust the constraint coefficient α before obtaining the solution without adjusting the constraint coefficient α or again after obtaining the solution to aim for a better value of the cost term.
Note that the information processing of the first embodiment may be implemented by the processing unit 12 being caused to execute a program. Furthermore, the information processing of the second and third embodiments may be implemented by the CPU 101 being caused to execute a program. The program may be recorded in the computer-readable recording medium 113.
For example, it becomes possible to distribute the program by distributing the recording medium 113 in which the program is recorded. Furthermore, the program may be stored in another computer, and the program may be distributed through a network. For example, a computer may store (install) the program, which is recorded in the recording medium 113 or received from another computer, in a storage device such as the RAM 102, the HDD 103, or the like, read the program from the storage device, and execute it.
While one aspect of the information processing system, information processing method, and program according to the embodiments has been described above based on the embodiments, these are merely examples, and are not limited to the descriptions above.
All examples and conditional language provided herein are intended for the pedagogical purposes of aiding the reader in understanding the invention and the concepts contributed by the inventor to further the art, and are not to be construed as limitations to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to a showing of the superiority and inferiority of the invention. Although one or more embodiments of the present invention have been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.
Number | Date | Country | Kind |
---|---|---|---|
2023-080206 | May 2023 | JP | national |