This application is based upon and claims the benefit of priority of the prior Japanese Patent Application No. 2022-196481, filed on Dec. 8, 2022, the entire contents of which are incorporated herein by reference.
The embodiments discussed herein are related to a data processing device, a storage medium, and a data processing method.
There is a method of converting a combinatorial optimization problem into an Ising model that represents a spin behavior of a magnetic body at a time of searching for a solution to the combinatorial optimization problem. The Ising model is represented by an Ising-type evaluation function that evaluates the solution to the combinatorial optimization problem. The Ising-type evaluation function includes a plurality of state variables (representing a state of the Ising model) and a plurality of weight values. In the Ising-type evaluation function, the state variable is a binary variable that takes a value of 0 or 1 (or −1 or +1). The state variable may be referred to as a bit. Furthermore, a value of the Ising-type evaluation function may also be referred to as energy of the Ising model.
In the solution search, a Markov-Chain Monte Carlo (MCMC) method is used. Hereinafter, the solution search based on the MCMC method will be referred to as an MCMC search. In the MCMC search, for example, a state transition is accepted with an acceptance probability of the state transition specified by a Metropolis method or a Gibbs method. At this time, a state transition that increases energy is also stochastically permitted. Note that the acceptance probability decreases as an amount of increase in energy increases. Examples of the MCMC method include simulated annealing and a replica exchange method. In such an MCMC search, a state of the Ising model in which the value of the Ising-type evaluation function is minimized is searched for. The state where the minimum value of local minimum values of the evaluation function is reached is to be an optimum solution.
Meanwhile, some combinatorial optimization problems have constraint conditions to be satisfied by a solution, and a method of performing a search in consideration of the constraint condition has been proposed. Examples of the constraint condition include an inequality constraint, an equality constraint, an absolute value constraint, and the like. The evaluation function reflecting the constraint condition includes a constraint term having a value corresponding to presence or absence of constraint condition violation. The constraint term is weighted by a coefficient representing weight of the constraint condition.
Japanese Laid-open Patent Publication No. 2020-201598, Japanese Laid-open Patent Publication No. 2020-204928, U.S. Patent Application Publication No. 2021/0216897, and U.S. Patent Application Publication No. 2021/0271214 are disclosed as related art.
According to an aspect of the embodiments, a data processing device includes one or more memories; and one or more processors coupled to the one or more memories and the one or more processors configured to: store values of a plurality of state variables included in an Ising-type evaluation function that evaluates a solution to a combinatorial optimization problem, values of a plurality of auxiliary variables that represent whether there is violation of each of a plurality of constraint conditions of the combinatorial optimization problem, a total value of values of a plurality of constraint terms weighted by a coefficient that represents a weight of each of the plurality of constraint conditions and a value of the evaluation function, a first local field that represents a change amount of the total value when each of the values of the plurality of state variables changes, a second local field used to specify a constraint violation amount for each of the plurality of constraint conditions, and a value of the coefficient, repeat, at a time of searching for the solution, a search process that includes determining whether to permit a change in a value of a first state variable among the plurality of state variables based on the first local field, updating the value of the first state variable, the first local field, the second local field, and the total value when the change in the value of the first state variable is determined to be permitted, determining whether to permit a change in a value of a first auxiliary variable among the plurality of auxiliary variables based on the second local field, and updating the value of the first auxiliary variable and the first local field when the change in the value of the first auxiliary variable is determined to be permitted, and adjust the value of the coefficient based on one selected from the total value and whether there is the violation.
The object and advantages of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the claims.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention.
In a case where the coefficient value representing the weight of the constraint condition is not appropriate, search efficiency may deteriorate in the MCMC search. For example, when the coefficient value representing the weight of the constraint condition is small, the amount of increase in energy at a time of transitioning to a state not satisfying the constraint condition (hereinafter referred to as a constraint violation solution) decreases. In this case, the constraint violation solution is likely to occur, and the search efficiency deteriorates. On the other hand, when the coefficient value described above is large, the state transition is less likely to occur, and the search efficiency deteriorates.
In one aspect, an object of the embodiments is to provide a data processing device, a program, and a data processing method capable of improving efficiency in searching for a solution to a combinatorial optimization problem.
In one aspect, the embodiments may improve the efficiency in searching for a solution to a combinatorial optimization problem.
Hereinafter, modes for carrying out the embodiments will be described with reference to the drawings.
A data processing device 10 according to the first embodiment includes a storage unit 11 and a processing unit 12.
The storage unit 11 is, for example, a volatile storage device (e.g., electronic circuit such as dynamic random access memory (DRAM)), or a non-volatile storage device (e.g., electronic circuit such as flash memory, hard disk drive (HDD), etc.). The storage unit 11 may include an electronic circuit such as a register.
The storage unit 11 stores values of a plurality (hereinafter referred to as N) of state variables included in an Ising-type evaluation function. Note that the state variables may also be called decision variables. The Ising-type evaluation function (E(x)) is defined by, for example, a function in a quadratic form such as the following equation (1).
A first term on a right side is obtained by integrating products of values (0 or 1) of two state variables and a weight value (representing strength of correlation between the two state variables) for all combinations of the N state variables of the Ising model with neither an omission nor an overlap. A state variable with an identification number i is represented by xi, a state variable with an identification number j is represented by xj, and a weight value indicating magnitude of correlation between the state variables with the identification numbers i and j is represented by Wij. A second term on the right side is obtained by summing up products of a bias coefficient and a state variable for each identification number. A bias coefficient for the identification number=i is represented by bi.
Furthermore, the storage unit 11 stores values of a plurality of auxiliary variables (xk) indicating presence or absence of violation of each of a plurality (hereinafter referred to as M) of constraint conditions. In the following description, description will be made assuming that xk has a value of 1 in the case of violating a constraint condition with the identification number=k and has a value of 0 in the case of satisfying the constraint condition, but the present disclosure is not limited to this. A spin variable having a value of −1 or +1 may also be used as xk.
Furthermore, the storage unit 11 stores a total value (hereinafter referred to as total energy) of the value of the evaluation function described above and values of a plurality of constraint terms. The total energy (H(x)) may be expressed by, for example, the following equation (2).
In the equation (2), a second term on a right side represents the overall magnitude (energy) of the plurality of constraint terms. An identification number of a constraint condition (or constraint term) is represented by k. Furthermore, λk is a predetermined positive coefficient representing a weight of the constraint condition with the identification number k. Each constraint term is weighted by λk. A penalty function, which differs depending on a type of the constraint condition, is represented by g(hk). It may be said that λkg(hk) is one constraint term. A value used to specify the constraint violation amount for the constraint condition with the identification number k is represented by hk.
In a case where the constraint condition is an inequality constraint, g(hk) in the equation (2) may be expressed by the following equation (3).
In the equation (3), max[0, hk] is a function that outputs the larger value of 0 and hk. Furthermore, Rk represents a consumption amount (also called resource amount) of the constraint term with the identification number k, and Uk represents an upper limit of the resource amount. D represents a set of the identification numbers of the state variables. Wki is a coefficient (weight value) representing a weight of xi in the inequality constraint with the identification number k.
H(x) in the case where the constraint condition is the inequality constraint may be expressed by the following equation (4) using the auxiliary variable (xk).
Moreover, the storage unit 11 stores a first local field (hi) representing a change amount of H(x) when each of the values of the N pieces of xi changes, and also stores hk described above. As described above, hk is a value used to specify the constraint violation amount for the constraint condition with the identification number k, and is also a value proportional to the change amount of H(x) when the value of xk changes. In
Furthermore, the storage unit 11 stores λk included in the equation (2) or (4) mentioned above. The storage unit 11 may further store a weight value between each of the N pieces of xi, a weight value between any one of the N pieces of xi and each of the M pieces of xk, and an increase amount Δλk+ and a decrease amount Δλk− at the time of adjusting λk. Each of Δλk+ and Δλk− may be set to a different value for each constraint condition. Furthermore, the storage unit 11 may store the bias coefficient (bi) in the equation (1), and Uk in the equation (3) or (4). Furthermore, the storage unit 11 may store various types of data such as calculation conditions when the processing unit 12 executes the data processing method to be described later. Furthermore, in a case where the processing unit 12 executes a part or all of processing of the data processing method to be described later by software, the storage unit 11 stores a program for executing the processing.
Strength of the correlation between the N state variables may be represented by Wij, which is N×N first weight values. For example, strength of the correlation between x1 and xi is W1i, strength of the correlation between xi and xN is Wki, and strength of the correlation between xi and xN is Wki. On the other hand, the correlation between the state variable and the auxiliary variable differs between the correlation based on the influence on the auxiliary variable exerted by a change in the state variable value and the correlation based on the influence on the state variable exerted by a change in the auxiliary variable. For example, as illustrated in
The processing unit 12 in
The processing unit 12 searches for, for example, a state where H(x) expressed by the equation (4) is minimized. The state where the minimum value of local minimum values of H(x) is reached is to be an optimum solution. Note that the processing unit 12 may also search for a state where the value of H(x) is maximized (in this case, the state where the maximum value is reached is to be the optimum solution) by changing the signs of the individual terms on the right side of the equation (4).
In
Note that, here, it is assumed that values based on initial values of x1 to xN are stored in the storage unit 11 as H(x), hi, hk, and xk.
The processing unit 12 performs the following process of steps S1 to S4 at a time of searching for a solution based on an MCMC method. First, the processing unit 12 carries out a search process based on the MCMC method (MCMC search) (step S1). The search process includes processing of flip determination of a state variable (step S1a), update of xi, hi, hk, and H(x) (step S1b), flip determination of an auxiliary variable (step S1c), and update of xk, hi, H(x) (step S1d).
The processing of step S1a is performed as follows, for example.
The processing unit 12 determines whether or not to permit a change in a value of a first state variable (hereinafter referred to as a flip candidate state variable) of the N pieces of xi based on hi. For example, the processing unit 12 selects the flip candidate state variable at random or in a predetermined order. Here, hi may be expressed by the following equation (5).
The processing unit 12 calculates a change amount (ΔH) of H(x) in the case where the value of the flip candidate state variable changes. For example, ΔH in the case where the value of xi changes may be calculated by the equation ΔH=−hiΔxi based on hi expressed by the equation (5).
Next, the processing unit 12 determines whether or not to permit a change in the value of the flip candidate state variable (whether or not flip is permissible) based on a result of comparison between ΔH and a predetermined value. The predetermined value is, for example, a noise value obtained based on a random number and a value of a temperature parameter. For example, log(rand)×T, which is an example of a noise value obtained based on a uniform random number (rand) equal to or greater than 0 and equal to or smaller than 1 and a temperature parameter (T), may be used as the predetermined value. In this case, in a case of −ΔHi≥log(rand)×T, the processing unit 12 determines that the change in the value of the flip candidate state variable is permitted (flip is permissible).
The processing of step S1b is performed as follows, for example.
When it is determined that the flip is permissible, the processing unit 12 updates hi, hk, H(x), and xi (state variables for which the flip is determined to be permissible). Note that the processing unit 12 does not update hi, hk, H(x), and xi unless it is determined that the flip is permissible. The processing unit 12 updates H(x) by adding ΔH to the original H(x). Furthermore, for example, when it is determined that the flip is permissible for xj, the processing unit 12 updates hi by adding Δhi=WijΔxj to the original hi for each of the N state variables. Moreover, when it is determined that the flip is permissible for xj, the processing unit 12 updates hk by adding Δhk=WkjΔxj to the original hk for each of the M auxiliary variables. In a case where violation of the constraint condition of the identification number=k occurs when the value of xj is changed, hk becomes a positive value by this update, and a change in xk from 0 to 1 is permitted by the processing of step S1c to be described later.
The processing of step S1c is performed as follows, for example.
The processing unit 12 determines whether or not to permit a change in a value of a first auxiliary variable (hereinafter referred to as a flip candidate auxiliary variable) of the M pieces of xk based on hk. For example, the processing unit 12 selects the flip candidate auxiliary variable at random or in a predetermined order. Here, hk may be expressed by the following equation (6).
The processing unit 12 calculates a change amount (ΔH) of H(x) in the case where the value of the flip candidate state variable changes. For example, ΔH in the case where the value of xi changes may be calculated by the equation ΔH=−hiΔxi based on hi expressed by the equation (5).
The processing unit 12 calculates ΔH in the case where the value of the flip candidate auxiliary variable changes. For example, ΔH in the case where the value of xk changes may be calculated by the equation ΔH=+λkhkΔxk using hk expressed by the equation (6). Although an error may occur depending on whether or not the value of the auxiliary variable changes in the processing of step S1a described above as ΔH is calculated without changing the value of the auxiliary variable, the error may be corrected by ΔH=+λkhkΔxk obtained by the processing of step S1c.
Next, the processing unit 12 determines whether or not to permit a change in the value of the flip candidate auxiliary variable (whether or not flip is permissible) based on a result of comparison between ΔH and a predetermined value. The predetermined value may be the same as the value used in the processing of step S1a, or may be a fixed value (e.g., 0). In a case of using log(rand)×T as the predetermined value, the processing unit 12 determines that the flip is permissible for the flip candidate auxiliary variable when ΔH>log(rand)×T. In a case where constraint violation is caused by the change in the value of the state variable according to the processing of step S1b, hk in the equation (6) becomes a positive value, and a change amount Δxk=1 when xk changes from 0 to 1, and thus ΔH is a positive value. Furthermore, log(rand)×T is a negative value. Thus, Xx is permitted to change from 0 to 1 by using the determination expression ΔH>log(rand)×T.
The processing of step S1d is performed as follows, for example.
When it is determined that the flip is permissible for the flip candidate xk, the processing unit 12 updates hi, H(x), and xk (auxiliary variables for which the flip is determined to be permissible). Note that the processing unit 12 does not update hi, H(x), and xk unless it is determined that the flip is permissible.
The processing unit 12 updates H(x) by adding ΔH to the original H(x). Furthermore, for example, when it is determined that the flip is permissible for xk, the processing unit 12 updates hi by adding Δhi=−λkWkiΔxk to the original hi for each of the N state variables.
Note that the processing unit 12 may repeat the processing of steps S1a and S1b a predetermined number of times, and then perform the processing of steps S1c and S1d. Furthermore, the processing unit 12 may not perform the processing of steps S1c and S1d until it is determined that the flip is permissible for the flip candidate state variable in the processing of step S1a and the update is carried out in the processing of step S1b.
After performing the search process as described above, the processing unit 12 determines whether or not a λk adjustment period is reached (step S2). The processing unit 12 determines that the adjustment period is reached each time the search process described above is performed a predetermined number of times. If it is determined that the adjustment period is reached, the processing unit 12 performs λk adjustment processing (step S3), and repeats the process from step S1 if it is determined that the adjustment period is not reached.
In the processing of step S3, the processing unit 12 adjusts λk based on H(x) or the presence or absence of constraint condition violation. For example, if H(x) at the time of λk adjustment is equal to or greater than Hbest, the processing unit 12 decreases the value of hk (hk in all or a designated range) of each of the plurality of constraint conditions. Hbest is the minimum value of H(x) in the state where no constraint condition is violated, which is obtained before the adjustment described above. If H(x) at the λk adjustment timing is smaller than Hbest and there is a constraint condition in which constraint condition violation occurs, the processing unit 12 increases the value of λk of the constraint condition. The λk adjustment is carried out by, for example, adding Δλk+ to λk or subtracting Δλk− from λk. For example, a value 0.1 times the original λk or the like is appropriately set as the value of Δλk+ or Δλk−.
Note that the values of Δλk+ and Δλk− and the initial value of λk may be changed during the MCMC search. Furthermore, a method of the λk adjustment is not limited to the method described above, and the processing unit 12 may multiply λk by a predetermined value (e.g., 1.1 or 0.9) to make adjustment.
When the λk adjustment is carried out, the processing unit 12 corrects hi and H(x) (step S4). The correction of hi may be carried out based on the following equation (7).
The correction of H(x) may be carried out based on the following equation (8).
In the equations (7) and (8), Δλk represents an adjustment amount of λk, and is Δλk+ or Δλk− described above.
Note that the correction of hi and H(x) may be carried out based on the following equations (9) and (10) without using xk.
After the processing of step S4, the process from step S1 is repeated.
Note that, while the example of selecting the flip candidate state variable one by one from among the N state variables and performing the processing of steps S1a and S1b has been described in the descriptions above, the processing of steps S1a and S1b may be performed in parallel for a plurality of (e.g., all the N pieces of) state variables. In that case, when there is a plurality of state variables whose values are permitted to change, the processing unit 12 selects a state variable whose value is to be changed at random or according to a predetermined rule.
Likewise, while the example of selecting the flip candidate auxiliary variable one by one from among the M auxiliary variables and performing the processing of step S1c and S1d has been described in the descriptions above, the processing of steps S1c and S1d may be performed in parallel for a plurality of (e.g., all the M pieces of) auxiliary variables. In that case, when there is a plurality of auxiliary variables whose values are permitted to change, the processing unit 12 selects an auxiliary variable whose value is to be changed at random or according to a predetermined rule.
In a case of performing simulated annealing, for example, the processing unit 12 decreases the value of the temperature parameter (T) described above according to a predetermined temperature parameter change schedule each time the flip determination for a state variable is repeated a predetermined number of times. Then, the processing unit 12 outputs a state obtained when the flip determination is repeated the predetermined number of times as a calculation result of a combinatorial optimization problem (e.g., displays it on a display device (not illustrated)). Note that the processing unit 12 may cause the storage unit 11 to retain Hbest and the state when Hbest is obtained. In that case, the processing unit 12 may output, as a calculation result, the state corresponding to Hbest stored after the flip determination is repeated the predetermined number of times.
In a case where the processing unit 12 performs a replica exchange method, the processing unit 12 repeats the process of steps S1 to S4 described above for each of a plurality of replicas to which each different T value is set. Then, the processing unit 12 carries out replica exchange each time the flip determination for a state variable is repeated a predetermined number of times. For example, the processing unit 12 selects two replicas having adjacent T values, and exchanges the values of the respective state variables and the values of the respective auxiliary variables between the selected two replicas at a predetermined exchange probability based on an energy difference or a T value difference between the replicas. Note that the T values may be exchanged between the two replicas instead of the values of the respective state variables and the values of the respective auxiliary variables. Alternatively, the processing unit 12 causes the storage unit 11 to retain Hbest and the state when Hbest is obtained. Then, the processing unit 12 outputs, as a calculation result, the state corresponding to the smallest Hbest in all the replicas among the pieces of Hbest stored after the flip determination described above is repeated the predetermined number of times in the individual replicas.
According to the data processing device 10 and the data processing method as described above, the processing unit 12 adjusts λk based on H(x) or the presence or absence of constraint condition violation each time the search process is performed the predetermined number of times. As a result, λk may be appropriately adjusted by reflecting the solution search condition, and accordingly, the efficiency in searching for a solution to the combinatorial optimization problem may improve.
For example, in a case where the value of H(x) is not improved (in a case where the value is not decreased) even when the search process is repeated the predetermined number of times, it is conceivable that the increase in energy when constraint violation occurs is too large so that the state transition is blocked to lower the search efficiency. In such a case, the state transition is promoted to improve the search efficiency when the value of λk is decreased. In a case where violation of a certain constraint condition occurs while the value of H(x) is improved, it becomes possible to suppress the occurrence of the violation of the constraint condition by increasing the value of λk related to the constraint condition. That is, it becomes possible to lower the possibility of the occurrence of the constraint violation solution, and to suppress the deterioration of the search efficiency.
Furthermore, when λk is adjusted, hi and H(x) are corrected based on the adjustment amount of λk, whereby the occurrence of a calculation error caused by changing of λk may be suppressed.
A data processing device 20 is, for example, a computer, and includes a processor 21, a RAM 22, an HDD 23, a GPU 24, an input interface 25, a medium reader 26, and a communication interface 27. The units described above are coupled to a bus.
The processor 21 is a processor such as a GPU, a CPU, or the like including an arithmetic circuit that executes a program command. The processor 21 loads at least a part of a program and data stored in the HDD 23 into the RAM 22, and executes the program. Note that the processor 21 may include a plurality of processor cores. Furthermore, the data processing device 20 may include a plurality of processors. Note that a set of the plurality of processors (multiprocessor) may be called a “processor”.
The RAM 22 is a volatile semiconductor memory that temporarily stores the program to be executed by the processor 21 and data to be used by the processor 21 for arithmetic operations. Note that the data processing device 20 may include a memory of a type different from the RAM 22, or may include a plurality of memories.
The HDD 23 is a non-volatile storage device that stores programs of software such as an operating system (OS), middleware, application software, and the like, and data. The programs include, for example, a program for causing the data processing device 20 to perform a process of searching for a solution to a combinatorial optimization problem. Note that the data processing device 20 may include another type of the storage device such as a flash memory, a solid state drive (SSD), or the like, or may include a plurality of non-volatile storage devices.
The GPU 24 outputs an image to a display 24a coupled to the data processing device 20 in accordance with a command from the processor 21. As the display 24a, a cathode ray tube (CRT) display, a liquid crystal display (LCD), a plasma display panel (PDP), an organic electro-luminescence (OEL) display, or the like may be used.
The input interface 25 obtains input signals from an input device 25a coupled to the data processing device 20, and outputs them to the processor 21. As the input device 25a, a pointing device such as a mouse, a touch panel, a touch pad, or a trackball, a keyboard, a remote controller, a button switch, or the like may be used. Furthermore, a plurality of types of input devices may be coupled to the data processing device 20.
The medium reader 26 is a reading device that reads a program and data recorded on a recording medium 26a. As the recording medium 26a, for example, a magnetic disk, an optical disk, a magneto-optical disk (MO), a semiconductor memory, or the like may be used. Examples of the magnetic disk include a flexible disk (FD) and an HDD. Examples of the optical disk include a compact disc (CD) and a digital versatile disc (DVD).
The medium reader 26 copies, for example, a program or data read from the recording medium 26a to another recording medium such as the RAM 22 the HDD 23, or the like. The read program is executed by, for example, the processor 21. Note that the recording medium 26a may be a portable recording medium, and may be used for distribution of the program or data. Furthermore, the recording medium 26a and the HDD 23 may be referred to as computer-readable recording media.
The communication interface 27 is an interface that is coupled to a network 27a and communicates with another information processing device via the network 27a. The communication interface 27 may be a wired communication interface coupled to a communication device such as a switch by a cable, or may be a wireless communication interface coupled to a base station by a wireless link.
Next, functions and processing procedures of the data processing device 20 will be described.
The data processing device 20 includes an input unit 31, a control unit 32, a search unit 33, and an output unit 34. With those units, processing similar to the processing performed by the processing unit 12 illustrated in
The input unit 31, the control unit 32, the search unit 33, and the output unit 34 may be implemented using, for example, a program module to be executed by the processor 21 or a storage area (register or cache memory) in the processor 21. Note that the search unit 33 may be further implemented by using a storage area secured in the RAM 22 or the HDD 23.
The input unit 31 receives, for example, input of initial values of N state variables, initial values of M auxiliary variables, problem information, and calculation conditions. The problem information includes, for example, Wki, and Uk in the equation (4) or the like in addition to Wij and bi in the equation (1). The calculation conditions include, for example, the number of replicas, a replica exchange cycle, and a value of a temperature parameter set for each replica in a case of executing a replica exchange method, a temperature parameter change schedule in a case of performing simulated annealing, calculation end conditions, and the like. Moreover, the calculation conditions include a parameter for adjusting λk.
Examples of the parameter for adjusting λk include an initial value of λk (λkinit), an increase amount (Δλk+) and a decrease amount (Δλk−) of λk, a variable (T1) indicating an interval for setting λkinit, Δλk+, and Δλk−, and a variable (TO) indicating an adjustment interval of λk. Note that T1>T0 is satisfied.
Those pieces of information may be input by an operation of the input device 25a made by a user, or may be input via the recording medium 26a or the network 27a.
The control unit 32 controls each unit of the data processing device 20 to execute processing to be described later.
The search unit 33 repeats an MCMC search under the control of the control unit 32, thereby searching for a state where a value (energy) of an evaluation function is minimized.
The output unit 34 outputs a search result (calculation result) by the search unit 33.
For example, the output unit 34 may output the calculation result to the display 24a to be displayed, transmit the calculation result to another information processing device via the network 27a, or store the calculation result in an external storage device.
The search unit 33 includes a variable setting unit 33a, a state variable holding unit 33b, an auxiliary variable holding unit 33c, a weight value holding unit 33d, and a λk adjustment unit 33e. Moreover, the search unit 33 includes an hi calculation unit 33f, an hk calculation unit 33g, ΔH calculation units 33h and 33i, a transition propriety determination units 33j and 33k, a selection unit 33l, an update unit 33m, and an energy calculation unit 33n.
The variable setting unit 33a retains various variables (λk, parameters for adjusting λk described above, etc.) received by the input unit 31, and sets them in the individual units. The variable setting unit 33a may retain H(x), Hbest, and the like to be used to determine whether to increase or decrease λk.
The state variable holding unit 33b retains N state variables (xi). Furthermore, the state variable holding unit 33b outputs a change amount (Δxi) of xi of a flip candidate.
The auxiliary variable holding unit 33c retains M auxiliary variables.
The weight value holding unit 33d retains weight values (Wij) between the N state variables and weight values (Wki) between each of the N state variables and the M auxiliary variables. Wij may be represented by a matrix of N rows and N columns, and Wki may be represented by a matrix of M rows and N columns. Note that it is not needed to retain weight values between state variables that does not affect any of the M auxiliary variables among the N state variables and the M auxiliary variables.
The λk adjustment unit 33e adjusts the value of λk based on the value of H(x) or xk (presence or absence of constraint condition violation) each time the search process (MCMC process) is performed T0 times. The λk adjustment unit 33e supplies the adjusted λk to the variable setting unit 33a, and supplies the adjustment amount (Δλk) to the hi calculation unit 33f and the update unit 33m.
The hi calculation unit 33f retains N pieces of hi, and updates hi according to changes in values of the state variables and the auxiliary variables. Furthermore, when λk is adjusted, the hi calculation unit 33f corrects hi according to the equation (7), for example.
The hk calculation unit 33g retains M pieces of hk and updates hk according to changes in values of the state variables.
The ΔH calculation unit 33h calculates ΔH=−hiΔxi based on hi for xi of a flip candidate.
The ΔH calculation unit 33i calculates ΔH=+λkhkΔxk based on hk for xk of a flip candidate.
The transition propriety determination unit 33j performs flip determination processing to determine whether or not to permit a change in the value of the flip candidate state variable based on a result of comparison between ΔH output by the ΔH calculation unit 33h and a predetermined value. The predetermined value is, for example, a noise value obtained based on a random number and a value of a temperature parameter. For example, in a case of −ΔH≥log(rand)×T, the transition propriety determination unit 33j determines that the change in the value of the flip candidate state variable is permissible.
The transition propriety determination unit 33k performs flip determination processing to determine whether or not to permit a change in the value of the flip candidate auxiliary variable based on a result of comparison between ΔH output by the ΔH calculation unit 33i and a predetermined value. The predetermined value may be the same as the value used by the transition propriety determination unit 33j, or may be a fixed value (e.g., 0). For example, in a case of ΔH>log(rand)×T, the transition propriety determination unit 33k determines that the change in the value of the flip candidate auxiliary variable is permissible.
The selection unit 33l selects a determination result of the transition propriety determination unit 33j in the case of performing the flip determination for a state variable, and selects a determination result of the transition propriety determination unit 33k in the case of performing the flip determination for an auxiliary variable, and outputs the determination result.
The update unit 33m transmits the identification number of the state variable for which the flip is determined to be permissible to the state variable holding unit 33b, and changes the value of the state variable. Furthermore, the update unit 33m transmits the identification number of the auxiliary variable for which the flip is determined to be permissible to the auxiliary variable holding unit 33c, and changes the value of the auxiliary variable.
Moreover, when it is determined that the flip is permissible for the flip candidate state variable, the update unit 33m causes the hi calculation unit 33f and the hk calculation unit 33g to update the N pieces of hi and the M pieces of hk. When it is determined that the flip is permissible for the flip candidate auxiliary variable, the update unit 33m causes the hi calculation unit 33f to update the N pieces of hi.
Furthermore, when it is determined that the flip is permissible for the state variable or the auxiliary variable, the update unit 33m causes the energy calculation unit 33n to update H(x). Furthermore, when λk is adjusted, the update unit 33m supplies Δλk to the energy calculation unit 33n to correct H(x).
The energy calculation unit 33n retains H(x), and updates H(x) when an update instruction is issued from the update unit 33m. Moreover, when an H(x) correction instruction is issued from the update unit 33m, the energy calculation unit 33n corrects H(x) based on Δλk according to, for example, the equation (8). Note that hk expressed by the equation (6) may be used as a value in the parentheses of the second term on the right side of the equation (8).
Furthermore, the energy calculation unit 33n retains Hbest, and in a case where the updated H(x) is smaller than Hbest and no constraint condition violation occurs when the H(x) is obtained, it sets the H(x) as a new Hbest.
The variable setting unit 33a includes a λk adjustment parameter holding unit 33a1, a state holding unit 33a2, an energy holding unit 33a3, and a λk holding unit 33a4. The λk adjustment parameter holding unit 33a1, the state holding unit 33a2, the energy holding unit 33a3, and the λk holding unit 33a4 may be implemented using a storage circuit such as a register.
The λk adjustment unit 33e includes a λk adjustment determination unit 33e1, a λk adjustment amount setting unit 33e2, and a λk setting unit 33e3.
The λk adjustment parameter holding unit 33a1 retains T0, T1 (>0), λkinit, Δλk+, and Δλk−. The λk adjustment parameter holding unit 33a1 supplies T0 to the λk adjustment determination unit 33e1, and supplies T1, λkinit, Δλk+, and Δλk− to the λk adjustment amount setting unit 33e2.
The state holding unit 33a2 retains the values of the state variables and the auxiliary variables. When an identification number i of the state variable for which the flip is determined to be permissible is received from the update unit 33m, the state holding unit 33a2 changes the value of the state variable (xi) corresponding to the identification number i. When an identification number k of the auxiliary variable for which the flip is determined to be permissible is received from the update unit 33m, the state holding unit 33a2 changes the value of the auxiliary variable (xk) corresponding to the identification number k. The state holding unit 33a2 supplies the M pieces of xk to the λk setting unit 33e3.
The energy holding unit 33a3 receives H(x) and Hbest from the energy calculation unit 33n, and retains them. The energy holding unit 33a3 supplies H(x) and Hbest to the λk setting unit 33e3.
The λk holding unit 33a4 retains the initial value of λk in all the constraint conditions. Thereafter, the λk holding unit 33a4 retains λk (k e M) obtained by the λk setting unit 33e3 at the λk adjustment timing.
The λk adjustment determination unit 33e1 determines that the λk adjustment timing is reached each time the MCMC search is carried out T0 times.
The λk adjustment amount setting unit 33e2 sets λkinit, Δλk+, and Δλk− in the λk setting unit 33e3 each time the MCMC search is carried out T1 times. As a result, λk is initialized.
The λk setting unit 33e3 compares H(x) with Hbest. When H(x) is equal to or greater than Hbest, the λk setting unit 33e3 subtracts Δλk− from the λk values in all or a designated range. Note that λk is retained in the λk holding unit 33a4. When H(x) is smaller than Hbest and there is a constraint condition in which constraint condition violation occurs, the λk setting unit 33e3 adds Δλk+ to the value of λk (retained in the λk holding unit 33a4) of the constraint condition. The λk setting unit 33e3 outputs, as Δλk, the value used for the λk adjustment out of Δλk− and Δλk+.
Note that, in the example of
Furthermore, when it is determined that the flip is permissible for xj, N pieces of Wij, which are weight values between xj and the individual N state variables, and M pieces of Wkj, which are weight values between xj and the individual M auxiliary variables, are read from the weight value holding unit 33d. Furthermore, when it is determined that the flip is permissible for xk, N pieces of Wki, which are weight values between xk and the individual N state variables, are read from the weight value holding unit 33d.
The hi calculation unit 33f includes multipliers 33f1, 33f2, 33f3, 33f4, and 33f5, and an hi update holding unit 33f6.
The hk calculation unit 33g includes a multiplier 33g1 and an hk update holding unit 33g2.
The multiplier 33f1 outputs a product of Δxj and the N pieces of Wij. The multiplier 33f2 outputs a product of Δxk and the N pieces of Wki. The multiplier 33f3 outputs a product of each of the output values of the multiplier 33f2 and λk read from the variable setting unit 33a. The multiplier 33f4 outputs a product of Δλk and xk. The multiplier 33f5 outputs a product of the output value of the multiplier 33f4 and the N pieces of Wki. The multiplier 33g1 outputs a product of Δxj and the M pieces of Wkj.
The hi update holding unit 33f6 retains the N pieces of hi. Then, when it is determined that the flip is permissible for xj, the hi update holding unit 33f6 adds Δhi=WijΔxj to each of the N pieces of hi, thereby updating hi. Furthermore, when it is determined that the flip is permissible for xk, the hi update holding unit 33f6 adds Δhi=−λkWkiΔxk to each of the N pieces of hi, thereby updating hi.
Moreover, when the λk adjustment is carried out, the hi update holding unit 33f6 corrects the N pieces of hi according to the equation (7) using ΔλkWkixk, which is the output value of the multiplier 33f5.
The hk update holding unit 33g2 retains the M pieces of hk. Then, when it is determined that the flip is permissible for xj, the hk update holding unit 33g2 adds Δhk=WkjΔxj to each of the M pieces of hk, thereby updating hk.
Hereinafter, an exemplary processing procedure (data processing method) of the data processing device 20 will be described.
Step S10: The input unit 31 receives input of initial values of the N state variables, initial values of the M auxiliary variables, problem information, and calculation conditions.
Step S11: The control unit 32 carries out an initialization process. In the initialization process, for example, the following processing is performed. The control unit 32 causes the variable setting unit 33a and the state variable holding unit 33b to retain the initial values of the N state variables, and causes the variable setting unit 33a and the auxiliary variable holding unit 33c to retain the initial values of the M auxiliary variables. Furthermore, the control unit 32 causes the weight value holding unit 33d to retain the weight value included in the problem information, and causes the variable setting unit 33a to retain the parameter for adjusting λk of the calculation conditions.
Moreover, the control unit 32 calculates an initial value of hi expressed by the equation (5) and an initial value of hk expressed by the equation (6) based on the initial values of the N state variables, the initial values of the M auxiliary variables, and the problem information. The control unit 32 causes the hi update holding unit 33f6 illustrated in
Furthermore, the control unit 32 calculates an initial value of H(x) expressed by the equation (4), for example, based on the initial values of the N state variables, the initial values of the M auxiliary variables, and the problem information. The control unit 32 causes the energy holding unit 33a3 illustrated in
Moreover, in the initialization process, the number of replicas=R and the like are set in the variable setting unit 33a.
Step S12: The control unit 32 sets t=0. Here, t is a variable representing the number of MCMC searches. In the following example, it is assumed that the number of MCMC searches is counted as one when the MCMC search is carried out for each of the N state variables.
Step S13: The control unit 32 sets r=0. Here, r is a variable representing a replica number.
Step S14: The control unit 32 sets i=1. Here, i is an identification number of the state variable.
Step S15: The λk adjustment amount setting unit 33e2 of the λk adjustment unit 33e determines whether or not the variable (T1) indicating the interval for setting λkinit, Δλk+, and Δλk− is divisible by t (whether or not (t % T1)=0). Processing of step S16 is performed if it is determined that (t % T1)=0 holds, and processing of step S17 is performed if it is determined that t % T1=0 does not hold.
Step S16: The λk adjustment amount setting unit 33e2 sets λkinit, Δλk+, and Δλk− of the replica with the replica number=r (hereinafter referred to as a replica r) for the λk setting unit 33e3. As a result, λk is initialized to λkinit. Note that the values of Δλk+ and Δλk− may be changed.
Step S17: The search unit 33 carries out the MCMC search. A processing procedure of the MCMC search will be described later (see
Step S18: The λk adjustment determination unit 33e1 of the λk adjustment unit 33e determines whether or not the variable (T0) indicating the adjustment interval of λk is divisible by t (whether or not (t % T0)=0). Processing of step S19 is performed if it is determined that t % T0=0 holds, and processing of step S26 is performed if it is determined that t % T0=0 does not hold.
Step S19: The λk setting unit 33e3 of the λk adjustment unit 33e determines whether or not H<Hbest holds. Processing of step S20 is performed if it is determined that H<Hbest holds, and processing of step S24 is performed if it is determined that H<Hbest does not hold.
Step S20: The λk setting unit 33e3 determines whether or not xk=1 holds. Processing of step S21 is performed if it is determined that xk=1 holds, and processing of step S23 is performed if it is determined that xk=1 does not hold.
Step S21: The λk setting unit 33e3 adjusts λk by adding Δλk+ to the original λk.
Step S22: The control unit 32 determines whether or not k=M holds. Processing of step S25 is performed if it is determined that k=M holds, and processing of step S23 is performed if it is determined that k=M does not hold.
Step S23: The control unit 32 sets k=k+1. Thereafter, the process from step S20 is repeated.
Step S24: The λk setting unit 33e3 adjusts λk by, for example, subtracting Δλk− from all the pieces of λk.
Step S25: Correction of hi and H(x) is carried out. For example, the hi update holding unit 33f6 corrects the N pieces of hi according to the equation (7). The energy calculation unit 33n corrects H(x) according to, for example, the equation (8).
Step S26: The control unit 32 determines whether or not i=N holds. Processing of step S28 is performed if it is determined that i=N holds, and processing of step S27 is performed if it is determined that i=N does not hold.
Step S27: The control unit 32 sets i=i+1. Thereafter, the process from step S15 is repeated.
Step S28: The control unit 32 determines whether or not r=R−1 holds. Processing of step S30 is performed if it is determined that r=R−1 holds, and processing of step S29 is performed if it is determined that r=R−1 does not hold.
Step S29: The control unit 32 sets r=r+1. Thereafter, the process from step S14 is repeated.
Step S30: The control unit 32 determines whether or not an end condition is satisfied. For example, the control unit 32 determines that the end condition is satisfied if the number of MCMC searches (t) reaches the maximum number or if H(x) becomes equal to or smaller than predetermined magnitude. Processing of step S32 is performed if it is determined that the process satisfies the predetermined end condition, and processing of step S31 is performed if it is determined that the process does not satisfy the predetermined end condition.
Step S31: The control unit 32 sets t=t+1. Thereafter, the process from step S13 is repeated.
Step S32: The output unit 34 outputs a calculation result. This terminates the process. For example, the output unit 34 may output the calculation result to the display 24a to be displayed, transmit the calculation result to another information processing device via the network 27a, or store the calculation result in an external storage device.
Note that, in the case of performing the simulated annealing, for example, the control unit 32 decreases the value of the temperature parameter (T) described above according to a predetermined temperature parameter change schedule each time the MCMC search for the state variable is repeated a predetermined number of times. Then, under the control of the control unit 32, the output unit 34 outputs, as a calculation result, a state corresponding to the minimum value of Hbest of the individual replicas obtained when the MCMC search is repeated the maximum number of times, for example.
In the case of performing the replica exchange method, replica exchange is carried out each time the MCMC search is repeated a predetermined number of times. For example, the control unit 32 selects two replicas having adjacent T values, and exchanges the T values or the values of the respective state variables and the values of the respective auxiliary variables between the selected two replicas at a predetermined exchange probability based on an H(x) difference or a T value difference between the replicas. Then, under the control of the control unit 32, the output unit 34 outputs, as a calculation result, a state corresponding to the minimum value of Hbest of the individual replicas obtained when the MCMC search is repeated the maximum number of times, for example.
Step S40: A flip candidate state variable (xi) is selected. When the flip candidate state variable is selected, a change amount (Δxi) when a value of the state variable is changed is output from the state variable holding unit 33b.
Step S41: The ΔH calculation unit 33h of the search unit 33 calculates ΔH by the equation ΔH=−hiΔxi.
Step S42: The transition propriety determination unit 33j of the search unit 33 performs flip determination for xi based on a result of comparison between ΔH and the predetermined value described above. Processing of step S43 is performed if it is determined that a change in xi is permissible (in a case where “flip is permissible”), and one MCMC search is terminated if it is determined that a change in xi is not permissible (in a case where “flip is not permissible”).
Step S43: The search unit 33 updates hi, hk, H(x), and xi by the processing described above.
Step S44: The control unit 32 sets k=1.
Step S45: A flip candidate auxiliary variable (xk) is selected. When the flip candidate auxiliary variable is selected, a change amount (Δxk) when a value of the auxiliary variable is changed is output from the auxiliary variable holding unit 33c.
Step S46: The ΔH calculation unit 33i of the search unit 33 calculates ΔH by the equation ΔH=+λkhkΔHΔxk.
Step S47: The transition propriety determination unit 33k of the search unit 33 performs flip determination for xk based on a result of comparison between ΔH and the predetermined value described above, for example. Processing of step S48 is performed if it is determined that a change in xk is permissible (in a case where “flip is permissible”), and processing of step S49 is performed if it is determined that a change in xk is not permissible (in a case where “flip is not permissible”).
Step S48: The search unit 33 updates hi, H(x), and xk by the processing described above.
Step S49: The control unit 32 determines whether or not k=M holds. Processing of step S51 is performed if it is determined that k=M holds, and processing of step S50 is performed if it is determined that k=M does not hold.
Step S50: The control unit 32 sets k=k+1. Thereafter, the process from Step S45 is repeated.
Step S51: The control unit 32 determines whether or not all the pieces of xk are 0. If it is determined that all the pieces of xk are 0, processing of step S52 is performed. If it is determined that all the pieces of xk are not 0 (any one xk is not 0), one MCMC search is terminated.
Step S52: The energy calculation unit 33n updates Hbest. If the updated H(x) is smaller than Hbest, the energy calculation unit 33n sets the H(x) as a new Hbest. After the processing of step S52, one MCMC search is terminated.
Note that the order of the processing illustrated in
According to the data processing method as described above, the λk adjustment unit 33e adjusts λk based on the result of comparison between H(x) and Hbest and the presence or absence of constraint condition violation represented by xk each time the MCMC search is performed T0 times. As a result, λk may be appropriately adjusted by reflecting the solution search condition, and accordingly, the efficiency in searching for a solution to the combinatorial optimization problem may improve.
Furthermore, when λk is adjusted, hi and H(x) are corrected based on the adjustment amount of λk, whereby the occurrence of a calculation error caused by changing of λk may be suppressed.
Note that, as described above, the processing contents described above may be implemented by causing the data processing device 20 to execute a program.
The program may be recorded in a computer-readable recording medium (e.g., recording medium 26a). As the recording medium, for example, a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like may be used. Examples of the magnetic disk include an FD and an HDD. Examples of the optical disk include a CD, a CD-recordable (R)/rewritable (RW), a DVD, and a DVD-R/RW. The program may be recorded in a portable recording medium and distributed. In that case, the program may be copied from the portable recording medium to another recording medium (e.g., HDD 23) and then executed.
The search unit 33 may process a plurality of replicas by pipeline processing.
During the T0 iteration, λk is fixed. During the T1 iteration, Δλk (and λkinit) is fixed. In the example of
In one iteration, processing is sequentially performed on R pieces of replicas (indicated as R replicas).
The λk adjustment timing may be shifted for each replica. In the example of
Processing performed by the individual replicas is divided into an update phase and a trial phase. Although processing for the replica 0 will be mainly described below, similar processing is performed for other replicas.
The update phase includes a process in which update (flip) of the value of the state variable (xi) or the auxiliary variable (xk) is carried out and a process in which calculation of λk is carried out. Furthermore, the update phase includes a process of reading the weight value for updating hi from the weight value holding unit 33d and a process of reading the weight value for updating hk from the weight value holding unit 33d. Moreover, the update phase includes a process of reading λk from the variable setting unit 33a and a process of updating or correcting hi or hk.
The trial phase includes a process of reading hi or hk from the memory mentioned above, a process of calculating ΔH, a process of determining a flip variable (state variable or auxiliary variable for which the value update is carried out), and a process of H(x) calculation.
Note that each of the processes is performed in one or a plurality of clock cycles, for example.
The trial phase for the state variable starts at timing t10. For each of the replicas, the read of hi from the memory, the calculation of ΔH, the determination of the flip variable based on ΔH, and the calculation of H(x) are sequentially carried out. At timing t11, the identification number of the state variable, which is the flip variable determined in the replica 0, is supplied to the update unit 33m via the FIFO, and the update phase starts (timing t12). At the timing t12, the value of the state variable, which is the flip variable, is updated. Then, at timing t13, the weight value corresponding to the updated state variable is read from the weight value holding unit 33d. Then, hi and hk are updated based on the individual read values, and stored in the memory (timing t14).
The trial phase for the auxiliary variable starts at timing t15. For each of the replicas, the read of hk from the memory, the calculation of ΔH, the determination of the flip variable based on ΔH, and the calculation of H(x) are sequentially carried out.
At timing t18 and t19, the values of the two auxiliary variables, which are the flip variables, are sequentially updated. Then, at timing t19 and t20, the weight value corresponding to the updated auxiliary variable is read from the weight value holding unit 33d, and λk is read from the variable setting unit 33a. Then, hi is updated based on the individual read values (timing t20 to t21). Note that the calculation (adjustment) of λk is carried out based on a result of the comparison between Hbest and H(x), a result of the determination regarding the presence or absence of violation of each constraint condition based on the value of xk at the timing t18. The correction of hi in a case where the value of λk is changed is also carried out at the timing t20 to t21.
At timing t21 to t23, the correction of hi is carried out based on a change of λk without changing the value of the auxiliary variable. In a case where adjustment to decrease the values of all the pieces of λk (M pieces of λk) is carried out, for example, the correction process is performed in a period of M clock cycles. However, the correction process of hi based on a change of λk that does not affect hi even if the value changes may be skipped.
At timing t22, the calculation of λk that does not affect hi (without a local field change) is carried out. Since this calculation does not affect hi, it may be started at the timing t22 between the timing t21 and t23 at which the correction of hi is carried out.
At timing t24 to t25, the correction process of H(x) based on a change of λk is carried out. Processing similar to the processing at the timing t10 is performed again at timing t25.
Next, an example of evaluating a difference in the effect depending on the presence or absence of the λk adjustment as described above will be described. The combinatorial optimization problem to be calculated is a set covering problem for arranging 5,800 people in 404 areas for which an optimum solution is known. When λk is not adjusted, it has failed to reach the optimum solution even with 106 iterations. When λk is adjusted as described above, the optimum solution has been reached with 225,603 iterations.
A data processing device 40 includes an accelerator card 41 coupled to a bus.
The accelerator card 41 is a hardware accelerator that searches for a solution to a combinatorial optimization problem. The accelerator card 41 includes an FPGA 41a and a DRAM 41b.
In the data processing device 40, the FPGA 41a and the DRAM 41b perform, for example, the processing of the processing unit 12 and the storage unit 11 illustrated in
Note that there may be a plurality of the accelerator cards 41.
While one aspect of the data processing device, the program, and the data processing method according to the present disclosure has been described based on the embodiments, this is merely an example, and is not limited to the description above.
While the case of mainly using the inequality constraint as the constraint condition has been described above, another constraint condition, such as an equality constraint, may also be used.
For example, in the case of using the equality constraint, the following equation (11) is used instead of the equation (4) for the total energy (H(x)).
Here, a spin variable having a value of −1 or 1 may be used as the auxiliary variable (xk). In that case, it may be expressed as Δxk=−2xk. When the equality constraint is not satisfied (in the case of Rk(x)≠Uk), xk becomes −1, and when the equality constraint is satisfied (in the case of Rk(x)=Uk), xk becomes +1.
When such an auxiliary variable is used, ΔH may be expressed as ΔH=+λkhkΔxk in a similar manner to the case described above.
Note that it is sufficient to set ΔH=+2λkhkΔxk instead of ΔH=+λkhkΔxk in a case of using a binary variable without using a spin variable.
Furthermore, the auxiliary variable may have values of equal to or greater than three values.
Here, xk has four values 0, 1, 2, and 3. A state where a constraint condition is satisfied is indicated by xk=0, and three constraint condition violated states are indicated by xk=1, 2, and 3. In the example of
Furthermore, as λk described above, λ1 is used when xk=1, λ2 is used when xk=2, and λ3 is used when xk=3. As a result, a constraint term that increases with different slopes as hk increases may be used, depending on whether xk=1, 2, or 3.
In a case of using the auxiliary variable as described above, ΔHi→j in the case of changing from (hi, gi) to (hj, gj) may be represented as ΔHi→j=[λj(hk−hj)+gj]−[λi(hk−hi)+gi]=(λj−λi)hk+[(gj−λjhj)−(gi−λihi)].
All examples and conditional language provided herein are intended for the pedagogical purposes of aiding the reader in understanding the invention and the concepts contributed by the inventor to further the art, and are not to be construed as limitations to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to a showing of the superiority and inferiority of the invention. Although one or more embodiments of the present invention have been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.
Number | Date | Country | Kind |
---|---|---|---|
2022-196481 | Dec 2022 | JP | national |