INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING SYSTEM, INFORMATION PROCESSING METHOD, AND STORAGE MEDIUM

CROSS REFERENCE TO RELATED APPLICATIONS

This application is continuation application of International Application No. JP2020/014190, filed on Mar. 27, 2020, which claims priority to Japanese Patent Application No. 2019-064699, filed on Mar. 28, 2019, the entire contents of which are incorporated herein by reference.

FIELD

Embodiments of the present invention relate to an information processing device, an information processing system, an information processing method, and a storage medium.

BACKGROUND

A combinatorial optimization problem is a problem of selecting a combination most suitable for a purpose from a plurality of combinations. Mathematically, combinatorial optimization problems are attributed to problems for maximizing functions including a plurality of discrete variables, called “objective functions”, or minimizing the functions. Although combinatorial optimization problems are common problems in various fields including finance, logistics, transport, design, manufacture, and life science, it is not always possible to calculate an optimal solution due to so-called “combinatorial explosion” that the number of combinations increases in exponential orders of a problem size. In addition, it is difficult to even obtain an approximate solution close to the optimal solution in many cases.

Development of a technique for calculating a solution for the combinatorial optimization problem within a practical time is required in order to solve problems in each field and promote social innovation and progress in science and technology.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 is a diagram illustrating a configuration example of an information processing system.

FIG. 2 is a block diagram illustrating a configuration example of a management server.

FIG. 3 is a diagram illustrating an example of data stored in a storage unit of the management server.

FIG. 4 is a block diagram illustrating a configuration example of a calculation server.

FIG. 5 is a diagram illustrating an example of data stored in a storage of the calculation server.

FIG. 6 is a flowchart illustrating an example of processing in a case where a solution of a simulated bifurcation algorithm is calculated by time evolution.

FIG. 7 is a graph illustrating an example of an update pattern of a coefficient p in a case where calculation of time evolution is performed.

FIG. 8 is a graph conceptually illustrating the coefficient p used in a case where calculation is performed using a replica exchange method.

FIG. 9 is a flowchart illustrating an example of processing in a case where calculation of a simulated bifurcation algorithm is performed using the replica exchange method.

FIG. 10 is a flowchart illustrating an example of processing in a case where calculation is performed using a varying coefficient p(m).

FIG. 11 is a flowchart illustrating an example of processing in a case where replica exchange is performed in parallel.

FIG. 12 is a diagram conceptually illustrating a processing flow in the replica exchange method.

FIG. 13 is a diagram illustrating an example of a formula related to the replica exchange method.

FIG. 14 is a histogram illustrating an example of the number of trials required until an optimal solution is obtained.

FIG. 15 is a diagram schematically illustrating an example of a multi-processor configuration.

FIG. 16 is a diagram schematically illustrating an example of a configuration using a GPU.

FIG. 17 is a flowchart illustrating an example of overall processing executed to solve a combinatorial optimization problem.

DETAILED DESCRIPTION

According to one embodiment, an information processing device includes a plurality of arithmetic circuits and a data exchange circuit. The plurality of arithmetic circuits each of which is configured to repeatedly update a first vector which has a first variable as an element and a second vector which has a second variable corresponding to the first variable as an element. Each of the arithmetic circuits is configured to update the first variable based on the corresponding second variable, weight the first variable with a first coefficient and add the weighted first variable to the corresponding second variable, calculate a problem term using a plurality of the first variables, and add the problem term to the second variable, different values are set as the first coefficients in the respective arithmetic circuits. The data exchange circuit is configured to execute at least one of exchange of the first vector and the second vector or exchange of the first coefficients between the arithmetic circuits.

Hereinafter, embodiments of the present invention will be described with reference to the drawings. In addition, the same components will be denoted by the same reference signs, and a description thereof will be omitted as appropriate.

FIG. 1 is a block diagram illustrating a configuration example of an information processing system 100. The information processing system 100 of FIG. 1 includes a management server 1, a network 2, calculation servers (information processing devices) 3a to 3c, cables 4a to 4c, a switch 5, and a storage device 7. In addition, FIG. 1 illustrates a client terminal 6 that can communicate with the information processing system 100. The management server 1, the calculation servers 3a to 3c, the client terminal 6, and the storage device 7 can perform data communication with each other via the network 2. For example, the calculation servers 3a to 3c can store data in the storage device 7 and read data from the storage device 7. The network 2 is, for example, the Internet in which a plurality of computer networks are connected to each other. The network 2 can use a wired or wireless communication medium or a combination thereof. In addition, an example of a communication protocol used in the network 2 is TCP/IP, but a type of communication protocol is not particularly limited.

In addition, the calculation servers 3a to 3c are connected to the switch 5 via the cables 4a to 4c, respectively. The cables 4a to 4c and the switch 5 form interconnection between the calculation servers. The calculation servers 3a to 3c can also perform data communication with each other via the interconnection. The switch 5 is, for example, an Infiniband switch. The cables 4a to 4c are, for example, Infiniband cables. However, a wired LAN switch/cable may be used instead of the Infiniband switch/cable. Communication standards and communication protocols used in the cables 4a to 4c and the switch 5 are not particularly limited. Examples of the client terminal 6 include a notebook PC, a desktop PC, a smartphone, a tablet, and an in-vehicle terminal.

Parallel processing and/or distributed processing can be performed to solve a combinatorial optimization problem. Therefore, the calculation servers 3a to 3c and/or processors of the calculation servers 3a to 3c may share and execute some steps of calculation processes, or may execute similar calculation processes for different variables in parallel. For example, the management server 1 converts a combinatorial optimization problem input by a user into a format that can be processed by each calculation server, and controls the calculation server. Then, the management server 1 acquires calculation results from the respective calculation servers, and converts the aggregated calculation result into a solution of the combinatorial optimization problem. In this manner, the user can obtain the solution to the combinatorial optimization problem. It is assumed that the solution of the combinatorial optimization problem includes an optimal solution and an approximate solution close to the optimal solution.

FIG. 1 illustrates three calculation servers. However, the number of calculation servers included in the information processing system is not limited. In addition, the number of calculation servers used for solving the combinatorial optimization problem is not particularly limited. For example, the information processing system may include one calculation server. In addition, a combinatorial optimization problem may be solved using any one of a plurality of calculation servers included in the information processing system. In addition, the information processing system may include several hundred or more calculation servers. The calculation server may be a server installed in a data center or a desktop PC installed in an office. In addition, the calculation server may be a plurality of types of computers installed at different locations. A type of information processing device used as the calculation server is not particularly limited. For example, the calculation server may be a general-purpose computer, a dedicated electronic circuit, or a combination thereof.

FIG. 2 is a block diagram illustrating a configuration example of the management server 1. The management server 1 of FIG. 2 is, for example, a computer including a central processing unit (CPU) and a memory. The management server 1 includes a processor 10, a storage unit 14, a communication circuit 15, an input circuit 16, and an output circuit 17. It is assumed that the processor 10, the storage unit 14, the communication circuit 15, the input circuit 16, and the output circuit 17 are connected to each other via a bus 20. The processor 10 includes a management unit 11, a conversion unit 12, and a control unit 13 as internal components.

The processor 10 is an electronic circuit that executes an operation and controls the management server 1. As the processor 10, for example, a CPU, a microprocessor, an ASIC, an FPGA, a PLD, or a combination thereof can be used. The management unit 11 provides an interface configured to operate the management server 1 via the client terminal 6 of the user. Examples of the interface provided by the management unit 11 include an API, a CLI, and a web page. For example, the user can input information of a combinatorial optimization problem via the management unit 11, and browse and/or download a calculated solution of the combinatorial optimization problem. The conversion unit 12 converts the combinatorial optimization problem into a format that can be processed by each calculation server. The control unit 13 transmits a control command to each calculation server. After the control unit 13 acquires calculation results from the respective calculation servers, the conversion unit 12 aggregates the plurality of calculation results and converts the aggregated result into a solution of the combinatorial optimization problem. In addition, the control unit 13 may designate a processing content to be executed by each calculation server or a processor in each server.

The storage unit 14 stores various types of data including a program of the management server 1, data necessary for execution of the program, and data generated by the program. Here, the program includes both an OS and an application. The storage unit 14 may be a volatile memory, a non-volatile memory, or a combination thereof. Examples of the volatile memory include a DRAM and an SRAM. Examples of the non-volatile memory include a NAND flash memory, a NOR flash memory, a ReRAM, or an MRAM. In addition, a hard disk, an optical disk, a magnetic tape, or an external storage device may be used as the storage unit 14.

The communication circuit 15 transmits and receives data to and from each device connected to the network 2. The communication circuit 15 is, for example, a network interface card (NIC) of a wired LAN. However, the communication circuit 15 may be another type of communication circuit such as a wireless LAN. The input circuit 16 implements data input with respect to the management server 1. It is assumed that the input circuit 16 includes, for example, a USB, PCI-Express, or the like as an external port. In the example of FIG. 2, an operation device 18 is connected to the input circuit 16. The operation device 18 is a device configured to input information to the management server 1. The operation device 18 is, for example, a keyboard, a mouse, a touch panel, a voice recognition device, or the like, but is not limited thereto. The output circuit 17 implements data output from the management server 1. It is assumed that the output circuit 17 includes HDMI, DisplayPort, or the like as an external port. In the example of FIG. 2, the display device 19 is connected to the output circuit 17. Examples of the display device 19 include a liquid crystal display (LCD), an organic electroluminescence (EL) display, and a projector, but are not limited thereto.

An administrator of the management server 1 can perform maintenance of the management server 1 using the operation device 18 and the display device 19. Note that the operation device 18 and the display device 19 may be incorporated in the management server 1. In addition, the operation device 18 and the display device 19 are not necessarily connected to the management server 1. For example, an administrator may perform maintenance of the management server 1 using a client terminal capable of communicating with the network 2.

FIG. 3 illustrates an example of data stored in the storage unit 14 of the management server 1. The storage unit 14 of FIG. 3 stores problem data 14A, calculation data 14B, a management program 14C, a conversion program 14D, and a control program 14E. For example, the problem data 14A includes data of a combinatorial optimization problem. For example, the calculation data 14B includes a calculation result collected from each calculation server. For example, the management program 14C is a program that implements the above-described function of the management unit 11. For example, the conversion program 14D is a program that implements the above-described function of the conversion unit 12. For example, the control program 14E is a program that implements the above-described function of the control unit 13.

FIG. 4 is a block diagram illustrating a configuration example of the calculation server. FIG. 4 illustrates a configuration of the calculation server 3a as an example. The other calculation server may have a configuration similar to that of the calculation server 3a or may have a configuration different from that of the calculation server 3a. The calculation server 3a is, for example, an information processing device that performs calculation of a first vector and a second vector alone or in a shared manner with the other calculation server. In addition, at least one of the calculation servers may calculate a problem term between elements of the first vector. Here, the problem term may be calculated based on an Ising model to be described later. In addition, the problem term may include a many-body interaction to be described later.

For example, the first vector is a vector which has a variable x_i(i=1, 2, . . . , N) or a variable X^(m)_i(i=1, 2, . . . , N) as an element. For example, the second vector is a vector which has a variable y_i(i=1, 2, . . . , N) or a variable Y^(m)i (i=1, 2, . . . , N) as an element. Here, m represents a replica number of an integer of 1 to M. Details of the replica will be described later.

The calculation server 3a includes, for example, a communication circuit 31, a shared memory 32, processors 33A to 33D, a storage 34, and a host bus adapter 35. It is assumed that the communication circuit 31, the shared memory 32, the processors 33A to 33D, the storage 34, and the host bus adapter 35 are connected to each other via a bus 36.

The communication circuit 31 transmits and receives data to and from each device connected to the network 2. The communication circuit 31 is, for example, a network interface card (NIC) of a wired LAN. However, the communication circuit 31 may be another type of communication circuit such as a wireless LAN. The shared memory 32 is a memory accessible from the processors 33A to 33D. Examples of the shared memory 32 include a volatile memory such as a DRAM and an SRAM. However, another type of memory such as a non-volatile memory may be used as the shared memory 32. The processors 33A to 33D can share data via the shared memory 32. Note that all the memories of the calculation server 3a are not necessarily configured as shared memories. For example, some of the memories of the calculation server 3a may be configured as a local memory that can be accessed only by any processor.

The processors 33A to 33D are electronic circuits that execute calculation processes. The processor may be, for example, any of a central processing unit (CPU), a graphics processing unit (GPU), a field-programmable gate array (FPGA), and an application specific integrated circuit (ASIC), or a combination thereof. In addition, the processor may be a CPU core or a CPU thread. When the processor is the CPU, the number of sockets included in the calculation server 3a is not particularly limited. In addition, the processor may be connected to another component of the calculation server 3a via a bus such as PCI express.

In the example of FIG. 4, the calculation server includes four processors. However, the number of processors included in one calculation server may be different from this. For example, the number and/or types of processors implemented in the calculation server may be different.

For example, the information processing device is configured to repeatedly update the first vector which has a first variable x_i(i=1, 2, . . . , N) as an element and the second vector which has a second variable y_i(i=1, 2, . . . , N) corresponding to the first variable as an element. The storage unit of the information processing device may be configured to store a first variable which is an element of a first vector and a second variable which is an element of a second vector.

For example, each of a plurality of arithmetic circuits 50 is configured to repeatedly update a first vector having a first variable as an element and a second vector having a second variable corresponding to the first variable as an element. For example, each of the arithmetic circuits 50 is configured to update the first variable based on the corresponding second variable, weight the first variable with a first coefficient and add the weighted first variable to the corresponding second variable, calculate a problem term using the plurality of first variables, and add the problem term to the second variable. In addition, different values may be set as the first coefficient in each of the arithmetic circuits 50. For example, a data exchange circuit 51 is configured to execute at least one of exchange of the first vector and the second vector or exchange of the first coefficients between the arithmetic circuits 50. The problem term calculated by the arithmetic circuit may be based on an Ising model. In addition, the problem term calculated by the arithmetic circuit may include a many-body interaction. Details of the problem term will be described later.

In the example of FIG. 4, the processors 33A to 33C correspond to the plurality of arithmetic circuits, and the processor 33D corresponds to the data exchange circuit. However, the correspondence relationship between the arithmetic circuit or data exchange circuit and the processor illustrated in FIG. 4 is merely an example. Therefore, the correspondence relationship between the arithmetic circuit or data exchange circuit and the processor may be different from this. In addition, the number of processors allocated to the arithmetic circuit or data exchange circuit is not particularly limited. The same processor may also serve as the arithmetic circuit and the data exchange circuit as will be described later. In a case where plural types of processors (for example, a CPU, a GPU, and an FPGA) are implemented in the calculation server, different types of processors may be allocated to the arithmetic circuit and the data exchange circuit.

Here, the case where the processing content is allocated in units of processors has been described as an example. However, a unit of a calculation resource in which the processing content is allocated is not limited. For example, the processing content may be allocated in units of calculators. In this case, the information processing system may include a plurality of first calculators and a second calculator. For example, each of the first calculators is configured to repeatedly update a first vector having a first variable as an element and a second vector having a second variable corresponding to the first variable as an element. For example, the second calculator is configured to execute data exchange between the first calculators. The first calculator and the second calculator correspond to, for example, the above-described calculation server.

Each of the first calculators may be configured to update the first variable based on the corresponding second variable, weight the first variable with a first coefficient and add the weighted first variable to the corresponding second variable, calculate a problem term using the plurality of first variables, and add the problem term to the second variable. The second calculator may be configured to execute at least one of exchange of the first vector and the second vector or exchange of the first coefficients between the first calculators. In addition, different values may be set as the first coefficient in each of the first calculators.

Hereinafter, components of the calculation server will be described with reference to FIG. 4 again.

The storage 34 stores various data including a program of the calculation server 3a, data necessary for executing the program, and data generated by the program. Here, the program includes both an OS and an application. The storage 34 may be a volatile memory, a non-volatile memory, or a combination thereof. Examples of the volatile memory include a DRAM and an SRAM. Examples of the non-volatile memory include a NAND flash memory, a NOR flash memory, a ReRAM, or an MRAM. In addition, a hard disk, an optical disk, a magnetic tape, or an external storage device may be used as the storage 34.

The host bus adapter 35 implements data communication between the calculation servers. The host bus adapter 35 is connected to the switch 5 via the cable 4a. The host bus adapter 35 is, for example, a host channel adaptor (HCA). The speed of parallel calculation processes can be improved by forming interconnection capable of achieving a high throughput using the host bus adapter 35, the cable 4a, and the switch 5.

FIG. 5 illustrates an example of data stored in the storage of the calculation server. The storage 34 of FIG. 5 stores calculation data 34A, a calculation program 34B, and a control program 34C. The calculation data 34A includes data in the middle of calculation by the calculation server 3a or a calculation result. Note that at least a part of the calculation data 34A may be stored in a different storage hierarchy such as the shared memory 32, a cache of the processor, and a register of the processor. The calculation program 34B is a program that implements a calculation process in each processor and a process of storing data in the shared memory 32 and the storage 34 based on a predetermined algorithm. The control program 34C is a program that controls the calculation server 3a based on a command transmitted from the control unit 13 of the management server 1 and transmits a calculation result of the calculation server 3a to the management server 1.

Next, a technique related to solving a combinatorial optimization problem will be described. An example of the information processing device used to solve the combinatorial optimization problem is an Ising machine. The Ising machine refers to an information processing device that calculates the energy of a ground state of an Ising model. Hitherto, the Ising model has been mainly used as a model of a ferromagnet or a phase transition phenomenon in many cases. In recent years, however, the Ising model has been increasingly used as a model for solving a combinatorial optimization problem. The following Formula (1) represents the energy of the Ising model.

$\begin{matrix} [Formula 1] \\ E_{Ising} = \sum_{i = 1}^{N} h_{i} s_{i} + \frac{1}{2} \sum_{i = 1}^{N} \sum_{j = 1}^{N} J_{i, j} s_{i} s_{j} & (1) \end{matrix}$

Here, s_iand s_jare spins, and the spins are binary variables each having a value of either +1 or −1. N is the number of spins. Further, h_iis a local magnetic field acting on each spin. J is a matrix of coupling coefficients between spins. The matrix J is a real symmetric matrix whose diagonal components are 0. Therefore, J_ijindicates an element in row i and column j of the matrix J. Note that the Ising model of Formula (1) is a quadratic expression for the spin, but an extended Ising model (Ising model having a many-body interaction) including a third-order or higher-order term of the spin may be used.

When the Ising model of Formula (1) is used, energy E_Isingcan be used as an objective function, and it is possible to calculate a solution that minimizes energy E_Isingas much as possible. The solution of the Ising model is expressed in a format of a spin vector (s₁, s₂, . . . , s_N). In particular, the vector (s₁, s₂, . . . , s_N) having the minimum value of the energy E_Isingis referred to as an optimal solution. However, the solution of the Ising model to be calculated is not necessarily a strictly optimal solution. Hereinafter, a problem of obtaining an approximate solution (that is, an approximate solution in which a value of the objective function is as close as possible to the optimal value) in which the energy E_Isingis minimized as much as possible using the Ising model is referred to as an Ising problem.

Since the spin s_iin Formula (1) is a binary variable, conversion with a discrete variable (bit) used in the combinatorial optimization problem can be easily performed using Formula (1+s_i)/2. Therefore, it is possible to obtain a solution of the combinatorial optimization problem by converting the combinatorial optimization problem into the Ising problem and causing the Ising machine to perform calculation. A problem of obtaining a solution that minimizes a quadratic objective function having a discrete variable (bit), which takes a value of either 0 or 1, as a variable is referred to as a quadratic unconstrained binary optimization (QUBO) problem. It can be said that the Ising problem represented by Formula (1) is equivalent to the QUBO problem.

For example, a quantum annealer, a coherent Ising machine, a quantum bifurcation machine have been proposed as hardware implementations of the Ising Machine. The quantum annealer implements quantum annealing using a superconducting circuit. The coherent Ising machine uses an oscillation phenomenon of a network formed by an optical parametric oscillator. The quantum bifurcation machine uses a quantum mechanical bifurcation phenomenon in a network of a parametric oscillator with the Kerr effect. These hardware implementations have the possibility of significantly reducing a calculation time, but also have a problem that it is difficult to achieve scale-out and a stable operation.

Therefore, it is also possible to solve the Ising problem using a widely-spread digital computer. As compared with the hardware implementations using the above-described physical phenomenon, the digital computer facilitates the scale-out and the stable operation. An example of an algorithm for solving the Ising problem in the digital computer is simulated annealing (SA). A technique for performing simulated annealing at a higher speed has been developed. However, general simulated annealing is a sequential updating algorithm where each of variables is updated sequentially, and thus, it is difficult to speed up calculation processes by parallelization.

Taking the above-described problem into consideration, a simulated bifurcation algorithm, capable of solving a large-scale combinatorial optimization problem at a high speed by parallel calculation in the digital computer, has been proposed. Hereinafter, a description will be given regarding an information processing device, an information processing system, an information processing method, a storage medium, and a program for solving a combinatorial optimization problem using the simulated bifurcation algorithm.

First, an overview of the simulated bifurcation algorithm will be described.

In the simulated bifurcation algorithm, a simultaneous ordinary differential equation in (2) below is numerically solved for each of two variables x_iand y_i(i=1, 2, . . . , N), the number of each of the variables being N. Each of the N variables x_icorresponds to the spin s_iof the Ising model. On the other hand, each of the N variables y_icorresponds to the momentum. It is assumed that both the variables x_iand y_iare continuous variables. Hereinafter, a vector having the variable x_i(i=1, 2, . . . , N) as an element is referred to as a first vector, and a vector having the variable y_i(i=1, 2, . . . , N) as an element is referred to as a second vector.

$\begin{matrix} [Formula 2] \\ \frac{{dx}_{i}}{dt} = \frac{\partial H}{\partial y_{i}} = {Dy}_{i} \frac{{dy}_{i}}{dt} = - \frac{\partial H}{\partial x_{i}} = {- D + p (t) - {Kx}_{i}^{2})} x_{i} - {ch}_{i} a (t) - c \sum_{j = 1}^{N} J_{ij} x_{j} & (2) \end{matrix}$

Here, H is a Hamiltonian of the following Formula (3).

$\begin{matrix} [Formula 3] \\ H = \sum_{i = 1}^{N} [\frac{D}{2} (x_{i}^{2} + y_{i}^{2}) - \frac{p (t)}{2} x_{i}^{2} + \frac{K}{4} x_{i}^{4} + {ch}_{i} x_{i} a (t) + \frac{c}{2} \sum_{j = 1}^{N} J_{ij} x_{i} x_{j}] & (3) \end{matrix}$

Values of the variables x_iand y_i(i=1, 2, . . . , N) can be repeatedly updated to obtain the spin s_i(i=1, 2, . . . , N) of the Ising model by calculating the time evolution of the simulated bifurcation algorithm. That is, when the time evolution is calculated, a value of the element of the first vector and a value of the element of the second vector are repeatedly updated. Here, each coefficient will be described assuming that the calculation of the time evolution is performed. However, in the simulated bifurcation algorithm, it is not always necessary to perform the calculation of the time evolution. For example, a solution of the simulated bifurcation algorithm may be obtained by a replica exchange method (parallel tempering) as will be described later.

In (2) and (3), a coefficient D corresponds to detuning. A coefficient p(t) corresponds to a pumping amplitude. When the time evolution of the simulated bifurcation algorithm is calculated, a value of the coefficient p(t) monotonically increases depending on the number of updates. An initial value of the coefficient p(t) may be set to 0. A coefficient K corresponds to a positive Kerr coefficient. As a coefficient c, a constant coefficient can be used. For example, a value of the coefficient c may be determined before execution of calculation according to the simulated bifurcation algorithm. For example, the coefficient c can be set to a value close to an inverse number of the maximum eigenvalue of the J⁽²⁾matrix. For example, a value of c=0.5D√(N/2n) can be used. Here, n is the number of edges of a graph related to the combinatorial optimization problem. In addition, a(t) is a coefficient that increases with p(t) at the time of calculating the time evolution. For example, √(p(t)/K) can be used as a(t). Note that the vector h_iof the local magnetic field in (3) and (4) can be omitted.

In the case of calculating the time evolution, a solution vector having the spin s_ias an element can be obtained by converting the variable x_iwhich is a positive value into +1 and the variable x_iwhich is a negative value into −1 in the first vector when the value of the coefficient p(t) exceeds a predetermined value. This solution vector corresponds to the solution of the Ising problem. Note that it may be determined whether to obtain a solution vector by executing the above-described conversion based on the number of updates of the first vector and the second vector.

In the case of calculating the time evolution of the simulated bifurcation algorithm, the above-described (2) can be converted into a discrete recurrence formula using the Symplectic Euler method to obtain the solution. The following (4) represents an example of the simulated bifurcation algorithm after conversion into the recurrence formula.

$\begin{matrix} [Formula 4] \\ x_{i} (t + Δ t) = x_{i} (t) + {Dy}_{i} (t) Δ t y_{i} (t + Δ t) = y_{i} (t) + [{- D + p (t + Δ t) - {Kx}_{i}^{2} (t + Δ t)} x_{i} (t + Δ t) - {ch}_{i} a (t + Δ t)] Δ t - c Δ t \sum_{j = 1}^{N} J_{i, j} x_{j} (t + Δ t) & (4) \end{matrix}$

Here, t is time, and Δt is a time step (time increment). Note that time t and a time step Δt are used to indicate the correspondence relationship with the differential equation in (4). However, the time t and the time step Δt are not necessarily included as explicit parameters when actually implementing the algorithm in software or hardware. For example, if the time step Δt is 1, the time step Δt can be removed from the algorithm at the time of implementation. In a case where the time t is not included as the explicit parameter when the algorithm is implemented, x_i(t+Δt) may be interpreted as an updated value of x_i(t) in (4). That is, “t” in the above-described (4) indicates a value of the variable before update, and “t+Δt” indicates a value of the variable after update.

In the case of calculating the time evolution of the simulated bifurcation algorithm, the value of the spin s_ican be obtained based on the sign of the variable x_iafter increasing the value of p(t) from the initial value (for example, 0) to a predetermined value. For example, if a signum function of sgn(x_i)=1 when x_i>0 and sgn(x_i)=−1 when x_i<0 is used, the value of the spin s_ican be obtained by converting the variable x_iwith the signum function when the value of p(t) increases to the predetermined value. As the signum function, for example, a function that satisfies sgn(x_i)=x_i/|x_i| when x_i≠0 and sgn(x_i)=1 or −1 when x_i=0 can be used. A timing of obtaining the solution (for example, the spin s_iof the Ising model) of the combinatorial optimization problem is not particularly limited. For example, the solution (solution vector) of the combinatorial optimization problem may be obtained when the number of updates of the first vector and the second vector or the value of the first coefficient p becomes larger than a threshold.

The flowchart of FIG. 6 illustrates an example of processing in the case where the solution of the simulated bifurcation algorithm is calculated by the time evolution. Hereinafter, the processing will be described with reference to FIG. 6.

First, the calculation server acquires the matrix J_ijand the vector h_icorresponding to a problem from the management server 1 (step S101). Then, the calculation server initializes the coefficients p(t) and a(t) (step S102). For example, values of the coefficients p and a can be set to 0 in step S102, but the initial values of the coefficients p and a are not limited. Next, the calculation server initializes the first variable x_iand the second variable y_i(step S103). Here, the first variable x_iis an element of the first vector. In addition, the second variable y_iis an element of the second vector. In step S103, the calculation server may initialize x_iand y_iusing pseudorandom numbers, for example. However, a method for initializing x_iand y_iis not limited. In addition, the variables may be initialized at different timings, or at least one of the variables may be initialized a plurality of times.

Next, the calculation server updates the first vector by performing weighted addition on the element y_iof the second vector corresponding to the element x_iof the first vector (step S104). For example, Δt×D×y_ican be added to the variable x_iin step S104. Then, the calculation server updates the element y_iof the second vector (steps S105 and S106). For example, Δt×[(p−D−K×x_i×x_i)×x_i] can be added to the variable y_iin step S105. In step S106, −Δt×c×h_i×a−Δt×c×ΣJ_ij×x_jcan be further added to the variable yi.

Next, the calculation server updates the values of the coefficients p and a (step S107). For example, a constant value (Op) may be added to the coefficient p, and the coefficient a may be set to a positive square root of the updated coefficient p. However, this is merely an example of a method for updating the values of the coefficients p and a as will be described later. Then, the calculation server determines whether the number of updates of the first vector and the second vector is smaller than the threshold (step S108). When the number of updates is smaller than the threshold (YES in step S108), the calculation server executes the processes of steps S104 to S107 again. When the number of updates is equal to or larger than the threshold (NO in step S108), the spin s_i, which is the element of the solution vector, is obtained based on the element x_iof the first vector (step S109). In step S109, the solution vector can be obtained, for example, in the first vector by converting the variable x_iwhich is the positive value into +1 and the variable x_iwhich is the negative value into −1.

Note that when the number of updates is smaller than the threshold in the determination in step S108 (YES in step S108), a value of the Hamiltonian may be calculated based on the first vector, and the first vector and the value of the Hamiltonian may be stored. As a result, a user can select an approximate solution closest to the optimal solution from the plurality of first vectors.

Note that at least one of the processes illustrated in the flowchart of FIG. 6 may be executed in parallel. For example, the processes of steps S104 to S106 may be executed in parallel such that at least some of the N elements included in each of the first vector and the second vector are updated in parallel. For example, the processes may be performed in parallel using a plurality of calculation servers. The processes may be performed in parallel by a plurality of processors. However, an implementation for realizing parallelization of the processes and a mode of the parallelization of the processes are not limited.

The execution order of processes of updating the variables x_iand y_iillustrated in steps S105 to S106 described above is merely an example. Therefore, the processes of updating the variables x_iand y_imay be executed in a different order. For example, the order in which the process of updating the variable x_iand the process of updating the variable y_iare executed may be interchanged. In addition, the order of sub-processing included in the process of updating each variable is not limited. For example, the execution order of the addition process included in the process of updating the variable y_imay be different from the example of FIG. 6. The execution order and timing of processing as a precondition for executing the process of updating each variable are also not particularly limited. For example, the calculation process of the problem term may be executed in parallel with other processes including the process of updating the variable x_i. The same applies to the processing in each flowchart to be described hereinafter, in that the execution order and timings of the processes of updating the variables x_iand y_i, the sub-processing included in the process of updating each variable, and the calculation process of the problem term are not limited.

A graph of FIG. 7 illustrates an example of an update pattern of the coefficient p in the case of performing the calculation of time evolution. The vertical axis in FIG. 7 represents the value of the coefficient p. The horizontal axis in FIG. 7 corresponds to time or the number of updates. FIG. 7 illustrates that there are a plurality of update patterns from the initial value p₁to p_M, which is a value obtained after M updates. If a condition of p₁<p_Mis satisfied, p₁and p_Mcan be set to arbitrary real numbers. In this manner, it can be said that there are an infinite number of update patterns of the coefficient p.

Note that the coefficient p in the simulated bifurcation algorithm corresponds to an inverse number of a temperature T in simulated annealing (SA). Therefore, a transition probability of a particle can be adjusted depending on the coefficient p. The smaller the coefficient p, the higher the transition probability of the particle. On the other hand, the larger the coefficient p, the lower the transition probability of the particle. Therefore, it can be said that the update pattern of the coefficient p in the simulated bifurcation algorithm corresponds to a cooling schedule in the simulated annealing.

For example, as illustrated in FIG. 7, the coefficient p may be incremented at a constant rate depending on the number of updates. In addition, the coefficient p may be incremented by different increments depending on the update timing. Further, the value of the coefficient p may be updated at some update timings, and the update of the value of the coefficient p may be skipped at the remaining update timings.

In a process of obtaining the simulated bifurcation algorithm, it is desirable to search as wide a region as possible in a solution space including a plurality of local solutions to reach an optimal solution of a problem or an approximate solution close thereto. The update patterns of the coefficients p and a designate a region of the solution space to be searched and the granularity of search. Therefore, it is necessary to adjust the update patterns of the coefficients p and a such that a global optimal solution can be easily found. In general, however, it is not always easy to find an optimal update pattern for each problem to be solved. A case where a user will try different update patterns and empirically select an update pattern to use is also assumed.

Therefore, the solution of the simulated bifurcation algorithm may be obtained by the replica exchange method instead of the calculation of the time evolution.

FIG. 8 conceptually illustrates the coefficient p used in a case where calculation is performed using the replica exchange method. The vertical axis in FIG. 8 represents the value of the coefficient p. In the case where the calculation is performed using the replica exchange method, the information processing device does not necessarily calculate the time evolution of the coefficient p. Therefore, the horizontal axis of FIG. 8 represents a virtual axis prepared for comparison with FIG. 7.

When the replica exchange method is used as illustrated in FIG. 8, a first vector and a second vector are updated for each combination (p⁽¹⁾, p⁽²⁾, . . . , p^(M)) of a plurality of determined coefficients p. In the following description, the first vector and the second vector prepared for each coefficient p are referred to as replicas. Then, the solution search in different regions in the solution space can be implemented by performing exchange of the first vector and the second vector between the replicas. In the replica exchange method, it is possible to increase the probability of reaching the optimal solution of the problem or the approximate solution close thereto by repeating the exchange of the first vector and the second vector.

For example, the replica may be allocated to each of the processors of the calculation server. Further, the replica may be allocated to each of the calculation servers in the information processing system. In addition, the amount of calculation resources to be allocated may be different depending on the replica. Types or combinations of the calculation resources to be allocated may be different depending on the replica. However, a method for allocating the calculation resource to each replica is not limited.

When the replica exchange method is used, calculation of the first vector, the second vector, and functions thereof in the respective replicas can be performed in parallel (first-level parallelization). In addition, processes of updating the first vector and the second vector each including a plurality of elements can also be performed in parallel (second-level parallelization). The granularity at which the first-level parallelization is performed and the granularity at which the second-level parallelization is performed in the calculation resources may be different. The latter granularity may be set to be larger than the former granularity. For example, the first-level parallelization can be realized by the granularity of the calculation server or a virtual machine. In addition, the second-level parallelization can be realized by the granularity of the processor. When the replica exchange method is used, processes at a plurality of levels can be parallelized, and thus, calculation can be efficiently performed. Therefore, it is possible to obtain the optimal solution of the problem or the approximate solution close thereto in a relatively short time.

[Case where Combination of Coefficients p is Automatically Determined]

First, processing in a case where a combination of coefficients p to be used is automatically determined when performing calculation using the replica exchange method will be described.

For example, a case where a Hamiltonian of the following Formula (5) is used is assumed.

$\begin{matrix} [Formula 5] \\ H = \sum_{i = 1}^{N} [\frac{D}{2} (x_{i}^{2} + y_{i}^{2}) - \frac{p (t)}{2} x_{i}^{2} + \frac{K}{4} x_{i}^{4} + {ch}_{i} x_{i} a (t) + \frac{c}{2} \sum_{j = 1}^{N} J_{ij} x_{i} x_{j}] & (5) \end{matrix}$

Here, for example, √(p(t)) can be used as a(t).

In Formula (5), when scaling is performed with a(t)X_i=x_iand a(t)Y_i=y_i, the following Formula (6) is obtained.

$\begin{matrix} [Formula 6] \\ H = \sum_{i = 1}^{N} [\frac{D}{2} (X_{i}^{2} + Y_{i}^{2}) - \frac{p (t)}{2} X_{i}^{2} + \frac{K}{4} X_{i}^{4} + {ch}_{i} X_{i} + \frac{c}{2} \sum_{j = 1}^{N} J_{ij} X_{i} X_{j}] & (6) \end{matrix}$

When the replica exchange method is used, the Hamiltonian is calculated for each replica. Therefore, notation such as the following Formula (7) is used to indicate the corresponding replica m (m=1, 2, . . . , M) in the following description.

$\begin{matrix} [Formula 7] \\ H^{' (m)} = \sum_{i = 1}^{N} [\frac{D}{2} (X_{i}^{(m) 2} + Y_{i}^{(m) 2}) - \frac{p^{(m)} (t)}{2} X_{i}^{(m) 2} + \frac{K}{4} X_{i}^{(m) 4} + {ch}_{i} X_{i}^{(m)} + \frac{c}{2} \sum_{j = 1}^{N} J_{ij} X_{i}^{(m)} X_{j}^{(m)}] & (7) \end{matrix}$

In order to guarantee convergence to a statistical dynamic equilibrium distribution, a probability W (m₁, m₁+1) of transition from a replica m₁to a replica m₁+1 and a probability W (m₁+1, m₁) of transition from the replica m₁+1 to the replica m₁are subjected to a detailed balance condition of the following Formula (8). A coefficient β corresponds to an inverse temperature in general simulated annealing. Since the combination of coefficients p (p⁽¹⁾, p⁽²⁾, . . . , p^(M)) is allocated to each replica of the replica exchange method, a finite positive constant can be used as the coefficient β.

$\begin{matrix} [Formula 8] \\ W (m_{1}, m_{1} + 1) \exp (- β H^{(m_{1})}, (X_{i}^{(m_{1})}, Y_{i}^{(m_{1})})) \exp (- β H^{(m_{1} + 1)}, (X_{i}^{(m_{1} + 1)}, Y_{i}^{(m_{1} + 1)})) = W (m_{1} + 1, m_{1}) \exp (- β H^{(m_{1} + 1)}, (X_{i}^{(m_{1})}, Y_{i}^{(m_{1})})) \exp (- β H^{(m_{1})}, (X_{i}^{(m_{1} + 1)}, Y_{i}^{(m_{1} + 1)})) & (8) \end{matrix}$

In addition, the exchange probability between the replica m₁and the replica m₁+1 is expressed by the following Formula (9).

$\begin{matrix} [Formula 9] \\ \frac{W (m_{1}, m_{1} + 1)}{W (m_{1} + 1, m_{1})} = \exp (- β (p^{(m_{1})} - p^{(m_{1} + 1)}) \sum_{i = 1}^{N} [\frac{X_{i}^{{(m_{1})}^{2}} - X_{i}^{{(m_{1} + 1)}^{2}}}{2} - \frac{X_{i}^{{(m_{1})}^{4}} - X_{i}^{{(m_{1} + 1)}^{4}}}{4}]) & (9) \end{matrix}$

For example, the data exchange circuit 51 can compare a value of Formula (9) with a pseudo random number in the range of 0 or more and less than 1, and make a condition determination based on a magnitude relation between the former value and the latter value. When the latter value is larger than the former value, the condition determination is positive, and the data exchange circuit 51 can exchange the first vector and the second vector between the replica m₁and the replica m₁+1. On the other hand, when the latter value is larger than the former value, the condition determination is negative, and the process of exchanging the first vector and the second vector between the replica m₁and the replica m₁+1 can be skipped.

For example, each replica may be allocated to each arithmetic circuit. In this case, the data exchange circuit may be configured to execute a condition determination for each pair of the arithmetic circuits, and execute at least one of exchange of the first vector and the second vector used for calculation between the arithmetic circuits or exchange of the first coefficients when the condition determination is positive. In addition, the data exchange circuit may be configured to execute the condition determination based on the first vectors (a plurality of first variables), the second vectors (a plurality of second variables), and the first coefficients in the arithmetic circuits included in the pair. For example, the condition determination may be performed using Formula (9), or the condition determination may be performed using another formula.

Each replica may be allocated to each calculator. In this case, the second calculator of the information processing system may be configured to execute a condition determination for each pair of the first calculators, and execute at least one of exchange of the first vector and the second vector used for calculation between the first calculators or exchange of the first coefficients when the condition determination is positive.

In the replica exchange method, it is preferable to set a value of a coefficient p^(m)in each replica such that the exchange probability is constant regardless of the difference in the value of the coefficient p in order to implement efficient calculation (convergence to an equilibrium state). In particular, it is known that a value of the exchange probability between replicas becomes constant if the value of each coefficient p^(m)is set such that the values of the coefficients p^(m)(m=1, 2, . . . , M) in a plurality of replicas form a geometric progression. Therefore, calculation may be performed using a combination (m=1, 2, . . . , M) of the coefficients p^(m)forming the geometric progression as the coefficients p^(m)in the respective replicas. That is, the first coefficients (for example, the coefficient p) used in the respective arithmetic circuits may have the relation of the geometric progression. However, a method for determining the value of the coefficient p^(m)(m=1, 2, . . . , M) used in the calculation by the replica exchange method is not limited.

In each replica, a process of numerically solving the following simultaneous ordinary differential equation of (10) is executed for each of N two variables X^(m), and Y^(m)_i(i=1, 2, . . . , N). Each of the N variables X^(m)_icorresponds to the spin s_iof the Ising model. On the other hand, each of the N variables Y^(m)_icorresponds to the momentum. It is assumed that both the variables X^(m)_iand Y^(m)_iare continuous variables. Hereinafter, a vector having the variable X^(m)_i(i=1, 2, . . . , N) as an element is referred to as a first vector, and a vector having the variable Y^(m)_i(i=1, 2, . . . , N) as an element is referred to as a second vector.

$\begin{matrix} [Formula 10] \\ \frac{\partial Y_{i}^{(m)}}{\partial t} = - \frac{\partial H^{' (m)}}{\partial X_{i}^{(m)}} = - [(D - p^{(m)} + {KX}_{i}^{{(m)}^{2}}) X_{i}^{(m)} + {ch}_{i} + c \sum_{j = 1}^{N} J_{ij} X_{j}^{(m)}] \frac{\partial X_{i}^{(m)}}{\partial t} = \frac{\partial H^{' (m)}}{\partial Y_{i}^{(m)}} = {DY}_{i}^{(m)} & (10) \end{matrix}$

The above-described (10) can be converted into a discrete recurrence formula using the Symplectic Euler method to perform calculation of the simulated bifurcation algorithm by the replica exchange method. The following (11) represents an example of the simulated bifurcation algorithm after being converted into the recurrence formula.

$\begin{matrix} [Formula 11] \\ X_{i}^{(m 1)} (t + Δ t) = X_{i}^{(m 1)} (t) + {DY}_{i}^{(m 1)} (t) Δ t Y_{i}^{(m 1)} (t + Δ t) = Y_{i}^{(m 1)} (t) + [{- D + p^{(m 1)} (t + Δ t) - {KX}_{i}^{(m 1)} (t + Δ t) X_{i}^{(m 1)} (t + Δ t)} X_{i}^{(m 1)} (t + Δ t)] Δ t - {ch}_{i} a (t + Δ t) Δ t - c Δ t \sum_{j = 1}^{N} J_{i, j} X_{j}^{(m 1)} (t + Δ t) & (11) \end{matrix}$

Note that the following term (12) in (11) is derived from the Ising energy. Since a format of this term is determined depending on a problem to be solved, the term is referred to as a problem term.

$\begin{matrix} [Formula 12] \\ - {ch}_{i} a (t + Δ t) Δ t - c Δ t \sum_{j = 1}^{N} J_{i, j} X_{j}^{(m 1)} (t + Δ t) & (12) \end{matrix}$

A problem term different from (12) may be used in the calculation as will be described later.

When the simulated bifurcation algorithm is calculated by the replica exchange method as described above, the parallelism of processes can be enhanced, and the optimal solution of the problem or the approximate solution close thereto can be obtained in a relatively short time. In particular, the use of the above-described method in which the combination of coefficients p is automatically determined facilitates implementation and execution of the simulated bifurcation algorithm.

A flowchart of FIG. 9 illustrates an example of processing in the case where the calculation of the simulated bifurcation algorithm is performed using the replica exchange method. Hereinafter, the processing will be described with reference to FIG. 9.

First, the calculation server obtains the matrix J_ijand the vector h_icorresponding to the problem, and initializes a counter t and coefficients p^(m)and a^(m)(step S111). For example, in step S111, the values of the coefficients p^(m)can be set so as to form a geometric progression. The counter t is set to 0, for example. Then, the calculation server initializes the element X^(m)_iof the first vector and the element Y^(m)_iof the second vector, and substitutes 1 for m₁(step S112). Here, m₁is a parameter used to designate the replica m. For example, in step S112, the initial values of the element X^(m)_iof the first vector and the element Y^(m)_iof the second vector can be set using pseudo random numbers. Next, the calculation server updates an element X^(m1)_iof the first vector based on the element Y^(m)_iof the corresponding second vector (step S113). For example, in step S113, Δt×D×Y^(m1)_ican be added to the element X^(m1)_iof the first vector. Then, the calculation server updates an element Y^(m1)_iof the second vector (step S114). For example, in step S114, Δt×[(p−D−K×X^(m1)_i×X^(m1)_i)×X^(m1)_i] can be added to the element Y^(m1)_iof the second vector. In step S114, −Δt×c×h_i×a−Δt×c××X^(m1)_ican be further added to the variable Y^(m1)_i.

Next, the calculation server increments m1 and determines whether m1≤M is satisfied (step S115). When the determination in step S115 is positive (YES in step S115), the processes in steps S113 to S115 are executed again. When the determination in step S115 is negative (NO in step S115), the calculation server selects random m1 from among 0 to M every k times, and exchanges each value of X^(m1), and X^(m1+1)_iand each value of Y^(m1)_iand Y^(m1+1)_iif a transition condition is satisfied (step S116). For example, the calculation server can generate a pseudo random number r of 0 or more and less than 1, and compare a value of r with the value of the above formula (9). When the value of the above formula (9) is larger than the value of the pseudo random number r, it can be determined that the transition condition is satisfied. On the other hand, when the value of the pseudo random number r is equal to or larger than the value of the above formula (9), it can be determined that the transition condition is not satisfied. Then, the calculation server updates the counter t (step S117). For example, in step S117, Δt can be added to t.

Next, it is determined whether the counter t is smaller than T (step S118). If the determination in step S118 is positive (YES in step S118), the calculation server executes the processes in steps S113 to S117 again. If the determination in step S118 is negative (NO in step S118), the calculation server obtains the spin s_i, which is the element of the solution vector, based on the value of the element (variable) X^(m)_iof the first vector (step S119). For example, in step S119, the variable X^(m)_iwhich is a positive value may be converted into +1, and the variable X^(m)_iwhich is a negative value may be converted into −1. Note that, instead, whether the number of executions of the process in step S116 has exceeded a threshold may be determined in step S118.

As the variable X^(m)_iin step S119, a representative value of the variable X^(m1)_iselected from any replica of 1 to M can be used. Here, the representative value may be the variable X^(m1)_iselected from any replica of 1 to M, or may be an average value of the variables X^(m1)_iselected from at least one replica of 1 to M.

In the above-described flowchart of FIG. 9, processes of updating the elements of the first vector and the elements of the second vector (i=1, 2, . . . N) may be executed in parallel in at least some of steps S112 to S114 and S116 (second-level parallelization). In addition, at least some of the processes related to the respective replicas may be parallelized (first-level parallelization). For example, the processes of steps S112 to S114 related to a replica different for each of the calculation server, the virtual machine, or the processor may be executed. In this case, the process in step S116 may be executed by a program operating on any calculation server, or may be executed by the control unit 13 of the management server 1 described above. In addition, the process in step S119 may be executed by another server such as the management server 1.

Here, the information processing method, the storage medium, and the program according to the present embodiment will be described.

The information processing method may use a plurality of first calculators and a second calculator. For example, each of the plurality of first calculators is configured to repeatedly update a first vector having a first variable as an element and a second vector having a second variable corresponding to the first variable as an element. For example, the second calculator is configured to execute data exchange between the first calculators. In this case, the information processing method may include: a step of updating, by each of the first calculators, the first variable based on the corresponding second variable; a step of weighting, by each of the first calculators, the first variable with a first coefficient and adding the weighted first variable to the corresponding second variable; a step of calculating, by each of the first calculators, a problem term using the plurality of first variables; a step of adding, by each of the first calculators, the problem term to the second variable; and a step of executing, by the second calculator, at least one of exchange of the first vector and the second vector or exchange of the first coefficients between the first calculators.

In addition, the information processing method may use a plurality of first processors and second processors. For example, each of the plurality of first processors is configured to repeatedly update a first vector having a first variable as an element and a second vector having a second variable corresponding to the first variable as an element. For example, the second processor is configured to execute data exchange between the first processors. In this case, the information processing method may include: a step of updating, by each of the first processors, the first variable based on the corresponding second variable; a step of weighting, by each of the first processors, the first variable with a first coefficient and adding the weighted first variable to the corresponding second variable; a step of calculating, by each of the first processors, a problem term using the plurality of first variables; a step of adding, by each of the first processors, the problem term to the second variable; and a step of executing, by the second processor, at least one of exchange of the first vector and the second vector or exchange of the first coefficients between the first processors. The program according to the present embodiment may include steps similar to those of the information processing method. Further, a non-transitory computer-readable storage medium according to the present embodiment may store the program.

[Case where Value of Coefficient p Varies]

Heretofore, the description has been made assuming the case where the values of the coefficients p^(m)and a^(m)in each replica are determined in the step (step S111 in FIG. 9) before loop processing, and the fixed values of the coefficients p^(m)and a^(m)are used in the subsequent steps including the loop processing. However, the values of the coefficients p^(m)and a^(m)are not necessarily fixed in the case where the calculation of the simulated bifurcation algorithm is performed using the replica exchange method.

For example, the information processing device may vary the value of the coefficient p^(m)based on the following Formula (13). Formula (13) is an example of the coefficient p^(m)that can vary depending on a replica exchange timing or an exchange determination timing.

$\begin{matrix} [Formula 13] \\ p^{(m)} (t) = \frac{(p_{0}^{(m + 1)} + p_{0}^{(m)})}{2} + \frac{(p_{0}^{(m + 1)} - p_{0}^{(m)})}{2} \cos (π (\frac{t}{T_{0}} + m)) & (13) \end{matrix}$

In Formula (13), p^(m)(t) indicates that the value of the coefficient p^(m)depends on the parameter t.

Condition (a)

First, the value of p^(m)in a case where the value of the parameter t is 0 or an even multiple of T₀will be described.

In a case where the replica number m is an even number, a value of cos ( . . . ) of Formula (13) becomes +1, and p^(m)=p₀^(m+1)is established. On the other hand, in a case where the replica number m is an odd number, the value of cos ( . . . ) of Formula (13) becomes −1, and p^(m)=p₀^(m)is established.

Condition (b)

Next, the value of p^(m)in a case where the value of the parameter t is an odd multiple of T₀will be described.

In a case where the replica number m is an even number, the value of cos ( . . . ) of Formula (13) becomes −1, and p^(m)=p₀^(m)is established. On the other hand, in a case where the replica number m is an odd number, the value of cos ( . . . ) of Formula (13) becomes +1, and p^(m)=p₀^(m+1)is established.

Condition (c)

Finally, the value of p^(m)in a case where none of the above-described conditions (a) and (b) is satisfied for the value of the parameter t will be described.

In this case, cos ( . . . ) in Formula (13) takes a value in a range of (−1, +1), and thus, p^(m)takes a value in which each of p₀^(m+1)and p₀^(m)is weighted, such as ap₀^(m+1)+βp₀^(m). Due to the periodicity of cos ( . . . ), p^(m)having different values are used between the replica having the even number m and the replica having the odd number m. In this manner, each of the arithmetic circuits may periodically change the value of the first coefficient depending on the number of updates of the first vector and the second vector.

As the coefficient p^(m)defined by the above-described Formula (13) is used, the following Formula (14) is established at the timing when the determination on exchange between replicas is performed.

$\begin{matrix} [Formula 14] \\ \frac{W (m_{1}, m_{1} + 1)}{W (m_{1} + 1, m_{1})} = 1 & (14) \end{matrix}$

In a case where the relation of Formula (14) is established at the exchange determination timing between the replicas, the first vector and the second vector are exchanged for the replica m₁and the replica m₁+1 at each exchange determination timing. Since it is unnecessary to perform the calculation illustrated in Formula (9) at each exchange determination timing, the consumption of calculation resources can be suppressed. In addition, the process of determining whether to perform exchange can be skipped at the exchange determination timing (step S116) in the flowchart of FIG. 9, the exchange determination timing of FIG. 9 is replaced with an exchange timing.

In the case of using the coefficient p^(m)expressed in Formula (13), the user needs to determine the values of p₀^(m+1)and p₀^(m)in advance. However, it is unnecessary to determine the update pattern of the coefficient p^(m)and the definition of the coefficient a^(m)as in the case of performing the time evolution calculation, and thus, it is possible to reduce the labor and effort required for execution of the simulated bifurcation algorithm. In addition, it is possible to efficiently exchange the first vector and the second vector between replicas and to obtain the optimal solution of the problem or the approximate solution close thereto in a relatively short time by using the coefficient p^(m)expressed in Formula (13).

Note that the definition of the coefficient p^(m)expressed in Formula (13) is merely an example. Therefore, the use of the coefficient p^(m)having a different definition is not hindered.

A flowchart of FIG. 10 illustrates an example of processing in a case where calculation is performed using the varying coefficient p^(m). Hereinafter, the processing will be described with reference to FIG. 10.

First, the calculation server obtains the matrix J_ijand the vector h_icorresponding to the problem, and initializes the counter t and the coefficients p^(m)and a^(m)(step S121). The counter t is set to 0, for example. In a case of following the definition of Formula (13), p^(m)=p₀^(m+1)is set for the replica with the even number m. In addition, p^(m)=p₀^(m)is set for the replica with the odd number m.

Processes in steps S122 to S125 correspond to the processes in steps S112 to S115 in FIG. 9, respectively, and thus, a detailed description thereof is omitted. If the determination in step S125 is positive (YES in step S125), the calculation server selects a random m1 from among 0 to M every k times, and exchanges each value of X^(m1)_iand X^(m1+1)_iand each value of Y^(m1)_iand Y^(m1+1)_i(step S126). In a case where the relation of Formula (14) is established, it is unnecessary to determine whether to perform exchange between replicas, and thus, step S126 corresponds to a timing for exchange of the first vector and the second vector between replicas. Then, the calculation server updates the counter t (step S127). For example, in step S127, Δt can be added to t.

Next, it is determined whether the counter t is smaller than T (step S128). If the determination in step S128 is positive (YES in step S128), the calculation server executes the processes in steps S123 to S127 again. If the determination in step S128 is negative (NO in step S128), the calculation server obtains the spin s_i, which is the element of the solution vector, based on the value of the element (variable) X^(m)_iof the first vector (step S129). For example, in step S129, the variable X^(m)_iwhich is a positive value may be converted into +1, and the variable X^(m)_iwhich is a negative value may be converted into −1. A method for selecting and calculating a representative value that can be used as the variable X^(m)_iin step S129 is not limited, which is similar to step S119 in FIG. 9. In addition, the process of step S129 may be executed by a server (calculator) other than the calculation server.

In the above-described flowchart of FIG. 10, processes of updating the elements of the first vector and the elements of the second vector (i=1, 2, . . . N) may be executed in parallel in at least some of steps S122 to S124 and S126 (second-level parallelization). In addition, at least some of the processes related to the respective replicas may be parallelized (first-level parallelization). For example, the processes of steps S122 to S124 related to a replica different for each of the calculation server, the virtual machine, or the processor may be executed. In this case, the process in step S126 may be executed by a program operating on any calculation server, or may be executed by the control unit 13 of the management server 1 described above.

As the information processing device executes the processing of the flowchart in FIG. 10 using the coefficient p^(m)expressed in Formula (13), it is possible to increase the parallelism of processes while suppressing the amount of calculation and to obtain the optimal solution of the problem or the approximate solution close thereto in a relatively short time.

[Parallelization of Replica Exchange Processes]

In the calculation of the simulated bifurcation algorithm using the replica exchange method, the processes of the replicas m₁and m₁+1 to be exchanged need to be synchronized at least at the replica exchange timing in order to perform the process of exchanging the first vector and the second vector between replicas. Therefore, the replica exchange processes may be a bottleneck in speeding up the simulated bifurcation algorithm by the replica exchange method depending on implementation. Therefore, it is possible to take a measure for executing the replica exchange processes in parallel and easily synchronizing the processes between the replicas to be exchanged.

For example, it is possible to alternately execute a process (first exchange process) of exchanging the first vector and the second vector between the even-numbered replica m₁and the replica m₁+1 and a process (second exchange process) of exchanging the first vector and the second vector between the odd-numbered replica m₁and the replica m₁+1. As a result, it is possible to parallelize the exchange processes of the first vector and the second vector between the replicas.

The flowchart of FIG. 11 illustrates an example of processing in the case where replica exchange is performed in parallel. Hereinafter, the processing will be described with reference to FIG. 11.

First, the calculation server obtains the matrix J_ijand the vector h_icorresponding to the problem, and initializes the counter t and the coefficients p^(m)and a^(m)(step S131). For example, a method for initializing the coefficient p^(m)in step S131 may be similar to that in step S121 described above. In addition, a method for initializing the coefficient p^(m)in step S131 may be similar to that in step S111 described above. In addition, in step S131, a combination of coefficients p^(m)obtained on the experience side may be used. The counter t is set to 0, for example.

Then, the calculation server initializes the element X^(m)_iof the first vector and the element Y^(m)_iof the second vector, and substitutes 1 for m₁(step S132). Here, m₁is a parameter used to designate the replica m. For example, in step S132, the initial values of the element X^(m)_iof the first vector and the element Y^(m)_iof the second vector can be set using pseudo random numbers. Next, the calculation server updates an element X^(m1)_iof the first vector based on the element Y^(m)_iof the corresponding second vector (step S133). For example, in step S133, Δt×D×Y^(m1)_ican be added to the element X^(m1)_iof the first vector. Then, the calculation server updates an element Y^(m1)_iof the second vector (step S134). For example, in step S134, Δt×[(p−D−K×X^(m1)_i×X^(m1)_i)×X^(m1)_i] can be added to the element Y^(m1), of the second vector. In step S134, −Δt×c×h_i×a−Δt×c××X^(m1)_ican be further added to the element Y^(m1)_iof the second vector.

Next, the calculation server alternately selects m₁which is an even number or m₁which is an odd number every k times to determine the transition condition, and exchanges each value of X^(m1)_iand X^(m1+1)_iand each value of Y^(m1), and Y^(m1+1)_iif the transition condition is satisfied (step S135). For example, the calculation server can generate a pseudo random number r of 0 or more and less than 1, and compare a value of r with the value of the above formula (9). When the value of the above formula (9) is larger than the value of the pseudo random number r, it can be determined that the transition condition is satisfied. On the other hand, when the value of the pseudo random number r is equal to or larger than the value of the above formula (9), it can be determined that the transition condition is not satisfied.

Then, the calculation server increments m1 and determines whether m1 M is satisfied (step S136). When the determination in step S136 is positive (YES in step S136), the processes in steps S133 to S135 are executed again. When the determination in step S136 is negative (NO in step S136), the calculation server updates the counter t (step S137). For example, in step S137, Δt can be added to t.

Next, it is determined whether the counter t is smaller than T (step S138). If the determination in step S138 is positive (YES in step S138), the calculation server executes the processes in steps S133 to S137 again. If the determination in step S138 is negative (NO in step S138), the calculation server obtains the spin s_i, which is the element of the solution vector, based on the value of the element (variable) X^(m)_iof the first vector (step S139). For example, in step S139, the variable X^(m)_iwhich is a positive value may be converted into +1, and the variable X^(m)_iwhich is a negative value may be converted into −1. A method for selecting and calculating a representative value that can be used as the variable X^(m)_iin step S139 is not limited, which is similar to step S119 in FIG. 9. In addition, the process in step S139 may be executed by a server (information processing device) other than the calculation server.

In the above-described flowchart of FIG. 11, processes of updating the elements of the first vector and the elements of the second vector (i=1, 2, . . . N) may be executed in parallel in at least some of steps S132 to S135 (second-level parallelization). In addition, at least some of the processes related to the respective replicas may be parallelized (first-level parallelization). For example, the processes of steps S132 to S135 related to a replica different for each of the calculation server, the virtual machine, or the processor may be executed. In this case, the process in step S135 may be executed by a program operating on any calculation server, or may be executed by the control unit 13 of the management server 1 described above.

FIG. 12 conceptually illustrates a processing flow in the replica exchange method. In FIG. 12, each square corresponds to a replica with a number of 1 to M.

Step S10 in FIG. 12 illustrates a case where the process of updating the first vector and the second vector is executed for each replica. Then, in step S11, it is determined whether a replica with an even number satisfies the transition condition, and the first vector and the second vector are exchanged when the transition condition is satisfied. In addition, in step S12, it is determined whether a replica with an even number satisfies the transition condition, and the first vector and the second vector are exchanged when the transition condition is satisfied. As described above, the transition condition can be determined by generating the pseudo random number r of 0 or more and less than 1 and comparing the value of r with the value of the above formula (9).

The data exchange circuit may be configured to select a pair of arithmetic circuits that differs depending on a timing, and perform at least one of exchange of the first vector and the second vector or exchange of the first coefficients. In addition, the data exchange circuit may be configured to alternately select a pair of a first combination and a pair of a second combination in the plurality of arithmetic circuits and execute at least one of exchange of the first vector and the second vector or exchange of the first coefficients. For example, it is assumed that there are replicas with numbers of 1 to 6. It is assumed that each of the replicas with the numbers of 1 to 6 is allocated to any of the arithmetic circuits. In this case, the first combination may include a pair of replicas with the numbers of 1 and 2, a pair of replicas with the numbers of 3 and 4, and a pair of replicas with the numbers of 5 and 6. On the other hand, the second combination may include a pair of replicas with the numbers of 2 and 3, a pair of replicas with the numbers of 4 and 5, and a pair of replicas with the numbers of 6 and 1.

As illustrated in FIG. 12, the process in step S11 and the process in step S12 can be alternately performed depending on the timing. As a result, the replica exchange processes are executed in parallel, and the optimal solution of the problem or the approximate solution close thereto can be obtained in a still shorter time.

FIG. 13 illustrates an example of a formula related to the replica exchange method. Formula (E1) in FIG. 13 illustrates an example of a Hamiltonian used in each replica. Formula (E2) illustrates an example of a canonical ensemble. Formula (E3) illustrates a general detailed balance equation. Formula (E4) illustrates a detailed balance equation in the canonical ensemble. Formula (E5) is obtained by rearranging terms of the detailed balance equation in the canonical ensemble. Formula (E6) represents a transition probability between the replica m₁and a replica m₂.

A graph in FIG. 14 is a histogram illustrating an example of the number of trials required until the optimal solution is obtained in each of the case where calculation of the simulated bifurcation algorithm is performed using the replica exchange method and the case where the calculation of the simulated bifurcation algorithm is performed by the time evolution. FIG. 14 illustrates data in a case where a Hamilton path problem of 48 nodes and 96 edges has been solved using the simulated bifurcation algorithm. The vertical axis in FIG. 14 represents the number of times of obtaining the optimal solution. In the calculation using the replica exchange method, the calculation is performed using 10 replicas. Therefore, the number of trials in one calculation by the replica exchange method is counted as 10 times. Note that whether an optimal solution is obtained is determined every 200 cycles in the case of the calculation by the time evolution. On the other hand, whether an optimal solution is obtained is determined every 50 cycles in the case of the calculation by the replica exchange method. Referring to the result of FIG. 14, when the replica exchange method is used, the optimal solution is obtained with a smaller number of trials as compared with the case of calculating the time evolution.

As the calculation is performed using the replica exchange method, redundancy can be imparted to the information processing system. For example, even if a calculator executing calculation of the first vector and the second vector in any of the replicas abnormally stops, the calculation process in the information processing system can be continued as long as a calculator executing the calculation of the first vector and the second vector in another replica continues to operate. In this case, data of the replica used in the calculator that has abnormally stopped may be restored based on data in the other replica. Even when the processor executing the calculation of the first vector and the second vector in any of the replicas abnormally stops, the calculation can be continued in another processor.

[Calculation Including Term of Many-Body Interaction]

It is also possible to solve a combinatorial optimization problem having a third-order or higher-order objective function by using the simulated bifurcation algorithm. A problem of obtaining a combination of variables that minimizes the third-order or higher-order objective function, which has a binary variable as a variable, is called a higher-order binary optimization (HOBO) problem. In a case of handling the HOBO problem, the following Formula (15) can be used as an energy formula in an Ising model extended to the higher order.

$\begin{matrix} [Formula 15] \\ E_{HOBO} = \sum_{i = 1}^{N} J_{i}^{(1)} s_{i} + \frac{1}{2} \sum_{i = 1}^{N} \sum_{j = 1}^{N} J_{i, j}^{(2)} s_{i} s_{j} + \frac{1}{3!} \sum_{i = 1}^{N} \sum_{j = 1}^{N} \sum_{k = 1}^{N} J_{i, j, k}^{(3)} s_{i} s_{j} s_{k} + \dots & (15) \end{matrix}$

Here, J⁽ⁿ⁾is an n-rank tensor, and is obtained by generalizing the matrix J of the local magnetic field h_iand a coupling coefficient of Formula (1). For example, a tensor J⁽¹⁾corresponds to a vector of the local magnetic field h_i. In the n-rank tensors J⁽ⁿ⁾, when a plurality of indices have the same value, values of elements are 0. Although terms up to a third-order term are illustrated in Formula (15), but a higher-order term can also be defined in the same manner as in Formula (15). Formula (15) corresponds to the energy of the Ising model including a many-body interaction.

Note that both QUBO and HOBO can be said to be a type of polynomial unconstrained binary optimization (PUBO). That is, a combinatorial optimization problem having a second-order objective function in PUBO is QUBO. In addition, it can be said that a combinatorial optimization problem having a third-order or higher-order objective function in PUBO is HOBO.

In a case where the HOBO problem is solved using the simulated bifurcation algorithm, the Hamiltonian H of Formula (5) described above may be replaced with the Hamiltonian H of the following Formula (16).

$\begin{matrix} [Formula 16] \\ H = \sum_{i = 1}^{N} [\frac{D}{2} (x_{i}^{2} + y_{i}^{2}) - (\frac{p (t)}{2}) x_{i}^{2} + \frac{K}{4} x_{i}^{4} + c (J_{i}^{(1)} x_{i} α (t) + \frac{1}{2} \sum_{j = 1}^{N} J_{i . j}^{(2)} x_{i} x_{j} + \frac{1}{3!} \sum_{j = 1}^{N} \sum_{k = 1}^{N} J_{i, j, k}^{(3)} x_{i} x_{j} x_{k} + \dots)] H_{Ising} = \sum_{i = 1}^{N} [J_{i}^{(1)} x_{i} α (t) + \frac{1}{2} \sum_{j = t}^{N} J_{i, j}^{(2)} x_{i} x_{j} + \frac{1}{3!} \sum_{j = 1}^{N} \sum_{k = 1}^{N} J_{i, j, k}^{(3)} x_{i} x_{j} x_{k} + \dots] & (16) \end{matrix}$

In addition, a problem term expressed by the following Formula (17) is derived from Formula (16).

$\begin{matrix} [Formula 17] \\ z_{i} = \frac{\partial H_{Ising}}{\partial_{X_{i}}} = J_{i}^{(1)} α (t) + \sum_{j = 1}^{N} J_{i, j}^{(2)} x_{j} + \sum_{j = 1}^{N} \sum_{k = 1}^{N} J_{i, j, k}^{(3)} x_{j} x_{k} + \dots & (17) \end{matrix}$

The problem term z_iof (17) takes a format in which the second expression of (16) is partially differentiated with respect to any variable x_i(element of the first vector). The partially differentiated variable x_idiffers depending on an index i. Here, the index i of the variable x_icorresponds to an index designating an element of the first vector and an element of the second vector.

In a case where calculation including the term of the many-body interaction is performed, the recurrence formula of (11) described above is replaced with the following recurrence formula of (18).

$\begin{matrix} [Formula 18] \\ X_{i}^{(m 1)} (t + Δ t) = X_{i}^{(m 1)} (t) + {DY}_{i}^{(m 1)} (t) Δ t Y_{i}^{(m 1)} (t + Δ t) = Y_{i}^{(m 1)} (t) + [{- D + p^{(m 1)} (t + Δ t) - X_{i}^{(m 1)} (t + Δ t) X_{i}^{(m 1)} (t + Δ t)} X_{i}^{(m 1)} (t + Δ t)] Δ t - c Δ t (J_{i}^{(1)} α (t + Δ t) + \sum_{j - 1}^{N} J_{i . j}^{(2)} X^{(m 1)} (t + Δ t) + \sum_{j = 1}^{N} \sum_{k = 1}^{N} J_{i, j, k}^{(3)} X_{j}^{(m 1)} (t + Δ t) X_{k}^{(m 1)} (t + Δ t) + \dots) & (18) \end{matrix}$

(18) corresponds to a further generalized recurrence formula of (11).

The problem terms described above are merely examples of a problem term that can be used by the information processing device according to the present embodiment. Therefore, a format of the problem term used in the calculation may be different from these.

[Modified Example of Algorithm]

Here, a modified example of the simulated bifurcation algorithm will be described. For example, various modifications may be made to the above-described simulated bifurcation algorithm for the purpose of reducing an error or reducing a calculation time.

For example, in order to reduce an error in calculation, additional processing may be executed at the time of updating the element of the first vector. For example, when an absolute value of an element X⁽ⁿ⁾_iof the first vector becomes larger than 1 by the update, a value of the element X⁽ⁿ⁾_iof the first vector is replaced with sgn(X⁽ⁿ⁾_i). That is, when X⁽ⁿ⁾_i>1 is satisfied by the update, the value of the variable X⁽ⁿ⁾_iis set to 1. In addition, when X⁽ⁿ⁾_i<−1 is satisfied by the update, the value of the variable X⁽ⁿ⁾_iis set to −1. As a result, it is possible to approximate the spin s_iwith higher accuracy using the variable X⁽ⁿ⁾_i. Since such processing is included, the algorithm is equivalent to a physical model of an N particles having a wall at positions of x_i=±1. More generally speaking, an arithmetic circuit may be configured to set the first variable, which has a value smaller than a first value, to the first value, and set the first variable, which has a value larger than a second value, to the second value.

Further, when X⁽ⁿ⁾_i>1 is satisfied by the update, a variable Y⁽ⁿ⁾_icorresponding to the variable X⁽ⁿ⁾_imay be multiplied by the coefficient rf. For example, when the coefficient rf of −1<r≤0 is used, the above wall becomes a wall of the reflection coefficient rf. In particular, when the coefficient rf of rf=0 is used, the algorithm is equivalent to a physical model having walls where completely inelastic collisions occur at X⁽ⁿ⁾_i=±1. More generally speaking, the arithmetic circuit may be configured to update a second variable which corresponds to the first variable having the value smaller than the first value or a second variable which corresponds to the first variable larger than the second value, to a value obtained by multiplying the original second variable by a second coefficient. For example, the arithmetic circuit may be configured to update the second variable which corresponds to the first variable having the value smaller than −1 or the second variable which corresponds to the first variable having the value larger than 1, to the value obtained by multiplying the original second variable by the second coefficient. Here, the second coefficient corresponds to the above-described coefficient rf.

Note that the arithmetic circuit may set the value of the variable Y⁽ⁿ⁾_icorresponding to the variable X⁽ⁿ⁾_ito a pseudo random number when X⁽ⁿ⁾_i>1 is satisfied by the update. For example, a random number in the range of [−0.1, 0.1] can be used. That is, the arithmetic circuit may be configured to set a value of the second variable which corresponds to a first variable having the value smaller than the first value or a value of the second variable which corresponds to the first variable having the value larger than the second value, to the pseudo random number.

If the update process is executed so as to suppress |X⁽ⁿ⁾_i|>1 as described above, the value of X⁽ⁿ⁾_idoes not diverge even if a non-linear term K×X⁽ⁿ¹⁾_i(t+Δt)×X⁽ⁿ¹⁾_i(t+Δt) in (11) and (18) is removed. Therefore, it is possible to use an algorithm illustrated in (19) below.

$\begin{matrix} [Formula 19] \\ X_{i}^{(m 1)} (t + Δ t) = X_{i}^{(m 1)} (t) + {DY}_{i}^{(m 1)} (t) Δ t Y_{i}^{(m 1)} (t + Δ t) = Y_{i}^{(m 1)} (t) + [{- D + p^{(m 1)} (t + Δt)} X_{i}^{(m 1)} (t + Δ t)] Δ t - c Δ t (J_{i}^{(1)} α (t + Δ t) + \sum_{j - 1}^{N} J_{i . j}^{(2)} X^{(m 1)} (t + Δ t) + \sum_{j = 1}^{N} \sum_{k = 1}^{N} J_{i, j, k}^{(3)} X_{j}^{(m 1)} (t + Δ t) X_{k}^{(m 1)} (t + Δ t) + \dots) & (19) \end{matrix}$

In the algorithm of (19), a continuous variable X is used in the problem term instead of a discrete variable. Therefore, there is a possibility that an error from the discrete variable used in the original combinatorial optimization problem occurs. In order to reduce this error, a value sgn(X) obtained by converting the continuous variable X by a signum function can be used instead of the continuous variable X in the calculation of the problem term as in (20) below.

$\begin{matrix} [Formula 20] \\ X_{i}^{(m 1)} (t + Δ t) = X_{i}^{(m 1)} (t) + {DY}_{i}^{(m 1)} (t) Δ t Y_{i}^{(m 1)} (t + Δ t) = Y_{i}^{(m 1)} (t) + [{- D + p^{(m 1)} (t + Δ t)} X_{i}^{(m 1)} (t + Δ t)] Δ t - c Δ t (J_{i}^{(1)} α (t + Δ t) + \sum_{j - 1}^{N} J_{i . j}^{(2)} sgn (X_{j}^{(m 1)} (t + Δt)) + \sum_{j = 1}^{N} \sum_{k = 1}^{N} J_{i, j, k}^{(3)} sgn (X_{j}^{(m 1)} (t + Δ t)) sgn (X_{k}^{(m 1)} (t + Δt)) + \dots) & (20) \end{matrix}$

In (20), sgn(X) corresponds to the spin s.

In (20), the coefficient a of a term including the first-rank tensor in the problem term may be a constant (for example, a=1). In an algorithm of (20), the product of spins appearing in the problem term always takes any value of −1 or 1, and thus, it is possible to prevent the occurrence of an error due to the product operation when the HOMO problem having the higher-order objective function is handled. As in the algorithm of (20) described above, data calculated by the calculation server may further include a spin vector (s_i, s₂, . . . , s_N) having the variable s_i(i=1, 2, . . . , N) as an element. The spin vector can be obtained by converting each element of the first vector by a signum function

[Example of Parallelization of Variable Update Processes]

Hereinafter, an example of parallelization of variable update processes at the time of calculation of the simulated bifurcation algorithm will be described.

First, an example in which the simulated bifurcation algorithm is implemented in a PC cluster will be described. The PC cluster is a system that connects a plurality of computers and realizes calculation performance that is not obtainable by one computer. For example, the information processing system 100 illustrated in FIG. 1 includes a plurality of calculation servers and processors, and can be used as the PC cluster. For example, the parallel calculation can be executed even in a configuration in which memories are arranged to be distributed in a plurality of calculation servers as in the information processing system 100 by using a message passing interface (MPI) in the PC cluster. For example, the control program 14E of the management server 1, the calculation program 34B and the control program 34C of each of the calculation servers can be implemented using the MPI.

In a case where the number of processors used in the PC cluster is Q, it is possible to cause each of the processors to calculate L variables among the variables x_iincluded in the first vector (x₁, x₂, . . . , x_N). Similarly, it is possible to cause each of the processors to calculate L variables among the variables y_iincluded in the second vector (y₁, y₂, . . . , y_N). That is, processors #j (j=1, 2, . . . , Q) calculate variables {x_m|m=(j−1)L+1, (j−1)L+2, . . . , jL} and {y_m|m=(j−1)L+1, (j−1)L+2, . . . , jL}. In addition, a tensor J⁽ⁿ⁾expressed in the following (21), necessary for the calculation of {y_m|m=(j−1)L+1, (j−1)L+2, . . . , jL} by the processors #j, is stored in a storage area (for example, a register, a cache, a memory, or the like) accessible by the processors #j.

$\begin{matrix} [Formula 21] \\ {J_{m}^{(1)} \langle m = (i - 1) L + 1, \dots, iL;} {J_{m, j}^{(1)} \langle m = (i - 1) L + 1, \dots, iL; j = 1, \dots, N} {J_{m, j, k}^{(3)} \langle m = (i - 1) L + 1, \dots, iL; j = 1, \dots, N; k = 1, \dots, N}, \dots & (21) \end{matrix}$

Here, the case where each of the processors calculates the constant number of variables of each of the first vector and the second vector has been described. However, the number of variables of the first vector and the second vector to be calculated may be different depending on the processor. For example, in a case where there is a performance difference depending on a processor implemented in a calculation server, the number of variables to be calculated can be determined depending on the performance of the processor.

Values of all the components of the first vector (x₁, x₂, . . . , x_N) are required in order to update the value of the variable y_i. The conversion into a binary variable can be performed, for example, by using the signum function sgn( ). Therefore, the values of all the components of the first vector (x₁, x₂, . . . , x_N) can be shared by the Q processors using the Allgather function. Although it is necessary to share the values between the processors regarding the first vector (x₁, x₂, . . . , x_N), but it is not essential to share values between the processors regarding the second vector (y₁, y₂, . . . , y_N) and the tensor JO). The sharing of data between the processors can be realized, for example, by using inter-processor communication or by storing data in a shared memory.

The processor #j calculates a value of the problem term {z_m|m=(j−1)L+1, (j−1)L+2, . . . , jL}. Then, the processor #j updates the variable {y_m|m=(j−1)L+1, (j−1)L+2, . . . , jL} based on the calculated value of the problem term {{z_m|m=(j−1)L+1, (j−1)L+2, . . . , jL}.

As illustrated in the above-described respective formulas, the calculation of the vector (z₁, z₂, . . . , z_N) of the problem term requires the product-sum operation including the calculation of the product of the tensor J(n) and the vector (x₁, x₂, . . . , x_N). The product-sum operation is processing with the largest calculation amount in the above-described algorithm, and can be a bottleneck in improving the calculation speed. Therefore, the product-sum operation is distributed to Q=N/L processors and executed in parallel in the implementation of the PC cluster, so that the calculation time can be shortened.

FIG. 15 schematically illustrates an example of a multi-processor configuration. A plurality of calculation nodes in FIG. 15 correspond to, for example, the plurality of calculation servers of the information processing system 100. In addition, a high-speed link of FIG. 15 corresponds to, for example, the interconnection between the calculation servers formed by the cables 4a to 4c and the switch 5 of the information processing system 100. A shared memory in FIG. 15 corresponds to, for example, the shared memory 32. Processors in FIG. 15 correspond to, for example, the processors 33A to 33D of the respective calculation servers. Note that FIG. 15 illustrates the plurality of calculation nodes, but the use of a configuration of a single calculation node is not precluded.

FIG. 15 illustrates data arranged in each of components and data transferred between the components. In each of the processors, values of the variables x_iand y_iare calculated. In addition, the variable x_iis transferred between the processor and the shared memory. In the shared memory of each of the calculation nodes, for example, the first vector (x₁, x₂, . . . , x_N)), L variables of the second vector (y₁, y₂, . . . , y_N), and some of the tensors J⁽ⁿ⁾are stored. Then, for example, the first vector (x₁, x₂, . . . , x_N) is transferred in the high-speed link connecting the calculation nodes. In a case where the Allgather function is used, all elements of the first vector (x₁, x₂, . . . , x_N) are required in order to update the variable y_iin each of the processors.

When the replica exchange method is used, one or more calculation nodes may be allocated for calculation of the first vector and the second vector in the respective replicas. In this case, at the execution timing in the replica exchange process, the data of the first vector and the second vector is transferred between the calculation nodes via the high-speed link. For example, the control unit 13 of the management server 1 can control the calculation node. Note that the arrangement and transfer of data illustrated in FIG. 15 are merely examples. A data arrangement method, a transfer method, and a parallelization method in the PC cluster are not particularly limited.

In addition, the simulated bifurcation algorithm may be calculated using a graphics processing unit (GPU).

FIG. 16 schematically illustrates an example of a configuration using the GPU. FIG. 16 illustrates a plurality of GPUs connected to each other by a high-speed link. Each GPU is equipped with a plurality of cores capable of accessing a shared memory. In addition, the plurality of GPUs are connected via the high-speed link to form a GPU cluster in the configuration example of FIG. 16. For example, if the GPUs are mounted on the respective calculation servers in FIG. 1, the high-speed link corresponds to the interconnection between the calculation servers formed by the cables 4a to 4c and the switch 5. Note that the plurality of GPUs are used in the configuration example of FIG. 16, but parallel calculation can be executed even in a case where one GPU is used. That is, each of the GPUs of FIG. 16 may perform the calculation corresponding to each of the calculation nodes of FIG. 15. That is, the processor of the information processing device (calculation server) may be a core of a graphics processing unit (GPU).

In the GPU, the variables x_iand y_iand the tensor J⁽ⁿ⁾are defined as device variables. The GPUs can calculate the product of the tensor J⁽ⁿ⁾necessary to update the variable y_iand the first vector (x₁, x₂, . . . , x_N) in parallel by a matrix vector product function. Note that the product of the tensor and the vector can be obtained by repeatedly executing the matrix vector product operation. In addition, it is possible to parallelize the processes by causing each thread to execute a process of updating the i-th element (x_i, y_i) for a portion other than the product-sum operation in the calculation of the first vector (x₁, x₂, . . . , x_N) and the second vector (y₁, y₂, . . . , y_N).

[Overall Processing for Solving Combinatorial Optimization Problem]

The following describes overall processing executed to solve a combinatorial optimization problem using the simulated bifurcation algorithm.

A flowchart of FIG. 17 illustrates an example of the overall processing executed to solve the combinatorial optimization problem. Hereinafter, processing will be described with reference to FIG. 17.

First, the combinatorial optimization problem is formulated (step S201). Then, the formulated combinatorial optimization problem is converted into an Ising problem (a format of an Ising model) (step S202). Next, a solution of the Ising problem is calculated by an Ising machine (information processing device) (step S203). Then, the calculated solution is verified (step S204). For example, in step S204, whether a constraint condition has been satisfied is confirmed. Further, in step S204, whether the obtained solution is the optimal solution or the approximate solution close thereto may be confirmed by referring to the value of the energy function (Hamiltonian).

Then, it is determined whether recalculation is to be performed depending on at least one of the verification result or the number of calculations in step S204 (step S205). When it is determined that the recalculation is to be performed (YES in step S205), the processes in steps S203 and S204 are executed again. On the other hand, when it is determined that the recalculation is not to be performed (NO in step S205), a solution is selected (step S206). For example, in step S206, selection can be performed based on at least one of whether the constraint condition is satisfied or the value of the energy function. Note that the process of step S206 may be skipped when a plurality of solutions are not calculated. Finally, the selected solution is converted into a solution of the combinatorial optimization problem, and the solution of the combinatorial optimization problem is output (step S207).

When the information processing device, the information processing system, the information processing method, the storage medium, and the program described above are used, the solution of the combinatorial optimization problem can be calculated within the practical time. As a result, it becomes easier to solve the combinatorial optimization problem, and it is possible to promote social innovation and progress in science and technology.

While certain embodiments have been described, these embodiments have been presented by way of example only, and are not intended to limit the scope of the inventions. Indeed, the novel methods and systems described herein may be embodied in a variety of other forms; furthermore, various omissions, substitutions and changes in the form of the methods and systems described herein may be made without departing from the spirit of the inventions. The accompanying claims and their equivalents are intended to cover such forms or modifications as would fall within the scope and spirit of the inventions.

	Number	Date	Country
Parent	PCT/JP2020/014190	Mar 2020	US
Child	17487179		US

INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING SYSTEM, INFORMATION PROCESSING METHOD, AND STORAGE MEDIUM

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

Priority Claims (1)

Continuations (1)