The present application claims the benefit of priority from Japanese Patent Application No. 2022-190585 filed on Nov. 29, 2022. The entire disclosures of all of the above application are incorporated herein by reference.
The present disclosure relates to a processing technique of optimizing a combination of binary variables.
A related art discloses a processing technique of optimizing a combination of binary variables under a one-hot constraint. In the processing technique disclosed in the related art, a combination optimization problem satisfying the one-hot constraint is divided into a plurality of partial problems to improve solving performance.
According to one example, a processing system may include a parallel processing processor in which threads is constructed for each of blocks, and that optimizes a combination of binary variables under a one-hot constraint. A group variable is defined with a combination pattern satisfying the one-hot constraint as a solution candidate for each of groups of the binary variables. The parallel processing processor executes assigning the solution candidate of the group variable for each of the threads in each of the blocks, searching for an output value of the group variable in each of the blocks, and outputting the output value of all the group variables having been searched.
Objects, features and advantages of the present disclosure will become more apparent from the following detailed description made with reference to the accompanying drawings. In the drawings:
In the processing technique of a relate art in which the optimization problem is divided into the plurality of partial problems, although an increase in speed of solving processing by the division can be achieved as the solving performance, solving accuracy is limited by the division.
The present disclosure provides a processing system that achieves both an increase in speed of solving processing and an improvement in solving accuracy. The present disclosure provides a processing method of achieving both an increase in speed of the solving processing and an improvement in the solving accuracy. The present disclosure provides a processing program that achieves both an increase in speed of the solving processing and an improvement in the solving accuracy.
According to one aspect of the present disclosure, a processing system may include a parallel processing processor in which a plurality of threads is constructed for each of a plurality of blocks, and that is configured to optimize a combination of binary variables under a one-hot constraint. A group variable is defined with a combination pattern satisfying the one-hot constraint as a solution candidate for each of groups of the binary variables. The parallel processing processor is configured to execute assigning the solution candidate of the group variable for each of the threads in each of the blocks, searching for an output value of the group variable on a basis of an energy evaluation value for the solution candidate of the group variable assigned for each of the threads in each of the blocks, and outputting the output value of all the group variables having been searched.
According to another aspect of the present disclosure, a processing method of optimizing a combination of binary variables under a one-hot constraint by a parallel processing processor in which a plurality of threads is constructed for each of a plurality of blocks is provided. A group variable is defined with a combination pattern satisfying the one-hot constraint as a solution candidate for each of groups of the binary variables. The processing method may include: assigning the solution candidate of the group variable for each of the threads in each of the blocks; searching for an output value of the group variable on the basis of an energy evaluation value for the solution candidate of the group variable assigned for each of the threads in each of the blocks; and outputting the output value of all the group variables having been searched.
According to another aspect of the present disclosure, a non-transitory computer readable storage medium storing a processing program including a command that is stored in the storage medium to optimize a combination of binary variables under a one-hot constraint and is executed by a parallel processing processor in which a plurality of threads is constructed for each of a plurality of blocks is provided. A group variable is defined with a combination pattern satisfying the one-hot constraint as a solution candidate for each of groups of the binary variables. The command includes assigning the solution candidate of the group variable for each of the threads in each of the blocks, searching for an output value of the group variable on the basis of an energy evaluation value for the solution candidate of the group variable assigned for each of the threads in each of the blocks, and outputting the output value of all the group variables having been searched.
As described above, in first to third aspects in which a group variable is defined with a combination pattern satisfying a one-hot constraint for each group of a binary variable as a solution candidate, the solution candidate is assigned for each thread in each of blocks of a parallel processing processor. Therefore, in the first to third aspects, an output value of the group variable is searched for on the basis of an energy evaluation value for the solution candidate of the group variable assigned to each thread in each block of the parallel processing processor. Accordingly, the output value of the group variable is searched in parallel in each block of the parallel processing processor, and thus, the search that can ensure accuracy can be completed in a short time. Here, the output values of all the group variables output by the search are equivalent to a solution in which the combination pattern is optimized so as to satisfy the one-hot constraint for each group of the binary variable. Therefore, the first to third aspects are effective in achieving both an increase in speed of solving processing and an improvement in solving accuracy.
First, a technical background related to an embodiment of the present disclosure will be described.
In the field of quantum computation, a quantum annealer first appeared as a quantum computer, and Ising machine that classically mimic the quantum annealer by digital technology appeared as a rival. The Ising machine is a machine in which a technique related to classical simulated annealing is implemented as a dedicated chip of a digital computer for an Ising model to be solved by the quantum annealer. The leading machines are a digital annealer and a complementary metal oxide semiconductor (CMOS) annealer. Such a situation has been put on hold by an annealer of general purpose computing on graphics processing units (GPGPU) base. The GPGPU-based annealer is a technique for reproducing the performance of a dedicated computer implemented by an application specific integrated circuit (ASIC) or the like with GPGPU. This first techniques are simulated bifurcation machine (SBM) and momentum annealing (MA). All of these techniques are considered to be capable of moving with a general-purpose GPGPU machine and exhibiting performance comparable to the performance of the digital annealer and the like.
However, even in the GPGPU-based annealer, the difficulty of implementing parallelization of an algorithm of the simulated annealing with high performance has become apparent. In the case of an SBM that is said to be the fastest machine in the world faster than a quantum computer, there are many reports saying that the performance of the SBM is only good against a fully coupled Max-Cut problem that has not been exhibited in the quantum computer, and since the Max-Cut problem itself is a problem with low applicability, sufficient performance is not exhibited against a problem that an Ising machine is widely targeted. In short, one of the problems of a method of solving the Ising model by the GPGPU is that the method cannot be adapted to a general-purpose problem. On the other hand, in the case of MA in which a minor embedding method proposed by the quantum annealer is applied to a solution method of the simulated annealing, the difficulty of GPGPU acceleration is apparent. In the MA, a total coupling problem having a quadratic relationship between all variables is embedded in a bipartite graph, and half of the variables can be simultaneously updated to enable parallel calculation by the GPGPU. However, since the limit of the solving accuracy is manually introduced by minor embedding in the bipartite graph, a problem occurs in improving the solving accuracy. In short, another problem of the method of solving the Ising model by the GPGPU is that the improvement in the solving accuracy is easily limited.
In the GPGPU-based annealer, the difficulty of implementing parallelization of the algorithm of the simulated annealing with high performance is due to the difficulty of parallelization. Specifically, in the Ising problem in which all the variables are related, by performing one variable flip (one search) and performing another variable flip, information obtained by each flip cannot be mutually used. That is, since the information obtained by one variable flip is inherited only by the variable flip performed serially, information remains in the sequentially repeated histories of the searches between the independent variable flips, but the information cannot be exchanged with each other, and it is difficult to exhibit an effect of parallelization. Therefore, in the SBM, by attempting to solve the Ising model by a differential equation that is easy to parallelize, parallelization has been successful by making it possible to search the problem of a variable size Z by updating Z independent variables, but instead, versatility or usefulness is lacking. In the MA, parallelization has been successful by making it possible to search the problem of the variable size Z by Z independent variable flips. However, there is a limit to improvement of the solving accuracy instead.
Here, one attempt of parallelization is multi-start simulated annealing as the simplest parallelization method. This parallelization method is a method of performing multiple simulated annealing by using independently prepared random initial values, and finally aggregating all the simulated annealing to obtain the best result. However, the performance of the original simulated annealing is only improved by the random initial value by the amount of solution dispersion, and does not provide essential improvement as an algorithm. On the other hand, another attempt of parallelization is a replica exchange method (parallel tempering) as a representative example of a Monte Carlo calculation method developed in the context of statistical physics. This parallelization method is a method of preparing replicas for which a plurality of different temperatures is set and searching is performed in parallel, and exchanging information with precise timing between the replicas. However, introduction of an excessively parallelized replica conversely causes a decrease in performance, and the effect of parallelization is small, that is, limited even with Z parallel to the variable size Z.
In addition to the above problems, many useful optimization problems based on the Ising model always require a long one-hot constraint. However, in an Ising formulation (that is, a penalty method) in which the one-hot constraint is formulated in the Ising model, it is difficult to improve the performance. Here, in the Ising machine and a pseudo quantum technology, the performance can be improved by limiting a search technique by the one-hot constraint. On the other hand, it is known that the SBM of the GPGPU-based annealer is not suitable for implementation of the one-hot constraint. From the above background, the present disclosure provides a technology capable of achieving not only high speed by implementing efficient parallelization by GPGPU but also particularly exhibiting performance for the problem of a one-hot constraint in Ising formulation frequently used in many useful applications.
Hereinafter, a plurality of embodiments of the present disclosure will be described with reference to the drawings. Note that the same reference numerals are given to corresponding components in each embodiment, and redundant description may be omitted. When only a part of a configuration is described in each embodiment, other parts of the configuration can adopt a configuration of another embodiment previously described. Furthermore, not only a combination of configurations explicitly described in the description of each embodiment but also a partial combination of configurations of a plurality of embodiments is possible even if not explicitly described as long as the combination is not hindered.
A processing system 1 according to a first embodiment shown in
The host processing computer 10 includes at least one host processing processor 12 and at least one host processing memory 14. The host processing processor 12 is a central processing unit (CPU) as a processor capable of performing classical computation processing on data and capable of performing data transfer processing with at least the parallel processing computer 20 inside the system of the parallel processing computer 20 inside the system or the outside of the system. The host processing processor 12 reads a host program as a processing program from the host processing memory 14, and manages inputting, outputting, and processing of data and a program with the parallel processing computer 20 by transfer. The host processing processor 12 may manage inputting, outputting, and processing of data and a program with the outside of the system.
The host processing memory 14 is a semiconductor memory as a non-transitory tangible storage medium capable of non-transiently storing computer-readable data and programs. The host processing memory 14 stores a processing program including a host program that manages inputting, outputting, and processing of data with the parallel processing computer 20 and a kernel function called by the parallel processing computer 20. The host processing memory 14 stores input data input to the parallel processing computer 20, internal output data output inside the system from the parallel processing computer 20, and external output data that can be output to the outside of the system in accordance with the internal output data.
The parallel processing computer 20 includes at least one parallel processing processor 22 and at least one parallel processing memory 24. As shown in
The parallel processing memory 24 shown in
The processors 12 and 22 of the computers 10 and 20 in the processing system 1 construct a plurality of functional sections as shown in
In this manner, the processors 12 and 22 of the computers 10 and 20 construct the respective functional sections, so that the processing method of solving the combinatorial optimization problem is performed in accordance with the processing flow shown in
In S10 shown in
The binary variable X represented by Formula 1 in the combinatorial optimization problem of the present embodiment is grouped into a plurality of groups Gi of a total number I in which an index i is defined as an integer by Formula 2. Then, the binary variable X is expressed as Xi[m] assuming that M binary variables X are allocated to each group Gi as in Formula 1 by using an index m defined as an integer by Formula 3. Furthermore, in each group Gi, the one-hot constraint is given in which only one Xi[m] in the same group Gi takes 1 and M−1 Xi[m] other than the one Xi[m] in the same group Gi take 0 as shown in Formula 4 and
The input management section 100 in S10 of
k=0˜K−1 (Formula 5)
In the combinatorial optimization problem of the present embodiment, the number K of solution candidates k matching the number of combination patterns of the binary variables Xi[m] for each group Gi is set to an integer equal to or greater than three, which is the same as the number M of the binary variables Xi[m] for each group Gi. Therefore, each of the K solution candidates k for each group variable xi is expressed as an integer by a multi bit index k as shown in
In this manner, in S10 of
In S12 following S11, the initial processing section 200 generates an initial value ki_s of the solution candidate k for each group variable xi. Here, parallel processing (described later) by the search section 220 is executed in parallel and simultaneously independently for each group variable xi in a plurality of blocks 26. Then, the initial processing section 200 generates the initial value ki_s, which is an integer of 0 to K−1, by random number generation for each group variable xi from individual seed values of different blocks 26, and stores the generated initial value ki_s in the parallel processing memory 24.
In S13 following S12, the search section 220 assigns the initial value ki_s generated from different seed values individually associated with each block 26 in S12 to the threads 28 of the same block 26 in common as shown in
As illustrated in
Specifically, in S14, S20 to S33 shown in
In the following S21, the search section 220 sets the output value ki_f as the group variable xi for searching by updating the solution candidate k, and selects the group variable xi in which the index i corresponding to the group Gi is common to all the blocks 26 one by one in the order of the index i. Therefore, hereinafter, the group variable xi selected in S21 is particularly referred to as a selected group variable xi. The search section 220 in S21 as described above increments the index i to be initialized to 0 in the first update processing for the selected group variable xi by one every time the second and subsequent update processing are started (that is, every time the processing flow returns from S32 described later).
In the following S22, the search section 220 assigns K different solution candidates k to the selected group variable xi of the index i in each block 26 for each of K threads 28 as shown in
As shown in
In Formula 7, Q means a quadratic unconstrained binary optimization (QUBO) matrix. Here, a matrix coefficient of Q is preferably input together with each group variable xi in S10 by being converted from the energy function for the binary variable Xi[m]. In Formula 7, xj represents a group variable with an index of j other than i to be distinguished from a group variable xi with an index of i. Then, a unique solution candidate k is given to the selected group variable xi for each thread 28. On the other hand, to the group variable xj other than the selected group variable xi, the latest value corresponding to the index j of the initial value ki_s assigned in the most recent S13 or an update value ki_u (described later) acquired in the past S25 and S31 is given. Note that in the following description and
In the search section 220 in S23, for each solution candidate k in each thread 28 for the selected group variable xi, a difference in the energy evaluation value Ei(k) from before the update processing is defined by a function δEi(k, kp) of Formula 8. Here, in Formula 8, kp represents the solution candidate before the current update processing for the selected group variable xi to be distinguished from the solution candidate k assigned in the most recent S22, which can be a candidate of the output value ki_f after the current update processing. Then, the solution candidate kp is given the latest value corresponding to the index i of the selected group variable xi of the initial value ki_s assigned in the most recent S13 or the update value ki_u acquired in the past S25 and S31.
As a result, when the energy evaluation value Ei(k) of the solution candidate k fluctuates to a smaller side than the energy evaluation value Ei(kp) of the solution candidate kp, the difference represented by Formula 8 has a negative value. On the other hand, when the energy evaluation value Ei(k) of the solution candidate k fluctuates to a greater side than the energy evaluation value Ei(kp) of the solution candidate kp, the difference represented by Formula 8 is positive. Note that in the following description and
In the search section 220 in S23, for each solution candidate k in each thread 28 for the selected group variable xi, a transition probability corresponding to the evaluation value difference δEi(k, kp) is defined by a function Pi(k) of Formula 9. Note that in the following description and
P
i(k)=exp(−δEi(k, kp)/Ta) (Formula 9)
Under the above definitions, in S23, the evaluation value difference δEi(k, kp) and the transition probability Pi(k) based on the energy evaluation value Ei(k) are acquired by parallel computation in the plurality of threads 28 for each solution candidate k for the selected group variable xi. Then, the acquired values δEi(k, kp) and Pi(k) for each thread 28 in S23 are stored in the parallel processing memory 24.
In the following S24, the search section 220 determines the presence or absence of a solution candidate k in which the evaluation value difference δEi(k, kp) from before the update processing acquired in the most recent S23 is negative (that is, δEi(k, kp)<0). As a result, when an affirmative determination is made because the evaluation value difference δEi(k, kp) corresponding to at least one solution candidate k is negative, the processing flow proceeds to S25.
In S25, the search section 220 acquires, as the update value ki_u, a solution candidate k in which the evaluation value difference δEi(k, kp) is the largest in a negative direction of at least one solution candidate k in which the evaluation value difference δEi(k, kp) is negative. Here, the update value ki_u means an update processing result for searching for the output value ki_f of the selected group variable xi. At this time, in particular, when the solution candidate k in which the evaluation value difference Ei(k, kp) is negative is singular, the singular solution candidate k corresponds to the update value ki_u in which the evaluation value difference δEi(k, kp) is the largest in the negative direction. In S25 as described above, the energy evaluation value Ei(k) is acquired on the basis of the evaluation value difference δEi(k, kp) for the solution candidate k giving the update value ki_u, and is stored in the parallel processing memory 24 in association with the update value ki_u.
In the following S26, the search section 220 determines whether a lowest energy condition is satisfied in which the energy evaluation value Ei(k) acquired in the most recent S25 is less than the energy evaluation value Ei(k) acquired in the past S25. When an affirmative determination is made as a result, the processing flow proceeds to S27.
In S27, the search section 220 updates the output value ki_f of the selected group variable xi in the parallel processing memory 24 by the update value ki_u in the most recent S25. That is, the update value ki_u corresponding to the energy evaluation value Ei(k) having the smallest value in the update processing from the past to the present is selected as the latest output value ki_f for the selected group variable xi. Furthermore, in S27, as the latest output value ki_f for the group variable xj other than the selected group variable xi, the latest value corresponding to the index j of the initial value ki_s assigned in the most recent S13 or the update values ki_u acquired in the past S25 and S31 is provided or held in the parallel processing memory 24. In S27 as described above, as the energy evaluation value Ei(k) corresponding to the latest output values ki_f and kj_f, the energy evaluation value Ei(k) acquired in the most recent S25 is updated in the parallel processing memory 24.
On the other hand, when a negative determination is made in S26, the processing flow proceeds to S28. In S28, the search section 220 provides or holds, as the latest output value ki_f for the selected group variable xi, the latest value corresponding to the index i of the initial value ki_s assigned in the most recent S13 or the output value ki_f updated in the past S27 in the parallel processing memory 24. That is, the output value ki_f of the selected group variable xi is not updated depending on the update value ki_u corresponding to the energy evaluation value Ei(k) greater than the smallest value in the past update processing. Furthermore, in S28, as the latest output value ki_f for the group variable xj other than the selected group variable xi, the latest value corresponding to the index j of the initial value ki_s assigned in the most recent S13 or the update values ki_u acquired in the past S25 and S31 is provided or held in the parallel processing memory 24.
When a negative determination is made in S24 for S25 to S28 as described above, it is determined that the evaluation value difference δEi(k, kp) from before the update processing acquired in the most recent S23 is positive for all the solution candidates k (that is, δEi(k, kp)>0), and the processing flow proceeds to S29. In S29, the search section 220 acquires an integrated probability ΣPi,N by integrating the transition probability Pi(k) acquired in the most recent S23 for a limited number N of solution candidates k of all the K solution candidates in which the evaluation value difference δEi(k, kp) is positive. At this time, the limited number N of the solution candidates k is defined as an integer smaller than a total number K of the solution candidates k so that the integrated probability (that is, a sum probability) ΣPi,N obtained by integrating the transition probability Pi(k) from a high probability side is less than one.
In the following S30, the search section 220 compares the integrated probability ΣPi,N acquired in the most recent S29 with a uniformly distributed random number probability Pr. At this time, the random number probability Pr is defined as a uniform random number in which random numbers generated in a fractional range of 0 to 1 are distributed with uniformity. Then, the search section 220 in S30 determines whether the integrated probability ΣPi,N exceeds the random number probability Pr. When an affirmative determination is made as a result, the processing flow proceeds to S31.
In S31, the search section 220 acquires, as the update value ki_u of the selected group variable xi, the solution candidate k adopted for the random number probability Pr among the limited number N of solution candidates k in a case where the integrated probability ΣPi,N exceeds the random number probability Pr. Then, in S31, the energy evaluation value Ei(k) is acquired on the basis of the evaluation value difference δEi(k, kp) for the solution candidate k giving the update value ki_u, and is stored in the parallel processing memory 24 in association with the update value ki_u. The energy evaluation value Ei(k) stored in S31 is used as a computation reference value to which a negative or positive evaluation value difference δEi(k, kp) is added in order to acquire the energy evaluation value Ei(k) in the next step of S31 or S25. The same applies to the energy evaluation value Ei(k) stored in S25 described above.
Here, in S31, a case is assumed where the update value ki_u in S25 is updated as the output value ki_f in S27 even when the update value ki_u is a value ki_ff that gives a false local solution as shown in
Specifically, as for the limited number N of solution candidates k, the search section 220 in S31 compares a cumulative sum ΣPi,n obtained by changing an integration number (that is, an integration section from a high probability side) n of the transition probability Pi(k) by one in an integer range of 1 to N as shown in
As shown in
In S33, the search section 220 determines whether the annealing temperature Ta has reached the minimum temperature Tmin. As a result, when a negative determination is made, the processing flow returns to S20 to continue the simulated annealing in which the annealing temperature Ta is reduced and changed for each group variable xi. On the other hand, when an affirmative determination is made, it is determined that the simulated annealing in S14 has been completed, and the processing flow proceeds from S14 to S15.
As described above, in the completion stage of S14, the latest output value ki_f stored in the parallel processing memory 24 for each group variable xi is confirmed as a search result. At this time, the output values ki_f and ki_f updated for the selected group variable xi and the other group variables xj in the most recent (that is, the last) S27 is the output value ki_f confirmed for each group variable xi in the completion stage of S14.
As shown in
In S16 following S15, the output management section 120 maps the output values ki_f of all the group variables xi output as the executable solutions from the parallel processing processor 22 to a solution space according to the energy function for the binary variable Xi[m]. As a result, the output management section 120 in S16 outputs a solution (that is, an optimal combination solution) in which the combination pattern of the binary variable Xi[m] is optimized so as to satisfy the one-hot constraint for each group Gi.
The output in S16 may be outputting and storing a solution to the host processing memory 14 so as to be readable by access from the outside of the system. In this case, the processing system 1 may include at least one type of a server system, a remote management system, mobility, or the like that uses an output solution from the host processing computer 10. The output in S16 may be outputting a solution by copy transfer to the outside of the system. In this case, the outside of the system may be, for example, at least one type of a server system that is communicable with the processing system 1, mobility equipped with the processing system 1, or the like that uses an output solution from the host processing computer 10. As described above, the current execution of the processing flow terminates when the execution of S16 is completed.
As described above, in the first embodiment in which the group variable xi with the combination pattern satisfying the one-hot constraint for each group Gi of the binary variable Xi[m] as the solution candidate k is defined, the solution candidate k is assigned for each thread 28 in each block 26 of the parallel processing processor 22. Therefore, in the first embodiment, the output value ki_f of the group variable xi is searched for on the basis of the energy evaluation value Ei(k) for the solution candidate k of the group variable xi assigned for each thread 28 in each block 26. Accordingly, the output value ki_f of the group variable xi is searched in parallel in each block 26, and thus, the search that can ensure accuracy can be completed in a short time. Here, the output values ki_f of all the group variables xi output by the search are equivalent to a solution in which the combination pattern is optimized so as to satisfy the one-hot constraint for each group Gi of the binary variable Xi[m]. Therefore, the first embodiment is effective in achieving both an increase in speed of solving processing and an improvement in solving accuracy.
In each block 26 according to the first embodiment, a solution candidate k is assigned to each thread 28, the solution candidate being obtained by expressing, as an integer, a combination pattern satisfying the one-hot constraint for each group Gi of the binary variable Xi[m] by a multi bit index k. Accordingly, even if the number of solutions of the combination pattern to be optimized increases, the search for the output value ki_f with which accuracy can be ensured in each block 26 can be completed in a short time in parallel on the basis of the energy evaluation value Ei(k) for the solution candidate k expressed as an integer with the number of bits corresponding to the number of solutions. Therefore, the first embodiment can contribute to both an increase in the speed of the solving processing and an improvement in the solving accuracy.
In each block 26 according to the first embodiment, the processing of assigning the solution candidate k of the same group variable xi for each thread 28 is repeated for all the group variables xi. Accordingly, the search for the output value ki_f that can be completed in a short time by the parallel processing in each block 26 is repeated for all the group variables xi, and thus, it is possible to output the output value ki_f of all the group variables xi with high accuracy. Therefore, the first embodiment is particularly effective in improving the solving accuracy together with increasing the speed of the solving processing.
In each block 26 according to the first embodiment, the output value ki_f is searched for by update processing based on not only the energy evaluation value Ei(k) for the solution candidate k of the group variable xi assigned to each thread 28 but also the transition probability Pi(k) for the solution candidate k. Accordingly, since the search accuracy of the output value ki_f can be improved, it is particularly effective for improving the solving accuracy together with increasing the speed of the solving processing.
In each block 26 according to the first embodiment, the output value ki_f is updated from the solution candidate k for each thread 28 in which the evaluation value difference δEi(k, kp) in the energy evaluation value Ei(k) from before the update processing and the transition probability Pi(k) corresponding to the difference δEi(k, kp) are acquired in accordance with the simulated annealing. Accordingly, the search for the output value ki_f according to the simulated annealing can be completed in a short time by the update processing based on the evaluation value difference δEi(k, kp) and the transition probability Pi(k) limited to the solution candidate k for each thread 28. Therefore, the first embodiment is particularly effective in increasing the speed of the solving processing together with improving the solving accuracy.
In each block 26 according to the first embodiment, the solution candidate k in which the evaluation value difference δEi(k, kp) in the energy evaluation value Ei(k) is the largest in the negative direction is acquired as the update value ki_u for searching for the output value ki_f. Accordingly, the output value ki_f that optimizes the energy evaluation value Ei(k) for each group variable xi can be searched with high accuracy and in a short time by the update based on the evaluation value difference δEi(k, kp). Therefore, the first embodiment can contribute to both an increase in the speed of the solving processing and an improvement in the solving accuracy.
In each block 26 according to the first embodiment, the integrated probability ΣPi,N obtained by integrating the transition probability Pi(k) is compared with the uniformly distributed random number probability Pr for the limited number N of solution candidates k from the high probability side among the solution candidates k in which the evaluation value difference δEi(k, kp) in the energy evaluation value Ei(k) is positive. As a result, the search for the output value ki_f is continued by using, as the update value ki_u, the solution candidate k of the transition probability Pi(k) adopted as the random number probability Pr among the limited number N of solution candidates k in a case where the integrated probability ΣPi,N exceeds the random number probability Pr. Accordingly, since the solution candidate k in which the evaluation value difference δEi(k, kp) becomes negative next even if the evaluation value difference δEi(k, kp) becomes positive once can be updated on the basis of the transition probability Pi(k) of the number N limited from the high probability side, the output value ki_f can be searched for with high accuracy for each group variable xi. Therefore, in the first embodiment, it is possible to ensure improvement in high solving accuracy while achieving an increase in the speed of the solving processing.
In the first embodiment, the group variable xi in which the combination pattern satisfying the one-hot constraint for each group Gi of the binary variable Xi[m] is set as the solution candidate k is input from the host processing processor 12 to the parallel processing processor 22. As a result, the output values ki_f of all the group variables xi output from the parallel processing processor 22 are output in accordance with mapping of the combination pattern of the binary variables Xi[m] to an optimized solution so as to satisfy the one-hot constraint for each group Gi in the host processing processor 12. Accordingly, the parallel processing processor 22 specialized for the search for the output value ki_f of the group variable xi in a short time with which accuracy can be secured can cause the host processing processor 12 to share the functions of the input of the group variable xi and the solution output of the combination pattern from the output value ki_f. Therefore, the first embodiment is particularly effective in increasing the speed of the solving processing together with improving the solving accuracy.
A second embodiment is a modification of the first embodiment.
As shown in
Specifically, in both S225 and S231, the search section 220 updates the output value ki_f of the selected group variable xi in the parallel processing memory 24 by the acquired update value ki_u. That is, the update value ki_u acquired in S225 and S231 is directly selected as the latest output value ki_f for the selected group variable xi. Furthermore, in S225 and S231, as the latest output value ki_f for the group variable xj other than the selected group variable xi, the latest value corresponding to the index j of the initial value ki_s assigned in the most recent S13 or the update values ki_u acquired in the past S225 and S231 is provided or held in the parallel processing memory 24. In S225 and S231 as described above, as the energy evaluation value Ei(k) corresponding to the latest output values ki_f and kj_f, the energy evaluation value Ei(k) acquired in accordance with the S25 and S31 is updated in the parallel processing memory 24.
In the processing flow of the second embodiment, in S14 including S225 and S231 described above, the latest output value ki_f stored in the parallel processing memory 24 for each group variable xi is confirmed as a search result. At this time, the output values ki_f and kj_f updated for the selected group variable xi and the other group variables xj in the most recent (that is, the last) step of S225 or S231 is the output value ki_f confirmed for each group variable xi in the completion stage of S14. Therefore, the second embodiment described above also enables exhibition of the operational effects similar to those of the first embodiment.
A third embodiment is a modification of the second embodiment.
As shown in
Specifically, in S314, S319 to S341 shown in
In S319 shown in
q=1˜Q (Formula 10)
In S320 following S319, the search section 220 counts a loop count h of the search processing in S321 to S339 shown in
As shown in
Specifically, in S333, the search section 220 selects a set of blocks 26 in which replica temperatures Tq and Tq+1 are adjacent to each other such that the same block 26 does not overlap between the sets. When the loop count h of the search processing counted in the most recent S333 is an odd number, the search section 220 in S320 as described above selects a set of blocks 26 corresponding to the replica temperatures Tq and Tq+1 in which q is an odd number and q+1 is an even number. On the other hand, when the loop count h of the search processing counted in the most recent S333 is an even number, the search section 220 in S320 as described above selects a set of blocks 26 corresponding to the replica temperatures Tq and Tq+1 in which q is an even number and q+1 is an odd number.
In the following S334, the search section 220 acquires an exchange determination probability Re of Formula 11 as a determination criterion for exchanging the latest output value ki_f stored in the parallel processing memory 24 for each group variable xi between the plurality of sets of blocks 26 selected in the most recent S333 as shown in
In the following S335, the search section 220 determines the presence or absence of a set of blocks 26 in which the exchange determination probability Re acquired in the most recent S334 exceeds one (that is, Re>1). As a result, when an affirmative determination is made because the exchange determination probability Re between at least one set of blocks 26 exceeds one, it is determined that an exchange condition based on the energy evaluation value Ei(k) is satisfied, and the processing flow proceeds to S336.
In S336, the search section 220 exchanges the latest output values ki_f for each group variable xi between at least one set of blocks 26 in which the exchange determination probability Re exceeds one (examples of a part (a) and a part (b) of
On the other hand, as shown in
In S338, the search section 220 compares the exchange determination probability Re acquired in the most recent S334 with a uniformly distributed random number probability Rr. At this time, the random number probability Rr is defined as a uniform random number in which random numbers generated in a fractional range of 0 to 1 are distributed with uniformity. Therefore, the search section 220 in S338 determines whether the presence or absence of a set of blocks 26 in which the exchange determination probability Re exceeds the random number probability Rr.
When an affirmative determination is made in S338, it is determined that another exchange condition based on the energy evaluation value Ei(k) is satisfied, and the processing flow proceeds to S339. In S339, the search section 220 exchanges the latest output values ki_f for each of all the group variables xi between at least one set of blocks 26 exceeding the random number probability Rr even if the exchange determination probability Re is less than one (examples of a part (c) and a part (d) of
As shown in
When a negative determination is made in S340, the processing flow returns to S320 as shown in
In the third embodiment described so far, in each block 26 as a replica of different temperatures Tq, together with the evaluation value difference δEi(k, kp) in the energy evaluation value Ei(k) from before the update processing, the output value ki_f is updated from the acquired solution candidates k for each thread 28 of the transition probability Pi(k) corresponding to the difference δEi(k, kp). Then, in the third embodiment, the output values ki_f for which the exchange condition based on the energy evaluation value Ei(k) is satisfied are exchanged between the blocks 26 of the adjacent temperatures Tq in accordance with the replica exchange method. Accordingly, the search for the output value ki_f with high accuracy can be completed in a short time by the exchange processing between the blocks 26 in which the update processing of the output value ki_f has been individually performed. Therefore, the third embodiment can contribute to both an increase in the speed of the solving processing and an improvement in the solving accuracy.
Although a plurality of embodiments have been described above, the present disclosure is not to be construed as being limited to these embodiments, and can be applied to various embodiments and combinations without departing from the gist of the present disclosure.
The dedicated computer constituting the host processing computer 10 and/or the parallel processing computer 20 in a modification of the first to third embodiments may include at least one of a digital circuit or an analog circuit as a processor. Here, the digital circuit is, for example, at least one type of an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), a system on a chip (SOC), a programmable gate array (PGA), a complex programmable logic device (CPLD), or the like. Such a digital circuit may include a memory storing a program. The memory may be a non-transitory computer readable storage medium.
In a modification of the first to third embodiments, each of the computers 10 and 20 may be implemented in a form of an individual or integrated semiconductor unit (for example, a semiconductor chip or the like). In a modification of the first to third embodiments, the functions of the host processing computer 10 may be integrated into the parallel processing computer 20. In a modification of the first to third embodiments, K=2 solution candidates k may be expressed by a single bit index k.
In a modification of the first to third embodiments, the energy evaluation value Ei(k) may be acquired every time S23 is executed. In a modification of the third embodiment, steps equivalent to S26 to S28 of the first embodiment may be added between S325 and S332. In a modification of the third embodiment, the latest output value ki_f acquired for each group variable xi by the block 26 in which the replica temperature Tq is the minimum temperature Tmin and stored in the parallel processing memory 24 may be determined as a search result in S341.
The present specification discloses a plurality of technical ideas listed below and a plurality of combinations thereof.
A processing system includes a parallel processing processor in which a plurality of threads are constructed for each of a plurality of blocks, in which a combination of binary variables is optimized under a one-hot constraint, and when a group variable is defined with a combination pattern satisfying the one-hot constraint as a solution candidate for each of groups of the binary variables, the parallel processing processor is configured to execute assigning the solution candidate of the group variable for each of the threads in each of the blocks, searching for an output value of the group variable on the basis of an energy evaluation value for the solution candidate of the group variable assigned for each of the threads in each of the blocks, and outputting the output value of all the group variables having been searched.
In the processing system according to Technical idea 1, assigning the solution candidate includes assigning, for each of the threads, the solution candidate in which the combination pattern satisfying the one-hot constraint is expressed as an integer by a multi bit index for each of the groups of the binary variables in each of the blocks.
In the processing system according to Technical idea 1 or 2, assigning the solution candidate includes repeating processing of assigning the solution candidate of the same group variable for each of the threads for all the group variables in each of the blocks.
In the processing system according to any one of Technical ideas 1 to 3, searching for the output value includes searching for the output value by update processing based on the energy evaluation value and a transition probability for the solution candidate of the group variable assigned to each of the threads in each of the blocks.
In the processing system according to Technical idea 4, searching for the output value includes updating the output value from the solution candidate for each of the threads in which a difference in the energy evaluation value from before the update processing and the transition probability corresponding to the difference are acquired in accordance with simulated annealing in each of the blocks.
In the processing system according to Technical idea 4, searching for the output value includes updating the output value from the solution candidate for each of the threads in which the difference in the energy evaluation value from before the update processing and the transition probability corresponding to the difference are acquired in each of the blocks set as a replica of different temperatures, and exchanging the output values for which an exchange condition based on the energy evaluation value is satisfied between the blocks of adjacent temperatures in accordance with a replica exchange method.
In the processing system according to Technical idea 5 or 6, searching for the output value includes acquiring the solution candidate in which the difference in the energy evaluation value is the largest in a negative direction as an update value for searching for the output value in each of the blocks.
In the processing system according to Technical idea 7, searching for the output value includes comparing, in each of the blocks, an integrated probability obtained by integrating the transition probability with a uniformly distributed random number probability for a limited number of the solution candidates from a high probability side of the transition probability among the solution candidates in which the difference in the energy evaluation value is positive, and continuing to search for the output value by using, as the update value, the solution candidate of the transition probability adopted as the uniformly distributed random number probability among the limited number of the solution candidates in a case where the integrated probability exceeds the uniformly distributed random number probability.
The processing system according to any one of Technical ideas 1 or 8 further includes a host processing processor together with the parallel processing processor, in which the host processing processor is configured to execute inputting, to the parallel processing processor, the group variable in which the combination pattern satisfying the one-hot constraint for each of the groups of the binary variables is the solution candidate, and outputting a solution in which the combination pattern of the binary variables is optimized to satisfy the one-hot constraint for each of the groups by mapping the output values of all the group variables output from the parallel processing processor.
Note that the above technical idea 1 to 9 may be implemented in a form of a method and a program.
Number | Date | Country | Kind |
---|---|---|---|
2022-190585 | Nov 2022 | JP | national |