DEVICE AND METHOD FOR SEARCHING COMPOUND

Information

  • Patent Application
  • 20200135295
  • Publication Number
    20200135295
  • Date Filed
    September 11, 2019
    5 years ago
  • Date Published
    April 30, 2020
    4 years ago
Abstract
A device including: a defining unit to define lattice space that is collection of lattices where compound groups are sequentially arranged; a limiting unit; an assigning unit; an arithmetic unit; a judging unit; and a controlling unit to cause the limiting unit to execute expansion of the limited lattice space, the assigning unit to execute assignment of the bits to the lattice points included in the limited lattice space after the expansion, and the arithmetic unit to execute calculation of the minimum energy, in case where the judging unit judges any of the compound groups assigned to the lattice points is arranged on the outermost edge, wherein the device is device for searching the compound, in which the compound groups are linked with one another.
Description
CROSS-REFERENCE TO RELATED APPLICATION

This application is based upon and claims the benefit of priority of the prior Japanese Patent Application No. 2018-201591, filed on Oct. 26, 2018, the entire contents of which are incorporated herein by reference.


FIELD

The embodiments discussed herein relate to a method and device for searching a compound.


BACKGROUND

A protein is a chain polymer where amino acids are linked one-dimensionally without branching. A protein forms a certain conformation (three-dimensional shape) by folding a chain polymer thereof. A conformation of a protein is determined by a sequence of amino acids.


A conformation of a protein is deeply related to functions of a protein. A molecule-recognition function of a protein is expressed by specifically bounding a certain region within a conformation thereof to a certain molecule. Therefore, it is important to determine a conformation of a protein to understand functions of the protein.


For example, a conformation of a protein can be determined by X-ray crystallography, or nuclear magnetic resonance spectroscopy (NMR). However, it takes a long time to determine a conformation of one protein by X-ray crystallography or NMR. According to X-ray crystallography, moreover, a single crystal of one kind of protein is created first. When the single crystal cannot be created, X-ray crystallography cannot be performed on a conformation of the protein. Moreover, NMR can determine a conformation of a protein in an aqueous solution without crystalizing the protein, but a large quantity of information related to a conformation of the protein cannot be obtained when the protein is a large protein.


Meanwhile, a sequence of amino acids of a protein can be relatively easily determined from genetic information or the protein itself, even when a conformation of the protein is unknown.


Accordingly, there have been attempts to predict a conformation of a protein from a sequence of amino acids. For example, there is a method for determining folding of a protein according to the diamond encoding method. The method is a method for embedding positions of chain amino acids in a diamond lattice, and can express a three-dimensional structure (conformation). Energy of the conformation determined by the above-described method can be calculated, for example, using the Ising model. To solve the Ising model, for example, an annealing machine is used. One example of the background technology is disclosed in R. Babbush et.al., Construction of Energy Functions for Lattice Heteropolymer Models: A Case Study in Constraint Satisfaction Programming and Adiabatic Quantum Optimization, arXiv:quant-ph/1211.3422v2 (https://arxiv.org/abs/1211.3422).


SUMMARY

According to one aspect of the present disclosure, a device for searching a compound includes: a defining unit configured to define a lattice space that is a collection of lattices where a plurality of compound groups are sequentially arranged; a limiting unit configured to, in the case where any of the compound groups is arranged in any of the lattices of the lattice space, followed by arranging a next compound group in the lattice space, generate a limited lattice space that is a space created by eliminating, from the lattice space, undesirable regions for the next compound group to be arranged; an assigning unit configured to assign a bit to each of lattice points, to which the compound groups can be arranged, in the limited lattice space; an arithmetic unit configured to perform a ground state search on an Ising model obtained through conversion based on restriction conditions related to each of the lattice points according to simulated annealing, to thereby calculate minimum energy of the Ising model; a judging unit configured to judge whether any of the compound groups assigned to the lattice points is arranged on the outermost edge of the limited lattice space or not; and a controlling unit configured to cause the limiting unit to execute expansion of the limited lattice space, cause the assigning unit to execute assignment of the bits to the lattice points included in the limited lattice space after the expansion, and cause the arithmetic unit to execute calculation of the minimum energy of the Ising model, in the case where the judging unit judges that any of the compound groups assigned to the lattice points is arranged on the outermost edge of the limited lattice space, wherein the device is a device for searching the compound, in which a plurality of the compound groups are linked with one another.


According to another aspect of the present disclosure, a method for searching a compound includes: defining a lattice space that is a collection of lattices where a plurality of compound groups are sequentially arranged; in the case where any of the compound groups is arranged in any of the lattices of the lattice space, followed by arranging a next compound group in the lattice space, generating a limited lattice space that is a space created by eliminating, from the lattice space, undesirable regions for the next compound group to be arranged; assigning a bit to each of lattice points, to which the compound groups can be arranged, in the limited lattice space; performing a ground state search on an Ising model obtained through conversion based on restriction conditions related to each of the lattice points according to simulated annealing, to thereby calculate minimum energy of an Ising model; judging whether any of the compound groups assigned to the lattice points is arranged on the outermost edge of the limited lattice space or not; and executing expansion of the limited lattice space, assigning the bits to the lattice points included in the limited lattice space after the expansion, assignment of the bits to the lattice points included in the limited lattice space after the expansion, and calculation of the minimum energy of the Ising model, in the case where it is judged that any of the compound groups assigned to the lattice points is arranged on the outermost edge of the limited lattice space, wherein the method is a method for allowing a computer to search the compound in which a plurality of the compound groups are linked with one another.


According to another aspect of the present disclosure, a program for searching a compound for causing a computer to execute a method for searching a compound in which a plurality of compound groups are linked with one another. The method includes: defining a lattice space that is a collection of lattices where a plurality of compound groups are sequentially arranged; in the case where any of the compound groups is arranged in any of the lattices of the lattice space, followed by arranging a next compound group in the lattice space, generating a limited lattice space that is a space created by eliminating, from the lattice space, undesirable regions for the next compound group to be arranged; assigning a bit to each of lattice points, to which the compound groups can be arranged, in the limited lattice space; performing a ground state search on an Ising model obtained through conversion based on restriction conditions related to each of the lattice points according to simulated annealing, to thereby calculate minimum energy of an Ising model; judging whether any of the compound groups assigned to the lattice points is arranged on the outermost edge of the limited lattice space or not; and executing expansion of the limited lattice space, assigning the bits to the lattice points included in the limited lattice space after the expansion, assignment of the bits to the lattice points included in the limited lattice space after the expansion, and calculation of the minimum energy of the Ising model, in the case where it is judged that any of the compound groups assigned to the lattice points is arranged on the outermost edge of the limited lattice space.


The object and advantages of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the claims.


It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention, as claimed.





BRIEF DESCRIPTION OF DRAWINGS


FIG. 1 is a graph depicting a relationship between the number of amino acid residues and the number of bits used;



FIG. 2A is a schematic view for searching a stable conformation of a protein (part 1);



FIG. 2B is a schematic view for searching the stable conformation of the protein (part 2);



FIG. 2C is a schematic view for searching the stable conformation of the protein (part 3);



FIG. 3A is a schematic view for describing the diamond encoding method (part 1);



FIG. 3B is a schematic view for describing the diamond encoding method (part 2);



FIG. 3C is a schematic view for describing the diamond encoding method (part 3);



FIG. 3D is a schematic view for describing the diamond encoding method (part 4);



FIG. 3E is a schematic view for describing the diamond encoding method (part 5);



FIG. 4 is a conceptual view illustrating a state where a lattice space is limited according to the disclosed technology;



FIG. 5 is a view illustrating an example where amino acid residues are arranged on the outermost edge (outer shell) of the limited lattice space;



FIG. 6 is a view illustrating a structural example of the disclosed device for searching a compound;



FIG. 7 is a flowchart for describing a method for searching a stable conformation of a protein using the device for searching a compound 10A of FIG. 6;



FIG. 8 is a view illustrating a case where each lattice within a radius r is Sr;



FIG. 9A is a view illustrating a collection of lattice points to which amino acid residues move in the case where a limited lattice space is not generated (part 1);



FIG. 9B is a view illustrating a collection of lattice points to which amino acid residues move in the case where a limited lattice space is not generated (part 2);



FIG. 9C is a view illustrating a collection of lattice points to which amino acid residues move in the case where a limited lattice space is not generated (part 3);



FIG. 9D is a view illustrating a collection of lattice points to which amino acid residues move in the case where a limited lattice space is not generated (part 4);



FIG. 10 is a view illustrating S1, S2, and S3 three-dimensionally;



FIG. 11A is a view illustrating one example of a state where space information is assigned to each of bits X1 to Xn (part 1);



FIG. 11B is a view illustrating one example of the state where space information is assigned to each of bits X1 to Xn (part 2);



FIG. 11C is a view illustrating one example of the state where space information is assigned to each of bits X1 to Xn (part 3);



FIG. 12 is a view for describing Hone;



FIG. 13 is a view for describing Hconn;



FIG. 14 is a view for describing Holap);



FIG. 15A is a view for describing Hpair (part 1);



FIG. 15B is a view for describing Hpair (part 2);



FIG. 16 is a view illustrating one example of a weight file;



FIG. 17 is a view illustrating a conceptual structure of an optimizing device (arithmetic unit) used for simulated annealing;



FIG. 18 is a block diagram of a circuit level of a transition controlling unit;



FIG. 19 is a diagram illustrating an operation flow of a transition controlling unit;



FIG. 20A is a view illustrating one example of an arrangement of amino acid residues in a diamond lattice space (Example 1);



FIG. 20B is a view illustrating one example where the diamond lattice space of FIG. 20A is expanded (Example 1);



FIG. 21 is a view illustrating another structural example of the disclosed device for searching a compound (Example 2);



FIG. 22 is a flowchart illustrating another method for searching a stable structure of a protein using the device for searching a compound 10B of FIG. 21 (Example 2);



FIG. 23A is a view illustrating one example of an arrangement of amino acid residues in a diamond lattice space (Example 2);



FIG. 23B is a view illustrating one example where the diamond lattice space of FIG. 23A is expanded (Example 2);



FIG. 23C is a view illustrating a state where Nos. 1 to 4 amino acid residues are fixed in the expanded diamond lattice space of FIG. 23B (Example 2);



FIG. 24A is a view illustrating one example of an arrangement of amino acid residues in a diamond lattice space (Example 3);



FIG. 24B is a view illustrating one example where the diamond lattice space of FIG. 24A is expanded (Example 3);



FIG. 24C is a view illustrating a state where Nos. 1 to 4 amino acid residues are fixed in the expanded diamond lattice space of FIG. 24B (Example 3);



FIG. 25A is a view illustrating one example of an arrangement of amino acid residues in a diamond lattice space (Example 4);



FIG. 25B is a view illustrating one example where the diamond lattice space of FIG. 25A is expanded (Example 4);



FIG. 26A is a view illustrating one example of an arrangement of amino acid residues in a diamond lattice space (Example 5);



FIG. 26B is a view illustrating one example where the diamond lattice space of FIG. 26A is expanded (Example 5);



FIG. 26C is a view illustrating a state where Nos. 1 to 4 amino acid residues are fixed in the expanded diamond lattice space of FIG. 26B (Example 5);



FIG. 27A is a view illustrating one example of an arrangement of amino acid residues in a diamond lattice space (Example 6);



FIG. 27B is a view illustrating one example where the diamond lattice space of FIG. 27A is expanded (Example 6);



FIG. 27C is a view illustrating a state where Nos. 1 to 4 amino acid residues are fixed in the expanded diamond lattice space of FIG. 27B (Example 6);



FIG. 28 is a diagram summarizing relationships of Examples 1 to 6;



FIG. 29 is a view illustrating another structural example of the disclosed device for searching a compound;



FIG. 30 is a flowchart of a modified example of the steps S101 to S107 of the flowchart of FIG. 7;



FIG. 31 is a view for describing the limit in an alignment of amino acid residues when a straight chain number limiting parameter M (part 1) is set;



FIG. 32 is a view for describing the limit in an alignment of amino acid residues when a straight chain number limiting parameter M (part 2) is set;



FIG. 33 is a flowchart of a modified example of the steps S101 to S107 of the flowchart of FIG. 7;



FIG. 34 is a view for describing the maximum space when a straight chain number limiting parameter M;



FIG. 35 is a graph comparing the number of bits used between Comparative Example and Referential Examples; and



FIG. 36 is a view describing one example of an effect of reducing the number of bits.





DESCRIPTION OF EMBODIMENTS

The disclosed device for searching a compound is a compound search device for searching a compound in which a plurality of compound groups are linked with one another.


The device for searching a compound includes at least a defining unit, a limiting unit, an assigning unit, an arithmetic unit, a judging unit, and a controlling unit.


The defining unit is configured to define a lattice space that is a collection of lattices where a plurality of compound groups are sequentially arranged.


The limiting unit is configured to, in the case where any of the compound groups is arranged in any of the lattices of the lattice space, followed by arranging a next compound group in the lattice space, generate a limited lattice space that is a space created by eliminating, from the lattice space, undesirable regions for the next compound group to be arranged.


The assigning unit is configured to assign a bit to each of lattice points, to which the compound groups can be arranged, in the limited lattice space.


The arithmetic unit is configured to perform a ground state search on an Ising model obtained through conversion based on restriction conditions related to each of the lattice points according to simulated annealing, to thereby calculate minimum energy of the Ising model.


The judging unit is configured to judge whether any of the compound groups assigned to the lattice points is arranged on the outermost edge of the limited lattice space or not.


The controlling unit is configured to cause the limiting unit to execute expansion of the limited lattice space, causing the assigning unit to execute assignment of the bits to the lattice points included in the limited lattice space after the expansion, and cause the arithmetic unit to execute calculation of the minimum energy of the Ising model, in the case where the judging unit judges that any of the compound groups assigned to the lattice points is arranged on the outermost edge of the limited lattice space.


The disclosed method for searching a compound is a method for searching a compound in which a plurality of compound groups are linked with one another.


The method for searching a compound allows a computer to perform a method including: defining a lattice space that is a collection of lattices where a plurality of compound groups are sequentially arranged; in the case where any of the compound groups is arranged in any of the lattices of the lattice space, followed by arranging a next compound group in the lattice space, generating a limited lattice space that is a space created by eliminating, from the lattice space, undesirable regions for the next compound group to be arranged; assigning a bit to each of lattice points, to which the compound groups can be arranged, in the limited lattice space; and performing a ground state search on an Ising model obtained through conversion based on restriction conditions related to each of the lattice points according to simulated annealing, to thereby calculate minimum energy of the Ising model.


In the method for searching method, moreover, the computer judges whether any of the compound groups assigned to the lattice points is arranged on the outermost edge of the limited lattice space or not, and executes expansion of the limited lattice space, assigning the bits to the lattice points included in the limited lattice space after the expansion, assignment of the bits to the lattice points included in the limited lattice space after the expansion, and calculation of the minimum energy of the Ising model, in the case where it is judged that any of the compound groups assigned to the lattice points is arranged on the outermost edge of the limited lattice space.


The disclosed program for searching a compound is a program for causing a computer to execute a method for searching a compound in which a plurality of compound groups are linked with one another.


The method includes: defining a lattice space that is a collection of lattices where a plurality of compound groups are sequentially arranged; in the case where any of the compound groups is arranged in any of the lattices of the lattice space, followed by arranging a next compound group in the lattice space, generating a limited lattice space that is a space created by eliminating, from the lattice space, undesirable regions for the next compound group to be arranged; assigning a bit to each of lattice points, to which the compound groups can be arranged, in the limited lattice space; and performing a ground state search on an Ising model obtained through conversion based on restriction conditions related to each of the lattice points according to simulated annealing, to thereby calculate minimum energy of the Ising model.


In the program for searching method, moreover, the computer judges whether any of the compound groups assigned to the lattice points is arranged on the outermost edge of the limited lattice space or not, and executes expansion of the limited lattice space, assigning the bits to the lattice points included in the limited lattice space after the expansion, assignment of the bits to the lattice points included in the limited lattice space after the expansion, and calculation of the minimum energy of the Ising model, in the case where it is judged that any of the compound groups assigned to the lattice points is arranged on the outermost edge of the limited lattice space.


Since there is a restriction in hardware of an anealing machine for solving an Ising model, there are restrictions in the number of arithmetic bits or quantum bits the annealing machine can handle.


Meanwhile, the number of bits used for solving a problem of folding of a protein increases exponentially relative to a scale of the protein (the number of amino acid residues) as demonstrated in the graph of FIG. 1.


As described above, a scale of a problem to be solved is limited by the restriction in the number of bits handled by hardware, and therefore search targets of amino acids cannot be expanded.


The present disclosure has an object to provide a device, method, and program for searching a compound, which can appropriately suppress the number of arithmetic bits or quantum bits used for searching a predetermined compound, and can search a compound having a large molecular weight.


According to one aspect of the present disclosure, provided is a device for searching a compound, which can appropriately suppress the number of arithmetic bits or quantum bits used for searching a predetermined compound, and can search a compound having a large molecular weight.


According to another aspect of the present disclosure, provided is a method for searching a compound, which can appropriately suppress the number of arithmetic bits or quantum bits used for searching a predetermined compound, and can search a compound having a large molecular weight.


According to another aspect of the present disclosure, provided is a program for searching a compound, which can appropriately suppress the number of arithmetic bits or quantum bits used for searching a predetermined compound, and can search a compound having a large molecular weight.


Before describing the details of the disclosed technology, a method for determining folding of a protein that is a compound according to the diamond encoding method will be described.


A search of a stable conformation of a protein is typically performed in the following manner.


First, coarse graining of a protein is performed (FIG. 2A). For example, the coarse graining of a protein is performed by coarse graining atoms 2 constituting the proteins into amino acid residue units 1A, 1B, and 1C.


Next, a structure search is performed using the created coarse-grained model (FIG. 2B). The structure search is performed according to the diamond encoding method described later.


Next, the coarse-grained model is returned back to the whole atoms (FIG. 2C).


The diamond encoding method is a method where a linear amino acid is embedded in a position on a diamond lattice, and can represents a three-dimensional structure. For the sake of simplicity, a two-dimensional structure is described as an example.


Used as an example is a linear pentapeptide having a structure illustrated in FIG. 3A, where 5 amino acid residues are linked, when the structure is represented by a linear structure. In FIGS. 3A to 3E, a number in each circle is a number of the amino acid residue in the linear pentapeptide.


First, an amino acid residue of No. 1 is arranged at a center of a diamond lattice as illustrated in FIG. 3A, positions where an amino acid residue of No. 2 can be arranged are limited to positions next to the center as illustrated in FIG. 3B (the positions numbered as 2).


Next, in FIG. 3C, positions to which an amino acid residue of No. 3 bonded to next to the amino acid residues of No. 2 can be arranged are limited to positions next to the positions numbered as 2 (the positions numbered as 3) in FIG. 3B.


Next, in FIG. 3D, positions to which an amino acid residue of No. 4 bonded to next to the amino acid residues of No. 3 can be arranged are limited to positions next to the positions numbered as 3 (the positions numbered as 4) in FIG. 3C.


Next, in FIG. 3E, positions to which an amino acid residue of No. 5 bonded to next to the amino acid residues of No. 4 can be arranged are limited to positions next to the positions numbered as 4 (the positions numbered as 5) in FIG. 3D.


In the manner as described above, a three-dimensional structure can be expressed by linking the positions where amino acid residues can be arranged.


As amino acid residues are bounded into a straight chain, a radius (n) of a diamond lattice space is set according to the number (n) of amino acid residues to be bounded.


However, amino acid residues are typically rarely arranged into a straight chain in a protein due to interaction between amino acid residues.


Therefore, a conformation of a protein can be determined without matching a radius r of a diamond lattice space with the number (n) of amino acid residues as illustrated in FIG. 4.


According to the disclosed technology, therefore, in the case where any of the compound groups is arranged in any of the lattices of the lattice space, followed by arranging a next compound group in the lattice space, a limited lattice space that is a space created by eliminating, from the lattice space, undesirable regions for the next compound group to be arranged, is generated, and a bit is assigned to each of lattice points, to which the compound groups can be arranged in the limited lattice space. This technology may be referred to as “Referential Example” hereinafter. As a result, the number of arithmetic bits or quantum bits used for a search of a predetermined compound is suppressed, and a compound having a large molecular weight can be searched.


In this case, however, the arrangement of the compound group is limited by the outermost edge of the limited lattice space if the limited lattice space is too small. As a result, an appropriate conformation may not be obtained. Specifically, in the case where any of the compound groups assigned to the lattice points is arranged on the outermost edge of the limited lattice space, the lattice space may be excessively limited.


For example, an example where 5 amino acid residues are aligned in a diamond lattice space having a radius of 3 as illustrated in FIG. 5 is considered. Since the third amino acid residue is arranged on the outermost edge, the fourth amino acid residues is not arranged in the directions indicated with the arrows. However, the more stable conformation may be obtained when the fourth amino acid residue is arranged in any of the directions of the arrows.


In the disclosed technology, therefore, the limited lattice space is expanded further when any of the compound groups assigned to the lattice groups is arranged on the outermost edge of the limited lattice space. As a result, the number of arithmetic bits or quantum bits used for search of a predetermined compound can be appropriately suppressed, and a compound having a large molecular weight can be searched.


In the present specification, the term “outermost edge” means an outer shell of a diamond lattice space, and includes both the outermost surface and the outermost side.


For example, the compound groups are amino acid residues.


In the case where the compound groups are amino acid residues, examples of the compound include a protein.


Amino acid that is a base of an amino acid residue may be natural amino acid or synthetic amino acid. Examples of the natural amino acid include alanine, arginine, asparagine, aspartic acid, cysteine, glutamine, glutamic acid, glycine, histidine, isoleucine, leucine, lysine, methionine, phenyl alanine, proline, serine, threonine, tryptophan, tyrosine, valine, β-alanine, and β-phenylalanine. Examples of the synthetic amino acid include para-benzoylphenylalanine.


The number of amino acid residues in the protein is not particularly limited and may be appropriately selected depending on the intended purpose. For example, the number thereof may be from about 10 to about 30, or about several hundreds.


For example, the number thereof may be from about 10 to about 30 as long as the protein is a protein for middle molecule drug discovery.


One example of the disclosed technology will be described using examples of a device, flowcharts, etc., hereinafter.


A structural example of a device for searching a compound is illustrated in FIG. 6.


The device for searching a compound 10A illustrated in FIG. 6 includes a compound group number counting unit 11, a defining unit 12, a limiting unit 13, an assigning unit 14, a H generating unit 15, a weight extracting unit 16, a weight file creating unit 17, an arithmetic unit 18, a judging unit 19, a controlling unit 20, and an outputting unit 21.


A flowchart for describing a method for searching a stable conformation of a protein using the device for searching a compound 10A of FIG. 6 is illustrated in FIG. 7.


<Step S101>

First, the number (n) of amino acid residues (compound groups) constituting the input protein (an alignment of the amino acid residues) is counted by the compound group number counting unit 11 (S101).


<Step S102>

Next, a lattice space that is a collection of lattices to which a plurality of the amino acid residues are sequentially arranged is defined by the defining unit 12 based on the number (n) of the amino acid residues (S102).


One example of the definition of the lattice space will be described. The lattice space is three dimensional, but a two dimensional lattice space is described as an example for simplicity.


First, a collection of lattices within a radius r in a diamond lattice space is determined as a shell, and each lattice point is determined as Sr. Each lattice point Sr is represented as in FIG. 8.


In the case where a limited lattice space is not generated unlike the disclosed technology, for example, collections V1 to V5 of lattice points to which amino acid residues of Nos. 1 to 5 are moved is represented as in FIGS. 9A to 9D.


In FIG. 9A, V1=S1, and V2=S2.


In FIG. 9B, V3=S3.


In FIG. 9C, V4=S2, S4.


In FIG. 9D, V5=S3, S5.


Note that, when S1, S2, and S3 are represented in three dimension, S1, S2, and S3 are represented as in FIG. 10. In FIG. 10, A=S1, B=S2, and C=S3.


In the case where a limited lattice space is not generated, a space Vi used for i-numbered amino acid residues in a protein having amino acid residues in the number of n is represented by the following formula.







V
i

=




r

J




S
r






In the formula above, i={1, 2, 3, . . . n}.


In case of an odd numbered (i=odd number) amino acid residue, J={1, 3, . . . i}. In case of an even numbered (i=even number) amino acid residue, J={2, 4, . . . i}.


<Steps S103 and S104>

In the disclosed technology, meanwhile, in the case where any of the compound groups is arranged in any of the lattices of the lattice space, followed by arranging a next compound group in the lattice space, a limited lattice space that is a space created by eliminating, from the lattice space, undesirable regions for the next compound group to be arranged, is generated by the limiting unit 13. For example, a space limiting parameter L (L<n) representing a size of a diamond lattice space is set (S103), and a collection of lattice points to which i-numbered amino acid residue is move under the limit of the space limiting parameter L is determined as Vi (S104).


Vi that is the space for the i-numbered amino acid residue is represented by the following formula.







V
i

=




r

J




S
r






In the formula above, i={1, 2, 3, . . . n}.


When the space limiting parameter L is an even number, and i<L:

    • J={1, 3, . . . i} in case of an odd numbered (i=odd number) amino acid residue.
    • J={2, 4, . . . i} in case of an even numbered (i=even number) amino acid residue.


When the space limiting parameter L is an even number and i>L:

    • J={1, 3, . . . L−1} in case of an odd numbered (i=odd number) amino acid residue.
    • J={2, 4, . . . L} in case of an even numbered (i=even number) amino acid residue.


When the space limiting parameter L is an odd number and i<L:

    • J={1, 3, . . . i} in case of an odd numbered (i=odd number) amino acid residue.
    • J={2, 4, . . . i} in case of an even numbered (i=even number) amino acid residue.


When the space limiting parameter L is an odd number and i>L:

    • J={1, 3, . . . L} in case of an odd numbered (i=odd number) amino acid residue.
    • J={2, 4, . . . L−1} in case of an even numbered (i=even number) amino acid residue.


As described above, a space to which an amino acid residue is arranged is determined.


<Step S105>

Next, the assigning unit 14 is configured to assign a bit to each of the lattice points to which a plurality of compound groups are arranged in the limited lattice space. Specifically, special information is assigned to each of bits X1 to Xn (S105). As illustrated in FIGS. 11B to 11E, specifically, a bit expressing presence of an amino acid residue in that position as 1 and absence of an amino acid residue as 0 is assigned with respect to a space to which each of amino acid residues is arranged. Note that, in FIGS. 11A to 11C, a plurality of Xi are assigned to amino acid residues 2 to 4, but in reality one bit Xi is assigned to one amino acid residue 1.


<Step S106>

Next, Hone, Hconn, Holap, and Hpair are set and an Ising model obtained through conversion based on restriction conditions related to each lattice point is created (S106).


Setting of Hone, Hconn, Holap, and Hpair is performed in each of a Hone generating unit 15A, a Hconn generating unit 15B, a Holap, generating unit 15C, and a Hpair generating unit 15D of the H generating unit 15.


In the diamond encoding method, the entire energy can be expressed as follows.






E(x)=H=Hone+Hconn+Holap+Hpair


In the formula above, Hone is a restriction that there is only one from each of first to n-numbered amino acids.


Hconn is a restriction that the first to n-numbered amino acids are all linked with one another.


Holap is a restriction that the first to n-numbered amino acids are not overlapped with one another.


Hpair is a restriction representing an interaction between amino acids.


One example of each restriction is as follows.


Note that, in FIGS. 12 to 15 described below, X1 is a position to which an amino acid residue of No. 1 can be arranged.


X2 to X5 are positions to which an amino acid residue of No. 2 can be arranged.


X6 to X13 are positions to which an amino acid residue of No. 3 can be arranged.


X14 to X29 are positions to which an amino acid residue of No. 4 can be arranged.


One example of Hone is presented below.







H
one

=


λ
one






i
=
0


N
-
1












x
a

,

x
b

,



Q
i


,

a
<
b






x
a



x
b









In the function above, Xa and Xb may be 1 or 0. Specifically, Hone is a function that energy increases when any two or more of X2, X3, X4, and X5 are 1, because only one of X2, X3, X4, and X5 is 1 in FIG. 12, and is a term of penalty and becomes 0 when only one of X2, X3, X4, and X5 is 1.


Note that, in the function above, Aone is a weighting coefficient.


One example of Hconn is presented below.







H
conn

=


λ
conn



(

N
-
1
-




i
=
0


N
-
1












x
d



Q
i









x
u




η


(

x
d

)




Q

i
+
1








x
d



x
u






)






In the function above, Xd and Xu may be 1 or 0. Specifically, Hconn is a s formula that energy decreases as long as X13, X6, or X7 is 1 when X2 is 1 in FIG. 13, and is a penalty term and becomes 0 when all of the amino acid residues are linked with one another.


Note that, in the function above, λconn is a weighting coefficient. For example, the relationship of λoneconn is satisfied.


One example of Holap is presented below.







H
olap

=


λ
olap






v

V








x
a

,

x
b

,



θ


(
v
)



,

a
<
b






x
a



x
b









In the function above, Xa and Xb are 1 or 0. Specifically, Holap is a term generating a penalty when X14 is 1 with X2 being 1 in FIG. 14.


Note that, in the function above, λolap is a weighting coefficient.


One example of Hpair is presented below.







H
pair

=


1
2






i
=
0


N
-
1












x
a



Q
i









x
b



η


(

x
a

)







p


ω


(

x
a

)




ω


(

x
b

)






x
a



x
b










In the function above, Xa and Xb may be 1 or 0. Specifically, Hpair is a function that energy decreases due to interaction Pω(x1)ω(x15) between the amino acid residue of Xi and the amino acid residue of X15 when X15 is 1 with X1 being 1 in FIGS. 15A and 15B. The interaction Pω(x1)ω(x15) is determined by a combination of two amino acid residues. For example, the interaction Pω(x1)ω(x15) is determined with reference to Miyazawa-Jernigan (MJ) matrix.


Next, H is calculated by synthesizing Hone, Hconn, Holap, and Hpair by the synthesizing unit 15E.


Next, a weighting coefficient (λone, λconn, and λolap) of each functions above is extracted by the weight extracting unit 16.


Next, a weight file corresponding to the extracted weight coefficient is created by the weight file creating unit 17. For example, the weight file is a matrix. In case of 2X1X2+4X2X3, for example, the weight file is a matrix file as illustrated in FIG. 16.


The following energy formula of the Ising model can be expressed by using the created weight file.







E


(
x
)


=


-






i
,
j







W
ij



x
i



x
j




-



i




b
i



x
i








In the function above, the states Xi and Xj may be 0 or 1, where 0 means absence and 1 means presence. Wij that is a first term of the right side is a weighting coefficient.


The first term of the right side is the integration of the product of the state of two neuron circuits and the weighting value for all selectable combinations of two neuron circuits from the whole neuron circuits without any omission or overlap.


Moreover, the second term of the right side is the integration of the product of the bias value and state of each of the whole neuron circuits. bi is a bias value of the i-numbered neuron circuit.


<Step S107>

Next, the arithmetic unit 18 (annealing machine) executes a ground state search of the Ising model converted based on the restriction conditions related to each of the lattice points according to simulated annealing to thereby calculate the minimum energy of the Ising model (S107).


The arithmetic unit 18 (annealing machine) may be any of a quantum annealing machine, a semiconductor annealing machine using a semiconductor technology, or simulated annealing executed by software using a central processing unit (CPU) or a graphics processing unit (GPU), if the computer for use is a computer employing an annealing system for performing a ground state search of an energy function represented by the Ising model.


One example of simulated annealing and the arithmetic unit 18 (annealing machine) will be described below.


Simulated annealing (SA) is a kind of Monte Carlo methods, and a method for stochastically determining using a random numerical value. In the description below, a problem for minimizing a value of an evaluation function to be optimized is taken as an example, and the value of the evaluation function is called energy. In case of maximization, a plus or minus sign of the evaluation function may be changed.


Starting with an initial state where one discrete value is assigned to each variable, a state that is close to the initial state (e.g., a state where only one variable is changed) is selected from the current state (combinations of values of the variables), and then state transition thereof is studied. An energy change for the state transition is calculated, and whether the state transition is adapted to change the state or the original state is retained without adapting the state transition is determined stochastically depending on the calculated value. When the adaption probability of a case where energy reduces is selected to be larger than the adaption probability of a case where energy increases, the state change occurs in the tendency that the energy reduces on average, and it is expected that the state transits to an appropriate state over time. Then, ultimately, it is possible to obtain an approximation solution that gives energy close to an optimum solution or an optimum value. If the case where energy reduces is adopted deterministically and the case where energy increases is not adapted, the energy change is in the state of weakly decreasing with respect to time, but the change will stop once the local solution is reached. Since there are a large number of local solutions in the discrete optimization problem as described above, it is most likely that the state is trapped by a local solution that is not very close to an optimum value. Accordingly, it is important to stochastically determine whether to adapt.


It is proved in the simulated annealing that the state reaches the optimum solution with the limit of infinite time (the number of iteration) when the adaptation (tolerance) probability of the state transition is determined as follows.


(1) With respect to an energy change (energy decrease) value (−ΔE) along with the state transition, acceptance probability p of the state transition is determined by any of the following functions f( ).










p


(


Δ





E

,
T

)


=

f


(


-
Δ







E
/
T


)






(

Formula





1


-


1

)








f
metro



(
x
)


=


min


(

1
,

e
x


)








(

Metropolis





method

)






(

Formula





1


-


2

)








f
Gibbs



(
x
)


=


1

1
+

e

-
x










(

Gibbs





method

)






(

Formula





1


-


3

)







In the formula above, T is a parameter called a temperature value, which is changed as follows.


(2) The temperature value T is logarithmically decreased relative to the number of iteration t as represented by the following formula.









T
=



T
0



log


(
c
)




log


(

t
+
c

)







(

Formula





2

)







In the formula above, T0 is an initial temperature value, and is desired to be sufficiently large depending on a problem.


In the case where acceptance probability represented by the formula of (1) is used, once the state reaches a steady state after sufficient iterations, occupancy probability of each state follows the Boltzmann distribution for a thermal equilibrium state in thermodynamics.


As the temperature is gradually lowered from a high temperature, occupancy probability of a low energy state increases. Therefore, a low energy state is supposed to be obtained when the temperature is sufficiently reduced. The state as described above is very similar to a state change occurred when a material is developed. Therefore, the method described above is called simulated annealing. The stochastic occurrence of the state transition of energy increase is equivalent to thermal excitation in physics.


An optimizing device (arithmetic unit 18) for performing simulated annealing is illustrated in FIG. 17. The descriptions below include a case where a plurality of candidates of state transitions are generated, but a transition candidate is generated one by one in the original basic simulated annealing.


An optimizing device 100 includes a state retaining unit 111 configured to retain a current state S (a plurality of state variable values). Moreover, the optimizing device 100 includes an energy calculating unit 112 configured to calculate an energy change value {−ΔEi} of each state transition when a state transition occurs from the current state S due to a change of any of the state variable values.


Moreover, the optimizing device 100 includes a temperature controlling unit 113 configured to control a temperature value T and a transition controlling unit 114 configured to control a state transition.


The transition controlling unit 114 is configured to stochastically determine whether any of the state transitions is adapted or not according to the correlation between the energy change value {−ΔEi} and the thermal excitation energy based on the temperature value T, the energy change value {−ΔEi}, and the random numerical value.


The transition controlling unit 114 is further subdivided. The transition controlling unit 114 includes a candidate generating unit 114a configured to generate candidates of state transition, and a judging unit 114b configured to stochastically judge on each candidate whether the state transition is allowed or not based on the energy change value {ΔEi} and temperature value T thereof. The transition controlling unit 114 further includes a transition determining unit 114c configured to determine the candidate to be adapted among the allowed candidates, and a random number generating unit 114d configured to generate probability variables.


An operation of one iteration is as follows. First, the candidate generating unit 114a generates one or more candidates (candidate number {Ni}) of state transition from the current state S retained in the state retaining unit 111 to the next state. The energy calculating unit 112 calculates an energy change value {−ΔEi} for each of state transitions listed as candidates using the current state S and the candidates of state transition. The judging unit 114b accepts the state transition with the acceptance probability of the formula of (1) above according to the energy change value {−ΔEi} of each state transition using the temperature value T generated by the temperature controlling unit 113 and the probability variable (random numerical value) generated by the random number generating unit 114d. Then, the judging unit 114b outputs acceptance or rejection {fi} of each state transition. In the case where there are a plurality of the accepted state transitions, the transition determining unit 114c randomly selects one of the accepted state transitions using the random numerical value. The transition determining unit 114c outputs the transition number N of the selected state transition and acceptance or rejection of the transition f. In the case where there is the accepted state transition, the value of the state variable stored in the state retaining unit 111 is updated according to the adapted state transition.


The iteration described above is started from an initial state and repeated with decreasing the temperature value by the temperature controlling unit 113. When the finishing judgement conditions, such as reaching the certain number of iterations, or the energy being dropped below a certain value, are satisfied, the operation is completed. The answer output by the optimizing device 110 is the state at the time of the finish.



FIG. 18 is a block diagram of a circuit level of a structural example of arithmetic part used for a transition controlling unit, particularly a judging unit, in typical simulated annealing where a candidate is generated one by one.


A transition controlling unit 114 includes a random number generator 114b1, a selector 114b2, a noise table 114b3, a multiplier 114b4, and a comparator 114b5.


The selector 114b2 is configured to select the value corresponding to the transition number N that is a random numerical value generated by the random number generator 114b1 among the energy change values {ΔEi} calculated for candidates of each state transition, and then output the value.


Functions of the noise table 114b3 will be described later. As the noise table 114b3, for example, a memory, such as a random access memory (RAM), and a flash memory, can be used.


The multiplier 114b4 outputs a product (corresponding to the above-described thermal excitation energy) obtained by multiplying the value output by the noise table 114b3 with the temperature value T.


The comparator 114b5 outputs, as transition acceptance or rejection f, a comparison result obtained by comparing the product result output by the multiplier 114b4 and the energy change value −ΔE selected by the selector 114b2.


The transition controlling unit 114 illustrated in FIG. 18 basically has the above-mentioned functions as they are, but a mechanism for accepting state transition with the acceptance probability represented by the formula (1) has not yet been described. Therefore, the mechanism will be supplementary described.


The circuit that outputs 1 with the acceptance probability p and 0 with (1−p) has two inputs A and B, can be realized by inputting the acceptance probability p to the input A of the comparator and a uniform random number having the value in the interval [0, 1] to the input B of the comparator where the comparator outputs 1 when A>B and outputs 0 when A<B. Accordingly, the above-described function can be realized by inputting the value of the acceptance probability p calculated from the energy change value and the temperature value T using the formula of (1) to the input A of the comparator.


Specifically, the above-described function can be realized with the circuit that outputs 1 when f(ΔE/T) is larger than u, where f is the function represented by the formula of (1) and u is a uniform random number having the value of the interval [0, 1].


The circuit may be as it is, but the same function can be also realized by performing the following deformation. The magnitude relationship of two numbers does not change when the same monotone increasing function is given the two numbers. Therefore, output does not change even when the same monotone increasing function is gives two inputs of the comparator. It can be understood that a circuit outputting 1 when −ΔE/T is larger than f−1(u) is acceptable when an inverse function f−1 off is used as the monotone increasing function. Since the temperature value T is a positive value, moreover, a circuit outputting 1 when −ΔE is larger than Tf1(u) is acceptable. The noise table 114b3 in FIG. 18 is a conversion table for realizing the inverse function f−1(u), and a table for outputting a value of the following function with respect to an input of discretized interval [0, 1].











f
metro

-
1




(
u
)


=

log


(
u
)






(

Formula





3


-


1

)








f
Gibbs

-
1




(
u
)


=

log


(

u

1
-
u


)






(

Formula





3


-


2

)







The transition controlling unit 114 also includes a latch configured to retain judgement results etc., a state machine configured to generate timing thereof, etc., but the above-mentioned units are omitted in FIG. 18 in order to simplify the illustration.



FIG. 19 illustrates an operation flow of the transition controlling unit 114. The operation flow includes a step for selecting one state transition as a candidate (S0001), a step for determining acceptance or rejection of the state transition with comparing a product of the energy change value of the state transition, temperature value, and random numerical value (S0002), and a step for adapting the state transition if the state transition is acceptable and rejecting if the state transition is not acceptable (S0003).


<Step S108>

Next, as a confirmation whether the limited lattice space is set to have a sufficient space, the judging unit 19 judges whether any of the compound groups assigned to the lattice points is arranged on the outermost edge of the limited lattice space (S108).


Typically, the judgement is performed on a conformation of a protein having the minimum energy calculated.


During the judging, the judgement is performed on whether any of the compound groups excluding the compound group arranged first and the compound group arranged last is arranged on the outermost edge of the limited lattice space among the compound groups assigned to the lattice points. This is because the compound group arranged first is not typically arranged on the outermost edge. Moreover, this is because the arrangement of the compound group is not limited by the outermost edge as there is no more compound group to be arranged next even when the compound group arranged last is arranged on the outermost edge.


<Step S109>

When the judging unit 19 judges that any of the compound groups assigned to the lattice points is not arranged on the outermost edge of the limited lattice space, it is judged that a sufficient space of the limited lattice space is set, and a calculation result related to conformation of a protein having the calculated minimum energy is output (S109).


The calculation result is output from the outputting unit 21. The result may be output as a protein conformation diagram or may be output as coordinate information of each amino acid residue constituting a protein.


<Step S110>

When the judging unit 19 judges that any of the compound groups assigned to the lattice points is arranged on the outermost edge of the limited lattice space, meanwhile, it is judged that a sufficient space of the limited lattice space is not set and hence the limited lattice space is expanded. Therefore, expansion information K is added to the space limiting parameter L (S110). Then, assignment of bits to the lattice points included in the limited lattice space after the expansion and calculation of the minimum energy of the Ising model are performed (S104 to S107).


During this step, the controlling unit 20 causes the limiting unit 13 to execute expansion of the limited lattice space.


Moreover, the controlling unit 20 causes the assigning unit 14 to execute assignment of a bit to each lattice point included in the limited lattice space after the expansion.


Moreover, the controlling unit 20 causes the arithmetic unit 18 to execute calculation of the minimum energy of the Ising model.


When the limited lattice space is expanded, the limited lattice space is not typically expanded to the original lattice space.


For example, the embodiment of the expansion may be an embodiment where the limited lattice space is uniformly expanded by the predetermined number of lattice points towards the outside of the outermost edge of the limited lattice space, and may be an embodiment where only a lattice space surrounding the compound group arranged on the outermost edge of the limited lattice space is expanded.


Moreover, the controlling unit 20 preferably does not change the already assigned bits to the lattice points of the compound group judged as being arranged on the outermost edge of the limited lattice space and the compound group arranged the earlier than the compound group judged as being arranged on the outermost edge. In the case where there are a plurality of compound groups judged as being arranged on the outermost edge of the limited lattice space, the “compound group judged as being arranged on the outermost edge of the limited lattice space” is preferably the compound group arranged the earliest among the compound groups judged as being arranged on the outermost edge of the limited lattice space.


Moreover, more preferred is that a difference (n−M) between the order (n) of the arrangement of a compound group arranged at last, and the order (M) of arrangement of the compound group judged as being arranged on the outermost edge of the limited lattice space be considered in expansion information, and the controlling unit 20 be configured to cause the limiting unit 13 to execute expansion of the limited lattice space based on the expansion information in a manner that the limited lattice space is expanded smaller when the difference (n−M) is small than when the difference (n−M) is large.


One of the above-described embodiments may be performed or the above-described embodiments may be performed in combination. Examples thereof will be described in Examples 1 to 6 below.


EXAMPLE 1

An example of expansion of a limited lattice space will be described with reference to drawings.



FIG. 20A illustrates an example where amino acid residues are arranged in a diamond lattice space. In FIG. 20A, in the case where the number (n) of the amino acid residues is 7 (n=7), a space limiting parameter L is set to 4 (L=4), 7 amino acid residues are arranged in a diamond lattice space having a radius r of 4 (r=4).


In FIG. 20A, the No. 4 amino acid residue is arranged on the outermost edge of the diamond lattice space.


Therefore, it is judged by the judgment performed in Step S108 that any of compound groups assigned to lattice points is arranged on the outermost edge of the limited lattice space and expansion of the diamond lattice space is performed.



FIG. 20B illustrates an example where the diamond lattice space of FIG. 20A is expanded by one layer of the lattice points towards the outside of the outer shell (i.e., an example where the radius r is expanded to the radius r+1). The limited lattice space is expanded by further adding one lattice point to the outside of the outer shell of the diamond lattice space of FIG. 20A.


The expansion is an embodiment where the limited lattice space is uniformly expanded by the predetermined number from the lattice number towards the outside of the outermost edge.


For example, the expansion above can be performed according to the expansion information K.


EXAMPLE 2

When compound groups are sequentially arranged, arrangement of the compound groups is not restricted by the outermost edge of the limited lattice space until the compound group is arranged on the outermost edge of the limited lattice space.


Therefore, preferred in view of reduction in a calculation time is that, as well as expanding the limited lattice space with maintaining the arrangement of the compound groups until a compound group is arranged on the outermost edge of the limited lattice space, assignment of bits to the lattice points included in the limited lattice space after the expansion and calculation of the minimum energy of the Ising model be executed.


In the case where assignment of bits to the lattice points included in the limited lattice space after the expansion and calculation of the minimum energy of the Ising model are executed as well as expanding the limited lattice space, it is preferred that bits already assigned to the lattice points of the compound group judged as being arranged on the outermost edge of the limited lattice space and the compound group arranged the earlier than the compound group judged as being arranged on the outermost edge be not changed. In the case where a plurality of compound groups are judged as being arranged on the outermost edge of the limited lattice space, in Example 2, the “compound group judged as being arranged on the outermost edge of the limited lattice space” is preferably the compound group arranged the earliest among the compound groups judged as being arranged on the outermost edge.


The examples above will be described with a flowchart and drawings.



FIG. 21 illustrates another structural example of the disclosed device for searching a compound. The device for searching a compound 10B illustrated in FIG. 21 is identical to the device for searching a compound 10A illustrated in FIG. 6 in that the device for searching a compound 10B includes a compound group number counting unit 11, a defining unit 12, a limiting unit 13, an assigning unit 14, a H generating unit 15, a weight extracting unit 16, a weight file creating unit 17, an arithmetic unit 18, a judging unit 19, a controlling unit 20, and an outputting unit 21. However, the device for searching a compound 10B illustrated in FIG. 21 is different from the device for searching a compound 10A illustrated in FIG. 6 in that the generating unit 15 of the device for searching a compound 10B includes a Hfix generating unit 15F.


The flowchart of FIG. 22 is a flowchart in which Step S111 is added to the flowchart of FIG. 7.



FIG. 23A is an example where amino acid residues are arranged in a diamond lattice space.


In FIG. 23A, in the case where the number (n) of the amino acid residues is 7 (n=7), a space limiting parameter L is set to 4 (L=4), 7 amino acid residues are arranged in a diamond lattice space having a radius r of 4 (r=4).


In FIG. 23A, the No. 4 amino acid residue is arranged on the outermost edge of the diamond lattice space.


Therefore, it is judged by the judgment performed in Step S108 that any of compound groups assigned to lattice points is arranged on the outermost edge of the limited lattice space and expansion of the diamond lattice space is performed.


As illustrated in FIG. 23B, for example, the diamond lattice space of FIG. 23A is expanded by a layer of the lattice points (i.e., the radius r is expanded to the radius r+1). Then, Steps S104 to S107 are performed using the expanded diamond lattice space. At the time when Ising model is created, a constraint term Hfix, which is configured not to change a bit already assigned, is added to the No 4 amino acid residue judged as being arranged on the outermost edge of the limited lattice space and the Nos. 1 to 3 amino acid residues having the smaller number of the arrangement than the No. 4 amino acid residue (S111). The constraint term Hfix is created by a Hfix generating unit 15F under the control of the controlling unit 20. When the Ising model is created, the bits already assigned to the lattice points of the No. 4 amino acid residue arranged on the outermost edge of the limited lattice space and the Nos. 1 to 3 amino acid residues arranged the earlier than the No. 4 amino acid residue can be unchanged by adding the constraint term Hfix. This can be illustrated as in FIG. 23C. FIG. 23C is a diagram illustrating a state where the Nos. 1 to 4 amino acid residues are fixed in the expanded diamond lattice space of FIG. 23B.


For example, the constraint term Hfix is represented by the following formula.







H
fix

=


λ
fix






i
=
1

M







(

1
-

x

A
i



)







In the function above, XAi is an address to be fixed (not changed), and i is the sequence number of the amino acid residues.


In the function above, λfix is a weighting coefficient.


E(x) to which the constraint term Hfix is added can be expressed as follows.






E(x)=H=Hone+Hconn+Holap+Hpair+Hfix


The constraint term Hfix is prioritized than other constraint terms by making λfix of the constraint term Hfix sufficiently larger than the weighting coefficient than other constraint terms, and therefore XAi is fixed to 1.


EXAMPLE 3

An embodiment of the expansion is preferably an embodiment where only the lattice space surrounding the compound group arranged on the outermost edge of the limited lattice space is expanded. There is a less possibility that an amino acid residue is arranged even when the lattice space other than the space surrounding the compound group arranged on the outermost edge of the limited lattice space is expanded. Therefore, the expanding range can be maintained to an appropriate range by using an embodiment where only the lattice space surrounding the compound group arranged on the outermost edge of the limited lattice space.


To expand only the lattice space surrounding the compound group arranged on the outermost edge of the limited lattice space may be referred to as “giving the expansion directionality” hereinafter.


An embodiment of Example 3 is an embodiment where directionality is given to the expansion in the embodiment of Example 2.


An embodiment of Example 3 will be described with reference to drawings.



FIG. 24A illustrates an example where amino acid residues are arranged in a diamond lattice space.


In FIG. 24A, in the case where the number (n) of the amino acid residues is 7 (n=7), a space limiting parameter L is set to 4 (L=4), 7 amino acid residues are arranged in a diamond lattice space having a radius r of 4 (r=4).


In FIG. 24A, the No. 4 amino acid residue is arranged on the outermost edge of the diamond lattice space.


Therefore, it is judged by the judgment performed in Step S108 that any of compound groups assigned to lattice points is arranged on the outermost edge of the limited lattice space and expansion of the diamond lattice space is performed.


As illustrated in FIG. 24B, for example, only the lattice space surrounding the amino acid residue arranged on the outermost edge of the limited lattice space (No. 4 amino acid residue) in the diamond lattice space of FIG. 24A is expanded. Steps S104 to S107 are performed using the expanded diamond lattice space. When the Ising model is created, the already assigned bits assigned to the lattice points of the No. 4 amino acid residue judged as being arranged on the outermost edge of the limited lattice space and the Nos. 1 to 3 amino acid residues arranged the earlier than the No. 4 amino acid residue are not changed (FIG. 24C).


EXAMPLE 4

In the case where the sequence number of the compound group arranged on the outermost edge of the limited lattice space is large, the number of compound groups (remaining compound groups) having the larger sequence number than the compound group arranged on the outermost edge is small. In the case where the sequence number of the compound group arranged on the outermost edge of the limited lattice space is small, on the other hand, on the other hand, the number of compound groups (remaining compound groups) having the larger sequence number than the compound group arranged on the outermost edge is large. In the case where the number of the remaining compound groups is large, appropriate expansion may not be performed unless a range of the expansion is made large. In the case where the number of the remaining compound groups is small, on the other hand, most of the expanded range may be pointless.


Therefore, a range of expansion of the limited lattice space is preferably made large when the sequence number of the compound group arranged on the outermost edge of the limited lattice space. A range of expansion of the limited lattice space is preferably made small when the sequence number of the compound group arranged on the outermost edge of the limited lattice space is large.


Specifically, it is preferred that the limited lattice space is preferably expanded in a manner that the limited lattice space be expanded smaller when the difference (n−M) is small than when the difference (n−M) is large, considering in the expansion information K the difference (n−M) between the order (n) of the arrangement of the compound group arranged last and the order (M) of the arrangement of the compound group judged as being arranged on the outermost edge of the limited lattice space.


The above-described embodiment may be referred to as an “amino acid residue number dependency” of the expansion range hereinafter.


The embodiment of Example 4 is an embodiment where the amino acid residue number dependency is given to the expansion range in the embodiment of Example 1.


An embodiment of Example 4 will be described with reference to drawings.



FIG. 25A illustrates an example where amino acid residues are arranged in a diamond lattice space.


In FIG. 25A, in the case where the number (n) of the amino acid residues is 7 (n=7), a space limiting parameter L is set to 4 (L=4), 7 amino acid residues are arranged in a diamond lattice space having a radius r of 4 (r=4).


In FIG. 25A, the No. 4 amino acid residue is arranged on the outermost edge of the diamond lattice space.


Therefore, it is judged by the judgment performed in Step S108 that any of compound groups assigned to lattice points is arranged on the outermost edge of the limited lattice space and expansion of the diamond lattice space is performed.


When the expansion is performed, a range K of the expansion (the number of layers of the lattice points) is determined based on the following function.






K=roundup((n−M)/2)


In the equation above, “round up” is a function for rounding up after the decimal point.


In case of n=7 and M=4, for example, (7−4)/2=1.5 and K=2. As illustrated in FIG. 25B, the outer shell is expanded by two layers of the lattice point towards the outside of the outer shell in the diamond lattice space of FIG. 25A (i.e., the radius r is expanded to the radius r+2). Then, Steps S104 to S107 are performed using the expanded diamond lattice space.


EXAMPLE 5

Example 5 is an embodiment where Example 4 and Example 2 are combined.


The embodiment of Example 5 will be described with reference to drawings.



FIG. 26A illustrates an example where amino acid residues are arranged in a diamond lattice space.


In FIG. 26A, in the case where the number (n) of the amino acid residues is 7 (n=7), a space limiting parameter L is set to 4 (L=4), 7 amino acid residues are arranged in a diamond lattice space having a radius r of 4 (r=4).


In FIG. 26A, the No. 4 amino acid residue is arranged on the outermost edge of the diamond lattice space.


Therefore, it is judged by the judgment performed in Step S108 that any of compound groups assigned to lattice points is arranged on the outermost edge of the limited lattice space and expansion of the diamond lattice space is performed.


When the expansion is performed, a range K of the expansion (the number of layers of the lattice points) is determined based on the following function.






K=roundup((n−M)/2)


In the equation above, “round up” is a function for rounding up after the decimal point.


In case of n=7 and M =4, for example, (7−4)/2=1.5 and K=2. As illustrated in FIG. 26B, the outer shell is expanded by two layers of the lattice point towards the outside of the outer shell in the diamond lattice space of FIG. 26A (i.e., the radius r is expanded to the radius r+2). Then, Steps S104 to S107 are performed using the expanded diamond lattice space. When the Ising model is created, the already assigned bits assigned to the lattice points of the No. 4 amino acid residue judged as being arranged on the outermost edge of the limited lattice space and the Nos. 1 to 3 amino acid residues arranged the earlier than the No. 4 amino acid residue are not changed (FIG. 26C).


EXAMPLE 6

Example 6 is an embodiment where Example 4 and Example 3 are combined.


The embodiment of Example 6 will be described with reference to drawings.



FIG. 27A illustrates an example where amino acid residues are arranged in a diamond lattice space.


In FIG. 27A, in the case where the number (n) of the amino acid residues is 7 (n=7), a space limiting parameter L is set to 4 (L=4), 7 amino acid residues are arranged in a diamond lattice space having a radius r of 4 (r=4).


In FIG. 27A, the No. 4 amino acid residue is arranged on the outermost edge of the diamond lattice space.


Therefore, it is judged by the judgment performed in Step S108 that any of compound groups assigned to lattice points is arranged on the outermost edge of the limited lattice space and expansion of the diamond lattice space is performed.


When the expansion is performed, a range K of the expansion (the number of layers of lattice points) is determined based on the following function with giving the directionality to the range of the expansion.






K=roundup((n−M)/2)+1


In the equation above, “round up” is a function for rounding up after the decimal point.


In case of n=7 and M=4, for example, (7−4)/2=1.5 and K=3. As illustrated in FIG. 27B, only the lattice space surrounding the amino acid residue (No. 4 amino acid residue) arranged on the outermost edge of the limited lattice space is expanded by three layers in the diamond lattice space of FIG. 27A. Then, Steps S104 to S107 are performed using the expanded diamond lattice space. When the Ising model is created, the already assigned bits assigned to the lattice points of the No. 4 amino acid residue judged as being arranged on the outermost edge of the limited lattice space and the Nos. 1 to 3 amino acid residues arranged the earlier than the No. 4 amino acid residue are not changed (FIG. 27C).


Examples 1 to 6 are described above, and can be summarized in FIG. 28.


Note that, the phrase “Is it automatically set?” in FIG. 28 corresponds to whether there is the amino acid residue number dependency described above.


Moreover, the conventional method in FIG. 28 means that a lattice space is not limited in the diamond encoding method.


Moreover, Referential Example in FIG. 28 means a case where a lattice space is limited but whether the limited lattice space is expanded is not judged and the limited lattice space is not expanded.


Note that, the device for searching a compound 10A of FIG. 6 is an example where the arithmetic unit 18 and the limiting unit 13 are arranged in the identical space, but the device for searching a compound may be a device in which the arithmetic unit 18 and the limiting unit 13 are spatially separated as in the device for searching a compound 10C illustrated in FIG. 29.


Next, a modified example of the steps S101 to S107 of the flowchart of FIG. 7 will be described.



FIG. 30 is a flowchart of a modified example.


In the flowchart of FIG. 30, the step S201 corresponds to the step S101 of the flowchart of FIG. 7, the step S202 corresponds to the step S102, the step S204 corresponds to the step S104, the step S205 corresponds to the step S105, the step S206 corresponds to the step S106, and the step S207 corresponds to the step S107.


Therefore, the descriptions are given with focusing on the limiting unit 13 and the step S203.


By setting the maximum number M (straight chain number limiting parameter M) of amino acid residues aligned in a straight chain (S203), the limiting unit 13 generates a limited lattice space obtained by removing, from the lattice space, undesirable regions for the next compound group to be arranged, in the case where any of a plurality of compound groups is arranged in any lattice of the lattice space, followed by arranging the next compound group in the lattice space.


As described earlier, amino acid residues are typically rarely aligned in a straight chain due to interaction between the amino acid residues.


Therefore, the number of arithmetic bits or quantum bits can be suppressed by setting the maximum number M (straight chain number limiting parameter M) of amino acid residues aligned in a straight chain, and eliminating regions where the amino acid residues are not be disposed under the restrictions above to thereby generate a limited lattice space. Naturally, M is smaller than the number (n) of amino acid residues (M<n).


For example, when the straight chain number limiting parameter M is set to 5, as illustrated in FIG. 31, the number of amino acid residues aligned in a straight chain is 5 as the maximum number.


When the straight chain number limiting parameter M is set, the limited lattice space increases as the number of the amino acid residues increases as illustrated in FIG. 32. Specifically, the maximum lattice space K is determined by the following formula when the straight chain limiting parameter M is used for the amino acid residues in the number of n.






K
=






n
-
M

M



×

(

M
-
2

)


+

MAX


(



n






mod


[
M
]



-
2

,
0

)


+
M





A space limiting parameter L (L<n) may be used in combination for generation of a limited lattice space. In this case, it is preferable that L<K be satisfied.



FIG. 33 is a flowchart of another modified example.


In the flowchart of FIG. 33, the step S301 corresponds to the step S201 of the flowchart of FIG. 21, the step S302 corresponds to the step S202, the step S303 corresponds to the step S203, the step S305 corresponds to the step S204, the step S306 corresponds to the step S205, the step S307 corresponds to the step S206, and the step S308 corresponds to the step S207.


Therefore, descriptions are given with focusing on the limiting unit 13 and the step S304.


In the case where any of a plurality of compound groups is arranged in any lattice of a lattice space followed by arranging a next compound group in the lattice space, the limiting unit 13 creates a limited lattice space, which is obtained by eliminating undesirable regions for the next compound group to be arranged from the lattice space, by setting the maximum number M (straight chain limiting parameter M) of amino acid residues aligned in a straight chain (S303), and moreover defining the maximum S(i) of the site to which i-numbered amino acid residue is moved (S304).


When the straight chain number limiting parameter M is used, a space radius r of each amino acid residue is for example as presented in Table 1 with M=5 (K=8), n=11, and L=K.











TABLE 1









Amino acid residue



















1
2
3
4
5
6
7
8
9
10
11






















r
1
2
3
4
5
6
7
8
8
8
8


Radius r'
1
2
3
4
5
4
5
6
7
8
7


actually


estimated









The above-described example is visualized as in FIG. 34. Although the maximum space is identical, excess space is created and it can be understood that the 6th or 7th amino acid residue can be made the smaller space in reality.


Therefore, the straight chain limiting parameter M is added and a space parameter s(x) using the straight chain number limiting parameter s(x). As a result, the space can be limited as follows, and the number of bits can be suppressed without lowering accuracy.







V
i

=




r

J




S

r













i
=

{

1
,
2
,
3
,







n


}








s


(
x
)


=






x
-
1

M



×

(

M
-
2

)


+

(


(

x
-
1

)


mod





M

)

+
1





When the space limiting parameter L is an even number, and i<L:

    • J={s(1), s(3), . . . S(i)} in case of an odd numbered (i=odd number) amino acid residue.
    • J={s(2), s(4), . . . S(i)} in case of an even numbered (i=even number) amino acid residue.


When the space limiting parameter L is an even number, and i>L:

    • J={s(2), s(4), . . . S(L−1)} in case of an odd numbered (i=odd number) amino acid residue.
    • J={s(2), s(4), . . . S(L)} in case of an even numbered (i=even number) amino acid residue.


When the space limiting parameter L is an odd number, and i<L:

    • J={s(1), s(3), . . . S(i)} in case of an odd numbered (i=odd number) amino acid residue.
    • J={s(2), s(4), . . . S(i)} in case of an even numbered (i=even number) amino acid residue.


When the space limiting parameter L is an odd number, and i>L:

    • J={s(2), s(4), . . . S(L)} in case of an odd numbered (i=odd number) amino acid residue.
    • J={s(2), s(4), . . . S(L−1)}} in case of an even numbered (i=even number) amino acid residue.


Regarding the technology of referential examples, an embodiment where a straight chain number limiting parameter is not set is determined as Referential Example 1, a modified example for describing using FIG. 30 is determined as Referential Example 2, and a modified example for describing unit FIG. 33 is determined as Referential Example 3. In each Referential Examples, a change of the number of bits used when the parameter is determined as follows is illustrated in FIG. 35.

    • Referential Example 1: L=15
    • Referential Example 2: L=15, M=5
    • Referential Example 3: L=15, M=5
    • Comparative Example 1: No restriction


It was confirmed in all of Examples that the number of bits used could be significantly reduced compared to Comparative Example 1 where no restriction was given, and a compound having relatively a large scale of a problem (e.g., a protein) could be used as a target of a search.


According to the referential examples above, however, the arrangement of the compound groups is limited by the outermost edge of the limited lattice space if the limited lattice space is two narrow. As a result, an appropriate conformation may not be obtained.


Therefore, the number of bits can be appropriately suppressed, for example, by combining any of the referential examples with Examples 1 to 6, compared with the conventional technology.


When a lattice space is not limited, for example, the number of bits used for amino acid residues in the number of n is represented by the following formula.










i
=
1

n







{






2


(




j
=
1


i
-
1








j
2


)


+

i
2

-

mod


(

i
,
2

)



}


+
1




When the limit of the lattice space (L: space limiting parameter) is set to L=n−1 and the radium of the space to be expanded (K: expansion information) is set to K=1, in the case where the limited lattice space is expanded with directionality, for example, the directional increase is 3 bits when the amino acid residue arranged on the outermost edge is positioned on the outermost plane of the limited lattice space, is 4 bits when the amino acid residue arranged on the outermost edge is positioned on the outermost side of the limited lattice space, and is 5 bits when the amino acid residue arranged on the outermost edge is positioned on the outermost apex of the limited lattice space. Therefore, the number of bits used for expanding the limited lattice space with directionality can be represented by the following formula when the limit of the lattice space (L: space limiting parameter) is set to L=n−1 and the radius of the space to be expanded (K: expansion information) is set to K=1.










i
=
1

n







{


2


(




j
=
1


i
-
1








j
2


)


+

i
2

-

mod


(

i
,
2

)



}


-




i
=
1


n
-
1








{


2


(




j
=
1


i
-
1








j
2


)


+

i
2

-

mod


(

i
,
2

)



}


+

[

3





or





4





or





5

]





An example where 11 amino acid residues are arranged as in FIG. 36 will be described hereinafter. Note that, the numbers in FIG. 36 are numbers representing the positions of the lattice points in the diamond lattice space.


When the lattice space for the 11 amino acid residues is not limited (in the case of n=11), the number of bits used is 2,921 bits.


In case of the arrangement as in FIG. 36, it is appropriate that the limited lattice space to be L=5, and the number of bits used is 153 bits, when the lattice space is limited, but whether the limited lattice space is expanded is not judged and the limited lattice space is not expanded as in the referential examples.


In case of the arrangement as in FIG. 36, however, the limited lattice space may be set to L=4 and may be expanded with directionality by K=1, i.e., 3 bits or 5 bits. In this case, the number of bits used is 69 bits (in case of r=4)+3 bits or 5 bits=72 bits or 74 bits. Therefore, 81 bits (=153 bits−72 bits) or 79 bits (=153 bits−74 bits) can be reduced compared to the case where the limited lattice space is simply set to L=5.


Note that, such a reduction effect becomes the larger as the number of the amino acid residues is the larger.


All examples and conditional language recited herein are intended for pedagogical purposes to aid the reader in understanding the invention and the concepts contributed by the inventor to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to a showing of the superiority and inferiority of the invention. Although the embodiments of the present invention have been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the sprit and scope of the invention.

Claims
  • 1. A device for searching a compound, comprising: a defining unit configured to define a lattice space that is a collection of lattices where a plurality of compound groups are sequentially arranged;a limiting unit configured to, in a case where any of the compound groups is arranged in any of the lattices of the lattice space, followed by arranging a next compound group in the lattice space, generate a limited lattice space that is a space created by eliminating, from the lattice space, undesirable regions for the next compound group to be arranged;an assigning unit configured to assign a bit to each of lattice points, to which the compound groups can be arranged, in the limited lattice space;an arithmetic unit configured to perform a ground state search on an Ising model obtained through conversion based on restriction conditions related to each of the lattice points according to simulated annealing, to thereby calculate minimum energy of the Ising model;a judging unit configured to judge whether any of the compound groups assigned to the lattice points is arranged on an outermost edge of the limited lattice space or not; anda controlling unit configured to cause the limiting unit to execute expansion of the limited lattice space, cause the assigning unit to execute assignment of the bits to the lattice points included in the limited lattice space after the expansion, and cause the arithmetic unit to execute calculation of the minimum energy of the Ising model, in a case where the judging unit judges that any of the compound groups assigned to the lattice points is arranged on the outermost edge of the limited lattice space,wherein the device is a device for searching the compound, in which a plurality of the compound groups are linked with one another.
  • 2. The device according to claim 1, wherein the judging unit is configured to judge whether any of the compound groups excluding the compound group arranged first and the compound group arranged last is arranged on the outermost edge of the limited lattice space among the compound groups assigned to the lattice points.
  • 3. The device according to claim 1, wherein the controlling unit is configured to cause the limiting unit to execute expansion of the limited lattice space based on expansion information.
  • 4. The device according to claim 1, wherein the controlling unit is configured not to change the bits already assigned to the lattice points of the compound group judged as being arranged on the outermost edge of the limited lattice space and the compound groups arranged earlier than the compound group judged as being arranged on the outermost edge of the limited lattice space.
  • 5. The device according to claim 1, wherein the controlling unit is configured to expand only a lattice space surrounding the compound group arranged on the outermost edge of the limited lattice space when the controlling unit causes the limiting unit to execute expansion of the limited lattice space.
  • 6. The device according to claim 3, wherein the expansion information considers a difference (n−M) between the order (n) of the arrangement of a compound group arranged last, and the order (M) of arrangement of the compound group judged as being arranged on the outermost edge of the limited lattice space, andthe controlling unit is configured to cause the limiting unit to execute expansion of the limited lattice space based on the expansion information in a manner that the limited lattice space is expanded smaller when the difference (n−M) is small than when the difference (n−M) is large.
  • 7. The device according to claim 1, wherein the compound groups are amino acid residues.
  • 8. The device according to claim 7, wherein the compound is a protein.
  • 9. A method for searching a compound, the method comprising: defining a lattice space that is a collection of lattices where a plurality of compound groups are sequentially arranged;in a case where any of the compound groups is arranged in any of the lattices of the lattice space, followed by arranging a next compound group in the lattice space, generating a limited lattice space that is a space created by eliminating, from the lattice space, undesirable regions for the next compound group to be arranged;assigning a bit to each of lattice points, to which the compound groups can be arranged, in the limited lattice space;performing a ground state search on an Ising model obtained through conversion based on restriction conditions related to each of the lattice points according to simulated annealing, to thereby calculate minimum energy of the Ising model;judging whether any of the compound groups assigned to the lattice points is arranged on the outermost edge of the limited lattice space or not; andexecuting expansion of the limited lattice space, assignment of the bits to the lattice points included in the limited lattice space after the expansion, and calculation of the minimum energy of an Ising model, in a case where it is judged that any of the compound groups assigned to the lattice points is arranged on the outermost edge of the limited lattice space,wherein the method is a method for allowing a computer to search the compound in which a plurality of the compound groups are linked with one another.
  • 10. The method according to claim 9, wherein the judging is judging whether any of the compound groups excluding the compound group arranged first and the compound group arranged last is arranged on the outermost edge of the limited lattice space among the compound groups assigned to the lattice points.
  • 11. The method according to claim 9, wherein the expanding is expanding the limited lattice space based on expansion information.
  • 12. The method according to claim 9, wherein the bits already assigned to the lattice points of the compound group judged as being arranged on the outermost edge of the limited lattice space and the compound groups arranged earlier than the compound group judged as being arranged on the outermost edge of the limited lattice space are not changed.
  • 13. The method according to claim 9, wherein, in the expansion of the limited lattice space, only a lattice space surrounding the compound group arranged on the outermost edge of the limited lattice space is expanded.
  • 14. The method according to claim 11, wherein the expansion information considers a difference (n−M) between the order (n) of the arrangement of a compound group arranged last, and the order (M) of arrangement of the compound group judged as being arranged on the outermost edge of the limited lattice space, andthe expansion is expanding the limited lattice space based on the expansion information in a manner that the limited lattice space is expanded smaller when the difference (n−M) is small than when the difference (n−M) is large.
  • 15. The method according to claim 9, wherein the compound groups are amino acid residues.
  • 16. The method according to claim 15, wherein the compound is a protein.
Priority Claims (1)
Number Date Country Kind
2018-201591 Oct 2018 JP national