Calculating a characteristic property of a molecule by correlation analysis

Information

  • Patent Application
  • 20030216871
  • Publication Number
    20030216871
  • Date Filed
    July 29, 2002
    22 years ago
  • Date Published
    November 20, 2003
    21 years ago
Abstract
Methods, including computer implemented methods for calculating a characteristic property of a molecule from the 3D-structure of the molecule by correlation analysis, in which the characteristic property is equal to a contribution from the substituent parts of the molecule and a contribution from some measured property of the molecule such as the hydrophobicity and the contribution to the characteristic property from substituent parts of the molecule is equal to a function of the distance of the substituent part to a reaction center multiplied by a weight factor and substantially the same functional form of the distance function is used for calculating the contribution for each substituent part.
Description


BACKGROUND

[0002] The elucidation of the relationships between structure and activity of molecules is one of the major challenges in the chemical and pharmaceutical sciences. One approach to this problem is to apply quantitative structure-activity relationships (“QSAR”), which is a rapidly growing area, integrating methods of modern chemistry, biochemistry, pharmacology, molecular modeling, proteomics, and bio- and chem- informatics. In QSAR modeling, the activity of a molecule is estimated using the substituent parts of the molecule and the observed activity of molecules with similar or analogous structural motifs.


[0003] Application of conventional methods of QSAR have allowed interpretation of reactivity and bioactivity data and physico-chemical properties of molecules. Correlation analysis, which in part is based on the principles of linearity of free energy relationships (“LFER”), is one method that has proved fruitful in this approach. Conventional correlation analysis is described in, for example, Hansch, C.; et al. Substituents Constants for Correlation Analysis in Chemistry and Biology; Wiley—Interscience: N.Y., 1979; Wells, P. R. Linear Free Energy Relationships; Academic Press: London, 1968; Chapman, N. B., Shorter, J. Correlation Analysis in Chemistry; Plenum Press, N.Y. 1978; and R. W. Parr, et al. Density-functional theory of atoms and molecules. Oxford University Press, N.Y., 1989.


[0004] Conventional correlation analysis calculates the activity of a molecule as the sum of contributions from different atoms or groups of atoms in a molecule but does not take account of the 3D-structure of the molecule and separates the contributions from each atom or group of atoms into polar, steric, inductive and resonance effects.


[0005] Quantitative description of polar influence of substituents first became possible within the framework of the approach developed by Hammett on the basis of the dissociation constants of substituted benzoic acids. The difference between the logarithms of dissociation constant K of substituted benzoic acid and the corresponding K0 of unsubstituted standard compound has been expressed by empirical equation:
1logKK0=pσ(1)


[0006] in which two new quantities have been introduced: σ is universal constant specific for a substituent in the benzene ring and ρ is reaction series constant reflecting the sensitivity of the reaction center to variation of substituent influence.


[0007] Later, the Hammett equation was modified many times, but the vast majority of these modifications related to the chemistry of aromatic compounds. For the series of aliphatic compounds, the Hammett relation, as a rule, did not hold. Taft suggested that in this case the steric substituent effects are significant and should be separated as:
2logKK0=ρiσ*+δiEs(2)


[0008] where σ* is a substituent constant depending only on the inductive influence of the substituent, Es is the substituent constant reflecting the steric effect of the substituent and δ is a reaction series constant reflecting the sensitivity of the reaction center to variations of substituent steric influence. Taft's inductive and steric constants are among the most reliable and widespread substituent parameters used in conventional QSAR.


[0009] A large number of polar and steric substituent constants have been determined, and these constants are used in many different QSAR schemes that are used for analysis of molecular reactivity, bioactivity, and physicochemical properties and reaction mechanisms studies.


[0010] In terms of mechanism of action, the steric effect is believed to be due to a variety of factors including an increase of the bulk of a substituent leading to the mechanical shielding of the reaction center from an attacking reagent (steric hindrance of motions), an increase of steric repulsion in a transition state (steric strain) of a reaction, and to steric inhibition of solvation. Thus, the methods of calculation of substituents steric constants usually operate by different descriptors of effective atomic, group or molecular sizes. For the inductive effect, there is no unanimously opinion as to the mechanism of action. The inductive effect includes polar electrostatic interactions between charged parts (atoms) of a molecule and polarization of bonds. The resonance effect is attributed to stabilization of a system (molecule, transition state, etc.) occurring due to the realization of multiple electronic states (resonance configurations).


[0011] Although conventional QSAR methods have proved useful in elucidating structure activity relationships and predicting the activity of molecules based on their structural motifs, conventional QSAR relies on an ad hoc mixture of contributions from polar, inductive, steric and resonance effects, each of which may be treated in a different manner depending on the application. In addition, conventional QSAR does not fully take into account the three dimensional structure of a molecule and thus may not include useful and important structural information contributing to the activity of a molecule.



SUMMARY

[0012] The inventors have identified new methods that treat the contributions from substituent parts of a molecule in a straightforward, consistent matter and take into account the full 3-D structure of a molecule when calculating the activity.


[0013] In this patent, we describe various methods that may be used to calculate the activity of a molecule based on its 3-D structure and give examples of the application of these methods demonstrating the utility of the methods. In this section, we summarize various aspects of the methods described in this patent and below in the Detailed Description section we present a more comprehensive description of these methods, their uses and implementations.


[0014] One of the methods described in this patent is a method for calculating a characteristic property of a molecule that includes one or more substituent parts, where the method includes the steps of (i) selecting one or more of the substituent parts as contributing substituent parts; (ii) for each of the contributing substituent parts, calculating the distance from the substituent part to a reaction center; (iii) for each of the contributing substituent parts, calculating the contribution of that substituent part to the characteristic property of the molecule; and (iv) calculating the characteristic property of the molecule by summing the contributions from the contributing substituent parts of the molecule plus a contribution equal to a measured property of the molecule multiplied by a weight factor. In this method, the contribution from a substituent part is equal to a function of the distance of the substituent part to the reaction center multiplied by a weight factor for the substituent part, and the same or substantially the same functional form for the function of the distance is used to calculate the contribution from each of the contributing substituent parts.


[0015] In one version, the methods described in this patent, the methods may be used to calculate characteristic properties that are chemical characteristic properties. Examples of chemical properties that may be calculated using the methods described in this patent include but are not limited to pKa, reaction rate constants, equilibrium constants, solubility, ionization potentials, atomization energy, evaporation energy, and bond energy. In another version of the methods described in this patent, the methods may be used to calculate a characteristic property that is a property related to the free energy of the molecule.


[0016] In one version of the methods described in this patent, the methods may be used to calculate the characteristic property of organic molecules, inorganic molecules, neutral molecules, radicals, anions, cations, ionic salts, metallo-organic compounds, or coordination compounds.


[0017] Regarding the substituent parts of the molecule, in one version of the methods described in this patent, the substituent parts of the molecule may be atoms contained in the molecule or groups of connected atoms contained in the molecule.


[0018] Regarding the reaction center, generally the reaction center may be any point in space. In one version of the methods described in this patent, the reaction center may be a substituent part of the molecule which may be an atom contained in the molecule or may be a group of connected atoms contained in the molecule.


[0019] Regarding the contributing substituent parts of the molecule, generally any number of the substituent parts may make up the contributing substituent parts. In one version of the methods described in this patent, the contributing substituent parts include all substituent parts of the molecule except one. In another version of the methods described in this patent, the contributing substituent parts include all substituent parts in the molecule except the substituent part that is the reaction center.


[0020] Regarding the function of the distance used in the calculation of the contribution from a substituent part, generally this function may be of any functional form provided that the same or substantially the same functional form is used for calculating the contribution for each substituent part. In one version of the methods described in this patent, the function of the distance is an inverse function of the distance. In another version, the function of the distance goes as the inverse of the square of the distance. In another version, the function of the distance goes as the inverse of the cube of the distance. In another version, the function of the distance goes as the sum of the inverse of the square of the distance and the inverse of the cube of the distance.


[0021] Regarding the weight factor used in the calculation of the contribution from a substituent part, generally the weight factor may be calculated as a regression coefficient for a multivariate regression analysis calculated for a series of molecules. In one version of the methods described in this patent, the dependent variables for the multivariate regression analysis are the values of the characteristic property for the series of molecules and the independent variables are the distant dependent contribution for each type of substituent part present in the series of molecules. For a particular molecule in the series of molecules, the value of the independent variable corresponding to a particular type of substituent part is equal to a sum of the function of the distance from the reaction center to the particular substituent part, where the sum is over all occurrences of that particular molecule. In one version of the methods described in this patent, the series of molecules include molecules that are analogs of the molecule for which the characteristic property is being calculated. In another version of the methods described in this patent, the series of molecules include molecules which include an atom or group of atoms that is the same as the reaction center of the molecule for which the characteristic property is being calculated.


[0022] Regarding how the reaction center may be selected, in one version of the methods described in this patent, the reaction center is selected by performing a multivariable regression analysis for two or more different possible reaction centers, calculating a characteristic of the multivariable regression analysis for each reaction center, and determining which reaction center corresponds to the multivariable regression analysis characteristic that satisfies a predetermined criteria. In one version of the methods described in this patent, the multivariable regression analysis characteristic is the global regression coefficient of the regression analysis and the predetermined criteria selects the reaction center with the highest global regression coefficient. In another version of the methods described in this patent, the multivariable regression analysis characteristic is the global standard error of the regression analysis and the predetermined criteria selects the reaction center with the lowest global standard error.


[0023] Regarding the measured property of the molecule the weighted contribution of which is included in the calculation of the characteristic property, generally the measured property of the molecule can be any property of the molecule that can be measured. In one version of the methods described in this patent, the measured property may be the hydrophobicity of the molecule. In one version, the value of the hydrophobicity is equal to the log of the octanol/water partition coefficient. In one version of the methods described in this patent, the weight factor used in the calculation of the contribution from the measured property is calculated as a regression coefficient for a multivariate regression analysis calculated for a series of molecules.


[0024] Another of the methods described in this patent is a method for calculating the pKa of a molecule that includes one or more substituent parts, where the method includes the steps of (i) selecting one or more of the substituent parts as contributing substituent parts; (ii) for each of the contributing substituent parts, calculating the distance from the substituent part to a reaction center; (iii) for each of the contributing substituent parts, calculating the contribution of that substituent part to the characteristic property of the molecule; and (iv) calculating the characteristic property of the molecule by summing the contributions from the contributing substituent parts of the molecule. In this method, the types of molecules for which pKa may be calculated, the nature of the substituent and contributing substituent parts, the nature of the reaction center, and the calculation of the contribution from a substituent part including the form of the distant dependent function and the calculation of the weight factor may all be as described above.


[0025] In addition to the methods describe above, other methods, devices, and compositions described in this patent include a computing device configured to calculate characteristic properties of molecules by one of the methods described in this patent; a computer-readable article of manufacture containing a computer program capable of being implemented in a computer to carry out one or more of the methods described in this patent; a molecule for which the structure was identified to include one or more substituent parts chosen to affect a characteristic property of the molecule, where the effect of the one or more substituent parts is calculated by one or more of the methods described in this patent; and a molecule synthesized after determining a likely characteristic property of the molecule, where the effect of the characteristic property of the molecule is calculated by one or more of the methods described in this patent.







BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWING(S)

[0026]
FIG. 1. Predicted vs. Experimental Activity of Mitomycins, Expressed as log (1/C) Against Human Tumor Cells in Culture.


[0027]
FIG. 2. Predicted vs. Experimental Dissociation Constants of Molecules Containing a Carboxylic Group


[0028]
FIG. 3. Predicted vs. Experimental pKa parameters of Organic Amines







DETAILED DESCRIPTION

[0029] The inventors have discovered new methods for calculating a characteristic property of a molecule by correlation analysis and in this section, we describe (1) specific aspects of the methods, (2) implementation of the methods in a computer system, (3) general uses of the methods, and (4) examples of results calculated using the methods.


[0030] Correlation Analysis Methods


[0031] The methods described in this patent may be used to calculate a characteristic property of a molecule. The characteristic properties that may be calculated and the classes of molecule to which the method may be applied are described in detail below. In the method, the molecule is conceptually separated into substituent parts, a reaction center is identified, and the distance of the substituent parts from the reaction center is calculated. The contribution from each substituent part is then calculated as a weight factor multiplied by a function of the distance of the substituent part from the reaction center. We describe in detail below the various forms of distant dependent function that may be used and the various methods that may be used for identifying the reaction center and calculating the weight factor.


[0032] In addition to the contributions of the substituent parts as described above, the characteristic property includes a contribution from one or more measured properties of the molecule. The contribution from a measured property is equal to the value of the measured property multiplied by a weight factor. We describe in detail below measured properties of the molecule that may be used and methods that may be used for calculating the weight factor.


[0033] In terms of an equation, the method may be written as
3CP=j=1nWjf(rj)+k=1mwkMPk(3)


[0034] where CP is the value of the characteristic property of the molecule, the sum over j is a sum over the substituent parts of the molecule, Wj is the weight factor associated with substituent j, rj is the distance from substituent j to the reaction center, f(r1) is a function of the distance from substituent j to the reaction center, the sum over k is a sum over the measure properties of the molecule, wk is the weight factor associated with the measured property k, and MPk is the value of measured property k.


[0035] In one version of the methods described in this patent, CP is the value of the characteristic property measured relative to some constant value, which in this patent we denote by CP0. In one version, CP0 may be the value of the characteristic property for a standard compound. In another version, CP0 may be the value of the intercept of a multiple regression analysis, as will be described in detail elsewhere in this patent.


[0036] As will be described in detail below, generally any characteristic property of a molecule, including chemical and biological properties, may be calculated using the method outlined above. In addition to this general method, which we will refer to in this patent as the “general method,” this patent describes certain chemical characteristic properties that may be calculated using the method outlined above but without including the contribution from the measured properties, i.e., without including the second term on the right hand side of equation 3. In this patent, unless the context make obvious otherwise, we will refer to this method used to calculate certain chemical characteristic properties as the “chemical characteristics method” Unless the context makes obvious otherwise, a reference in this patent to the “methods described in this patent,” or some such language refers to both the general method including the measured properties term and the chemical characteristics method that does not include the measured properties term.


[0037] Molecules for Which Characteristic Properties May be Calculated


[0038] Generally, the methods (both general method and chemical characteristics method) of the invention may be used to calculate the characteristic properties of any molecule or molecular fragment, including but not limited to organic molecules, inorganic molecules, neutral molecules, radicals, anions, cations, ionic salts and metallo-organic and coordination compounds. In one version of the methods described in this patent, the methods may be used to calculate the characteristic properties of peptides, proteins, and non-peptide small molecules. The methods described in this patent may be used to calculate the characteristic properties of molecules of arbitrary size. In another version of the methods described in this patent, the methods may be used to calculate characteristic properties for aniline mustards, nonsteroidal anti-inflammatory drugs (NSAID), and mytomycins. In another version of the methods described in this patent, the methods may be used to calculate characteristic properties for amines, or carboxylic acids.


[0039] As will be described in detail below, the methods described in this patent include a function of the distances of substituent parts from a reaction center. To facilitate this calculation, the 3D structure of the molecule may be obtained by any method capable of providing the 3D structure, including, but not limited to theoretical modeling calculations, experimental, x-ray diffraction data, and other experimental data, such as NMR data. In one version of the methods described in this patent, the 3D structure is obtained by using the Hyperchem software package available from HyperCube, Inc.


[0040] Characteristic Properties That May be Calculated


[0041] Generally, any characteristic properties that can be measured may be calculated by the general methods described in this patent, including but not limited to chemical, physical, and biological characteristics properties.


[0042] Examples of chemical characteristic properties that may be calculated by this general method include, but are not limited to, pKa, any property related to the free energy of the molecule, reaction rate constants, equilibrium constants, solubility, ionization potentials, atomization energy, evaporation energy, and bond energy. In one version, adiabatic ionization energies or vertical ionization energies can be calculated. Physical properties that can be calculated by this general method include, but are not limited to melting temperature, boiling temperature, and sublimation temperature.


[0043] Examples of biological characteristic properties are described in detail in the patent application filed on the same date as the application for this patent, with the same inventors, titled, “Calculating a Biological Characteristic Property of a Molecule By Correlation Analysis.”


[0044] Methods of Calculating Characteristic Property


[0045] In one version of the methods described in this patent, the characteristic property is calculated as the sum of contributions from substituent parts of the molecule. As described below in detail, not all substituent parts of the molecule need be included in this calculation. In this version the characteristic property is calculated as equal to a sum of contributions from each contributing substituent part and the contribution of each substituent part is substantially equal to the product of a weight factor multiplied by a function of the distance of the substituent part to a reaction center.


[0046] This version of the methods described in this patent is shown in equation form in Equation 3 above.


[0047] Substituent Parts


[0048] As part of the methods described in this patent, a molecule is conceptually separated into substituent parts and the characteristic property is calculated as the sum of contribution from some number of the substituent parts. The substituent parts contributing to the calculation of the characteristic property are referred to in this patent as the “contributing substituent parts.” Generally, the substituent parts of a molecule may be any portion of the molecule, including but not limited to, individual atoms in the molecule, groups of atoms in the molecule, individual portions of high electron density in the molecule (for example, lone pairs). In one version of the methods described in this patent, the substituent parts are individual atoms or groups of atoms. A person well versed with the use of correlation analysis to calculate the properties of molecules will understand how to identify atoms and groups that may be used as substituent parts. Generally, however, any portion of the molecule, including atoms and groups may be used as substituent parts.


[0049] Non-limiting examples of atoms and groups that may be used as substituent parts include all possible atoms, alkyl groups, alkenyl groups, aromatic groups, metallo-organic groups, and hetero-aromatic groups. A person familiar with the technology of correlation analysis will be able in a straight forward manner to identify other groups that may be used.


[0050] Generally, any number of the substituent parts may be contributing substituent parts. In one version, all of the substituent parts except one are contributing substituent parts. In another version in which the reaction center is a substituent part, all of the substituent parts except the reaction center are contributing substituent parts. In a version in which the contribution of a substituent part diminishes as the distance to the reaction center increases, substituent parts distant from the reaction center may make insignificant contribution to the calculated property and may be omitted from the contributing substituent parts. Such distant substituent parts may, however, also be included in the contributing substituent parts.


[0051] Reaction Center


[0052] In the methods described in this patent, having determined the contributing substituent parts of the molecule, one then calculates the distance from the contributing substituent parts to a reaction center. Generally, the reaction center can be any point in space. As will be described below in detail, in one version of the methods described in this patent an optimal reaction center may be identified by varying the position of the reaction center, calculating the weight factors for the substituent parts by multivariable regression analysis using the various reaction centers, and identifying the optimal reaction center as that center yielding the best regression analysis fit. In one version, the reaction center may be identified as one of the substituent parts of the molecule.


[0053] Functional Forms


[0054] The inventors have discovered that it is possible to take into account the structure of a molecule when calculating a characteristic property if the contribution of each contributing substituent part is proportional to a function of the distance of the substituent part to the reaction center. The function of the distance used to calculate the contribution for each substituent has the same or substantially the same functional form; the function of the distance may, however, generally be of any functional form. By substantially the same functional form, we mean a functional form that is not identical to the other functional forms but for which the difference in functional form does not qualitatively affect the results of the calculations. As a nonlimiting example, functional forms of 1/r2 and 1/r(2−δ) may be considered substantially the same for small δ.


[0055] In one version of the methods described in this patent, the functional form is a function of the inverse of the distance. In another version, the functional form goes as the inverse of the square of the distance (i.e., f(r) proportional to 1/r2). In another version, the functional form goes as the inverse of the cube of the distance (i.e., f(r) proportional to 1/r3). In another version, the functional form goes as 1/r2+1/r3.


[0056] In the 1/r2 version, for example, equation (3) becomes:
4CP=j=1nWjrj2


[0057] Calculation of the Weight Factors


[0058] As part of the methods described in this patent, the contribution to the characteristic property of a molecule by a substituent part is given by a function of the distance of that substituent part from a reaction center multiplied by a weight factor. Generally the weight factor may be calculated as a regression coefficient for a multivariate regression analysis calculated for a series of molecules. Below we describe one specific version of the methods that may be used to calculate the weight factors, but first we describe in more general terms methods that may be used. A description of the implementation of multivariate regression analysis may be found in for example Essentials of Statistics, Stephen A. Book, New York, McGraw Hill, 1978, page 315 et seq.


[0059] In one version of the methods described in this patent, the dependent variables for the multivariate regression analysis are the values of the characteristic property for the series of molecules and the independent variables are the distant dependent contribution for each type of substituent part present in the series of molecules. For a particular molecule in the series of molecules, the value of the independent variable corresponding to a particular type of substituent part is equal to a sum of the function of the distance from the reaction center to the particular substituent part, where the sum is over all occurrences of that particular substituent part. In one version of the methods described in this patent, the series of molecules include molecules that are analogs of the molecule for which the characteristic property is being calculated. In another version of the methods described in this patent, the series of molecules include molecules which include an atom or group of atoms that is the same as the reaction center of the molecule for which the characteristic property is being calculated.


[0060] One specific example of the multivariable regression analysis that may be used to calculate the weight factors is as follows. This example calculates the weight factors for a version of the methods described in this patent in which the function of the distance used in calculating the contribution of the substituent parts goes as one over the inverse of the distance. In a more general version of the methods described in this patent in which the function of the distance may be any function, f(r), the following example will still apply except that the R-matrix contains terms of the form
5kf(rrc-mk)


[0061] rather than
6k1rrc-mk2.


[0062] This example is presented in three steps: first, calculation of the geometries of the series of molecules used to calculate the weights; second, the calculations of the “R-matrix;” and third, the multivariable regression analysis, also called the partial least squares analysis, used to calculate the weights as the regression coefficients.


[0063] 1. Input. Structural files for optimized geometries of molecules of reaction series are prepared, where each contributing substituent part is specified with its number and 3 spatial coordinates.


[0064] If a reaction series contains M molecules, then the input of M structural files should be prepared. For each molecule j, its, reaction center (rcj) is specified by placing the corresponding atomic number into [rcl, . . . , rcj, . . . , rcM]-vector.


[0065] 2. R-Matrix. The next step of the procedure is composition of the R-matrix containing sums of the
7k1rrc-mk2


[0066] terms, related to certain types of substituent parts.


[0067] When there are K types of substituent parts present in M molecules of the reaction series, the [M×K] R-matrix is formed. For each structural file the program sorts the atoms according to specified types of substituent parts and calculates the sums
8k1rrc-mk2,


[0068] where r is the direct distance between substituent parts of m-type in molecule j and the reaction center and k sums over the substituent parts of type m in the molecule j:
9R=[(k1rrc-mk2)1,1(k1rrc-mk2)1,2(k1rrc-mk2)1,K(k1rrc-mk2)j,1(k1rrc-mk2)j,2(k1rrc-mk2)j,K(k1rrc-mk2)M,1(k1rrc-mk2)M,2(k1rrc-mk2)M,K]


[0069] In the absence of contributing substituent parts of m-type in the molecule n, the corresponding matrix element is set equal to 0:


[0070] 3. Partial Least Square (PLS)-analysis. The final step in this procedure is estimation whether the dataset can be treated as set dependent parameters of multiparameter regression with an intercept equal to CP0. For example, when the method of the invention is applied to free energy (ΔG is the free energy measured relative to some standard free energy G0), the experimental parameters of free energy changes are taken as the vector ΔG:
10ΔG=[ΔG1ΔG2ΔGM],


[0071] the equation can be written in matrix notation as the following:


Rg=ΔG


[0072] where g is solution vector
11[g1g2gK],


[0073] containing K values of what will be the weight


[0074] factors (Wj) which here are designated gi, corresponding to all types of contributing substituent parts.


[0075] When M>K (i.e. the number of molecules in reaction series is greater then the number of types of contributing substituent parts) the system is consistent and Rg=ΔG can be solved.


[0076] An approximate solution of equation can be achieved by multivariable regression, when the columns of R-matrix are considered as sets of independent variables and set ΔG values as dependent parameters. If such regression can be estimated with high accuracy, its linear coefficients can be taken as the weight factors, corresponding to the types of contributing substituent parts.


[0077] Additional Measured Properties That May Contribute to the Calculated Characteristic Property and Calculation of Weights for the Additional Measured Properties


[0078] As presented in Equation 3 above and the supporting description, in one aspect of the methods described in this patent, the characteristic property is calculated as a contribution from the contributing substituent parts plus a contribution from one or more measured properties of the molecule. In one version of these methods, there is a contribution from one measured property of the molecule. Generally, any property of the molecule may be included as a measured property. Properties that may be measured properties include but are not limited to biological properties, chemical properties, and physical properties of the molecule. In one version, the hydrophobicity of the molecule is one measured property that may be used. In one version, the hydrophobicity may be calculated as the logarithm of the octanol-8/water partition coefficient.


[0079] Implementation of the Methods


[0080] The methods described in this patent may be implemented using any device capable of implementing the methods. Examples of devices that may be used include but are not limited to electronic computational devices, including computers of all types. When the methods described in this patent are implemented in a computer, the computer program that may be used to configure the computer to carry out the steps of the methods may be contained in any computer readable medium capable of containing the computer program. Examples of computer readable medium that may be used include but are not limited to diskettes, CD-ROMs, DVDs, ROM, RAM, and other memory and computer storage devices. The computer program that may be used to configure the computer to carry out the steps of the methods may also be provided over an electronic network, for example, over the internet, world wide web, an intranet, or other network.


[0081] In one example, the methods described in this patent may be implemented in a system comprising a processor and a computer readable medium that includes program code means for causing the system to carry out the steps of the methods described in this patent. The processor may be any processor capable of carrying out the operations needed for implementation of the methods. The program code means may be any code that when implemented in the system can cause the system to carry out the steps of the methods described in this patent. Examples of program code means include but are not limited to instructions to carry out the methods described in this patent written in a high level computer language such as C++, Java, or Fortran; instructions to carry out the methods described in this patent written in a low level computer language such as assembly language; or instructions to carry out the methods described in this patent in a computer executable form such as compiled and linked machine language.


[0082] Uses of the Methods


[0083] The methods described in this patent may be used in a variety of ways including but not limited to the prediction of a characteristic property of a molecule that has not been synthesized or for which the property has not been measured; investigation of the effect of structural modification on the characteristic property of a molecule, which may be used to identify candidate molecules for use in specific circumstances, including but not limited to uses as pharmaceuticals. The methods described in this patent may be used to predict the characteristic properties of any molecule or molecule fragment for which the structure is known or may be obtained. The methods may be used to predict the efficacy of a molecule or molecular fragment for various uses including but not limited to use as a pharmaceutical, herbicide, insecticide, nutraceutical, cosmetic, or fungicide.



EXAMPLES

[0084] The following examples demonstrate implementation of various methods described in this patent and demonstrate the operability and utility of these methods. The general approach in these examples is to compose a matrix [M×K]r−2 of a series of molecules (M) containing a number of different types of contributing substituent parts (K). The interatomic distances, r, are determined by using the Hyperchem software package, which allows simple estimation of standard geometries of the corresponding molecules. The resulting r−2 matrices are then analyzed with the appropriate multivariable regression analysis to determine the weight parameters. The implementation of this method is referred to in these examples as the 3D-CAN(TM) method. In these examples the contributing substituent parts are referred to as “atomic types” or some similar phrase, and the weight factors are referred to as “operational parameters,” “operational atomic parameters,” or similar phrase and are designated edi, 1di, gi, ici, cox1i; and cox2i in the various examples. Methods described in these examples that include a contribution from a measured property of the molecule are referred to as “modified 3D-CAN(TM)” or similar phrase.


[0085] The examples below demonstrate specific implementation of methods that may be used in the selection of a reaction center.


[0086] As used in these examples, an atom designation of C4 for example represents a 4-coordinate carbon atom (i.e., sp3 hybridized), C3 represents a 3-coordinate carbon atom (i.e., sp2 hybridized), N3 represents a 3-coordinate nitrogen atom (i.e., sp2 hybridized), etc.



Example 1


Application of the Modified 3D CAN(TM) to Quantification of Mitomycin Series of Anti-Cancer Compounds

[0087] In order to evaluate the applicability of the developed approach for quantification of bioactivity data we have considered anti tumor activity of substituted mytomycins. A number of attempts have been previously made to study structure-activity relationships of mytomycins—clinical antitumor agents of the quinone series.
1


[0088] No satisfying results have previously been obtained. The best correlation could be estimated between activity of compounds 1-30 (See table F) and the corresponding values of their logP and redox potentials. The coefficient of the correlation has been established as 0.84.


[0089] We have considered a number of derivatives of Mitomycin C (1-19) and Mitomycin A (20-30) and processed their activities (expressed in concentration C which is average IC50 from assays) against human tumor cells in culture (S. P. Gupta, Chem. Review, 94, No. 6, 1519 (1994)). The corresponding experimental log(1/C) and logP values have been processed within the modified 3D CAN(TM) schemata, where the parameters are modeled as the following:
12log(1C)=const+ircN-1girrc-i2+αlogP


[0090] where N is the number of atoms in the molecule, rrc-i is the distance between atom i and the reaction center (rc) and gi is the ability of an atom of a certain type to contribute into overall ΔΔG=ΔG−ΔG0−value. logP is the empirical measure of hydrophobicity


[0091] Since the equation above contains intraatomic distance to the atom selected as a reaction center, 3D CAN(TM) allows scanning multiple potential reaction centers to established the appropriate one, based on the quality of the regression. Several common atoms were tested as a potential reaction center of the series.


[0092] For the mytomycins series we have considered numerous common atoms as a potential reaction centers (rc). For example, when the carbon atom of the quinolone o-methyl group has been considered as the reaction center, the quality of the regression is poor as can be seen in the following table:
1Regression StatisticsMultiple R0.890038R Square0.792167Adjusted R Square0.536372Standard Error0.542012Observations30


[0093] The corresponding atomic operational parameters also have poor quality (see Table 1):
2TABLE 1Operational Parameters for Atomic GroupUsing the Quinolone Carbon as RCAtomic typeCoefficientsStandard ErrorConst−7.0921214.77619H22.5840811.8968C4−31.973916.13538C═−25.736614.09525C aromatic−9.31246.767811N3−102.10814.5592—O—−64.19739.511758O═377.0665119.1042F5.93748230.53861Br11.0670334.05972I17.6479227.12964—S—−16.35439.743814—N═173.619261.12753N nitro−645.49205.5137N indole18.0939233.21112N pyridine−27.524127.39797


[0094] The best quality regression parameters were obtained when an atom in the center ring of mytomycin (marked with a star in the structure above) was considered as the rc. The parameters of the corresponding regression, estimated in this approximation are presented in following table:
3Regression StatisticsMultiple R0.956692R Square0.91526Adjusted R Square0.810965Standard Error0.346095Observations30


[0095] When the hydrophobicity is not taken into account, the quality of the correlation is lower:
4Regression StatisticsMultiple R0.949617R Square0.901772Adjusted R Square0.796527Standard Error0.359069Observations30


[0096] The estimated atomic operational contributions determined by regression are given in Table 2 and the operational R matrix of the modified 3D CAN(TM) (matrix of parameters) is given as Table 3.
5TABLE 2Operational atomic parameters g, derived for the presentedatomic types by equation from log(1/C) against human tumor.CoefficientsStandard Errorconst−3.2243914.49385H27.243911.9157C4−41.510616.90643C═−39.329216.5488C aromatic−15.26387.724581N3−95.814614.6992—O—−54.298111.46341O═420.8054118.7589F8.57124329.49205Br2.10554833.4149I3.57640527.91911—S—−18.42139.501031—N═207.429963.43391N nitro−714.328203.7863N indole17.8719832.01148N pyridine−28.29726.41347logP0.2110750.146731


[0097]

6





TABLE 3








The operational R matrix of the modified 3D CAN(TM) (matrix of parameters)

























Compound/












Atomic



C


type
H
C4
C═
aromatic
N3
—O—
O═
F
Br
I




















1
2.1452
1.3313
1.0476
0.0000
0.3369
0.1745
0.2157
0.0000
0.0000
0.0000


2
2.2659
1.4152
1.0556
0.0000
0.3282
0.1867
0.2148
0.0000
0.0000
0.0000


3
2.2092
1.3681
1.0852
0.0000
0.3376
0.1746
0.2196
0.0000
0.0000
0.0000


4
2.2637
1.4374
1.0477
0.0000
0.3374
0.1916
0.2195
0.0000
0.0000
0.0000


5
2.2376
1.3901
1.1043
0.0000
0.3370
0.1892
0.2213
0.0000
0.0000
0.0000


6
2.2929
1.3999
1.0482
0.0812
0.3375
0.1744
0.2157
0.0000
0.0000
0.0000


7
2.2096
1.3344
1.0479
0.1538
0.3369
0.1745
0.2194
0.0000
0.0000
0.0000


8
2.2197
1.3333
1.0481
0.1536
0.3508
0.1742
0.2195
0.0000
0.0000
0.0000


9
2.1954
1.3344
1.0483
0.1532
0.3369
0.1745
0.2195
0.0140
0.0000
0.0000


10
2.1952
1.3344
1.0484
0.1536
0.3369
0.1745
0.2195
0.0000
0.0126
0.0000


11
2.1965
1.3342
1.0481
0.1540
0.3368
0.1744
0.2192
0.0000
0.0000
0.0113


12
2.1953
1.3344
1.0483
0.1542
0.3367
0.1745
0.2194
0.0000
0.0000
0.0122


13
2.2078
1.3342
1.0480
0.1535
0.3368
0.1884
0.2195
0.0000
0.0000
0.0000


14
2.1945
1.3338
1.0483
0.1542
0.3365
0.1747
0.2441
0.0000
0.0000
0.0000


15
2.1949
1.3341
1.0483
0.1535
0.3366
0.1886
0.2196
0.0000
0.0000
0.0113


16
2.1932
1.3324
1.0478
0.1525
0.3365
0.1884
0.2411
0.0000
0.0000
0.0000


17
2.2053
1.3330
1.0481
0.1908
0.3365
0.1744
0.2193
0.0000
0.0000
0.0000


18
2.1513
1.3501
1.1238
0.0000
0.3370
0.1749
0.2186
0.0000
0.0000
0.0000


19
2.1722
1.3334
1.1414
0.0000
0.3558
0.1745
0.2152
0.0000
0.0000
0.0000


20
2.1814
1.3796
1.0562
0.0000
0.2871
0.2147
0.2170
0.0000
0.0000
0.0000


21
2.2248
1.4370
1.0559
0.0000
0.2869
0.2147
0.2183
0.0000
0.0000
0.0000


22
2.3145
1.4789
1.0561
0.0000
0.2868
0.2153
0.2169
0.0000
0.0000
0.0000


23
2.3140
1.4785
1.0563
0.0000
0.2868
0.2152
0.2175
0.0000
0.0000
0.0000


24
2.2195
1.3776
1.0558
0.0895
0.2868
0.2151
0.2164
0.0000
0.0000
0.0000


25
2.2381
1.4093
1.0558
0.0000
0.2869
0.2376
0.2170
0.0000
0.0000
0.0000


26
2.2359
1.3998
1.0558
0.0562
0.2869
0.2323
0.2171
0.0000
0.0000
0.0000


27
2.2476
1.4230
1.0563
0.0000
0.2870
0.2422
0.2171
0.0000
0.0000
0.0000


28
2.2615
1.4309
1.0556
0.0000
0.2871
0.2428
0.2165
0.0000
0.0000
0.0000


29
2.2319
1.3992
1.0561
0.0496
0.2870
0.2149
0.2168
0.0000
0.0000
0.0000


30
2.2327
1.4170
1.0559
0.0000
0.2869
0.2224
0.2169
0.0000
0.0000
0.0000


















Compound/









Atomic



type
—S—
—N═
N nitro
N indole
N pyridine
logP



















1
0.0000
0.0000
0.0000
0.0000
0.0000
−0.38



2
0.0000
0.0000
0.0000
0.0000
0.0000
0.1



3
0.0000
0.0000
0.0000
0.0000
0.0000
0.24



4
0.0000
0.0000
0.0000
0.0000
0.0000
0.21



5
0.0000
0.0000
0.0000
0.0000
0.0000
1.9



6
0.0000
0.0000
0.0000
0.0000
0.0177
1.23



7
0.0000
0.0000
0.0000
0.0000
0.0000
1.3



8
0.0000
0.0000
0.0000
0.0000
0.0000
0.07



9
0.0000
0.0000
0.0000
0.0000
0.0000
1.44



10
0.0000
0.0000
0.0000
0.0000
0.0000
2.16



11
0.0000
0.0000
0.0000
0.0000
0.0000
2.42



12
0.0000
0.0000
0.0000
0.0000
0.0000
2.42



13
0.0000
0.0000
0.0000
0.0000
0.0000
0.63



14
0.0000
0.0000
0.0137
0.0000
0.0000
1.02



15
0.0000
0.0000
0.0000
0.0000
0.0000
1.75



16
0.0000
0.0000
0.0126
0.0000
0.0000
0.51



17
0.0000
0.0000
0.0000
0.0146
0.0000
2.45



18
0.0365
0.0177
0.0000
0.0000
0.0000
1.52



19
0.0000
0.0220
0.0000
0.0000
0.0000
0.56



20
0.0000
0.0000
0.0000
0.0000
0.0000
0.26



21
0.0000
0.0000
0.0000
0.0000
0.0000
0.83



22
0.0000
0.0000
0.0000
0.0000
0.0000
1.35



23
0.0000
0.0000
0.0000
0.0000
0.0000
2.47



24
0.0000
0.0000
0.0000
0.0000
0.0000
1.94



25
0.0000
0.0000
0.0000
0.0000
0.0000
−1.1



26
0.0000
0.0000
0.0000
0.0000
0.0000
1.74



27
0.0000
0.0000
0.0000
0.0000
0.0000
−1.08



28
0.0000
0.0000
0.0000
0.0000
0.0000
−0.46



29
0.0160
0.0000
0.0000
0.0000
0.0000
2.38



30
0.0299
0.0000
0.0000
0.0000
0.0000
0.36











[0098]

7





TABLE 4










Predicted and Experimental Values of Active Concentration


(log1/C) of Mitomycins 1-30 Against Human Tumor











Compound
R
Prediction
Experimenter
resid














1
NH2
7.711772
7.7
−0.01177


2
HOC3H6NH
7.071587
6.98
−0.09159


3
HC═CCH2—NH
8.102683
8.46
  0.357317


4
tetrahydrofuryl-NH
7.245377
7.13
−0.11538


5
2-furyl-C2H4—NH
7.565948
7.34
−0.22595


6
2-pyridyl-C2H4—NH
7.38
7.38
−1.3E−14


7
C6H5NH
8.862808
8.78
−0.08281


8
4-H2N—C6H4—NH
7.642204
7.83
  0.187796


9
4-F—C6H4—NH
8.67
8.67
−2E−14


10
4-Br—C6H4—NH
8.72
8.72
  1.78E−14


11
3-I—C6H4—NH
8.7268
8.9
  0.1732


12
4-I—C6H4—NH
8.771307
8.77
−0.00131


13
4-OH—C6H4—NH
7.965666
7.88
−0.08567


14
4-NO2—C6H4—NH
9.015853
9.07
  0.054147


15
3-I-4-OH—C6H3—NH
7.931492
7.76
−0.17149


16
4-OH-3-NO2—C6H3—NH
7.76895
7.71
−0.05895


17
5-indolyl-NH
8.75
8.75
−8.9E−15


18
4-methyl-thiazolyl-NH
8.679922
8.69
  0.010078


19
3-pyrazolyl-NH
7.388116
7.38
−0.00812


20
CH3O
9.602933
9.52
−0.08293


21
c-C3H5—O
9.080572
9.2
  0.119428


22
c-C3H5—CH2—O
9.304672
9.43
  0.125328


23
c-C4H7—CH2—O
9.787183
9.66
−0.12718


24
C6H5—CH2—O
9.481265
9.21
−0.27126


25
HO—C2H4—O
8.397708
8.31
−0.08771


26
C6H5—O—C2H4—O
8.808812
9.48
  0.671188


27
HO—C2H4—O—C2H4—O
7.88795
7.32
−0.56795


28
CH3—O—C2H4—O—C2H4—O
7.786789
8.24
  0.453211


29
C6H5—S—C2H4—O
9.480943
9.16
−0.32094


30
HO—C2H4—SS—C2H4—O
8.490691
8.65
  0.159309










[0099] Table 4 above (presented graphically in FIG. 1) demonstrates that the modified 3D CAN(TM) allows us to quantify the set of bioactivity parameters of substituted mytomycins with accuracy, considerably higher then has been previously reported by other authors.



Example 2


The use of 3D-CAN(TM) Approach for Quantification of Dissociation Constants of Molecules Containing Carboxylic Group

[0100] Values of ionization constants of 827 various carboxylic acids (including small polypeptides) have been extrapolated to 25° C. and zero ionic strength (Kortum, G.; Vogel, W.; Alldrussow, K. Dissociation Constants of Organic Acids in Aqueous Solution. Butter Worth: London, 1961, Perrin, D. D.; Dempsey, B.; Seijeant, E. P. pKa Prediction for Organic Acids and Bases. Chapman & Hall, London: New York, 1981). The structures of acids molecules have been optimized within MM+ routine of Hyperchem software package allowing simple estimation of the standard geometries in the gas phase.


[0101] We have assumed ionizable oxygen as the reaction center, and have composed [827×21] R-matrix for 827 compounds containing 21 types of substituent atoms. The following atomic types: H, C sp3, C sp2, C sp, Caromatic, N sp3, N sp (CN group), O sp2, O sp3, F, Cl, Br, I, S sp3, S4 (from —SO2—) Si, Se, N+, O, N+sp2 have been specified. Nitro groups in nitro-substituted compounds were considered as subatomic unit and the corresponding r parameters have been taken as the distances between reaction center and nitrogen of NO2. Ionized carboxylic groups have been considered as having full negative charge on one of oxygen atoms, while the other is in O sp2 configuration.


[0102] The procedure of composition of R-matrix has been performed by MATLAB-routine, which exports atomic types and coordinates from Hyperchem structural file, arranges atoms according to the types specified and calculates intramolecular distances. After atoms-reaction centers have been indicated for all molecules of a reaction series, the routine has composed the corresponding R-matrix.


[0103] The columns of such [827×21] matrix of the reaction series have been taken as the sets of independent variables and the corresponding thermodynamic pK-s have been considered as dependent parameters of following polynomial equation:
13pKRCOOH=iN-1δiari2+const


[0104] where δia is introduced atomic operational parameter, reflecting the ability of atoms of one type to contribute to pK value of N-atomic caboxylic acid RCOOH where R represents the molecular environment of the carboxylic group.


[0105] A multilinear regression has then been established with high accuracy (Const =4.84+/−0.12; N=827; R(mult)=0.9703; S=0.1035). The estimated dissociation constants of the carboxylic acids are presented in the Table 5. The interrelation between estimated and experimental pK values is present graphically in FIG. 2. The structures of the various carboxylic acids are presented in Scheme 1.
8TABLE 5Experimental (25 C, I = 0) and estimated dissociationconstants of compounds containing carboxylic groups.NrNamepK corrpK calcΔ1HCOOH3.754.10−0.352CH3COOH4.764.300.463C2H5COOH4.874.580.294C3H7COOH4.824.680.145iso C3H7COOH4.844.720.126C4H9COOH4.804.770.037iso-C4H9COOH4.744.78−0.038t-C4H9COOH4.974.860.119sec-C4H9COOH4.884.840.0410C5H10COOH4.884.800.0811(CH3)2CH—C2H4COOH4.804.790.0112C3H7CH(CH3)COOH4.864.92−0.0613C2H5CH(CH3)CH2COOH4.914.860.0514C2H5C(CH3)2COOH5.134.970.1615(C2H5)2CHCOOH4.714.94−0.2316C6H13COOH4.894.790.1017t-C4H9C2H4COOH4.864.92−0.0618C3H7CH(C2H5)COOH4.785.01−0.2419C7Hi5COOH4.894.860.0320C8HnCOOH4.954.880.0721HOOCCOOH1.272.68−1.4122OOCCOOH4.285.37−1.0923HOOCCH2COOH2.843.01−0.1724OOCH2COOH5.664.950.7125HOOCC2H4COOH4.214.080.1226OOCC2H4COOH5.645.520.1127HOOCC3H6COOH4.344.310.0328OOCC3H6COOH5.415.160.2529HOOCC4H8COOH4.434.350.0830OOCC4H8COOH5.415.120.2931HOOCC5H,0COOH4.484.380.1032OOCC5H10COOH5.425.360.0733HOOCC6H12COOH4.524.520.0034OOCC6H12COOH5.405.270.1435HOOCC7H14COOH4.554.60−0.0536OOCC7H12COOH5.415.120.3037HOOCCH(CH3)COOH3.053.32−0.2738OOCCH(CH3)COOH5.766.01−0.2639HOOCCH(C2H5)COOH2.993.07−0.0840OOCCH(C2H5)COOH5.836.05−0.2241HOOCCH(C3H7)COOH3.002.930.0742OOCCH(C3H7)COOH5.846.15−0.3043HOOCCH(iso-C3H7)COOH2.943.16−0.2244OOCCH(iso-C3H7)COOH5.886.23−0.3545HOOCC(CH3)2COOH3.173.26−0.0946OOCC(CH3)2COOH6.065.660.4047HOOCC(CH3)(C2H5)COOH2.862.670.1948OOCC(CH3)(C2H5)COOH6.416.49−0.0849HOOCC(C2H5)2COOH2.213.13−0.9250OOCC(C2H5)2COOH7.297.67−0.3851HOOCC(C2H5)(C3H7)COOH2.153.12−0.9852OOCC(C2H5)(C3H7)COOH7.437.77−0.3353HOOCC(C3H7)2COOH2.073.17−1.1054OOCC(C3H7)2COOH7.517.85−0.3455OOCCH(CH3)CH2COOH5.735.460.2756HOOCCH(CH3)CH(CH3)COOH meso3.773.720.0557OOCCH(CH3)CH(CH3)COOH meso5.945.840.1058HOOCCH(CH3)CH(CH3)COOH rac3.943.810.1359OOCCH(CH3)CH(CH3)COOH rac6.206.27−0.0760HOOCCH(C2H5)CH2COOH4.084.18−0.1061HOOCC(C2H5)2CH2COOH3.844.21−0.3762HOOCCH(C2H5)CH(C2H5)COOH meso3.633.330.3063OOCCH(C2H5)CH(C2H5)COOH meso6.466.79−0.3364HOOCCH(C2H5)CH(C2H5)COOH rac3.513.59−0.0865OOCCH(C2H5)CH(C2H5)COOH rac6.606.440.1566HOOCCH(C2H5)C(C2H5)2COOH2.742.280.4667HOOCCH2CH(CH3)CH2COOH4.254.44−0.1968OOCCH2CH(CH3)CH2COOH5.415.62−0.2169HOOCCH2CH(C2H5)CH2COOH4.294.52−0.2470OOCCH2CH(C2H5)CH2COOH5.335.290.0471HOOCCH2C(CH3)2CH2COOH3.703.77−0.0772OOCCH2C(CH3)2CH2COOH6.346.110.2373HOOCCH2CH(C3H7)CH2COOH4.314.59−0.2874OOCCH2CH(C3H7)CH2COOH5.395.85−0.4775HOOCCH2CH(iso-C3H7)CH2COOH4.304.68−0.3876OOCCH2CH(iso-C3H7)CH2COOH5.515.53−0.0277HOOCCH2C(C2H5)(CH3)CH2COOH3.623.94−0.3278OOCCH2C(C2H5)(CH3)CH2COOH6.706.520.1879HOOCCH2C(C2H5)2CH2COOH3.623.90−0.2880OOCCH2C(C2H5)2CH2COOH7.127.24−0.1281HOOCCH2C(C3H7)(CH3)CH2COOH3.633.130.5082HOOCCH2C(C3H7)(C2H5)CH2COOH3.513.460.0583HOOCCH2C(C3H7)2CH2COOH3.694.37−0.6884OOCCH2C(C3H7)2CH2COOH7.317.36−0.0585H2OCHCOOH4.254.31−0.0786CH3HC═CHCOOH trans4.694.400.2987CH3HC═CHCOOH cis4.484.56−0.0888H2C═CHCH2COOH4.344.54−0.2089H2C═C(CH3)COOH4.734.430.3090H3CC═CCOOH2.652.69−0.0491H5C2CH═CHCOOH4.694.480.2192H2C═CHC2H4COOH4.674.620.0593H3CCH═CH(CH3)COOH cis4.364.67−0.3194H3CCH═CH(CH3)COOH trans5.064.520.5495(H3C)2C═CHCOOH5.124.650.4796H7C3CH═CHCOOH4.704.530.1897H5C2CH═CHCH2COOH trans4.524.66−0.1598H3CCH═CHC2H4COOH trans4.724.650.0699H2C═CHC3H6COOH4.724.700.02100H5C2C(CH3)═CHCOOH trans5.134.680.45101H5C2C(CH3)═CHCOOH cis5.154.810.34102iso-H7C3—CH═CHCOOH4.704.560.14103(H3C)2C═CHCH2COOH4.604.69−0.09104(H3C)2C═CHC2H4COOH4.804.720.08105HOOCCH═CHCOOH cis2.002.69−0.69106QOCCH═CHCOOH cis6.265.630.63107HOOCCH═CHCOOH trans3.023.83−0.81108OOCCH═CHCOOH trans4.455.16−0.71109HOOCCH2—CH═CHCOOH3.774.11−0.34110OOCCH2—CH═CHCOOH5.085.56−0.48111OOCC(CH3)═CHCOOH cis6.295.770.52112OOCC(CH3)═CHCOOH trans4.895.44−0.54113H2C═C(COO)CH2COOH5.645.75−0.11114C3H5(cyclo)COOH4.834.570.26115C4H7(cyclo)COOH4.794.680.10116C5H9(cyclo)COOH4.994.910.07117C6H11(cyclo)COOH4.904.890.00118C6H10(cyclo), 1-CH3, 1-COOH5.135.060.07119C6H10(cyclo)-2-CH3, 1-COOH trans5.745.080.66120C6H10(cyclo)-2-CH3, 1-COOH ecvat5.045.32−0.29121C6H10(cyclo)-3-CH3, 1-COOH trans5.024.910.12122C6H10(cyclo)-3-CH3, 1-COOH ecvat4.885.11−0.23123C6H10(cyclo)-4-CH3, 1-COOH trans4.884.98−0.10124C6H10(cyclo)-4-CH3, 1-COOH ecvat5.044.980.06125C6H11(cyclo)CH2COOH4.804.83−0.03126C6H11(cyclo)C2H4COOH4.914.780.13127C6H11(cyclo)C3H6COOH4.954.850.10128C3H4(cyclo)-1-COOH, 1-COOH1.822.95−1.13129C3H4(cyclo)-1-COO, 1-COOH5.435.45−0.02130C3H4(cyclo)-2-COOH, 1-COOH axial3.663.88−0.22131C3H4(cyclo)-2-COCH, 1-COOH axial5.145.040.10132C3H4(cyclo)-2-COOH, 1-COOH ecvat3.333.40−0.07133C3H4(cyclo)-2-COO, 1-COOH ecvat5.475.050.42134C4H6(cyclo)-1-COOH, 1-COOH3.133.70−0.58135C4H6(cyclo)-1-COO, 1-COOH5.885.810.07136C4H6(cyclo)-2-COOH, 1-COOH axial3.794.19−0.40137C4H6(cyclo)-2-COO, 1-COOH axial5.615.450.16138C4H6(cyclo)-2-COOH, 1-COOH ecvat3.903.520.38139C4H6(cyclo)-2-COO, 1-COOH ecvat5.896.19−0.30140C4H6(cyclo)-3-COOH, 1-COOH axial3.814.25−0.44141C4H6(cyclo)-3-COO, 1-COOH axial5.285.000.28142C4H6(cyclo)-3-COOH, 1-COOH ecvat4.033.740.29143C4H6(cyclo)-3-COO, 1-COOH ecvat5.315.080.23144C5H8(cyclo)-1-COOH, 1-COOH3.233.010.22145C5H8(cyclo)-2-COOH, 1-COOH axial3.964.25−0.29146C5H8(cyclo)-2-COO, 1-COOH axial5.855.470.38147C5H8(cyclo)-2-COOH, 1-COOH ecvat4.434.320.11148C5H8(cyclo)-2-COO, 1-COOH ecvat6.576.460.10149C5H8(cyclo)-3-COOH, 1-COOH axial4.324.42−0.10150C5H8(cyclo)-3-COO, 1-COOH axial5.425.270.15151C5H8(cyclo)-3-COOH, 1-COOH ecvat4.264.050.21152C5H8(cyclo)-3-COO, 1-COOH ecvat5.515.280.23153C5H8(cyclo)-2-CH2COOH, 1-COOH axial4.444.420.02154C5H8(cyclo)-2-CH2COO, 1-COOH axial5.745.80−0.06155C5H8(cyclo)-2-CH2COOH, 1-COOH ecvat4.454.76−0.31156C5H8(cyclo)-2-CH2COO, 1-COOH ecvat5.865.87−0.01157C5H8(cyclo)-1-CH2COOH, 1-CH2COOH3.804.19−0.39158C5H8(cyclo)-1-CH2COO1-CH2COOH6.776.86−0.09159C5H8(cyclo)-2-CH2COOH, 1-CH2COOH axial4.484.61−0.13160C5H8(cyclo)-2-CH2COO, 1-CH2COOH axial5.505.76−0.26161C5H8(cyclo), 2-CH2COOH, 1-CH2COOH ecvat4.474.320.15162C5H8(cyclo), 2-CH2COO, 1-CH2COOH ecvat5.495.080.41163C5H8(cyclo), 3-CH2COOH, 1-CH2COOH ecvat3.793.480.32164C5H7(cyclo), 3-CH3, 1-CH2COOH, 1-CH2COOH6.746.370.37165C6H10(cyclo), 1-COOH, 1-COOH3.453.030.42166C6H10(cyclo), 2-COOH, 1-COOH axial4.254.42−0.17167C6H10(cyclo), 2-COO, 1-COOH axial6.016.20−0.18168C6H10(cyclo), 2-COOH, 1-COOH ecvat4.384.290.09169C6H10(cyclo), 2-COO, 1-COOH ecvat6.866.270.59170C6H10(cyclo), 3-COOH, 1-COOH axial4.374.330.04171C6H10(cyclo), 3-COO, 1-COOH axial5.815.620.19172C6H10(cyclo), 3-COOH, 1-COOH ecvat4.193.930.26173C6H10(cyclo), 3-COO, 1-COOH ecvat5.595.550.04174C6H10(cyclo), 4-COOH, 1-COOH axial4.274.62−0.35175C6H10(cyclo), 4-COO, 1-COOH axial5.505.340.16176(1)4.004.34−0.34177(2)5.886.02−0.14178(3)3.944.07−0.13179(4)6.886.870.01180C6H10(cyclo), 1-CH2COOH, 1-CH2COOH3.493.360.12181C6H10(cyclo), 1-CH2COOH, 1-CH2COOH6.966.770.20182C6H10(cyclo), 2-CH2COOH, 1-CH2COOH4.434.48−0.05axial183C6H10(cyclo), 2-CH2COO, 2-CH2COOH5.495.70−0.21axial184C6H10(cyclo), 2-CH2COOH, 1-CH2COOH4.474.63−0.16ecvat185C6H10(cyclo), 2-CH2COO, 1-CH2COOH5.525.62−0.10ecvat186C6H10(cyclo), 2-CH2COOH, 2-CH3, 1-CH2COOH6.896.710.17187C6H10(cyclo), 2-CH2COOH, 3-CH3, 1-CH2COOH3.493.420.07188C6H10(cyclo), 2-CH2COO, 3-CH3, 1-CH2COOH6.086.34−0.26189C6H10(cyclo), 4-CH3, 1-CH2COOH, 1-CH2COOH3.493.170.32190C6H10(cyclo), 4-CH31-CH2COO, 1-CH2COOH6.106.14−0.04191(5)5.015.13−0.12192(6)6.786.750.03193(7)4.004.21−0.21194(8)5.705.87−0.18195(9)3.983.99−0.01196(10)6.476.390.08197(11)4.074.48−0.41198(12)5.735.480.25199(13)4.144.45−0.31200(14)7.487.73−0.25201(15)4.574.350.22202(16)6.826.86−0.04203(17)4.113.920.19204(18)5.815.380.43205(19)4.303.800.50206(20)7.147.37−0.23207(21)4.514.230.28208(22)6.726.490.22209(23)4.143.930.21210(24)6.246.140.10211(25)4.844.370.47212(26)7.057.13−0.08213(27)4.203.960.24214(28)7.927.790.13215(29)4.714.330.38216(30)6.867.13−0.28217(31)4.614.320.29218(32)6.966.750.21219(33)4.574.89−0.32220(34)6.256.060.19221(35)4.203.930.27222(36)7.928.36−0.44223(37)4.774.460.31224(38)6.966.720.23225(39)4.303.920.38226(40)6.155.840.30227(41)4.514.88−0.37228(42)6.096.40−0.32229(43)4.744.91−0.17230(44)6.316.010.30231(45)4.504.75−0.25232(46)5.705.96−0.26233(47)3.824.34−0.52234(48)5.325.280.04235(49)2.342.50−0.16236(50)8.318.65−0.35237(51)3.604.49−0.89238(52)5.295.64−0.36239(53)3.864.27−0.41240(54)5.595.390.20241FCH2COOH2.593.74−1.15242ClCH2COOH2.822.620.20243BrCH2COOH2.902.490.41244ICH2COOH3.183.25−0.07245N≡CCH2COOH2.451.750.70246Cl2CHCOOH1.371.83−0.47247Cl3CCOOH0.630.500.13248CH3CH(Cl)COOH2.912.720.18249ClC2H4COOH4.173.890.28250CH3CH(Br)COOH3.002.880.12251BrC2H4COOH4.063.840.23252CH3CH(I)COOH3.163.64−0.48253IC2H4COOH4.164.140.02254CH3CH(CN)COOH2.372.40−0.04255N≡CC2H4COOH3.993.660.33256F3CCH2COOH2.953.33−0.38257(CH3)2C(Cl)COOH3.023.20−0.18258N≡CC3H6COOH4.444.030.41259(CH3)2C(CN)COOH2.422.420.00260F3CC2H4COOH4.164.19−0.03261C2H5CH(CH2Br)COOH3.974.17−0.20262F3CC3H6COOH4.494.370.12263F2CHC3F6COOH2.652.550.11264F7C3C2H4COOH4.183.950.23265F2CHC5F,oCOOH2.682.180.50266F2CHC7F14COOH2.602.070.53267H2C═CFCOOH2.553.72−1.16268F2C═CHCOOH3.173.49−0.33269F2C═CFCOOH1.793.15−1.36270ClCH═CHCOOH trans3.703.630.07271ClCH═CHCOOH cis3.323.54−0.22272Cl2C═CHCOOH1.150.880.27273CH3CH═CClCOOH3.223.53−0.31274H2C═CHCHClCOOH2.542.98−0.44275F3CCH═CHCOOH3.353.240.11276C3F7CH═CHCOOH3.232.910.32277C6H11(cyclo)—CH(CN)COOH2.372.70−0.33278C6Hio(cyclo),2-CN, 1-COOH axial3.864.09−0.23279HOOCCH(Br)CH2COOH2.752.430.−32280OOCCH(Br)CH2COOH4.444.47−0.03281HOOCCH(Cl)CH(Cl)COOH rac1.431.44−0.02282OOCCH(Cl)CH(Cl)COOH rac2.783.07−0.29283HOOCCH(Cl)CH(Cl)COOH meso1.521.56−0.04284OOCCH(Cl)CH(Cl)COOH meso2.962.720.24285HOOCCH(Br)CH(Cl)COOH meso1.461.61−0.16286OOCCH(Cl)CH(Br)COOH meso2.793.14−0.35287HOOCCH(Br)CH(Cl)COOH rac1.431.52−0.09288OOCCH(Cl)CH(Br)COOH rac2.632.73−0.10289HOOCCH(Br)CH(Br)COOH meso1.421.85−0.43290OOCCH(Br)CH(Br)COOH meso3.273.47−0.19291HOOCCH(Br)CH(Br)COOH rac1.511.400.11292OOCCH(Br)CH(Br)COOH rac2.742.660.08293HOCH2COOH3.833.87−0.04294C2H5OCH2COOH3.704.01−0.31295C5H9(cyclo)OCH2COOH3.703.76−0.06296C6H11(cyclo)OCH2COOH3.543.74−0.20297C6H11(cyclo)—CH2OCH2COOH3.904.17−0.27298C6H10(cyclo), 1-OCH2COOH, 2-CH33.803.87−0.08299C6H10(cyclo), 1-OCH2COOH, 3-CH3axial3.813.780.03300C6H10(cyclo), 1-OCH2COOH, 3-CH3ecvat3.854.20−0.34301(55)4.754.340.41302C6H5OCH2COOH3.173.31−0.14303C6H4(2-CH3)OCH2COOH3.233.45−0.22304C6H4(3-CH3)OCH2COOH3.203.35−0.15305C6H4(4-CH3)OCH2COOH3.223.34−0.12306C6H3(2-CH3, 6-CH3)OCH2COOH3.363.55−0.19307C6H4(2-OCH3)OCH2COOH3.233.040.19308C6H4(3-OCH3)OCH2COOH3.143.25−0.10309C6H4(4-OCH3)OCH2COOH3.213.34−0.13310CH3CH(OH)COOH3.863.470.39311C6H11(cyclo)OCH(CH3)COOH3.643.98−0.34312C6H10(cyclo) 2-CH3, 1-OCH(CH3)COOH3.654.10−0.45313CH3C(CH3)(OH)COOH4.114.28−0.18314C2H5C(CH3)(OH)COOH4.063.690.38315CH3CH(OH)CH(CH3)COOH4.724.370.35316(C2H5)2C(OH)COOH3.873.860.01317CH3CH(OH)C2H4COOH4.764.500.26318CH3C(OH)(CH3)C2H4COOH4.944.580.36319HOCH2(CH(OH))4COOH3.232.850.38320(CH3)2CHCH(OH)C2H4CH(CH3)CH2COOH5.174.880.29321C6H10(cyclo), 2-OH, 1-COOH axial4.684.640.04322C6H10(cyclo), 2-OH, 1-COOH ecvat4.804.470.33323C6H10(cyclo), 3-OH, 1-COOH axial4.814.600.20324C6H10(cyclo), 3-OH, 1-COOH ecvat4.604.510.09325C6H10(cyclo), 4-OH, 1-COOH axial4.684.71−0.04326C6H10(cyclo), 4-OH, 1-COOH ecvat4.844.660.17327OOCCH(OH)CH2COOH5.144.860.28328HOOCCH(OH)CH(OH)COOH rac3.043.22−0.19329OOCCH(OH)CH(OH)COOH rac4.374.150.22330HOOCCH(OH)CH(OH)COOH meso3.223.040.18331OOCCH(OH)CH(OH)COOH meso4.824.310.51332HOOCCH2C(OH)(COOH)CH2COOH3.132.720.41333HOOCCH2C(OH)(COOH)CH2COOH4.764.440.32334HOOCCH2C(OH)(COOH)CH2COOH6.406.050.34335H3N+CH2COOH2.352.150.20336CH3N+H2CH2COOH2.352.340.01337C2H5N+H2CH2COOH2.342.37−0.03338C3H7N+H2CH2COOH2.352.300.05339C4H9N+H2CH2COOH2.352.47−0.12340Iso-C4H9N+H2CH2COOH2.352.50−0.15341HC(O)NHCH2COOH3.433.67−0.24342H3CC(O)NHCH2COOH3.673.74−0.07343ClCH2C(O)NHCH2COOH3.383.43−0.05344C2H5C(O)NHCH2COOH3.723.78−0.07345H2NC(O)NHCH2COOH3.883.540.33346C2H5OC(O)NHCH2COOH3.683.490.19347H3N+CH(CH3)COOH2.342.240.10348CH3N+H2CH(CH3)COOH2.222.49−0.27349C2H5N+H2CH(CH3)COOH2.222.36−0.14350C3H7N+H2CH(CH3)COOH2.212.61−0.40351CH3C(O)NHCH(CH3)COOH3.723.530.19352H2NC(O)NHCH(CH3)COOH3.893.96−0.07353H3N+C2H4COOH3.553.540.01354CH3C(O)NHC2H4COOH4.454.220.23355H3N+C(O)NHC2H4COOH4.494.150.34356H3N+CH(C2H5)COOH2.292.32−0.04357CH3C(O)NHCH(C2H5)COOH3.723.610.11358H3N+C(O)NHCH(C2H5)COOH3.894.02−0.14359H3N+C3H6COOH4.033.980.05360H3N+C(O)NHC3H6COOH4.684.340.35361H3N+C(CH3)2COOH2.362.140.22362H3N+C(O)NHC(CH3)2COOH4.464.170.29363H3N+CH(C3H7)COOH2.322.250.07364H3N+CH(C2H5)CH2COOH4.023.830.19365H3K+C4H8COOH4.203.920.28366H3N+CH(iso-C3H7)COOH2.292.29−0.01367H3N+CH(C4H9)COOH2.342.270.07368H3N+C5H10COOH4.434.400.03369H3N+CH(iso-C4H9)COOH2.332.38−0.05370H3N+CH(sec-C4H9)COOH2.322.55−0.23371H3N+C11H22COOH4.654.84−0.19372C6H10(cyclo), 1-N+H3, 1-COOH2.662.540.11373C6H10(cyclo), 1-N+H3, 2-COOH3.593.410.18374C5H10(cyclo), 1-N+H3, 3-COOH axial3.854.08−0.23375C6H10(cyclo), 1-N+H3, 3-COOH ecvat3.704.10−0.40376C6H10(cyclo), 1-N+H3, 4-COOH axial4.394.240.15377C6H10(cyclo), 1-N+H3, 4-COOH ecvat4.834.220.60378H3N+C3H6CH(N+H3)COOH1.942.08−0.14379H3N+CH(C3H6NHC(═N+H2)NH2)COOH1.821.97−0.15380(56)1.821.670.15381H3N+CH(C3H6NHC(O)NH2)COOH2.432.170.26382H3N+CH(C4H8N+H3)COOH2.182.120.06383H3N+CH(CH2COOH)COOH1.981.740.24384H3N+CH(CH2COO)COOH3.963.550.41385H3N+CH(C2H4COOH)COOH2.102.12−0.02386H3N+CH(C2H4COO)COOH4.074.20−0.13387C2H5COOCH(N+H3)C2H4COOH3.853.760.08388H3N+CH(C2H4COOC2H5)COOH2.152.070.08389HOOCCH2N+H2CH2COOH2.542.470.07390HOOCCH2N+H(CH3)CH2COOH2.171.920.24391C6H5N(CH2COOH)22.422.47−0.05392C6H5N(CH2COO)CH2COOH5.034.860.17393N+H(CH2COO)(CH2COOH)CH2COOH2.942.610.33394HOOCC2H4N+H2CH2COOH3.583.350.22395HOOCC2H4N+H2C2H4COOH4.063.620.44396HOOCC2H4N(CH2COOH)C2H4N+H(C2H4COOH)CH22.972.290.67COOH397HOOCC2H4N(CH2COO)3.763.220.54C2H4N+H(C2H4COOH)CH2COOH398HOOCC2H4N(CH2COO)C2H4N+H(CH2COO)5.764.830.93C2H4COOH399(HOOCC2H4)2NC2H4N+H(C2H4COOH)C2H4COOH2.973.30−0.33400HOOCC2H4(OOCC2H4)3.403.230.16NC2H4N+H(C2H4COOH)C2H4COOH401CH3C(O)COOH2.493.24−0.75402CH3C(O)CH2COOH3.634.15−0.52403CH3C(O)C2H4COOH4.714.460.25404CH3C(O)CH2C(O)COOH2.593.04−0.45405CH3C(O)C3H6COOH4.764.470.29406HOOCCH2C(O)COOH2.552.89−0.34407OOCC(O)CH2COOH4.374.370.00408C6H4(2-F)OCH2COOH)3.093.28−0.19409C6H4(3-F)OCH2COOH3.083.28−0.20410C6H4(4-F)OCH2COOH3.133.29−0.16411C6H4(2-Cl)OCH2COOH3.052.960.09412C6H4(3-Cl)OCH2COOH—3.073.11−0.04413C6H4(4-Cl)OCH2COOH3.103.16−0.06414C6H3(2-CH3, 4-Cl)OCH2COOH3.283.290.00415C6H2(2-CH3, 4-Cl, 6-Cl)OCH2COOH3.132.830.30416C6H3(2-Cl, 4-Cl)OCH2COOH3.183.41−0.23417C6H4(2-Br)OCH2COOH3.122.840.29418C6H4(3-Br)OCH2COOH3.103.090.01419C6H4(4-Br)OCH2COOH3.133.14−0.01420C6H4(2-I)OCH2COOH3.172.840.34421C6H4(3-I)OCH2COOH3.133.17−0.04422C6H4(4-I)OCH2COOH3.163.22−0.06423C6H4(2-CN)OCH2COOH2.973.28−0.31424C6H4(3-CN)OCH2COOH3.032.920.12425C6H4(4-CN)OCH2COOH2.933.04−0.11426C6H4(2-NO2)OCH2COOH2.902.820.07427C6H4(3-NO2)OCH2COOH2.953.14−0.19428C6H4(4-NO2)OCH2COOH2.893.20−0.30429C6H3(3-NO2, 4-Cl)OCH2COOH2.962.950.01430CH3CH(OH)C(O)OCH(CH3)COOH2.983.04−0.07431ClCH2CH(OH)COOH3.123.000.12432CH3CH(OH)CH(Cl)COOH2.592.80−0.21433CH3CH(Cl)CH(OH)COOH3.083.10−0.02434ClCH2C(CH3)(OH)COOH3.203.55−0.35435HOOCCH(Cl)CH(OH)COOH2.322.000.32436CH3C(O)OC(CH2COOH)2COOH2.492.230.26437C6H5CH(OH)CH(Cl)COOH2.612.410.20438(57)1.953.28−1.33439(58)3.294.14−0.84440(59)1.972.19−0.22441(60)3.994.44−0.45442H2NC(O)CH2COOH3.643.88−0.24443H2NC(O)C2H4COOH4.544.160.38444H2NC(O)C3H6COOH4.604.380.22445H2NC(O)C4H8COOH4.634.430.20446C6H11(cyclo)SCH2COOH3.493.92−0.43447HOOCCH2SCH2COOH3.353.41−0.06448OOCCH2SCH2COOH4.574.470.10449HOOCCH2SSCH2COOH3.123.14−0.01450OOCCH2SSCH2COOH4.274.000.27451HOOCCH2SCH2SCH2COOH3.363.48−0.12452OOCCH2SCH2SCH2COOH4.414.090.33453HOOCCH2SC2H4SCH2COOH3.433.48−0.05454OOCCH2SC2H4SCH2COOH4.424.060.36455HOOCCH2SC3H6SCH2COOH3.483.75−0.26456OOCCH2SC3H6SCH2COOH4.454.270.19457HOOCCH2SC4H8SCH2COOH3.513.62−0.11458OOCCH2SC4H8SCH2COOH4.494.240.26459HOOCCH2SC5H,0SCH2COOH3.533.89−0.36460OOCCH2SC5H10SCH2COOH4.484.200.29461HOOCCH(CH3)SCH2COOH4.613.960.65462CH3SCH(CH3)COOH3.763.98−0.22463C2H5SCH(CH3)COOH3.804.06−0.27464C3H7SCH(CH3)COOH3.824.12−0.30465Iso-C3H7SCH(CH3)COOH3.784.18−0.40466OOCCH(CH3)SCH(CH3)COOH rac4.694.580.11467OOCCH(CH3)SCH(CH3)COOH meso4.644.400.24468HOOCCH(CH3)SSCH(CH3)COOH rac3.153.94−0.79469HOOCCH(CH3)SSCH(CH3)COOH meso3.143.78−0.64470HOOCCH(CH3)SCH2SCH(CH3)COOH3.383.62−0.25471HOOCC2H4SC2H4COOH4.093.980.11472OOCC2H4SC2H4COOH5.084.560.51473OOCCH(C2H5)SCH(C2H5)COOH rac4.674.680.00474OOCCH(C2H5)SCH(C2H5)COOH meso4.664.67−0.01475HOOCC3H6SC3H6COOH4.424.44−0.02476OOCC3H6SC3H6COOH5.334.630.70477OOCCH(isoC3H7)SCH(isoC3H7)COOH rac4.874.96−0.09478OOCCH(isoC3H7)SCH(isoC3H7)COOH meso4.924.840.08479H3N+C2H4SC9H18COOH4.004.81−0.81480HOOCC10H20NHC2H4SSC2H4N+H2C10H20COOH3.204.65−1.45481CH3SO2CH(CH3)COOH2.442.380.06482C2H5SO2CH(CH3)COOH2.492.440.04483C3H7SO2CH(CH3)COOH2.512.480.02484Iso-C3H7SO2CH(CH3)COOH2.522.68−0.16485C6H10(cyclo)SeCH2COOH3.193.190.00486F3CC3H6CN+H3)COOH2.162.26−0.10487F3CCH(CH3)CH2CH(N+H3)COOH2.052.06−0.01488F3CC2H4CH(N+H3)COOH2.041.980.06489F3CCH(CH3)CH(N+H3)COOH1.541.90−0.36490F3CCH2CH(N+H3)COOH1.601.82−0.22491F3CCH(OH)CH(N+H3)COOH1.551.310.24492F3CCH(NH2)CH2COOH2.762.91−0.16493H3N+CH(CH2OH)COOH2.212.38−0.17494(61)1.921.820.10495HOOCCH2CH(OH)CH(N+H3)COOH2.321.900.43496OOCCH(N+H3)CH(OH)CH2COOH4.244.000.23497H2+N═C(NH2)NH—O—C2H4CH(N+H3)COOH2.502.420.08498H3+NOC2H4CH(N+H3)COOH2.402.200.20499HOOCCH(N+H3)CH2SSCH2CH(N+H3)COOH1.001.24−0.24500OOCCH(N+H3)CH2SSCH2CH(N+H3)COOH2.102.100.00501C2H5SCH2CH(N+H3)COOH2.031.940.09502(CH3)3SiCH2COOH5.225.070.15503(CH3)3SiC2H4COOH4.915.14−0.23504(CH3)3SiC3H6COOH4.895.04−0.15505(CH3)3SiC4H8COOH4.965.19−0.22506(CH3)3SiC5H10COOH5.065.34−0.28507(CH3)3SiOSi(CH3)2CH2COOH5.225.120.10508C6H5Si(CH3)2CH2COOH5.275.170.10509C6H5CH2COOH4.314.310.00510C6H4(2-CH3)CH2COOH4.424.62−0.20511C6H4(4-CH3)CH2COOH4.374.350.01512C6H4(4-C2H5)CH2COOH4.374.39−0.02513C6H4(4-iso-C3H7)CH2COOH4.394.350.04514C6H4(4-t-C4H9)CH2COOH4.424.48−0.06515(C6H5)2CHCOOH3.944.23−0.29516(C6H5)3CCOOH3.964.29−0.33517Naphtyl-1-CH2COOH4.244.30−0.06518Naphtyl-2-CH2COOH4.264.200.06519C6H5C2H4COOH4.664.470.19520C6H4(2-CH3)C2H4COOH4.664.520.14521C6H4(3-CH3)C2H4COOH4.684.500.18522C6H4(4-CH3)C2H4COOH4.684.500.19523C6H5C3H6COOH4.764.550.21524C6H5CH═CHCOOH trans4.444.300.14525C6H5CH═CHCOOH cis3.884.29−0.41526C6H5(2-CH3)CH═CHCOOH trans4.504.410.09527C6H5(3-CH3)CH═CHCOOH trans4.444.340.10528C6H5(4-CH3)CH═CHCOOH trans4.564.330.24529C6H5CH(COOH)COOH2.582.480.10530C6H5CH(COO)COOH5.035.15−0.12531C6H5CH(CH2COOH)COOH3.783.560.22532C6H5CH(COO)CH2COOH5.555.79−0.23533C6H5CH2CH(CH2COOH)COOH4.163.970.19534C6H5CH2CH(COO)CH2COOH5.716.00−0.29535(C6H5)2C(CH2COOH)COOH3.093.36−0.27536HOOCCH(C6H5)CH(C6H5)COOH rac3.583.87−0.30537HOOCCH(C6H5)CH(C6H5)COOH meso3.483.99−0.51538HOOCCH2(C6H5CH2)C(C6H5)COOH3.744.20−0.47539C6H5C(COO)(CH2C6H5)CH2COOH6.586.95−0.37540HOOCCH2(C6H5CH2)2CCOOH4.014.39−0.38541OOOC)C(CH2C6H5)2CH2COOH6.746.83−0.09542HOOCCH2(C6H5C2H4)C(C6H5)COOH3.794.19−0.40543C6H5C(COO)(C2H4C6H5)CH2COOH6.616.130.48544HOOCC2H4C(C6H5)2COOH3.963.480.47545(C6H5)2C(COCr)C2H4COOH6.465.970.49546HOOCC3H6C(C6H5)2COOH4.224.29−0.08547(C6H5)2C(COO)C3H6COOH5.474.980.49548HOOCCH2CH(C6H5)CH(C6H5)CH2COOH4.224.46−0.24549OOCCH2CH(C6H5)CH(C6H5)CH2COOH5.195.030.16550HOOCC4H8C(C6H5)2COOH4.334.37−0.04551OOCC4H8C(C6H5)2COOH5.464.910.54552HOOCC5H10C(C6H5)2COOH4.304.300.00553OOCC5H10C(C6H5)2COOH5.394.930.46554HOOCC6H12C(C5H5)2COOH4.334.37−0.05555OOCC6H12C(C5H5)2COOH5.404.960.44556C6H5CH(Br)COOH2.212.58−0.37557C6H4(4-F)CH2COOH4.254.220.03558C6H4(2-Cl)CH2COOH4.073.840.22559C6H4(3-Cl)CH2COOH4.144.030.11560C6H4(4-Cl)CH2COOH4.194.020.17561C6H4(2-Br)CH2COOH4.053.800.25562C6H4(4-Br)CH2COOH4.193.990.20563C6H4(2-I)CH2COOH4.044.000.04564C6H4(3-I)CH2COOH4.164.120.04565C6H4(4-I)CH2COOH4.184.120.06566C6H4(2-Cl)C2H4COOH4.584.090.49567C6H4(3-Cl)C2H4COOH4.594.300.29568C6H4(4-Cl)C2H4COOH4.614.310.30569C6H5(2-Cl)—CH═CH—COOH trans4.233.900.33570C6H5(3-Cl)—CH═CH—COOH trans4.294.060.24571C6H5(4-Cl)—CH—CH—COOH trans4.414.120.29572C6H5(2-Br)—CH═CH—COOH trans4.413.870.54573C6H5CH2C(CH3)(CN)COOH2.292.32−0.03574C6H5CH(OH)COOH3.413.78−0.37575C6H5C(CH3)(OH)COOH3.604.02−0.42576C6H5CH(OH)CH2COOH4.474.130.34577(C6H5)2C(OH)COOH3.103.41−0.31578C5H4(2-OH)—CH═CH—COOH trans4.614.070.54579C6H4(3-OH)—CH═CH—COOH trans4.404.190.20580C6H4(2-NO2)CH2COOH4.003.680.32581C6H4(3-NO2)CH2COOH3.973.930.04582C6H4(4-NO2)CH2COOH3.854.11−0.26583C6H3(2-NO2, 4-NO2)CH2COOH3.503.200.30584C6H4(2-NO2)C2H4COOH4.504.150.35585C6H4(4-NO2)C2H4COOH4.474.420.05586C6H4(2-NO2)CH═CHCOOH trans4.153.850.30587C6H4(3-NO2)CH═CHCOOH trans4.124.080.04588C6H4(4-NO2)CH═CHCOOH trans4.054.15−0.11589C6H5CH2CH(N+H3)COOH2.162.160.00590(62)2.382.260.11591C6H4(4-OCH3)CH2COOH4.364.240.12592C6H3(2-OCH3, 3-OCH3)CH2COOH4.334.000.34593C6H4(2-OCH3)C2H4COOH4.804.430.37594C6H4(3-OCH3)C2H4COOH4.654.420.23595C6H4(4-OCH3)C2H4COOH4.694.420.27596C6H4(2-OCH3)CH═CHCOOH trans4.464.120.35597C6H4(3-OCH3)CH═CHCOOH trans4.384.180.19598C6H4(4-OCH3)CH═CHCOOH trans4.544.180.36599C6H5CH2SC2H4COOH4.534.240.30600C6H5C2H4SCH2COOH3.863.820.05601C6H4(3-F)CH(OH)COOH4.243.820.42602C6H4(3-Cl)CH(OH)COOH4.243.620.62603C6H4(3-Br)CH(OH)COOH4.233.510.72604C6H4(3-I)CH(OH)COOH4.263.610.65605C6H4(2-F)CH2CH(N+H3)COOH2.131.960.17606C6H4(3-F)CH2CH(N+H3)COOH2.102.21−0.11607C6H4(4-F)CH2CH(N+H3)COOH2.132.130.00608C6H4(2-Cl)CH2CH(N+H3)COOH2.231.780.45609C6H4(3-Cl)CH2CH(N+H3)COOH2.172.050.12610C6H4(4-Cl)CH2CH(N+H3)COOH2.082.010.07611(63)2.202.080.12612C6H3(3-OH, 4-OH)CH2CH(N+H3)COOH2.321.960.36613C6H2(3-I, 4-I, 5-I)CH2CH(N+H3)COOH2.121.810.316141-Naphtyl-C(O)C2H4COOH4.484.220.266152-Naphtyl-C(O)C2H4COOH4.964.210.75616C6H4(4-NHSO2OH)CH2CH(N+H3)COOH1.991.820.17617H3CO(O)CC(C6H5)2CH2COOH4.524.030.48618H3CO(O)CC(C6H5)2COOH3.953.650.30619H3CO(O)CC(C6H5)2C2H4COOH4.714.320.39620H3CO(O)CC2H4C(C6H5)2COOH4.054.21−0.16621H3CO(O)CC(C6H5)2C3H6COOH4.894.470.42622H3CO(O)CC3H6C(C6H5)2COOH4.314.310.00623H3CO(O)CC(C6H5)2C4H8COOH5.044.560.48624H3CO(O)CC4H8C(C6H5)2COOH4.454.430.03625H3CO(O)CC(C6H5)2C5H10COOH5.154.500.65626H3CO(O)CC5H10C(C6H5)2COOH4.554.480.07627C6H5CH(COOCH3)CH2COOH4.414.120.28628C6H5CH(CH2COOCH3)COOH4.103.850.25629H3COC(O)C(C6H5)(CH2C6H5)CH2COOH4.514.160.34630H3COC(O)CH2C(C6H5)(CH2C6H5)COOH4.114.18−0.07631H3COC(O)C(CH2C6H5)2CH2COOH4.714.340.37632H3COC(O)CH2C(CH2C6H5)2COOH4.534.56−0.03633H3N+CH(CH3)C(O)NHCH(CH3)COOH3.303.080.22LL634H3N+CH(CH3)C(O)NHCH(CH3)COOH3.123.100.02LD635H3N+CH(CH3)C(O)NHCH(CH3)C(O)NHCH(CH3)CO3.393.40−0.01OH LLL636H3N+CH(CH3)C(O)NHCH(CH3)C(O)NHCH(CH3)CO3.373.140.23OH LLD637H3N+CH(CH3)C(O)NHCH(CH3)C(O)NHCH(CH3)CO3.313.310.00OH LDL638H3N+CH(CH3)C(O)NHCH(CH3)C(O)NHCH(CH3)CO3.373.43−0.06OH DLL639H3N+CH(CH3)C(O)NHCH(CH3)C(O)NHCH(CH3)CO3.393.280.11OH DDD640H3N+CH(CH3)C(O)[NHCH(CH3)C(O)]2NHCH(CH3)C3.423.45−0.03OOH LLLL641H3N+CH(CH3)C(O)[NHCH(CH3)C(O)]2NHCH(CH3)C3.243.31−0.07OOH LLDL642H3N+CH(CH3)C(O)[NHCH(CH3)C(O)]2NHCH(CH3)C3.223.48−0.26OOH LDLL643H3N+CH(CH3)C(O)[NHCH(CH3)C(O)]2NHCH(CH3)C3.423.56−0.14OOH DLLL644H3N+CH(CH3)C(O)NHCH(C4H8N+H3)C(O)NHCH3.153.32−0.17(CH3)COOH DDD645H3N+CH(CH3)C(O)NHCH(C4H8N+H3)C(O)NHCH3.333.000.33(CH3)COOH LDL646H3N+CH(CH3)C(O)NHCH(C4H8N+H3)C(O)NHCH3.292.990.30(CH3)COOH LLD647H3N+CH(CH3)CONHCH(C4H8N+H3)CONHCH(CH3)3.583.390.19CONHCH(CH3)COOH LLLL648H3N+CH(CH3)CONHCH(C4H8N+H3)CONHCH(CH3)3.323.37−0.05CONHCH(CH3)COOH LDLL649H3N+CH(CH3)CONHCH(C4H8N+H3)CO[NHCH(CH3)3.533.180.35CO]2NHCH(CH3)COOH LLLLL650H3N+CH(CH3)CONHCH(C4H8N+H3)CO[NHCH(CH3)3.303.57−0.27CO]2NHCH(CH3)COOH LDLLL651H3N+CH2C(O)NHCH(CH3)COOH3.153.020.13652H3N+CH2C(O)NHCH(CH3)C(O)NHCH(CH3)COOH3.383.000.38LL653H3N+CH2C(O)NHCH(CH3)C(O)NHCH(CH3)COOH3.303.240.06LD654H3N+CH(C4H8N+H3)C(O)NHCH(CH3)C(O)NHCH3.222.970.25(CH3)COOH LL655H3N+CH(C4H8N+H3)C(O)NHCH(CH3)COOH3.002.970.03LD656H3N+CH(CH2OCH3)COOH2.042.020.02657H3N+CH(CH(OH)CH3)COOH2.112.24−0.13658H3N+CH(CH(OCH3)CH3)COOH1.921.890.03659C2H5CH(NH2)COOH2.292.37−0.09660H3N+CH(C2H5)C(O)HNCH(C2H5)COOH3.073.040.03661H3N+CH2C(O)HNCH(C2H5)COOH3.153.58−0.42662HON+H2CH(C2H5)COOH2.712.700.01663F3CCH(N+H3)CH2COOH2.762.630.13664H3N+CH2C(O)C2H4COOH4.053.930.12665H3CCH(N+H3)C2H4COOH3.974.00−0.03666C6H4(4-NH2)CH2COOH3.604.25−0.65667H3N+CH(CH2C6H5)C(O)NHCH(C3H6NHC(NH2)═2.602.92−0.32N+H2)COOH LL668H3N+CH(CH2C6H4(4-2.632.68−0.05OH))C(O)NHCH(C3H6NHC(NH2)═N+H2)COOH LL669H3N+CH2C(O)NHCH(CH2C(O)NH2)COOH2.942.710.23670H2NC(O)CH2C(OH)(N+H3)COOH2.282.200.08671H2NC(O)CH(OH)CH(N+H3)COOH2.091.820.27672H3N+CH(iso-C4H9)C(O)NHCH(CH2C(O)NH2)COOH3.033.37−0.34LL673HOOCCH(CH2COOH)NHC(O)CH(N+H3)CH2COOH2.702.99−0.29674H3N+CH(CH2COO—)C(O)NHCH(CH2COOH)COOH3.402.960.44675H3N+CH(CH2COO—)C(O)NHCH(COO—)CH2COOH4.704.86−0.16676H3N+CH2C(O)NHCH(CH2COOH)COOH2.812.660.15677H3N+CH2C(O)NHCH(COO—)CH2COOH4.454.260.19678H3N+CH(CH2C(O)NH2)COOH1.981.670.31679HON+HCH(CH2COOH)COOH1.911.690.22680HON+HCH(COO)CH2COOH3.513.81−0.30681(CH3)3N+CH2COOH1.832.52−0.69682H2NC(O)NHOC3H6CH(N+H3)COOH2.432.120.31683H2NC(═NH)N(CH3)CH2COOH2.633.20−0.57684H3N+CH(CH2SH)COOH1.861.92−0.06685H3N+CH(CH2SH)C(O)NHCH(CH2SH)COOH2.652.410.24686HOOCCH(N+H3)CH2SSCH2CH(N+H3)CONHCH(CO1.871.91−0.04OH)CH2SSCH2CH(N+H3)COOH687HOOCCH(N+H3)CH2SSCH2CH(N+H3)CONHCH(CH22.942.260.68SSCH2C(N+H3)COO)COOH688H3N+CH2CONHCH2CONHCH(COOH)CH2SSCH2CH2.712.610.10(N+H3)COOH689H3N+CH2CONHCH2CONHCH(CH2SSCH2CH(N+H3)2.712.640.07COO)COOH690H3N+C2H4CH(N+H3)COOH1.871.90−0.03691H3N+CH2CH(N+H3)COOH1.330.880.45692HOOCCH(N+H3)C4H8CH(N+H3)COOH1.862.07−0.21693OOCCH(N+H3)C4H8CH(N+H3)COOH2.682.550.13694H3N+CH(C2H4COOH)COOH2.132.18−0.05695H3N+CH(COOH)C2H4COOH4.324.37−0.05696H3N+CH(C2H4COOH)C(O)NHCH(C2H4COOH)COOH3.143.31−0.17DD697H3N+CH(C(O)NHCH(C2H4COOH)COO)C2H4COOH4.384.52−0.14DD698HOOCCH2CH(OH)CH(N+H3)COOH2.271.760.51699OOCCH(N+H3)CH(OH)CH2COOH4.294.030.26700H3N+CH(C4H8N+H3)C(O)NHCH(C2H4COOH)COOH2.933.24−0.31701H3N+CH(C4H8N+H3)C(O)NHCH(COOH)C2H4COOH4.474.89−0.42702C6H5OC(O)C2H4CH(N+H3)COOH2.172.24−0.07703C2H5OC(O)CH(N+H3)C2H4COOH3.853.740.11704C2H5OC(O)C2H4CH(N+H3)COOH2.152.28−0.13705H3N+CH(C2H4CONH2)COOH2.172.19−0.02706H3N+CH2C(O)NHCH(C2H4CONH2)COOH2.933.28−0.35707H3N+CH(isoC4H9)C(O)NHCH(C2H4COOH)COOH2.993.41−0.42LL708HOOCCH2NHC(O)CH(CH2SH)NHC(O)C2H4NHCH2.121.930.19(N+H3)COOH709H3CHCCOO)3.593.61−0.02C2H4C(O)NHCH(CH2SH)C(O)NHCH2COOH710H3N+CH(COOH)C2H4CONHCH(CONHCH2COOH)C2.022.53−0.51H2SSCH2CH(CONHCH2COOH)NHCOC2H4CH(N+H3)COOH711H3N+CH(COO)C2H4CONHCH(CONHCH2COOH)2.622.470.15CH2SSCH2CH(CONHCH2COOH)NHCOC2H4CH(N+H3)COOH712H3N+CH(COO)C2H4CONHCH(CONHCH2COOH)3.323.270.05CH2SSCH2CH(NHCOC2H4CH(N+H3)COO)CONHCH2COOH713H3N+CH(COO)C2H4CONHCH(CONHCH2COO)CH24.024.000.03SSCH2CH(NHCOC2H4CH(N+H3)COO)CONHCH2COOH714CH3CONHCH2COOH3.693.84−0.16715H3N+CH(CH3)CONHCH2COOH.3.173.39−0.22716H3N+CH(CH3)CONHCH2CONHCH2COOH3.232.880.34717H3N+CH(CH2CONH2)CONHCH2COOH L2.952.880.06718H3N+CH(CH2COOH)CONHCH2COOH L2.102.47−0.37719H3N+CH(CONHCH2COO)CH2COOH L4.534.58−0.05720(HOC2H4)2NCH2COOH2.482.050.43721H3N+CH(CH2SH)C(O)NHCH2C(O)NHCH2COOH3.052.840.21722H3N+CH(CH2SH)C(O)[NHCH2C(O)]3NHCH2COOH3.142.590.55723(C2H5)2N+HCH2COOH2.042.50−0.46724(CH3)2N+HCH2COOH2.082.29−0.21725(CH3)2N+HCH2CONHCH2COOH3.112.800.31726C2H5N+H2CH2COOH2.302.200.10727H3N+CH(C2H4CONH2)C(O)NHCH2COOH3.152.950.20728H3N+CH2CONHCH2COOH3.143.030.11729H3N+CH2CO[NHCH(CH3)CO]2NHCH2COOH3.303.42−0.12730H3N+CH2CONHCH2CONHCH2COOH3.233.41−0.19731H3N+CH2CO[NHCH2CO]2NHCH2COOH3.113.38−0.27732H3N+CH2CONHCH(CH2OH)CONHCH2COOH3.233.100.13733H3N+CH2CO[NHCH2CO]5NHCH2COOH2.943.59−0.65734(64)2.432.66−0.23735Iso-C3H7N+H2CH2COOH2.362.45−0.09736H3N+CH(iso-C4H9)CONHCH2COOH3.253.67−0.42737H3N+CH(iso-C4H9)CONHCH2CONHCH2COOH3.283.54−0.26738H3CN+H2CH(iso-C4H9)CONHCH2COOH3.293.32−0.03739H3N+CH2CO[NHCH2CO]4NHCH2COOH3.173.48−0.32740H3N+CH(CH2C6H5)CONHCH2COOH3.132.650.48741(65)3.193.050.14742H3CN+H2CH2CONHCH2COOH3.143.000.14743H3N+CH(CH2OH)CONHCH2COOH3.102.980.12744H3N+CH2CO[NHCH2CO]3NHCH2COOH3.143.47−0.33745H3N+H2CH(iso-C3H7)CONHCH2COOH L3.233.34−0.11746H2NC(═N+H2)NHCH2COOH2.822.600.22747(66)2.642.91−0.27748(67)2.642.96−0.32749(68)2.402.330.07750(69)2.932.830.10751(70)1.931.720.21752(71)2.953.11−0.16753(72)2.722.080.64754(73)2.252.040.21755(74),1.651.610.04756(75)1.841.650.19757(76)2.001.870.13758(77)1.731.83−0.11759HSC2H4CH(NH2)COOH2.222.24−0.02760HOOCCH(N+H3)C2H4SSC2H4CH(N+H3)COOH1.592.09−0.50761OOCCH(N+H3)C2H4SSC2H4CH(N+H3)COOH2.542.74−0.21762HOOCCH2NHN+H2CH2COOH2.421.960.46763OOCCH2N+H2NHCH2COOH3.162.580.57764(78)2.962.860.10765(79)4.254.26−0.01766H2NCOCH(N+H3)CH2COOH2.973.10−0.13767H2NC(═N+H2)CH2N(CH3)COOH2.843.04−0.20768H2NCOCH(N+H3)C2H4COOH3.813.520.29769H3N+CH(sec-C4H9)COOH2.322.54−0.22770H3N+CH(OH)COOH2.722.75−0.03771H3N+CH(iso-C4H9)C(O)NHCH(OH)COOH3.223.030.19772H3N+CH(iso-C4H9)COOH L2.332.54−0.21773H3N+CH2C(O)NHCH(iso-C4H9)COOH L3.183.42−0.24774H3CN+H2CH2C(O)NHCH(iso-C4H9)COOH L3.153.66−0.51775H3N+CH(CH2OH)CONHCH(iso-C4H9)COOH3.083.72−0.64LL776H3N+CH(CH2CH(CH3)CF3)COOH2.052.30−0.26777H3N+CH(C4H8N+H3)COOH L2.162.40−0.24778HON+H2CH(C4H8N+H3)COOH2.082.12−0.04779H3N+CH(C4H8N+H3)CONHCH(C4H8N+H3)COOH3.012.850.16LL780H3N+CH(C4H8N+H3)CONHCH(C4H8N+H3)COOH2.852.800.05LD781H3N+CH(C4H8N+H3)CONHCH(C4H8N+H3)CONHCH3.083.39−0.31(C4H8N+H3)COOH LLL782H3N+CH(C4H8N+H3)CONHCH(C4H8N+H3)CONHCH2.912.570.34(C4H8N+H3)COOH LDL783H3N+CH(C4H8N+H3)CONHCH(C4H8N+H3)CONHCH2.943.03−0.09(C4H8N+H3)COOH LDD784H3N+CH(C2H4SCH3)COOH2.172.090.08785H3N+CH(C2H4SCH3)CONHCH(C2H4SCH3)COOH3.203.080.12LL786H3N+C3H6CH(N+H3)COOH1.712.08−0.37787H3N+CH(CH2C6H3(3-OH, 4-OH))COOH2.322.170.15788H3N+CH(CH2C6H3(2-F, 4-OCH3))COOH2.122.21−0.09789H3N+CH2C(O)NHCH(CH2C6H5)COOH3.123.010.11790H3N+CH(C6H5)COOH1.832.17−0.34791H3N+CH(C6H4(3-C(O)CH3))COOH1.141.93−0.79792H3N+CH(C6H4(3-Cl))COOH1.051.66−0.61793H3N+CH(C6H4(4-Cl))COOH1.461.62−0.16794H3N+CH(C6H4(3-CN))COOH0.281.33−1.05795H3N+CH(C6H4(3-OCH3))COOH1.682.08−0.40796H3N+CH(C6H4(4-OCH3))COOH2.081.790.29797H3N+CH(C6H4(3-CH3))COOH1.892.02−0.13798H3N+CH(C6H4(4-CH3))COOH1.971.940.03799H3N+CH(C6H4(3-NO2))COOH0.061.54−1.48800(80)1.951.97−0.01801(81)3.042.810.23802(82)2.812.98−0.17803(83)1.822.20−0.38804H3CN+H2CH2COOH2.212.24−0.03805H3N+CH2C(O)N(CH3)CH2COOH2.983.34−0.36806H3CN+H2CH2CON(CH3)CH2COOH2.892.750.14807H3N+CH(CH2OH)COOH2.191.990.20808H3N+CH2C(O)NH(CH2OH)COOH2.982.99−0.01809H3N+CH(CH(CH3)OH)COOH2.092.25−0.16810H3N+CH(CH(CH3)OCH3)COOH2.022.37−0.35811H3N+CH(CH(CF3)OCH3)COOH1.551.510.04812H3N+CH(CH2C6H4(4-OH))COOH2.202.27−0.07813H3N+CH(CH2COOH)C(0)NHCH(CH2C6H4(4-2.131.820.31OH))COOH814H3N+CH(C(O)NHCH(CH2C6H4(4-OH))3.573.490.08COO)CH2COOH815H3N+CH(CH2C6H2(4-OH, 3-Br, 5-Br))COOH2.172.20−0.03816H3N+CH(CH2C6H2(4-OH, 3-Cl, 5-Cl))COOH2.122.25−0.13817H3N+CH(CH2C6H2(4-OH, 3-I, 5-I))COOH2.122.48−0.36818H3N+CH2C(O)NHCH(CH2C6H4(4-OH))COOH2.982.950.03819H3N+CH(iso-C4H9)C(0)NHCH(CH2C6H4(4-2.873.23−0.36OH))COOH DL820H3N+CH(CH2C6H4(4-OCH3))COOH2.212.200.01821H3N+CH(CH2C6H4(4-OH))CONHCH(CH2C6H4(4-3.523.300.22OH))COOH822H3N+CH(iso-C3H7)COOH2.292.45−0.16823H3N+CH2C(O)NHCH(iso-C3H7)COOH3.153.18−0.03824HON+H2CH(iso-C3H7)COOH2.552.330.22825H3N+CH(C(CH3)2SH)COOH2.002.15−0.15826H3N+CH(CH(CH3)CF3)COOH1.541.69−0.15The numbers in the table correspond to the structures presented in scheme 1.


[0106]

2





3





4





5





6





7





8





9






[0107] The estimated results demonstrate that the suggested approach allows for accurate quantitative interpretation dissociation constants of wide range of various carboxylic acids. The values of the estimated atomic operational contributions in equation above can be used for an accurate prediction of unknown pK values of molecules, constituted from the a variety of atom types presented in table 6 shown below.
9TABLE 6Operational atomic constants δiaest and δibest, estimated from pKparameters of carboxylic acids and protonated amines respectively, the correspondingvalues, predicted by correlations and parameters of atomic “inductive” electronegativitiesand radii, used in these correlations.δiaδiaδibδibχR(A0)est+/−calcest+/−calcH2.100.300.950.170.150.760.060.22C42.100.770.480.240.990.080.041.48C32.250.670.560.20−0.23−2.540.27−1.05C22.650.60−5.071.25−4.88−8.660.45−11.26C ar2.450.67−0.450.11−1.56−2.460.11−4.01N32.560.70−3.340.33−2.45−5.150.26−6.03N16.760.55−18.242.55−19.95−42.001.34−44.56O23.050.66−5.610.25−5.28−9.540.24−12.22F3.930.64−2.880.28−8.320.46Cl3.090.99−12.590.55−12.44−23.770.31−28.75Br2.961.14−14.600.83−14.05−36.590.64−32.70I2.801.33−8.901.88−16.524.67S22.691.04−6.190.50−7.45−14.854.30−17.82Si2.061.112.861.492.771.360.844.65N+4.330.70−20.330.42−15.04−41.290.72−33.91O−1.850.7028.610.609.440.465.19N22.053.47N2+−16.712.02−13.28−30.6716.33O14.600.62−6.250.32−13.30−9.820.71S6−3.641.36Se2.541.17−16.303.69Nitro−9.022.04χ is electronegativity of the substituent part.



Example 3


Quantitative Assessment of pKa Values of Amines

[0108] It is a matter of common knowledge that the basicity of amines can be interpreted in terms of polar substituent constants. Numerous authors have proposed different linear free energy (LFER) equations describing limited series of basicity data for of primary-, secondary- and tertiary amines (Perrin, D. D.; Dempsey, B.; Seijeant, E. P. pKa Prediction for Organic Acids and Bases. Chapman & Hall, London: New York, 1981).


[0109] We have not separated experimental data into several reaction series and have considered the pK values of 802 different amines in which ionizing nitrogen was not engaged into conjugation interactions. The structures of organic amines have been optimized within MM+ routine of Hyperchem software package allowing simple estimation of the standard geometries in the gas phase.


[0110] We have assumed ionizable nitrogen as the reaction center, and composed [802×19] R-matrix for 802 compounds containing 19 types of substituent atoms. The following atomic types: H, C sp3, C sp2, C sp, Caromatic, N sp3, N sp2, N sp (CN group), O sp2, O sp3, F, Cl, Br, I, S, Si, N+, O, N+sp2 have been specified. Ionized carboxylic groups have been considered as having full negative charge on one of oxygens, while the other is in O sp2 configuration. The columns of [802×19] R-matrix have been taken as the sets of independent variables. Values of pK-s, have been extrapolated to 25C and zero ionic strength (Perrin, D. D.; Dempsey, B.; Serjeant, E. P. pKa Prediction for Organic Acids and Bases. Chapman & Hall, London: New York, 1981 Perrin, D. D. Dissociation Constants of Organic Bases in Aqueous Solution; Butter Worth: London, 1965). When experimental details were insufficient, the corresponding pK values have been accepted as given (what in some cases might lead to uncertainties up to 0.1 pK units). Then corrected pKa parameters have been considered as dependent parameters of polynomial equation:
14pKR3N=iN-1δibri2+const


[0111] where N is the number of atoms in amine, δib is introduced atomic operational parameter reflecting the ability of atoms of one type to contribute to amine's pKa


[0112] A multilinear regression was established based on the equation above, with high accuracy (Const=9.12 +/−0.19; N=802; R (mult)=0.9659; S=0.1819) that allows the usage of the estimated operational atomic parameters for amines basicity predictions:
15pKR3N=9.12+iN-1δibri2


[0113] The structures of the various amines are presented in Scheme 2.
10111213141516171819


[0114] The estimated pKa-s of the amines are presented in Table 7 along with the corresponding experimental data. Interrelation between estimated and experimental pK values is presented graphically in FIG. 3. Operational atomic parameters δib for 19 atomic types used, taken as the multiple coefficients of the equation above, are collected in Table 6 (example 2). The large uncertainties in the operational parameters δib estimated for O sp2, F and I are due to the lack of the data (column elements of the R-matrix) for these atoms, which lead to significant statistical deviations.


[0115] This data demonstrate that this approach allows for accurate quantitative interpretation of basicity data for a wide range of primary-, secondary- and ternary amines. The values of the estimated atomic operational contributions in the equation above can hence be used for prediction of unknown pK values for amines, constituted from the atom types presented in Table 6 in Example 2.
10TABLE 7Experimental (25C, I = 0) and Estimated pK parameters of organic aminespKpKσ*NrMoleculeexperpredΔcalc1CH3NH210.6610.410.24−0.142H2NC(O)CH2NH27.958.14−0.190.423N≡CCH2NH25.344.990.351.184(C2H5O)2Si(CH3)CH2NH29.209.31−0.110.135C2H5OSi(CH3)2CH2NH210.1810.29−0.11−0.116CH3ONH24.605.06−0.461.177(C2H5O)3SiCH2NH28.438.320.110.378(H3C)3SiCH2NH210.9610.870.09−0.259C2H5NH210.7010.570.13−0.1710C6H5C(O)NHC2H4NH29.139.41−0.280.1111BrC2H4NH28.478.350.120.3712CH3CH(CONH2)NH28.028.23−0.210.4013N≡CC2H4NH27.778.23−0.460.4014HOC2H4NH29.509.76−0.260.0215HSC2H4NH28.358.40−0.050.3516H3COC2H4NH29.429.340.080.1317H3CSC2H4NH29.369.340.020.1318Cl3CC2H4NH25.403.801.601.4719F3CC2H4NH25.707.57−1.870.5620(CH3)3SiC2H4NH210.9710.780.19−0.2221CH3C(O)OC2H4NH29.109.040.060.2022HOC2H4SC2H4NH29.288.850.430.2523C3H7NH210.6910.570.12−0.1724iso-C3H7NH210.6010.62−0.02−0.1925BrC3H6NH28.829.26−0.440.1526(CH3)2C(CN)NH25.235.34−0.111.1027(OHCH2)3CNH28.087.980.100.4628(OHCH2)2C(CH3)NH28.809.26−0.460.1529t-C4H9CH2NH210.2410.83−0.59−0.2430HOC3H6NH29.9610.16−0.20−0.0731CH3CH(CH2OH)NH29.439.370.060.1232(CH3)2C(OH)CH2NH29.259.070.180.1933(CH3)2C(CH2OH)CH2NH29.719.490.220.0934iso-C4H9NH210.7210.75−0.03−0.2235t-C4H9NH210.6810.85−0.17−0.2436F3CC2H4NH28.589.19−0.610.1637(CH3)3SiC3H6NH210.7310.77−0.04−0.2238H2C═CH—CH2NH29.499.72−0.230.0339HC≡C—CH2NH28.157.990.160.4540C4H9NH210.6110.62−0.01−0.1941sec-C4H9NH210.5610.69−0.13−0.2042HOC4H8NH210.2010.37−0.17−0.1243C2H5CH(CH2OH)NH29.559.490.060.0944C2H5C(CH2OH)2NH28.808.89−0.090.2345C2H5CH(CH3)CH2NH210.6410.76−0.12−0.2246(CH3)2CHC2H4NH210.7010.690.01−0.2047(CH3)2C(C2H5)NH210.7210.81−0.09−0.2348Cl3CC3H6NH29.788.391.390.3649Cl2C═CHC2H4NH29.978.841.130.2550C5H11NH210.6310.65−0.02−0.1951(C2H5)2CHNH210.4210.78−0.36−0.2252BrC5H10NH29.5010.05−0.56−0.0553(iso-C3H7)2CHNH210.2310.98−0.75−0.2754(C2H5)3CNH210.5911.00−0.41−0.2855HOC5H10NH210.4310.48−0.05−0.1556(C2H5)2C(CH3)NH210.6310.90−0.27−0.2557Cl3CC4H8NH29.979.180.790.1658t-C4H9CH2C(CH3)2NH210.7311.00−0.27−0.2859C6H13NH210.6410.68−0.04−0.2060BrC6H12NH210.4810.240.24−0.0961HOC6H12NH210.4810.55−0.07−0.1762C7H13NH210.6610.68−0.02−0.2063C5H11CH(CH3)NH210.6710.80−0.13−0.2364C5H11C(CH3)2NH210.5610.92−0.36−0.2665(CH3)2CHC3H6CH(CH3)NH210.2810.83−0.55−0.2466C8H17NH210.6510.71−0.06−0.2167C6H13CH(CH3)NH210.4910.82−0.33−0.2368C5H11CH(OH)C(CH3)2NH29.859.720.130.0369C9H19NH210.6410.72−0.08−0.2170C10H21NH210.6210.73−0.11−0.2171C11H23NH210.6310.74−0.11−0.2172C12H25NH210.6310.74−0.11−0.2273C13H27NH210.6310.79−0.16−0.2374C14H29NH210.6210.80−0.18−0.2375C15H31NH210.6110.80−0.19−0.2376C16H33NH210.6110.76−0.15−0.2277C17H35NH210.6010.81−0.21−0.2378C18H37NH210.6010.82−0.22−0.2379C22H44NH210.6010.83−0.23−0.2480C5H8(cyclo), 2-OH, 1-NH29.209.35−0.150.1281C6H11(cyclo)NH210.6810.76−0.08−0.2282C6H10(cyclo) 2-Cl, 1-NH29.499.340.150.1383C6H10(cyclo) 2-OH, 1-NH29.539.510.020.0884C6H9(cyclo) 5-CH3, 2-OH, 1-NH29.449.56−0.120.0785C6H9(cyclo) 6-CH3, 2-OH, 1-NH29.399.62−0.230.0686C6H9(cyclo) 5-CH3, 2-CH(CH3)2, 1-NH2ecvat10.3511.06−0.71−0.2987C6H9(cyclo) 5-CH3, 2-CH(CH3)2, 1-NH2axial10.4811.08−0.60−0.3088C6H11(cyclo)CH2NH210.4910.83−0.34−0.2489C6H10(cyclo) 1-CH3, 1-NH210.3610.93−0.57−0.2690C6H10(cyclo) 2-CH3, 1-NH2ecvat10.4910.87−0.38−0.2591C6H10(cyclo) 2-CH3, 1-NH2axial10.5110.89−0.38−0.2592C6H10(cyclo) 3-CH3, 1-NH2ecvat10.5610.78−0.22−0.2393C6H10(cyclo) 3-CH3, 1-NH2axial10.6110.80−0.19−0.2394C6H11(cyclo)CH2CH(CH3)NH210.1410.87−0.73−0.25952010.4210.000.42−0.03962110.3710.060.31−0.0597C7H12(cyclo) 2-OH, 1-NH29.259.27−0.020.1498C6H5C2H4CH(CH3)NH29.7910.36−0.57−0.1299C6H4(4-OH)C2H4CH(CH3)NH29.149.90−0.76−0.01100C6H5C4H8NH210.3610.41−0.05−0.13101C6H5CH(CH3)NH29.089.34−0.260.13102C6H5C2H4NH29.849.96−0.12−0.03103C6H3(3-OH, 4-OH)C2H4NH28.749.07−0.330.19104C6H4(4-OH)C2H4NH29.309.37−0.070.12105C6H3(3-OH, 4-OH)CH(OH)CH(C2H5)NH28.428.82−0.400.25106C6H5CH(OH)CH2NH28.908.440.460.34107C6H3(3-OH, 4-OH)CH(OH)CH2NH28.588.450.130.34108C6H4(3-OH)CH(OH)CH2NH28.678.060.610.44109C6H4(4-OH)CH(OH)CH2NH28.818.230.580.40110C6H5CH(OH)CH(CH3)NH29.318.810.500.26111C6H3(3-OH, 4-OH)CH(OH)CH(CH3)NH28.458.430.020.35112C6H4(4-OH)CH(OH)CH(CH3)NH28.708.650.050.29113C6H5CH(CH3)CH2NH210.2710.100.17−0.06114C6H5Si(CH3)2CH2NH210.3610.39−0.03−0.13115C6H5CH(CH3)C2H4NH210.0310.25−0.22−0.09116C6H5C3H6CH(CH3)NH29.9910.51−0.52−0.16117C6H5C5H10NH210.4410.53−0.09−0.16118C6H4(4-OH)C3H6CH(CH3)NH29.4010.02−0.62−0.04119C6H5CH2CH(CH3)NH210.0310.08−0.05−0.05120C6H5C3H6NH210.1610.22−0.06−0.09121C6H3(3-OCH3, 4-OCH3)CH2CH(CH3)NH29.609.76−0.160.02122C6H4(4-OH)CH2CH(CH3)NH29.319.93−0.62−0.02123C6H4(4-OCH3)CH2CH(CH3)NH29.539.95−0.42−0.02124C6H5CH2NH29.339.210.120.16125C6H3(2-OCH3, 3-OCH3)CH2NH29.418.510.900.33126C6H3(3-OCH3, 4-OCH3)CH2NH29.398.750.640.27127C6H4(2-OCH3)CH2NH29.708.730.970.27128C6H4(3-OCH3)CH2NH29.158.970.180.22129C6H4(4-OCH3)CH2NH29.479.010.460.21130C6H4(2-CH3)CH2NH29.199.33−0.140.13131C6H4(3-CH3)CH2NH29.339.250.080.15132C6H4(4-CH3)CH2NH29.369.240.120.15133F5C6CH2NH27.676.591.080.79134OC(O)CH(CH3)NH29.8710.22−0.35−0.09135OC(O)CH(CH3)NHC(O)CH(CH3)NH28.148.39−0.250.36LL136OC(O)CH(CH3)NHC(O)CH(CH3)NH28.308.140.160.42LD137OC(O)CH(CH3)NHC(O)CH(CH3)NHC(O)CH(CH3)NH28.037.910.120.47LLL138OC(O)CH(CH3)NHC(O)CH(CH3)NHC(O)CH(CH3)NH28.057.910.140.47LLD139OC(O)CH(CH3)NHC(O)CH(CH3)NHC(O)CH(CH3)NH28.137.950.180.46LDL140OC(O)CH(CH3)NHC(O)CH(CH3)NHC(O)CH(CH3)NH28.067.950.110.47DLL141OC(O)CH(CH3)NHC(O)CH(CH3)NHC(O)CH(CH3)NH28.067.960.100.46DDD142OC(O)CH(CH3)NH[C(O)CH(CH3)NH]2C(O)CH(CH3)NH27.947.710.230.52LLLL143OC(O)CH(CH3)NH[C(O)CH(CH3)NH]2C(O)CH(CH3)NH27.937.770.160.51LLDL144OC(O)CH(CH3)NH[C(O)CH(CH3)NH]2C(O)CH(CH3)NH27.997.770.220.51LDLL145OC(O)CH(CH3)NH[C(O)CH(CH3)NH]2C(O)CH(CH3)NH27.997.780.210.51DLLL146OC(O)CH(CH3)NHC(O)CH(C4H8N+H3)NHC(O)CH(CH3)NH27.657.630.020.54LLL147OC(O)CH(CH3)NHC(O)CH(C4H8N+H3)NHC(O)CH(CH3)NH27.977.640.330.54LDL148OC(O)CH(CH3)NHC(O)CH(C4H8N+H3)NHC(O)CH(CH3)NH27.847.640.200.54LLD149OC(O)CH(CH3)NHC(O)CH(NHC(O)CH(CH3)NH2)C4H8NH210.309.970.33−0.03LLL150OC(O)CH(CH3)NHC(O)CH(NHC(O)CH(CH3)NH2)C4H8NH210.369.950.41−0.02LDL151OC(O)CH(CH3)NHC(O)CH(NHC(O)CH(CH3)NH2)C4H8NH210.499.990.50−0.03LLD152OC(O)CH(CH3)NHC(O)CH(C4H8N+H3)NHC(O)CH(CH3)NHC(O)CH(CH3)NH28.017.480.530.58LLLL153OC(O)CH(CH3)NHC(O)CH(NHC(O)CH(CH3)NHC(O)CH(CH3)NH2)C4H8NH210.589.930.65−0.02LLLL154OCOCH(CH3)NHC(O)CH(C4H8N+H3)NHC(O)CH(CH3)NHC(O)CH(CH3)NH28.017.440.570.59LDLL155OCOCH(CH3)NHC(O)CH(C4H8N+H3)NH[C(O)CH(CH3)NH]2C(O)CH(CH3)NH27.757.570.180.56LLLLL156OCOCH(CH3)NHC(O)CH(C4H8N+H3)NH[C(O)CH(CH3)NH]2C(O)CH(CH3)NH27.857.320.530.62LDLLL157OC(O)CH(CH3)NHC(O)CH(NH[C(O)CH(CH3)NH]2C(O)CH(CH3)NH2)C4H8NH210.359.920.43−0.02LLLLL158OC(O)CH(CH3)NHC(O)CH(NH[C(O)CH(CH3)NH]2C(O)CH(CH3)NH2)C4H8NH210.299.830.460.01LDLLL159OOCCH(CH3)NHC(O)CH2NH28.238.41−0.180.35160OOCCH(CH3)NHC(O)CH(CH3)NHC(O)CH2NH28.107.910.190.47LL161OOCCH(CH3)NHC(O)CH(CH3)NHC(O)CH2NH28.178.050.120.44LD162OC(O)CH(CH3)NHC(O)CH(C4H8N+H3)NH27.627.93−0.310.47LL163OC(O)CH(CH3)NHC(O)CH(NH2)C4H8NH210.7010.240.46−0.09LL164OC(O)CH(CH3)NHC(O)CH(C4H8N+H3)NH27.747.86−0.120.49LD165OC(O)CH(CH3)NHC(O)CH(NH2)C4H8NH210.6310.230.40−0.09LD166OC(O)CH(CH2OCH3)NH29.189.060.120.19167H5C2OC(O)CH(CH3)NH27.747.99−0.250.45168OC(O)CH(C2H4OH)NH29.109.080.020.19169OC(O)CH(C2H4OCH3)NH28.909.12−0.220.18170OC(O)CH(C2H5)NH29.609.76−0.160.02171OC(O)CH(CH2CF3)NH28.178.060.120.44172H5C2OC(O)CH(C2H5)NH27.608.11−0.510.42173OOCCH2CH(CF3)NH25.835.830.000.98174OOCC3H6NH210.5610.490.06−0.15175H5C2OOCC3H6NH29.719.79−0.080.02176OOCC5H10NH210.8010.560.25−0.17177H5C2OOCC5H10NH210.3010.34−0.03−0.12178OOCC(CH3)2NH210.2110.44−0.23−0.14179OOCC2H4C(O)CH2NH28.828.640.180.30180OOCC2H4CH(CH3)NH210.4610.61−0.15−0.18181OOCC4H8NH210.7510.590.16−0.18182H5C2OOCC4H8NH210.1510.19−0.04−0.08183OOCC2H4NH210.2410.49−0.26−0.16184H5C2OOCC2H4NH29.069.59−0.530.07185OOCCH(C3H6NHC(═N+H2)NH2)NH28.999.08−0.090.19186OOCCH(C3H6NHC(═N+H2)NH2)NHC(O)CH(CH2C6H5)NH27.547.530.010.57187OOCCH(C3H6NHC(═N+H2)NH2)NHC(O)CH(CH2C6H4(4-OH))NH27.397.390.000.60DD188OOCCH(CH2C(O)NH2)NH28.849.36−0.520.12189OOCCH(CH2C(O)NH2)NHC(O)CH(CH2SH)NH26.957.13−0.180.66LL190OOCCH(CH2C(O)NH2)NHC(O)CH2NH28.278.140.130.42191OOCC(OH)(CH2C(O)NH2)NH27.207.55−0.350.56192OOCCH(CH(OH)C(O)NH2)NH28.298.46−0.170.34193OOCCH(CH2C(O)NH2)NHC(O)CH(iso-C4H9)NH28.118.24−0.130.39194OOCCH(CH2COO)NH210.0010.13−0.13−0.07195OOCCH(CH2COO)NHC(O)CH(CH2COO)NH28.267.920.340.47196OOCCH(CH2COO)NHC(O)CH2NH28.608.270.330.39197H5C2OOCCH(CH2COOC2H5)NH26.406.77−0.370.75198H2NOCCH(CH2CONH2)NH27.006.920.080.72199OOCCH(CH2SH)NH28.338.70−0.370.28200OOCCH(CH2SH)NHC(O)CH(CH2SH)NH27.277.120.150.67201OOCCH(CH2SC2H5)NH28.699.12−0.430.18202OOCCH(CH2SCH3)NH28.759.15−0.400.17203H5C2OOCCH(CH2SH)NH26.696.75−0.060.76204H3COOCCH(CH2SH)NH26.566.92−0.360.71205OOCCH(NH2)CH2SSCH2CH(COO)NH28.959.19−0.240.16206OOCCH(N+H3)CH2SSCH2CH(COO)NH28.267.750.510.51207OOCCH(N+H3)CH2SSCH2CH(COO)7.667.81−0.150.50NHC(O)CH(N+H3)CH2SSCH2CH(COO)NH2208OOCCH(NH2)CH2SSCH2CH(COO)8.187.680.500.53NHC(O)CH2NHC(O)CH2NH2209H3N+CH2C(O)NHCH2C(O)NHCH(COO)8.187.760.420.51CH2SSCH2C(O)CH(COO)NH2210H2NC(O)CH(NH2)CH2SSCH2CH(CONH2)NH26.806.87−0.070.73211H3CCH(NH2)CH2CH(COO)NH210.359.750.600.03212OOCCH(N+H3)CH2CH(CH3)NH28.168.140.020.42213H2NCH2CH(COO)NH29.609.260.340.14214OOCCH(N+H3)CH2NH26.807.29−0.490.63215OOCCH(C2H4COO)NH29.9410.17−0.23−0.08216OOCCH(C2H4COO)NHC(O)CH(C2H4COO)NH27.627.99−0.370.45DD217OOCCH(CH(OH)CH2COO)NH29.669.490.170.09218OOCCH(C2H4COO)NHC(O)CH(C4H8N+H3)NH27.757.720.030.52219OOCCH(C2H4COO)NHC(O)CH(NH2)C4H8NH210.5010.210.29−0.09220H5C2OOCCH(C2H4COOC2H5)NH27.047.30−0.260.62221OOCCH(C2H4COOCH2C6H5)NH29.009.26−0.260.14222H5C2OOCCH(C2H4COO)NH27.847.86−0.020.49223OOCCH(C2H4COOC2H5)NH29.199.53−0.340.08224OOCCH(C2H4CONH2)NH29.139.16−0.030.17225OOCCH(C2H4CONH2)NHC(O)CH2NH28.168.120.040.42226OOCCH(C2H4CONH2)NHC(O)CH(iso-C4H9)NH27.948.34−0.400.37LL227OOCCH2NHC(O)CH(CH2SH)NHC(O)C2H4CH(COO)NH28.669.01−0.350.21228H2NCH(COO)C2H4CONHCH(CONHCH2COO)9.529.150.370.17CH2SSCH2CH(CONHCH2COO)NHCOC2H4CH(COO)NH2229H3N+CH(COO)C2H4CONHCH(CONHCH2COO)8.628.85−0.240.24CH2SSCH2CH(CONHCH2COO)NHCOC2H4CH(COO)NH2230OOCCH2NHC(O)CH(CH2SC2H5)NHC(O)C2H4CH(COO)NH29.209.43−0.230.10231OOCCH2NH29.7810.13−0.35−0.07232OOCCH2NHC(O)CH(CH3)NH28.188.56−0.380.32233OOCCH2NHC(O)CH2NHC(O)CH(CH3)NH28.037.990.040.45234OOCCH2NHC(O)CH(CH2C(O)NH2)NH27.257.61−0.360.55235H2NC(O)CH(N+H3)CH2SSCH2CH(CONH2)NH25.855.94−0.100.95236OOCCH2NHC(O)CH2NHC(O)CH(N+H3)CH2SSCH2CH(COO)NH26.957.10−0.150.67237OOCCH2NHC(O)CH(C2H4COO)NH27.527.89−0.370.48238OOCCH2NHC(O)CH2NH28.258.160.090.41239OOCCH2NH[C(O)CH(CH3)NH]2C(O)CH2NH27.937.780.150.51240OOCCH2NHC(O)CH2NHC(O)CH2NH28.097.630.460.54241OOCCH2NH[C(O)CH2NH]2C(O)CH2NH27.947.650.290.54242OOCCH2NHC(O)CH(CH2OH)NHC(O)CH2NH27.997.630.360.54243OOCCH2NH[C(O)CH2NH]5C(O)CH2NH27.597.160.430.66244(1)7.977.790.180.50245OOCCH2NHC(O)CH(iso-C4H9)NH28.138.56−0.430.32246OOCCH2NHC(O)CH2NHC(O)CH(iso-C4H9)NH27.977.740.230.51247OOCCH2NH[C(O)CH2NH]4C(O)CH2NH27.627.320.300.62248(2)8.978.340.640.37249OOCCH2NHC(O)CH(CH2OH)NH27.337.070.260.68250OOCCH2NH[C(O)CH2NH]3C(O)CH2NH27.907.630.270.54251OOCCH2NHC(O)CH(iso-C3H7)NH28.028.42−0.400.35252H2NOOCCH2NHC(O)CH(CH3)NHC(O)CH(CH2C6H5)NH26.726.84−0.120.73253C2H5OOCCH2NH27.647.79−0.150.50254CH3OOCCH2NH27.597.72−0.130.52255CH3OOCCH2NHC(O)CH2NH27.757.430.320.59256C2H5OOCCH2NHC(O)CH2NHC(O)CH2NH27.797.590.200.55257C2H5OOCCH2NH[C(O)CH2NH]2C(O)CH2NH27.697.600.090.55258CH3OOCCH2NH[C(O)CH2NH]4C(O)CH2NH27.747.500.240.57259(3)9.159.120.030.18260(4)9.519.480.030.09261(5)9.459.340.110.13262(6)8.208.45−0.250.34263(7)9.079.030.050.20264(8)8.187.730.450.52265(9)8.628.83−0.210.25266(10)8.208.090.110.43267(11)7.827.820.000.50268(12)8.478.84−0.370.25269(13)8.858.820.030.25270(14)9.319.40−0.090.11271(15)7.647.610.030.55272(16)r7.337.44−0.110.59273OOCCH(C2H4SH)NH28.878.760.110.27274OOCCH(NH2)C2H4SSC2H4CH(COO)NH29.449.200.240.16275OOCCH(N+H3)C2H4SSC2H4CH(COO)NH28.528.490.030.33276H2NC(O)CH(CH2COO)NH27.957.750.200.51277H2NC(O)CH(C2H4COO)NH27.888.19−0.310.41278OOCCH(sec-C4H9)NH29.769.89−0.13—0.01279OOCH2NHC(O)CH(sec-C4H9)NH28.008.26−0.260.39280OOCCH(OH)NH29.339.48−0.150.09281OOCCH(OH)NHC(O)CH(iso-C4H9)NH28.218.38−0.170.36282OOCCH(iso-C4H9)NH29.749.81−0.060.01283OOCCH(iso-C4H9)NHC(O)CH2NH28.298.52−0.230.33284OOCCH(iso-C4H9)NHC(O)CH2N(CH3)H8.678.280.390.38285OOCCH(iso-C4H9)NHC(O)CH(CH2OH)NH27.457.62−0.170.54LL286OOCCH(CH2CH(CH3)CF3)NH28.959.46−0.520.10287H2NC(O)CH(iso-C4H9)NH27.808.60−0.800.31288H5C2OOCCH(iso-C4H9)NH27.578.24−0.670.39289OOCCH(C4H9N+H3)NH29.209.57−0.370.07290OOCCH(NH2)C4H9NH210.8010.500.30−0.16291OOCCH(C4H9N+H3)NHC(O)CH(C4H9N+H3)NH27.537.490.040.58LL292OOCCH(C4H9N+H3)NHC(O)CH(NH2)C4H9NH210.059.960.09−0.03LL293H2NCH(C4H9NH3)C(O)NHCH(COO)C4H9NH211.0110.440.57−0.14LL294OOCCH(C4H9N+H3)NHC(O)CH(C4H9N+H3)NH27.537.54−0.010.56LD295OOCCH(C4H9N+H3)NHC(O)CH(NH2)C4H9NH29.929.810.110.01LD296H2NCH(C4H9NH2)C(O)NHCH(COO)C4H9NH210.8910.450.44−0.15LD297OOCCH(C4H9N+H3)NHC(O)CH(C4H9N+H3)NHC(O)CH(C4H9N+H3)NH27.347.040.300.68LLL298H3N+C4H9CH(C(O)NHCH(COO)C4H9N+H3)NHC(O)CH(NH2)C4H9NH29.809.580.220.07LLL299H2NCH(C4H9NH2)C(O)NHCH(C(O)NHCH(COO)C4H9N+H3)C4H9NH210.5410.020.52−0.04LLL300H2NCH(C4H9NH2)C(O)NHCH(C4H9NH2)C(O)NHCH(COO)C4H9NH211.3210.740.58−0.21LLL301OOCCH(C2H4SCH3)NH29.279.76−0.490.02302OOCCH(C2H4SCH3)NHC(O)CH(C2H4SCH3)NH27.507.73−0.230.52LL303H2NC(O)CH(C2H4SCH3)NH27.537.55−0.020.56304OOCCH(C4H9)NH29.839.570.260.07305OOCCH(C3H6CF3)NH29.469.65−0.190.05306OOCCH(C3H7)NH29.819.520.290.08307OOCCH(C2H4CF3)NH28.929.40−0.480.11308OOCC2H4NHC(═NH)NHC3H6CH(COO)NH28.728.660.060.29309—OOCCH(C3H6N+H3)NH28.698.87−0.180.24310H2NCH(COO)C3H6NH210.7610.320.43−0.11311—OOCCH(CH2C6H5)NH29.249.29−0.050.14312—OOCCH(CH2C6H4(2-Cl))NH28.938.510.410.33313—OOCCH(CH2C6H4(3-Cl))NH28.909.06−0.160.19314—OOCCH(CH2C6H4(4-Cl))NH28.959.21−0.270.16315—OOCCH(CH2C6H3(3-OH, 4-OH))NH29.199.190.000.16316—OOCCH(CH2C6H4(2-F))NH28.989.25−0.260.15317—OOCCH(CH2C6H4(3-F))NH28.959.37−0.420.12318—OOCCH(CH2C6H4(4-F))NH29.029.45−0.420.10319—OOCCH(CH2C6H3(2-F, 4-OCH3))NH28.988.910.060.23320—OOCCH(CH2C6H5)NHC(O)CH2NH28.178.19−0.020.41321—OOCC(CH3)(CH2C6H5)NH29.579.92−0.35−0.02322—OOCC(CH3)(CH2C6H3(3-OH, 4-OH))NH29.129.61−0.490.06323H2NC(O)C(CH3)(CH2C6H5)NH27.227.75−0.530.51324HONHC(O)CH(CH2C6H5)NH26.786.98−0.200.70325CH3OC(O)CH(CH2C6H5)NH27.007.57−0.570.56326HONHC(O)CH(CH2C6H5)NH27.157.45−0.300.59327(17)10.6410.140.50−0.07328(18)8.388.110.270.42329(19)8.698.670.020.29330(20)9.479.190.280.16331OOCCH2N(CH3)C(O)CH2NH28.598.190.400.41332OOCCH(CH2OH)NH29.219.080.130.19333OOCCH(CH2OH)NHC(O)CH2NH28.107.900.200.48334H2NOCCH(CH2OH)NH27.307.55−0.250.56335H3COCCH(CH2OH)NH27.107.28−0.180.63336OOCCH(CH(CH3)OH)NH29.109.31−0.210.13337OOCCH(CH(CH3)OCH3)NH29.009.000.000.21338OOCCH(CH(CF3)OH)NH27.787.490.290.58339(21)9.449.82−0.380.01340(22)8.067.820.240.50341(23)7.557.380.170.60342OOCCH(CH2C6H4(4-OH))NH29.119.13−0.020.18343OOCCH(CH2C6H4(4-OH))NHC(O)CH(CH2COO)NH28.939.05−0.120.20344OOCCH(CH2C6H2(3-Br, 4-OH, 5-Br))NH26.456.140.310.90345OOCCH(CH2C6H2(3-Cl, 4-OH, 5-Cl))NH26.476.82−0.350.74346OOCCH(CH2C6H2(3-I, 4-OH, 5-I))NH26.487.03−0.550.69347OOCCH(CH2C6H4(4-OH))NHC(O)CH2NH28.458.060.390.44348OOCCH(CH2C6H4(4-OH))NHC(O)CH(iso-C4H9)NH28.368.39−0.030.36DL349OOCCH(CH2C6H4(4-OCH3))NH29.278.950.320.22350OOCCH(CH2C6H4(4-OH))NHC(O)CH(CH2C6H4(4-OH))NH27.687.420.260.59351H2NOCCH(CH2C6H4(4-OH))NH27.487.80−0.320.50352C2H5OCCH(CH2C6H4(4-OH))NH27.337.320.010.62353C2H5OCCH(CH2C6H4(4-OCH3))NH27.317.33−0.020.61354HONHC(O)CH(CH2C6H4(4-OH))NH27.007.30−0.300.62355OOCCH(iso-C3H7)NH29.729.84−0.120.00356OOCCH(iso-C3H7)NHC(O)CH2NH28.258.36−0.110.37357OOCCH(C(CH3)2SH)NH28.008.02−0.020.45358OOCCH(CH(CH3)CF3)NH28.108.20−0.090.40359H2NOCCH(iso-C3H7)NH28.008.62−0.620.30360H2NC2H4OC2H4NH29.689.280.400.14361H3N+C2H4OC2H4NH28.768.680.080.29362H2NC2H4NHC2H4NH29.8010.11−0.32−0.06363H3N+C2H4NHC2H4NH29.109.43−0.330.10364(H2N + C2H4)2NH4.304.40−0.101.33365C6H5CH2NHC2H4NH26.486.51−0.030.81366(H2NC2H4)3N10.1310.090.04−0.06367(H2NC2H4)2N+HC2H4NH29.449.030.410.20368H3N+C2H4(H2NC2H4)N+HC2H4NH28.427.870.550.48369(C4H9)N+H2C2H4NH27.537.330.200.61370(C2H5)2N+HC2H4NH27.077.030.040.69371(CH3)2N+HC2H4NH26.636.90−0.270.72372H2NC2H4OC2H4SC2H4NH29.609.320.280.13373H3N+C2H4SC2H4OC2H4NH28.668.68−0.020.29374H5C2N+H2C2H4NH27.427.230.190.64375(C4H3O)—CH2—NH—C2H4—NH2 (24)6.125.980.130.94376H3N+C2H4(HOC2H4)NH6.837.02−0.190.69377H3N+C2H4(CH3CH(OH)CH2)NH6.947.07−0.130.68378H3N+C2H4(HOC3H6)NH6.787.11−0.330.67379iso-C3H7N+H2C2H4NH27.707.76−0.060.51380CH3N+H2C2H4NH26.837.19−0.360.65381C6H4(4-CH3)CH2N+H2C2H4NH26.516.90−0.390.72382C6H5C2H4N+H2C2H4NH26.596.380.210.84383C3H7N+H2C2H4NH27.547.280.260.63384H2NC2H4OC2H4OC2H4NH29.719.140.570.18385H3N+C2H4OC2H4OC2H4NH28.748.700.040.28386H2NC2H4SC2H4SC2H4NH29.459.400.040.11387H3N+C2H4SC2H4SC2H4NH28.578.430.130.35388H2NCH2C(O)NHC2H4NHC(O)CH2NH28.357.960.390.46389H3N+CH2C(O)NHC2H4NHC(O)CH2NH27.637.87−0.240.48390H2NC2H4NH29.939.930.00−0.02391H3N+C2H4NH26.857.13−0.280.66392H2NC2H4SSC2H4NH29.039.05−0.030.20393H3N+C2H4SSC2H4NH28.698.650.030.29394H2NC2H4SC2H4NH29.809.410.380.11395H3N+C2H4SC2H4NH28.988.960.010.22396H2NC3H6NHC3H6NH29.869.87−0.010.00397H3N+C3H6(H2NC3H6)NH8.147.250.890.63398H2NC3H6N(CH3)C3H6NH210.0110.48−0.47−0.15399H3N+C3H6N(CH3)C3H6NH29.029.17−0.150.17400H2NCH2CH(CH3)NH210.0010.06−0.06−0.05401H3CCH(N+H2)CH2NH27.137.25−0.120.63402H2NC3H6NH210.4810.380.10−0.13403H3N+C3H6NH28.458.78−0.330.26404(H2NCH2)2CHCH2NH210.4210.250.17−0.10405(H3N+CH2)(H2NCH2)CHCH2NH28.788.700.080.28406(H3N+CH2)2CHCH2NH26.846.430.410.83407H2NCH2C(CH3)2CH2NH210.3810.64−0.27−0.19408H3N+CH2C(CH3)2CH2NH28.308.61−0.310.30409H2NCH2CH(OH)CH2NH29.569.220.330.15410H3N+CH2CH(OH)CH2NH27.787.300.480.62411H2NCH2C(CH3)2NH210.0010.11−0.11−0.06412H3N+C(CH3)2CH2NH26.797.00−0.210.69413H2NCH2CH(NH2)CH2NH29.469.48−0.020.09414H3N+CH2CH(NH2)CH2NH27.837.460.370.58415(H3N+CH2)2CHNH23.673.83−0.161.47416H2NC4H8(H5C2)2N10.3010.60−0.30−0.18417H2NC4H8NH210.1910.50−0.31−0.16418H3N+C4H8NH28.789.02−0.240.20419H3CCH(NH2)CH(CH3)NH210.0010.41−0.41−0.14420H3CCH(N+H3)CH(CH3)NH26.916.590.320.79421H3CCH(NH2)C3H6(H5C2)2N10.1010.80−0.70−0.23422H2NC5H10NH210.2510.57−0.32−0.17423H3N+C5H10NH29.139.24−0.110.15424H2NC6H12NH210.9310.620.31−0.19425H3N+C6H12NH29.8310.16−0.33−0.07426H2NC8H16NH210.8310.750.08−0.22427H3N+C8H16NH29.9510.34−0.40−0.12428C6H10(cyclo), 2-NH2, 1-NH2ecvat9.8010.20−0.41−0.08429C6H10(cyclo), 2-N+H3, 1-NH2ecvat6.035.760.261.00430C6H10(cyclo), 2-NH2, 1-NH2axial9.7410.47−0.73−0.15431C6H11(cyclo)CH2CH(CH3)NH210.5211.04−0.52−0.29432C7H12(cyclo), 2-OH, 1-NH29.259.51−0.260.08433C7H12(cyclo), 2-NH2, 1-NH210.0210.32−0.31−0.11434C7H12(cyclo), 2-N+H3, 1-NH26.215.830.370.98435(25)10.1711.00−0.83−0.28436(26)9.149.31−0.160.13437(27)9.489.83−0.350.01438(28)8.138.15−0.020.41439(29)9.509.400.090.11440(30)9.719.77−0.060.02441OOCCH2NHC(O)CH(CH2COO)NH29.078.650.420.29442H2NC(O)CH2(H3C)NH8.318.290.020.20443(NCCH2)2NH0.20−0.440.642.32444(CH3)2NC(O)CH2(H3C)NH8.828.440.380.16445H3C(HO)NH5.964.781.181.05446H3C(H3CO)NH4.754.90−0.151.02447(H3C)2NH10.7310.210.52−0.27448H3CNHC(O)CH2(H3C)NH8.248.080.160.25449[(H3C)3SiCH2]2NH11.4011.170.23−0.50450H3CC(O)NHC2H4NH29.059.30−0.25−0.05451(NCC2H4)2NH5.265.48−0.220.88452(C6H5)2CHC(O)C2H4(CH3)NH9.128.980.140.03453(H5C2)2NH11.0410.430.61−0.32454(HOC2H4)2NH8.888.680.200.10455(HOC2H4)2(H3C)N8.528.500.020.15456HOC2H4(H3CCH(OH)CH2)NH8.818.600.210.12457tBu-NH—CH2CH(OH)—CH310.009.880.12−0.19458(iso-C4H9)2NH10.7910.84−0.05−0.42459(iso-C3H7)2NH10.9910.640.35−0.37460(C3H7)2NH11.0010.550.45−0.35461(H3C)3SiCH2(iso-C3H7)NH10.8010.91−0.11−0.44462(H2C═CHCH2)2NH9.298.930.360.04463H2C═CHCH2(H3C)NH10.119.620.49−0.12464(HC≡CCH2)2NH6.105.390.710.90465(C4H9)2NH11.2510.650.60−0.38466(sec-C4H9)2NH11.0110.800.21−0.41467[(H3C)2CHC2H4]2NH11.0310.800.22−0.41468(C5H11)2NH11.1910.720.47−0.39469(C6H13)2NH11.0110.810.20−0.41470C5H11CH(H3C)CH(H3C)NH10.8210.610.21−0.36471(C8H17)2NH11.0110.870.14−0.43472(C12H25)2NH11.0010.780.22−0.41473(C13H27)2NH11.0010.960.04−0.45474(C15H31)2NH11.0010.990.01−0.46475(C18H37)2NH11.0011.01−0.01−0.46476C5H8(cyclo), 2-OH, 1-(H3C)NH10.069.480.58−0.09477C5H9(cyclo)(H3C)NH10.8510.450.40−0.33478C6H11(cyclo)(t-H9C4)NH11.2310.870.36−0.43479C6H10(cyclo), 2-Cl, 1-(H3C)NH9.859.150.70−0.01480C6H11(cyclo)(H3C)2N10.7210.350.37−0.30481C6H10(cyclo) 2-OH, 1-(H3C)2N10.329.780.54−0.16482C6H10(cyclo) 4-OH, 1-CH(OH)CH2(H7C3)NH10.089.940.13−0.20483C6H11(cyclo)(H3C)NH11.0410.630.41−0.37484C6H11(cyclo)CH2(H3C)CH(H3C)NH10.5210.86−0.34−0.43485C6H11(cyclo)[(H3C)3SiCH2]NH10.9611.12−0.16−0.49486C6H5CH2CH(CH3)(NCC2H4)NH7.237.52−0.290.39487C6H3, 1-OH, 2-OH, 4-CH(OH)CH2(iso-H7C3)NH8.878.440.430.16488C6H3, 1-OH, 2-OH, 4-CH(OH)(iso-C3H7)CH(iso-H7C3)NH8.918.710.200.10489C6H3, 1-OH, 2-OH, 4-CH(OH)CH2(H3C)NH8.508.240.260.21490C6H3, 1-OH, 2-OH, 4-C2H4(H3C)NH8.788.80−0.020.07491C6H4, 1-F, 3-CH(OH)CH2(iso-H7C3)NH9.358.890.450.05492C6H4, 1-OH, 3-CH(OH)CH2(H3C)NH8.868.340.520.19493C6H4, 1-OH, 4-CH(OH)CH2(H3C)NH8.908.400.500.17494C6H4, 1-OH, 4-C(O)CH2(iso-H7C3)NH7.647.77−0.130.33495C6H5CH(OH)CH2(H3C)NH9.318.860.450.06496C6H4, 1-OH, 4-CH(OH)CH2(H3C)NH9.368.480.880.15497C6H5CH(OH)CH(CH3)(H3C)NH9.528.770.750.08498C6H5C4H8(H3C)NH10.8010.390.41−0.31499C6H5C2H4(H3C)NH10.089.780.30−0.16500C6H5CH2(H3C)CH(H3C)NH9.879.89−0.02−0.19501C6H5C3H6(H3C)NH10.6410.040.60−0.23502C6H5CH(CH3)CH2(H3C)NH9.889.860.02−0.18503C6H5CH2(H5C2)NH9.649.160.48−0.01504C6H5CH2(H3C)NH9.549.060.480.01505C6H5CH2(H7C3)NH9.589.240.34−0.03506(31)11.0710.500.57−0.34507(32)11.1210.360.77−0.30508(33)11.0710.580.49−0.36509(34)9.098.570.520.13510(35)10.9510.470.48−0.33511(36)11.0710.420.65−0.32512(37)10.9010.640.26−0.37513(38)11.0710.780.29−0.41514(39)11.2110.630.58−0.37515(40)11.3810.690.69−0.38516(41)8.399.03−0.640.02517—OOCCH2(H3C)NH10.199.960.23−0.21518—OOCCH2(H7C3)NH10.1910.140.05−0.25519—OOCCH2(H9C4)NH10.0710.18−0.11−0.26520(42)8.728.90−0.180.05521OOCCH2(iso-H9C4)NH10.1210.56−0.44−0.35522OOCCH2(iso-H7C3)NH10.0610.16−0.10−0.25523OOCCH2(H7C3)NH10.1910.040.15−0.23524OOCCH2NHC(O)CH2(H3C)NH8.578.500.070.15525OOCCH2(H3C)NH10.209.940.26−0.20526OOCCH2(H3C)NC(O)CH2(H3C)NH9.188.690.490.10527C6H5CH2NHC2H4NH29.418.820.590.07528C4H9NHC2H4NH210.3010.200.10−0.27529C2H5NHC2H4NH210.3610.080.28−0.24530(43)9.579.410.16−0.07531HOC2H4NHC2H4NH29.829.340.48−0.06532H3CCH(OH)CH2NHC2H4NH29.869.470.39−0.09533HOC3H6NHC2H4NH29.679.500.17−0.10534iso-C3H7NHC2H4NH210.6210.320.30−0.30535CH3NHC2H4NH29.989.730.24−0.15536C6H4(4-CH3)CH2NHC2H4NH29.418.790.620.08537C6H5C2H4NHC2H4NH29.449.000.440.03538C3H7NHC2H4NH210.3410.150.19−0.25539C4H9NHC2H4(H9C4)NH10.1910.35−0.16−0.30540C4H9N+H2C2H4(H9C4)NH7.467.320.140.43541C2H5NHC2H4(H5C2)NH10.4610.020.44−0.22542C2H5N+H2C2H4(H5C2)NH7.707.350.350.43543iso-C3H7NHC2H4(iso-C3H7)NH10.4010.320.08−0.29544iso-C3H7N+H2C2H4(iso-C3H7)NH7.597.420.170.41545CH3NHC2H4(H3C)NH10.1610.020.14−0.22546CH3N+H2C2H4(H3C)NH7.407.180.220.47547C3H7NHC2H4(C3H7)NH10.2710.260.01−0.28548C3H7N+H2C2H4(C3H7)NH7.537.490.040.39549H2NC3H6NHC3H6NH210.8610.410.45−0.32550C4H9NHCH2C3F6CH2(H9C4)NH7.076.740.330.58551C4H9N+H2CH2C3F6CH2(H9C4)NH5.695.320.370.92552t-C4H9NHCH2C3F6CH2(t-H9C4)NH6.896.790.100.56553t-C4H9N+H2CH2C3F6CH2(t-H9C4)NH5.926.17−0.250.71554iso-C3H7NHCH2C3F6CH2(iso-H7C3)NH7.016.660.350.60555iso-C3H7N+H2CH2C3F6CH2(iso-H7C3)NH5.726.05−0.330.74556CH3NHCH2C3F6CH2(H3C)NH6.826.450.370.65557CH3N+H2CH2C3F6CH2(H3C)NH5.705.640.060.84558C4H9NHC6H12(C4H9)NH11.2810.780.50−0.41559C4H9N+H2C6H12(C4H9)NH11.0910.740.35−0.40560t-C4H9NHC6H12(t-C4H9)NH11.0210.810.21−0.42561t-C4H9N+H2C6H12(t-C4H9)NH11.0010.830.17−0.42562iso-C3H7NHC6H12(iso-C3H7)NH11.2110.700.51−0.39563iso-C3H7N+H2C6H12(iso-C3H7)NH10.9410.230.71−0.27564C6H11(cyclo), 4-OH, 1-CH(OH)CH2(iso-C3H7)NH10.089.950.12−0.21565C6H11(cyclo)(CH3)NH11.0410.640.40−0.37566(CH3)3SiCH2[C6H11(cyclo)]NH10.9611.12−0.16−0.49567(44)6.468.03−1.570.26568(CH3)2ClN0.461.26−0.801.73569NCCH2(H3C)2N4.244.79−0.550.87570(H3C)3N9.7510.01−0.26−0.40571HO(H3C)2N5.204.520.680.93572H3CO(H3C)2N3.654.72−1.070.89573H3CC(O)C2H4(H5C2)2N8.919.50−0.60−0.28574H3CC(O)C2H4(H3C)2N8.258.80−0.55−0.11575C6H5CH2C(O)C2H4(H5C2)2N9.279.39−0.13−0.25576C6H5CH2C(O)C2H4(H3C)2N8.188.51−0.33−0.04577(ClC2H4)2(H5C2)N6.556.91−0.360.35578(ClC2H4)2(H3COC2H4)N5.455.53−0.080.69579(NCC2H4)2(H5C2)N4.555.05−0.500.81580(NCCH2)2(H5C2)N−0.60−0.28−0.322.10581(ClC2H4)3N4.374.40−0.030.96582(ClC2H4)2(H3C)N6.436.78−0.350.38583ClC2H4(H5C2)2N8.808.370.430.00584Cl(H5C2)2N1.021.46−0.441.68585HOC2H4(ClC2H4)(H3C)N7.487.50−0.020.21586(NCC2H4)3N1.101.050.051.78587NCC2H4(H5C2)2N7.657.86−0.210.12588NCC2H4(H3C)2N7.077.67−0.600.17589NCC2H4(iso-H7C3)NH8.108.080.010.07590NCCH2(H5C2)2N4.554.99−0.440.82591(H5C2)3N10.7510.300.45−0.47592(C6H5)2CHC(O)C2H4(H5C2)2N9.429.020.40−0.16593HOC2H4(H5C2)2N9.749.300.43−0.23594C2H5(H3C)2N10.0110.11−0.10−0.43595(C6H5)2CHC(O)C2H4(H3C)2N8.598.89−0.30−0.13596HOC2H4(H3C)2N9.189.110.07−0.18597H3C(C2H5)2N10.3110.200.10−0.45598(HOC2H4)3N7.777.390.380.24599H3CC(O)C3H6(H3C)2N8.949.73−0.80−0.33600H3CC(O)C(C6H5)2CH(CH3)CH2(H3C)2N9.408.900.50−0.13601H5C6C(O)C(C6H5)2C2H4(H3C)2N9.348.510.83−0.04602H5C6CH2C(O)CH(C6H5)C2H4(H5C2)2N9.339.270.05−0.22603H5C6CH2C(O)CH(C6H5)C2H4(H3C)2N8.869.09−0.23−0.18604(ClC2H4)2(H7C3)N6.687.21−0.530.28605(ClC2H4)2(iso-H7C3)N6.987.43−0.450.23606(H3CCH(OH)CH2)2(H7C3)N8.908.280.620.02607(H3CCH(OH)CH2)2(iso-H9C4)N8.808.250.550.03608(H3CCH(OH)CH2)2(t-H9C4)N9.408.810.59−0.11609H9C4C(O)C(C6H5)2C2H4(CH3)2N10.239.101.13−0.18610OOCC(C6H5)2C2H4(C2H5)2N10.449.930.51−0.38611OOCC(C6H5)2C2H4CH(CH3)(H3C)2N10.739.661.06−0.32612NCC3H6(C2H5)2N9.298.830.46−0.11613NCC(C6H5)2C2H4(C2H5)2N8.957.891.060.12614NCC(C6H5)2C2H4(H3C)2N8.267.690.570.16615NCC(C6H5)2CH(CH3)CH2(H3C)2N7.857.240.610.27616NCC(C6H5)2CH2(H3C)CH(H3C)2N8.567.780.780.14617(H3CCH(OH)CH2)3N7.868.06−0.200.07618(iso-C4H9)3N10.3210.80−0.48−0.59619C3H7(H3C)2N10.0110.17−0.17−0.44620iso-C3H7(H3C)2N10.3210.210.11−0.45621CH(C6H5)2C2H4(CH3)2N9.359.39−0.04−0.25622H5C2OC(O)C(C6H5)2C2H4(H3C)2N9.728.561.15−0.05623H5C2OC(O)C(C6H5)2CH2CH(CH3)(H3C)2N9.978.691.27−0.08624H5C2C(O)C(C6H5)2C2H4(H3C)2N9.188.990.19−0.15625iso-C4H9(H3C)2N9.9310.30−0.38−0.47626t-C4H9(H3C)2N10.5410.300.23−0.47627(H7C3)3N10.2610.51−0.25−0.52628HOC2H4(iso-H7C3)2N9.939.760.16−0.34629H3CCH(OH)CH2(HOC2H4)(H3C)N8.708.330.370.01630H2C═CHCH2(H3CCH(OH)CH2)2N8.208.36−0.160.00631(H2C═CHCH2)3N8.318.110.200.06632H2C═CHCH2(H3C)2N8.649.40−0.76−0.25633H3C(H2C═CHCH2)2N8.798.720.07−0.09634HC≡CHCH2(H3C)2N6.977.68−0.710.17635(HC≡CHCH2)3N3.093.11−0.021.28636H3CC(O)C4H8(H3C)2N9.619.95−0.34−0.39637(ClC2H4)2(C4H9)N6.617.22−0.610.28638(H3CCH(OH)CH2)2(H9C4)N9.309.230.07−0.21639NCC4H8(H5C2)2N10.089.410.67−0.26640NCC(C6H5)2CH2CH(CH3)(H3C)2N8.267.700.560.16641(C4H9)3N9.9310.65−0.72−0.56642C4H9(CH3)2N10.0410.22−0.19−0.45643sec-C4H9(CH3)2N10.4210.270.15−0.46644CH(C6H5)2CH2CH(CH3)(CH3)2N9.439.46−0.03−0.27645C2H5C(O)C(C6H5)2CH2CH(CH3)(CH3)2N8.949.01−0.07−0.16646(C6H5)2C═CHCH(CH3)(CH3)2N9.218.730.48−0.09647HC≡CHC2H4(H3C)2N8.258.75−0.50−0.10648C5H11(CH3CH(OH)CH2)2N9.008.590.41−0.05649NCC5H10(C2H5)2N10.469.710.75−0.33650iso-C3H7(H3C)2N9.9610.55−0.59−0.53651HC≡CC3H6(H3C)2N8.809.49−0.69−0.28652C6H13(CH3CH(OH)CH2)2N8.508.340.160.00653HC≡CC4H8(H3C)2N9.169.81−0.65−0.35654C8H17(CH3CH(OH)CH2)2N8.308.58−0.28−0.05655C10H21(CH3CH(OH)CH2)2N7.608.12−0.520.06656C6H4(4-CH2Br)C(O)OC2H4(C2H5)2N8.128.51−0.39−0.04657C6H4(3-CH2OC4H9)C(O)OC2H4(C2H5)2N8.118.79−0.68−0.10658C6H4(4-OC4H9)C(O)OC2H4[C5H9(cyclo)]2N8.308.75−0.45−0.09659C6H4(4-Cl)C(O)OC2H4(C2H5)2N8.088.53−0.45−0.04660C6H5CH2CH(CH3)(NCC2H4)(CH3)N6.957.35−0.400.25661C6H5C(O)OCH2C(CH3)2CH2(C2H5)2N9.589.69−0.11−0.32662C6H5CH═CHC(O)OC3H6(C2H5)2N9.719.450.26−0.26663C6H5CH2CH(CH3)(CH3)2N9.409.68−0.28−0.32664C6H5CH2(C2H5)2N9.449.050.39−0.17665C6H5CH2(H3C)2N8.918.850.06−0.12666(45)10.409.860.54−0.36667(46)8.678.72−0.06−0.09668(47)8.908.300.590.01669(48)10.4610.360.10−0.49670(49)10.7310.460.27−0.51671(50)8.027.390.630.24672(51)4.554.96−0.410.83673(52)7.687.470.210.22674(53)7.497.69−0.200.16675(54)10.2210.25−0.03−0.46676(55)8.919.34−0.43−0.24677(56)10.3910.240.14−0.46678(57)8.819.19−0.38−0.20679(58)8.539.19−0.66−0.20680(59)9.278.930.34−0.14681(60)10.6610.340.32−0.48682(61)10.419.960.44−0.39683(62)10.0910.15−0.06−0.43684(63)8.539.15−0.62−0.19685(64)8.759.68−0.93−0.32686(65)7.418.18−0.770.04687(66)8.238.88−0.65−0.13688(67)8.158.87−0.72−0.12689(68)9.549.420.12−0.26690(69)11.3610.500.86−0.52691(70)10.4610.310.14−0.48692(71)10.3410.41−0.07−0.50693(72)10.8610.430.42−0.50694(73)10.3110.34−0.04−0.48695(74)6.737.23−0.500.27696(75)6.136.74−0.610.39697(76)6.076.64−0.570.42698(77)6.045.950.090.59699(78)6.056.40−0.350.48700(79)6.296.82−0.530.38701(80)7.208.03−0.830.08702(81)7.077.73−0.660.15703(82)7.678.67−1.00−0.08704(83)6.957.99−1.040.09705(84)6.688.10−1.420.06706(85)7.028.12−1.100.06707(86)7.388.83−1.45−0.12708(87)6.858.38−1.530.00709(88)10.9510.060.89−0.41710(89)7.817.91−0.100.11711(90)9.409.070.33−0.17712(91)10.239.600.63−0.30713OOCCH(CH3)(C2H5)NH10.2210.170.05−0.44714OOCC2H4(H3C)2N9.8510.25−0.40−0.46715C2H5OOCC2H4(H3C)2N8.5410.10−1.56−0.42716(C2H5)2NC(O)CH2(C2H5)NH8.818.700.11−0.08717OOCCH2(C2H5)2N10.4710.330.14−0.48718OOCCH2(H3C)2N9.949.750.19−0.34719—OOCCH2NHC(O)CH2(H3C)2N8.097.990.100.09720OOCCH2NHC(O)CH(iso-C4H9)(C2H5)2N7.788.03−0.250.08721OOCCH2(C2H5)NH10.2310.030.20−0.41722(C2H5)2NC2H4NH210.0210.37−0.35−0.49723(CH3)2NC2H4NH29.539.95−0.42−0.39724(C2H5)2NC2H4(C2H5)2N9.5510.06−0.51−0.41725(C2H5)2N+HC2H4(C2H5)2N6.186.42−0.240.47726(H3CCH(OH)CH2)2NC2H4(H3CCH(OH)CH2)2N8.848.660.18−0.07727(H3CCH(OH)CH2)2N+HC2H4(H3CCH(OH)CH2)2N4.334.290.040.99728(H3C)2NC2H4(H3C)2N9.029.73−0.72−0.33729(H3C)2N+HC2H4(H3C)2N5.665.87−0.220.61730(H5C2)2NC2H4OC2H4(H5C2)2N9.969.670.29−0.32731(H5C2)2N+HC2H4OC2H4(H5C2)2N8.499.01−0.52−0.16732(H3C)2NC2H4OC2H4(H5C2)2N10.029.470.54−0.27733(H5C2)2N+HC2H4OC2H4(H3C)2N8.268.55−0.29−0.05734(H3C)(C2H5)NC2H4OC2H4(H5C2)2N9.979.570.39−0.29735(H5C2)2N+HC2H4OC2H4(H3C)(C2H5)N8.348.55−0.21−0.05736(H3C)2NC2H4OC2H4(H3C)2N9.629.140.47−0.19737(H3C)2N+HC2H4OC2H4(H3C)2N8.078.34−0.270.00738(H3C)2NC2H4N(CH3)C2H4(H3C)2N9.329.59−0.27−0.30739(H3C)2N+HC2H4N(CH3)C2H4(H3C)2N8.338.80−0.48−0.11740H3C[(H3C)2N+HC2H4]2N2.392.120.271.52741(H3C)2NC2H4OC2H4(H5C2)(H3C)N9.499.170.32−0.20742(H5C2)(H3C)N+HC2H4OC2H4(H3C)2N7.828.06−0.240.07743(H3C)2NC2H4SC2H4(H3C)2N9.028.940.08−0.14744(H3C)2N+HC2H4SC2H4(H3C)2N7.938.29−0.360.02745H2N + C3H6N(CH3)C3H6NH26.456.84−0.390.37746(H5C2)2NC3H6(H5C2)2N10.1810.33−0.15−0.48747(H5C2)2N+HC3H6(H5C2)2N8.208.46−0.26−0.02748(H5C2)2NCH2CH(OH)CH2(H5C2)2N9.809.310.49−0.23749(H5C2)2N+HCH2CH(OH)CH2(H5C2)2N7.747.97−0.230.09750(H3C)2NCH2CH(CH3)(CH3)2N9.6310.04−0.42−0.41751(H3C)2N+HCH(CH3)CH2(CH3)2N5.475.94−0.470.59752(H3C)2NC3H6(H3C)2N9.7110.07−0.36−0.42753(H3C)2N+HC3H6(H3C)2N7.637.68−0.050.17754(CH3CH(OH)CH2)2NC3H6(H3C)2N9.209.23−0.03−0.21755(H3C)2N+HC3H6(CH3CH(OH)CH2)2N6.506.54−0.040.44756(H3C)2NC3H6N(H3C)C3H6(H3C)2N9.9110.10−0.20−0.42757(H3C)2N+HC3H6N(H3C)C3H6(H3C)2N8.929.13−0.21−0.19758[(H3C)2N+HC3H6]2(H3C)N6.356.65−0.300.42759H2NC4H8(H5C2)2N9.209.25−0.05−0.22760H2NCH(CH3)C3H6(H5C2)2N9.559.450.10−0.26761C6H11(cyclo)(H3C)2N10.7210.410.31−0.50762(92)10.329.700.62−0.33763(93)10.3710.060.31−0.41764(94)6.877.82−0.950.13765(95)9.589.85−0.28−0.36766(96)9.379.320.04−0.23767(97)9.709.520.17−0.28768OOCCH(CH2C(O)NH2)NH28.808.800.000.26769H2NC2H4Si(CH3)2OSi(CH3)2C2H4NH210.7410.480.26−0.15770H2NCH2Si(CH3)2OSi(CH3)2CH2NH210.3010.31−0.01−0.11771C6H11(cyclo)NHCH2Si(CH3)2OSi(CH3)2CH2[C6H11(cyclo)]NH10.1110.54−0.43−0.35772iso-C3H7NHCH2Si(CH3)2OSi(CH3)2CH2(iso-C3H7)NH10.4010.400.00−0.31773H2N(C2H5)2N7.717.370.340.24774(C2H5)2N(C2H5)2N7.787.650.130.36775H2N(H3C)2N7.217.39−0.180.24776H3CNH(H3C)NH7.527.460.060.40777H2N(C2H5)NH7.997.530.460.38778H2N(H3C)NH7.877.590.280.37779(H3C)2N(H3C)2N6.307.47−1.170.22780H3CNH(H3C)2N6.787.27−0.490.27781(98)10.009.810.19−0.35782H2NC(O)NHNH23.655.31−1.661.11783OOCCH2(H5C2)NH10.239.970.26−0.21784OOCCH(CH3)(H5C2)NH10.2210.150.07−0.25785C6H10(cyclo), 1-COO, 1-NH210.0310.37−0.34−0.12786C6H10(cyclo), 1-COO, 2-NH210.109.840.260.01787C6H10(cyclo), 1-COO, 3-NH210.5010.54−0.04−0.17788C6H10(cyclo), 1-COO, 4-NH2axial10.5510.61−0.06−0.18789C6H10(cyclo), 1-COO, 4-NH2ecvat10.6210.600.02−0.18790H2NC(O)NHC3H6CH(COO)NH29.419.86−0.450.00791(OOCCH2)2NH9.128.800.320.07792(OOCCH2)2(H3C)N9.929.200.72−0.20793(OOCCH2)3N10.239.231.00−0.21794OOCCH2(OOCC2H4)NH9.469.000.460.03795(OOCC2H4)2NH9.619.89−0.28−0.19796OOCCH2NHC2H4(OOCCH2)NH9.469.330.13−0.05797OOCC9H19SC2H4NH28.309.83−1.530.01798OOCC10H21SC2H4NH29.609.160.440.17799OOCC10H21NHC2H4SSC2H4(OOCC10H21)NH9.909.160.74−0.01800H2NOC2H4CH(COO)NH29.208.740.460.27801C2H5SCH2CH(COO)NH28.609.19−0.590.16802NH39.2510.61−1.36The numbers in the table correspond to the molecular structures from Scheme 2.


[0116] The examples and embodiments described in this patent are for illustrative purposes only and various modifications or changes will be suggested to persons skilled in the art and are to be included within the disclosure in this application and scope of the claims. All publications, patents and patent applications cited in this patent are hereby incorporated by reference in their entirety for all purposes to the same extent as if each individual publication, patent or patent application were specifically and individually indicated to be so incorporated by reference.


Claims
  • 1. A method for calculating a characteristic property of a molecule, where the molecule has one or more measured properties and the molecule comprises one or more substituent parts, the method comprising selecting one or more contributing substituent parts; for each contributing substituent part, calculating the distance from the substituent part to a reaction center; for each contributing substituent part, calculating a contribution of the substituent part to a characteristic property of the molecule, where the contribution is equal to a function of the distance of the substituent part to the reaction center multiplied by a weight factor for the substituent part, and the where the function has a functional form that is substantially the same for all substituent parts; and calculating the characteristic property of the molecule by summing the contributions from the contributing substituent parts of the molecule plus a contribution comprising a value of a measured property of the molecule multiplied by a weight factor.
  • 2. The method of claim 1, wherein the characteristic property is a chemical characteristic property.
  • 3. The method of claim 1, wherein the characteristic property is any property related to the free energy of the molecule.
  • 4. The method of claim 2, wherein the chemical characteristic property is selected from the group consisting of pKa, reaction rate constants, equilibrium constants, solubility, ionization potentials, atomization energy, evaporation energy, and energy of bonds.
  • 5. The method of claim 1, wherein the molecule is selected from the group consisting of organic molecules, inorganic molecules, neutral molecules, radicals, anions, cations, ionic salts, metallo-organic compounds and coordination compounds.
  • 6. The method of claim 1, wherein a substituent part of the molecule is an atom contained in the-molecule or a group of connected atoms contained in the molecule.
  • 7. The method of claim 1, wherein the contributing substituent parts include all substituent parts of the molecule except one.
  • 8. The method of claim 1, wherein the reaction center is a point in space.
  • 9. The method of claim 1, wherein the reaction center is an atom contained in the molecule.
  • 10. The method of claim 1, wherein the reaction center comprises a substituent parts of the molecule.
  • 11. The method of claim 10, wherein the reaction center is one of the substituent parts of the molecule.
  • 12. The method of claim 11, wherein the contributing substituent parts include all substituent parts in the molecule except the reaction center substituent part.
  • 13. The method of claim 1, wherein the function of the distance is of the form of an inverse function of the distance.
  • 14. The method-of claim 13 wherein the function of the distance is of the form of the inverse of the square of the distance.
  • 15. The method of claim 13, wherein the function of the distance is of the form of sum the inverse of the square of the distance and the inverse of the cube of the distance.
  • 16. The method of claim 13, wherein the function of the distance is of the form of the inverse of the cube of the distance.
  • 17. The method of claim 1, wherein the weight factor is calculated as a regression coefficient for a multivariate regression analysis calculated for a series of molecules.
  • 18. The method of claim 17, wherein for the multivariate regression analysis a dependent variable is the characteristic property for one of molecules in the series and there is an independent variable for each type of substituent part present in the series of molecules, and for a particular independent variable the value of the dependent variable corresponding to a particular substituent part is equal to a sum over all of the particular substituent parts in the molecule corresponding to the independent variable of the function of the distance from the reaction center to the particular substituent part.
  • 19. The method of claim 17, wherein the series of molecules analogs of the molecule.
  • 20. The method of claim 17, wherein the series of molecules that have the same reaction center as the molecule.
  • 21. The method of claim 17, wherein the reaction center is a point in space or a substituent part of the molecule and the reaction center is identified by a method comprising for a first reaction center, performing the multivariable regression analysis and determining a first characteristic of the multivariable regression analysis, for a second reaction center, performing the multivariable regression analysis and determining a second characteristic of the multivariable regression analysis, identifying the reaction center as that reaction center with the multivariable regression analysis characteristic satisfying a predetermined criteria.
  • 22. The method of claim 21, wherein the characteristic of the multivariable regression analysis is the global regression coefficient and the predetermined criteria selects for the reaction center with the highest global regression coefficient.
  • 23. The method of claim 21, wherein the characteristic of the multivariable regression analysis is the global standard error and the predetermined criteria selects for the reaction center with the lowest standard error.
  • 24. The method of claim 1, wherein one of the measured properties of the molecule is the hydrophobicity of the molecule.
  • 25. The method of claim 1, wherein the measured property weight factor is calculated as a regression coefficient for a multivariate regression analysis calculated for a series of molecules.
  • 26. The method of claim 25, wherein for the multivariate regression analysis a dependent variable is the characteristic property for one of molecules in the series and the independent variables comprise a value for a measured property.
  • 27. A method for calculating a characteristic property of a molecule, where the molecule has one or more measured properties and the molecule comprises one or more substituent parts, the method comprising denominating one of the substituent parts as a reaction center; for each substituent part other than the reaction center, calculating the distance from the substituent part to the reaction center; for each substituent part other than the reaction center, calculating a contribution of the substituent part to the characteristic property of the molecule, where the contribution is equal to the inverse of the square of the distance of the substituent part to the reaction center multiplied by a weight factor for the substituent part, and where the weight factor is calculated as a regression coefficient for a multivariate regression analysis calculated for a series of molecules comprising analogs of the molecule; for each measured property, calculating the contribution of the measured property, where the contribution is equal to the value of the measured property multiplied by a weight factor, and where the weight factor is calculated as a regression coefficient for a multivariate regression analysis calculated for a series of molecules comprising analogs of the molecule; calculating the property of the molecule by summing the contributions from the contributing substituent parts of the molecule plus the contribution or contributions from the one or more measured properties.
  • 28. The method of claim 27, wherein the characteristic property is a chemical characteristic property.
  • 29. The method of claim 28 wherein the characteristic property is any property related to the free energy of the molecule.
  • 30. The method of claim 28, wherein the chemical characteristic property is selected from the group consisting of pKa, reaction rate constants, equilibrium constants, solubility, ionization potentials, atomization energy, evaporation energy, energy of bonds.
  • 31. The method of claim 27, wherein the molecule is selected from the group consisting of organic molecules, inorganic molecules, neutral molecules, radicals, anions, cations, ionic salts, metallo-organic compounds and coordination compounds.
  • 32. The method of claim 27, wherein the one or more substituent parts of the molecule are atoms contained in the molecule or groups of connected atoms contained in the molecule.
  • 33. The method of claim 27, wherein for the multivariate regression analysis a dependent variable is the characteristic property for one of molecules in the series and there is an independent variable for each type of substituent part present in the series of molecules, and for a particular independent variable the value of the dependent variable corresponding a particular substituent part is equal to a sum over all of the particular substituent parts in the molecule corresponding to the independent variable of the inverse square of the distance from the reaction center to the particular substituent part.
  • 34. The method of claim 27, wherein the reaction center is identified by a method comprising for a first reaction center, performing the multivariable regression analysis and determining a first characteristic of the multivariable regression analysis, for a second reaction center, performing the multivariable regression analysis and determining a second characteristic of the multivariable regression analysis, identifying the reaction center as that reaction center with the multivariable regression analysis characteristic satisfying a predetermined criteria.
  • 35. The method of claim 34, wherein the characteristic of the multivariable regression analysis is the global regression coefficient and the predetermined criteria selects for the reaction center with the highest global regression coefficient.
  • 36. The method of claim 34, wherein the characteristic of the multivariable regression analysis is the global standard error and the predetermined criteria selects for the reaction center with the lowest standard error.
  • 37. The method of claim 27, wherein one of the measured properties of the molecule is the hydrophobicity of the molecule.
  • 38. A method for calculating a chemical characteristic property of a molecule, where the molecule has a hydrophobicity and the molecule comprises one or more substituent parts and the substituent parts are atoms contained in the molecule or groups of connected atoms contained in the molecule, the method comprising selecting one of the substituent parts as a reaction center; for each substituent part other than the reaction center, calculating the distance from the substituent part to the reaction center; for each substituent part other than the reaction center, calculating a contribution of the substituent part to the characteristic property of the molecule, where the contribution is equal to the inverse of the square of the distance of the substituent part to the reaction center multiplied by a weight factor for the substituent part, and where the weight factor is calculated as a regression coefficient for a multivariate regression analysis calculated for a series of molecules comprising analogs of the molecule; calculating the contribution of the hydrophobicity as equal to the value of the hydrophobicity multiplied by a weight factor calculated as a regression coefficient for a multivariate regression analysis calculated for a series of molecules comprising analogs of the molecule; calculating the characteristic property of the molecule by summing the contributions from the contributing substituent parts of the molecule plus the contribution from the hydrophobicity.
  • 39. The method of claim 38, wherein the chemical characteristic property is selected from the group consisting of pKa, reaction rate constants, equilibrium constants, solubility, ionization potentials, atomization energy, evaporation energy, energy of bonds.
  • 40. The method of claim 39, wherein the chemical characteristic is any property related to the free energy of the molecule.
  • 41. The method of claim 38, wherein the molecule is selected from the group consisting of organic molecules, inorganic molecules, neutral molecules, radicals, anions, cations, ionic salts and metallo-organic compounds and coordination compounds.
  • 42. The method of claim 41 wherein the molecule is an aniline mustard, nonsteroidal anti-inflammatory drug (NSAID), mytomycin, amine, or carboxylic acid.
  • 43. The method of claim 38, wherein for the multivariate regression analysis a dependent variable is the characteristic property for one of molecules in the series and there is an independent variable for each type of substituent part present in the series of molecules, and for a particular independent variable the value of the dependent variable corresponding a particular substituent part is equal to a sum over all of the particular substituent parts in the molecule corresponding to the independent variable of the inverse square of the distance from the reaction center to the particular substituent part.
  • 44. The method of claim 38, wherein the reaction center is selected by a method comprising the steps of for a first reaction center, performing the multivariable regression analysis and determining a first characteristic of the multivariable regression analysis; for a second reaction center, performing the multivariable regression analysis and determining a second characteristic of the multivariable regression analysis; and selecting the reaction center as that reaction center with the multivariable regression analysis characteristic satisfying a predetermined criteria.
  • 45. The method of claim 44, wherein the characteristic of the multivariable regression analysis is the global regression coefficient and the predetermined criteria selects for the reaction center with the highest global regression coefficient.
  • 46. The method of claim 44, wherein the characteristic of the multivariable regression analysis is the global standard error and the predetermined criteria selects for the reaction center with the lowest standard error.
  • 47. A method for calculating a chemical characteristic property of a molecule, where the molecule comprises one or more substituent parts and the chemical characteristic property is selected from the group consisting of pKa, reaction rate constants, equilibrium constants, solubility, ionization potentials, atomization energy, evaporation energy, and bond energy, the method comprising the steps of selecting one or more contributing substituent parts; for each contributing substituent part, calculating a distance from the substituent part to a reaction center; for each contributing substituent part, calculating the contribution of the substituent part to the characteristic property of the molecule, where the contribution is equal to a function of the distance of the substituent part to the reaction center multiplied by a weight factor for the substituent part and the function has a functional form that is substantially the same for all substituent parts; and calculating the characteristic property of the molecule by summing the contributions from the contributing substituent parts of the molecule.
  • 48. The method of claim 47, wherein the molecule is an organic molecule, inorganic molecule, neutral molecule, radical, anion, cation, ionic salt, metallo-organic compound or a coordination compound.
  • 49. The method of claim 47, wherein the one or more substituent parts of the molecule are atoms contained in the molecule or groups of connected atoms contained in the molecule.
  • 50. The method of claim 47, wherein the contributing substituent parts include all substituent parts of the molecule except one.
  • 51. The method of claim 47, wherein the reaction center is a point in space.
  • 52. The method of claim 47, wherein the reaction center is an atom contained within the molecule.
  • 53. The method of claim 47, wherein the reaction center comprises a substituent part of the molecule.
  • 54. The method of claim 53, wherein the reaction center is one of the substituent parts.
  • 55. The method of claim 53, wherein the contributing substituent parts include all substituent parts in the molecule except the reaction center substituent part.
  • 56. The method of claim 47, wherein the function of the distance is of the form of an inverse function of the distance.
  • 57. The method of claim 56, wherein the function of the distance goes as the inverse of the square of the distance.
  • 58. The method of claim 56, wherein the function of the distance is of the form of the sum of the inverse of the square of the distance and the inverse of the cube of the distance.
  • 59. The method of claim 56, wherein the function of the distance is of the form of the inverse of the cube of the distance.
  • 60. The method of claim 47, wherein the weight factor is calculated as a regression coefficient for a multivariate regression analysis calculated for a series of molecules.
  • 61. The method of claim 60, wherein for the multivariate regression analysis a dependent variable is the characteristic property for one of molecules in the series and there is an independent variable for each type of substituent part present in the series of molecules, and for a particular independent variable the value of the dependent variable corresponding to a particular substituent part is equal to a sum over all of the particular substituent parts in the molecule corresponding to the independent variable of the function of the distance from the reaction center to the particular substituent part.
  • 62. The method of claim 60, wherein the series of molecules comprise analogs of the molecule.
  • 63. The method of claim 62, wherein the series of molecules comprise molecules that have the same reaction center as the molecule.
  • 64. The method of claim 62, wherein the reaction center is a point in space or a substituent part of the molecule and the reaction center is selected by a method comprising for a first reaction center, performing the multivariable regression analysis and determining a first characteristic of the multivariable regression analysis, for a second reaction center, performing the multivariable regression analysis and determining a second characteristic of the multivariable regression analysis, identifying the reaction center as that reaction center with the multivariable regression analysis characteristic satisfying a predetermined criteria.
  • 65. The method of claim 64, wherein the characteristic of the multivariable regression analysis is the global regression coefficient and the predetermined criteria selects for the reaction center with the highest global regression coefficient.
  • 66. The method of claim 64, wherein the characteristic of the multivariable regression analysis is the global standard error and the predetermined criteria selects for the reaction center with the lowest standard error.
  • 67. The method of claim 47, wherein the molecule has one or more measured properties and wherein the characteristic property of the molecule is calculated by summing the contributions from the contributing substituent parts of the molecule plus a contribution comprising a measured property of the molecule multiplied by a weight factor.
  • 68. The method of claim 67, wherein the one or more measured properties of the includes the hydrophobicity of the molecule.
  • 69. The method of claim 67, wherein the measured property weight factor is calculated as a regression coefficient for a multivariate regression analysis calculated for a series of molecules.
  • 70. A system for calculating a characteristic property of a molecule, where the molecule has one or more measured properties and the molecule comprises one or more substituent parts, the system comprising: a processor; and a computer readable medium having computer readable program code means embodied therein for causing the system to calculate a biological characteristic property of a molecule, the computer readable program code means comprising: (1) a computer readable program code means for causing a computer to carry out the step of selecting one or more contributing substituent parts; (2) a computer readable program code means for causing a computer to carry out the step of, for each contributing substituent part, calculating the distance from the substituent part to a reaction center; (3) a computer readable program code means for causing a computer to carry out the step of, for each contributing substituent part, calculating a contribution of the substituent part to a characteristic property of the molecule, where the contribution is equal to a function of the distance of the substituent part to the reaction center multiplied by a weight factor for the substituent part, and the where the function has a functional form that is substantially the same for all substituent parts; and (4) a computer readable program code means for causing a computer to carry out the step of calculating the characteristic property of the molecule by summing the contributions from the contributing substituent parts of the molecule plus a contribution comprising a value of a measured property of the molecule multiplied by a weight factor.
  • 71. A system for calculating a characteristic property of a molecule, where the molecule has one or more measured properties and the molecule comprises one or more substituent parts, the system comprising: a processor; and a computer readable medium having computer readable program code means embodied therein for causing the system to calculate a biological characteristic property of a molecule, the computer readable program code means comprising: (1) a computer readable program code means for causing a computer to carry out the step of denominating one of the substituent parts as a reaction center; (2) a computer readable program code means for causing a computer to carry out the step of, for each substituent part other than the reaction center, calculating the distance from the substituent part to the reaction center; (3) a computer readable program code means for causing a computer to carry out the step of, for each substituent part other than the reaction center, calculating a contribution of the substituent part to the characteristic property of the molecule, where the contribution is equal to the inverse of the square of the distance of the substituent part to the reaction center multiplied by a weight factor for the substituent part, and where the weight factor is calculated as a regression coefficient for a multivariate regression analysis calculated for a series of molecules comprising analogs of the molecule; (4) a computer readable program code means for causing a computer to carry out the step of, for each measured property, calculating the contribution of the measured property, where the contribution is equal to the value of the measured property multiplied by a weight factor, and where the weight factor is calculated as a regression coefficient for a multivariate regression analysis calculated for a series of molecules comprising analogs of the molecule; and (5) a computer readable program code means for causing a computer to carry out the step of, calculating the property of the molecule by summing the contributions from the contributing substituent parts of the molecule plus the contribution or contributions from the one or more measured properties.
  • 72. A system for calculating a chemical characteristic property of a molecule, where the molecule has a hydrophobicity and the molecule comprises one or more substituent parts and the substituent parts are atoms contained in the molecule or groups of connected atoms contained in the molecule, the system comprising: a processor; and a computer readable medium having computer readable program code means embodied therein for causing the system to calculate a biological characteristic property of a molecule, the computer readable program code means comprising: (1) a computer readable program code means for causing a computer to carry out the step of selecting one of the substituent parts as a reaction center; (2) a computer readable program code means for causing a computer to carry out the step of, for each substituent part other than the reaction center, calculating the distance from the substituent part to the reaction center; (3) a computer readable program code means for causing a computer to carry out the step of, for each substituent part other than the reaction center, calculating a contribution of the substituent part to the characteristic property of the molecule, where the contribution is equal to the inverse of the square of the distance of the substituent part to the reaction center multiplied by a weight factor for the substituent part, and where the weight factor is calculated as a regression coefficient for a multivariate regression analysis calculated for a series of molecules comprising analogs of the molecule; (4) a computer readable program code means for causing a computer to carry out the step of calculating the contribution of the hydrophobicity as equal to the value of the hydrophobicity multiplied by a weight factor calculated as a regression coefficient for a multivariate regression analysis calculated for a series of molecules comprising analogs of the molecule; and (5) a computer readable program code means for causing a computer to carry out the step of calculating the characteristic property of the molecule by summing the contributions from the contributing substituent parts of the molecule plus the contribution from the hydrophobicity.
  • 73. A system for calculating a chemical characteristic property of a molecule, where the molecule comprises one or more substituent parts and the chemical characteristic property is selected from the group consisting of pKa, reaction rate constants, equilibrium constants, solubility, ionization potentials, atomization energy, evaporation energy, and bond energy, the system comprising: a processor; and a computer readable medium having computer readable program code means embodied therein for causing the system to calculate a biological characteristic property of a molecule, the computer readable program code means comprising: (1) a computer readable program code means for causing a computer to carry out the step of selecting one or more contributing substituent parts; (2) a computer readable program code means for causing a computer to carry out the step of, for each contributing substituent part, calculating a distance from the substituent part to a reaction center; (3) a computer readable program code means for causing a computer to carry out the step of, for each contributing substituent part, calculating the contribution of the substituent part to the characteristic property of the molecule, where the contribution is equal to a function of the distance of the substituent part to the reaction center multiplied by a weight factor for the substituent part and the function has a functional form that is substantially the same for all substituent parts; and (4) a computer readable program code means for causing a computer to carry out the step of calculating the characteristic property of the molecule by summing the contributions from the contributing substituent parts of the molecule.
  • 74. An article of manufacture comprising a computer useable medium having computer readable program code means embodied therein for causing a computer to calculate a characteristic property of a molecule, where the molecule has one or more measured properties and the molecule comprises one or more substituent parts, the computer readable program code means comprising: (1) a computer readable program code means for causing a computer to carry out the step of selecting one or more contributing substituent parts; (2) a computer readable program code means for causing a computer to carry out the step of, for each contributing substituent part, calculating the distance from the substituent part to a reaction center; (3) a computer readable program code means for causing a computer to carry out the step of, for each contributing substituent part, calculating a contribution of the substituent part to a characteristic property of the molecule, where the contribution is equal to a function of the distance of the substituent part to the reaction center multiplied by a weight factor for the substituent part, and the where the function has a functional form that is substantially the same for all substituent parts; and (4) a computer readable program code means for causing a computer to carry out the step of calculating the characteristic property of the molecule by summing the contributions from the contributing substituent parts of the molecule plus a contribution comprising a value of a measured property of the molecule multiplied by a weight factor.
  • 75. An article of manufacture comprising a computer useable medium having computer readable program code means embodied therein for causing a computer to calculate a characteristic property of a molecule, where the molecule has one or more measured properties and the molecule comprises one or more substituent parts, the computer readable program code means comprising: (1) a computer readable program code means for causing a computer to carry out the step of denominating one of the substituent parts as a reaction center; (2) a computer readable program code means for causing a computer to carry out the step of, for each substituent part other than the reaction center, calculating the distance from the substituent part to the reaction center; (3) a computer readable program code means for causing a computer to carry out the step of, for each substituent part other than the reaction center, calculating a contribution of the substituent part to the characteristic property of the molecule, where the contribution is equal to the inverse of the square of the distance of the substituent part to the reaction center multiplied by a weight factor for the substituent part, and where the weight factor is calculated as a regression coefficient for a multivariate regression analysis calculated for a series of molecules comprising analogs of the molecule; (4) a computer readable program code means for causing a computer to carry out the step of, for each measured property, calculating the contribution of the measured property, where the contribution is equal to the value of the measured property multiplied by a weight factor, and where the weight factor is calculated as a regression coefficient for a multivariate regression analysis calculated for a series of molecules comprising analogs of the molecule; and (5) a computer readable program code means for causing a computer to carry out the step of, calculating the property of the molecule by summing the contributions from the contributing substituent parts of the molecule plus the contribution or contributions from the one or more measured properties.
  • 76. An article of manufacture comprising a computer useable medium having computer readable program code means embodied therein for causing a computer to calculate a chemical characteristic property of a molecule, where the molecule has a hydrophobicity and the molecule comprises one or more substituent parts and the substituent parts are atoms contained in the molecule or groups of connected atoms contained in the molecule, the computer readable program code means comprising: (1) a computer readable program code means for causing a computer to carry out the step of selecting one of the substituent parts as a reaction center; (2) a computer readable program code means for causing a computer to carry out the step of, for each substituent part other than the reaction center, calculating the distance from the substituent part to the reaction center; (3) a computer readable program code means for causing a computer to carry out the step of, for each substituent part other than the reaction center, calculating a contribution of the substituent part to the characteristic property of the molecule, where the contribution is equal to the inverse of the square of the distance of the substituent part to the reaction center multiplied by a weight factor for the substituent part, and where the weight factor is calculated as a regression coefficient for a multivariate regression analysis calculated for a series of molecules comprising analogs of the molecule; (4) a computer readable program, code means for causing a computer to carry out the step of calculating the contribution of the hydrophobicity as equal to the value of the hydrophobicity multiplied by a weight factor calculated as a regression coefficient for a multivariate regression analysis calculated for a series of molecules comprising analogs of the molecule; and (5) a computer readable program code means for causing a computer to carry out the step of calculating the characteristic property of the molecule by summing the contributions from the contributing substituent parts of the molecule plus the contribution from the hydrophobicity.
  • 77. An article of manufacture comprising a computer useable medium having computer readable program code means embodied therein for causing a computer to calculate a chemical characteristic property of a molecule, where the molecule comprises one or more substituent parts and the chemical characteristic property is selected from the group consisting of pKa, reaction rate constants, equilibrium constants, solubility, ionization potentials, atomization energy, evaporation energy, and bond energy, the computer readable program code means comprising: (1) a computer readable program code means for causing a computer to carry out the step of selecting one or more contributing substituent parts; (2) a computer readable program code means for causing a computer to carry out the step of, for each contributing substituent part, calculating a distance from the substituent part to a reaction center; (3) a computer readable program code means for causing a computer to carry out the step of, for each contributing substituent part, calculating the contribution of the substituent part to the characteristic property of the molecule, where the contribution is equal to a function of the distance of the substituent part to the reaction center multiplied by a weight factor for the substituent part and the function has a functional form that is substantially the same for all substituent parts; and (4) a computer readable program code means for causing a computer to carry out the step of calculating the characteristic property of the molecule by summing the contributions from the contributing substituent parts of the molecule.
  • 78. A molecule comprising one or more substituent parts chosen to affect a characteristic property of the molecule, where the effect of the one or more substituent parts is calculated by the method according to claim 1.
  • 79. A molecule comprising one or more substituent parts chosen to affect a characteristic property of the molecule, where the effect of the one or more substituent parts is calculated by the method according to claim 27.
  • 80. A molecule comprising one or more substituent parts chosen to affect a characteristic property of the molecule, where the effect of the one or more substituent parts is calculated by the method according to claim 38.
  • 81. A molecule comprising one or more substituent parts chosen to affect a characteristic property of the molecule, where the effect of the one or more substituent parts is calculated by the method according to claim 47.
  • 82. A molecule synthesized after determining a likely characteristic property of the molecule, where the effect of the characteristic property of the molecule is calculated by the method according to claim 1.
  • 83. A molecule synthesized after determining a likely characteristic property of the molecule, where the effect of the characteristic property of the molecule is calculated by the method according to claim 27.
  • 84. A molecule synthesized after determining a likely characteristic property of the molecule, where the effect of the characteristic property of the molecule is calculated by the method according to claim 38.
  • 85. A molecule synthesized after determining a likely characteristic property of the molecule, where the effect of the characteristic property of the molecule is calculated by the method according to claim 47.
CROSS-REFERENCE TO RELATED APPLICATIONS.

[0001] This application claims the benefit of U.S. provisional application No. 60/308,666, filed Jul. 31, 2001, with inventors Artem Tcherkassov and Ridong Chen, which application is incorporated herein by reference. This application is related to an application filed on the same date, with the same inventors, titled, “Calculating a Biological Characteristic Property of a Molecule By Correlation Analysis,” with attorney docket number 53260-20001.00, which application is incorporated herein by reference.

Provisional Applications (1)
Number Date Country
60308666 Jul 2001 US