The present invention generally relates to computer assisted methods and systems for using organic solutes to sample aqueous and heterogeneous environments. More specifically, it relates to computer assisted methods and systems for determining the spatial distributions and thermodynamics of small molecules in aqueous environments and heterogenous environments, including heterogenous environments containing a protein having deep or occluded binding sites. The methods provided herein may be used for thermodynamic studies of complex aqueous systems and for computer-aided drug design.
Chemical potential (μ) describes the equilibrium movement of particles between two phases or states. The driving force behind this movement comes from 1) a concentration gradient, namely, particles tend to move from a region of higher concentration to a region of lower concentration to gain mixing entropy; and 2) chemical affinity: particles are attracted to regions of high chemical affinity (See Dill K A & Bromberg S (2003), Chemistry and Biology).
Excess chemical potential (μex) is the quasistatic work to bring a particle (e.g. solute molecule) from the gas phase to the solvent; μex=μ-μid, where μ and μid are the chemical potential and the ideal gas chemical potential of the solute, respectively. In the context of statistical mechanics, chemical potential allows for the thermodynamic state of a system to be defined in terms of a grand canonical (GC) ensemble (μVT) that allows for variation in the species concentrations across phases/states. Simulation procedures have long evolved towards efficiently determining Gibbs' free energy of hydration (HFE), chemical affinity and other thermodynamically relevant properties of water and other small solute molecules from GC ensembles instead of the more conventional isothermal, isobaric (NPT), canonical (NVT) or microcanonical (NVE) ensembles where the concentration of the species is fixed. To date, many of the GC ensemble strategies have employed Monte-Carlo (MC) simulations to either drive the sampling of water molecules or individual small molecules around proteins or crystal environments, or alternately to improve the accuracy of relative HFE calculations in free energy perturbation (FEB) calculations. However, since Grand Canonical Monte Carlo (GCMC) simulations of systems containing explicit solvent to represent the bulk-phase suffer from convergence problems due to low acceptance rates encountered for the insertion of solutes, simulations in the past were restricted towards investigation of the chemical affinities of only the solvent, simulations investigating the chemical affinities to drive individual solute sampling in the absence of explicit solvent, determine the thermodynamic properties in crystal conditions, or simulations to investigate the use expanded ensemble strategies.
In the context of protein and macromolecular environments, chemical fragment sampling simulation techniques have been employed for discovery or rational design of molecules that can bind to macromolecular targets with high affinities so as to achieve a desired biological outcome. The Site Identification by Ligand Competitive Saturation (SILCS) method is one such technique that identifies the location and approximate affinities of different functional groups on a target macromolecular surface by performing Molecular Dynamics (MD) simulations of the target in an aqueous solution of solute molecules that are representative of different chemical fragments. However, these MD isothermal, isobaric (NPT) ensembles suffer from the long diffusion time scales of the solutes through explicit solvent and macromolecule environments, especially when the macromolecular binding sites are deeply buried and inaccessible to the solvent (i.e. occluded). These limitations imply that only binding sites on a protein that are accessible to the bulk solvent can be studied. However, a large number of biologically important proteins such as the G-protein coupled receptors (GPCRs), nuclear receptor proteins as well as other macromolecules, have deep or occluded ligand binding pockets, and simulations to study the affinity of small molecules for these occluded pockets are generally time intensive.
The methods and systems of the present invention overcome these limitations, yielding accurate spatial distributions for solutes in aqueous macromolecular environments.
The present invention generally relates to computer assisted methods and systems for using organic solutes to sample aqueous and heterogeneous environments
In one embodiment, the invention describes a computational method for sampling the spatial distribution of one or more solutes and water in a defined region of space (system) comprising:
In other embodiments of the claimed method, the system may contain water and one or more solutes. In one embodiment the system further comprises one or more macromolecules. The spatial distribution of one or more solutes or water is used to identify preferential affinity of each of the solutes or water to each of the one or more macromolecules. Exemplary macromolecules include proteins, RNA, DNA, carbohydrates, lipids, organic chemicals, inorganic chemicals, or combinations thereof.
The molecular weight of the one or more macromolecules used in the inventive computational method can be greater than or equal to 1000 daltons. In one aspect of the claimed method the μex of one or more solutes and water is alternately increased and decreased during the GCMC operations across consecutive cycles involving steps 2)-4) above, after the concentration of the solutes and the water reach their target value.
When sampling the spatial distribution of the one or more solutes or water using the Grand-Canonical Monte-Carlo (GCMC) Metropolis sampling criteria, the GCMC steps may equally be divided between each of the one or more solutes and water. In one embodiment, the proportion of GCMC steps for each of the one or more solutes and water is assigned based on the target concentration of each of the solutes and water.
For certain aspects of the claimed method, the system containing the solutes and or water is encompassed in a larger system containing said solutes and or water. According to this aspect of the invention, the system containing the solutes and water is a sphere that is encompassed in a larger sphere whose difference in the radii is 5 Å. In addition to solutes and water the system additionally includes one or more macromolecules and the GCMC operation is performed 2,000-50,000 times.
According to an embodiment of the claimed method, μex is increased or decreased by an amount equal to Ntgt/Nsys and the spatial distribution following step 2) may be sampled with a molecular dynamics simulation. Thus, μex is increased when Nsys is lower than the Ntgt and decreased when Nsys is greater than Ntar.
The claimed method is useful for assisting in computer-aided drug design. In one aspect of the method, the computationally defined region comprises an inner region containing the one or more solutes and water, located within a larger outer region containing additional water and the difference between the inner region and outer region is large enough to limit edge effects.
Non-limiting exemplary embodiments are described herein, with reference to the following figures:
Computer assisted methods and systems are provided for using organic solutes to sample aqueous and heterogeneous environments. In the following description, for the purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the present invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced without these specific details. Still other aspects, features, and advantages of the invention are readily apparent from the following detailed description, simply by illustrating a number of particular embodiments and implementations, including the best mode contemplated for carrying out the invention. The invention is also capable of other and different embodiments, and its several details can be modified in various obvious respects, all without departing from the spirit and scope of the invention. Accordingly, the drawings and description are to be regarded as illustrative in nature, and not as restrictive.
The methods and systems of the present invention allow for investigations of the excess chemical potential (μex) of solutes in aqueous solution, including solutions containing multiple solutes. Central to the approach is the use of a fluctuating μex over the GCMC portions of the simulations. This leads to convergence of μex for given solute(s) and environment, as defined by the user, based on the target concentration and the maintenance of solute sampling once the system has converged with respect to μex or target concentration. In addition to probing μex required to maintain the solutes at their target concentration in aqueous environments, the iterative GCMC-MD with fluctuating μex approach is also useful for efficient solute sampling.
It is well accepted that targeting ligands to occluded ligand binding pockets (LBP) in proteins with minimal or no accessibility to the surrounding environment is challenging. As the efficacies of ligands of both GPCRs and nuclear receptors (NRs) are known to be coupled to small conformational changes in their binding sites, accurate modeling of these sites is critical for future development of therapeutic agents for a wide range of diseases.
The SILCS methodology can be used to map the free energy affinity patterns of functional groups at protein surfaces, including LBPs. This method accounts for the conformational flexibility of proteins, chemical space of ligands, and explicit solvent by running molecular dynamics (MD) of the target protein in an aqueous solution of small solute molecules representative of different chemical functional groups. In one embodiment, the affinity patterns of these functional groups are obtained in the form of discretized probability, or, free energy maps, called FragMaps. Inclusion of protein flexibility and explicit solvent representation in the SILCS method is particularly important given the known conformational changes within the binding pocket upon ligand binding and the competition with and displacement of waters by ligands.
As further described below, the SILCS method was successful in mapping the functional group requirements of ligands for a range of macromolecules. To probe occluded LBPs, SILCS is coupled with an iterative Grand-Canonical Monte-Carlo (GCMC) and MD methodology. GCMC drives the sampling of small solutes and explicit solvent in LBPs while MD allows for conformational sampling of the macromolecules in the presence of solutes and water, which is useful in exploring cryptic pockets absent in apo crystal structures that are known to serve as binding sites.
The present invention describes the use of SILCS-GCMC/MD to map the functional group affinity patterns of the occluded pockets of the following therapeutically important NR proteins and GPCRs. Exemplary therapeutically important macromolecules are the androgen receptor (AR), the peroxisome proliferator-activated receptor-γ (PPARβ), nuclear receptors as well as GPCRs such as the metabotropic glutamate receptor (mGluR) and β2-adrenergic receptor (β2AR). The method was utilized to predict the relative binding affinities of ligands using a ligand grid free energy (LGFE) scoring scheme which, is further described below, incorporates protein conformational flexibility parameters. The method also is capable of distinguishing between active and inactive states of the β2AR through differences in the affinity patterns of ligands across these states. From a drug discovery perspective, such information is useful for distinguishing the function of ligands. Validating this capability is the ability to utilize FragMaps differences to identify new agonists of β2AR, compounds that have the potential to be developed into therapeutic agents for the treatment of asthma and other obstructive pulmonary diseases.
In one embodiment, the method may be used to sample the configurational space of an occluded pocket in a macromolecule. This sampling may be performed on a single solute in conjunction with water wherein the concentration of the solute in silico is in a range from about 0.01M to 1000M so as to estimate the binding affinity of the solute or a complex mixture of solutes for the occluded pocket of a macromolecule and to map the functional preference of groups for the occluded (binding) pocket of the macromolecule. In certain examples, the solute concentration is 1M because it corresponds to concentration at the standard state, which provides the experimental free energy of binding. When this is analyzed in the context of the SILCS methodology, the approach is shown to qualitatively reproduce the binding orientation of known ligands as well as quantitatively rank the order of the binding affinities of ligands. Such information concerning the functionality and binding orientation of ligands assists with computer aided drug design methods that are targeted towards the identification of high affinity ligands for macromolecules with deep or occluded binding pockets such as protein nuclear factors and GPCRs.
The present invention includes, therefore, methods for solute sampling in explicit solvent aqueous systems and solvated protein environments using iterative GCMC simulations. According to one embodiment, the present invention provides a computational method for sampling the spatial distribution of one or more solutes and water in a defined region of space (system) comprising:
In certain embodiments, the methods of the present invention may further generate an output of the spatial distributions of the one or more solutes and water. Among other things, the output may be used to assist with computer-aided drug design.
Details of the spatial distribution will depend on the number of repetitions of steps 2) and 3) with the user defining when a satisfactory spatial distribution is obtained. Typically, a user stops repeating steps 2) and 3) when the details of the spatial distribution of the one or more solutes and water no longer undergo a significant change with each additional repetition as defined by the user. This scenario may be referred to as a converged spatial distribution. In an aqueous solution, for example, a converged spatial distribution will be homogeneous (i.e. the same everywhere for water and all solutes). Alternatively, when the system is heterogeneous (i.e. contains a macromolecule) the spatial distribution of solutes and water will also be heterogeneous. However, the spatial distribution can be defined as converged if it does not undergo significant changes with more repetitions, where significant is defined by the user.
In the context of the present invention, the oscillating μex permits the determination of the spatial distribution of the solutes and water and the term “convergence” refers to the number of GCMC steps at which the spatial distribution of water and solutes no longer changes. While μex can still oscillate for each sampling of the spatial distribution, the average value of μex, however, cannot change once convergence is obtained. In one embodiment convergence of spatial distributions is identified using overlap coefficients. In one embodiment spatial distributions are considered converged when the overlap coefficient is greater than 0.4. In an alternate embodiment, convergence is identified when the deviation of the concentration of a solute in the system undergoing sampling of the spatial distribution is consistently less than 10% from the target concentration for at least the last 20 sampling cycles.
The system may include only water, one or more solutes, or a combination of water and the one or more solutes. The target concentration may be set to any desired level. However, in one particular embodiment having only water and one solute, the target concentration of the solute is set to 1M and the target concentration of the water is set to 55 M.
The number of GCMC operations in step 2) may be greater than one for each cycle. In certain embodiments GCMC operation is performed from about 100 to 100,000 times. However, less than 100 or more than 100,000 GCMC operations can be performed if necessary. The GCMC operations may be divided in any fashion amongst the one or more solutes and water. In certain embodiments, the GCMC operations are divided equally between the one or more solutes and water. The proportion of GCMC operations for each of the one or more solutes may also be assigned in any ratio desired. However, in certain embodiments, the proportion of GCMC operations is assigned based on the target concentration of each of the solutes and water. In certain embodiments, the μex of the one or more solutes and water is alternately increased and decreased during the GCMC operations across consecutive cycles involving steps 2) and 3). The alternating process of increasing or decreasing the μex of the one or more solutes and water is repeated until the overall spatial distribution no longer changes or reaches an acceptable level of change.
Macromolecules may also be present in the systems of the present invention. Examples may include, but are not limited to a protein, RNA, DNA, carbohydrate, lipid, organic chemical, inorganic chemical, or any combination thereof. Said macromolecules are generally greater than 1000 daltons, and less than 10,000,000 daltons. Generally, the macromolecules are in a range of 5,000 to 500,000 daltons. In certain systems that comprise a macromolecule, the sampling method of the present invention further includes a step wherein μex of the one or more solutes and water is alternately increased and decreased during the GCMC operations across consecutive cycles involving steps 2) through 3) after the concentration of the one or more solutes and water reach their target values. In embodiments that comprise one or more macromolecules, the spatial distribution of the one or more solutes may be used to identify preferential affinity of each of the solutes for the macromolecule. This can be achieved by determining the relative probability that the one or more solutes are spatially located adjacent or in one of the macromolecules versus the other or to the remainder of the system. The same holds for multiple sites on a single macromolecule. The binding affinity can be estimated based on the probability of the solutes being in a site relative to the remainder of the system or relative to another site on the macromolecule itself or versus the other macromolecule.
In yet other embodiments, the spatial distribution following step 2) is sampled using molecular dynamics (MD) simulations. A MD simulation after the GCMC allows for both conformational sampling of the solutes and configurational sampling of the aqueous system and the macromolecule. To achieve satisfactory convergence as defined by the user, the process of GCMC-MD is repeated through multiple iterations, with the μex of the species being studied systematically oscillated over the iterations to drive the solute and water exchanges. Many of the examples provided herein are described using GCMC-MD but MD simulation may not be used in all embodiments of the present invention.
In certain embodiments, the system used in the methods described herein is a finite spherical system. The difference between the radii of the systems may be 0.1 Å to 1000 Å. In certain embodiments, the difference between the radii is or is approximately 5 Å. While specific examples provided herein are taught with a spherical system, any defined region of space may be used.
The theory of the GCMC is described below. There are four possible GCMC moves on a molecule, M (i.e., solute or water): Insertion: brings M into the system A from the reservoir; deletion: removes M from the system A and moves it back into the reservoir; translation and rotation: M is translated/rotated within a sub-volume surrounding the original location of M in system A. The probabilities of these moves as governed by the Metropolis criteria are:
μex is the excess chemical potential,
As seen in Eqn. 1, the expected number of molecules of each solute “
With
Scheme 1 of
The iterative GCMC-MD procedure is performed as follows:
GCMC moves (insertion/deletion/translation/rotation) were attempted for the solutes (F1, F1, F2, . . . FM) and the water, in Scheme I and Scheme II, respectively. However, there is no requirement that the number of GCMC moves for the different solutes and water, have to be equal. The order in which the four possible GCMC moves are attempted, and the molecule (solutes or water) on which the move is performed is randomized In the first iteration, the value of μex for solutes and water was set to 0. The radii of the water and the solute sphere(s) subjected to GCMC moves were set to rA through the GCMC process, though the energetic interactions associated with the moves also includes contributions from any water molecules, or other chemical entities, outside of the GCMC sphere that is defined as system B.
With the variation of dμP system-dependent, various schemes (standard state, aqueous mixture, heterogeneous systems) can be generated using Eqn. 2. It should be noted that different schemes can be applied to vary dμP.
The GCMC simulations were run using an in-house C++ code that implemented the grid-based GCMC scheme with the cavity-bias algorithm to drive solute and water exchanges between their reservoirs and the aqueous system A. Solute empirical force-field parameters were obtained from CGenFF the general force field in CHARMM and TIP3P, a model for water used in GCMC and MD simulations. The solute molecules chosen for the present study represent different chemical functionalities including apolar molecules such as benzene and propane, neutral polar molecules such as acetaldehyde, methanol and formamide, and the negative and positively charged molecules such as acetate and methylammonium, respectively.
To prevent aggregation of hydrophobic and charged solutes, and promote faster convergence, a repulsive energy term was introduced only between benzene:benzene, benzene:propane, propane:propane, acetate:acetate, acetate:methylammonium and methylammonium:methylammonium molecular pairs. This was achieved by adding a massless particle to the center of mass of benzene and the central carbon of propane, acetate and methylammonium. Each such particle does not interact with any other atoms in the system but only with other particles on the hydrophobic or charged molecules through the Lennard-Jones (LJ) force field term using the following parameters: ϵ=0.01 kcal/mol; and Rmin=12.0 Å. All LJ force field terms and coulomb interactions were calculated during GCMC (i.e. there was no truncation of non-bonded interactions), including the interactions with system B. Simulations are initiated with an empty system A. The randomized GCMC process with the solutes and the waters was initially run in multiples of 50,000 moves until the water molecules in the system reach a bulk concentration of 55 M. At each point in this iterative process of 50,000 GCMC moves, the μex of the solutes and water was increased by 1 kcal/mol to accelerate the water and solute accumulation in system A. During this process, the concentration of the solutes may increase beyond their target values. After the bulk water concentration of 55M is attained, the GCMC-MD procedure as described above in steps 1-4 was run after resetting μex of the respective fragments and the water to 0. Over 50 iterations of the GCMC-MD in which the μex values were varied, the concentrations of solutes approach their respective target values. For each iteration, 50,000 and 100,000 GCMC moves were run for Scheme I and Scheme II, respectively.
The GROMACS package (version 5.0) was used for all MD simulations. For the aqueous systems, for each iteration, the combined systems A and B, including all the solutes and the waters were simulated for 500 ps using a leap-frog integrator (GROMACS integrator “md”), having a 2 fs time-step, at 300 K through a Nose-Hoover thermostat. The LINCS algorithm was used to constrain water geometries and all covalent bonds involving a hydrogen atom, while van der Waals (vdW) and electrostatic interactions were switched off smoothly over a range of 8-10 Å. Solutes and water molecules were held within the spherical dimensions of systems A or B by applying harmonic flat-bottom restraints with a force constant of 1.2 kcal/mol/Å2 on the following solute or water atoms: the massless particle at the geometric center of benzene, propane, acetate and methylammonium, the carbon atoms of the acetaldehyde, methanol and formamide and the oxygen atom of water. For the bulk aqueous systems, the radius, rB of the harmonic flat-bottom restraints used to define system B applied only to water and was 5 Å larger than the radius rA=20 Å defining the restraint applied to the solutes in system A.
For the T4-L99A system, PDB coordinates 181L with the benzene ligand was used following deletion of the ligand. Briefly, the T4-L99A structure was inserted into boxes replicating Scheme I and Scheme II systems containing 1 M of benzene and 0.25 M each of the different fragments (benzene, propane, acetaldehyde, methanol, formamide, acetate and methylammonium), respectively. Akin to the aqueous systems, 10 such boxes were built for Scheme I with 1M benzene and Scheme II with the multiple solutes, with the solutes randomly inserted in each of the boxes. These systems were minimized over 1000 iterative steps using the steepest descent algorithm in the presence of periodic boundary conditions (PBC). The systems were then equilibrated for 250 ps by periodic reassignment of velocities. The leap frog version of the Verlet integrator with a time step of 2 fs was used for heating and equilibration. Long range electrostatic interactions were handled using the particle-mesh Ewald method with a real space cutoff of 12 Å. A switching function was applied to Lennard-Jones interactions at 12 Å, and a long-range isotropic correction was applied to the pressure component for Lennard-Jones interactions beyond the 12 Å cutoff length. During minimization and equilibration, harmonic positional restraints with a force constant of 2.4 kcal/mol /Å2 were applied to protein non-hydrogen atoms. For MD simulations in the iterative protocol, the position restraints were removed and replaced by weak restraints applied only to the backbone C-alpha carbon atoms of the protein with a force constant (k in 1/2 kδx2) of 0.12 kcal/mol/Å2. This was done to prevent the rotation of the protein in the simulation box and to prevent the potential denaturation of the protein due to the presence of a highly concentrated fragment solution (17).
The GCMC-MD iterations are repeated until convergence is obtained as defined by the user. Typically, about 200 GCMC-MD iterations are needed for the aqueous systems, yielding 100 ns of MD simulations and 10 and 20 million MC steps for Scheme I and Scheme II, respectively. Furthermore, to ensure sufficient sampling and convergence, 10 separate GCMC-MD simulations were run for Scheme II and for each solute in Scheme I resulting in a cumulative 1 μs (10×100 ns) of MD for Scheme II and each solute in Scheme I. Runs differ through a randomly generated seed for both GCMC and the leap-frog MD integrator operations in each iteration. GCMC-MD of the protein systems are repeated over 100 iterations, yielding a cumulative 500 ns of MD over the 10 separate simulations from both Scheme I and Scheme II systems. In the context of Scheme I, the GCMC-MD simulations would allow calculation of the HFE corresponding to a 1 M standard state aqueous system and the binding affinity of solutes to the protein. With the solute mixture in Scheme II the approach would permit determining the μex required to drive sampling of the distribution of solute molecules in a heterogeneous aqueous system and determine solute affinity patterns around the protein site.
In the initial calculations, when solute and solvent in the finite spherical systems shared the same spherical boundary, the nonpolar molecules only sampled the surface of the spherical system (
Two types of aqueous systems were considered: 1) a system containing only one type of solute in water at a concentration of 1 M, thereby replicating the standard state of the solute and 2) a dilute aqueous mixture containing many types of solutes each at a concentration of 0.25 M. Results using both Schemes I and II are presented for aqueous systems alone, followed by a description for Scheme I and Scheme II calculations on a system that contains T4-L99A lyzozyme. This lysozyme mutant was selected as a model system as it has been widely used in experimental and computational studies of ligand binding as well as in studies evaluating the impact of mutations on protein structure and stability. The L99A mutation in the C-terminal domain T4 creates a completely buried, hydrophobic cavity of ˜150 Å3 that, although inaccessible in static structures, binds small hydrophobic ligands in a rapid and reversible manner. Such an occluded pocket offers a rigorous test of the sampling effectiveness of the presented GCMC-MD methodology, including a quantitative evaluation of the approach.
Briefly, the GCMC-MD methodology can be explained as follows. GCMC was run on water and each of the different solutes in the system. Variations in the value of μex for water and solutes for the GCMC-MD iterations was used to improve the solute exchange probabilities and simultaneously permit the determination of μex required to maintain a defined concentration of the solutes and water molecules in the aqueous systems. This approach therefore, allows the HFE of solutes to be determined using their μex values when the target concentration of solute in the aqueous systems is 1 M. In the presence of the protein, the GCMC-MD strategy was used to determine the binding affinities of solutes to the protein as well as to efficiently sample the spatial distribution of the solutes in and around the protein. The site identification by ligand competitive saturation (SILCS) methodology is a fragment based sampling method that maps free energy affinity patterns of functional groups at protein surfaces, including occluded ligand binding pockets of proteins. The SILCS method, therefore, permits determination of the affinity patterns of multiple solutes to the protein and can be used for rational drug design.
I. Simulation Details for Scheme I and Scheme II Aqueous Systems—Molecular Dynamics with Periodic Boundary Conditions
Empirical force field parameters for proteins are CHARMM36, ligand molecules were treated using CGenFF and water was treated using the TIP3P model. Simulations were performed using the CHARMM molecular simulation program. To replicate Scheme I, the number of molecules required to attain a 1M concentration of a specific solute were randomly placed in a cubic box whose sides were 50 Å each Similarly, to replicate Scheme II, the number of molecules required to attain 0.25 M concentration of each solute type, were randomly placed in a bulk water cubic box whose sides were 50 Å each. The aqueous mixtures were minimized using the steepest descent algorithm over 5000 steps and the further minimized over another 5000 steps using the conjugate gradient algorithm in the presence of periodic boundary conditions. The systems were then heated to 300K at the rate of 5 K/ps by periodic reassignment of velocities. Following this, the system was equilibrated for 200 ps using velocity reassignment. The leap frog version of the Verlet integrator with a time step of 1 fs was used for heating and equilibration. Water geometries and covalent bonds involving hydrogen atoms were constrained using the SHAKE algorithm. Long range electrostatic interactions were handled with the particle-mesh Ewald method with a real space cutoff of 12 Å. A switching function was applied to the Lennard-Jones interactions at 10 Å, and a long-range isotropic correction was applied to the pressure parameter for Lennard-Jones interactions beyond the 12 Å cutoff length. Following equilibration, the aqueous mixture boxes were simulated for 15 ns with a time step of 2 fs, at 300K and at 1 atm pressure using a Nose-Hoover thermostat, and the Langevin piston barostat.
3D probability distributions for selected atoms of solutes from the GCMC/MD and GCMC-only simulations are called “FragMaps.” For the present invention, FragMaps were constructed for benzene carbons, propane carbons, methanol polar hydrogen, methanol oxygen, formamide polar hydrogens, formamide oxygen, imidazole unprotonated nitrogen, imidazole hydrogen, acetaldehyde oxygen, methylammonium polar hydrogens, and acetate oxygens. Briefly, solute atoms from 10 ps snapshots of the Scheme II aqueous mixture systems in the absence and the presence of the T4-L99A mutant GCMC-MD simulation were binned into 1 Å×1 Å×1 Å cubic volume elements (voxels) of a grid spanning the entire system, and the voxel occupancy for each FragMap atom type was calculated.
For the GCMC-only simulations, voxel occupancies for each FragMap atom type were obtained every 1000 steps of GCMC in every cycle. These were then normalized by the voxel occupancies of the fragments in a bulk-phase system devoid of the protein. Bulk-phase occupancies were equal across the GCMC/MD and GCMC-only simulations. The voxel occupancies of the eleven atom types were merged to create the following five generic FragMap types: (1) generic nonpolar, APOLAR (benzene and propane carbons); (2) generic neutral hydrogen bond donor, HBDON (methanol, formamide and imidazole polar hydrogens); (3) generic neutral hydrogen bond acceptor, HBACC (methanol, formamide, and acetaldehyde oxygen and imidazole unprotonated nitrogen) (4) positive donor, POS (methylammonium polar hydrogens); and (5) negative acceptor, NEG (acetate oxygens).
The probability distributions for each atom type in the absence of protein were normalized and then converted to free energies via a Boltzmann-based transform of the normalized probability to yield a grid free energy (GFE) for each fragment type “f” for the coordinates x,y,z, referred to as GFE FragMaps. All GFE values were capped at 3 kcal/mol. FragMaps were visualized as iso-contour surfaces at a GFE value of −1.2 kcal/mol, unless otherwise noted.
LGFE quantifies the overlap of atoms in ligands in the ligand binding pocket (LBP) with the corresponding GFE FragMaps. Ligand atoms were classified into FragMap types, according to an assignment map. Briefly, each classified atom of a ligand with coordinates (xi, yi, zi) was assigned a score equal to the GFE value of the corresponding FragMap type (1), GFEfxi,yi,zi, of the voxel it occupies. LGFE is then the sum of each of these GFE values for all the classified ligand atoms. LGFE was calculated as Boltzmann weighted average over an ensemble of conformations obtained by MC sampling of the ligand in LBP, in the field of FragMaps.
IV. MD conformational ensemble (for LGFEMD):
Initial ligand conformations were extracted from the corresponding co-crystal coordinate PDB files and the automated CGenFF parametrization algorithm was used to obtain the topology and parameters in the context of CGenFF. The ligand conformation from the co-crystal structure was extracted and aligned with the protein conformation used in the SILCS simulations. The alignment was done based on optimal alignment of the backbone Cα atoms in the two protein structures. The complex was minimized using the SD algorithm for 50 steps. This minimized conformation was subject to a 10 ps MD simulation, with snapshots output every 0.2 ps. During the dynamics, all protein atoms farther than 8 Å around any ligand atom were restrained using a force constant of 1 kcal/mol/Å2. This process was repeated on 40 protein conformations obtained from the GCMC-MD SILCS simulations, equally spaced in time (cycle 10, 20, 30, . . . ), yielding at total of 1000 ligands conformations for estimation of the Boltzmann weighted LGFE.
Sampling of the ligand was also performed in the “field” of the GFE FragMaps using Metropolis Monte Carlo (MC) steps. An in-house suite of programs was used to setup and run the MC simulations. The ligand had rotational, translational and intra-molecular degrees of freedom. The ligand had no rotational restraints, but its center of mass (CoM) was restrained to lie within 2.5 Å of the CoM of the ligand crystal conformation using a flat bottom restraint. Intramolecular degrees of freedom consisted of rotatable bonds, which were automatically detected based on the CGenFF molecular topology. All acyclic non-terminal bonds were considered rotatable, with the exception of bonds ending in methyl or NH3+ groups. The force-field terms corresponding to the intra-molecular degrees of freedom comprised of dihedral, van der Waals (vdW) and electrostatic terms. Due to the absence of protein and solvent during these simulations a distance dependent dielectric (=4|r|) was used to evaluate the intramolecular electrostatic contributions to prevent their overestimation. The energy computed during the Metropolis MC can be written as follows.
E=E
vdw,intra
+E
elec,intra
+E
dihe,intra+LGFE
For each ligand, 20 different MC simulations (each run for 10,000,000 steps; snapshots recorded every 10,000 steps) were run, where for each of them the ligand is initially randomly placed close to the protein active site. An average LGFE is first calculated over each of these 20 MC simulations. LGFEMC is then the Boltzmann weighted average over these LGFE values obtained across the 20 MC simulations.
The extent by which the binding sites are occluded is referred to as occludedness, and is calculated as the ratio of the solvent accessible surface area (SASA) of the ligands alone to the SASA when bound to the protein using ligand-protein complex X-ray structures. As shown in Table 1, these ratios vary from 0, for testosterone bound to AR up to 0.13 for both Rosiglitazone bound to PPARγ and Carazalol bound to β2AR. This data suggests a high level of inaccessibility of LBPs for these therapeutically important drug targets.
During GCMC, fragments and water are exchanged between their gas-phase reservoirs and a cubic region of radius 20 Å (25 Å for the PPARγ and AR) encompassing the ligand binding pocket of the protein. The excess chemical potential (μex) supplied to drive fragment exchanges is periodically fluctuated, for example, over every 3 cycles, such that the average μex over the 100 cycles is close to the values shown in Table 2. These values are the magnitude of the μex required to maintain 0.25 M of a solute in a bulk aqueous mixture devoid of any protein, and are approximately equal to the hydration free energy.
Ten GCMC-MD simulations were run for each protein system. Each of these 10 simulations constituted 100 cycles of GCMC and MD (200 in the case of AR), with each cycle involving 100,000 steps of GCMC and 0.5 ns of MD, yielding a total of 100 million steps of GCMC and 500 ns of MD for every protein system (200 million steps of GCMC and 1 μs of MD for AR). Through the 100,000 steps of GCMC, both the solute molecules (fragments) and waters are exchanged between their gas-phase reservoirs and a sub-volume encapsulating the LBP of the protein. The configuration at the end of the GCMC is used as the starting configuration in the MD.
Before starting MD, a 500 step steep descent (SD) minimization and a 100 ps equilibration are run. The excess chemical potential (μex) supplied to drive solute and water exchanges during GCMC are periodically fluctuated over every 3 cycles, such that the average μex over the 100 cycles is close to the values in Table 2. These values are the magnitude of the μex required to maintain 0.25 M of a solute in a bulk aqueous mixture devoid of any protein, and are approximately equal to the hydration free energy. Since μex is periodically fluctuated, throughout GCMC, the system is not a formal GC ensemble. However, by maintaining the average μex Constant over the length of the simulation the extent of deviation was minimal. For all proteins exemplified in this method, convergence of the FragMaps was monitored by calculating overlap coefficients (OC).
The ten trajectories were divided into two groups (group 1, trajectories 1-5; group 2, trajectories 6-10), and FragMaps of each group were separately computed. OC relates the overlap between FragMaps of two groups (group 1, trajectories 1-5 and group 2; trajectories 6-10) to a number between 0 and 1, with 1 reflecting completely identical maps. OC is calculated using equation (3).
In equation 3 N is the number of voxels in the FragMaps and Qi1 and Qi2 are occupancies for the ith voxel from group 1 and 2 generated FragMaps, respectively. The ratios in the parentheses are computed to normalize the occupancy of each voxel by the sum of occupancies of all voxels in the corresponding FragMap. For each voxel, the smaller values (the conserved part) from group 1 and 2 are summed over all voxels to get the OC. It should be noted that the OC index does not behave linearly, such that a relatively small difference in the two distributions leads to a decrease from 1 to approximately 0.8, and values of >0.5 indicate a high degree of similarity (
To probe the role of protein flexibility in modulating the FragMaps in the LBPs, a second set of GCMC-only simulations were run for all the protein systems. Like the GCMC/MD simulation, 10 independent runs were set up. Through the 100 cycles, no MD was run at the end of 100,000 steps of GCMC. However, the last configuration at the end of the 100,000 steps of GCMC was used as the input configuration for the next cycle. As with the GCMC-MD, μex is periodically fluctuated over every 3 cycles around the values shown in Table 2, above.
AR & PPARγ:
PDB coordinates of 2AM9 (resolution 1.64 Å) and 3U9Q (resolution 1.52 Å) with the ligands testosterone (TES) and decanoic acid (DA), respectively, were used for the computer based methods of the present invention. Following the deletion of the ligands, missing residues 262-275 in the PDB 3U9Q were built using MODELLER. A total of 100 models were generated and ranked using the Discrete Optimized Protein Energy (DOPE) method and the highest ranking model was used as a starting structure. Crystal water molecules were retained, as were any structurally important ions. An in-house preparation script utilized GROMACS utilities to generate the simulation system involving protein, water, and small molecules, with the size of the system so as to have the protein extremity separated from the edge by 12 Å on all sides. The net system charge was made neutral by replacing random water molecules with the appropriate number of sodium or chloride ions. The proteins were minimized for 500 steps with the steepest descent (SD) algorithm in the presence of periodic boundary conditions (PBC). This was followed by a 100 ps equilibration during which temperature was adjusted by velocity rescaling. During the minimization and equilibration, harmonic positional restraints with a force constant of 2.4 kcal mol−1 Å−2 were applied to protein non-hydrogen atoms. The final coordinates at the end of equilibration are used as the starting conformations for the GCMC/MD simulation.
An in silico database of about 1.8 million compounds which contains all accessible tautomers and protonation states of each compound from the CHEMBRIDGE and MAYBRIDGE databases was used for screening. 1) Distinct B2A FragMap affinities (A1, A2, P1, P2, HBD1 and HBD2 in
ΔEdock=|Edockact|−|Edockinact|
where Edockact and Edockinact are scores from top ranked poses in the active and inactive structures, respectively. A second selection criterion of ΔEdock>0, yielded 906 ligands. LGFEs of these 906 ligands were calculated as Boltzmann averages over 1000 steps MC sampling in fields of both active (LGFEact) and inactive (LGFEinact) FragMaps. Differences in these LGFE scores were calculated as ΔLGFE=|LGFEact|−|LFGEinact|, and 109 ligands were selected with ΔLGFE>0. Chemical fingerprint-based cluster analysis was used to select 15 chemically diverse ligands with which functional assessment studies were performed.
All mouse studies were approved by the Animal Care and Use Committee of UMB. 5 mm sections of trachea from FVB/N mice were excised and studied in an isometric myograph system (ADInstrument) as described in the literature. A passive tension of 0.5 g was applied for each ring for a baseline. Rings were contracted with 1 μM acetylcholine, followed by the addition of 100 μM of isoproterenol (iso) and the selected ligands from VS studies. Percentage of relaxation in the presence of a ligand was measured as change in tension from acetylcholine-stimulated peak contraction. Rings were washed for reuse. Relaxation was calculated as an average over 9 runs with isoproterenol and 4 runs with each of the selected ligands.
For Scheme I and the aqueous mixture of Scheme II, the target concentration is obtained by varying μex through the GCMC-MD iterations. To start, the value of μex for both the solute and water is initially set to 0. In both cases, the average value of μex for both solutes and water converged close to their HFE. Table 4 lists both the calculated and the experimental HFE (HFEexp) of the solutes. Average μex of solute and water for the Scheme I and Scheme II GCMC-MD simulations with a spherical system of radius 25 Å were obtained from cycles 150-200. Averages and standard deviations were based on the 10 individual simulations of each system.
HFEs were also calculated using a fast energy perturbations (FEP) method, as previously described (see, Baker C M, et al., (2010), J Chem Theory Comput 6(4):1181-1198), to account for possible limitations in the force field that would yield a HFE in disagreement with the experimental data.
Table 4 lists the average μex value and the concentration for each solute for 10 simulations, from the final 50 iterations. Because the average μex value and concentration for solutes over the last 50 iterations are largely similar to corresponding μex and concentration values measured across iterations 50-200 (Table 5) of the simulation, these results are consistent with the onset of convergence starting at iteration 50.
For hydrophobic and polar molecules, the average μex values compare well with the HFEfep. The largest deviation occurs with formamide with the values falling between the HFEexp and HFEfep values. The deviation of μex and HFEfep from the HFEexp for charged fragments, acetate and methylammonium, is due to the fact that the vacuum-to-solvent interface potential is not being accounted for in the present calculations. For monovalent anions/cations, this contribution was calculated to be about +/−12.5 kcal/mol, respectively, using the TIP3P water model. Further, as GCMC methods with charged solutes are limited by low acceptance rates for particle insertions, the present estimates of μex for charged systems are within acceptable limits. Overall, these results establish that the presented iterative GCMC-MD methodology is useful towards estimating the HFE of organic solutes by virtue of their μex in standard state aqueous systems. In addition, the approach is suitable for more complex aqueous mixtures as evidenced by the fact that the μex values obtained from Scheme II are in satisfactory agreement with the experimental and HFEfep values.
While the GCMC-MD simulation protocol achieved the correct concentration and μex in the system, the method was investigated for its ability to obtain the correct spatial sampling of solutes in finite spherical systems e.g., heterogeneous systems. Spatial sampling was investigated via the analysis of radial distribution functions (RDF). RDFs of selected solute atoms and water oxygens were calculated from the cumulative MD sampling of both the Scheme I and Scheme II GCMC-MD simulations.
These RDF's were compared to the RDFs obtained from explicit-water 15 ns PBC-MD simulations. The PBC-MD simulations as described above maintain concentrations of solutes and water according to the parameters set forth for Scheme I and Scheme II in a cubic box whose sides are 50 Å each. These simulations also include an explicit treatment for long-range non-bonding interactions via particle mesh Ewald and isotropic LJ correction methods.
The RDFs from finite spherical systems GCMC-MD sampling and the explicit-water PBC MD simulations match very well. These results indicate that the use of oscillating μex values to drive GCMC sampling and that the treatment of the long-range non-bonding interactions in the presented GCMD-MD protocol does not significantly impact the spatial sampling of the aqueous systems. Some fluctuations are seen in the RDF from Scheme II PBC, which are due to sampling issues, for instance, the MD only simulations were run for 15 ns versus a total of 100 ns MD simulations in the GCMC-MD protocol. Thus, the present methodology attains spatial sampling consistent with that observed in unbiased MD PBC simulations. It is noted that the μex value settled close to the HFE values in a Scheme I simulation of acetaldehyde and methanol without MD at the end of every iteration. However, since the molecules are rigid during GCMC, MD simulations are likely needed to preserve the correct conformational and spatial sampling of the solutes in these bulk-phase environments.
Subsequent calculations focused on determining if the GCMC-MD approach could obtain equilibrium solute sampling with known, fixed μex values. Accordingly, a set of GCMC-MD simulations were run, where instead of starting the simulation by assigning μex to be 0, the method holds the value of μex fixed to the HFE throughout the iterations. Shown in
To overcome this, μex of both the solutes and the water were cyclically fluctuated around their respective HFEs.
was alternately added and subtracted to μP(P=F1, F2, . . . , FM,W) over every three iterations of GCMC. Such variations in μex lead to improvements in sampling over the course of the GCMC-MD iterations, as evidenced by both the change in the number of the solute molecules and increase in the GCMC move probabilities (Pins+del and Ptrans+rot for insertions, deletions, translations and rotation moves respectively), as shown in
T4-L99A, which contains an engineered occluded binding site for benzene, was selected as a model system. The GCMC-MD sampling method was identical to the aqueous systems above, except that system B was treated as periodic with the solute being studied included in system B at the target concentration used to drive the GCMC sampling. System A was a 20 Å sphere centered on the T4-L99A binding site defined by residues Ala 99 and Met 102 (
Scheme I calculations on T4 involved benzene as the only solute at a concentration of 1 M along with water at a bulk concentration of 55 M. These parametric constraints permitted validation of the method, namely to drive sampling of benzene in the occluded binding pocket of the protein as well as yield a quantitative estimate of ligand binding. Across a 10×37.5 ns GCMC-MD simulation with conformations saved every 10 ps, the method showed that benzene was bound to the T4-L99A binding site for a total of 372 ns with the pocket being empty for only 3 ns. The average benzene concentration in the simulation system A was 1.5±0.2 M. These results indicate that the GCMC-MD method can sample the occluded pocket as well as attain the defined concentration in the entire system that drives the GCMC sampling.
The pocket sampling also permitted the estimation of the binding affinity of a ligand ΔG°, using the following equation:
In equation (4), R is the gas constant (kcal/mol/K), T (K) is the temperature, [PL] is the concentration of the bound ligand, [L] is the total concentration of the ligand, [P] is the concentration of the protein, Vref is the reference volume in concentration units (1660 Å3 per one ligand molecule for 1 M of ligand). Since the simulations are maintained in equilibrium, the bound vs. unbound ligand concentrations can be correlated to the time fraction of ligand bound vs. unbound in the binding site through the simulation. As the T4-L99A pocket is completely occluded, the presence or the absence of a solute atom at the active site is driven only by the GCMC insertion/deletion moves. Over the 10×37.5 ns GCMC-MD simulation that involves a total of 18.75 million GCMC insertion/deletion attempts for the benzene, with [L]˜1.5 M and a [PL]/[P] ratio of 99.4/0.6, ΔG° , calculated using Equation 2, is about −3.25 kcal/mol. Although there is some difference from the experimental binding affinity of −5.19±0.16 kcal/mol, it should be noted that an increase of 104 steps out of a total of 1.9×107 steps with benzene in the pocket, translates to nearly 1 kcal/mol difference in the calculated ΔG°. This does not mean a lack of equilibrium in the system, but emphasizes the difficulty of converging the calculation of a binding constant to a single site in a protein, although force field effects could also impact the obtained ΔG°.
Scheme II calculations included seven solutes along with water. As with the aqueous system, the target concentration for the solutions of each solute was 0.25 M. This simulation was run for a total of 10×50 ns. To facilitate analysis of the results, affinity patterns of selected atoms from different solutes in the occluded binding pocket, called “Grid Free Energy (GFE) FragMaps”, were calculated.
GFE FragMaps are Boltzmann transformed probability distributions for the solute atoms that are normalized using the distributions of solute molecules in an aqueous solution that does not contain the macromolecule. This normalization accounts for the free energy penalty due to solute desolvation when calculating the GFEs. These maps may then be visualized to qualitatively evaluate the ability of the GCMC-MD sampling method to reproduce the positions of different ligands known to bind T4-L99A that have been subjected to experimental analysis. Nine ligands were considered: 1) benzene, 2) o-xylene, 3) p-xylene, 4) ethylbenzene, 5) benzofuran, 6) indene, 7) indole, 8) isobutylbenzene, and 9) n-butylbenzene.
The use of GFE FragMaps has other advantages too. For instance, GFE FragMaps permits quantitative evaluation of the relative affinity of ligands, based on Ligand GFE (LGFE) scores, as described above.
LGFE scores quantify the overlap of atoms in the ligand with the corresponding GFE FragMaps. LGFE scores were calculated as Boltzmann weighted averages from ensembles of ligand-protein orientations generated using i) MD sampling of the ligands bound to the protein and ii) MC sampling of the ligands in the field of FragMaps. As shown in
Eight representative solutes with different chemical functionalities: benzene, propane, acetaldehyde, methanol, formamide, imidazole, acetate, and methylammonium were chosen to probe the LBPs. Benzene and propane serve as probes for nonpolar functionalities, while methanol, formamide, imidazole and acetaldehyde are neutral molecules that participate in hydrogen bonding. The positively charged methylammonium and negatively charged acetate molecules serve as probes for charged donor and acceptor groups, respectively. The normalized probability distributions for selected atoms in these solutes from the SILCS-GCMC/MD simulations were then used to create functional group affinity FragMaps at the respective LBPs as described below.
a. Androgen Receptor
To create FragMaps, a SILCS-GCMC/MD was run with the testosterone-AR crystal structure (PDB 2AM9) after the removal of testosterone. Although the simulations were initiated with the steroid-bound conformation of AR, FragMaps from the 10×100 ns GCMC/MD simulations recapitulate the locations of different functional groups for both steroidal and non-steroidal crystallographic ligands (
Notably, the apolar cavity between the W741, L873 and T877 that is occupied by some ligands such as the steroidal EM-5744 and the non-steroidal S-1, an analog of R-bicalutamide, is inaccessible in the starting conformation. However, following simulation of the structure, the side chains of W741 and T877 undergo conformational changes that lead to the formation of a cavity without significantly affecting the global conformation of the androgen receptor. For instance, the root mean square deviation (RMSD) in the proteins (backbone is ˜1.2 Å, consequently, APOLAR FragMaps were found in this cavity (A2,
b. PPARγ:
To map the functional group free energy patterns at the LBP of PPARγ a 10×50 ns SILCS-GCMC/MD was initiated using the PDB structure 3U9Q following removal of decanoic acid from the crystal structure. As illustrated in
In addition, as shown in
Further validation of the method is provided by greater extent of overlap between the terminal alkyl chain of decanoic acid and the APOLAR FragMap densities as well as the overlap between the carboxylic acid of decanoic acid and the NEG FragMap densities (A1, N1, respectively) in the LBP1 (
Along with these qualitative observations, binding affinities of the ligands were estimated using Ligand Grid Free Energy (LGFE) scoring for 16 ligands whose binding activity data to the human PPARγ is available, and compared against the experimental binding affinity, ΔGbind. KIs obtained from the different sources were normalized against the KI of rosiglitazone (KI=120 nM; binding DB reported a range of values between 8-440 nM). Despite the diversity in the ligands and their binding modes, there is reasonable correlation between the LGFE and ΔGbind values with a predictive index (PI) of ˜0.63 and R2˜0.22. For instance, the compound GW409544 that binds both the LBP1 and LBP2 (PDB:1K74) pockets has very good overlap with APOLAR FragMaps A1, A2 and A3 (
c. mGluR:
To efficiently sample the partially occluded LBP of mGluR1, 10×50 ns SILCS-GCMC/MD was performed on a monomer of mGluR obtained from a crystal structure of the inactive conformation of the dimeric 7 transmembrane (TM) region of the family C mGluR1 in complex with FITM, a negative allosteric modulator (NAM). During this sampling, the GCMC of solutes and water were restricted to a 20 Å cubic region around the LBP. FragMaps correctly recapitulate the different functional groups of the FITM (
Mutational studies of the mGluR1 and binding activity with other allosteric modulators such as the 2-Methyl-6-(phenylethynyl)pyridine (MPEP) (another known NAM) revealed that the binding pocket identified with FITM could be shared with other NAMs. MC sampling of MPEP in the pocket and in the presence of FragMaps as described above yields binding modes similar to FITM (
Crystal structures of the inactive and the active conformations of the family A GPCR are available and two separate 10×50 ns SILCS-GCMC/MD were run with each structure. In this document, B2I refers to the simulations starting with the inactive conformation (PDB: 2RH1) and B2A refers to simulations starting with the active conformation (PDB: 3P0G).
Good overlaps were obtained between the FragMaps and the crystallographic ligands for both B2I and B2A. APOLAR FragMap affinities close to the hydrophobic region defined by F289, V117 and A200 overlapped well with the benzoxazine and carbazol moieties of BI161707 and with carazolol, respectively (A1 in
A second APOLAR density was found close to the second hydrophobic pocket adjacent to W109, I309, and F193 only in B2A (A2,
Notably, the SILCS-GCMC/MD approach is able to quantitatively differentiate between the two states of β2AR as well. LGFE scores from MC sampling were obtained for a diverse range of agonists, partial agonists and antagonists/inverse agonists. As shown in Table 7, good correlations were obtained between the LGFE scores and binding affinities of agonists and partial agonists with B2A (R2˜0.46, PI˜0.59,
β2AR are expressed on numerous cell types, including airway smooth muscle (ASM). Activation of β2AR in ASM causes bronchodilation and inhaled beta-agonists are the main therapy for asthma and other obstructive pulmonary diseases. Differences in FragMaps between the two end states were used to guide virtual screening (VS) studies to identify new agonists for β2AR. Following the procedure described above, 15 top scoring chemically diverse ligands (
46
47
47
48
49
49
49
49
50
21
50
50
50
50
50
50
51
50
50
50
52
This invention was made with government support under Grant Number CA107331 awarded by the National Institutes of Health. The government has certain rights in the invention.
Number | Date | Country | |
---|---|---|---|
61932890 | Jan 2014 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 15114599 | Jul 2016 | US |
Child | 16019804 | US |