This application relates generally to compounds and cardiotoxicity and more generally to processor-implemented systems and methods for analyzing compounds with respect to cardiotoxicity.
Cardiotoxicity is a leading cause of attrition in clinical studies and post-marketing withdrawal. The human Ether-a-go-go Related Gene 1 (hERG1) K+ ion channel is implicated in cardiotoxicity, and the U.S. Food and Drug Administration (FDA) requires that candidate drugs be screened for activity against the hERG1 channel. Recent investigations suggest that non-hERG cardiac ion channels are also implicated in cardiotoxicity. Therefore, screening of candidate drugs for activity against cardiac ion channels, including hERG1, is recommended.
The hERG1 ion channel (also referred to as KCNH2 or Kv11.1) is a key element for the rapid component of the delayed rectified potassium currents (IKr) in cardiac myocytes, required for the normal repolarization phase of the cardiac action potential (Curran et al., 1995, “A Molecular Basis for Cardiac-Arrhythmia; HERG Mutations Cause Long Qt Syndrome,” Cell, 80, 795-803; Tseng, 2001, “I(Kr): The hERG Channel,” J. Mol. Cell. Cardiol., 33, 835-49; Vandenberg et al., 2001, “HERG Kb Channels: Friend and Foe,” Trends. Pharm. Sci. 22, 240-246). Loss of function mutations in hERG1 cause increased duration of ventricular repolarization, which leads to prolongation of the time interval between Q and T waves of the body surface electrocardiogram (long QT syndrome-LQTS) (Vandenberg et al., 2001; Splawski et al., 2000, “Spectrum of Mutations in Long-QT Syndrome Genes KVLQT1, HERG, SCN5A, KCNE1, and KCNE2,” Circulation, 102, 1178-1185; Witchel et al., 2000, “Familial and Acquired Long QT Syndrome and the Cardiac Rapid Delayed Rectifier Potassium Current, Clin. Exp. Pharmacol. Physiol., 27, 753-766). LQTS leads to serious cardiovascular disorders, such as tachyarrhythmia and sudden cardiac death.
Diverse types of organic compounds used both in common cardiac and noncardiac medications, such as antibiotics, antihistamines, and antibacterial, can reduce the repolarizing current IKr (i.e., with binding to the central cavity of the pore domain of hERG1) and lead to ventricular arrhythmia (Lees-Miller et al., 2000, “Novel Gain-of-Function Mechanism in K Channel-Related Long-QT Syndrome: Altered Gating and Selectivity in the HERG1 N629D Mutant,” Circ. Res., 86, 507-513; Mitcheson et al., 2005, “Structural Determinants for High-affinity Block of hERG Potassium Channels,” Novartis Found. Symp. 266, 136-150; Lees-Miller et al., 2000, “Molecular Determinant of High-Affinity Dofetilide Binding to HERG1 Expressed in Xenopus Oocytes: Involvement of S6 Sites,” Mol. Pharmacol., 57, 367-374). Therefore, several approved drugs (i.e., terfenadine, cisapride, astemizole, and grepafloxin) have been withdrawn from the market, whereas several drugs, such as thioridazine, haloperidol, sertindole, and pimozide, are restricted in their use because of their effects on QT interval prolongation (Du et al., 2009, “Interactions between hERG Potassium Channel and Blockers,” Curr. Top. Med. Chem., 9, 330-338; Sanguinetti et al., 2006, “hERG Potassium Channels and Cardiac Arrhythmia,” Nature, 440, 463-469).
The recommended in vitro drug screening process includes traditional patch clamp techniques, radiolabeled drug binding assays, 86RB-flux assays, and high-throughput cell-based fluorescent dyes and stably transfected hERG1 ion channels from Chinese hamster ovary (CHO) cells (Stork et al., 2007, “State Dependent Dissociation of HERG Channel Inhibitors,” Br. J. Pharmacol., 151, 1368-1376) and HEK 293 cells (also known as 293T cells) (Diaz et al., 2004, “The [3H]Dofetilide Binding Assay is a Predictive Screening Tool for hERG Blockade and Proarrhythmia: Comparison of Intact Cell and Membrane Preparations and Effects of Altering [K+]O,” J. Pharmacol. Toxicol. Methods., 50(3), 187-199). Although elaborate nonclinical tests display a reasonable sensitivity and establish safety standards for novel therapeutics, the screening of all of potential candidates remains very time-consuming and thus increases the final cost of drug design.
Molecular modeling techniques have provided some guidance in screening drug candidates for their blocking ability to cardiac channel proteins. For example, several receptor-based models of hERG-drug interactions based on molecular docking and molecular dynamics (MD) simulation studies have been published (Stansfeld et al., 2007, “Drug Block of the hERG Potassium Channel: Insight from Modeling,” Proteins: Struct. Funct. Bioinf. 68, 568-580; Masetti et al., 2007, “Modeling the hERG Potassium Channel in a Phospholipid Bilayer: Molecular Dynamics and Drug Docking Studies, J. Comp. Chem., 29(5), 795-808; Zachariae et al., 2009, “Side Chain Flexibilities in the Human Ether-a-go-go Related Gene Potassium Channel (hERG) Together with Matched-Pair Binding Studies Suggest a New Binding Mode for Channel Blockers,” J. Med. Chem., 52, 4266-4276; Boukharta et al., 2011, “Computer Simulations of Structure—Activity Relationships for hERG Channel Blockers,” Biochemistry, 50, 6146-6156; Durdagi et al., 2011, “Combined Receptor and Ligand-Based Approach to the Universal Pharmacophore Model Development for Studies of Drug Blockade to the hERG1 Pore Domain,” J. Chem. Inf. Model., 51, 463-474). However, the MD simulations in these studies are of short duration and do not provide vital information regarding the structural rearrangements that take place during voltage-induced gating transitions as well as the conformational dynamics of the ion channel. Therefore, an accurate atomistic approach to the problem of cardiotoxicity involving cardiac ion channels, including hERG1, is lacking in the art.
Provided herein is the first comprehensive computational dynamic model of a membrane-bound ion channel that provides an atomistically detailed sampling of the physiologically relevant conformational states of the channel. In certain embodiments, the model is combined with an atomistically detailed high throughput screening algorithm of test compounds in silica to predict cardiotoxicity or risk of cardiotoxicity and to select for compounds with reduced risk of cardiotoxicity.
In certain embodiments, the model and methods disclosed herein can be used to screen a standardized panel of drugs showing that cardiotoxic compounds are blockers of the membrane-bound ion channels disclosed herein, whereas proven safe drugs do not block these channels. In certain embodiments, the model and methods disclosed herein can be used to screen thousands of new candidate drugs in silico, which greatly accelerates drug development and renders it safer and cheaper rather than having to test all compounds in biological assays.
In certain embodiments, the model and methods disclosed herein can be used to predict compounds that are cardiotoxic or are potentially cardiotoxic, or to identify which chemical moieties of the compounds may be implicated in the toxicity, so that drug developers may avoid using the molecule, or may structurally modify the molecule to address the toxicity concerns.
In certain embodiments, the ion channel used in the computational dynamic model is a tetrameric protein, surrounded by a membrane, ions, solvent or physiological fluid molecules, and optionally, other components of an in vivo system, to simulate the realistic environment of the channel. In certain embodiments, the duration of the computational dynamic model is of sufficient length (e.g., greater than 200 ns) to allow sampling of all physiologically relevant conformational states of the channel, including the open, closed and inactive states.
In certain embodiments, the atomistic detail afforded by the computational dynamic model and high throughput screening algorithm allows a determination of whether a test compound blocks the channel in its preferred binding conformation or conformations. In certain embodiments, a compound that blocks the channel in its preferred binding conformation or conformations is cardiotoxic.
In one aspect, provided herein, is a system and method for selecting a compound with reduced risk of cardiotoxicity. As an example, the system and method can include a computational dynamic model combined with a high throughput screening in silico that mimics ion channels associated with cardiotoxicity, for example, the human Ether-a-go-go Related Gene 1 (hERG1) channel, the hNav1.5 channel, and the hCav1.2 channel. Also provided herein are processor-implemented systems and methods for redesigning compounds that are predicted to be cardiotoxic based on the model and the high throughput screening.
As another example, a processor-implemented system and method includes the steps of: a) using structural information describing the structure of a cardiac ion channel protein; b) performing a molecular dynamics (MD) simulation of the protein structure; c) using a clustering algorithm to identify dominant conformations of the protein structure from the MD simulation; d) selecting the dominant conformations of the protein structure identified from the clustering algorithm; e) providing structural information describing conformers of one or more compounds; f) using a docking algorithm to dock the conformers of the one or more compounds of step e) to the dominant conformations of step d); g) identifying a plurality of preferred binding conformations for each of the combinations of protein and compound; h) optimizing the preferred binding conformations using MD; and i) determining if the compound blocks the ion channel of the protein in the preferred binding conformations; wherein one or more of the steps a) through i) are not necessarily executed in the recited order.
In certain embodiments, one or more of the steps a) through i) of the method are performed in the recited order.
In certain embodiments, the structural information of step a) is a three-dimensional (3D) structure. In certain embodiments, the structural information of step a) is an X-ray crystal structure, an NMR solution structure, or a homology model, as disclosed herein.
In certain embodiments, step e) comprises providing the chemical structure of a compound and determining the conformers of the compound. In certain embodiments, the chemical structure of the compound defines the conformers.
In certain embodiments, if the compound does not block the ion channel in the preferred binding conformations, the compound is selected for further development or possible use in humans, or to be used as a compound for further drug design.
In certain embodiments, steps a) through i) of the method are executed on one or more processors.
In certain embodiments, the cardiac ion channel protein is a membrane-bound protein. In certain embodiments, the cardiac ion channel protein is voltage-gated. In certain embodiments, the cardiac ion channel protein is a sodium, calcium, or potassium ion channel protein. In certain embodiments, the cardiac ion channel protein is a potassium ion channel protein. In certain embodiments, the potassium ion channel protein is hERG1. In certain embodiments, the hERG1 channel is formed as a tetramer through the association of four monomer subunits. In certain embodiments, the potassium ion channel protein is flexible. In certain embodiments, the flexible potassium ion channel protein has greater than 100 variable-sized pockets within the monomer subunits or between the interaction sites of the monomers. In certain embodiments, the cardiac ion channel protein is a sodium ion channel protein. In certain embodiments, the sodium ion channel protein is hNav1.5. In certain embodiments, the cardiac ion channel protein is a calcium ion channel protein. In certain embodiments, the calcium ion channel protein is hCav1.2.
In certain embodiments, the compound is capable of inhibiting hepatitis C virus (HCV) infection. In certain embodiments, the compound is an inhibitor of HCV NS3/4A protease, an inhibitor of HCV NS5B polymerase, or an inhibitor of HCV NS5a protein.
In certain embodiments, the structural information of step a) is a three-dimensional (3D) structure. In certain embodiments, the structural information of step a) is an X-ray crystal structure, an NMR solution structure, or a homology model.
In certain embodiments, the structural information of step a) is subjected to energy minimization (EM) prior to performing the MD simulation of step b). In certain embodiments, the MD simulation of step b) incorporates implicit or explicit solvent molecules and ion molecules. In certain embodiments, the MD simulation of step b) incorporates a hydrated lipid bilayer with explicit phospholipid, solvent and ion molecules. In certain embodiments, the MD simulation uses an AMBER force field, a CHARMM force field, or a GROMACS force field. In certain embodiments, the duration of the MD simulation of step b) is greater than 200 ns. In certain embodiments, the duration of the MD simulation of step b) is 200 ns.
In certain embodiments, the docking algorithm of step f) is DOCK or AutoDock.
In certain embodiments, the MD of step h) uses NAMD software.
In certain embodiments, the method further comprises the step of calculating binding energies for each of the combinations of protein and compound in the corresponding optimized preferred binding conformations. In certain embodiments, the method further comprises the step of selecting for each of the combinations of protein and compound the lowest calculated binding energy in the optimized preferred binding conformations, and outputting the selected calculated binding energies as the predicted binding energies for each of the combinations of protein and compound.
In another aspect, provided herein, is a method for predicting cardiotoxicity or risk of cardiotoxicity of a compound.
In certain embodiments of the methods disclosed herein, if the compound does not block the ion channel in the preferred binding conformations, the compound is predicted to have reduced risk of cardiotoxicity. In certain embodiments, if the compound is predicted to have reduced risk of cardiotoxicity, the compound is selected for further development or possible use in humans, or to be used as a compound for further drug design.
In certain embodiments of the methods disclosed herein, if the compound blocks the ion channel in the preferred binding conformations, the compound is predicted to be cardiotoxic. In certain embodiments, if the compound is predicted to be cardiotoxic, the compound is not selected for further clinical development or for use in humans.
In another aspect, provided herein is a method for chemically modifying a compound that is predicted to be cardiotoxic.
In certain embodiments of the methods disclosed herein, if the compound blocks the ion channel in one of the preferred binding conformations, the method further comprises the step of using a molecular modeling algorithm to chemically modify or redesign the compound such that it does not block the ion channel in any of the preferred binding conformations. In certain embodiments, the method further comprises repeating steps e) through i) for the modified compound.
In another aspect, provided herein are biological methods for testing the cardiotoxicity of the compound or modified compound in an in vitro biological assay or in vivo in a wild type animal or a transgenic animal model.
In certain embodiments, the method further comprises testing the cardiotoxicity of the compound or modified compound in an in vitro biological assay. In certain embodiments, the in vitro biological assay comprises high throughput screening of ion channel and transporter activities. In certain embodiments, the in vitro biological assay comprises high throughput screening of potassium ion channel and transporter activities. In certain embodiments, the in vitro biological assay is a hERG1 channel inhibition assay. In certain embodiments, the in vitro biological assay is a FluxOR™ potassium ion channel assay. In certain embodiments, the FluxOR™ potassium channel assay is performed on HEK 293 cells stably expressing hERG1 or mouse cardiomyocyte cell line HL-1 cells. In certain embodiments, the in vitro biological assay comprises electrophysiology measurements in single cells. In certain embodiments, the electrophysiology measurements in single cells comprise patch clamp measurements. In certain embodiments, the single cells are Chinese hamster ovary cells stably transfected with hERG1. In certain embodiments, the in vitro biological assay is a Cloe Screen IC50 hERG1 Safety assay.
In certain embodiments, the method further comprises testing the cardiotoxicity of the compound or modified compound in vivo by measuring ECG in a wild type animal, for example a wild type mouse, or a transgenic animal model, for example, a transgenic mouse model expressing human hERG1.
In another aspect, provided herein is a processor-implemented system is provided for designing a compound in order to reduce risk of cardiotoxicity. The system includes one or more computer-readable mediums, a grid computing system, and a data structure. The one or more computer-readable mediums are for storing protein structural information representative of a cardiac ion channel protein and for storing compound structural information describing conformers of the compound. The grid computing system includes a plurality of processor-implemented compute nodes and a processor-implemented central coordinator, said grid computing system receiving the stored protein structural information and the stored compound structural information from the one or more computer-readable mediums. Said grid computing system uses the received protein structural information to perform molecular dynamics simulations for determining configurations of target protein flexibility over a simulation length of greater than 50 ns. The molecular dynamics simulations involve each of the compute nodes determining forces acting on an atom based upon an empirical force field that approximates intramolecular forces, where numerical integration is performed to update positions and velocities of atoms. The central coordinator forms molecular dynamic trajectories based upon the updated positions and velocities of the atoms as determined by each of the compute nodes. Said grid computing system configured to: cluster the molecular dynamic trajectories into dominant conformations of the protein, execute a docking algorithm that uses the compound's structural information in order to dock the compound's conformers to the dominant conformations of the protein, and identify a plurality of preferred binding conformations for each of the combinations of protein and compound based on information related to the docked compound's conformers. The data structure is stored in memory which includes information about the one or more of the identified plurality of preferred binding conformations blocking the ion channel of the protein. Based upon the information about blocking the ion channel, the compound is redesigned in order to reduce risk of cardiotoxicity.
In another aspect, provided herein, is a computer-implemented system for selecting a compound with reduced risk of cardiotoxicity which includes one or more data processors and a computer-readable storage medium encoded with instructions for commanding the one or more data processors to execute certain operations. The operations include: a) using structural information describing the structure of a cardiac ion channel protein; b) performing a molecular dynamics (MD) simulation of the protein structure; c) using a clustering algorithm to identify dominant conformations of the protein structure from the MD simulation; d) selecting the dominant conformations of the protein structure identified from the clustering algorithm; e) providing structural information describing conformers of one or more compounds; f) using a docking algorithm to dock the conformers of the one or more compounds of step e) to the dominant conformations of step d); g) identifying a plurality of preferred binding conformations for each of the combinations of protein and compound; h) optimizing the preferred binding conformations using MD; and i) determining if the compound blocks the ion channel of the protein in the preferred binding conformations. If the compound blocks the ion channel in the preferred binding conformations, the compound is predicted to be cardiotoxic. If the compound does not block the ion channel in the preferred binding conformations, the compound is predicted to have reduced risk of cardiotoxicity. Based on a prediction that the compound has reduced risk of cardiotoxicity, the compound is selected.
In certain embodiments, a computer-implemented system for selecting a compound with reduced risk of cardiotoxicity includes: one or more computer memories and one or more data processors. The one or more computer memories are for storing a single computer database having a database schema that contains and interrelates protein-structural-information fields, compound-structural-information fields, and preferred-binding-conformation fields. The protein-structural-information fields are contained within the database schema and configured to store protein structural information representative of a cardiac ion channel protein. The compound-structural-information fields are contained within the database schema and are configured to store compound structural information describing conformers of one or more compounds. The preferred-binding-conformation fields are contained within the database schema and are configured to store information related to one or more preferred binding conformations for each combination of protein and compound determined based at least in part on information in the protein-structural-information fields and the compound-structural-information fields. The one or more data processors are configured to: process a database query that operates over data related to the protein-structural-information fields, the compound-structural-information fields, and the preferred-binding-conformation fields and determine whether the one or more compounds are cardiotoxic by using information in the preferred-binding-conformation fields.
In certain embodiments, a non-transitory computer-readable storage medium is provided for storing data for access by a compound-selection program which is executed on a data processing system. The storage medium includes a protein-structural-information data structure, a candidate-compound-structural-information data structure, a molecular-dynamics-simulations data structure, a dominant-conformations data structure, and a binding-conformations data structure. The protein-structural-information data structure has access to information stored in a database and includes protein structural information representative of a cardiac ion channel protein. The candidate-compound-structural-information data structure has access to information stored in the database and includes compound structural information describing conformers of one or more compounds. The molecular-dynamics-simulations data structure has access to information stored in the database and includes configuration information of target protein flexibility determined by performing molecular dynamics simulations on the protein structural information. The dominant-conformations data structure has access to information stored in the database and is determined by using a first clustering algorithm based at least in part on the configuration information of target protein flexibility. The binding-conformations data structure has access to information stored in the database and includes information related to one or more combinations of protein and compound determined by using a docking algorithm based at least in part on the compound structural information and the one or more dominant conformations, one or more preferred binding conformations being determined by using a second clustering algorithm based at least in part on the information related to the one or more combinations of protein and compound. A compound is selected if the compound does not block the ion channel in the preferred binding conformations.
As used herein, the term “cardiotoxic” or “cardiotoxicity” refers to having a toxic effect on the heart, for example, by a compound having a deleterious effect on the action of the heart, due to poisoning of the cardiac muscle or of its conducting system. In certain embodiments, long Q-T syndrome or “LQTS” is an aspect of cardiotoxicity.
As used herein, the term “reduced cardiotoxicity” refers to a favorable cardiotoxicity profile with reference to, for example, one or more ion channel proteins disclosed herein. In certain embodiments, a “ligand,” “compound” or “drug,” as defined herein, has reduced cardiotoxicity if it does not inhibit one or more ion channel proteins (e.g., potassium ion channel proteins, such as hERG or hERG1, sodium ion channel proteins, such as hNav1.5, and calcium ion channel proteins, such as hCav1.2) disclosed herein. In certain embodiments, a ligand, compound or drug has reduced cardiotoxicity if it does not inhibit “hERG” or “hERG1.” In certain embodiments, a ligand, compound or drug has reduced cardiotoxicity if it does not inhibit “hNav1.5.” In certain embodiments, a ligand, compound or drug has reduced cardiotoxicity if it does not inhibit “hCav1.2.” In certain embodiments, a ligand, compound or drug has reduced cardiotoxicity if it does not block, obstruct, or partially obstruct, the channel of one or more ion channel proteins (e.g., potassium ion channel proteins, such as hERG or hERG1, sodium ion channel proteins, such as hNav1.5, and calcium ion channel proteins, such as hCav1.2) disclosed herein. In certain embodiments, a ligand, compound or drug has reduced cardiotoxicity if it is not a “blocker,” as defined herein. In certain embodiments, a ligand, compound or drug has reduced cardiotoxicity if it does not block, obstruct, or partially obstruct, the hERG or hERG1 channel, as defined herein. In certain embodiments, a ligand, compound or drug has reduced cardiotoxicity if it does not block, obstruct, or partially obstruct, the hNav1.5 channel, as defined herein. In certain embodiments, a ligand, compound or drug has reduced cardiotoxicity if it does not block, obstruct, or partially obstruct, the hCav1.2 channel, as defined herein. In certain embodiments, a ligand, compound or drug has reduced cardiotoxicity if it is not a blocker of hERG or hERG1. In certain embodiments, a ligand, compound or drug has reduced cardiotoxicity if it is not a blocker of hNav1.5. In certain embodiments, a ligand, compound or drug has reduced cardiotoxicity if it is not a blocker of hCav1.2.
As used herein, the terms “reducing risk” or “reduced risk” as it applies to cardiotoxicity (e.g., “reduced risk of cardiotoxicity”) refers to observable results which tend to demonstrate an improved cardiotoxicity profile with reference to, for example, one or more ion channel proteins disclosed herein. In certain embodiments, a ligand, compound or drug has a reduced risk of cardiotoxicity if it does not block, obstruct, or partially obstruct, the channel of one or more ion channel proteins disclosed herein. In certain embodiments, a ligand, compound or drug, has a reduced risk of cardiotoxicity if it is not a blocker. In certain embodiments, a ligand, compound or drug has a reduced risk of cardiotoxicity if it does not block, obstruct, or partially obstruct, the hERG or hERG1 channel. In certain embodiments, a ligand, compound or drug has a reduced risk of cardiotoxicity if it is not a blocker of hERG or hERG1. In certain embodiments, a ligand, compound or drug has a reduced risk of cardiotoxicity if it does not block, obstruct, or partially obstruct, the hNav1.5 channel. In certain embodiments, a ligand, compound or drug has a reduced risk of cardiotoxicity if it is not a blocker of hNav1.5. In certain embodiments, a ligand, compound or drug has a reduced risk of cardiotoxicity if it does not block, obstruct, or partially obstruct, the hCav1.2 channel. In certain embodiments, a ligand, compound or drug has a reduced risk of cardiotoxicity if it is not a blocker of hCav1.2. In certain embodiments, risk is reduced if there is at least about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or 100% decrease (as measured, e.g., by IC50 data from in vitro biological assays) in the ability of the ligand, compound or drug to inhibit the channel of one or more ion channel proteins disclosed herein. In certain embodiments, a reduction in the risk of cardiotoxicity by at least about 90% indicates that cardiotoxicity has been eliminated with respect to one or more of the ion channel proteins disclosed herein. In certain embodiments, a ligand, compound or drug has a reduced risk of cardiotoxicity if its calculated binding energies, as defined herein, to the one or more ion channel proteins, disclosed herein, compare to physiologically relevant concentrations of greater than or equal to 100 μM. In certain embodiments, a ligand, compound or drug has a reduced risk of cardiotoxicity if its “selectivity index (SI),” as defined herein, is greater than about 100, about 1000 or about 10,000.
As used herein, the term “LQTS” as used herein refers to long Q-T syndrome, a group of disorders that increase the risk for sudden death due to an abnormal heartbeat. The QT of LQTS refers to an interval between two points (Q and T) on the common electrocardiogram (ECG, EKG) used to record the electrical activity of the heart. This electrical activity, in turn, is the result of ions such as sodium and potassium passing through ion channels in the membranes surrounding heart cells. A prolonged QT interval indicates an abnormality in electrical activity that leads to irregularities in heart muscle contraction. One of these irregularities is a specific pattern of very rapid contractions (tachycardia) of the lower chambers of the heart called torsade de pointes, a type of ventricular tachycardia. The rapid contractions, which are not effective in pumping blood to the body, result in a decreased flow of oxygen-rich blood to the brain. This can result in a sudden loss of consciousness (syncope) and death.
As used herein, the term “lipid bilayer” refers to the basic structure of a cell membrane comprising a double layer of phospholipid molecules. Lipid bilayers are particularly impermeable to ions (such as potassium ions, sodium ions, and calcium ions).
As used herein, the term “hydrated lipid bilayer” refers to a lipid bilayer in the presence of water molecules. As used herein, the term “ion channel” or “ion channel protein,” refers to a membrane bound protein that acts as a pore (e.g., permeation pore) in a cell membrane and permits the selective passage of ions (such as potassium ions, sodium ions, and calcium ions), by means of which electrical current passes in and out of the cell. Such ion channel proteins include, for example, potassium ion channel proteins, such as hERG or hERG1, sodium ion channel proteins, such as hNav1.5, and calcium ion channel proteins, such as hCav1.2. In certain embodiments, an ion channel or ion channel protein comprises an inner cavity and a selectivity filter (see, e.g.,
One of ordinary skill in the art will understand that there are several possible ways to classify ion channels into groups, as described herein (see, e.g., TABLES 1-4). For instance, (1) by gating, where the conformational change between closed, open and inactivated of the channels is called gating, where (a) voltage-gated ion channels are controlled by the voltage gradient across the membrane (e.g., voltage-gated potassium channels, voltage-gated sodium channels, and voltage-gated calcium channels, etc.), and (b) ligand-gated ion channels are regulated by conformation changes induced by ligands; and (2) by ion, where channels can be categorized by the species of ions passing through those gates (e.g., potassium ion channels, sodium ion channels, and calcium ion channels, etc.)
As used herein, the term “transporter activity,” when used in relation to an “ion channel” or “ion channel protein,” refers to the movement of an ion across a cell membrane.
As used herein, the term “potassium ion channel” or “potassium ion channel protein,” refers to an ion channel that permits the selective passage of potassium ions (K+).
As used herein, the term “sodium ion channel” or “sodium ion channel protein,” refers to an ion channel that permits the selective passage of sodium ions (Na+).
As used herein, the term “calcium ion channel” or “calcium ion channel protein,” refers to an ion channel that permits the selective passage of calcium ions (Ca+2).
As used herein, the term “membrane bound protein” refers to any protein that is bound to a cell membrane under physiological pH and salt concentrations. In certain embodiments, binding of the membrane bound protein can be either by direct binding to the phospholipid bilayer or by binding to a protein, glycoprotein, or other intermediary that is bound to the membrane.
As used herein, the term “voltage-gated channel” or “voltage-gated ion channel” refers to a class of transmembrane ion channels that are activated by changes in electrical potential difference near the channel. In certain embodiments, the voltage-gated ion channel is a voltage-gated potassium channel. In certain embodiments, the voltage-gated ion channel is a voltage-gated sodium channel. In certain embodiments, the voltage-gated ion channel is a voltage-gated calcium channel.
As used herein, the term “voltage-gated potassium channel,” “voltage-gated potassium ion channel” or “voltage-gated potassium ion (K+) channel” is a transmembrane channel specific for potassium and sensitive to voltage changes in the cell's membrane potential.
As used herein, the term “voltage-gated sodium channel,” “voltage-gated sodium ion channel” or “voltage-gated sodium ion (Na+) channel” is a transmembrane channel specific for sodium and sensitive to voltage changes in the cell's membrane potential.
As used herein, the term “voltage-gated calcium channel,” “voltage-gated calcium ion channel” or “voltage-gated calcium ion (Ca+2) channel” is a transmembrane channel specific for calcium and sensitive to voltage changes in the cell's membrane potential.
As used herein, the term “human ERG,” “human ERG1,” “hERG” or “hERG1” refers to the human Ether-a-go-go-Related Gene of chromosome 7q36.1 that codes for a protein known as Kv11.1, the alpha (a) subunit of potassium voltage-gated channel, subfamily H (eag-related), member 2. It will be known to those of ordinary skill in the art that hERG or hERG1 can be also called different names, such as erg1, ERG1, KCNH2, Kv11.1, LQT2, and SQT1. See, for example, “KCNH2 potassium voltage-gated channel, subfamily H (eag-related), member 2 [Homo sapiens (human)],” Gene ID: 3757, updated 3-Nov-2013, http://www.ncbi.nlm.nih.gov/gene/3757. As used herein, the term “hERG” or “hERG1” refers interchangeably to the gene and gene product, Kv11.1. It will further be known to those of ordinary skill in the art the functional hERG1 channel is comprised of a homo-tetramer of four identical monomer α-subunits (e.g., the hERG1 monomer subunits), as disclosed herein.
As used herein, the term “human Nav1.5” or “hNav1.5” or refers to the sodium ion channel protein that in humans is encoded by the SCN5A gene. It will be known to those of ordinary skill in the art the functional hNav1.5 channel is comprised of single pore forming α subunit and ancillary β subunits, where the a subunit consists of four structurally homologous transmembrane domains designated DI-DIV, as disclosed herein.
As used herein, the term “human Cav1.2” or “hCav1.2” refers to the calcium ion channel protein that in humans is encoded by the CACNA1C gene. It will be known to those of ordinary skill in the art the functional hCav1.2 channel is comprised of α-1, α-2/δ and β subunits in a 1:1:1 ratio, as disclosed herein.
As used herein, the term “protein structure” refers to the three-dimensional structure of a protein. The structure of a protein is characterized in four ways. The primary structure is the order of the different amino acids in a protein chain, whereas the secondary structure consists of the geometry of chain segments in forms such as helices or sheets. The tertiary structure describes how a protein folds in on itself; the quaternary structure of a protein describes how different protein monomers or monomer subunits fold in relation to each other.
As used herein, the term “monomer” or “monomer subunit” refers to one of the proteins making up the quaternary structure of a macromolecule.
As used herein, the term “tetramer” refers to a macromolecule, for example, a protein macromolecule, made up of four monomer subunits. An example of a tetramer is the hERG1 tetramer comprised of four hERG1 monomer subunits. Tetrameric assembly into a quaternary structure is required for the formation of the functional hERG1 channel.
As used herein, the term “structural information” refers to the three dimensional structural coordinates of the atoms within a macromolecule, for example, a protein macromolecule such as hERG1.
As used herein, the term “three-dimensional (3D) structure” refers to the Cartesian coordinates corresponding to an atom's spatial relationship to other atoms in a macromolecule, for example, a protein macromolecule such as hERG1. Structural coordinates may be obtained using NMR techniques, as known in the art, or using x-ray crystallography as is known in the art. Alternatively, structural coordinates can be derived using molecular replacement analysis or homology modeling. Various software programs allow for the graphical representation of a set of structural coordinates to obtain a three dimensional representation of a molecule or molecular complex.
As used herein, the term “dynamics,” when applied to macromolecule and macromolecular structures, refers to the relative motion of one part of the molecular structure with respect to another. Examples include, but are not limited to: vibrations, rotations, stretches, domain motions, hinge motions, sheer motions, torsion, and the like. Dynamics may also include motions such as translations, rotations, collisions with other molecules, and the like.
As used herein, the term “flexible” or “flexibility,” when applied to macromolecule and macromolecular structures defined by structural coordinates, refers to a certain degree of internal motion about these coordinates, e.g., it may allows for bond stretching, rotation, etc.
As used herein, the term “molecular modeling algorithm” refers to computational approaches for structure prediction of macromolecule. For instance, these may comprise comparative protein modeling methods including homology modeling methods or protein threading modeling methods, and may further comprise ab initio or de novo protein modeling methods, or a combination of any such approaches.
As used herein, the term “computational dynamic model” refers to a computer-based model of a system that provides dynamics information of the system. In certain embodiments, when the system is a biological system, for example, a macromolecule or macromolecular structure, the computational dynamic model provides information of the vibrations, rotations, stretches, domain motions, hinge motions, sheer motions, torsion, translations, rotations, collisions with other molecules, and the like, exhibited by the system in the relevant time scale examined by the model.
As used herein, the term “molecular simulation” refers to a computer-based method to predict the functional properties of a system, including, for example, thermodynamic properties, thermochemical properties, spectroscopic properties, mechanical properties, transport properties, and morphological information. In certain embodiments, the molecular simulation is a molecular dynamics (MD) simulation.
As used herein, the term “molecular dynamics simulation” (MD or MD simulation) refers to computer-based molecular simulation methods in which the time evolution of a set of interacting atoms, groups of atoms or molecules, including macromolecules, is followed by integrating their equations of motion. The atoms or molecules are allowed to interact for a period of time, giving a view of the motion of the atoms or molecules. Thus, the MD simulation may be used to sample conformational space over time to predict the lowest energy, most populated, members of a conformational ensemble. Typically, the trajectories of atoms and molecules are determined by numerically solving the Newton's equations of motion for a system of interacting particles, where forces between the particles and potential energy are defined by molecular mechanics force fields. However, MD simulations incorporating principles of quantum mechanics and hybrid classical-quantum mechanics simulations are also available and may be contemplated herein.
As used herein, the term “scalable molecular dynamics” (scalable MD) refers to computational simulation methods which are suitably efficient and practical when applied to large situations (e.g., a large input data set, a large number of outputs or users, or a large number of participating nodes in the case of a distributed system). In certain embodiments, the methods disclosed herein use scalable MD for simulation of the large systems disclosed herein, for example, the hERG1 tetramer in a hydrated lipid bilayer with explicit phospholipid, solvent and ion molecules, free, or bound to ligand.
As used herein, the term “energy minimization” (EM) refers to computational methods for computing stable states of interacting atoms, groups of atoms or molecules, including macromolecules, corresponding to global and local minima on their potential energy surface. Starting from a non-equilibrium molecular geometry, EM employs the mathematical procedure of optimization to move atoms so as to reduce the net forces (the gradients of potential energy) on the atoms until they become negligible.
As used herein, the term “ligand,” “compound” and “drug” are used interchangeably, and refer to any small molecule which is capable of binding to a target receptor, such as an ion channel protein, for example, hERG1. In certain embodiments, the ligand, compound or drug is a “blocker,” as defined herein.
As used herein, the term “dock” or “docking” refers to using a model of a ligand and receptor to simulate association of the ligand-receptor at a proximity sufficient for at least one atom of the ligand to be within bonding distance of at least one atom of the receptor. The term is intended to be consistent with its use in the art pertaining to molecular modeling. A model included in the term can be any of a variety of known representations of a molecule including, for example, a graphical representation of its three-dimensional structure, a set of coordinates, set of distance constraints, set of bond angle constraints or set of other physical or chemical properties or combinations thereof. In certain embodiments, the ligand is a compound, for example a small molecule, and the receptor is a protein macromolecule, for example, hERG1.
As used herein, the term “docking algorithm” refers to computational approaches for predicting the energetically preferred orientation of a ligand to a receptor when bound or docked to each other to form a stable ligand-receptor complex. Knowledge of the preferred orientation in turn may be used to predict the strength of association or binding affinity between ligand and receptor using, for example, scoring functions. In certain embodiments, the ligand is a compound, for example a small molecule, and the receptor is a protein macromolecule, for example, hERG1.
As used herein, the term “drug design” or “rational drug design” refers to methods of processes of discovering new drugs based on the knowledge of a biological target. In certain embodiments of the methods disclosed herein, the biological target is a protein macromolecule, for example, hERG1. Those of ordinary skill in the art will appreciate that drug design that relies on the knowledge of the three-dimensional structure of the biomolecular target is also known as “structure-based drug design.” Those of ordinary skill in the art will also understand that drug design may rely on computer modeling techniques, which type of modeling is often referred to as “computer-aided drug design.” As used herein, the term “binding conformations” refers to the orientation of a ligand to a receptor when bound or docked to each other.
As used herein, the term “dominant conformation” or “dominant conformations” refers to most highly populated orientation(s) of a ligand to a receptor when bound or docked to each other. In certain embodiments, when applied to the trajectories of the MD simulations disclosed herein, a clustering algorithm is used to determine the “dominant conformation” or “dominant conformations.”
As used herein, the term “clustering algorithm,” when applied to a trajectory of the MD simulations disclosed herein, refers to computational approaches for grouping similar conformations in the trajectory into clusters.
As used herein, the term “preferred binding conformation” refers to the energetically preferred orientation of a ligand to a receptor when bound or docked to each other to form a stable ligand-receptor complex.
As used herein, the term “optimized preferred binding conformation” refers to the energetically preferred orientation of a ligand to a receptor when bound or docked to each other to form a stable ligand-receptor complex, following optimizing the preferred binding conformations using MD.
As used herein, the term “binding energies” is understood to mean the “free energy of binding” (ΔG°) of a ligand to a receptor. Under equilibrium conditions, this binding energy is equal to ΔG°=−T ΔS'=−R T Log (Keq), where the symbols have their customary meanings. In certain embodiments, the methods disclosed herein allow calculation of binding energies for various ligand-receptor complexes, for example, various compounds bound to hERG1.
As used herein, the terms “IC50” and “IC90” refer to the concentration of a compound that reduces (e.g., inhibits) the enzyme activity of a target by 50% and 90%, respectively. The term “IC50” generally describes the inhibitory concentration of the compound. Typically, measurements of IC50 and IC90 are made in vitro. In certain embodiments, where the target is a secondary biological target, for example, a membrane-bound ion channel implicated in cardiac cytotoxicity (e.g., hERG1), IC50 is the concentration at which 50% inhibition is observed. IC50's and IC90's can be measured according to any method known to one of ordinary skill in the art.
As used herein, the terms “EC50” and “EC90” refer to the plasma concentration/AUC of a compound that reduces (e.g., inhibits) the cellular effect resulting from enzyme activity by 50% and 90%, respectively. The term “EC50” generally describes the effective dose of the compound. In certain embodiments, where the target is a primary biological target, for example, a viral protein (e.g., HCV NS3/4A protease, HCV NS5B polymerase, or HCV NS5a protein), EC50 is the dose of the compound that inhibits viral replication by 50%. EC50's and EC90's can be measured according to any method known to one of ordinary skill in the art.
As used herein, the terms “CC50” and “CC90” refer to the concentration of a compound that reduces the number of viable cells (e.g., kills the cells) compared to that for untreated controls, by 50% and 90%, respectively. The term “CC50” generally describes the concentration of the compound that is cytotoxic to cells. In certain embodiments, where the target is a primary biological target, for example, a viral protein (e.g., HCV NS3/4A protease, HCV NS5B polymerase, or HCV NS5a protein), CC50 is the dose of the compound that is cytotoxic to uninfected cells. In certain embodiments, where the target is a secondary biological target, for example, a membrane-bound ion channel implicated in cardiac cytotoxicity (e.g., hERG1), CC50 is the dose of the compound that is cytotoxic to heart cells. In certain embodiments, the methods disclosed herein select for compounds with reduced risk of cardiotoxicity, but which retain strong biological activity to their primary targets. For example, such compounds may have high EC50 values for the secondary biological target (e.g., hERG1), high CC50 values for uninfected cells, but low EC50 values against the primary biological target (e.g., HCV NS3/4A protease, HCV NS5B polymerase, or HCV NS5a protein). CC50's and CC90's can be measured according to any method known to one of ordinary skill in the art.
As used herein, the term “selectivity index” (“SI”) refers to the ratio of the CC50 for cardiotoxicity with reference to a secondary biological target (e.g., hERG1) and to uninfected cells compared to the EC50 for effectiveness with reference to a primary biological target (e.g., HCV N53/4A protease, HCV NS5B polymerase, or HCV NS5a protein). In certain embodiments, the methods disclosed herein select for compounds that display SI values greater than about 100. In certain embodiments, the methods disclosed herein select for compounds that display SI values greater than about 1000. In certain embodiments, the methods disclosed herein select for compounds that display SI values greater than about 10,000.
As used herein, the term “blocker” refers to a compound that blocks, obstructs, or partially obstructs, an ion channel, for example, the hERG1 ion channel. In certain embodiments, a blocker is a cardiotoxic compound.
As used herein, the term “non-blocker” refers to a compound that does not block, obstruct, or partially obstruct, an ion channel, for example, the hERG1 ion channel.
As used herein, “high throughput screening” refers to a method that allows a researcher to quickly conduct chemical, genetic or pharmacological tests, the results of which provide starting points for drug design and for understanding the interaction or role of a particular biochemical process in biology. In certain embodiments, the high throughput screening is through virtual in silico screening, for example, using computer-based methods or computer-based models.
As used herein, the terms “processor” and “central processing unit” or “CPU” are used interchangeably and refer to a device that is able to read a program from a computer memory (e.g., ROM or other computer memory) and perform a set of steps according to the program.
As used herein, the terms “computer memory” and “computer memory device” refer to any storage media readable by a computer processor. Examples of computer memory include, but are not limited to, RAM, ROM, computer chips, digital video discs (DVD), compact discs (CDs), hard disk drives (HDD), and magnetic tape.
As used herein, the term “computer readable medium” refers to any device or system for storing and providing information (e.g., data and instructions) to a computer processor. Examples of computer readable media include, but are not limited to, DVDs, CDs, hard disk drives, magnetic tape and servers for streaming media over networks.
Provided herein is the first comprehensive computational dynamic model of a membrane-bound ion channel that provides an atomistically detailed sampling of the physiologically relevant conformational states of the channel. In certain embodiments, the model is combined with an atomistically detailed high throughput screening algorithm of test compounds in silico to predict cardiotoxicity and to select for compounds with reduced cardiotoxicities.
As an example, these models and algorithms may be used to mimic one of the most important ion channels associated with cardiotoxicity, namely the human Ether-a-go-go Related Gene 1 (hERG1) channel. The hERG1 channel is expressed in the heart as well as in various brain regions, smooth muscle cells, endocrine cells, and a wide range of tumor cell lines. However, its role in the heart is the one that has been well characterized and extensively studied for two main reasons. First, it is directly involved in long QT syndrome (LQTS), a disorder associated with an increased risk of ventricular arrhythmias and ultimately sudden cardiac death. Secondly, the blockade of hERG1 by prescription medications causes drug-induced QT prolongation that shares the same risk of sudden cardiac arrest like LQTS.
The hERG1 channel is formed as a tetramer through the association of four monomer subunits. In the computer-based molecular simulations and molecular models disclosed herein, the tetramer structure is surrounded by a membrane, ions, and water molecules to simulate the realistic environment of the channel. Further, the computer-based molecular simulations disclosed herein are of sufficient length (e.g., greater than 200 ns) to allow sampling of all physiologically relevant conformational states of the hERG1 channel, including the open, closed, inactive states, and any conformation in between these states. This robust molecular simulation of the hERG1 channel allows an atomistically detailed high throughput screening in silico to test compounds and determine if the compounds block the channel, and therefore are likely to exhibit cardiotoxicity. The atomistic detail of the molecular simulation also allows a chemical modification or redesign of those compounds found to block the channel. The redesigned compound may then be re-tested in an iterative fashion using the methods disclosed herein.
An overview of the methods disclosed herein, including computer-based molecular simulations and molecular models, is provided in
In certain embodiments, if the compound blocks the ion channel, the compound is predicted to be cardiotoxic. In certain embodiments, if the compound is predicted to be cardiotoxic, the compound is not selected for further clinical development or for use in humans. In certain embodiments, the compound may be structurally modified or redesigned to address cardiotoxicity.
In certain embodiments, if the compound does not block the ion channel, the compound is predicted to have reduced risk of cardiotoxicity. In certain embodiments, if the compound is predicted to have reduced risk of cardiotoxicity, the compound is selected for further development or possible use in humans, or to be used as a compound for further drug design.
Individual elements and steps of the methods disclosed herein are now described.
6.2.1 Ion Channels
In certain embodiments, the method comprises the step of using structural information describing the structure of a target receptor, for example, an ion channel protein.
In certain embodiments, the target receptor is an ion channel that regulates cardiac function, for example, a cardiac ion channel disclosed herein. In certain embodiments, the cardiac ion channel is a membrane-bound protein. In certain embodiments, the cardiac ion channel is voltage-gated. In certain embodiments, the cardiac ion channel is a sodium, calcium, or potassium ion channel. In certain embodiments, the cardiac ion channel is a potassium ion channel.
Those of ordinary skill in the art will appreciate that ion channels, for example, a cardiac ion channel disclosed herein, may have two fundamental properties, ion permeation and gating. Ion permeation describes the movement through the open channel. The selective permeability of ion channels to specific ions is a basis of classification of ion channels (e.g., Na+, K+ and Ca2+ channels). Gating is the mechanism of opening and closing of ion channels. Voltage-dependent gating is the most common mechanism of gating observed in ion channels.
The following TABLE 1 describes cardiac ion channels, any of which may be associated with cardiotoxicity.
Cardiac K+ channels fall into three broad categories: voltage-gated (Ito, IKur, IKr, and IKs), inward rectifier channels (IK1, IKAch, and IKATP), and the background K+ currents (TASK-1, TWIK-1/2).
In certain embodiments, the ion channel is selected from any one of the cardiac ion channels of TABLE 1.
In certain embodiments, the ion channel is a potassium ion channel protein selected from TABLE 1.
In certain embodiments, the ion channel is a sodium ion channel protein selected from TABLE 1.
In certain embodiments, the ion channel is a calcium ion channel protein selected from TABLE 1.
In certain embodiments, the ion channel comprises the amino acid sequence selected from group consisting of SEQ ID NO: 2, 4, and 6, as disclosed herein.
The following TABLE 2 describes potassium ion channels, any of which may be associated with cardiotoxicity.
In certain embodiments, the ion channel is selected from any one of the potassium ion channels of TABLE 2.
In certain embodiments, the ion channel is selected from any one of the members 1-8 of the potassium voltage-gated channel, subfamily H (eag-related), of TABLE 2.
In certain embodiments, the ion channel comprises the amino acid sequence selected from group consisting of SEQ ID NO: 2, 7, 8, 9, 10, 11, 12, and 13, as disclosed herein.
In certain embodiments, the ion channel is the Human Ether-a-go-go Related Gene 1 (hERG1) Channel, as described below.
In certain embodiments, the ion channel is the hNav1.5 voltage gated sodium channel, as described below.
In certain embodiments, the ion channel is the hCav1.2 voltage gated calcium channel, as described below.
6.2.2 Human Ether-a-go-go Related Gene 1 (hERG1) Channel
The hERG1 ion channel (also referred to as KCNH2 or Kv11.1) is an important element for the rapid component of the delayed rectified potassium currents (IKr) in cardiac myocytes, for the normal repolarization phase of the cardiac action potential (Curran et al., 1995, “A Molecular Basis for Cardiac-Arrhythmia; HERG Mutations Cause Long Qt Syndrome,” Cell, 80, 795-803; Tseng, 2001, “I(Kr): The hERG Channel,” J. Mol. Cell. Cardiol., 33, 835-49; Vandenberg et al., 2001, “HERG K Channels: Friend and Foe,” Trends. Pharm. Sci. 22, 240-246). Loss of function mutations in hERG1 cause increased duration of ventricular repolarization, which leads to prolongation of the time interval between Q and T waves of the body surface electrocardiogram (long QT syndrome-LQTS) (Vandenberg et al., 2001; Splawski et al., 2000, “Spectrum of Mutations in Long-QT Syndrome Genes KVLQT1, HERG, SCN5A, KCNE1, and KCNE2,” Circulation, 102, 1178-1185; Witchel et al., 2000, “Familial and Acquired Long QT Syndrome and the Cardiac Rapid Delayed Rectifier Potassium Current, Clin. Exp. Pharmacol. Physiol., 27, 753-766). LQTS leads to serious cardiovascular disorders, such as tachyarrhythmia and sudden cardiac death.
The DNA and amino acid sequences for hERG are provided as SEQ ID NO: 1 and SEQ ID NO: 2, respectively.
A detailed atomic structure of the hERG1 gene product based on X-ray crystallography or NMR spectroscopy is not yet available, so structural details for hERG1 are based on analogy with other ion channels, computer homology models, pharmacology, and mutagenesis studies. For example, as described in EXAMPLE 1 below, the structure of hERG1 is based on combined de novo and homology protein modeling, as previously described (Durdagi et al., 2012, “Modeling of Open, Closed, and Open-Inactivated States of the HERG1 Channel: Structural Mechanisms of the State-Dependent Drug Binding,” J. Chem. Inf. Model., 52, 2760-2774). The structural information useful for the methods described herein is provided, for example, as a homology model, including wherein the homology model is represented by coordinates for a potassium ion channel protein (e.g., hERG1), as in Table A (see, e.g., EXAMPLE 1).
In homology models, the hERG1 gene product comprises a tetramer, with each monomer subunit containing six transmembrane helices (see
Movements of the voltage-sensor domain enable the pore domain to open and close in response to changes in membrane potential. The drug binding site is contained within the central pore cavity of the pore domain, located below the selectivity filter and flanked by the four S6 helices (see
Without being limited by any theory, in one aspect of the disclosure, the blocking of the central pore cavity or channel of hERG by a drug is a predictor of the cardiotoxicity of the drug. Undesired drug blockade of K+ ion flux in hERG1 can lead to long QT syndrome, eventually inducing fibrillation and arrhythmia. hERG1 blockade is a significant problem experienced during the course of many drug discovery programs.
6.2.3 Human Nav1.5 Voltage Gated Sodium Channel
The Nav1.5 voltage gated sodium channel (VGSC) is responsible for initiating the myocardial action potential and blocking Nav1.5 through either mutations or its interactions with small molecule drugs or toxins have been associated with a wide range of cardiac diseases. These diseases include long QT syndrome 3 (LQT3), Brugada syndrome 1 (BRGDA1) and sudden infant death syndrome (SIDS).
The DNA and amino acid sequences for hNav1.5 are provided as SEQ ID NO: 3 and SEQ ID NO: 4, respectively.
A detailed atomic structure of the hNav1.5 gene product based on X-ray crystallography or NMR spectroscopy is not yet available, so structural details for hNav1.5 are based on analogy with other ion channels, computer homology models, pharmacology, and mutagenesis studies. The structural information useful for the methods described herein is provided, for example, as a homology model, including wherein the homology model is represented by coordinates for a sodium ion channel protein (e.g., hNav1.5), as in Table B (see, e.g., EXAMPLE 16).
Eukaryotic VGSCs are hetero-tetramers in which the four domains (DI-IV; see
VGSCs generally share a common activation mechanism. A change in the membrane potential results in a conformational change and an outward movement of S4, allowing the activation of the channel and the passage of the captions through the channel's pore (Catterall, 2014, “Structure and Function of Voltage-Gated Sodium Channels at Atomic Resolution,” Exp Physiol 99: 35-51″). The last two helical segments from each domain (S5-S6) are usually referred to as the pore forming segments. The S5 helical segment is a long segment that extends horizontally from S4, through a linker, and then vertically through the trans-membrane region. A loop then connects S5 to two short helices named as the pore helices (P1 and P2). The S6 segment is connected to P2 through a short turn and extends vertically toward the intracellular part of the channel. A short turn connecting P1 and P2 contains the selectivity specific residues, which is uniquely conserved among VGSCs with the following arrangement (DEKA) splayed across the four domains and is known as the selectivity filter (D372, E898, K1419 and A1711). This DEKA selectivity filter is responsible for introducing the sodium selectivity over other mono/di-valent cations as has been shown previously by several experimental and computational mutational analyses (Lipkind et al., 2008, “Voltage-Gated Na Channel Selectivity: The Role of the Conserved Domain III Lysine Residue,” J Gen Physiol 131: 523-529). It has been shown that mutating the selectivity filter's residues not only affect the selectivity of the channel, but also the gating kinetics of the as well (Hilber, et al., 2005, “Selectivity Filter Residues Contribute Unequally to Pore Stabilization in Voltage-Gated Sodium Channels,” Biochemistry 44: 13874-13882).
Without being limited by any theory, in one aspect of the disclosure, the blocking of the central pore cavity or channel of hNav1.5 by a drug is a predictor of the cardiotoxicity of the drug. Undesired drug blockade of Na ion flux in hNav1.5 can lead to long QT syndrome, eventually inducing fibrillation and arrhythmia. Blockage of hNav1.5 is a significant problem experienced during the course of many drug discovery programs.
6.2.4 Human Cav1.2 Voltage Gated Calcium Channel
The Cav1.2 voltage gated calcium channel is also responsible for mediating the entry of calcium ions into excitable cells and blocking Cav1.2 through either mutations or its interactions with small molecule drugs or toxins have been associated with a wide range of cardiac diseases. These diseases include long QT syndrome 3 (LQT3) and Brugada syndrome 1 (BRGDA1).
The DNA and amino acid sequences for hCav1.2 are provided as SEQ ID NO: 5 and SEQ ID NO: 6, respectively.
A detailed atomic structure of the hCav1.2 gene product based on X-ray crystallography or NMR spectroscopy is not yet available, so structural details for hCav1.2 are based on analogy with other ion channels, computer homology models, pharmacology, and mutagenesis studies. The structural information useful for the methods described herein is provided, for example, as a homology model, including wherein the homology model is represented by coordinates for a calcium ion channel protein (e.g., hCav1.2), as in Table C.
The global architecture of Cavs is composed of four basic components. The α1 subunit is located in the cell membrane and calcium ions can pass through. The auxiliary β, CaM and α2δ subunits bind with high affinity to the loops of domain I and II. Cav α2δ is a single pass transmembrane subunit which is formed by two disulfide-linked proteins (Van Petegem et al., 2006, “The Structural Biology of Voltage-Gated Calcium Channel Function and Regulation,” Biochem Soc Trans 34(Pt 5): 887-93).
The transmembrane Cav consists of four homologous repeats membranespanning domains (DI-IV). Each repeat is formed by six segments (S1-S6). The first 4 segments (S1-S4) are the voltage-segment domain and the last 2 segments (S5-S6) form the calcium-selective pore domain. The S4 segment contains positively charged residues and acts as a voltage sensors controlling gating. Channel activation is considered to be triggered by a conformational change in the voltage sensors leading to channel opening.
Without being limited by any theory, in one aspect of the disclosure, the blocking of the central pore cavity or channel of hCav1.2 by a drug is a predictor of the cardiotoxicity of the drug. Undesired drug blockade of Ca+2 ion flux in hCav1.2 can lead to long QT syndrome, eventually inducing fibrillation and arrhythmia. Blockage of hCav1.2 is a significant problem experienced during the course of many drug discovery programs.
6.2.5 Computational Aspects
In certain aspects, provided herein are computational methods for selecting a compound that is not likely to be cardiotoxic.
In certain embodiments, the computational methods comprise a computational dynamic model. In certain embodiments, the computational dynamic model comprises a molecular simulation that samples conformational space over time. In certain embodiments, the molecular simulation is a molecular dynamics (MD) simulation.
In certain embodiments, the method comprising the steps of: a) using structural information describing the structure of an ion channel protein; b) performing a molecular dynamics (MD) simulation of the protein structure; c) using a clustering algorithm to identify dominant conformations of the protein structure from the MD simulation; d) selecting the dominant conformations of the protein structure identified from the clustering algorithm; e) providing structural information describing conformers of one or more compounds; f) using a docking algorithm to dock the conformers of the one or more compounds of step e) to the dominant conformations of step d); g) identifying a plurality of preferred binding conformations for each of the combinations of protein and compound; h) optimizing the preferred binding conformations using scalable MD; and i) determining if the compound blocks the ion channel of the protein in the preferred binding conformations; wherein one or more of the steps a) through i) are not necessarily executed in the recited order. In certain embodiments, the ion channel protein is a potassium ion channel protein.
In certain embodiments, the structural information of step a) is a three-dimensional (3D) structure. In certain embodiments, the structural information of step a) is an X-ray crystal structure, an NMR solution structure, or a homology model, as disclosed herein.
In certain embodiments, step e) comprises providing the chemical structure of a compound and determining the conformers of the compound. In certain embodiments, the chemical structure of the compound defines the conformers.
In certain embodiments, steps e) through i) comprise a high-throughput screening of the compounds to determine if they are “blockers” or “non-blockers.”
In certain embodiments, one or more of the steps a) through i) of the method are performed in the recited order.
In certain embodiments, steps a) through i) of the method are executed on one or more processors.
6.2.5.1 Structural Information of the Ion Channel Protein
In certain embodiments, the method comprises the step of using structural information describing the structure of an ion channel protein. In certain embodiments, the ion channel protein is also referred to as a “receptor” or “target” and the terms “protein,” “receptor” and “target” are used interchangeably.
In certain embodiments, the structural information describing the structure of the ion channel protein is from a homology model.
In certain embodiments, the structural information describing the structure of the ion channel protein is from an NMR solution structure. Multidimensional heteronuclear NMR techniques for determination of the structure and dynamics of macromolecules are known to those of ordinary skill in the art (see, e.g., Rance et al., 2007, “Protein NMR Spectroscopy: Principles and Practice,” 2nd ed., Boston: Academic Press).
In certain embodiments, the structural information describing the structure of the ion channel protein is from an X-ray crystal structure. X-ray crystallographic techniques for determination of the structure of macromolecules are also known to those of ordinary skill in the art (see, e.g., Drenth et al., 2007, “Principles of Protein X-Ray Crystallography,” 3rd ed., Springer Science).
The following TABLE 3 describes structures of cardiac ion channels, any of which may be used in the methods disclosed herein.
In certain embodiments, the structural information describing the structure of the ion channel protein is selected from any one of the structures of TABLE 3.
The following TABLE 4 describes structures of potassium ion channels, any of which may be used in the methods disclosed herein.
In certain embodiments, the structural information describing the structure of the ion channel protein is selected from any one of the structures of TABLE 4.
In certain embodiments, for example, wherein the ion channel is the potassium ion channel protein hERG1, a detailed atomic structure based on X-ray crystallography or NMR spectroscopy is not yet available. Accordingly, structural details are based on analogy with other ion channels, computer homology models, pharmacology, and mutagenesis studies.
The hERG1 homology model may comprise comparative protein modeling methods including homology modeling methods (see, e.g., Marti-Renom et al., 2000, Annu. Rev. Biophys. Biomol. Struct. 29, 291-325) performable without limitation using the “Modeller” computer program (Fiser and Sali, 2003, Methods Enzymol. 374, 461-91) or the “Swiss-Model” application (Arnold et al., 2006, Bioinformatics 22, 195-201); or protein threading modeling methods (see, e.g., Bowie et al., 1991, Science 253, 164-170; Jones et al., 1992, Nature 358, 86-89) performable without limitation using the “Hhsearch” program (Soding, 2005, Bioinformatics 21, 951-960), the “Phyre” application (Kelley and Sternberg, 2009, Nature Protocols 4, 363-371) or the “Raptor” program (Xu et al., 2003, J. Bioinform. Comput. Biol. 1, 95-117); may further comprise ab initio or de novo protein modeling methods using various algorithms, performable without limitation using the publically distributed “ROSETTA” platform (Simons et al., 1999, Genetics 37, 171-176; Baker 2000, Nature 405, 39-42; Bradley et al., 2003, Proteins 53, 457-468; Rohl 2004, Methods Enzymol. 383, 66-93), the “1-TASSER” application (Wu et al., 2007, BMC Biol. 5, 17), or using physics-based prediction (see, e.g., Duan and Kollman 1998, Science 282, 740-744; Oldziej et al., 2005, Proc. Natl. Acad. Sci. USA 102, 7547-7552); or a combination of any such approaches. Computational approaches applicable herein for structure prediction of biomolecules are evaluated annually within the Critical Assessment of Techniques for Protein Structure (CASP) experiment as published in the CASP Proceedings (http://predictioncenter.org/). Advantageously, data holding information about computationally predicted conformations and structures of many biomolecules such as peptides, polypeptides and proteins are available through respective publically available repositories (see, e.g., Kopp and Schwede, 2004, Nucleic Acids Research 32, D230-D234).
In certain embodiments, the methods disclosed herein work best with complex membrane-bound systems that are not susceptible to structure determination by X-ray crystallographic or NMR spectroscopic methods.
6.2.5.2 Structural Information of the Compound (Ligand)
In certain embodiments, the method comprises providing structural information describing conformers of one or more compounds or ligands. As used herein, the terms “compound” and “ligand” are interchangeable.
One of ordinary skill in the art will understand that a chemical compound can adopt differing three-dimensional (3-D) shapes or “conformers” due to rotation of atoms about a bond. Conformers can thus interconvert by rotation around a single bond without breaking. A particular conformer of a ligand may provide a complimentary geometry to the pore (e.g., permeation pore) of an ion channel protein, and promote binding.
In certain embodiments, the structural information of describing conformers of one or more compounds or ligands is obtained from the chemical structure of a compound or ligand.
In certain embodiments, the structural information of the compound is based upon a viral compound being studied or developed by universities, pharmaceutical companies, or individual inventors. Typically, the compound will be a small organic molecule having a molecular weight under 900 atomic mass units. Structural information of the compound may be provided in 2D or 3D, using ChemDraw or simple structural depictions, or by entry of the compound's chemical name. Computer-based modeling of the compound may be used to translate the chemical name or 2D information of the compound into a 3D representative structure.
The software LigPrep from the Schrödinger package (Schrödinger Release 2013-2: LigPrep, version 2.7, Schrödinger, LLC, New York, N.Y., 2013) may be used to translate the 2D information of the compound (ligand) into a 3D representative structure which provides the structural information. LigPrep may also be used to generate variants of the same compound (ligand) with different tautomeric, stereochemical, and ionization properties. All generated structures may be conformationally relaxed using energy minimization protocols included in LigPrep.
In certain embodiments, the compound is selected from a list of compounds that have failed in clinical trials, or were halted in clinical trials due to cardiotoxicity.
In certain embodiments, the compound is selected from TABLE 5, below:
In certain embodiments, the compound is an anticancer agent, such as anthracyclines, mitoxantrone, cyclophosphamide, fluorouracil, capecitabine and trastuzumab. In certain embodiments, the compound is an immunomodulating drug, such as interferon-alpha-2, interleukin-2, infliximab and etanercept. In certain embodiments, the compound is an antidiabetic drug, such as rosiglitazone, pioglitazone and troglitazone. In certain embodiments, the compound is an antimigraine drug, such as ergotamine and methysergide. In certain embodiments, the compound is an appetite suppressant, such as fenfulramine, dexfenfluramine and phentermine. In certain embodiments, the compound is a tricyclic antidepressants. In certain embodiments, the compound is an antipsychotic drug, such as clozapine. In certain embodiments, the compound is an antiparkinsonian drug, such as pergolide and cabergoline. In certain embodiments, the compound is an glucocorticoid. In certain embodiments, the compound is an antifungal drugs such as itraconazole and amphotericin B. In certain embodiments, the compound is an NSAID, including selective cyclo-oxygenase (COX)-2 inhibitors.
In certain embodiments, the compound is an active ingredient in a natural product. In certain embodiments, the compound is a toxin or environmental pollutant.
In certain embodiments, the compound is an antiviral agent.
In certain embodiments, the compound is selected from the group consisting of a protease inhibitor, an integrase inhibitor, a chemokine inhibitor, a nucleoside or nucleotide reverse transcriptase inhibitor, a non-nucleoside reverse transcriptase inhibitor, and an entry inhibitor.
In certain embodiments, the compound is capable of inhibiting hepatitis C virus (HCV) infection.
In certain embodiments, the compound is an inhibitor of HCV NS3/4A serine protease.
In certain embodiments, the compound is an inhibitor of HCV NS5B RNA dependent RNA polymerase.
In certain embodiments, the compound is an inhibitor of HCV NS5A monomer protein.
In certain embodiments, the compound is a compound disclosed in one of the following three applications: U.S. Provisional Patent Application No. 61/780,505, filed Mar. 13, 2013, entitled “Hepatitis C Virus NS5B Polymerase Inhibitors and Methods of Use”; U.S. Provisional Patent Application No. 61/784,584, filed Mar. 14, 2013, entitled “Hepatitis C Virus NS5B Polymerase Inhibitors and Methods of Use”; and U.S. Provisional Patent Application No. 61/786,116, filed Mar. 14, 2013, entitled “Hepatitis C Virus NS5A Monomer Inhibitors and Methods of Use.” The contents of each of these provisional applications are incorporated by reference in their entireties.
In certain embodiments, the compounds is selected from the group consisting of Abacavir, Aciclovir, Acyclovir, Adefovir, Amantadine, Amprenavir, Ampligen, Arbidol, Atazanavir, Balavir, Boceprevirertet, Cidofovir, Darunavir, Delavirdine, Didanosine. Docosanol, Edoxudine, Efavirenz, Emtricitabine, Enfuvirtide, Entecavir, Famciclovir, Fomivirsen, Fosamprenavir, Foscarnet, Fosfonet, Ganciclovir, Ibacitabine, Imunovir, Idoxuridine, Imiquimod, Indinavir, Inosine, Interferon type III, Interferon type II, Interferon type I, Interferon, Lamivudine, Lopinavir, Loviride, Maraviroc, Moroxydine, Methisazone, Nelfinavir, Nevirapine, Nexavir, Oseltamivir (Tamiflu), Peginterferon alfa-2a, Penciclovir, Peramivir, Pleconaril, Podophyllotoxin, Raltegravir, Ribavirin, Rimantadine, Ritonavir, Pyramidine, Saquinavir, Sofosbuvir, Stavudine, Telaprevir, Tenofovir, Tenofovir disoproxil, Tipranavir, Trifluridine, Trizivir, Tromantadine, Truvada, Valaciclovir (Valtrex), Valganciclovir, Vicriviroc, Vidarabine, Viramidine, Zalcitabine, Zanamivir (Relenza), and Zidovudine.
In certain embodiments, the compound is Daclatasvir (BMS-790052), for which the chemical name is “Methyl [(2S)-1{(2S)-2-[5-(4′-{2-[(2S)-1{(2S)-2-[(methoxycarbonyl)amino]-3-methylbutanoyl}2-pyrrolidinyl]-1H-imidazol-5-yl}4-biphenylyl)-1H-imidazol-2-yl]-1-pyrrolidinyl}3-methyl-1-oxo-2-butanyl]carbamate.” The structure of Daclastavir is provided below:
In certain embodiments, the compound is BMS-986094, for which the chemical name is “(2R)-neopentyl 2-(((a2R,3R,4R)-5-(2-amino-6-methoxy-9H-purin-9-yl)-3,4-dihydroxy-4-methyltetrahydrofuran-2-yl)methoxy)(naphthalen-1-yloxy)phosphoryl)amino)propanoate.” The structure of BMS-986094 is illustrated below:
6.2.5.3 Energy Minimization
In certain embodiments, the X-ray crystal structure, NMR solution structures, homology models, molecular models, or generated structures disclosed herein are subjected to energy minimization (EM) prior to performing an MD simulation.
The goal of EM is to find a local energy minimum for a potential energy function. A potential energy function is a mathematical equation that allows the potential energy of a molecular system to be calculated from its three-dimensional structure. Examples of energy minimization algorithms include, but are not limited to, steepest descent, conjugated gradients, Newton-Raphson, and Adopted Basis Newton-Raphson (Molecular Modeling: Principles and Applications, Author A. R. Leach, Pearson Education Limited/Prentice Hall (Essex, England), 2nd Edition (2001) pages: 253-302). It is possible to use several methods interchangeably.
6.2.5.4 Molecular Simulations
In certain embodiments, the method comprises the step of performing a molecular simulation of the structure of the ion channel protein.
Accordingly, provided herein are molecular simulations that sample conformational space of the ion channel protein according to the methods described herein. In certain embodiments, the molecular simulation is a molecular dynamics (MD) simulation.
Molecular simulations can be used to monitor time-dependent processes of the macromolecules and macromolecular complexes disclosed herein, in order to study their structural, dynamic, and thermodynamic properties by solving the equation of motion according to the laws of physics, e.g., the chemical bonds within proteins may be allowed to flex, rotate, bend, or vibrate as allowed by the laws of chemistry and physics. This equation of motion provides information about the time dependence and magnitude of fluctuations in both positions and velocities of the given molecule. Interactions such as electrostatic forces, hydrophobic forces, van der Waals interactions, interactions with solvent and others may also be modeled in MD simulations. The direct output of a MD simulation is a set of “snapshots” (coordinates and velocities) taken at equal time intervals, or sampling intervals. Depending on the desired level of accuracy, the equation of motion to be solved may be the classical (Newtonian) equation of motion, a stochastic equation of motion, a Brownian equation of motion, or even a combination (Becker et al., eds. Computational Biochemistry and Biophysics. New York 2001).
One of ordinary skill in the art will understand that direct output of a MD simulation, that is, the “snapshots” taken at sampling intervals of the MD simulation, will incorporate thermal fluctuations, for example, random deviations of a system from its average state, that occur in a system at equilibrium.
In certain embodiments, the molecular simulation is conducted using the CHARMM (Chemistry at Harvard for Macromolecular Modeling) simulation package (Brooks et al., 2009, “CHARMM: The Biomolecular Simulation Program,” J. Comput. Chem., 30(10):1545-614). In certain embodiments, the molecular simulation is conducted using the NAMD (Not (just) Another Molecular Dynamics program) simulation package (Phillips et al., 2005, “Scalable Molecular Dynamics with NAMD,” J. Comput. Chem., 26, 1781-1802; Kalé et al., 1999, “NAMD2: Greater Scalability for Parallel Molecular Dynamics,” J. Comp. Phys. 151, 283-312). One of skill in the art will understand that multiple packages may be used in combination. Any of the numerous methodologies known in the art to conduct MD simulations may be used in accordance with the methods disclosed herein. The following publications, which are incorporated herein by reference, describe multiple methodologies which may be employed: AMBER (Assisted Model Building with Energy Refinement) (Case et al., 2005, “The Amber Biomolecular Simulation Programs,” J. Comput. Chem. 26, 1668-1688; amber.scripps.edu); CHARMM (Brooks et al., 2009, J. Comput. Chem., 30(10):1545-614; charmm.org); GROMACS (GROningen MAchine for Chemical Simulations) (Van Der Spoel et al., 2005, “GROMACS: Fast, Flexible, and Free,” J. Comput. Chem., 26(16), 1701-18; gromacs.org); GROMOS (GROningen MOlecular Simulation) (Schuler et al., 2001, J. Comput. Chem., 22(11), 1205-1218; igc.ethz.ch/GROMOS/index); LAMMPS (Large-scale Atomic/Molecular Massively Parallel Simulator) (Plimpton et al., 1995, “Fast Parallel Algorithms for Short-Range Molecular Dynamics,” J. Comput. Chem., 117, 1-19; lammps.sandia.gov); and NAMD (Phillips et al., 2005, J. Comput. Chem., 26, 1781-1802; Kale et al., 1999, J. Comp. Phys. 151, 283-312).
Wherein the methods call for a MD simulation, the simulation may be carried out using a simulation package chosen from the group comprising or consisting of AMBER, CHARMM, GROMACS, GROMOS, LAMMPS, and NAMD. In certain embodiments, the simulation package is the CHARMM simulation package. In certain embodiments, the simulation package is the NAMD simulation package.
Wherein the methods call for a MD simulation, the simulation may be of any duration. In certain embodiments, the duration of the MD simulation is greater than 200 ns. In certain embodiments, the duration of the MD simulation is greater than 150 ns. In certain embodiments, the duration of the MD simulation is greater than 100 ns. In certain embodiments, the duration of the MD simulation is greater than 50 ns. In certain embodiments, the duration of the MD simulation of step is about 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, or 250 ns.
In certain embodiments, the molecular simulation incorporates solvent molecules. In certain embodiments, the molecular simulation incorporates implicit or explicit solvent molecules. One of ordinary skill in the art will understand that implicit solvation (also known as continuum solvation) is a method of representing solvent as a continuous medium instead of individual “explicit” solvent molecules most often used in MD simulations and in other applications of molecular mechanics. In certain embodiments, the molecular simulation incorporates water molecules. In certain embodiments, the molecular simulation incorporates implicit or explicit water molecules. In certain embodiments, the molecular simulation incorporates explicit ion molecules. In certain embodiments, the molecular simulation incorporates a lipid bilayer. In certain embodiments, the lipid bilayer incorporates explicit lipid molecules. In certain embodiments, the lipid bilayer incorporates explicit phospholipid molecules. In certain embodiments, the lipid bilayer incorporates a solvated lipid bilayer. In certain embodiments, the lipid bilayer incorporates a hydrated lipid bilayer. In certain embodiments, the hydrated lipid bilayer incorporates explicit phospholipid molecules and explicit water molecules.
6.2.5.5 Principal Component Analysis
In certain embodiments, the method optionally comprises the step of principal component analysis (PCA) of the MD trajectory. In certain embodiments, PCA is performed prior to identification of dominant conformations of the ion channel protein using clustering algorithms (see below). In certain embodiments, PCA is performed using the software AMBER-ptraj (Case et al., 2012, AMBER 12, University of California, San Francisco; Salomon-Ferrer et al., 2013, “An Overview of the Amber Biomolecular Simulation Package,” WIREs Comput. Mol. Sci. 3, 198-210; Amber Home Page. Assisted Model Building with Energy Refinement. Available at: http://ambermd.org, accessed Oct. 26, 2013). PCA reduces the system dimensionality toward a finite set of independent principal components covering the essential dynamics of the system.
6.2.5.6 Calculation of RMSDs
In certain embodiments, the method optionally comprises the step of calculating the root mean square deviation (RMSD) of Cα atoms relative to a reference structure of the ion channel protein. In certain embodiments, calculation of RMSD is performed to observe the overall behavior of the MD trajectory, prior to identification of dominant conformations of the ion channel protein using clustering algorithms (see below).
6.2.5.7 Clustering Algorithms
In certain embodiments, the method comprises the steps of using a clustering algorithm to identify dominant conformations of the ion channel protein from the MD simulation, and selecting the dominant conformations of the protein structure identified from the clustering algorithm.
Clustering algorithms are well known by one of ordinary skill in the art (see, e.g., Shao et al., 2007, “Clustering Molecular Dynamics Trajectories: 1. Characterizing the Performance of Different Clustering Algorithms,” J. Chem. Theory & Computation. 3, 231).
In certain embodiments, 50 or more dominant conformations are selected. In certain embodiments, 100 or more dominant conformations are selected. In certain embodiments, 150 or more dominant conformations are selected. In certain embodiments, 200 or more dominant conformations are selected. In certain embodiments, 250 or more dominant conformations are selected. In certain embodiments, 300 or more dominant conformations are selected.
6.2.5.8 Docking Algorithms
In certain embodiments, the method comprises the step of using a docking algorithm to dock the conformers of the one or more compounds to the dominant conformations of the structure of the ion channel protein determined from the molecular simulations.
Various docking algorithms are well known to one of ordinary skill in the art. Examples of such algorithms that are readily available include: GLIDE (Friesner et al., 2004 “Glide: A New Approach for Rapid, Accurate Docking and Scoring. 1. Method and Assessment of Docking Accuracy,” J. Med. Chem. 47(7), 1739-49), GOLD (Jones et al., 1995, “Molecular Recognition of Receptor Sites using a Genetic Algorithm with a Description of Desolvation,” J. Mol. Biol., 245, 43), FRED (McGann et al., 2012, “FRED and HYBRID Docking Performance on Standardized Datasets,” Comp. Aid. Mol. Design, 26, 897-906), FlexX (Rarey et al., 1996, “A Fast Flexible Docking Method using an Incremental Construction Algorithm,” J. Mol. Biol., 261, 470), DOCK (Ewing et al., 1997, “Critical Evaluation of Search Algorithms for Automated Molecular Docking and Database Screening,” J. Comput. Chem., 18, 1175-1189), AutoDock (Morris et al., 2009, “Autodock4 and AutoDockTools4: Automated Docking with Selective Receptor Flexiblity,” J. Computational Chemistry, 16, 2785-91), IFREDA (Cavasotto et al., 2004, “Protein Flexibility in Ligand Docking and Virtual Screening to Protein Kinases,” J. Mol. Biol., 337(1), 209-225), and ICM (Abagyan et al., 1994, “ICM—A New Method for Protein Modeling and Design: Application to Docking and Structure Prediction from the Distorted Native Conformation,” J. Comput. Chem., 15, 488-506), among many others.
In certain embodiments, the docking algorithm is DOCK or AutoDock.
6.2.5.9 Identification of Preferred Binding Conformations
In certain embodiments, the method comprises the step of identifying a plurality of preferred binding conformations for each of the combinations compound (ligand) and ion channel protein (receptor).
In certain embodiments, a clustering algorithm, as described above, is used to identify the preferred binding conformations for each of the combinations of compound and protein. In certain embodiments, the preferred binding conformations are those which have the largest cluster population and the lowest binding energy. In certain embodiments, the preferred binding conformations are the energetically preferred orientation of the compound (ligand) docked to the protein (receptor) to form a stable complex. In certain embodiments, there is only one preferrend binding conformation for the docked compound.
In certain embodiments, a compound that blocks the channel in one of its preferred binding conformations is predicted to be cardiotoxic. In certain embodiments, a compound that does not block the channel in any of its preferred binding conformations is predited to not be cardiotoxic.
In certain embodiments, a compound that blocks the channel in one of its preferred binding conformations is cardiotoxic. In certain embodiments, a compound that does not block the channel in any of its preferred binding conformations has reduced risk of cardiotoxicity.
6.2.5.10 Optimizing Preferred Binding Conformations
In certain embodiments, the method comprises the step of optimizing the preferred binding conformations using MD, as described above.
In certain embodiments, the MD is scalable MD.
In certain embodiments, the MD uses NAMD software.
6.2.5.11 Calculation of Binding Energies, ΔGcalc
In certain embodiments, the method comprises the step of calculating binding energies, ΔGcalc, for each of the combinations of compound (ligand) and protein (receptor) in the corresponding optimized preferred binding conformations.
Calculation of binding energies using a combination of molecular mechanics and solvation models are well known by one of ordinary skill in the art (see, e.g., Kollman et al., 2000, “Calculating Structures and Free Energies of Complex Molecules: Combining Molecular Mechanics and Continuum Models,” Acc. Chem. Res. 3B, 889-897).
In certain embodiments, the method further comprises outputting the selected calculated binding energies, ΔGcalc, and comparing them to physiologically relevant concentrations for each of the combinations of protein and compound. In this regard, the IC50 (concentration at which 50% inhibition is observed) values measured from, for example, in vitro biological assays can be converted to the observed free energy change of binding, ΔGobs (cal mol−1) using the relation: ΔGcalc=RT ln Ki, where R is the gas constant, R=1.987 cal K−1 mol−1, T is the absolute temperature, and Ki is approximated to be the IC50 measured for a particular compound, i. Accordingly, ΔGcalc may be compared to ΔGobs, and physiologically relevant concentrations (IC50) for each of the combinations of protein and compound.
6.2.5.12 Prediction of Cardiotoxicity and Selection of Compound
In certain embodiments, the method comprises prediction of cardiotoxicity and selection of a compound based on (i) classification of the compound as “blocker” versus “nonblocker”; and/or (ii) calculated binding energies.
In certain embodiments, where the compound does not block the ion channel in any of its preferred binding conformations, the compound is identified as a “non-blocker.” Under such circumstances, the “non-blocking” compound is predicted to have reduced risk of cardiotoxicity, and the compound is selected for further development or possible use in humans, or to be used as a compound for further drug design. In certain embodiments, further clinical development may comprise further testing for cardiotoxicity with other ion channels using the methods disclosed herein.
In certain embodiments, wherein the compound blocks the ion channel in one of its preferred binding conformations, the compound is identified as a “blocker.” Under such circumstances, the compound is predicted to be cardiotoxic, and the compound is not selected for further clinical development or for use in humans. However, under such circumstances, the method may further comprise the step of using a molecular modeling algorithm to chemically modify or redesign the compound such that it does not block the ion channel in its preferred binding conformations and retains biological activity to its primary biological target, as described in Sections 5.2.3.13 and 5.2.3.14 below, respectively. As a possible alternative to modification/redesign of the compound, a new compound may also be selected from the collections of a chemical or compound library, for example, a library of new drug candidates generated by organic or medicinal chemists as part of a drug discovery program, as described in Section 5.2.3.15 below.
In certain embodiments, where the calculated binding energies, ΔGcalc, for the preferred binding conformations compare to physiologically relevant compound concentrations of greater than or equal to 100 μM, binding affinity is predicted to be weak. Under such circumstances, the compound is predicted to have reduced risk of cardiotoxicity at therapeutically relevant concentrations. The compound may be selected for further development or possible use in humans, or to be used as a compound for further drug design. In certain embodiments, further clinical development may comprise further testing for cardiotoxicity with other ion channels using the methods disclosed herein.
In certain embodiments, where the calculated binding energies, ΔGcalc, for the preferred binding conformations compare to physiologically relevant compound concentrations of less than or equal to 1 μM, binding affinity is predicted to be moderate to strong. The compound is predicted to be cardiotoxic at therapeutically relevant concentrations, and the compound is not selected for further clinical development or for use in humans. However, under such circumstances, as described above, the method may further comprise the step of using a molecular modeling algorithm to chemically modify or redesign the compound, or as a possible alternative, selecting a new compound from the collections of a chemical or compound library, as described in the sections below.
6.2.5.13 Modification/Redesign of Compounds
In certain embodiments, the method further comprises the step of using a molecular modeling algorithm to chemically modify or design the compound such that it does not block the ion channel in any of its preferred binding conformations.
In certain embodiments, the method comprises repeating steps e) through i) for the modified or redesigned compound.
For example, if a chemical moiety of a compound identified as a “blocker” is found to be responsible for blocking, obstructing, or partially obstructing the ion channel, that chemical moiety may be modified in silico using any one of the molecular modeling algorithms disclosed herein or known to one of ordinary skill in the art. The modified compound may then be retested by repeating steps e) through i) of the methods disclosed herein.
Following re-testing, if the modified compound does not block, obstruct, or partially obstruct the ion channel in any of its preferred binding conformations, the modified compound may now be identified as a “non-blocker.” The modified compound may now be characterized as having reduced risk of cardiotoxicity, and selected for further development or possible use in humans, or to be used as a compound for further drug design. By such modification/redesign, potentially cardiotoxic compounds at risk for QT interval prolongation may be salvaged for further clinical development.
In certain embodiments, the modified or redesigned compound does not block the ion channel in its preferred binding conformations, but retains selective binding to a desired biological target, as described in Section 5.2.3.14 below.
6.2.5.14 Modification/Redesign of Compounds for Selective Binding to Primary Biological Target
In certain embodiments, the modified or redesigned compound retains or even increases selective binding to a primary biological target. In certain embodiments, binding of the compound or modified/redesigned compound to the primary biological target blocks hepatitis C virus (HCV) production. In certain embodiments, the primary biological target is HCV NS3/4A serine protease, HCV NS5B RNA dependent RNA polymerase, or HCV NS5A monomer protein.
In certain embodiments, the modified or redesigned compound is tested in an in vitro biological assay for selective binding to its biological target.
In certain embodiments, the modified or redesigned compound is tested for binding to its biological target in silico using any of the computational models or screening algorithms disclosed herein.
In certain embodiments, the modified or redesigned compound binds with high affinity to its biological target and/or retains biological activity. In certain embodiments, where the primary biological target is HCV NS3/4A serine protease, HCV NS5B RNA dependent RNA polymerase, or HCV NS5A monomer protein, the modified or redesigned compound retains antiviral activity.
In certain embodiments, the computational models or screening algorithms disclosed herein for selecting compounds that have reduced risk of cardiotoxicity may be combined with any computational models or screening algorithms known to those of ordinary skill in the art for modeling the binding of the compound or modified/redesigned compound to its primary biological target.
6.2.5.15 Selection of New Compound from a Chemical Library
As an alternative to modification/redesign of the compound, a new compound may also be selected from the collections of a chemical or compound library, for example, new drug candidates generated by organic or medicinal chemists as part of a drug discovery program.
For example, once the methods disclosed herein identify a chemical moiety of a original tested compound as a “blocker” that is responsible for blocking, obstructing, or partially obstructing the ion channel, a new compound from a chemical library may be selected wherein, for example, the new compound does not comprise the moiety found to be responsible for the blocking, obstructing, or partially obstructing of the ion channel.
The new compound may then be retested for cardiotoxicity by repeating steps e) through i) of the methods disclosed herein.
Following re-testing, if the new compound does not block, obstruct, or partially obstruct the ion channel in any of its preferred binding conformations, the new compound may be identified as a “non-blocker.” The new compound may be characterized as having reduced risk of cardiotoxicity, and selected for further development or possible use in humans, or to be used as a compound for further drug design. By such selection of a new compound from a chemical library, an entire drug discovery program with potentially cardiotoxic compounds at risk for QT interval prolongation may be salvaged by redirecting the program to safer lead compounds for further clinical development.
The new compound selected from the chemical library may also be tested for selective binding to a desired biological target, for example, a primary biological target, as described above in Section 5.2.3.14 above, for the modified/redesigned compound.
6.2.6 Biological Aspects
Optionally, the methods disclosed herein include checking in silico predicted cardiotoxicities with the results of an in vitro biological assay, or in vivo in an animal model. The methods disclosed herein may also include validating or confirming the in silico predicted cardiotoxicities with the results of an in vitro biological assay, or with the results of an in vivo study in an animal model.
Accordingly, in certain aspects, provided herein are biological methods for testing, checking, validating or confirming predicted cardiotoxicities.
In certain embodiments, the method comprises testing, checking, validating or confirming the predicted cardiotoxicity of the compound or modified compound using standard assaying techniques which are known to those of ordinary skill in the art.
In certain embodiments, the method comprises testing, checking, validating or confirming the predicted cardiotoxicity of the compound or modified compound in an in vitro biological assay.
In certain embodiments, the in vitro biological assay comprises high throughput screening of ion channel and transporter activities.
In certain embodiments, the in vitro biological assay is a hERG1 channel inhibition assay, for example, a FluxOR™ potassium ion channel assay, or electrophysiology measurements in single cells, as explained below.
In certain embodiments, the method comprises testing the cardiotoxicity of the compound or modified compound in vivo in an animal model.
In certain embodiments, the cardiotoxicity of the compound or modified compound is tested in vivo by measuring ECG in a wild type mouse or a transgenic mouse model expressing human hERG, as explained below.
6.2.6.1 FluxOR™ Potassium Ion Channel Assay
In certain embodiments, the in vitro biological assay is a FluxOR™ potassium ion channel assay (see, e.g. Beacham et al., 2010, “Cell-Based Potassium Ion Channel Screening Using FluxOR™ Assay,” J. Biomol. Screen., 15(4), 441-446), which allows high throughput screening of potassium ion channel and transporter activities.
The FluxOR™ assay monitors the permeability of potassium channels to thallium (Tl+) ions. When thallium is added to the extracellular solution with a stimulus to open channels, thallium flows down its concentration gradient into the cells, and channel or transporter activity is detected with a proprietary indicator dye that increases in cytosolic fluorescence. Accordingly, the fluorescence reported in the FluxOR™ system is an indicator of any ion channel activity or transport process that allows thallium into cells.
In certain embodiments, the FluxOR™ potassium channel assay is performed on HEK 293 cells stably expressing hERG1 or mouse cardiomyocyte cell line HL-1 cells.
In certain embodiments, the FluxOR™ potassium channel assay is performed on a human adult cardiomyocyte cell line expressing hERG1
6.2.6.2 Electrophysiology Measurements in Single Cells
In certain embodiments, the in vitro biological assay comprises electrophysiology measurements, for example, patch clamp electrophysiology measurements, which use a high throughput single cell planar patch clamp approach (see, e.g., Schroeder et al., 2003, “Ionworks HT: A New High-Throughput Electrophysiology Measurement Platform,” J. Biomol. Screen. 8 (1), 50-64).
In certain embodiments, electrophysiology measurements are in single cells. In certain embodiments, the single cells are Chinese hamster ovary (CHO) cells stably transfected with hERG1(CHO-hERG). In certain embodiments, the single cells are from a human adult cardiomyocyte cell line expressing hERG1.
The cells are dispensed into the PatchPlate. Amphotericin is used as a perforating agent to gain electrical access to the cells. The hERG tail current is measured prior to the addition of the test compound by perforated patch clamping. Following addition of the test compound (typically 0.008, 0.04, 0.2, 1, 5, and 25 μM, n=4 cells per concentration, final DMSO concentration=0.25%), a second recording of the hERG current is performed.
Post-compound hERG currents are usually expressed as a percentage of pre-compound hERG currents (% control current) and plotted against concentration for each compound. Where concentration dependent inhibition is observed the Hill equation is used to fit a sigmoidal line to the data and an IC50 (concentration at which 50% inhibition is observed) is determined.
6.2.6.3 Cloe Screen IC50 hERG Safety Assay
In certain embodiments, the in vitro biological assay is a Cloe Screen IC50 hERG Safety assay, for example, as provided by the company CYPROTEX (see, e.g., http://www.cyprotex.com/toxicology/cardiotoxicity/hergsafety/).
In certain embodiments, the Cloe Screen IC50 hERG Safety assay is performed using an Ionworks™ HT platform (Molecular Devices using a CHO hERG cell line) which measures whole-cell current from multiple cells simultaneously using an automated patch clamp system.
Typically, hERG Safety assay uses a high throughput single cell planar patch clamp approach. CHO-hERG cells are dispensed into a PatchPlate. Amphotericin is used as a perforating agent to gain electrical access to the cells. The hERG tail current is measured prior to the addition of the test compound by perforated patch clamping. Following addition of the test compound (typically 0.008, 0.04, 0.2, 1, 5, and 25 μM, n=4 cells per concentration, final DMSO concentration=0.25%), a second recording of the hERG current is performed. Post-compound hERG currents are expressed as a percentage of pre-compound hERG currents (% control current) and plotted against concentration for each compound. Where concentration dependent inhibition is observed the Hill equation is used to fit a sigmoidal line to the data and an IC50 (concentration at which 50% inhibition is observed) is determined.
In certain embodiments, the hERG safety assay using the Ionworks™ HT system generates data comparable with traditional single cell patch clamp measurements.
6.2.6.4 Electrocardiography Studies in Transgenic Mouse Models
In certain embodiments, the method comprises testing the cardiotoxicity of the compound or modified compound in vivo by measuring ECG in a transgenic mouse model expressing human hERG1.
Electrocardiograpy to test anti-arrhythmic activity, in particular, QT prolongation, in transgenic mice expressing hERG specifically in the heart may performed using previously published protocols (Royer et al., 2005, “Expression of Human ERG K+Channels in the Mouse Heart Exerts Anti-Arrhythmic Activity,” Cardiovascular Res. 65, 128-137).
Alternatively, or in addition, electrocardiograpy to test anti-arrhythmic activity, in particular, QT prolongation, in wild type mice may be performed.
The following examples are included to demonstrate preferred embodiments of the disclosure. It should be appreciated by those of ordinary skill in the art that the techniques disclosed in the examples which follow represent techniques discovered by the inventor to function well in the practice of the disclosure, and thus can be considered to constitute preferred modes for its practice. However, those of ordinary skill in the art should, in light of the present disclosure, appreciate that many changes can be made in the specific embodiments which are disclosed and still obtain a like or similar result without departing from the spirit and scope of the disclosure.
In certain embodiments, the MD simulations disclosed herein comprise simulations of at least 200,000 atoms and their coordinates (protein, membrane, water and ions). In certain embodiments, the equilibration process of at least 200 ns is equivalent to taking 100 billion steps (1011 steps) updating the position coordinates and velocities of each atom in the system in each of these steps. In certain embodiments, the MD simulations using a current state-of-the art supercomputer, for example, the “IBM Blue Gene/Q” supercomputer system, require an equivalent of 10 million CPU hours which scales approximately linearly with the size of the computational hardware available.
The methods disclosed herein as applied to potassium ion channels may be performed as described in Examples 1-15.
Combined de novo and homology protein modeling of the hERG1 channel protein was performed as previously described (Durdagi et al., 2012, “Modeling of Open, Closed, and Open-Inactivated States of the HERG1 Channel: Structural Mechanisms of the State-Dependent Drug Binding,” J. Chem. Inf. Model., 52, 2760-2774). FIGS. 4 and 5A-5B present molecular models of the hERG1 monomer subunit and the hERG1 tetramer, respectively.
In brief, homology modeling for parts of the hERG1structure conserved among K+ channels with known crystal structures used target-template sequence alignment performed by the ClustalW algorithm (Thompson et al., 1994, “Improving the Sensitivity of Progressive Multiple Sequence Alignment Through Sequence Weighting, Position-Specific Gap Penalties and Weight Matrix Choice,” Nucleic Acids Res. 22 (22), 4673-4680). Homology models were produced by the Comparative Modeling module in ROSETTA (Raman et al., 2009, “Structure Prediction for CASP8 with All-Atom Refinement using Rosetta,” Proteins, 77, 89-99; Chivian et al., 2006, “Homology Modeling using Parametric Alignment Ensemble Generation with Consensus and Energy-Based Model Selection,” Nucleic Acids Res. 34 (17), el 12) to produce reasonably good models with ˜3-4 Å backbone Cα RMSD. Since the pore domain (PD) contains an unusually long S5-Pore linker or turret which forms a 8-12-residue helix above the selectivity filter, de novo modeling of the linker and missing parts in the model was performed by Loop Modeling (Wang et al. 2007, “Protein-Protein Docking with Backbone Flexibility,” J. Mol. Biol., 373 (2), 503-519; Canutescu et al., 2003, “Cyclic Coordinate Descent: A Robotics Algorithm for Protein Loop Closure,” Protein Sci., 12 (5), 963-972) in ROSETTA. Five steps were used in the protein modeling: (i) sequence alignment for generation of alignment based on one or more template structures, (ii) threading for generation of initial models based on template structure by copying coordinates over the aligned regions, (iii) loop modeling for rebuilding the missing parts using de novo modeling, (iv) selection of models based on reported experimental data from biochemical, biophysical, and electrophysiological studies, and (v) refinement using all-atom molecular dynamics (MD) simulations with reported constraints for the interatomic distances of the salt-bridge interaction pair obtained from electrophysiology and mutagenesis experiments performed on hERG1 channels.
The previously published sequence alignment was used (Subbotina et al., 2010, “Structural Refinement of the HERG1 Pore and Voltage-Sensing Domains with ROSETTA-Membrane and Molecular Dynamics Simulations,” Proteins, 78 (14), 2922-2934) for modeling the hERG1 channel in open, closed, and inactivated states. Open and closed state S1-S6 TM models were modeled based on the refined Kv1.2 model which was derived from the Kv1.2 crystal structure (PDB ID 2A79) and the Kv1.2 closed state protein model, respectively (Chivian et al., 2006, Nucleic Acids Res. 34 (17), e112; Long et al., 2005, “Crystal Structure of a Mammalian Voltage-Dependent Shaker Family K+ Channel,” Science, 309 (5736), 897-903). Open state Kv1.2, closed state Kv1.2,15 and open-inactivated KcsA PD (PDB ID 3F5W) from Mus musculus were used as template structures. Intracellular (IC) and extracellular (EC) domains such as antibody light and heavy chains from the available PDB coordinate files were trimmed off for generating initial incomplete models of hERG1 in S1-S6 open and closed states and S5S6 in the openinactivated state.
For optimal loop prediction in hERG1, fragment-based loop modeling of ROSETTA was implemented (Wang et al., 2007, J. Mol. Biol., 373 (2), 503-519; Canutescu et al., 2003, Protein Sci., 12 (5), 963-972). Fragment-based conformational searching using cyclic coordinate descent (CCD) and kinematic loop closure (KLC) algorithms for inserting 3- and 9-residue-long fragments of protein structures from the PDB fragment library was performed, and secondary structure prediction was generated by PSIPRED (McGuffin et al., 2000, “The PSIPRED Protein Structure Prediction Server,” Bioinformatics, 16 (4), 404-405). Over 20,000 models for open, closed, and open-inactivated states were generated using loop modeling. Models with a 8-12-residue helix located in the outer mouth of the selectivity filter were selected for further analysis with the Molsoft ICM program (Abagyan et al., 1994, “ICM—A New Method for Protein Modeling and Design—Applications to Docking and Structure Prediction from the Distorted Native Conformation,” J. Comput. Chem., 15 (5), 488-506). The stable models complying with published experimental constraints were used for subsequent all-atom MD simulations.
The coordinates for hERG1 generated from the homology modeling described in EXAMPLE 1, above, are provided in the attached Table A. These coordinates were used as input for the MD simulations, described in EXAMPLE 3 below.
The software MOE (Molecular Operating Environment) from Chemical Computing Group (CCG) (http://www.chemcomp.com/press_releases/2010-11-30.htm) was used to translate the 2D information of a compound (ligand) into a 3D representative structure. MOE also generated variants of the same ligand with different tautomeric, stereochemical, and ionization properties. All generated structures were conformationally relaxed using energy minimization protocols included in MOE.
Alternative, or in addition, the software LigPrep from the Schrödinger package (Schrödinger Release 2013-2: LigPrep, version 2.7, Schrödinger, LLC, New York, N.Y., 2013) may be used to translate the 2D information of a compound (ligand) into a 3D representative structure. LigPrep may also be used to generate variants of the same ligand with different tautomeric, stereochemical, and ionization properties. All generated structures may be conformationally relaxed using energy minimization protocols included in LigPrep.
All-atom MD simulations were carried out for the selected models using NAMD (Not (just) Another Molecular Dynamics program) (Phillips et al., 2005, “Scalable Molecular Dynamics with NAMD,” J. Comput. Chem., 26, 1781-1802; Kale et al., 1999, “NAMD2: Greater Scalability for Parallel Molecular Dynamics,” J. Comp. Phys. 151, 283-312) in a Molecular Operating Environment (MOE).
MD simulations were carried out at 300 K, and physiological pH (pH 7) and 1 atm using the all-hydrogen AMBER99SB force field for the protein (Hornak et al., 2006, “Comparison of Multiple Amber Force Fields and Development of Improved Protein Backbone Parameters,” Proteins 65, 712-725) and the generalized AMBER force field (GAFF) for the ligands (Wang et al., 2004, “Development and Testing of a General Amber Force Field,” J. Comput. Chem. 25, 1157-1174).
Similar to previous MD simulations (Chivian et al. 2006, “Homology modeling using parametric alignment ensemble generation with consensus and energy-based model selection.” Nucleic Acids Res., 34, 17) of K channels, the particle mesh Ewald (PME) algorithm was used for electrostatic interactions. K ions at the selectivity filter were used as the occupation of ions at the S0:S2:S4 positions according to the previous studies (Chivian et al., 2006). The protein model was embedded into the 1-palmitoyl-2-oleoyl-sn-glycero-3-phosphocholine (POPC) membrane bilayer using the CHARMM-GUI membrane builder protocol (Kumar et al., 2007, “CHARMM-GUI: A Graphical User Interface for the CHARMM users,” Abstr. Pap. Am. Chem. Soc. 233, 273-273; Jo et al., 2008, “Software news and updates—CHARMM-GUI: A Web-Based Graphical User Interface for CHARM,” J. Comput. Chem. 29 (11), 1859-1865). The simulation box contained 1 protein, 416 POPC molecules, 3 K+ ions, pore water molecules in the intracellular cavity, solvated by 0.15 M KCl aqueous salt solution. Total atoms in the simulation systems were approximately 176716 atoms.
Structures were minimized for 200,000 steps, heated for 2 ns, then equilibrated for 20 ns. During minimization and heating, backbone atoms were heavily restrained from motion, while during equilibration those restraints were strongly reduced (i.e., heating and minimization were carried out with 100.0 kcal mol−1 Å−2 for backbone, and gradually reduced to 10 kcal mol−1 Å−2 during equilibration). The system was then subjected to a 200 ns production run with no restraints.
Atomic coordinates were saved to the trajectory every 10 ps, producing 20,000 snapshots. Atomic fluctuation (B-factors) and root mean deviations from the reference structures (RMSD) were then calculated, as explained below.
The root mean square deviation (RMSD) of Cα atoms relative to a reference structure were calculated as follows:
where N is the number of atoms, and rref is a reference structure, and is presented in
Iterative clustering of the MD trajectory was then performed to extract dominant conformations of hERG1. The clustering procedure has been previously described (Barakat et al., 2010, “Ensemble-Based Virtual Screening Reveals Dual-Inhibitors for the P53-MDM2/MDMX Interactions,” J. Mal. Graph. & Model. 28, 555-568; Barakat et al., 2011, “Relaxed Complex Scheme Suggests Novel Inhibitors for the Lyase Activity Of DNA Polymerase Beta,” J. Mol. Graph. & Model. 29, 702-716). An average-linkage algorithm was used to group similar conformations in the 200 ns trajectory into clusters. The optimal number of clusters was estimated by observing the values of the Davies-Bouldin index (DBI) (see, e.g., Davies et al., 1979, “A Cluster Separation Measure,” IEEE Trans. Pattern Anal. Intelligence 1, 224) and the percentage of data explained by the data (SSR/SST) (see, e.g., Shao et al., 2007, “Clustering Molecular Dynamics Trajectories: 1. Characterizing the Performance of Different Clustering Algorithms,” J. Chem. Theory & Computation. 3, 231) for different cluster counts ranging from 5 to 600. At the optimal number of clusters, a plateau in the SSR/SST is expected to match a local minimum in the DBI (Shan et al., 2007). Using this methodology, three-hundred (300) distinct conformations for the intracellular hERG channel were identified.
Docking:
All docking simulations employed the software AutoDock, version 4.0 (Morris et al., 2009, “Autodock4 and AutoDockTools4: Automated docking with selective receptor flexibility,” J. Computational Chemistry, 16, 2785-91). The docking method and parameters were similar to ones previously used (Barakat et al., 2009, “Characterization of an Inhibitory Dynamic Pharmacophore for the ERCC1-XPA Interaction Using a Combined Molecular Dynamics and Virtual Screening Approach,” J. Mol. Graph. Model 28, 113-130). The screening method adopted the relaxed complex scheme (RCS) (Lin et al., 2002, “Computational Drug Design Accommodating Receptor Flexibility: The Relaxed Complex Scheme,” J. Am. Chem. Soc. 124, 5632-33) through docking of the tested compounds to the 300 hERG structures generated from the above-mentioned clustering methodology. All docking simulations employed the using the Lamarckian Genetic Algorithm (LGA), the docking parameters included an initial population of 400 random individuals; a maximum number of 10,000,000 energy evaluations; 100 trials; 40,0000 maximum generations and the requirement that only one individual can survive into the next generation. The rest of the parameters were set to the default values.
Iterative Clustering:
Clustering of the docking results followed the same adaptive procedure as one previously employed (Barakat et al., 2009). In brief, for each docking simulation a modified version of the PTRAJ module of AMBER (Case et al., 2005, “The Amber Biomolecular Simulation Programs,” J. Comput. Chem. 26, 1668-1688) clustered the docking trials. Every time a number of clusters were produced, two clustering metrics (e.g., DBI and percentage of variance (Shao et al., 2007, “Clustering Molecular Dynamics Trajectories: 1. Characterizing the Performance of Different Clustering Algorithms,” J. Chem. Theory and Comput. 3, 2312)) were calculated to assess the quality of clustering. Once acceptable values for these metrics were reached, the clustering protocol extracted the clusters at the predicted cluster counts. The screening protocol then sorted the docking results by the lowest binding energy of the most populated cluster. The objective was to extract the docking solution, for each ligand, that had the largest cluster population and the lowest binding energy from all hERG structures. In this context, for each ligand, the docking results were clustered independently for the individual structures. The clustering results were then compared and top 40 hits were considered for further analysis. AutoDock scoring function (Equation 2) provided a preliminary ranking for the compounds:
Here, the five ΔG terms on the right-hand side are constants. The function includes three in vacua interaction terms, namely a Lennard-Jones 12-6 dispersion/repulsion term, a directional 12-10 hydrogen bonding term, where E(t) is a directional weight based on the angle, t, between the probe and the target atom, and screened Columbic electrostatic potential. In addition, the unfavorable entropy contributions are proportional to the number of rotatable bonds in the ligand and solvation effects are represented by a pairwise volume-based term that is calculated by summing up, for all ligand atoms, the fragmental volumes of their surrounding protein atoms weighted by an exponential function and then multiplied by the atomic solvation parameter of the ligand atom (Si).
The lowest 40 energy poses for each ligand with their representative hERG1 structures were used as a starting configuration of an MD simulation. The AMBER99SB force field (Hornak et al., 2006, “Comparison of Multiple AMBER Force Fields and Development of Improved Protein Backbone Parameters,” Proteins 65, 712-725) was used for protein parameterization, while the generalized AMBER force field (GAFF) provided parameters for ligands (Wang et al., 2004, “Development and Testing of a General AMBER Force Field,” J. Comput. Chem. 25, 1157-1174). For each ligand, partial charges were calculated with the AM1-BCC method using the Antechamber module of AMBER 10. Protonation states of all ionizable residues were calculated using the program PDB2PQR. All simulations were performed at 300 K and pH 7 using the NAMD program (Kalé et al., 1999, “NAMD2: Greater Scalability for Parallel Molecular Dynamics,” J. Comp. Phys. 151, 283-312). Following parameterization, the protein-ligand complexes were immersed in the center of a cube of TIP3P water molecules. The cube dimensions were chosen to provide at least a 10 Å buffer of water molecules around each system. When required, chloride or sodium counter-ions were added to neutralize the total charge of the complex by replacing water molecules having the highest electrostatic energies on their oxygen atoms. The fully solvated systems were then minimized and subsequently heated to the simulation temperature with heavy restraints placed on all backbone atoms. Following heating, the systems were equilibrated using periodic boundary conditions for 100 ps and energy restraints reduced to zero in successive steps of the MD simulation. The simulations were then continued for 2 ns during which atomic coordinates were saved to the trajectory every 2 ps for subsequent binding energy analysis.
The molecular mechanics Poisson-Boltzmann surface area (MM-PBSA) technique was used to re-score the preliminary ranked docking hits (Kollman et al., 2000, “Calculating Structures and Free Energies of Complex Molecules: Combining Molecular Mechanics and Continuum Models,” Acc. Chem. Res. 3B, 889-897). This technique combines molecular mechanics with continuum solvation models. The total free energy is estimated as the sum of average molecular mechanical gas-phase energies (EMM), solvation free energies (Gsolv), and entropy contributions (−TSsolute) of the binding reaction:
G=E
MM
+G
solv
−TS
solute (3)
The molecular mechanical (EMM) energy of each snapshot was calculated using the SANDER module of AMBER10 with all pair-wise interactions included using a dielectric constant (8) of 1.0. The solvation free energy (Gsolv) was estimated as the sum of electrostatic solvation free energy, calculated by the finite-difference solution of the Poisson-Boltzmann equation in the Adaptive Poisson-Boltzmann Solver (APBS) and non-polar solvation free energy, calculated from the solvent-accessible surface area (SASA) algorithm. The solute entropy was approximated using the normal mode analysis. Applying the thermodynamic cycle for each protein-ligand complex, the binding free energy was calculated using the following equation:
ΔGcalco=GgashERG-ligand+GsolvhERG-ligand−{GsolvhERG-ligand+GgashERG-ligand} (4)
Here, (GgashERG-ligand) represents the free energy per mole for the non-covalent association of the ligand-protein complex in vacuum (gas phase) at a representative temperature, while (−ΔGsolv) stands for the work required to transfer a molecule from its solution conformation to the same conformation in vacuum (assuming that the binding conformation of the ligand-protein complex is the same in solution and in vacuum).
The calculated binding energies, ΔGocalc, can be compared directly to the physiologically relevant concentrations. In this regard, the IC50 (concentration at which 50% inhibition is observed) values measured from, for example, in vitro biological assays are converted to the observed free energy change of binding, ΔGobs (cal mol−1) using the equation:
ΔGoobs=RT ln Ki (5)
where R is the gas constant, R=1.987 cal K−1 mol−1, T is the absolute temperature, and Ki is approximated to be the IC50 measured for a particular test compound, i. Accordingly, the calculated binding energies in silico, ΔGocalc, are compared to the observed binding energy in vitro, ΔGobs (e.g., from inhibition studies), and thus, also to the physiologically relevant concentrations (IC50) for each of the combinations of compound and protein, for example, hERG.
The calculated binding energy of a tested compound may also compared to that of a known control (a known hERG blacker from a standardized panel of drugs). The following equation is used:
where Ki1 and Ki2 are the molar concentrations of the tested compound and the control, respectively.
VMD (Visual MD) (Humphrey et al., 1996, “Visual Molecular Dynamics,” J. Mol. Graphics, 14 (1), 33-38) was used to visually analyze the results of the MD trajectories of the selected complexes for preliminary ranking of the docking hits.
A channel blacker binds within the cavity so that the passage of the potassium ions through the selection filter is blocked. On the other hand, a compound may bind to the channel in a way that it does not interfere with the potassium passage. With that in mind, and by visually inspecting the bound structures, one can classify the tested small molecules as “blockers,” e.g., compounds that blocked the hERG1 ion channel, or as “non-blockers,” e.g., compounds that did not block the hERG1 ion channel.
BMS-986094 (“(2R)-neopentyl 2-(((((2R,3R,4R)-5-(2-amino-6-methoxy-9H-purin-9-yl)-3,4-dihydroxy-4-methyltetrahydrofuran-2-yl)methoxy)(naphthalen-1-yloxy)phosphoryl)amino)propanoate) is a nucleotide polymerase (NS5B) inhibitor that was in Phase II development for the treatment of hepatitis. BMS-986094 is an example of a compound that was placed on clinical hold by the FDA, after nine patients in a clinical trial had to be hospitalized and one of them died because of effects on QT interval prolongation. The structure of BMS-986094 is illustrated below, where the highlighted moiety corresponds to an “amino acid based prodrug”:
As demonstrated in EXAMPLE 9 and
According to the preferred binding conformations identified for BMS-986094 from the methods disclosed herein, the part of the BMS compound that blocks the hERG ion channel is the amino acid based prodrug hanging off the left-hand side of the 5-membered sugar. Without being limited by any theory, it is believed that by modifying or, if necessary, removing the prodrug portion of the compound, the modified BMS compound will no longer block the hERG ion channel, but will retain anti-HCV activity.
Mammalian cells expressing the hERG1 potassium channel were dispensed into 384-well planar arrays and hERG tail-currents were measured by whole-cell voltage-clamping. A range of concentrations (TBD) of the test compounds were then added to the cells and a second recording of the hERG current was made. The percent change in hERG current was calculated. IC50 values were derived by fitting a sigmoidal function to concentration-response data, where concentration-dependent inhibition was observed.
The experiments were performed on an IonWorks™ FIT instrument (Molecular Devices Corporation), which automatically performs electrophysiology measurements in 48 single cells simultaneously in a specialised 384-well plate (PatchPlate™). All cell suspensions, buffers and test compound solutions were at room temperature during the experiment.
The cells used were Chinese hamster ovary (CHO) cells stably transfected with hERG (cell-line obtained from Cytomyx, UK). A single-cell suspension was prepared in extracellular solution (Dulbecco's phosphate buffered saline with calcium and magnesium pH 7-7.2) and aliquots were added automatically to each well of a PatchPlate™. The cells were then positioned over a small hole at the bottom of each well by applying a vacuum beneath the plate to form an electrical seal. The vacuum was applied through a single compartment common to all wells which were filled with intracellular solution (buffered to pH 7.2 with HEPES). The resistance of each seal was measured via a common ground-electrode in the intracellular compartment and individual electrodes placed into each of the upper wells.
Electrical access to the cell was then achieved by circulating a perforating agent, amphotericin, underneath the PatchPlate™ and then measuring the pre-compound hERG current. An electrode was positioned in the extracellular compartment and a holding potential of −80 mV for 15 sec was applied. The hERG channels were then activated by applying a depolarising step to +40 mV for 5 sec and then clamped at −50 mV for 4 sec to elicit the hERG tail current, before returning to −80 mV for 0.3 s.
A test compound was then added automatically to the upper wells of the PatchPlate™ from a 96-well microtitre plate containing a range of concentrations of each compound. Solutions were prepared by diluting DMSO solutions of the test compound into extracellular buffer. The test compound was left in contact with the cells for 300 sec before recording currents using the same voltage-step protocol as in the pre-compound scan. Quinidine, an established hERG inhibitor, was included as a positive control and buffer containing 0.25% DMSO was included as a negative control. The results for all compounds on the plate were rejected and the experiment repeated if the IC50 value for quinidine or the negative control results are outside quality-control limits.
Each concentration was tested in 4 replicate wells on the PatchPlate™. However, only cells with a seal resistance greater than 50 MOhm and a pre-compound current of at least 0.1 nA were used to evaluate hERG blockade.
Post-compound currents were then expressed as a percentage of pre-compound currents and plotted against concentration for each compound. Where concentration-dependent inhibition is observed, the data are fitted to the following equation and an IC50 value calculated:
where Y=(post-compound current/pre-compound current)×100, x=concentration, X50=concentration required to inhibit current by 50% (IC50) and s=slope of the graph.
An IC50 was reported if concentration-dependent inhibition is observed. The standard error (SE) of the IC50 model and the number of data-points used to determine IC50 was also reported. Results are presented in TABLE 6, below, and in FIGS. 10 and 11A-11D. According to the data, both astemizole and BMS-986094 inhibit the potassium channel.
The FluxOR™ potassium channel assay was performed on Human Embryonic Kidney 293 cells (HEK 293) cells stably expressing hERG1 or mouse cardiomyocyte cell line HL-1 cells (a gift from Dr. William Claycomb, Louisiana, USA). Briefly, FluxOR™ loading buffer was made from Hank's Balanced Saline Solution (HBSS) buffered with 20 mM HEPES and pH adjusted with NaOH to 7.4. Powerload™ concentrate and water-soluble probenecid were used as directed by the kit to enhance the dye solubility and retention, respectively. Media were removed from the cell plates manually, and 20 of loading buffer containing the FluxOR™ dye mix was applied to each well. Once inside the cell, the nonfluorescent AM ester form of the FluxOR™ dye was cleaved by endogenous esterases into a thallium-sensitive indicator. The dye was loaded for 60 min at room temperature and then removed manually. The cell plates were subsequently washed once with dye-free assay buffer, before adding a final volume of 20 μL assay buffer containing water-soluble probenecid. Cell plates received 2 μL per well of the screening compounds, and were then incubated at room temperature (23-25° C.) for 30 min for HEK 293 cells to allow equilibration of the test compounds in the cultures or at 37° C. for 24 h for HL-1 cells. Prior to injection, stimulation buffer was prepared from the 5× chloride-free buffer, thallium, and potassium sulfate reagents provided in the kit to contain 10 mM free thallium (5 mM Tl2SO4) and 50 mM free potassium (25 mM K2SO4). These concentrations resulted in final added concentrations of 2 mM free Tl+ and 10 mM free K+ after 1:5 dilution upon injection of the stimulus buffer into cells that had been loaded with FluxOR™ dye. To each well 20 μL stimulation buffer was added and fluorescence measures were done every 1 sec for a total time of 180 sec. Fluorescence measurement were made using a Perkin Elmer EnSpire Multimode Plate Reader (Massachusetts, USA) using excitation and emission wavelengths of 490/525 nm, respectively.
Electrocardiograpy to test anti-arrhythmic activity in transgenic mice expressing hERG1 specifically in the heart may be performed using previously published protocols (Royer et al., 2005, “Expression of Human ERG K+Channels in the Mouse Heart Exerts Anti-Arrhythmic Activity,” Cardiovascular Res. 65, 128-137).
The computation model and methods disclosed herein were used to identify drug-mediated hERG blocking activity of a test panel of compounds with high sensitivity and specificity. These in silica results were validated using hERG binding assays and patch clamp electrophysiology. As demonstrated in the following Example, the computation models and methods disclosed herein can distinguish between potent, weak, and non-hERG blockers, and enable for the first time high throughput screening and modification of compounds with reduced cardiotoxicity early in the drug development process.
A.1. Molecular Dynamics (MD) Simulations:
A previously published homology structure for the hERG channel in its open state as the initial configuration (Durdagi et al., 2012, “Modeling of Open, Closed, and Open-Inactivated States of the Hergl Channel: Structural Mechanisms of the State-Dependent Drug Binding,” J. Chem. Inform. & Model. 52, 2760-2774) was used. The protein structure was embedded into 416 POPC membrane lipids bilayer, 15 Å-wide buffer of water molecules and a 0.15M of KCl salt concentration using the CHARMM-GUI membrane builder protocol (Barakat et al., 2010, “Ensemble-based Virtual Screening Reveals Dual-Inhibitors for the p53-MDM2IMDMX Interactions,” J. Mol. Graph. & Model. 28, 555-568). Three potassium ions were positioned within the selectivity filter. Two force fields were used, the AMBER99SB force field (Hornak et al., 2006, “Comparison of Multiple Amber Force Fields and Development of Improved Protein Backbone Parameters,” Proteins 65, 712-725) for the protein structure and the amber lipid11 force field (Skjevik et al., 2012, “LIPID11: a Modular Framework for Lipid Simulations using Amber,” J. Phys. Chem. B 116, 11124-11136) for the membrane structure. Overall, 155 MD simulations were carried out using the NAMD program (Homak et al., 2006) at 310K. The initial simulation was carried out for 500 ns on the membrane-bound structure with no ligands within the pocket to explore the conformational dynamics of the hERG cavity and to extract dominant conformations for subsequent docking analyses.
The protocol for the MD simulation employed 200,000 minimization steps with heavy restraints on the protein backbone and lipid molecules, gradual heating for 1 ns over 1000 steps with the same restraints, equilibration for 10 ns with the restrained weakened to one hundred times from that of heating, followed by an additional equilibration phase for 10 ns with a further reduction to one tenth of the restraints used in the previous step, and finally, running the system for the rest of the 500 ns with no restraints. The remaining 154 MD simulations were used to relax the hERG-ligands complexes obtained from docking simulations and generate an ensemble of protein-ligand structures for binding energy analysis. These MD simulations followed the same procedure as those previously described (Jordheim et al., 2013, “Small Molecule Inhibitors of ERCC1-XPF Protein-Protein Interaction Synergize Alkylating Agents in Cancer Cells,” Mol. Pharmacol. 84, 12-24; Barakat et al., 2010, “Ensemble-based Virtual Screening Reveals Dual-Inhibitors for the p53-MDM2/MDMX interactions,” J. Mol. Graph. & Model. 28, 555-568; Barakat et al., 2012, “Virtual Screening and Biological Evaluation of Inhibitors Targeting the XPA-ERCC1 Interaction,” PloS one 7, e51329 (2012)10.1371/journal.pone.0051329)).
For the ligand-bound systems, the ligand parameters were obtained using the generalized amber force field (GAFF) (Wang et al., 2004, “Development and Testing of a General Amber Force Field,” J. Comput. Chem. 25, 1157-1174). For each ligand, partial charges were calculated with the AM1-BCC method using the Antechamber module of AMBER 10. Root-mean-square deviations (RMSD) and B-factors were computed over the duration of the simulation time using the PTRAJ utility. The 1-D electron density profiles were calculated using the density profile tool as implemented in VMD (Barakat et al., 2012, “DNA Repair Inhibitors: the Next Major Step to Improve Cancer Therapy,” Curr. Topics Med. Chem. 12, 1376-1390) for the last 300 ns.
A.2. Clustering Analysis:
The RMSD conformational clustering was performed using the average-linkage algorithm using cluster counts ranging from 5 to 300 clusters. Clustering analysis was performed on the 500 ns MD simulation using residues 623, 624, 651, 652, 653, 654, 655 and 656 from each monomer. Structures were extracted at 10 ps intervals over the entire 500 ns simulation times. All Cα-atoms were RMSD fitted to the minimized initial structures in order to remove overall rotation and translation. The clustering quality was anticipated by calculating two clustering metrics, namely, the Davies-Bouldin index (DBI) (Davies et al., 1979, “A Cluster Separation Measure,” IEEE Trans. Pattern Anal. Mach. Intelligence 1, 224) and the “elbow criterion” (Shao et al., 2007, “Clustering Molecular Dynamics Trajectories: 1. Characterizing the Performance of Different Clustering Algorithms,” J. Chem. Theor. & Comp., 2312). A high-quality clustering scheme is expected when DBI experiences a local minimum versus the number of clusters used. On the other hand, using the elbow criterion, the percentage of variance explained by the data is expected to plateau for cluster counts exceeding the optimal number of clusters (Shao et al., 2007). Using these metrics and varying the number of clusters, for adequate clustering, one should expect a local minimum for DBI and a horizontal line for the percentage of variance, which is exhibited by the data (see Results, below).
A.3. Principal Component Analysis:
PCA can transform the original space of correlated variables from a large MD simulation into a reduced space of independent variables comprising the essential dynamics of the system (Barakat et al., 2011, “Relaxed Complex Scheme Suggests Novel Inhibitors for the Lyase Activity of DNA Polymerase Beta,” J. Mol. Graph. & Model. 29, 702-716). For a typical protein, the system's dimensionality is thereby reduced from tens of thousands to fewer than fifty degrees of freedom.
To perform PCA for a subset of N atoms, the entire MD trajectory was RMSD fitted to a reference structure, in order to remove all rotations and translations. The covariance matrix was then be calculated from their Cartesian atomic co-ordinates as:
σij=(ri−ri)(rj−rj) (8)
where ri represents one the three Cartesian co-ordinates (xi, yi or zi) and the eigenvectors of the covariance matrix constitute the essential vectors of the motion.
A.4. Docking:
The 45 representatives of all clusters were used as rigid targets for the docking simulations. All docking runs were performed using AUTODOCK (Osterberg et al., 2002, “Automated Docking to Multiple Target Structures: Incorporation of Protein Mobility and Structural Water Heterogeneity in Autodock,” Proteins 46, 34-40), version 4.028. For each ligand, an initial docking simulation was performed within the whole cavity against the 45 dominant conformations. Results from this ensemble docking procedure were clustered using RMSD clustering from AUTODOCK with 2 Å cutoff, followed by ranking of the docking binding energies. More comprehensive docking simulations against the 45 dominant conformations were then performed within the preferred halves of the cavity that were selected by the top hits from the initial docking simulation.
For the initial run, the docking box spanned 126 grid points in each direction, with spacing of 0.238 Å between every two-adjacent points, enough to cover twice the whole pocket. For the more focused docking simulations, the box size was confined to 52 82 126 with the same spacing between points, however, the center of the box was moved to be more focused on the residues of the selected half pocket. For all docking simulations, the parameters were similar to those previously described (Barakat et al., 2012, “Virtual Screening and Biological Evaluation of Inhibitors Targeting the XPA-ERCC1 Interaction,” PloS one 7, e51329 (2012)10.1371/journal.pone.0051329); Barakat et al., 2013, “A Computational Model for Overcoming Drug Resistance Using Selective Dual-Inhibitors for Aurora Kinase A and Its T217D Variant,” Mol. Pharm. 10, 4572-4589). In brief, using the Lamarckian Genetic Algorithm (LGA), the docking parameters included an initial population of 350 random individuals; a maximum number of 25,000,000 energy evaluations; 100 trials; 34,000 maximum generations; a mutation rate of 0.02; a crossover rate of 0.80 and the requirement that only one individual can survive into the next generation.
A.5. Calculating the Shortest Distance from the Channel Mouth:
The shortest distance between a tested compound to one of the Thr623 residues at the mouth of the hERG channel was calculated using VMD to construct a table of all contact atoms within 20A for the four-threonine residues and the tested compound. Distances were calculated for each atom pair and all distances were sorted to extract the shortest distance.
A.6. Binding Energy Analysis:
The MM-PBSA technique (Kollman et al., 2000, “Calculating Structures and Free Energies of Complex Molecules: Combining Molecular Mechanics and Continuum Models,” Acc. Chem. Res. 3B, 889-897) was used to predict binding energies. Similar to the work described previously in the literature (Barakat et al., 2010, “Ensemble-Based Virtual Screening Reveals Dual-Inhibitors for the P53-MDM2/MDMX Interactions,” J. Mol. Graph. & Model. 28, 555-568; Barakat et al., 2013, “A Computational Model for Overcoming Drug Resistance Using Selective Dual-Inhibitors for Aurora Kinase A and Its T217D Variant,” Mol. Pharm. 10, 4572-4589; Barakat et al., 2013, “Detailed Computational Study of the Active Site of the Hepatitis C Viral RNA Polymerase to Aid Novel Drug Design,” J. Chem. Inform. & Model. 53, 3031-3043); Friesen et al., 2012, “Discovery of Amall Molecule Inhibitors that Interact with Gamma-Tubulin,” Chem. Biol. & Drug Design 79, 639-652), the total free energy for each system was estimated as the sum of the average molecular mechanical gas-phase energies (EMM), solvation free energies (Gsolv), and entropy contributions (−TSsolute) of the binding reaction:
G=E
MM
+G
solv
−TS
solute (9)
The molecular mechanical (EMM) energy of each snapshot was calculated using the SANDER module of AMBER10. The solvation free energy (Gsolv) was estimated as the sum of electrostatic solvation free energy, calculated by the finite-difference solution of the Poisson-Boltzmann equation in the Adaptive Poisson-Boltzmann Solver (APBS) and non-polar solvation free energy, calculated from the solvent-accessible surface area (SASA) algorithm:
ΔG0=GgashERG-ligand+GsolvhERG-ligand−{GsolvhERG-ligand+GgashERG} (10)
The parameters used included a dielectric constant for the protein-ligand complex of 1, a dielectric constant for the water of 80, an ionic concentration of 0.15 M, and a surface tension of 0.005 with a zero surface offset to estimate the nonpolar contribution of the solvation energy.
Two-thousand (2000) snapshots from each trajectory were selected to predict the molecular mechanics and solvation contributions; fifty (50) snapshots from each trajectory were selected to predict entropy. Selection of the snapshots' frequency was based on estimating the correlation time similar to the work described by Genheden and Ryde (Genheden and Ryde, 2010, “How to Obtain Statistically Converged MM/GBSA Results,” J. Comput. Chem. 31, 837-846). That is, the delta MM-PBSA energy points from the whole MD trajectory (X) was divided into blocks (Yi) of equal time spaces (τ). The function Φ was then calculated according to the following equation:
where σ2 (X) is the variance of the whole trajectory delta MM-PBSA energy points and σ2(Y) is the variance of the averages of the energy data points within the blocks of length τ (e.g., for each block the average delta energy is calculated then the variance of the n blocks generated is then used in Equation 11 as σ2 (Y)τ for a certain τ). The length of the block (τ) is then varied and the values of Φ are expected to be constant when the block averages are statistically independent and at this point the time correlation can be estimated.
A.7. Electrophysiology Buffers and Compounds:
Dulbecco's Phosphate-buffered saline was purchased from Corning. Intracellular (IC) buffer was composed of (mM) ethylene glycol tetraacetic acid EGTA (11), MgCl2 (2), KCl (30), KF (90), 4-(2-hydroxyethyl)-1-piperazineethane sulfonic acid (HEPES) (10), and K2-ATP (5), and was pH adjusted with KOH to 7.3. Extracellular (EC) buffer was composed of (mM) CaCl2, (2), MgCl2 (1), HEPES (10), KCl (4), NaCl (145), and pH adjusted with NaOH to 7.4. Astemizole, pimozide, cisapride, rofecoxib, celecoxib, haloperidol, terfenadine, quinidine, amiodarone, E-4031, trimethoprim, resveratrol, ranitidine HCl, acetyl salicylic acid, naproxen, ibuprofen, diclofenac Na, acetaminophen, guanosine, and 1-naphtol were obtained from Sigma-Aldrich. 2-amino-6-O-methyl-2′C-methyl guanosine (MG) was purchased from Carbosynth (Berkshire, UK). BMS-986094 was locally synthesized by Syninnova (Edmonton, AB). Compounds were serially diluted in dimethylsulfoxide (DMSO) and then added to the EC buffer at a constant concentration of 0.01% DMSO. A reagent (part No. 910-0049, FLreagent; Fluxion Biosciences) that reduced compound loss due to adhesion/adsorption to the plate was also added to compound solutions (1:100 ratio).
A.8. Predictor™ hERG Fluorescence Polarization Assay:
Compounds that bind to the hERG channel proteins were identified by their ability to displace the tracer (Predictor hERG Tracer Red) and decrease the fluorescence polarization. The Tracer Red ligand was stored in 100% DMSO and diluted to 8 nM in assay buffer (50 mM Tris-HCl, 1 mM MgCl2, 10 mM KCl, 0.05% Pluronic F127, pH 7.4, 4° C.) on the day of the experiment. Test samples and controls were diluted in assay buffer to 16 concentrations with half-log intervals. Cell membranes were removed from the −80° C. freezer and placed on ice after defrosting. Membranes working solution protein concentration was 0.3 mg/mL. The assay was compiled by adding 5 μL of test compound or control buffers, 5 μL of the Tracer Red ligand and 10 μL of cell membranes to a black 384-well plate (Corning, Cat No. 3677). The plates were mixed and then incubated for 6 h prior to reading on a Perkin Elmer EnVision plate reader (Excitation 531/25 nm, Emission 579/25 nm). IC50 values were derived by fitting a sigmoidal function to concentration-response data, where concentration-dependent inhibition was observed. All IC50 data were calculated and analyzed using GraphPad Prism 6 (GraphPad Software).
A.9. Cell Culture and Transfection:
AC10 adult human cardiomyocytes (ATCC Cat. No. PTA-1501) were seeded one day before the transfection in a 6 well plate in complete growth media with 5% fetal bovine serum (FBS) at 37° C. and 5% CO2. Transfections were carried out according to manufacturer's protocols. Briefly, x μg of lentiviral ORF expression plasmid DNA and y μl of Lenti-Pac HIV mix was first mixed in Opti-MEM I in one tube. In a separate tube, z μl of EndoFectin Lenti was diluted with Opti-MEM I. The diluted EndoFectin Lenti reagents were added drop wise to the DNA containing tube. The mixture was incubated at room temperature to allow the DNA-EndoFectin complex to form. The complex mixture was then directly added to each well and the plate was gently swirled. After incubation at 37° C. and 5% CO2 for 12-16 h, medium containing the mixtures was gently removed, and fresh growth medium was added. 48 hours post transfection, psedudovirus-containing culture medium was collected in sterile capped tubes and centrifuged. The supernatant was filtered through 0.45 μM low protein-binding filters.
A.10. Transduction of AC10 Cells:
AC10 cells were plated two days before the viral infection into 24-well plate, so that the cells reach to 70-80% confluency at the time of transduction. For each well viral suspension was diluted in complete medium in the presence of Polybrene. Cells were infected with diluted viral suspension containing Polybrene. Cells were incubated at 37° C. in 5% CO2 overnight. Cells were splitted into 1:5 onto 6-well plate and continued incubating for 48 hours into cell specific medium. The infected target cells were analyzed by transient expression of transgenes by flow cytometry and with a fluorescent microscope. For selecting stably transduced cells, the old media was replaced with fresh selective medium containing the appropriate selection drug every 3-4 days until drug resistance colonies become visible.
A.11. Patch Clamp Cell Culture:
AC10 cells constitutively expressing hERG channels and their corresponding negative control cells were validated in-house on IonFlux 16 (Molecular Devices). The medium was composed of 10% fetal bovine serum, 1% penicillin-streptomycin, and 89% Dulbecco's Modified Eagle Medium (DMEM)/F12 (Invitrogen Corporation). Cells were grown in T175 tissue culture flasks, split at 70%-90% confluency with trypsin/ethylene diamine-tetraacetic acid (0.05%; Invitrogen Corporation), and maintained at 37° C. and 5% CO2. When designated for experiments, passaged cells were moved to 28° C. for at least 24 h. Harvesting was performed with trypsin/ethylene diamine-tetraacetic acid 0.05% for 4 min, and detached sells were pelleted and resuspended in a solution of 97.5% serum free media (Gibco No. 12052; Invitrogen) and 2.5% HEPES buffer solution (Gibco No. 15630; Invitrogen) for 0.5-2.5 h at 23° C. Immediately before experiments, cells were washed once in EC buffer.
A.12. Automated Patch Clamp IonFlux Software and Experimental Protocols:
Compounds were diluted as described above, and distributed into compound wells (250 μL/well) manually. Cells were distributed to the designated wells and the plate was inserted into the IonFlux system. Plates were primed for 3 min according to the following protocol: (1) traps and compounds at 8 psi for t=0-160 s and 1.6 psi for t=160-175 s, (2) traps but not compounds at 1.6 psi for t=175-180 s, and (3) main channel at 1 psi for t=0-160 s and 0.2 psi for 160-180 s. After cell introduction at 5-8×106 cells/mL, the plates were reprimed: (1) traps and compounds at 5 psi for t=0-15 s and 2 psi for t=15-55 s, (2) traps but not compounds at 2 psi for t=55-60 s, and (3) main channel at 1 psi for t=0-20 s, 0.5 psi for t=20-40 s, and 0.2 psi for t=40-60 s. Then, cells were introduced into the main channel and trapped at lateral trapping sites with a trapping protocol: (1) trapping vacuum of 6 mmHg for t=0-30 s and 4 mmHg for t=30-85 s, (2) main channel pressure of 0.1 psi for t=0-2 s, followed by 15 repeated square pulses of 0-0.2 psi with baseline duration of 4.5 s and pulse duration of 0.5 s, followed by 0.1 psi for 8 s. One to five break protocols were performed and currents were stabilized before compound testing. A negative control (EC buffer with 0.01% DMSO) was tested before compounds which were infused for 5 to 15 min. Finally, cells were washed with EC buffer. Voltage command protocols used in the current study are similar to those employed in conventional patch clamping for hERG current, Vh was −80 mV and an initial step to +50 mV for 800 ms inactivated the channels, followed by a 1-s step to −50 mV to elicit the outward tail current that was measured.
A.13. Automated Patch Clamp Data Analysis:
Remaining percentage of current (REM) was calculated by subtracting current level from that of full block (e.g., positive controls), and then dividing by the difference of no block (e.g., negative controls) and full block (negative minus positive controls). The half maximal inhibitory concentration (IC50) and Hill slope (H) for compound concentrations (C) were fit to the following formula for the dps:
REM=I100+[(I0−I100)/(1+([C]=IC50̂H))] (12)
where I0 and I100 refer to no block and full block, respectively. IonFlux software (Molecular Devices), GraphPad Prism (GraphPad Software), and Microsoft Excel (Microsoft) were used to analyze and present IC50 values, currents, and seals.
A.14. Patch Clamp Data Inclusion Criteria:
IC50 values were calculated at temperature (33° C.-35° C.) from seven-point concentration-response curves with a minimum of n=6 at each concentration. Data points were accepted if they passed the following inclusion criteria: (1) acceptable current run-up/run-down (<10%) during compound incubation and before the positive control, (2) the negative control associated with the same cell trap did not show current block, and (3) the positive control associated with the same cell trap showed complete current block. The rate of current recovery during washout of compound was monitored, and outliers were excluded to filter out recordings that were lost.
A 500 ns molecular dynamics (MD) simulation was performed using an explicitly solvated membrane-bound hERG channel, an IBM Blue Gene/Q supercomputer, and an automated relaxed complex scheme (RCS) docking algorithm (Barakat et al., 2013, “A Computational Model for Overcoming Drug Resistance Using Selective Dual-Inhibitors for Aurora Kinase A and Its T217D Variant,” Mol. Pharm. 10, 4572-4589). The protocol involved six steps: (1) extracting the dominant (45) conformations of hERG's inner cavity; (2) performing blind docking simulations within the inner cavity against these 45 conformations to identify the highest affinity binding locations; (3) performing focused ligand docking to the top-ranked locations; (4) using all-atom MD simulations with explicit solvent and ions to rescore top hits; (5) calculating the molecular mechanics Poisson-Boltzmann surface area (MM-PBSA) binding energies of the refined complexes; (6) estimating the likelihood of channel blocking based on the ligand's lowest binding energy and shortest distance to the channel's pore. Since most hERG blockers bind within the inner hERG cavity in the channel's open state (Mitcheson et al., 2000, “A Structural Basis for Drug-Induced Long QT Syndrome,” Proc. Natl. Acad. Sci. USA 97, 12329-12333; Spector et al., 1996, “Class III Antiarrhythmic Drugs Block HERG, a Human Cardiac Delayed Rectifier K+ Channel. Open-Channel Block by Methanesulfonanilides,” Circ. Res. 78, 499-503), an open-state model (Durdagi et al., 2012, “Modeling of Open, Closed, and Open-Inactivated States of the Hergl Channel: Structural Mechanisms of the State-Dependent Drug Binding,” J. Chem. Inform. & Model. 52, 2760-2774) was used as an initial configuration for MD simulations prior to extracting representative inner cavity structures for docking.
To confirm the model's reproduceability, electron density profiles were calculated for the lipid bilayer's heads and tails, protein, water and ions. The distance between the centroids of average electron density profiles of the lipid head groups determines membrane boundaries illustrating the internal component distributions. As may be seen in
Sampling of the channel's conformational space allowed extracting the dominant hERG conformations for docking. Principal component analysis (PCA) helped reduce the system's dimensionality keeping the essential dynamics (see Methods of Materials, above). The dominant eigenvectors decay exponentially and the largest eigenvalues represent correlated hERG motions with the largest standard deviations along orthogonal directions.
The huge search space and many redundant docking solutions due to hERG symmetry pose additional challenges. Hence, the cavity was divided into four halves for two ensemble-based ligand screening simulations. The first identified preferred ligand binding locations used an ensemble-based blind docking with the 45 dominant conformations, involving the whole cavity (see
Finally, the degree of hERG blockage by ligands was quantified using both the binding energies and distances to the permeation pore. Binding affinity alone yields false positives since a ligand could bind tightly far from the permeation pore leading to a minor effect on the ions' channel passage. Binding weakly close to the permeation pore could be impermanent due to large thermal fluctuations. Hence, using either the binding energy or the shortest distance from the permeation pore alone is insufficient.
To determine parameter thresholds for hERG blockers, a panel of 22 compounds including hERG blockers and non-blockers (see TABLE 7, below) was used (see also
Three examples from TABLE 7 are particularly illustrative: acetaminophen (a non-hERG blocker), astemizole (a potent hERG blocker), and BMS-986094 (a potent HCV replication inhibitor, which caused sudden death and severe cardiotoxicity in patients (Sheridan, 2012, “Calamitous HCV trial casts shadow over nucleoside drugs,” Nat. Biotechnol. 30, 1015-1016).
To validate these computational predictions, the 22 compounds were then tested for hERG binding using the Predictor™ assay and patch clamp electrophysiology using AC10 cardiomyocytes stably expressing the hERG channel (see
Consistent with the in silico predictions and with previously reported experimental data, the 10 already known hERG blockers in addition to BMS-986094 displaced the hERG-bound dye. For example, these 10 positive controls were reported to block hERG in in vitro electrophysiology and binding assays with similar IC50 values to those obtained here (Wible et al., 2005, “A Novel Comprehensive High-Throughput Screen for Drug-Induced Herg Risk,” J. Pharmacol. Toxicol. Methods 52, 136-145); Deacon et al., 2007, “Early Evaluation of Compound QT Prolongation Effects: A Predictive 384-Well Fluorescence Polarization Binding Assay for Measuring HERG Blockade,” J. Pharmacol. Toxicol. Methods 55, 238-247; Diaz et al., 2004, “The [3H]Dofetilide Binding Assay is a Predictive Screening Tool for HERG Blockade and Proarrhythmia: Comparison of Intact Cell and Membrane Preparations and Effects of Altering [K+]o,” J. Pharmacol. Toxicol. Methods 50, 187-199). In contrast, none of the known non-hERG blockers displaced the dye nor did they affect hERG tail currents implying the negative controls do not bind sufficiently closely to the channel permeation pore to block (see
The computation models and methods disclosed herein were used to identify drug-mediated hERG blocking activity of BMS-986094 and its metabolites.
BMS-986094 and its metabolites (1-naphthol (1-NP), 2-amino-6-O-methyl-2′C-methyl guanosine (MG) and guanosine) were computationally and experimentally examined according to the methods in the previous example. Consistent with the results of these computational methods and models, experiments showed that BMS-986094 is a potent hERG blocker completely displacing the dye with IC50=0.003 μM (see
Using the methods described herein, BMS-986094 may be modified as described in EXAMPLE 10. For example, the amino acid based prodrug in the BMS-986094 structure depicted above may be modified to a new prodrug moiety, such as an alkoxyalkyl group (Ciesla et al., 2003, “Esterification of Cidofovir with Alkoxyalkanols Increases Oral Bioavailability and Diminishes Drug Accumulation in Kidney,” Antiviral Res. 59, 163-171; Hostetler, 2009, “Alkoxyalkyl Prodrugs of Acyclic Nucleoside Phosphonates Enhance Oral Antiviral Activity and Reduce Toxicity: Current State of the Art,” Antiviral Res. 82, A84-98), as shown in Examples 15a-d, below:
The methods disclosed herein as applied to sodium ion channels may be performed as described in Examples 16-19.
Homology protein modeling of the α-subunit of the human Nav1.5 was performed as follows.
The full-length amino acid sequence (2016 amino acid residues) of the α-subunit of the human Nav1.5 (Uniprot accession code: Q14524-1) was downloaded from the Uniprot database (Magrane et al., 2011, “Uniprot Knowledgebase: A Hub of Integrated Protein Data,” Database 2011). Initially, the full Nav1.5 sequence was dissected into nine sub-domains, four trans-membrane domains (TRM1-TRM4) and five cytoplasmic domains (CYT1-CYT5). Dissection was carried out based on the ProtParam tool (Wilkins et al., 1999, “Protein identification and analysis tools in the ExPASy server,” Methods Mol. Biol. 112: 531-552) on the ExPASy bioinformatics resource portal (Artimo et al., 2012, “ExPASy: SIB Bioinformatics Resource Portal,” Nucleic Acids Res 40: W597-603). Following dissection, 10 full models for each sub-domains were separately generated using the I-Tasser bioinformatics software (Roy et al., 2010, “I-TASSER: a unified platform for automated protein structure and function prediction,” Nat. Protoc. 5: 725-738) based on the Nay/NB bacterial sodium channel (Payandeh et al., 2012, “Crystal Structure of a Voltage-Gated Sodium Channel in two Potentially Inactivated States,” Nature 486: 135-139) as the main template for the TRM domains. NavAB crystal structures represent the closed-inactivated states of the channel (PDB codes: 3RVY, 3RVZ, 3RWO and 4EKW) (Payandeh et al., 2011, The Crystal Structure of a Voltage-Gated Sodium Channel,” Nature 475: 353-359). The resolved crystal structures of the two states are very similar with the exception of a very minor shift that is close to the intracellular end of the four S6 helices. These two states of VGSCs are responsible for the binding of common Nav1.5 blockers, including the anti-anginal drug ranolazine (inactivated state) (Sokolov et al., 2013, “Proton-Dependent Inhibition of the Cardiac Sodium Channel Nav1.5 by Ranolazine,” Front Pharmacol 4: 78) and the antiarrhythmic drug mexiletine (closed state) (Undrovinas et al., 2006, Ranolazine Improves Abnormal Repolarization and Contraction in Left Ventricular Myocytes of Dogs with Heart Failure by Inhibiting Late Sodium Current,” J Cardiovasc Electrophysiol, 17 Suppl 1: S169-S177). The open state of the Nav1.5 channel has been shown to bind VGSCs activators (Tikhonov et al., 2005, “Sodium Channel Activators: Model of Binding Inside the Pore and a Possible Mechanism of Action,” FEBS Lett 579: 4207-4212), and rarely blockers, such as the antiarrhythmic flecainide (Ramos et al., 2004, “State-Dependent Trapping of Flecainide in the Cardiac Sodium Channel,” J Physiol 560: 37-49). Flecininde has been shown to bind strongly to the open activated state of the channel (IC50 7 μM) and only very weakly to the closed/inactivated state (IC50 345 μM). The amino acid sequences for each sub-domain selected from the main Nav1.5 sequence is given in TABLE 8, below.
A full homology modeling cycle by iterative threading assembly refinement (I-Tasser) started with a multi-threading procedure using the software LOMET followed by alignment of the query protein on the selected templates from the pool of PDB resolved NMR or X-ray crystal structures. Following this extensive threading and alignment procedures, secondary structures of the query protein domain was predicted using the PSIPRED tool. The correctly predicted domains were then assembled and unaligned regions, such as loops, were predicted through ab initio modeling. Structure assembly was carried out through a modified replica-exchange Monte Carlo simulation. The simulation was guided by statistical as well as energetic potentials. This was followed by final ranking and refinement stages for the generated model. For Nav1.5, final model refinement was carried out using the ModRefiner algorithm of I-Tasser (Xu et al., 2011, “Improving the Physical Realism and Structural Accuracy of Protein Models by a Two-Step Atomic-Level Energy Minimization,” Biophys J 101: 2525-2534). ModRefiner enhanced the overall quality of the generated models, producing models with optimum side chain packing and minimal numbers of steric clashes. TABLE 8 also shows the 1-Tasser calculated TM scores for the best model for each domain and all TRM domains had a high TM score (>0.5) (Zhang et al., 2004, “Scoring Function for Automated Assessment of Protein Structure Template Quality,” Proteins 57: 702-710). The relatively low TM score for TRM1 is believed to be due to the long loop (84 residues, Leu276-Ala359). Before incorporating this loop into the final model, it was first excised and then modeled separately with I-Tasser followed by a structural refinement using a short, all atoms solvated MD simulation (≈1 ns). Finally, the TRM domains were assembled by superposition on the NavAb wild type crystal structure (PDB code: 4EKW) and the final models were again refined with fragment-guided molecular dynamic simulation FG-MD (Zhang et al., 2011, “Atomic-Level Protein Structure Refinement using Fragment-Guided Molecular Dynamics Conformation Sampling,” Structure 19: 1784-1795).
To speed up the simulation, the N (CYT1) and C (CYT5) termini of the channel, the inactivation gate (CYT4) and the four trans-membrane domains (TRM1-TRM4) were included in the final models. The already crystallized small segments for the human Nav1.5 were added to the model without modification. These structures were extracted from the two available X-ray crystal structures for the calmodulin binding motif of the C-terminus (residues: 1773-1940) of Nav1.5. The first structure (PDB code: 4DCK) was resolved at a 2.2 Å resolution (Wang et al., 2012, “Crystal Structure of the Ternary Complex of a Nav C-Terminal Domain, a Fibroblast Growth Factor Homologous Factor, and Calmodulin,” Structure 20: 1167-1176) and the second one (PDB code: 4JQ0) was resolved at 3.84 Å resolution (Wang et al., 2014, “Structural Analyses of Ca(2)(+)/CaM Interaction with NaV Channel C-termini Reveal Mechanisms of Calcium-Dependent Regulation,” Nat Commun 5: 4896). Another crystal structure was available for residues 1491-1522 in the activation gate resolved at an atomic resolution of 1.35 Å (PDB code: 4DJC) (Sarhan et al., 2012, “Crystallographic basis for calcium regulation of sodium channels,” Proc Natl Acad Sci USA 109: 3558-3563). In the final model, 4DCK and 4DJC were included after brief protein refinement using the protein preparation wizard module of the Schrodinger software package. CYT2 (residue 417-709) and CYT3 (941-1198) were omitted from the final model to speed up the simulations and also due the low sequence similarity with other homologous proteins. Thus, the final models of Nav1.5 included 1465 residues that are topologically subdivided into 7 subdomains, 4 transmembrane (TRM1, TRM2, TRM3 and TRM4) sub-domains, and three cytoplasmic domains (CYT1, CYT4 and CYT5).
To achieve the well established four-fold symmetry, the four domains of Nav1.5 were assembled in a clockwise manner based on the resolved NavAb crystal structure. Assembly was carried out by superposing the domains on the 4EKW crystal structure using the Smith-Waterman local alignment (Smith et al., 1981, “Identification of Common Molecular Subsequences,” J Mol Biol 147: 195-197) algorithm with a 90% score for the secondary structure and an iteration threshold of 0.2 Å as implemented in UCSF Chimera (Pettersen et al., 2004, “UCSF Chimera—a Visualization System for Exploratory Research and Analysis,” J. Comput Chem 25: 1605-1612). As a final refinement steps and to remove potential severe steric clashes, the system was finally minimized using the protein preparation wizard in Schrodinger was heavy atoms not allowed to move beyond 0.3 Å.
The coordinates for hNav1.5 generated from the homology modeling described in EXAMPLE 16, above, are provided in Table B. These coordinates were used as input for the MD simulations, described in EXAMPLE 17 below.
The system preparation and setup procedures for the MD simulation were carried out using the CHARMM-GUI routine for building membrane proteins. Ionization states of titratable residues were treated at physiological pH 7.4. The protein was then embedded in a double bilayer of 400 1-palmitoyl-2-oleoyl-sn-glycero-3-phosphocholine (POPC) lipids in each layer. Upper (15 Å thickness from the protein) and lower (20 Å thickness from the protein) water layers of TIP3P waters and an ionic concentration of 150 mM NaCl solution were used. A 12 Å cutoff was used to calculate the short-range electrostatic interactions. The Particle Mesh Ewald summation method was used for calculating long-range electrostatic interactions. The NBFIX correction for sodium ions interaction with charged carboxylates was used.
Multistage heating and equilibration phases were applied for model relaxation and refinement prior to the production simulation. The system was first minimized for 50,000 minimization steps where only lipid tails were free to move and the rest of the system was held fixed. Four additional minimization steps of 25,000 steps were carried out with constrains removed gradually from the rest of the system (protein and lipid heads) and with water molecules and ions freely moving. Constrains were gradually released from 100, 50, 5 and 1 kcal/mol. Dihedral lipid tails were also constrained and the constrains were gradually released from 100, 50, 5 and 1 kcal/mol. The system was then gradually heated to 310 K for 5 ns using a 1 fs integration time step with 1 kcal/mol constrains on the protein backbone, equilibrated for additional 2*10 ns simulation with 1 fs and then 2 fs time step and with weak 0.5 kcal/mol constrains on the protein backbone.
Production simulation was then carried out for 100 ns using 0.1 kcal/mol constrains on the Cα carbons of the TRM subdomains. The Langevin thermostat (Palovcak et al., 2014, “Evolutionary Imprint of Activation: The Design Principles of VSDs,” J Gen Physiol 143: 145-156; Tiwari-Woodruff et al., 2000, “Voltage-Dependent Structural Interactions in the Shaker K(+) Channel,” J Gen Physiol 115: 123-138) and an anisotropic pressure control were used to keep the temperature at 310 K and the pressure at 1 bar, respectively. Total system size was 573,763 atoms. All simulations were carried out using NAMD 2.9 on a Blue Gene\Q supercomputer. Atomic coordinates were saved to the trajectory every 10 ps. Atomic fluctuation (B-factors) and root mean deviations from the reference structures (RMSD) were calculated, according to the methodologies of EXAMPLE 4 above.
Iterative clustering of the MD trajectory was then performed to extract dominant conformations of Nav1.5, according to the methodologies of EXAMPLE 5 above. Using this methodology, eleven (11) distinct conformations for the intracellular VGSC channel were identified, as shown in
Docking simulations were next performed. Three marketed cardiovascular drugs were tested: (1) one strong Nav1.5 blacker (Ranolazine, antianginal drug) (Sokolov et al., 2013, “Proton-Dependent Inhibition of the Cardiac Sodium Channel Nav1.5 by Ranolazine,” Front Pharmacol 4: 78) with an IC50 of 5.9 μM; (2) one weak blocker (Dofetilide, antiarrhythmic drug) (Roukoz et al., 2007, “Dofetilide: a New Class III Antiarrhythmic Agent,” Expert Rev Cardiovasc Ther 5: 9-19) with an IC50 of 300 and (3) one known non-blocker for Nav1.5 (Nadolol, anti-hypertensive) (Wang et al., 2010, “Propranolol Blocks Cardiac and Neuronal Voltage-Gated Sodium Channels,” Front Pharmacol 1: 144). The chemical structures of these three compounds are provided below:
The compounds were docked against the selected eleven (11) dominant conformations. Docking was carried out using the standard precision mode of the Glide docking module of the Schrodinger package (Glide SP). Top ranked poses were re-scored with AMBER-MMGBSA over 60 snapshots produced from three short 200 ps MD simulation for each ligand. Docking and scoring results are given in TABLE 9, below.
As shown in TABLE 9, the model was able to correctly identify Ranolazine to be the top ranked compound. The AMBER/GBSA over the selected snapshots improved the ranking of the chosen compounds based on their corresponding IC50 values, such that the experimentally observed activity trend is reproduced (Ranolazine>Dofetilide>Nadolol).
As shown in
Classification the compounds as “blockers,” e.g., compounds that block the hNav1.5 ion channel, or as “non-blockers,” e.g., compounds that do not block the hNav1.5 ion channel, is performed as described in EXAMPLE 9, above, for the hERG ion channel.
Redesign of a hNav1.5 ion channel blocker to be a non-blocker is performed as described in EXAMPLE 10, above, for the hERG ion channel.
The methods disclosed herein as applied to calcium ion channels may be performed as described in Examples 20-23.
Homology protein modeling of the α-1 subunit of the human Cav1.2 is performed as follows.
The full-length amino acid sequence (2138 amino acid residues) of the α-1 subunit of the human Cav1.2 (Uniprot accession code: Q13936) is downloaded from the Uniprot database (Magrane et al., 2011, “Uniprot Knowledgebase: A Hub of Integrated Protein Data,” Database 2011). Initially, the full Cav1.2 sequence is dissected into sub-domains, trans-membrane domains and cytoplasmic domains. Dissection is carried out based on the ProtParam tool (Wilkins et al., 1999, “Protein identification and analysis tools in the ExPASy server,” Methods Mol. Biol. 112: 531-552) on the ExPASy bioinformatics resource portal (Artimo et al., 2012, “ExPASy: SIB Bioinformatics Resource Portal,” Nucleic Acids Res 40: W597-603). Following dissection, full models for each sub-domains are separately generated using the I-Tasser bioinformatics software (Roy et al., 2010, “I-TASSER: a unified platform for automated protein structure and function prediction,” Nat. Protoc. 5: 725-738) based on the NavAB bacterial sodium channel (Payandeh et al., 2012, “Crystal Structure of a Voltage-Gated Sodium Channel in two Potentially Inactivated States,” Nature 486: 135-139) as the main template for the transmembrane domains. NavAB crystal structures represent the closed-inactivated states of the channel (PDB codes: 3RVY, 3RVZ, 3RWO and 4EKW) (Payandeh et al., 2011, The Crystal Structure of a Voltage-Gated Sodium Channel,” Nature 475: 353-359). The coordinates for the template NavAB crystal structure, used to model Cav1.2 is provided in Table C.
MD simulations are performed, as described herein, for example, according to the methodologies of EXAMPLES 3 and 17 above.
Iterative clustering of the MD trajectory is then performed to extract dominant conformations of hCav1.2, according to the methodologies of EXAMPLE 5 above. Using this methodology, distinct conformations for the intracellular hCav1.2 channel are identified.
Compounds prepared according to the methodologies of EXAMPLE 2, above, are docked against the selected dominant conformations. Docking is carried out using the standard precision mode of the Glide docking module of the Schrodinger package (Glide SP). Top ranked poses are re-scored with AMBER-MMGBSA.
Classification the compounds as “blockers,” e.g., compounds that block the hCav1.2 ion channel, or as “non-blockers,” e.g., compounds that do not block the hCav1.2 ion channel, is performed as described in EXAMPLE 9, above, for the hERG ion channel.
Redesign of a hCav1.2 ion channel blocker to be a non-blocker is performed as described in EXAMPLE 10, above, for the hERG ion channel.
One or more data stores 1308 can store the data to be analyzed by the grid computing environment 1306 as well as any intermediate or final data generated by the grid computing environment. However in certain embodiments, the configuration of the grid computing environment 1306 allows its operations to be performed such that intermediate and final data results can be stored solely in volatile memory (e.g., RAM), without a requirement that intermediate or final data results be stored to non-volatile types of memory (e.g., disk).
This can be useful in certain situations, such as when the grid computing environment 1306 receives ad hoc queries from a user and when responses, which are generated by processing large amounts of data, need to be generated on-the-fly. In this non-limiting situation, the grid computing environment 1306 is configured to retain the processed information within the grid memory so that responses can be generated for the user at different levels of detail as well as allow a user to interactively query against this information.
For example, the grid computing environment 1306 receives structural information describing the structure of the ion channel protein, and performs a molecular dynamics simulation of the protein structure. Then, the grid computing environment 1306 uses a clustering algorithm to identify dominant conformations of the protein structure from the molecular dynamics simulation, and select the dominant conformations of the protein structure identified from the clustering algorithm. In addition, the grid computing environment 1306 receives structural information describing conformers of one or more compounds, and uses a docking algorithm to dock the conformers of the one or more compounds to the dominant conformations. The grid computing environment 1306 further identifies a plurality of preferred binding conformations for each of the combinations of protein and compound, and optimizes the preferred binding conformations using molecular dynamics simulations so as to determine whether the compound blocks the ion channel of the protein in the preferred binding conformations.
Specifically, in response to user inquires about cardiotoxicity of a compound, the grid computing environment 1306, without an OLAP or relational database environment being required, aggregates protein structural information and compound structural information from the data stores 1308. Then the grid computing environment 1306 uses the received protein structural information to perform molecular dynamics simulations for determining configurations of target protein flexibility (e.g., over a simulation length of greater than 50 ns). The molecular dynamics simulations involve the grid computing environment 1306 determining forces acting on an atom based upon an empirical force field that approximates intramolecular forces, where numerical integration is performed to update positions and velocities of atoms. The grid computing environment 1306 clusters molecular dynamic trajectories formed based upon the updated positions and velocities of the atoms into dominant conformations of the protein, and executes a docking algorithm that uses the compound's structural information in order to dock the compound's conformers to the dominant conformations of the protein. Based on information related to the docked compound's conformers, the grid computing environment 1306 identifies a plurality of preferred binding conformations for each of the combinations of protein and compound. If the compound does not block the ion channel of the protein in the preferred binding conformations, the grid computing environment 1306 predicts the compound has reduced risk of cardiotoxicity. Otherwise, the grid computing environment 1306 predicts the compound is cardiotoxic, and redesigns the compound in order to reduce risk of cadiotoxicity.
As an example of an implementation environment, the grid computing environment 1306 can comprise a number of blade servers, and a central coordinator 1406 and the node coordinators (1412, 1414) are associated with their own blade server. In other words, a central coordinator 1406 and the node coordinators (1412, 1414) execute on their own respective blade server. In some embodiments, each blade server contains multiple cores and a thread is associated with and executes on a core belonging to a node processor (e.g., node processor 1408). A network connects each blade server together.
The central coordinator 1406 comprises a node on the grid. For example, there might be 100 nodes, with only 50 nodes specified to be run as node coordinators. The grid computing environment 1306 will run the central coordinator 1406 as a 51st node, and selects the central coordinator node randomly from within the grid. Accordingly, the central coordinator 1406 has the same hardware configuration as a node coordinator.
The central coordinator 1406 may receive information and provide information to a user regarding queries that the user has submitted to the grid. The central coordinator 1406 is also responsible for communicating with the 50 node coordinator nodes, such as by sending those instructions on what to do as well as receiving and processing information from the node coordinators. In one implementation, the central coordinator 1406 is the central point of contact for the client with respect to the grid, and a user never directly communicates with any of the node coordinators.
With respect to data transfers involving the central coordinator 1406, the central coordinator 1406 communicates with the client (or another source) to obtain the input data to be processed. The central coordinator 1406 divides up the input data and sends the correct portion of the input data for routing to the node coordinators. The central coordinator 1406 also may generate random numbers for use by the node coordinators in simulation operations as well as aggregate any processing results from the node coordinators. The central coordinator 1406 manages the node coordinators, and each node coordinator manages the threads which execute on their respective machines.
A node coordinator allocates memory for the threads with which it is associated. Associated threads are those that are in the same physical blade server as the node coordinator. However, it should be understood that other configurations could be used, such as multiple node coordinators being in the same blade server to manage different threads which operate on the server. Similar to a node coordinator managing and controlling operations within a blade server, the central coordinator 1406 manages and controls operations within a chassis.
In certain embodiments, a node processor includes shared memory for use for a node coordinator and its threads. The grid computing environment 1306 is structured to conduct its operations (e.g., matrix operations, etc.) such that as many data transfers as possible occur within a blade server (i.e., between threads via shared memory on their node) versus performing data transfers between threads which operate on different blades. Such data transfers via shared memory are more efficient than a data transfer involving a connection with another blade server.
Specifically, the protein-structural-information data structure 1502 is configured to store data related to the structure of the potassium ion channel protein, for example, special relationship data between different atoms. The data related to the structure of the potassium ion channel protein may be obtained from a homology model, an NMR solution structure, an X-ray crystal structure, a molecular model, etc. Molecular dynamics simulations can be performed on data stored in the protein-structural-information data structure 1502. For example, the molecular dynamics simulations involve solving the equation of motion according to the laws of physics, e.g., the chemical bonds within proteins being allowed to flex, rotate, bend, or vibrate. Information about the time dependence and magnitude of fluctuations in both positions and velocities of the given molecule/atoms is obtained from the molecular dynamics simulations. For example, data related to coordinates and velocities of molecules/atoms at equal time intervals or sampling intervals are obtained from the molecular dynamics simulations. Atomistic trajectory data (e.g., at different time slices) are formed based on the positions and velocities of molecules/atoms resulted from the molecular dynamics simulations and stored in the molecular-dynamics-simulations data structure 1508. The molecular dynamics simulations can be of any duration. In certain embodiments, the duration of the molecular dynamics simulation is greater than 50 ns, for example, preferably greater than 200 ns.
Data stored in the molecular-dynamics-simulations data structure 1508 are processed using a clustering algorithm, and associated cluster population data are stored in the cluster data structure 1512. Dominant conformations of the potassium ion channel protein are identified based at least in part on the data stored in the molecular-dynamics-simulations data structure 1508 and the associated cluster population data stored in the cluster data structure 1512. Atomistic trajectory data (e.g., at different time slices) related to the identified dominant conformations are stored in the dominant-conformations data structure 1510.
Data stored in the candidate-compound-structure-information data structure 1504 are processed together with data related to the dominant conformations of the potassium ion channel protein stored in the dominant-conformations data structure 1510. The conformers of the one or more compounds are docked to the dominant conformations of the structure of the potassium ion channel protein using a docking algorithm (e.g., DOCK, AutoDock, etc.), so that data related to various combinations of potassium ion channel protein and compound is determined and stored in the binding-conformations data structure 1506. For example, the compound is an antiviral agent (e.g., hepatitis C inhibitor). As an example, the binding-conformations data structure includes data related to binding energies. 2D information of the compound may be translated into a 3D representative structure to be stored in the candidate-compound-structure-information data structure 1504 for docking. Data stored in the binding-conformations data structure 1506 are processed using a clustering algorithm, and associated cluster population data are stored in the cluster data structure 1512. One or more preferred binding conformations are identified based at least in part on the data stored in the binding-conformations data structure 1506 and the associated cluster population data stored in the cluster data structure 1512. For example, the preferred binding conformations include those with a largest cluster population and a lowest binding energy.
The identified preferred binding conformations are optimized using a scalable molecular dynamics simulations (e.g., through a NAMD software, etc.). In certain embodiments, binding energies are calculated (e.g., using salvation models, etc.) for each of the combinations of protein and compound (receptor and ligand) in the corresponding optimized preferred binding conformation(s). The calculated binding energies are output as the predicted binding energies for each of the combinations of protein and compound.
The cardiotoxicity-analysis data structure 1514 includes data related to a blocking degree of one or more compounds, e.g., in the preferred binding conformations. For example, the data stored in the cardiotoxicity-analysis data structure 1514 includes identification of blocking sites and non-blocking sites. The data stored in the cardiotoxicity-analysis data structure 1514 indicates a potential cardiac hazard when (i) a pocket within the hERG channel is classified as a blocking site and (ii) a ligand fits within the pocket and is within a predetermined binding affinity level. The data stored in the cardiotoxicity-analysis data structure 1514 does not indicate a potential cardiac hazard when a ligand binds to a pocket within the hERG channel that is classified as a non-blocking site. In some embodiments, if the compound does not block the ion channel (e.g., the blocking degree being lower than a threshold) in the preferred binding conformation(s), the compound is predicted to have reduced risk of cardiotoxicity, and the compound can be selected. In other embodiments, if the compound blocks the ion channel (e.g., the blocking degree being higher than the threshold) in the preferred binding conformation(s), the compound is predicted to be cardiotoxic. A molecular modeling algorithm can be used to chemically modify or redesign the compound so as to reduce the risk of cardiotoxicity (e.g., to reduce the blocking degree).
A system can be configured such that a compound-selection system 2102 can be provided on a stand-alone computer for access by a user 2104, such as shown at 2100 in
Additionally, the methods and systems described herein may be implemented on many different types of processing devices by program code comprising program instructions that are executable by the device processing subsystem. The software program instructions may include source code, object code, machine code, or any other stored data that is operable to cause a processing system to perform the methods and operations described herein. Other implementations may also be used, however, such as firmware or even appropriately designed hardware configured to carry out the methods and systems described herein.
The systems' and methods' data (e.g., associations, mappings, data input, data output, intermediate data results, final data results, etc.) may be stored and implemented in one or more different types of computer-implemented data stores, such as different types of storage devices and programming constructs (e.g., RAM, ROM, Flash memory, flat files, databases, programming data structures, programming variables, IF-THEN (or similar type) statement constructs, etc.). It is noted that data structures describe formats for use in organizing and storing data in databases, programs, memory, or other computer-readable media for use by a computer program.
The systems and methods may be provided on many different types of computer-readable media including computer storage mechanisms (e.g., CD-ROM, diskette, RAM, flash memory, computer's hard drive, etc.) that contain instructions (e.g., software) for use in execution by a processor to perform the methods' operations and implement the systems described herein.
The computer components, software modules, functions, data stores and data structures described herein may be connected directly or indirectly to each other in order to allow the flow of data needed for their operations. It is also noted that a module or processor includes but is not limited to a unit of code that performs a software operation, and can be implemented for example as a subroutine unit of code, or as a software function unit of code, or as an object (as in an object-oriented paradigm), or as an applet, or in a computer script language, or as another type of computer code. The software components and/or functionality may be located on a single computer or distributed across multiple computers depending upon the situation at hand.
The computing system can include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
While this specification contains many specifics, these should not be construed as limitations on the scope or of what may be claimed, but rather as descriptions of features specific to particular embodiments. Certain features that are described in this specification in the context or separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination. Moreover, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.
Similarly, while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. In certain circumstances, multitasking and parallel processing may be advantageous. Moreover, the separation of various system components in the embodiments described above should not be understood as requiring such separation in all embodiments, and it should be understood that the described program components and systems can generally be integrated together in a single software product or packaged into multiple software products.
Thus, particular embodiments have been described. Other embodiments are within the scope of the following claims. For example, the actions recited in the claims can be performed in a different order and still achieve desirable results.
All publications and patent applications cited in this specification are herein incorporated by reference as if each individual publication or patent application were specifically and individually indicated to be incorporated by reference. Although the foregoing has been described in some detail by way of illustration and example for purposes of clarity of understanding, it will be readily apparent to those of ordinary skill in the art in light of the teachings of the specification that certain changes and modifications may be made thereto without departing from the spirit or scope of the appended claims.
The present application claims the benefit of priority of U.S. Provisional Application No. 61/916,093, filed Dec. 13, 2013, and U.S. Provisional Application No. 62/034,745, Aug. 7, 2014, the content of each of which is hereby incorporated by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
61916093 | Dec 2013 | US | |
62034745 | Aug 2014 | US |