Large-scale biobanks offer the potential to link genes to health traits documented in electronic health records (EHR) with unprecedented power. In turn, these discoveries are expected to improve our understanding of the etiology of common and complex diseases as well as our ability to treat and prevent these conditions. To this end, the Million Veteran Program (MVP) was established by the U.S. Veterans Health Administration in 2011 as a nationwide research program within the Veteran Administration (VA) healthcare system. The overarching goal of MVP is to reveal new biologic insights and clinical associations broadly relevant to human health and enhance the care of veterans through precision medicine.
Blood concentrations of total cholesterol (TC), low-density lipoprotein (LDL) cholesterol, high-density lipoprotein (HDL) cholesterol, and triglycerides (TG) are heritable risk factors for cardiovascular disease, a highly prevalent condition among U.S. veterans. Genome-wide association studies (GWAS) to date have identified at least 268 loci that influence these levels, many of which are under investigation as potential therapeutic targets. However, off-target effects have dampened enthusiasm for some of these molecules, and understanding the full spectrum of clinical consequences of a given DNA sequence variant through phenome-wide association scanning (“PheWAS”) may shed light on potential unintended effects as well as novel therapeutic indications.
A GWAS was perfromed, including a discovery phase in MVP and a replication phase in the Global Lipids Genetics Consortium (GLGC) (
Disclosed are methods for determining a subject's susceptibility to having or developing coronary artery disease comprising determining in the subject the presence of one or more PDE3B loss of function or damaging variants, and wherein the presence of the variant indicates the subject's decreased susceptibility for having or developing coronary artery disease.
In some aspects, the PDE3B loss of function or damaging variant is Arg783Ter or rs150090666. In some aspects, the PDE3B loss of function or damaging variant results in a truncated PDE3B protein. In some aspects, the truncated PDE3B protein has the mutation Arg783Ter.
In some aspects, the PDE3B loss of function or damaging variant is determined from a sample obtained from the subject.
In some aspects, the PDE3B loss of function or damaging variant is determined by amplifying or sequencing a nucleic acid sample obtained from the subject. In some aspects, the amplifying is performed using polymerase chain reaction (PCR).
The method of claims 6-7, wherein the amplifying or sequencing comprises using primers having sequences complementary to PDE3B DNA or RNA sequences. For example, disclosed are primers and probes having sequences complementary to a portion of the PDE3B nucleic acid sequence found in accession number NM_000922.3.
Disclosed are methods of detecting one or more PDE3B loss of function or damaging variants in a subject, said method comprising: obtaining a biological sample from a subject; detecting whether a PDE3B loss of function variant is present in the biological sample by performing whole genome or whole exome sequencing.
Disclosed are methods comprising: obtaining a biological sample from a subject; detecting whether one or more PDE3B loss of function variants are present in the sample; diagnosing the subject as having a greater likelihood of responding to PDE3B inhibitors when there is an absence of the one or more PDE3B loss of function variants; and administering an effective amount of a PDE3B inhibitor/antagonist to the subject.
Disclosed are methods of treating a patient with coronary artery disease comprising administering an effective amount of a PDE3B inhibitor/antagonist.
Disclosed are methods of treating/preventing coronary artery disease in a subject comprising administering a composition that antagonizes/inhibits PDE3B to the subject, wherein the subject has been determined to lack one or more loss of function mutations in PDE3B.
Disclosed are methods of screening for test compositions that cause a loss of function mutation in PDE3B comprising: contacting a PDE3B gene with a test composition; detecting the presence of one or more mutations in the PDE3B gene; and determining if the one or more mutations are loss of function mutations, wherein the presence of one or more loss of function mutations in PDE3B indicates a test composition that causes a loss of function in PDE3B.
Disclosed are methods of screening for therapeutic candidates for treating coronary artery disease compositions comprising: contacting a cell lacking one or more loss of function or damaging mutations in PDE3B with a test composition; and determining if the test composition inhibits PDE3B in the cell, wherein if the test composition inhibits PDE3B then it is a therapeutic candidate for treating coronary artery disease.
Disclosed are vectors comprising a loss of function or damaging PDE3B variant, wherein the PDE3B variant comprises a mutation that results in a truncated PDE3B protein.
Disclosed are cells comprising any of the disclosed vectors.
Disclosed are methods for identifying a subject in need of treatment for coronary artery disease comprising determining in the subject the presence of a PDE3B loss of function or damaging variant, wherein the presence of a PDE3B loss of function or damaging variant indicates that the subject is not in need of treatment for a coronary artery disease.
Disclosed are methods of identifying a subject in need of screening for the development of a coronary artery disease comprising determining in the subject the absence of a PDE3B loss of function or damaging variant, wherein the absence of a a PDE3B loss of function or damaging variant indicates a subject in need of screening for the development of coronary artery disease.
Disclosed are engineered, non-naturally occurring CRISPR-CAS systems comprising: a guide RNA that hybridizes with a target sequence, wherein the target sequence comprises a PDE3B loss of function variant, and a Cas protein or gene encoding a Cas protein.
Disclosed are methods of altering expression of at least one gene product, wherein the at least one gene product is a gene product from a PDE3B loss of function variant, wherein the method comprises administering a) a guide RNA that hybridizes with a target sequence, wherein the target sequence comprises the PDE3B loss of function variant, and b) a Cas protein or gene encoding a Cas protein, whereby the guide RNA targets the target sequence and the Cas9 protein cleaves the nucleic acid molecule which comprises the PDE3B loss of function variant, whereby expression of the at least one gene product is altered.
Disclosed are methods of altering expression of at least one gene product, wherein the at least one gene product is a gene product from a PDE3B loss of function variant, wherein the method comprises administering a vector that comprises a) a first regulatory element operable in a eukaryotic cell operably linked to at least one nucleotide sequence encoding a CRISPR-Cas system guide RNA that hybridizes with a target sequence, wherein the target sequence comprises the PDE3B loss of function variant, and b) a second regulatory element operable in a eukaryotic cell operably linked to a nucleotide sequence encoding a Cas9 protein, whereby the guide RNA targets the target sequence and the Cas9 protein cleaves the target sequence, whereby expression of the at least one gene product is altered.
Disclosed are methods of silencing or inhibiting expression of wild type PDE3B in a cell comprising providing at least one silencing agent to the cell, wherein said silencing agent silences or inhibits expression of the wild type PDE3B in the cell.
Disclosed are methods of silencing or inhibiting expression of wild type PDE3B in a cell comprising providing at least one RNA to the cell in an amount sufficient to inhibit the expression of PDE3B, wherein the RNA comprises or forms a double-stranded structure containing a first strand comprising a ribonucleotide sequence which corresponds to a nucleotide sequence of PDE3B and a second strand comprising a ribonucleotide sequence which is complementary to the nucleotide sequence of PDE3B, wherein the first and the second ribonucleotide sequences are separate complementary sequences that hybridize to each other to form said double-stranded structure, and the RNA comprising the double-stranded structure inhibits expression of PDE3B.
Disclosed are RNAs comprising a double-stranded structure containing a first strand comprising a ribonucleotide sequence which corresponds to a nucleotide sequence of PDE3B and a second strand comprising a ribonucleotide sequence which is complementary to the nucleotide sequence of PDE3B, wherein the first and the second ribonucleotide sequences are separate complementary sequences that hybridize to each other to form said double-stranded structure.
Disclosed are methods of inhibiting expression of PDE3B in a cell comprising: (a) isolating the cell; (b) contacting the cell with a RNA comprising a double-stranded structure comprising a first strand comprising a ribonucleotide sequence which corresponds to a nucleotide sequence of PDE3B and a second strand comprising a ribonucleotide sequence which is complementary to the nucleotide sequence of PDE3B, wherein the first and the second ribonucleotide sequences are separate sequences that hybridize to each other to form said double-stranded structure, and (c) subsequently introducing the cell into a host, wherein said RNA comprising the double-stranded structure inhibits expression of the target gene in the cell in the host.
Additional advantages of the disclosed method and compositions will be set forth in part in the description which follows, and in part will be understood from the description, or may be learned by practice of the disclosed method and compositions. The advantages of the disclosed method and compositions will be realized and attained by means of the elements and combinations particularly pointed out in the appended claims. It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention as claimed.
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate several embodiments of the disclosed method and compositions and together with the description, serve to explain the principles of the disclosed method and compositions.
Abbreviations: EA, Effect Allele; NEA, Non-effect Allele; EAF, Effect Allele Frequency; SE, Standard Error.
The disclosed method and compositions may be understood more readily by reference to the following detailed description of particular embodiments and the Example included therein and to the Figures and their previous and following description.
It is to be understood that the disclosed method and compositions are not limited to specific synthetic methods, specific analytical techniques, or to particular reagents unless otherwise specified, and, as such, may vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting.
Disclosed are materials, compositions, and components that can be used for, can be used in conjunction with, can be used in preparation for, or are products of the disclosed method and compositions. These and other materials are disclosed herein, and it is understood that when combinations, subsets, interactions, groups, etc. of these materials are disclosed that while specific reference of each various individual and collective combinations and permutation of these compounds may not be explicitly disclosed, each is specifically contemplated and described herein. Thus, if a class of molecules A, B, and C are disclosed as well as a class of molecules D, E, and F and an example of a combination molecule, A-D is disclosed, then even if each is not individually recited, each is individually and collectively contemplated. Thus, is this example, each of the combinations A-E, A-F, B-D, B-E, B-F, C-D, C-E, and C-F are specifically contemplated and should be considered disclosed from disclosure of A, B, and C; D, E, and F; and the example combination A-D. Likewise, any subset or combination of these is also specifically contemplated and disclosed. Thus, for example, the sub-group of A-E, B-F, and C-E are specifically contemplated and should be considered disclosed from disclosure of A, B, and C; D, E, and F; and the example combination A-D. This concept applies to all aspects of this application including, but not limited to, steps in methods of making and using the disclosed compositions. Thus, if there are a variety of additional steps that can be performed it is understood that each of these additional steps can be performed with any specific embodiment or combination of embodiments of the disclosed methods, and that each such combination is specifically contemplated and should be considered disclosed.
It is understood that the disclosed method and compositions are not limited to the particular methodology, protocols, and reagents described as these may vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to limit the scope of the present invention which will be limited only by the appended claims.
It must be noted that as used herein and in the appended claims, the singular forms “a”, “an”, and “the” include plural reference unless the context clearly dictates otherwise. Thus, for example, reference to “a variant” includes a plurality of such variants, reference to “the variant” is a reference to one or more variants and equivalents thereof known to those skilled in the art, and so forth.
“Optional” or “optionally” means that the subsequently described event, circumstance, or material may or may not occur or be present, and that the description includes instances where the event, circumstance, or material occurs or is present and instances where it does not occur or is not present.
Ranges may be expressed herein as from “about” one particular value, and/or to “about” another particular value. When such a range is expressed, also specifically contemplated and considered disclosed is the range¬from the one particular value and/or to the other particular value unless the context specifically indicates otherwise. Similarly, when values are expressed as approximations, by use of the antecedent “about,” it will be understood that the particular value forms another, specifically contemplated embodiment that should be considered disclosed unless the context specifically indicates otherwise. It will be further understood that the endpoints of each of the ranges are significant both in relation to the other endpoint, and independently of the other endpoint unless the context specifically indicates otherwise. Finally, it should be understood that all of the individual values and sub-ranges of values contained within an explicitly disclosed range are also specifically contemplated and should be considered disclosed unless the context specifically indicates otherwise. The foregoing applies regardless of whether in particular cases some or all of these embodiments are explicitly disclosed.
Unless defined otherwise, all technical and scientific terms used herein have the same meanings as commonly understood by one of skill in the art to which the disclosed method and compositions belong. Although any methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present method and compositions, the particularly useful methods, devices, and materials are as described. Publications cited herein and the material for which they are cited are hereby specifically incorporated by reference. Nothing herein is to be construed as an admission that the present invention is not entitled to antedate such disclosure by virtue of prior invention. No admission is made that any reference constitutes prior art. The discussion of references states what their authors assert, and applicants reserve the right to challenge the accuracy and pertinency of the cited documents. It will be clearly understood that, although a number of publications are referred to herein, such reference does not constitute an admission that any of these documents forms part of the common general knowledge in the art.
Throughout the description and claims of this specification, the word “comprise” and variations of the word, such as “comprising” and “comprises,” means “including but not limited to,” and is not intended to exclude, for example, other additives, components, integers or steps. In particular, in methods stated as comprising one or more steps or operations it is specifically contemplated that each step comprises what is listed (unless that step includes a limiting term such as “consisting of”), meaning that each step is not intended to exclude, for example, other additives, components, integers or steps that are not listed in the step.
Disclosed are methods for determining a subject's risk for having or developing coronary artery disease comprising determining in the subject the presence of one or more PDE3B loss of function or damaging variants, and wherein the presence of the variant indicates the subject's reduced risk for having or developing coronary artery disease.
In some aspects, the PDE3B loss of function or damaging variant can be any of those found in
In some aspects, the PDE3B loss of function or damaging variant results in a PDE3B protein having the mutation Arg783Ter or rs150090666.
In some aspects, the PDE3B loss of function or damaging variant is determined from a sample obtained from the subject. The sample obtained from the subject can be, for example, blood, plasma, serum, cells, urine, mucus, spinal fluid, or sweat.
In some aspects, the PDE3B loss of function or damaging variant is determined by amplifying or sequencing a nucleic acid sample obtained from the subject. In some aspects, the amplifying can be performed using polymerase chain reaction (PCR). In some aspects, the amplifying or sequencing comprises using primers having sequences complementary to PDE3B DNA or RNA sequences. For example, disclosed are primers and probes having sequences complementary to a portion of the PDE3B nucleic acid sequence found in accession number NM_000922.3.
Disclosed are methods of detecting one or more PDE3B loss of function or damaging variants in a subject, said method comprising: obtaining a biological sample from a subject; detecting whether one or more PDE3B loss of function or damaging variants are present in the biological sample by contacting the biological sample with an anti-PDE3B loss of function or damaging variant antibody or antigen binding fragment thereof and detecting binding between the one or more PDE3B loss of function or damaging variants and the antibody, or fragment thereof.
Disclosed are methods of detecting one or more PDE3B loss of function or damaging variants in a subject, said method comprising: obtaining a biological sample from a subject; detecting whether one or more PDE3B loss of function or damaging variants are present in the biological sample by performing whole genome or whole exome sequencing. After detecting the presence of a variant the effect of these variant on function of the protein or expression of the protein can be predicted. pLOFs can lead to truncation of a protein, splice site problems, or frameshifts.
Disclosed are methods comprising: obtaining a sample from a subject; detecting whether one or more PDE3B loss of function or damaging variants are present in the sample; diagnosing the subject as having a greater likelihood of responding to PDE3B inhibitors when there is an absence of the one or more PDE3B loss of function or damaging variants; and administering an effective amount of a PDE3B inhibitor to the subject. In some aspects, the sample can be, but is not limited to, blood, plasma, serum, cells, urine, mucus, spinal fluid, or sweat. In some aspects, the sample can be DNA or protein.
In some aspects of the disclosed methods, the PDE3B loss of function or damaging variant can be any of those found in
In some aspects, the PDE3B inhibitor can be a compound, protein, DNA, RNAi, CRISPR, or siRNA.
Disclosed are methods of treating a subject comprising administering a composition that inhibits the function of PDE3B to a subject, wherein the subject has been determined to lack one or more loss of function or damaging mutations in PDE3B. In some aspects, a PDE3B loss of function or damaging variant results in a PDE3B protein having the mutation Arg783Ter. Thus, a subject lacking the loss of function mutation in PDE3B can be a subject that does not contain the Arg783Ter mutation. In some aspects, a subject lacking the loss of fuction mutation in PDE3B can be a subject that does not contain any of the mutations in
In some aspects, the composition administered to the subject can be a compound, protein, DNA, RNAi, CRISPR, or siRNA.
Disclosed are methods for identifying a subject in need of treatment for coronary artery disease comprising determining in the subject the presence of a PDE3B loss of function or damaging variant, wherein the presence of a PDE3B loss of function or damaging variant indicates that the subject is not in need of treatment for coronary artery disease. Thus, also disclosed are methods for identifying a subject in need of treatment for coronary artery disease comprising determining in the subject the lack of a PDE3B loss of function or damaging variant, wherein the lack of a PDE3B loss of function or damaging variant indicates that the subject is in need of treatment for coronary artery disease. In some aspects, the PDE3B loss of function or damaging variant results in a PDE3B protein having the mutation Arg783Ter. In some aspects, the PDE3B loss of function or damaging variant results in a PDE3B protein having the mutation of any of those in
Disclosed are methods of screening for test compositions that cause a loss of function or damaging mutation in PDE3B comprising: contacting a PDE3B gene with a test composition; detecting the presence of one or more mutations in the PDE3B gene; and determining if the one or more mutations is a loss of function or damaging mutation, wherein the presence of one or more loss of function or damaging mutations in PDE3B indicates a test composition that causes a loss of function or damaging in PDE3B. In some aspects, prior to contacting a PDE3B gene with a test composition, the presence of a loss of function or damaging mutation is first analyzed in the PDE3B gene. If no loss of function or damaging mutation is detected then the PDE3B gene can be contacted with a test composition.
In some aspects, the PDE3B loss of function or damaging variant results in a PDE3B protein having the mutation of any of those in
Disclosed are methods of screening for therapeutic candidates for treating coronary artery disease compositions comprising: contacting a cell lacking a loss of function or damaging mutation in PDE3B with a test composition; and determining if the test composition inhibits PDE3B in the cell, wherein if the test composition inhibits PDE3B then it is a therapeutic candidate for treating coronary artery disease.
Disclosed are methods of identifying a subject in need of screening for the development of coronary artery disease comprising determining in the subject the absence of a PDE3B loss of function or damaging variant, wherein the absence of a a PDE3B loss of function or damaging variant indicates a subject in need of screening for the development of coronary artery disease. In some aspects, the loss of function or damaging mutation in PDE3B results in a PDE3B protein having the mutation Arg783Ter.
Disclosed are methods of inducing a loss of function or damaging mutation in PDE3B comprising administering a test composition determined from the disclosed methods of screening for test compositions that cause a loss of function or damaging mutation in PDE3B. In some aspects, the loss of function or damaging mutation in PDE3B results in a PDE3B protein having the mutation Arg783Ter.
Disclosed are vectors comprising a loss of function or damaging PDE3B variant, wherein the loss of function or damaging mutation in PDE3B results in a PDE3B protein having the mutation Arg783Ter.
In some aspects, the vectors can be viral or non-viral vectors. The term “vector”, as used herein, refers to a composition capable of transporting a nucleic acid. In some cases, a vector can be a plasmid, i.e., a circular double stranded piece of DNA into which additional DNA segments can be ligated. In some cases, a vector can be a viral vector, wherein additional DNA segments can be ligated into the viral genome. In some cases, a vector can autonomously replicate in a host cell into which they are introduced (e.g., bacterial vectors having a bacterial origin of replication and episomal mammalian vectors). In other cases, vectors (e.g., non-episomal mammalian vectors) can be integrated into the genome of a host cell upon introduction into the host cell, and thereby are replicated along with the host genome. Moreover, certain vectors can direct the expression of genes to which they are operatively linked. Such vectors can be referred to as “recombinant expression vectors” (or simply, “expression vectors”).
In some aspects, the proteins encoded by the PDE3B variants are expressed by inserting DNAs encoding the PDE3B variants into expression vectors such that the genes are operatively linked to necessary expression control sequences such as transcriptional and translational control sequences. Expression vectors include plasmids, retroviruses, adenoviruses, adeno-associated viruses (AAV), plant viruses such as cauliflower mosaic virus, tobacco mosaic virus, cosmids, YACs, EBV derived episomes, and the like. In some instances nucleic acids comprising the PDE3B variants can be ligated into a vector such that transcriptional and translational control sequences within the vector serve their intended function of regulating the transcription and translation of the PDE3B variant. The expression vector and expression control sequences are chosen to be compatible with the expression host cell used. Nucleic acid sequences comprising the PDE3B variants can be inserted into separate vectors or into the same expression vector. A nucleic acid sequence comprising the PDE3B variants can be inserted into the expression vector by standard methods (e.g., ligation of complementary restriction sites on the nucleic acid comprising the PDE3B variants and vector, or blunt end ligation if no restriction sites are present).
In addition to a nucleic acid sequence comprising the PDE3B variants, the recombinant expression vectors can carry regulatory sequences that control the expression of the genetic variant in a host cell. It will be appreciated by those skilled in the art that the design of the expression vector, including the selection of regulatory sequences can depend on such factors as the choice of the host cell to be transformed, the level of expression of protein desired, etc. Preferred regulatory sequences for mammalian host cell expression include viral elements that direct high levels of protein expression in mammalian cells, such as promoters and/or enhancers derived from retroviral LTRs, cytomegalovirus (CMV) (such as the CMV promoter/enhancer), Simian Virus 40 (SV40) (such as the SV40 promoter/enhancer), adenovirus, (e.g., the adenovirus major late promoter (AdMLP)), polyoma and strong mammalian promoters such as native immunoglobulin and actin promoters. For further description of viral regulatory elements, and sequences thereof, see e.g., U.S. Pat. Nos. 5,168,062, 4,510,245 and 4,968,615. Methods of expressing polypeptides in bacterial cells or fungal cells, e.g., yeast cells, are also well known in the art.
In addition to a nucleic acid sequence comprising the PDE3B variants and regulatory sequences, the recombinant expression vectors can carry additional sequences, such as sequences that regulate replication of the vector in host cells (e.g., origins of replication) and selectable marker genes. The selectable marker gene facilitates selection of host cells into which the vector has been introduced (see e.g., U.S. Pat. Nos. 4,399,216, 4,634,665 and 5,179,017, incorporated herein by reference). For example, typically the selectable marker gene confers resistance to drugs, such as G418, hygromycin or methotrexate, on a host cell into which the vector has been introduced. Preferred selectable marker genes include the dihydrofolate reductase (DHFR) gene (for use in dhfr-host cells with methotrexate selection/amplification), the neo gene (for G418 selection), and the glutamate synthetase (GS) gene.
Disclosed are cells comprising the disclosed vectors. In some instances, a cell can be transfected with a nucleic acid comprising the PDE3B variants. In some instances, a cell comprising one or more of the PCKS9 variants can express the protein encoded by the one or more disclosed genetic variants and therefore, also disclosed are cells comprising a protein encoded by one or more PDE3B variants.
The materials described above as well as other materials can be packaged together in any suitable combination as a kit useful for performing, or aiding in the performance of, the disclosed method. It is useful if the kit components in a given kit are designed and adapted for use together in the disclosed method. For example disclosed are kits that can comprise an assay or assays for detecting one or more PDE3B variants in a sample of a subject.
Disclosed are engineered, non-naturally occurring CRISPR-CAS system comprising: a guide RNA that hybridizes with a target sequence, wherein the target sequence comprises a PDE3B loss of function or damaging variant, and a Cas protein or gene encoding a Cas protein. In some aspects, the Cas protein can be a Type-II Cas9 protein or a gene encoding a Type-II Cas9 protein. In some aspects, the Cas9 protein and the guide RNA do not naturally occur together
In some aspects of the engineered, non-naturally occurring CRISPR-CAS system, the PDE3B loss of function or damaging variant comprises the mutation Arg783Ter in the PDE3B protein.
In some aspects, the guide RNA sequence can comprise a sequence that binds to a portion of the PDE3B nucleic acid sequence found in accession number NM_000922.3.
Disclosed are methods of altering expression of at least one gene product, wherein the at least one gene product is a gene product from a PDE3B loss of function or damaging variant, wherein the method comprises administering a guide RNA that hybridizes with a target sequence, wherein the target sequence comprises the PDE3B loss of function or damaging variant, and a Cas protein or gene encoding a Cas protein, whereby the guide RNA targets the target sequence and the Cas9 protein cleaves the nucleic acid molecule which comprises the PDE3B loss of function or damaging variant, whereby expression of the at least one gene product is altered. In some aspects, the PDE3B loss of function or damaging variant comprises the mutation Arg783Ter in the PDE3B protein.
Disclosed are methods of altering expression of at least one gene product, wherein the at least one gene product is a gene product from a PDE3B loss of function or damaging variant, wherein the method comprises administering a vector that comprises a first regulatory element operable in a eukaryotic cell operably linked to at least one nucleotide sequence encoding a CRISPR-Cas system guide RNA that hybridizes with a target sequence, wherein the target sequence comprises the PDE3B loss of function or damaging variant; and a second regulatory element operable in a eukaryotic cell operably linked to a nucleotide sequence encoding a Cas9 protein, whereby the guide RNA targets the target sequence and the Cas9 protein cleaves the target sequence, whereby expression of the at least one gene product is altered. In some aspects, the PDE3B loss of function or damaging variant comprises the mutation Arg783Ter in the PDE3B protein.
In some aspects, the guide RNA sequence comprises the sequence of can comprise a sequence that binds to a portion of the PDE3B nucleic acid sequence found in accession number NM_000922.3.
Disclosed are methods of silencing or inhibiting expression of wild type PDE3B in a cell comprising providing at least one silencing agent to the cell, wherein said silencing agent silences or inhibits expression of the wild type PDE3B in the cell.
In some aspects, the cell is inside a subject and thus the method occurs in vivo. In some aspects, the silencing or inhibiting expression of PDE3B in a cell occurs in vitro.
In some aspects, the silencing agent can be RNAi, CRISPR, or siRNA.
Disclosed are methods of silencing or inhibiting expression of wild type PDE3B in a cell comprising providing at least one RNA to the cell in an amount sufficient to inhibit the expression of PDE3B, wherein the RNA comprises or forms a double-stranded structure containing a first strand comprising a ribonucleotide sequence which corresponds to a nucleotide sequence of PDE3B and a second strand comprising a ribonucleotide sequence which is complementary to the nucleotide sequence of PDE3B, wherein the first and the second ribonucleotide sequences are separate complementary sequences that hybridize to each other to form said double-stranded structure, and the RNA comprising the double-stranded structure inhibits expression of PDE3B.
In some aspects, the first strand comprises a sequence which corresponds to a portion of the PDE3B nucleic acid sequence found in accession number NM_000922.3. In some aspects, the second strand comprises a sequence that can bind to, or is complementary to, a portion of the PDE3B nucleic acid sequence found in accession number NM_000922.3.
Disclosed are methods of inhibiting expression of PDE3B in a cell comprising: isolating the cell; contacting the cell with a RNA comprising a double-stranded structure comprising a first strand comprising a ribonucleotide sequence which corresponds to a nucleotide sequence of PDE3B and a second strand comprising a ribonucleotide sequence which is complementary to the nucleotide sequence of PDE3B, wherein the first and the second ribonucleotide sequences are separate sequences that hybridize to each other to form said double-stranded structure, and subsequently introducing the cell into a host, wherein said RNA comprising the double-stranded structure inhibits expression of the target gene in the cell in the host.
“Silencing” or “inhibiting,” as it is used herein, is a term generally used to refer to suppression, full or partial, of expression of a gene.
Disclosed are RNAs comprising a double-stranded structure containing a first strand comprising a ribonucleotide sequence which corresponds to a nucleotide sequence of PDE3B and a second strand comprising a ribonucleotide sequence which is complementary to the nucleotide sequence of PDE3B, wherein the first and the second ribonucleotide sequences are separate complementary sequences that hybridize to each other to form said double-stranded structure.
In some aspects, the first strand comprises a sequence corresponding to a portion of the PDE3B nucleic acid sequence found in accession number NM_000922.3. In some aspects, the second strand comprises a sequence that can bind to, or is complementary to, a portion of the PDE3B nucleic acid sequence found in accession number NM_000922.3.
The disclosed nucleic acids that encode the PDE3B variants or their modified forms can also be used to generate either transgenic animals or “knock out” animals which, in turn, are useful in the development and screening of therapeutically useful reagents as well as studying the mechanism of action of the genetic variant. A transgenic animal (e.g., a mouse or rat) is an animal having cells that contain a transgene, which transgene was introduced into the animal or an ancestor of the animal at a prenatal, e.g., an embryonic stage. A transgene is a DNA that is integrated into the genome of a cell from which a transgenic animal develops. In some instances, cDNA encoding one or more of the PDE3B variants can be used to clone genomic DNA encoding the one or more of the disclosed genetic variants in accordance with established techniques and the genomic sequences used to generate transgenic animals that contain cells that express DNA encoding one or more of the PDE3B variants.
Large-scale biobanks offer the potential to link genes to health traits documented in electronic health records (EHR) with unprecedented power. In turn, these discoveries are expected to improve the understanding of the etiology of common and complex diseases as well as the ability to treat and prevent these conditions. To this end, the Million Veteran Program (MVP) was established by the U.S. Veterans Health Administration in 2011 as a nationwide research program within the Veteran Administration (VA) healthcare system. The overarching goal of MVP is to reveal new biologic insights and clinical associations broadly relevant to human health and enhance the care of veterans through precision medicine.
Blood concentrations of total cholesterol (TC), low-density lipoprotein (LDL) cholesterol, high-density lipoprotein (HDL) cholesterol, and triglycerides (TG) are heritable risk factors for cardiovascular disease, a highly prevalent condition among U.S. veterans. Genome-wide association studies (GWAS) to date have identified at least 268 loci that influence these levels, many of which are under investigation as potential therapeutic targets. However, off-target effects have dampened enthusiasm for some of these molecules, and understanding the full spectrum of clinical consequences of a given DNA sequence variant through phenome-wide association scanning (“PheWAS”) can shed light on potential unintended effects as well as novel therapeutic indications.
A GWAS was performed, including a discovery phase in MVP and a replication phase in the Global Lipids Genetics Consortium (GLGC) (
A transcriptome-wide association study (TWAS) and a competitive gene set pathway analysis was then performed. Novel, genome-wide lipid-associated, low-frequency missense variants unique to black and Hispanic individuals were examined. Results for predicted loss of gene function (pLoF) mutations were focused on, as these associations have revealed target pathways for pharmacologic inactivation and modulation of cardiovascular risk. A PheWAS was performed for a set of DNA sequence variants within genes that have already emerged as therapeutic targets for lipid modulation, leveraging the full catalog of ICD-9 diagnosis codes in the VA EHR to better understand the consequences of pharmacologic modulation of these genes or their products. Lastly, the causal relationship of lipids on abdominal aortic aneurysm (AAA) development were explored through a multivariate Mendelian randomization analysis.
1. Results
i. Demographic and Clinical Characteristics of Genotyped Participants in the Million Veteran Program
A total of 353,323 veterans had genetic data available in MVP, with clinical phenotypes recorded in the VA EHR over 3,088,030 patient-years prior to enrollment (median of 10.0 years per participant) and 61,747,974 distinct clinical encounters (median of 99 per participant). Veterans were categorized into three mutually exclusive ancestral groups for association analysis: 1) non-Hispanic whites, 2) non-Hispanic blacks, and 3) Hispanics. Admixture plots depicting the genetic background of the black and Hispanic groups are shown in
A subset of 297,626 participants passing quality control had at least 1 laboratory measurement of blood lipids in their EHR. These individuals collectively had a total of 15,456,328 lab entries for blood lipids, or a median of 12 measures per lipid fraction per participant. To minimize potential confounding from the use of lipid-altering agents with variable adherence, a participant's maximum LDL cholesterol, TG, and TC as well as his or her minimum HDL cholesterol were selected for genetic association analysis. Table 2 summarizes characteristics at enrollment and the distribution lipid levels for MVP participants included in the analysis. Participants were largely male, 72% white, and while 39-46% of participants in each ancestral group had statin therapy prescriptions at the time of enrollment, only 8-9% were prescribed statin therapy at the time of their maximum LDL or TC measurement used for GWAS analysis.
139 ± 38.4
19.3, 31.4, and 30.4 million variants in white, black, and Hispanic veterans, respectively, were successfully imputed [INFO>0.3, minor allele frequency (MAF)>0.0003] using the 1000 Genomes Project reference panel (Table 2). Black and Hispanic participants had substantially more variants available for analysis, reflecting the known greater genetic diversity within these populations. We also identified 6,657 pLoF variants in 4,294 genes across the three ethnicities (
The Z scores and effect estimates from the published literature were compared with those observed in MVP for 444 previously reported exome-wide significant variants for lipids that were imputed using HapMap. A strong correlation of genetic associations was found across all four traits, validating the lipid phenotypes defined through EHR (
Association testing was performed separately among individuals of each of three ancestries (whites, blacks, and Hispanics) in the initial discovery analysis and then meta-analyzed results across ancestry groups using an inverse variance-weighted fixed effects method (
Replication for variants within MVP with suggestive associations (P<1×10-4) was sought in one of two independent studies (
At any given genetic locus, more than one variant may independently affect plasma lipid levels. A conditional analysis was performed using combined summary statistics from MVP and publicly available data from GLGC for each lipid trait (
ii. Variance Explained and Gain Using Multiple Lipid Measurements
The previously mapped 444 lipid variants explain about 7.5-10.5% of the phenotypic variance in lipid levels in the MVP population. The 118 novel loci explain an additional 0.38-0.74% in phenotypic variance, and the 826 independent variants identified in the conditional analysis increase the overall phenotypic variance explained to 8.8-12.3% (Table 3).
The impact of multiple lipid measurements was subsequently explored in an analysis restricted to 171,314 European MVP participants with >5 lipid measurements in their EHR. A weighted genetic risk score (GRS) of 223 variants was constructed across 268 of the previously mapped loci with effect estimates available in the 2017 GLGC exome array analysis summary statistics (
iii. Transcriptome-wide Association Study
A TWAS23 was performed using: 1) pre-computed weights from expression array data measured in peripheral blood from 1,245 unrelated control individuals from the Netherlands Twin Registry (NTR), RNA-seq data measured in adipose tissue from 563 control individuals from the Metabolic Syndrome in Men study (METSIM), and RNA-seq data from post-mortem liver (97 individuals) and tibial artery (285 individuals) tissue from the Genotype-Tissue Expression project (GTEx V6), and 2) combined MVP and GLGC summary statistics for each of the four lipid traits (
In total, the TWAS identified 655 genome-wide significant (P<5×10-8) gene-lipid associations (summed across expression reference panels) in a total of 333 distinct genes, including 194 that were significant in more than one tissue or lipid trait (
iv. Tissue Expression Enrichment and Competitive Gene Set Pathway Analysis
Multi-marker Analysis of GenoMic Annotation (MAGMA) was used as implemented in the FUMA pipeline to perform a competitive gene set analysis of curated gene sets and GO terms (pathways) obtained from the Molecular Signature Database, as well as a gene-property analysis for gene expression of GTex25 tissues for LDL-C, TG, and HDL. As expected, the pathway analysis revealed a significant enrichment for several biological processes related to lipoprotein metabolism including sterol homeostasis, acylglycerol homeostasis, chylomicron mediated transport,acyl reverse cholesterol transport, and regulation of lipoprotein lipase activity (P Bonferroni <0.05). MAGMA gene-property analysis revealed a significant enrichment of GWAS signal overlapping genes expressed in the liver, adrenal gland, and the ovary for LDL-C, subcutaneous and visceral adipose tissue, liver, adrenal gland, and pancreas for TG, and liver for HDL-C.
v. Predicted Loss of Gene Function Lipid Associations
The subset of genotyped or imputed pLoF variants [variants annotated as: premature stop (nonsense), canonical splice-sites (splice-donor or splice-acceptor) or insertion/deletion variants that shifted frame (frameshift) by the Variant Effect Predictor software] was then studied. A total of 15 unique pLoF variants demonstrated genome-wide significant lipid associations across individuals of all three ethnic groups (
One novel pLoF association was identifed. Among white MVP participants, carriers of a rare stop-gain mutation in PDE3B (p.Arg783Ter; carrier frequency of 1 in 625), exhibited a 4.72 mg/dL (0.41 standard deviations) higher blood HDL cholesterol (P<2.8×10−16) and 43.3 mg/dL (−0.27 standard deviations) lower blood TG (P=7.5×10−8). This signal is independent of the previously reported PDE3B genome-wide significant lead variant, rs103737811 (p.Arg783Ter conditional analysis P=8.91×10−8 for TG and 6.3×10−16 for HDL cholesterol, respectively). One individual was also identified who was homozygous for p.Arg783Ter. This PDE3B “human knockout” was in his sixth decade of life HDL cholesterol and TG levels of 73 and 56 mg/dL, respectively. He was not on lipid-lowering medication and was free of coronary artery disease (CAD). The TG and HDL associations were replicated for this pLoF variant in an independent sample of ˜45,000 participants of the DiscovEHR study (
vi. Loss of PDE3B Function and Risk of Coronary Artery Disease
Mutations damaging or causing a loss of function in PDE3B can protect against the development of CAD based on their association with lifelong lower TG levels in blood. A case-control analysis of CAD status was performed involving 5 cohorts: MVP, UK Biobank, Myocardial Infarction Genetics Consortium (MIGen), Penn Medicine Biobank (PMBB), and DiscovEHR. In studies with exome sequencing available (MIGen, PMBB, DiscovEHR), pLoF variants were combined with missense variants predicted to be damaging or possibly damaging by each of 5 computer prediction algorithms (LRT score, MutationTaster, PolyPhen-2, HumDiv, PolyPhen-2 HumVar, and SIFT) as performed previously. Because any damaging mutations were individually rare, they were aggregated in subsequent association analysis with CAD (Table 4). Among 103,580 individuals with CAD and 566,813 controls available for meta-analysis in these 5 cohorts, carriers of damaging PDE3B mutations were found to have a 24% decreased risk of CAD (OR=0.76, 95% CI=0.65-0.90, P=0.0015,
vii. Novel Lipid Loci and Association with Coronary Disease
To further evaluate whether novel lipid variants identified in the analysis also influence the risk of CAD, the association of lead variants was examined within the 118 novel lipid loci identified in the study with CAD. 115/118 of the lead variants were present in the CARDIoGRAMplusC4D 1000 Genomes GWAS; the remaining 3 (MAF<0.0035 for each) were present the MIGen and CARDIoGRAM exome chip GWAS analysis. In total, 25 of the 118 loci showed at least nominal (P<0.05) association with CAD in the CARDIoGRAM studies (
Data was leveraged from the Million Veteran Program to investigate the inherited basis of blood lipids using EHR-based laboratory measures in nearly 300,000 U.S. veterans. First, 188 previously identified loci were confirmed; furthermore, an additional 118 novel genome-wide significant loci were uncovered. Next, a total of 826 independent lipid associated variants were identified increasing the phenotypic variance explained by nearly 2%. A TWAS was performed in four tissues identifying 5 additional novel lipid loci at a genome-wide level of significance, and a pathway analysis was performed highlighting lipid transport mechanisms in the GWAS results. Ancestry-specific effects of rare coding variation on lipids among white, black, and Hispanic participants were identified and 15 pLoF mutations associated with lipids at a genome-wide level of significance were identified, including a protein-truncating variant in PDE3B that lowers TG, raises HDL cholesterol, and protects against CAD. Finally, the full spectrum of phenotypic consequences for mutations in lipid genes emerging as therapeutic targets, identifying protective effects of pLoF mutations in PCSK9 for abdominal aortic aneurysm and in ANGPTL4 for type 2 diabetes were examined.
There is enormous potential of a large-scale multi-ethnic biobank built within an integrated health care system in the discovery of the genetic basis of a broad spectrum of human traits. Specifically, the VA's mature nationwide EHR was leveraged to efficiently extract existing repeated laboratory measures of lipids collected during the course of clinical care in nearly 300,000 veterans over a median of 10 years for GWAS analysis. Subsequent meta-analysis (combined N>600,000) with existing datasets increased the number of known independent genetic lipid associations to nearly 400. These results highlight an increase in variance explained with multiple lipid measurements, and multiple lipid pathways with links to human disease. For example, common variants near genes such as COL4A2 and ITGA1 identified for LDL cholesterol/TC indicate links to extracellular matrix and cell adhesion biology, two pathways recently implicated by GWAS of CAD. Carriers of a rare missense mutation in the gene encoding Perilipin-1 (PLIN1 p.Leu90Pro) possess a markedly higher plasma HDL cholesterol (0.243 standard deviations). In humans, Perilipin-1 is required for lipid droplet formation, triglyceride storage, as well as free fatty acid metabolism, and frameshift pLoF mutations Perilipin-1 have been reported to result in severe lipodystrophy. A variant downstream of BDNF (encoding Brain-Derived Neurotrophic Factor) was found to be associated with HDL cholesterol and TG levels, supporting recent evidence linking this gene with metabolic syndrome and diabetes. These findings not only improve the understanding of the genetic basis of dyslipidemia, but also provide insights into targets for the development of novel therapeutic agents.
There is a benefit of studying individuals with a diverse ethnic background. Such a design can provide valuable incremental information on the nature of previously identified human genetic associations. In MVP, nearly 60,000 black and 25,000 Hispanic veterans were examined for analysis, representing one of the largest—if not the largest—single-cohort GWAS to date for these ethnic groups for any trait. Among these individuals, we compared the effect estimates and allele frequencies of lipid-associated variants across ancestral group and identified 7 novel, low-frequency coding variants associated with lipids only in non-European populations. Conversely, a shared genetic architecture across all three racial groups for pLoF variation at the LPL and APOC3 loci was confirmed. Previous work identifying low-frequency missense and pLoF variation in lipid genes have led to the development of the next generation of pharmaceutical agents for cardiovascular disease. Expansion of these efforts to larger sample sizes and additional ancestries may help explain differences in blood lipid levels and risk of atherosclerosis among select populations.
These findings lend human genetic support to PDE3B inhibition as a therapeutic strategy for atherosclerosis. Cilostazol, an inhibitor of both the 3A and 3B isoforms of the phosphodiesterase enzyme, is known to have anti-platelet, vasodilatory, and inotropic effects via inhibition of PDE3A, and also has well documented substantial effects on TG and HDL cholesterol levels—likely through antagonism of PDE3B. A PDE3B pLoF variant recapitulates the known lipid effects of cilostazol and damaging PDE3B mutations are also associated with reduced risk of CAD. Randomized control trials to date have demonstrated cilostazol's efficacy in intermittent claudication and prevention of restenosis following percutaneous coronary intervention. The drug is also currently used off-label for the prevention of stroke recurrence through a presumed anti-platelet effect. Mice genetically deficient in Pde3b display reduced atherosclerosis as well as decreased infarct size and improved cardiac function following experimental coronary artery ligation.
In conclusion, >100 new genetic signals were identified for blood lipid levels utilizing a biobank that exploits existing EHRs of U.S. veterans.
3. Methods
The design of the Million Veteran Program (MVP) has been previously described. Briefly, individuals aged 19 to 104 years have been recruited from more than 50 VA Medical Centers nationwide since 2011. Each veteran's EHR data are being integrated into the MVP biorepository, including inpatient International Classification of Diseases (ICD-9) diagnosis codes, Current Procedural Terminology (CPT) procedure codes, clinical laboratory measurements, and reports of diagnostic imaging modalities. The MVP received ethical and study protocol approval from the VA Central Institutional Review Board (IRB) in accordance with the principles outlined in the Declaration of Helsinki.
i. Genetic Data
DNA extracted from whole blood was genotyped using a customized Affymetrix Axiom biobank array, the MVP 1.0 Genotyping Array. With 723,305 total DNA sequence variants, the array is enriched for both common and rare variants of clinical significance in different ethnic backgrounds. Veterans of three mutually exclusive ethnic groups were identified for analysis: 1) non-Hispanic whites, 2) non-Hispanic blacks, and 3) Hispanics. Quality-control procedures used to assign ancestry, remove low-quality samples and variants, and perform genotype imputation to the 1000 Genomes reference panel were performed.
ii. Variant Quality Control
Prior to imputation, variants that were poorly called (genotype missingness >5%) or that deviated from their expected allele frequency based on reference data from the 1000 Genomes Project were excluded. After pre-phasing using EAGLE v2, genotypes from the 1000 Genomes Project phase 3, version 5 reference panel were imputed into Million Veteran Program (MVP) participants via Minimac3 software. Ethnicity-specific principal component analysis was performed using the EIGENSOFT software.
Following imputation, variant level quality control was performed using the EasyQC R package (www.R-project.org), and exclusion metrics included: ancestry specific Hardy-Weinberg equilibrium P<1×10-20, posterior call probability <0.9, imputation quality/INFO <0.3, minor allele frequency (MAF)<0.0003, call rate <97.5% for common variants (MAF>1%), and call rate <99% for rare variants (MAF<1%). Variants were also excluded if they deviated >10% from their expected allele frequency based on reference data from the 1000 Genomes Project.
iii. EHR-Based Lipid Phenotypes
EHR clinical laboratory data were available for MVP participants from as early as 2003. The maximum LDL cholesterol/TG/TC, and minimum HDL cholesterol was extracted for each participant for analysis. These extreme values were selected to approximate plasma lipid concentrations in the absence of lipid lowering therapy. For each phenotype (LDL cholesterol, natural log transformed TG, HDL cholesterol, and TC), residuals were obtained after regressing on age, age2, sex, and 10 principal components of ancestry. Residuals were subsequently inverse normal transformed for association analysis. Statin therapy prescription at enrollment was defined as the presence of a statin prescription in the EHR within 90 days before or after enrollment in MVP. Statin therapy prescription at the maximum lipid measurement was defined as the presence of a statin prescription in the EHR within 90 days prior to the maximum lipid laboratory measurement used in the GWAS analysis.
iv. MVP Association Analysis
Genotyped and imputed DNA sequence variants with a MAF>0.0003 were tested for association with the inverse normal transformed residuals of lipid values through linear regression assuming an additive genetic model. In a discovery analysis, association testing was performed separately among individuals of each of three genetic ancestries (whites, blacks, and Hispanics) and then meta-analyzed results across ethnic groups using an inverse variance-weighted fixed effects method. For variants with suggestive associations (association P<10-4), replication was sought of the findings in one of two independent studies: the 2017 GLGC exome array meta-analysis or the 2013 GLGC “joint meta-analysis.” Replication was first performed using summary statistics from the 2017 GLGC exome array study. A total of 242,289 variants in up to 319,677 individuals were analyzed after quality control and were available for replication.
If a DNA sequence variant was not available for replication in the above exome array-focused study, replication was sought from publicly available summary statistics from the 2013 GLGC “joint meta-analysis.” An additional 2,044,165 variants in up to 188,587 individuals were available for replication in this study. In total, 2,286,454 DNA sequence variants in up to 319,677 individuals were available for independent replication. If a variant was available for replication in both studies, replication was prioritized using summary statistics from the 2017 GLGC exome array study given its larger sample size. Significant novel associations were defined as those that were at least nominally significant in replication (P<0.05) and had an overall P<5×10-8 (genome-wide significance) in the discovery and replication cohorts combined. Novel loci were defined as being greater than 1 mB away from a known lipid genome-wide associated lead variant. Additionally, linkage disequilibrium information from the 1000 Genomes Project was used to determine independent variants where a locus extended beyond 1 mB.
v. Conditional Analysis
Given that individual level data for the prior GLGC lipid analyses are not publicly available, we used the COJO-GCTA software to perform an approximate, stepwise conditional analysis to identify independent variants within lipid-associated loci. We used summary statistics after a meta-analysis of 1.9 million overlapping variants across the GLGC (predominantly European) and European MVP datasets (
vi. Variance Explained and Gain Using Multiple Lipid Measurements
The proportion of variance explained by the set of 444 previously mapped independent lipid variants, the 118 novel lipid loci identified in the study, and the 826 independent lipid variants identified from conditional analysis using ridge regression with the glmnet R package were estimated. The variance explained was determined after tuning the hyperparameter (lambda) to approximate an optimal value, and then calculating the model R2 after performing linear regression with the inverse normal transformed lipid outcome and each set (444, 118, 826) of independent genome-wide variants as predictors.
To assess the impact of multiple lipid measurements, the variance explained for a GRS of 223 previously described GWAS lipid variants weighted by their previously reported effect sizes as a function of the number of lipid measurements was estimated (
vii. Lipids Transcriptome-wide Association Study
A TWAS was performed using summary statistics after a meta-analysis of 1.9 million overlapping variants among GLGC (predominantly European) and European MVP datasets (
viii. Tissue Expression Analysis and Competitive Gene Set Pathway Analysis
MAGMA was used as implemented in the FUMA pipeline to perform a competitive gene set analysis for 10,655 gene sets (curated gene sets: 4,738, GO terms: 5,917) present in the Molecular Signature Database (MsigDB 6.1) and a gene-property analysis for gene expression in GTEx v7 with 53 tissue types. The input for these analyses was the 1000 Genomes imputed summary statistics from Stage 1 for LDL-C, TG, and HDL-C. The combined trans-ethnic summary statistics were run and then the summary statistics in the European subgroup of participants alone. For the gene-set analyses, a P adjusted for the number of total gene sets tested was calculated and output for gene-sets with P bon <0.05. MAGMA gene-set and gene-property analyses uses the full distribution of SNP p values and differs from pathway enrichment tests that only tests for enrichment of prioritized genes.
ix. Identification of Independent Low-Frequency Coding Variant Lipid Associations Specific to Blacks and Hispanics
The P value and linkage disequilibrium-driven clumping procedure in PLINK version 1.90b (-clump) was used to identify associations between low-frequency coding variants and lipids specific to blacks and Hispanics. Input included summary lipid association statistics from our MVP 1000 Genomes imputed genome-wide association study of black and Hispanic individuals, and reference linkage disequilibrium panels of 661 African (AFR) and 347 Ad Mixed American (AMR) samples from 1000 Genomes phase 3 whole genome sequencing data. Variants were clumped with stringent r2 (<0.01) and P (<5×10−8) thresholds in a 1 mega-base region surrounding the lead variant at each locus to reveal independent index variants at genome-wide significance. From this list of independent variants, we report novel protein-altering variants specific to blacks and Hispanics at a MAF<0.05.
x. Loss of Gene Function Analysis
The Variant Effect Predictor software was used to identify pLoF DNA sequence variants defined as: premature stop (nonsense), canonical splice-sites (splice-donor or splice-acceptor) or insertion/deletion variants that shifted frame (frameshift). These variants were then merged with data from the Exome Aggregation Consortium24 (Version 0.3.1), a publicly available catalogue of exome sequence data to confirm consistency in variant annotation. pLoF DNA sequence variants were required to be observed in at least 50 individuals, and set a statistical significance threshold of P<5×10−8 (genome-wide significance).
xi. Loss of PDE3B Gene Function and Coronary Artery Disease
A novel lipid association was identified for a pLoF mutation in the PDE3B gene (rs150090666, p.Arg783Ter). For carriers of damaging mutations in Phosphodiesterase 3B, the mutation's effects on risk for CAD were examined using logistic regression in five separate cohorts: MVP, UK Biobank, and 3 cohorts with exome sequencing: the Myocardial Infarction Genetics Consortium (MIGen), the Penn Medicine Biobank (PMBB), and DiscovEHR. In studies with exome sequencing, pLoF variants were combined with missense variants predicted to be damaging or possibly damaging by each of 5 computer prediction algorithms (LRT score, MutationTaster, PolyPhen-2, HumDiv, PolyPhen-2 HumVar, and SIFT) as performed previously. Because any individual damaging mutation was rare, variants were aggregated together for subsequent phenotypic analysis. Logistic regression on disease status was performed, adjusting for age, sex, and principal components of ancestry as appropriate. Effects of PDE3B damaging mutations were pooled across studies using an inverse-variance weighted fixed effects meta-analysis. A P<0.05 threshold for statistical significance was set.
xii. Novel Lipid Loci and Association with Coronary Disease
To assess whether novel lipid loci in our study modulate the risk of CAD, association results were extracted for the lead variant at each locus from either the CARDIoGRAMplusC4D 1000 Genomes imputed CAD GWAS37 (115/118 variants) or from the MIGen and CARDIoGRAM Exome Chip GWAS analysis for 3 variants not available in the former. A two-tailed exact binomial test for goodness of fit was performed examining the expected and observed distributions of 1) LDL-C and 2) TG raising alleles (P<10−4), and 3) HDL-C raising alleles (P<10−4) not also associated with LDL-C or TG (P>0.05) and their effect on CAD risk. The null hypothesis was tested that the lipid-associated variants were equally likely to increase or decrease CAD risk and set a P<0.05 threshold for statistical significance.
xiii. Lipids and Abdominal Aortic Aneurysm Mendelian Randomization Analysis
Summary-level data for 223 genome-wide lipids-associated variants were obtained from the publicly available data from the Global Lipids Genetics Consortium. Results were utilized from a GWAS of 5,002 AAA cases and 139,968 controls performed in white MVP participants using the definition proposed by Denny et al. The effect alleles were matched with all lipid and AAA summary data and 3 different Mendelian randomization analyses were performed: 1) inverse variance-weighted; 2) multivariable; 3) MR-Egger to account for pleiotropic bias. First, inverse-variance-weighted Mendelian randomization was performed using each set of variants for each lipid trait as instrumental variables. This method, however, does not account for possible pleiotropic bias. Therefore, inverse-variance-weighted multivariable Mendelian randomization was next performed. This method adjusts for possible pleiotropic effects across the included lipid traits in our analyses using effect estimates from the variant-AAA outcome and effect estimates from variant-LDL-C, variant-HDL-C, and variant-TG as predictors in 1 multivariable model. MR-Egger was additionally performed. This technique can be used to detect bias secondary to unbalanced pleiotropy in Mendelian randomization studies. In contrast to inverse variance-weighted analysis, the regression line is unconstrained, and the intercept represents the average pleiotropic effects across all variants. Bonferroni-corrected 2-sided P values (P=0.016; 0.05/3) for 3 tests were used to declare statistical significance.
Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments of the method and compositions described herein. Such equivalents are intended to be encompassed by the following claims.
This invention was made with government support under #G002, I01-01BX03340, I01-BX002641, and I01-CX001025 awarded by Veterans Administration (VA) Cooperative Studies Program (CSP), T32 HL007734 and R01HL127564 awarded by the National Institute of Health. The government has certain rights in the invention.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US19/40472 | 7/3/2019 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62694949 | Jul 2018 | US |