The field of this invention is identification and isolation of genes; more particularly, it is computational identification of consensus nucleotide sequences common to mRNAs that contain adenylate uridylate-rich elements (AREs), and use of these consensus sequences: i) to search gene databases to identify genes containing consensus ARE sequences, and ii) to design primers, and selectively amplify and clone isolated cellular mRNAs that contain ARE sequence elements. Genes encoding ARE-containing mRNAs or unique fragments thereof are used as probes on microarrays for analysis of gene expression.
Adenylate uridylate-rich elements (AREs) are cis-acting sequences, usually found in the 3′ untranslated region (3′UTR) of many labile mRNAs. Such ARE-containing mRNAs have relatively short half lives and are rapidly degraded after they have been transcribed. Studies have shown that certain AREs act as instability determinants (Chen and Shyu, 1995, Trends Biochem Sci, 20:465-70.). For example, the half lives of specific long-lived mRNAs were significantly decreased by inclusion of ARE sequences in the 3′UTR of such mRNAs (Shaw and Kamen, 1986, Cell, 46:659-67.). Early studies suggested the minimal necessary sequence for a functional ARE was UUAUUUAUU (Chen and Shyu, 1995, Trends Biochem Sci, 20:465-70; Lagnado, et al., 1994, Mol Cell Biol, 14:7984-95; Lewis, et al., 1998, J Biol Chem, 273:13781-6; Zubiaga, et al., 1995, Mol Cell Biol, 15:2219-30.). Studies have described the binding of specific proteins to the ARE elements in mRNA and it may be that these proteins mediate the short half life of such mRNAs (Bakheet, et al., 2001, Nucleic Acids Res, 29:246-54.).
Known ARE-containing mRNAs are encoded by many early response genes that function to regulate cell proliferation and respond to exogenous agents, such as inflammatory stimuli, radiation, and viruses. Among these gene products are proteins that participate in growth control, such as the proto-oncogene, c-fos, and the hematopoietic growth factor, granulocyte monocyte colony stimulating factor; cytokines that respond to inflammatory stimuli, such as TNF-α and IL-8; interferons, such as IFN-α and IFN-β, that are responsible for early defenses against viruses; and cellular receptors, such as tissue factor, an initiator of blood coagulation.
ARE-mediated changes in mRNA stability are important in processes that require transient responses such as cellular growth, immune response, cardiovascular toning, and external stress-mediated pathways. Abnormal expression of genes encoding ARE-containing mRNAs, by stabilization of the mRNAs for example, may cause increased concentrations of proteins encoded by such mRNAs and lead to disease. For example, removal of the ARE element of the proto-oncogene c-fos correlates with increased oncogenicity (Raymond, et al., 1989, Oncogene Res, 5:1-12). The ARE-containing Bcl-2 mRNA, encodes an anti-apoptotic protein whose increased concentrations can lead to neoplastic transformation of follicular B-cells (Capaccioli, et al., 1996, Oncogene, 13:105-15; Schiavone, et al., 2000, Faseb J, 14:174-84.). Another example of disease, possibly caused by misregulated ARE-containing mRNAs, is the chronic inflammatory arthritis and Crohn's-like inflammatory bowel disease that were detected in mice whose ARE-containing region was deleted from the TNF gene (Kontoyiannis, et al., 1999, Immunity, 10:387-98.). Chromosomal alterations led to deletion of ARE-3′UTR in the CCND1 gene (cyclin D1, PRAD1, parathyroid adenomatosis 1) that resulted in overexpression of CCND1 mRNA in mantle cell lymphoma, a deregulation event that is thought to perturb the G1-S transition of the cell cycle and thereby contributes to tumor development (Rimokh, et al., 1994, Blood, 83:3689-96.). The tumorgenicity of small neuroblastic cells correlates with overexpression of the ARE-mRNA, MYCN, and also correlated with a large amount of a p40 ELAV-protein that targets AREs and stabilizes ARE-mRNAs when compared to substrate adherent cells (Chagnovich and Cohn, 1997, Eur J Cancer, 33:2064-7.). Tumor necrosis factor (TNF-α) is a typical ARE-mRNA and, although it is both pro-inflammatory and has anti-tumor activity to specific solid cancers, there is experimental evidence that it can act as a growth factor in certain leukemias and lymphomas (Liu, et al., 2000, J Biol Chem, 275:21086-93.).
Misregulation in ARE-mRNA pathways can result in other transiently regulated biological processes being affected. The 70-year phenomenon of the Warburg effect which is the oxygen-dependent enhanced glycolysis in cancer cells has been linked to the increased constitutive expression of a novel ARE-mRNA isoform for 6-phosphofructoso-2-kinase in cancer cells and was required for tumor growth in vitro and in vivo (Chesney, et al., 1999, Proc Natl Acad Sci USA, 96:3047-52.). In the same context of enhanced glucose metabolism in cancer, the stability of glucose transporter Glut1 mRNA has been shown to be regulated by ARE and ARE binding proteins and correlated with certain tumors including gliomas (Hamilton, et al., 1999, Biochem Biophys Res Commun, 261:646-51.). The high invasiveness of the breast cancer cell line, MDA-MB231, has been shown to be mediated by increased constitutive levels of urokinase-type plasminogen activator (uPA) due to impairment in the ARE-mediated decay of uPA mRNA (Montero and Nagamine, 1999, Cancer Res, 59:5286-93.). The increased activity of uPA and its receptor has been associated with invasiveness in a number of tumors (Reuning, et al., 1998, Int J Oncol, 13:893-906.). Interestingly, both the uPA and its receptor belong to the ARE-gene family (Bakheet, et al., 2001, Nucleic Acids Res, 29:246-54.) indicating the tightly regulated process of cell adhesiveness in normal situations. The mRNA of the transcription factor CHOP, which is involved in cell division and apoptosis in response to stress, is regulated by ARE (Ubeda, et al., 1999, Biochem Biophys Res Commun, 262:31-8.). Increased production of hematopoietic growth factors, e.g., GM-CSF, acting as autocrine growth factors, due to defects in ARE-mediated stability, may contribute to the pathogenesis of leukemia (Hoyle, et al., 1997, Cytokines Cell Mol Ther, 3:159-68; Paul, et al., 1997, Am J Hematol, 56:79-85.). Growth-regulated alterations in the abundance of ARE-mRNA regulating proteins, AUF1 and HuR may have pleiotropic effects on the expression of many highly regulated ARE-mRNAs and this may significantly impact the onset, maintenance, and progression of the neoplastic phenotype (Blaxall, et al., 2000, Mol Carcinog, 28:76-83.).
Despite their significance, however, probably less than 100 ARE-containing mRNAs have so far been identified. Other ARE-containing genes likely exist whose misregulation may contribute to human disease. Therefore, it would be desirable to identify additional genes that encode ARE-containing mRNAs.
The present invention relates to a gene discovery system and gene expression systems specific for genes encoding ARE-containing mRNAs. In one aspect, the present invention relates to computational methods of selecting coding sequences of ARE-genes from databases using aone or more ARE search sequences. The ARE search sequences are from 10 to 80 nucleotides in length and comprise a sequence which is encompassed by one of the following two sequences: (a) WU/T(AU/TU/TU/TA)TWWW, SEQ ID NO. 1, wherein none or one of the nucleotides outside of the parenthesis is replaced by a different nucleotide, and wherein W represents A, U. or T; and (b) U/T(AU/TU/T/U/T)n, SEQ ID NO. 2, wherein n indicates that the search sequence comprises from 3 to 12 of the tetrameric sequences contained within the parenthesis. The method comprises extracting from the databases, those nucleic acids whose protein coding sequences are upstream and contiguous with a 3′untranslated region (UTR) that comprises one of the ARE search sequences. Examples of such databases are mRNA databases, cDNA databases, and genomic databases, including the human genome project. The invention also relates to methods of making DNA libraries and microarrays that comprise a plurality of the nucleic acids that are selected by the computational methods. The invention also relates to the DNA libraries and microarrays that are made by such methods. In one embodiment, the microarray comprises probes that hybridize to the coding sequences of a plurality of the genes that are listed in Table 6.
The present invention also relates to a method of identifying primer sets target to the initiation region of genes whose 3′ UTR comprise ARE sequences. In one preferred embodiment, the method employs the ARE search sequences. The ARE genes are grouped into four classes or sixteen classes. The four class grouping is based upon the nucleotide base that is attached to the 3′ end of the start codon of the ARE genes. The sixteen class grouping is based on the nucleotide bases that are attached to both the 5′ end and the 3′ end of the start codon, ATG, of the ARE genes. Using the ARE genes that are found in the database, consensus sequences for each of the classes are determined. The consensus sequences are useful for preparing 5′ primer sets, e.g. degenerate primers, which can be used to selectively amplify full-length and partial length ARE genes.
The present invention also relates to methods of selectively amplifying RNA and cDNA molecules using primers derived from and complementary to the consensus 5′ sequence motifs and primers derived from and complementary to the ARE search sequence. Such amplified RNA and cDNA molecules comprise the full-length or partial length sequences of new ARE genes.
The present invention also relates to methods of selectively amplifying ARE genes which employ a 3′ primer which is from 15 to 50 nucleotides and length and comprises from 2 to 10 pentamers having the sequence TAAAT. The pentameric sequences in the primers are either overlapping or non-overlapping. The 3′ primers are used in the reverse transcription step of the methods, the polymerase chain reaction (PCR) amplification step of the methods, or in both the reverse transcription step and the PCR amplification step of the methods. The present invention also relates to methods of making libraries which comprise portions of the ARE genes that are selectively amplified by the present methods and to methods of making microarrays which comprise probes that hybridize under stringent conditions to portions of the protein coding sequences of the ARE genes that are selectively amplified by the present methods. The present invention also relates to libraries and the microarrays that are made by such methods.
The present invention also relates to microarrays comprising probes which hybridize under stringent conditions to the coding sequences of the genes which comprise the sequences shown in
The present invention also relates to methods of using the ARE genes for generation of PCR products or oligonucleotides for use as immonpilized probes in cDNA or oligonuceotide microarray, respectively.
The present invention also relates to methods of using the microarrays of the present invention to obtain the ARE expression profile of a subject, particularly a subject with a disease such as cancer.
The present invention relates to computational and laboratory methods for identifying ARE genes.
Generally, the term “gene” refers to a contiguous stretch of nucleotide bases within the genome that is transcribed into an RNA, more specifically an mRNA. Such mRNA is subsequently translated into a protein. As used herein, the term can refer not only to the DNA within the genome (i.e., genomic sequences), but also to the mRNA transcribed from the DNA, and a DNA copy of the mRNA, also called “cDNA.” Such a gene has multiple sections, parts or regions, as described below (i.e., coding sequence, 3′UTR and 5′UTR). A “complete” gene comprises all of the sections. A “fragment” of a gene consists of less than all the sections. A fragment of a gene may comprise less than one entire section of a gene. A fragment of a gene that is used for the purpose of hybridization is referred to as a “probe.”
As used herein, the terms “protein coding sequence” or “coding sequence,” refer to an area of a gene (e.g., genomic DNA, mRNA or cDNA) that contains the genetic information responsible for the linear positioning of amino acids into a protein. The genetic information in such a coding region normally comprises contiguous groups of three nucleotide bases, called codons, each specifying a single amino acid within the encoded protein. Such coding sequence is said to be “full length” if it encodes a protein that is of the length and sequence normally found within a cell. Such coding sequence is said to be “partial length” if it encodes a protein that is shorter than the length of the protein normally found within a cell. Such partial length coding sequences can arise, for example, when enzymes that are used to copy DNA or RNA, do not faithfully copy the entire length of DNA or RNA being used as a template.
As used herein, “3'UTR” refers to an area of a gene, cDNA or mRNA that is located 3′ or downstream of the protein coding region of said gene, cDNA or mRNA.
As used herein, “5'UTR” refers to an area of a gene, cDNA or mRNA that is located 5′ or upstream of the protein coding region of said gene, cDNA or mRNA.
As used herein, “ARE” means “adenylate uridylate-rich element.” Such AREs are found in the 3′UTR of a gene. As used herein, an ARE gene, refers to a gene which contains an ARE within its 3′UTR.
In one aspect, the present invention provides an ARE search sequences which can be used to select ARE genes from public databases. One group of ARE search sequence comprise the sequence WU/T(AU/TU/TU/TA)U/TWWW, SEQ ID NO. 1, wherein none or one of the nucleotides outside of the parenthesis is replaced by a different nucleotide, and wherein W represents A, U, or T. Another group of search sequences comprise the sequence U/T(AU/TU/TU/T)n, SEQ ID NO. 2, wherein n indicates that the search sequences comprises from 3 to 12 of the tetrameric sequences within the parenthesis. The ARE search sequences were derived through analysis of the sequences of 57 mRNAs that are known to contain ARE sequences in their 3′UTR. The two rules used to include an mRNA among the 57 mRNAs are: i) an mRNA in which the ARE sequence has been shown to control mRNA stability or half-life, or ii) an ARE-containing mRNA that is known to be transiently induced. From the 3′UTR of these 57 mRNAs, consensus ARE sequences were generated through use of multiple expectation maximization for motif elicitation (MEME) program (Bailey and Gribskov, 1998, J Comput Biol, 5:211-21.). The sequence, TATTTAWW (W=A or T) was obtained. Using the 57 sequences, a consensus analysis was then performed around the TATTTAWW motif. In one embodiment, the parameters of the analysis specify a 75% certainty of a stated nucleotide being at each position. Using these parameters, the ARE search sequences were derived.
Derivation of the mRNA Database to be Searched with the ARE Search Sequence
A total of 36,951 human mRNA/cDNA sequences were extracted from GenBank Release 113 (National Center for Biotechnology Information, NCBI). Those sequences that encode full-length open reading frames were retained and others discarded. The 3′UTR sequences were extracted from each mRNA/cDNA sequence. The sequences containing no 3′UTR were discarded. A list of 13,057 sequences remained.
Searching the mRNA Database with ARE Search Sequences
In one embodiment, the 13,057 sequences were searched for the WWWTATTTATWWW sequence using the FindPattern analysis routine (Genetics Computer Group/Oxford Molecular Company; Madison, Wis.) allowing 1 bp mismatch on each side, outside of the core TATTTAT sequence. Redundant sequences were eliminated. The sequences found comprised 897 independent mRNA/cDNA sequences (see listing shown in Table 6 at end of examples).
In other embodiments of the invention, other variations of the ARE search sequence were used to search the mRNA database. Examples of the ARE search sequences which can be used include: WWWT(ATTTA)TWWW, SEQ ID NO. _, WWWT(ATTTA)TWW, SEQ ID NO. _, WWWT(ATTTA)TTWW, SEQ ID NO _, WWWT(ATTTA)TWWW. SEQ ID NO. _, WW(ATTTATTTA)WW, SEQ ID NO. _, ATTT(ATTTA)TTTA, SEQ ID NO. _, A(TTTA)n, where n can be from 3 to 12. These search sequences can be further varied by allowing between 0 and 2 nucleotides outside of the nucleotides shown in parenthesis above not to match (i.e., mismatches).
Searching Genomic Databases with ARE Search Sequences
In another embodiment, ARE search sequences are used to search existing databases of genomic DNAs. A major difference between searching a genomic database as compared to searching a database comprised of 3′UTR sequences is that the ARE search sequence can be found in regions of genes other than the 3′UTR. Identification of a sequence matching the ARE search sequence within the coding region of a gene is not useful. Only ARE search sequences present in the context of the 3′UTR likely function as determinants of mRNA stability.
To determine the possibility that ARE search sequences are found in a context other than the 3′UTR of a gene, diagnostic computational tests are performed. In one test, for example, the full protein coding sequence plus 3′UTR (not just the 3′UTR) of the 13,057 mRNAs/cDNAs described above are searched for the WWWTATTTATWWW sequence. The results of this search are 897 matches, the same number as found previously, when only the 3′UTR regions of these genes are searched. This result indicates that the ARE search sequence is not found within the coding region of these genes.
In another diagnostic computational test, the ARE search sequence is searched in a database of genomic sequences from the human genome project. While the ARE search sequence is not found with significant frequency in protein coding or 5′UTR regions of genes, ARE search sequences are frequently found in introns of genes throughout the genome.
Therefore, additional computational methods are used to eliminate from consideration those genes in which the ARE search sequence is found in regions other than the 3′UTR. These additional computational methods can also be used independently as methods of finding ARE-containing genes in genomic databases. The GENSCAN computer prediction program (Burge and Karlin, 1997, J Mol Biol, 268:78-94.) is one program used for this purpose. GENSCAN is a program that predicts the presence of genes within DNA databases using probabilistic models to detect gene structures such as exons, introns, transcriptional promoters and polyadenylation signals. Using GENSCAN, it is possible to rapidly determine whether ARE search sequences are found in regions other than the 3′UTR of genes. This eliminates genes in which the ARE search sequence is found in other areas of genes (e.g., within introns).
As an alternative to the GENSCAN program, the FGENSH program (Solovyev and Salamov, 1997, Proc Int Conf Intell Syst Mol Biol, 5:294-302; Solovyev, et al., 1995, Proc Int Conf Intell Syst Mol Biol, 3:367-75) is also used. FGENSH has been developed based on the exon recognition functions that uses linear discriminant functions for splice sites, 5′-coding, internal exon, and 3′-coding region recognition.
Once GENSCAN or FGENSH software are used to identify ARE-containing genes, 6-20 kilobase pairs of contiguous sequence upstream of the ARE sequence and 1-3 kilobase pairs of contiguous sequence downstream of the ARE sequence are obtained. The open reading frame of the genes are obtained by analysis of these contiguous regions.
Selective Amplification of ARE mRNAs by Reverse Transcription
In addition to computational identification of ARE genes that are present in databases, laboratory methods allow identification and cloning of ARE genes that are not present in computer databases.
As a first step toward laboratory-based identification of ARE genes, cDNA is synthesized from total cellular RNA using reverse transcriptase. RNA may be total cellular RNA or mRNA. Isolation of such RNA is common to those knowledgeable in the art. Such RNA could come from cells or tissues.
In one embodiment, oligo(dT) is used as the primer in the reverse transcription reaction. Oligo(dT) hybridizes to the poly(A) tails of mRNAs during first strand cDNA synthesis. Since all mRNAs normally have a poly(A) tail, first strand cDNA is made from all mRNAs present in the reaction (i.e., there is no specificity).
In another embodiment, first strand cDNA is synthesized only from those mRNAs that contain an ARE sequence in their 3′UTR. Such selectivity is achieved by replacing oligo(dT) with degenerate universal 3′ primers that specifically hybridize to ARE sequences in the 3′UTR of such mRNAs. Such degenerate universal 3′ primers are based on the ARE search sequence derived earlier and are complementary to sequences encompassed by one or more of the search sequences. The 3′ primer are from 15 to 50 nucleotides in length and comprises from 2 to 10 pentamers having the sequence TAAAT. These pentameric sequences may be overlapping, i.e. where the fifth nucleotide in the upstream pentamer is the first nucleotide in the downstream pentamer or non-overlapping. In those cases where the primers contain nonoverlapping pentamers, the primers either are not separated, i.e. they are adjacent, or, preferably are separated by from one to five nucleotides.
Examples of 3′ primers suitable for use in the reverse transcription reaction are AATAAATAAATVA (Down-ATP). SEQ ID NO. 3, TAAATWVATAAAT (Down-TAP), SEQ ID NO. 4, AATAAATAAATAA (S-MOTIFP), SEQ ID NO. 5, CTCGAGWHWWAAATAAATA (TA-XHOP), SEQ ID NO. 6, AND CTCGAGTAAATWNATAAAT (AT-XHOP), SEQ ID NO. 7, where W=A or T, H=A or C or T, V=A or G or C, and N=A or G or C or T.
In further embodiments, additional variations of the 3′ primers may be used. Such 3′ primers include: AATAAATAATCA, SEQ ID NO. 8, AATAAATAATGA, SEQ ID NO. 9, AWTAAATAAATWA, SEQ ID NO. 10, and WWWTAAATAAAT, SEQ ID NO. 11, for example. Longer primers can be used, such as those with multiple overlapping or non-overlapping ARE pentamer elements (i.e., ATTTA). Examples of such longer primers are AATAAATAAATAAATAAAT, SEQ ID NO. 12, and GGCGGATCCGGGCTAAATAAATAAA, SEQ ID NO. 13.
Preferably, the reverse transcriptase enzyme used in the reaction is stable at temperatures above 60° C., for example, SuperScript II RT (GIBCO-BRL). However, MMLV reverse transcriptase can also be used.
In a preferred embodiment, the disaccharide, trehalose, is added to the reverse transcriptase reaction. Trehalose is a disaccharide that has been shown to stabilize several enzymes including RT at temperatures as high as 60° C. (Mizuno, et al., 1999, Nucleic Acids Res, 27:1345-9.). Trehalose addition allows the use of high temperatures in the reverse transcription reaction (e.g., as high as 60° C.). Preferably, trehalose is added to the reverse transcriptase reaction such that it is present in a final concentration of between 20 to 30%. Preferably, the reverse transcriptase reaction is then performed at a temperature between 35 to 75° C., more preferably at a temperature from between 50 to 75° C., most preferably at a temperature of 60° C.
Amplification of ARE cDNAs by PCR
To clone the cDNAs representative of new ARE-containing genes, the first strand cDNAs synthesized is designed to be specific for first strand cDNAs that contain ARE-sequences. In one embodiment this employs two primer sets, the 3′ set and the 5′ set, which are designed to selectively amplify ARE genes.
The first set of primers, the 3′ set, are similar, and could be identical, to the 3′ primers used in the aforementioned specific reverse transcription of ARE-containing mRNAs. Preferably, however, the primers of the 3′ set are longer than those used for reverse transcription and have a high percentage of GC in their sequence. Examples of the 3′ set of primers used for PCR are GGCGGATCCGGGCTAAATAWATAAATWA (MOTIF-AA), SEQ ID NO. 14, and GGCGGATCCGGGCAATAAATAWATAAAT (MOTIF-T), SEQ ID NO. 15. Other variations in sequence of these 3′ primers could be made to facilitate PCR or cloning in subsequent steps, such as inclusion of restriction enzyme cleavage sites, for example.
The second set of primers, directed to the 5′ end of the genes represented by the first strand cDNAs, are determined by computational analysis of sequences in known databases. For example, 897 mRNA/cDNA sequences that were identified as containing ARE sequences in their 3′ UTRs (these 897 genes were discussed above in the section entitled, “Searching the mRNA Database for the ARE Search Sequence.”). The region in the 5′UTR that flanked the ATG start codon for each of these 897 sequences was compared. There is some sequence conservation in all eukaryotic genes known to be present surrounding the translation start codon (Kozak, 1987, Nucleic Acids Res, 15:8125-48; Kozak, 1987, J Mol Biol, 196:947-50.).
By analysis of this 5′ region of the 897 sequences a set of four degenerate primers, or alternatively, sixteen degenerate primers is designed, such that the set of primers hybridize to 99% of the first strand cDNAs derived from the 897 mRNA/cDNA sequences (Table 4). Individual degenerate primers are selected from this list to be used in PCR. The 5′ primers are designed in such a way that they hybridize to the 5′ end of a subset of the 897 ARE genes. Therefore, to amplify all possible ARE-containing mRNAs different PCR reactions using different sets of primers are used.
Using the 3′ and 5′ primers, the PCR reaction preferably is performed using Taq polymerase and is preferably hot start PCR (i.e., adding Taq polymerase to the reaction during heating for 10 min. at 95° C.) or using anti-Taq antibody (i.e., Taq polymerase is pre-incubated with anti-Taq antibody which renders the polymerase inactive until reactivated by heating). Preferably, annealing temperature of the first four PCR cycles is between 32 and 50° C. Thereafter, the annealing temperature is raised to between 60 and 65° C. for 22 to 35 cycles. A final extension step is performed at 7° C. for 3 minutes.
RNA-Ligase Based cDNA Synthesis Followed by Specific PCR Amplification of ARE Sequences
In another embodiment, synthesis of cDNA uses an RNA ligase based method, followed by amplification of such cDNAs using PCR (
In such embodiment, total cellular RNA is reverse transcribed into first strand cDNA, preferably by SuperScript II reverse transcriptase and oligo(dT) primers that are modified at the 5′ ends by NH2 (amino group prevents self ligation or inter-ligation of the oligo(dT) and the RL oligo primer). The first strand cDNA that results has the modified oligo(dT) primer incorporated and, therefore, its 5′ end blocked by NH2 (see
Amplification of this resulting cDNA is performed by PCR using a 3′ primer containing the consensus ARE sequence, and a 5′ primer homologous to the RL oligomer.
The present invention also relates to cDNA libraries that comprise the protein coding sequences of the ARE genes that are identified by the present methods. To produce such libraries, double-stranded DNA produced after PCR amplification of first strand cDNA is cloned into plasmid vectors. The cDNA may or may not be fractionated by size before cloning. Cloning of cDNA uses appropriate vectors, such as for example, T/A vectors or other cloning techniques known to those skilled in the art. Such cDNA cloning of PCR products can be accomplished through the use of commercial kits from, for example, Clontech (Palo Alto, Calif.), Invitrogen (Carlsbad, Calif.), Novagen (Madison, Wis.), Stratagene (LaJolla, Calif.), or other companies.
Library clones containing inserts are selected, further cloned, DNA extracted and purified. DNA samples are sequenced using primers specific to vector sequences flanking the inserts. Performance of these procedures is well known among those experienced in the art.
Such ARE cDNA libraries contain a plurality of DNA molecules that together represent a plurality of different ARE genes. Such individual DNA molecules normally contain a fragment of a given ARE gene. Such fragments can comprise a full length or partial length coding sequence. Such partial length coding sequences can comprise from about 10% to about 90% of the full length coding sequence. Preferably, such a partial length coding sequence comprises a unique sequence which is not contained within the protein coding sequences of genes that are not ARE-genes. The uniqueness of such sequence is determined through computational search of publicly available sequence databases. Sequences of some ARE genes isolated in this way are not found in public databases. Some such sequences are shown in
The present invention also relates to microarrays that comprise probes which are nucleotide molecules derived from the nucleotide sequences of ARE genes. As used herein, the term “microarray” refers to a solid support that comprises a plurality of ARE gene probes. Preferably, fewer than 20%, more preferably fewer than 10% of the probes on the array bind under stringent hybridization conditions to the protein coding sequences of non-ARE genes. Such microarrays can comprise substantially the entire protein coding sequence of the ARE gene.
The probes that comprise the microarrays are derived from ARE genes which are identified both by computational search methods and by laboratory generation of ARE cDNA libraries as described above. The sequences derived from the ARE genes are matched to genes present in the pubically-available Unigene database (http://www.ncbi.nlm.nih.gov UniGene/) by searching for the sequence in the BLAST database and determining the Unigene number. The Unigene database is a resource for gene discovery in which each Unigene sequence, or cluster, represents a unique gene. Clones corresponding to Unigene cluster identification numbers are used to identify clones that are then obtained from either a commercial set of 40,000 cDNA clones (human 40K set; Research Genetics; Huntsville, Ala.) or from the I.M.A.G.E. Consortium clone set (http://image.llnl.gov/).
The sources of immobilized nucleic acids (i.e., probes) placed on the microarrays may depend on the microarray and comprise several different types of probe. Such probes may comprise nucleic acids amplified from clones present in an ARE library, or obtained from Research Genetics or the I.M.A.G.E. Consortium. In such case, the insert DNAs (i.e., ARE cDNAs) from these clones are amplified by PCR using primers that hybridize to vector DNA sequences that flank the cloned insert. Alternatively, they are amplified using the 3′ primers and 5′ primer specific to the sequence of the cloned insert. In addition to PCR products amplified from ARE clones, probes may comprise fragments from ARE clones, such as fragments generated through restriction endonuclease cleavage of the ARE clones.
In addition, other types of molecules may be used as the gene probes in the microarrays. For example, oligonucleotides which contain at least 10 nucleotides, preferably from about 10 to about 100 nucleotides, more preferably from about 10 to about 30 nucleotides can be used. Sequence information from ARE genes is used to design and synthesize such oligonucleotides which are then placed onto the microarrays. Such oligonucleotides can be designed based on any region of an ARE-containing gene (i.e., 5′UTR, coding region, 3′UTR) as long as the sequences encoded by such oligonucleotide are unique (i.e., the sequence is not present in any other gene within the genome). Such oligonucleotides preferably have a GC ratio (i.e., the percentage of the nucleotide bases that comprise G and C) of at least 40%. Such oligonucleotides also preferably do not internally hybridize to themselves (i.e., they do not form “hairpin” structures). In addition to oligonucleotides, other gene probes which comprise nucleobases including synthetic gene probes such as, for example, peptide nucleic acids (PNAs) can also be used.
In addition to containing sequences representative of ARE genes, microarrays will, for control purposes, also contain a smaller number of sequences representative of genes that do not contain an ARE element. Such non-ARE genes are preferably so-called “housekeeping” genes, such as for example, β-actin or GAPDH.
Microarrays are made in a variety of ways. Probes can be loaded into a robotic instrument which precisely places a predetermined amount of the probe onto the solid support. In one embodiment, probes are spotted onto glass slides that had been coated with poly-L-lysine using a SDDC-2 microarray robot (Engineering Services Inc.; Toronto, Canada), followed by UV-crosslinking and neutralization of remaining poly-L-lysine. In another embodiment, oligonucleotide probes are synthesized directly on the surface of the solid support. Making of microarrays has been described in several publications (Southern, et al., 1999, Nat Genet, 21:5-9; Duggan, et al., 1999, Nat Genet, 21:10-4; Cheung, et al., 1999, Nat Genet, 21:15-9; Lipshutz, et al., 1999, Nat Genet, 21:20-4.) and (U.S. Pat. Nos. 5,837,832, 6,110,426 and 6,153,743, for example). These publications and patents are incorporated herein by reference.
The ARE microarrays are then used in hybridization experiments. Hybridization of mRNA, more preferably cDNA made from mRNA, from a cell line or tissue, to a probe on the microarray is indicative of expression, at the level of transcription, of the ARE gene in the cell line or tissue that corresponds to the specific probe on the microarray. Through determination of the amount of hybridization of the cell line or tissue RNA to the totality of probes on the microarray, the expression pattern of all ARE genes comprising that cell line or tissue can be determined.
The mRNA or cDNA made from the mRNA (i.e., target nucleic acids) is normally fluorescently labeled. In one embodiment, total RNA that is to be tested for the presence and amount of ARE transcripts, is extracted from cells or tissues, labeled with Cyanine-5-dUTP (Cy5, red, Amersham; Piscataway, N.J.) in a reverse transcriptase reaction using oligo(dT)11-18 primers and SuperScript II RT. Similarly, control RNA is labeled with Cyanine-3-dUTP (Cy3, green). The labeled cDNA samples are hydrolyzed by NaOH, purified by column chromatography and concentrated in TE buffer. The labeled cDNAs are mixed and hybridized to the sequences on the glass slide.
Conditions for hybridization of the target to the probe are based on the melting temperature (Tm) of the nucleic acid binding complex or probe, as described (Wahl, et al., 1987, Methods Enzymol, 152:399-407). The term “stringent conditions,” as used herein, is the “stringency” which occurs within a range from about Tm−5 (5° below the melting temperature of the probe) to about 20° C. below Tm. As used herein, “highly stringent” conditions employ at least 0.2×SSC buffer and at least 65° C. As recognized in the art, stringency conditions are attained by varying a number of factors such as the length and nature of the probe, the length and nature of the target sequences (i.e., the labeled cDNA), the concentration of the salts and other components, such as formamide, dextran sulfate, and polyethylene glycol, of the hybridization solution. All of these factors may be varied to generate conditions of stringency which are equivalent to the conditions listed above
In one embodiment, in addition to the labeled cDNA, the hybridization solution contains poly dA40-60 (8 mg/ml), yeast tRNA (4 mg/ml), and CoT1 DNA (10 mg/ml), 3 μl of 20×SSC, and 1 μl 50×Denhardt's blocking solution. Conditions for hybridization of such targets to the probes on the microarray are known to those experienced in the art. Such conditions have been well published. One source for such information is a series of articles in the January 1999 issue (supplement) of Nature Genetics (1999, Nat Genet, supplement, 21:1-60) which are incorporated herein by reference.
After hybridization, determination of the amount of hybridization of the target nucleic acids to individual probes on the microarray, the expression pattern of ARE genes in the cell line or tissue from which the mRNA originated is determined. In one embodiment, the glass slides are washed and read by a GenePix 4000A scanner (Axon Instruments; Foster City, Calif.) to yield gene expression data. The scanner program allows normalization of Cy3 (control sample) and Cy5 (experimental sample) ratios using the β-actin control probe on the array. The intensity ratios (Cy3 versus Cy5) represent the relative expression profile of the ARE-genes. Through comparison of such ratios for a specific gene between different samples (e.g., two different cell lines, the same cell line wherein one sample is treated with a drug compared to the other sample which is untreated, two different tissues, etc.) changes in expression of specific ARE genes are determined.
The following examples are meant to illustrate the preferred aspects of the invention and are not to be construed as limiting the aspects of the invention in any way.
An ARE search sequence was defined using sequences that belonged to 57 previously identified ARE-containing mRNAs were used for the computational derivation of the ARE motif.
The selection of these mRNAs for the analysis was based on the ability of the mRNA to meet one of two criteria: i) an mRNA in which the ARE in the 3′UTR had been experimentally shown to affect the half life of that mRNA or, ii) an mRNA in which the ARE in the 3′UTR had not been experimentally shown to affect half life, but the mRNA was known to be transiently induced.
Based on these criteria, the 57 previously identified ARE-containing mRNAs that were used for this computation are: early lymphocyte activation antigen CD69 (Santis, et al., 1995, Eur J Immunol, 25:2142-6.), 6-phosphofructo-2-kinase (PFK-2)/fructose-2,6-biphosphate (Chesney, et al., 1999, Proc Natl Acad Sci USA, 96:3047-52.), B-cell leukemia/lymphoma2 oncogene (Bcl-2) (Capaccioli, et al., 1996, Oncogene, 13:105-15), c-fos proto-oncogene (Chen, et al., 1994, Mol Cell Biol, 14:416-26.), CHOP/Growth arrest and DNA-damage inducible factor (Ubeda, et al., 1999, Biochem Biophys Res Commun, 262:31-8.), c-myb proto-oncogene (Reeves and Magnuson, 1990, Prog Nucleic Acid Res Mol Biol, 38:241-82), c-myc proto-oncogene (Brewer, 1991, Mol Cell Biol, 11:2460-6.), cyclin D1 (Rimokh, et al., 1994, Blood, 83:3689-96.), cyclooxygenase (Lasa, et al., 2000, Mol Cell Biol, 20:4265-74.), endothelin-2 (Saida, et al., 2000, Genomics, 64:51-61.), epidermal growth factor receptor (McCulloch, et al., 1998, Int J Biochem Cell Biol, 30:1265-78.), estrogen receptor α (Kenealy, et al., 2000, Endocrinology, 141:2805-13.), fibroblast growth factor 2 (Touriol, et al., 1999, J Biol Chem, 274:21402-8.), granulocyte monocyte colony stimulating factor (Reeves and Magnuson, 1990, Prog Nucleic Acid Res Mol Biol, 38:241-82; Brown, et al., 1996, J Biol Chem, 271:20108-12.), glucose transporter 1 (Hamilton, et al., 1999, Biochem Biophys Res Commun, 261:646-51.), granulocyte monocyte colony stimulating factor (Shaw and Kamen, 1986, Cell, 46:659-67; Winzen, et al., 1999, Embo J, 18:4969-80.), gro-α (Sirenko, et al., 1997, Mol Cell Biol, 17:3898-906.), inducible nitric oxide synthase (Rodriguez-Pascual, et al., 2000, J Biol Chem, 275:26040-9.), interferon-α (Reeves and Magnuson, 1990, Prog Nucleic Acid Res Mol Biol, 38:241-82; Caput, et al., 1986, Proc Natl Acad Sci USA, 83:1670-4.), interferon-αAA (Caput, et al., 1986, Proc Natl Acad Sci USA, 83:1670-4.), interferon-al (Reeves and Magnuson, 1990, Prog Nucleic Acid Res Mol Biol, 38:241-82; Caput, et al., 1986, Proc Natl Acad Sci USA, 83:1670-4.), interferon-α1B (Caput, et al., 1986, Proc Natl Acad Sci USA, 83:1670-4.), interferon-αF (Reeves and Magnuson, 1990, Prog Nucleic Acid Res Mol Biol, 38:241-82; Caput, et al., 1986, Proc Natl Acad Sci USA, 83:1670-4.), interferon-αG (Reeves and Magnuson, 1990, Prog Nucleic Acid Res Mol Biol, 38:241-82; Caput, et al., 1986, Proc Natl Acad Sci USA, 83:1670-4.), interferon-αH (Reeves and Magnuson, 1990, Prog Nucleic Acid Res Mol Biol, 38:241-82; Caput, et al., 1986, Proc Natl Acad Sci USA, 83:1670-4.), interleukin-1α (Gorospe and Baglioni, 1994, J Biol Chem, 269:11845-51.), interferon-β (Peppel, et al., 1991, J Exp Med, 173:349-55; Grafi, et al., 1993, Mol Cell Biol, 13:3487-93.), interferon-γ (Gillis and Malter, 1991, J Biol Chem, 266:3172-7.), interleukin-1β (Kastelic, et al., 1996, Cytokine, 8:751-61.), interleukin-10 (Kishore, et al., 1999, J Immunol, 162:2457-61.), interleukin-2 (Lindstein, et al., 1989, Science, 244:339-43; Henics, et al., 1994, J Biol Chem, 269:5377-83.), interleukin-3 (Stoecklin, et al., 2000, Mol Cell Biol, 20:3753-63.), interleukin-4 (Reeves and Magnuson, 1990, Prog Nucleic Acid Res Mol Biol, 38:241-82), interleukin-6 (Winzen, et al., 1999, Embo J, 18:4969-80.), interleukin-8 (Winzen, et al., 1999, Embo J, 18:4969-80.), interleukin-11 (Yang and Yang, 1994, J Biol Chem, 269:32732-9.), lymphotoxin (Reeves and Magnuson, 1990, Prog Nucleic Acid Res Mol Biol, 38:241-82), K-ras proto-oncogene (Quincoces and Leon, 1995, Cell Growth Differ, 6:271-9.), leukemia inhibitory factor (Carlson, et al., 1996, Glia, 18:141-51.), macrophage colony stimulating factor (Chambers and Kacinski, 1994, J Soc Gynecol Investig, 1:310-6.), macrophage chemotaxis protein-1 (Bhattacharya, et al., 1999, Nucleic Acids Res, 27:1464-72.), macrophage inflammatory protein-α (Wang, et al., 1999, Inflamm Res, 48:533-8.), macrophage inhibitory protein-2α (Hartner, et al., 1997, Kidney Int, 51: 1754-60.), Mda-7 (Madireddi, et al., 2000, Oncogene, 19:1362-8.), Monocyte Chemotactic Protein-3 (Kondo, et al., 2000, Immunology, 99:561-8.), MYCN (Chagnovich and Cohn, 1997, Eur J Cancer, 33:2064-7.), Nerve growth factor (Caput, et al., 1986, Proc Natl Acad Sci USA, 83:1670-4; Sherer, et al., 1998, Exp Cell Res, 241:186-93.), platelet-derived growth factor/c-sis proto-oncogene (Liang and Pardee, 1992, Science, 257:967-71.), Pim-1 proto-oncogene (Wingett, et al., 1991, J Immunol, 147:3653-9.), plasminogen activator inhibitor type 2 (Maurer, et al., 1999, Nucleic Acids Res, 27:1664-73.), thioredexin reductase (Gasdaska, et al., 1999, J Biol Chem, 274:25379-85.), tissue factor (Ahern, et al., 1993, J Biol Chem, 268:2154-9.), tumor necrosis factor (Shaw and Kamen, 1986, Cell, 46:659-67; Zubiaga, et al., 1995, Mol Cell Biol, 15:2219-30.), urokinase-type plasminogen receptor (Montero and Nagamine, 1999, Cancer Res, 59:5286-93.), urokinase-type plasminogen activator (Montero and Nagamine, 1999, Cancer Res, 59:5286-93.) and vascular endothelial growth factor (Pages, et al., 2000, J Biol Chem, 275:26484-91.).
The 3′UTR regions of these mRNA sequences were extracted computationally using the Assemble program (Genetics Computer Group; Madison, Wis.) which extracted the sequences downstream of the coding sequence (i.e., >CDS). The 57 3′ UTRs were then analyzed by the MEME (multiple expectation maximization for motif elicitations) program which finds conserved ungapped short motifs within a group of related, unaligned sequences (Bailey and Gribskov, 1998, J Comput Biol, 5:211-21.). MEME yielded the motif pattern UAUUUAWW. Next, a consensus analysis around this motif was performed, which resulted in the pattern WWWUAUUUAUWWW (W=A or U) with a certainty level of 75% at each position (Table 1).
The goal was to search a human database to identify sequences containing the ARE search sequence, WWWUAUUUAUWWW, that was determined in Example 1. To do this, the sequences to be searched had to be obtained. This was done as described below.
A total of 36,951 human mRNA/cDNA sequences were extracted from GenBank Release 113 (National Center for Biotechnology Information, NCBI) using Lookup program (Genetics Computer Group) that was used to find mRNA or cDNA in the Definition Field along with Homo sapiens in the Organism Field (Source) in GenBank entries. Subsequently, a PERL code (Practical Extraction and Report Language) was written to extract the sequences that contained the field CDS in the Features Table (indicating the sequence included a protein coding region) in order to exclude those sequences which did not have CDS. This resulted in 27,403 CDS-containing mRNA/cDNA sequences. This file was used as the input to another PERL program that extracted sequences with complete CDS (i.e., without ambiguous CDS such as <, >, complement or join). The output was 15,148 full-length CDS-containing sequences in an mRNA/cDNA file. The 3′UTRs of the sequences in this file were constructed using the Assemble program (Genetics Computer Group), which extracted the sequences downstream of CDS (i.e., >CDS). This was done in order to obtain the 3′UTR region of the genes where the ARE sequences would be found. This 3′UTR extraction step was necessary because most of the GenBank records lack the 3′UTR as an annotated Feature key, despite the fact this information can be extracted computationally from CDS Feature as executed here. The UNIX command, Stream Editor (Sed), was used to remove sequences that had no 3′UTR. A resultant list of 13,057 human full-length CDS/3′UTR-containing mRNA sequences was finally compiled.
The 13-bp pattern determined in Example 1 (WWWUAUUUAUWWW) was searched in the 13,057 sequences determined in Example 2 using FindPattern (Genetic Computer Group). The stringency was decreased by allowing one mismatch in each direction of the nucleotides flanking the core pattern (UAUUUAU), in order to allow maximum recovery from the search. This step was performed on the 3′UTRs of the full-length CDS/3′UTR-containing mRNA list. The resulting subset of sequences was made minimally redundant using the CLEANUP program (Grillo, et al., 1996, Comput Appl Biosci, 12:1-8.) with the parameters of 90% similarity and 90% overlap, which produced an output file that that contained the longest available sequences. Approximately 17% redundancy in the ARE-mRNA list was computationally removed. A total of 897 minimally redundant sequences (see listing at end of examples), approximately 8% of the human mRNA sequences analyzed, were finally obtained and subsequently termed the “ARE-mRNA database (ARED).” This database was stored as flat GenBank files and imported for further analysis into the commercial Vector NTI software version 5.5 (InforMax; Bethesda, Md.). Each sequence in the database contained the 3′UTR, full-length CDS (i.e., protein coding sequence), and at least 10 bp of 5′UTR.
In Example 3, the consensus ARE sequence determined in Example 1 was used to search a database of 3′UTR sequences, as determined in Example 2. As an independent check on the specificity of the consensus ARE sequence (i.e., that it is specific to the 3′UTR), the ARE sequence was searched in the complete ARED database, which contained both 3′UTR sequences as well as coding sequences, using Assemble and FindPattern. The data show that the 13-bp ARE pattern with 2 mismatches (one on each side of the core UAUUUAU pattern) was highly selective (89% specificity) towards the 3′UTR when compared to CDS (P<0.0001). The selectivity could also be increased to 96%, although this was at the expense of losing some ARE-containing sequences (Table 2).
1No. of mRNA sequences with the 13-bp ARE search sequence present either in the 3-UTR or in the CDS (protein coding sequence) retrieved by the search..
2Indicates the number of ARE patterns found in each subset.
3Mean of finds of the 13-bp ARE pattern per 3′UTR or CDS.
4% Coverage = % (no. of 3′UTR with ARE pattern/total 897 mRNA sequences).
5% Specificity (% sp) = 1 − (CDS containing the pattern/total 897 mRNA sequences).
6P values indicate statistical significance between the mean of 13-bp ARE pattern per ARE mRNA using unpaired t-test with Welch correction (used because of the significantly different variances as verified by F test, P < 0.0001).
A distinguishable feature of the 13-bp ARE search sequence in typical ARE-mRNAs is that a significant number of ARE mRNAs (about 40% of total ARE-mRNAs) have continuous patterns of AUUUA (n>1) with the predominant pattern of WWWUAUUUAUUUAWW.
GENSCAN is a software program designed to predict complete gene structures based on a probabilistic model of the gene structure of human genomic sequences (Burge and Karlin, 1997, J Mol Biol, 268:78-94.). Such model incorporates descriptions of the basic transcriptional, translational and splicing signals, as well as length distributions and compositional features of exons, introns and intergenic regions.
There are two instances in which the GENSCAN program is used. In the first instance, GENSCAN is used to analyze the gene sequences obtained after searching a genomic database for genes containing an ARE search sequence using a program such as FindPattern. Such an analysis is used to eliminate those genes that contain the ARE consensus sequence in a region of the gene other than the 3′UTR (e.g., in an intron or intergenic regions). In the second instance, the GENSCAN program is used as an alternative to using the FindPattern analysis routine. FindPattern identifies a gene that contains a consensus ARE sequence, for example, wherever that sequence occurs within the gene. GENSCAN, however, can be used to identify only those genes in which the ARE consensus sequence occurs in the 3′UTR of the gene. GENSCAN predicts the coding segments of a genomic area. Thus, GENSCAN can be used to predict an ARE gene. First, the FindPattern program is used to locate the ARE gene upstream of the ARE region. This upstream genomic region is then subjected to GENSCAN or another computer gene prediction program to give an output of protein coding region and predicted amino acid sequence.
In addition to computational identification of genes containing ARE sequences, laboratory isolation of these, as well as previously unidentified ARE-containing genes, was also performed. The first step in laboratory isolation of ARE-containing genes was isolation of RNA from cells.
In this study, the monocytic leukemia cell line, THP-1 (American Type Culture Collection; Rockville, Md.), was used. This cell line was known to produce the ARE mRNA, interleukin-8 (IL-8) and β-actin, which will be discussed later. The cells were grown in RPMI 1640 supplemented with 10% fetal bovine serum. This cell line was treated with lipopolysaccharide (LPS), an inducer of cytokines (Al-Humidan, et al., 1998, Cell Immunol, 188:12-8.), and cycloheximide (CHX), which blocks protein synthesis and increases expression of early response genes that do not require protein synthesis for transcription (Reeves and Magnuson, 1990, Prog Nucleic Acid Res Mol Biol, 38:241-82) and increases ARE-mRNA stability (Shaw and Kamen, 1986, Cell, 46:659-67.)
Total RNA was extracted from the cells using the guanidine isothiocyanate method using Tri Reagent (Molecular Research Center; Cincinnati, Ohio). The RNA was subject to DNase I treatment, followed by chloroform extraction, precipitation and resuspension in diethyl pyrocarbonate-treated (DEPC) water.
To isolate ARE genes, the isolated RNA described in Example 6 was reverse transcribed into DNA. Reverse transcription of the isolated RNA used a 13 nucleotide long degenerate primer of sequence WWWTAAATAAAT. Reverse transcription was performed in a 20 μl volume in a nuclease-free microcentrifuge tube. Total RNA (0.5 μg) was heated with different concentrations of primer to 70° C. for 10 min before quick chill on ice. Contents were collected by brief centrifugation and the following were added: 1× First Strand Buffer (250 mM Tris-Hcl, pH 8.3, 375 mM KCl, 15 mM MgCl2), 500 μM dNTP mixture (GIBCO BRL; Gaithersburgh, Md.), 10 μM dTT (GIBCO BRL), and 20 U RNAsin (Pharmacia; Uppsala, Sweden). Contents of the tube were mixed gently and incubated at appropriate temperatures. SuperScript II (Rnase H-minus MMLV; GIBCO BRL) enzyme then was then added and incubated for two hours. The reaction was inactivated by boiling.
At this point, a pool of first strand cDNA was obtained. Because the WWWTAAATAAAT primer should have hybridized specifically to mRNAs containing ARE elements, those mRNAs should have been preferentially reverse transcribed into first strand cDNA. mRNAs that did not contain ARE elements should have been less preferentially reverse transcribed.
To test whether mRNAs containing ARE elements had been preferentially reverse transcribed, the amounts of cDNAsin the first strand cDNA pool corresponding to two sample genes was determined. The first gene, interleukin-8 (IL-8), contains discontinuous multiple nonamers, VWAUUUAUU, in its 3′UTR. IL-8, therefore, is a gene that encodes an ARE-containing mRNA. The second gene, the housekeeping gene β-actin, contains a single non-typical ARE pentamer, UCAGG(AUUUA)AAAA in its 3′UTR. β-actin, therefore, encodes an mRNA that is considered not to contain an ARE element. This is the control.
The first strand cDNA pool was used as a template for PCR amplification of IL-8 and β-actin. Determination of the ratio of PCR products of IL-8 relative to β-actin is a measure of the relative abundance of the two first strand cDNAs in the pool of cDNAs made by reverse transcription.
For amplification of IL-8 cDNA, the primers were as follows: IL-8, sense, ATGACTTCCAAGCTGGCCGTGGCT; IL-8 antisense, TCTCAGCCCTCTTCAAAAACTTCTC. For amplification of β-actin cDNA, the primers were as follows: β-actin sense; ATGGATGATGATATCGCCGCG; β-actin, antisense; CTCCTTAATGTCACGCACGATTTC. PCR was performed using 40 μg of cDNA with the following reagents in their final concentrations of: 1 unit of Taq polymerase (Perkin-Elmer), 1×PCR buffer (Perkin-Elmer), 10 μM of each of dATP, dCTP, dGTP, and dTTP, 1 μM of both sense and antisense primers. Hot start, (i.e., adding Taq polymerase to the reaction tubes during heating tubes for 10 min. at 95° C.) was used or, alternatively, Taq polymerase was pre-incubated with antibody to Taq (Sigma; St. Louis, Mo.) which rendered the Taq polymerase inactive until reactivated by heating in the first denaturation cycle. The cycling conditions were as follows: Four initial cycles of 94° C. for 1 min, 35° C. (variable temperature) for 2 min, 72° C. for 2 min; Twenty five cycles of 94° C. for 45 sec, 60° C. for 1 min, 72° C. for 2 min; Final extension cycle of 72° C. for 7 min, 4° C. for overnight storage.
The results of this experiment are shown in
The disaccharide, trehalose, was used for further refinement for suppression of β-actin cDNA abundance while maintaining selection of ARE cDNAs (
The result of trehalose addition to the reverse transcription reactions was higher specificity of the reverse transcription reaction for the ARE-containing mRNAs as compared to reverse transcription of mRNAs that did not contain an ARE consensus sequence.
As shown in
In order to clone the sequences representative of ARE-containing first-strand cDNAs made in Example 7, the cDNAs were amplified. In one embodiment, this was done by PCR amplification. This PCR amplification used the 3′ primers representative of the consensus ARE sequence motif. An additional primer, derived from the 5′ region of the ARE-containing cDNA was also required. Such 5′ primers were derived from the region of the gene encompassing the translation start site of the gene, which includes the ATG start codon. Design of the 5′ primers is described in this example below.
The 5′UTR initiation context sequences (i.e., those that flank the start codon, ATG) of sequences in the ARE-mRNA database (the 897 genes described in Example 3) were analyzed. It is known that nucleotide sequences surrounding ATG start codons are conserved (Kozak, 1987, Nucleic Acids Res, 15:8125-48; Kozak, 1987, J Mol Biol, 196:947-50.). Thus, this region was chosen to design 5′ primers with the idea that ARE genes would have a slightly different conservation of sequences surrounding the ATG as compared to all genes.
Out of 897 ARE genes, 605 had at least 10 bp upstream (or 5′) of the ATG start codon in the database. These 605 sequences were used to examine the region around the ATG start codon. The 605 sequences were divided into either four or sixteen subsets by using the sequence designations ATGN and NATGN, respectively (N=A or C or G or T). This was followed by alignment of the truncated 5′UTR (−7 bp ATG, +2 bp) of the 605 sequences using the PileUP program (Genetics Computer Group). Four and sixteen consensus patterns at a certainty level of 75% at each position were derived from the alignment (Table 3). It is important to note that the consensus sequences in Table 3 are the most frequently occurring. Therefore, not every sequence in the ARED database is represented here.
The overall consensus initiation site in the ARE mRNA database was SSMAMSATGRM at a 50% certainty level at each position. In comparison, the initiation consensus of non-clustered random human sequences was SSSRMSATGRM. The conserved pattern, CACCATGG was also noted in Table 3 and appears in approximately 30% of total ARE mRNAs. It is similar to the Kozak sequence CRCCATG previously reported and to the pattern of the larger lists available at the TransTerm database1, CAMCATGGC. 1TransTerm is a database containing sequence information on the start and stop codons, as well as the codon usage data, for many different species. The URL is: http://uther.otago.ac.nz/Transterm.html
Statistical analysis of the four and sixteen 10-mer (−6 ATG, +1) consensus sequences was performed (Table 4). Sequences in each of the sixteen subsets were analyzed for initiation context sequences. Each consensus pattern contains five conserved nucleotides (i.e., ATG with one flanking nucleotide in each direction), and six additional upstream degenerate nucleotides and one additional downstream nucleotide. The most common consensus in initiation regions is Cg consensus VVVVRSCATGGM (Table 4). Other frequent initiation consensus are Ca, Ag, and Gg. Each accounts for approximately 9-10% of all ARE mRNAs.
Not all consensus sequences were unique to the initiation regions. This means that the consensus sequences could be found in areas of the mRNA sequence that did not contain the translation initiator ATG (e.g., within the protein coding sequence). Depending on the specific consensus sequence, there were varying degrees of internal sites in addition to the initiation region. The most common consensus sequence around any ATG was the Aa consensus (Table 4) which existed in 39% of the entire ARE-mRNA molecules. The least occurring consensus sequences were those flanked by a T upstream of ATG, e.g., Ta, Tc, Tg, and Tt consensus. The highest proportion of consensus in initiation regions in any subset was the Gc consensus in which 71% of the sites (initiation plus internal) were initiation sequences. The overall consensus site per mRNA ranged form 1.0 to 1.65 (i.e., >1 if the consensus sequence found in mRNAs other than at the translation initiation region).
1Number of mRNA sequences (percentage) of the total ARE-containing mRNA sequences in each of the 4 (ATGN) or 16 (NATGN) subsets.
2Number of sequences (percentage) of ARE-containing mRNA sequences in the overall ARE-containing mRNA database (i.e., includes consensus sequences found other than at the translation initiation site).
3Total number of sites (hits) in mRNA sequences in the ARED database (includes consensus sequences found other than at the translation start).
4Average number of hits per mRNA.
5% full length mRNA (i.e., the percentage of mRNAs recognized by the consensus probe that are full length mRNAs) is obtained by dividing column 2 (No. mRNA/subset) by column 4 (No. total sites). If the consensus sequence is infrequently found at sites other than the translation start, this percentage will be nearer to 100%).
Once first strand cDNA was synthesized from cellular RNA, the first strand cDNA had to be made into double-stranded DNA and the double-stranded DNA had to be amplified. In this example, amplification of the double-stranded DNA was done using PCR, 5′ primers comprising those described in Example 8 and 3′ ARE-specific primers described earlier in this application.
A PCR-protocol called ARE-cDNA PCR was used to selectively amplify ARE-cDNA. The selective amplification of ARE cDNA was verified using specific PCR to known ARE mRNA molecules with various numbers of ARE repeats (IL-8, c-fos, and TNF-α), and monitoring the abundance of the non-ARE β-actin signal, as in Example 7. TNF-α mRNA contains continuous stretches UUAUUUAUU (AUUUA)5, while IL-8 contains discontinuous multiple nonamers in the ARE flanking region. The proto-oncogene, c-fos, has two continuous overlapping nonamers, i.e., UAAUUUAUUUAUU. As discussed earlier, β-actin, encodes an mRNA that is considered not to contain an ARE element. The goal of ARE-cDNA PCR was to amplify the typical ARE-cDNAs and concurrently suppress amplification of non-ARE sequences.
Using the optimized ARE-cDNA PCR (as described in Example 6 and as modified in the Brief Description of
In all of the experiments, DNA contamination was monitored by lack of larger PCR products, as primers for the specific PCR were designed to span more than one exon. The specific amplifications of TNF-α and IL-8 cDNA, which were performed following ARE-cDNA PCR was not due to carryover cDNA, which has an amount of 4 ng, and was performed under high stringency conditions including the use of 50 μM of dNTP and 25 cycles.
As an alternative to selective reverse transcription or selective amplification of ARE-containing mRNAs into first strand cDNA, an alternative is RNA-ligase mediated amplification (
To perform this procedure, called RL-ARE-PCR, total RNA was reversed transcribed by SuperScript II as described in Example 7 except that the primer used was oligo(dT) that had been modified at its 3′-end by the addition of NH2. To this cDNA reaction, 2 units of RNase H were added and incubated at 37° C. for 20 min, then incubated at 90° C. for 2 min. The cDNA in the reaction was then ligated with 5′-phosphorylated and NH2 3′-end modified oligomers (RL oligo; Operon Technologies, Inc.; Alameda, Calif.). The 3′ end of oligo(dT) and the RL oligo primer were blocked with the amino (NH2) groups to prevent the self ligation or the inter-ligation of the oligo(dT) and RL oligomers. The 25 μl reaction contained the following: 2.5 μl of 10× ligase buffer, 16.7 ul (2 ug) of cDNA, 01.0 ul (10 U) of T4 RNA ligase, 01.0 ul (0.5 ug) of the 3′-end NH2 blocked and 5′-end phosphorylated primer. This reaction was incubated at 37° C. for 1.5 hrs, followed by incubation at 16° C. for 1.5 hrs, and then at 100° C. for 2 mins.
This was followed by amplification of the RL-ligated cDNA with a 5′-primer specific to the RL sequence and 3′primer specific to ARE-regions. PCR was performed as described in Example 7. The primers used for this PCR were GACTCCACAACCACGACACA and PTGTGTCGTGGTTGTGGAGTCL, where P=phosphate and L=amino linker. This PCR experiment verified amplification of the ARE-cDNA, TNF-α, but not β-actin (
Cloning of the PCR products was needed to construct libraries of the ARE genes. A pilot construction of a pUC19 mini-library was performed using the amplified ARE-PCR products generated from the optimum conditions of RL-ARE-PCR (
Bacterial colonies resulting from the transformation were randomly picked and mini-plasmid preparations were performed for evaluation purposes. The average size of the amplified inserts was 600 bp and the insert size range from 350-800 bp. This size range was satisfactory for the purpose of generating cDNA spotted probes of the microarray. The inserts of said clones were sequenced to provide DNA sequence information of said inserts. The sequences of many of these clones were found in publicly available sequence databases. The sequences of other of these clones were not found in such databases, suggesting that such clones identify previously unknown genes. The sequences of a number of such clones are shown in
This study describes making a microarray containing DNA sequences representative of ARE genes. Such microarrays are for use in gene expression analysis.
To make such a microarray, Unigene cluster IDs were obtained for the 897 genes in the ARE database (ARED). For genes among the 897 that had no Unigene cluster ID, and for ARE genes contained in the ARE libraries (Example 11), sequence information from those genes was used as input for BLASTN to retrieve genes corresponding to those sequences, and the corresponding Unigene cluster IDs. The Unigene cluster IDs were then used to extract the corresponding clones from the 40K set of clones of Research Genetics, Inc., which has the majority of ARE-cDNAs. In addition, individual IMAGE clones were also purchased and custom sequence-verified. Additionally; a list of 30 housekeeping genes (control genes) was compiled to be included on the array for purposes of quality control and normalization.
The cDNA clones, as glycerol culture stocks, were grown in 96-well growth blocks. The probe cDNAs that were spotted onto glass slides were obtained by PCR amplification of the insert DNAs from the clones. Purified plasmid DNA served as templates for the PCR reactions. The plasmids were prepared using commercial plasmid mini-preparation kits. All PCR reactions were carried out in 96-well thin wall PCR plates. The reaction mixtures contained 20 mM Tris-HCL (pH 8.4), 50 mM KCl, 1.5 mM MgCl2, 0.8 mM of each dATP, dGTP, dTTP, and dCTP, 0.1 μM forward oligonucleotide primer (5′GTTGTAAAACGACGGCCAGTG), 0.1 μM reverse oligonucleotide primer (5′CACACAGGAAACAGCTATG), and 5 units Taq DNA polymerase. The reactions had a total volume of 100 μl, and contained 100-300 ng of purified plasmid to provide the template DNA. PCRs were performed using the following thermal cycler program: 1 cycle of 94° C. for 2 min, 27 cycles of 94° C. for 30 sec, 55° C. for 30 sec, and 72° C. for 2.5 min, 1 cycle of 72° C. for 5 min. The PCR products (5 μl of the reaction) were then analyzed by agarose gel electrophoresis and could be stored at −20° C. until further processing. The PCR products were further processed in 96-well format either by ethanol precipitation or using commercially available DNA purification plates. Purified or precipitated PCR products were resuspended in a salt solution (e.g. 3×SSC).
These resuspended DNAs were the probe DNAs that were spotted onto glass slides to give the ARE-containing gene array. The slides were first coated with poly-L lysine. The poly-L-lysine slide coating procedure was as follows. A batch of plain Gold Seal microscope slides was incubated in cleaning solution (2.5 M NaOH in 60% ethanol) under agitation for two hours. Subsequently, the slides were rinsed with distilled water five times, each rinse lasting 5 minutes. The slides were then incubated in poly-L-lysine solution (0.01% poly-L-lysine in 0.1× standard tissue culture PBS) for one hour under agitation. Slides were then rinsed in distilled water for one minute, and any free liquid was removed by centrifugation of the slides at low speed. The coated slides were stored dust free and could be used for array printing for several weeks.
The probe DNAs were arrayed onto the slides using a SDDC-2 microarray robot from ESI (Engineering Services Inc.; Toronto, Canada). The setup used eight print-pins, delivering eight individual probe DNAs simultaneously to each slide, and washing the pins twice in water between every probe pick-up step. The probe DNAs were contained in 384-well plates to minimize loss by evaporation during the printing procedure. The size of the array area on each slide depended on the number of probe DNAs in the array. The distance between the centers of neighboring DNA spots was 200 μm. All probe DNAs were spotted onto each array at least in duplicate. For example, an array of 1000 genes (hence 2000 array spots) printed from a 384-well plate using eight print-pins will covered an area on the slide of approximately 170 mm2. After the printing, the array slides were stored dust free for 2-4 days before UV cross-linking.
The arrayed probe DNA was cross linked to the poly-L-lysine coat using a Stratalinker (Stratagene) with a UV dose of 450 mJ. The positive charges of the lysine residues on the array slides were neutralized by incubating the slides in a freshly prepared solution of 1.7% succinic anhydride in 1-methyl-2-pyrrolidinone/77 mM borate buffer for 30 minutes. The slides were then submerged for two minutes in first, distilled water of 95° C., and second 95% ethanol. Excess ethanol was then removed by centrifugation at low speed, and the cDNA microarray was stored dust free at room temperature ready to be used for hybridization.
To use the ARE microarrays for gene expression experiments, total RNA (100 ug) samples were extracted from THP-1 cells that were previously treated with CHX and LPS using the Qiagen Rneasy RNA purification kit and refined by Trizol reagent (GibcoBRL). The RNA samples were labeled with Cyanine-3-dUTP (Cy3, green) and Cyanine-5-dUTP (Cy5, red, Amersham), in two separate RT reactions using olig(dT)11-18 primers and SuperScript II RT. The labeled cDNA samples were hydrolyzed by NaOH and purified on Micro Bio-Spin® 6 chromatography column (Bio-Rad) and concentrated in TE buffer. The labeled cDNA sample mixture was hybridized to the microarray. The hybridization solution contained poly dA40-60 (8 mg/ml), yeast tRNA (4 mg/ml), and CoT1 DNA (10 mg/ml), 3 μl of 20×SSC, and 1 μl 50×Denhardt's blocking solution. This mixture was applied to the ARE-cDNA glass slides and hybridized under stringent conditions. Subsequently, the glass slides were washed.
Analysis of hybridization to the microarray used scanning of the microarray with a GenePix 4000A scanner (Axon Instruments). The scanner program allowed normalization of Cy3 (THP-1 control sample) and Cy5 (LPS+CHX treated THP-1 sample) ratios using the β-actin control on the array. Most of the duplicates gave similar readings. The intensity ratios from two cDNA samples measured using the ARE-cDNA microarray represented the relative expression profile of the ARE genes in the two starting RNA samples.
H. sapiens IFN-omega 1 gene.
H. sapiens mRNA for thromboplastin (clone 2b-Apr5).
H. sapiens BTA 1916 mRNA for Pai-2.
H. sapiens BTA 1922 mRNA for Pai-2.
H. sapiens beta-casein cDNA.
H. sapiens u-PA cDNA sequence.
Homo sapiens mRNA for semaphorin E, complete cds.
Homo sapiens mRNA for TRAF5, complete cds.
Homo sapiens mRNA for glia maturation factor, complete cds.
Homo sapiens mRNA for Efs1, complete cds.
Homo sapiens mRNA for Efs2, complete cds.
Homo sapiens mRNA for hSLK, complete cds.
Homo sapiens mRNA for 26S proteasome subunit p55, complete
Homo sapiens mRNA for Cdc7-related kinase, complete cds.
Homo sapiens mRNA for LAK-1, complete cds.
Homo sapiens mRNA for KIAA0285 gene, complete cds.
Homo sapiens mRNA for KIAA0288 gene, complete cds.
Homo sapiens EXLM1 mRNA, complete cds.
Homo sapiens mRNA for chemokine LEC precursor, complete cds.
Homo sapiens KIAA0400 mRNA, complete cds.
Homo sapiens KIAA0406 mRNA, complete cds.
Homo sapiens KIAA0410 mRNA, complete cds.
Homo sapiens KIAA0414 mRNA, partial cds.
Homo sapiens KIAA0419 mRNA, complete cds.
Homo sapiens KIAA0426 mRNA, complete cds.
Homo sapiens mRNA for KIAA0458 protein, complete cds.
Homo sapiens mRNA for KIAA0470 protein, complete cds.
Homo sapiens mRNA for KIAA0471 protein, complete cds.
Homo sapiens mRNA for KIAA0473 protein, complete cds.
Homo sapiens mRNA for KIAA0475 protein, complete cds.
Homo sapiens mRNA for KIAA0476 protein, complete cds.
Homo sapiens mRNA for KIAA0480 protein, complete cds.
Homo sapiens mRNA for KIAA0481 protein, complete cds.
Homo sapiens FCMD mRNA for fukutin, complete cds.
Homo sapiens mRNA for KIAA0531 protein, complete cds.
Homo sapiens mRNA for KIAA0535 protein, complete cds.
Homo sapiens mRNA for KIAA0537 protein, complete cds.
Homo sapiens mRNA for KIAA0550 protein, complete cds.
Homo sapiens mRNA for KIAA0562 protein, complete cds.
Homo sapiens mRNA for KIAA0565 protein, complete cds.
Homo sapiens mRNA for KIAA0569 protein, complete cds.
Homo sapiens mRNA for KIAA0571 protein, complete cds.
Homo sapiens mRNA for DRAK1, complete cds.
Homo sapiens mRNA for Musashi, complete cds.
Homo sapiens mRNA for KIAA0617 protein, complete cds.
Homo sapiens mRNA for KIAA0626 protein, complete cds.
Homo sapiens mRNA for KIAA0628 protein, complete cds.
Homo sapiens mRNA for KIAA0651 protein, complete cds.
Homo sapiens mRNA for KIAA0652 protein, complete cds.
Homo sapiens mRNA for KIAA0660 protein, complete cds.
Homo sapiens mRNA for KIAA0669 protein, complete cds.
Homo sapiens mRNA for KIAA0685 protein, complete cds.
Homo sapiens mRNA for KIAA0688 protein, complete cds.
Homo sapiens mRNA for KIAA0698 protein, complete cds.
Homo sapiens mRNA for KIAA0705 protein, complete cds.
Homo sapiens Elk1 mRNA, complete cds.
Homo sapiens mRNA for sterol-C5-desaturase, complete cds.
Homo sapiens HGC6.1.1 mRNA, complete cds.
Homo sapiens mRNA for oxidative-stress responsive 1, complete
Homo sapiens mRNA for condoroitin 6-sulfotransferase, complete
Homo sapiens mRNA for KIAA0711 protein, complete cds.
Homo sapiens mRNA for KIAA0716 protein, complete cds.
Homo sapiens mRNA for KIAA0736 protein, complete cds.
Homo sapiens mRNA for KIAA0744 protein, complete cds.
Homo sapiens mRNA for KIAA0764 protein, complete cds.
Homo sapiens mRNA for KIAA0798 protein, complete cds.
Homo sapiens mRNA for KIAA0808 protein, complete cds.
Homo sapiens mRNA for Gab2, complete cds.
Homo sapiens PKIG mRNA for protein kinase inhibitor gamma,
Homo sapiens mRNA for dermatan/chondroitin sulfate
Homo sapiens mRNA for KIAA0832 protein, complete cds.
Homo sapiens mRNA for KIAA0835 protein, complete cds.
Homo sapiens mRNA for KIAA0844 protein, complete cds.
Homo sapiens mRNA for KIAA0848 protein, complete cds.
Homo sapiens mRNA for KIAA0852 protein, complete cds.
Homo sapiens mRNA for KIAA0879 protein, complete cds.
Homo sapiens mRNA for KIAA0893 protein, complete cds.
Homo sapiens HFB30 mRNA, complete cds.
Homo sapiens FUT9 mRNA for alpha-1,3-fucosyltransferase IX,
Homo sapiens mRNA for KIAA0924 protein, complete cds.
Homo sapiens mRNA for KIAA0936 protein, complete cds.
Homo sapiens mRNA for KIAA0938 protein, complete cds.
Homo sapiens mRNA for KIAA0941 protein, complete cds.
Homo sapiens mRNA for KIAA0952 protein, complete cds.
Homo sapiens mRNA for KIAA0955 protein, complete cds.
Homo sapiens mRNA for KIAA0966 protein, complete cds.
Homo sapiens mRNA for KIAA0970 protein, complete cds.
Homo sapiens mRNA for KIAA0971 protein, complete cds.
Homo sapiens mRNA for KIAA0990 protein, complete cds.
Homo sapiens mRNA for KIAA0997 protein, complete cds.
Homo sapiens mRNA for KIAA1008 protein, complete cds.
Homo sapiens mRNA for MALT1, complete cds.
Homo sapiens mRNA for Kelch motif containing protein, complete
Homo sapiens mRNA for KIAA1041 protein, complete cds.
Homo sapiens mRNA for KIAA1042 protein, complete cds.
Homo sapiens mRNA for KIAA1044 protein, complete cds.
Homo sapiens mRNA for KIAA1073 protein, complete cds.
Homo sapiens mRNA for KIAA1101 protein, complete cds.
Homo sapiens mRNA for epsilon-adaptin, complete cds.
Homo sapiens germinal center kinase related protein kinase mRNA,
Homo sapiens cdc14 homolog mRNA, complete cds.
Homo sapiens dead box, X isoform (DBX) mRNA, alternative
Homo sapiens dead box, Y isoform (DBY) mRNA, alternative
Homo sapiens ubiquitous TPR motif, X isoform (UTX) mRNA,
Homo sapiens RNA editase (RED1) mRNA, complete cds.
Homo sapiens dihydrolipoamide dehydrogenase-binding protein
Homo sapiens lymphoid phosphatase LyP1 mRNA, complete cds.
Homo sapiens E1B 19K/Bcl-2-binding protein Nip3 mRNA, nuclear
Homo sapiens Jagged1 (JAG1) mRNA, complete cds.
Homo sapiens germ cell nuclear factor (GCNF) mRNA, complete
Homo sapiens hUNC18a alternatively-spliced mRNA, complete
Homo sapiens hUNC18b alternatively-spliced mRNA, complete
Homo sapiens jerky gene product homolog mRNA, complete cds.
Homo sapiens CDO mRNA, complete cds.
Homo sapiens retinoic acid hydroxylase mRNA, complete cds.
Homo sapiens dishevelled 1 (DVL1) mRNA, complete cds.
Homo sapiens CHD2 mRNA, complete cds.
Homo sapiens embryonic lung protein (HUEL) mRNA, complete
Homo sapiens MDM2-like p53-binding protein (MDMX) mRNA,
Homo sapiens EVI5 homolog mRNA, complete cds.
Homo sapiens TEB4 protein mRNA, complete cds.
Homo sapiens homeodomain protein (BAPX1) mRNA, complete
Homo sapiens eIF4GII mRNA, complete cds.
Homo sapiens zinc finger protein (ZNF198) mRNA, complete cds.
Homo sapiens death receptor 5 (DR5) mRNA, complete cds.
Homo sapiens hamartin (TSC1) mRNA, complete cds.
Homo sapiens MTG8-like protein MTGR1b mRNA, complete cds.
Homo sapiens Cdc7 (CDC7) mRNA, complete cds.
Homo sapiens chromosome 1 atrophin-1 related protein (DRPLA)
Homo sapiens TRAIL receptor 2 mRNA, complete cds.
Homo sapiens death receptor 5 (DR5) mRNA, complete cds.
Homo sapiens maltase-glucoamylase mRNA, complete cds.
Homo sapiens apoptosis inducing receptor TRAIL-R2 (TRAILR2)
Homo sapiens receptor activator of nuclear factor kappa B ligand
Homo sapiens heparan sulfate 3-O-sulfotransferase-1 precursor
Homo sapiens macrophage inhibitory cytokine-1 (MIC-1) mRNA,
Homo sapiens PEN11B mRNA, complete cds.
Homo sapiens DNA damage-inducible RNA binding protein
Homo sapiens vascular endothelial growth factor mRNA, complete
Homo sapiens homeodomain protein (OG12) mRNA, complete cds.
Homo sapiens protein phosphatase with EF-hands-2 long form
Homo sapiens mRNA capping enzyme (HCE) mRNA, complete
Homo sapiens yotiao mRNA, complete cds.
Homo sapiens serine/threonine kinase RICK (RICK) mRNA,
Homo sapiens transmembrane protein Jagged 1 (HJ1) mRNA,
Homo sapiens neuralized mRNA, complete cds.
Homo sapiens glypican-4 (GPC4) mRNA, complete cds.
Homo sapiens sodium-hydrogen exchanger 6 (NHE-6) mRNA,
Homo sapiens epithelial V-like antigen precursor (EVA) mRNA,
Homo sapiens acyl-CoA synthetase 4 (ACS4) mRNA, complete cds.
Homo sapiens pendrin (PDS) mRNA, complete cds.
Homo sapiens interleukin 15 precursor (IL-15) mRNA, complete
Homo sapiens forkhead protein (FKHR) mRNA, complete cds.
Homo sapiens cell cycle related kinase mRNA, complete cds.
Homo sapiens CASK mRNA, complete cds.
Homo sapiens FGFR signalling adaptor SNT-2 mRNA, complete
Homo sapiens pre-mRNA splicing factor (PRP17) mRNA, complete
Homo sapiens atrophin-1 interacting protein 1 (AIP1) mRNA,
Homo sapiens anti-death protein (IEX-1L) mRNA, complete cds.
Homo sapiens cadherin-10 (CDH10) mRNA, complete cds.
Homo sapiens TATA binding protein associated factor (TAFII150)
Homo sapiens spindle pole body protein spc98 homolog GCP3
Homo sapiens lithium-sensitive myo-inositol monophosphatase A1
Homo sapiens CLCA homolog (hCLCA3) mRNA, complete cds.
Homo sapiens HCG-1 protein (HCG-1) mRNA, complete cds.
Homo sapiens protein regulating cytokinesis 1 (PRC1) mRNA,
Homo sapiens transcriptional regulatory protein p54 mRNA,
Homo sapiens cytokine receptor related protein 4 (CYTOR4)
Homo sapiens sodium bicarbonate cotransporter 3 (SLC4A7)
Homo sapiens ribosomal protein L33-like protein mRNA, complete
Homo sapiens spleen mitotic checkpoint BUB3 (BUB3) mRNA,
Homo sapiens cyclin T2a mRNA, complete cds.
Homo sapiens MMS2 (MMS2) mRNA, complete cds.
Homo sapiens TACC1 (TACC1) mRNA, complete cds.
Homo sapiens Src-associated adaptor protein (SAPS) mRNA,
Homo sapiens supervillin mRNA, complete cds.
Homo sapiens 15 kDa selenoprotein mRNA, complete cds.
Homo sapiens neuronal double zinc finger protein (ZNF231)
Homo sapiens mitotic checkpoint component Bub3 (BUB3) mRNA,
Homo sapiens osteoprotegerin ligand mRNA, complete cds.
Homo sapiens angiotensin/vasopressin receptor AII/AVP mRNA,
Homo sapiens clone 24695 guanine nucleotide-binding protein
Homo sapiens monotactin-1 mRNA, complete cds.
Homo sapiens leucine-rich glioma-inactivated protein precursor
Homo sapiens kynurenine 3-hydroxylase mRNA, complete cds.
Homo sapiens inducible 6-phosphofructo-2-kinase/fructose
Homo sapiens sarcosin mRNA, complete cds.
Homo sapiens estrogen-related receptor gamma mRNA, complete
Homo sapiens actin binding protein MAYVEN mRNA, complete
Homo sapiens nuclear matrix protein NRP/B (NRPB) mRNA,
Homo sapiens serum-inducible kinase mRNA, complete cds.
Homo sapiens Gz-selective GTPase-activating protein (ZGAP1)
Homo sapiens UDP-glucose dehydrogenase (UGDH) mRNA,
Homo sapiens T41p (C8orf1) mRNA, complete cds.
Homo sapiens protocadherin (PCDH8) mRNA, complete cds.
Homo sapiens diacylglycerol kinase iota (DGKi) mRNA, complete
Homo sapiens keratan sulfate proteoglycan mRNA, complete cds.
Homo sapiens brain my047 protein mRNA, complete cds.
Homo sapiens intersectin long form mRNA, complete cds.
Homo sapiens low-density lipoprotein receptor-related protein 5
Homo sapiens GC20 protein mRNA, complete cds.
Homo sapiens alpha endosulfine mRNA, complete cds.
Homo sapiens haemopoietic progenitor homeobox HPX42B
Homo sapiens choline/ethanolaminephosphotransferase (CEPT1)
Homo sapiens cytohesin binding protein HE mRNA, complete cds.
Homo sapiens WSB-1 mRNA, complete cds.
Homo sapiens MTG8-like protein MTGR1a mRNA, complete cds.
Homo sapiens inhibitor of apoptosis protein-1 (MIHC) mRNA,
Homo sapiens OPA-containing protein mRNA, complete cds.
Homo sapiens MMSET type I (MMSET) mRNA, complete cds.
Homo sapiens insulin receptor substrate-2 (IRS2) mRNA, complete
Homo sapiens small EDRK-rich factor 1, short isoform (SERF1)
Homo sapiens small EDRK-rich factor 1, long isoform (SERF1)
Homo sapiens cytokine-inducible SH2 protein 6 (CISH6) mRNA,
Homo sapiens Hus1-like protein (HUS1) mRNA, complete cds.
Homo sapiens HSPC012 mRNA, complete cds.
Homo sapiens SIH002 mRNA, complete cds.
Homo sapiens protein translation factor sui1 homolog mRNA,
Homo sapiens HSPC019 mRNA, complete cds.
Homo sapiens hypothetical SBBI03 protein mRNA, complete cds.
Homo sapiens LDL receptor member LR3 mRNA, complete cds.
Homo sapiens conductin mRNA, complete cds.
Homo sapiens ubiquitin-like protein activating enzyme (UBA2)
Homo sapiens testis-specific chromodomain Y-like protein (CDYL)
Homo sapiens sirtuin type 1 (SIRT1) mRNA, complete cds.
Homo sapiens WD repeat protein WDR3 (WDR3) mRNA,
Homo sapiens cyclin-D binding Myb-like protein mRNA, complete
Homo sapiens xenotropic and polytropic murine leukemia virus
Homo sapiens SUMO-1-activating enzyme E1 C subunit (UBA2)
Homo sapiens clone 628 unknown mRNA, complete sequence.
Homo sapiens beta-1,3-N-acetylglucosaminyltransferase mRNA,
Homo sapiens type 2 iodothyronine deiodinase mRNA, complete
Homo sapiens UDP-Gal:glucosylceramide
Homo sapiens Ste-20 related kinase SPAK mRNA, complete cds.
Homo sapiens ARF-family of Ras related GTPases mRNA,
Homo sapiens connective tissue growth factor related protein WISP-1
Homo sapiens bone morphogenetic protein 10 (BMP10) mRNA,
Homo sapiens placenta-specific ATP-binding cassette transporter
Homo sapiens L-type amino acid transporter subunit LAT1 mRNA,
Homo sapiens decoy receptor 3 (DcR3) mRNA, complete cds.
Homo sapiens K—Cl cotransporter KCC4 mRNA, complete cds.
Homo sapiens heparan sulfate D-glucosaminyl 3-O-
Homo sapiens mitochondrial inner membrane preprotein translocase
Homo sapiens WSB-1 mRNA, complete cds.
Homo sapiens WSB-1 isoform mRNA, complete cds.
Homo sapiens stromal cell-derived receptor-1 beta mRNA,
Homo sapiens Mcd4p homolog mRNA, complete cds.
Homo sapiens ubiquitous 6-phosphofructo-2-kinase/fructose
Homo sapiens eukaryotic translation initiation factor 2 alpha
Homo sapiens fibroblast growth factor 19 (FGF19) mRNA,
Homo sapiens TDE homolog mRNA, complete cds.
Homo sapiens integral inner nuclear membrane protein MAN1
Homo sapiens clone HH114 unknown mRNA.
Homo sapiens bright and dead ringer gene product homologous
Homo sapiens host cell factor 2 (HCF-2) mRNA, complete cds.
Homo sapiens thyroid hormone receptor-associated protein complex
Homo sapiens cytokine receptor-like molecule 9 (CREME9)
Homo sapiens CAAX prenyl protein protease RCE1 (RCE1)
Homo sapiens API2-MLT fusion protein (API2-MLT) mRNA,
Homo sapiens SH2-containing protein Nsp2 mRNA, complete cds.
Homo sapiens bisphosphate 3'-nucleotidase mRNA, complete cds.
Homo sapiens HSPC040 protein mRNA, complete cds.
Homo sapiens MALT lymphoma associated translocation (MLT)
Homo sapiens cytokine-inducible SH2-containing protein (G18)
Homo sapiens CGI-10 protein mRNA, complete cds.
Homo sapiens CGI-26 protein mRNA, complete cds.
Homo sapiens CGI-34 protein mRNA, complete cds.
Homo sapiens titin-like protein (TTID) mRNA, complete cds.
Homo sapiens corin mRNA, complete cds.
Homo sapiens cofilin isoform 1 mRNA, complete cds.
Homo sapiens cofilin isoform 2 mRNA, complete cds.
Homo sapiens AKT3 protein kinase mRNA, complete cds.
Homo sapiens origin recognition complex subunit 6 (ORC6)
Homo sapiens aggrecanase-1 mRNA, complete cds.
Homo sapiens hairy and enhancer of split related-1 (HESR-1)
Homo sapiens CGI-79 protein mRNA, complete cds.
Homo sapiens CGI-107 protein mRNA, complete cds.
Homo sapiens CGI-111 protein mRNA, complete cds.
Homo sapiens CGI-123 protein mRNA, complete cds.
Homo sapiens CGI-141 protein mRNA, complete cds.
Homo sapiens CGI-142 protein mRNA, complete cds.
Homo sapiens CGI-145 protein mRNA, complete cds.
Homo sapiens CGI-148 protein mRNA, complete cds.
Homo sapiens thiamine carrier 1 (TC1) mRNA, complete cds.
Homo sapiens glutaminase C mRNA, complete cds.
Homo sapiens ras-related GTP-binding protein 4b (RAB4B)
Homo Sapiens, RP58 cDNA for complete mRNA.
Homo sapiens mRNA for X-like 1 protein.
Homo sapiens mRNA for cartilage-associated protein (CASP).
Homo sapiens mRNA for Rho guanine nucleotide-exchange factor,
Homo sapiens mRNA for Six9 protein.
Homo sapiens mRNA for NAALADase II protein.
Homo sapiens mRNA for TL132.
Homo sapiens mRNA for protein kinase.
Homo sapiens mRNA for matrilin-3.
Homo sapiens mRNA for ZNF198 protein.
Homo sapiens mRNA for phospholipase A2 activating protein.
Homo sapiens mRNA for centaurin beta2.
Homo sapiens mRNA for alpha-3-fucosyltransferase.
Homo sapiens mRNA; cDNA DKFZp566D213 (from clone
Homo sapiens mRNA; cDNA DKFZp564M112 (from clone
Homo sapiens mRNA; cDNA DKFZp434K151 (from clone
Homo sapiens mRNA; cDNA DKFZp434F122 (from clone
Homo sapiens mRNA for phosphoribosyl pyrophosphate synthetase
Homo sapiens mRNA for ITK, complete cds.
Homo sapiens mRNA for epiregulin, complete cds.
Homo sapiens mRNA for placental leucine aminopeptidase,
Homo sapiens mRNA for ceramide glucosyltransferase, complete
Homo sapiens mRNA for T-cluster binding protein, complete cds.
Homo sapiens mRNA for CIRP, complete cds.
Homo sapiens mRNA for neuron derived orphan receptor, complete
Homo sapiens mRNA for NeuroD, complete cds.
Homo sapiens mRNA for nel-related protein, complete cds.
Homo sapiens WNT7a mRNA, complete cds.
Homo sapiens mRNA for ankyrin repeat protein, complete cds.
Homo sapiens mRNA for CD38, complete cds.
Homo sapiens mRNA for hyaluronan synthase, complete cds.
Homo sapiens mRNA for fungal sterol-C5-desaturase homolog,
Homo sapiens mRNA for HCS, complete cds.
Homo sapiens mRNA for HYA22, complete cds.
Homo sapiens T cell-specific tyrosine kinase mRNA, complete cds.
Homo sapiens cAMP phosphodiesterase PDE7 (PDE7A1) mRNA,
Homo sapiens proto-oncogene (Wnt-5a) mRNA, complete cds.
Homo sapiens galactocerebrosidase (GALC) mRNA, complete cds.
Homo sapiens (PWD) gene mRNA, 3′ end.
Homo sapiens paired box protein mRNA, complete cds.
Homo sapiens beta2-chimaerin mRNA, complete cds.
Homo sapiens ras GTPase-activating-like protein (IQGAP1)
Homo sapiens receptor protein-tyrosine kinase (HEK11) mRNA,
Homo sapiens GT198 mRNA, complete ORF.
Homo sapiens transcription factor SL1 mRNA, complete cds.
Homo sapiens TNFR2-TRAF signalling complex protein mRNA,
Homo sapiens COX17 mRNA, complete cds.
Homo sapiens inwardly rectifying potassium channel (Kir3.2)
Homo sapiens cap-binding protein mRNA, complete cds.
Homo sapiens secreted T cell protein (H400; SIS-gamma) mRNA,
Homo sapiens tyrosine kinase (ELK1) oncogene mRNA, complete
Homo sapiens (clone pAT 464) potential lymphokine/cytokine
Homo sapiens (clone pAT 744) potential lymphokine/cytokine
Homo sapiens nuclear-encoded mitochondrial branched chain
Homo sapiens phosphatidylcholine 2-acylhydrolase (cPLA2)
Homo sapiens MAD-3 mRNA encoding IkB-like activity, complete
Homo sapiens lutropin/choriogonadotropin receptor (LHCGR)
Homo sapiens transcription factor (HTF4A) mRNA, complete cds.
Homo sapiens skeletal muscle alpha 2 actinin (ACTN20 mRNA,
Homo sapiens cyclooxygenase-2 (Cox-2) mRNA, complete cds.
Homo sapiens cyclin D3 (CCND3) mRNA, complete cds.
H. sapiens zinc finger transcriptional regulator mRNA, complete cds.
Homo sapiens phospholipase C-beta-2 mRNA, complete cds.
Homo sapiens nucleolysin TIAR mRNA, complete cds.
Homo sapiens Kallmann syndrome (KAL) mRNA, complete cds.
Homo sapiens glutamine PRPP amidotransferase (GPAT) mRNA,
Homo sapiens ileal sodium-dependent bile acid transporter
Homo sapiens calcium dependent potassium channel alpha subunit
Homo sapiens A-kinase anchor protein (AKAP100) mRNA,
Homo sapiens putative tumor suppressor ST13 (ST13) mRNA,
Homo sapiens nuclear autoantigen GS2NA mRNA, complete cds.
Homo sapiens stanniocalcin precursor (STC) mRNA, complete cds.
Homo sapiens Rab27a mRNA, complete cds.
Homo sapiens myotubularin (MTM1) mRNA, complete cds.
Homo sapiens fragile X mental retardation protein FMR2p (FMR2)
Homo sapiens GOK (STIM1) mRNA, complete cds.
Homo sapiens haemochromatosis protein (HLA-H) mRNA,
Homo sapiens enterocyte differentiation associated factor EDAF-1
Homo sapiens A28-RGS14p mRNA, complete cds.
Homo sapiens orphan nuclear receptor GCNF mRNA, complete cds.
Homo sapiens post-synaptic density protein 95 (PSD95) mRNA,
Homo sapiens basic-leucine zipper transcription factor MafG
Homo sapiens lysyl hydroxylase isoform 2 (PLOD2) mRNA,
Homo sapiens dentin matrix acidic phosphoprotein 1 (DMP1)
H. sapiens endothelin 3 mRNA.
H. sapiens cDNA for CREB protein.
H. sapiens mRNA for 1D-myo-inositol-trisphosphate 3-kinase B
H. sapiens serotonin 5-HT2 receptor mRNA.
Homo sapiens D13S106 mRNA for a highly charged amino acid
H. sapiens RR2 mRNA for small subunit ribonucleotide reductase.
H. sapiens KALIG-1 mRNA for neural cell adhesion and axonal
H. sapiens mRNA for P2 protein of peripheral myelin.
H. sapiens mRNA (TRK-T1) for 55 KD protein.
Homo sapiens mRNA for serum response factor-related protein,
H. sapiens mRNA for tre oncogene (clone 210).
H. sapiens mRNA for tre oncogene (clone 213).
H. sapiens mRNA for plasma membrane calcium ATPase.
H. sapiens mRNA for APO-1 cell surface antigen.
H. sapiens mRNA for RDC-1 POU domain containing protein.
H. sapiens mRNA for transacylase (DBT).
H. sapiens MaTu MN mRNA for p54/58N protein.
H. sapiens mRNA for heterogeneous nuclear ribonucleoprotein.
H. sapiens mRNA for CD40 ligand.
H. sapiens mRNA for myocyte-specific enhancer factor 2 (MEF2).
H. sapiens TRAP mRNA for ligand of CD40.
H. sapiens interleukin-13 mRNA.
H. sapiens mRNA for calcitonin receptor.
H. sapiens FMR-1 mRNA.
H. sapiens mRNA for transforming growth factor alpha.
Homo sapiens mRNA for FUS-CHOP protein fusion.
H. sapiens ERGIC-53 mRNA.
H. sapiens mRNA encoding Rev-ErbAalpha (internal fragment).
H. sapiens mRNA for nicein B2 chain.
H. sapiens p130 mRNA for 130K protein.
H. sapiens mRNA for vacuolar H+ ATPase E subunit.
H. sapiens mRNA for 2-5A binding protein.
H. sapiens mRNA for cathepsin-O.
H. sapiens GD3 synthase mRNA.
H. sapiens HZF10 mRNA for zinc finger protein.
H. sapiens SCA1 mRNA for ataxin.
H. sapiens AUH mRNA.
H. sapiens ERK3 mRNA.
H. sapiens mRNA (clone p5) for archain.
H. sapiens hTFIIAs mRNA for smallest (gamma) TFIIA subunit.
H. sapiens mRNA for HE6 Tm7 receptor.
H. sapiens HOK-2 mRNA for zinc finger protein.
H. sapiens Staf50 mRNA.
H. sapiens mRNA for alpha-centractin.
H. sapiens Brain 4 mRNA.
H. sapiens SMA4 mRNA.
H. sapiens mRNA for phosphatidylinositol 3 kinase gamma.
H. sapiens APXL mRNA.
H. sapiens mRNA for cytokine inducible nuclear protein.
H. sapiens mRNA for prostaglandin E receptor (EP3c).
H. sapiens mRNA for protein kinase, PKX1.
H. sapiens mRNA for TRPC1 protein.
H. sapiens mRNA for ESM-1 protein.
H. sapiens mRNA for microsomal triglyceride transfer protein.
H. sapiens mRNA for MHC class I mic-B antigen.
H. sapiens mRNA for novel gene in Xq28 region.
H. sapiens mRNA for translin associated protein X.
H. sapiens mRNA for PWP2 protein.
Homo sapiens mRNA for translational inhibitor protein p14.5.
H. sapiens mRNA for TAFII100 protein.
H. sapiens mRNA for C1D protein.
H. sapiens mRNA for canalicular multidrug resistance protein.
H. sapiens mRNA for TIM17 preprotein translocase.
H. sapiens mRNA for protein induced by vitamin D.
H. sapiens mRNA for carcinoembryonic antigen family member 2,
H. sapiens mRNA for FAA protein.
H. sapiens mRNA for B-HLH DNA binding protein.
H. sapiens mRNA for hRTR/hGCNF protein.
H. sapiens mRNA for 46 kDa coxsackievirus and adenovirus
H. sapiens mRNA for SIRP-beta1.
Homo sapiens mRNA for WNT11 gene.
Homo sapiens mRNA for FIM protein.
Homo sapiens mRNA for farnesylated-proteins converting enzyme
Homo sapiens mRNA for farnesylated-proteins converting enzyme
Homo sapiens mRNA for DIF-2 protein.
Homo sapiens mRNA
Homo sapiens mRNA for leukemia associated gene 2.
Homo sapiens mRNA for nebulette.
Homo sapiens mRNA for serum response factor-related protein,
Homo sapiens mRNA for CDS2 protein.
Homo sapiens mRNA for monocyte chemotactic protein-2.
Homo sapiens mRNA for prefoldin subunit 3.
Homo sapiens mRNA for protein phosphatase 1 (PPP1R6).
H. sapiens mRNA for N-Oct 3, N-Oct5a, and N-Oct 5b proteins.
H. sapiens mRNA for laminin.
H. sapiens ALK-2 mRNA.
H. sapiens ALK-3 mRNA.
H. sapiens a2-chimaerin mRNA.
H. sapiens tropomyosin isoform mRNA, complete CDS.
H. sapiens (23k/3) mRNA for ubiquitin-conjugating enzyme UbcH2.
H. sapiens mRNA for 43 kDa inositol polyphosphate 5-phosphatase.
Homo sapiens mRNA for membrane transport protein (XK gene).
H. sapiens HK2 mRNA for hexokinase II.
Homo sapiens SOX9 mRNA.
H. sapiens CTLA8 mRNA.
H. sapiens mRNA for phosphotyrosine phosphatase kappa.
H. sapiens mRNA for TRPC1A.
H. sapiens mRNA for CC-chemokine, eotaxin variant (clone 53).
The present application is a divisional application of U.S. application Ser. No. 10/257,294, filed Apr. 9, 2003, which claims priority to International Application PCT/US01/11993, filed Apr. 12, 2001 which claims the benefit of the filing date of U.S. Provisional Application 60/196,870 filed Apr. 12, 2000, all of which are incorporated herein by reference in their entirety.
Number | Date | Country | |
---|---|---|---|
60196870 | Apr 2000 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 10257294 | Jul 2003 | US |
Child | 11774296 | US |