Cytomegalovirus gene function and methods for developing antivirals, anti-CMV vaccines, and CMV-based vectors

Information

  • Patent Grant
  • 7407744
  • Patent Number
    7,407,744
  • Date Filed
    Friday, July 23, 2004
    19 years ago
  • Date Issued
    Tuesday, August 5, 2008
    15 years ago
Abstract
A global functional analysis of HCMV genes is performed by constructing virus gene-deletion mutants and examining their growth phenotypes in different natural HCMV host cells. This systematic analysis of the HCMV genome identified 45 viral ORFs essential for viral replication and characterizes of 115 growth-dispensable viral genes. Of particular interest is the finding that HCMV encodes genes (temperance factors) that repress its own replication on a cell type-specific basis. In addition to HCMV, pathogen temperance may be a strategy employed by other infectious agents to enhance their long-term survivability within their respective host population.
Description

Human cytomegalovirus (HCMV) is among the largest of the DNA viruses, with a genome of over 230 kb. This virus infects various tissue and cell types and, hence, is responsible for a myriad of complications including mental retardation, AIDS-associated retinitis, and vascular diseases. HCMV, is found universally throughout all geographic locations and socioeconomic groups, and infects between 50% and 85% of adults in the United States by 40 years of age. HCMV is also the virus most frequently transmitted to a developing child before birth. For most healthy persons who acquire CMV after birth there are few symptoms and no long-term health consequences, although there is usually a dormant virus infection for life.


However, HCMV infection is problematic for certain high-risk groups. Included among these are infection during pregnancy, and infection of immunocompromised individuals, such as organ transplant recipients and persons infected with human immunodeficiency virus (HIV). HCMV is a major cause of morbidity and mortality in AIDS patients with low CD4 counts, from either primary infection or reactivation of latent infection. Clinical illnesses in patients with HIV infection include chorioretinitis, pneumonia, esophagitis, colitis, encephalitis, polyradiculopathy, adrenalitis and hepatitis


CMV also remains the most important cause of congenital viral infection in the United States. Generalized infection may occur in the infant if infected before birth, and symptoms may range from moderate enlargement of the liver and spleen (with jaundice) to fatal illness. With supportive treatment most infants with CMV disease usually survive. However, from 80% to 90% will have complications within the first few years of life that may include hearing loss, vision impairment, and varying degrees of mental retardation. Another 5% to 10% of infants who are infected but without symptoms at birth will subsequently have varying degrees of hearing and mental or coordination problems.


Although primary HCMV infection in an immunocompromised patient can cause serious disease, the more common problem is the reactivation of the dormant virus. Infection with CMV is a major cause of disease and death in immunocompromised patients, including organ transplant recipients, patients undergoing hemodialysis, patients with cancer, patients receiving immunosuppressive drugs, and HIV-infected patients. Pneumonia, retinitis (an infection of the eyes), and gastrointestinal disease are the common manifestations of disease. Because of this risk, exposing immunosuppressed patients to outside sources of CMV should be minimized. Whenever possible, patients without CMV infection should be given organs and/or blood products that are free of the virus.


Depending on the tissue type and the host's immune state, HCMV engages in three different modes of infection: acute infections with highly productive growth, persistent infections with low levels of replication, and latent infections where no viral progeny are produced. In different cell types, HCMV exhibits various growth rates, suggesting that its replication in a particular cell type is tightly regulated and thus, determines the outcome of diseases in specific tissues. Although there is evidence for a genetic basis of viral cell type-specific infection and growth regulation, many virus-encoded cell-tropism factors have not been identified, and their functional roles in viral replication are unclear.


Methods of controlling and preventing HCMV infection are of broad interest to the scientific community, pharmaceutical and biotech industry. The present invention addresses these issues.


Relevant Literature


The genomic sequence of human cytomegalovirus (AD169) has been deposited with Genbank; accession number NC001347. The sequence information is reviewed by Davison et al. (2003) J. Gen. Virol. 84 (Pt 1), 17-28; Dargan et al. (1997) J. Virol. 71 (12), 9833-9836; and Chee et al. (1990) Curr. Top. Microbiol. Immunol. 154, 125-169.


SUMMARY OF THE INVENTION

A global functional analysis of HCMV genes was performed by constructing virus gene-deletion mutants and examining their growth phenotypes in different natural HCMV host cells. This systematic analysis of the HCMV genome identified 45 viral ORFs essential for viral replication and characterized 115 growth-dispensable viral genes. Of particular interest is the finding that HCMV encodes genes (herein termed temperance factors) that repress its own replication on a cell type-specific basis. In addition to HCMV, pathogen temperance may be a strategy employed by other infectious agents to enhance their long-term survivability within their respective host population.


Viral temperance factors, genes encoding such temperance factors, and viruses having mutations in temperance factors are provided. Viruses with deletions temperance factor genes exhibit enhanced growth phenotypes, as compared to the wild type virus. These repressors of growth facilitate pathogen temperance. The genetic sequence of such temperance factors in viruses are modified to modulate virus replication, e.g. in the development of vaccine strains, for research purposes, and the like. The temperance factor polypeptides are useful as targets for drug design, as targets for immunological agents, and the like. Drugs mimicking or activating growth inhibitors or temperance factors find use in therapies against infectious diseases. In vitro hyper-growth strains having diminished or absent temperance factors can be used for facile production of large quantity of subunit and attenuated live vaccines.


Genes essential, or dispensable, for replication of HCMV are also identified. The sequence of such essential or dispensable genes can be modified to modulate virus replication, e.g. in the development of vectors and vaccine strains, for research purposes, and the like. Protein products of these genes are useful as targets for drug design, as targets for immunological agents, and the like.


In another embodiment of the invention, methods and compositions for the functional analysis of cytomegalovirus are provided. Such methods include the construction of rescued mutants, and methods for tagging and introducing foreign genes into CMV genome. These approaches can be used for vector and vaccine development. A collection of mutant cytomegaloviruses is provided, where each virus contains a deletion corresponding to one open reading frame in the virus genome. The mutant HCMV are useful in a number of screening methods. Screening methods include the growth of HCMV in different human cell lines.





BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1. Genome organization and genes of HCMV (Towne strain) based on the genome-wide shotgun sequencing of the viral sequence cloned in a BAC. Similar to the HCMV AD169 genome, the Towne genome is composed of a unique long (UL) region and a unique short (US) region, both flanked by inverted repeat regions (RL and RS). RL and RS are shown in a thicker format than UL and US. Each of the ORFs (RL1-RL13, UL2-UL147, IRS1, US1-US34, and TRS1) is color-coded according to the growth properties of their corresponding virus-deletion mutants in HFFs (see Table 6). The ORFs (RL11 and RL12), for which a deletion mutant was not generated, are shown in white. Repeated attempts to delete these two ORFs failed, possibly due to the presence of two copies of these genes at the inverted repeated regions. The vertical dashed lines represent the splicing junctions.



FIG. 2. (A) Procedures for constructing deletion and rescued mutants, as described in Methods. (B) Multiple-step growth (multiplicity of infection [MOI]=0.05) of HCMV mutants in HFFs. Cells were infected with each virus and at different time points post-infection, cells and culture media were harvested and sonicated. The viral titers were determined by plaque assays on HFFs. The values of the viral titer represent the average obtained from triplicate experiments. The standard deviation is indicated by the error bars.



FIG. 3. Analysis of multiple-step growth of different mutants and TowneBAC in HFFs (A) (MOI=0.05), retinal pigment epithelial (RPE) cells (B) (MOI=0.25), and human microvascular endothelial cells (HMVEC) (C) (MOI=0.05). (D) Comparison of the growth properties of 15 mutants in these three cell types with those of TowneBAC. +++, peak titer similar to that of TowneBAC; +++++, peak titer at least 100 times higher than that of TowneBAC; +, peak titer at least 100 times lower than that of TowneBAC. The values of the viral titer represent the average obtained from triplicate experiments. The standard deviation is indicated by the error bars.



FIG. 4. Polymerase chain reaction (PCR) (lanes 1-3) and Southern analyses (lanes 4-6) of the DNAs of the deletion (ΔUL32) and rescued (Rescued-UL32) mutant, and TowneBAC that were isolated from E.coli (lanes 1-3) and human fibroblasts (lanes 4-6). In (A), PCR products were separated on 1% agarose gels and visualized using ethidium bromine staining. In (B), DNAs were digested with Hind III, separated on 0.8% agarose gels, transferred to membranes, and hybridized with a [32P]-labeled probe containing both the KanMX4 and HCMV UL32 sequences. The numbers represent the size of either the PCR DNA products (PCR) or the DNA fragments (Hind III) of BAC-DNAs that were digested with Hind III and hybridized to the radiolabeled probe in Southern analysis.



FIG. 5. Microscopic images of green fluorescent protein (GFP) staining of human foreskin fibroblasts (HFFs) transfected with the DNAs (20 μg/105 cells) of TowneBAC, ΔUL32, and rescued-UL32 at 10 days post-transfection. Viral infection can be visualized using GFP staining since all BAC-DNAs contain a GFP expression cassette.





DETAILED DESCRIPTION OF THE EMBODIMENTS

Using a bacterial artificial chromosome (BAC) engineering and RED recombinase technology in conjunction with growth curve analysis in human fibroblast cells in tissue culture, an open reading frame deletion library spanning the entire human cytomegalovirus genome was constructed. The complete sequence of HCMV Towne strain was determined, and is provided herein as SEQ ID NO:1. The BAC based ORF deletion constructs were then transfected into human fibroblast cells in tissue culture. Constructs with deletions in 45 separate and distinct ORFs in the HCMV genome did not yield any viral progeny upon transfection into the fibroblast cells, indicating that those regions of the genome are essential for viral growth. These essential genes are drug targets for anti CMV therapeutic applications.


In addition, the functional mapping of the genome identified regions in the viral genome dispensable for viral growth. All ORF deletion constructs that yielded viral progeny upon transfection were deemed dispensable for viral growth. Growth curve analyses were performed on the BAC derived mutant virus and ORF deletions categorized as either severe growth defect, moderate growth defect, growth like wild type, or enhanced growth. The identification of these non-essential genes identify which genes can be deleted to create an attenuated virus for use as a vaccine, which genes can be deleted to create a gene therapy vector so as to accommodate the delivery gene of interest without affecting viral propagation in vitro; etc. Further growth kinetic characterization of the constructed mutants were carried out on human retinal epithelial cells, human aortic smooth muscle cells, and human microvascular endothelial cells and compared to the results from the human foreskin fibroblast characterization. This comparative analysis identified open reading frame deletion viruses that replicated differentially, compared to the wild-type virus, in the cell types tested, indicating that these open reading frames encoded cell tropism important factors.


The various methods of the invention will be described below. Although particular methods of tumor suppression are exemplified in the discussion below, it is understood that any of a number of alternative methods, including those described above are equally applicable and suitable for use in practicing the invention. It will also be understood that an evaluation of the vectors and methods of the invention may be carried out using procedures standard in the art, including the diagnostic and assessment methods described above.


The practice of the present invention will employ, unless otherwise indicated, conventional techniques of molecular biology (including recombinant techniques), microbiology, cell biology, biochemistry and immunology, which are within the scope of those of skill in the art. Such techniques are explained fully in the literature, such as, “Molecular Cloning: A Laboratory Manual”, second edition (Sambrook et al., 1989); “Oligonucleotide Synthesis” (M. J. Gait, ed., 1984); “Animal Cell Culture”) (R. I. Freshney, ed., 1987); “Methods in Enzymology” (Academic Press, Inc.); “Handbook of Experimental Immunology” (D. M. Weir & C. C. Blackwell, eds.); “Gene Transfer Vectors for Mammalian Cells” (J. M. Miller & M. P. Calos, eds., 1987); “Current Protocols in Molecular Biology” (F. M. Ausubel et al., eds., 1987); “PCR: The Polymerase Chain Reaction”, (Mullis et al., eds., 1994); and “Current Protocols in Immunology” (J. E. Coligan et al., eds., 1991).


Unless otherwise indicated, all terms used herein have the same meaning as they would to one skilled in the art and the practice of the present invention will employ, conventional techniques of microbiology and recombinant DNA technology, which are within the knowledge of those of skill of the art.


“Replication” and “propagation” are used interchangeably and refer to the ability of a virus or viral vector of the invention to reproduce or proliferate. These terms are well understood in the art. For purposes of this invention, replication involves production of viral proteins and is generally directed to reproduction of virus. Replication can be measured using assays standard in the art and described herein, such as a virus yield assay, burst assay or plaque assay. “Replication” and “propagation” include any activity directly or indirectly involved in the process of virus manufacture, including, but not limited to, viral gene expression; production of viral proteins, nucleic acids or other components; packaging of viral components into complete viruses; and cell lysis.


An “individual” is a vertebrate, preferably a mammal, more preferably a human. Mammals include, but are not limited to, farm animals, sport animals, rodents, primates, and pets. A “host cell” includes an individual cell or cell culture which can be or has been a recipient of an viral vector of this invention. Host cells include progeny of a single host cell, and the progeny may not necessarily be completely identical (in morphology or in total DNA complement) to the original parent cell due to natural, accidental, or deliberate mutation and/or change. A host cell includes cells transfected or infected in vivo with an adenoviral vector of this invention.


A “biological sample” encompasses a variety of sample types obtained from an individual and can be used in a diagnostic or monitoring assay. The definition encompasses blood and other liquid samples of biological origin, solid tissue samples such as a biopsy specimen or tissue cultures or cells derived therefrom and the progeny thereof. The definition also includes samples that have been manipulated in any way after their procurement, such as by treatment with reagents, solubilization, or enrichment for certain components, such as proteins or polynucleotides. The term “biological sample” encompasses a clinical sample, and also includes cells in culture, cell supernatants, cell lysates, serum, plasma, biological fluid, and tissue samples.


An “effective amount” is an amount sufficient to effect beneficial or desired clinical results. An effective amount can be administered in one or more administrations. For purposes of this invention, an effective amount of a temperance factor or temperance factor mimetic or temperance factor enhancer is an amount that is sufficient to palliate, ameliorate, stabilize, reverse, slow or delay the progression of the viral infection. An effective amount of a virus used in a vaccine is the amount that is sufficient to generate a virus specific immune response in the individual to which it is administered.


As used herein, “treatment” is an approach for obtaining beneficial or desired clinical results. For purposes of this invention, beneficial or desired clinical results include, but are not limited to, alleviation of symptoms, diminishment of extent of disease, stabilized (i.e., not worsening) state of disease, preventing spread of disease, delay or slowing of disease progression, amelioration or palliation of the disease state, and remission (whether partial or total), whether detectable or undetectable. “Treatment” can also mean prolonging survival as compared to expected survival if not receiving treatment. “Palliating” a disease means that the extent and/or undesirable clinical manifestations of a disease state are lessened and/or time course of the progression is slowed or lengthened, as compared to not administering factors or compounds of the present invention.


The term “polynucleotide” as used herein refers to a polymeric form of nucleotides of any length, either ribonucleotides or deoxynucleotides. Thus, this term includes single-, double- and triple-stranded DNA, as well as single- and double-stranded RNA, RNA-DNA hybrids, or a polymer comprising purine and pyrimidine bases, or other natural, chemically, biochemically modified, non-natural or derivatized nucleotide bases. The backbone of the polynucleotide can comprise sugars and phosphate groups (as may typically be found in RNA or DNA), or modified or substituted sugar or phosphate groups. Alternatively, the backbone of the polynucleotide can comprise a polymer of synthetic subunits such as phosphoramidates and thus can be a oligodeoxynucleoside phosphoramidate (P-NH2) or a mixed phosphoramidate-phosphodiester oligomer. Peyrottes et al. (1996) Nucleic Acids Res. 24: 1841-8; Chaturvedi et al. (1996) Nucleic Acids Res. 24: 2318-23; Schultz et al. (1996) Nucleic Acids Res. 24: 2966-73. A phosphorothioate linkage can be used in place of a phosphodiester linkage. Braun et al. (1988) J. Immunol. 141: 2084-9; Latimer et al. (1995) Mol. Immunol. 32: 1057-1064. Preferably, the polynucleotide is DNA. As used herein, “DNA” includes not only bases A, T. C, and G, but also includes any of their analogs or modified forms of these bases, such as methylated nucleotides, internucleotide modifications such as uncharged linkages and thioates, use of sugar analogs, and modified and/or alternative backbone structures, such as polyamides. In addition, a double-stranded polynucleotide can be obtained from the single-stranded polynucleotide product of chemical synthesis either by synthesizing the complementary strand and annealing the strands under appropriate conditions, or by synthesizing the complementary strand de novo using a DNA polymerase with an appropriate primer.


The term “gene” is well understood in the art and is a polynucleotide encoding a polypeptide. In addition to the polypeptide coding regions, a gene includes non-coding regions including, but not limited to, introns, transcribed but untranslated segments, and regulatory elements upstream and downstream of the coding segments.


The term “virus target” is used to generally refer to a complete virus particle or virion, a nucleocapsid, capsid, or macromolecule from the virus, which may be a lipid, polysaccharide, protein, etc., usually an envelope or capsid protein. Viruses are infectious agents, usually comprising only one kind of nucleic acid as their genome. The nucleic acid is encased in a protein shell of capsid proteins, which forms the nucleocapsid particle. The nucleocapsid may be further surrounded by a lipid containing membrane, into which are typically inserted envelope proteins.


Viruses may be classified according to their genome composition. DNA viruses include parvoviruses, papovaviruses, adenoviruses, herpesviruses, poxviruses and hepanaviruses. RNA containing viruses include caliciviruses, reoviruses, arboviruses, togaviruses, flaviviruses, arenoviruses, coronaviruses, retroviruses, bunyaviruses, orthomyxoviruses, paramyxoviruses, and rhabdoviruses.


Herpesvirus is a class of viruses containing several important human pathogens. An important property of herpesviruses is their ability to establish life-long persistant infection of the host, and to undergo periodic reactivation. Their frequent reactivation in immunosuppressed patients frequently causes health problems. The reactivated infection may be clinically very different from the disease caused by primary infection.


There are eight herpesviruses known to infect humans: herpes simplex viruses 1 and 2; varicella-zoster virus, cytomegalovirus, Epstein-Barr virus, human herpesvirus 6 and 7, and Kaposi's Sarcoma associated herpesvirus (HHV-8). All herpesviruses have a core of double-stranded DNA surrounded by a protein coat having icosahedral symmetry. The nucleocapsid is surrounded by an envelope that is derived from the nuclear membrane of the host cell, and contains viral glycoprotein spikes.


The sub-family of β-herpesvirus include humanherpesvirus 5 (Human cytomegalovirus); muromegalovirus Murid (beta) herpesvirus 1 (Mouse cytomegalovirus); Suid herpesvirus 2 (Pig cytomegalovirus); Equid (beta) herpesvirus 2 (Equine cytomegalovirus); Porcine herpesvirus 2 (inclusion body rhinitis virus); Bovine herpesvirus 4 (bovine cytomegalovirus); Murid herpesvirus 2 (Rat cytomegalovirus); and Caviid herpesvirus 1 (guineapig cytomegalovirus). The sub-family of α-herpesvirus include the simplexviruses: Simplexvirus Human herpesvirus 1 (Herpes simplex virus 1); Human herpesvirus 2 (Herpes simplex virus 2); Bovine herpesvirus 1 (Bovine Mammilitis virus 1); and the Varicellovirus Herpesviridae: Duck Enteritis Virus (Duck enteritis herpesvirus (DEHV), Duck enteritis virus, Duck plague virus, Anatid Herpesvirus, Avian herpesvirus 2); Human herpesvirus 3 (Varicella-zoster virus); Suid herpesvirus 1 (Pseudorabies/Aujesky's disease virus); Bovine herpesvirus 1 (Infectious bovine rhinotracheitis virus); Equine herpesvirus 1 (Equine abortion virus); Equine herpesvirus 4 (Respiratory infection virus); Feline herpesvirus 1 (FHV-1); Canine herpesvirus (CHV) (“Fading puppy” syndrome virus); Equine herpesvirus 3 (Coital exanthema); and Avian herpesvirus (Infectious laryngotracheitis of chicken).


Characterization of HCMV Gene Sequences According to their Effect on Growth

The present invention provides for the classification of open reading frames (genes) in HCMV according to the effect that such sequences have on growth of the virus. Sequences are classified according to the effect on a virus when the sequence is deleted, and are: essential for growth, causing a severe growth deficit, causing a moderate growth deficit, having no effect on growth, and causing enhanced growth. In the tables setting forth the open reading frames in these categories, the sequences are referred to by the ORF, which are diagrammed in FIG. 1.


In order to unambiguously define the sequence of each ORF in the HCMV Towne strain, the genetic sequence of the HCMV is provided herein, as SEQ ID NO:1. Also provided are upstream primer sequences adjacent to the ATG start codon of each ORF; and downstream primers that are adjacent to the sequence 1 nt past the stop codon of each ORF. The sequence of the complete ORF can easily be determined by one of skill in the art, by using the primer sequences provided to delineate the ORF in SEQ ID NO:1. An ORF may thus be defined as the sequence of SEQ ID NO:1 that is bounded by the corresponding up and down primer. For example, the ORF of US26 comprises the sequence of SEQ ID NO:1 that is 3′ of the upstream primer and 5′, less 1 nucleotide, of the downstream primer.


The orientation of the primers (i.e. whether the primer is complementary or identical to the corresponding region of SEQ ID NO:1) with respect to SEQ ID NO:1 depends on the the orientation of the open reading frame in question. This can be determined by looking at the numerical identifers of the primers. These identifiers are three digit numbers followed by “Up2” and a letter, either “W” or “C” (eg. 006_Up2W or 453_Up2C). If the letter is a “W” then the upstream primer located is complementary to the SEQ ID NO:1 and the downstream primer is identical to the sequence in SEQ ID NO:1. If the letter is “C” then the upstream primer is identical to the sequence and the downstream primer is complementary. The 3′ end of the upstream primers ends directly adjacent to the ATG start codon of the ORF. The 3′ end of the downstream primers stop 1 nt. beyond the stop codon (i.e. there is a 1 nt. gap between the stop codon and the 3′ end of the downstream primer).














006_up2W
AAGAAACTCCATAAAATAGGCTGCCAAGTGCCGCTC
006_down2W
TTTATTTGTATTCCTTTCCTGTTTTGTACTCGTAAA
US 26







CACGCCGCGGCACC

CTGTTGACGTTGTT





014_up2W
CCCCACTTGCCGCTGTACAACGAATTCACCAGCTTT
014_down2W
GTGCCACCGGTCCAGGTGAGAAAGAGAAGCCGCAAT
UL116






CGCCTGCCCACCTC

CCGGGCGGCGGCAC





017_up2W
GCCGCCCGAGCTGAAGCAGACGCGCGTCAACCTGCC
017_down2W
TAGACATCACAGTTCACCACCTTGTCTCCCCGGTGT
UL 114






GGCTCACTCGCGCT

GTCTATTATCATCA





019_up2W
CCGCACTCGGTCAGCACCCGCAGAATCCCGGGATCT
019_down2W
AAAAGCACAGGGCCAGGAAAAGCAACCAGCCCCGCC
UL117






CGGGCCCTGCGGCC

ATCGCCGCCGCCGC





020_up2W
CGGCGCCAACTGGCTCCTTACCGTCACACTCTCATC
020_down2W
ACGCGAGCCTGCTCGTCGGGGGTTAACAGAGAGCCT
UL 115






GTGCCGCAGACTTG

TTATTATCAGCAAT





024_up2W
CCGCCATGAAGGCAAGAGCAGCAGCAGCAACGACGT
024_down2W
TGNGGCCTATAAGGTGTCTTCTATCACGGTGGCTTG
UL109






CACTACGATGATTG

TTCATCGCTTGGCG





036_up2W
CCTTCGTCCCAGACGGACGGCTATCGGTTCGCGCGC
036_down2W
GCTGCTCTTGCCTTCATGGCGGTATTTCTCTTCCTC
UL110






TCGTCTTTTCTTCG

CCCCCTAACCCCAT





046_up2W
TCGCCCCGAGGCGCTGCTCTGAAGCCAAGTGCCGAC
046_down2W
AGCGTCACAACTGACGTGGGTTGGGTACTGACGTGC
US 33






GGCGCTTTGGCTTT

AGGATATTACGCGA





064_up2W
GACGCCGCGCCGACGCTCAAGCTCTGGGACTGGACT
064_down2W
TGTGAAAAAGAATTCTCGTAAGCATGTTGACAACTG
TRS 1






TGGCCACGGTGGTG

CAAAATAAAACCAT





070_up2W
ATTACTAATCCATAACATGGCTCTTTGCCACAACTA
070_down2W
GCACACTGGTGGTGGTGGGCATTGTGCTGTGCCTAA
UL125






TCTCTATTGGCTAT

GTCTGGCCTCCACT





073_up2W
AGAGTAAAGATTAACTCTTGCATGTGAGCGGGGCAT
073_down2W
ACAATAGTGACGTGGGATCCATAACAGTAACTGATA
UL 123






CGAGATAGCGATAA

TATATATACAATAG





079_up2W
CGCGTCCTTTCAAGGTGATTATTAAACCGCCCGTGC
079_down2W
ACGGGGAATCACTATGTACAAGAGTCCATGTCTCTC
ul 122






CTCCCGCGCCTATC

TTTCCAGTTTTTCA





083_up2W
CTGTTTAATAAAAGTAGCTTTTTTTATACATCTCCG
083_down2W
TAGTTACCCTCTCGACGTCGCCGGCTGTCAATGACG
UL121






TCTCTGGTCTCGTG

TGCCTGCGTCAGTG





085_up2W
TCACCTATCCCATCTACGCCGTGTACGGGACTCGCT
085_down2W
GAAGTCAGCGAAATAAAGACAACACAGCAGCCGCTC
UL118






TGAACGCTACCACG

CTCTCGTTTCTGGC





094_up2W
CTCGGCCAGGGGGTACCGAGGCGGTGCCCGCGACTC
094_down2W
GTTGGGTGTGGCCGGAAGCGCTCGGGGTCGACGGTG
UL62






GCCCCTCCTCCAAG

GGCCGCCATGACAC





097_up2W
ATCAGCAGCTCGCACAGGCGCTGGGCTAGCTGCATC
097_down2W
AGATGAGACCGCTGCCGGGGGGCGGGTCACCGGCGC
UL70






GTGCCGGCGCGACG

CGTGGAAAGTGAGG





098_up2W
CTATATATACATCAGCGTGCCCGAACGTGACCTTCC
098_down2W
TAACGGGATAAGGGACAGCAATCATCACGCACAACA
UL69






TAGCGACGACGGCC

CCCTTCACTCTCTT





099_up2W
GCCGCCGCCGCGGTTGCTACTACTTTCTTAAGTGAT
099_down2W
ATAAACGTTCTCAACAGGTATGAAATGAACAAACTA
UL67






GCGAATTGGTGGCT

GATGATGCTATAAC





100_up2W
CCAGTGTTCCTTGGAGAGACGAAAAGCGAGCGTGTT
100_down2W
CAAATACGGTCGTGGCCGAGCGCAAAAAAACGCACC
UL65






TCACGAGATGGCTG

ATCGACACCACACC





110_up2W
GAGCCTGAGATGATGATGATGGCTACGAAGGACGGG
110_down2W
TAATGACAGAATGAACTCCATGTTATACGCTCTTTA
UL64






CGGACGGGCAAACG

TATAGTTTCTCTGC





114_up2W
GATGCTTAGAGCGTGGAGATTGATGGTACTACTTGC
114_down2W
TAAACACAATAGCTACAGCTGCGCGGTTCTGTGGAA
UL 4






CGCGTACTGTTATT

CTTCACGTGCGATC





115_up2W
TATTGTGTTTACGTTGCTTTTGAAATGTTAAGCGTC
115_down2W
ACAAATATGCAAAAGCAAAACACAACAAACTATACA
UL5






CCTACGGCGCTAAC

CAGCTGGCTAACTA





116_up2W
TGGAAAGACAGTAAACAGTATGGACAAGTGTTCATG
116_down2W
TGAGCTGAAAAATAAACGTACATAGCTTTTAGTTTC
UL9






ACGGACACAGAACT

CTCGACGGTGATTC





117_up2W
GTAAACATAATGACGTACATATACGTGGTTATACAA
117_down2W
TATATTCAAACAGTGAGTTTGAAACCGGACATATCC
UL10






CAGGTGTTTGTGCT

GTCCGCTCACGATA





119_up2W
ACCGTGGCCTGTCCGCCCCGAGAACCCCCGCATCGT
119_down2W
TTCCGTTTTCCTGCCGTGACTGCGAATCATCCGCTT
UL14






GCCCTGTTTCGTCT

CATGGCTCTCCTCG





121_up2W
CCCGTGGACGGGTCTCTTTGACACGAGCGCGGCACG
121_down2W
TTTGACCCCTCCTATCTTCTTTGATGATGTATCCTC
UL17






CCGTTGCCACGAGC

TTAGCCGTGTGTTG





122_up2W
CTGAAAGTATATAACGCCGATCATGTCCGAGGAACT
122_down2W
CGGGGGCACGCGGTAACCGACGTCGAAACAGCTCAT
UL 18






GTTAATAAAACGCC

ACAGGGCGTTGATG





129_up2W
AGTACTGTTTGAGCGTGACTGTTTCCAAATCGTACC
129_down2W
CGGGCTAGTCATTGTGGGCACAAAACCTTCTCCCTG
UL7






GTGGTAAATAAATC

ATAAAAAGCACATT





130_up2W
CAGAATTATAGTAATGTGCTTTTTATCAGGGAGAAG
130_down2W
GTGTACAAAGAATGATTGTTATCCATCGAAGTAATA
UL8






GTTTTGTGCCCACA

ACGCGTACCGGAAC





133_up2W
CCCTGATTCCCTTCATAAAGCTGTTGACCGGCCCTA
133_down2W
ACGCATAAGCGACCGGGGATGGGGGGAAATAAAGGA
UL13






GAAAGACCAAGAGC

ATGGCTCGGTGTAT





136_up2W
GGGCTCCATGCTGACGTAGGTACCGACTGGGGTCAA
136_down2W
GGCCTTCTTATAGCAGCGTGAACGTTGCACGTGGCC
UL16






AAGCCTGGGTACTT

TTTGCGGTTATCCG





138_up2W
TGGAACGGTCTTTATATATACAAACGCCGTTATGCT
138_down2W
TTATGGAAAATATGTAGTCCGTACCGCTTGGGGCTC
UL20






CAGTGTCCGGCAAG

AGAGTCCAAAGTCC





143_up2W
GAGAGTCTGAAACGGGGTGGGAGGGACTTTTGCGGG
143_down2W
TACCACGGTACGATTTGGAAACAGTCACGCTCAAAC
UL6






TAGTGCACGCTAAG

AGTACTTTTTATTT





147_up2W
GGGACAGTCCCTACGGAACCTGAGAACATGTGGAAA
147_down2W
GGAGTTGGCGTTTCACAGTGATTTCATGCAATCATT
UL 11 UL13






TCACCTGTGGTAGA

TCCTACGCGACTTG





153_up2W
TACCTACGTAACCTGGCCTTTGCGTGGCGCTATCGC
153_down2W
ACGGACGTAGGTTATTTTGAAAACCTACGTTAATCC
UL19






AAGGTCCGGTCGTC

TGAACGCGTTTCGT





179_up2W
CTCTCTAGGTAGGGGACTACCTCCTCGACGGTCCAT
179_down2W
GCATGGCCATCTTTCTCACGTTGTTGCTCATGCTCT
US 20






TCTAGCGGGACGAC

CGGGTCCCCGTTGG





238_up2W
ATGGCTAATTGCCAATATTGATTCAATGTATAGATC
238_down2W
ATCAGTACCTGGAGAGCGTTAAGAAACACAAACGGC
UL127






GATATGCATTGGCC

TGGATGTGTGCCGC





249_up2W
GAAAAGTAAAAGATGACCGCGCCCTCGGAGTCCTTT
249_down2W
GATACATTAATAAATATATTATATCTGGTGTATATA
TRL7






TTTCCTTTTCAATC

CTGAATGCTGCTGG





250_up2W
GGGTACTAAAAAAGTGTTTAATATTGGGGTTTAATG
250_down2W
AGTCATCATCCTAAAATTCAGATATAAATGAACACA
TRL6






ATAAAATCCAGGTT

TGTCGTATGGGATT





252_up2W
CCTTTTTATGTGAGTTTCTCTTCCGCGTCTCCCGGC
252_down2W
TGTGCAGGGCATGCGGGGAATCAGGACCGGACACGG
TRL4






CGTACCATCCACCC

GATAATTTCATCTA





257_up2W
TGAGAGTCGATTCGATCGGTAAACATCGTAAGCATC
257_down2W
ATGGAAACCTTACCCCGCCGGAACACCGCCGGGCTG
UL73






GTGGCGGTGGTGTG

TGAACCTGTCCACC





261_up2W
TCCCCGGAGAGGGTATATTCGTTCGGCGAGAGCGGG
261_down2W
TGACGTAATTTATCTGCCACTTTTCTCCCCGCTGCC
UL78






CGGCGGTGGTGGGT

GTACAACGCCGCCG





263_up2W
GAGCTCAGCGGCTGTCCGCGCGACATCTTCTCGCTA
263_down2W
TATCACGGTGTAGAAAAAAAAGAGAGGGAAGCCCTA
UL80






ATCTGTAATATTAG

AATATAGCGTCTCT





272_up2W
CCTTCTCCTGTTCCCTCCGCCCCCAAAACTGTCAGC
272_down2W
TGGTCGAGCACCAGATGTAGAGGCAATTGCTCATCG
UL92






GACGCTCAGACGTC

TCAGCGAACCGCGC





276_up2W
GTGCTAGACCGTTGGAGTCGCGACCTGTCCCGCAAG
276_down2W
GTGTCCCATTCCCGACTCGCGAATCGTACGCGAGAC
UL 99






ACGAACCTACCGAT

CTGAAAGTTTATGG





278_up2W
CATGGCGATAGCGGCGGCCCGCTCGCTCGGGAGGCG
278_down2W
GCGGCGTAGCTGGCGCGATGCACAGCACGCACCTCA
UL101






ATGGGGGCGCGCCG

GCCGGCGGCAGACG





285_up2W
CGATGTCATTGGCCGCTGCGAAGGGAGAAGAGGGGA
285_down2W
GCGGTCGCCGCGTCAGACGGGGTGGCGGGTCCCGTG
UL76






CACGCGAGTAAGTC

ATGGCATCGTGCCG





312_up2W
GTTGACGGCAGTTCTGAACCCACGTCGCCGCGAGCG
312_down2W
CATGGCCACCTACCTGTGTGACGAGATACACGCCAT
UL88






CGGTTTGCATCACG

CCGTTTCAGGGTCA





316_up2W
GCGCGCCCATAAAAACGAAAGTGTCGTCGTCGCGAC
316_down2W
CGTAGAGCGAGTGTAACTGGATCTCCTCGGTAAACG
UL91






CCGCCACAGCCGCC

CGTTCTGGACGTGC
UL92





317_up2W
TAGTCGTAAGAAGCGCGAGGACGCGCTTCTGAAACA
317_down2W
CGGTAGAGCAACAGCAACTGGCATAAGATACACGAG
UL93






GATGCGTTCCGAAT

CTGTCGTCCTCCGG





320_up2W
TCGGTGTGGTAGCTAGTGCAGCTCTAGGAACAGGGA
320_down2W
TACCTTCTCTGTCGCCTTTCCCCTCAGCAACCGTCA
UL97






AGACTGTCGCCACT

CGTTCCGCGTCCCG





321_up2W
AGAAGGTACAAACCCACCGGCGGGGAAAATACCGAG
321_down2W
GAGGGATGTTGTCGTAGGAGCGTAGAGACACCTGGC
UL 98






GCGCCGCCATCATC

GACCCAGAGCATCT





325_up2W
GTCGGCGAAAAAAGACCCCGCGGGCCTTCGCGACTC
325_down2W
TTTTTACTAGTATCCACGTCACTTACCCACGTAGTT
UL102






TCTTCTGTCCGAGG

CCCCTACGTGACTC





331_up2W
TTTCGACCTGTGTACCGATTCTGTTCTGGACTATCT
331_down2W
CCCTCTCCGGGGACGCTCGCCCTTTATGCAGCAAGC
UL77






GGGACGGCGTCAGG

GACACGTGGTGGAA





339_up2W
GGCGTGAGCGCGAGGCGTCGGAGCTCGGGGAAAGCA
339_down2W
TCGGACGCTCCTCCGGACGAAACGCCGCGGCGGCAG
UL87






GCGCGACCCGGAGA

CGGCCGCGGCTTCC





345_up2W
TTACTGGGTGCTGCCGGGCGGCTTTGCTGTGTTCTC
345_down2W
TCCTTTTTTTGTTGTTTCTTGTTTCTTCTCCCCGTG
UL94






GCGCGTCACTCTTC

AACTGTCAGACCCC





347_up2W
GCAGCTCCGCGTAGCGCTCCTGGATCTTGGCGGCCG
347_down2W
GCTGACGCGCTCGTCTCGACCGCACAAGCGCCGGCC
UL95






AGTCTCCGCGCAAC

CCGCCGCCGCCACC





348_up2W
TTGCTGGACGCCCTCTCGCTGAACGACGCGGGTCTC
348_down2W
TTTTTTTTTAATAAAATCTGAACAGAGGCGTGACGG
UL96






ATCACGTTGAATCT

GGATTGCTATACCT





362_up2W
TATAAAATTCACTCAGTGGCGGCGTAGCCATTGTCT
362_down2W
TGTTGCGATGCTCGTGGCTGCGGCGGCCGTTGTCGC
UL 57






TCCGTTCATCCACC

GGCGTCTGCTGGCG





366_up2W
CAAGAGACCACGACGCGCCTCATCGCTGCTGGATTT
366_down2W
ATCACAAGTCTCTGTCACTTTTTTTGTCTAGTTTTT
UL 55






GGCCCGCGACGAAC

TTTTCTCCTCTTGG





378_up2W
TCACTTTATTGAAATCTACCTGATTTCTTTGTTATT
378_down2W
AAGACGCCCGGCGTCTAATAATACAGCCGCGCCGAG
UL 45






TTCCTCGTAAACTT

CCAGCGGGCCCCCG





379_up2W
CTAGAGCGCGTGCCCGGGCACGCGGCCTGCGCGCAC
379_down2W
GACGGCGACGGTGGTAACTGTGGTGGAGACGGTACC
UL43






GGCGCGGTCCCGCG

GACGGCGTCCGCGG





380_up2W
TCGGTACCGTCTCCACCACAGTTACCACCGTCGCCG
380_down2W
TTATTCCGTAGCAGCAATGATGGTACAGTCAAGCAC
UL42






TCACTGCCACCGAC

ATGATCTATTTCCC





382_up2W
GATGTACGTACCACGGTACGGACATTAACGTCACTT
382_down2W
GAGAACTACGGCGCGGCGGCACGGCCTTTATAGACA
UL37






CCAACGCCACGAGT

CTATCAGCGTTGAC





384_up2W
GCTGTCAGGAATACCTGCACCCCTTTGGCTTCGTCG
384_down2W
AAACATGCACATAAACAAACGGGACCACCGTGCTCG
UL36






AGGGTCCGGGCTTT

TCATCCTCTCCTCA





388_up2W
CGGGCGCAGTCCGGGGCGACGACGCTTCCGGGTTCT
388_down2W
TCACTATCCGATGGTTTCATTAAAAAGTACGTCTGC
UL32






GGAGAAAAGCCAGC

GTGTGTGTTTATTA





393_up2W
GTTGAAAACGCGCATGATCTCGCGGAGCCATCTACG
393_down2W
TCCACACGCTCAGCCGCGACTGAGCGCCGGGGCGCG
UL30






CGCCTGTCAGGGAG

CCGCTACTTGGGTT





394_up2W
ACTGCTGCTTCTGCTTTTTTGTCTCCTGTGGATCGT
394_down2W
CGGTTATAAAAACACCGTCGCCCTATTTCTGGGCGT
UL29






CGCGGACTGCCGGC

GTGTACACTGATGA





397_up2W
GGGGCCCTCGGTGCGCTACCGGGCCCACATTCAAAA
397_down2W
CTCTGTCTTCTCCGGGTTTTTTTTTTCATGTTTTTT
UL26






GTTTGAGCGTCTTC

TTTCTTCCTATTTT





398_up2W
AGAGGCCCCGCCTAGGTGGGCGGAGCGGTAATTTTC
398_down2W
AATCATCTCTGATGACGTAGCGAGCGAAGCGAGCTA
UL60






CACCGCCGCGGCCC

CGTCATCAGTCCGT





400_up2W
CACCGCCTCGCCGGCCACGGGGTTGATTCCTGTTCT
400_down2W
AAAGATCCGAACTTTAAAATTGTGTATTTTTATTTT
UL59






TATGCCGACACCAG

CCCATCCCCCTCTT





407_up2W
ATTTGCTTTGTGATTTTGCTTCGTAAGCTGTCAGCC
407_down2W
AGTCTCAGCAGCATTATCACCGTCCCCAGTCACCAC
UL 54






TCTCACGGTCCGCT

CGCCGCCGCTGTTT





411_up2W
TACTCGGATTCATGGCGATCGGCGCCGCTGATTGAG
411_down2W
ATCCTGATGGAGAACCTTGTTCATCTCCATCGCACC
UL51






GACGCGGAAAAAGA

GACGCCACCGCCGA





423_up2W
CCCGCAGCTGCTCTATCAACTTTTTGAAATCTACCG
423_down2W
TGTGTTTATTTTTTTCTTCTGTGTCTCCTCCCCGTA
UL46






TGCGCCTCGCCATC

TGCTGTCAGCGCCG





426_up2W
TTTCAAGACGACGTGAGACCCACACGCGGGTTTCAC
426_down2W
AGTCCCTTCTTATACTATCCCGGAGTCTGTGGTTTT
UL37






TTCTTTCTTTAATT

TTTGTTTACCCCTG





452_up2W
GGCCGGCGCCAGACCGGACGACAGCGTCTCGTACGT
452_down2W
CCACGAGTAGAAGATGAGGAAACCGCAGCACCCAGA
UL56






GAGCGAGTCGAGTC

CAGACGATACACAA





459_up2W
CCCGCTGGTGCTGGCTCTCCTGCTGGTGCTGGCTCT
459_down2W
TGACGGTGTTTTTCGTCCCGCTTGTTGGCCACCGTG
UL49






GCTGTGGCGCGGTC

GGTCCCGGCGCGGT





471_up2W
TTTCGCTCGCTCGCGCCCGCTCCTTAGTCGAGACTT
471_down2W
TCCATCGCGGGACCGCGCCGTGCGCGCAGGCCGCGT
UL44






GCACGCTGTCCGGG

GCCCGGGCACGCGC





472_up2W
AGAAGGGACTTTACCGCTATTGCTGCTATTCATAGA
472_down2W
ACTACAAAAAAAAAAAGCTGAACATGGTCATCTAGC
UL38






GAAGGATAGAAAGG

AGCAAAGTTCTCCT





484_up2W
CCACGGCGGGTCGTTGGCTCCCGCTGTGCTGGCCGC
484_down2W
GGCGGTAAAGCCAAACACCGGCTATATAGCTAGTCA
UL28






CGCTGCACGGCATC

TCACAGTCTCCTCC





485_up2W
CCGCCGTCGCTCCGCGTCGCTTCGCCGCCACCTTCT
485_down2W
GCGCCTCGTCGGTCGATGACCCCACGGTGCTTATAA
UL27






TCTTCCTCTCAGTC

CGCGCCGCCACGGC





490_up2C
TTCAGAACGAGGTGCTCATCAACTACTGCGACATCG
490_down2C
GTGGTTTTTACCCTGCTCAATAAAGTCACGTTTTCC
UL105






CCGACAACTGGGTC

TTACACGGTGTTGT





504_up2C
TCCAACGCGCCTGTGGAGGGCCAATCGGACCGCGGG
504_down2C
AATACAAATAAAAAAAGACGCTGTGACACTTTGGCT
US25






AGCTCTCCAAGTGG

CTTTCCTGTGCACC





511_up2C
AGACGGTGCAGGAGTCCGAGGCGGCGGCGACGGCGG
511_down2C
AATGTCCAAGCGCGTCCTGTTTCATAATTTTTCCGG
UL 113






CGGCTGCGGGGTTA

TCTCGGCTCGGTTT





520_up2C
GCTCCACGGCCTCCGACGAGCGTTGCGCTCGCGCTT
520_down2C
CCACCAGCGCACCAACACCGCTCGCCTGCTCGCTCG
UL 112






TGCGCCGCCGCGTC

TGCGCTACGGGGGG





526_up2C
CTACCTGGGACGCGCAGTTGGGCGGCGGACTGGGGC
526_down2C
TCGAGCCACACGGAGTAGTCGTCCTCACGTTGCTAC
UL 111a






GGCATGCTGCGGTG

AAGAGGAAAACTAC





530_up2C
TCTTTTTTCTTTTTAGTCGATGGAACTTTTCTTCGG
530_down2C
AAGGATCATATATATCTCGTCAGGGAAATACAAGTT
UL108






TACGGGTTCTTGTT

AGACCATAATGTTG





542_up2C
CGACATCGGTGACACAGCTTCAGAAACAACGTGTGT
542_down2C
AAAGACAAATGAGACGCTGAAGGCCGCGATCAGCCT
US 30






GGCGCACGCTACTT

CCCGTCTCTTTATT





543_up2C
GTCGGTGTCTCGTCGGTGAGACGAGGCCGCCGCCCG
543_down2C
CCCCGCAGATATCCGGTTGATGTAGCCAGTCGCCTA
US31






ACAAGTTCGATCTC

CACGCGACTTATCG





544_up2C
CGTTGTCATCCGGCTTAGAGCAAACCGTCCTTTTAT
544_down2C
CACACATCACACGGGGATTTACGCTATGTTGTTTAT
US 32






CATCTTCCGTCGCC

TGTCATGCCGTGTT





546_up2C
CGCCGTCGGCACTTGGCTTCAGAGCAGCGCCTCGGG
546_down2C
ATCGCGGCACAACGACTGGACGACGTCGTTTACGTA
US 34






GCGATGCGACGGCG

ATTTTAAGAAGAAT





557_up2C
GTGCGTGGACCAGACGGCGTCCATGCACCGAGGGCA
557_down2C
AGAGGGGCGGACACGGGGTTTGTATGAAAAGGCCGA
US 28






GAACTGGTGCTATC

GGTAGCGCTTTTTT





558_up2C
CGGAAAAGTTTATGGGGAAAAAGACGTAGGAAAGGA
558_down2C
CGGCACTGTTCTCGAATGGACATGTTTCGTCCGACA
US 29






TCATGTAGAAAAAC

TCGACAGTGCAGCC





582_up2C
CTTGGCAGAGGACTCCATCGTGTCAAGGACGGTGAC
582_down2C
TTTACAAATTCACATATACAACAACGCCGTCCCCCG
UL124






TGCAGAAAAGACCC

TGCCCGCAGTTTTT





592_up2C
GGGAAGACGCAGTGATCCGTCGGTGTCTGCGAGAGT
592_down2C
GTACTCGTCGTGTCCGTGATCACGTACGTTTTCCAA
UL71






ACGTTGGCGACTAT

AACGTGCCAGGCTG





626_up2C
TTTTTTCCGGATCGGCCCGATTTCTTTTTGTCCACC
626_down2C
ATTTACAGGAACGGGGAAAAAAAAGGCACACGGTCC
UL23






GACGCGCGACCGCG

GTGGGAGACGCGGG





627_up2C
TTTTTAGAGCAGAACCTTACAGCTTTTTAATAAAAA
627_down2C
GCGCAGGTAAACAGGTAAGAAATACAAAAAATAACG
UL 20a






ACAAGATAGTCAAC

TGATTGTGAACGCG





639_up2C
AAAGAACAAAAAACACCCATCCCAGCGGTACCGTAC
639_down2C
CACGACCTGCGCCACTCGGACCGCTCCTGCGACCTA
UL15






CTCGGCGACGCTCC

GCTTTCGGATCTCG





642_up2C
GCAGCGGGAGCAGATGATAACGCAAGAAGCGACCGC
642_down2C
TACCGCAAAAGCTGTGGCTGCTCTGGCAGCATGACA
UL12






AGTGGGCCCACAGC

AGCACGGCATCGTG





650_up2C
TTTACCGTACCCAGACAACGGTGCTTTATAGACTCA
650_down2C
TACTGAGCGTGCGAACCGGGTAGGGTGCCGAACGAC
UL3






TCACTTAAGGCGGG

GGGTATGCGTCGTC





653_up2C
GGATTCTTCTCAGGGCGGCCAGAGCGTGCCGGTATC
653_down2C
CGTCGGTGTTTTATGCCCCAAGCAGCGTCGTCGTCA
UL48






TCAACGGATGGAAC

CTCGTGGCGTCACA





655_up2C
GCGTCTGGCTGTGTGCCGTTAAATACCTTGGGTGAC
655_down2C
GATGTAAATAAAATGCTTTTATTTAAAACTGGTCCC
UL21






GACATCTCGAGGTC

AATGTTCTTCGGGA





666_up2C
GATTCCAAACCGGATACGCTACATACCTGCCACAGT
666_down2C
GCTATGTTACCACAGGAGATCACGGAACATAAATGT
UL2






GGGCAGCTTTTACC

TTTCTGCGTATGTT





670_up2C
CGCTTTGTGTATTTAGACGAATCTCGGCGATAACCG
670_down2C
ACAAGCGAGCGAGTGGGGCACGGTGACGTGGTCACG
US 21






CCGGCGTTGCCGCC

CCGCGGACACGTCG





676_up2C
CGGAACTGGTTTTCGGACAGAGCAGCCGTTTCCAGA
676_down2C
TCTCCATGTCGGGACCGCAGCGCCCGGCGGCGTATC
US 15






GAACGCAGCGCACC

CGCAAGGTCTCGAA





679_up2C
TTTCGCGCAGCGCGCTTTATCCGACTCGCTGTCGAG
679_down2C
TGCAGAATCATAAGTTTATGATGAATAAAAACGGGG
US22






ACGGCTCCGCCGGC

AAAGGGAATCTGCT





680_up2C
CGTGACCTCGGTGGTGTGCGATACGCAGGACATCCT
680_down2C
AGCATGGCGACAAGCGCGGCTGCTGTGAAAACGGGC
US 20






GCACGACATCGAGT

GCGGTTTTATAGGC





681_up2C
GTTTTCACAGCAGCCGCGCTTGTCGCCATGCTTCAT
681_down2C
CGTCTTATCAGCACCCGGTTACCGCGGATTTGATTG
US 19






GTCGTCCCGCTAGA

ACGTCACGAGTGTG





682_up2C
ACTGTTTCATCGACGCCTACCTTAGACCGACAGCGG
682_down2C
GAAGGTGGGGAACGTTTAAGCGAGCAGGAGCGTGTC
US 18






TCGTAAGCGGCAGC

ATCTCCCCCATCTT





683_up2C
ACACTCTATAAACGGTTTTTCATACGCGCCTTTTGA
683_down2C
ATTGGTGGAGACGGCCGGCGCGGCGGGTGGGGGAAA
US 17






TCGCCACCGCCGTC

CGACGAGTTTTTCC
US12





684_up2C
CCCCACGGATCTCGCGCCTTAGACGCACGGTCATAT
684_down2C
GCGTTCTCTGGAAACGGCTGCTCTGTCCGAAAACCA
US 16






AGCCTCCGGCTGTC

GTTCCGAACGAAAA





686_up2C
AAGACTCCACCGAGACGCTCACCCGTTCACTCGGGC
686_down2C
GCTTCAGGTACCCGGCAAGTTTTATAGAGAAAGGGG
US 13






GCATCACCCGCCTC

GACGATGGGTGGTG





687_up2C
CTCTTTCTCTGCTTCTTTTCTGGGGTGTCTAGCTGG
687_down2C
AGCAGCGTCAGACGAATCGCGGCTGGTGGCCCTGGG
US 23






CGGCCTCTTTTGAC

GGTGGGACGCGCCG





692_up2C
CTAATGCCTATAAAACCGCGCCCGTTTTCACAGCAG
692_down2C
GACGTCACGAGTGTGGTCAAACCGTGGCGGCACCCT
US 19






CCGCGCTTGTCGCC

GTATCCGACCCGTC
US12






FAMILY





696_up2C
CTGTAGCTTCGAGACCTTGCGGATACGCCGCCGGGC
696_down 2C
CGAGTGAACGGGTGAGCGTCTCGGTGGAGTCTTCTT
US 14






GCTGCGGTCCCGAC

ATAAACCAGCGGAG





700_up2C
CCTCGCCTATTTAACCTCCACCCACTTCAACACACA
700_down2C
GCGTGGCGGCGAAATACGCGATCCCTGGGCTGGTAG
TRL1






CCTGCCGCACAATC

ATCCCCCTACCCCG





710_up2C
GGACGAGGACGACGACGTCTGACAAGGAAGGCGAGA
710_down2C
TATTTGCGTATATGATGACTTGTTCCACCGTCGATG
TRL 11






ACGTGTTTTGCACC

TTGTGTGCGCATCT





720_up2C
GGGGTGGCGGTAGTGGTGCTGCTGATGGTAGTCGGG
720_down2C
ATACCATGGGACCCCTTTTCGTCACACACGTCTTTC
TRL5






ACGGAGGAGAGACG

CGCTTACTCAACGC





735_up2C
GAGTTCAGCGTGCGGCTCTTTGCCAACTAGCCTGCG
735_down2C
GACCCAATAGCAGCCACAACGCCGTCAAGAACGGCG
UL130






TCACGGGAAATAAT

TCAGGTTTTTGGGA





738_up2C
CCATCCCGAGCACTCCACACGCTATAACAGACCACG
738_down2C
CAAACCTCGGTTTCTTCCTATTCTTAAGTTTTCCCT
TRL2






GACACGGCAAATGC

AGTATATTTGCCTC





746_up2C
TGCGGCGGCGACGACGACAGCTGCGATTTGTCGGCC
746_down2C
AGGAAACTGGAGAGAGCCACAACAGAAACAGCGTGG
TRL8






GACATGCCGATGGT

GACTGTCCGCTGTT





747_up2C
GTGGTGAAAGAAGAGCACCAGCAATCCCAGGAGGAG
747_down2C
CTGTCCATCTCCCTGTCTTTTCGCGCCGCCGGTCCC
TRL9






CAACAAGCCCTCAC

CCCCAAACCATGTC





748_up2C
GTGCGGGGAGGATCGACGTGTGCGGTGCTTGTGGAA
748_down2C
AGGGGGGTGCTGTAGGTCTGCATGGTGCAAAACACG
TRL10






CACGGTGTTTTAAT

TTCTCGCCTTCCTT





755_up2C
ACACGTCGTTCGCGGACATAACGAGAAATCCACGTC
755_down2C
CGAGGTGATGGGGCGGGGAAAGAGTTGGAACCGAAA
UL132






GCCACGTCTCAAGA

GACAAAAAAAAAAG





758_up2C
TTGTGGCTGCTATTGGGTCACAGCCGCGTGCCGCGG
758_down2C
CTGTAGCAGACTTCGCCGTCCGGACACCGCAGCCTG
UL129






GTGCGCGCAGAAGA

TGGATTCATGAAAA





773_up2C
TAGTGGCGTGCGCGACCCCCAGTCGGTTGAGTTCCG
773_down2C
TTGTCCTCGGATGCTCTGTGTAGAGAGGAGACAGAA
UL90






CCAGCAACGAGTTC

AAGGGACTCTTATG





774_up2C
CCAGTGACGCCACGTGTTTCTTGACGCGCCTCAACA
774_down2C
TTCTGCCGATGCCGGCGTCAGTCGCCGGCACCTGGT
UL89






ATGCGCCCTTTGAC

GGCTCTGCTGCGTG





778_up2C
CGCGCTGCTTTCCCCGAGCTCCGACGCCTCGCGCTC
778_down2C
GGTGACTCGCCGCTAACCTGCGGTCGTCGCCGTCCT
UL 86






ACGCCGCCGCCGCG

CCTCACCGGACGGC





779_up2C
CGACGAGATCGCGCGGCTGTCGGCGCTTTTCGTCAT
779_down2C
CCGTATCGCGCGGACGCCTAGTGTCCGTTTCCCATC
UL 84






GCTGCGACAGCTGG

ACCAGGGTTCTCTG





780_up2C
GCCGCAGAGGGCGCGCCGCTCAGTCGCCTACACCCG
780_down2C
GTGGACGTGGGTTTTTATAGAGTCGTCCTAAGCGCG
UL 83






TACGCGCAGGCAGC

TGCGCGGCGGGTGG





781_up2C
CCGTTCACCTTTGCGCATCCCCTGACCCCCCCCCTC
781_down2C
AAATACAGGGAATGGGAAAAACACGCGGGGGGAAAA
UL 82






ATCCCGCCTTCGCG

CAAAGAAGTCTCTC





783_up2C
TCGTCCATCGTCATTGTCGTCACCGTCGCTACCCGC
781_down2C
GCGGCGTTGTACGGCAGCGGGGAGAAAAGTGGCAGA
UL79






TCACCGAGCGAACG

TAAATTACGTCAGG





794_up2C
CGGTAGTTGCGGCAGAGGGGTTGTTATCTGTCGTTC
794_down2C
CCGCGCACCGTAAAGTCGAGCACTTGCGGCTCCATG
UL104






GTTCAACGCGACTG

ATCATCACATTCTG





819_up2C
GCCAACCACCACCTGGATCACGCCGCTGAACCCAGC
819_down2C
ATGTCTTTAACTTTCTCTGTCCCTTTTCTCATAAAC
UL 75






GGCGCGGCCGCGCT

TGTCAGGTTCTACA





823_up2C
CACGGCAGACGAGGAGCGGCGCGGCCCAGAGCGTGT
823_down2C
ACTACGTGTTGCGTGTTTTTTTTTCTATGATATGCG
UL103






CGGCCGATTTCGAA

TGTCTAGTTCGCTT





827_up2C
CATCGGCGCGCCCCCATCGCCTCCCGAGCGAGCGGG
827_down2C
TGTCTCTTTTTTATGTCCATGTCTCCAAGTCTGGTG
UL 100






CCGCCGCTATCGCC

CGGGTGGCGGCGGG





832_up2C
CCTCTCGCCGCTGCCGCCTAACCTCCGCTCGCACCA
832_down2C
GTGTTCCTGTCCGGTGCTTAAGAACCTAGTGCACTA
UL89






CCGCCGCCGCCATC

ACGGGGTCTGACAG





839_up2C
GTTGTTCGTCTCCGCTTCTCCTCCGTCGCGGCCACG
839_down2C
TTGGGGTCGGCGCGTGGCATGCTTGGTGTCTGCGGG
UL85






ATTTCACCGCCGCT

CGCGAGAGGGCCGG





851_up2C
GCAAGCCAAACCACAAGGCAGACGGACGGTGCGGGG
851_down2C
TTCTCATGGGAGTTTTTTGTATCGTACTACGACATT
UL74






TCTCCTCCTCTGTC

GCTGTTTCCAGAAC





852_up2C
CATGTATGCAGGTAAGCAACTGAGCCGAACGCACCT
852_down2C
TCCTGTGACTTTTTATCATAAACCGTTCCGCCCTGC
UL25






CAGCAGACGAGAGG

TGCTTCGTTCCACC





857_up2C
CCGCCTAGAACCGCAGTACCAGTACTCCGCATGTCA
857_down2C
GGGGAAATGGCGACGGGTTCTGGTGCTTTCTGAATA
UL33






ACAGTACCTGTAAC

AAGTAACAGGAAAG





860_up2C
ACACACACCACACGTCACGACACCGATCGATTTTCT
860_down2C
GAAAGCGCTTTTGGGCTCACCCATCTGCAGTCCTGT
UL39






TTATTCTTAGTGTG

TGCCTGAACGAGCA





868_up2C
ATCGACCCGCCCGCCGGCTCGACATCGGTGTCCCTG
868_down2C
AAAAACGATAAAAAGCCTATTGTTTTTATTACCCGC
UL48






CCGCCGGCCTCGCC

TACTGTCAGTGTCG





896_up2C
GGCCCGCTCGCACGGACCTATACTATTACCGCCCCA
896_down2C
AAAACCAGAGCGGAACTTGAGAAATCAACGCTTTAT
UL34






CCGCCGTCGTCGTC

TGTTCTCCAGTGAC





897_up2C
TTCTCAAGTTCCGCTCTGGTTTTGGTTTCGTTTTCA
897_down2C
TATCAACGTCTCGTCCTGAGACAGACACGTATAAAA
UL35






AAGGGAGCCCCATC

AGAGGAAAACCGCG





911_up2C
TGTCCTCGTCGGCCGGGTCGCGCGGCCGTTTGGCCA
911_down2C
GCGCTCCAAAGCGAGCGATGTCGCCCTGGTGGCAGC
UL47






CCGCGCGCGCGTCC

TGGCCTGCGTGACT





918_up2C
TAGCCCAGGACATTCTTTTTCCGCGTCCTCAATCAG
918_down2C
AGGGAGCGCAAGGCTGAGCGTCGTTCGCGCGGCGTG
UL52






CGGCGCCGATCGCC

CGCACGCCGCTCAC





950_up2C
AGTCGGCTACATGCGCCCTGGGTCTGACGCTCCAAA
950_down2C
TAATGAAACCATCGGATAGTGACGTGTCGGGAAAGG
UL31






GCGTACGCAGTCTG

AGGACGGACGGAGG





986_up2C
GGAGAGTTGCGACATCAAGCTGGTGGACCCCACGTA
986_down2C
TGGTGCTGCCGCGGCGCTTGCACTTGGAGCCGGCTT
UL53






CGTGATAGACAAGT

TTCTGCCGTACAGT






Genes essential for replication of HCMV are identified. As set forth in Table 1, the ORFs essential for replication include the following ORFs:













TABLE 1








Sequence
Gene



ORF
Conservation
Function









UL32
β-herpes
Tegument



UL34
CMV
Unknown (Transcription)



UL37.1
β-herpes/CMV
Anti-Apoptotic



UL44
Core
DNA replication



UL46
Core
Capsid



UL48
Core
Tegument



UL48.5
Core
Capsid protein



UL49
Core
Unknown



UL50
Core
Egress



UL51
Core
DNA packaging/cleavage



UL52
Core
DNA packaging/cleavage



UL53
Core
Egress



UL54
Core
DNA polymerase



UL55
Core
Glycoprotein B



UL56
Core
DNA packaging/cleavage



UL57
Core
ssDNA binding protein



UL60
CMV
Unknown (OriLyt)



UL70
Core
Helicase/primase



UL71
Core
Unknown



UL73
Core
Glycoprotein N



UL75
Core
Glycoprotein H



UL76
Core
Unknown



UL77
Core
DNA packaging/cleavage



UL79
Core
Unknown



UL80
Core
Capsid assembly



UL84
β-herpes
DNA replication



UL85
Core
Capsid



UL86
Core
Capsid



UL87
Core
Unknown



UL89.1
Core
DNA packaging/cleavage



UL90
CMV
Unknown



UL91
β-herpes
Unknown



UL92
β-herpes
Unknown



UL93
Core
Unknown



UL94
Core
Unknown (Tegument)



UL95
Core
Unknown



UL96
β-herpes
Unknown



UL98
Core
Alkaline nuclease



UL99
Core
Tegument



UL100
Core
Glycoprotein M



UL102
Core
Helicase/Primase



UL104
Core
DNA packaging/cleavage



UL105
Core
Helicase/Primase



UL115
Core
Glycoprotein L



UL122
β-herpes
IE2 (transcription)







The sequence conservation indicates whether an ORF is strongly conserved with the core group of herpesviruses, with the β-herpesviruses, or only with cytomegaloviruses. See Table 6 for genes previously identified as essential for replication.






In one embodiment of the invention, a cytomegalovirus comprising a deletion in one or more ORFs essential for replication is provided. As described below, libraries of such cytomegalovirus may also be provided.


In another embodiment of the invention, open reading frames essential for viral growth are targeted by ant-viral drugs designed to treat a cytomegalovirus infection in humans. Screening for such agents may involve contacting a polypeptide encoded by an ORF essential for replication with a candidate agent. Some types of therapeutic agents that may be developed against these identified viral genes may include, but are not limited to, polynucleotide based compounds that target the mRNA transcribed from these essential regions, small molecule compounds designed to inhibit or bind to the protein molecules coded by these essential genes, or recombinant protein based molecules such as monoclonal antibodies which may bind to the protein products encoded by these essential genes.


In one embodiment of the invention, a cytomegalovirus comprising a deletion in one or more ORFs designated as severe to moderate growth defects. Such viruses can be used to construct human cytomegalovirus vaccines. As described below, libraries of such cytomegalovirus may also be provided. The deletion of these genes results in attenuated viral growth in tissue culture ranging from 10-fold less than wild-type to severe growth defect compared to wild-type. These ORFs can be deleted to create an attenuated or weakened virus, which can then be used for vaccination for human cytomegalovirus infection.


Open reading frames identified as non-essential for growth, but which have a severe or moderate growth defect when deleted include the following ORFs:









TABLE 2







SEVERE GROWTH DEFECT (12 mutants)











Genes
Conservation
Function







UL21
CMV
Unknown



UL26
CMV
Tegument (transcription)



UL28
β-herpes
Unknown



UL30
CMV
Unknown



UL69
Core
Tegument (transcription)



UL82
β-herpes
Tegument (transcription)



UL112
β-herpes
Major early protein



UL113
β-herpes
Major early protein



UL117
β-herpes
Unknown



UL123
CMV
IE1



UL124
CMV
Latent transcript(ORF 152)



Us26
β-herpes
Unknown

















TABLE 3







MODERATE GROWTH DEFECT (23 mutants)











Genes
Conservation
Function







UL2
CMV
Unknown



UL11
CMV
Glycoprotein



UL12
CMV
Unknown



UL14
CMV
Unknown



UL20
CMV
TCR homolog



UL29
β-herpes
Unknown



UL31
β-herpes
Transcription



UL35
β-herpes
Tegument/Transcription



UL38
β-herpes
Unknown



UL47
Core
Tegument-DNA release



UL65
CMV
Unknown (pp67 virion protein)



UL72
Core
dUTPase



UL74
β-herpes
Glycoprotein O



UL88
β-herpes
Tegument



UL97
Core
Protein kinase



UL103
Core
Unknown



UL108
CMV
Unknown



UL114
Core
Uracil DNA glycosylase



UL129
CMV
Unknown



UL132
CMV
Unknown



US13
CMV
Unknown



US23
β-herpes
Unknown



TRS1
CMV
Transcription/egress










In one embodiment of the invention, a cytomegalovirus comprising a deletion in one or more ORFs designated as severe to moderate growth defects. Such viruses can be used to construct human cytomegalovirus vaccines. As described below, libraries of such cytomegalovirus may also be provided. The deletion of these genes results in attenuated viral growth in tissue culture ranging from 10-fold less than wild-type to severe growth defect compared to wild-type. These ORFs can be deleted to create an attenuated or weakened virus, which can then be used for vaccination for human cytomegalovirus infection.


Open reading frames identified as lacking an effect on growth can be deleted for construction of gene therapy vectors. Deletion of growth like wide type genes results in no significant deviation of viral growth from that of wild-type levels. This indicates that these regions can be deleted from the viral genome without affecting viral growth in vitro. Deletion of these genes can make more space in the viral genome to accommodated foreign genes being expressed in a gene therapy procedure. Identification of these wild type-like growth genes presents an advantage over other attenuated dispensable genes in that high-titers of the gene therapy vector can be attained due to the conservation of near to wild-type like growth characteristics in tissue culture.









TABLE 4







GROWTH LIKE WILD TYPE (66 mutants, 76 ORFs)











Genes
Conservation
Function







UL3
CMV
Unknown



UL4
CMV
Glycoprotein



UL5
CMV
Unknown



UL6
CMV
Unknown



UL7
CMV
Unknown



UL8
CMV
Unknown



UL10
CMV
Unknown



UL13
CMV
Unknown



UL15
CMV
Unknown



UL16
CMV
Immunomodulation



UL17
CMV
Unknown



UL18
CMV
MHC homolog



UL19
CMV
Unknown



UL24
β-herpes
Tegument



UL25
β-herpes
Tegument



UL27
β-herpes
Unknown



UL33
β-herpes
G protein receptor



UL36
β-herpes
Anti-apoptotic



UL37.3
β-herpes
Unknown



UL39
CMV
Unknown



UL42
CMV
Unknown



UL43
β-herpes t
Tegumen



UL45
Core
Ribonucleotide reductase



UL59
CMV
Unknown



UL62
CMV
Unknown



UL64
CMV
Unknown



UL67
CMV
Unknown



UL78
CMV
G protein receptor



UL83
β-herpes
Tegument



UL89.2
Core
DNA packaging/cleavage



UL109
CMV
Unknown



UL110
CMV
Unknown



UL111a
CMV
IL-10 homolog



UL116
CMV
Unknown



UL119
CMV
Fc receptor



UL121
CMV
Unknown



UL127
CMV
Unknown



UL130
CMV
Unknown



UL146
CMV
Chemokine



UL147
CMV
Chemokine homolog



(US1)
CMV
Unknown



(US2)
CMV
Immunomodulation



(US3)
CMV
Immunomodulation



(US6)
CMV
Immunomodulation



(US7)
CMV
Unknown



(US8)
CMV
Immunomodulation



(US9)
CMV
Unknown



(US10)
CMV
Immunomodulation



(US11)
CMV
Immunomodulation



(US12)
CMV
Unknown



US14
CMV
Unknown



US15
CMV
Unknown



US16
CMV
Unknown



US17
CMV
Unknown



US18
CMV
Unknown



US19
CMV
Unknown



US20
CMV
Unknown



US21
CMV
Unknown



US22
β-herpes
Unknown



US24
CMV
Unknown



US25
CMV
Unknown



US27
CMV
G-protein receptor



US28
β-herpes
G-protein receptor



US29
CMV
Unknown



US31
CMV
Unknown



US32
CMV
Unknown



US33
CMV
Unknown



US34
CMV
Unknown



RL1
CMV
Unknown



RL2
CMV
Unknown



RL4
CMV
Early protein



RL6
CMV
Unknown



RL9
CMV
Unknown



RL10
CMV
Glycoprotein



RL13
CMV
Unknown










Virus encoded temperance factors that suppress viral replication are identified as follows:









TABLE 5







ENHANCED GROWTH (4 mutants)











Genes
Conservation
Function







UL9
CMV
Unknown



UL20a
CMV
Unknown



UL23
β-herpes
Tegument



US30
CMV
Unknown










These ORFs encode repressors of growth that facilitate pathogen temperance. Counterparts of temperance factors can be found in related viruses. The genetic sequence of such temperance factors can be modified to modulate virus replication, e.g. in the development of vaccine strains, for research purposes, and the like. The temperance factor polypeptides are useful as targets for drug design, as targets for immunological agents, and the like. Drugs mimicking or activating growth inhibitors or temperance factors find use in therapies against infectious diseases. Temperance factors may also be cell type specific, affecting viral tropism.


Furthermore, ORFs identified as encoding cell tropism factors can also be deleted in vaccine constructs in order to prevent the vaccine strain from potentially causing disease in specific tissues. For example, ORFs encoding tropism factors for HCMV replication in human retinal epithelial cells can be deleted from the vaccine construct to prevent the possibility that the vaccine may cause HCMV retinitis.


Among the tropism factors are the following: The ORF UL24-deletion mutant grows normally in retinal epithelial cells and fibroblasts, but are significantly defective in growth in endothelial cells. The ORF UL64-deletion mutant grows normally in fibroblasts and endothelial cells, but is significantly growth defective in retinal epithelial cells. The ORF UL10 deletion mutant grows normally in fibroblasts and endothelial cells, but has increased growth relative to wild type in retinal epithelial cells. The ORF UL16 deletion mutant grows normally in retinal epithelial cells and fibroblasts, but has increased growth relative to wild type in endothelial cells.


UL10 and US16 encode cell-type specific functions for virus-growth inhibition. UL24 and UL64 encode cell-type specific functions for viral replication in HMVEC and RPE, respectively.


In one embodiment of the invention, a cytomegalovirus comprising a deletion in one or more ORFs designated as temperance factors. As described below, libraries of such cytomegalovirus may also be provided. In vitro hyper-growth strains having diminished or absent temperance factors can be used for facile production of large quantity of subunit and attenuated live vaccines.


Recombinant Cytomegalovirus

As described in the examples, a collection of viruses having a defined deletion in a single open reading frame are generated. It will be understood by those of skill in the art that various methods can be used to alter virus in a site specific manner. Such mutant viruses are useful in vaccine construction, in testing candidate drugs, investigating growth in different cell types, etc. The mutant virus also provides a basis for further genetic alteration, e.g. in deletion of a second ORF, to add back genetically engineered versions of the deleted ORF, and the like. Of particular interest are sequences of herpesviruses, e.g. alpha-herpesviruses, beta-herpesviruses, etc., particularly cytomegaloviruses, more particularly human cytomegaloviruses.


The panel of viruses may be provided in the form of isolated polynucleotides, in the form of viral particles, in the form of cells comprising the virus polynucleotides, and the like. Where the panel is provided with cells, there may be an array of different cells type, e.g. retinal epithelial cells, fibroblasts, endothelial cells, neural cells, hematopoietic cells, etc. Further, cells may be of one or more species, preferably including human cells.


In one embodiment, a set of recombinant viruses are provided, which set is useful in investigating the effects of drugs, growth conditions, cells, etc. on a variety of mutations. The following sets of viruses may be used individually, or may be combined, e.g. normal growth and enhanced growth, normal growth and growth essential, and the like. Sets of mutant viruses may comprise, without limitation, at least 2, at least 5, at least 10, at least 15, at least 20, at least 25, at least 30, at least 35, at least 40, up to 45 different viruses, each having deletions in unique growth essential genes, as described above. A set of mutant viruses may also comprise, without limitation, at least 2, at least 5, at least 10, at least 12 different viruses, each having deletions in unique severe growth defect genes, as described above. Another set of viruses may comprise, without limitation, at least 2, at least 5, at least 10, at least 15, at least 20, at least 23 different viruses, each having deletions in unique moderate growth defect genes, as described above.


Another virus collection of interest comprises the virus temperance factors, which may comprise 1, 2, 3, or 4 or more viruses having deletions in unique temperance factors. Such a virus collection may further comprise one or more viruses having deletions in unique tropism factors.


Another virus collection of interest includes viruses having deletions in the set of deletions resulting in normal growth. Sets of mutant viruses may comprise, without limitation, at least 2, at least 5, at least 10, at least 15, at least 20, at least 25, at least 30, at least 35, at least 40, at least 45, at least 50, at least 55, at least 60, at least 65, at least 70, at least 75 and up to 76 different viruses, each having deletions in unique genes that do not affect growth.


Recombinant viruses may be constructed according to the following methods. Two oligonucleotide primers are constructed to contain: sequences homologous to an antibiotic resistance cassette, a sequence providing a unique barcode tag, a common primer, and a region homologous to the sequence adjacent to either the start or stop codon of the ORF being targeted for deletion. By amplification reactions, a product is having the antibiotic resistance cassette, flanked by homologous sequences targeting the ORF to be deleted. Transformation of a host cell carrying a genetic construct of the CMV genome with the PCR product results in the replacement of the target gene upon selection for antibiotic resistance. The unique barcode sequences are covalently linked to the sequence that targeted them to the HCMV genome, creating a permanent association and genetic linkage between a particular deletion strain and the tag sequence. The ability of the genetically altered virus to cause disease may be tested in one or more experimental models, e.g. using a variety of human cell lines.


Nucleic Acids

The sequences of the provided HCMV Towne strain, the specific identified ORFs genes and recombinant viruses find use in research and therapeutic methods, for the recombinant production of the encoded polypeptide, and the like. The nucleic acids of the invention include nucleic acids having a high degree of sequence similarity or sequence identity to one of the sequences provided in Table 6. Of particular interest are sequences of other viruses, which may include, without limitation, other herpesviruses, e.g. alpha-herpesviruses, beta-herpesviruses, etc. Sequence identity can be determined by hybridization under stringent conditions, for example, at 50° C. or higher and 0.1×SSC (9 mM NaCl/0.9 mM Na citrate). Hybridization methods and conditions are well known in the art, see, e.g., U.S. Pat. No. 5,707,829. Nucleic acids that are substantially identical to the provided nucleic acid sequence, e.g. allelic variants, genetically altered versions of the gene, etc., bind to one of the sequences provided in Table 1 under stringent hybridization conditions. Further specific guidance regarding the preparation of nucleic acids is provided by Fleury et al. (1997) Nature Genetics 15:269-272; Tartaglia et al., PCT Publication No. WO 96/05861; and Chen et al., PCT Publication No. WO 00/06087, each of which is incorporated herein in its entirety.


The sequences can be isolated from suitable sources, or a suitable nucleic acid can be chemically synthesized. Direct chemical synthesis methods include, for example, the phosphotriester method of Narang et al. (1979) Meth. Enzymol. 68: 90-99; the phosphodiester method of Brown et al. (1979) Meth. Enzymol. 68: 109-151; the diethylphosphoramidite method of Beaucage et al. (1981) Tetra. Lett., 22: 1859-1862; and the solid support method of U.S. Pat. No. 4,458,066. Chemical synthesis produces a single stranded oligonucleotide. This can be converted into double stranded DNA by hybridization with a complementary sequence, or by polymerization with a DNA polymerase using the single strand as a template. While chemical synthesis of DNA is often limited to sequences of about 100 bases, longer sequences can be obtained by the ligation of shorter sequences. Alternatively, subsequences may be cloned and the appropriate subsequences cleaved using appropriate restriction enzymes.


Coding sequences of interest comprises the nucleic acid present between the initiation codon and the stop codon, as defined in the listed sequences, including all of the introns that are normally present in a native chromosome. It can further include the 3′ and 5′ untranslated regions found in the mature mRNA. It can further include specific transcriptional and translational regulatory sequences, such as promoters, enhancers, etc., including about 1 kb, but possibly more, of flanking genomic DNA at either the 5′ or 3′ end of the transcribed region. The genomic DNA flanking the coding region, either 3′ or 5′ may contains sequences required for expression.


Probes specific to the nucleic acid of the invention can be generated using the nucleic acid sequence disclosed in Table 1. The probes are preferably at least about 18 nt, 25 nt, 50 nt or more of the corresponding contiguous sequence of one of the sequences provided in Table 1, and are usually less than about 2, 1, or 0.5 kb in length. Preferably, probes are designed based on a contiguous sequence that remains unmasked following application of a masking program for masking low complexity. Double or single stranded fragments can be obtained from the DNA sequence by chemically synthesizing oligonucleotides in accordance with conventional methods, by restriction enzyme digestion, by PCR amplification, etc. The probes can be labeled, for example, with a radioactive, biotinylated, or fluorescent tag.


The nucleic acids of the subject invention are isolated and obtained in substantial purity, generally as other than an intact chromosome. Usually, the nucleic acids, either as DNA or RNA, will be obtained substantially free of other naturally-occurring nucleic acid sequences, generally being at least about 50%, usually at least about 90% pure and are typically “recombinant,” e.g., flanked by one or more nucleotides with which it is not normally associated on a naturally occurring chromosome.


The nucleic acids of the invention, including genomes of mutant HCMV, can be provided as a linear molecule or within a circular molecule, and can be provided within autonomously replicating molecules (vectors) or within molecules without replication sequences. The nucleic acids of the invention can be introduced into suitable host cells using a variety of techniques available in the art, such as transferrin polycation-mediated DNA transfer, transfection with naked or encapsulated nucleic acids, liposome-mediated DNA transfer, intracellular transportation of DNA-coated latex beads, protoplast fusion, viral infection, electroporation, gene gun, calcium phosphate-mediated transfection, and the like.


For use in amplification reactions, such as PCR, a pair of primers will be used. The exact composition of the primer sequences is not critical to the invention, but for most applications the primers will hybridize to the subject sequence under stringent conditions, as known in the art. It is preferable to choose a pair of primers that will generate an amplification product of at least about 50 nt, preferably at least about 100 nt. Algorithms for the selection of primer sequences are generally known, and are available in commercial software packages. Amplification primers hybridize to complementary strands of DNA, and will prime towards each other. For hybridization probes, it may be desirable to use nucleic acid analogs, in order to improve the stability and binding affinity. The term “nucleic acid” shall be understood to encompass such analogs.


Polypeptides

Polypeptides encoded by the ORFs identified herein are of interest for screening methods, as reagents to raise antibodies, as therapeutics, and the like. Such polypeptides can be produced through isolation from natural sources, recombinant methods and chemical synthesis. In addition, functionally equivalent polypeptides may find use, where the equivalent polypeptide may contain deletions, additions or substitutions of amino acid residues that result in a silent change, thus producing a functionally equivalent differentially expressed on pathway gene product. Amino acid substitutions may be made on the basis of similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic nature of the residues involved. “Functionally equivalent”, as used herein, refers to a protein capable of exhibiting a substantially similar in vivo activity as the polypeptide encoded by an ORF as provided in Table 1.


The polypeptides may be produced by recombinant DNA technology using techniques well known in the art. Methods which are well known to those skilled in the art can be used to construct expression vectors containing coding sequences and appropriate transcriptional/translational control signals. These methods include, for example, in vitro recombinant DNA techniques, synthetic techniques and in vivo recombination/genetic recombination. Alternatively, RNA capable of encoding the polypeptides of interest may be chemically synthesized.


Typically, the coding sequence is placed under the control of a promoter that is functional in the desired host cell to produce relatively large quantities of the gene product. An extremely wide variety of promoters are well-known, and can be used in the expression vectors of the invention, depending on the particular application. Ordinarily, the promoter selected depends upon the cell in which the promoter is to be active. Other expression control sequences such as ribosome binding sites, transcription termination sites and the like are also optionally included. Constructs that include one or more of these control sequences are termed “expression cassettes.” Expression can be achieved in prokaryotic and eukaryotic cells utilizing promoters and other regulatory agents appropriate for the particular host cell. Exemplary host cells include, but are not limited to, E. coli, other bacterial hosts, yeast, and various higher eukaryotic cells such as the COS, CHO and HeLa cells lines and myeloma cell lines. In mammalian host cells, a number of viral-based expression systems may be used, including retrovirus, lentivirus, adenovirus, adeno-associated virus, and the like.


Specific initiation signals may also be required for efficient translation of the genes. These signals include the ATG initiation codon and adjacent sequences. In cases where a complete gene, including its own initiation codon and adjacent sequences, is inserted into the appropriate expression vector, no additional translational control signals may be needed. However, in cases where only a portion of the gene coding sequence is inserted, exogenous translational control signals must be provided. These exogenous translational control signals and initiation codons can be of a variety of origins, both natural and synthetic. The efficiency of expression may be enhanced by the inclusion of appropriate transcription enhancer elements, transcription terminators, etc.


In addition, a host cell strain may be chosen that modulates the expression of the inserted sequences, or modifies and processes the gene product in the specific fashion desired. Such modifications (e.g., glycosylation) and processing (e.g., cleavage) of protein products may be important for the function of the protein. Different host cells have characteristic and specific mechanisms for the post-translational processing and modification of proteins. Appropriate cell lines or host systems can be chosen to ensure the correct modification and processing of the foreign protein expressed. To this end, eukaryotic host cells that possess the cellular machinery for proper processing of the primary transcript, glycosylation, and phosphorylation of the gene product may be used. Such mammalian host cells include but are not limited to CHO, VERO, BHK, HeLa, COS, MDCK, 293, 3T3, WI38, etc.


For long-term, high-yield production of recombinant proteins, stable expression is preferred. For example, cell lines that stably express the differentially expressed or pathway gene protein may be engineered. Rather than using expression vectors that contain viral origins of replication, host cells can be transformed with DNA controlled by appropriate expression control elements, and a selectable marker. Following the introduction of the foreign DNA, engineered cells may be allowed to grow for 1-2 days in an enriched media, and then are switched to a selective media. The selectable marker in the recombinant plasmid confers resistance to the selection and allows cells to stably integrate the plasmid into their chromosomes and grow to form foci which in turn can be cloned and expanded into cell lines. This method may advantageously be used to engineer cell lines that express the target protein. Such engineered cell lines may be particularly useful in screening and evaluation of compounds that affect the endogenous activity of the *** protein. A number of selection systems may be used, including but not limited to the herpes simplex virus thymidine kinase, kanamycin resistance, hypoxanthine-guanine phosphoribosyltransferase, and adenine phosphoribosyltransferase genes. Antimetabolite resistance can be used as the basis of selection for dhfr, which confers resistance to methotrexate; gpt, which confers resistance to mycophenolic acid; neo, which confers resistance to the aminoglycoside G-418; and hygro, which confers resistance to hygromycin.


The polypeptide may be labeled, either directly or indirectly. Any of a variety of suitable labeling systems may be used, including but not limited to, radioisotopes such as 125I; enzyme labeling systems that generate a detectable colorimetric signal or light when exposed to substrate; and fluorescent labels. Indirect labeling involves the use of a protein, such as a labeled antibody, that specifically binds to the polypeptide of interest. Such antibodies include but are not limited to polyclonal, monoclonal, chimeric, single chain, Fab fragments and fragments produced by a Fab expression library.


Once expressed, the recombinant polypeptides can be purified according to standard procedures of the art, including ammonium sulfate precipitation, affinity columns, ion exchange and/or size exclusivity chromatography, gel electrophoresis and the like (see, generally, R. Scopes, Protein Purification, Springer—Verlag, N.Y. (1982), Deutscher, Methods in Enzymology Vol. 182: Guide to Protein Purification., Academic Press, Inc. N.Y. (1990)).


As an option to recombinant methods, polypeptides and oligopeptides can be chemically synthesized. Such methods typically include solid-state approaches, but can also utilize solution based chemistries and combinations or combinations of solid-state and solution approaches. Examples of solid-state methodologies for synthesizing proteins are described by Merrifield (1964) J. Am. Chem. Soc. 85:2149; and Houghton (1985) Proc. Natl. Acad. Sci., 82:5132. Fragments of a *** protein can be synthesized and then joined together. Methods for conducting such reactions are described by Grant (1992) Synthetic Peptides: A User Guide, W. H. Freeman and Co., N.Y.; and in “Principles of Peptide Synthesis,” (Bodansky and Trost, ed.), Springer-Verlag, Inc. N.Y., (1993).


Compound Screening

Compound screening may be performed using an in vitro model, a cell infected with a mutant CMV as provided herein, or a panel of cells infected with individual mutant viruses as provided herein, or purified protein corresponding to any one of the provided ORFs. One can identify ligands or substrates that bind to, modulate or mimic the action of the encoded polypeptide.


The polypeptides include those encoded by the ORFs, as well as nucleic acids that, by virtue of the degeneracy of the genetic code, are not identical in sequence to the disclosed nucleic acids, and variants thereof. Variant polypeptides can include amino acid (aa) substitutions, additions or deletions. The amino acid substitutions can be conservative amino acid substitutions or substitutions to eliminate non-essential amino acids, such as to alter a glycosylation site, a phosphorylation site or an acetylation site, or to minimize misfolding by substitution or deletion of one or more cysteine residues that are not necessary for function. Variants can be designed so as to retain or have enhanced biological activity of a particular region of the protein (e.g., a functional domain and/or, where the polypeptide is a member of a protein family, a region associated with a consensus sequence). Variants also include fragments of the polypeptides disclosed herein, particularly biologically active fragments and/or fragments corresponding to functional domains. Fragments of interest will typically be at least about 10 aa to at least about 15 aa in length, usually at least about 50 aa in length, and can be as long as 300 aa in length or longer, but will usually not exceed about 500 aa in length, where the fragment will have a contiguous stretch of amino acids that is identical to the provided polypeptide sequence.


Compound screening identifies agents that modulate function of the HCMV polypeptides. Of particular interest are screening assays for agents that have a low toxicity for human cells. A wide variety of assays may be used for this purpose, e.g. binding assays of a compound to a polypeptide, effect of a compound on HCMV replication, effect on tissue specificity, and the like. Compounds may be assayed for inducing temperance of viral infection, for preventing infection, for preventing replication, etc.


The term “agent” as used herein describes any molecule, e.g. protein or pharmaceutical, with the capability of altering or mimicking the physiological function of an HCMV polypeptide according to any of the provided growth categories, e.g. growth essential, growth enhancing, and the like. Generally a plurality of assay mixtures are run in parallel with different agent concentrations to obtain a differential response to the various concentrations. Typically one of these concentrations serves as a negative control, i.e. at zero concentration or below the level of detection.


Candidate agents encompass numerous chemical classes, though typically they are organic molecules, preferably small organic compounds having a molecular weight of more than 50 and less than about 2,500 daltons. Candidate agents comprise functional groups necessary for structural interaction with proteins, particularly hydrogen bonding, and typically include at least an amine, carbonyl, hydroxyl or carboxyl group, preferably at least two of the functional chemical groups. The candidate agents often comprise cyclical carbon or heterocyclic structures and/or aromatic or polyaromatic structures substituted with one or more of the above functional groups. Candidate agents are also found among biomolecules including peptides, saccharides, fatty acids, steroids, purines, pyrimidines, derivatives, structural analogs or combinations thereof.


Candidate agents are obtained from a wide variety of sources including libraries of synthetic or natural compounds. For example, numerous means are available for random and directed synthesis of a wide variety of organic compounds and biomolecules, including expression of randomized oligonucleotides and oligopeptides. Alternatively, libraries of natural compounds in the form of bacterial, fungal, plant and animal extracts are available or readily produced. Additionally, natural or synthetically produced libraries and compounds are readily modified through conventional chemical, physical and biochemical means, and may be used to produce combinatorial libraries. Known pharmacological agents may be subjected to directed or random chemical modifications, such as acylation, alkylation, esterification, amidification, etc. to produce structural analogs. Test agents can be obtained from libraries, such as natural product libraries or combinatorial libraries, for example. A number of different types of combinatorial libraries and methods for preparing such libraries have been described, including for example, PCT publications WO 93/06121, WO 95/12608, WO 95/35503, WO 94/08051 and WO 95/30642, each of which is incorporated herein by reference.


Where the screening assay is a binding assay, one or more of the molecules may be joined to a label, where the label can directly or indirectly provide a detectable signal. Various labels include radioisotopes, fluorescers, chemiluminescers, enzymes, specific binding molecules, particles, e.g. magnetic particles, and the like. Specific binding molecules include pairs, such as biotin and streptavidin, digoxin and antidigoxin, etc. For the specific binding members, the complementary member would normally be labeled with a molecule that provides for detection, in accordance with known procedures.


A variety of other reagents may be included in the screening assay. These include reagents like salts, neutral proteins, e.g. albumin, detergents, etc that are used to facilitate optimal protein-protein binding and/or reduce non-specific or background interactions. Reagents that improve the efficiency of the assay, such as protease inhibitors, nuclease inhibitors, anti-microbial agents, etc. may be used. The mixture of components are added in any order that provides for the requisite binding. Incubations are performed at any suitable temperature, typically between 4 and 40° C. Incubation periods are selected for optimum activity, but may also be optimized to facilitate rapid high-throughput screening. Typically between 0.1 and 1 hours will be sufficient.


Preliminary screens can be conducted by screening for compounds capable of binding to the polypeptide. The binding assays usually involve contacting a polypeptide with one or more test compounds and allowing sufficient time for the protein and test compounds to form a binding complex. Any binding complexes formed can be detected using any of a number of established analytical techniques. Protein binding assays include, but are not limited to, methods that measure co-precipitation, co-migration on non-denaturing SDS-polyacrylamide gels, and co-migration on Western blots (see, e.g., Bennet, J. P. and Yamamura, H. I. (1985) “Neurotransmitter, Hormone or Drug Receptor Binding Methods,” in Neurotransmitter Receptor Binding (Yamamura, H. I., et al., eds.), pp. 61-89.


Active test agents identified by the screening methods described herein that affect polypeptide activity and/or virus growth can serve as lead compounds for the synthesis of analog compounds. Typically, the analog compounds are synthesized to have an electronic configuration and a molecular conformation similar to that of the lead compound. Identification of analog compounds can be performed through use of techniques such as self-consistent field (SCF) analysis, configuration interaction (Cl) analysis, and normal mode dynamics analysis. Computer programs for implementing these techniques are available. See, e.g., Rein et al., (1989) Computer-Assisted Modeling of Receptor-Ligand Interactions (Alan Liss, New York).


Theraputic/Prophylactic Treatment Methods

Agents that modulate activity of the provided HCMV ORFs provide a point of therapeutic or prophylactic intervention, particularly agents that inhibit replication of the virus. Numerous agents are useful in modulating this activity, including agents that directly modulate expression, e.g. expression vectors, antisense specific for the targeted polypeptide; and agents that act on the protein, e.g. specific antibodies and analogs thereof, small organic molecules that block catalytic activity, etc.


Methods can be designed to selectively deliver nucleic acids to certain cells. When liposomes are utilized, substrates that bind to a cell-surface membrane protein associated with endocytosis can be attached to the liposome to target the liposome to targeted cells and to facilitate uptake.


Antisense molecules can be used to down-regulate expression in cells. The antisense reagent may be antisense oligonucleotides (ODN), particularly synthetic ODN having chemical modifications from native nucleic acids, or nucleic acid constructs that express such antisense molecules as RNA. The antisense sequence is complementary to the mRNA of the targeted gene, and inhibits expression of the targeted gene products. Antisense molecules inhibit gene expression through various mechanisms, e.g. by reducing the amount of mRNA available for translation, through activation of RNAse H, or steric hindrance. One or a combination of antisense molecules may be administered, where a combination may comprise multiple different sequences.


Antisense molecules may be produced by expression of all or a part of the target gene sequence in an appropriate vector, where the transcriptional initiation is oriented such that an antisense strand is produced as an RNA molecule. Alternatively, the antisense molecule is a synthetic oligonucleotide. Antisense oligonucleotides will generally be at least about 7, usually at least about 12, more usually at least about 20 nucleotides in length, and not more than about 500, usually not more than about 50, more usually not more than about 35 nucleotides in length, where the length is governed by efficiency of inhibition, specificity, including absence of cross-reactivity, and the like. It has been found that short oligonucleotides, of from 7 to 8 bases in length, can be strong and selective inhibitors of gene expression (see Wagner et al. (1996) Nature Biotechnology 14:840-844).


A specific region or regions of the endogenous sense strand mRNA sequence is chosen to be complemented by the antisense sequence. Selection of a specific sequence for the oligonucleotide may use an empirical method, where several candidate sequences are assayed for inhibition of expression of the target gene in vitro or in an animal model. A combination of sequences may also be used, where several regions of the mRNA sequence are selected for antisense complementation.


Antisense oligonucleotides may be chemically synthesized by methods known in the art (see Wagner et al. (1993) supra. and Milligan et al., supra.) Preferred oligonucleotides are chemically modified from the native phosphodiester structure, in order to increase their intracellular stability and binding affinity. A number of such modifications have been described in the literature, which alter the chemistry of the backbone, sugars or heterocyclic bases.


Among useful changes in the backbone chemistry are phosphorothioates; phosphorodithioates, where both of the non-bridging oxygens are substituted with sulfur; phosphoroamidites; alkyl phosphotriesters and boranophosphates. Achiral phosphate derivatives include 3′-O′-5′-S-phosphorothioate, 3′-S-5′-O-phosphorothioate, 3′-CH2-5′-O-phosphonate and 3′-NH-5′-O-phosphoroamidate. Peptide nucleic acids replace the entire ribose phosphodiester backbone with a peptide linkage. Sugar modifications are also used to enhance stability and affinity. The alpha.-anomer of deoxyribose may be used, where the base is inverted with respect to the natural .beta.-anomer. The 2′-OH of the ribose sugar may be altered to form 2′-O-methyl or 2′-O-allyl sugars, which provides resistance to degradation without comprising affinity. Modification of the heterocyclic bases must maintain proper base pairing. Some useful substitutions include deoxyuridine for deoxythymidine; 5-methyl-2′-deoxycytidine and 5-bromo-2′-deoxycytidine for deoxycytidine. 5-propynyl-2′-deoxyuridine and 5-propynyl-2′-deoxycytidine have been shown to increase affinity and biological activity when substituted for deoxythymidine and deoxycytidine, respectively.


The following examples are put forth so as to provide those of ordinary skill in the art with a complete disclosure and description of how to make and use the subject invention, and are not intended to limit the scope of what is regarded as the invention. Efforts have been made to ensure accuracy with respect to the numbers used (e.g. amounts, temperature, concentrations, etc.) but some experimental errors and deviations should be allowed for. Unless otherwise indicated, parts are parts by weight, molecular weight is average molecular weight, temperature is in degrees centigrade; and pressure is at or near atmospheric.


Experimental

Genetic manipulation to generate herpesvirus mutants has been possible through mutagenesis of the viral genome in human cells or maintained as a bacterial artificial chromosome (BAC). A construct, TowneBAC, was produced by inserting a BAC sequence into the HCMV genome (Towne strain) and replacing the dispensable, 10 kb US1-US12 region (Marchini et al. (2001) J Virol 75, 1870-8). The TowneBAC DNA, while maintained as a BAC-based plasmid in E.coli, produces infectious progeny in human fibroblasts and retains wild type growth characteristic in vitro.


The cloned HCMV Towne sequence in the TowneBAC construct was determined (Genbank accession number AY315197) using the shotgun sequencing approach (Venter et al. (1998) Science 280, 1540-2). The Towne sequence present in the TowneBAC construct is predicted to encode 152 unique ORFs, with nine of these present in two copies in the RL elements (FIG. 1). Taking into account the 10 putative ORFs within the deleted US1-US12 region, the Towne strain potentially encodes at least 162 unique ORFs, many of which have homologues in the recently-reanalyzed HCMV AD169 strain genome (Davison et al. (2003) J Gen Virol 84, 17-28).


To systematically analyze the function of each ORF in viral replication, we employed a rapid bacterial homologous recombination system and generated a collection of mutants in E. coli by deleting each of the predicted ORFs from TowneBAC (Lee et al. (2001) Genomics 73, 56-65). Each gene was precisely deleted from the start to stop codons and replaced with a kanamycin resistance cassette (FIG. 2A). Each deletion was verified using PCR screening, restriction digest profiling, and Southern analysis (FIG. 4). In total, 150 of the 152 genes were deleted (Table 1).


The mutant BAC-DNAs were isolated from bacteria and transfected into cultured human foreskin fibroblasts (HFFs). Of the 150 constructed mutants, 105 produced viral progeny, indicating that the mutated genes are not essential for HCMV replication in HFFs. In contrast, 45 mutants did not yield infectious progeny even after repeated transfection and extensive incubation. To further confirm their non-growth phenotype, revertant BAC clones were constructed for several mutants (e.g. ΔUL32) by restoring the deletion with the intact ORF sequence (FIG. 2A, FIG. 4). The rescued mutant (e.g. rescued-UL32) produced progeny and grew as well as the TowneBAC, thereby confirming that deleting the ORF sequence causes the no-growth phenotype (FIGS. 4-5).


Of the 45 essential ORFs in HFFs, 37 had not been previously reported, of which 15 had not even been suggested to be essential based on the studies of other herpesviruses (Table 6). Over 90% of the essential genes are conserved among all herpesviruses (core genes) or β-herpesviruses (Table 6). In contrast, about 70% of the non-essential genes are HCMV-specific and are not conserved among β-herpesviruses.









TABLE 6







A list of HCMV Towne strain genes categorized by the growth


properties of their respective deletion mutants in cultured HFFs.


Also shown are the sequence conservations of these ORFs with those


in HCMV AD169 strain and other herpesviruses, the genome sequence


of which are currently available5-7,30, and their functions and


the functions of their homologues in other herpesviruses that have been


shown or implicated from previous studies. Although virus mutants


with a deletion in each of the 10 ORFs in the US1-US12 region


(marked with parentheses) were not individually constructed, these


ORFs are listed as dispensable since they were collectively deleted


and were not present in TowneBAC. RL11 and RL12, for which


a deletion mutant were not generated, are not included.










Genes
Conservation
Function
Growth










NO GROWTH (45 mutants)










UL32
β-herpes
Tegument
¶Essential


UL34
CMV
Unknown (Transcription)
*Essential


UL37.1
β-herpes/CMV
Anti-Apoptotic
*Essential


UL44
Core
DNA replication
*Essential


UL46
Core
Capsid
*Essential


UL48
Core
Tegument
*Essential


UL48.5
Core
Capsid protein
*Essential


UL49
Core
Unknown
*Essential


UL50
Core
Egress
*Essential


UL51
Core
DNA packaging/cleavage
*Essential


UL52
Core
DNA packaging/cleavage
*Essential


UL53
Core
Egress
*Essential


UL54
Core
DNA polymerase
*Essential


UL55
Core
Glycoprotein B
¶Essential


UL56
Core
DNA packaging/cleavage
*Essential


UL57
Core
ssDNA binding protein
*Essential


UL60
CMV
Unknown (OriLyt ?)
*Essential


UL70
Core
Helicase/primase
*Essential


UL71
Core
Unknown
*Essential


UL73
Core
Glycoprotein N
¶Essential


UL75
Core
Glycoprotein H
¶Essential


UL76
Core
Unknown
*Essential


UL77
Core
DNA packaging/cleavage
*Essential


UL79
Core
Unknown
*Essential


UL80
Core
Capsid assembly
¶Essential


UL84
β-herpes
DNA replication
*Essential


UL85
Core
Capsid
*Essential


UL86
Core
Capsid
*Essential


UL87
Core
Unknown
*Essential


UL89.1
Core
DNA packaging/cleavage
*Essential


UL90
CMV
Unknown
*Essential


UL91
β-herpes
Unknown
*Essential


UL92
β-herpes
Unknown
*Essential


UL93
Core
Unknown
*Essential


UL94
Core
Unknown(Tegument)
*Essential


UL95
Core
Unknown
*Essential


UL96
β-herpes
Unknown
*Essential


UL98
Core
Akaline nuclease
*Essential


UL99
Core
Tegument
*Essential


UL100
Core
Glycoprotein M
¶Essential


UL102
Core
Helicase/Primase
*Essential


UL104
Core
DNA packaging/cleavage
*Essential


UL105
Core
Helicase/Primase
*Essential


UL115
Core
Glycoprotein L
¶Essential


UL122
β-herpes
IE2(transcription)
¶Essential







SEVERE GROWTH DEFECT (12 mutants)










UL21
CMV
Unknown
*<2 × 10−4


UL26
CMV
Tegument (transcription)
*<2 × 10−4


UL28
β-herpes
Unknown
*<2 × 10−4


UL30
CMV
Unknown
*<2 × 10−4


UL69
Core
Tegument(transcription)
¶<2 × 10−4


UL82
β-herpes
Tegument(transcription)
¶<2 × 10−4


UL112
β-herpes
Major early protein
*<2 × 10−4


UL113
β-herpes
Major early protein
*<2 × 10−4


UL117
β-herpes
Unknown
*<2 × 10−4


UL123
CMV
IE1
¶<2 × 10−4


UL124
CMV
Latent transcript(ORF152)
†<2 × 10−4


Us26
β-herpes
Unknown
*<2 × 10−4







MODERATE GROWTH DEFECT (23 mutants)










UL2
CMV
Unknown
¶10−1-10−2


UL11
CMV
Glycoprotein
*10−2-10−3


UL12
CMV
Unknown
*10−1-10−2


UL14
CMV
Unknown
*10−2-10−3


UL20
CMV
TCR homolog
¶10−2-10−3


UL29
β-herpes
Unknown
*10−2-10−3


UL31
β-herpes
Transcription
*10−2-10−3


UL35
β-herpes
Tegument/Transcription
*10−2-10−3


UL38
β-herpes
Unknown
*10−2-10−3


UL47
Core
Tegument-DNA release
¶10−3-10−4


UL65
CMV
Unknown (pp67 virion protein)
*10−2-10−3


UL72
Core
dUTPase
*10−3-10−4


UL74
β-herpes
Glycoprotein O
¶10−3-10−4


UL88
β-herpes
Tegument
*10−2-10−3


UL97
Core
Protein kinase
¶10−2-10−3


UL103
Core
Unknown
*10−2-10−3


UL108
CMV
Unknown
*10−2-10−3


UL114
Core
Uracil DNA glycosylase
¶10−3-10−4


UL129
CMV
Unknown
*10−2-10−3


UL132
CMV
Unknown
*10−2-10−3


US13
CMV
Unknown
†10−1-10−2


US23
β-herpes
Unknown
*10−2-10−3


TRS1
CMV
Transcription/egress
¶10−2-10−3







GROWTH LIKE WILD TYPE (66 mutants, 76 ORFs)










UL3
CMV
Unknown
¶Dispensable


UL4
CMV
Glycoprotein
¶Dispensable


UL5
CMV
Unknown
¶Dispensable


UL6
CMV
Unknown
¶Dispensable


UL7
CMV
Unknown
¶Dispensable


UL8
CMV
Unknown
¶Dispensable


UL10
CMV
Unknown
¶Dispensable


UL13
CMV
Unknown
*Dispensable


UL15
CMV
Unknown
*Dispensable


UL16
CMV
Immunomodulation
¶Dispensable


UL17
CMV
Unknown
*Dispensable


UL18
CMV
MHC homolog
¶Dispensable


UL19
CMV
Unknown
*Dispensable


UL24
β-herpes
Tegument
*Dispensable


UL25
β-herpes
Tegument
*Dispensable


UL27
β-herpes
Unknown
*Dispensable


UL33
β-herpes
G protein receptor
¶Dispensable


UL36
β-herpes
Anti-apoptotic
¶Dispensable


UL37.3
β-herpes
Unknown
¶Dispensable


UL39
CMV
Unknown
*Dispensable


UL42
CMV
Unknown
¶Dispensable


UL43
β-herpes
Tegument
¶Dispensable


UL45
Core
Ribonucleotide reductase
¶Dispensable


UL59
CMV
Unknown
*Dispensable


UL62
CMV
Unknown
*Dispensable


UL64
CMV
Unknown
*Dispensable


UL67
CMV
Unknown
*Dispensable


UL78
CMV
G protein receptor
¶Dispensable


UL83
β-herpes
Tegument
¶Dispensable


UL89.2
Core
DNA packaging/cleavage
*Dispensable


UL109
CMV
Unknown
*Dispensable


UL110
CMV
Unknown
*Dispensable


UL111a
CMV
IL-10 homolog
*Dispensable


UL116
CMV
Unknown
*Dispensable


UL119
CMV
Fc receptor
*Dispensable


UL121
CMV
Unknown
*Dispensable


UL127
CMV
Unknown
¶Dispensable


UL130
CMV
Unknown
*Dispensable


UL146
CMV
Chemokine
*Dispensable


UL147
CMV
Chemokine homolog
*Dispensable


IRS
CMV
Transcription
¶Dispensable


(US1)
CMV
Unknown
¶Dispensable


(US2)
CMV
Immunomodulation
¶Dispensable


(US3)
CMV
Immunomodulation
¶Dispensable


(US6)
CMV
Immunomodulation
¶Dispensable


(US7)
CMV
Unknown
¶Dispensable


(US8)
CMV
Immunomodulation
¶Dispensable


(US9)
CMV
Unknown
¶Dispensable


(US10)
CMV
Immunomodulation
¶Dispensable


(US11)
CMV
Immunomodulation
¶Dispensable


(US12)
CMV
Unknown
¶Dispensable


US14
CMV
Unknown
¶Dispensable


US15
CMV
Unknown
*Dispensable


US16
CMV
Unknown
*Dispensable


US17
CMV
Unknown
*Dispensable


US18
CMV
Unknown
*Dispensable


US19
CMV
Unknown
*Dispensable


US20
CMV
Unknown
*Dispensable


US21
CMV
Unknown
*Dispensable


US22
β-herpes
Unknown
*Dispensable


US24
CMV
Unknown
*Dispensable


US25
CMV
Unknown
*Dispensable


US27
CMV
G-protein receptor
¶Dispensable


US28
β-herpes
G-protein receptor
¶Dispensable


US29
CMV
Unknown
*Dispensable


US31
CMV
Unknown
*Dispensable


US32
CMV
Unknown
*Dispensable


US33
CMV
Unknown
*Dispensable


US34
CMV
Unknown
*Dispensable


RL1
CMV
Unknown
*Dispensable


RL2
CMV
Unknown
*Dispensable


RL4
CMV
Early protein
¶Dispensable


RL6
CMV
Unknown
¶Dispensable


RL9
CMV
Unknown
¶Dispensable


RL10
CMV
Glycoprotein
¶Dispensable


RL13
CMV
Unknown
¶Dispensable







ENHANCED GROWTH (4 mutants)










UL9
CMV
Unknown
*1 × 10


UL20a
CMV
Unknown
*1 × 10


UL23
β-herpes
Tegument
*1 × 10


US30
CMV
Unknown
*1 × 10





*Results from this study


¶Results in this study consistent with previous studies4.


†Results in this study different from those in previous studies4.






Based on their growth properties in fibroblasts, viral mutants carrying deletions in nonessential genes were further categorized into four groups: severe growth defect, moderate growth defect, growth like the wild type, and enhanced growth (Table 6). Twelve mutants were classified to have a severe growth defect in HFFs, thereby precluding the generation of sufficient titers for growth studies. Five of these ORFs have unknown functions, while the remaining seven genes are involved in regulating transcription or genome replication (Mocarski, E. S. & Courcelle, C. T. in Fields Virology (eds. Knipe, D. M. & Howley, P. M.) 2629-2673 (Lippincott-William & Wilkins, Philadelphia, Pa., 2001). “Moderate growth defect” mutants reached a peak titer of 10-10,000 times less than TowneBAC after 14 days in a multiple-step growth analysis (e.g. ΔUL132, FIG. 2B). This group contains 23 viral mutants of which 11 of the deleted ORFs have not been characterized, and their functions are currently unknown.


Sixty-six mutants retained growth properties that ranged from wild type levels to less than 10-fold fewer plaque-forming units at 14 days post-infection (e.g. ΔUL27, FIG. 2B). These “growth like wild type” mutants (Table 1) are considered to have deletions in dispensable genes, the majority of which are HCMV specific ORFs.


The mutant group that showed enhanced growth reached a 10-fold greater peak titer than the wild type virus during a 14-day infection (e.g. ΔUS30, FIG. 2B). We found it intriguing that these mutants were capable of reaching higher titers than the wild type virus. While their functions are currently unknown, recent bioinformatic analyses suggest that these ORFs are all either β-herpesvirus or HCMV-specific transmembrane proteins (Rigoutsos et al. (2003) J Virol 77, 4326-44).


Although 66 ORFs are found to be dispensable for viral replication in HFFs, it is possible that these ORFs encode important functions for HCMV infection in vivo, including those involved in immunomodulation. Due to the lack of an animal model for study of HCMV pathogenesis, cultured natural host cells have been used. In vivo, HCMV infects human retinal pigment epithelial (RPE) cells and microvascular endothelial cells (HMVEC), leading to viral-associated retinitis and vascular diseases, respectively. It is conceivable that some of the ORFs, while dispensable for HCMV growth in fibroblasts, are important for supporting viral replication in other cell types.


To test this hypothesis, HMVEC and RPE cells were individually infected with a collection of 15 viral mutants that grew as well as the wild type virus in HFFs. The growth of each virus in HMVEC and RPE cells was compared to the result found in HFFs. Diverse growth phenotypes of these mutants were observed in HMVEC and RPE cells (FIG. 3). For instance, the UL24-deletion mutant grew as well as the TowneBAC in HFFs and RPE cells, but was significantly defective in growth in HMVEC. Another mutant with a UL64 deletion replicated normally in HMVEC and HFFs, but barely produced viral progeny in RPE cells (FIG. 3). Our results suggest that UL24 and UL64 are important for viral replication in HMVEC and RPE, respectively. Interestingly, a UL10 deletion mutant grew normally in HFFs and HMVEC, but reached a 500-fold higher titer than TowneBAC in RPE cells, while a US16 deletion mutant replicated as well as the TowneBAC in HFFs and RPE cells but grew 100-fold better in HMVEC (FIG. 3). These observations imply that UL10 and US16 encode cell-type specific functions for virus-growth inhibition.


Research during the last two decades has collectively shown that the prototype herpesvirus, herpes simplex virus 1, encodes 37 essential genes and 48 nonessential genes. The majority (78%) of the 45 HCMV genes that are essential for replication in HFFs are highly conserved across all herpesviruses, suggesting that these core ORFs may represent the minimal ancestral genome of all herpesviruses. HCMV may have evolved from the progenitor genome through the acquisition of non-essential genes that are responsible for its infection and pathogenesis in various tissues. This hypothesis is supported by the identification of Epstein-Barr virus and Kaposi's sarcoma-associated herpesvirus-specific genes that are involved in their unique latent infections. The functional profiling of HCMV genes reported provides a step toward elucidating the role of each gene in viral infection.


Our analysis of the mutant library suggests the presence of viral encoded factors that regulate viral growth in different cell types. The discovery of HCMV encoded factors that repress viral replication on a cell type-specific basis represents a novel discovery in the field of animal viruses. Deletion of distinct ORFs resulted in mutant viruses with enhanced growth in specific cell types (e.g. ΔUS30 in HFFs, ΔUL10 in RPE cells, and ΔUS16 in HMVEC). While the mechanism by which these genes repress viral replication is currently unknown, we speculate that the genes may either directly block CMV growth or activate cellular antiviral machinery to suppress viral replication.


The presence of these growth-repressor factors may initially seem counterproductive from the perspective of the virus, however, their existence is consistent with the observations that HCMV exhibits different growth rates in various cell types. In vivo, these inhibitors may moderate viral loads to levels optimal for transmission, but prevent viral replication from reaching levels that can result in severe tissue damage or host death. Furthermore, they may suppress productive lytic replication to low levels or cease viral replication, thereby facilitating persistent and latent infections. Therefore, these repressor factors may have the effect of enhancing virus survival. This strategy of pathogen temperance may be a fundamental component in a pathogen's repertoire of factors that function to enhance its long term existence.


The presence of such temperance genes in viruses suggests that pathogen temperance is a prevalent survival strategy and present in other higher order organisms with greater genome content. This is consistent with recent observations in infectious organisms where deletion of certain pathogen-encoded factors resulted in a hypervirulent infection in the host (Parish et al. (2003) Infect Immun 71, 1134-40; Cunningham et al. (2001) Science 292, 285-7). Recognition of pathogen temperance may radically alter the way we perceive the emergence of hyper-growth virulent variants from benign pathogens. The underlying mechanism for hypervirulence may be the loss of these temperance factors, as opposed to the acquisition of virulence genes. Accordingly, drugs that mimic or activate temperance factors may lead to effective therapies against infectious diseases. Further studies of pathogen temperance will provide insight into the evolution of new and emerging virulent pathogens and facilitate the development of novel approaches for controlling future epidemics caused by these virulent strains.


Materials and Methods


Virus and cells. HCMV (Towne strain) (ATCC, Manassas, Va.) and human cells (Clonetics Inc. San Diego, Calif.) were propagated as described previously (Marchini et al. (2001) J Virol 75, 1870-8). The TowneBAC, which contains a green fluorescence protein (GFP) expression cassette, was maintained in human cells and in bacterial strains DH10B and DY380 (Lee et al. (2001) Genomics 73, 56-65).


Genomic sequencing and bioinformatic analysis. TowneBAC DNAs were subjected to genome-wide shotgun sequencing analysis at MWG-Biotech, Inc. (High Point, N.C.). The sequence was determined to an average redundancy of more than 10-fold. The sequence database was manually reviewed before depositing it into Genbank (accession number AY315197). ORFs that potentially encode a protein greater than 100 amino acids were predicted using standard genetic codes, following the guidelines as previously described (Davison, supra.), or the manufacturer's suggestions (MWG-Biotech Inc., High Point, N.C.).


Construction of deletion and rescued mutants. To construct the deletion cassettes, two oligonucleotide primers (up1 and dn1) were constructed and contained the following components (from 3′ to 5′): 18 or 19 homologous nucleotides to the antibiotic resistance cassette KanMX4, a 20 nucleotide unique barcode tag, a common 19 nucleotide primer, and a 25 nucleotide region homologous to the first 25 nucleotide adjacent to either the start or stop codon of the ORF being targeted for deletion. The up1 and dn1 primers were used to amplify the KanMX4 cassette, which contains the kanamycin resistance gene, nptl, fused with an efficient bacterial promoter. A second round of PCR using primers bearing 50 bases of homology to the region upstream and downstream of a particular HCMV ORF yielded a product in which the KanMX4 cassette was flanked by 50 nucleotide homologous sequences targeting the ORF to be deleted in the TowneBAC. Transformation of the TowneBAC-bearing DY380 strain with the PCR product resulted in the replacement of the target gene upon selection for kanamycin resistance. The unique 20-mer barcode sequences were covalently linked to the sequence that targeted them to the HCMV genome, creating a permanent association and genetic linkage between a particular deletion strain and the tag sequence.


All predicted ORFs that potentially encode proteins greater than 100 amino acids in size were initially selected for deletion. The deletion cassette was designed to remove the entire coding sequence for a given ORF. Although ˜10% of HCMV ORFs overlapped with each other, the position of the deletions was not adjusted, nor were there any attempts made to avoid essential genes, genes in which a previous deletion had been constructed, or genes with a well-defined function.


To verify the correct integration of the deletion cassette, BAC-DNAs were prepared from kanamycin-resistant clones and subject to PCR screening using the primers for the corresponding deleted ORF. In restriction profiling and Southern analysis, BAC-DNAs were digested with restriction enzymes, separated on agarose gels, transferred onto membranes, and then probed with a [32P]-labeled probe containing both the target ORF and KanMX4 sequence. Only clones with insertions of the cassette, as confirmed by PCR, restriction profiles, and Southern analysis, were further studied.


Construction of rescued BAC mutants was carried out by adapting a two-step homologous recombination approach in E. coli (FIG. 2A), first replacing the kanamycin cassette of the deletion mutants with a tetracycline and streptomycin (tet/str) cassette by selecting tetracycline-resistant clones, and then replacing the tet/str cassette with the intact ORF sequence by selecting streptomycin-susceptible clones. The latter selection takes advantage of the fact that only bacterial clones lacking the str cassette survive in the presence of streptomycin.


Growth analysis of viral mutants in cells. HFFs were electroporated with TowneBAC DNAs, then plated onto six-well plates, and observed for 3-15 weeks for GFP expression and cytopathic effect (CPE). No viral progeny were produced from TowneBAC DNAs containing deletions of essential genes. Mutants that did not reach more than 30% CPE after 15-weeks post-infection were considered to have severe growth defects, and their titers were not sufficient for the multiple-step growth analysis. Flasks of cells infected with mutants that exhibited moderate growth defects or growth like the wild type reached 30-100% CPE at 3-15 weeks post-infection and were used for the preparation of viral stocks.


In multiple-step growth analyses, 1×105 cells were infected in duplicate with different viruses at a multiplicity of infection (MOI) of either 0.05 plaque forming units (PFU) (for HFFs and HMVEC) or 0.25 PFU per cell (for RPE). The cells and medium were harvested at different times post-infection, and viral stocks were prepared by adding an equal volume of 10% skim milk followed by sonication. The titers of the viral stocks were determined in triplicate as described previously.


It is to be understood that this invention is not limited to the particular methodology, protocols, formulations and reagents described, as such may, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to limit the scope of the present invention which will be limited only by the appended claims.


It must be noted that as used herein and in the appended claims, the singular forms “a”, “and”, and “the” include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to “a complex” includes a plurality of such complexes and reference to “the formulation” includes reference to one or more formulations and equivalents thereof known to those skilled in the art, and so forth.


Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood to one of ordinary skill in the art to which this invention belongs. Although any methods, devices and materials similar or equivalent to those described herein can be used in the practice or testing of the invention, the preferred methods, devices and materials are now described.


All publications mentioned herein are incorporated herein by reference for the purpose of describing and disclosing, for example, the cell lines, constructs, and methodologies that are described in the publications which might be used in connection with the presently described invention. The publications discussed above and throughout the text are provided solely for their disclosure prior to the filing date of the present application. Nothing herein is to be construed as an admission that the inventors are not entitled to antedate such disclosure by virtue of prior invention.

Claims
  • 1. A method for identifying biologically active agents that modulate cytomegalovirus replication, the method comprising: combining a candidate biologically active agent with a mutant virus comprising a defined deletion of a temperance factor open reading frame (ORF) selected from UL9, UL 20a, UL 23 and US 30;determining the effect of said agent virus replication; andcomparing the effect of said agent to that of a control.
  • 2. The method according to claim 1, wherein said agent increases replication of said virus.
  • 3. The method according to claim 1, wherein said agent decreases replication of said virus.
  • 4. The method according to claim 1, wherein the mutant virus comprises a defined deletion of UL23 ORF.
  • 5. The method of claim 4, wherein the candidate agent mimicks the activity of the UL 23 temperance factor.
Parent Case Info

This application claims benefit of U.S. provisional application 60/490,200 filed 25 Jul. 2003.

Related Publications (1)
Number Date Country
20050064394 A1 Mar 2005 US
Provisional Applications (1)
Number Date Country
60490200 Jul 2003 US