Genetic locus for everninomicin biosynthesis

Information

  • Patent Application
  • 20030143666
  • Publication Number
    20030143666
  • Date Filed
    January 26, 2001
    23 years ago
  • Date Published
    July 31, 2003
    21 years ago
Abstract
The present invention relates to isolated genetic sequences encoding proteins which direct the biosynthesis of the antibiotic everninomicin in Micromonospora carbonacea. The isolated biosynthetic gene cluster serves as a substrate for bioengineering of antibiotic structures.
Description


FIELD OF INVENTION

[0002] The present invention relates to the field of antibiotics, specifically those active against gram-positive bacteria and more specifically to genes of the everninomicin biosynthetic pathway of Micromonospora carbonacea. In particular, this invention elucidates the gene cluster controlling the biosynthesis of everninomicin.



BACKGROUND

[0003] Everninomicin is one member of a class of oligosaccharide natural products collectively referred to as the orthosomycins. At least five active components of everninomicin have been obtained by fermentation of M. carbonacea, namely everninomicin A, B, C, D, and E, of which everninomicin D is the principal component (Weinstein et al., Antimicrobial Agents and Chemotherapy—1964, 24-32, 1964; U.S. Pat. No. 3,499,078). Additional everninomicins, including 13-384 component 1 and 13-384 component 5, have been described from other strains of M. carbonacea (Ganguly et al., Heterocycles, 1989, Vol. 28, pp. 83-88; U.S. Pat. Nos. 4,597,968 and 4,735,903). The structure of some of the known everninomicins is described in Encyclopedia of Chemical Technology, 4th edition, volume 3, 1992, pp. 60-261 ed. Mary Howe-Grant, from which the chemical structure of everninomicin, as illustrated in FIG. 2 of the present specification, was derived.


[0004] Everninomicins contain two sensitive orthoester moieties and one or more highly substituted aromatic moiteties. Everninomicins possess many unusual features, including a 1-1′ disaccharide bridge, a nitrosugar (evernitrose), thirteen rings, and thirty five stereogenic centers within its structure (Ganguly A. K. et al., Tetrahedron Lett. 1997, 38, 7989-7991). It has been recognized that everninomicin constitutes a formidable challenge to organic synthesis because of its unusual connectivity and polyfunctional and sensitive nature (Nicolaou, K. C. et al., Angew. Chem. Int. Ed. 1999, 38. No. 22). Moreover, chemical synthesis of everninomicin compounds produces a poor yield of the desired everninomicin molecule due to the presence of the unusual structural features. As an alternative to making structural analogs of microbial metabolites by chemical synthesis, manipulating genes of governing secondary metabolism offer a promising alternative and allow for preparation of these compounds biosynthetically. However, the success of a biosynthetic approach depends critically on the availability of novel genetic systems and on genes encoding novel enzyme activities. Elucidation of the everninomicin gene cluster contributes to the general field of combinatorial biosynthesis by expanding the repertoire of genes uniquely associated with everninomicin biosynthesis, leading to the making of novel everninomicins via combinatorial biosynthesis.


[0005] The emergence of multi-resistant, Gram-positive pathogens gives rise to an urgent need for new antimicrobial agents that display novel mechanisms of actions and demonstrate activity against resistant strains. Everninomicin has demonstrated a wide spectrum of antibacterial activity against gram-positive organisms, including methicillin-resistant Staphylococcus aureus, vancomycin-resistant enterococci, and penicillin-resistant pneumococci. The production of everninomicin is recognized as a valuable source of antibiotics. For example, everninomicin (trade name Ziracin®) was under development by Schering-Plough as an intravenous treatment of severe resistant gram-positive bacterial infections. Consequently, it is desirable to develop cost effective means to produce everninomicin. Elucidation of the everninomicin gene cluster would provide a means to construct everninomicin overproducing strains by de-regulating the biosynthetic machinery.


[0006] It is also desirable to produce chemical modifications of everninomicin to enhance certain properties. For example, everninomicin D presented pharmacokinetic problems when tested in vivo on mice and dogs (Ganguly A. K. et al., J. Antibiotics 35:5 561-570, 1982). Likewise, it has been reported that everninomicins have been unavailable for clinical use due to severe adverse reactions observed in laboratory animals, which reactions include lack of coordination and ataxia (Maertens, Current Opinion in Anti-infective investigational Drugs, 1999 1(1):49-56). Elucidation of the everninomicin gene cluster would provide a means to produce via genetic manipulation or combinatorial biosynthesis modified everninomicin D with improved properties. Elucidation of the gene cluster controlling the biosynthesis of everninomicin would provide access to rational engineering of everninomicin biosynthesis for novel drug leads. Accordingly, there is a need for genetic information regarding the biosynthesis of everninomicin.



SUMMARY OF THE INVENTION

[0007] The invention provides purified and isolated polynucleotide molecules that encode polypeptides of the everninomycin biosynthetic pathway in Micromonospora carbonacea. In one form of the invention, polynucleotide molecules are selected from contiguous DNA sequences of FIG. 1 (SEQ ID NOS: 1, 3, 4, 8, 22, 36, 47 and 49). In another form, the invention provides polypeptides corresponding to the isolated DNA molecules. The amino acid sequences of the corresponding encoded polypeptides are also shown in FIG. 1.


[0008] Structural and functional characterization is provided for the 49 open reading frames (ORFs) comprising this cluster (SEQ ID NOS: 2, 5 to 7, 9 to 21, 23 to 35, 37 to 46, 48, and 50 to 58). Thus, in one embodiment, this invention provides an isolated nucleic acid comprising a nucleic acid selected from the group consisting of a nucleic acid encoding any of everninomicin ORFs 1 to 49 (SEQ ID NOS: 2, 5 to 7, 9 to 21, 23 to 35, 37 to 46, 48, and 50 to 58); a nucleic acid encoding a polypeptide encoded by any of everninomicin ORFs 1 to 49; and a nucleic acid (SEQ ID NOS: 2, 5 to 7, 9 to 21, 23 to 35, 37 to 46, 48, and 50 to 58) which is at least 75% (preferably 80%, more preferably 85% or more) identical in amino acid sequence to a polypeptide encoded by any of everninomicin ORFs 1 to 49. Certain embodiments of the invention specifically exclude one or more of ORFs 1 to 49. In one embodiment, preferred nucleic acids comprise a nucleic acid encoding at least two (more preferably at least three or more, and still more preferably at least 5 or more) ORFs selected from the group consisting of ORF 1 to 49 (SEQ ID NOS: 2, 5 to 7, 9 to 21, 23 to 35, 37 to 46, 48, and 50 to 58).


[0009] Those skilled in the art will readily understand that the invention, having provided the polynucleotide sequences encoding polypeptides of the everninomicin biosynthetic pathway, also provides polynucleotides encoding fragments derived from such peptides. In one embodiment the invention provides an isolated nucleic acid comprising a nucleic acid that specifically hybridizes under stringent conditions to an ORF of the everninomicin biosynthesis gene cluster, and can substitute for the ORF to which it specifically hybridizes to direct the synthesis of an everninomicin. In certain embodiments this also includes nucleic acids that would stringently hybridize but for the degeneracy of the nucleic acid code. In other words, if silent mutations could be made in the subject sequence so that it hybridizes to the indicated sequences under stringent conditions, it would be included in certain embodiments. The invention also provides an isolated gene cluster comprising ORFs encoding polypeptides sufficient to direct the assembly of an everninomicin or an everninomicin analogue.


[0010] Moreover, the invention is understood to provide naturally occurring variants or derivatives of such polypeptides and fragments derived therefrom, such variants or derivatives resulting from the addition, deletion, or substitution of non-essential amino acids or conservative substitutions of essential amino acids as described herein. Particularly preferred nucleic acids comprise a nucleic acid that specifically hybridizes under stringent conditions to a nucleic acid encoding a polypeptide selected from the group consisting of ORF 1, ORF 2, ORF 3, ORF 4, ORF 5, ORF 6, ORF 7, ORF 8, ORF 9, ORF 10, ORF 11, ORF 12, ORF 13, ORF 14, ORF 15, ORF 16, ORF 17, ORF 18, ORF 19, ORF20, ORF 21, ORF 22, ORF 23, ORF 24, ORF 25, ORF 26, ORF 27, ORF 28, ORF 29, ORF 30, ORF 31, ORF 32, ORF 33, ORF 34, ORF 35, ORF 36, ORF 37, ORF 38, ORF 39, ORF 40, ORF 41, ORF 42, ORF 43, ORF 44, ORF 45, ORF 46, ORF 47, ORF 48, and ORF 49 (SEQ ID NOS: 2, 5 to 7, 9 to 21, 23 to 35, 37 to 46, 48, and 50 to 58 respectively). Particularly preferred isolated nucleic acid comprises a nucleic acid encoding a polypeptide selected from the group consisting of ORF 1, ORF 2, ORF 3, ORF 4, ORF 5, ORF 6, ORF 7, ORF 8, ORF 9, ORF 10, ORF 11, ORF 12, ORF 13, ORF 14, ORF 15, ORF 16, ORF 17, ORF 18, ORF 19, ORF20, ORF 21, ORF 22, ORF 23, ORF 24, ORF 25, ORF 26, ORF 27, ORF 28, ORF 29, ORF 30, ORF 31, ORF 32, ORF 33, ORF 34, ORF 35, ORF 36, ORF 37, ORF 38, ORF 39, ORF 40, ORF 41, ORF 42, ORF 43, ORF 44, ORF 45, ORF 46, ORF 47, ORF 48, and ORF 49 (SEQ ID NOS: 2, 5 to 7, 9 to 21, 23 to 35, 37 to 46, 48, and 50 to 58 respectively). The nucleic acid may comprise a nucleic acid that is a single nucleotide polymorphism (SNP) of a nucleic acid encoding a polypeptide selected from the group consisting of ORF 1, ORF 2, ORF 3, ORF 4, ORF 5, ORF 6, ORF 7, ORF 8, ORF 9, ORF 10, ORF 11, ORF 12, ORF 13, ORF 14, ORF 15, ORF 16, ORF 17, ORF 18, ORF 19, ORF20, ORF 21, ORF 22, ORF 23, ORF 24, ORF 25, ORF 26, ORF 27, ORF 28, ORF 29, ORF 30, ORF 31, ORF 32, ORF 33, ORF 34, ORF 35, ORF 36, ORF 37, ORF 38, ORF 39, ORF 40, ORF 41, ORF 42, ORF 43, ORF 44, ORF 45, ORF 46, ORF 47, ORF 48, and ORF 49 (SEQ ID NOS: 2, 5 to 7, 9 to 21, 23 to 35, 37 to 46, 48, and 50 to 58). Certain embodiments of the invention specifically exclude one or more of ORFs 1 to 49.


[0011] This invention also provides for a polypeptide encoded by any one or more of the nucleic acids described herein.


[0012] Those skilled in the art would also readily understand that the invention, having provided the polynucleotide sequences of the entire genetic locus from M. carbonacea, further provides naturally-occurring variants or homologs of the genes of the everninomicin biosynthetic locus from other bacterial of the order Actinomycetes family. It is also understood that the invention, having provided the polynucleotide sequences of the entire genetic locus as well as the coding sequences, further provides polynucleotides which regulate the expression of the polypeptides of the biosynthetic pathway. Such regulating polynucleotides include but are not limited to promoter and enhancer sequences, as well as sequences antisense to any of the aforementioned sequences. The antisense molecules are regulators of gene expression in that they are used to suppress expression of the gene from which they are derived.


[0013] The gene cluster may be present in a host cell, preferably in a bacterial cell. Preferred families of bacterial cells include but are not limited to: a) bacteria of the family Micromonosporaceae, of which preferred genus include Micromonospora, Actinoplanes and Dactylosporangium; b) bacteria of the family Streptomycetaceae, of which preferred genus include Streptomyces, and Kitasatospora; and c) bacteria of the family Pseudonocardiaceae, of which preferred genus are Amycolatopsis, Kibdelosporangium, and Saccharopolyspora. The host cell is transformed with an exogenous nucleic acid comprising a gene cluster encoding polypeptides sufficient to direct the assembly of an everninomicin or an everninomicin analogue. In certain embodiments heterologous nucleic acid may comprise only a portion of the gene cluster, but the cell will still be able to express an everninomicin. Expression cassettes and vectors comprising a polynucleotide as described herein, as well as cells transformed or transfected with such cassettes and vectors, are also within the scope of the invention.


[0014] The invention also provides methods of chemically modifying a biological molecule. The methods involve contacting a biological molecule that is a substrate for a polypeptide encoded by an everninomicin biosynthesis gene cluster ORF, with a polypeptide encoded by an everninomicin biosynthesis gene cluster ORF whereby the polypeptide chemically modifies the biological molecule. In one preferred embodiment, the polypeptide is an enzyme selected from the group consisting of an O-methyltransferase, an integral membrane antiporter, a methyltransferase, a blue copper oxidoreductase, a C-methyltransferase, a nucleotide binding protein, a mannosyltransferase, a sugar epimerase/reductase, an oxygenase, a tRNA/rRNA methylase, a 3-ketoacyl-[ACP]-synthase, a glycosyltransferase, an alpha-ketoglutarate-dependent dioxygenase, a halogenase, a glycosyltransferase, an acetoin dehydrogenase E1 alpha or beta subunit, a rhamnosyltransferase, a sugar dehydratase/epimerase, a sugar nucleotidyltransferase, a sugar 4,6-dehydratase, a sugar epimerase/ketoreductase, an iterative type 1 polyketide synthase, a hydrolase/phosphatase, a glucosyltransferase, a sugar ketoreductase, sugar 2,3-dehydratase, sugar dehydratase, a resistance rRNA methyltransferase, a flavoprotein oxidoreductase, a deoxyhexose aminotransferase, a sugar epimerase, a sugar ketoreductase, an endoglucanase, a transcriptional regulator and a glucokinase. In a preferred embodiment, the method involves contacting the biological molecule with at least two (preferably at least three or more) different polypeptides of everninomicin gene cluster ORFs 1 to 49 (SEQ ID NOS: 2, 5 to 7, 9 to 21, 23 to 35, 37 to 46, 48, and 50 to 58). The contacting may be in a host cell or the contacting can be ex vivo. The biological molecule can be an endogenous metabolite produced by the host cell or an exogenous supplied metabolite. In preferred embodiments, the host cell is a bacterial cell or eukaryotic cell (e.g. a mammalian cell, a yeast cell, a plant cell, a fungal cell, an insect cell etc.). In certain preferred embodiments, the host cell synthesizes deoxyhexose precursors or a dichloroisoeverninic moiety for the biological molecule. In other preferred embodiments, the host cell synthesizes the nitrosugar evernitrose. In one preferred embodiment, the method comprises contacting the biological molecule with substantially all of the polypeptides of ORF 1 to 49 (SEQ ID NOS: 2, 5 to 7, 9 to 21, 23 to 35, 37 to 46, 48, and 50 to 58) and the method produces an everninomicin or everninomicin analogue.







BRIEF DESCRIPTION OF THE DRAWINGS

[0015]
FIG. 1 illustrates contiguous nucleotide sequences and deduced amino acid sequences of the everninomicin biosynthetic locus from Micromonospora carbonacea (SEQ ID NOS: 1 to 58).


[0016]
FIG. 2 illustrates the structure of some of the known everninomicins.


[0017]
FIG. 3 illustrates a biosynthetic scheme for the production of deoxyhexose precursors for everninomicin biosynthesis.


[0018]
FIG. 4 illustrates a biosynthetic scheme for the production of nitrosugar evernitrose.


[0019]
FIG. 5 illustrates a biosynthetic scheme for the production of the dichloroisoeverninic moiety that is found in the ester linkage to the sugar residue B of everninomicin.







DETAILED DESCRIPTION OF THE INVENTION

[0020] Contiguous nucleotide sequences and deduced amino acid sequences of the everninomicin biosynthetic locus from Micromonospora carbonacea are illustrated in FIG. 1 (SEQ ID NOS: 1 to 58). In particular, FIG. 1 shows a complete gene cluster formed of eight DNA contiguous sequences, which gene cluster regulates the biosynthesis of everninomicin. FIG. 1 further shows the amino acid sequences of the isolated polynucleotide coding regions which encode 49 polypeptides of the everninomicin biosynthetic pathway (SEQ ID NOS: 2, 5 to 7, 9 to 21, 23 to 35, 37 to 46, 48, and 50 to 58).


[0021] The contiguous nucleotide sequences are arranged such that, as found within the everninomicin biosynthetic locus, DNA contig 1 (SEQ ID NO 1) is adjacent to the 5′ end of DNA contig 2 (SEQ ID NO 3), which is in turn adjacent to DNA contig 3 (SEQ ID NO 4), etc. The ORFs represent open reading frames deduced from the nucleotide sequences. ORF 1 (SEQ ID NO 2) has been deduced from DNA contig 1 (SEQ ID NO 1); ORFs 2 to 4 (SEQ ID NOS: 3, 4, and 8) have been deduced from DNA contig 3 (SEQ ID NO 4); ORFs 5 to 17 (SEQ ID NOS: 9 to 21) have been deduced from DNA contig 4 (SEQ ID NO 8); ORFs 18 to 30 (SEQ ID NOS: 23 to 35) have been deduced from DNA contig 5 (SEQ ID NO 22); ORFs 31 to 39 (SEQ ID NOS 37 to 45) and the C-terminus of ORF 40 (SEQ ID NO 46) have been deduced from DNA contig 6 (SEQ ID NO 36); the N-terminus of ORF 40 (SEQ ID NO 48) has been deduced from DNA contig 7 (SEQ ID NO 47); ORFs 41 to 49 (SEQ ID NOS 50 to 58) have been deduced from DNA contig 8 (SEQ ID NO 49). As pointed out in FIG. 1, some of the ORFs are incomplete. In addition, one nucleotide (at position 27 of DNA contig 6, SEQ ID NO 36) remains to be determined. The DNA contig coding regions giving rise to the ORFs are also shown in FIG. 1, along with the orientation of the ORFs, (i.e. whether they are to be read off the positive (sense, coding) strand or the negative (antisense, non-coding strand)).


[0022] A deposit of three strains of E.coli DH10B cells, each harbouring a cosmid clone of the everninomicin locus was made on Jan. 24, 2001 with the International Depositary Authority of Canada (IDAC), 1015 Arlington Street, Winnipeg, Manitoba, R3E 3R2, Canada according to the provisions of the Budapest Treaty. The deposits were assigned accession nos. IDAC 240101-1, IDAC 240101-2 and IDAC 240101-3. All restrictions on the availability to the public of the above IDAC deposits will be irrevocably removed upon the granting of a patent on this application.


[0023] Everninomicin is naturally produced by a number of microorganisms of the order Actinomycetales. Given the potential medical importance of this class of antibiotics, the genetic locus encoding the biosynthetic pathway for everninomicin production was isolated and sequenced from one known producer, Micromonospora carbonacea subspecies aurantiaca (strain number NRRL 2997, obtained from the Agricultural Research Service Culture Collection of the United States Department of Agriculture; everninomicin production by this strain is described in U.S. Pat. No. 3,499,078). The newly discovered locus encodes 49 individual proteins (SEQ ID NOS: 2, 5 to 7, 9 to 21, 23 to 35, 37 to 46, 48, and 50 to 58) involved in the biosynthesis of everninomicin by this organism. The full-length locus and individual cloned genes are useful for a variety of purposes relating to synthesis of antibiotics of the orthosomycin class.


[0024] The entire everninomycin biosynthetic locus spans approximately 60 kb. Analysis of this 60 kb DNA sequence reveals the presence of individual genes encoding 49 individual proteins. Three of the genes show strong homology to the Streptomyces viridochromogenes avilamycin biosynthetic genes aviD, aviE and aviM, previously demonstrated to be involved in the biosynthesis of avilamycin, a member of the orthosomycin class of antibiotics (Gaisser et al., 1997, J. Bacteriol., Vol. 179, pp. 6271-6278). The gene encoding ORF 28 of FIG. 1 (SEQ ID NO 33) is homologous to the aviD gene, the gene encoding ORF 29 of FIG. 1 (SEQ ID NO 34) is homologous to the aviE gene, and the gene encoding ORF 32 of FIG. 1 (SEQ ID NO 38) is homologous to the aviM gene.


[0025] The functions of the 49 individual proteins of the everninomicin biosynthetic locus were assessed by computer comparison of each protein with proteins found in the GenBank database of protein sequences (National Center for Biotechnology Information, National Library of Medicine, Bethesda, Md. USA) using the BLASTP algorithm (Altschul et al., 1997, Nucleic Acids Res. Vol. 25, pp.3389-3402). Significant amino acid sequence homologies and proposed function found for each protein in the everninomicin locus are shown in Table 1.
1TABLE 1GenBank%%ORF# aaProposed functionhomologyprobabilityidentitysimilarityproposed function of GenBank match 1250O-methyltransferaseAAD418195.00E−835571TylF 3″″-O-methyltransferase in tylosinbiosynthetic locus of Streptomyces fradiaeBAA036703.00E−805471MycF mycinamicin III O-methyltransferasein the mycinamicin biosynthetic locus ofAAG297941.00E−795670Micromonospora griseorubidaCumN O-methyltransferase in coumermycinAAF675092.00E−795670A1 biosynthetic locus of Streptomyces rishiriensisNovP O-methyltransferase in the novobiocinbiosynthetic locus of Streptomyces spheroides 2345integral membraneAAF269066.00E−383148protein similar to Na/H and drug/H antiportersantiporterin epothilone biosynthetic locus ofSorangium cellulosum(partial)CAB450492.00E−353154putative integral membrane ion antiporter inchloroeremomycin biosynthetic locus ofAmycolatopsis orientalisBAA169916.00E−332649Synechocystis sp. Na/H antiporter 3385methyltransferaseBAA795256.00E−152841hypothetical protein in Aeropyrum pemix withhomology to N-6 Adenine-specific DNA methylasesCAB889466.00E−053140putative methyltransferase in Streptomyces coelicolor 4480blue copperCAB124491.00E−603344Bacillus subtilis spore coat protein involvedoxidoreductasein brown pigmentation during sporogenesis(partial)BAA021236.00E−603549bilirubin oxidase from Myrothecium verrucariaCAB754227.00E−573447polyphenol oxidase from Acremonium morurumAAA866683.00E−352637PhsA phenoxazinone synthase from StreptomycesAntibioticus 5274methyltransferaseAAF099399.00E−055364probable methyltransferase, BioC family, fromDeinococcus radioduransAAC017387.00E−053545methyltransferase in rifamycin biosynthetic locusof Amycolatopsis mediterraneiCAB934373.00E−044270putative methyltransferase fromStreptomyces coelicolor 6414C-methyltransferaseAAD418234.00E−794355TylCIII NDP-hexose 3-C-methyltransferasein thetylosin biosynthetic locus ofStreptomyces fradiaeCAA429264.00E−724155protein in the erythromycin biosynthetic locusof Saccharopolyspora erythraeaAAG298035.00E−463149CumW C-methyltransferase in the coumermycinA1 biosynthetic locus of Streptomyces rishiriensisAAF018161.00E−453147SnoG protein in the nogalamycin biosyntheticlocus of Streptomyces nogalaterAAF675146.00E−443047NovU C-methyltransferase in the novobiocinbiosynthetic locus of Streptomyces spheroides 7357O-methyltransferaseAAD121643.00E−794559TylE O-methyltransferase in the tylosin biosyntheticlocus of Streptomyces fradiaeCAA120216.00E−724557SnogY O-methylase in the nogalamycin biosyntheticlocus of Streptomyces nogalaterCAA056447.00E−524256OleY protein in the oleandomycin biosyntheticlocus of Streptomyces antibioticus 8292mannosyltransferaseAAB895171.00E−052647galactosyltransferase from Archaeoglobus fulgidusCAB583326.00E−052638putative glycosyl transferase from StreptomycescoelicolorAAF122693.00E−042545mannosyl transferase from Deinococcusradiodurans 9137nucleotide-bindingAAD452663.60E+003442Pseudomonas aeruginosa WbjC putativeproteinnucleotide-binding protein involved in O-antigen(sugar) biosynthesisAAB639476.20E+003860Streptococcus pneumoniae SulD bifunctionalaldolase-pyrophosphokinase10314sugar epimerase/reductaseCAA120101.00E−514253SnogG dTD P-4-keto-6-deoxyhexose reductasein the nogalamycin biosynthetic locus of StreptomycesnogalaterAAB630474.00E−463852DnmV thymidine diphospho-4-keto-2,3,6-trideoxyhexulose reductase in the daunorubicinbiosynthetic locus of Streptomyces peucetiusAAD135615.00E−453950LanZ3 NDP-hexose 4-keto reductase in thelandomycin biosynthetic locus ofStreptomyces cyanogenusAAF725494.00E−433948UrdZ3 NDP-hexose 4-ketoreductase in the urdamycinbiosynthetic of Streptomyces fradiae11285O-methyltransferaseBAA321322.00E−685061methyltransferase in Streptomyces griseusAAB005312.00E−634659DmpM O-demethylpuromycin-O-methyltransferasein the puromycin biosynthetic locus of StreptomycesalbonigerAAD327428.00E−343447MmcR O-methyltransferase in the mitomycinbiosynthetic locus of Streptomyces lavendulaeAAA675184.00E−323348TcmN O-methyltransferase in the tetracenomycinbiosynthetic locus of Streptomyces glaucescens12276OxygenaseCAA077665.00E+002739MtmOl oxygenase in the mithramycin biosyntheticlocus of Streptomyces argillaceus13265tRNA/rRNA methylaseAAG320663.00E−735470rRNA methyltransferase AviRb involved in avilamycinA resistance Streptomyces viridochromogenesAAF105917.00E−283651rRNA methylase from Deinococcus radioduransAAF735911.00E−233148SpoU rRNA methylase family protein fromChlamydia muridarumAAC680001.00E−223048SpoU family rRNA methylase from ChlamydiaTrachomatisAAD186702.00E−222748SpoU-1 rRNA methylase fromChlamydophilapneumoniae143443-ketoacyl-[ACP]-synthaseAAG297872.00E−764358CumJ 3-ketoacyl-[ACP]-synthase in the coumermycinA1 biosynthetic locus of Streptomyces rishiriensisAAA652082.00E−613854DpsC daunorubicin-doxorubicin polyketide synthasefrom Streptomyces peucetiusCAB719143.00E−704058beta-keto acyl synthase III homolog form StreptomycescoelicolorAAF701095.00E−543750AknE2 ketoacyl synthase involved in aclacinomycinbiosynthesis in Streptomyces galilaeus15240methyltransferaseCAA700165.00E−043341StsG methyltransferase involved in N-methyl-L-glucosamine pathway in streptomycin biosyntheticlocus of Streptomyces griseusAAG065592.00E−032441UbiG 3-demethylubiquinone-9 3-methyltransferasefrom Pseudomonas aeruginosaAAF096185.00E−032747putative methyltransferase fromDeinococcus radioduransAAD284581.50E−022743MitN methyltransferase in the mitomycin biosyntheticlocus of Streptomyces lavendulae16380glycosyltransferaseAAF002095.00E−804458UrdGT2 glycosyl transferase in the urdamycinA biosynthetic locus of Streptomyces fradiaeAAD135537.00E−784359LanGT2 glycosyl transferase in the landomycinbiosynthetic locus of Streptomyces cyanogenusCAA096358.00E−704255Gra-orf14 putative glycosyl transferase in thegranaticin biosynthetic locus of StreptomycesviolaceoruberAAC017313.00E−583751dNTP-hexose glycosyl transferase in the rifamycinbiosynthetic locus of Amycolatopsis mediterranei17405unknownnone18296*alpha-ketoglutarate-AAC717110.0052742HtxA putative alpha-ketoglutarate-dependent(partial)dependentHypophosphite dioxygenase fromdioxygenasePseudomonas stutzeri19243methyltransferaseJC53199.90E−024361TlrD macrolide-lincosamide-streptogramin Bresistance determinant from Streptomyces fradiaeCAB450432.20E−013649putative rRNA methylase from AmycolatopsisorientalisAAF863983.80E−012635FkbM 31-O-methyltransferase in the FK520biosynthetic locus of Streptomyceshygroscopicus var. ascomyceticusAAC443603.80E−013040FkbM 31-O-demethyl-FK506 methyltransferasein the FK506 biosynthetic locus of Streptomyces sp.20482halogenaseCAA117806.00E−603250protein similar to non-heme oxygenase/halogenasein chloroeremomycin biosynthetic locus ofAmycolatopsis orientalisCAA765505.00E−593249OxyD putative halogenase in the balhimycinbiosynthetic locus of Amycolatopsis mediterraneiAAG388442.00E−343147putative reductase/halogenase in the xanthomonadinbiosynthetic locus of Xanthomonas oryzaeAAD248847.00E−292743PltA putative halogenase in the pyoluteorinbiosynthetic locus of Pseudomonas fluorescens21438glycosyltransferaseAAC649282.00E−443244MtmGI glycosyltransferase involved in mithramycinbiosynthesis in Streptomyces argillaceusAAD555832.00E−433246MtmGIII glycosyltransferase involved in mithramycinbiosynthesis in Streptomyces argillaceusAF0778692.00E−413244MtmGIV glycosyltransferase involved in mithramycinbiosynthesis in Streptomyces argillaceusAAC686773.00E−342842DesVII glycosyl transferase in themethymycin/pikromycinbiosynthetic locus of Streptomyces venezuelae22325acetoin dehydrogenaseAAG075378.00E−714860probable dehydrogenase E1 component fromE1 alpha subunitPseudomonas aeruginosaAAA217448.00E−694661TPP-dependent acetoin dehydrogenase E1 alpha-subunit from Clostridium magnumAAA219483.00E−654657Acetoin:DCPIP oxidoreductase-alpha from Ralstoniaeutropha23320acetoin dehydrogenaseAAA189162.00E−533855Acetoin:DCPIP oxidoreductase beta subunit fromE1 beta subunitPelobacter carbinolicusAAG075388.00E−534054Acetoin catabolism protein AcoB from PseudomonasaeruginosaAAA217456.00E−523757TPP-dependent acetoin dehydrogenase beta-subunitfrom Clostridium magnum24337RhamnosyltransferaseCAB500992.00E−183148rhamnosyl transferase related protein fromPyrococcus abyssiAAF043755.00E−182942WbbL dTDP-Rha:a-D-GlcNAc-diphosphorylpolyprenol a-3-L-rhamnosyl transferase fromMycobacterium smegmatisAAF122713.00E−162745putative rhamnosyltransferase from DeinococcusradioduransAAB665222.00E−152444putative rhamnosyl transferase involved in capsularpolysaccharide biosynthesis in Streptococcuspneumoniae25350unknownNone26252alpha-ketoglutarate-AAF018121.00E−122841SnoK protein in the nogalamycin biosynthetic locus ofdependent dioxygenaseStreptomyces nogalaterAAC717113.00E−112342HtxA putative alpha-ketoglutarate-dependenthypophosphite dioxygenase from PseudomonasstutzeriAAB818353.00E−062335peroxisomal phytanoyl-CoA alpha-hydroxylase fromMus musculusAAF159712.00E−0523382-oxoglutarate dependent peroxisomal phytanoyl-CoAhydroxylase (dioxygenase) from Rattus norvegicus27309sugar dehydratase/AAG088384.00E−463853Gmd GDP-mannose 4,6-dehydratase fromepimerasePseudomonas aeruginosaAAC386687.00E−463751LpsA putative GDP-mannose-4,6-dehydratasepredicted to be involved in S-layer lipopolysaccharidebiosynthesis in Caulobacter crescentusAAC441176.00E−443751Gca GDP-D-mannose dehydratase involved incommon antigen biosynthesis in PseudomonasaeruginosaAAB848397.00E−433450GDP-D-mannose dehydratase inMethanothermobacter thermoautotrophicusAAD203732.00E−423650MdhtA GDP-D-mannose-dehydratase found inglycopeptolipid biosynthetic locus of Mycobacteriumavium28355SugarP080751.00E−1266177StrD glucose-1-phosphate thymidylyltransferase foundnucleotidyltransferasein the streptomycin biosynthetic locus in StreptomycesgriseusT308721.00E−1256078AviD dNDP-glucose synthase in the avilamycinbiosynthetic locus of Streptomyces viridochromogenesAAD285171.00E−1245977BlmD streptomycin strD protein homolog in thebluensomycin biosynthetic locus of StreptomycesbluensisT488661.00E−1236077MtmD glucose-1-phosphate thymidylyltransferase inthe mithramycin biosynthetic locus of Streptomycesargillaceus29329sugar 4,6-dehydrataseT308731.00E−1397482AviE dNDP-glucose dehydratase in the avilamycinbiosynthetic locus of Streptomyces viridochromogenesAAG184571.00E−1236675AprE dTDP-glucose 4,6-dehydratase fromStreptomyces tenebrariusAAA682111.00E−1236675TDP-D-glucose-4,6-dehydratase in the erythromycinbiosynthetic locus of Saccharopolyspora erythraeaBAA845931.00E−1156376AveBII dTDP-glucose 4,6-dehydratase in theavermectin biosynthetic locus of StreptomycesavermitilisAAC686811.00E−1146274DesIV TDP-glucose-4,6-dehydratase in themethymycin/pikromycin biosynthetic locus ofStreptomyces venezuelae30342sugar epimerase/AAD355946.00E−433853UDP-glucose 4-epimerase from Thermotoga maritimaketoreductaseAAG074553.00E−373751probable epimerase from Pseudomonas aeruginosaA711832.00E−343346probable UDP-glucose 4-epimerase from PyrococcushorikoshiiCAB492271.00E−333346GalE-1 UDP-glucose 4-epimerase from Pyrococcusabyssi31354alpha-ketoglutarate-AAF018121.00E−102641Snok protein in the nogalamycin biosynthetic locus ofdependent dioxygenaseStreptomyces nogalaterAAB818353.00E−072943peroxisomal phytanoyl-CoA alpha-hydroxylase fromMus musculusAAC717114.00E−062541HtxA putative alpha-ketoglutarate-dependenthypophosphite dioxygenase from Pseudomonasstutzeri321267iterative type ICAA727130.00E+006575AviM orsellinic acid synthase in the avilamycinpolyketide synthasebiosynthetic locus of Streptomyces viridochromogenesBAA201020.00E+0040566-methylsalicylic acid synthase from AspergillusterreusS131780.00E+0041556-methylsalicylic acid synthase from Penicilliumgriseofulvum33303hydrolase/AAF099921.00E−053143hydrolase of the CbbY/CbbZ/GpH/YieH family fromphosphataseDeinococcus radioduransAAG193241.00E−053246p-nitrophenyl phosphatase from Halobacterium sp.AAC764104.00E−033353phosphoglycolate phosphatase from Escherichia coli34307sugar epimerase/AAD455542.00E−524355Spcl putative dNDP-glucose-4,6-dehydratase in theketoreductasespectinomycin biosynthetic locus of StreptomycesflavopersicusCAA188141.00E−233243putative sugar dehydratase from MycobacteriumlepraeAAD355942.00E−232844UDP-glucose 4-epimerase from Thermotoga maritimaBAA845952.00E−173042AviBIV dTDP-4-keto-6-deoxy-L-hexose 4-reductase inthe avermectin biosynthetic locus of Streptomycesavermitilis35295glycosyltransferaseS370286.00E−052842ExoM rhizobium succinoglycan biosynthesisglycosyltransferase from Sinorhizobium melilotiAAB906212.20E−012542ExoM succinoglycan biosynthesis protein fromArchaeoglobus fulgidus36341sugar ketoreductaseAAF734536.00E−915569AknQ putative 3-ketoreductase in the Streptomycesgalilaeus aclacinomycin biosynthetic locusAAD135502.00E−875365LanT oxidoreductase homolog found in thelandomycin biosynthetic locus of StreptomycescyanogenusAAA834253.00E−854864RdmF oxidoreductase of Streptomyces purpurascensAAF599314.00E−825065dTDP-3,4-diketo-2,6-dideoxyglucose 3-ketoreductaseinvolved in the 2-deoxygenation step in dTDP-L-oleandrose biosynthesis37470sugar 2,3-dehydrataseAAD554511.00E−1275264OleV involved in the C-2 deoxygenation step indTDP-L-oleandrose biosynthesis inStreptomyces antibioticusCAB965511.00E−1225263MtmV D-olivose, D-oliose and D-mycarose 2,3-dehydratase in the mithramycin biosynthetic locusof Streptomyces argillaceusT466681.00E−1195164SnogH probable 2,3-dehydratase in the nogalamycinbiosynthetic locus ofStreptomyces nogalaterAAD135491.00E−1185063LanS NDP-hexose 2,3-dehydratase homolog in thelandomycin biosynthetic locus ofStreptomyces cyanogenus38346sugar dehydrataseAAF717651.00E−1206377NysDIII putative dGDP-mannose-4,6-dehydratase inthe nystatin biosynthetic locus ofStreptomyces nourseiAAG353604.00E−965571Gmd GDP-mannose 4,6-dehydratase fromAneurinibacillus thermoaerophilusAAD102325.00E−935269putative GDP-D-mannose dehydratasefrom Anabaena sp.AAC441173.00E−895068Gca GDP-D-mannose dehydratase involved incommon antigen biosynthesis inPseudomonas aeruginosaAAC386682.00E−884967LpsA putative GDP-mannose-4,6-dehydratasepredicted to be involved in S-layer lipopolysaccharidebiosynthesis in Caulobacter crescentusAAF071993.00E−874966Gmd1 GDP-D-mannose 4,6-dehydratase fromArabidopsis thaliana39277resistance rRNAAAG320672.00E−625265AviRa rRNA methyltransferase involved in avilamycinmethyltransferaseA resistance in Streptomyces viridochromogenes40159*sugar epimerase/AAD355942.00E−314363UDP-glucose 4-epimerase from Thermotoga maritimaketoreductase49*C705622.00E−294559robable dTDP-glucose 4-epimerase fromMycobacterium tuberculosis(partial)AAB981964.00E−284361GalE UDP-glucose 4-epimerase fromMethanococcus jannaschiiCAA188142.00E−274357putative sugar dehyratase from Mycobacterium leprae41400flavoproteinCAA516701.00E−1085568ORF3 flavoprotein in the daunorubicin biosyntheticoxidoreductaselocus of Streptomyces griseusAAB630454.00E−563947DnmZ putative flavoprotein required for biosynthesisof the daunorubicin precursor thymidine diphospho-L-daunosamine in Streptomyces peucetius42373deoxyhexoseCAA117821.00E−1577382PCZA361.5 sugar biosynthesis gene in theaminotransferasechloroeremomycin biosynthetic locus ofAmycolatopsis orientalisAAG139101.00E−1517083MegDII TDP-3-keto-6-deoxyhexose 3-aminotransaminase in the megalomicinbiosynthetic locus of Micromonospora megalomiceaAAF734621.00E−1457481AknZ putative aminotransferase in the aclacinomycinbiosynthetic locus of Streptomyces galilaeusAAF018211.00E−1437381Snogl putative aminotransferase in the nogalamycinbiosynthetic locus of Streptomyces nogalater43416C-methyltransferaseCAA117771.00E−1596779PCZA361.22 sugar biosynthesis gene in thechloroeremomycin biosynthetic locus ofAmycolatopsis orientalisAAC384441.00E−1526677DnrX daunorubicin/doxorubicin biosynthesis enzymefrom Streptomyces peucetiusCAB965492.00E−663751MtmC D-mycarose 3-C-methyltransferase in themithramycin biosynthetic locus ofStreptomyces argillaceusAAG298037.00E−623450CumW C-methyltransferase in the coumermycin A1biosynthetic locus of Streptomyces rishiriensis44207sugar epimeraseAAB630467.00E−686375DnmU putative epimerase involved in thebiosynthesis of daunorubicin precursorTDP-L-daunosamine in Streptomyces peucetiusAAF701012.00E−646073AknL dTDP-4-keto-6-deoxyhexose 3,5-epimerase inthe aclacinomycin biosynthetic locus ofStreptomyces galilaeusCAA117818.00E−645872Protein similar to epimerase in the chloroeremomycinbiosynthetic locus of Amycolatopsis orientalisCAA120111.00E−606072SnogF 3,5-epimerase in the nogalamycin biosyntheticlocus of Streptomyces nogalater45343sugar ketoreductaseAAG139133.00E−865464MegDV TDP-4-keto-6-deoxyhexose 4-ketoreductasein the megalomicin biosynthetic locus ofMicromonospora megalomiceaCAA117642.00E−845171protein similar to dTDP-dehydrogenase in thechloroeremomycin biosynthetic locus ofAmycolatopsis orientalisBAA845951.00E−795363AveBlVdTDP-4-keto-6-deoxy-L-hexose 4-reductase inthe avermectin biosynthetic locus ofStreptomyces avermitilisAAB840713.00E−734863EryBIV oxidoreductase involved in L-mycarosebiosynthesis in the erythromycin biosyntheticlocus of Saccharopolyspora erythraea46306unknownNone47518endoglucanaseAAA230842.00E−455263endoglucanase from Cellulomonas fimiCAC169704.00E−413547putative secreted endoglucanase fromStreptomyces coeticolorAAA622115.00E−365062beta-1,4-exocellulase precursor fromThermobifida fusca48286transcriptionalCAB619192.00E−564558putative lacl-family transcriptional regulatorregulatorin Streptomyces coelicolorCAA206098.00E−564659putative lacl-family transcriptional regulator inStreptomyces coelicolorCAB656542.00E−282848putative repressor of maltose transportgenes in AlicyclobacillusacidocaldariusAAD518264.00E−283449ThuR member of the Lacl-GalR family regulatoryproteins in Sinorhizobium meliloti49340glucokinaseCAB952964.00E−293448probable sugar kinase from Streptomyces coelicolorCAB655766.00E−283744putative transcriptional regulatory proteinwith similarity to glucokinase inStreptomyces coelicolorBAB051442.00E−273147glucose kinase from Bacillus haloduransAAD365379.00E−262945glucokinase from Thermotoga maritima


[0026] The everninomicin backbone is composed of eight saccharide residues joined by glycosidic and orthoester linkages. Many of the proteins encoded by the everninomicin locus are likely to be involved in the biosynthesis of the sugar precursors and their subsequent joining and modification.


[0027] Five of the eight saccharide residues of everninomicin (residues A-E of FIG. 2) are deoxyhexoses and are likely to be derived from D-glucose-6-phosphate. Deoxyhexoses are common constituents of microbial secondary metabolites. The first two steps in the biosynthesis of many deoxysugars are the synthesis of dNDP-D-glucose and its conversion to dNDP-4-keto-6-deoxyglucose, catalyzed respectively by dNDP-glucose synthases and dNDP-glucose dehydratases (Liu and Thorson, 1994, Annu. Rev. Microbiol., Vol. 48, pp. 223-256). ORF 28 (SEQ ID NO 33) is similar to many bacterial dNDP-glucose synthases while ORF 29 (SEQ ID 34) is similar to many bacterial dNDP-glucose dehydratases. These two proteins are likely to be involved in generating 6-deoxyhexose precursors for incorporation into everninomicin. Sugar residues at positions A-C, and occasionally D, also lack C-2 hydroxyl groups (see FIG. 2). ORFs 36 and 37 (SEQ IS NOS 42 and 43) encode proteins that are similar to bacterial proteins known to be involved in C-2 deoxygenation and are therefore likely to be involved in the generation of 2,6-dideoxyhexose precursors. ORFs 10, 27, 30, 34, 38 and 40 (SEQ ID NOS 14, 32, 35, 40, 44, and 46) are similar to bacterial proteins that catalyze dehydration, epimerization and/or ketoreduction of deoxyhexose precursors and are likely to catalyze 4-ketoreduction to generate sugars with the appropriate C-4 stereochemistry for everninomicin biosynthesis. A biosynthetic scheme for the production of deoxyhexose precursors for everninomicin biosynthesis is shown in FIG. 3.


[0028] The everninomicins are distinguished from other orthosomycin antibiotics by the presence of a nitrogen-containing sugar residue (residue A of FIG. 2). ORFs 41-45 (SEQ ID NOS 50 to 54) constitute a cluster of ORFs with strong similarity to proteins involved in the biosynthesis of aminodeoxyhexoses. In particular, these ORFs are similar to proteins proposed to catalyze the synthesis of the 3-amino-3-methyl-2,3,6-trideoxyhexose residue of chloroeremomycin (van Wageningen et al., 1998, Chem. & Biol., Vol. 5, pp. 155-162) and proteins involved in the synthesis of the 3-amino-2,3,6-trideoxyhexose residue of daunorubicin (Olano et al., 1999, Chem. & Biol., Vol. 6, pp. 845-855). ORFs 41-45 (SEQ ID NOS 50 to 54) are therefore likely to catalyze the biosynthesis of a 3-amino-3-methyl-2,3,6-trideoxyhexose intermediate that would subsequently be modified by O-methyl transfer and amino group oxidation to yield the evernitrose nitrosugar residue. Two proteins (ORFs 1, 7; SEQ ID NOS 2 and 11) found in the everninomicin locus are similar to bacterial proteins that catalyze O-methyl transfer to deoxyhexoses groups of secondary metabolites and may catalyze O-methyl transfer in evernitrose biosynthesis. ORF 4 (SEQ ID NO 7) encodes an unusual oxidoreductase that shows similarity to bacterial blue-copper oxidoreductases involved in oxidizing nitrogen-containing compounds and as such provides a likely candidate for the amine oxidase required for the biosynthesis of evernitrose. A scheme for the biosynthesis of the nitrosugar evernitrose is shown in FIG. 4.


[0029] Five proteins (ORFs 8, 16, 21, 24 and 35; SEQ ID NOS 12, 20, 26, 29, and 41) are similar to bacterial glycosyltransferases and are therefore likely to catalyze the joining of saccharide precursors via glycosidic linkages to form the backbone oligosaccharide structure that is characteristic of the orthosomycins. Among the glycosyltransferases encoded by the everninomicin locus, one (ORF16; SEQ ID NO 20) shows the greatest similarity to enzymes known to catalyze the transfer of aminodeoxyhexose residues. This glycosyltransferase is therefore likely to catalyze the incorporation of the aminodeoxyhexose precursor that is subsequently converted to the nitrosugar evernitrose. The protein encoded by ORF 35 is the most unusual of the glycosyltransferases and is therefore likely to perform the unusual C-1 to C-1′ linkage that is characteristic of the orthosomycins.


[0030] The everninomicins may contain as many as 7 O-methyl groups (see FIG. 2). It is significant then that the everninomicin locus encodes seven proteins (ORFs 1, 3, 5, 7, 11, 15 and 19; SEQ ID NOS 2, 6, 9, 11, 19, and 24) that show similarity to O-methyltransferases. It is likely that each of these proteins catalyzes a specific O-methylation reaction during the course of everninomicin biosynthesis. ORFs 1 and 7 (SEQ ID NOS 2 and 11) are discussed above as possible enzymes responsible for methylating the C-4 hydroxyl group of the nitrosugar evernitrose. ORF 11 (SEQ ID NO 15) is discussed in more detail below and is likely to catalyze methylation of the phenolic hydroxyl group found on the dichloroisoeverninic acid moiety.


[0031] Four proteins encoded by the everninomicin locus (ORFs 12, 18, 26 and 31; SEQ ID NOS 16, 23, 32 and 37) are similar to oxidoreductases and are likely to catalyze the unusual oxidative modifications of the oligosaccharide backbone that are typical of the orthosomycins. In particular, three of these oxidoreductases (ORFs 18, 26 and 31; SEQ IS NOS 23, 31 and 37) show significant similarity to alpha-ketoglutarate-dependent dioxygenases and may therefore be involved in generating the three orthoester/diether linkages found in all orthosomycins (the orthoester linkages between sugar rings C-D and rings G-H, and the aliphatic methylene dioxy group appended to ring H, as shown in FIG. 2).


[0032] Two proteins in the everninomicin locus (ORFs 6, 43; SEQ ID NOS 10 and 52) are similar to C-methyltransferases that transfer methyl groups to deoxyhexose residues, thus accounting for the source of the two deoxyhexose C-methyl groups found in everninomicin (see FIG. 2). ORF 43 (SEQ ID NO 52) forms part of the aminodeoxyhexose gene cluster discussed earlier and is likely to be responsible for incorporating the C-3 methyl group of the evernitrose residue. ORF 6 (SEQ ID NO 10) is thus the likely source of the only remaining C-methyl group of everninomicin, that found on C-3 of the deoxyhexose residue D.


[0033] Four proteins encoded by the everninomicin locus (ORFs 11, 14, 20 and 32; SEQ ID NOS 15, 18, and 25) are likely to be involved in the biosynthesis of the dichloroisoeverninic moiety that is found in ester linkage to the sugar residue B of everninomicin (see FIG. 2). ORF 32 (SEQ ID NO 38) encodes a type I polyketide synthase that is similar to fungal 6-methylsalicylic acid synthases and to the AviM orsellinic acid synthase involved in avilamycin biosynthesis in Streptomyces viridochromogenes (Gaisser et al., 1997, J. Bacteriol., Vol. 179, pp. 6271-6278). ORF 32 (SEQ ID NO 38) is proposed to catalyze successive rounds of condensation of acyl-CoA precursors to form orsellinic acid, an aromatic precursor to isoeverninic acid. ORF 14 encodes a protein that is similar to 3-ketoacyl-[ACP]-synthases, including the DpsC protein in the daunorubicin biosynthetic locus of Streptomyces sp. strain C5. The DpsC protein has been proposed to interact with polyketide synthases and to confer specificity for the proper acyl-CoA starter unit (Rajgarhia et al., 1997, J. Bacteriol., Vol. 179, pp. 2690-2696). Similarly, the ORF 14 protein may interact with the ORF 32 (SEQ ID NO 38) polyketide synthase during the synthesis of the orsellinic acid precursor. ORF 11 (SEQ ID NO 15) encodes an O-methyltransferase that shows greatest similarity to bacterial proteins that transfer methyl groups to phenolic hydroxyls, and is therefore likely to catalyze the conversion of orsellinic acid to isoeverninic acid. ORF 20 (SEQ ID NO 25) encodes a protein that is similar to many bacterial non-heme halogenases, and is likely to catalyze the addition of 2 chlorine atoms to isoeverninic acid to form dichloroisoeverninic acid. A scheme for the biosynthesis of the dichioroisoeverninic acid moiety is shown in FIG. 5.


[0034] Three proteins encoded by the everninomicin locus (ORFs 22, 23 and 33; SEQ ID NOS 27, 28 and 39) are similar to enzymes involved in carbohydrate metabolism and may serve to generate short chain aliphatic alcohol precursors that are subsequently used to modify the variable positions on C-52 of residue H (see FIG. 2). ORFs 22 and 23 (SEQ ID NOS 27 and 28) are similar to subunits of the acetoin dehydrogenase component E1 involved in the catabolism of acetoin (3-hydroxy-2-butanone), while ORF 33 (SEQ ID NO 39) shows some similarity to bacterial phosphoglycolate phosphatases involved in glycolate (hydroxyacetic acid) metabolism.


[0035] Four proteins encoded by the everninomicin locus (ORFs 2, 13, 39 and 47; SEQ ID NOS 5, 17, 45 and 56)) are likely to be involved in conferring resistance to everninomicin and/or transporting everninomicin out of the producing bacterial cell. Everninomicin inhibits bacterial protein synthesis, and thus exerts its antibacterial effect, by binding to a specific site on the bacterial 50S ribosomal subunit (McNicholas et al., 2000, Antimicrob. Agents Chemother., Vol. 44, pp. 1121-1126). ORFs 13 and 39 (SEQ ID NOS 17 and 45) encode proteins that are similar to ribosomal RNA methyltransferases and are therefore likely to confer resistance to everninomicin (or its intermediates) by modifying the ribosomes of the producing microorganism. ORF 47 (SEQ ID NO 56) encodes a protein with similarity to a number of bacterial endoglucanases, enzymes that catalyze the hydrolysis of internal beta-1,4-glycosidic linkages. The ORF 47 (SEQ ID NO 56) enzyme may confer resistance to everninomicin or its intermediates by cleaving the beta-1,4-endoglycosidic linkage that is found in the oligosaccharide backbone of all orthosomycins. ORF 2 (SEQ ID NO 5) encodes a protein that is similar to integral membrane antiporters associated with antibiotic biosynthesis in other bacteria and is therefore likely to be involved in transport of everninomicin or its intermediates across the bacterial cell membrane.


[0036] Two proteins encoded by the everninomicin locus (ORFs 48, 49; SEQ ID NOS 57 and 58) are likely to be involved in regulating the expression of one or more of the genes in the locus. The orthosomycins are composed of repeating saccharide units and the biosynthesis of these molecules may be sensitive to the availability of saccharide precursors from primary cellular metabolism. ORF 48 (SEQ ID NO 57) encodes a protein that is similar to Lacl family transcriptional repressors that contain sugar binding sites and regulate transcription in response to the presence of small molecules such as saccharides. The ORF 49 (SEQ ID NO 58) protein is similar to glucose kinase and to ROK family transcriptional regulators that have glucose kinase homology. This protein may act as a sensor of hexose levels in the cell and interact with the ORF 48 (SEQ ID NO 57) transcriptional regulator in order to activate expression of one or more genes in the everninomicin locus in response to the availability of saccharide precursors.


[0037] Four proteins encoded by the everninomicin locus (ORFs 9, 17, 25 and 46; SEQ ID NOS 13, 21, 30 and 55) cannot be assigned a putative role in the biosynthesis of everninomicin. ORFs 17, 25 and 46 (SEQ ID NOS 21, 30 and 55) show no significant similarity to proteins in the GenBank database, while the ORF 9 (SEQ ID NO 13) protein shows weak similarity to putative nucleotide-binding proteins involved in sugar biosynthesis.


[0038] Polynucleotide and Amino Acid Sequences:


[0039] The term “isolated polynucleotide” is defined as a polynucleotide removed from the environment in which it naturally occurs. For example, a naturally-occurring DNA molecule present in the genome of a living bacteria is not isolated, but the same molecule separated from the remaining part of the bacterial genome, as a result of, e.g., a cloning event (amplification), is isolated. Typically, an isolated DNA molecule is free from its natural chromosomal context. Such isolated polynucleotides may be part of a vector or a composition and still be defined as isolated in that such a vector or composition is not part of the natural environment of such polynucleotide.


[0040] The polynucleotide of the invention is either RNA or DNA (cDNA, genomic DNA, or synthetic DNA), or modifications, variants, homologs or fragments thereof. The DNA is either double-stranded or single-stranded, and, if single-stranded, is either the coding strand or the non-coding (anti-sense) strand. Any one of the polynucleotide sequences of the invention as shown in FIG. 1 is (a) a coding sequence; (b) a ribonucleotide sequence derived from transcription of (a); (c) a coding sequence which uses the redundancy or degeneracy of the genetic code to encode the same polypeptides; or (d) a regulatory sequence. By “polypeptide” or “protein” is meant any chain of amino acids, regardless of length or post-translational modification (e.g., proteolytic processing or phosphorylation). Both terms are used interchangeably in the present application.


[0041] Consistent with this aspect of the invention, amino acid sequences are provided which are homologous to any one of the amino acid sequences of FIG. 1. As used herein, “homologous amino acid sequence” is any polypeptide which is encoded, in whole or in part, by a nucleic acid sequence which hybridizes at 25-35° C. below critical melting temperature (Tm), to any portion of the coding region nucleic acid sequences of FIG. 1. A homologous amino acid sequence is one that differs from an amino acid sequence shown in FIG. 1 by one or more conservative amino acid substitutions. Such a sequence also encompasses allelic variants (defined below) as well as sequences containing deletions or insertions which retain the functional characteristics of the polypeptide. Preferably, such a sequence is at least 75%, more preferably 80%, and most preferably 90% identical to any amino acid sequence shown in FIG. 1.


[0042] Homologous amino acid sequences include sequences that are identical or substantially identical to the amino acid sequences of FIG. 1. By “amino acid sequence substantially identical” is meant a sequence that is at least 90%, preferably 95%, more preferably 97%, and most preferably 99% identical to an amino acid sequence of reference and that preferably differs from the sequence of reference by a majority of conservative amino acid substitutions.


[0043] Conservative amino acid substitutions are substitutions among amino acids of the same class. These classes include, for example, amino acids having uncharged polar side chains, such as asparagine, glutamine, serine, threonine, and tyrosine; amino acids having basic side chains, such as lysine, arginine, and histidine; amino acids having acidic side chains, such as aspartic acid and glutamic acid; and amino acids having nonpolar side chains, such as glycine, alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan, and cysteine.


[0044] Homology is measured using sequence analysis software such as Sequence Analysis Software Package of the Genetics Computer Group, University of Wisconsin Biotechnology Center, 1710 University Avenue, Madison, Wis. 53705. Amino acid sequences are aligned to maximize identity. Gaps may be artificially introduced into the sequence to attain proper alignment. Once the optimal alignment has been set up, the degree of homology is established by recording all of the positions in which the amino acids of both sequences are identical, relative to the total number of positions.


[0045] Homologous polynucleotide sequences are defined in a similar way. Preferably, a homologous sequence is one that is at least 45%, more preferably 60%, and most preferably 85% identical to any one of the coding sequences of FIG. 1.


[0046] Consistent with this aspect of the invention, polypeptides having a sequence homologous to any one of the amino acid sequences of FIG. 1 include naturally-occurring allelic variants, as well as mutants or any other non-naturally occurring variants that retain the inherent characteristics of any polypeptide of FIG. 1.


[0047] As is known in the art, an allelic variant is an alternate form of a polypeptide that is characterized as having a substitution, deletion, or addition of one or more amino acids that does not alter the biological function of the polypeptide. By “biological function” is meant the function of the polypeptide in the cells in which it naturally occurs. A polypeptide can have more than one biological function.


[0048] Also consistent with this aspect of the invention is a substantially purified polypeptide or polypeptide derivative having an amino acid sequence encoded by a polynucleotide of the invention. A “substantially purified polypeptide” as used herein is defined as a polypeptide that is separated from the environment in which it naturally occurs and/or that is free of the majority of the polypeptides that are present in the environment in which it was synthesized. For example, a substantially purified polypeptide is free from cellular polypeptides. Those skilled in the art would readily understand that the polypeptides of the invention may be purified from a natural source, i.e., a bacterial cell of the order Actinomycetales, or produced by recombinant means.


[0049] The nucleic acids of ORF 1 to 49 can be isolated, optionally modified and inserted into a host cell to create and/or modify a metabolic (biosynthetic) and thereby enable that host cell to synthesize and/or modify various metabolites.


[0050] Alternatively, the everninomicin gene cluster can be expressed in the host cell and the encoded everninomicin polypeptides recovered for use as chemical reagents, e.g. in the ex vivo synthesis and/or chemical modification of various metabolites. Either application typically entails insertion of one or more nucleic acids encoding one or more isolated and/or modified everninomicin open reading frames in a metabolic/biosynthetic pathway (in which case the synthetic product of the pathway is typically recovered) or the everninomicin polypeptides themselves are recovered. The nucleic acid(s) are typically in an expression vector, a construct containing control elements suitable to direct expression of the everninomicin polypeptides. The expressed everninomicin polypeptides in the host cell then act as components of a metabolic/biosynthetic pathway (in which case the synthetic product of the pathway is typically recovered) or the everninomicin polypeptides themselves are recovered. Using the sequence information provided herein, cloning and expression of everninomicin nucleic acids can be accomplished using routine and well-known methods.


[0051] The ORFs (SEQ ID NOS: 2, 5 to 7, 9 to 21, 23 to 35, 37 to 46, 48, and 50 to 58) can be used to synthesize everninomicin antibiotics and/or analogues thereof. Alternatively, various components of the everninomicin gene cluster can be used to synthesize and/or chemically modify a wide variety of biomolecules/metabolites.


[0052] Polynucleotides encoding homologous polypeptides or allelic variants are retrieved by polymerase chain reaction (PCR) amplification of genomic bacterial DNA extracted by conventional methods. This involves the use of synthetic oligonucleotide primers matching upstream and downstream of the 5′ and 3′ ends of the encoding domain. Suitable primers are designed according to the nucleotide sequence information provided in FIG. 1. The procedure is as follows: a primer is selected which consists of 10 to 40, preferably 15 to 25 nucleotides. It is advantageous to select primers containing C and G nucleotides in a proportion sufficient to ensure efficient hybridization; i.e., an amount of C and G nucleotides of at least 40%, preferably 50% of the total nucleotide content. A standard PCR reaction contains typically 0.5 to 5 Units of Taq DNA polymerase per 100 μL, 20 to 200 μM deoxynucleotide each, preferably at equivalent concentrations, 0.5 to 2.5 mM magnesium over the total deoxynucleotide concentration, 105 to 106 target molecules, and about 20 pmol of each primer. About 25 to 50 PCR cycles are performed, with an annealing temperature 15° C. to 5° C. below the true Tm of the primers. A more stringent annealing temperature improves discrimination against incorrectly annealed primers and reduces incorportion of incorrect nucleotides at the 3′ end of primers. A denaturation temperature of 95° C. to 97° C. is typical, although higher temperatures may be appropriate for denaturation of G+C-rich targets. The number of cycles performed depends on the starting concentration of target molecules, though typically more than 40 cycles is not recommended as non-specific background products tend to accumulate.


[0053] An alternative method for retrieving polynucleotides encoding homologous polypeptides or allelic variants is by hybridization screening of a DNA or RNA library. Hybridization procedures are well-known in the art and are described in Ausubel et al., (Ausubel et al., Current Protocols in Molecular Biology, John Wiley & Sons Inc., 1994), Silhavy et al. (Silhavy et al. Experiments with Gene Fusions, Cold Spring Harbor Laboratory Press, 1984), and Davis et al. (Davis et al. A Manual for Genetic Engineering: Advanced Bacterial Genetics, Cold Spring Harbor Laboratory Press, 1980). Important parameters for optimizing hybridization conditions are reflected in a formula used to obtain the critical melting temperature above which two complementary DNA strands separate from each other (Casey & Davidson, Nucl. Acid Res. (1977) 4:1539). For polynucleotides of about 600 nucleotides or larger, this formula is as follows: Tm=81.5+0.5×(% G+C)+1.6 log (positive ion concentration)−0.6×(% formamide). Under appropriate stringency conditions, hybridization temperature (Th) is approximately 20 to 40° C., 20 to 25° C., or, preferably 30 to 40° C. below the calculated Tm. Those skilled in the art will understand that optimal temperature and salt conditions can be readily determined.


[0054] For the polynucleotides of the invention, stringent conditions are achieved for both pre-hybridizing and hybridizing incubations (i) within 4-16 hours at 42° C., in 6×SSC containing 50% formamide, or (ii) within 4-16 hours at 65° C. in an aqueous 6×SSC solution (1 M NaCl, 0.1M sodium citrate (pH 7.0)).


[0055] The native everninomicin gene cluster ORFs can be re-ordered, modified and combined with other biosynthetic units to produce a wide variety of molecules. Large chemical libraries can be produced and screened for a desired activity.


[0056] Useful homologs and fragments thereof that do not occur naturally are designed using known methods for identifying regions of a polypeptide that are likely to tolerate amino acid sequence changes and/or deletions. As an example, homologous polypeptides from different species are compared; conserved sequences are identified. The more divergent sequences are the most likely to tolerate sequence changes. Homology among sequences may be analyzed using the BLAST homology searching algorithm of Altschul et al., Nucleic Acids Res. 25:3389-3402 (1997).


[0057] Alternatively, identification of homologous polypeptides or polypeptide derivatives encoded by polynucleotides of the invention which have activity in the everninomicin biosynthetic pathway may be achieved by screening for cross-reactivity with an antibody raised against the polypeptide of reference having an amino acid sequence of FIG. 1. The procedure is as follows: an antibody is raised against a purified reference polypeptide, a fusion polypeptide (for example, an expression product of MBP, GST, or His-tag systems), or a synthetic peptide derived from the reference polypeptide. Where an antibody is raised against a fusion polypeptide, two different fusion systems are employed. Specific antigenicity can be determined according to a number of methods, including Western blot (Towbin et al., Proc. Natl. Acad. Sci. USA (1979) 76:4350), dot blot, and ELISA, as described below.


[0058] In a Western blot assay, the product to be screened, either as a purified preparation or a total E. coli extract, is submitted to SDS-Page electrophoresis as described by Laemmli (Nature (1970) 227:680). After transfer to a nitrocellulose membrane, the material is further incubated with the antibody diluted in the range of dilutions from about 1:5 to about 1:5000, preferably from about 1:100 to about 1:500. Specific antigenicity is shown once a band corresponding to the product exhibits reactivity at any of the dilutions in the above range.


[0059] In an ELISA assay, the product to be screened is preferably used as the coating antigen. A purified preparation is preferred, although a whole cell extract can also be used. Briefly, about 100 μl of a preparation at about 10 μg protein/ml are distributed into wells of a 96-well polycarbonate ELISA plate. The plate is incubated for 2 hours at 37° C. then overnight at 4° C. The plate is washed with phosphate buffer saline (PBS) containing 0.05% Tween 20 (PBS/Tween buffer). The wells are saturated with 250 μl PBS containing 1% bovine serum albumin (BSA) to prevent non-specific antibody binding. After 1 hour incubation at 37° C., the plate is washed with PBS/Tween buffer. The antibody is serially diluted in PBS/Tween buffer containing 0.5% BSA. 100 μl of dilutions are added per well. The plate is incubated for 90 minutes at 37° C., washed and evaluated according to standard procedures. For example, a goat anti-rabbit peroxidase conjugate is added to the wells when specific antibodies were raised in rabbits. Incubation is carried out for 90 minutes at 37° C. and the plate is washed. The reaction is developed with the appropriate substrate and the reaction is measured by colorimetry (absorbance measured spectrophotometrically). Under the above experimental conditions, a positive reaction is shown by O.D. values greater than a non immune control serum.


[0060] In a dot blot assay, a purified product is preferred, although a whole cell extract can also be used. Briefly, a solution of the product at about 100 μg/ml is serially two-fold diluted in 50 mM Tris-HCl (pH 7.5). 100 μl of each dilution are applied to a nitrocellulose membrane 0.45 μm set in a 96-well dot blot apparatus (Biorad). The buffer is removed by applying vacuum to the system. Wells are washed by addition of 50 mM Tris-HCl (pH 7.5) and the membrane is air-dried. The membrane is saturated in blocking buffer (50 mM Tris-HCl (pH 7.5) 0.15 M NaCl, 10 g/L skim milk) and incubated with an antibody dilution from about 1:50 to about 1:5000, preferably about 1:500. The reaction is revealed according to standard procedures. For example, a goat anti-rabbit peroxidase conjugate is added to the wells when rabbit antibodies are used. Incubation is carried out 90 minutes at 37° C. and the blot is washed. The reaction is developed with the appropriate substrate and stopped. The reaction is measured visually by the appearance of a colored spot, e.g., by colorimetry. Under the above experimental conditions, a positive reaction is shown once a colored spot is associated with a dilution of at least about 1:5, preferably of at least about 1:500.


[0061] Another aspect of the invention provides a process for purifying a polypeptide or polypeptide derivative of the invention by affinity chromatography using as a ligand either an antibody or an orthosomycin-related compound which binds to the polypeptide. The antibody is either polyclonal or monoclonal. Purified IgGs are prepared from an antiserum using standard methods (see, e.g., Coligan et al., Current Protocols in Immunology (1994) John Wiley & Sons, Inc., New York, N.Y.). Conventional chromatography supports are described in, e.g., Antibodies: A Laboratory Manual, D. Lane, E. Harlow, Eds. (1988).


[0062] Consistent with this aspect of the invention, polypeptide derivatives are provided that are partial sequences of the amino acid sequences of FIG. 1, partial sequences of polypeptide sequences homologous to the amino acid sequences of FIG. 1, polypeptides derived from full-length polypeptides by internal deletion, and fusion proteins.


[0063] Polynucleotides of 30 to 600 nucleotides encoding partial sequences of sequences homologous to nucleotide sequences of FIG. 1 are retrieved by PCR amplification using the parameters outlined above and using primers matching the sequences upstream and downstream of the 5′ and 3′ ends of the fragment to be amplified. The template polynucleotide for such amplification is either the full length polynucleotide homologous to a polynucleotide sequence of FIG. 1, or a polynucleotide contained in a mixture of polynucleotides such as a DNA or RNA library. As an alternative method for retrieving the partial sequences, screening hybridization is carried out under conditions described above and using the formula for calculating Tm. Where fragments of 30 to 600 nucleotides are to be retrieved, the calculated Tm is corrected by subtracting (600/polynucleotide size in base pairs) and the stringency conditions are defined by a hybridization temperature that is 5 to 10° C. below Tm. Where oligonucleotides shorter than 20-30 bases are to be obtained, the formula for calculating the Tm is as follows: Tm=4×(G+C)+2×(A+T). For example, an 18 nucleotide fragment of 50% G+C would have an approximate Tm of 54° C. Short peptides that are fragments of the polypeptide sequences of FIG. 1 or their homologous sequences, are obtained directly by chemical synthesis (E. Gross and H. J. Meinhofer, 4 The Peptides: Analysis, Synthesis, Biology; Modern Techniques of Peptide Synthesis, John Wiley & Sons (1981), and M. Bodanzki, Principles of Peptide Synthesis, Springer-Verlag (1984)).


[0064] Polynucleotides encoding polypeptide fragments and polypeptides having large internal deletions are constructed using standard methods (Ausubel et al., Current Protocols in Molecular Biology, John Wiley & Sons Inc., 1994). Such methods include standard PCR, inverse PCR, restriction enzyme treatment of cloned DNA molecules, or the method of Kunkel et al. (Kunkel et al Proc. Natl. Acad. Sci. USA (1985) 82:448). Components for these methods and instructions for their use are readily available from various commercial sources such as Stratagene. Once the deletion mutants have been constructed, they are tested for their ability to improve production of everninomicin or generate novel analogues of the antibiotic or natural products of the orthosomycin class as described above.


[0065] As used herein, a fusion polypeptide is one that contains a polypeptide or a polypeptide derivative of the invention fused at the N- or C-terminal end to any other polypeptide (hereinafter referred to as a peptide tail). A simple way to obtain such a fusion polypeptide is by translation of an in-frame fusion of the polynucleotide sequences, i.e., a hybrid gene. The hybrid gene encoding the fusion polypeptide is inserted into an expression vector which is used to transform or transfect a host cell. Alternatively, the polynucleotide sequence encoding the polypeptide or polypeptide derivative is inserted into an expression vector in which the polynucleotide encoding the peptide tail is already present. Such vectors and instructions for their use are commercially available, e.g. the pMal-c2 or pMal-p2 system from New England Biolabs, in which the peptide tail is a maltose binding protein, the glutathione-S-transferase system of Pharmacia, or the His-Tag system available from Novagen. These and other expression systems provide convenient means for further purification of polypeptides and derivatives of the invention.


[0066] Vectors, Transformed Cells, Primers and Probes:


[0067] A polynucleotide molecule according to the invention, including RNA, DNA, or modifications or combinations thereof, have various applications. A DNA molecule is used, for example, for producing a polypeptide of the invention in a recombinant host system. Another aspect of the invention encompasses (a) an expression cassette containing a DNA molecule of the invention placed under the control of the elements required for expression, in particular under the control of an appropriate promoter; (b) an expression vector containing an expression cassette of the invention; (c) a prokaryotic cell transformed with an expression cassette and/or vector of the invention, as well as (d) a process for producing a polypeptide or polypeptide derivative encoded by a polynucleotide of the invention, which involves culturing a prokaryotic cell transformed with an expression cassette and/or vector of the invention under conditions that allow expression of the DNA molecule of the invention, and recovering the encoded polypeptide or polypeptide derivative from the culture.


[0068] A recombinant expression system is selected from prokaryotic hosts. Bacterial cells are available from a number of different sources including commercial sources to those skilled in the art, e.g., the American Type Culture Collection (ATCC; Rockville, Md.). Commercial sources of cells used for recombinant protein expression also provide instructions for usage of the cells.


[0069] The choice of the expression system depends on the features desired for the expressed polypeptide. For example, it may be useful to produce a polypeptide of the invention in a particular lipidated form or any other form.


[0070] One skilled in the art would readily understand that not all vectors and expression control sequences and hosts would be expected to express equally well the polynucleotides of this invention. With the guidelines described below, however, a selection of vectors, expression control sequences and hosts may be made without undue experimentation and without departing from the scope of this invention.


[0071] In selecting a vector, the host must be chosen that is compatible with the vector which is to exist and possibly replicate in it. Considerations are made with respect to the vector copy number, the ability to control the copy number and expression of other proteins such as antibiotic resistance. In selecting an expression control sequence, a number of variables are considered. Among the important variables are the relative strength of the sequence (e.g. the ability to drive expression under various conditions), the ability to control the sequence's function and compatibility between the polynucleotide to be expressed and the control sequence (e.g. secondary structures are considered to avoid hairpin structures which prevent efficient transcription). In selecting the host, unicellular hosts are selected which are compatible with the selected vector, tolerant of any possible toxic effects of the expressed product, able to secrete the expressed product efficiently if such is desired, able to express the product in the desired conformation, easily scaled up, and having regard to ease of purification of the final product, which may be the expressed polypeptide or the natural product, e.g. an antibiotic, which is a product of the biosynthetic pathway of which the expressed polypeptide is a part.


[0072] The choice of the expression cassette depends on the host system selected as well as the features desired for the expressed polypeptide or natural product. Typically, an expression cassette includes a promoter that is functional in the selected host system and can be constitutive or inducible; a ribosome binding site; a start codon (ATG) if necessary; optionally a region encoding a leader peptide; a DNA molecule of the invention; a stop codon; and optionally a 3′ terminal region (translation and/or transcription terminator). The leader peptide encoding region is adjacent to the polynucleotide of the invention and placed in proper reading frame. The leader peptide-encoding region, if present, is homologous or heterologous to the DNA molecule encoding the mature polypeptide and is compatible with the secretion apparatus of the host used for expression. The open reading frame constituted by the DNA molecule of the invention, solely or together with the leader peptide, is placed under the control of the promoter so that transcription and translation occur in the host system. Promoters and leader peptide encoding regions are widely known and available to those skilled in the art.


[0073] The expression cassette is typically part of an expression vector, which is selected for its ability to replicate in the chosen expression system. Expression vectors (e.g., plasmids and cosmids) are widely known and are readily available to those skilled in the art. For bacterial vectors, the polynucleotide of the invention is inserted into the bacterial genome or remains in a free state as part of a plasmid. Methods for transforming host cells with expression vectors are well-known in the art.


[0074] The sequence information provided in the present application enables the design of specific nucleotide probes and primers that are used for identifying and isolating putative orthosomycin-producing microorganisms. Accordingly, an aspect of the invention provides a nucleotide probe or primer having a sequence found in or derived by degeneracy of the genetic code from a sequence shown in FIG. 1.


[0075] The term “probe” as used in the present application refers to DNA (preferably single stranded) or RNA molecules (or modifications or combinations thereof) that hybridize under the stringent conditions, as defined above, to nucleic acid molecules of FIG. 1 or to sequences homologous to those of FIG. 1, or to their complementary or anti-sense sequences. Generally, probes are significantly shorter than full-length sequences. Such probes contain from about 5 to about 100, preferably from about 10 to about 80, nucleotides. In particular, probes have sequences that are at least 75%, preferably at least 85%, more preferably 95% homologous to a portion of a sequence disclosed in FIG. 1 or that are complementary to such sequences. Probes may contain modified bases such as inosine, methyl-5-deoxycytidine, deoxyuridine, dimethylamino-5-deoxyuridine, or diamino-2, 6-purine. Sugar or phosphate residues may also be modified or substituted. For example, a deoxyribose residue may be replaced by a polyamide (Nielsen et al., Science (1991) 254:1497) and phosphate residues may be replaced by ester groups such as diphosphate, alkyl, arylphosphonate and phosphorothioate esters. In addition, the 2′-hydroxyl group on ribonucleotides may be modified by including such groups as alkyl groups.


[0076] Probes of the invention are used for identifying and isolating putative orthosomycin-producing microorganisms, as capture or detection probes. Such capture probes are conventionally immobilized on a solid support, directly or indirectly, by covalent means or by passive adsorption. A detection probe is labeled by a detection marker selected from: radioactive isotopes, enzymes such as peroxidase, alkaline phosphatase, enzymes able to hydrolyze a chromogenic or fluorogenic or luminescent substrate, compounds that are chromogenic or fluorogenic or luminescent, nucleotide base analogs, and biotin.


[0077] Probes of the invention are used in any conventional hybridization technique, such as dot blot (Maniatis et al., Molecular Cloning: A Laboratory Manual (1982) Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.), Southern blot (Southern, J. Mol. Biol. (1975) 98:503), northern blot (identical to Southern blot with the exception that RNA is used as a target), or the sandwich technique (Dunn et al., Cell (1977) 12:23). The latter technique involves the use of a specific capture probe and/or a specific detection probe with nucleotide sequences that at least partially differ from each other.


[0078] A primer is a probe of usually about 10 to about 40 nucleotides that is used to initiate enzymatic polymerization of DNA in an amplification process (e.g., PCR), in an elongation process, or in a reverse transcription method. Primers used in diagnostic methods involving PCR are labeled by methods known in the art.


[0079] As described herein, the invention also encompasses (i) a reagent comprising a probe of the invention for detecting and/or isolating putative orthosomycin-producing microorganisms; (ii) a method for detecting and/or isolating putative orthosomycin-producing microorganisms, in which DNA or RNA is extracted from the microorganism and denatured, and exposed to a probe of the invention, for example, a capture probe or detection probe or both, under stringent hybridization conditions, such that hybridization is detected; and (iii) a method for detecting and/or isolating putative orthosomycin-producing microorganisms, in which (a) a sample is recovered or derived from the microorganism, (b) DNA is extracted therefrom, (c) the extracted DNA is primed with at least one, and preferably two, primers of the invention and amplified by polymerase chain reaction, and (d) the amplified DNA fragment is produced.


[0080] It is understood that the embodiments described herein are for illustrative purposes only and that various modifications or changes in light thereof will be suggested to persons skilled in the art and are to be included within the spirit and purview of this application and scope of the appended claims. All publications, patents and patent applications cited herein are hereby incorporated by reference in their entirety for all purposes.


Claims
  • 1. An isolated nucleic acid molecule comprising a nucleic acid sequence selected from any of: (a) a nucleic acid encoding any of everninomicin open reading frames (ORFs) 1 to 49 (SEQ ID NOS: 2, 5 to 7, 9 to 21, 23 to 35, 37 to 46, 48, and 50 to 58); (b) a nucleic acid encoding a polypeptide encoded by any of everninomicin open reading frames (ORFS) 1 to 49 (SEQ ID NOS: 2, 5 to 7, 9 to 21, 23 to 35, 37 to 46, 48, and 50 to 58); and (c) a nucleic acid encoding a polypeptide which is at least 75% identical in amino acid sequence to a polypeptide encoded by any of everninomicin open reading frames (ORFs) 1 to 49 (SEQ ID NOS: 2, 5 to 7, 9 to 21, 23 to 35, 37 to 46, 48, and 50 to 58).
  • 2. The isolated nucleic acid of claim 1, wherein said nucleic acid comprises a nucleic acid encoding at least two open reading frames (ORFs) selected from the group consisting of ORF 1 to ORF 49 (SEQ ID NOS: 2, 5 to 7, 9 to 21, 23 to 35, 37 to 46, 48, and 50 to 58).
  • 3. The isolated nucleic acid of claim 2, wherein said nucleic acid comprises a nucleic acid encoding at least three open reading frames (ORFs) selected from the group consisting of ORF 1 to ORF 49 (SEQ ID NOS: 2, 5 to 7, 9 to 21, 23 to 35, 37 to 46, 48, and 50 to 58).
  • 4. An isolated nucleic acid comprising a nucleic acid that hybridizes under stringent conditions to an open reading frame (ORF) of the everninomicin biosynthesis gene cluster and can substitute for the ORF to which it specifically hybridizes to direct the synthesis of an everninomicin.
  • 5. The isolated nucleic acid of claim 4, wherein the isolated nucleic acid specifically hybridizes under stringent conditions to a nucleic acid encoding a polypeptide selected from the group comprising of ORF 1, ORF 2, ORF 3, ORF 4, ORF 5, ORF 6, ORF 7, ORF 8, ORF 9, ORF 10, ORF 11, ORF 12, ORF 13, ORF 14, ORF 15, ORF 16, ORF 17, ORF 18, ORF 19, ORF 20, ORF 21, ORF 22, ORF 23 and ORF 24 (SEQ ID NOS: 2, 5 to 7, 9 to 21, and 23 to 29).
  • 6. The isolated nucleic acid of claim 4 wherein the nucleic acid specifically hybridizes under stringent conditions to a nucleic acid encoding a polypeptide selected from the group consisting of ORF 25, ORF 26, ORF 27, ORF 28, ORF 29, ORF 30, ORF 31, ORF 32, ORF 33, ORF 34, ORF 35, ORF 36, ORF 37, ORF 38, ORF 39, ORF 40, ORF 41, ORF 42, ORF 43, ORF 44, ORF 45, ORF 46, ORF 47, ORF 48 and ORF 49 (SEQ ID NOS 30 to 35, 37 to 46, 48 and 50 to 58).
  • 7. The isolated nucleic acid of claim 5 wherein the isolated nucleic acid encodes a polypeptide selected from the group consisting of ORF 1, ORF 2, ORF 3, ORF 4, ORF 5, ORF 6, ORF 7, ORF 8, ORF 9, ORF 10, ORF 11, ORF 12, ORF 13, ORF 14, ORF 15, ORF 16, ORF 17, ORF 18, ORF 19, ORF 20, ORF 21, ORF 22, ORF 23 and ORF 24 (SEQ ID NOS: 2, 5 to 7, 9 to 21, and 23 to 29).
  • 8. The isolated nucleic acid of claim 6 wherein the isolated nucleic acid encodes a polypeptide selected from the group consisting of ORF 25, ORF 26, ORF 27, ORF 28, ORF 29, ORF 30, ORF 31, ORF 32, ORF 33, ORF 34, ORF 35, ORF 36, ORF 37, ORF 38, ORF 39, ORF 40, ORF 41, ORF 42, ORF 43, ORF 44, ORF 45, ORF 46, ORF 47, ORF 48 and ORF 49 (SEQ ID NOS 30 to 35, 37 to 46, 48 and 50 to 58).
  • 9. An isolated gene cluster comprising open reading frames encoding polypeptides sufficient to direct the synthesis of an everninomicin or an everninomicin analogue.
  • 10. The isolated gene cluster of claim 9 wherein the gene cluster is present in a bacterium.
  • 11. The isolated gene cluster of claim 9 wherein the gene cluster is present in E. coli strains DH10B having accession nos. IDAC 240101-1, IDAC 240101-2 and IDAC 240101-3.
  • 12. An isolated polypeptide comprising a polypeptide sequence selected from any one of: a) a polypeptide of open reading frames 1 to 49 (SEQ ID NOS: 2, 5 to 7, 9 to 21, 23 to 35, 37 to 46, 48, and 50 to 58); and b) a polypeptide which is at least 75% identical in amino acid sequence to a polypeptide sequence of open reading frames (ORFs) 1 to 49 (SEQ ID NOS: 2, 5 to 7, 9 to 21, 23 to 35, 37 to 46, 48, and 50 to 58).
  • 13. The polypeptide of claim 12, wherein said polypeptide is a polypeptide containing at least two open reading frames selected from open reading frames (ORFs)1 to 49 (SEQ ID NOS: 2, 5 to 7, 9 to 21, 23 to 35, 37 to 46, 48, and 50 to 58).
  • 14. The polypeptide of claim 12, wherein said polypeptide is a polypeptide containing at least three open reading frames selected from open reading frames (ORFs) 1 to 49 (SEQ ID NOS: 2, 5 to 7, 9 to 21, 23 to 35, 37 to 46, 48, and 50 to 58).
  • 15. The polypeptide of claim 12, wherein said polypeptide is a polypeptide containing at least three or more open reading frames selected from open reading frames 1 to 49 (SEQ ID NOS: 2, 5 to 7, 9 to 21, 23 to 35, 37 to 46, 48, and 50 to 58).
  • 16. An expression vector comprising a nucleic acid of claim 1.
  • 17. A host cell transformed with an expression vector of claim 16.
  • 18. The host cell of claim 17, wherein the cell is transformed with an exogenous nucleic acid comprising a gene cluster encoding polypeptides sufficient to direct the assembly of an everninomicin or an everninomicin analogue.
  • 19. A method of chemically modifying a biological molecule, said method comprising contacting a biological molecule that is a substrate for a polypeptide encoded by an everninomicin biosynthesis gene cluster open reading frame with a polypeptide encoded by an everninomicin biosynthesis gene cluster open reading frame whereby said polypeptide chemically modifies said biological molecule.
  • 20. The method of claim 19 wherein said method comprises contacting said biological molecule with at least two different polypeptides encoded by everninomicin biosynthesis gene cluster open reading frames 1 to 49 (SEQ ID NOS: 2, 5 to 7, 9 to 21, 23 to 35, 37 to 46, 48, and 50 to 58).
CROSS REFERENCE TO RELATED APPLICATION

[0001] This application claims benefit under 35 U.S.C. §119 of provisional application U.S. Ser. No. 60/177,170, filed on Jan. 27, 2000, which is herein incorporated by reference in its entirety for all purposes.

Provisional Applications (1)
Number Date Country
60177711 Jan 2000 US