The present disclosure relates to recombinant prokaryotic host cells and methods for producing an O-glycosylated protein.
Protein glycosylation is one of the most abundant and structurally complex posttranslational modifications (PTMs) (Khoury et al., “Proteome-Wide Post-Translational Modification Statistics: Frequency Analysis and Curation of the Swiss-Prot Database,” Sci. Rep. 1:90 (2011) and Walsh et al., “Protein Posttranslational Modifications: The Chemistry of Proteome Diversifications,” Angew Chem. Int. Ed. Engl. 44:7342-7372 (2005)) and occurs in all domains of life (Abu-Qarn et al., “Not Just for Eukarya Anymore: Protein Glycosylation in Bacteria and Archaea,” Curr. Opin. Struct Biol. 18:544-550 (2008)). Protein-linked glycans (mono-, oligo- or polysaccharide) play important roles in protein folding, solubility, stability, serum half-life, immunogenicity, and biological function (Varki, A., “Biological Roles of Glycans,” Glycobiology 27:3-49 (2017)). Glycan conjugation is also critical to the development of many biologics, with glycoproteins accounting for more than 70% of current protein-based drugs (Sethuraman & Stadheim, “Challenges in Therapeutic Glycoprotein Production,” Curr. Opin. Biotechnol. 17:341-346 (2006)) and glycoconjugate vaccines representing one of the safest and most successful vaccination approaches developed over the last 40 years (Rappuoli, R. “Glycoconjugate Vaccines: Principles and Mechanisms,” Sci. Trans. Med. 10:eaat4615 (2018)). The importance of glycosylation in both nature and the clinic has prompted widespread glycoengineering efforts that seek to: (i) create designer production platforms for controllable glycoprotein synthesis (Valderrama-Rincon et al., An Engineered Eukaryotic Protein Glycosylation Pathway in Escherichia coli,” Nat. Chem. Biol. 8:434-436 (2012); Meuris et al., “GlycoDelete Engineering of Mammalian Cells Simplifies N-Glycosylation of Recombinant Proteins,” Nat. Biotechnol. 32:485-489 (2014); Hamilton et al., “Production of Complex Human Glycoproteins in Yeast,” Science 301:1244-1246 (2003); Jaroentomeechai et al., “Single-Pot Glycoprotein Biosynthesis Using a Cell-Free Transcription-Translation System Enriched with Glycosylation Machinery,” Nat. Commun. 9:2686 (2018); Kightlinger et al., “A Cell-Free Biosynthesis Platform for Modular Construction of Protein Glycosylation Pathways,” Nat. Commun. 10:5404 (2019); Feldman et al., “Engineering N-Linked Protein Glycosylation with Diverse O Antigen Lipopolysaccharide Structures in Escherichia coli,” Proc. Natl. Acad Sci. USA 102:3016-3021 (2005); Tytgat et al., “Cytoplasmic Glycoengineering Enables Biosynthesis of Nanoscale Glycoprotein Assemblies,” Nat. Commun. 10:5403 (2019); Aumiller et al., “A Transgenic Insect Cell Line Engineered to Produce CMP-Sialic Acid and Sialylated Glycoproteins,” Glycobiology 13:497-507 (2003); Chang et al., “Small-Molecule Control of Antibody N-Glycosylation in Engineered Mammalian Cells,” Nat. Chem. Biol. 15:730-736 (2019); and Yang et al., “Engineering Mammalian Mucin-Type O-Glycosylation in Plants,” J. Biol. Chem. 287:11911-11923 (2012)); and (ii) rationally manipulate glycan structures and their attachment sites as a means to optimize the therapeutic and immunologic properties of proteins (Elliott et al., “Enhancement of Therapeutic Protein In Vivo Activities Through Glycoengineering,” Nat. Biotechnol. 21:414-421 (2003); Huang et al., “Chemoenzymatic Glycoengineering of Intact IgG Antibodies for Gain of Functions,”J. Am. Chem. Soc. 134:12308-12318 (2012); Broecker et al., “Multivalent Display of Minimal Clostridium difficile Glycan Epitopes Mimics Antigenic Properties of Larger Glycans,” Nat. Commun. 7:11224 (2016); Umana et al., “Engineered Glycoforms of an Antineuroblastoma IgG1 with Optimized Antibody-Dependent Cellular Cytotoxic Activity,” Nat. Biotechnol. 17:176-180 (1999); and Ilyushin et al., “Chemical Polysialylation of Human Recombinant Butyrylcholinesterase Delivers a Long-Acting Bioscavenger for Nerve Agents In Vivo,” Proc. Natl. Acad. Sci. USA 110:1243-1248 (2013)).
Genetically engineered eukaryotic expression hosts have provided extensive access to a chemically rich landscape of glycoproteins enabling efforts to generate defined glycoprotein epitopes and engineer proteins with advantageous properties (Meuris et al., “GlycoDelete Engineering of Mammalian Cells Simplifies N-Glycosylation of Recombinant Proteins,” Nat. Biotechnol. 32:485-489 (2014); Hamilton et al., “Production of Complex Human Glycoproteins in Yeast,” Science 301:1244-1246 (2003); Aumiller et al., “A Transgenic Insect Cell Line Engineered to Produce CMP-Sialic Acid and Sialylated Glycoproteins,” Glycobiology 13:497-507 (2003); Chang et al., “Small-Molecule Control of Antibody N-Glycosylation in Engineered Mammalian Cells,” Nat. Chem. Biol. 15:730-736 (2019); and Yang et al., “Engineering Mammalian Mucin-Type O-Glycosylation in Plants,” J. Biol. Chem. 287:11911-11923 (2012)). However, glycoengineering in eukaryotes is complicated by the fact that glycans are synthesized across several subcellular compartments by the coordinated activities of numerous glycosyltransferases (GTs) (Schwarz & Aebi, “Mechanisms and Principles of N-Linked Protein Glycosylation,” Curr. Opin. Struct. Biol. 21:576-582 (2011)) and that glycosylation is an essential process, with significant alteration of glycosylation pathways often leading to severe fitness defects (Choi et al., “Use of Combinatorial Genetic Libraries to Humanize N-Linked Glycosylation in the Yeast Pichia pastoris,” Proc. Natl. Acad. Sci. USA 100:5022-5027 (2003)). Glycoengineering in bacteria, on the other hand, is not constrained by these issues due to the non-essential nature of protein glycosylation in bacterial cells and thus has emerged as an attractive alternative that permits customizable glycan construction and protein glycosylation (Natarajan et al., “Metabolic Engineering of Glycoprotein Biosynthesis in Bacteria,” Emerg. Top Life Sci. 2: 419-432 (2018)). Moreover, some bacteria including laboratory strains of Escherichia coli lack endogenous glycosylation pathways, thereby providing a “clean” chassis for installation of orthogonal glycosylation pathways with little to no interference from endogenous GTs and the potential for more uniformly glycosylated protein products.
Over the last two decades, numerous efforts have collectively endowed E. coli and E. coli-derived cell-free extracts with the catalytic potential to produce diverse N-glycoproteins. Notably, this includes generation of structurally complex glycans, such as the eukaryotic Man3GlcNAc2 structure (Valderrama-Rincon et al., An Engineered Eukaryotic Protein Glycosylation Pathway in Escherichia coli,” Nat. Chem. Biol. 8:434-436 (2012)), and their installation at authentic human glycosites (Ollis et al., “Engineered Oligosaccharyltransferases with Greatly Relaxed Acceptor-Site Specificity,” Nat. Chem. Biol. 10:816-822 (2014)). In contrast, the analogous construction of O-linked glycosylation pathways in bacteria has received relatively little attention. Two of the earliest examples involved reconstituting the initiating step of vertebrate mucin-type O-glycosylation in E. coli (Henderson et al., “Site-Specific Modification of Recombinant Proteins: A Novel Platform for Modifying Glycoproteins Expressed in E. coli,” Bioconjug. Chem. 22:903-912 (2011) and Mueller et al., “High level In Vivo Mucin-Type Glycosylation in Escherichia coli,” Microb. Cell Fact. 17:168 (2018). Specifically, human polypeptide N-acetylgalactosaminyl-transferase 2 (GalNAcT2) was used to conjugate GalNAc onto threonine residues of peptides derived from different O-glycoproteins including human mucin 1 (MUC1) or an artificial rat-derived MUC10 in the cytoplasm of E. coli. Most recently, it was shown that the GalNAc installed by GalNAcT2 on threonine residues could be extended by a single galactose (Gal) residue using Campylobacter jejuni β1,3-galactosyltransferase CgtB, yielding acceptor proteins modified with Gal-β1,3-GalNAcα (T antigen or core 1) (Du et al., “A Bacterial Expression Platform for Production of Therapeutic Proteins Containing Human-Like O-Linked Glycans,” Cell Chem. Biol. 26:203-212 e205 (2019)). Bacterial protein O-glycosylation pathways have also been successfully reconstituted in E. coli; however, these systems are unlike the processive mechanism used by eukaryotes and instead operate according to an en bloc mechanism that is reminiscent of the canonical N-glycosylation process (Natarajan et al., “Metabolic Engineering of Glycoprotein Biosynthesis in Bacteria,” Emerg. Top Life Sci. 2: 419-432 (2018)).
The present application is directed to overcoming these and other deficiencies in the art.
Accordingly, a first aspect of the present disclosure relates to a recombinant prokaryotic host cell expressing one or more 4-epimerases, one or more glycosyl-1-phosphate transferases, and one or more O-oligosaccharyltransferases.
Another aspect of the present disclosure relates to a recombinant prokaryotic host cell expressing one or more 4-epimerases, one or more glycosyl-1-phosphate transferases, one or more O-oligosaccharyltransferases, and one or more ß1,3-galactosyltransferase enzymes capable of transferring galactose (Gal) to undecaprenyl pyrophosphate (Und-PP)-linked N-Acetylgalactosamine (GalNAc).
Another aspect of the present disclosure relates to a method for producing an O-glycosylated protein. This method involves providing a recombinant host cell expressing one more 4-epimerases, one or more glycosyl-1-phosphate transferases, one or more O-oligosaccharyltransferases, and a glycoprotein target comprising one or more serine and/or threonine residues. This method further involves culturing the host cell under conditions effective to: (i) produce N-acetylgalactosamine (GalNAc) linked to undecaprenyl pyrophosphate (Und-PP); and (ii) transfer the N-acetylgalactosamine (GalNAc) linked to undecaprenyl pyrophosphate (Und-PP) en bloc to a serine or threonine amino acid of the glycoprotein target.
Another aspect of the present disclosure relates to a method for producing an O-glycosylated protein. This method involves providing a recombinant host cell expressing one more 4-epimerases, one or more glycosyl-1-phosphate transferases, one or more O-oligosaccharyltransferases, one or more ß1,3-galactosyltransferase enzymes capable of transferring galactose (Gal) to undecaprenyl pyrophosphate (Und-PP)-linked N-Acetylgalactosamine (GalNAc), and a glycoprotein target comprising one or more serine and/or threonine residues. This method further involves culturing said host cell under conditions effective to: (i) produce N-acetylgalactosamine (GalNAc) linked to undecaprenyl pyrophosphate (Und-PP); (ii) extend Und-PP-GalNAc by a single galactose (Gal) monosaccharide to yield lipid-linked Gal-ß1,3-GalNAc; and (iii) transfer the lipid-linked Gal-ß1,3-GalNAc en bloc to a serine or threonine amino acid of the glycoprotein target.
Another aspect of the present disclosure relates to an in vitro method for producing an O-glycosylated protein. This method involves providing glycosylation reagents comprising one more 4-epimerases, one or more glycosyl-1-phosphate transferases, and one or more O-oligosaccharyltransferases; providing a glycoprotein target comprising one or more serine and/or threonine residues; and incubating said glycosylation reagents and said glycoprotein target under conditions effective to: (i) yield N-acetylgalactosamine (GalNAc) linked to undecaprenyl pyrophosphate (Und-PP), and (ii) transfer the N-acetylgalactosamine (GalNAc) linked to undecaprenyl pyrophosphate (Und-PP) en bloc to a serine or threonine amino acid of the glycoprotein target.
Another aspect of the present disclosure relates to an in vitro method for producing an O-glycosylated protein. This method involves providing glycosylation reagents comprising one more 4-epimerase enzymes, one or more heterologous diacetylbacilliosaminyl-1-phosphate transferase enzymes, one or more heterologous O-oligosaccharyltransferases, and one or more ß1,3-galactosyltransferase enzymes capable of transferring galactose (Gal) to undecaprenyl pyrophosphate (Und-PP)-linked N-Acetylgalactosamine (GalNAc); providing a glycoprotein target comprising one or more serine and/or threonine residues; and incubating said glycosylation reagents and said glycoprotein target under conditions effective to: (i) yield lipid-linked Gal-ß1,3-GalNAc, and (ii) transfer the lipid-linked Gal-ß1,3-GalNAc and any additional sugars en bloc to a serine or threonine amino acid of the glycoprotein target.
Another aspect of the present disclosure relates to an in vitro method for producing an O-glycosylated protein. This method involves providing reagents suitable for synthesizing a glycoprotein target; providing glycosylation reagents comprising one more 4-epimerases, one or more glycosyl-1-phosphate transferases, and one or more O-oligosaccharyltransferases; providing a nucleic acid molecule encoding a glycoprotein target and incubating said reagents suitable for synthesizing a glycoprotein target, glycosylation reagents, and nucleic acid molecule encoding a glycoprotein target under conditions effective to: (i) synthesize the glycoprotein target encoded by the nucleic acid molecule encoding a glycoprotein target, (i) yield N-acetylgalactosamine (GalNAc) linked to undecaprenyl pyrophosphate (Und-PP), and (ii) transfer the N-acetylgalactosamine (GalNAc) linked to undecaprenyl pyrophosphate (Und-PP) en bloc to a serine or threonine amino acid of the glycoprotein target.
Another aspect of the present disclosure relates to an in vitro method for producing an O-glycosylated protein. This method involves providing reagents suitable for synthesizing a glycoprotein target; providing glycosylation reagents comprising one more 4-epimerase enzymes, one or more heterologous N,N′-diacetylbacilliosaminyl-1-phosphate transferase enzymes, one or more heterologous O-oligosaccharyltransferases, and one or more ß1,3-galactosyltransferase enzymes capable of transferring galactose (Gal) to undecaprenyl pyrophosphate (Und-PP)-linked N-Acetylgalactosamine (GalNAc), providing a nucleic acid molecule encoding a glycoprotein target; and incubating said reagents suitable for synthesizing a glycoprotein target, glycosylation reagents, and nucleic acid molecule encoding a glycoprotein target under conditions effective to: (i) synthesize the glycoprotein target encoded by the nucleic acid molecule encoding a glycoprotein target, (ii) yield lipid-linked Gal-ß1,3-GalNAc, and (iii) transfer the lipid-linked Gal-ß1,3-GalNAc and any additional sugars en bloc to a serine or threonine amino acid of the glycoprotein target.
Another aspect of the present disclosure relates to a prokaryotic host cell expressing an α2,6-sialyltransferases and an α2,3-sialyltransferase, where the α2,6-sialyltransferases is the α2,6-sialyltransferases from Photobacterium sp. JT-ISH-224 (PspST6) and where the α2,3-sialyltransferase is the α2,3-sialyltransferase from E. coli O104 (EcWbwA).
Another aspect of the present disclosure relates to a prokaryotic host cell expressing one or more 4-epimerases, one or more glycosyl-1-phosphate transferases, one or more O-Antigen ligases (e.g., EcWaaL), and, optionally, one or more ß1,3-galactosyltransferase enzymes capable of transferring galactose (Gal) to undecaprenyl pyrophosphate (Und-PP)-linked N-Acetylgalactosamine (GalNAc). In some embodiments according to this aspect of the invention, the prokaryotic host cell does not encode an O-oligosaccharyltransferase.
Another aspect of the present disclosure relates to a method for producing a lipid linked Gal-β1,3-GalNAcα (T antigen or core 1). This method involves providing a recombinant host cell expressing one or more 4-epimerases, one or more glycosyl-1-phosphate transferases, one or more ß1,3-galactosyltransferase enzymes capable of transferring galactose (Gal) to undecaprenyl pyrophosphate (Und-PP)-linked N-Acetylgalactosamine (GalNAc), and one or more O-Antigen ligases (e.g., EcWaaL). This method further involves culturing the host cell under conditions effective to: (1) produce Gal-β1,3-GalNAc linked to undecaprenyl pyrophosphate (Und-PP) and (ii) transfer Gal-β1,3-GalNAc linked to undecaprenyl pyrophosphate (Und-PP) en bloc to a lipid target.
The examples of the present disclosure demonstrate the implantation of a synthetic glycobiology approach to engineer E. coli with human-like O-glycosylation pathways based on the bacterial PglL/O paradigm. As proof-of-concept, a collection of orthogonal pathways for biosynthesis of proteins decorated with mucin-type O-glycans including Tn, T, sialyl-Tn (STn), and sialyl-T (ST) glycans were engineered. Each of these pathways involved cytoplasmic preassembly of desired O-glycan structures on Und-PP by a prescribed set of heterologous GTs expressed in E. coli cells metabolically engineered to produce required nucleotide sugar donors. The addition of heterologous O-OSTs enabled efficient site-directed O-glycosylation of acceptor sequences derived from different human glycoproteins.
Glycoengineered E. coli cells were also used to source crude cell extracts selectively enriched with O-glycosylation machinery, enabling a one-pot, cell-free reaction scheme for efficient and site-specific installation of O-glycans on target acceptor proteins. Overall, it is anticipated that the glycoengineered bacteria described herein will enable future efforts to produce structurally diverse O-glycoproteins for a variety of applications at the intersection of glycoscience, synthetic biology, and biomedicine.
Unless otherwise indicated, the definitions and embodiments described in this and other sections are intended to be applicable to all embodiments and aspects of the present application herein described for which they are suitable as would be understood by a person skilled in the art.
As used herein, the singular forms “a,” “an,” and “the” and the like include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to “a compound” includes both a single compound and a plurality of different compounds.
Terms of degree such as “substantially”, “about”, and “approximately” as used herein mean a reasonable amount of deviation of the modified term such that the end result is not significantly changed. These terms of degree should be construed as including a deviation of at least ±1% (and up to ±5% or ±10%) of the modified term if this deviation would not negate the meaning of the word it modifies.
The term “and/or” as used herein means that the listed items are present, or used, individually or in combination. In effect, this term means that “at least one of” or “one or more” of the listed items is used or present.
As will be understood by one skilled in the art, for any and all purposes, such as in terms of providing a written description, all ranges disclosed herein also encompass any and all possible subranges and combinations of subranges thereof. Any listed range can be easily recognized as sufficiently describing and enabling the same range being broken down into at least equal halves, thirds, quarters, fifths, tenths, and so on. As a non-limiting example, each range discussed herein can be readily broken down into a lower third, middle third and upper third, and so on. As will also be understood by one skilled in the art all language such as “up to,” “at least,” and the like include the number recited and refer to ranges which can be subsequently broken down into subranges as discussed above. Finally, as will be understood by one skilled in the art, a range includes each individual member.
In understanding the scope of the present application, the term “comprising” and its derivatives, as used herein, are intended to be open ended terms that specify the presence of the stated features, elements, components, groups, integers, and/or steps, but do not exclude the presence of other unstated features, elements, components, groups, integers and/or steps. The foregoing also applies to words having similar meanings such as the terms, “including”, “involving”, “having”, and their derivatives. The term “consisting” and its derivatives, as used herein, are intended to be closed terms that specify the presence of the stated features, elements, components, groups, integers, and/or steps, but exclude the presence of other unstated features, elements, components, groups, integers and/or steps. The term “consisting essentially of”, as used herein, is intended to specify the presence of the stated features, elements, components, groups, integers, and/or steps as well as those that do not materially affect the basic and novel characteristic(s) of features, elements, components, groups, integers, and/or steps. In embodiments or claims where the term comprising (or the like) is used as the transition phrase, such embodiments can also be envisioned with replacement of the term “comprising” with the terms “consisting of” or “consisting essentially of.” The methods, kits, systems, and/or compositions of the present disclosure can comprise, consist essentially of, or consist of, the components disclosed.
In embodiments comprising an “additional” or “second” component, the second component as used herein is different from the other components or first component. A “third” component is different from the other, first, and second components, and further enumerated or “additional” components are similarly different.
Certain terms employed in the specification, examples and claims are collected herein. Unless defined otherwise, all technical and scientific terms used in this disclosure have the same meanings as commonly understood by one of ordinary skill in the art to which this disclosure belongs.
Preferences and options for a given aspect, feature, embodiment, or parameter of the invention should, unless the context indicates otherwise, be regarded as having been disclosed in combination with any and all preferences and options for all other aspects, features, embodiments, and parameters of the invention.
As used herein, amino acid residues will be indicated either by their full name or according to the standard three-letter or one-letter amino acid code.
As used herein, the terms “polypeptide” or “protein” are used interchangeably, and refer to a polymeric form of amino acids of any length, which can include coded and non-coded amino acids, chemically or biochemically modified or derivatized amino acids, and polypeptides having modified peptide backbones. A “peptide” is also a polymer of amino acids with a length which is usually of up to 50 amino acids. A polypeptide or peptide is represented by an amino acid sequence.
As used herein, the terms “nucleic acid molecule”, “polynucleotide”, “polynucleic acid”, “nucleic acid” are used interchangeably and refer to polymeric form of nucleotides of any length, either deoxyribonucleotides or ribonucleotides, or analogs thereof. A nucleic acid molecule is represented by a nucleic acid sequence, which is primarily characterized by its base sequence. Polynucleotides may have any three-dimensional structure, and may perform any function, known or unknown. Non-limiting examples of polynucleotides include a gene, a gene fragment, exons, introns, messenger RNA (mRNA), transfer RNA, ribosomal RNA, ribozymes, cDNA, recombinant polynucleotides, branched polynucleotides, plasmids, vectors, isolated DNA of any sequence, control regions, isolated RNA of any sequence, nucleic acid probes, and primers. The nucleic acid molecule may be linear or circular.
As used herein, the term “homology” denotes at least secondary structural identity or similarity between two macromolecules, particularly between two polypeptides or polynucleotides, from same or different taxons, wherein said similarity is due to shared ancestry. Hence, the term “homologues” denotes so-related macromolecules having said secondary and optionally tertiary structural similarity. For comparing two or more nucleotide sequences, the “(percentage of) sequence identity” between a first nucleotide sequence and a second nucleotide sequence may be calculated using methods known by the person skilled in the art (e.g., by dividing the number of nucleotides in the first nucleotide sequence that are identical to the nucleotides at the corresponding positions in the second nucleotide sequence by the total number of nucleotides in the first nucleotide sequence and multiplying by 100% or by using a known computer algorithm for sequence alignment such as NCBI Blast). In determining the degree of sequence similarity between two amino acid sequences, the skilled person may take into account so-called “conservative” amino acid substitutions, which can generally be described as amino acid substitutions in which an amino acid residue is replaced with another amino acid residue of similar chemical structure and which has little or essentially no influence on the function, activity or other biological properties of the polypeptide. Possible conservative amino acid substitutions have been already exemplified herein. Amino acid sequences and nucleic acid sequences are said to be “exactly the same” if they have 100% sequence identity over their entire length.
Throughout this disclosure, each time one refers to a specific amino acid sequence SEQ ID NO (take SEQ ID NO: Y as example), one may replace it by: a polypeptide comprising an amino acid sequence that has at least 80% sequence identity or similarity with amino acid sequence SEQ ID NO: Y. Throughout this application, the wording “a sequence is at least X % identical with another sequence” may be replaced by “a sequence has at least X % sequence identity with another sequence”.
Each amino acid sequence described herein by virtue of its identity percentage (at least 80%) with a given amino acid sequence respectively has in a further preferred embodiment an identity of at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identity with the given amino acid sequence respectively. In some embodiments, sequence identity is determined by comparing the whole length of the sequences as identified herein. Each amino acid sequence described herein by virtue of its similarity percentage (at least 80%) with a given amino acid sequence respectively has in a further embodiment a similarity of at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more similarity with the given amino acid sequence respectively. In some embodiments, sequence similarity is determined by comparing the whole length of the sequences as identified herein. Unless otherwise indicated herein, identity or similarity with a given SEQ ID NO means identity or similarity based on the full length of said sequence (i.e., over its whole length or as a whole).
“Sequence identity” is herein defined as a relationship between two or more amino acid (polypeptide or protein) sequences or two or more nucleic acid (polynucleotide) sequences, as determined by comparing the sequences. The identity between two amino acid sequences is preferably defined by assessing their identity within a whole SEQ ID NO as identified herein or part thereof. Part thereof may mean at least 50% of the length of the SEQ ID NO, or at least 60%, or at least 70%, or at least 80%, or at least 90%.
In the art, “identity” also means the degree of sequence relatedness between amino acid sequences, as the case may be, as determined by the match between strings of such sequences. “Similarity” between two amino acid sequences is determined by comparing the amino acid sequence and its conserved amino acid substitutes of one polypeptide to the sequence of a second polypeptide. “Identity” and “similarity” can be readily calculated by known methods, including but not limited to those described in Computational Molecular Biology, Lesk, A. M., ed., Oxford University Press, New York, 1988; Biocomputing: Informatics and Genome Projects, Smith, D. W., ed., Academic Press, New York, 1993; Computer Analysis of Sequence Data, Part I, Griffin, A. M., and Griffin, H. G., eds., Humana Press, New Jersey, 1994; Sequence Analysis in Molecular Biology, von Heine, G., Academic Press, 1987; and Sequence Analysis Primer, Gribskov, M, and Devereux, J., eds., M Stockton Press, New York, 1991; and Carillo, H., and Lipman, D., SIAM J. Applied Math., 48:1073 (1988).
Preferred methods to determine identity are designed to give the largest match between the sequences tested. Methods to determine identity and similarity are codified in publicly available computer programs. Preferred computer program methods to determine identity and similarity between two sequences include e.g. the GCG program package (Devereux, J., et al., Nucleic Acids Research 12 (1): 387 (1984)), BestFit, FASTA, BLASTN, and BLASTP (Altschul, S. F. et al., J. Mol. Biol. 215:403-410 (1990)), EMBOSS Needle (Madeira, F., et al., Nucleic Acids Research 47(W1): W636-W641 (2019)). The BLAST program is publicly available from NCBI and other sources (BLAST Manual, Altschul, S., et al., NCBI NLM NIH Bethesda, Md. 20894; Altschul, S., et al., J. Mol. Biol. 215;403-410 (1990)). The EMBOSS program is publicly available from EMBL-EBI. The well-known Smith Waterman algorithm may also be used to determine identity. The EMBOSS Needle program is the preferred program used.
Preferred parameters for polypeptide sequence comparison include the following: Algorithm: Needleman and Wunsch, J. Mol. Biol. 48 (3):443-453 (1970); Comparison matrix: BLOSUM62 from Henikoff and Henikoff, Proc. Natl. Acad. Sci. USA. 89:10915-10919 (1992); Gap Open Penalty: 10; and Gap Extend Penalty: 0.5. A program useful with these parameters is publicly available as the EMBOSS Needle program from EMBL-EBI. The aforementioned parameters are the default parameters for a Global Pairwise Sequence alignment of proteins (along with no penalty for end gaps).
Preferred parameters for nucleic acid comparison include the following: Algorithm: Needleman and Wunsch, J. Mol. Biol. 48:443-453 (1970): Comparison matrix: DNAfull; Gap Open Penalty: 10; Gap Extend Penalty: 0.5. A program useful with these parameters is publicly available as the EMBOSS Needle program from EMBL-EBI. The aforementioned parameters are the default parameters for a Global Pairwise Sequence alignment of nucleotide sequences (along with no penalty for end gaps).
Also provided herein are embodiments wherein any embodiment described herein may be combined with any one or more other embodiments, provided the combination is not mutually exclusive.
As described herein above, glycoengineering in eukaryotes is complicated by the fact that glycans are synthesized across several subcellular compartments and that glycosylation is an essential process, with significant alteration of glycosylation pathways often leading to severe fitness defects (Choi et al., “Use of Combinatorial Genetic Libraries to Humanize N-Linked Glycosylation in the Yeast Pichia pastoris,” Proc. Natl. Acad. Sci. USA 100:5022-5027 (2003), which is hereby incorporated by reference in its entirety). Glycoengineering in bacteria, on the other hand, is not constrained by these issues due to the non-essential nature of protein glycosylation in bacterial cells.
The present disclosure provides recombinant prokaryotic host cells, as well as lysates of such recombinant prokaryotic host cells and related kits, devices, compositions, systems, and methods for producing O-glycosylated proteins. Specifically, the present disclosure provides for the development of a low-cost strategy for efficient production of O-linked glycoproteins in prokaryotic host cells (or using lysates thereof). In some embodiments, the recombinant prokaryotic host cells of the present disclosure have been genetically engineered with a one or more genes encoding a novel O-glycosylation pathway that is capable of efficiently glycosylating target proteins at specific acceptor sites (e.g., O-linked glycosylation). Using these engineered recombinant prokaryotic host cells, virtually any recombinant protein-of-interest can be expressed and glycosylated.
A first aspect of the present disclosure relates to a recombinant prokaryotic host cell expressing one or more 4-epimerases, one or more glycosyl-1-phosphate transferases, and one or more O-oligosaccharyltransferases.
Another aspect of the present disclosure relates to a recombinant prokaryotic host cell expressing one or more 4-epimerases, one or more glycosyl-1-phosphate transferases, one or more O-oligosaccharyltransferases, and one or more ß1,3-galactosyltransferase enzymes. In some embodiments, the one or more ß1,3-galactosyltransferase enzymes are capable of transferring galactose (Gal) to undecaprenyl pyrophosphate (Und-PP)-linked N-Acetylgalactosamine (GalNAc).
Another aspect of the present disclosure relates to a prokaryotic host cell expressing an α2,6-sialyltransferases and an α2,3-sialyltransferase, where the α2,6-sialyltransferases is the α2,6-sialyltransferases from Photobacterium sp. JT-ISH-224 (PspST6) and where the α2,3-sialyltransferase is the α2,3-sialyltransferase from E. coli O104 (EcWbwA).
Another aspect of the present disclosure relates to a prokaryotic host cell expressing one or more 4-epimerases, one or more glycosyl-1-phosphate transferases, one or more O-Antigen ligases (e.g., EcWaaL), and optionally one or more ß1,3-galactosyltransferase enzymes capable of transferring galactose (Gal) to undecaprenyl pyrophosphate (Und-PP)-linked N-Acetylgalactosamine (GalNAc). In some embodiments according to this aspect of the invention, the prokaryotic host cell does not encode an O-oligosaccharyltransferase.
Recombinant prokaryotic cells according to the present disclosure serve as a host for expression of recombinant proteins for production of O-glycosylated proteins of interest. Suitable host cells include, without limitation, E. coli and other Enterobacteriaceae, Escherichia sp., Campylobacter sp., Wolinella sp., Desulfovibrio sp. Vibrio sp., Pseudomonas sp. Bacillus sp., Listeria sp., Staphylococcus sp., Streptococcus sp., Peptostreptococcus sp., Megasphaera sp., Pectinatus sp., Selenomonas sp., Zymophilus sp., Actinomyces sp., Arthrobacter sp., Frankia sp., Micromonospora sp., Nocardia sp., Propionibacterium sp., Streptomyces sp., Lactobacillus sp., Lactococcus sp., Leuconostoc sp., Pediococcus sp., Acetobacterium sp., Eubacterium sp., Heliobacterium sp., Heliospirillum sp., Sporomusa sp., Spiroplasma sp., Mycoplasma sp., Erysipelothrix sp., Corynebacterium sp. Enterococcus sp., Clostridium sp., Mycoplasma sp., Mycobacterium sp., Actinobacteria sp., Salmonella sp., Shigella sp., Moraxella sp., Helicobacter sp., Stenotrophomonas sp., Micrococcus sp., Neisseria sp., Bdellovibrio sp., Hemophilus sp., Klebsiella sp., Proteus mirabilis, Enterobacter cloacae, Serratia sp., Citrobacter sp., Proteus sp., Serratia sp., Yersinia sp., Acinetobacter sp., Actinobacillus sp. Bordetella sp., Brucella sp., Capnocytophaga sp., Cardiobacterium sp., Eikenella sp., Francisella sp., Haemophilus sp., Kingella sp., Pasteurella sp., Flavobacterium sp. Xanthomonas sp., Burkholderia sp., Aeromonas sp., Plesiomonas sp., Legionella sp., Rhizobium sp., and Azototoacter sp. (e.g., A. vinelandii).
One major advantage of E. coli as a prokaryotic host cell for O-glycoprotein expression is that, unlike yeast and all other eukaryotes, there are no native glycosylation systems. Thus, the addition (or subsequent removal) of glycosylation-related genes should have little to no bearing on the viability of glyco-engineered E. coli cells. Furthermore, the potential for non-human glycan attachment to target proteins by endogenous glycosylation reactions is eliminated in these cells. Accordingly, in various embodiments, a prokaryotic host cell (or lysate thereof) is used to produce O-linked glycoproteins, which provides an attractive solution for circumventing the significant hurdles associated with eukaryotic cell culture. Suitable E. coli host cells according to the present disclosure include, without limitation, laboratory strains of E. coli selected from the group consisting of DH5α, NEB 10-beta, BL21(DE3), W3110, CLM24, CLM25, MC4100, MCΔw, MCΔΔw, MCΔΔw-neuo-
In some embodiments of the present disclosure, the one or more 4-epimerases, the one or more glycosyl-1-phosphate transferases, the one or more O-oligosaccharyltransferases, and/or the one or more ß1,3-galactosyltransferase enzymes are orthogonal and/or heterologous to the prokaryotic host cell. The term “orthogonal” refers to a molecule (e.g., an enzyme) that functions with endogenous components of a host cell with reduced efficiency as compared to a corresponding molecule that is endogenous to the host cell, or that fails to function with endogenous components of the cell. In some embodiments, the orthogonal enzyme lacks a functionally normal endogenous complementary enzyme in the host cell. A second orthogonal enzyme can be introduced into the cell that functions with the first orthogonal enzyme. For example, an orthogonal O-glycosylation pathway may include introduced complementary components that function together in the host cell (e.g., one or more 4-epimerases, the one or more glycosyl-1-phosphate transferases, the one or more O-oligosaccharyltransferases, and the one or more ß1,3-galactosyltransferase enzymes).
The term “heterologous” refers to a molecule (e.g., an enzyme or a nucleic acid sequence encoding an enzyme) not normally found in the host organism. The term “heterologous” also includes a nucleic acid molecule comprising a native coding region, or portion thereof, that is reintroduced into the host organism in a form that is different from the corresponding native gene (e.g., not in its natural location in the host cell genome or in a codon-optimized format). Thus, in some embodiments, the one or more 4-epimerases, the one or more glycosyl-1-phosphate transferases, the one or more O-oligosaccharyltransferases, and/or the one or more ß1,3-galactosyltransferase enzymes are heterologous to the prokaryotic host cell.
In some embodiments, the 4-epimerase is a uridine diphosphate-N-acetylglucosamine (UDP-GlcNAc) 4 epimerase. As used herein, the term “UDP-GlcNAc 4-epimerase” refers to an enzyme that catalyzes the epimerization of the hydroxyl group at position C-4 of UDP-GlcNAc (uridine diphosphate-N-acetylglucosamine) to generate uridine diphosphate-N-acetylgalactosamine (UDP-GalNAc) (see, e.g., Bernatchez et al., “A Single Bifunctional UDP-GlcNAc/Glc 4-Epimerase Supports the Synthesis of Three Cell Surface Glycoconjugates in Campylobacter jejuni,” J. Biol. Chem. 280(6):P4792-4802 (2005), which is hereby incorporated by reference in its entirety). Thus, in some embodiments, the one or more 4-epimerases comprises a UDP-GlcNAc 4-epimerase (Gne). Suitable UDP-GlcNAc 4-epimerases include, without limitation, C. jejuni Gne (CjGne), Salmonella enterica O30 Gne (SeGne), Shigella boydii 018 Gne (SbGne), E. coli 055 Gne (EcGne), E. coli 086 Gne (EcGne), E. coli 086 Gne2 (EcGne2). In other embodiments, the one or more 4-epimerases belongs to the KEGG orthology ID KO01784 (see, e.g., Meceratesi et al., “Human UDP-Galactose 4′ Epimerase (GALE) Gene and Identification of Five Missense Mutations in Patients with Epimerase-Deficiency Galactosemia,” Mol. Genet. Metab. 63:26-30 (1998) and Majumdar et al., “UDPgalactose 4-Epimerase from Saccharomyces cerevisiae. A Bifunctional Enzyme with Aldose 1-Epimerase activity,” Eur. J. Biochem. 271:753-759 (2004), which are hereby incorporated by reference in their entirety). Suitable 4-epimerases belonging to KEGG orthology ID KO01784 include, without limitation, UDP-galactose-4-epimerases (GALE) enzymes.
The amino acid sequence of C. jejuni Gne (CjGne) has the amino acid sequence of SEQ ID NO: 1 below.
Additional exemplary UDP-GlcNAc 4-epimerases include, without limitation. Pseudomonas aeruginosa WbpP and Plesiomonas shigelloides WbgU (see, e.g., Ishiyama et al., “Crystal Structure of WbpP, a Genuine UDP-N-Acetylglucosamine 4-Epimerase from Pseudomonas aeruginosa: Substrate Specificity in Udp-Hexose 4-Epimerases,” J. Biol. Chem. 279(21):22635-22642 (2004) and Kowal & Wang et al., “New UDP-GlcNAc C4 Epimerase Involved in the Biosynthesis of 2-Acetamino-2-deoxy-L-Altruronic Acid in the O-Antigen Repeating Units of Plesiomonas shigelloides O17,” Biochemistry 41(51):15410-15414 (2002), which is hereby incorporated by reference in its entirety).
As used herein, the term “glycosyl-1-phosphate transferase” refers to an enzyme that transfers a phosphate-monosaccharide from a nucleotide diphosphate-monosaccharide to a polyprenol phosphate to generate a polyprenol diphosphate-linked monosaccharide.
In some embodiments, the glycosyl-1-phosphate transferase transfers N-acetylgalactosamine (GalNac) from UDP-GalNAc to undecaprenol phosphate (Und-P) to form undecaprenol pyrophosphate (Und-PP)-linked GalNAc. Suitable glycosyl-1-phosphate transferases include, without limitation, PglC. In some embodiments, the PglC is an Acinetobacter baumannii PglC (AbPglC).
Suitable Acinetobacter baumannii PglC (AbPglC) enzymes may be selected from the group consisting of Acinetobacter baumannii strain ATCC 17978 PglC, Acinetobacter baumannii strain NIPH190, Acinetobacter baumannii strain D46, Acinetobacter baumannii strain LUH5541, Acinetobacter baumannii LUH5546, Acinetobacter baumannii RBH4, Acinetobacter baumannii strain A74, Acinetobacter baumannii strain ACICU, Acinetobacter baumannii strain LUH5533, Acinetobacter baumannii strain LUH5550, Acinetobacter baumannii strain NIPH70, and Acinetobacter baumannii strain 4190 (Harding et al., “Distinct Amino Acid Residues Confer One of Three UDP-Sugar Substrate Specificities in Acinetobacter baumannii PglC Phosphoglycosyltransferases,” Glycobiology 28(7):522-533 (2018), which is hereby incorporated by reference in its entirety).
The amino acid sequence of Acinetobacter baumannii strain ATCC 17978 PglC has the amino acid sequence of SEQ ID NO: 2 below.
As used herein, the term “ß1,3-galactosyltransferase” refers to an enzyme that transfers galactose (Gal) to a polyprenol diphosphate-linked monosaccharide (e.g., undecaprenol pyrophosphate (Und-PP)-linked GalNAc).
In some embodiments, the one or more ß1,3-galactosyltransferase enzymes capable of transferring galactose (Gal) to undecaprenyl pyrophosphate (Und-PP)-linked N-Acetylgalactosamine (GalNAc) is a ß-1,3; galactosyltransferase derived from the O-antigen biosynthesis pathway of an enterohemorrhagic Escherichia coli (EcWbwC). Suitable ß-1,3; galactosyltransferases derived from the O-antigen biosynthesis pathway of an enterohemorrhagic Escherichia coli include, without limitation, E. coli strain O104 EcWbwC and E. coli strain O5 EcWbwC.
As described herein, E. coli strain O104 encodes the ß-1,3,-galactosyltransferase WbwC, which extends Und-PP-GalNAc by a single Gal residue, yielding lipid-linked Gal-β1,3-GalNAc. The Gal-β1,3-GalNAc disaccharide present on E. coli strain O104 O antigens is identical to the O-glycan core 1 of mammalian glycoproteins and the cancer-associated Thomsen-Friedenreich (TF or T) antigen (see, e.g., Wang et al., “Characterization of Two UDP-Gal:GalNAc-Diphosphate-Lipid β1,3-Galactosyltransferases WbwC from Escherichia coli Serotypes O104 and O5,” J. Bacteriol. 196(17):3122-3133 (2014), which is hereby incorporated by reference in its entirety).
The amino acid sequence of E. coli strain O104 EcWbwC has the amino acid sequence of SEQ ID NO: 3 below.
As used herein, the term “O-oligosaccharyltransferase” refers to an enzyme that transfers an O-oligosaccharide from a lipid carrier molecule to an acceptor molecule (e.g., the hydroxyl group of a serine or threonine residue on a target protein). In some embodiments, the O-oligosaccharyltransferase described herein glycosylates the acceptor molecule via an en bloc mechanism.
The one or more O-oligosaccharyltransferases (O-OST) may be PglL, PglO, or a combination thereof.
O-glycosylation of proteins in Neisseria meningitidis is catalyzed by PglL, which belongs to a protein family including WaaL O-antigen ligases. Neisseria meningitides PglL (NmPglL) shows relaxed substrate specificity and is able to transfer O-oligosaccharides composed of different sugars, linkages, and lengths from an undecaprenyl pyrophosphate (Und-PP) carrier to proteins (Faridmoayer et al., “Extreme Substrate Promiscuity of the Neisseria Oligosaccharyl Transferase Involved in Protein O-glycosylation,” J. Biol. Chem. 283(50):34596-604 (2008) and Faridmoayer et al., “Functional Characterization of Bacterial Oligosaccharyl-Transferases Involved in O-linked Protein Glycosylation,” J. Bacteriol. 189(22):8088-8098 (2007), which are hereby incorporated by reference in their entirety). Neisseria gonorrhoeae PglO (NgPglO) is an O-oligosaccharyltransferases having 95% sequence identity with Neisseria meningitides PglL (NmPglL) which glycosylates a wide range of periplasmic proteins containing serine and threonine residues in vivo (Hartley et al., “Biochemical Characterization of the O-Linked Glycosylation Pathway in Neisseria gonorrhoeae Responsible for Biosynthesis of Protein Glycans Containing N,N′-Diacetylbacillosamine,” Biochemistry 50(22):4936-4948 (2011), which is hereby incorporated by reference in its entirety).
In some embodiments, the PglL is a Neisseria meningitides PglL (NmPglL) and the PglO is Neisseria gonorrhoeae PglO (NgPglO).
The amino acid sequence of Neisseria meningitides PglL (NmPglL) has the amino acid sequence corresponding to nucleic acid residues 1-615 of SEQ ID NO: 4 below.
The amino acid sequence of Neisseria gonorrhoeae PglO (NgPglO) has the amino acid sequence corresponding to amino acid residues 1-613 SEQ ID NO: 5 below.
Additional O-oligosaccharyltransferases are well known to those of skill in the art and include, without limitation proteins comprising protein glycosylation ligase (PglL_A), O-antigen ligase (Wzy_C), and/or virulence factor membrane-bound polymerase, C-terminal (Wzy_C_2) domains (see, e.g., Musumeci et al., “Evaluating the Role of Conserved Amino Acids in Bacterial O-Oligosaccharyltransferases by In Vivo, In Vitro and Limited Proteolysis Assays,” Glycobiology 24(1):39-50 (2014): Klena et al., “Comparison of Lipopolysaccharide Biosynthesis Genes rfaK, rfaL, rfaY, and rfaZ of Escherichia coli K-12 and Salmonella typhimurium,” J. Bacteriol. 174(14):4746-4752 (1992); Kadioglu et al., “The Role of Streptococcus pneumoniae Virulence Factors in Host Respiratory Colonization and Disease,” Nat. Rev. Microbiol. 6(4):288-301 (2008), which are hereby incorporated by reference in their entirety).
An exemplary embodiment of the present disclosure is shown schematically in
As described herein, undecaprenyl-phosphate alpha-N-acetylglucosaminyl 1-phosphate transferase enzymes catalyze the transfer of a GlcNAc-1-phophate moiety (see
As described herein supra, O-glycosylation of proteins in Neisseria meningitides is catalyzed by PglL, which belongs to a protein family including WaaL O-antigen ligases. WaaL O-antigen ligases are inner membrane glycosyltransferases that catalyze the transfer of O-antigen polysaccharide from a lipid-linked intermediate to a terminal sugar of the lipid A-core oligosaccharide, which is a conserved step in lipopolysaccharide biosynthesis (Ruan et al., “Escherichia coli and Pseudomonas aeruginosa Lipopolysaccharide O-antigen Ligases share Similar Membrane Topology and Biochemical Properties,” Mol. Biol. 110(1):95-113 (2018), which is hereby incorporated by reference in its entirety).
Recombinant prokaryotic host cells according to the present disclosure may be obtained by providing one or more nucleotide sequences encoding an enzyme of the present disclosure (e.g., the nucleotide sequences encoding the one or more 4-epimerases, the one or more glycosyl-1-phosphate transferases, the one or more O-oligosaccharyltransferases, the one or more ß1,3-galactosyltransferase enzymes, the enzymes of an enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc), and/or the sialyltransferases of the present disclosure). Thus, each of the one or more 4-epimerases, the one or more glycosyl-1-phosphate transferases, the one or more O-oligosaccharyltransferases, the one or more ß1,3-galactosyltransferase enzymes, the enzymes of an enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc), and/or the sialyltransferases of the present disclosure may be encoded by a nucleotide sequence that is independently located either on an extrachromosomal plasmid carried by the prokaryotic host cell or in the recombinant prokaryotic host cell's genome.
In some embodiments, the one or more nucleotide sequences encoding the one or more 4-epimerases, the one or more glycosyl-1-phosphate transferases, the one or more O-oligosaccharyltransferases, the one or more ß1,3-galactosyltransferase enzymes, the enzymes of an enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc), and/or the sialyltransferases of the present disclosure is a recombinant genetic construct.
As used herein, the “recombinant genetic construct” of the disclosure refers to a nucleic acid molecule containing a combination of two or more genetic elements not naturally occurring together. The recombinant genetic construct may comprise non-naturally occurring nucleotide sequences that can be in the form of linear DNA, circular DNA, i.e., placed within a vector (e.g., a bacterial vector) or integrated into a genome.
As described in more detail infra, the recombinant genetic construct is introduced into the host cell of interest to effectuate the expression of the one or more 4-epimerases, the one or more glycosyl-1-phosphate transferases, the one or more O-oligosaccharyltransferases, the one or more ß1,3-galactosyltransferase enzymes, the enzymes of an enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc), and/or the sialyltransferases, as disclosed herein.
Suitable nucleotide sequences encoding the one or more 4-epimerases, the one or more glycosyl-1-phosphate transferases, the one or more O-oligosaccharyltransferases, and/or the one or more ß1,3-galactosyltransferase enzymes disclosed herein are set forth in Table 1 below. Suitable nucleotide sequences also include nucleotide sequences having at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to the 4-epimerase, the glycosyl-1-phosphate transferase, the O-oligosaccharyltransferase, and the ß1,3-galactosyltransferase coding sequences provided in Table 1 below (i.e., SEQ ID NOs. 6-10)
C. jejuni Gne
Acinetobacter
baumannii
E. coli strain
Neisseria
meningitides
ATCACCATCACCATCACCATCACTAA
Neissena
meningitides
ACCATCATCATCACTAA
In some embodiments of the present disclosure, the nucleotide sequences disclosed herein (e.g., the nucleotide sequences encoding the one or more 4-epimerases, the one or more glycosyl-1-phosphate transferases, the one or more O-oligosaccharyltransferases, the one or more ß1,3-galactosyltransferase enzymes, enzymes of an enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc), and/or the sialyltransferases of the present disclosure) are codon optimized to overcome limitations associated with the codon usage bias between E. coli (and other bacteria) and/or higher organisms, such as yeast and mammalian cells. Codon usage bias refers to differences among organisms in the frequency of occurrence of codons in protein-coding DNA sequences (genes). A codon is a series of three nucleotides (triplets) that encodes a specific amino acid residue in a polypeptide chain. Codon optimization can be achieved by making specific transversion nucleotide changes (i.e., a purine to pyrimidine or pyrimidine to purine nucleotide change) or transition nucleotide change (i.e., a purine to purine or pyrimidine to pyrimidine nucleotide change). Exemplary codon optimized nucleic acid molecules corresponding to C. jejuni Gne (CjGne), Acinetobacter baumannii strain ATCC 17978 PglC, E. coli strain O104 EcWbwC, Neisseria meningitides PglL (NmPglL), and Neisseria meningitides PglO (NmPglO) are set forth herein as SEQ ID NOs: 6-10, SEQ ID NOs: 15-18, SEQ ID NO: 21, and SEQ ID NO: 22.
The nucleotide sequences encoding any of the enzymes disclosed herein (i.e., the nucleotide sequences encoding the one or more 4-epimerases, the one or more glycosyl-1-phosphate transferases, the one or more O-oligosaccharyltransferases, the one or more ß1,3-galactosyltransferase enzymes, enzymes of an enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc), and/or the sialyltransferases of the present disclosure) may comprise sequences having least 80% identity to any one of SEQ ID NOs: 6-10, SEQ ID NOs: 15-18, SEQ ID NO: 21, and SEQ ID NO: 22.
In another embodiment, the nucleotide sequences of the present disclosure encode a polypeptide having the amino acid sequence of any one or more of SEQ ID NOs: 1-5, or a modified amino acid sequence of any one of SEQ ID NOs: 1-5, where the modified sequence has at least 80% sequence identity to any one of SEQ ID NOs: 1-5.
Methods for transforming/transfecting host cells with recombinant genetic constructs provided herein are well-known in the art and depend on the host system selected, as described in Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold Springs Laboratory Press, Cold Springs Harbor, N.Y. (1989), which is hereby incorporated by reference in its entirety.
As noted above, the further elaboration of Gal-β1,3-GalNAc with additional sugars such as N-Acetylneuraminic acid (NeuNAc) (see
In some embodiments, when the recombinant prokaryotic host comprises one or more β1,3-galactosyltransferase enzymes capable of transferring galactose (Gal) to undecaprenyl pyrophosphate (Und-PP)-linked-N-Acetylgalactosamine (GalNac), the recombinant prokaryotic host cell may further expresses: (i) the enzymes of an enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc); and (ii) one or more α2,3-sialyltransferases and/or one or more α2,6-sialyltransferases (see pathway beginning in the middle of
Suitable enzymes of an enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc) are encoded by, e.g., the E. coli neuDBAC genes (e.g., E. coli K1 neuDBAC genes). As described herein, NeuC is a UDP-GlcNAc 2-epimerase that converts UDP-GlcNAc to ManNAc; NeuB is a sialic acid synthase that condenses ManNAc and PEP to form NeuNAc; NeuA is a CMP-NeuNAc synthetase that converts NeuNAc to CMP-NeuNAc; and NeuD promotes efficient sialic acid synthesis by enhancing the activity of NeuC, NeuB, and NeuA (see, e.g., Daines et al., “NeuD Plays a Role in the Synthesis of Sialic Acid in Escherichia coli K1,” FEMS Microbiology Letters 189(2):281-284 (2000) and Valentine et al., “Immunization with Outer Membrane Vesicles Displaying Designer Glycotopes Yields Class-switched, Glycan-specific Antibodies,” Cell. Chem. Biol. 23:655-665 (2016), which are hereby incorporated by reference in their entirety). Thus, in some embodiments, the enzymes of an enzymatic pathway capable of producing CMP-NeuNAc include one or more of NeuA, NeuB, NeuC, and NeuD (e.g., E. coli K1 NeuA, E. coli K1 NeuB, E. coli K1 NeuC, and E. coli K1 NeuD).
The amino acid sequence of E. coli K1 NeuA has the amino acid sequence of SEQ ID NO: 11 below.
The amino acid sequence of E. coli K1 NeuB has the amino acid sequence of SEQ ID NO: 12 below.
The amino acid sequence of E. coli K1 NeuC has the amino acid sequence of SEQ ID NO: 13 below.
The amino acid sequence of E. coli K1 NeuD has the amino acid sequence of SEQ ID NO: 14 below.
Suitable nucleotide sequences encoding the enzymes of an enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc) disclosed herein are set forth in Table 2 below. Suitable nucleotide sequences also include nucleotide sequences having at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87% 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to the enzymes of an enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc) coding sequences provided in Table 2 below (i.e., SEQ ID NOs. 15-18).
E. coli K1
E. coli K1
E. coli K1
E. coli K1
N-acetylneuraminate lyase plays a role in the regulation of sialic acid metabolism in bacterial by catalyzing the reversible aldol cleavage of N-acetylneuraminic acid (sialic acid) to form pyruvate and N-acetyl-D-mannosamine (Izard et al., “The Three-Dimensional Structure of N-Acetylneuraminate Lyase from Escherichia coli,” Structure 2(5):361-369 (1994), which is hereby incorporated by reference in its entirety). Accordingly, N-acetylneuraminate lyase may interfere with the CMP-NeuNAc synthase pathway encoded by the E. coli K1 neuDBAC genes. Thus, in some embodiments, the recombinant prokaryotic host cell does not express an enzymatically active N-acetylneuraminate lyase. Exemplary N-acetylneuraminate lyases are well known in the art and include, without limitation, NanA.
In some embodiments, the recombinant prokaryotic host cell is an E. coli host cell that lacks a functional copy of the nanA gene (Valentine et al., “Immunization with Outer Membrane Vesicles Displaying Designer Glycotopes Yields Class-switched, Glycan-specific Antibodies,” Cell. Chem. Biol. 23:655-665 (2016), which is hereby incorporated by reference in its entirety).
An exemplary sialyltransferases for use in the present disclosure are identified in the schematic of
The amino acid sequence of Photobacterium sp. JT-ISH-224 α2,6-sialyltransferases (PspST6) has the amino acid sequence of SEQ ID NO: 19 below.
In some embodiments, the one or more sialyltransferases is an α2,3-sialyltransferase. In accordance with such embodiments, the one or more α2,3-sialyltransferases is WbwA from E. coli O104.
The amino acid sequence of E. coli O104 α2,3-sialyltransferases (EcWbwA) has the amino acid sequence of SEQ ID NO: 20 below.
Suitable nucleotide sequences encoding the sialyltransferases disclosed herein are set forth in Table 3 below. Suitable nucleotide sequences also include nucleotide sequences having at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to the sialyltransferase coding sequences provided in Table 3 below (i.e., SEQ ID NOs. 21-22).
Photobacterium sp.
E. coli O104 α2,3-
As noted above, in some embodiments, the enzymes of the enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc) are encoded by a poly-nucleotide sequence that is located on an extrachromosomal plasmid carried by the prokaryotic host cell. As described herein, expression of the enzymes of the enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc) from an extrachromosomal plasmid is effective to produce greater amounts of the enzymes of the enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc) than when the enzymes of the enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc) are expressed in the recombinant prokaryotic host cell's genome.
In some embodiments, the enzymes of an enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc) are encoded by a polynucleotide sequence that is located in the recombinant prokaryotic host cell's genome. Thus, in some embodiments, the prokaryotic host cell is an E. coli cell and the one or more enzymes of an enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc) (e.g., E. coli K1 NeuA, E. coli K1 NeuB, E. coli K1 NeuC, and E. coli K1 NeuD) are encoded by a polynucleotide sequence that is located at the native O-antigen locus of the host cell's genome. In accordance with such embodiments, the gnomically integrated enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc) is heterologous and orthogonal to the host cell. As described herein, the gnomically integrated enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc), when integrated into the native O-antigen locus of the host cell genome, produces greater amounts of glycoprotein bearing sialic acid than when (i) the enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc) is integrated at any other locus in the host cell genome and/or (ii) the enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc) is located on an extrachromosomal plasmid carried by the host cell.
In some embodiments, each of the one or more α2,3-sialyltransferases or α2,6-sialyltransferases is encoded by a polynucleotide sequence that is independently located either on an extrachromosomal plasmid carried by the prokaryotic host cell or in the recombinant prokaryotic host cell's genome.
In some embodiments of present disclosure, the recombinant prokaryotic host cell further expresses a “glycoprotein target” or a “target protein for glycosylation.” As used herein, the terms “glycoprotein target” or “glycoprotein target” refer to a protein of interest which comprises one or more acceptor sites for O-glycosylation. Thus, in some embodiments, the glycoprotein target comprises one or more serine and/or threonine residue. Suitable target proteins include prokaryotic and eukaryotic proteins. In some embodiments, the glycoprotein target is a mucin or mucin-like protein. Exemplary mucins and mucin-like proteins are well known in the art and include, e.g., MUC1, MUC2, MUC3, MUC4, MUC5, MUC6, MUC7, MUC8, MUC9, MUC10, MUC11, MUC12, MUC13, MUC14, MUC15, MUC16, MUC17, MUC18, MUC19, MUC20, MUC21, MUC22, MUC23, MUC24, MUC25, MUC26, MUC27, MUC28, MUC29, MUC30, MUC31, MUC32, MUC33, MUC 34, MUC35, MUC26, MUC37, MUC38, MUC39, MUC 40, MUC41, MUC42, human erythropoietin (EPO), human glycophorin C (GPC), and fragments thereof, as well as leukosialin (leucocyte sialoglycoprotein, sialophorin, CD43, galactoglycoprotein, GALGP)(Schmid et al., “Amino Acid Sequence of Human Plasma Galactoglycoprotein: Identity with the Extracellular Region of CD43 (Sialophorin),” Proc. Natl. Adad. Sci. USA 89(2):663-667 (1992); Fukada, M., “Leukosialin, A Major O-Glycan-Containing Sialoglycoprotein Defining Leukocyte Differentiation and Malignancy,” Glycobiology 1(4):347-356 (1991); and Campos et al., “Probing the O-Glycosylation of Gastric Cancer Cell Lines for Biomarker Discovery;” Mol. Cell. Proteomics 14(6):P1616-1629 (2015), which are hereby incorporated by reference in their entirety); glycophorin A (PAS-2, sialoglycoprotein alpha, MN sialoglycoprotein) (Pisano et al., “Glycosylation Sites Identified by Solid-Phase Edman Degradation: O-Linked Glycosylation Motifs on Human Glycophorin A,” Glycobiology 3(5):429-435 (1993), which is hereby incorporated by reference in its entirety); Glycophorin C (Glycophorin D, PAS-2′, GLPC) (Dahr & Beyreuther, “A Revision of the N-Terminal Structure of Sialoglycoprotein D (Glycophorin C) from Human Erythrocyte Membranes,” Biol. Chem. Hoppe Seyler 366(11):1067-1070 (1985), which is hereby incorporated by reference in its entirety); Von willebrand factor (Titani et al., “Amino Acid Sequence of Human von Willebrand Factor,” Biochemistry 25(11):3171-3184 (1986) and Samor et al., “Primary Structure of the Major O-Glycosidically Linked Carbohydrate Unit of Human von Willebrand Factor,” Glyconj. J. 6(3):263-270 (1989), which is hereby incorporated by reference in its entirety); Kininogen (Lottspeich et al., “The Amino Acid Sequence of the Light Chain of Human High-Molecular-Mass Kininogen,” Eur. J. Biochem. 152(2):307-314 (1985), which is hereby incorporated by reference in its entirety); and chorionic gonadotropin beta chain (Birken et al., “Characterization of Antisera Distinguishing Carbohydrate Structures in the Beta-Carboxyl-Terminal Region of Human Chorionic Gonadotropin,” Endocrinology 122(5):2054-2063 (1988), which is hereby incorporated by reference in its entirety) (see Table 4 below).
Aberrant mucin expression and glycosylation are reliable biomarkers of carcinomas in humans (Rata et al., “MUC Glycoproteins: Potential Biomarkers and Molecular Targets for Cancer Therapy,” Curr. Cancer Drug Targets 21(2):132-152 (2021), which is hereby incorporated by reference in its entirety). Indeed, the membrane-associated mucin MUC1 is aberrantly expressed in −60% of all cancers diagnosed each year in the U.S. (Jonckheere & Van Seuningen, “The Membrane-Bound Mucins: From Cell Signalling to Transcriptional Regulation and Expression in Epithelial Cancers,” Biochimie 92(1):1-11 (2009), which is hereby incorporated by reference in its entirety), rendering MUC1 one of the most prominently dysregulated genes in cancer. Another mucin, MUC16 (also called CA125), is highly expressed in ovarian cancer and clinically used as a biomarker for treatment efficacy and surveillance.
Additional suitable glycoprotein targets may be selected from the group consisting of a therapeutic protein, a diagnostic protein, an industrial enzyme, or a portion thereof.
In some embodiments, the therapeutic protein is selected from the group consisting of an enzyme, a cytokine, a hormone, a growth factor, an inhibitor protein, a protein receptor, a ligand that binds a protein receptor, or an antibody.
In some embodiments, the target protein is heterologous to the recombinant prokaryotic host cell.
In some embodiments, the glycoprotein target comprises a MOOR tag (WPAAASAP (SEQ ID NO: 24).
In some embodiments, the target protein is encoded by a polynucleotide sequence that is located on an extrachromosomal plasmid carried by the prokaryotic host cell or in the recombinant prokaryotic host cell's genome.
In some embodiments, the expression of one or more of: the one or more 4-epimerases, the one or more glycosyl-1-phosphate transferases, the one or more O-oligosaccharyltransferases, the one or more ß1,3-galactosyltransferase enzymes capable of transferring galactose (Gal) to undecaprenyl pyrophosphate (Und-PP)-linked N-Acetylgalactosamine (GalNAc), the glycoprotein target comprising one or more serine and/or threonine residues, the one or more α2,3-sialyltransferases, and/or the one or more α2,6-sialyltransferases is constitutive.
In some embodiments, the expression of one or more of: the one or more 4-epimerases, the one or more glycosyl-1-phosphate transferases, the one or more O-oligosaccharyltransferases, the one or more ß1,3-galactosyltransferase enzymes capable of transferring galactose (Gal) to undecaprenyl pyrophosphate (Und-PP)-linked N-Acetylgalactosamine (GalNAc), the glycoprotein target comprising one or more serine and/or threonine residues, the one or more α2,3-sialyltransferases, and the one or more α2,6-sialyltransferases is inducible.
In some embodiments, membrane extracts may be prepared from the recombinant prokaryotic host cells disclosed herein. The membrane extracts may be prepared from different recombinant prokaryotic host cell strains as disclosed herein and the membrane extracts may be combined to prepare a mixed membrane extract. In some embodiments, one or more membrane extracts may be prepared from one or more recombinant prokaryotic host cell strains including a genomic modification (e.g., deletions of genes rendering the genes inoperable) that preferably result in membrane extracts comprising sugar precursors for glycosylation at relatively high concentrations (e.g., in comparison to a strain not having the genomic modification). In some embodiments, one or more membrane extracts may be prepared from one or more recombinant prokaryotic host cell strains that have been modified to express one or more orthogonal or heterologous genes or gene clusters that are associated with glycoprotein synthesis. Preferably, the membrane extracts or mixed membrane extracts are enriched in glycosylation components, such as the one or more 4-epimerases, the one or more glycosyl-1-phosphate transferases, the one or more O-oligosaccharyltransferases, the one or more ß1,3-galactosyltransferase enzymes, the enzymes of an enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc), and/or the sialyltransferases of the present disclosure.
Another aspect of the present disclosure relates to a method for producing an O-glycosylated protein. This method involves providing a recombinant host cell expressing one more 4-epimerases, one or more glycosyl-1-phosphate transferases, one or more O-oligosaccharyltransferases, and a glycoprotein target comprising one or more serine and/or threonine residues. This method further involves culturing the host cell under conditions effective to: (i) produce N-acetylgalactosamine (GalNAc) linked to undecaprenyl pyrophosphate (Und-PP): and (ii) transfer the N-acetylgalactosamine (GalNAc) linked to undecaprenyl pyrophosphate (Und-PP) en bloc to a serine or threonine amino acid of the glycoprotein target.
Another aspect of the present disclosure relates to a method for producing an O-glycosylated protein. This method involves providing a recombinant host cell expressing one more 4-epimerases, one or more glycosyl-1-phosphate transferases, one or more O-oligosaccharyltransferases, one or more ß1,3-galactosyltransferase enzymes capable of transferring galactose (Gal) to undecaprenyl pyrophosphate (Und-PP)-linked N-Acetylgalactosamine (GalNAc), and a glycoprotein target comprising one or more serine and/or threonine residues. This method further involves culturing said host cell under conditions effective to: (i) produce N-acetylgalactosamine (GalNAc) linked to undecaprenyl pyrophosphate (Und-PP): (ii) extend Und-PP-GalNAc by a single galactose (Gal) monosaccharide to yield lipid-linked Gal-ß1,3-GalNAc; and (iii) transfer the lipid-linked Gal-ß1,3-GalNAc en bloc to a serine or threonine amino acid of the glycoprotein target.
Suitable host cells, 4-epimerases, glycosyl-1-phosphate transferases, O-oligosaccharyltransferases, and ß1,3-galactosyltransferase enzymes for use in the methods described herein are described in more detail supra.
In some embodiments, the host cell is an Escherichia coli cell. Exemplary suitable E. coli host cells according to the present disclosure include, without limitation, laboratory strains of E. coli selected from the group consisting of DH5α, NEB 10-beta, BL21(DE3), W3110, CLM24, CLM25, MC4100, MC4100, MCΔw, MCΔΔw, MCΔΔw-neuo-
In some embodiments, the 4-epimerase is a uridine diphosphate-N-acetylglucosamine (UDP-GlcNAc) 4-epimerase. For example, the 4-epimerase may be a Gne. The Gne may be C. jejuni Gne (c/Gne).
In some embodiments, the one or more glycosyl-1-phosphate transferases is a PglC. For example, the PglC is Acinetobacter baumannii ATCC 17978 PglC (AbPglC).
In some embodiments, the one or more O-oligosaccharyltransferases is PglL, a PglO, or a combination thereof. Thus, in some embodiments, the PglL is Neisseria meningitides PglL (NmPglL) and the PglO is Neisseria gonorrhoeae PglO (NgPglO).
In some embodiments, when the recombinant host cell expresses one or more ß1,3-galactosyltransferase enzymes capable of transferring galactose (Gal) to undecaprenyl pyrophosphate (Und-PP)-linked N-Acetylgalactosamine (GalNAc), the one or more ß1,3-galactosyltransferase enzymes capable of transferring galactose (Gal) to undecaprenyl pyrophosphate (Und-PP)-linked N-Acetylgalactosamine (GalNAc) is the ß-1,3-galactosyltransferase derived from the O-antigen biosynthesis pathway of enterohemorrhagic Escherichia coli O104 (EcWbwC).
In some embodiments of the methods disclosed herein, each of the one or more 4-epimerases, the one or more glycosyl-1-phosphate transferases, the one or more O-oligosaccharyltransferases, and the one or more ß1,3-galactosyltransferase enzymes capable of transferring galactose (Gal) to undecaprenyl pyrophosphate (Und-PP)-linked N-Acetylgalactosamine (GalNAc) are encoded by a polynucleotide sequence that is independently located either on an extrachromosomal plasmid carried by the prokaryotic host cell or in the recombinant prokaryotic host cell's genome.
In some embodiments, when the recombinant prokaryotic host cell does not express one or more ß1,3-galactosyltransferase enzymes capable of transferring galactose (Gal) to undecaprenyl pyrophosphate (Und-PP)-linked N-Acetylgalactosamine (GalNAc), the recombinant prokaryotic host cell further expresses: (i) the enzymes of an enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc), and (ii) one or more α2,6-sialyltransferases. In accordance with such embodiments, the N-acetylgalactosamine (GalNAc) linked to undecaprenyl pyrophosphate (Und-PP) is extended by one or more sialic acid (NeuNAc) sugars before the transfer the N-acetylgalactosamine (GalNAc) linked to undecaprenyl pyrophosphate (Und-PP) en bloc to a serine or threonine amino acid of the glycoprotein target.
Suitable enzymes of an enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc) are described in more detail supra and include, e.g., E. coli K1 NeuA, NeuB, NeuC, and NeuD.
Suitable α2,6-sialyltransferases are described in more detail supra and include, e.g., the α2,6-sialyltransferase from Photobacterium sp. JT-ISH-224.
In some embodiments, when the recombinant prokaryotic host cell expresses one or more ß1,3-galactosyltransferase enzymes capable of transferring galactose (Gal) to undecaprenyl pyrophosphate (Und-PP)-linked N-Acetylgalactosamine (GalNAc), the recombinant prokaryotic host cell further expresses: (i) the enzymes of an enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc); and (ii) one or more α2,3-sialyltransferases and/or one or more α2,6-sialyltransferases; and wherein the lipid-linked Gal-ß1,3-GalNAc is extended by one or more sialic acid (NeuNAc) sugars before the transfer the lipid-linked Gal-ß1,3-GalNAc en bloc to a serine or threonine amino acid of the glycoprotein target.
Suitable enzymes of an enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc) are described in more detail supra and include, e.g., E. coli K1 NeuA, NeuB, NeuC, and NeuD. Thus, in some embodiments, the enzymes of the enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc) are encoded by the E. coli K1 neuDBAC genes.
Suitable α2,3-sialyltransferases and α2,6-sialyltransferases described in more detail supra. In some embodiments, the one or more α2,3-sialyltransferases is WbwA from E. coli O104 and the one or more α2,6-sialyltransferases is the α2,6-sialyltransferase from Photobacterium sp. JT-ISH-224.
Suitable glycoprotein targets are described in more detail supra.
Purified glycoproteins and/or glycoprotein reagents (e.g., one more 4-epimerase enzymes, one or more N,N′-diacetylbacilliosaminyl-1-phosphate transferase enzymes, one or more O-oligosaccharyltransferases, and one or more ß1,3-galactosyltransferase enzymes capable of transferring galactose (Gal) to undecaprenyl pyrophosphate (Und-PP)-linked N-Acetylgalactosamine (GalNAc)) may be obtained from the recombinant prokaryotic host cell described herein by several methods readily known in the art, including ion exchange chromatography, hydrophobic interaction chromatography, affinity chromatography, gel filtration, and reverse phase chromatography. In some embodiments, the glycoproteins and/or glycoprotein reagents described herein are produced in purified form (preferably at least about 70% pure, at least about 75% pure, at least about 80% pure, at least about 85% pure, at least about 90% pure, at least about 95% pure, at least about 96% pure, at least about 97% pure, at least about 98% pure, at least about 99% pure, at least about 99.5% pure, at least about 99.9% pure, or more) by conventional techniques.
Another aspect of the present disclosure relates to an in vitro method for producing an O-glycosylated protein. This method involves providing glycosylation reagents comprising one more 4-epimerases, one or more glycosyl-1-phosphate transferases, and one or more O-oligosaccharyltransferases; providing a glycoprotein target comprising one or more serine and/or threonine residues: and incubating said glycosylation reagents and said glycoprotein target under conditions effective to: (i) yield N-acetylgalactosamine (GalNAc) linked to undecaprenyl pyrophosphate (Und-PP), and (ii) transfer the N-acetylgalactosamine (GalNAc) linked to undecaprenyl pyrophosphate (Und-PP) en bloc to a serine or threonine amino acid of the glycoprotein target.
Another aspect of the present disclosure relates to an in vitro method for producing an O-glycosylated protein. This method involves providing glycosylation reagents comprising one more 4-epimerase enzymes, one or more heterologous N,N′-diacetylbacilliosaminyl-1-phosphate transferase enzymes, one or more heterologous O-oligosaccharyltransferases, and one or more ß1,3-galactosyltransferase enzymes capable of transferring galactose (Gal) to undecaprenyl pyrophosphate (Und-PP)-linked N-Acetylgalactosamine (GalNAc): providing a glycoprotein target comprising one or more serine and/or threonine residues; and incubating said glycosylation reagents and said glycoprotein target under conditions effective to: (1) yield lipid-linked Gal-ß1,3-GalNAc, and (ii) transfer the lipid-linked Gal-ß1,3-GalNAc and any additional sugars en bloc to a serine or threonine amino acid of the glycoprotein target.
In some embodiments, the glycosylation reagents are provided as a membrane extract of a recombinant prokaryotic host cell of the present disclosure. In some embodiments, the recombinant prokaryotic host cell is an Escherichia coli cell.
In some embodiments, the glycosylation reagents are provided in the form of purified enzymes.
Suitable 4-epimerases, glycosyl-1-phosphate transferases. O-oligosaccharyltransferases, and ß1,3-galactosyltransferase enzymes for use in the methods described herein are described in more detail supra.
In some embodiments, the 4-epimerase is a uridine diphosphate-N-acetylglucosamine (UDP-GlcNAc) 4-epimerase. For example, the 4-epimerase may be a Gne. The Gne may be C. jejuni Gne (CjGne).
In some embodiments, the one or more glycosyl-1-phosphate transferases is a PglC. For example, may be Acinetobacter baumannii ATCC 17978 PglC (AbPglC).
In some embodiments, the one or more O-oligosaccharyltransferases is PglL, a PglO, or a combination thereof. Thus, in some embodiments, the PglL is Neisseria meningitides PglL (NmPglL) and the PglO is Neisseria gonorrhoeae PglO (NgPglO).
In some embodiments, when the recombinant host cell expresses one or more ß1,3-galactosyltransferase enzymes capable of transferring galactose (Gal) to undecaprenyl pyrophosphate (Und-PP)-linked N-Acetylgalactosamine (GalNAc), the one or more ß1,3-galactosyltransferase enzymes capable of transferring galactose (Gal) to undecaprenyl pyrophosphate (Und-PP)-linked N-Acetylgalactosamine (GalNAc) is the ß-1,3; galactosyltransferase derived from the O-antigen biosynthesis pathway of enterohemorrhagic Escherichia coli O104 (EcWbwC).
In some embodiments of the methods disclosed herein, when the glycosylation reagents do not comprise one or more ß1,3-galactosyltransferase enzymes capable of transferring galactose (Gal) to undecaprenyl pyrophosphate (Und-PP)-linked N-Acetylgalactosamine (GalNAc), the glycosylation reagents further comprise: (i) the enzymes of an enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc), and (ii) one or more α2,6-sialyltransferases. In accordance with such embodiments, the N-acetylgalactosamine (GalNAc) linked to undecaprenyl pyrophosphate (Und-PP) is extended by one or more sialic acid (NeuNAc) sugars before the transfer the N-acetylgalactosamine (GalNAc) linked to undecaprenyl pyrophosphate (Und-PP) en bloc to a serine or threonine amino acid of the glycoprotein target.
Suitable enzymes of an enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc) are described in more detail supra and include, e.g., E. coli K1 NeuA, NeuB, NeuC, and NeuD.
Suitable α2,6-sialyltransferases are described in more detail supra and include, e.g., the α2,6-sialyltransferase from Photobacterium sp. JT-ISH-224.
In some embodiments, when the glycosylation reagents comprise one or more ß1,3-galactosyltransferase enzymes capable of transferring galactose (Gal) to undecaprenyl pyrophosphate (Und-PP)-linked N-Acetylgalactosamine (GalNAc), the reagents further comprise: (i) the enzymes of an enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc); and (ii) one or more α2,3-sialyltransferases and/or one or more α2,6-sialyltransferases. In accordance with such embodiments, the lipid-linked Gal-ß1,3-GalNAc is extended by one or more sialic acid (NeuNAc) sugars before the transfer the lipid-linked Gal-ß1,3-GalNAc en bloc to a serine or threonine amino acid of the glycoprotein target.
Suitable enzymes of an enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc) are described in more detail supra and include, e.g., E. coli K1 NeuA, NeuB, NeuC, and NeuD. Thus, in some embodiments, the enzymes of the enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc) are encoded by the E. coli K1 neuDBAC genes.
Suitable α2,3-sialyltransferases and α2,6-sialyltransferases described in more detail supra. In some embodiments, the one or more α2,3-sialyltransferases is WbwA from E. coli O104 and the one or more α2,6-sialyltransferases is the α2,6-sialyltransferase from Photobacterium sp. JT-ISH-224.
Suitable glycoprotein targets are described in more detail supra.
Another aspect of the present disclosure relates to an in vitro method for producing an O-glycosylated protein. This method involves providing reagents suitable for synthesizing a glycoprotein target; providing glycosylation reagents comprising one more 4-epimerases, one or more glycosyl-1-phosphate transferases, and one or more O-oligosaccharyltransferases; providing a nucleic acid molecule encoding a glycoprotein target and incubating said reagents suitable for synthesizing a glycoprotein target, glycosylation reagents, and nucleic acid molecule encoding a glycoprotein target under conditions effective to: (i) synthesize the glycoprotein target encoded by the nucleic acid molecule encoding a glycoprotein target, (i) yield N-acetylgalactosamine (GalNAc) linked to undecaprenyl pyrophosphate (Und-PP), and (ii) transfer the N-acetylgalactosamine (GalNAc) linked to undecaprenyl pyrophosphate (Und-PP) en bloc to a serine or threonine amino acid of the glycoprotein target.
Another aspect of the present disclosure relates to an in vitro method for producing an O-glycosylated protein comprising. This method involves providing reagents suitable for synthesizing a glycoprotein target; providing glycosylation reagents comprising one more 4-epimerase enzymes, one or more heterologous N,N′-diacetylbacilliosaminyl-1-phosphate transferase enzymes, one or more heterologous O-oligosaccharyltransferases, and one or more ß1,3-galactosyltransferase enzymes capable of transferring galactose (Gal) to undecaprenyl pyrophosphate (Und-PP)-linked N-Acetylgalactosamine (GalNAc), providing a nucleic acid molecule encoding a glycoprotein target; and incubating said reagents suitable for synthesizing a glycoprotein target, glycosylation reagents, and nucleic acid molecule encoding a glycoprotein target under conditions effective to: (i) synthesize the glycoprotein target encoded by the nucleic acid molecule encoding a glycoprotein target, (ii) yield lipid-linked Gal-ß1,3-GalNAc, and (iii) transfer the lipid-linked Gal-ß1,3-GalNAc and any additional sugars en bloc to a serine or threonine amino acid of the glycoprotein target.
Reagents suitable for synthesizing a glycoprotein target are well known in the art and include, e.g., translation reagents.
Reagents for synthesizing proteins from a nucleic acid molecule and/or a recombinant genetic construct in vitro (i.e., in a cell-free environment) are well known in the art. These reagents or systems typically consist of extracts from rabbit reticulocytes, wheat germ, and E. coli. The extracts contain all the macromolecule components necessary for translation of an exogenous RNA molecule, including, for example, ribosomes, tRNAs, aminoacyl-tRNA synthetases, initiation, elongation, and termination factors. The other required components of the system include amino acids, energy sources (e.g., ATP, GTP), energy regenerating systems (e.g., creatine phosphate and creatine phosphokinase for eukaryote systems, and phosphoenol pyruvate and pyruvate kinase for prokaryote systems), and other cofactors (e.g., Mg.sup.2+, K.sup.+, etc.). If the nucleic acid molecule and/or recombinant genetic construct encoding the glycoprotein target is a DNA molecule, the cell-free translation reaction is coupled or linked to an initial transcription reaction that utilizes a RNA polymerase.
Suitable 4-epimerases, glycosyl-1-phosphate transferases, O-oligosaccharyltransferases, and ß1,3-galactosyltransferase enzymes for use in the methods described herein are described in more detail supra.
In some embodiments, the 4-epimerase is a uridine diphosphate-N-acetylglucosamine (UDP-GlcNAc) 4-epimerase. For example, the 4-epimerase may be a Gne. The Gne may be C. jejuni Gne (CjGne).
In some embodiments, the one or more glycosyl-1-phosphate transferases is a PglC. For example, may be Acinetobacter baumannii ATCC 17978 PglC (AbPglC).
In some embodiments, the one or more O-oligosaccharyltransferases is PglL, a PglO, or a combination thereof. Thus, in some embodiments, the PglL is Neisseria meningitides PglL (NmPglL) and the PglO is Neisseria gonorrhoeae PglO (NgPglO).
In some embodiments, when the recombinant host cell expresses one or more ß1,3-galactosyltransferase enzymes capable of transferring galactose (Gal) to undecaprenyl pyrophosphate (Und-PP)-linked N-Acetylgalactosamine (GalNAc), the one or more ß1,3-galactosyltransferase enzymes capable of transferring galactose (Gal) to undecaprenyl pyrophosphate (Und-PP)-linked N-Acetylgalactosamine (GalNAc) is the ß-1,3,-galactosyltransferase derived from the O-antigen biosynthesis pathway of enterohemorrhagic Escherichia coli O104 (EcWbwC).
Suitable glycoprotein targets are described in more detail supra.
In some embodiments, when the glycosylation reagents do not comprise one or more ß1,3-galactosyltransferase enzymes capable of transferring galactose (Gal) to undecaprenyl pyrophosphate (Und-PP)-linked N-Acetylgalactosamine (GalNAc), the glycosylation reagents may further comprise: (i) the enzymes of an enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc); and (ii) one or more α2,6-sialyltransferases, and the N-acetylgalactosamine (GalNAc) linked to undecaprenyl pyrophosphate (Und-PP) is extended by one or more sialic acid (NeuNAc) sugars before the transfer of the N-acetylgalactosamine (GalNAc) linked to undecaprenyl pyrophosphate (Und-PP) en bloc to a serine or threonine amino acid of the glycoprotein target.
Suitable enzymes of an enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc) are described in more detail supra and include, e.g., E. coli K1 NeuA, NeuB, NeuC, and NeuD.
Suitable α2,6-sialyltransferases are described in more detail supra and include, e.g., the α2,6-sialyltransferase from Photobacterium sp. JT-ISH-224.
In some embodiments, when the glycosylation reagents comprise one or more ß1,3-galactosyltransferase enzymes capable of transferring galactose (Gal) to undecaprenyl pyrophosphate (Und-PP)-linked N-Acetylgalactosamine (GalNAc), the glycosylation reagents may further comprise: (i) the enzymes of an enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc); and (ii) one or more α2,3-sialyltransferases and/or one or more α2,6-sialyltransferases; and wherein the lipid-linked Gal-ß1,3-GalNAc is extended by one or more sialic acid (NeuNAc) sugars before the transfer the lipid-linked Gal-ß1,3-GalNAc en bloc to a serine or threonine amino acid of the glycoprotein target.
Suitable enzymes of an enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc) are described in more detail supra and include, e.g., E. coli K1 NeuA, NeuB, NeuC, and NeuD. Thus, in some embodiments, the enzymes of the enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc) are encoded by the E. coli K1 neuDBAC genes.
Suitable α2,3-sialyltransferases and α2,6-sialyltransferases described in more detail supra. In some embodiments, the one or more α2,3-sialyltransferases is WbwA from E. coli O104 and the one or more α2,6-sialyltransferases is the α2,6-sialyltransferase from Photobacterium sp. JT-ISH-224.
Another aspect of the present disclosure relates to a method for producing a lipid linked Gal-β1,3-GalNAcα (T antigen or core 1). This method involves providing a recombinant host cell expressing one or more 4-epimerases, one or more glycosyl-1-phosphate transferases, one or more ß1,3-galactosyltransferase enzymes capable of transferring galactose (Gal) to undecaprenyl pyrophosphate (Und-PP)-linked N-Acetylgalactosamine (GalNAc), and one or more O-Antigen ligases (e.g., EcWaaL). This method further involves culturing the host cell under conditions effective to: (i) produce Gal-ß1,3-GalNAc linked to undecaprenyl pyrophosphate (Und-PP) and (ii) transfer Gal-ß1,3-GalNAc linked to undecaprenyl pyrophosphate (Und-PP) en bloc to a lipid target (see, e.g.,
Suitable recombinant host cells for use in the methods described herein are described in more detail supra. In some embodiments, the recombinant prokaryotic host cell is an Escherichia coli cell.
Suitable lipid targets include, e.g., lipid A core.
Suitable 4-epimerases, glycosyl-1-phosphate transferases, and ß1,3-galactosyltransferase enzymes for use in the methods described herein are described in more detail supra.
In some embodiments, the 4-epimerase is a uridine diphosphate-N-acetylglucosamine (UDP-GlcNAc) 4-epimerase. For example, the 4-epimerase may be a Gne. The Gne may be C. jejuni Gne (CjGne).
In some embodiments, the one or more glycosyl-1-phosphate transferases is a PglC. For example, may be Acinetobacter baumannii ATCC 17978 PglC (AbPglC).
In some embodiments, the one or more ß1,3-galactosyltransferase enzymes capable of transferring galactose (Gal) to undecaprenyl pyrophosphate (Und-PP)-linked N-Acetylgalactosamine (GalNAc) is the ß-1,3; galactosyltransferase derived from the O-antigen biosynthesis pathway of enterohemorrhagic Escherichia coli O104 (EcWbwC).
In some embodiments, the one or more O-Antigen ligases (e.g., EcWaaL) for use in the methods described herein is endogenous to the prokaryotic host cell. In other embodiments, the one or more O-Antigen ligases (e.g., EcWaaL) for use in the methods described herein is heterologous to the host cell.
Each of the 4-epimerases, glycosyl-1-phosphate transferases, and ß1,3-galactosyltransferase enzymes for use in the methods described herein may be heterologous to the prokaryotic host cell.
In some embodiments according to this aspect of the invention, the prokaryotic host cell does not encode an O-oligosaccharyltransferase.
In some embodiments, the recombinant host cell further expresses: (i) the enzymes of an enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc); and (ii) one or more α2,3-sialyltransferases and/or one or more α2,6-sialyltransferases. In accordance with such embodiments, the lipid-linked Gal-ß1,3-GalNAc is extended by one or more sialic acid (NeuNAc) sugars before the transfer the lipid-linked Gal-ß1,3-GalNAc en bloc to the lipid target.
Suitable enzymes of an enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc) are described in more detail supra and include, e.g., E. coli K1 NeuA, NeuB, NeuC, and NeuD. Thus, in some embodiments, the enzymes of the enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc) are encoded by the E. coli K1 neuDBAC genes.
In some embodiments of the methods described herein, the enzymes of the enzymatic pathway capable of producing cytidine-5′-monophosphate-5-N-acetylneuraminic acid (CMP-NeuNAc) are heterologous to the prokaryotic host cell.
Suitable α2,3-sialyltransferases and α2,6-sialyltransferases described in more detail supra. In some embodiments, the one or more α2,3-sialyltransferases is WbwA from E. coli O104 and the one or more α2,6-sialyltransferases is the α2,6-sialyltransferase from Photobacterium sp. JT-ISH-224.
In some embodiments of the methods described herein, the one or more α2,3-sialyltransferases and/or the one or more α2,6-sialyltransferases are heterologous to the prokaryotic host cell.
The examples below are intended to exemplify the practice of embodiments of the disclosure but are by no means intended to limit the scope thereof.
All strains used in the study are listed in Table 5 below. E. coli strain DH5a and NEB 10-beta were used for cloning and maintenance of plasmids while BL21(DE3) was used to produce purified acceptor proteins for 1VG reactions. Unless otherwise noted, strain CLM25 was used for all O-glycoprotein expression and was constructed by deleting wecA from CLM24 (Feldman et al., “Engineering N-linked Protein Glycosylation with Diverse O Antigen Lipopolysaccharide Structures in Escherichia coli,” Proc. Natl. Acad Sci. USA 102:3016-3021 (2005), which is hereby incorporated by reference in its entirety) through Plvir phage transduction where strain JW3758-2(Δrfe-735::kan) from the Keio collection (Baba et al., “Construction of Escherichia coli K-12 in-frame, Single-gene Knockout Mutants: The Keio Collection,” Mol. Syst Biol. 2:2006.0008 (2006), which is hereby incorporated by reference in its entirety) was used as the donor. MC4100 ΔwecA (MCΔw) and MC4100 ΔwecA ΔwaaL (MCΔΔw) were used as the hosts for flow cytometry screening and glyco-recoding to introduce the CMP-NeuNAc biosynthesis pathway. Strain MCΔw was generated by Plvir phage transduction of strain MC4100 to delete wecA using JW3758-2(Δrfe-735::kan) as the donor. Subsequent Plvir phage transduction of MCΔw to delete wool., using JW3597-1 (ΔrfaL734::kan) as donor yielded strain MCΔΔw. In all cases, after each deletion the linked kanamycin resistance (KanR) cassette was removed by transformation with the temperature-sensitive plasmid pCP20 as described in detail elsewhere (Datsenko K A & Wanner B L, “One-step Inactivation of Chromosomal Genes in Escherichia coli K-12 Using PCR Products,” Proc. Natl. Acad Sci. USA 97:6640-6645 (2000), which is hereby incorporated by reference in its entirety). The E. coli K1 neuDBAC genes encoding the CMP-NeuNAc biosynthesis pathway (Valentine et al., “Immunization with Outer Membrane Vesicles Displaying Designer Glycotopes Yields Class-switched, Glycan-specific Antibodies,” Cell. Chem. Biol. 23:655-665 (2016), which is hereby incorporated by reference in its entirety) were integrated into the chromosome of MCΔΔw using a previously described glyco-recoding strategy (Yates et al., “Glyco-recoded Escherichia coli: Recombineering-based Genome Editing of Native Polysaccharide Biosynthesis Gene Clusters,” Metab. Eng. 53:59-68 (2019), which is hereby incorporated by reference in its entirety). Briefly, the neuDBAC gene cluster was cloned into the pRecO-PS shuttle vector, which is uniquely designed to promote homologous recombination-based insertion of genes-of-interest in place of the existing genomic locus encoding the O-PS biosynthetic pathway between the glf and gnd genes (
E. coli strains
melanogaster cloned without the first 50 amino
coli O104 cloned in plasmid pOG-Tn; CmR
E. coli K12 strains; AmpR, KanR
E. coli K1 neuDBAC genes cloned into pRecO-PS;
E. coli DsbA signal peptide and a C terminal fusion
All cultures were grown at 37° C. in Luria-Bertani (LB) media containing D-glucose (0.2% w/v) as well as 20 μg/ml chloramphenicol (Cm), 100 μg/ml trimethoprim (Tmp), and 100 μg/ml ampicillin (Amp) as needed for plasmid maintenance. Induction of protein expression was always performed at mid-log phase (Abs600˜0.6) with 0.1 mM isopropyl β-D-thiogalactoside (IPTG) and 0.2% (w/v) L-arabinose at 16° C. for 16-20 hours. For yield determination experiments, cells were grown in 100 ml of Terrific Broth (TB) at 37° C. until mid-log phase and then induced with 1 mM IPTG and 0.2% (w/v) L-arabinose at 16° C. for 22 hours. Following expression, cells were harvested and protein purification was performed as described below.
All plasmids used in the study are listed in Table 5. Plasmid construction was performed according to standard cloning protocols using restriction enzymes from New England Biolabs. The pOG backbones were cloned in either the yeast recombineering plasmid pMW07 (Valderrama-Rincon, “An Engineered Eukaryotic Protein Glycosylation Pathway in Escherichia coli.,” Nat. Chem. Biol. 8:434-436 (2012), which is hereby incorporated by reference in its entirety) or a modified derivative of pMW07, namely pMW08, in which the yeast origin of replication and URA3 gene were deleted. Plasmid pOG-Tn was generated by the Gibson assembly method. Briefly, the genes encoding CjGne and AbPglC were PCR amplified with overlapping regions, and subsequently cloned into pMW08 using the NEBuilder HiFi DNA Assembly Cloning Kit (New England Biolabs) to generate plasmid pOG-Tn. Each of the candidate GalT enzymes was cloned into pOG-Tn by first obtaining codon-optimized DNA corresponding to each GalT gene synthesized with overlapping regions to facilitate recombination (Twist Biosciences). These genes were then amplified by PCR and cloned into pOG-Tn by Gibson assembly. A similar strategy was followed to generate plasmid pOG-T. Briefly, the genes encoding CjGne, AbPglC, EcWbwC were PCR amplified with overlapping regions, and subsequently cloned into pMW07 using the NEBuilder HiFi DNA Assembly Cloning Kit (New England Biolabs) to generate the pOG-T. Genes encoding NgPglO and NmPglL were added to pOG-Tn and pOG-T as follows. First, codon-optimized DNA encoding the NgPglO and NmPglL genes was synthesized with overlapping regions to facilitate recombination (Twist Biosciences). The synthesized genes were then amplified by PCR to have overlapping ends and recombined with linearized versions of plasmids pOG-Tn and pOG-T using a modified “lazy bones” protocol (Shanks et al., “Saccharomyces Cerevisiae-based Molecular Tool Kit for Manipulation of Genes from Gram-negative Bacteria,” Appl. Environ. Microbiol. 72:5027-5036 (2006), which is hereby incorporated by reference in its entirety). Briefly, 0.5 ml of an overnight yeast culture was pelleted and washed in sterile TE buffer (10 mM Tris-HCl pH 8.0 and 1 mM EDTA). 0.4 mg of salmon sperm carrier DNA (Sigma), plasmid DNA, and PCR products were added to the pellet along with 0.5 ml lazy bones solution (40% polyethylene glycol MW 3350, 0.1 M lithium acetate, 10 mM Tris-HCl pH 7.5 and 1 mM EDTA). After vortexing for 1 minute, the solution was incubated up to 4 days at room temperature. Cells were heat-shocked at 42° C., pelleted, and plated on selective medium. Plasmids were isolated from individual transformants and confirmed by DNA sequencing.
All acceptor proteins were cloned in plasmid pEXT20 (Dykxhoorn et al., “A Set of Compatible Tac Promoter Expression Vectors,” Gene 177:133-136 (1996), which is hereby incorporated by reference in its entirety). Briefly, the gene encoding E. coli MBP lacking its native 26-residue signal peptide was PCR amplified with primers that introduced the N-terminal signal peptide from E. coli DsbA, which permits periplasmic localization and glycosylation of fused proteins (Fisher et al., “Production of Secretory and Extracellular N-linked Glycoproteins in Escherichia coli,” Appl. Environ. Microbiol. 77:871-881 (2011), which is hereby incorporated by reference in its entirety). The resulting PCR product was cloned into pEXT20 using restriction cloning between the EcoRI and XbaI sites. The MOOR tag was comprised of an 8-residue core sequence (WPAAASAP (SEQ ID NO: 24)) that mimics the S63 glycosite in pilin (PilE), one of the native substrates of NmPglL (Pan et al., “Biosynthesis of Conjugate Vaccines Using an O-Linked Glycosylation System,” MBio. 7:e00443-00416 (2016), which is hereby incorporated by reference in its entirety), as well as two hydrophilic flanking sequences (DPRNVGGDLD (SEQ ID NO: 27) and QPGKPPR (SEQ ID NO: 28)) that are required for glycosylation. This sequence was synthesized as a G block (Integrated DNA Technologies) with a hexa-histidine epitope tag at its C-terminus and cloned between the XbaI and HindIII sites. All other acceptor proteins including GST, scFv 13-R4, CRM 197, PD, YebF-MBP, sfGFP, and sfGFPQ157 were synthesized as G blocks (Integrated DNA Technologies) and cloned in place of MBP by Gibson assembly using the EcoRI and XbaI sites to linearize the backbone. All additional acceptor peptides including MOORmut, the 8-residue EPO sequence, the 8-residue GPC sequence, the 9-residue SAP sequence, the 8-residue MUC1 sequence (MUC1_8), MUC1_12, MUC1_16, MUC1_20, MUC1_24, and MUC1_41 were synthesized as G blocks (Integrated DNA Technologies) and cloned in place of the MOOR sequence at the C-terminus of MBP by Gibson assembly using the XbaI and HindIII sites to linearize the backbone. The MUC1 sequence designs included motifs based on the most frequent minimal epitopes of natural MUC1 IgG and IgM antibodies including PPAHGVT (SEQ ID NO: 29), PDTRP (SEQ ID NO: 30), and RPAPGS (SEQ ID NO: 31) (von Mensdorff-Pouilly et al., “Reactivity of Natural and Induced Human Antibodies to MUC1 Mucin with MUC1 Peptides and n-acetylgalactosamine (GalNAc) Peptides,” Int J Cancer 86:702-712 (2000), which is hereby incorporated by reference in its entirety) and in epitopes that bind to specific human MHC class I molecules including STAPPAHGV (SEQ ID NO: 32), SAPDTRPAP (SEQ ID NO: 33), TSAPDTRPA (SEQ ID NO: 34), and APDTRPAPG (SEQ ID NO: 35) (Apostolopoulos et al., “Induction of HLA-A2-restricted CTLs to the Mucin 1 Human Breast Cancer Antigen,” J. Immunol. 159:5211-5218 (1997), which is hereby incorporated by reference in its entirety). The sialyltransferase used to produce the ST antigen was cloned adjacent to spDsbA-MBPMOOR in the pEXT20 acceptor plasmid. For sialylation of T antigen, E. coli O104 WbwA was acquired as a codon-optimized G block (Integrated DNA Technologies) and cloned downstream of spDsbA-MBPMOOR in plasmid pEXT20-spDsbA-MBPMOOR using Gibson assembly, yielding plasmid pEXT-spDsbA-MBPMOOR-EcWbwA. For sialylation of Tn antigen, the gene encoding EcWbwA was replaced with α2,6-sialyltransferase from Photobacterium sp. JT-ISH-224, yielding plasmid pEXT-spDsbA-MBPMOOR-PspST6. The plasmid for expression of the neuDBAC genes was constructed by yeast-based recombineering which involved cloning the E. coli K1 neuDBAC genes into plasmid pMLBy, which is a variant of plasmid pMLBAD that contains the yeast origin of replication and URA3 gene. The resulting plasmid was linearized with NheI after which the araC gene and pBAD promoter were replaced with the J23100 constitutive promoter from the Anderson library as described previously (Glasscock et al., “A Flow Cytometric Approach to Engineering Escherichia coli for Improved Eukaryotic Protein Glycosylation,” Metab. Eng. 47:488-495 (2018), which is hereby incorporated by reference in its entirety). The resulting pConNeuDBAC plasmid was used to transform strain ZLKA, a nanA-deficient host used previously for producing CMP-NeuNAc (Fierfort N & Samain E, “Genetic Engineering of Escherichia coli for the Economical Production of Sialylated Oligosaccharides,” J. Biotechnol. 134:261-265 (2008), which is hereby incorporated by reference in its entirety). Cell-free expression plasmids were generated by first PCR-amplifying the genes encoding MBPMOOR and MBPMOORmut from pEXT-spDsbA-MBPMOOR and pEXT-spDsbA-MBPMOORmut, respectively. The resulting PCR products were then ligated between NdeI and SalI restriction sites in plasmid pJL1, a pET-based vector used in cell-free glycoprotein synthesis reaction as described previously (Jaroentomeechai et al., “Single-pot Glycoprotein Biosynthesis Using a Cell-free Transcription-translation System Enriched with Glycosylation Machinery,” Nat Commun 9:2686 (2018), which is hereby incorporated by reference in its entirety).
Finally, a plasmid for expressing chimeric 5E5 antibody was constructed as described previously (Cox E C et al., “Antibody-mediated Endocytosis of Polysialic Acid Enables Intracellular Delivery and Cytotoxicity of a Glycan-directed Antibody-drug Conjugate,” Cancer Res. 79:1810-1821 (2019), which is hereby incorporated by reference in its entirety). First, DNA sequences for the VH and VL domains of mouse mAb 5E5 (Sorensen et al., “Chemoenzymatically Synthesized Multimeric Tn/STn MUC1 Glycopeptides Elicit Cancer-specific Anti-MUC1 Antibody Responses and Override Tolerance,” Glycobiology 16:96-107 (2006), which is hereby incorporated by reference in its entirety) were obtained from U.S. Pat. No. 10,189,908 B2 (which is hereby incorporated by reference in its entirety) and ordered as genes from GeneArt Gene Synthesis (Thermo Fisher). The 5E5 VH and VL sequences were then swapped with the existing variable region sequences in pVITRO1-Trastuzumab-IgG1/κ (Addgene plasmid #61883) to generate the vector pVITRO1-5E5-IgG1/κ according to previously published method (Dodev T S et al., “A Tool Kit for Rapid Cloning and Expression of Recombinant Antibodies,” Sci. Rep. 4:5885 (2014), which is hereby incorporated by reference in its entirety). All plasmids were confirmed by DNA sequencing.
Glycoprotein expression was carried out in 150-ml cultures for 16-20 hours. Cells were pelleted at 10,000×g for 30 minutes at 4° C., resuspended in 2 ml of lysis buffer containing 50 mM sodium phosphate, 300 mM sodium chloride, and 10 mM imidazole. Samples were frozen at −80° C. overnight. Cells were then thawed, gently agitated at room temperature with 200 μg/ml of lysozyme (Sigma) for 15 minutes, and lysed by sonication. Lysed samples were then centrifuged at 10,000×g for 30 minutes at 4° C., and the supernatant was subjected to Ni2+ affinity purification using Ni-NTA spin columns (Qiagen) according to the manufacturer's protocol. For preparation of extracellular culture supernatants, 10 ml of cells were pelleted by centrifugation at 10,000×g for 30 minutes. 5 ml of the cleared supernatant was then transferred to a fresh tube to which 5 ml of 20% chilled trichloroacetic acid was added. The mixture was vortexed and incubated at 4° C. without agitation for 16-20 hours. The sample was then centrifuged at 21,000×g for 30 minutes at 4° C. The supernatant was discarded and the pellet was resuspended in 1 ml of acetone. The sample was again centrifuged at 21,000×g for 30 minutes at 4° C., allowed to dry at 37° C. for 10 minutes, and resuspended in 60 μl of PBS.
Purified protein samples were prepared in Bolt LDS Sample Buffer (Thermo Fisher) and resolved on Bolt SDS-PAGE gels (Thermo Fisher). Following electrophoresis, proteins were transferred onto Immobilon-P polyvinylidene difluoride (PVDF) membranes (0.45 μm; Thermo Fisher) according to the manufacturer's protocol. Antibodies used included: HRP-conjugated anti-hexa-histidine polyclonal antibody (Abeam cat #ab 1187; dilution 1:5,000), mouse anti-human MUC1 antibody (BD Biosciences cat #555925; dilution 1:1,000), biotinylated PNA (Vector labs cat #B-1075; dilution 1:1,000), biotinylated VVA (Vector labs cat #B-1235; dilution 1:500), and chimeric 5E5 antibody (dilution 1:250). The latter antibody was produced in-house using FreeStyle™ 293-F cells (Thermo Fisher) transfected with pVITRO1-5E5-IgG1/κ and purified from cell culture supernatants using Protein A/G agarose (Thermo Fisher) according to the manufacturer's recommendations. Secondary antibodies included: HRP-conjugated rabbit anti-human IgG (Fc) antibody (Thermo Fisher cat #31423; 1:2,500 dilution) and HRP-conjugated goat anti-mouse IgG (H&L) antibody (Abeam cat #ab6789; 1:2,500 dilution). Biotinylated lectins were detected using HRP-conjugated Extravidin (Sigma cat #E2886; dilution 1:2,000). Detection of blots was performed using Bio-Rad enhanced chemiluminescent (ECL) substrate. All immunoblots were visualized using a Chemidoc XRS+ system with Image Lab software (Bio-Rad).
All reagents were purchased from Sigma Aldrich unless otherwise mentioned. Proteins were separated on SDS-PAGE gels after which gel pieces containing the glycoprotein bands were excised, cut into small pieces of about 1 mm2, and destained by treatment with 300 μL of a 1:1 mixture of acetonitrile and 50 mM aqueous NH4HCO3 followed by 500 μl of 100% acetonitrile. Since the glycoproteins did not have cysteine residues, reduction and alkylation was not performed. The glycoproteins were directly digested by adding 50 μl of digestion buffer with 12.5 μl of sequencing-grade trypsin (0.4 μg/μ1; Promega) to the gel pieces and incubating at 37° C. for 12 hours. The digested peptides were extracted twice by 5% formic acid in 200 μL of 1:2 water:acetonitrile and filtered through a 0.2-μm filter. The digests were then dried using a SpeedVac, and subsequently re-dissolved in solvent A (0.1% formic acid in water) and stored at −30° C. until analysis by nano-LC-MS/MS.
The digests were analyzed on an Orbitrap Fusion Tribrid mass spectrometer (Thermo Fisher) equipped with a nanospray ion source and connected to a Dionex binary solvent system. Pre-packed nano-LC columns of 15-cm length with 75-μm internal diameter (id), filled with 3-μm C18 material (reverse phase) were used for chromatographic separation of samples. The precursor ion scan was acquired at 120,000 resolution in the Orbitrap analyzer and precursors at a time frame of 3 seconds were selected for subsequent MS/MS fragmentation in the Orbitrap analyzer at 15,000 resolution or in ion trap. The threshold for triggering an MS/MS event with either higher-energy collisional dissociation product-triggered electron-transfer dissociation (HCDpdETD) program or electron-transfer dissociation (ETD) was set to 1,000 counts. Charge state screening was enabled, and precursors with unknown charge state or a charge state of +1 were excluded (positive ion mode). Dynamic exclusion was enabled (exclusion duration of 30 secs).
The LC-MS/MS spectra of tryptic digest of glycoproteins were searched against the respective fasta sequence of mucin fragment using Byonic™ software versions 3.2 and 3.5 with the specific cleavage option enabled, and selecting trypsin as the digestion enzyme. Oxidation of methionine, deamidation of asparagine and glutamine, and O-glycan masses of HexNAc (m/z 203.079). HexHexNAc (m/z 365.132), and NeuNAcHexHexNAc (m/z 656.228) were used as variable modifications. The LC-MS/MS spectra were also analyzed manually for the glycopeptides using Xcalibur 4.2 software. The HCDpdETD and ETD MS' spectra of glycopeptides were evaluated for the glycan neutral loss pattern, oxonium ions, and the glycopeptide fragmentations to assign the sequence and the presence of glycans in the glycopeptides. The peptide fragments at high resolution from ETD spectra were analyzed for the localization of O-glycosylation sites.
For detection and quantification of nucleotide sugars, E. coli cells were pelleted to an equivalent to Abs600 of ˜30, resuspended in 1 mL ultrapure water, and lysed by sonication. Following centrifugation at 30,000×g, the supernatant was collected and analyzed within 4 hours. Cleared E. coli lysates were diluted twofold in ultrapure water and injected into an UPLC-ESI-MS system (Waters) for analysis. The autosampler was set at 10° C. Separation was performed on an Acquity BEH C18 Column (1.7 μm, 2.1 mm×50 mm; Waters). The elution started from 95% mobile phase A (5 mM TBA aqueous solution, adjusted to pH 4.75 with acetic acid) and 5% mobile phase B (5 mM TBA in Acetonitrile), raised to 57% B in 2 minutes, further raised to 100% B in 0.5 minutes, and then held at 100% B for 2 minutes, and returned to initial conditions over 0.1 minute and held for 4 minutes to re-equilibrate the column. The flow rate was set at 0.6 ml/min with an injection volume of 2 μL. The column was preconditioned by pumping the starting mobile phase mixture for 10 minutes, followed by repeating twice the gradient protocol specified above prior to any injections. LC-ESI-MS chromatograms were acquired in negative ion mode under the following conditions: cpme voltage of 10 V, dry temperature at 520° C., and an acquisition range of m/z 400-900. Selected ion recordings were specified for CMP-NeuNAc. A standard curve was generated using commercial CMP-NeuNAc (CarboSynth).
To analyze the activity of candidate GalT enzymes, a flow cytometry-based screen was adapted from Glasscock et al., “A Flow Cytometric Approach to Engineering Escherichia coli for Improved Eukaryotic Protein Glycosylation,” Metab. Eng. 47:488-495 (2018), which is hereby incorporated by reference in its entirety). Briefly, overnight cultures of each strain were grown in LB with relevant antibiotics. Cells were subcultured to an Abs600 of ˜0.1 in 10 ml LB and grown for 16-20 hours at 30° C. The next day, 1 ml of culture was washed twice with 1 ml PBS and resuspended in 500 μl PBS. All samples were diluted to an Abs600 of ˜0.2 in 250 μl PBS. Detection of the disaccharide T antigen was performed with PNA-FITC conjugate (Vector labs cat #FL1071). PNA-FITC was diluted 1:500 in PBS and 250 μl of diluted lectin was added to cells, followed by incubation at 37° C. for 30 minutes. Cells were pelleted at 6,000×g for 4 minutes, washed in 1 ml PBS, resuspended in 1 ml PBS, and analyzed by flow cytometry using a FACSCalibur flow cytometer (BD Biosciences). All experiments were performed in triplicate with the resulting data generated through CellQuest Pro 6.0 and analyzed using FlowJo 10.5 software.
For IVG reactions, crude membrane extracts enriched with NgPglO and UndPP-linked T antigen was prepared as described previously (Jaroentomeechai et al., “Single-Pot Glycoprotein Biosynthesis Using a Cell-Free Transcription-Translation System Enriched with Glycosylation Machinery,” Nature Commun. 9:2686 (2018), which is hereby incorporated by reference in its entirety). Briefly, CLM25 cells carrying plasmid pOG-T NgPglO were grown for 16-20 hours at 37° C. in LB media. The following day, cells were subcultured into 4 L LB media and allowed to grow at 37° C. until mid-log phase (Abs600˜0.0.6). Cells were then induced for 20 hours at 16° C. with 0.2% L-arabinose. Cells were harvested by centrifugation at 10,000×g for 30 minutes at 4° C., and then resuspended in buffer containing 50 mM Tris-HCl (pH 8.0) and 25 mM sodium chloride. Cells were lysed by passing the cell suspension through a high-pressure homogenizer (Avestin) five times and the resulting lysate was centrifuged at 15,000×g for 20 minutes at 4° C. The supernatant was collected and subjected to ultracentrifugation at 100,000×g for 2 hours at 4° C. The resulting pellet corresponding to the membrane fraction was collected and resuspended in 3 ml of buffer containing 50 mM Tris-HCl (pH 7.0), 25 mM sodium chloride, and 0.1% (w/v) n-dodecyl-β-D-maltoside (DDM). The resuspended pellet was incubated with mild agitation at room temperature for 1 hour to enable the solubilization of NgPglO and LLOs. Following incubation, the mixture was centrifuged at 16,000×g for 1 hour at 4° C., and the supernatant was retained as a crude membrane extract. In parallel, acceptor proteins MBPMOOR and MBPMOORmut were purified as described above from a 500-ml culture of BL21(DE3) cells carrying either pEXT-spDsbA-MBPMOOR or pEXT-spDsbA-MBPMOORmut. In vitro glycosylation of purified acceptor proteins was carried out in 1.5-ml reactions containing 50 μg of purified acceptor protein and 1 ml of crude membrane extract in reaction buffer containing 10 mM HEPES (pH 7.5), 10 mM manganese chloride, and 1% (w/v) DDM. The reaction was incubated at 30° C. for 16 hours with mild tumbling. Upon completion of the reaction, acceptor proteins were purified from the reaction mixture by standard Ni2+ affinity purification using Ni-NTA spin columns (Qiagen) followed by concentration of samples.
For single-pot CFGpS, crude S12 extracts enriched with NgPglO and UndPP-linked T antigen glycans were prepared as described previously (Jaroentomeechai et al., “Single-pot Glycoprotein Biosynthesis Using a Cell-free Transcription-translation System Enriched with Glycosylation Machinery,” Nat. Commun. 9:2686 (2018), which is hereby incorporated by reference in its entirety). Briefly, CLM25 cells carrying plasmid pOG-T-NgPglO were grown at 37° C. in 2×YTPG (10 g/L yeast extract, 16 g/L tryptone, 5 g/L NaCl, 7 g/L K2HPO4, 3 g/L KH2PO4, 18 g/L glucose, pH 7.2) until the Abs600 reached ˜1. The culture was then induced with 0.02% (w/v) L-arabinose and the protein expression was allowed to proceed at 30° C. until the Abs600 reached ˜3. All subsequent steps were carried out at 4° C. unless otherwise stated. Cells were harvested and washed twice using S12 buffer (10 mM tris acetate, 14 mM magnesium acetate, 60 mM potassium acetate, pH 8.2). The pellet was then resuspended in 1 ml per 1 g cells of S12 buffer. The resulting suspension was passed once through a EmulsiFlex-B15 high-pressure homogenizer (Avestin) at 20,000-25,000 psi to lyse cells. The extract was then centrifuged twice at 12,000×g for 30 minutes to remove cell debris and the supernatant was collected and incubated at 37° C. for 60 minutes. Following centrifugation at 15,000×g for 15 minutes at 4° C., the supernatant was collected, flash-frozen in liquid nitrogen, and stored at −80° C. CFGpS reactions were carried out at 1-ml reaction volumes in a 15-ml conical tube using a modified PANOx-SP system (Jewett M C & Swartz J R, “Mimicking the Escherichia coli Cytoplasmic Environment Activates Long-lived and Efficient Cell-free Protein Synthesis,” Biotechnol Bioeng 86:19-26 (2004), which is hereby incorporated by reference in its entirety). The reaction mixture contained the following components: 0.85 mM each of GTP, UTP, and CTP, 1.2 mM ATP, 34.0 μg/ml folinic acid, 170.0 μg/ml of E. coli tRNA mixture, 130 mM potassium glutamate, 10 mM ammonium glutamate, 12 mM magnesium glutamate, 2 mM each of 20 amino acids, 0.4 mM nicotinamide adenine dinucleotide (NAD), 0.27 mM coenzyme-A (CoA), 1.5 mM spermidine, 1 mM putrescine, 4 mM sodium oxalate, 33 mM phosphoenolpyruvate (PEP), 57 mM HEPES, 6.67 μg/ml plasmid, and 27% (v/v) of cell lysate. Protein synthesis was carried out for 30 minutes at 30° C., after which protein glycosylation was initiated by the addition of sucrose and tetracycline at the final concentration of 100 mM and 10 μg/ml, respectively, and carried out at 30° C. for 16 hours. To recover protein products, reaction mixtures were passed through a Ni-NTA spin column (Qiagen) twice, washed, and eluted with 300 mM imidazole. Samples were concentrated and analyzed by SDS-PAGE followed by immunoblotting analysis.
The enable orthogonal O-glycosylation in E. coli required assembling an en bloc pathway for producing the simplest mucin-type O-glycoform, GalNAcα (Tn antigen) (
To transfer Und-PP-linked Tn antigen to hydroxylated amino acids in target proteins, the focus was on the bacterial O-OST NmPglL and its ortholog NgPglO (95% identity). It was hypothesized that these enzymes would recognize preassembled O-glycans on Und-PP and transfer them en bloc to Sec-translocated protein substrates in the periplasm (
The glycosylated MBPMOOR was further examined by nanoscale liquid chromatography coupled to tandem mass spectrometry (nano-LC-MS/MS) to identify the modification sites.
Glycosylation with only HexNAc was identified as the predominant species while a much smaller amount of aglycosylated peptide was also detected (
Next, biosynthesis of the T antigen (Gal-β1,3-GalNAcα), another mucin-type O-glycan that is absent in most normal tissues but present in many human cancers (Tarp M A & Clausen H, “Mucin-type O-glycosylation and its Potential Use in Drug and Vaccine Development,” Biochim. Biophys. Acta. 1780:546-563 (2008), which is hereby incorporated by reference in its entirety), was attempted. The challenge here was the fact that Und-PP-GalNAc represents an atypical substrate for eukaryotic Gal transferases (GalT) that prefer GalNAcα-O-S/T. Therefore, a panel of GalT enzymes were evaluated. The panel included: core 1 synthase glycoprotein-N-acetylgalactosamine 3-β-galactosyltransferase from Homo sapiens (HsClGalT1) and Drosophila melanogaster (DmClGalT1); Bifidobacterium infantis D-galactosyl-β1-3-N-acetyl-D-hexosamine phosphorylase (BiGalHexNAcP); the “S42” mutant of C. jejuni β1-3-galactosyltransferase (CjCgtB) engineered with improved catalytic activity (Yang et al., “Fluorescence Activated Cell Sorting as a General Ultra-high-throughput Screening Method for Directed Evolution of Glycosyltransferases,” J. Am. Chem. Soc. 132:10570-10577 (2010), which is hereby incorporated by reference in its entirety); and β-1,3-galactosyltransferases from enteropathogenic E. coli O86 (EcWbnJ) and enterohemorrhagic E. coli O104 (EcWbwC).
To screen GalT activity, a high-throughput flow cytometric assay was adapted (Valderrama-Rincon, “An Engineered Eukaryotic Protein Glycosylation Pathway in Escherichia coli.,” Nat. Chem. Biol. 8:434-436 (2012) and Glasscock et al., “A Flow Cytometric Approach to Engineering Escherichia coli for Improved Eukaryotic Protein Glycosylation,” Metab. Eng. 47:488-495 (2018), which are hereby incorporated by reference in their entirety). In this assay, Und-PP-linked glycans are flipped into the periplasm by the native E. coli flippase, Wzx, and transferred onto lipid A-core by the O-antigen ligase, WaaL (
To transfer T antigen to proteins, O-OST genes were added to the T antigen pathway, yielding plasmids pOG-T-NmPglL and pOG-T-NgPglO. CLM25 cells co-transformed with one of these plasmids along with the plasmid encoding MBPMOOR produced acceptor proteins that were glycosylated with T antigen as revealed by immunoblots probed with PNA (
To produce O-glycans bearing sialic acid (NeuNAc), including the STn (NeuNAc-α2,6-GalNAcα) and ST antigens (NeuNAc-α2,3-Gal-β 1,3-GalNAcα) (
It as speculated that this low efficiency might be overcome by chromosomal integration of the multi-gene CMP-NeuNAc pathway, a strategy that previously increased glycosylation efficiency of an orthogonal N-linked pathway (Yates et al., “Glyco-recoded Escherichia coli: Recombineering-based Genome Editing of Native Polysaccharide Biosynthesis Gene Clusters,” Metab. Eng. 53:59-68 (2019), which is hereby incorporated by reference in its entirety). To test this notion, a glyco-recoding strategy (Yates et al., “Glyco-recoded Escherichia coli: Recombineering-based Genome Editing of Native Polysaccharide Biosynthesis Gene Clusters,” Metab. Eng. 53:59-68 (2019), which is hereby incorporated by reference in its entirety) was used to integrate the CMP-NeuNAc pathway in place of the non-essential O-polysaccharide (O-PS) antigen biosynthesis pathway in the genome (
A nearly identical strategy for producing STn antigen was carried out using the same glyco-recoded host strain carrying plasmid pOG-Tn-NgPglO in place of pOG-T-NgPglO and the pEXT-based acceptor protein plasmid with α2,6-sialyltransferase from Photobacterium sp. JT-ISH-224 in place of EcWbwA. These cells generated MBPMOOR bearing STn antigen albeit with relatively low sialylation (
On average, ˜30 mg/L of glycosylated MBPMOOR with each of the different O-glycan structures was produced from small-scale cultures (
Cell-free modalities are emerging as useful glycoscience tools and for on-demand biomanufacturing of glycoprotein products (Jaroentomeechai et al., “Single-Pot Glycoprotein Biosynthesis Using a Cell-Free Transcription-Translation System Enriched with Glycosylation Machinery,” Nat. Commun. 9:2686 (2018) and Kightlinger et al., “A Cell-Free Biosynthesis Platform for Modular Construction of Protein Glycosylation Pathways,” Nat. Commun. 10:5404 (2019), which are hereby incorporated by reference in their entirety). However, there are currently no cell-free platforms for total biosynthesis of O-glycoproteins. To address this gap, an in vitro glycosylation strategy that combined purified acceptor proteins with partially purified glycosylation machinery was first evaluated. Crude membrane extracts selectively enriched with NgPglO and UndPP-linked T antigen were prepared from CLM25 cells carrying plasmid pOG-T NgPglO. Upon addition of purified acceptor protein to these “glyco-enriched” extracts, clear glycosylation was observed (
To determine the range of glycosylatable acceptor proteins, the MOOR tag was grafted onto the C-terminus of several proteins including: E. coli glutathione-S-transferase (GST); a single-chain Fv antibody fragment specific for β-galactosidase (scFv13-R4); and two conjugate vaccine carrier proteins, namely cross-reacting material 197 (CRM197) and Haemophilus influenzae protein D (PD). A chimera comprised of E. coli secretory protein YebF fused to MBPMOOR as well as two variants of superfolder GFP (sfGFP), one with a C-terminal MOOR tag and the other with the MOOR motif grafted in an internal loop starting at Gln157 were also created. It should be noted that scFv13-R4, sfGFP, and YebF have all been N-glycosylated in E. coli previously (Valderrama-Rincon, “An Engineered Eukaryotic Protein Glycosylation Pathway in Escherichia coli.,” Nat Chem. Biol. 8:434-436 (2012); Jaroentomeechai et al., “Single-pot Glycoprotein Biosynthesis Using a Cell-free Transcription-translation System Enriched with Glycosylation Machinery,” Nat. Commun. 9:2686 (2018); Ollis et al., “Engineered Oligosaccharyltransferases with Greatly Relaxed Acceptor-site Specificity,” Nat. Chem. Biol. 10:816-822 (2014); and Fisher et al., “Production of Secretory and Extracellular N-linked Glycoproteins in Escherichia coli,” App. Environ. Microbiol. 77:871-881 (2011), which are here by incorporated by reference in their entirety) while CRM 197 and PD represent carrier proteins used in licensed conjugate vaccines. When expressed in the presence of the T antigen pathway, each protein cross-reacted with PNA (
System modularity was further evaluated by swapping the 8-residue core sequence of the MOOR tag with different human or synthetic O-glycosylation motifs. These included: 8 residues surrounding the S126 O-glycosite in human erythropoietin (EPO) (Lai et al., “Structural Characterization of Human Erythropoietin,” J. Biol. Chem. 261:3116-3121 (1986), which is hereby incorporated by reference in its entirety); 8 residues surrounding the S24 O-glycosite in human glycophorin C (GPC), a surface glycoprotein found on red blood cells that marks the Gerbich antigen system (Maier et al., “Plasmodium Falciparum Erythrocyte Invasion Through Glycophorin C and Selection for Gerbich Negativity in Human Populations,” Nat. Med. 9:87-92 (2003), which is hereby incorporated by reference in its entirety); 8 residues derived from the ectodomain of human mucin 1 (MUC1), which is expressed on the apical surface of glandular epithelial cells at low levels but following oncogenic transformation is expressed at very high levels and with altered glycosylation (Tarp M A & Clausen H, “Mucin-type O-glycosylation and its Potential Use in Drug and Vaccine Development,” Biochim. Biophys. Acta 1780:546-563 (2008), which is hereby incorporated by reference in its entirety); and synthetic “SAP” motif that was designed de novo based on known glycosite preferences of NmPglL (Pan et al., “Biosynthesis of Conjugate Vaccines Using an O-Linked Glycosylation System,” MBio. 7:e00443-00416 (2016), which is hereby incorporated by reference in its entirety). When each construct was expressed in the presence of NgPglO, strong glycosylation with T antigen was observed (
To generate additional MUC1 glycoforms with relevance to human cancer, the variable number of tandem repeats (VNTRs) of MUC1 that consist of 20-120 repeats of a 20-amino acid sequence (PDTRPAPGSTAPPAHGVTSA (SEQ ID NO: 36)) and contain five potential O-glycosylation sites (bold) was the next focus of these studies (Gendler et al., “A Highly Immunogenic Region of a Human Polymorphic Epithelial Mucin Expressed by Carcinomas is Made Up of Tandem Repeats;” J. Biol. Chem. 263:12820-12823 (1988), which is hereby incorporated by reference in its entirety). Here, four VNTR-derived sequences were created by incrementally extending the MUC1_8 motif. Each of these was cloned between the hydrophilic flanking regions of the MOOR motif and subsequently expressed in CLM25 cells carrying either pOG-T-NgPglO or pOG-T-NmPglL. The T antigen-producing host strain was chosen because tumor-associated MUC1 is aberrantly glycosylated with truncated O-glycans including T antigen (Tarp M A & Clausen H, “Mucin-type O-glycosylation and its Potential Use in Drug and Vaccine Development,” Biochim. Biophys. Acta 1780:546-563 (2008), which is hereby incorporated by reference in its entirety). Following expression in bacteria carrying the T antigen pathway, each MUC1 motif was strongly glycosylated by NgPglO (
To generate more antigenically authentic glycoforms, a 41-residue MUC1 sequence containing the 20-residue VNTR flanked with additional stretches of the MUC1 repeat but without the original MOOR flanking residues was investigated. Importantly, both NgPglO and NmPglL were able to transfer T antigen to this construct (
As was seen for the other APDTRP-containing MUC1 sequences, T-modified MUC1_41 cross-reacted with H23 (
In this work, orthogonal O-glycoprotein biosynthesis in E. coli was engineered by rewiring the cell's metabolism to provide necessary sugar donors and ectopically expressing specific GTs and OSTs from diverse organisms. The system was highly modular as evidenced by the ability to generate multiple O-glycan structures and post-translationally modify a panel of acceptor protein targets. Unlike previous mucin-type O-glycoengineering in E. coli that focused on processive glycosylation mechanisms (Henderson et al., “Site-specific Modification of Recombinant Proteins: A Novel Platform for Modifying Glycoproteins Expressed in E. coli,” Bioconjug. Chem. 22:903-912 (2011); Mueller et al., “High Level In Vivo Mucin-Type Glycosylation in Escherichia coli,” Microb. Cell Fact. 17:168 (2018); and Du et al., “A Bacterial Expression Platform for Production of Therapeutic Proteins Containing Human-like O-Linked Glycans,” Cell Chem. Biol. 26:203-212 e205 (2019), which are hereby incorporated by reference in their entirety), an unconventional approach based on the en bloc O-glycosylation mechanism found natively in some bacteria was investigated. Although modeled after this process, the collection of synthetic O-glycosylation pathways described herein has no direct biological equivalent and includes the first biosynthetic routes to sialylated mucin-type O-glycosylation in E. coli.
One advantage of the strategy described herein is the opportunity to leverage diverse enzymes from all domains of life that naturally operate on lipids as well as proteins. A number of bacteria employ glycomimicry strategies in which endogenous GTs construct human-like oligosaccharides that serve to cloak cell-surface components as a means to evade host immune responses. By enlisting these bacterial GTs, one could further expand the repertoire of O-glycans that can be assembled in E. coli. Moreover, because many human GTs are difficult to functionally express in bacteria, often requiring specialized chaperones or solubility-enhancing fusion partners (Ju T & Cummings R D, “A Unique Molecular Chaperone Cosmc Required for Activity of the Mammalian Core 1 Beta 3-galactosyltransferase,” Proc. Natl. Acad Sci. USA 99:16613-16618 (2002) and Skretas et al., “Expression of Active Human Sialyltransferase ST6GalNAcl in Escherichia coli,” Microb. Cell Fact 8:50 (2009), which are hereby incorporated by reference in their entirety), GTs of microbial origin represent a potential workaround for construction of human-like O-glycans as we demonstrated here.
Another advantage of the strategy described herein is the utilization of bacterial O-OSTs that have an inbuilt ability to transfer glycans onto both serine and threonine residues, whereas human GalNAcT2 used previously is limited to threonine. These enzymes exhibit extreme glycan substrate permissiveness as exemplified by NmPglL (Faridmoayer et al., “Extreme Substrate Promiscuity of the Neisseria Oligosaccharyl Transferase Involved in Protein O-glycosylation,” J. Biol. Chem. 283:34596-34604 (2008) and Pan et al., “Biosynthesis of Conjugate Vaccines Using an O-Linked Glycosylation System,” MBio. 7:e00443-00416 (2016), which are hereby incorporated by reference in their entirety). Here, this promiscuity was leveraged to show that NmPglL and its NgPglO ortholog can transfer human-like O-glycan structures. The compatibility of acceptor sequences with these enzymes is much less understood. While it has been shown that individual O-OSTs can modify multiple protein substrates (Schulz et al., “Identification of Bacterial Protein O-oligosaccharyltransferases and Their Glycoprotein Substrates,” PLoS One 8:e62768 (2013), which is hereby incorporated by reference in its entirety), there is no clear sequon for glycosylation and the O-glycan attachment sites are in flexible, low-complexity regions, thereby hindering glycoprotein engineering efforts. A breakthrough in this regard was the identification of the MOOR motif that together with two additional hydrophilic flanking sequences could be recognized by NmPglL (Pan et al., “Biosynthesis of Conjugate Vaccines Using an O-Linked Glycosylation System,” MBio. 7:e00443-00416 (2016), which is hereby incorporated by reference in its entirety) and, as is shown in the exampled presented herein, NgPglO. Using these hydrophilic flanking sequences, the list of glycosylatable sequences was expanded to include several human and synthetic O-glycosites. The observation that NmPglL and NgPglO could glycosylate varying-length human MUC1 sequences suggested a much greater flexibility than was first reported for these enzymes Pan et al., “Biosynthesis of Conjugate Vaccines Using an O-Linked Glycosylation System,” MBio. 7:e00443-00416 (2016), which is hereby incorporated by reference in its entirety).
Most surprising was the site-directed O-glycosylation of MUC1_41 that lacked the flanking sequences, addressing earlier skepticism about the ability of bacterial O-OSTs to discern mammalian O-glycosites (Du et al., “A Bacterial Expression Platform for Production of Therapeutic Proteins Containing Human-like O-Linked Glycans,” Cell Chem. Biol. 26:203-212 e205 (2019), which is hereby incorporated by reference in its entirety). The O-glycosylated MUC1_41 produced herein was structurally similar to glycopeptides that are reactive towards IgG/IgM antibodies (von Mensdorff-Pouilly et al., “Reactivity of Natural and Induced Human Antibodies to MUC1 Mucin with MUC1 Peptides and n-acetylgalactosamine (GalNAc) Peptides,” Int. J. Cancer 86:702-712 (2000), which is hereby incorporated by reference in its entirety) and human MHC class I molecules (Apostolopoulos et al., “A Glycopeptide in Complex with MHC Class 1 Uses the GalNAc Residue as an Anchor,” Proc. Natl. Acad. Sci. USA 100:15029-15034 (2003), which is hereby incorporated by reference in its entirety). Indeed, recognition of Tn-modified MUC1_41 by a glycoform-specific antibody indicated the creation of an antigenically authentic glycoform. Moreover, the relatively low glycan occupancy on MUC1_41 (˜1 or 2 O-glycans per repeat) may bode well for immunotherapeutic discovery given that a synthetic 60-residue MUC1 tandem-repeat peptide, which was extensively glycosylated (5 O-glycans per repeat), elicited only modest antibody responses (Sorensen et al., “Chemoenzymatically Synthesized Multimeric Tn/STn MUC1 Glycopeptides Elicit Cancer-specific Anti-MUC1 Antibody Responses and Override Tolerance,” Glycobiology 16:96-107 (2006), which is hereby incorporated by reference in its entirety). This weak humoral response results from an inability of antigen-presenting cells to process densely glycosylated MUC1 glycopeptides (Ninkovic T & Hanisch F G, “O-glycosylated Human MUC1 Repeats are Processed In Vitro by Immunoproteasomes,” J. Immunol. 179:2380-2388 (2007), which is hereby incorporated by reference in its entirety). In contrast, a glycopeptide modified with just a single O-glycan elicited more robust antibody titers and also activated cytotoxic T lymphocytes, which amounted to superior tumor prevention (Lakshminarayanan et al., “Immune Recognition of Tumor-associated Mucin MUC1 is Achieved by a Fully Synthetic Aberrantly Glycosylated MUC1 Tripartite Vaccine.” Proc. Natl. Acad. Sci. USA 109:261-266 (2012), which is hereby incorporated by reference in its entirety).
Looking forward, it is anticipated that the platform described herein could find use in the scalable biosynthesis of O-glycoprotein therapeutics and vaccines. To gain access to greater O-glycoprotein structural space may require additional O-OSTs such as those from Bacteroidetes that modify proteins at a minimal 3-residue motif, D-(S/T)-(A/L/V/I/M/T) (Coyne et al., “Phylum-wide General Protein O-glycosylation System of the Bacteroidetes,” Mol. Microbiol. 88:772-783 (2013), which is hereby incorporated by reference in its entirety). Directed evolution of GTs to tailor substrate specificity and metabolic engineering to drive pathway performance towards higher conversion could be enabled through a high-throughput screen for O-glycosylation akin to ‘glycoSNAP’, a bacterial colony blot assay for N-linked glycosylation that was used previously to evolve bacterial N-OST variants with greatly relaxed sequon specificity (Ollis et al., “Engineered Oligosaccharyltransferases with Greatly Relaxed Acceptor-site Specificity,” Nat. Chem. Biol. 10:816-822 (2014), which is hereby incorporated by reference in its entirety). A first important step in this direction was the demonstration that O-glycoproteins can be secreted out of the cell by genetic fusion to the C-terminus of the secretory protein YebF, a feat that is not possible with cytoplasmic O-glycosylation systems. Beyond O-glycoprotein production, the ability of the glycoengineered strains to produce custom glyco-ligands such as O-glycosylated GST and sfGFP could facilitate pulldown assays and cell labeling experiments, respectively, with the potential to uncover and characterize binding partners of structurally defined O-glycoforms. Altogether, the results presented herein define a versatile platform for site-directed O-glycosylation of proteins with different mucin-type O-glycans, thereby expanding the bacterial glycoengineering toolkit.
Although preferred embodiments have been depicted and described in detail herein, it will be apparent to those skilled in the relevant art that various modifications, additions, substitutions, and the like can be made without departing from the spirit of the invention and these are therefore considered to be within the scope of the invention as defined in the claims which follow.
This application claims the priority benefit of U.S. Provisional Patent Application Ser. No. 63/014,486, filed Apr. 23, 2020, which is hereby incorporated by reference in its entirety.
This invention was made with government support under 1163167 awarded by Defense Threat Reduction Agency, 1605242 awarded by National Science Foundation Division of Chemical, Bioengineering, Environmental and Transport Systems, 1R01GM127578-01 awarded by National Institutes of Health, 1U54CA210184-01 awarded by National Institutes of Health. The government has certain rights in the invention.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2021/028901 | 4/23/2021 | WO |
Number | Date | Country | |
---|---|---|---|
63014486 | Apr 2020 | US |