A Sequence Listing in XML format is incorporated by reference into the specification. The name of the XML file containing the Sequence Listing is B21-125-2US.xml. The XML file is 62,097 bytes and was created on Jun. 17, 2024.
Ribosomally synthesized and post-translationally modified peptides (RiPPs) are a unique class of natural products (https://doi.org/10.1039/D0NP00027B). Their synthesis begins with a ribosomally translated precursor peptide that is elaborated further by enzymes that convert one or more traditional peptide linkages into those that contain heterocycles, D-amino acids, intrachain crosslinks, and backbone and side chain modifications (
Goto and Suga developed in vitro translation systems for the rapid prototyping of heterocyclase substrates (DOI: 10.1016/j.chembiol.2014.04.008; https://doi.org/10.1246/c1.160562). These systems were applied to study the influence of flanking sequences on the activity of the heterocyclase enzyme PatD. This work demonstrated that PatD activity is insensitive to polypeptide sequences flanking the core peptide. However, PatD was found to be sensitive to deletions and point mutations within the leader peptide.
Bowers, Goto, Hicks, and Suga combined in vitro translation, chimeric leader peptides, and unnatural amino acids (UAAs) to create a semi-synthetic approach to synthesizing thiopeptides such as thiocillin and lactazole (https://doi.org/10.1021/jacs.8b11521). This platform was used to investigate the influence of core peptide point mutations as well as deletions and insertions within the leader peptide on the activity of the heterocyclase enzyme LynD. Generally, core peptide point mutations were tolerated, but leader peptide modifications impacted yield. All changes introduced into the core sequence were natural a-amino acids.
Recently, Goto and Suga utilized in vitro translation and flexizyme-mediated tRNA charging of UAAs to further explore the substrate scope of PatD (https://doi.org/10.1002/cbic.201900521; https://doi.org/10.1002/anie.201910894). These works demonstrated that PatD, which natively accepts sulfhydryl and hydroxyl nucleophiles, also accepts amino nucleophiles. PatD also tolerated methylene insertion within the side chain, which gave rise to 6-membered heterocyclic products. Further, modifications to the side chain substituent of cysteine and threonine gave rise to 5-membered heterocycles with non-natural substituents.
Onaka, and Suga extended their work on PatD substrate tolerance by utilizing in vitro translation to recapitulate the biosynthesis of lactazole A (doi: https://doi.org/10.1101/807206). This work demonstrated that the heterocyclase LazD tolerates expansion and contraction of the macrocyclic loop of lactazole A. Further, UAAs and β-amino acids were tolerated within the macrocyclic loop. However, these unnatural residues were located at positions distant from the residues targeted for heterocyclization by LazD.
Other researchers have studied the substrate tolerance of native heterocyclases. Early work from Jaspars, Naismith, and Smith explored the substrate tolerance of PatD and TruD [68]. PatD and TruD were found to cyclize selenocysteine residues to yield selenazolines. In a later work, this in vitro synthesis scheme was expanded to include the heterocyclases LynD, MicD, and TenD [69]. A mutational study established that heterocyclases tolerate a variety of α-amino acid substitutions within the core peptide sequence.
Jaspars and Naismith developed a set of constitutively active heterocycases in which the leader sequence is fused directly to the enzyme, such as LynD and MicD. These constructions are referred to as “fusions” (such as MicD fusion) (DOI: 10.1038/nchembio.1841). In line with previous studies of TruD (DOI: 10.1002/anie.201306302) and PatD (DOI: 10.1016/j.chembiol.2014.04.008) this work found that native LynD retains limited activity towards leaderless core peptides. Fusion of the leader peptide to the N-terminus of LynD led to the creation of a constitutively activated variant, LynD_fusion. LynD_fusion processed leaderless core peptides, but point mutations impacted the efficiency of heterocyclization. Recently, this concept was used to constitutively activate the heterocyclase MicD, and study the processivity of heterocyclases (https://doi.org/10.1021/acs.biochem.9b00084).
Migaud, Naismith and Westwood have since explored the substrate tolerance of constitutively activated heterocyclases (https://doi.org/10.1002/open.201600134; https://doi.org/10.1002/cbic.201500494). These efforts demonstrated that LynD_fusion tolerates the insertion of a 4-unit polyethylene glycol spacer or azidoalanine within the core peptide. However, much like work from Goto, Onaka, and Suga, these modifications were made at positions distant from the residue heterocyclized by LynD_fusion.
Gaps in Knowledge with Respect to Heterocyclase Substrate Scope
Existing approaches to the synthesis of novel compositions of matter using RiPP heterocyclases have notable limitations. Studies have demonstrated that heterocyclases process core peptides containing multiple proteinogenic α-amino acids, but studies of substrates containing non-natural monomers have been severely limited. No study has reported heterocyclase activity for a substrate containing a non-α-amino acid residue at position +1 or −1, or incorporated a non-natural backbone monomer at the +1 site. Reliance on the paradigm of leader peptide-directed modification has provided valuable insight into the biochemistry of native heterocyclases, but makes it difficult to apply the chemistry to valuable substrates such as antibodies, cytokines, replacement enzymes, or other therapeutic proteins. No study has modified the activity, processivity, or selectivity of native heterocyclases. These limitations restrict the structural diversity of compounds accessible via heterocyclase biosynthesis.
We demonstrate that constitutively active dehydratase enzymes involved in RiPP biosynthesis can accept substrates containing multiple, diverse, non-α-amino acid monomers at positions flanking the reaction site. We demonstrate that this reactivity can be used to engineer therapeutic proteins that express at higher levels, resist degradation, alter immune reactions, and encode functions that add additional value, such as but not limited to targeting the protein to distinct cells or tissues.
The invention expands the chemistry of the heterocyclase MicD in multiple unique ways. We reveal that multiple, structurally diverse aromatic rings are tolerated at the +1 position that precedes the site of cyclization. We reveal that multiple, structurally diverse beta-β-amino acids are also tolerated at the +1 site. We reveal that aramid monomers are tolerated at the −1 site. Finally, we reveal that MicD can introduce an azoline heterocycle into one or more loops of GFP to generate proteins with altered and improved properties.
In an aspect the invention provides a method of polymer engineering: deploying a constitutively active dehydratase enzyme of RiPP biosynthesis to accept a substrate containing a non-α-amino acid monomer at a position flanking the reaction site.
In embodiments:
The invention encompasses all combinations of the particular embodiments recited herein, as if each combination had been laboriously recited.
Unless contraindicated or noted otherwise, in these descriptions and throughout this specification, the terms “a” and “an” mean one or more, the term “or” means and/or. It is understood that the examples and embodiments described herein are for illustrative purposes only and that various modifications or changes in light thereof will be suggested to persons skilled in the art and are to be included within the spirit and purview of this application and scope of the appended claims. All publications, patents, and patent applications cited herein, including citations therein, are hereby incorporated by reference in their entirety for all purposes.
Expanded substrate scope at +1 position. Tolerance to diverse aromatic +1 site residues. We explored the extent to which MicD fusion would accept non-α-amino acid substrates at the +1 position (
Tolerance to diverse aromatic +1 site residues. We explored the extent to which MicD fusion would accept non-α-amino acid substrates at the +1 position (
Tolerance to beta-β-amino acids at +1 site. We explored the extent to which MicD fusion would accept b3-amino acid substrates at the +1 position (
Tolerance to diverse −1 site residues. We next sought to explore the extent to which MicD fusion would accept non-α-amino acid substrates at the −1 position (
The invention provides numerous practical applications. Currently, if one sought a protein therapeutic fused to a RiPP natural product, that material would need to be synthesized in multiple steps and would likely include a chemical bio-conjugation step that often requires significant optimization. Here, the desired material can be prepared in a single step, even in situ, and purified only once. The invention can be used to improve the ease of production of therapeutic proteins, (b) improve the thermal and proteolytic stability of therapeutic proteins, (c) add a targeting function to a therapeutic protein, (d) modulate the immune response to a therapeutic protein, etc.
Abstract: Ribosomally synthesized and post-translationally modified peptides (RiPPs) are peptide-derived natural products that include the FDA-approved analgesic ziconotide1,2 as well as compounds with potent antibiotic, antiviral, and anticancer properties.3 RiPP enzymes known as cyclodehydratases and dehydrogenases represent an exceptionally well-studied enzyme class.3 These enzymes work together to catalyze intramolecular, inter-residue condensation3,4 and aromatization reactions that install oxazoline/oxazole and thiazoline/thiazole heterocycles within ribosomally produced polypeptide chains. Here we show that the previously reported enzymes MicD-F and ArtGox accept backbone-modified monomers-including aramids and beta-amino acids-within leader-free polypeptides, even at positions immediately preceding or following the site of cyclization/dehydrogenation. The products are sequence-defined chemical polymers with multiple, diverse, non-α-amino acid subunits. We show further that MicD-F and ArtGox can install heterocyclic backbones within protein loops and linkers without disrupting the native tertiary fold. Calculations reveal the extent to which these heterocycles restrict conformational space; they also eliminate a peptide bond-both features could improve the stability or add function to linker sequences now commonplace in emerging biotherapeutics. Moreover, as thiazoles and thiazoline heterocycles are replete in natural products,5-7 small molecule drugs,8,9 and peptide-mimetic therapeutics,10 their installation in protein-based biotherapeutics can be deployed to improve or augment performance, activity, stability, and/or selectivity. This invention provides a general strategy to expand the chemical diversity of the proteome beyond and in synergy with what can now be accomplished by expanding the genetic code.
Ribosomally synthesized and post-translationally modified peptides (RiPPs) are peptide-derived natural products that include the FDA-approved analgesic ziconotide1,2 as well as compounds with potent antibiotic, antiviral, and anticancer properties.3 RIPP biosynthesis begins with a ribosomally synthesized polypeptide whose N-terminal leader sequence (−20-110 aa) recruits one or more endogenous enzymes capable of diverse post-translational modification (PTM) of an adjacent C-terminal substrate sequence.3,11 Researchers have leveraged this leader-dependent mechanism to direct RiPP PTM enzymes to C-terminal substrate sequences containing diverse non-canonical α-amino acids (nc-α-AAs).12,13
Cyclodehydratases and dehydrogenases represent an exceptionally well-studied class of RiPP enzymes.3 These enzymes work together to catalyze intramolecular cyclization3,4 and subsequent aromatization reactions that install oxazoline/oxazole and thiazoline/thiazole heterocycles within polypeptide chains (
Previous work has also shown that certain cyclodehydratase enzymes retain activity for leader sequence-free substrates when the leader peptide is provided in trans (
Here we report that MicD-F26,29 and ArtGox26,29 act together to process polypeptide substrates containing diverse translation-compatible30-35 aramid and β-amino acid monomers, even at sites directly flanking the reaction site (
MicD-F and ArtGox Accept Substrates with Diverse Structures at the +1 Site
We began by exploring the tolerance of MicD-F for sequences containing non-α-amino acid monomers at the +1 site (
Next, we explored whether MicD-F and ArtGox could act in synergy to convert peptides containing non-α-amino acids at the +1 site directly into the corresponding thiazoles 3(a-i) (
MicD-F and ArtGox Accept Substrates with Diverse Structures at the −1 Site
Next we explored whether MicD-F and ArtGox would accept leader-free polypeptide substrates containing non-α-amino acid monomers at the −1 site (
All −1 site substrates (substrates 4(a-g),
To complete the exploration of the substrate tolerance of MicD-F and ArtGox, we synthesized a set of potential substrates containing a non-α-amino acid directly at the cyclization site. Each contained a C-terminal AYD sequence preceded by either L-β3- or D-β3-threonine (
Thiazolines and thiazole are replete in natural products42-44 and synthetic drug-like small molecules,8,9 and calculations confirm the expected decrease in conformational freedom that derives from aromatic and/or sp2 character within the peptide backbone.45 This finding and the leader-independent nature of MicD-F and ArtGox-mediated thiazol(in)e biosynthesis inspired us to explore substrates in which the site of cyclodehydration/dehydrogenation is embedded within a stable protein fold (
Treatment of mCherryC+ with 50 mol % MicD-F (pH 9.0, 24 hours, 37° C.) led to virtually complete conversion to the thiazoline product as indicated by a loss of water in the deconvoluted mass spectrum (
We hypothesized that the absence of cyclodehydration reactivity for mCherry174+ and mCherry192+ at 37° C. was due to neighboring structural elements that disfavor productive interaction with MicD-F and/or enzyme-promoted thiazoline formation. Therefore, we carried out a second set of cyclodehydration reactions at 42° C., the highest temperature at which MicD-F remained stable in our hands, which should increase the conformational flexibility of loop insertions. At this elevated temperature, mCherryC+ again displayed cysteine-specific loss of water characteristic of successful cyclodehydration (
To test this hypothesis, we sought a folded, globular protein with a lower melting temperature than mCherry with the expectation that it would be more amenable to insertion of an internal thiazol(in)e linkage. Rop is a homodimeric four-helix bundle protein formed by the antiparallel association of two helix-turn-helix monomers.48 Regan and coworkers reported many years ago that the native two-residue turn in Rop could be replaced by up to ten glycine residues without loss of the native dimer structure. The Rop variant with the longest insertion-Gly10-melted cooperatively at 50° C.,49 suggesting that it might tolerate an internal, intra-loop thiazole or thiazoline (
All three Rop variants exhibited high α-helical content at 20 μM as judged by wavelength-dependent CD measurements (
Although RopC, RopN, and RopM all contained the same CAYD recognition sequence, only one-RopC-underwent clean conversion into the corresponding thiazoline upon treatment with 50 mol % MicD-F (pH 9.0, 37° C., 16h). RopM reacted partially under these conditions and RopN was unreactive (
The products of the reaction of RopC with MicD-F (RopC-U) and with MicD-F and ArtGox (RopC-Z) were purified and analyzed by size-exclusion chromatography and wavelength- and temperature-dependent CD. Thiazoline-containing RopC-U was a homogeneous dimer as judged by SEC (
To explore the effects of the cyclization reaction on local backbone flexibility, we examined the conformational space of the tetrapeptide Ac-AACA-NH2. The use of this simplified substrate allowed the inherent energetics of the backbone to be evaluated in the protein context without the complications of side chain fluctuations. Molecular mechanics methods (Macromodel, OPLS4 force field, implemented in Schrödinger Maestro software) were first used to generate and minimize large populations of conformers for cysteine-, thiazoline-, and thiazole-containing analogs (
The results of the conformational analysis appear in
One can imagine two mutually synergistic strategies to introduce non-natural monomers into polypeptide and protein oligomers.52 One “bottom-up” approach relies on extant or engineered ribosomes to accept and process tRNAs carrying diverse non-canonical α-amino or non-α-amino acids.53 Hundreds of non-canonical α-amino acids (as well as α-hydroxy acids54,55) have been introduced into proteins in cells and animals using genetic code expansion,56,57 which usually relies on novel orthogonal aminoacyl tRNA synthetases to generate the requisite acylated tRNAs. Select non-canonical α-amino acids58 and one β-amino acid30 have also been incorporated into proteins in vivo using endogenous α-aminoacyl tRNA synthetases. Alternatively, many non-canonical α-amino acids, as well as certain non-α-amino acids, including β-amino acids32,59 and certain polyketide precursors,33 can be introduced into short peptides in vitro and on small scale using genetic code reprogramming, in which a stoichiometric RNA co-reagent (Flexizyme60) generates the requisite acylated tRNA.
The second “top-down” approach is reminiscent of late-stage functionalization reactions used to manipulate complex small molecule natural products61,62 and the natural biosynthetic strategy used to assemble ribosomally synthesized and post-translationally modified peptides (RiPPs).13 In this approach, enzymes, chemical reagents, or chemical catalysts are employed to post-translationally modify a peptide12 or protein52 to install a new or modified monomer. Examples of this approach include reactions of natural or non-canonical protein side chains or modification of the N- or C-terminus.63-66 The only backbone-focused non-enzymatic reaction of which we are aware is the O-mesitylenesulfonylhydroxylamine-promoted oxidative elimination of Cys residues to generate a dehydroalanine backbone67 that is subsequently modified. We note that the top-down and bottom-up strategies are complementary, and both have the potential to operate in vivo where very high protein titers are possible.68
In this example we show that a constitutively active form of MicD and ArtGox, enzymes used in the biosynthesis of cyanobactin natural products69 are sufficiently promiscuous to process substrates containing diverse backbone-modified monomers within substrate polypeptides, even at positions immediately preceding or following the site of cyclization/dehydrogenation. The backbone-modified monomers compatible with MicD-F and ArtGox include many accepted by extant ribosomes in small-scale in vitro reactions, including aramids and β2- and β3-amino acids. The products of these reactions are sequence-defined chemical polymers with multiple, diverse, non-α-amino acid monomers. We show further that cyclodehydration and dehydrogenation can install thiazoline or thiazole backbones within protein loops and linkers without disrupting the native tertiary fold. Calculations reported here reveal the extent to which these heterocycles restrict conformational space; they also eliminate a peptide bond-both features could improve the stability or add function to linker sequences now commonplace in emerging biotherapeutics. Moreover, as thiazoles and thiazoline heterocycles are replete in natural products,5-7 small molecule drugs,8,9 and peptide-mimetic therapeutics,10 their installation in protein-based biotherapeutics can be used to improve or augment performance, activity, stability, and/or selectivity. More generally, this work represents a general strategy to expand the chemical diversity of the proteome without need for genetic manipulations.
The plasmid used to express MicD-F (pJExpress411-MicD-F) was graciously provided by Professor James Naismith (University of Oxford).2 The translation product encoded by this plasmid is the full sequence of MicD (heterocyclase from Microcystis aeruginosa, Uniprot ID: A8Y998) preceded by five repeats of a Gly-Ala spacer, residues Thr18 to Ala37 from PatE (Uniprot ID: A0MH79), a TEV protease recognition site, and an N-terminal 6×His purification tag. The full sequence of the translation product is provided below.
A starter culture of 5 mL of Miller's LB Broth (AmericanBio, catalog #AB01201) containing 50 μg/mL kanamycin was inoculated with a single colony of E. coli BL21 (DE3) harboring the pJExpress411-MicD-F plasmid and grown overnight at 37° C. with shaking at 200 rpm. The starter culture (5 mL) was used to inoculate a 500 mL expression culture of Miller's LB Broth which also contained 50 μg/mL kanamycin. The expression culture was grown at 37° C. with shaking at 200 rpm to an OD600 of 0.6 at which point the expression culture was induced with 1 mM IPTG, transferred to a 20° C. incubator, and grown for 24 hours with shaking at 200 rpm. The expression culture was harvested by centrifugation at 4,300×g at 4° C. for 20 minutes. The resulting cell pellet was suspended in 10 mL of Lysis Buffer (20 mM Tris-HCl, 500 mM NaCl, pH 8.0) containing 1 tablet of cOmplete, mini EDTA-free ULTRA protease inhibitor cocktail (Sigma-Aldrich, St. Louis, MO). The cell suspension was disrupted by sonication on ice (Branson Sonifier 250, 3 cycles of 30 second pulse at 30% duty cycle and microtip limit of 5 followed by 60 second pause). The cell lysate was cleared by centrifugation at 4,300×g at 4° C. for 20 minutes. A gravity flow Poly-Prep Chromatography Column (Bio-Rad Laboratories, Hercules, CA) was loaded with 2 mL of packed Ni-NTA agarose (Qiagen, Germantown, MD) and equilibrated with 10 mL of Lysis Buffer. The 6×His-tagged protein was bound to the Ni-NTA column by passing the cleared cell lysate over the column three times. Non-specifically bound proteins were removed by washing the Ni-NTA column with 10 mL of Wash Buffer (20 mM Tris-HCl, 500 mM NaCl, 50 mM imidazole pH 8.0). The 6×His-tagged protein was eluted by washing the Ni-NTA column with 5 mL of Elution Buffer (20 mM Tris-HCl, 500 mM NaCl, 250 mM imidazole pH 8.0). The purified protein was concentrated and exchanged into Storage Buffer (20 mM HEPES, 150 mM NaCl, 1 mM TCEP, pH 7.4) using Amicon Ultra-0.5 mL centrifugal filters (MilliporeSigma, Burlington, MA) with a 30 kDa molecular weight cut off according to the manufacturer's instructions. The purified protein was quantified using absorbance at 280 nm (molar extinction coefficient of 125,250 M−1 cm−1), diluted to 500 M using Storage Buffer, snap frozen as single-use aliquots, and stored at −80° C. The typical expression yield of MicD-F using the above protocol was 30 mg/L of E. coli culture.
The plasmid used to express ArtGox (pEHISTEVSUMO-ArtGox) was generously provided by Professor James Naismith (University of Oxford).3,4 The translation product encoded by this plasmid is the oxidase domain of ArtG (thiazoline oxidase from Arthrospira platensis, Uniprot ID: H1W8K1) preceded by a TEV protease recognition site, a SUMO fusion tag, and an N-terminal 6×His purification tag. The full sequence of the translation product is provided below.
A starter culture of 5 mL of Miller's LB Broth (AmericanBio, catalog #AB01201) supplemented with 50 μg/mL kanamycin was inoculated with a single colony of E. coli BL21 (DE3) harboring the pEHISTEVSUMO-ArtGox plasmid and grown overnight at 37° C. with shaking at 200 rpm. The starter culture (5 mL) was used to inoculate a 500 mL expression culture of Miller's LB Broth supplemented with 50 μM riboflavin and containing 50 μg/mL kanamycin. The expression culture was grown at 37° C. with shaking at 200 rpm to an OD600 of 0.6 at which point it was induced with 1 mM IPTG, transferred to a 20° C. incubator, and grown for 24 hours with shaking at 200 rpm. The expression culture was harvested by centrifugation at 4,300×g at 4° C. for 20 minutes. The resulting cell pellet was suspended in 10 mL of Lysis Buffer (20 mM Tris-HCl, 500 mM NaCl, 50 μM flavin mononucleotide, pH 8.0) containing 1 tablet of cOmplete, mini EDTA-free ULTRA protease inhibitor cocktail (Sigma-Aldrich, St. Louis, MO). The cell suspension was disrupted by sonication on ice (Branson Sonifier 250, 3 cycles of 30 second pulse at 30% duty cycle and microtip limit of 5 followed by 60 second pause). The cell lysate was cleared by centrifugation at 4,300×g at 4° C. for 20 minutes. A gravity flow Poly-Prep Chromatography Column (Bio-Rad Laboratories, Hercules, CA) was loaded with 2 mL of packed Ni-NTA agarose (Qiagen, Germantown, MD) and equilibrated with 10 mL of Lysis Buffer. The 6×His-tagged protein was bound to the Ni-NTA column by passing the cleared cell lysate over the column three times. Non-specifically bound proteins were removed by washing the Ni-NTA column with 10 mL of Wash Buffer (20 mM Tris-HCl, 500 mM NaCl, 50 mM imidazole, 50 M flavin mononucleotide, pH 8.0). The 6×His-tagged protein was eluted by washing the Ni-NTA column with 5 mL of Elution Buffer (20 mM Tris-HCl, 500 mM NaCl, 250 mM imidazole, 50 M flavin mononucleotide, pH 8.0). The purified protein was concentrated and exchanged into Storage Buffer (20 mM HEPES, 150 mM NaCl, 1 mM TCEP, pH 7.4) using Amicon Ultra-0.5 mL centrifugal filters (MilliporeSigma, Burlington, MA) with a 30 kDa molecular weight cut off according to the manufacturer's instructions. The purified protein was quantified using absorbance at 280 nm (molar extinction coefficient of 75,290 M−1 cm−1), diluted to 800 M using Storage Buffer, snap frozen as single use aliquots, and stored at −80° C. The typical expression yield of ArtGox using the above protocol was 40 mg/L of E. coli culture.
SDS-PAGE. Purified samples of MicD-F and ArtGox (4 g) were mixed (4:1) with SDS-PAGE sample buffer (5% β-Mercaptoethanol, 0.02% bromophenol blue, 30% glycerol, 10% SDS, 250 mM Tris-HCl, pH 6.8). The protein samples were reduced and denatured by incubation at 95° C. for 5 minutes. Reduced and denatured samples were separated using a 4-15% mini-PROTEAN TGX gel (Bio-Rad Laboratories, Hercules, CA) run at 120 V for 60 minutes in Tris-Glycine-SDS running buffer (3 g/L tris, 14.4 g/L glycine, 1 g/L sodium dodecyl sulfate, pH 8.3) and their molecular weights compared against Precision Plus Protein Dual Color Standards (Bio-Rad Laboratories, Hercules, CA). Protein content was visualized using Coomassie stain (1 g/L Coomassie Brilliant Blue in methanol:water:acetic acid (5:4:1)) and imaged using a Bio-Rad Chemidoc MP Imaging System. Band intensities were quantified using the gel analysis tool of FIJI.5 Please see
Cloning, Expression, Purification, and Characterization of mCherry and Rop Variants
Design of mCherry Variants
C-terminally modified substrates (mCherryC+ and mCherryC−) are variants of the mCherry sequence associated with Uniprot ID: X5DSL3. Both contained an N-terminal 6×His purification tag, the natural three-residue chromophore-forming sequence MYG, and a C-terminal extension that included a TEV protease recognition site, a MicD-F/ArtGox compatible substrate, and a FLAG purification tag. Internally-modified substrates (mCherry137+, mCherry174+, mCherry192+ and mCherry211+) are also variants of the mCherry sequence associated with Uniprot ID: X5DSL3. Each contained an N-terminal FLAG purification tag, the natural three-residue chromophore-forming sequence MYG, and a C-terminal 6×His purification tag. A MicD-F/ArtGox compatible substrate sequence was inserted internally on the C-terminal side of the indicated residue (137, 174, 192, or 211). The full sequences of the mCherry translation products are provided below.
LYFQGMCAYDGDYKDDDDK
LYFQGMAAYDGDYKDDDDK
CAYDGGPVMQKKTMGWEASSERMYPEDGALKGEIKQRLKLKDGGHYDAE
Cloning mCherry Constructs into pET-32a(+) Vector
The sequences encoding mCherryC+, mCherryC−, mCherry137+, mCherry174+, mCherry192+ and mCherry211+ were cloned into a pET-32a(+) plasmid as follows. Circular pET-32a(+) vector (1 g, Millipore Sigma, catalog #69015-3) was incubated with 1 L each of restrictions enzymes NdeI (New England Biosciences, catalog #R0111S) and NotI-HF (New England Biosciences, catalog #R3189S) in cutSmart Buffer (New England Biolabs, catalog #B7204S) at 37° C. for 1 hour. The entire restriction digest reaction was run on a 0.8% agarose gel at 150V for 45 minutes and linear 5.4 kbp and 0.5 kbp fragments were observed using a blue light transilluminator. The larger 5.4 kbp fragment was excised from the gel and purified using a Monarch DNA Gel Extraction Kit (NEB, catalog #T1020S). The concentration of purified, linearized pET-32(a)+ was determined by absorbance at 260 nm. Next, 33.3 ng of purified, linearized pET-32(a)+ and 100 ng of respective gBlock DNA fragment (Integrated DNA Technologies, Coralville, IA) encoding mCherry constructs were combined in a 10 L Gibson Assembly reaction6 containing HiFI DNA Assembly Master Mix (NEB, catalog #E2621L) and incubated at 50° C. for 1 hour to generate circular pET-32(a)+ vectors containing coding sequences for mCherry constructs. Circularized plasmids from the previous step were transformed into NEB 5-alpha competent E. coli (NEB, catalog #C2987H) as follows. Frozen stocks of cells were thawed on ice for 10 minutes. Upon thawing 4 L of the previous Gibson Assembly reaction was added to cells and incubated on ice for 30 minutes. Cells incubated with plasmid were then subjected to heat shock at 42° C. for 30 seconds and placed on ice for 5 minutes. 900 L of SOC outgrowth medium (NEB, catalog #B9020S) was added to cells and cells were incubated at 37° C. for 1 hour with shaking at 200 rpm. Agar plates containing 100 g/mL carbenicillin were inoculated with 100 L of transformed cells and grown overnight at 37° C. 5 single colonies per construct were picked and inoculated into liquid cultures containing 5 mL LB+100 μg/mL carbenicillin and grown for 16 hours at 37° C. Pure plasmid was isolated from 5 mL cultures using Qiaprep Spin Miniprep Kit (Qiagen, catalog #27106) and sequences were confirmed by Sanger sequencing at the UC Berkeley DNA Sequencing Facility. Plasmids containing the precise coding sequence for each construct were transformed into chemically competent BL21 E. coli (NEB, catalog #C2530H) following the same transformation protocol detailed above for large scale protein expression.
Loop-modified substrates (RopN, RopC, RopM, RopCG4) are analogs of the Rop sequence associated with Uniprot ID: P03051. All contained an N-terminal FLAG purification tag, and one of five ten-amino acid sequences in place of the native Asp30-Ala31 sequence of Rop. They also all contained a C-terminal 6×His purification tag. The full sequences of the Rop translation products are provided below.
Cloning Rop Constructs into pET-32(a)+ Vector
The sequences encoding RopN, RopM, RopC, RopC−, and RopCG4 were cloned into a pET-32a(+) plasmid as follows. Circular pET-32a(+) vector (1 μg, Millipore Sigma, catalog #69015-3) was incubated with 1 L each of restrictions enzymes NdeI (New England Biosciences, catalog #R0111S) and NotI-HF (New England Biosciences, catalog #R3189S) in cutSmart Buffer (New England Biolabs, catalog #B7204S) at 37° C. for 1 hour. The entire restriction digest reaction was run on a 0.8% agarose gel at 150V for 45 minutes and 5.4 kbp and 0.5 kbp fragments were observed using a blue light transilluminator. The larger 5.4 kbp fragment was excised from the gel and purified using a Monarch DNA Gel Extraction Kit (NEB, catalog #T1020S). Concentration of linearized pET-32(a)+ vector was determined by absorbance at 260 nm. 33.3 ng of purified, linearized pET-32(a)+ vector and 100 ng of respective gBlock DNA fragment (Integrated DNA Technologies, Coralville, IA) encoding Rop constructs were combined in a 10 μL Gibson Assembly reaction containing HiFI DNA Assembly Master Mix (NEB, catalog #E2621L) and incubated at 50° C. for 1 hour to generate circular pET-32(a)+ vectors containing coding sequences for Rop constructs. Circularized plasmids from the previous step were transformed into NEB 5-alpha competent E. coli (NEB, catalog #C2987H). First, cells were thawed on ice for 10 minutes after which 4 μL of the previous Gibson Assembly reaction was added to cells and placed back on ice for 30 minutes. After addition of plasmid, cells were subjected to heat shock at 42° C. for 30 seconds and placed on ice for 5 minutes. 900 L of SOC outgrowth medium (NEB, catalog #B9020S) was added to cells and cells were incubated at 37° C. for 1 hour with shaking at 200 rpm. Agar plates containing 100 μg/mL carbenicillin were inoculated with 100 L of transformed cells and grown overnight at 37° C. 5 single colonies per construct were picked and inoculated into 5 mL LB+100 μg/mL carbenicillin and grown for 16 hours at 37° C. Pure plasmid was isolated from 5 mL cultures using Qiaprep Spin Miniprep Kit (Qiagen, catalog #27106) and sequences were confirmed by Sanger sequencing at the UC Berkeley DNA Sequencing Facility. Plasmids containing the precise coding sequence for each construct were transformed into chemically competent BL21 E. coli (NEB, catalog #C2530H) for large scale protein expression following the same transformation protocol detailed above.
Expression and Purification of mCherry Variants
Starter cultures of 2 mL of Miller's LB Broth (AmericanBio, catalog #AB01201) supplemented with 100 μg/mL carbenicillin were inoculated with a single colony of E. coli BL21 (DE3) harboring the plasmid of interest and grown overnight at 37° C. with shaking at 200 rpm. The starter culture (2 mL) was used to inoculate a 200 mL expression culture of Miller's LB Broth also supplemented with 100 μg/mL carbenicillin. The expression culture was grown at 37° C. with shaking at 200 rpm to an OD600 of 0.6 at which point it was induced with 1 mM IPTG, transferred to a 20° C. incubator, and grown for 24 hours with shaking at 200 rpm. The expression culture was harvested by centrifugation at 4,300×g at 4° C. for 45 minutes. The resulting cell pellet was suspended in 10 mL of Lysis Buffer (20 mM Tris-HCl, 500 mM NaCl, pH 8.0) containing 1 tablet of cOmplete, mini EDTA-free ULTRA protease inhibitor cocktail (Sigma-Aldrich, St. Louis, MO). The cell suspension was disrupted by sonication on ice (Branson Sonifier 250, 3 cycles of 30 second pulse at 30% duty cycle and microtip limit of 5 followed by 60 second pause). The cell lysate was cleared by centrifugation at 23,000×g at 4° C. for 20 minutes. TALON® Metal Affinity Resin (2 mL) (Takara Biosciences, catalog #635504) was equilibrated with Lysis Buffer, added to the cleared cell lysate, and incubated on a rotisserie at 4° C. for 1 hour. The TALON® resin-lysate mixture was then passed through a gravity flow Poly-Prep Chromatography Column (Bio-Rad Laboratories, Hercules, CA). Non-specifically bound proteins were removed by washing the Ni-NTA column with 10 mL of Wash Buffer (20 mM Tris-HCl, 500 mM NaCl, 20 mM imidazole pH 8.0). The 6×His-tagged protein was eluted by washing the Ni-NTA column with 5 mL of Elution Buffer (20 mM Tris-HCl, 500 mM NaCl, 250 mM imidazole pH 8.0). The purified protein was loaded into a β-12 mL Slide-A-Lyzer Dialysis Cassette with a 10 kDa molecular weight cut off (Thermo Scientific, Waltham, MA) and dialyzed against 1 L of Storage Buffer (20 mM HEPES, 150 mM NaCl, 1 mM TCEP, pH 7.4) at 4° C. for at least 18 hours. The dialyzed protein was concentrated using Amicon Ultra-15 mL centrifugal filters with a 10 kDa molecular weight cut off (MilliporeSigma, Burlington, MA) according to the manufacturer's instructions. The purified protein was quantified using absorbance at 280 nm, diluted to 1 mM using Storage Buffer, snap frozen as single-use aliquots, and stored at −80° C. The typical expression yield of mCherry constructs using the above protocol was 90 mg/L of E. coli culture.
Starter cultures of 2 mL of Miller's LB Broth (AmericanBio, catalog #AB01201) supplemented with 100 μg/mL carbenicillin were inoculated with a single colony of E. coli BL21 (DE3) harboring the plasmid of interest and grown overnight at 37° C. with shaking at 200 rpm. The starter culture (2 mL) was used to inoculate a 200 mL expression culture of Miller's LB Broth also supplemented with 100 μg/mL carbenicillin. The expression culture was grown at 37° C. with shaking at 200 rpm to an OD600 of 0.6 at which point it was induced with 1 mM IPTG, transferred to a 20° C. incubator, and grown for 24 hours with shaking at 200 rpm. The expression culture was harvested by centrifugation at 4,300×g at 4° C. for 45 minutes. The resulting cell pellet was suspended in 10 mL of Lysis Buffer (20 mM Tris-HCl, 500 mM NaCl, pH 8.0) containing 1 tablet of cOmplete, mini EDTA-free ULTRA protease inhibitor cocktail (Sigma-Aldrich, St. Louis, MO). The cell suspension was disrupted by sonication on ice (Branson Sonifier 250, 3 cycles of 30 second pulse at 30% duty cycle and microtip limit of 5 followed by 60 second pause). The cell lysate was cleared by centrifugation at 23,000×g at 4° C. for 20 minutes. TALON® Metal Affinity Resin (2 mL) (Takara Biosciences, catalog #635504) was equilibrated with lysis buffer (20 mM Tris-HCl, 500 mM NaCl, pH 8.0). The equilibrated Ni-NTA agarose (2 mL) was added to the cleared cell lysate and incubated on a rotisserie at 4° C. for 1 hours. The TALON® resin-lysate mixture was then passed through a gravity flow Poly-Prep Chromatography Column (Bio-Rad Laboratories, Hercules, CA). Non-specifically bound proteins were removed by washing the TALON® column with 10 mL of wash buffer (20 mM Tris-HCl, 500 mM NaCl, 20 mM imidazole pH 8.0). The 6×His-tagged protein was eluted by washing the TALON® column with 5 mL of elution buffer (20 mM Tris-HCl, 500 mM NaCl, 250 mM imidazole pH 8.0). The purified protein was loaded into a β-12 mL Slide-A-Lyzer Dialysis Cassette with a 3.5 kDa molecular weight cut off (Thermo Scientific, Waltham, MA) and dialyzed against 1 L of storage buffer (20 mM HEPES, 150 mM NaCl, 1 mM TCEP, pH 7.4) at 4° C. for at least 18 hours. The dialyzed protein was concentrated using Amicon Ultra-15 mL centrifugal filters with a 3 kDa molecular weight cut off (MilliporeSigma, Burlington, MA) according to the manufacturer's instructions. The purified protein was quantified using absorbance at 280 nm, diluted to 1 mM using storage buffer (20 mM HEPES, 150 mM NaCl, 1 mM TCEP, pH 7.4), snap frozen as single use aliquots, and stored at −80° C. The typical expression yield of Rop construct using the above protocol was 40 mg/L of E. coli culture.
Characterization of mCherry and Rop Constructs
SDS-PAGE. Purified proteins (4 μg) were mixed (4:1) with SDS-PAGE sample buffer (5% β-Mercaptoethanol, 0.02% bromophenol blue, 30% glycerol, 10% SDS, 250 mM Tris-HCl, pH 6.8) and incubated at 95° C. for 5 minutes before being applied to a 4-15% mini-PROTEAN TGX precast gel (Bio-Rad Laboratories, Hercules, CA) run at 200 V for 30 minutes in Tris-Glycine-SDS running buffer (3 g/L tris, 14.4 g/L glycine, 1 g/L sodium dodecyl sulfate, pH 8.3) alongside a lane containing Precision Plus Protein Dual Color Standards (Bio-Rad Laboratories, Hercules, CA). Protein bands were visualized using Coomassie stain (1 g/L Coomassie Brilliant Blue in methanol:water:acetic acid (5:4:1)) and imaged using a Bio-Rad Chemidoc MP Imaging System. Band intensities were quantified using the gel analysis tool of FIJI.5
LC-MS analysis was performed on an Agilent 1290 Infinity II HPLC connected to an Agilent 6530B QTOF AJS-ESI. The mobile phase for LC-MS was water and acetonitrile with 0.1% (v/v) formic acid and a flow rate of 0.4 mL/min. Each sample was injected onto a Poroshell 300SB-C8 column (2.1×75 mm, 5-Micron, room temperature, Agilent) using a linear gradient from 5 to 95% acetonitrile over 9 minutes after an initial hold at 5% acetonitrile for 0.5 minutes (0.4 mL/min). The following parameters were used during acquisition: Fragmentor voltage 225 V, gas temperature 300° C., gas flow 10 L/min, sheath gas temperature 350° C., sheath gas flow 11 L/min, nebulizer pressure 35 psi, skimmer voltage 65 V, Vcap 5000 V, 1 spectra/s. Intact protein masses were obtained via deconvolution using the Maximum Entropy algorithm in Mass Hunter Bioconfirm (V10, Agilent).
Reactions with MicD-F and/or ArtGox
Reaction of Peptides and Proteins with MicD-F
A typical reaction scale for analysis via LC-MS was 30 μL. Peptide substrate stock (3 μL) suspended at 1 mM (10 mM bicine, 150 mM NaCl, 1 mM TCEP, pH 9.0) was added to 24 μL of reaction buffer (6.25 mM ATP, 6.25 mM MgCl2, 100 mM bicine, 150 mM NaCl, 1 mM TCEP, pH 8.0 or 9.0). Separately, an aliquot of MicD-F (500 M in 20 mM HEPES, 150 mM NaCl, 1 mM TCEP, pH 7.4) was thawed on ice. The thawed enzyme aliquot was then diluted with cold storage buffer (20 mM HEPES, 150 mM NaCl, 1 mM TCEP, pH 7.4) to a concentration equal to 10× the desired enzyme concentration. The peptide and enzyme solutions were then incubated at 37° C. for 15 minutes. After temperature equilibration, enzyme solution (3 μL) was added to the peptide solution and gently pipette mixed. The resulting solution was then incubated at 37° C. for the indicated amount of time before LC-MS analysis.
Reactions of proteins with MicD-F were performed in the manner described above with minor modifications. Namely, the substrate stock was a 1 mM solution of protein in 20 mM HEPES, 150 mM NaCl, 1 mM TCEP, pH 7.4.
Reaction of Peptides and Proteins with MicD-F and ArtGox in Tandem
A typical reaction scale for analysis via LC-MS was 30 μL. Peptide substrate stock (3 L) suspended at 1 mM in 10 mM bicine, 150 mM NaCl, 1 mM TCEP, pH 9.0 was added to 21 L of reaction buffer (7.14 mM ATP, 7.14 mM MgCl2, 2.86 mM FMN, 100 mM bicine, 150 mM NaCl, 1 mM TCEP, pH 8.0 or 9.0). Separately, aliquots of MicD-F (500 M in 20 mM HEPES, 150 mM NaCl, 1 mM TCEP, pH 7.4) and ArtGox (800 M in 20 mM HEPES, 150 mM NaCl, 1 mM TCEP, pH 7.4) were thawed on ice. The thawed enzyme aliquots were then diluted with cold storage buffer (20 mM HEPES, 150 mM NaCl, 1 mM TCEP, pH 7.4) to a concentration equal to 10× the desired enzyme concentration. The peptide and enzyme solutions were then incubated at 37° C. for 15 minutes. After temperature equilibration, enzyme solutions (3 μL of each) were added to the peptide solution and gently pipette mixed. The resulting solution was then incubated at 37° C. for the indicated amount of time before LC-MS analysis.
Reactions of proteins with MicD-F and ArtGox were performed in the manner described above with minor modifications. The substrate stock was a 1 mM solution of protein in 20 mM HEPES, 150 mM NaCl, 1 mM TCEP, pH 7.4. Reactions were carried out at either 25° C. or 37° C. as indicated.
LC-MS Analysis of Reactions with Peptide Substrates
To remove 6×His-tagged enzymes, Ni-NTA (Qiagen, Hilden, Germany) slurry and crude reaction were mixed 1:1 by volume and incubated on ice for 30 minutes with occasional agitation. The Ni-NTA resin was then removed by centrifugation at 21,300×g for 10 minutes at 4° C. Enzyme-depleted reaction mixture (1 μL) was used for LC-MS analysis which was performed on an Agilent 1290 Infinity II HPLC connected to an Agilent 6530B QTOF AJS-ESI. The mobile phase for LC-MS was water and acetonitrile with 0.1% (v/v) formic acid and a flow rate of 0.7 mL/min. Each sample was injected onto an Eclipse XDB C-18 column (2.1×50 mm, 1.8-Micron, room temperature, Agilent) and separated using a linear gradient from 5 to 95% acetonitrile over 4.5 minutes after an initial hold at 5% acetonitrile for 0.5 minutes. The following parameters were used during acquisition: Fragmentor voltage 175 V, gas temperature 300° C., gas flow 8 L/min, sheath gas temperature 350° C., sheath gas flow 11 L/min, nebulizer pressure 35 psi, skimmer voltage 65 V, Vcap 3500 V, 1 spectra/s.
LC-MS Analysis of Reactions with Protein Substrates
Unprocessed reaction mixture (1 μL) was used for LC-MS analysis which was performed on an Agilent 1290 Infinity II HPLC connected to an Agilent 6530B QTOF AJS-ESI. The mobile phase for LC-MS was water and acetonitrile with 0.1% (v/v) formic acid and a flow rate of 0.4 mL/min. Each sample was injected onto a Poroshell 300SB-C8 column (2.1×75 mm, 5-Micron, room temperature, Agilent) using a linear gradient from 5 to 55% acetonitrile over 8 minutes after an initial hold at 5% acetonitrile for 2 minutes (0.4 mL/min). The following parameters were used during acquisition: Fragmentor voltage 225 V, gas temperature 300° C., gas flow 10 L/min, sheath gas temperature 350° C., sheath gas flow 11 L/min, nebulizer pressure 35 psi, skimmer voltage 65 V, Vcap 5000 V, 1 spectra/s. Intact protein masses were obtained via deconvolution using the Maximum Entropy algorithm in Mass Hunter Bioconfirm (V10, Agilent).
Reactions to synthesize and purify thiazoline (RopC-U)- and thiazole (RopC-Z)-modified Rop variants were carried out in a total reaction volume of 1.5 mL. All stock solutions and reaction components were scaled up in accordance with the analytical scale reaction protocols described above. To synthesize RopC-U, RopC (100 μM) was reacted with MicD-F (50 μM) at pH 9.0 and 37° C. To synthesize RopC-Z, RopC (100 μM) was reacted with MicD-F (50 μM) and ArtGox (80 μM) at pH 9.0 and 37° C. Reaction progress was monitored in the manner described above for LC-MS analysis of analytical scale protein reactions. Once the reaction was complete as judged by LC-MS, RopC variants were separated from the crude reaction mixture via size exclusion chromatography (SEC).
A HiLoad® 16/600 Superdex® 75 pg column (stored and operated at 4° C.) was washed with 2 column volumes (CV) of degassed and sterile filtered MilliQ water. The column was then equilibrated in running buffer (10 mM phosphate, 100 mM NaCl, 150 M TCEP, pH 7.0) for 2 CV. The crude reaction mixture (1.5 mL) was applied to a 5 mL sample loop. The sample loop was washed with 10 mL of running buffer at 1 mL/min to load the sample onto the column. The sample was then eluted from the column by flowing running buffer at 1 mL/min for 1.5 CV. Fractions were collected in 1 mL aliquots for the entirety of sample application and elution. Fractions were analyzed via SDS-PAGE analysis and those containing protein of the correct molecular weight (approximately 10 kDa) were pooled and concentrated to 20 M using Amicon Ultra-0.5 mL centrifugal filters with a 3 kDa molecular weight cut off (MilliporeSigma, Burlington, MA) according to the manufacturer's instructions. The 20 M protein solution was equally divided into 3 parts and flash frozen on liquid nitrogen before analysis via circular dichroism.
To assess the homogeneity of isolated Rop variants, a solution of each protein was prepared (250 μg at 50 μM) in running buffer (10 mM phosphate, 100 mM NaCl, 150 M TCEP, pH 7.0). A Superdex® 75 Increase 10/300 GL column (stored and operated at 4° C.) was washed with 2 column volumes (CV) of degassed and sterile filtered MilliQ water. The column was then equilibrated in running buffer for 2 CV. Each sample (500 μL) was applied to a 500 μL sample loop. The sample loop was washed with 2 mL of running buffer at 0.8 mL/min to load the sample onto the column. The sample was then eluted from the column by flowing running buffer at 0.8 mL/min for 1.30 CV. To assess column performance, a gel filtration standard (Bio-Rad Laboratories, Hercules, CA, catalog number 151-1901) containing 670 kDa, 158 kDa, 44 kDa, 17 kDa, and 1.35 kDa standards was used according to the manufacturer's instructions.
RopN, RopM, and RopC were exchanged into CD buffer (10 mM phosphate, 100 mM NaCl, 150 μM TCEP, pH 7.0) using Amicon Ultra-0.5 mL centrifugal filters (MilliporeSigma, Burlington, MA) with a 3 kDa molecular weight cut-off according to the manufacturer's instructions and then diluted with the same buffer to a concentration of 20 μM. For CD analysis each Rop variant was transferred to a 1 mm quartz cuvette. Wavelength and temperature dependent CD spectra were collected with an AVIV Biomedical, Inc. (Lakewood, NJ) Circular Dichroism Spectrometer Model 410. For wavelength-dependent spectra, initial scans were performed from 200 to 300 nm at 25° C. in 2 nm increments with an averaging time of 5 seconds. For temperature melt experiments, the signal was monitored at 222 nm with an averaging time of 5 seconds. Temperature melts were performed from 5 to 90° C. in 1° C. increments with equilibration for 2 minutes before each measurement. Following the temperature melt the sample was returned to 25° C. and the wavelength-dependent CD spectra was measured once more to assess the reversibility of the melt. Raw data (mdeg) were converted to molar ellipticity ([Θ], in deg*cm2*dmol−1) by
where M is the mean residual weight (116.00 g/mol for RopN, RopM, and RopC), L is the pathlength of the cuvette in centimeters, and C is the concentration of the sample in g/L.7 The melting temperature was determined by fitting the molar ellipticity as a function of temperature to a Boltzmann sigmoidal curve using GraphPad Prism (Version 7.04).
CD analysis of RopC-U and RopC-Z was performed as described for RopN, RopM, and RopC with the following modifications. Temperature melts were performed from 5 to 90° C. in 2.5° C. increments. Before measuring the temperature melt for RopC-Z, the protein was refolded by performing an initial melt from 5 to 90° C. in 2.5° C. increments as described.
This application is a continuation of PCT/US22/80459, filed Nov. 23, 2022, which claims priority to U.S. Provisional Application No. 63/295,885; filed Jan. 1, 2022 the disclosures of which are hereby incorporated by reference in its entirety for all purposes.
This invention was made with government support under the National Science Foundation, grant numbers 2002182 and 2021739. The government has certain rights in the invention.
| Number | Date | Country | |
|---|---|---|---|
| 63295885 | Jan 2022 | US |
| Number | Date | Country | |
|---|---|---|---|
| Parent | PCT/US22/80459 | Nov 2022 | WO |
| Child | 18747424 | US |