A Sequence Listing accompanies this application and is submitted as an ASCII text file of the sequence listing named “702581_02153_ST25.txt” which is 116,528 bytes in size and was created on May 19, 2022. The sequence listing is electronically submitted via EFS-Web with the application and is incorporated herein by reference in its entirety.
The present disclosure relates to methods to evolve macromolecular machines and improved macromolecular machines identified and made by the methods. In some embodiments, the improved macromolecular machines include improved tethered ribosomes. Also disclosed are systems and methods for making and using the same.
The ribosome is a molecular machine responsible for the polymerization of α-amino acids into proteins. In all kingdoms of life, the ribosome is made up of two subunits. In bacteria, these correspond to the small (30S) subunit and the large (50S) subunit. The 30S subunit contains the 16S ribosomal RNA (rRNA) and 21 ribosomal proteins (r-proteins), and is involved in translation initiation and decoding the mRNA message. The 50S subunit contains the 5S and 23S rRNAs and 33r-proteins, and is responsible for accommodation of amino acid substrates, catalysis of peptide bond formation, and protein excretion.
The extraordinarily versatile catalytic capacity of the ribosome has driven extensive efforts to harness it for novel functions, such as reprogramming the genetic code. For example, the ability to modify the ribosome's active site to work with substrates beyond those found in nature such as mirror-image (D-α-) and backbone-extended (β- and γ-) amino acids, could enable the synthesis of new classes of sequence-defined polymers to meet many goals of biotechnology and medicine. Unfortunately, cell viability constraints limit the alterations that can be made to the ribosome.
To bypass this limitation, recent developments have focused on the engineering of specialized ribosome systems. The concept is to create an independent, or orthogonal, translation system within the cell dedicated to production of one or a few target proteins while wild-type ribosomes continue to synthesize genome-encoded proteins to ensure cell viability. Pioneering efforts by Hui and DeBoer, and subsequent improvements by Chin and colleagues, first created a specialized small ribosomal subunit. By modifying the Shine-Dalgarno (SD) sequence of an mRNA and the corresponding anti-Shine Dalgarno (ASD) sequence in 16S rRNA, they generated orthogonal 30S subunits capable of primarily translating a specific kind of engineered mRNA, while largely excluding them from translating endogenous cellular mRNAs. These advances enabled the selection of mutant 30S ribosomal subunits capable of re-programming cellular logic and enabling new decoding properties.
Unfortunately, such techniques have been restricted to the small subunit because the large subunits freely exchange between pools of native and orthogonal 30S. This limited the engineering potential of the large subunit, which contains the peptidyl transferase center (PTC) active site and the nascent peptide exit tunnel. This limitation has been addressed with a fully orthogonal ribosome (termed Ribo-T), whereby the small and large subunits are tethered together via helix h44 of the 16S rRNA and helix H101 of the 23S rRNA.
Since the initial discovery of Ribo-T and a subsequent stapled design15, new orthogonal Ribo-T/mRNA pairs as well as tether sequences have been optimized using directed evolution methods9,14. Specifically, tether residues have been randomized in sequence but not in length9, or mutations to surrounding residues surrounding a fixed RNA linker (the J5/5a junction from the Tetrahymena group I intron) have been investigated14. Despite the improvement, the potential of tethered ribosome systems remains limited by their low activity.
The untapped potential and existing inefficiencies of tethered ribosome systems motivate the need for new directed evolution-based approaches to engineer these systems for improving their activity.
Disclosed herein are tethered ribosomes and methods of making and using the ribosomes. Also disclosed are novel methods for evolving macromolecular machines, termed “Evolink.”
Disclosed herein are engineered ribosomes. In some embodiments, the engineered ribosomes comprise a) a small subunit comprising a 16S rRNA polynucleotide sequence or variant thereof; b) a large subunit comprising a 23S rRNA polynucleotide sequence or variant thereof, c) a linking moiety comprising a T1 polynucleotide domain and a T2 polynucleotide domain, wherein the linking moiety links the 16s RNA and the 23S rRNA, thereby linking the large and small ribosomal subunits. In some embodiments, the linking moiety covalently bonds helix 101 of the 23S rRNA large subunit to helix 44 of the 16s rRNA of the small subunit. In some embodiments, the T1 polynucleotide domain comprises 5′-GUUAUA-3′ or 5′-AGUCAAUAA-3′ and the T2 polynucleotide domain comprises 5′-UCACAAG-3′; or 5′-GACCUUCG-3′. In some embodiments, the T1 polynucleotide domain comprises 5′-GUUAUA-3′ and the T2 polynucleotide domain comprises 5′-UCACAAG-3′. In some embodiments, the T1 polynucleotide domain comprises 5′-AGUCAAUAA-3′ and T2 polynucleotide comprises 5′-GACCUUCG-3′. In some embodiments, the engineered ribosome comprises SEQ ID NO: 1.
Also disclosed are polynucleotides encoding the rRNA of the engineered ribosomes, such as, for example, SEQ ID NO: 1, and cells comprising the polynucleotides. Also disclosed are methods for preparing a sequence-defined polymer using the engineered ribosomes disclosed herein.
Also disclosed are methods for evolving molecular machines comprising RNA and/or protein regions of interest that are far apart in primary sequence, but proximal in three-dimensional space. In some embodiments, the methods comprise a design step, a build step, a test step, and an analyze step, the test step involving Evolink, comprising a first PCR, a ligation, and a second PCR.
Non-limiting embodiments of the present invention will be described by way of example with reference to the accompanying figures, which are schematic and are not intended to be drawn to scale. In the figures, each identical or nearly identical component illustrated is typically represented by a single numeral. For purposes of clarity, not every component is labeled in every figure, nor is every component of each embodiment of the invention shown where illustration is not necessary to allow those of ordinary skill in the art to understand the invention.
Terminology
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of skill in the art to which the invention pertains. All definitions, as defined and used herein, should be understood to control over dictionary definitions, definitions in documents incorporated by reference, and/or ordinary meanings of the defined terms.
The indefinite articles “a” and “an,” as used herein in the specification and in the claims, unless clearly indicated to the contrary, should be understood to mean “at least one.”
A range includes each individual member. Thus, for example, a group having 1-3 members refers to groups having 1, 2, or 3 members.
It should also be understood that, unless clearly indicated to the contrary, in any methods claimed herein that include more than one step or act, the order of the steps or acts of the method is not necessarily limited to the order in which the steps or acts of the method are recited.
The modal verb “may” refers to the preferred use or selection of one or more options or choices among the several described embodiments or features contained within the same. Where no options or choices are disclosed regarding a particular embodiment or feature contained in the same, the modal verb “may” refers to an affirmative act regarding how to make or use an aspect of a described embodiment or feature contained in the same, or a definitive decision to use a specific skill regarding a described embodiment or feature contained in the same. In this latter context, the modal verb “may” has the same meaning and connotation as the auxiliary verb “can.”
In the claims, as well as in the specification above, all transitional phrases such as “comprising,” “including,” “carrying,” “having,” “containing,” “involving,” “holding,” “composed of,” and the like are to be understood to be open-ended, i.e., to mean including but not limited to. Only the transitional phrases “consisting of” and “consisting essentially of” shall be closed or semi-closed transitional phrases, respectively, as set forth in the United States Patent Office Manual of Patent Examining Procedures, Section 2111.03.
As used herein, “about,” “approximately,” “substantially,” and “significantly” will be understood by persons of ordinary skill in the art and will vary to some extent on the context in which they are used. If there are uses of the term which are not clear to persons of ordinary skill in the art given the context in which it is used, “about” and “approximately” will mean up to plus or minus 10% of the particular term and “substantially” and “significantly” will mean more than plus or minus 10% of the particular term.
The terms “nucleic acid” and “oligonucleotide,” as used herein, refer to polydeoxyribonucleotides (containing 2-deoxy-DRibose), polyribonucleotides (containing DRibose), and to any other type of polynucleotide that is an N glycoside of a purine or pyrimidine base. There is no intended distinction in length between the terms “nucleic acid”, “oligonucleotide” and “polynucleotide”, and these terms will be used interchangeably. These terms refer only to the primary structure of the molecule. Thus, these terms include double- and single-stranded DNA, as well as double- and single-stranded RNA. For use in the present invention, an oligonucleotide also can comprise nucleotide analogs in which the base, sugar or phosphate backbone is modified as well as non-purine or non-pyrimidine nucleotide analogs.
Oligonucleotides can be prepared by any suitable method, including direct chemical synthesis by a method such as the phosphotriester method of Narang et al., 1979, Meth. Enzymol. 68:90-99; the phosphodiester method of Brown et al., 1979, Meth. Enzymol. 68:109-151; the diethylphosphoramidite method of Beaucage et al., 1981, Tetrahedron Letters 22:1859-1862; and the solid support method of U.S. Pat. No. 4,458,066, each incorporated herein by reference. A review of synthesis methods of conjugates of oligonucleotides and modified nucleotides is provided in Goodchild, 1990, Bioconjugate Chemistry 1(3): 165-187, incorporated herein by reference.
The term “primer,” as used herein, refers to an oligonucleotide capable of acting as a point of initiation of DNA synthesis under suitable conditions. Such conditions include those in which synthesis of a primer extension product complementary to a nucleic acid strand is induced in the presence of four different nucleoside triphosphates and an agent for extension (for example, a DNA polymerase or reverse transcriptase) in an appropriate buffer and at a suitable temperature.
A primer is preferably a single-stranded DNA. The appropriate length of a primer depends on the intended use of the primer but typically ranges from about 6 to about 225 nucleotides, including intermediate ranges, such as from 15 to 35 nucleotides, from 18 to 75 nucleotides and from 25 to 150 nucleotides. Short primer molecules generally require cooler temperatures to form sufficiently stable hybrid complexes with the template. A primer need not reflect the exact sequence of the template nucleic acid, but must be sufficiently complementary to hybridize with the template. The design of suitable primers for the amplification of a given target sequence is well known in the art and described in the literature cited herein.
Primers can incorporate additional features which allow for the detection or immobilization of the primer but do not alter the basic property of the primer, that of acting as a point of initiation of DNA synthesis. For example, primers may contain an additional nucleic acid sequence at the 5′ end which does not hybridize to the target nucleic acid, but which facilitates cloning or detection of the amplified product, or which enables transcription of RNA (for example, by inclusion of a promoter) or translation of protein (for example, by inclusion of a 5′-UTR, such as an Internal Ribosome Entry Site (IRES) or a 3′-UTR element, such as a poly(A)n sequence, where n is in the range from about 20 to about 200). The region of the primer that is sufficiently complementary to the template to hybridize is referred to herein as the hybridizing region.
The term “promoter” refers to a cis-acting DNA sequence that directs RNA polymerase and other trans-acting transcription factors to initiate RNA transcription from the DNA template that includes the cis-acting DNA sequence.
The terms “target, “target sequence”, “target region”, and “target nucleic acid,” as used herein, are synonymous and refer to a region or sequence of a nucleic acid which is to be amplified, sequenced or detected.
The term “hybridization,” as used herein, refers to the formation of a duplex structure by two single-stranded nucleic acids due to complementary base pairing. Hybridization can occur between fully complementary nucleic acid strands or between “substantially complementary” nucleic acid strands that contain minor regions of mismatch. Conditions under which hybridization of fully complementary nucleic acid strands is strongly preferred are referred to as “stringent hybridization conditions” or “sequence-specific hybridization conditions”. Stable duplexes of substantially complementary sequences can be achieved under less stringent hybridization conditions; the degree of mismatch tolerated can be controlled by suitable adjustment of the hybridization conditions. Those skilled in the art of nucleic acid technology can determine duplex stability empirically considering a number of variables including, for example, the length and base pair composition of the oligonucleotides, ionic strength, and incidence of mismatched base pairs, following the guidance provided by the art (see, e.g., Sambrook et al., 1989, Molecular Cloning-A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y.; Wetmur, 1991, Critical Review in Biochem. and Mol. Biol. 26(3/4):227-259; and Owczarzy et al., 2008, Biochemistry, 47: 5336-5353, which are incorporated herein by reference).
The term “amplification reaction” refers to any chemical reaction, including an enzymatic reaction, which results in increased copies of a template nucleic acid sequence or results in transcription of a template nucleic acid. Amplification reactions include reverse transcription, the polymerase chain reaction (PCR), including Real Time PCR (see U.S. Pat. Nos. 4,683,195 and 4,683,202; PCR Protocols: A Guide to Methods and Applications (Innis et al., eds, 1990)), and the ligase chain reaction (LCR) (see Barany et al., U.S. Pat. No. 5,494,810). Exemplary “amplification reactions conditions” or “amplification conditions” typically comprise either two or three step cycles. Two-step cycles have a high temperature denaturation step followed by a hybridization/elongation (or ligation) step. Three step cycles comprise a denaturation step followed by a hybridization step followed by a separate elongation step.
As used herein, a “polymerase” refers to an enzyme that catalyzes the polymerization of nucleotides. “DNA polymerase” catalyzes the polymerization of deoxyribonucleotides. Known DNA polymerases include, for example, Pyrococcus furiosus (Pfu) DNA polymerase, E. coli DNA polymerase I, T7 DNA polymerase and Thermus aquaticus (Taq) DNA polymerase, among others. “RNA polymerase” catalyzes the polymerization of ribonucleotides. The foregoing examples of DNA polymerases are also known as DNA-dependent DNA polymerases. RNA-dependent DNA polymerases also fall within the scope of DNA polymerases. Reverse transcriptase, which includes viral polymerases encoded by retroviruses, is an example of an RNA-dependent DNA polymerase. Known examples of RNA polymerase (“RNAP”) include, for example, T3 RNA polymerase, T7 RNA polymerase, SP6 RNA polymerase and E. coli RNA polymerase, among others. The foregoing examples of RNA polymerases are also known as DNA-dependent RNA polymerase. The polymerase activity of any of the above enzymes can be determined by means well known in the art.
As used herein, the term “sequence defined polymer” refers to a polymer having a specific primary sequence. A sequence defined polymer can be equivalent to a genetically-encoded defined polymer in cases where a gene encodes the polymer having a specific primary sequence.
As used herein, a primer is “specific,” for a target sequence if, when used in an amplification reaction under sufficiently stringent conditions, the primer hybridizes primarily to the target nucleic acid. Typically, a primer is specific for a target sequence if the primer-target duplex stability is greater than the stability of a duplex formed between the primer and any other sequence found in the sample. One of skill in the art will recognize that various factors, such as salt conditions as well as base composition of the primer and the location of the mismatches, will affect the specificity of the primer, and that routine experimental confirmation of the primer specificity will be needed in many cases. Hybridization conditions can be chosen under which the primer can form stable duplexes only with a target sequence. Thus, the use of target-specific primers under suitably stringent amplification conditions enables the selective amplification of those target sequences that contain the target primer binding sites.
As used herein, “expression template” refers to a nucleic acid that serves as substrate for transcribing at least one RNA that can be translated into a polypeptide or protein. Expression templates include nucleic acids composed of DNA or RNA. Suitable sources of DNA for use a nucleic acid for an expression template include genomic DNA, plasmid DNA, cDNA and RNA that can be converted into cDNA. Genomic DNA, cDNA and RNA can be from any biological source, such as a tissue sample, a biopsy, a swab, sputum, a blood sample, a fecal sample, a urine sample, a scraping, among others. The genomic DNA, cDNA and RNA can be from host cell or virus origins and from any species, including extant and extinct organisms. As used herein, “expression template” and “transcription template” have the same meaning and are used interchangeably.
As used herein, “tethered,” “conjoined,” “linked,” “connected,” “coupled” and “covalently-bonded” have the same meaning as modifiers.
As used herein, “tethered ribosome,” “engineered ribosome,” and “Ribo-T” will be used interchangeably.
As used here, “CP” refers to a circularly permuted subunit. As used herein, when CP is followed by “23S” that refers to a circularly permuted 23S rRNA. As used herein, when CP followed by a number may refer to the location of the new 5′ end in a secondary structure, e.g. CP101 means the new 5′ end is in helix 101 of the 23S rRNA, or to the location of the new 5′ nucleotide, e.g. CP2861 means the new 5′ nucleotide is the nucleotide 2861 of the 23 rRNA, depending on context.
As used herein, “translation template” refers to an RNA product of transcription from an expression template that can be used by ribosomes to synthesize polypeptide or protein.
Methods for Improved Molecular Evolution of Biological Machines and Compositions Derived Therefrom
Disclosed herein is a new technique for evolving macromolecular machines, which combines molecular biology techniques with next-generation sequencing to allow co-evolution of functionally-linked residues previous out of reach for next generation sequencing reads with length limitations (˜300 nts). Termed Evolink, this technique is broadly applicable to large RNA or protein machines, and can be implemented with very basic techniques available in many molecular biology laboratories.
Also disclosed herein is a new sequence for an RNA machine, the ribosome, which improves upon the previous tethered ribosome (see e.g., Ref 9). The new ribosome system, termed Ribo-T v3, is capable of orthogonal protein synthesis and improved cellular fitness when supporting life.
Ribo-T v3 features new ribosomal RNA sequences that link together the 16S and 23S rRNA of the small (30S) and large (50S) subunits of the E. coli ribosome. This new RNA sequence was achieved by applying a newly invented technique called Evolink, in which distal regions of a machine (e.g., functional protein or nucleic acid sequence) encoded on a plasmid can be linked together in an amplicon for next-generation reads to enable co-evolution of previously separated parts. Evolink can be applied to any machine encoded on a plasmid, and can link together multiple regions. Such regions are abundant in many macromolecular machines (both protein and RNA), and have been precluded from high throughput evolution due to limitations in assay techniques.
Ribosome engineering is emerging as a powerful approach for expanding the catalytic potential of the protein synthesis apparatus and for elucidating its origin, evolution and function. Because the properties of the engineered ribosome might be detrimental for the general protein synthesis, the designer ribosome needs to be functionally isolated from the translation machinery synthesizing cellular proteins. The initial solution to this problem has been offered by Ribo-T, an engineered ribosome with the tethered subunits which, while translating a desired protein, could be excluded from translation of the cellular proteome. In the present disclosure, the inventors present a new paradigm for designing and evolving macromolecular machines. The inventors herein demonstrate the combination of computational modeling with a molecular biology workflow that enables high-throughput evolution of distant regions in a large molecular machine. To showcase the utility of the approach, the inventors evolved a tethered ribosome which improves upon the previous state-of-the-art by over 50% in orthogonal protein translation.
Applications and Advantages of Evolink
The improved molecular evolution methods for biological machines, and compositions derived therefrom, e.g., improved tethered ribosomes, have many applications and advantages. The following are examples only, and are not intended to be limiting.
Ribosome evolution/engineering (for example towards more efficient non-canonical amino acid incorporation); expanded genetic codes for non-canonical amino acid incorporation; enabling detailed in vivo studies of antibiotic resistance mechanisms, enabling antibiotic development process; biopharmaceutical production; orthogonal circuits in cells; synthetic biology; production of engineered peptides by incorporating new functionality inaccessible to peptides synthesized by native (or wild type) ribosome or their post-translationally modified derivatives; production of novel protease-resistant peptides that could transform medicinal chemistry.
For evolution of the ribosomes, the inventors present a new paradigm for directed evolution that integrates computational structural modeling of RNA machines as well as a new molecular biology technique that enables evolution of distant regions on molecular machines compatible with next-generation sequencing.
This improved upon a previous state-of-the-art design for a tethered ribosome (Ribo-T v2.0, see Ref. 9). It outperforms Ribo-T v2.0 in both supporting cellular life (faster and more robust growth) as well as orthogonal protein production (improved orthogonal GFP synthesis).
Improvements to orthogonal ribosomes could play a vital role in successful directed evolution towards new functions, such as new polymerization chemistries and orthogonal genetic circuits.
The inventors further show compatibility of orthogonal, tethered ribosomes with other synthetic translation machinery, specifically the flexizyme system for non-standard amino acid incorporation to produce a peptide containing a coumarin derivative non-canonical monomer in an in vitro translation reaction. This combination of engineered translation machinery has not previously been shown.
The novel evolutionary molecular method disclosed herein greatly increases throughput of directed evolution efforts on large protein or RNA enzymes. The unmet need is the current limitation in the number of genotypes that can be linked to advantageous phenotypes. Notably, it is impossible to evolve sequence-distal parts of molecular machines and interactions between those sequences although based on structure they are likely linked in function. The invention described herein allows a research group to rapidly assess which parts of macromolecular machines are functionally linked, and then to perform directed evolution on them with readouts that allow them to link sequence-distal parts in Next Generation Sequencing (NGS) readouts without having to rely on clonal screening or using statistics to infer functional linkage. This invention could increase throughput by orders of magnitude and with greater fidelity than previously available methods.
Engineered Ribosomes
Engineered ribosomes and methods of making and using the ribosomes, are described in U.S. Pat. No. 10,590,456, Ref. 9, and Ref. 18, each of which is incorporated herein by reference in its entirety.
The engineered ribosome comprises a small subunit, a large subunit, and a linking moiety, wherein the linking moiety tethers the small subunit with the large subunit. In some embodiments, the engineered ribosome is capable of supporting translation of a sequence-defined polymer. In some embodiments, the engineered ribosome comprises a linking moiety that links the 16S and 23S rRNA of the small (30S) and large (50S) subunits of the E. coli ribosome.
In the following discussion, the rRNA component of ribosomes is the focus. As is well known in the art, ribosomes, including the engineered ribosomes disclosed herein, comprise ribosomal proteins as well as RNA. For example, bacterial ribosomes, such as E. coli ribosomes, include 31 ribosomal proteins in the 50S (large) subunit, and 21 ribosomal proteins in the 30S (small) subunit. Ribosomal proteins and methods of making ribosomes are well known in the art (see e.g., references above). While the RNA is the focus of the discussion, it is to be understood that ribosomes and their subunits also include ribosomal proteins.
In contrast to a naturally occurring ribosome, the engineered ribosome has a large and a small subunit that are not separable.
An embodiment of a portion of an engineered tethered ribosome is illustrated in
Large Subunit
The large ribosome subunit 301 comprises a subunit capable of joining amino acids to form a polypeptide chain. The large subunit 301 may comprise a ribosomal RNA comprising a first large subunit domain (“L1 polynucleotide domain” or “L1 domain”), a second large subunit domain (“L2 polynucleotide domain” or “L2 domain”), and a connector domain (“C polynucleotide domain” or “C domain”) 304, wherein the L1 domain is followed, in order, by the C domain and the L2 domain, from 5′ to 3′.
In some embodiment, the large subunit rRNA 301 may be a permuted variant of a separable large subunit rRNA. In some embodiments, the permuted variant is a circularly permuted variant of a separable large subunit rRNA. The separable large subunit may be any functional large subunit. In some embodiments, the separable large subunit may be a 23S rRNA. In some embodiments, the separable large subunit comprises a wild-type large subunit rRNA. In some embodiments, the separable large subunit is a wild-type 23S rRNA. In some embodiments, the separable large subunit is at least 90% identical, at least 91% identical, at least 92% identical, at least 93% identical, at least 94% identical, at least 95% identical, at least 96% identical, at least 97% identical, at least 98% identical, or at least 99% identical to a wild-type 23S rRNA.
In some embodiment, if the large subunit 301 is a permuted variant of a large subunit rRNA, then the polynucleotide sequences consisting essentially of the L2 domain, followed by the L1 domain, from 5′ to 3′, may be substantially identical to a large subunit rRNA. In some embodiments, the polynucleotide sequence consisting essentially of the L2 domain followed by sequence consisting essentially of the L1 domain, from 5′ to 3′, is at least 90% identical, at least 91% identical, at least 92% identical, at least 93% identical, at least 94% identical, at least 95% identical, at least 96% identical, at least 97% identical, at least 98% identical, or at least 99% identical to the large subunit rRNA.
In some embodiments where the large subunit 301 is a permuted variant of a separable large subunit rRNA, the large subunit 301 may further comprise a C domain 304 that connects the native 5′ and 3′ ends of the separable large subunit rRNA. The C domain may comprise a polynucleotide having a length ranging from 1-200 nucleotides. In some embodiments, the C domain 304 comprises a polynucleotide having a length ranging from 1-150 nucleotides 1-100 nucleotides, 1-90 nucleotides, from 1-80 nucleotides, 1-70 nucleotides, 1-60 nucleotides, 1-50 nucleotides, 1-40 nucleotides, 1-30 nucleotides, 1-20 nucleotides, 1-10 nucleotides, 1-9 nucleotides, 1-8 nucleotides, 1-7 nucleotides, 1-6 nucleotides, 1-5 nucleotides, 1-4 nucleotides, 1-3 nucleotides, or 1-2 nucleotides. In certain embodiments, the C domain comprises a GAGA polynucleotide.
Small Subunit
The small subunit 302 is capable of binding mRNA. The small subunit 302 comprises a first small subunit rRNA domain (“S1 polynucleotide domain” or “S1 domain”) and a second small subunit domain (“S2 polynucleotide domain” or “S2 domain”), wherein the S1 domain is followed, in order, by S2 domain, from 5′ to 3′. Referring again to
The small subunit rRNA 302 may be a permuted variant of a separable small subunit rRNA. In certain embodiments, the permuted variant is a circularly permuted variant of a separable small subunit rRNA. The separable small subunit may be any functional small subunit. In certain embodiments, the separable small subunit may be a 16S rRNA. In certain embodiments, the separable small subunit is a wild-type small subunit rRNA. In specific embodiments, the separable small subunit is a wild-type 23S rRNA. In some embodiments, the separable small subunit is at least 90% identical, at least 91% identical, at least 92% identical, at least 93% identical, at least 94% identical, at least 95% identical, at least 96% identical, at least 97% identical, at least 98% identical, or at least 99% identical to the small subunit rRNA.
In some embodiments, if the small subunit 302 is a permuted variant of a small subunit rRNA, then the polynucleotide sequence consisting essentially of the S1 domain followed by the polynucleotide sequence consisting essentially of the S2 domain, from 5′ to 3′, may be substantially identical to a small subunit rRNA. In certain embodiments, the polynucleotide sequence consisting essentially of the S1 domain followed by the S2 domain, from 5′ to 3′, is at least 90% identical, at least 91% identical, at least 92% identical, at least 93% identical, at least 94% identical, at least 95% identical, at least 96% identical, at least 97% identical, at least 98% identical, or at least 99% identical to the small subunit rRNA.
The small subunit may further comprise a modified-anti-Shine-Dalgarno sequence. In some embodiments, the modified anti-Shine-Dalgarno sequence facilitates the translation of templates having a complementary Shine-Dalgarno sequence different from an endogenous cellular mRNA.
Linking Moiety
Referring again to
In some embodiments, the linking moiety comprises a first tether domain (“T1 polynucleotide domain” or “T1 domain”) and a second tether domain (“T2 polynucleotide domain” or “T2 domain”). Referring again to
In some embodiments, the T1 domain links the S1 domain and the L1 domain, wherein the S1 domain is followed, in order, by the T1 domain and the L1 domain, from 5′ to 3′. In some embodiments, the T1 domain comprises a polynucleotide having a length ranging from 5-200 nucleotide, 5-150 nucleotides, 5-100 nucleotides, 5-90 nucleotide, 5-80 nucleotides, 5-70 nucleotides, 5-60 nucleotides, 5-50 nucleotides, 5-40 nucleotides, 5-30 nucleotides, or 5-20 nucleotides, including polynucleotides having 5 nucleotides, 6 nucleotides, 7 nucleotides, 8 nucleotides, 9 nucleotides, 10 nucleotides, 11 nucleotides, 12 nucleotides, 13 nucleotides, 14 nucleotides, 15 nucleotides, 16 nucleotides, 17 nucleotides, 18 nucleotides, 19 nucleotides, or 20 nucleotides. In some embodiments, T1 comprises polyadenine. In some embodiments, T1 comprises polyuridine. In some embodiments, T1 comprises an unstructured polynucleotide. In some embodiments, T1 comprises nucleotides that base-pairs with the T2 domain.
In some embodiments, the T2 domain links that L2 domain and the S2 domain, wherein the L2 domain is followed, in order, by the T2 domain and the S2 domain, from 5′ to 3′. In some embodiments, the T2 domain comprises a polynucleotide having a length ranging from 5-200 nucleotides, 5-150 nucleotides, 5-100 nucleotides, 5-90 nucleotide, 5-80 nucleotides, 5-70 nucleotides, 5-60 nucleotides, 5-50 nucleotides, 5-40 nucleotides, 5-30 nucleotides, or 5-20 nucleotides, including polynucleotides having 5 nucleotides, 6 nucleotides, 7 nucleotides, 8 nucleotides, 9 nucleotides, 10 nucleotides, 11 nucleotides, 12 nucleotides, 13 nucleotides, 14 nucleotides, 15 nucleotides, 16 nucleotides, 17 nucleotides, 18 nucleotides, 19 nucleotides, or 20 nucleotides. In certain embodiments, T1 comprises polyadenine. In certain embodiments, T2 comprises polyuridine. In certain embodiments, T12comprises an unstructured polynucleotide. In certain embodiments, T2 comprises nucleotides that base-pairs with the T1 domain.
In embodiments having a T1 domain and a T2 domain, the T1 domain and the T2 domain may have the same number of polynucleotides. In other embodiments, the T1 domain and the T2 domain may have a different number of polynucleotides.
In some embodiments, the engineered ribosome may comprise a S1 domain followed, in order, by a T1 domain, a L1 domain, a C domain, a L2 domain, a T2 domain, and a S2 domain, from 5′ to 3′. In specific embodiments, the engineered ribosome may consist essentially of a S1 domain followed, in order, by a T1 domain, a L1 domain, a C domain, a L2.
In some embodiments, the ribosomal RNA and the linking moiety of an engineered ribosome comprises the general structure shown below, from 5′ to 3′, wherein 16S (5′) represents S1, 23S includes L1 and L2, and optionally, a connector (not shown), and 16S(3′) represents S2:
In some embodiments, the T1 domain comprises 5′-GUUAUA-3′ and the T2 domain comprises 5′-UCACAAG-3′. In some embodiments, the T1 domain comprises 5′-AGUCAAUAA-3′ and T2 comprises 5′-GACCUUCG-3′.
An engineered ribosome, which includes T1 5′-AGUCAAUAA-3′ and T2 5′-GACCUUCG-3′ and which comprises a variant of a 16S and a 23S rRNA sequence, adapted to accommodate the T1 and T2 sequences as disclosed herein, is termed Ribo-T v3 and is shown below as SEQ ID NO: 1.
Mutations
In certain embodiments, the engineered ribosomes disclosed herein, such as Ribo-T v3, may comprise one or more mutations (in addition to those of the rRNA of Ribo-T V3, for example). In some embodiments the mutation is a change-of-function mutation. A change-of-function mutation may be a gain-of-function mutation or a loss-of-function mutation. A gain-of-function mutation may be any mutation that confers a new function. A loss-of-function mutation may be any mutation that results in the loss of a function possessed by the parent.
In some embodiments, the change-of-function mutation may be in the peptidyl transferase center of the ribosome. In specific embodiments, the change-of-function mutation may be in an A-site of the peptidyl transferase center. In other embodiments, the change-of-function mutation may be in the exit tunnel of the engineered ribosome.
In some embodiments the change-of-function mutation may be an antibiotic resistance mutation. The antibiotic resistance mutation may be either in the large subunit or the small subunit. In some embodiments antibiotic resistance mutation may render the engineered ribosome resistant to an aminoglycoside, a tetracycline, a pactamycin, a streptomycin, an edein, or any other antibiotic that targets the small ribosomal subunit. In some embodiments antibiotic resistance mutation may render the engineered ribosome resistant to a macrolide, a chloramphenicol, a lincosamide, an oxazolidinone, a pleuromutilin, a streptogramin, or any other antibiotic that targets the large ribosomal subunit.
Methods
In some embodiments, methods for preparing a sequence defined polymer are provided. In some embodiments, an engineered ribosome as disclosed herein (e.g., RiboT-v3, or functional variants thereof), is contacted with a nucleic acid encoding the sequence defined polymer under conditions for transcription (if the nucleic acid encoding the sequence defined polymer comprises DNA) by transcriptional components, and/or translation (if the nucleic acid encoding the sequence defined polymer comprises mRNA) by the tethered ribosomes. In some embodiments, translation by the tethered ribosomes may include the use non-canonical or unnatural codons and corresponding tRNAs (e.g., using the flexizyme system). Such codons, in combination with a system such as flexizyme, may allow for the production of polymers comprising, for example, non-canonical amino acids, or non-amino acid monomers.
In some embodiments, conditions for translation by the tethered ribosomes may include the use of tethered ribosomes comprising modified anti Shine-Dalgarno sequences, and mRNA comprising complementary modified Shine-Dalgarno sequences.
In some embodiments, the sequence defined polymer is prepared in vitro, for example, in a ribosome-depleted cellular extract or purified translation system.
In some embodiments, the sequence defined polymer is prepared in vivo, for example, in a host cell, such as a bacterial host cell, e.g., an Escherichia coli cell.
Polynucleotides
Disclosed herein are polynucleotides encoding the rRNA of the engineered ribosomes of the present technology (e.g., RiboT-v3, or functional variants thereof). In some embodiments, the polynucleotide comprise a vector. In some embodiments, a vector encoding the rRNA of an engineered ribosome of the present technology also encodes a gene, gene fragment, or other nucleic acid sequence that after transcription, can be translated by the engineered ribosomes. By way of example, in some embodiments, the gene, gene fragment, or other nucleic acid sequence is first transcribed, either in vitro or in vivo (e.g., by bacterial host cell transcription machinery) and is then translated by the engineered ribosomes. In some embodiments, a gene, gene fragment, or other nucleic acid sequence is provided as a separate vector or as a separate nucleic acid (either as DNA or mRNA).
Cells
Disclosed herein are cells comprising one or more polynucleotides encoding rRNA of the engineered ribosomes of the present technology (e.g., RiboT-v3, or functional variants thereof). In some embodiments, one or more of the polynucleotides comprises a vector. In some embodiments, the cells express the encoded rRNA and comprise a functional tethered ribosome as described herein (e.g., RiboT-v3, or functional variants thereof). In some embodiments, the cell comprises a mammalian cell, a yeast cell, an insect cell, an algal cell, a plant cell, a protozoan cell, or a bacterial cell. In some embodiments, the cells is an Escherichia coli cell.
In some embodiments, the cell comprises a first protein translation mechanism and a second protein translation mechanism. In some embodiments, the first protein translation mechanism comprises a ribosome, wherein the ribosome lacks a linking moiety between the large subunit and the small subunit. In some embodiments, the first translation mechanism comprises canonical ribosomes. In some embodiments, the first translation mechanism comprises non-canonical ribosomes. In some embodiments, the second protein translation mechanism comprises an engineered, tethered ribosome as disclosed herein (e.g., RiboT-v3, or functional variants thereof).
Methods of Directed Evolution
Disclosed herein are methods for directed evolution of a target nucleic acid sequence. In some embodiments, the target nucleic acid sequence comprises at least two regions of interest, wherein the regions of interest are separated by an intervening sequence of at least 300 nucleotides in length. In some embodiments, the methods include generating a library of test nucleic acid sequences, wherein each test nucleic acid sequence has a different nucleotide sequence for at least one of the regions of interest; screening the library for functional test nucleic acid sequences; sequencing the functional test nucleic acid sequences. In some embodiments, the sequencing comprises: performing a first polymerase chain reaction (PCR), wherein the first PCR provides a first PCR product comprising the least two regions of interest but does not include at least a portion of the intervening sequence; performing a ligation reaction, wherein the ligation reaction provides a first ligation product comprising the two regions of interest, wherein the two regions of interest are positioned less than 300 nucleotides apart; performing a second PCR, wherein the second PCR provides a second PCR product comprising the two regions of interest; sequencing the second PCR product and the two regions of interest. In some embodiments, the sequencing comprises next generation sequencing (NGS).
In some embodiments, the two regions of interest are positioned more than about 5, 10, 50, 100, 200, 300, 500, 1000, 1500, 2000, 2500, or 5000 nucleotides apart. In some embodiments, the two regions of interest are positioned more than about 300 nucleotides apart.
NGS sequencing methods are well known in the art, with a variety of platforms and chemistries. One non-limiting example includes the Illumina NGS sequencing methods.
Exemplary Advantages of Ribo-T v3
Ribo-T v3 features new tether RNA sequences (T1: 5′-AGUCAAUAA-3′ and T2: 5′-GACCUUCG-3′) as well as designed base pairs at the Tether-H101 junction. Ribo-T v3 exhibits up to a 58% improvement in orthogonal sfGFP translation and a 97% improvement in growth rate as well as a 77% improvement in lag time in SQ171 cells growing in M9 minimal media. Interestingly, cells supported by Ribo-T v3 in rich LB-Miller media exhibit comparable growth rates to those living on Ribo-T v2, but a 59% improvement in lag time. This is consistent with our evolution experiments which favor cells that emerge from stationary phase most quickly.
Additionally, we showcase Ribo-T v3's potential for expanding the chemical toolbox of orthogonal translation systems through the incorporation of a new non-canonical amino acid featuring a bulky side chain (DECP) into a peptide using an in vitro transcription and translation reaction supplemented with synthetic tRNAs charged with flexizyme.
Miscellaneous
All methods described herein can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The use of any and all examples, or exemplary language (e.g., “such as”) provided herein, is intended merely to better illuminate the invention and does not pose a limitation on the scope of the invention unless otherwise claimed. No language in the specification should be construed as indicating any non-claimed element as essential to the practice of the invention.
Preferred aspects of this invention are described herein, including the best mode known to the inventors for carrying out the invention. Variations of those preferred aspects may become apparent to those of ordinary skill in the art upon reading the foregoing description. The inventors expect a person having ordinary skill in the art to employ such variations as appropriate, and the inventors intend for the invention to be practiced otherwise than as specifically described herein. Accordingly, this invention includes all modifications and equivalents of the subject matter recited in the claims appended hereto as permitted by applicable law. Moreover, any combination of the above-described elements in all possible variations thereof is encompassed by the invention unless otherwise indicated herein or otherwise clearly contradicted by context.
Abstract
RNA-based macromolecular machines, such as the ribosome, have functional parts reliant on structural interactions spanning sequence-distant regions. These features hamper the engineering potential of such machines because they limit evolutionary exploration of mutant libraries and confound 3D structure-guided design. To address these challenges, the inventors describe Evolink (evolution and linkage), a method that enables high-throughput evolution of sequence-distant regions in large molecular machines, and library design guided by computational RNA modeling to enable thorough exploration of structurally stable designs. To showcase the utility of this approach, the inventors evolved a tethered ribosome, which improves upon previous iterations by 58% in orthogonal protein translation and a nearly two-fold improvement in growth in minimal media. The Evolink approach enhances the engineering of macromolecular machines for new and improved functions with implications for synthetic biology.
Introduction
Directed evolution of RNA- and protein-based enzymes can elucidate principles of biological design and generate new catalytic activities for synthetic biology1-8. Unfortunately, methods for directed evolution can be hindered by practical considerations. For example, the combinatorial space for evolution is immense (i.e., for an average protein of length 300 amino acids, there are a seemingly infinite number of theoretically possible amino acid sequences (˜20300)), and random mutagenesis alone cannot screen all possible variants9-12. In addition, macromolecular machines often have complex tertiary structures that contribute to their function13, which bring residues that are distant in primary sequence close in three-dimensional space
Despite these challenges, directed evolution of the ribosome has emerged as a promising opportunity in chemical and synthetic biology1-5,7-9,14-21. A major goal of ribosome evolution efforts is to repurpose the ribosome for diverse genetically encoded chemistries to create new classes of enzymes, therapeutics, and materials by selectively incorporating non-canonical monomers into peptides and proteins. While the natural ribosome works well for many noncanonical α-amino acids, there is poor compatibility with the natural translation apparatus for numerous classes of non-α-amino acids (e.g., backbone-extended amino acids (γ-, δ-, ε-, etc.)) leading to inefficiencies in incorporation1-4,22,23.
Methods for engineering ribosomes have been developed to address these inefficiencies7,16,17,24,25. In vivo, ribosome engineering methods have focused on the development of specialized ribosome systems. Recently, the advent of tethered ribosomes has made possible the first fully orthogonal ribosome-mRNA system in cells, where a sub-population of ribosomes are available for engineering and are independent from wild-type ribosomes supporting cell life18. Tethered ribosome systems have two key features. First, the anti-Shine-Dalgarno sequence of the 16S ribosomal RNA (rRNA) of the small 30S subunit can be mutated to function as orthogonal ribosomes that selectively initiate translation of orthogonal messenger RNAs (mRNAs) with mutated Shine-Dalgarno sequences19,26,27. Second, the small and large subunits are covalently linked together
The untapped potential and existing inefficiencies of tethered ribosome systems motivate the need for new directed evolution-based approaches to engineer these systems for improving their activity. Previous works were limited in throughput in evaluating designs (e.g., 48 and 108 members were evaluated in two different efforts9,14) due to their reliance on clonal isolation and functional testing. A bottleneck in these efforts has been that the regions of interest in the tethered ribosomes are separated by around 2,900 nucleotides (the length of the circularly permuted 23S rRNA18), and current readily available methods for next-generation sequencing are typically limited to overlapping read lengths of ˜300 nucleotides. While methods have been developed to address these shortcomings28,29, they face limitations that hinder broad applications to macromolecular machines as large as the ribosome, which feature many examples of distantly sequence encoded, but physically interacting regions
Here, to address existing limitations and facilitate evolution of ribosomes, we present a molecular biology technique called Evolink (evolution and linkage)
Results
Linking of Sequence-Distant Regions on a Single Next-Generation Sequencing Read
We aimed to develop a generalizable method, guided by computational design, for directed evolution of sequence-distant sites of macromolecular machines. As a model, we focused on evolving the tether sequences of covalently tethered ribosomes. To achieve our goal, we first developed the molecular biology methods needed, termed Evolink. Evolink is a three-step process that uses polymerase chain reaction (PCR), ligation, and a second PCR reaction to bring together sequence-separated regions of a plasmid into a single next-generation sequencing (NGS) read. This process is analogous to amplifying and closing the “backbone” of a plasmid, where the “insert” omitted from amplification is the RNA sequence separating the two regions of interest. Because Evolink relies on simple, general-purpose molecular biology (e.g., PCR and ligation), it can be adapted to any plasmid-encoded molecular machine
To start, we demonstrated the three key molecular biology steps of Evolink (termed PCR-1, LIG-1, PCR-2)
Following the first PCR, LIG-1 was carried out to cyclize the product of PCR-1 in a unimolecular ligation, proximally linking the previously distant regions. Prior to ligation, PCR-1 products that used primers compatible with restriction enzyme digests were processed with enzymatic digest and purification. Those that used 5′ phosphorylated primers or enzymatic digestion were purified and used in ligation with T4 ligase, and those which featured overlapping complementary sequences were ligated together using isothermal assembly31.
Finally, we carried out PCR-2 with a different set of primers to amplify the now-linked regions of interest. In this step, the primers are designed with the forward primer upstream of T1 and the new reverse primer downstream of T2, such that now the primers are “outside” of the regions of interest. The sequences between each respective primer and region of interest (forward primer-T1 and reverse primer-T2 in this case) contribute to the final amplicon length for sequencing. We designed primers such that the final amplicon product is ˜200 nucleotides in length and can be directly used in NGS library preparation. To demonstrate robustness, we tested the PCR-2 with four different ligation methods (Type I/II restriction enzyme digestion and ligation, blunt end ligation, and isothermal assembly), each with eight different input template amounts into the ligation (1, 2, 5, 10, 20, 30, 40, 50 ng). We observed successful generation of the desired amplicon for NGS for all 32 reactions tested
Applying Evolink to Tethered Ribosomes
With the Evolink method in hand, we sought to apply it to develop mutant tethered ribosomes for improved activity, with a focus on tether design and evolution
In the first library, we elected to broadly sample possible lengths and sequences of T1 and T2, with a degenerate library ranging from 5-15 nucleotides
Structural Fragility of the Tether-H101 Junction
Based on previous literature that showed stapled ribosome function is sensitive to the connection between the tether and 23S rRNA residues14 (henceforth referred to as the Tether-H101 junction), we wondered if the Tether-H101 junction would also be significant in the Ribo-T design context9,18
To further test and understand this hypothesis, we turned to computational modeling to gauge structural stability of the Tether-H101 junction
We conducted 3D modeling of these tethers to augment our understanding
Evolink and Computational Validation of a Designed Tether Library
With the range of tether lengths informed by the Broad Sampling Library and the designed base pairs at the Tether-H101 junction, we next performed Evolink on a tether library followed by 3Dstructure analysis. The library featured 6 to 9 random nucleotides for both T1 and T2 regions, with the addition of three synthetic base pairs at the Tether-H101 junction to encourage its formation and increase the independence of tether folding from junction folding
Clonal Isolation of Enriched Genotypes and Test of Orthogonal Protein Synthesis
We then carried out a final round of randomized library building and selection. The goal of this selection was to identify candidates for clonal isolation and characterization of improved tethered ribosome genotypes. The library combined the lessons learned from our three previous libraries. First, we tested tether lengths ranging from 5 to 9 nucleotides for T1 and 6 to 9 nucleotides for T2 based on the previous round of Evolink converging to 6 and 8 nucleotides for T1 and T2, respectively
To test this cooperativity hypothesis and isolate a final winning genotype, we built 16 individual genotypes from the final library by combining the top 4 enriched sequences for the T1 and T2 regions from this round of Evolink and tested the combinations individually for their ability to carry out orthogonal superfolder GFP (sfGFP) synthesis compared to a previously improved orthogonal tethered ribosome, oRibo-T v2
Functional Characterization of Ribo-T v3
We next tested the ability of Ribo-T v3 to support cellular life in the SQ171 strain as a general measure of ribosome function9,18. We compared growth rates of cells supported by Ribo-T v3 and Ribo-T v2 on both minimal M9 media as well as rich LB-Miller media
Towards this vision of genetic code expansion with tethered ribosomes, we tested the ability of Ribo-T v3 to incorporate a non-canonical amino acid into a peptide. The idea was not to engineer Ribo-T v3 further to be better than a natural ribosome at incorporating non-canonical amino acids, but rather to show that oRibo-T was compatible with applications geared towards expanding the chemistry of life1-4,14,23. We chose a non-canonical L-α-amino acid ((R)-2-amino-3-(7-(diethylamino)-2-oxo-2H-chromene-3-carboxamido)propanoic acid, DECP) featuring a diethylamino coumarin group on its sidechain. The monomer, which features a bulky side chain, has not yet been shown to be incorporated into a peptide ribosomally, and thus presented a new and attractive target to showcase Ribo-T v3's ability to expand the chemical biology toolbox of engineered translation machinery. For demonstration purposes, and since evolved aminoacyltRNA synthetases do not exist for this monomer, we used a cell-free transcription and translation platform based on the PURExpress system37-39. In this platform, the monomer DECP was charged onto tRNAfMet(CAU) using a flexizyme38
In this work, we present an improved tethered ribosome platform, termed Ribo-T v3, evolved from the previous state-of-the-art (Ribo-T v2). Key to our effort was the development of Evolink, a technique for evolving regions in macromolecular machines far apart in primary sequence but proximal (and potentially functionally linked) in three-dimensional space. Evolink uses widely available molecular biology protocols (PCR and ligation) to link together distant sites of a plasmid in a single next-generation sequencing (NGS) read, alleviating previous limitations to ribosome evolution enforced by short NGS read lengths (˜300 nucleotides). We carried out four iterations of our design-build-test-analyze directed evolution experiment, featuring library designs informed by NGS results as well as structural modeling. Libraries explored simultaneous variation of tether sequence and length, as well as interaction between the tether and its junction with H101, culminating in design of a library that yielded Ribo-T v3.
Ribo-T v3 features new tether RNA sequences (T1: 5′-AGUCAAUAA-3′ and T2: 5′-GACCUUCG-3′) as well as designed base pairs at the Tether-H101 junction. Ribo-T v3 exhibits up to a 58% improvement in orthogonal sfGFP translation and a 97% improvement in growth rate as well as a 77% improvement in lag time in SQ171 cells growing in M9 minimal media. Interestingly, cells supported by Ribo-T v3 in rich LB-Miller media exhibit comparable growth rates to those living on Ribo-T v2, but a 59% improvement in lag time. This is consistent with our evolution experiments which favor cells that emerge from stationary phase most quickly. Additionally, we showcase Ribo-T v3's potential for expanding the chemical toolbox of orthogonal translation systems through the incorporation of a new non-canonical amino acid featuring a bulky side chain (DECP) into a peptide using an in vitro transcription and translation reaction supplemented with synthetic tRNAs charged with flexizyme. Looking forward, we predict that Ribo-T v3 will accelerate new advances in orthogonal translation systems to expand the palette of genetically encoded chemistries9,14,16,40. Moreover, we expect Evolink will advance directed evolution efforts, especially those for large macromolecular machines, for synthetic biology.
Materials and Methods
Library Construction
Plasmid libraries of Ribo-T tethers were generated using polymerase chain reaction (PCR) with the plasmid encoding Ribo-T v2.09, as the template. Oligonucleotides (IDT, USA) encoding degenerate bases (Ns) in place of the tethers were used to amplify the insert which includes both tethers and the 23S rRNA (referred to as the insert) [
Resulting amplicons were purified using the Omega Cycle-Pure kit (Omega Bio-Tek), then digested with DpnI (NEB) to remove the template. The insert and backbone were ligated using isothermal DNA assembly31, and transformed into POP2136 cells via electroporation. Post-transformation, the cells were recovered in 800 μL of SOC at 30° C. for 90-120 minutes, then plated on LB-agar plates containing 100 μg/mL carbenicillin. The plates were incubated at 30° C. for 16-18 h until colonies appeared. All colonies were scraped from the agar plates and plasmid extraction was performed using a Zymo-PURE Midiprep II kit (Zymo Research).
Selection of Tethered Ribosomes
The libraries of Ribo-T tethers were transformed into SQ171 cells lacking chromosomal ribosomes32. 100 ng of the plasmid library was transformed into 50 μL of SQ171 cells via electroporation, then recovered with 500 μL SOC at 37° C. with shaking at 250 rpm for 2 h. After, another 1.5 mL of SOC was added to the cells and the final 2 ml culture was brought to 100 μg/Ml carbenicillin and 0.25% sucrose. These cells were then incubated at 37° C. with shaking at 250 rpm for 16-18 h. After incubation, cells were plated onto LB-agar plates containing: carbenicillin (100 μg/mL), sucrose (5% w/v), and erythromycin (250 μg/mL) and incubated at 37° C. for 20-24 h until colonies appeared. Colonies were then washed from the agar plates with LB containing 100 μg/mL carbenicillin (˜5 mL of LB-carbenicillin per 100 mm petri dish) and grown to saturation at 37° C. with 250 rpm shaking. 1 mL of the solution was reserved and plasmids were extracted using the Zymo-PURE Miniprep kit (Zymo Research). The saturated culture was then subject to passaging over 4 days in LB containing 100 μg/mL carbenicillin, and plasmids were extracted each day for sequencing.
Preparation of Amplicons for Next-Generation Sequencing
Plasmids extracted from selection cultures were linearized using PCR and purified using the Omega Cycle-Pure kit. 20 ng of the purified product was then used in a 20 μL ligation reaction containing T4 ligase (NEB) and the appropriate accompanying buffer. After incubation at 37° C. for 2 h, 2 μL of the ligation reaction was used directly in a 20 μL PCR with 15 cycles of amplification, which generated the amplicon for next-generation sequencing. The resulting product was then purified and prepared for next-generation sequencing using the NEBNext Ultra II DNA Library Prep kit (NEB). The resulting library was run on a MiSeq (Illumina) using a 150-cycle MiSeq Reagent Kit v3 (Illumina).
Analysis of Next-Generation Sequencing Results
Paired end reads from Illumina sequencing were assembled using PANDASeq39. Reads that had coverage (number of redundant reads) of less than ten were filtered and excluded from analysis. Pairs of sequences were then identified, and the following parameters were calculated.
Abundance was calculated using the following formula:
for a specific genotype i at timepoint n, and S represents the total number of unique genotypes at timepoint n after filtering as described above.
Fold-enrichment was calculated using the following formula:
for a specific genotype i at timepoint n, and abundance0 represents the abundance after selection on agar plates as previously described before any liquid culture.
Post Facto Computational Modeling of Tether
For 3D modeling studies, we set up FARFAR2 simulations34 using a crystal structure of the E. coli ribosome40 (PDB code: 4YBB). Starting from that structure, we truncated the stemloops 23S rRNA Helix 101 (H101) and 16S rRNA helix 44 (h44), removing the residues that are deleted in all tethered ribosome constructs, and renumbered those residues to facilitate building a continuous RNA chain.
Using that initial structure as a template, we built the remaining residues of the tether using the FARFAR2 algorithm, conducted on 200 CPUs for 24 h, generating several thousand structures. We conducted simulations under two conditions: in one, only tether residues were resampled; in another, a junction on the 23S side of the tether was resampled as well.
All inputs and command files used in setting up computational modeling are available at github.com/everyday847/ribotv3_simulations.
Measurement of Orthogonal GFP Production
Combinations of potentially high-performant tether designs were identified from next generation sequencing results and built into a plasmid containing both an orthogonal tethered ribosome gene (oRibo-T) and an orthogonal superfolder GFP (o-sfGFP) coding sequence (mutated Shine-Dalgarno sequence)9. 10 ng of sequence-confirmed plasmids were then transformed into 25 μL of BL21(DE3) cells via electroporation, recovered in 1 mL of SOC, and plated on agar plates containing 100 μg/mL of carbenicillin. Individual colonies were picked (n=3) for inoculation of 100 μL of LB media containing 100 μg/mL carbenicillin. Cultures were incubated at 37° C. for 14-16 h with 2 mm continuous linear shaking in a plate reader (Agilent BioTek Synergy H1) and absorbance at 600 nm (OD600) was monitored to ensure saturation. After cultures reached saturation, each culture was diluted to an of ˜0.01 OD600 in fresh LB media containing 100 μg/mL of carbenicillin and 1 mM isopropyl β-D-1-thiogalactopyranoside (IPTG) to induce transcription of the orthogonal GFP gene. Cultured were incubated at 37° C. for 14-16 h with 2 mm continuous linear shaking in a plate reader and OD600 was monitored along with fluorescence (485/528 nm excitation/emission). Orthogonal GFP production (fluorescence) was normalized by OD600.
Growth Rate Characterization of Ribo-Tv3
A plasmid encoding tether sequences corresponding to Ribo-Tv3 (named pRTv3), was constructed using Gibson assembly31. 10 ng of pRTv3 was transformed into 50 μL of SQ171 cells 18 via electroporation and recovered in 500 μL of SOC at 37° C. for 2 h with shaking at 250 rpm.
After recovery, 1.5 mL of SOC was added and supplemented with 100 μg/mL carbenicillin and 0.25% (w/v) sucrose (final concentrations). After overnight (16-18 h) recovery at 37° C. with 250 rpm shaking, the cells were spun down (4000×g, 10 minutes) and plated on LB-agar plates containing 100 mg/m: carbenicillin, 5% sucrose, and 250 μg/mL erythromycin. Individual colonies were picked, and resistance to 100 μg/mL carbenicillin and sensitivity to 50 μg/mL kanamycin was checked on LB-agar plates to confirm successful swapping of ribosome plasmids in the SQ171 cells. Colonies that successfully replaced pCSacB32 with pRTv3 were carried through for analysis.
In a 96 well plate, 100 μL of LB media containing 100 μg/mL carbenicillin, 5% sucrose, and 250 μg/mL erythromycin was inoculated with a colony from an LB agar plate containing 100 μg/mL carbenicillin, 5% sucrose, and 250 μg/mL erythromycin and incubated for 14-16 h at 37° C. with 2 mm lateral shaking in a plate reader (Agilent BioTek Synergy H1). Absorbance at 600 nm was monitored to ensure cultures reached saturation. After incubation, cultures were diluted to A600 ˜0.05 (˜20-fold) in 100 μL of LB media containing 100 μg/mL carbenicillin, 5% sucrose, and 250 μg/mL erythromycin and incubated for 18 h at 37° C. with 2 mm lateral shaking, and absorbance at 600 nm (A600) was monitored.
Preparation of DECP-CME
Cyanomethyl-2-amino-3-(7-(diethylamino)-2-oxo-2H-chromene-3-carboxamido) propanoate (DECP-CME, 5) was prepared with three steps using the synthetic methods previously described 36,41. First, 268 mg (1 mmol) of 7-(diethylamino)-2-oxo-2H-chromene-3-carboxylic acid (1) and 162 mg (1 mmol) of carbonyldiimidazole (CDI) were added to a flask and sealed with a septum. 5 mL of anhydrous DMF was added into the flask using an oven-dried syringe and stirred at room temperature for 2 h. 204 mg (1 mmol) of (R)-3-amino-2-((tert-butoxycarbonyl)amino)propanoic acid (2) was added and stirred overnight. The product was extracted with ethyl acetate after washing the crude reaction mixture with 1 M HCl, water, and brine. Second, 38 mL (0.6 mmol) of chloroacetonitrile and 104 mL (0.75 mmol) of triethylamine were added to 223 mg (0.5 mmol) of the purified 2-((tert-butoxycarbonyl)amino)-3-(7-(diethylamino)-2-oxo-2H-chromene-3-carboxamido) propanoic acid (3) in 1 mL of DCM and stirred overnight. The organic layer was washed with 1 M HCl, water, and brine and dried over MgSO4. 3) 1 mL of 50% of TFA solution in DCM was added to the purified cyanomethyl 2-((tert-butoxycarbonyl)amino)-3-(7-(diethylamino)-2-oxo-2H-chromene-3-carboxamido)propanoate (4) to deprotect the Boc group. The final product was dried under high vacuum and obtained as pale yellow powder (yield: 57%).
In brief, SQ171 cells harboring pRTv3 as the sole source of ribosomes were grown to mid-exponential phase (0.3-0.8 A600) in 500 mL of LB media containing 100 □g/mL carbenicillin and 250 □g/mL erythromycin. Cells were spun down, lysed using homogenization, and ribosomes were harvested using a sucrose cushion as described previously 25. Ribosome pellets were resuspended in Buffer C (10 mM pH 7.5 Tris Acetate, 60 mM ammonium chloride, 7.5 mM magnesium acetate, 0.5 mM ethylenediaminetetraacetic acid, and 2 mM dithiothreitol) and brought to a concentration of 15 mM (A260=625). Resuspended ribosomes were used directly in in vitro translation reactions.
Preparation of DNA templates for RNAs. The DNA templates for flexizmyes and tRNAs preparation were synthesized as previously described 22,36. Sequences of the final DNA templated used for in vitro transcription by the T7 polymerase are:
Preparation of Fxs and tRNAs.
Flexizymes (Fxs) and tRNAs were prepared using an in vitro transcription kit (HiScribe™ T7 High yield RNA synthesis kit, NEB E2040S) and purified by the previously reported methods 22.
Charging DECP into tRNA by Fx.
The acylation experiment was performed first using flexizyme with three flexizymes (e, d, and aFx). The Fx reaction was carried out as follows: 1 μL of 0.5 M HEPES (pH 7.5) or bicine (pH 8.8), 1 μL of 10 μM microhelix (mihx, tRNA mimic), and 3 μL of nuclease-free water were mixed in a PCR tube with 1 μL of 10 μM eFx, dFx, and aFx, respectively. The mixture was heated for 2 min at 95° C. and cooled down to room temperature over 5 min. 2 μL of 0.3 M MgCl2 in water was added to the mixture and incubated for 5 min at room temperature. Followed by the incubation of the reaction mixture on ice for 2 min, 2 μL of 25 mM DECP-CME in DMSO was then added to the reaction mixture. The reaction mixture was incubated for 16 h on ice in cold room. The optimal acylation reaction was determined by measuring the acylation yield using an acidic polyacrylamide gel (pH 5.2). tRNAfMet(AUG) was charged with DECP under the condition obtained from the mihx acylation experiment. The charged tRNA was precipitated using ethanol and used for in vitro translation without further purification.
In Vitro Protein Translation Reaction.
The non-canonical substrate incorporation experiment was performed using the PURExpress™ (Δribosome, Δaa, ΔtRNA, E3315Z) system. DECP-charged tRNAfMet(CAU) was dissolved in 1 μL of 1 mM NaOAc (pH 5.2) and added into 9 μL solution mixture containing 2 μL of Solution A, 1.2 μL of Factor mix, 1.8 μL of Ribo-T v3 (2.4 μM in final reaction), 1 μL of
endogenous tRNA mixture, 1 μL of DNA plasmid (130 ng μL-1), 1 μL of nuclease-free water, and
1 μL of 5 mM amino acid mixtures (Trp, Ser, His, Pro, Gln, Phe, Glu, Lys, and Thr). The reaction mixture was incubated in 37° C. for 2 h.
The target peptide produced in the PURE reaction was purified by using MagStrep (type 3) XT beads 5% suspension (IBA Lifesciences) which selectively pull down the target peptide bearing the Strep tag (WSHPQFEK) at the C-terminal region. After pulling down the target peptide, the magnetic beads were washed with the Strep-Tactin XT wash buffer (IBA Lifesciences) and treated with 0.1% SDS solution in water. The beads were heated at 95° C. in a PCR machine to denature the target peptide bound to the beads. The magnetic beads were removed on a magnet rack and the obtained peptide was analyzed by mass spectrometry.
DNA Primers Used in this Study.
Sequences are listed 5′ to 3′. For primers indicated with ‘\Phos\’, Phosphorylation performed on oligos with polynucleotide kinase (PNK) prior to PCR for use in blunt end ligation. ‘N’ indicates degenerate oligonucleotides. All oligonucleotides purchased from Integrated DNA Technologies (IDT).
Comparisons of orthogonal sfGFP production by multiple Ribo-T v3 candidates (
Code Availability
All inputs and command files used in setting up computational modeling are available at github.com/everyday847/ribotv3_simulations.
Full Sequences of modified ribosome RNA including tether pairs 1-16 of
It will be readily apparent to one skilled in the art that varying substitutions and modifications may be made to the invention disclosed herein without departing from the scope and spirit of the invention. The invention illustratively described herein suitably may be practiced in the absence of any element or elements, limitation or limitations which is not specifically disclosed herein. The terms and expressions which have been employed are used as terms of description and not of limitation, and there is no intention in the use of such terms and expressions of excluding any equivalents of the features shown and described or portions thereof, but it is recognized that various modifications are possible within the scope of the invention. Thus, it should be understood that although the present invention has been illustrated by specific embodiments and optional features, modification and/or variation of the concepts herein disclosed may be resorted to by those skilled in the art, and that such modifications and variations are considered to be within the scope of this invention.
Citations to a number of patent and non-patent references may be made herein. Any cited references are incorporated by reference herein in their entireties. In the event that there is an inconsistency between a definition of a term in the specification as compared to a definition of the term in a cited reference, the term should be interpreted based on the definition in the specification.
2. The engineered ribosome of claim 1, wherein the T1 polynucleotide domain comprises 5′-GUUAUA-3′ and the T2 polynucleotide domain comprises 5′-UCACAAG-3′ (Pair 1).
3. The engineered ribosome of claim 1, wherein the T1 polynucleotide domain comprises 5′-AGUCAAUAA-3′ and T2 polynucleotide comprises 5′-GACCUUCG-3′ (Pair 2).
4. The engineered ribosome of claim 1 comprising SEQ ID NO: 1, or any one SEQ ID NOs 1-16.
5. A polynucleotide, the polynucleotide encoding the rRNA of the engineered ribosome of claim 1.
6. The polynucleotide of claim 5, wherein the polynucleotide is in a vector.
7. The polynucleotide of claim 6, wherein the polynucleotide further comprises a gene to be expressed by the engineered ribosome.
8. The polynucleotide of claim 7, wherein the engineered ribosome comprises a modified anti-Shine-Dalgarno sequence and the gene comprises a complementary Shine-Dalgarno sequence to the engineered ribosome.
9. The polynucleotide of claim 8 wherein the gene comprises one or more codons, wherein at least one of the one or more codons comprises a non-canonical codon or an unnatural codon.
10. The polynucleotide of claim 9, wherein the non-canonical codon or the unnatural codon codes for a non-canonical amino acid, or a non-amino acid monomer.
11. A method for preparing an engineered ribosome, the method comprising expressing the polynucleotide of claim 5.
12. A cell, the cell comprising (i) the polynucleotide of claim 5, (ii) the engineered ribosome of claim 1, or both (i) and (ii).
13. A cell, the cell comprising a first protein translation mechanism and a second protein translation mechanism;
This application claims the benefit of U.S. Provisional Patent Application No. 63/202,555 filed Jun. 16, 2021, the entire content of which is incorporated herein by reference in its entirety.
This invention was made with government support under W911NF-16-1-0372 awarded by the Army Research Office, Department of Defense, and 1716766 awarded by the National Science Foundation. The government has certain rights in the invention.
Number | Date | Country | |
---|---|---|---|
63202555 | Jun 2021 | US |