A Sequence Listing accompanies this application and is submitted as an ASCII text file of the sequence listing named “112624-01242_ST25.txt” which is 17.6 KB in size and was created on Mar. 1, 2021. The sequence listing is electronically submitted via EFS-Web with the application and is incorporated herein by reference in its entirety.
Precise regulation of gene expression at the level of transcription or translation plays a pivotal role in establishing basic cell function, ensuring appropriate responses to environmental cues, and even robust therapeutics and diagnostics1-6. Therefore, effective strategies are required to enable accurate and predictable control of the production and degradation of RNA and protein molecules7-9. In bacteria, such control has largely been achieved through engineering of the production of RNA (transcription) or protein (translation). Modulation of the −35 and −10 consensus elements has allowed for engineering of synthetic promoter libraries with a broad range of transcription efficiencies10-12. This mechanism-driven methodology has also been applied to develop tools to manipulate translation, where RNAs featuring low folding energy coupled with high affinity Shine Dalgarno (SD) sequences to encourage efficient ribosome binding, thereby leading to accelerated translation rates13,14. Libraries of ribosome binding sites (RBSs) with varying strengths have been developed to predict and tune protein yields14-16. Other attempts have been made to control the production of gene products by developing synthetic transcriptional terminators17-19, riboregulators20-23, thermosensors24, ribozymes25, CRISPR activation and interference systems26-29, switchable guide RNAs30-32, engineering regions nearby open reading frames (ORFs)33-37, and through optimization of codon usage38,39. However, there are significant limitations to these approaches. For example, methods such as engineering strong promoters require over production of genetic materials and thus introduce metabolic burden to the host cells, while others such as riboregulator or thermosensor are translation-dependent and therefore unable to manipulate RNA levels.
RNA molecules in prokaryotes are typically unstable, with half-lives on the minute timescale, which allows cells to rapidly adapt to changes in the environment40,41. This rapid degradation is orchestrated by an ensemble of bacterial ribonucleases (RNases) that have been extensively studied42,43. In E. coli, which lacks 5′→3′ exonucleases, the vast majority of RNA degradation processes combine the actions of endonucleases and 3′→5′ exonucleases. Specifically, the endonucleases RNase E or RNase III target the underlying RNA molecule for primary cleavage followed by complete degradation via 3′→5′ exonucleases44. Previous studies have discovered several naturally occurring 5′ UTRs, termed RNA stabilizers, or rationally designed synthetic DNA cassettes that can increase RNA half-life by forming 5′ secondary structures45-50. These 5′ hairpin structures have been shown to be able to control heterologous mRNA half-life and have been used to regulate recombinant protein expression without introducing stress to host cells50. However, most engineered 5′ stabilizing elements have been designed and tested on an ad-hoc basis. Thus, an understanding of the relationship between stabilizer structural features and mRNA half-life has remained elusive.
Accordingly, there remains a need in the art for improved, versatile methods of modifying gene expression in a cell.
In a first aspect, the present invention provides degradation tuning RNAs (dtRNAs). The dtRNAs comprise the following components, ordered from 5′ to 3′: (a) a leader sequence comprising zero to six nucleotides, (b) a first stem-forming region, (c) a loop-forming region comprising at least three nucleotides, (d) a second stem-forming region, and (e) an insulator sequence comprising at least five nucleotides. The first stem-forming region and the second stem-forming region of the dtRNAs form a stem that is at three nucleotides in length.
In a second aspect, the present invention provides methods of modulating the stability of an RNA. The methods comprise: (a) forming a dtRNA described herein; and (b) inserting the dtRNA into the RNA in a position that is 5′ to the functional portion of the RNA. In some embodiments, the methods increases the stability of the RNA. In other embodiments, the methods decrease the stability of the RNA.
In a third aspect, the present invention provides DNA constructs comprising a promoter that is operably connected to a sequence encoding a dtRNA described herein and a multi-cloning site or a functional RNA.
The present application is based on the inventors' development of a new class of RNA modules known as “degradation tuning RNAs” or “dtRNAs,” which are designed to form stabilizing (or destabilizing) secondary structures. As described in the paragraphs that follow and the Example, the inventors engineered a library of dtRNAs that can be inserted at the 5′ end of RNAs of interest to manipulate their stability. Based on in silico analyses, the dtRNA modules form secondary structures that impact RNA degradation without interfering with downstream RNA features, including ribosome binding site (RBS) context. As is described in the Examples, the inventors systematically characterized dtRNA structures and discovered that RNA stability is strongly correlated with several structural features, including stem length, GC content, loop size, 5′ leader sequence, and the presence of ribonuclease (RNase) cleavage sites. Manipulation of these features yielded a library of 82 dtRNAs that can be used to tune gene expression upwards by 5-fold or downwards by 8-fold, resulting in an overall dynamic range of 40-fold. The sequences of these dtRNAs are provided herein as SEQ ID NO:1-82.
This disclosure further provides methods of using dtRNAs to modulate the stability of RNAs. In the Examples, the inventors demonstrate that integration of these dtRNAs can be used to tune the dynamics of a positive feedback loop or to increase noncoding RNA levels for improved CRISPR interference. They also show that dtRNAs can be used to tune gene and RNA aptamer production in in vitro cell-free systems, and can be used to improve paper-based viral diagnostics via integration into toehold switch sensors. This disclosure, therefore, provides a variety of dtRNAs that offer non-leaky and robust transcriptional regulation.
The present invention provides degradation tuning RNAs (dtRNAs) comprising or consisting essentially of the following components, ordered from 5′ to 3′: (a) a leader sequence comprising zero to six nucleotides, (b) a first stem-forming region, (c) a loop-forming region, (d) a second stem-forming region, and (e) an insulator sequence comprising at least five nucleotides. The first stem-forming region and the second stem-forming region form a stem that is at three nucleotides in length. In some embodiments, the dtRNA contains only one stem loop.
The terms “stem loop,” “hairpin,” and “hairpin structure” refer to a lollipop-shaped RNA secondary structure formed two regions of a nucleic acid molecule (which are usually complementary when read in opposite directions) base pair to form a double helix that ends in an unpaired loop. As used herein, the term “stem” refers to the double-stranded portion of a stem loop, and the term “loop” refers to the single-stranded, unpaired portion of a stem loop.
The dtRNAs of the present invention comprise five essential components. On the 5′ end, the dtRNAs comprise a leader sequence comprising zero to six nucleotides. As used herein, the term “leader sequence” refers to the single-stranded region upstream (5′) of the stem loop-forming region within a dtRNA. The inventors have determined that a leader sequence is sometimes required for dtRNA stability. For example, the inventors have determined that a short leader sequence (i.e., GGG) is required for transcription by T7 RNA polymerase. Preferably, the leader sequence is about three to six nucleotides in length.
The dtRNAs of the present invention comprise a first stem-forming region and a second stem-forming region that form a stem that is at least three nucleotides in length. As used herein, the term “stem-forming region” refers to the portion of the dtRNA that forms the stem, i.e., the fully or partially double-stranded portion of a stem loop, formed via complementary base pairing. In some cases, the first and second stem-forming regions are perfectly complementary, such that all of the nucleotides in these regions participate in complementary base pairing. In other cases, the first and second stem-forming regions are not perfectly complementary, such that the stem loop comprises bulges, mismatches, and/or inner loops.
The dtRNAs of the present invention also comprise a loop-forming region comprising at least three nucleotides. As used herein, the term “loop forming region” refers to the portion of the dtRNA that forms the single-stranded loop of the stem-loop.
The dtRNAs of the present invention also comprise an insulator sequence comprising at least five nucleotides. As used herein, the term “insulator sequence” refers to a nucleotide sequence that has the ability to block the interaction of functional portions of a nucleic acid. In the present case, the insulator sequence is positioned downstream (3′) of the stem loop-forming region of the dtRNA to prevent interactions between the stem loop and any downstream portion of an RNA into which the dtRNA is inserted. Preferably, the insulator sequence is single stranded and does not form any unwanted hairpin structure that could affect the function of downstream RNA. Preferably, the insulator sequence is about 10 nucleotides in length. In certain embodiments, the insulator sequence is 5′-AAAACCAAAA-3′ (SEQ ID NO:88), a sequence that was designed by the inventors (i.e., using NUPACK) to interact minimally with surrounding sequences. However, the insulator sequence should be selected in view of the particular sequence context at hand. Ideally, the local RNA structure should be analyzed to prevent unwanted structure formation. Additionally, the insulator sequence should not contain functional sequences, such as transcriptional terminators or potential RNase cleavage sites that could negatively impact RNA function.
In some embodiments, the dtRNAs comprise one of the 82 synthetic dtRNAs that were tested by the inventors, which are disclosed herein as SEQ ID NO:1-82.
In another aspect, the present invention provides DNA constructs comprising the dtRNAs described herein. In some embodiments, the DNA constructs comprise a promoter that is operably connected to a sequence encoding the dtRNA and a protein. As used herein, the term “DNA construct” refers to an artificially constructed segment of DNA. In some cases, the DNA construct is a vector. The term “vector,” as used herein, refers to a nucleic acid molecule capable of propagating another nucleic acid to which it is linked. The term includes the vector as a self-replicating nucleic acid structure as well as the vector incorporated into the genome of a host cell into which it has been introduced. Certain vectors are capable of directing the expression of nucleic acids to which they are operatively linked. Such vectors are referred to herein as “expression vectors”. Vectors often comprise regulatory sequences, such as promoters and enhancers, which allow for expression of a polypeptide.
As used herein, the term “promoter” refers to a DNA sequence capable of controlling the expression of a coding sequence or functional RNA. In general, a coding sequence is located 3′ to a promoter sequence. Promoters may be derived in their entirety from a native gene, or be composed of different elements derived from different promoters found in nature, or even comprise synthetic DNA segments. It is understood by those skilled in the art that different promoters may direct the expression of a gene in different tissues or cell types, or at different stages of development, or in response to different environmental conditions. Promoters that cause a gene to be expressed in most cell types at most times are commonly referred to as “constitutive promoters”. Promoters that allow the selective expression of a gene in most cell types are referred to as “inducible promoters”. Pol II or Pol III promoters may be utilized in the constructs provided herein and can be chosen by those of skill in the art for the particular purpose and RNA being generated by the construct.
As used herein, the term “operably linked” refers to a relationship between two nucleic acid sequences wherein the production or expression of one of the nucleic acid sequences is controlled by the other nucleic acid sequence. For instance, a promoter is operably linked to a nucleic acid sequence if the promoter is capable of affecting the expression of that sequence (i.e., the sequence is under the transcriptional control of the promoter).
The DNA constructs may encode any protein of interest. In the Examples, the inventors demonstrate that dtRNAs can also be applied to genes with very different sequence composition, i.e., GFP and mRFP, which have only 3% homology. Suitable proteins that may be encoded by the DNA constructs of the present invention include, for example, detectable reporter proteins (e.g., β-galactosidase, alkaline phosphatase, GFP, RFP, mCherry, luciferase), therapeutic proteins, and proteins of industrial interest. The DNA constructs may also encode RNA molecules such as iRNA, shRNAs, sgRNA for use in CRISPR/Cas gene editing or other functional RNA molecules encoded by a DNA and described more fully below. These RNAs may be under the control of a Pol III promoter.
The present invention provides methods of modulating the stability of an RNA. The methods comprise: (a) forming the dtRNA of claim 1 or 2; and (b) inserting the dtRNA into the RNA in a position that is 5′ to the functional portion of the RNA. When the modified RNA is transcribed, the dtRNA forms a hairpin structure that stabilizes and protects the transcribed RNA from RNase degradation.
In the Examples, the inventors demonstrate the ability of a dtRNA to stabilize or destabilize a RNA into which it is inserted. As used herein, the “stability” of a RNA refers to its half-life. In some cases, the addition of a dtRNA increases RNA stability by at least 2-fold. In some cases, RNA stability is increased at least 2-, 3-, 4-, 5-, 6-, 7-, 8-, or 9-fold or more, relative to a control lacking the dtRNA. In some cases, the addition of a dtRNA decreases RNA stability by at least 2-fold. In some cases, RNA stability is decreased at least 2-, 3-, 4-, 5-, 6-, 7-, 8-, 9-, 10-, 11-, 12-fold or more, relative to a control lacking the dtRNA.
In the present methods, the dtRNA may be formed using any suitable method, such as chemical synthesis or PCR mutagenesis. In some cases, the dtRNA will be provided as a complementary DNA (cDNA) that encodes the desired dtRNA sequence. The dtRNA sequence is inserted into a RNA of interest (or a DNA sequence encoding a RNA of interest) using standard molecular cloning techniques.
The methods of the present invention can be used to modulate the stability of any form of RNA. Suitable forms of RNA include, without limitation, messenger RNAs (mRNAs), transfer RNAs (tRNAs), ribosomal RNAs (rRNAs), microRNAs (miRNAs), siRNAs, piRNAs, snoRNAs, snRNAs, exRNAs, scaRNAs, long ncRNAs, and other synthetic RNAs (e.g., guide RNAs used in CRISPR-based systems). Thus, the term “functional portion” is used herein to refer generally to the portion of an RNA that either (1) encodes a protein, or (2) provides a non-coding RNA function (e.g., the portion of a miRNA that binds to a target sequence).
In the Examples, the inventors demonstrate that the ability of a dtRNA to stabilize or destabilize a RNA into which it is inserted is strongly correlated with several features of the dtRNA, which include the GC content of the stem-forming region, stem length, loop size, the length of the 5′ leader sequence, and the presence of RNase cleavage sites. Thus, the dtRNAs used in the methods of the present invention are characterized in terms of these features.
As used herein, the term “GC content” refers to the percentage of nitrogenous bases in a DNA or RNA molecule that are either guanine (G) or cytosine (C). As is demonstrated in
As shown in
As shown in
The leader sequence of the dtRNAs also affects RNA stability. Long single-stranded regions make RNAs unstable because they are targets for digestion by RNases. In the Examples, the inventors determined that the leader sequence only began to destabilize the RNAs when it was at least 18 nucleotides in length. Thus, in some embodiments, the leader sequence is less than 18 nucleotides in length. In other embodiments, the leader sequence at least 18 nucleotides in length.
The addition of RNase cleavage sites is known to destabilize RNAs. For example, the inventors have demonstrated that the addition of the RNase E cleavage site UCUUCC to an unstable dtRNA loop decreases RNA stability. Thus, in some embodiments, the dtRNA comprises one or more RNase E cleavage sites. However, any RNase cleavage site may be used in the dtRNAs of the present invention to allow for tunability of expression, i.e. to allow for precise regulation or alteration of gene expression.
The methods of the present invention may be used to either increase or decrease the stability of a RNA. The overall effect of inserting a dtRNA into a RNA sequence will depend on the dtRNA's specific combination of features that affect RNA stability. The dtRNAs tested by the inventors were capable of tuning expression upwards by up to 5-fold or downwards by up to 8-fold. Thus, the stabilizing or destabilizing effect of dtRNAs is tunable, and is readily modulated by the manipulation of the features described herein.
The methods of the present invention can be used to alter the stability of any form of RNA. In some embodiments, the methods are used to alter the stability of messenger RNA. The term “messenger RNA (mRNA)” refers to an RNA that encodes at least one protein. In these embodiments, the dtRNA is inserted between the transcription start site and the ribosome binding site of a DNA molecule encoding the mRNA. The term “transcription start site (TSS)” refers to the position in a DNA molecule from which transcription begins. The term “ribosome binding site (RBS)” refers to a sequence of nucleotides found upstream of the start codon of an mRNA transcript that is responsible for the recruitment of a ribosome during the initiation of translation. Positioning the dtRNA between the TSS and TBS ensures that the dtRNA is transcribed and that the insulator sequence at the 3′ end of the dtRNA will prevent the dtRNA stem loop from interfering with translation. For example, as is illustrated in
In some embodiments, the methods are used to alter the stability of a noncoding RNA. The term “noncoding RNA (ncRNA)” refers to a RNA that is not translated into a protein. In these embodiments, the dtRNA is inserted on the 5′ end of the ncRNA, i.e., 5′ to the functional portion of the ncRNA.
In some embodiments, the ncRNA is part of a CRISPR-based system. As used herein, the term “CRISPR-based system” refers to any system that utilizes CRISPR technology. Examples of CRISPR-based system include, without limitation, CRISPR-mediated genome editing, CRISPR-mediated epigenetic editing, CRISPR-mediated chromatin immunoprecipitation, CRISPR-mediated transcriptional activation, CRISPR-mediated transcriptional repression, CRISPR-mediating live imaging of DNA/RNA, and CRISPR libraries for screening. In some embodiments, a dtRNA is used to modulate the stability of a guide RNA used in a CRISPR-based system. As illustrated in
In some embodiments, the ncRNA comprises a toehold switch. As used herein, the term “toehold switch” refers to a class of RNAs that comprise a hairpin loop that unfolds upon binding to a cognate “trigger RNA” (i.e., an RNA comprising a region that is complementary to a portion of the toehold switch). Unfolding of the hairpin loops exposes a ribosome binding site (RBS) and permits translation of a downstream protein (Green et al., 2014, Cell 159:925-939). Thus, toehold switches are programmable RNA devices that are used to regulate translation. Generally, a toehold switch is designed to comprise a long 5′ single-stranded region that is complementary to the trigger RNA. Such long single-stranded regions make RNAs unstable, as they are targets for digestion by RNases. As is demonstrated in
In some embodiments, the dtRNAs are used to amplify detection signals in a diagnostic method or device. For example, in some cases, dtRNAs of this disclosure are added to RNAs that are used to detect the presence of a pathogen-associated nucleic acid in a sample. In some embodiments, the methods described herein are adapted for high-throughput or rapid detection, for example, in a clinical setting or in the field. When a dtRNA output is coupled to a reporter element, such as fluorescence emission or a color-change through enzymatic activity, the resulting synthetic molecule serves as a genetically encoded sensor for nucleic acid detection. In the Examples, the inventors demonstrate that adding dtRNAs to toehold switch sensors designed to detect norovirus-associated nucleic acids enhances the performance of a paper-based norovirus diagnostic assay. However, the methods of the present invention can be used to improve detection of any nucleic acid of interest. For example, other applications of the methods provided herein include, without limitation, detecting pathogens or environmental contaminants, profiling species in an environment (e.g., water, mosquito populations carrying mosquito-borne viruses); profiling species in an human or animal microbiome; food safety applications (e.g., detecting the presence of a pathogenic species, determining or confirming food source/origin such as type of animal or crop plant); obtaining patient expression profiles (e.g., detecting expression of a gene or panel of genes (e.g., biomarkers); wastewater monitoring applications (e.g., detecting the presence of pathogens in sewage for pathogen surveillance).
In the Examples, the inventors demonstrate that the inventors demonstrate that insertion of a dtRNA modulates the stability of an RNA both in vivo and in in vitro cell-free expression systems. Thus, in some embodiments, the RNA modulated by the methods of the present invention is expressed in a cell-free expression system. As used herein, the term “cell-free expression system” refers to a system in which protein is expressed in a crude extract rather than in a cell.
The present disclosure is not limited to the specific details of construction, arrangement of components, or method steps set forth herein. The compositions and methods disclosed herein are capable of being made, practiced, used, carried out and/or formed in various ways that will be apparent to one of skill in the art in light of the disclosure that follows. The phraseology and terminology used herein is for the purpose of description only and should not be regarded as limiting to the scope of the claims. Ordinal indicators, such as first, second, and third, as used in the description and the claims to refer to various structures or method steps, are not meant to be construed to indicate any specific structures or steps, or any particular order or configuration to such structures or steps. All methods described herein can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The use of any and all examples, or exemplary language (e.g., “such as”) provided herein, is intended merely to facilitate the disclosure and does not imply any limitation on the scope of the disclosure unless otherwise claimed. No language in the specification, and no structures shown in the drawings, should be construed as indicating that any non-claimed element is essential to the practice of the disclosed subject matter. The use herein of the terms “including,” “comprising,” or “having,” and variations thereof, is meant to encompass the elements listed thereafter and equivalents thereof, as well as additional elements. Embodiments recited as “including,” “comprising,” or “having” certain elements are also contemplated as “consisting essentially of” and “consisting of” those certain elements.
Recitation of ranges of values herein are merely intended to serve as a shorthand method of referring individually to each separate value falling within the range, unless otherwise indicated herein, and each separate value is incorporated into the specification as if it were individually recited herein. For example, if a concentration range is stated as 1% to 50%, it is intended that values such as 2% to 40%, 10% to 30%, or 1% to 3%, etc., are expressly enumerated in this specification. These are only examples of what is specifically intended, and all possible combinations of numerical values between and including the lowest value and the highest value enumerated are to be considered to be expressly stated in this disclosure. Use of the word “about” to describe a particular recited amount or range of amounts is meant to indicate that values very near to the recited amount are included in that amount, such as values that could or naturally would be accounted for due to manufacturing tolerances, instrument and human error in forming measurements, and the like. All percentages referring to amounts are by weight unless indicated otherwise.
No admission is made that any reference, including any non-patent or patent document cited in this specification, constitutes prior art. In particular, it will be understood that, unless otherwise stated, reference to any document herein does not constitute an admission that any of these documents forms part of the common general knowledge in the art in the United States or in any other country. Any discussion of the references states what their authors assert, and the applicant reserves the right to challenge the accuracy and pertinence of any of the documents cited herein. All references cited herein are fully incorporated by reference, unless explicitly indicated otherwise. The present disclosure shall control in the event there are any disparities between any definitions and/or description found in the cited references.
The following examples are meant only to be illustrative and are not meant as limitations on the scope of the invention or of the appended claims.
The ability to tune RNA and gene expression dynamics is greatly needed for biotechnological applications and has motivated the development of an assortment of engineered libraries of components, including promoters, ribosome binding sites, and transcriptional terminators. RNA and protein levels are both strongly affected by transcript stability. Native RNA stabilizers or engineered 5′ stability hairpins have been utilized to regulate transcript half-life to control recombinant protein expression. However, these methods have been mostly ad-hoc and hence lack predictability and modularity. In the following Example, the inventors report a library of RNA modules called degradation tuning RNAs (dtRNAs) that can increase or decrease transcript stability in vivo and in vitro. dtRNAs enable modulation of transcript stability over a 40-fold dynamic range in Escherichia coli while having a minimal influence on translation initiation. They harness dtRNAs in mRNAs and noncoding RNAs to tune gene circuit dynamics and enhance CRISPR interference in vivo. Use of stabilizing dtRNAs in cell-free transcription-translation reactions also tunes gene and RNA aptamer production in vitro. Finally, they combine dtRNAs with toehold switch sensors to enhance the performance of paper-based norovirus diagnostics, illustrating the potential of synthetic dtRNAs for biotechnological applications.
Results:
Modulation of RNA stability by variants of the native ompA stabilizer. Inspired by previous studies that naturally occurring stabilizers can be used to tune gene expression in synthetic gene circuits48,50, we inserted the 5′ UTR sequence from the E. coli ompA transcript between the promoter and RBS region to tune downstream GFP expression33,45,47 (
To explore the impact of extra secondary structures formed close to the RBS on GFP expression, another three stabilizer variants were designed and synthesized: “WT_I”, “Hp1_I” and “Hp2_I” which, compared to above designs, form eight extra base pairs with their downstream sequence to establish a short hairpin structure near RBS (red structure in
To rule out the possibility that the observed increase GFP fluorescence was due to enhanced translation rather than increased RNA stability, RT-qPCR experiments were carried out for Ctrl, WT, Hp1, and Hp2 to measure their RNA levels.
Identifying functional structural features of synthetic dtRNAs. Using these two general principles for stabilizers as a framework, we proceeded to design a library of synthetic dtRNAs with a range of structural features to systematically evaluate their influence on RNA stability. In silico analyses highlighted stem length, stem GC content, loop size, 5′ spacing sequence, and 3′ insulation as the primary candidate features to investigate as part of the library (
To investigate the impact of stem length on RNA stability, another ten dtRNAs sharing the same loop sequence and optimal stem GC content but varying stem length were designed and tested (
Finally, to identify the relationship between loop size and RNA stability, we designed and tested another set of twelve dtRNA structures containing optimal stem features but varying loop sizes. In theory, tetraloops, which are hairpin loops of 4 nt, endow an RNA structure with strong thermal stability and make them highly nuclease resistant54. This effect is confirmed experimentally in
Having designed the necessary structural features to enhance RNA stability, we next explored incorporating motifs to decrease RNA stability. We first attempted to insert the previously reported RNase E cleavage site (UCUUCC, 6-nt) into dtRNA structures24,55. No significant GFP fluorescence decrease was observed when cleavage sites were inserted into the stable hairpin (
We also investigated other features such as the presence of bulges within the stem and loop GC content and found that they have insignificant effects on RNA stability (
To confirm that the observed gene expression tuning could be attributed to RNA levels, RT-qPCR experiments were performed to measure RNA levels for selected dtRNAs with a range of GFP fluorescence enhancement levels. The results show a strong correlation between relative RNA level and relative GFP fluorescence (R2=0.9406), indicating that GFP fluorescence variation is mainly due to the change in RNA levels (
Table 3. Information on additional dtRNAs constructed (a-i) followed by combined design rules and their predicted relative GFP. We defined three factors α, β and γ, which are calculated through the fitted GFP fluorescence of each feature (stem GC content (α), stem length (β) and loop size (γ)) based on
We further averaged Max GFP from all three features as Max GFP (average) and the predicted relative GFP can be calculated by the equation (we assume that each feature impact the GFP fluorescence independently):
Predicted relative GFP=α·β·γ·Max GFP(average)
In all, we systematically designed and tested a library of 82 synthetic dtRNAs and identified the functional structural features affecting RNA stability. Each dtRNA shares a single hairpin structure with an insulator sequence at the 3′ end to prevent interference between the stability hairpin and RBS region. By tuning combinations of structural features, dtRNAs enable quantitative control over gene expression with a wide dynamic range of 40-fold from the least to the most stable sequences (
Modulation of gene circuit dynamics and noncoding RNA levels. As an initial test of the utility of dtRNAs, we selected two dtRNAs with the top GFP enhancement performance (dR1 and dR6) to incorporate into a LuxR/Luxl quorum sensing (QS) regulatory circuit and measure their impact on downstream GFP expression. It can be seen in
To explore this impact on nonlinear gene circuit dynamics, synthetic dtRNAs were inserted into a LuxR/Luxl QS-based positive feedback loop to tune the bistability of each circuit56,57. The constitutive promoter in circuits C_dR1 and C_dR6 was replaced with a pLux promoter such that LuxR gene can activate itself to form a positive feedback topology (circuit H_dR1 and H_dR6) (
To explore the tunability dtRNAs offer for noncoding RNA levels, we built a CRISPR interference system to control small guide RNA (sgRNA) levels by redesigning the 5′ sequence of an sgRNA targeting a GFP promoter with dR1 and dR6, and two other top-performing dtRNAs (dR15 and dR19). When transcribed from a weak promoter, each redesigned sgRNA can guide dCas9 to bind with the cognate promoter region to inhibit downstream GFP expression (
In vitro regulation of gene and RNA aptamer production by synthetic dtRNAs. Cell-free expression systems have been widely used in synthetic biology, metabolic engineering and in vitro diagnostics6,60-62. To test whether synthetic dtRNAs enable regulation of gene expression in cell-free expression systems, we constructed two circuits with dtRNAs that showed good performance with sgRNAs (dR15 and dR19) along with two additional circuits with randomly selected top-performing dtRNAs (dR4 and dR7) to measure their impact on GFP expression in cell-free transcription-translation expression systems (
We first performed measurements without the addition of RNase inhibitor to each reaction (-RNase inhibitor group). The result in
To better quantify gene expression enhancement due to RNA stability increases, we constructed a dynamic model to describe dtRNA-regulated GFP expression enhancement in both scenarios (
Stabilizing efficacy, defined as the ratio between the steady state GFP concentration without RNase inhibitor and with RNase inhibitor treatment, measures the robustness of dtRNAs in vitro against RNase activities, which could impact dtRNAs effectiveness (compare
To further investigate the effect of dtRNA on RNA stability in vitro, we next coupled dtRNAs to the RNA aptamer Broccoli to directly measure whether dtRNAs can influence RNA levels in cell-free expression systems. 65 dtRNAs spanning the dynamic range of the library were selected, designed and ligated to the 5′ end of the Broccoli aptamers, and their fluorescence was measured using a plate reader. It can be seen that most of the dtRNAs significantly enhanced the aptamer fluorescence (
Improved viral diagnostics using hybrid dtRNA/toehold switch sensors. The toehold switch is a programmable RNA device that can interact with a user-specified target RNA to activate translation of a protein of interest20 and has been widely applied in areas including in vitro viral diagnostics6,63, gene circuit engineering22,60,64 and education65. Toehold switches feature a long single-stranded region known as a toehold at their 5′ end that is designed to initiate binding with the target RNA. However, transcripts with excessive 5′ single-stranded regions could be easily targeted and digested by RNases (
To test these hybrid sensors in paper-based diagnostic systems, synthetic norovirus RNA was introduced to paper-based devices containing cell-free reactions and DNA templates for transcription of the sensors without RNase inhibitor present. We observed that sensors with dtRNAs (dR19_1, dR19_4 and dR19_5) exhibited faster detection speed (1.22 hours, ΔOD575=0.4) without leaky expression, while the original sensor (Ori) without dtRNA only showed detectable signals after 1.74 hours of induction (
A great many methods have been developed to meet the increasing demand for precise and predictable control of gene expression. Naturally occurring RNA stabilizers or engineered 5′ stability hairpins that thwart RNase activity hold the potential to directly control RNA half-life and have been applied to regulate cellular RNA levels as well as heterologous protein yields45-48. In this study, we systematically identify the RNA structural features that influence stability, design a library of synthetic dtRNAs, and use them to tune gene expression levels in vivo and in vitro. We find that application of structure-stability relationships discerned from the library enables semi-quantitative predictions of the performance of newly designed dtRNAs. Moreover, we demonstrate multiple applications of dtRNAs by using them to increase the strength of CRISPR interference, tune gene circuit behavior and aptamer stability, and to enhance the speed and stability of paper-based viral diagnostics.
Previous studies have investigated 5′ stabilizing elements with an interest in increasing mRNA stability and understanding RNase substrate specificity46,48,50, while others have designed 5′ UTRs to manipulate translation of mRNAs14,51. Specifically, portable mRNA-stabilizing 5′-UTR sequences have been demonstrated to increase GFP mRNA stability50. In this work, we engineered and tested a more comprehensive set of hairpins with systematically designed secondary structures that are able to not only tune RNA stability up, but also destabilize RNA molecules. Expanding earlier work of analyzing free energy (ΔG) of hairpin designs48, we systematically explored the structural feature space with the aim to elucidate dtRNA's structure-stability relationships. Our results demonstrate that 5′ UTR RNA secondary structure can be engineered with varying features such as stem-loop length, sequence context and RNase cleavage sites to achieve wide dynamic range over RNA stability regulation, in turn allowing precise control over gene expression and non-coding RNA activity. Moreover, compared to engineered synthetic promoter and RBS libraries, it is relatively easy to construct dtRNAs following our design rules in diverse engineering scenarios. Similar to previous studies, our work also confirms that gene expression regulation by dtRNA modules exert little effect on cell growth, indicating that compared to the other gene expression regulation methods, RNA manipulation renders less burden for cell economy (
When assessing mRNA lifetime, it is important to note that degradation and translation are closely intertwined processes. Thus, only considering one to determine the final protein yield could overestimate the capabilities of dtRNAs. After being transcribed, mRNA is competitively targeted by RNases and ribosome subunits, where, in theory, a stable mRNA has a higher chance for ribosome binding than unstable mRNA. Furthermore, highly translated genes can also be shielded by active ribosomes that serve to protect against RNase activities. This positive side effect of enhanced RNA stability can be observed in our RT-qPCR results where RNA fold increase can account for over 94% but still not all GFP expression increases (
We also applied our dtRNA modules to directly upregulate gene expression and tune RNA aptamer levels in cell-free expression systems with a 10-fold dynamic range. An RNA-based device, the toehold switch sensor, is optimized with our dtRNAs for rapid paper-based viral diagnostics. Higher detection sensitivity with low expression leakage is achieved using the redesigned sensors, making them more compatible for potential field-ready diagnostics. More importantly, dtRNA robustness against RNase activities suggests that they can also be used to enhance expression in crude-extract-based cell lysates, which are substantially cheaper to produce but have higher RNase levels68,69. Previous work has shown that native 5′ UTR structures can be used to enhance gene expression in such cell-free reactions70. Overall, our work provides a purely RNA-based method to regulate gene expression in vivo and in vitro that can be used for a variety of different biotechnological applications.
Strain, media and culture condition. All molecular cloning experiments were performed in Escherichia coli DH10B (Invitrogen). Synthetic circuits (
Plasmid construction. Most genes were obtained from iGEM Registry (http://parts.igem.org/Main_Page). Plasmids were constructed based on general molecular biology techniques and standardized Biobrick cloning methods as previously describer. For example, to assemble GFP gene (E0040) with a strong RBS (B0034), plasmids with GFP gene were digested with xbaI and PstI as the cloning insert while plasmids containing RBS were digested with SpeI and PstI as the cloning vector. Digested plasmids were then separated on 1% TAE Agarose gel by gel electrophoresis. Gel bands with correct insert or vector size were selected and purified using the PureLink gel extraction Kit (Invitrogen). Gel extraction products with insert and vector were ligated by T4 DNA ligase (New England Biolabs, NEB) and transformed into E. coli DH10B. Transformed cells were plated on LB agar plates with 100 μg/mL ampicillin, or 50 μg/mL kanamycin for screening. In the end, plasmids extracted by GenElute HP MiniPrep Kit (SIGMA-ALDRICH) were confirmed through gel electrophoresis (digested by EcoRI and PstI) and Sanger DNA Sequencing (Biodesign Sequencing Core, ASU). Similar Biobrick cloning steps were taken for the following genetic components until the entire circuit has been constructed. All names and Biobrick number of genetic components can be found in Table 1.
For construction of the circuits with dtRNAs or sgRNAs, each structure was analyzed and designed by the NUPACK design package72 and their respective DNA oligos were synthesized by IDT. Biobrick XbaI and PstI cleavage sites were added at 5′ or 3′ end of the DNA oligos. DNA Oligos for the same dtRNA were diluted with ddH2O and hetero duplexed on a heat block and were further ligated into the plasmids with the promoter digested by XbaI and PstI. The guide sequence of sgRNA or redesigned sgRNAs were designed and then synthesized by IDT. The sequence 5′-GCTA-3′ and 5′-AAC-3′ were added on sgRNA forward and reverse primers, respectively. DNA oligos for the same sgRNA were diluted by ddH2O, hetero duplexed on a heat block and ligated to the vector digested by SapI as previously describer. The rest of the cloning steps remain the same as the general gene circuit construction.
Plate reader OD and fluorescence measurements. All sequencing-confirmed gene circuits were transformed into E. coli DH10B. Single colonies were picked and cultured in 4 mL of LB medium with 100 μg/mL ampicillin. Cells were shaken until they were evenly distributed in the medium of which 300 μL were transferred into 96-well plate for OD and fluorescence measurements. Optical density (OD600) and fluorescence (excitation: 485 nm; emission: 530 nm) were measured every 15 minutes at 37° C. under continuous plate shaking (Synergy H1 Hybrid Reader, BioTek) at 220 rpm over 21 hr. For all the experiments, at least three random colonies were picked as biological replicates. For stable protein expression, we chose the 16-hour data point for further analysis in the study unless specified.
Flow cytometry measurements. We used an Accuri C6 flow cytometer to perform the flow cytometry measurements (Becton Dickinson). Cultured samples were collected and run through the flow cytometer. For each sample, 20,000 individual cells were analyzed at the slow flow rate and the fluorescence intensity was not normalized with the cell density because it only measured single cell data. All the results were then collected in log mode and further analyzed by MATLAB (MathWorks).
RT-qPCR. For selected gene circuits, three biological replicates were used to quantify the mRNA levels. Total RNA was extracted from the 2 mL of cell culture using the Quick-RNA Fungal/Bacterial Miniprep Kit (Zymo Research). Purified RNA was treated in column with DNaseI (Zymo Research) to remove the extra DNA. Total RNA was eluted by nuclease-free water and the concentration quantified for the following experiments. cDNA was then synthesized from each RNA sample using iScript Reverse Transcription Supermix for RT-qPCR (Bio-Rad). For each 20-uL reaction, about 1 μg RNA was used for reverse transcription. qPCR was performed for each cDNA sample using iTaq Universal SYBR Green Supermix (Bio-Rad) and the experiment reaction was detected using the iQ5 Real-Time PCR detection system (Bio-Rad). Specifically, each cDNA sample contains an extra technical replicate, the total reaction volume for each sample is 10 μL and prokaryotic 16S rRNA was set as the endogenous control. We used previous reported primers (IDT) for both 16S rRNA and GFP amplification32. The sequence of primers for 16S rRNA are 5′-GAATGCCACGGTGAATACGTT-3′ (SEQ ID NO:83) (rrnB, forward, starting at the 1361st nucleotide), and 5′-CACAAAGTGGTAAGCGCCCT-3′ 3′ (SEQ ID NO:84) (rrnB, reverse, starting at the 1475th nucleotide) and the sequence of GFP primers are 5′-CAGTGGAGAGGGTGAAGGTGA-3′ (SEQ ID NO:85) (forward, starting at the 87th nucleotide); and 5′-CCTGTACATAACCTTCGGGCAT-3′ (SEQ ID NO:86) (reverse, starting at the 283th nucleotide). Bio-rad CFX Manager software version 3.1 was used to analyze the data. To investigate the fold change over mRNA levels, we averaged each Ct value of 16S rRNA and GFP with their biological replicates and calculated the delta Ct based on Cttarget−Ct16S. Fold change for each sample was further calculated according to the biological control (circuit without dtRNA regulation) by 2−(ΔΔCt). The minimum information for publication of quantitative real-time PCR (MIQE) is also provided in Table 2.
Hysteresis experiments. We used our previously reported protocol to perform the hysteresis experiments33. In detail, gene circuits of the synthetic positive feedback loop were constructed in a low-copy plasmid and transformed into E. coli K-12 MG1655 strain with lacI−/−. Single colonies for three replicates were picked for each sample and cultured at 37° C., 220 rpm overnight in LB medium with 50 μg/mL kanamycin. For OFF-ON experiments, overnight cultured cells (initial OFF cells) were diluted into fresh LB medium at a 1:100 ratio and distributed into 5-mL polypropylene round-bottom tubes (Falcon) with various 3OC6HSL concentrations. Fluorescence of each sample was measured using an Accuri C6 flow cytometer (Becton Dickinson). In our experiments, GFP fluorescence became stable after ˜12 hours of induction. For ON-OFF experiments, cells were first induced by 2 nM 3OC6HSL for 12 hours to ensure the fully induction as the initial ON state. These ON state cells were then collected through low speed centrifugation, washed once and further diluted to the fresh LB medium at 1:100 ratio. Various 3OC6HSL concentrations were then added to each sample for culture. Flow cytometry measurements were performed at 12 and 16 hours, respectively. We used 16-hour results as the ON-OFF dataset in
RNA aptamer assay. Sequences of dtRNA-regulated Broccoli aptamers were designed using NUPACK and were further synthesized from IDT. T7 promoter and terminator sequences were inserted to each redesigned aptamer through PCR. Amplified double-stranded DNA molecules were purified using MinElute PCR purification kit (QIAGEN) and measured their concentration via Nanodrop spectrophotometer. Purified DNA was then diluted and mixed with cell-free transcription-translation systems (PURExpress, NEB). Each sample with 4 uL reaction mix was loaded to the 384 well plate for a five-hour plate reader measurement, and the fluorescence of each sample reached the peak value after about two-hour incubation at 37° C. In this experiment, we used a 30-nM DNA concentration for each sample for the reactions and the fluorescence was measured every 90 seconds.
Hybrid dtRNA/toehold sensor plasmid construction. Synthetic DNAs encoding the redesigned norovirus-specific toehold sensors were synthesized by IDT. All cloning steps are following the general molecular biology technologies. Synthetic DNAs were amplified by PCR and inserted into the plasmid backbone using Gibson assembly74. Complete plasmids were further confirmed by Sanger sequencing (Biodesign Sequencing Core, ASU). Plasmids and primers were described previously63.
Paper-based cell-free systems preparation. The protocols used for the paper-based cell-free reactions have been described previously63. Briefly, cell-free transcription-translation systems (PURExpress, NEB) were used to prepare the freeze-dried samples. The volume for each component of the reaction sample is 40% of cell-free solution A, 30% of cell-free solution B, 2% RNase inhibitor (Roche, 03335402001, distributed by MilliporeSigma) if needed, 2.5% chlorophenol red-b-D-galactopyranoside (Roche, 10884308001, distributed by MilliporeSigma, 24 mg/mL) and the remaining volume for toehold sensor DNA, lacZω and nuclease-free water. The final concentration for the synthetic DNA plasmid of each paper device is 30 ng/μL. The paper for the assays was first cut to a 2-mm diameter using a biopsy punch and transferred into PCR tubes. The prepared cell-free reaction mix (1.8 μL for each device) was then added into the PCR tubes with the paper disks and flash frozen in liquid nitrogen. Frozen devices were transferred to a lyophilizer to freeze-dry overnight. Completely dry paper devices were ready for use as viral diagnostics and can be stored at room temperature as previously described60,63.
This section describes the method for in silico design of the synthetic dtRNA library through NUPACK design package1. The same method is also used to design new dtRNAs for in vitro gene expression regulation and toehold sensor optimization for paper-based viral diagnostics.
Definition of dtRNA secondary structure domains. We first specify the secondary structure domains of the dtRNA library. A single hairpin is set to be the basic structural frame for each dtRNA. As shown in
Optimization of NUPACK scripts and dtRNA library sequence generation. After completing definition of the domains of the dtRNA structure, NUPACK scripts are needed to generate the sequence to fit the design principles. We first determined the basic settings for the design: the material is chosen to be RNA; the temperature is set at 37° C. and the trial number is set as 10 which indicates the number of independent sequences to perform for one-time NUPACK design (Maximum 10).
We then define the base structure of each dtRNA in the library. In particular, we use DU+ notation to specify the single-stranded or base-paired nucleotides: U denotes the single-stranded nucleotides and D denotes the base-paired nucleotides. To define a hairpin structure with a 4-bp stem and 4-nt loop, for example, the algorithm format should be “D4 U4”. Accordingly, the general format for the dtRNA structure with a 6-nt 5′ spacing, 12-bp stem, 6-nt loop, 10-nt insulator sequence, and 64-nt downstream sequence is “U6 D12 U6 U10 U64”. Specifically, for designs with an imperfect hairpin structure such as the introduction of a bulge within the stem region, we use brackets to specify the structural hierarchies. For example, “D3 (U3 D3 U6 U3)” denotes the structure with 9 bp stem interrupted by 3-nt symmetrical bulge. To ensure each domain will not interfere with the others, we maintain all sequences to be single stranded except the dtRNA hairpin structure during design process.
We next assign specific sequences to each domain. If the assigned sequence is not specified or needs the NUPACK design package to determine, we use the letter “N” to denote these nucleotides. Otherwise, using A, U, C and G to represent the four ribonucleotides. For example, a script with dtRNA=U6 D12 U6 U10, dtRNA.seq=a b c b* d (b* represents the complementary sequence to b), domain a=UCUUCC, domain b=N3UCUUCCN3, domain c=UCUUCC and domain d=N10 represent a dtRNA with three RNase E cleavage sites UCUUCC inserted into 6 nt 5′ spacing (domain “a”), the middle 6 bp of the stem (domain “b” and “b*”), and in the 6-nt loop (domain “c”) while keeping the other nucleotides random.
For the final output of the synthetic dtRNA library, we choose Serra and Turner, 1995 as the basic RNA energy parameters and use 1.0 M Na+ and 0 M Mg2+ for the design algorithm4. To prevent runs of nucleotides or pairs of nucleotides, the following sequences were disallowed in the resulting designs: AAAAA, CCCCC, GGGGG, UUUUU, KKKKKK, MMMMMM, RRRRRR, SSSSSS, WWWWWW, YYYYYY.
Analysis and removal of unwanted designs. NUPACK design package calculates each design with a specific normalized ensemble defect which indicates the average percentage of incorrectly paired nucleotides at equilibrium relative to the design secondary structure which is evaluated by the Boltzmann-weighted ensemble of (unpseudoknotted) secondary structure. The best normalized ensemble defect is 0%, while 100% is the worst. We select the designs with the lowest normalized ensemble defect while removing the others to select the seed dtRNAs for each design criteria listed in
The same method is used to denote the feature of dtRNAs to regulate gene expression in vitro and hybrid toehold sensors for viral diagnostics. In short, we select the desirable hairpin from the dtRNA library as the basal structure and define new 5′ spacing and insulator sequence as the design required (e.g., add GGG at the beginning of 5′ spacing for T7 promoter transcription preference). All designed dtRNAs are further analyzed and finalized as described above.
Examples of the Scripts for dtRNA Design:
#
# Basic Settings of dtRNA structure design
material=rna
temperature=37.0
trials=10
sodium=1.0
#
#
# Basic Sequence information
# Rnase E cleavage site=UCUUCC
# Common 3′ end sequence (RBS to first 38 nt of GFP sequence, total 64 nt)=TACTAGAGAAAGAGGAGAAATACTAGATGCGTAAAGGAGAAGAACTTTTCACTGGAGTTGTCCC (SEQ ID NO:89)
# Common 3′ end sequence structure=U13 D3 (U2 D3 (U1 D4 (U1 D2 (U3 D4 U8 U1)) U1) U1)
#
#
# dtRNA Structure Design
# example of dtRNA DU+ notation design of 6 nt 5′ spacing, 12 bp stem 6 nt loop with 10 nt insulator
structure dtRNA=U6 D12 U6 U10 U64
#
#
# Sequence denotation of each dtRNA domain
TACTAGAGAAAGAGGAGAAATACTAGATGCGTAAAGGAGAAGAACTTTTCACTGGAGTTGTCCC (SEQ ID NO:89) #64 nt Common 3′ end sequence
#
#
# Define each domain of dtRNA structure
dtRNA.seq=a b c b* d e
#
#
# Following sequence patterns are disallowed to prevent runs of nucleotides or pairs of nucleotides
prevent=AAAAA, CCCCC, GGGGG, UUUUU, KKKKKK, MMMMMM, RRRRRR, SSSSSS, WWWWWW, YYYYYY
#
#
# Output=qualified dtRNA sequence
#
Mathematic modeling for positive feedback circuit analysis. We constructed a mathematical model to clarify the underlying mechanism of the dynamic changes of a positive feedback circuit regulated by dtRNAs. We used a 2D ordinary differential equation (ODE) describing the transcription and translation process:
[Eq1] describes the luxR mRNA transcription and degradation process. M is the abundance of luxR mRNA. v0 stands for leakage transcription rate of lux promoter without binding of LuxR, while
represents the transcription rate with [LuxR-3OC6HSL]2 complex bound to the lux promoter, given in Hill Equation form5. v1 is the maximum transcription rate when all lux promoters are fully bound by [LuxR-3OC6HSL]2. Rf stands for functional LuxR protein abundance that are activated through binding with 3OC6HSL. KB is the square of the dissociation constant of lux promoter and [LuxR-3OC6HSL]2 binding. The mRNA degradation process is given by a linear form
δM is the degradation rate without dtRNA. The effect of dtRNA is measured by α, the relative dtRNA strength. α=1 if there is no dtRNA regulation. α>1 if dtRNA stabilizes mRNA and thus increases protein expression, while α<1 if dtRNA facilitates mRNA degradation and thus decreases protein expression.
[Eq2] describes LuxR protein translation and degradation process. RT is total LuxR protein concentration in system, including free LuxR and LuxR bound with 3OC6HSL and/or lux promoter. The mRNA translation is given by a Michaelis-Menten kinetics form
where v2 is the maximum translation rate and KM is the Michaelis-Menten constant, i.e. mRNA abundance when translation rate reaches half of maximum value v2. LuxR protein degradation takes simple linear form δRRT, where δR is the degradation rate of LuxR protein.
The relationship of total LuxR abundance RT and functional LuxR abundance Rf is given by [Eq3] in Hill Equation form6. L stands for concentration of 3OC6HSL. n describes the cooperativity of 3OC6HSL-LuxR binding. KL is the dissociation constant of 3OC6HSL-LuxR binding.
Using these ODE equations, we analyzed the dynamics of the self-activation system with XPPAUT (XPPAUT 8.0 January 2016)7. Parameter values used during analysis are shown in Table 4. A two-parameter bifurcation regarding α and L is performed (
Mathematic modeling and data fitting for cell-free gene expression analysis. When analyzing in vitro experiments, we used a mathematical model to help interpret results. Modeling of transcription and translation steps can take different forms because of different levels of details considered. A simple model only considers them as linear production and degradation6,8. Some models use Michaelis-Menten or Hill-function-like terms to describe production or degradation processes, to account for nonlinear bottleneck or saturation effects due to the limitations of cellular machineries5,9,10. Since the cell free system provides abundant molecular machinery for transcription and translation, we chose to use a simplified model that includes only transcription and translation steps without nonlinear terms. There are only four parameters in our simplified model. Two production rates can be freely scaled to fit experimental results and the protein degradation rate is also fixed according to literature reported values. The parameter studied in detail is the RNA degradation rate, which is directly related to different versions of dtRNAs used in each experiment.
Translation and transcription in vitro can be described by a simple 2D ordinary differential equation (ODE):
where M stands for mRNA abundance and P stands for GFP abundance over time. α, β, γ, and δ are the mRNA production rate, mRNA degradation rate, GFP translation rate, and GFP degradation rate, respectively. This simple ODE can be solved analytically and give us the expression of GFP abundance over time:
Using this formula, we can fit time series data in
After fitting, the equation
is used to compute GFP accumulation rate over time (
So when
Rate′(t)=0 and Rate(t) reaches its maximal value. As we can see, the location of the peak only depends on degradation rates. When protein degradation remains constant, the only factor affecting the peak locations is mRNA degradation rate, which is being tuned by dtRNA. As the mRNA getting more stable, β decreases and the curve peak shifts to the right.
Mathematic modeling and data fitting for cell-free RNA expression analysis. We used a simple mathematical model to calculate the RNA half-life and to interpret the results of in vitro RNA expression with data fitting. During the experiment, RNAs are generated through transcription and digested through degradation. We describe this RNA expression process with following ordinary differential equation:
R stands for RNA concentration, which is indicated by the value of fluorescence measurements. v is the transcription rate which can vary in a relatively small range due to the variation of sequence. δ is the degradation rate, which is affected by dtRNAs. We then analytically solved this simple ODE and obtained the function of RNA concentration over time:
where R0 is the initial RNA concentration. With this formula, we can fit RNA expression data of all 64 dtRNAs and control. In our fitting, we set a small boundary, 10±4 min−1, for v, since this transcription rate is affected by the variation of dtRNA sequence. δ is the main parameter changing over different dtRNA designs. Also, considering the error caused by the initial small value of experimental data, we used the mean value of first three data points as the initial RNA concentration R0. We fitted v and δ against experimental data using lsqcurvefit from Matlab. Then we calculated RNA concentration over time with Eq5 and RNA half-life with the formula:
As shown in Table 5, most of RNA with dtRNAs (45/64) has longer half-life than control.
This application claims priority to U.S. Provisional Application No. 62/984,622 filed on Mar. 3, 2020, the contents of which are incorporated by reference in their entireties.
This invention was made with government support under 1100309 awarded by the National Science Foundation and GM106081, GM131405, GM126892, and AI136571 awarded by the National Institutes of Health. The government has certain rights in the invention.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2021/020698 | 3/3/2021 | WO |
Number | Date | Country | |
---|---|---|---|
62984622 | Mar 2020 | US |