This invention relates to the field of biotechnology and to the control of gene expression in organisms; more particularly the control of gene expression in genetically modified cells of organisms, wherein the genetic modification allows for expression of a protein or polypeptide of interest (POI) by the cell in reaction to a control mechanism, usually involving a switch responsive to an inducer molecule. The invention also relates to the field of CRISPR-Cas gene modification and the control of protein expression in artificial CRISPR-Cas systems.
In biotechnology, control of gene expression is a desired attribute. Such control may provide several advantages such as avoiding toxicity, avoiding by-product formation and tuning of metabolic pathways. Currently, several methods have been developed for controlling gene expression e.g. different strength promoters (strong, weak, inducible promoters), different strength ribosomal binding sites (RBSs) and Riboswitches (amongst others).
Riboswitches are used for controlling gene expression, more specifically protein translation, and they have been used widely in the area of biotechnology (Breaker 2011 Breaker, R. R. (2011) “Prospects for riboswitch discovery and analysis” Molecular Cell 43 (6): 867-879.). Riboswitches see a wide application in biotechnology due to their simple design and their inducibility through an inducer molecule (e.g. theophylline).
Whilst riboswitches are a great technology to regulate gene expression, (depending on their structure) they can be leaky, they require knowledge of the 5′ untranslated region (UTR) sequence of the gene of interest and they may be complex to engineer to a non-model organism. For this reason, a more universal, easily applicable riboswitch is needed that will simplify the engineering process and will ensure control of gene expression/translation amongst a variety of organisms and genes.
A great example for the applicability of a universal gene expression/translation system is for controlling the Clustered Regularly Interspaced Short Palindromic Repeat-Cas (CRISPR-Cas). Homologous recombination (HR) combined with CRISPR-Cas counterselection is a powerful approach to perform genome editing with high editing efficiencies. However, to achieve high editing efficiencies, HR should precede CRISPR-Cas counterselection. CRISPR-Cas tools will, therefore, be inefficient unless the CRISPR-Cas module is tightly regulated. Several regulation systems have been developed to control the expression and activity of the CRISPR-Cas module (Davis, K. M., et al., (2015) “Small molecule-triggered Cas9 protein with improved genome-editing specificity” Nature Chemical Biology 11 (5): 316-318; Zetsche et al., (2015); Nihongaki et al. (2015) “Photoactivatable CRISPR-Cas9 for optogenetic genome editing” Nature Biotechnology 33 (7): 755-760; Liu et al. 2016 “A chemical-inducible CRISPR-Cas9 system for rapid control of genome editing” Nature Chemical Biology 12 (11): 980; Cañadas et al. (2019) RiboCas: a universal CRISPR-based editing tool for Clostridium” ACS Synthetic Biology 8 (6): 1379-1390; Tang et al. (2017) “Aptazyme-embedded guide RNAs enable ligand-responsive genome editing and transcriptional activation” Nature Communications 8 (1): 1-8. Siu and Chen (2019) “Riboregulated toehold-gated gRNA for programmable CRISPR-Cas9 function” Nature Chemical Biology 15 (3): 217-220; Kundert et al. (2019) “Controlling CRISPR-Cas9 with ligand-activated and ligand-deactivated sgRNAs” Nature Communications 10 (1): 1-11; Moroz-Omori et al. (2020) “Photoswitchable gRNAs for Spatiotemporally Controlled CRISPR-Cas-Based Genomic Regulation” ACS Central Science. Whilst the existing approaches are suitable for the organism, Cas protein or gRNA of interest, these solutions are typically not universally applicable.
For this reason, a more universal, easily applicable riboswitch is needed that will simplify the engineering process and will ensure control of gene expression amongst a variety of organisms and genes.
Thompson, K. M., et al., (2002) “Group I aptazymes as genetic regulatory switches” BMC Biotechnology 2 (1): 21 describes the attachment of a theophylline aptamer to the group I self-splicing T4 td intron to control and induce the thymidylate synthase gene in E. coli. The Wild Type (WT)/parental td intron in the thymidylate synthase gene was substituted with a number of different theophylline-dependent self-splicing introns. Importantly, the insertion position of the theophylline-controlled self-splicing intron was exactly the same as the parent td intron and so no modification was made to any of the 5′ or 3′ exons of the intron-gene complex. Nonetheless, the P6a stem loop was modified to be theophylline responsive and so Thompson K. M., (et al.) created an inducible version of the self-splicing td intron.
A problem with the inducer-controlled riboswitch described by Thompson et al. (2002) is that it cannot be transferred to other genes (other than the td gene), because this causes disruption of the amino acid sequence of the expressed protein. Therefore, the Thompson et al. (2002) riboswitch is not universal and it is restricted to the td gene.
Recently, a Cas system controlled by ligand-responsive riboswitch or known as RiboCas was reported as a universal genome editing tool in Clostridium species by Cañadas et al., (2019) “RiboCas: a universal CRISPR-based editing tool for Clostridium” ACS Synthetic Biology 8 (6): 1379-1390. RiboCas works by placing a riboswitch within the 5′ untranslated region of mRNA (5′ UTR). The riboswitch creates a loop which prevents the ribosome from binding at the Shine-Dalgarno (SD) sequence at the ribosomal binding sites (RBS), thereby inhibiting translation initiation. In the presence of a ligand (theophylline), the loop shifts and releases the SD sequence allowing the translation initiation to occur. This strategy of controlling the gene expression is a useful alternative when an inducible promoter is not an option.
The interaction between SD and anti-SD (the complementary sequence on the 16S rRNA of the ribosome) has been assumed to be a conserved and universal mechanism of translation initiation in prokaryotes (see Schmeing and Ramakrishnan (2009) “What recent ribosome structures have revealed about the mechanism of translation” Nature 461 (7268): 1234-1242). However, Nakagawa et al. (2010) “Dynamic evolution of translation initiation mechanisms in prokaryotes” Proceedings of the National Academy of Sciences 107 (14): 6382-6387 is a more recent comparative analysis of several prokaryotes and indicates that SD-independent translation is much more widespread than previously estimated. Alternative mechanisms are also reported in Nakagawa et al. (2010), which use ribosomal protein 51 (RPS1) or leaderless mRNAs that lack their 5′ UTR.
Chen, S. et al. (2007) “Characterization of strong promoters from an environmental Flavobacterium hibernum strain by using a green fluorescent protein-based reporter system” Appl. Environ. Microbiol. 73 (4): 1089-1100 describe how in some AT-rich prokaryotes, the common prokaryotic SD (GGAGG) (SEQ ID NO: 1) appears not to be conserved in the 5′UTR, although the anti-SD sequence (CCUCC) (SEQ ID NO: 2) is present at the 3′ end of the 16S rRNA. Accetto, T., & G. AvgAtin. (2011) “Inability of Prevotella bryantii to form a functional Shine-Dalgarno interaction reflects unique evolution of ribosome binding sites in Bacteroidetes” PloS one 6 (8) describe how low GC content at the 5′UTR indicates a reduced tendency to form secondary structures, implying that utilizing riboswitch in the 5′UTR region may result in a leaky system.
All of the aforementioned limitations make the recently described RiboCas tool difficult to apply in species with SD-independent translation systems.
CRISPR-Cas has become a regular genome engineering tool for many prokaryotes and eukaryotes. Application of this technology ranges from gene editing to controlling the expression of a gene of interest (GOI) through silencing or induction. Typical CRISPR-Cas applications include:
CRISPR-Cas-mediated genome editing involving non-homologous end joining (NHEJ) and the homology-directed repair (HDR) systems that repair the double strand breaks (DSBs) generated by active Cas proteins. In case of a DSB in a gene, NHEJ creates insertions or deletions (indels) that often disrupt the function of the gene. HDR can be used for any type of precision editing, for example, substituting a base pair (gene therapy), removing a gene (knock-out) or introducing a gene (knock-in).
CRISPR base editing involving fusion of a dead Cas (dCas) or a nickase Cas (nCas) to a base editor. A cytidine deaminase (e.g. APOBEC1) or an adenine deaminase (TadA) have been used so far for base editing (Eid et al., 2018) “CRISPR base editors: genome editing without double-stranded breaks” Biochemical Journal 475(11): 1955-1964). Such a tool is used to create single base edits in the target of interest and thereby fix or destroy the GOI. This tool can be considered as CRISPR-Cas-mediated genome editing as well, but an important difference with the aforementioned editing approaches is that it does not generate DSBs.
CRISPR prime editing involving a fusion of a dead Cas (dCas) to an engineered reverse transcriptase. A prime editing guide RNA (pegRNA) is sequence specific for the target site and encodes the desired edit.
CRISPR transposition involving a catalytically inactive Cas protein (dCas) linked to a transposase, that results in guide-dependent integration of DNA fragments.
CRISPR interference involves a catalytically inactive Cas protein (dCas) which mediates the downregulation of gene expression by binding to the promoter or the coding sequence of the GOI. This can be considered as gene silencing.
CRISPR activation involving the fusion of a dCas protein to a transcription factor or an induction element that mediates the recruitment of the RNA Polymerase (RNAP) and thereby activates the expression of the GOI.
The problem is that whilst many state of the art technologies have been developed, and two main Cas proteins (Cas9 and Cas12a) have been widely used for genome engineering, strict control of the expression of the Cas proteins is limited in various ways.
Inducible promoters may be used whereby expression of the Cas protein is usually under the control of the inducible promoter e.g. a tetracycline promoter, which blocks the expression of the Cas protein when the inducer (tetracycline in this case) is absent. Addition of the inducer, allows for the expression of the protein which mediates gene editing, interference, activation, etc. A strict level of control of the Cas protein is especially important in HDR applications in prokaryotes that often lack NHEJ and often have a poorly active HDR system. Cas nuclease is used for counter-selection (i.e. to eliminate the wild type and enrich the desired recombinant) which implies that HDR must precede the nuclease activity of the Cas protein. In other words, HDR should take place before the Cas protein is able to target the genome and bring about cell death. Strict regulation of Cas protein expression by an inducible promoter is often the best option to delay the activation of Cas protein as a counter-selection tool and to provide enough time for homologous recombination to occur (see Mougiakos et al., (2017) “Efficient genome editing of a facultative thermophile using mesophilic spCas9” ACS Synthetic Biology 6.5: 849-861; and Cañadas et al., (2019) Supra).
Despite the existence of inducible promoters useful in model microorganisms such as E. coli and S. cerevisiae (e.g. Lactose/IPTG, Arabinose, Rhamnose, Galactose, Maltose, Xylose) or useful in human cells (e.g. TetR), the “leakiness” of such promoters may hinder the efficiency of HDR CRISPR-Cas. This problem becomes more apparent in non-model organisms like Flavobacterium species or Clostridia species, for which strictly inducible promoters are not really known. For example, the only reported inducible promoter that works in Flavobacterium requires a low temperature (12° C.) to be active (see Gómez et al., 2015) “Development of a markerless deletion system for the fish-pathogenic bacterium Flavobacterium psychrophilum” PLoS One 10.2: e0117969, and this low temperature sacrifices activity of the Cas nuclease. Such absence of suitable inducible promoters for controlling expression of Cas proteins, together with very low HR efficiencies means that there is little motivation to attempt to try and engineer a non-model organism using a CRISPR-Cas system.
Several microorganisms (e.g. Bacillus smithii) can grow at elevated temperatures e.g. 55° C. and above. At such high temperatures the basic cellular functions, including homologous recombination, are active whereas the Cas protein (spCas9 in this case) is inactive (see Mougiakos et al. (2017) Supra). The ability of the microorganism to grow and replicate at high temperatures allows for sufficient time for homologous recombination to occur before shifting to a temperature (37° C.) where the Cas protein is active and acts for counter-selection. Whilst this approach is successful for Bacillus smithii, it is a thermophile-specific method and so cannot be applied to other organisms.
CRISPR-Cas base editing, although very promising, suffers many off-target effects on DNA and RNA (see Zuo et al., (2019) “Cytosine base editor generates substantial off-target single-nucleotide variants in mouse embryos” Science 364.6437: 289-292; Xin et al., (2019) “Off-Targeting of Base Editors: BE3 but not ABE induces substantial off-target single nucleotide variants” Signal transduction and targeted therapy 4.1: 1-2. Zhou et al., (2019) “Off-target RNA mutation induced by DNA base editing and its elimination by mutagenesis” Nature 571.7764: 275-278). This off target problem has been attributed to the high protein levels (Base Editor-Cas fusion) followed by the lack of specificity of the base editor.
Pichler A. & Schroeder R. (2002) “Folding Problems of the 5′ Splice Site Containing the P1 Stem of the Group I Thymidylate Synthase Intron” J. Biol. Chem 277 (20) 17987-17993 is a scientific publication describing an in vitro cleaving assay for the thymidylate synthase (td) group I intron. Pichler et al., checked the effect of the 5′ splice site and showed that it can tolerate substitutions. However, they did not use this knowledge to characterize further the substitutions and use them to control the expression of any gene of interest. Pichler et al. describe a limited assessment of the effect of alterations at the P1 and P2 stem loop (5′ exon of the intron) and how they affect the self-splicing activity of the T4 td intron. Pichler et al. found that alterations at the −4 to −6 positions can alter the splicing activity of the intron in vitro. Pichler et al. describe how both a stable variant (−4P, −5P, −6P) and a destabilized variant (−4M, −5M and −6M) have better self-splicing activity when compared to the unmodified (WT) intron. Pichler et al. does not describe the effect of modifying the −7 or +296 bases on self-splicing. Whilst Pichler et al. describe particular alterations at the 5′ exon of the T4 td intron, they do not describe or suggest the use of using any modified T4 td intron for the purpose of controlling gene expression.
WO2016/166310 (Wageningen Universiteit) discloses an intronic, self-splicing riboswitch configured for enzyme-product specificity by introducing an appropriate aptamer. This then provides a sensing-expression construct, whereby the presence of an enzyme product in the cell triggers self-splicing of the intron sequence to restore the reading frame of the reporter gene and as such to drive expression of the gene product. The sensing construct expresses a protein which marks the cell or permits its growth or survival in or on an otherwise selective media. In this way, introduction or the presence of such product sensing-reporter constructs in cells can be harnessed to provide a multi-parallel rapid screening of cells or libraries for desirable enzyme variants. Modifications of the Td intron strictly followed those made by Pichler et al. (see above). Also described is using at least 2 self-splicing introns to minimise “leakiness” of the expression control. In particular, the self-splicing introns were introduced into a T7 polymerase for downstream control of a GFP gene serving as the GOI.
WO2018/083128 (Wageningen Universiteit) discloses how in the absence of efficient non-homologous end joining (NHEJ) repair mechanisms in the majority of microbes, double stranded DNA break (DSDB) typically leads to cell death. Therefore methods of microbial gene editing are provided involving plasmid transformation. Both homologous recombination and Cas9 site-specific gene editing events can be used together. Single or multiple plasmid approaches are used. In a method of counter-selection of microbes for a desired genetic change, a two-phase approach is used whereby a switch is made from a higher growth temperature phase favouring homologous recombination (HR)—as opposed to a Cas9 site-directed nuclease activity- to a lower growth temperature phase at which the Cas9 site directed nuclease activity takes place. This has the effect whereby the Cas9 site-directed nuclease activity has counter selecting activity, removing microbes which do not have a desired modification introduced beforehand by HR. The population of microbes surviving after the temperature switch counter selection is thereby enhanced for the desired modification.
The inventors found that the splicing activities of the modified T4 td intron seen by Pichler et al. in vitro do not correspond to what is found to happen in vivo. By assessing a range of modifications in an in vivo system the inventors have surprisingly found that it is possible to modify the exon sequence portions of the T4 td self-splicing intron and achieve differential splicing activities. Therefore the inventors have discovered how to “tune” a self-splicing intron to work when inserted into a gene in an in vivo td expression system. The modified introns were placed either into the ORF of a gene of interest, just after the start codon of a gene of interest, or just before a start codon of a gene of interest.
The inventors have therefore discovered and developed a universal riboswitch that can be applied to virtually any gene and organism of interest. This universal riboswitch can be applied to prokaryotes and eukaryotes for the induction of protein expression, including RNA transcription. More particularly the invention has utility in relation to CRISPR-Cas engineering, including in organisms which have so far proved intractable for genetic modification.
More specifically, the inventors modified the 5′ and 3′ exons of the T4 td intron which function as a tuneable self-splicing intron that can be introduced to any GOI to multiple spots in the ORF, e.g. allowing the intron to be inserted without changing the amino acid sequence of the protein of interest. The inventors have also introduced Tag sequences which offer an additional advantage of simpler genetic manipulation work when constructing particular desired genetic sequences, since they can simply be added to the N-terminus of the POI. Each Tag sequence can have a different splicing activity and therefore a titration of inducible effect is achievable.
The combination of modification of the 5′ and 3′ exon sequence together with the provision of Tag sequences makes the inducible riboswitch tool universal for any GOI and for any organism of interest.
In accordance with the present invention there is provided a method for controlling expression of a polypeptide of interest (P01) in a cell, comprising:
The invention further provides a method for controlling expression of an RNA of interest (ROI) in a cell, comprising:
In accordance with the invention, the locating of an inducible self-splicing intron in a gene of interest (GOI), which is almost always a different gene from where the self-splicing intron is found in nature, means that the inducible self-splicing intron is located in a non-native, non-WT position in a gene. This surprising effect is tunable via certain modifications to the 5′ and 3′ exon regions of the self-splicing intron, allowing for universal applicability to prokaryotes and eukaryotes, and to permit addition of tag sequences which allow differential splicing activity in reaction to inducer. What is unexpected in this invention is that by changing the location of the intron from a naturally occurring position in a genetic sequence to a novel position in a different genetic sequence, that functionality of the self-splicing can be maintained. This is surprising because hitherto it was expected that intron functionality would be dependent a multiplicity of factors which have all to be in place, such as secondary structure of the mRNA, which is dependent on the exact sequence context (i.e. the primary sequence of the total RNA molecule). By introducing different mutations in the 5′ and/or 3′ exon sequences of the T4 td intron, the inventors have successfully managed to decrease or increase the splicing activity of the intron, thereby creating a library of tunable self-splicing intron variants. The tuned activity of the self-splicing intron is unexpected, especially since increased activity shown by some of the variants (see for example in
What is also advantageous with the self-splicing introns of the invention is that the inducible system they provide is SD-independent and can be used for any GOI in any organism (without substantially interfering with the coding sequence). This also allows inducible expression of any ROI or POI in organisms where there are limited or no inducible promoters available.
The protein of interest (POI) may be any desired protein which is needing to be expressed in the cell. POI are typically polypeptide macromolecules comprising 20 or more contiguous amino acid residues and may include, but are not limited to enzymes, structural proteins, binding proteins and/or surface-active proteins. The methods of the present invention are useful in the production of desirable proteins in the agricultural, chemical, industrial and pharmaceutical fields. POI may include those of therapeutic value or industrial value. Examples of POI include enzyme, binding protein, antibody or chimeric antibody.
The POI may be a protein which is already endogenous to the cell, optionally wherein the POI is modified in amino acid sequence compared to the native endogenous protein of the cell. Often the POI is a heterologous protein which is not normally expressed by the unmodified WT cell.
Advantageously, the present invention is of broad applicability and host cells may be selected from any archaea, prokaryotic or eukaryotic cells. The invention is applicable to commonly used host cells, for example prokaryotic cells, fungal cells, plant cells and animal cells commonly used for recombinant heterologous protein expression. Equally the invention is applicable to less commonly used host cells, including prokaryotic cells, or cells of prokaryotes or eukaryotes which have not yet been subjected to genetic modification.
Self-splicing introns may be adapted as described herein via the 5′ and 3′ exon sequences to provide the optimum inducer controlled self-splicing for the ROI or POI at the selected site of insertion. The most suitable sites of insertion for a given ROI and POI may be chosen using an selection algorithm of the kind described in Example 4 and/or 11. Following this, the better performing 5′ and 3′ exon variant sequences of the self-splicing introns of the invention are readily identified in accordance with methods described herein. The self-splicing introns and methods of the invention are thereby of universal applicability to any desired ROI or POI. There is additionally the opportunity of altering selected nucleotide residues of an exon sequence encoding a ROI or POI. Therefore the particular choice of variants of the 5′ and 3′ exons of the self-splicing introns of the invention may be married together with a modification of nucleotides in the sequence which is receiving the self-splicing intron. Such modifications are silent in the sense that they do not alter the encoded amino acid of the ROI or POI. In this way further optimization and universal application of the self-splicing introns of the invention may be achieved. Polynucleotide constructs as described herein in accordance with any aspect of the invention are preferably in the form of an expression vector. Suitable expression vectors will vary according to the recipient host cell and suitably may incorporate regulatory elements which allow expression in the host cell of interest and preferably which facilitate high-levels of expression. Such regulatory sequences may be capable of influencing transcription or translation of a gene or gene product, for example in terms of initiation, accuracy, rate, stability, downstream processing and mobility.
Polynucleotides, usually expression constructs, in accordance with any aspect of the invention defined herein, may be in the form of plasmids used to transform a host cell. Methods of transformation are well known to persons of skill in the art and include but are not limited to; heat shock, electroporation, particle bombardment, chemical induction, microinjection and viral transformation.
In any of the methods of the invention for expressing POI, the self-splicing intron may be located 3′ of and in-frame with the start codon (i.e. not causing a frameshift in the polynucleotide portion encoding the POI upon splicing) and the expressed POI comprises an amino acid tag sequence encoded by a polynucleotide sequence which includes the 5′ and 3′ exon nucleotide sequences of the self-splicing intron rendered contiguous by self-splicing of the intron; preferably wherein the amino acid tag sequence is an N-terminal amino acid tag in the expressed POI. For example, in this way an N-terminal tag can be added to or fused to the POI, whereby methionine encoded by the in frame start codon is followed by a tag encoded by the 5′ and 3′ exon sequences rendered contiguous by the splicing, which is then followed by the (further) amino acids of the POI. The methionine can be directly followed by the amino acids encoded by the 5′ and 3′ exon sequences, i.e. the self-splicing intron is then directly adjacent to the start codon. Alternatively, or one or more amino acids can be included in between the start codon and the self-splicing intron to create a longer tag.
In alternative methods of the invention, the self-splicing intron may be located within the polynucleotide portion encoding the ROI or POI, e.g. 3′ of and in-frame with the start codon (i.e. not causing a frameshift in the polynucleotide portion encoding the POI upon splicing), and preferably the expressed ROI or POI does not comprise a tag added to the ROI or POI, i.e. preferably the intron is inserted such that no changes in the ROI or amino acid sequence of the POI are made, e.g. by making use of the herein described modifications to the 5′ and/or 3′ exon sequences. Possible insertion sites in any ROI or POI can thus be determined, e.g. using the herein described script.
In alternative methods of the invention, the self-splicing intron is 5′ of the polynucleotide portion which encodes the POI and therefore is at or 5′ of the start codon, such that the polynucleotide portion is not disrupted by the self-splicing activity of the intron.
When a self-splicing intron is located 5′ of the start codon of a polynucleotide coding a POI, then this may be directly adjacent to the start codon, or upstream of the start codon but downstream of the ribosome binding site. In other words, the self-splicing intron may be inserted anywhere into the stretch of contiguous nucleotides between the ribosome binding site and the start codon. In some embodiments, this can be at most about 12 nt upstream of the start codon; and in other embodiments at most about 11 nt, 10 nt, 9 nt, 8 nt, 7 nt, 6 nt, 5 nt, 4 nt, 3 nt, 2 nt or 1 nt upstream of the start codon.
A person of skill in the art will understand that in the case of an intron 3′ of a start codon (e.g. 3′ of the start codon), the 5′ and 3′ exon sequences of the self-splicing intron will need to be such that no frame shift is caused in the polynucleotide encoding the POI, e.g. by presenting complete codons, and as such are inserted in reading frame to the nucleotide sequence.
In aforementioned methods of the invention, the polynucleotide construct may further comprise a polynucleotide sequence encoding an additional amino acid sequence. When present, an additional amino acid sequence may be a functional moiety, e.g. a protein purification or detection tag, a cellular localization sequence, or a fluorescent moiety. In some embodiments, this additional amino acid sequence, e.g. functional moiety, can be included in or added to the N-terminal tag sequence, so as to fuse this to the POI upon splicing (and translation).
In the methods of the invention two or more self-splicing introns may be present. In such embodiments where two or more self-splicing introns are present, at least one of them may be comprised in the polynucleotide portion from which the POI is expressed; optionally directly adjacent and in-frame with the start codon (i.e. not causing a frameshift in the polynucleotide encoding the P01). Additionally or alternatively at least one of the two or more self-splicing introns is 5′ of the polynucleotide portion which encodes the POI, i.e. of the start codon of the polynucleotide encoding the POI.
In embodiments concerning POI, all of the self-splicing introns may be comprised in the polynucleotide portion which encodes the POI, such as 3′ of the start codon; optionally wherein at least one of said two or more self-splicing introns is directly adjacent to and in-frame with the start codon of the POI (i.e. not causing a frameshift).
In embodiments concerning ROIs which are mRNA, the aforementioned aspects relating to methods for controlling expression of POI apply; the mRNA ROI as will be appreciated by a person of skill in the art, being an intermediate step in the process of protein expression.
In other embodiments concerning ROIs which are other than mRNA, then an RNA of interest (ROI) herein may be any of transfer RNA (tRNA), ribosomal RNA (rRNA), long non coding RNA (lncRNA), micro RNA (miRNA), small nucleolar RNA (snoRNA), PIWI-interacting RNA (piRNA), circular RNA (circRNA), small interfering RNA (siRNA), antisense RNA (aRNA), CRISPR guide RNA (gRNA) or crRNA or single guide RNA (sgRNA), trans-activating CRISPR RNA (tracrRNA), double stranded RNA (dsRNA), short hairpin RNA (shRNA), trans-acting siRNA (tasiRNA), repeat associated siRNA (rasiRNA), enhancer RNA (eRNA).
In the aforementioned examples of ROI, the RNA is encoded by a transcription unit and so at least one self-splicing intron in accordance with the invention may be present either within the transcription unit, or upstream thereof. More than one self-splicing intron may be used, with at least one present within the transcription unit and at least one present in the polynucleotide portion between the promoter and the transcription unit.
In methods of the invention, the self-splicing intron preferably comprises an aptamer which has binding affinity for the inducer molecule. The aptamers may be DNA, cDNA, RNA, or preferably RNA. Suitable aptamers may be 20-30 nt in length; optionally they are 20 nt, 21 nt, 22 nt, 23 nt, 24 nt, 25 nt, 26 nt, 27 nt, 28 nt, 29 nt or 30 nt in length. New aptamers can readily be developed for a required inducer molecule by means known in the art, preferable by a selection procedure such as Systematic Evolution of Ligands by Exponential Enrichment (SELEX) of aptamer fragments. In this way the specificity of each self-splicing intron in the invention can be adjusted for the cell or inducer required to be worked in.
Inducers of aptamers useful in accordance with the invention may be selected from: flavin mononucleotide, thiamine pyrophosphate, s-adenosylmethionine, s-adenosylhomocysteine, adenosylcobalamin, cyclic diguanylate, adenine, guanine, glycine, lysine, theophylline, 3-methylxanthine, caffeine, 1-methylxanthine, 7-methylxanthine, 1,3-dimethyl uric acid, hypoxanthine, xanthine, theobromine tetracycline, neomycin or malachite green, 2′-Deoxyguanosine, Magnesium, glucosamine-6-phosphate, 7-aminomethyl-7-deazaguanine, 7-cyano-7-deazaguanine, Aquacobalamin, Molybdenum cofactor, Tungsten cofactor, Tetrahydrofolate, Prequeusine-1, c-di-adenosine monophosphate, Cyclic guanosine monophosphate-adenosine monophosphate. Preferably the inducer is theophylline.
In various embodiments of the invention, the 5′ exon nucleotide sequence and/or 3′exon nucleotide sequence of the self-splicing intron are preferably modified compared to the respective wild type exon nucleotide sequence(s) of the intron.
Ordinarily, the self-splicing introns employed in any aspect of the invention described herein are Group I introns, although it is possible for Group II or Group III self-splicing introns to be used.
In preferred aspects, the self-splicing intron is the T4 td self-splicing intron wherein the 5′ exon sequence is NNNNNNGGT (SEQ ID NO: 3) and the 3′ exon sequence is CTN (SEQ ID NO: 4), preferably wherein the 5′ exon sequence is TTBYBDGGT (SEQ ID NO: 5) and the 3′ exon sequence is CTH (SEQ ID NO: 6) (wherein B=G/T/C, Y=C/T, D=G/A/T and H=A/T/C) optionally wherein the 5′ exon sequence is selected from TCCTCAGGT (SEQ ID NO: 7), TCCTCGGGT (SEQ ID NO: 8), TCCTTGGGT (SEQ ID NO: 9), TCCTCTGGT (SEQ ID NO: 10) or TTCTTGGGT (SEQ ID NO: 11); and the 3′ exon sequence is CTA (SEQ ID NO: 12).
In some aspects, the native exon portions of self-splicing introns may be modified at any or all of positions of −9, −8, −7, −6, −5, −4 and +296. Additionally or alternatively any or all of positions −3, −2, −1, +294 and +295 are not modified and so are kept as wild type in order to maintain self-splicing ability of the intron. This will provide more flexibility in placing the intron within a coding sequence without changing the amino acid sequence of the POI (i.e. no tag is added).
In some aspects, the self-splicing intron is the T4 td self-splicing intron and the exon sequences are modified compared to the respective wild type exon nucleotide sequence(s) of the intron, but the present invention optionally does not include a self-splicing intron with the following 5′ exon sequence: 5′ caccuuaggu 3′ (SEQ ID NO: 13) and/or the following 3′ exon sequence: 5′ cuat 3′ (SEQ ID NO: 14). Additionally or alternatively the present invention optionally does not include a self-splicing intron with the following 5′ exon sequence: 5′ uucuugggu 3′ (SEQ ID NO: 15) and/or the following 3′ exon sequence: 5′ cuac 3′ (SEQ ID NO: 16).
The invention also may exclude the following 5′ exon sequence: 5′ CACCTTAGGT 3′ (SEQ ID NO: 17) and/or the following 3′ exon sequence: 5′ CTAT 3′ (SEQ ID NO: 18). Additionally or alternatively the present invention optionally does not include a self-splicing intron with the following 5′ exon sequence: 5′ TTTCTTGGGT 3′ (SEQ ID NO: 19) and/or the following 3′ exon sequence: 5′ CTAC 3′ (SEQ ID NO: 20).
In other aspects, the T4 td self-splicing intron in the present invention does not include a 5′ exon sequence of CAAGGGT (SEQ ID NO: 21) or CTTGGGT (SEQ ID NO: 22) and/or a 3′ exon sequence of CTAC (SEQ ID NO: 20) or CTAA (SEQ ID NO 23).
In accordance with any aspect of the invention, a POI may be selected from any of:
In circumstances wherein the POI is ii) or iii), then polynucleotide preferably also further comprises a portion encoding a targeting RNA molecule, e.g. a guide RNA (gRNA) which directs ii) or iii) to a target locus in a DNA sequence. The CRISPR-Cas nucleases may be selected from any Cas Type I, Type II or Type III. More particularly, the Cas may be selected from Cas9, Cas12a (previously known as Cpf1) or Cas13 (previously known as C2c2); also any of Caw, Cas12b, Cas12c, Cas13a,b,c,d, Cas4, Csn2, Csf1, Csx10, Csx11, Cmr5, Csm2, Cas10, Csy1,2,3, Cse1,2, Cas10d, Cas8a,b,c, Cas5 or Cas3. The CRISPR-Cas nucleases may any variant from any species, whether well-known, e.g. from Streptococcus pyogenes (SpyCas9), or less commonly used such as from Geobacillus thermodenitrificans T12 (ThermoCas9) or Geobacillus stearothermophilus (GeoCas9).
In any of the methods described herein, separately of the inducer controlled self-splicing introns described herein, polynucleotides and expression vectors in host cells may be under primary inducer control using well-known induction systems for cell expression. As such, the polynucleotide constructs and expression vectors will comprise the necessary elements and an inducer molecule can be provided to the cell. When an inducer is provided exogenously to an host cell, then it may be a chemical compound (e.g. Dimethyl sulfoxide (DMSO), Doxycycline, Muristerone A; Ponasterone A). Suitable inducers also include, but are not limited to; Rhamnose, Arabinose or Isopropyl β-D-1-thiogalactopyranoside (IPTG).
Alternatively, expression of heterologous polynucleotides and expression vectors may be under the control of environmental factors such as heat shock, high or low pH; physiological factors such as glucose elevation or hypoxia; or developmental cues, such as cell density or growth phase. This permits a generic level of switching that can be used to turn cells on and off as regards heterologous gene expression.
The invention therefore includes an isolated polynucleotide, also to be termed a polynucleotide construct, comprising:
Using such polynucleotides of the invention, the ROI may be translatable into the POI when the ROI is an mRNA.
1. In preferred aspects, the self-splicing intron is 3′ of an in-frame start codon (e.g. 3′ of the start codon) and a POI when expressed from the polynucleotide comprises an amino acid tag sequence encoded by a polynucleotide sequence which includes the 5′ and 3′ exon nucleotide sequences of the self-splicing intron rendered contiguous by self-splicing of the intron; preferably wherein the amino acid tag sequence is an N-terminal amino acid tag in or fused to the expressed POI, such as described in relation to the method aspects of the invention as hereinbefore described.
In other aspects, the self-splicing intron may be located within the polynucleotide portion encoding the ROI or POI, e.g. 3′ of and in-frame with the start codon (i.e. not causing a frameshift in the polynucleotide portion encoding the POI upon splicing), and preferably the expressed ROI or POI does not comprise a tag added to the ROI or POI, i.e. preferably the intron is inserted such that no changes in the ROI or amino acid sequence of the POI are made, e.g. by making use of the herein described modifications to the 5′ and/or 3′ exon sequences, such as described in relation to the method aspects of the invention as hereinbefore described.
In other aspects, the self-splicing intron is 5′ of the polynucleotide portion from which the ROI or POI is expressed and the said polynucleotide is not disrupted by the self-splicing activity of the intron; preferably wherein the self-splicing intron is 5′ of a start codon (e.g. 5′ of the start codon) of the polynucleotide encoding the POI.
The polynucleotide constructs of the invention described herein may further comprise a polynucleotide sequence encoding an additional amino acid sequence; optionally wherein the additional amino acid sequence is a functional moiety, e.g. a protein purification or detection tag, a cellular localization sequence, a fluorescent moiety. such as described in relation to the method aspects of the invention as hereinbefore described.
In some embodiments, two or more self-splicing introns may be used. Each of the self-splicing introns may be at a different location rather than contiguous or substantially contiguous. For example, a first self-splicing intron may be 5′ of the polynucleotide portion from which the ROI or POI is expressed, e.g. 5′ of a start codon (e.g. 5′ of the start codon), and a second self-splicing intron may be located at any position 3′ and in-frame with the start codon (i.e. not cause a frame shift in the polynucleotide encoding the P01); optionally directly adjacent to the start codon. There may be three, four or more self-splicing introns present and the positions may be selected independently.
Where there are two or more self-splicing introns, at least one is 3′ of the start codon; or at least one is 5′ of the start codon. The positioning of any self-splicing intron in polynucleotides of the invention may be as described in relation to the method aspects of the invention as hereinbefore described.
Where there are two or more self-splicing introns, each intron may be induced by the same inducer, or each intron may be induced by a respective different inducer molecules.
For self-splicing introns which are induced by an inducer molecule, these may preferably comprise an aptamer as the binding site for the inducer. The aptamer will have a binding affinity and degree of specificity for the inducer molecule; optionally wherein the inducer molecule is one selected from flavin mononucleotide, thiamine pyrophosphate, s-adenosylmethionine, s-adenosylhomocysteine, adenosylcobalamin, cyclic diguanylate, adenine, guanine, glycine, lysine, theophylline, 3-methylxanthine, caffeine, 1-methylxanthine, 7-methylxanthine, 1,3-dimethyl uric acid, hypoxanthine, xanthine, theobromine tetracycline, neomycin or malachite green, 2′-Deoxyguanosine, Magnesium, glucosamine-6-phosphate, 7-aminomethyl-7-deazaguanine, 7-cyano-7-deazaguanine, Aquacobalamin, Molybdenum cofactor, Tungsten cofactor, Tetrahydrofolate, Prequeusine-1, c-di-adenosine monophosphate, Cyclic guanosine monophosphate—adenosine monophosphate. A preferred inducer is theophylline.
In polynucleotides of the invention described herein, the 5′ exon nucleotide sequence and/or 3′exon nucleotide sequence of the self-splicing intron are preferably modified compared to the respective wild type exon nucleotide sequence(s) of the intron.
In a preferred embodiment of a polynucleotide of the invention, including for use in the methods of the invention, the self-splicing intron is the T4 td self-splicing intron. Modifications and variations of the 5′ and 3′ exons of this self-splicing intron are as described in connection with the aforementioned method aspects of the invention.
There are additional applications of the methods and polynucleotides of the invention with respect to the genetic modification of organisms and cells. Therefore in one aspect, the POI is selected from:
Therefore, when the POI is ii) or iii), a polynucleotide of the invention as used in method applications of the invention may further comprise for convenience a portion encoding a targeting RNA molecule, e.g. guide RNA (gRNA) which directs the b) or c) to a target locus in a DNA sequence; optionally wherein the gRNA is under the control of a self-splicing intron. Alternatively, the targeting RNA molecule may be supplied directly or via expression from a separate expression construct introduced into the cell.
In an aspect of the invention, the POI may be an argonaut.
The invention includes expression vectors comprising any of the polynucleotides as herein described.
The invention also provides transformed cells for inducer molecule-controlled expression of an RNA of interest (ROI) or polypeptide of interest (POI) thereby, wherein such a cell comprises a polynucleotide as herein described, or an expression vector comprising such polynucleotides.
The invention further provides kits for expressing an RNA of interest (ROI) or a polypeptide of interest (POI) as hereinbefore described, and wherein the expression is in transformed host cells under the control of an inducer molecule. The kits can comprise:
The invention includes a system for generating an RNA of interest (ROI) or a polypeptide of interest (POI), comprising a transformed cell as herein defined.
In other aspects, the invention provides a method of inducer controlled modification of a target genomic locus in a cell, comprising introducing or generating in the cell a ribonuclease complex comprising a Crispr-Cas nuclease and a gRNA molecule for the target genetic locus; wherein the Crispr-Cas nuclease and/or the gRNA is comprised as the ROI and/or POI in a polynucleotide construct or an expression vector as hereinbefore defined; and
2. subjecting the cell to a condition which causes a concentration of inducer molecule to promote the self-splicing activity of the intron, thereby resulting in expression of the Crispr-Cas nuclease and/or gRNA in the cell; optionally wherein an homologous repair (HR) template encoded by the same or different polynucleotide or expression vector, and the HR template is expressed in the cell.
The invention also includes a method of inducer-controlled base editing of a target genomic locus in a cell, comprising:
The invention further includes a method of inducer-controlled prime editing of a target genomic locus in a cell, comprising:
In any of the aforementioned methods of the invention for genetic modification, an exogenous inducer molecule is preferably provided to the cell. Alternatively, the (a) the inducer molecule may be generated as a result of expression of a separate gene in the cell, wherein the separate gene is under the control of different expression regulatory elements; optionally wherein the different expression regulatory elements are responsive to a different inducer molecule and/or physical condition, e.g. temperature; or (b) the inducer molecule is naturally synthesized by the cell in response to chemical and/or physical condition to which the cell is subjected to. Such range of physical conditions have been referred to previously.
In any of the aforementioned methods of the invention for genetic modification of cells, a first polynucleotide may comprise a self-splicing intron under the control of a first inducer molecule, and a second polynucleotide comprises a self-splicing intron which is under the control of a second different inducer molecule. Similarly, further polynucleotides and inducer molecule combinations can be added. The inducer molecules may be same or different.
The invention herein includes a system for inducer controlled genetic modification of a cell, comprising at least a first expression vector, the first expression vector comprising a polynucleotide, a polynucleotide construct or expression vector as hereinbefore defined, wherein the respective POI or ROI is selected from:
The invention herein also includes a system for inducer controlled genetic modification of a cell, comprising at least a first expression vector, the first expression vector comprising a polynucleotide, polynucleotide construct or expression vector as hereinbefore defined,
wherein the respective POI or ROI is selected from:
Further, the invention includes a system for inducer controlled genetic modification of a cell, comprising at least a first expression vector, the first expression vector comprising a polynucleotide, polynucleotide construct or expression vector as hereinbefore defined, wherein the respective POI or ROI is selected from:
In any of the aforementioned systems of the invention, each individual POI and/ROI is preferably under the control of a respective self-splicing intron. More particularly, a first polynucleotide may comprise a self-splicing intron under the control of a first inducer molecule, and a second polynucleotide comprises a self-splicing intron which is under the control of a second different inducer molecule.
The ROI or POI from transcription/expression of the first polynucleotide in reaction to the respective first inducer molecule may provide the inducer molecule for a second, optionally third or more self-splicing introns encoded within second, third, etc. polynucleotides of the invention. The invention can therefore include modes of operation whereby an initial exogenously applied inducer molecule effect can be amplified by two or more inducer molecules within the cell which have been produced within the cell in reaction to the applied inducer molecule. Similarly, a physical induction can be used as the primary inducer resulting in production of a secondary ROI/POI which goes on to induce activity of a second self-splicing intron and production of a further ROI/POI. Clearly a person of skill in the art can design and operate a multiplicity of possibilities for cascade in control of expression of a desired, ultimate ROI/POI.
In the genetic modification method aspects, kits and systems of the invention defined herein is a SIBR-Cas tool which can often provide the only solution for Homology Directed Recombination (HDR) in prokaryotes. SIBR-Cas can be used virtually in any organism of interest for any of the previously demonstrated CRISPR-Cas applications and whenever it makes sense to control the expression of the Cas protein.
The invention therefore provides certain technical advantages of:
Industrial applications of the invention described herein include:
Embodiments of the invention are further described hereinafter with reference to the accompanying drawings, in which:
Ribozymes and riboswitches are gene regulation systems found in a wide range of bacterial species. The catalytic and/or regulatory functionality of these RNA molecules relies on their primary, secondary and tertiary structures, making them great candidates for developing universal tools for regulating gene expression, without the use of proteins (Breaker, R. R. Riboswitches and the RNA world. Cold Spring Harbor perspectives in biology 4, a003566 (2012); Park, S. V. et al. Catalytic RNA, ribozyme, and its applications in synthetic biology. Biotechnology advances 37, 107452 (2019); Serganov, A. & Nudler, E. A decade of riboswitches. Cell 152, 17-24 (2013); Serganov, A. & Patel, D. J. Ribozymes, riboswitches and beyond: regulation of gene expression without proteins. Nature Reviews Genetics 8, 776-790 (2007); Weinberg, C. E., Weinberg, Z. & Hammann, C. Novel ribozymes: discovery, catalytic mechanisms, and the quest to understand biological function. Nucleic acids research 47, 9480-9494 (2019)) To this end, several studies used ribozymes and riboswitches to control the expression of a gene of interest (G01), but also for regulating the activity and function of CRISPR-Cas (Zhao, J., et al. Development of aptamer-based inhibitors for CRISPR/Cas system. Nucleic Acids Research (2020); Cañadas, I.s.C., et al. RiboCas: a universal CRISPR-based editing tool for Clostridium. ACS synthetic biology 8, 1379-1390 (2019); Tang, W., Hu, J. H. & Liu, D. R. Aptazyme-embedded guide RNAs enable ligand-responsive genome editing and transcriptional activation. Nature communications 8, 1-8 (2017); Siu, K.-H. & Chen, W. Riboregulated toehold-gated gRNA for programmable CRISPR-Cas9 function. Nature chemical biology 15, 217-220 (2019). Kundert, K. et al. Controlling CRISPR-Cas9 with ligand-activated and ligand-deactivated sgRNAs. Nature communications 10, 1-11 (2019); Park, S. V. et al. Catalytic RNA, ribozyme, and its applications in synthetic biology. Biotechnology advances 37, 107452 (2019)). Although quite successful, these approaches leave room for improvement. For example, the technology developed by Tang et al. (2017) requires base pairing of the CRISPR spacer sequence with the 5′ end of the hammerhead ribozyme; something that requires modification in case the CRISPR spacer needs to be changed. Moreover, the studies by Kundert et al. (2019), Siu et al. (2019) and Zhao et al. (2020) rely on the secondary structure of the Cas9 single guide RNA (sgRNA), which rules out the use of other CRISPR-Cas systems. Lastly, the RiboCas technology developed by Cañadas et al. (2019), regulates the expression of Cas9 by masking the RBS with a theophylline-dependent riboswitch. Whereas this technology is a smart alternative to previous approaches, it can be cumbersome to use either in organisms that do not use the canonical RBS sequence, or in cases that the secondary structure of the 5′ UTR sequence interferes with the theophylline aptamer (Chen, S., Bagdasarian, M., Kaufman, M. & Walker, E. Characterization of strong promoters from an environmental Flavobacterium hibernum strain by using a green fluorescent protein-based reporter system. Appl. Environ. Microbiol. 73, 1089-1100 (2007); Gómez, E., Álvarez, B., Duchaud, E. & Guijarro, J. A. Development of a markerless deletion system for the fish-pathogenic bacterium Flavobacterium psychrophilum. PLoS One 10, e0117969 (2015); Accetto, T. & AvgAtin, G. Inability of Prevotella bryantii to form a functional Shine-Dalgarno interaction reflects unique evolution of ribosome binding sites in Bacteroidetes. PloS one 6 (2011)).
The inventors substituted the Wild Type (VVT) P6a loop of the T4 td intron with a theophylline responsive aptamer (see
To create a universal T4 td intron riboswitch, the inventors introduce modifications to the intron allowing it to be transferred to any gene of interest without compromising its splicing activity. The modifications are located in the 5′ and 3′ exon sequences of the T4 td intron (
When converting the T4 td intron into a universal riboswitch, certain modifications were introduced to the intron, allowing it to be transferred into any gene of interest without compromising its activity. The use of the inducer controlled self-splicing intron to control CRISPR-Cas proteins was found to solve the problem of how to engineer some prokaryotes which have proved intractable previously to attempts to modify them with a Crispr-Cas approach, as previous attempts failed to do so (e.g. Flavobacterium IR1).
In more detail, the inventors explored the role of the 5′ exon and 3′ exon sequences of the td intron and determined its splicing activity by substituting the relevant bases in the 5′ exon and 3′ exon (see
Initially the inventors substituted the −7 and +296 positions of the 5′ exon and 3′ exon, respectively, and by inserting the different variants into the LacZa gene and by performing assays in E. coli (see Examples 1 to 3). The, positions −6, −5 and −4 of the 5′ exon of the td intron were tested. This defined several base substitutions which either allowed more self-splicing and therefore more LacZa activity, or less self-splicing and therefore less LacZa activity.
The inventors then further modified the 5′exon and 3′exon sequences of the intron in order to control/titrate its self-splicing activity, or to introduce it in multiple sites in the Open Reading Frame (ORF) of the Gene Of Interest (GOI). The inventors were successful in transferring the self-splicing intron to any GOI at different positions in the ORF.
Altered splicing efficiency by changing the base pair interactions at the P1 stem of the T4 td intron was previously observed by Pichler A. & Schroeder R. (2002) “Folding Problems of the 5′ Splice Site Containing the P1 Stem of the Group I Thymidylate Synthase Intron” J. Biol. Chem 277 (20) 17987-17993, who created two mutant variants to either stabilize (−4A, −5C, −6T) or destabilize (−4C, −5A, −6C) the base pair interactions at the P1 stem and noticed increased splicing efficiency for both the stabilized and the destabilized variants compared to the WT intron. However, these results are contradicting to the present results, as stabilization (−4A, −5C, −6T) of the P1 stem decreased the splicing efficiency by approximately 80% (compared to the WT intron) in our setup (
The inventors further successfully provide a universal “TAG” sequence whereby the intron is introduced just after the ATG “start” codon and therefore is gene/protein independent. The TAG sequence leaves a 4 amino acid tag at the N-terminus of the protein of interest (P01) just after the methionine (m) encoded by the start codon. This tag sequence does not usually hinder the activity of the expressed protein as it consists only of 4 amino acids. A cleavage sequence of a TEV protease cleavage site can be added directly after the “Tag” sequence and then cleaved with proteases afterwards. The cleavage leaves a single amino acid attached to the protein of interest. Other cleavage sequences and proteases well known in the art may be used, e.g. https://web.expasy.org/peptide_cutter/ and https://web.expasy.org/peptide_cutter/peptidecutter_enzymes.html.
Using different versions of tag-introns, the inventors are able to control expression of a GOI at the protein level which gives the advantage of titration. Tag sequences are chosen from those shown in
The addition of Tags has been successfully tested in E. coli, P. putida and Flavobacterium IR1 by inserting Tagged introns after the start codon of Cas12a. This approach allowed efficient editing of the bacterium of interest. More specifically, for P. putida editing efficiencies of up to 75% were reached with Tag4 (
The invention is applicable to any self-splicing intron and these are found in many species of bacteriophage, bacteria, protozoa and fungi, for example. The self-splicing introns are usually found embedded in specific genes of a species or strain. For example, the T4 td self-splicing intron is located in the td gene of the T4 bacteriophage.
Other self-splicing introns from bacteriophages are: T6: td, RB3: td, LZ2: td, TulA: td, ϕ1: DNA polymerase, W31: DNA polymerase, Pf-WMP3: DNA polymerase, 822: td, SPO1: DNA polymerase, SP82: DNA polymerase, cpe: DNA polymerase, SPb prophage (Ribonucleotide reductase (bnrdE and bnrdF)), Sb3: lysin, rlt: ORF40, LLH: Terminase, Twort (introns nrdE-11 & nrdE-12): ORF142.
Examples of self-splicing introns from bacteria are: Agrobacterum tumefaciens A136: tRNAArgCCU, Azoarcus sp. strain BH72: tRNAIleCAU, Coxiella burnetii (Cbu.L1917): 23S rRNA, Coxiella burnetii (Cbu.L1951): 23S rRNA, Thermotoga neapolitana NS-E Tna.bL1931: 23S rRNA, Thermotoga subterranea SL1 Tsu.bL1926: 23S rRNA, Clostridium botulinum: tmma pos. 338, Geobacillus stearothermophilus (NBRC 12550): flagellin, Bacillus sp. Kps3: flagellin, Clostridium difficile strain 630: CD3246, Anabaena PCC7120: tRNLeuUAA, Scytonema hofmanii: RNAfMet, Synechocystis PCC 6803: RNAfMet, Neochloris aquatica: ml pos. 1931, Calothrix sp. strain PCC7601: Cal.x1, Calothrix sp. strain PCC7101: Cal.x2, L. lactis ML3: LI.LtrB, L. lactis 712: IntL, S. meliloti GR4: RmInt1.
Examples of self-splicing introns from Protozoa are: Tetrahymena thermophila (Tth.L1925): 26S rRNA, Didymium iridis (Dir.S956-1): SSU rDNA, Didymium iridis (Dir.S956-2): SSU rDNA, Physarum polycephalum (Ppo.L1925): LSU rDNA, Amoebidium parasiticum: ml, pos. 2500 and ml, pos. 1403, Naegleria (NaGIR1 and NaGIR2): SSU rRNA.
Examples of self-splicing introns from Fungi are: Neurospora crassa: ml, pos. 2449, Saccharomyces cerevisae (Sc.OX1,3): SSU rDNA, Candida albicans: 25S rRNA, Scytalidium dimidiatum (rns, pos. 1199).
Examples of self-splicing introns from other miscellaneous organisms are: Simkania negevensis ZT: 23S rRNA, Chlamydomonas nivalis: rnl, pos 2593, Dunaliella parva: rnl, pos. 1931, Aureoumbra lagunensis: SSU rRNA, Bangia atropurpurea: SSU rRNA.
Calothrix sp. strain PCC7601: Cal.x1, Calothrix sp. strain PCC7101: Cal.x2, L. lactis ML3: LI.LtrB, L. lactis 712: IntL, S. meliloti GR4: RmInt1 are Group II introns, while all others are Group I introns.
Examples of Group III introns include the Euglena gracilis introns found in the psbC, rps18, ycf8, ycf13, rpoCl, rp116, psbF, rps3, rp123, rps18, rps19, rp114, rps8, rps14, rp116, psbK genes.
A unique type of ribozymes includes the self-splicing Group I introns. Group I introns have been described to control gene expression and RNA processing in bacteria and phages but also in some eukaryotes (protozoa and plants) (Hausner, G., Hafez, M. & Edgell, D. R. Bacterial group I introns: mobile RNA catalysts. Mobile DNA 5, 1-12 (2014); Edgell, D. R., Belfort, M. & Shub, D. A. Barriers to intron promiscuity in bacteria. Journal of Bacteriology 182, 5281-5289 (2000); Nielsen, H. & Johansen, S. D. Group I introns: moving in new directions. RNA biology 6, 375-383 (2009)). Due to their prevalence and simplistic nature, Group I introns have the potential to be used as universal, synthetic ribozymes to control gene expression. Especially when ribozymes are associated with a specific ligand-binding sequence (RNA aptamer), the presence/absence of such a ligand allows for switching ON/OFF the splicing activity (riboswitch), potentially controlling the expression of an associated gene. An example of a natural Group I intron-based riboswitch has been discovered in the bacterium Clostridium difficile, where its sequence resides between the RBS and the ATG start codon of an adjacent gene. After transcription, this results in a secondary structure in the 5′-UTR that prevents recruitment of the ribosome, hence hampering translation initiation. After induction by intracellular GTP or c-di-GMP, this ribozyme induces its splicing from the precursor transcript, resulting in appropriate re-positioning of the RBS upstream the start codon, thereby allowing for the ribosome to start the translation process (Lee, E. R., Baker, J. L., Weinberg, Z., Sudarsan, N. & Breaker, R. R. An allosteric self-splicing ribozyme triggered by a bacterial second messenger. Science 329, 845-848 (2010); Chen, A. G., Sudarsan, N. & Breaker, R. R. Mechanism for gene control by a natural allosteric group I ribozyme. Rna 17, 1967-1972 (2011)). Although this natural mechanism is a beautiful case of gene expression control, its requirement for specific endogenous inducers (GTP and c-di-GMP) as well as its dependency on specific secondary structures (including both the ribozyme and the coding sequence) complicates its general applicability. A synthetic alternative was provided by Thompson et al. (2002), when they combined the self-splicing Group I intron of the T4 bacteriophage with a theophylline aptamer towards a functional inducible gene expression system (Thompson, K. M., Syrett, H. A., Knudsen, S. M. & Ellington, A. D. Group I aptazymes as genetic regulatory switches. BMC biotechnology 2, 21 (2002)). Although this system was restricted to controlling the original thymidylate synthase (td) gene, we here describe its repurposing as a generic system to tune gene expression.
The inventors have also created a novel system termed Self-splicing Intron Based Riboswitch Cas (SIBR-Cas). This is created using the Group I-based aptazyme to enhance recombination in prokaryotes. The inducer controlled T4 td intron (containing an in-frame stop codon) is inserted into a CRISPR-Cas nuclease gene (Cas12a, for example) resulting in incomplete translation and avoiding formation of a functional CRISPR-Cas nuclease. Then, upon exposure to theophylline, this triggers the induction of a conformational change in the synthetic riboswitch which induces the self-splicing activity of the td intron resulting in the excision of the intron and the joining of the 5′ exon to the 3′ exon. This restores the complete mRNA of the CRISPR-Cas gene which consequently leads to the functional expression/translation of the CRISPR-Cas nuclease. In the particular example of the Cas12a protein, by controlling the expression, a time series can be made to find the appropriate induction time for counter-selection by Cas12a, thereby increasing the chances of generating correct HDR-based mutants.
So long as the relevant inducer, e.g. theophylline, can reach the self-splicing intron, then the SIBR-Cas system can be used in any organism. The advantages of such a technology are:
The SIBR-Cas tool can be applied for editing virtually any GOI in any cell of interest. The inventors have applied SIBR-Cas to Flavobacterium IR1.
Suitable nucleases to be used in the methods described herein are selectable at the option of the skilled person. A choice may depend upon the optimal growth temperature of the particular microbe being used. The CRISPR-Cas nucleases may be selected from any Cas Type I, Type II or Type III. More particularly, the Cas may be selected from Cas9, Cas12a (previously known as Cpf1) or Cas13 (previously known as C2c2); also any of Caw, Cas12b, Cas12c, Cas13a,b,c,d, Cas4, Csn2, Csf1, Csx10, Csx11, Cmr5, Csm2, Cas10, Csy1,2,3, Cse1,2, Cas10d, Cas8a,b,c, Cas5 or Cas3. The CRISPR-Cas nucleases may any variant from any species, whether well-known, e.g. from Streptococcus pyogenes (SpyCas9), or less commonly used such as from Geobacillus thermodenitrificans T12 (ThermoCas9) or Geobacillus stearothermophilus (GeoCas9). Methods described herein may preferably use Cas9, preferably Streptococcus pyogenes Cas9; or C2c1. Alternatively, methods described herein may preferably use Cas 12a (Cpf1). Further alternative nucleases suitable for the methods described herein are C2C3 or Argonaute. It is also contemplated that the methods described herein may use other nucleases such as zinc finger nucleases (ZFNS), meganucleases or transcription activator effector like nucleases (TALENS
In order that expression of any of the polynucleotide constructs or expression vectors of the invention described herein can be carried out in a chosen host cell, the these incorporate regulatory elements which allow expression in the host cell of interest and preferably which facilitate high-levels of expression. Such regulatory sequences may be capable of influencing transcription or translation of a gene or gene product, for example in terms of initiation, accuracy, rate, stability, downstream processing and mobility.
Such elements may include, for example, strong and/or constitutive promoters, 5′ and 3′ UTR's, transcriptional and/or translational enhancers, transcription factor or protein binding sequences, start sites and termination sequences, ribosome binding sites, recombination sites, polyadenylation sequences, sense or antisense sequences, sequences ensuring correct initiation of transcription and optionally poly-A signals ensuring termination of transcription and transcript stabilisation in the host cell. The regulatory sequences may be plant-, animal-. bacteria-, fungal- or virus derived, and preferably may be derived from the same organism as the host cell. Clearly, appropriate regulatory elements will vary according to the host cell of interest. For example, regulatory elements which facilitate high-level expression in prokaryotic host cells such as in E. coli may include the pLac, T7, P(Bla), P(Cat), P(Kat), trp or tac promoters. Regulatory elements which facilitate high-level expression in eukaryotic host cells might include the AOX1 or GAL1 promoter in yeast or the CMV- or SV40-promoters, CMV-enhancer, SV40-enhancer, Herpes simplex virus VIP16 transcriptional activator or inclusion of a globin intron in animal cells. In plants, constitutive high-level expression may be obtained using, for example, the Zea mays ubiquitin 1 promoter or 35S and 19S promoters of cauliflower mosaic virus.
Suitable regulatory elements may be constitutive, whereby they direct expression under most environmental conditions or developmental stages, developmental stage specific or inducible. Suitably, promoters may be chosen which permit expression of the protein of interest at particular developmental stages or in response to extra- or intra-cellular conditions, signals or externally applied stimuli. For example, a range of promoters exist for use in E. coli which give high-level expression at particular stages of growth (e.g. osmY stationary phase promoter) or in response to particular stimuli (e.g. HtpG Heat Shock Promoter).
Suitable expression vectors may comprise additional sequences encoding selectable markers which allow for the selection of said vector in a suitable host cell and/or under particular conditions.
Regarding transformation of a host cell with an heterologous gene sequence, expression constructs comprising the polynucleotide sequences of the invention may be located in plasmids (expression vectors) which are used to transform the host cell. Methods of transformation may include but are not limited to; heat shock, electroporation, particle bombardment, chemical induction, microinjection and viral transformation, Agrobacterium-mediated transformation, PEG-mediated transformation, lipofection.
As well as a ROI or POI, the polynucleotides of the invention as described herein may include a selectable marker protein. This may be used to screen cell populations positively or negatively. For example, the expression of a particular POI in a host cell may be coupled to relief of an auxotrophic deficit, it will be appreciated that such selectable markers may include polynucleotide sequences encoding proteins to which the cell is fatally sensitive. In these embodiments of the invention, the presence of the desired product may be coupled to the restoration of translation of the reporter protein. In this way host cells expressing the protein of interest may be selected from those which do not express the protein of interest.
Where the expression of a particular POI in a host cell is coupled to promotion of cell growth and/or division, it will be appreciated that such selectable markers may include polynucleotide sequences encoding proteins which promote cell growth and/or division. In these embodiments of the invention, the presence of the desired product may be coupled to the restoration of translation of the reporter protein. In this way host cells expressing the protein of interest may be selected from those which do not express the protein of interest.
The polynucleotides may include a reporter protein which may be assayed for or monitored for. Such reporter proteins include for example Green Fluorescent Protein (GFP), Yellow Fluorescent Protein (YFP), Red Fluorescent Protein (RFP), Cyan Fluorescent Protein (CFP), or Luciferase fusion tags. The reporter protein may be an enzyme which can be used to generate an optical signal. Alternatively, the expression vector may incorporate a polynucleotide reporter encoding a luminescent protein, such as a luciferase (e.g. firefly luciferase). Alternatively, the reporter gene may be a chromogenic enzyme which can be used to generate an optical signal, e.g. a chromogenic enzyme (such as beta-galactosidase (LacZ) or beta-glucuronidase (Gus)).
Tags used for detection of reporter protein expression may also be antigen peptide tags. A cleavable tag may also be provided for affinity purification, e.g. a polyhistidine tag. It is envisaged that other types of label may also be used to indicate expression of the reporter protein including, for example, organic dye molecules or radiolabels. In particular, preferred expression vectors will include sequences encoding a fluorescent protein, for example GFP which will enable the screening and optionally separation (selection) of a cell which expresses the protein of interest for example by Fluorescence Activated Cell Sorting (FACS).
The flanking regions (5′ and 3′ exons) of the group I introns are part of the coding sequence as well as of the ribozyme (see
When inserting the intron into another gene it is almost impossible to retain both the intron flanking regions and the CDS. Applying minor changes to the CDS with synonymous codons may create a site that resembles the wild type intron flanking regions. However, it is not clear to which extent the flanking regions determine the splicing efficiency.
To investigate the effect of the flanking regions of the T4 td intron on its splicing efficiency and on the expression of the target gene, a series of constructs were made containing the lacZ gene from E. coli with the intron in between amino acids D6 and S7 (see
GATCTTAAGGATG
TGActgcagAATATTAA
TTCT
GGTTAAT
ACGGTAGCATTATGT
TGAGGCCTGAGTA
TCAGATAAGGTCG
TAAGGTG (SEQ ID
GATCTTAAGGATG
TGActgcagAATATTAA
TTCT
GGTTAAT
ACGGTAGCATTATGT
TGAGGCCTGAGTA
TCAGATAAGGTCG
TAAGGTG (SEQ ID
GATCTTAAGGATG
TGActgcagAATATTAA
TTTT
GGTTAAT
ACGGTAGCATTATGT
TGAGGCCTGAGTA
TCAGATAAGGTCG
TAAGGTG (SEQ ID
The wild type interactions are shown in
In more detail in
Thompson et al. (2002) Supra do not show any interaction between position +296 (the +3 position in the 3′ exon) and the P1 loop of the T4 td intron. This is similar to the situation with the −7 position. Therefore, point mutations at the +296 position of the T4 td intron were made to see if they might impact on the splicing activity of the intron. The WT +296 position (mismatch) was mutated by PCR (Table 2) to form either a pair or a wobble pair with the P1 loop. All mutants were assayed for β-galactosidase activity after overnight growth.
GATCTTAAGGA
TGActgcagAATATTAA
TGTTCTcttgGGT
ACGG
AGCATTATGTT
TAATTGAGGCC
CAGATAAGGTCG
TGAGTATAAGG
TG (SEQ ID NO:
GATCTTAAGGA
TGActgcagAATATTAA
TGTTTTcttgGGT
ACGG
AGCATTATGT
TAATTGAGGCC
TCAGATAAGGTCG
TGAGTATAAGG
TG (SEQ ID NO:
GATCTTAAGGA
TGActgcagAATATTAA
TGTTTTcttgGGT
ACGG
AGCATTATGT
TAATTGAGGCC
TCAGATAAGGTCG
TGAGTATAAGG
TG (SEQ ID NO:
This investigated the effect of altering positions −4 to −6 in all possible combinations of pair (P), mismatch (M) and wobble pair (W) (if applicable) whilst preserving all the other bases as WT. With reference to
In
GATCTTAAGGA
TGActgcagAATATTAA
TGTTCT
GG
ACGGTAGCATTATGT
TTAATTGAGGC
TCAGATAAGGTCG
CTGAGTATAAG
GTG (SEQ ID
GATCTTAAGGA
TGActgcagAATATTAA
TGTTTT
GG
ACGGTAGCATTATGT
TTAATTGAGGC
TCAGATAAGGTCG
CTGAGTATAAG
GTG (SEQ ID
GATCTTAAGGA
TGActgcagAATATTAA
TGTTTT
GG
ACGGTAGCATTATGT
TTAATTGAGGC
TCAGATAAGGTCG
CTGAGTATAAG
GTG (SEQ ID
GATCTTAAGGA
TGActgcagAATATTAA
TGTTTT
GG
ACGGTAGCATTATGT
TTAATTGAGGC
TCAGATAAGGTCG
CTGAGTATAAG
GTG (SEQ ID
GATCTTAAGGA
TGActgcagAATATTAA
TGTTTT
GG
ACGGTAGCATTATGT
TTAATTGAGGC
TCAGATAAGGTCG
CTGAGTATAAG
GTG (SEQ ID
GATCTTAAGGA
TGActgcagAATATTAA
TGTTTT
GG
ACGGTAGCATTATGT
TTAATTGAGGC
TCAGATAAGGTCG
CTGAGTATAAG
GTG (SEQ ID
GATCTTAAGGA
TGActgcagAATATTAA
TGTTTT
GGT
ACGGTAGCATTATGT
TAATTGAGGCC
TCAGATAAGGTCG
TGAGTATAAGG
GATCTTAAGGA
TGActgcagAATATTAA
TGTTTT
GG
ACGGTAGCATTATGT
TTAATTGAGGC
TCAGATAAGGTCG
CTGAGTATAAG
GTG (SEQ ID
GATCTTAAGGA
TGActgcagAATATTAA
TGTTTT
GG
ACGGTAGCATTATGT
TTAATTGAGGC
TCAGATAAGGTCG
CTGAGTATAAG
GTG (SEQ ID
GATCTTAAGGA
TGActgcagAATATTAA
TGTTTT
GG
ACGGTAGCATTATGT
TTAATTGAGGC
TCAGATAAGGTCG
CTGAGTATAAG
GTG (SEQ ID
GATCTTAAGGA
TGActgcagAATATTAA
TGTTTT
GG
ACGGTAGCATTATGT
TTAATTGAGGC
TCAGATAAGGTCG
CTGAGTATAAG
GTG (SEQ ID
GATCTTAAGGA
TGActgcagAATATTAA
TGTTTT
GG
ACGGTAGCATTATGT
TTAATTGAGGC
TCAGATAAGGTCG
CTGAGTATAAG
GTG (SEQ ID
GATCTTAAGGA
TGActgcagAATATTAA
TGTTTT
GG
ACGGTAGCATTATGT
TTAATTGAGGC
TCAGATAAGGTCG
CTGAGTATAAG
GTG (SEQ ID
GATCTTAAGGA
TGActgcagAATATTAA
TGTTTT
GG
ACGGTAGCATTATGT
TTAATTGAGGC
TCAGATAAGGTCG
CTGAGTATAAG
GTG (SEQ ID
GATCTTAAGGA
TGActgcagAATATTAA
TGTTTT
GG
ACGGTAGCATTATGT
TTAATTGAGGC
TCAGATAAGGTCG
CTGAGTATAAG
GTG (SEQ ID
GATCTTAAGGA
TGActgcagAATATTAA
TGTTTT
GG
ACGGTAGCATTATGT
TTAATTGAGGC
TCAGATAAGGTCG
CTGAGTATAAG
GTG (SEQ ID
GATCTTAAGGA
TGActgcagAATATTAA
TGTTTT
GG
ACGGTAGCATTATGT
TTAATTGAGGC
TCAGATAAGGTCG
CTGAGTATAAG
GTG (SEQ ID
GATCTTAAGGA
TGActgcagAATATTAA
TGTTTT
GG
ACGGTAGCATTATGT
TTAATTGAGGC
TCAGATAAGGTCG
CTGAGTATAAG
GTG (SEQ ID
GATCTTAAGGA
TGActgcagAATATTAA
TGTTTT
GG
ACGGTAGCATTATGT
TTAATTGAGGC
TCAGATAAGGTCG
CTGAGTATAAG
GTG (SEQ ID
Transferring the T4 td intron into the open reading frame (ORF) of genes other than the WT thymidylate synthase gene, can be achieved by following the script provided in Example 11, and by introducing silent mutations to the 5′ and 3′ flanking regions of the intron. The script retains the WT interactions of positions −1 to −3, +294 and +295, but changes the positions −4 to −6 and +296 in order to find an insertion side in the gene of interest (GOI). The script ensures that the insertion side preserves the amino acids of the encoded protein from the GOI by introducing silent mutations and it also ensures sufficient splicing activity of the intron according to our previous results.
An example of the insertion site for the FnCas12a gene is shown in
To control the splicing activity of the T4 td intron, Thompson et al. (2002) Supra attached a theophylline aptamer at the P6 stem loop of the T4 td intron. In a similar fashion, the theophylline aptamer was also added to the modified (changes at positions −4 to −6) T4 td introns developed in this example. In this way, tight, titratable and inducible control of the GOI was obtained. A schematic representation of the T4 td intron with the theophylline aptamer at the P6 stem loop is shown in
To further control the splicing activity of the modified introns, a theophylline aptamer was added at the P6 stem loop of the T4 td intron as previously described (see Thompson et al. (2002) Supra) and shown in
As shown in
Splicing of the intron results in a short (four amino acid long) tag sequence attached to the N-terminus of the POI (when not counting the M encoded by the start codon) whereas unspliced mRNA results in a small, non-functional peptide sequence (due to stop codons present in the T4 td intron).
Electrocompetent E. coli MG1655 were transformed (2.5 kV, 200 Ω, 25 μF) with 10 ng μL−1 of the respective plasmid and recovered for 1 hour in 500 μL LB medium [10 g L−1 tryptone (Oxoid), 5 g L−1 yeast extract (BD), 10 g L−1 NaCl (Acros)] at 37° C. Then, the recovered culture was serially diluted and drop plated on selective (50 μg mL−1 kanamycin) LB agar plates in the presence or absence of 2 mM theophylline. The agar plates were incubated at 30° C. for 24 hours and the CFUs were counted.
For efficient genome editing in bacteria, HR should precede CRISPR-Cas counterselection. To assess whether tight control over CRISPR-Cas targeting could bolster the efficiency of CRISPR-Cas mediated genome editing by allowing more time for HR to occur, we used SIBR-Cas and targeted the LacZ gene of E. coli MG1655 for knock-out through HR and CRISPR-Cas counterselection using a blue/white screening colony assay. To facilitate HR, we added 500 bp up- and down-stream homology arms to the plasmids expressing the four SIBR-Cas (Int1-4) and WT-FnCas12a variants that target the LacZ gene. After 1 hour recovery, we induced the expression of the SIBR-Cas variants to counterselect the WT from the mutant colonies.
The WT-FnCas12a variant targeting the LacZ gene produced no colonies, demonstrating the targeting efficiency of WT-FnCas12a but also the inefficient HR system of the WT E. coli MG1655 strain (
Since disruption of LacZ can also be achieved through non-HR mediated approaches (spontaneous mutations or occasional error-prone DNA repair following DNA cleavage by Cas12a), not all gene deletions can be screened phenotypically. Therefore, we repeated our experiment, but X-gal was omitted from the medium to eliminate the possibility of false-positives. Randomly selected colonies that were obtained were screened by PCR for LacZ deletion showing a 0%, 0%, 29% and 38% editing efficiency for Int1, Int2, Int3 and Int4 SIBR-Cas variants, respectively (
Following successful demonstration of inducible expression of FnCas12a in E. coli, the system was transferred to Pseudomonas putida, an organism with very low HR efficiencies. Plasmids bearing the four T4 td intron-FnCas12a or the intron-less FnCas12a and an EndA T or an NT crRNA were transformed to P. putida and the targeting efficiency was assessed by comparing the CFUs μg−1 in the presence or absence of the theophylline inducer. All the plasmids used for this experiment are listed in Table 5.
In more detail, electrocompetent P. putida cells were transformed (2.5 kV, 200 Ω, 25 μF) with 200 ng plasmid and recovered in 1 ml LB for 2 hours at 30° C. Then, the culture was serially diluted and drop plated on selective (50 μg mL−1 kanamycin) LB agar plates in the presence or absence of 2 mM theophylline. The agar plates were incubated at 30° C. for 24 hours and the CFUs were counted.
Further genome editing experiments were conducted to knock-out the FlgM gene of P. putida. A repair template (1125 bp) was included on the plasmids bearing approximately 500 bp homologous sides upstream and downstream of the FlgM gene. The repair template was introduced to either of the four tagged-intron-FnCas12a variants along with the T crRNA for counterselection or the NT crRNA as a control. A list of the plasmids is given in table 6. Plasmids were transformed to P. putida through electroporation and the transformed cells were recovered in LB medium for 2 hours before plating on LB agar plates containing 50 μg ml−1 kanamycin and 2 mM theophylline. Plates were incubated at 30° C. overnight and formed colonies were screened through colony PCR for the knock-out of the FlgM gene.
Flavobacterium IR1 is a non-model organism known for its iridescent colour (see Johansen, V., et al., (2018) “Genetic manipulation of structural colour in bacterial colonies” Proceedings of the National Academy of Sciences 115 (11): 2652-2657; and Schertel, L., G. T. et al., (2020) “Complex photonic response reveals three-dimensional self-organization of structural coloured bacterial colonies” Journal of the Royal Society Interface 17 (166): 20200196). The lack of genomic tools and the low HR efficiency of IR1 are currently the main bottlenecks limiting the fundamental characterization and commercial exploitation of this phenomenon (i.e. development of new paints). As IR1 is a recently discovered non-model organism, inducible promoters are not characterized. Therefore, the control of CRISPR-Cas cannot succeed without a promoter-independent regulatory system such as is disclosed herein.
To establish controllable genetic engineering tools for IR1, plasmids were constructed by inserting the 300 bp self-splicing aptazyme intron of Thompson et al., (2002) Supra into the fncas12a gene to provide a module, and subsequently inserting this module into two editing plasmids yielding pSIBRFnCas12a_sprF_HR_NT (no-target spacer) and pSIBRFnCas12a_sprF_HR_S3 (spacer targeting sprF gene). For this, the theophylline T4 td intron was introduced in the ORF of FnCas12a. The insertion position was generated by using the algorithm of Example 11. The insertion position is illustrated in
The constructed plasmids were then transformed into IR1 and cultured following the experimental design shown in
Prior to theophylline induction, the liquid cultures of IR1 transformed with pSIBRFnCas12a_sprF_HR_S3 and incubated for 0, 24, and 48 h showed no obvious growth (data not shown). Correspondingly, there was no colony obtained after plating these cultures following theophylline induction in the liquid culture. In contrast, IR1 transformed with the non-targeting plasmid pSIBRFnCas12aFb_sprF_HR_NT showed growth following 24 and 48 h incubation prior and after the induction with theophylline.
Interestingly, after 72 h and 96 h incubation, cultures transformed with pSIBRFnCas12a_sprF_HR_S3 started to show some growth (data not shown). Likewise, colonies were also obtained when plating these cultures after theophylline induction.
To demonstrate the inefficient HR mechanism of IR1, the organism was transformed using electroporation with a plasmid expressing an intron-less (WT) FnCas12a under a constitutive promoter (OmpA-P), and a T crRNA targeting the SprF gene under the constitutive promoter HU-P. A repair template (2963 bp) for knocking out the SprF gene through HR was also included on the plasmid resulting in the final plasmid pFnCas12aFb_sprF_HR_T. As control, the crRNA was replaced with an NT crRNA resulting in pFnCas12aFb_sprF_HR_NT. Also, the pCP11 empty vector was used as an indicator for transformation efficiency.
IR1 electrocompetent cells were prepared as follows: IR1 was grown overnight in 10 mL of ASW at 25° C., 200 rpm. The overnight culture was used to inoculate 2×100 mL ASW broth in 500 ml baffled flask and grown until it reached an OD600 of 0.3. Thereafter, the cells were harvested by centrifugation at 4000 rpm for 10 minutes, 4° C. The cells were washed two times with 1×volume of washing buffer (10 mM MgCl2 and 5 mM CaCl2) at 4° C. and washed once with 10% (v/v) glycerol (Gilchrist and Smit, (1991) “Transformation of freshwater and marine caulobacters by electroporation” Journal of Bacteriology 173.2: 921-925). The pellet was suspended using 10% glycerol to 1/100 of the initial volume. Cells were divided into aliquots of 100 μL in 1.5 mL Eppendorf tubes and stored at −80° C. until use.
IR1 electrocompetent cells were transformed with 1 μg μl−1 plasmid in 1-mm cuvette using the following settings: 1.5 kV, 200 Ω, 25 μF. 900 μL of ASW medium [5 g L−1 peptone (Sigma #70173), 1 g L−1 yeast extract (BD), 10 g L−1 sea salt (Sel marin)] was added immediately and the cells were incubated at 25° C. for 4 hours for recovery. The cells were plated on ASWBC agar [ASW medium, 15 g L−1 agar (Oxoid), 100 mg L−1 nigrosine (Aldrich #198285), and 5 g L−1 Kappa Carrageenan (Special Ingredients)] supplemented with 200 μg mL−1 erythromycin and incubated at 25° C. for 2 to 3 days.
Clearly, the constitutive expression of the WT FnCas12a and the T crRNA along with the inefficient HR machinery of IR1 resulted in the targeting of the genome of IR1 causing cell death. To overcome this limitation, it is suggested to use the four T4 td intron-FnCas12a variants (as developed for E. coli and P. putida) and this would result in the tight and controlled expression of FnCas12a in order to allow HR to precede counterselection.
Because theophylline uptake from IR1 appears to be a prerequisite, a toxicity assay was carried out on the growth of IR with varying theophylline concentrations (0, 0.1, 2, 5 and mM) grown for 24 hours at 25° C.
To achieve efficient HR in IR1, the WT FnCas12a in pFnCas12aFb_sprF_HR_S3 was replaced with the four T4 td intron-FnCas12a variants developed previously for E. coli and P. putida resulting in the plasmids listed in table 7. As a control, the WT FnCas12a of pFnCas12aFb_sprF_HR_NT was replaced with the four T4 td intron-FnCas12a variants (Table 7). In addition, a better method was developed in order to increase the obtained colonies as this will increase the chances of obtaining knock-outs.
By using the following script, the user can upload the sequence of the gene of interest and the script will return possible insertion sites for the T4 td intron. The insertion sites will require point mutations to be introduced when inserting the intron at the target site. Multiple sites are possible options but one at the beginning of the gene is recommended to eliminate potential function of partially produced proteins.
To show the functionality and applicability of SIBR into eukaryotic systems, we transferred SIBR into the eukaryotic model organism Baker's yeast (Saccharomyces cerevisiae) and controlled the expression of the FnCas12a protein.
To control the activity of FnCas12a, we sought to disrupt its activity by disrupting the encoded protein through the placement of SIBR. To this end, by using the acquired knowledge from Example 1, 2 and 3 and the generated script from Example 4 and/or 11, we introduced SIBR before the RuvC I domain at amino acid position 859 (
PL-319, PL-320 or pUDE731 were co-transformed in the yeast S. cerevisiae with either a plasmid containing a non-targeting spacer (PL-207) or a plasmid containing a targeting spacer (PL-074). The targeting spacer was targeting the ADE2 gene. To transform S. cerevisiae, the LiAc/SS carrier DNA/PEG method by Gietz and Schiestl (Gietz, R. D. and Schiestl, R. H., 2007. High-efficiency yeast transformation using the LiAc/SS carrier DNA/PEG method. Nature protocols, 2(1), pp. 31-34) was used. 500 ng of each plasmid was used per transformation. After transformation, the transformed yeast cells were recovered in YPD medium for 3 hours at 30° C. and then serially diluted in PBS and plated on drop-out (omitting uracil; for the selection of PL-319, PL-320 or pUDE731) minimal agar medium (1.7 g/L bacto-yeast nitrogen base w/o amino acids and without ammonium sulfate; 1 g/L monosodium glutamate; 20 g/L glucose; 20 g/L agar) containing 200 μg/mL Geneticin (G418 sulfate) antibiotic (for the selection of PL-074 or PL-207; targeting and non-targeting plasmids) and containing different concentrations of theophylline (0, 5, 10, 20 mM).
The results of this experiment are depicted in
As noted herein above, Group I introns, like the T4 td intron, form core secondary structures consisting of multiple paired regions. In principle, to turn any Group I intron into a Self-splicing Intron Based Riboswitch (SIBR) according to the invention, a stepwise approach, for example as described herein below, can be followed, similar to the one described in this patent.
As a first step, a library of mutant 5′ and 3′ exonic sequences is developed, since the and 3′ exonic sequences of Group I introns interact with the intron sequence and affect the secondary and tertiary structure of the intron. This mutant library will serve as the basis to define the effect of the 5′ and 3′ exonic sequences on the splicing efficiency of the intron. Moreover, this library will contain introns with a range (low to high) of splicing efficiencies. It is likely that the mutant intron library will contain introns with better splicing efficiency than the wild type intron; similar to the results observed in Examples 1-3). Also, this library will allow the transfer of the intron of interest to the open reading frame of any gene of interest without disturbing the amino acid sequence of the target gene/protein, for example when applying the script of Example 4 and/or 11.
Next, to achieve inducible control over the splicing of the intron, an aptamer moiety which responds to specific small molecules (e.g. theophylline) is introduced at one or multiple pairing (P) domains of the intron. For example, as described by Thompson et al., 2002, and also shown in Examples 5 and 7, the theophylline aptamer is introduced at the P6 domain of the T4 td intron, turning it into an inducible self-splicing gene regulator. Another example is described by Kertsburg and Soukup, 2002 (Nucleic Acids Research, Volume 30, Issue 21, 1 Nov. 2002, pages 4599-4606), where they turned the Tetrahymena group I intron into an inducible self-splicing intron by replacing the P6 or P8 or both P6 and P8 domains with a theophylline aptamer. Similar approaches (to that of Thompson et al., 2002 and that of Kertsburg and Soukup, 2002) can be taken for any other Group I intron where one of their P domains is altered to contain an aptamer moiety that responds to specific small molecules and can consequently control the splicing of the intron.
After generating the mutant intron library (mutations at the 5′ and 3′ exonic sequences) and achieving inducible control over the splicing of the intron (through the introduction of an aptamer in one of the P domains of the intron), the generated intron variants can be moved to the ATG start codon, or 5′ to the start codon, of the polynucleotide portion encoding the POI, or within the polynucleotide portion encoding the POI. When transferring the intron at a location of choice, attention should be given in avoiding codon frameshifting after splicing as this will result in a non-sense protein.
Group II introns are found in higher (plants) and lower eukaryotes (fungi and yeasts) but also in bacteria. Similar to Group I introns, group II introns reside in between genes (separating them into 5′ and 3′ exons) which upon excision (formation of a lariat product instead of linear product as observed for Group I introns) allow for the formation of a functional protein. Group II introns can self-splice, although some intron-encoded proteins (IEPs) may facilitate splicing by stabilizing the intron RNA structure. The 5′ and 3′ exonic sequences of the Group II introns (called intron-binding site or IBS) interact with conserved domains of the intron (called exon-binding site or EBS) to form long-range tertiary interactions. The intron-exon interactions are necessary for splicing as they bring the intron at the active site of the exons in order to facilitate the typical transesterification reaction that mediates the excision of the intron. The necessity of intron-exon interactions for splicing, translates into a limitation in transferring any group II intron into any gene of interest (GOI), as the exon sequences need to be conserved. To overcome this, a similar approach as the one developed in this patent for Group I introns can be used.
First, a mutant library of Group II introns can be generated in which the exon sequences (IBS1 and IBS2 for 5′ exon and IBS3 for 3′ exon) are mutated. In some cases, and especially when the IBS is heavily mutated, the EBS might need to be modified as well to maintain the IBS-EBS base-pairing necessary for the formation of long-range tertiary interactions. The generated mutant library is then assessed for the efficiency of the self-splicing activity of the intron, by following a similar approach as that was employed for LacZ as described in Examples 1-3. Important to note is that self-splicing efficiency can be assessed by any other in vitro or in vivo method (other than LacZ) as long as it can distinguish the formation of spliced products from un-spliced products, or the formation and quantity of active protein from inactive proteins.
In the case where the Group II intron mutant library will be assayed through a protein (similar to that of LacZ; Examples 1-3) then, for convenience and to maintain the coding sequence of the protein, the Group II intron can be transferred directly after the ATG start codon in order to maintain the coding sequence of the protein. This approach was described in Examples 1-3 (LacZ) and Example 5 (FnCas12a). The outcome of the Group II intron mutant library assay is expected to yield a range with good and bad splicing introns which can then be used to modulate/tune the expression of the gene/RNA/protein of interest.
After establishing the requirements for splicing as defined by the IBS-EBS interactions, a script similar to Example 4 and/or 11 can be developed that allows for transferring the mutant Group II intron to virtually any gene/RNA/protein of interest.
In case inducible self-splicing is required, an aptamer moiety which responds to specific small molecules (e.g. theophylline) is introduced at one or multiple pairing (P) domains of the intron. To achieve this, the approaches developed and applied by Thompson et al. (2002) Ibid, and Kertsburg and Soukup (2002) lbid, can be used.
After generating the mutant Group II intron library (mutations at the 5′ and 3′ exonic sequences) and achieving inducible control over the splicing of the Group II intron (through the introduction of an aptamer in one of the P domains of the intron), the generated Group II intron variants can be moved at the ATG start codon, or 5′ of the ATG start of the polynucleotide portion encoding the POI, or within the polynucleotide portion encoding the POI. When transferring the intron at the location of choice, attention should be given in avoiding codon frameshifting after splicing as this will result in a non-sense protein.
In general, Group III introns are short (approx. 100 nt) U-rich introns which are predominantly found in Euglena gracilis. Group III introns are considered streamlined versions of Group II introns as they retain the 5′ splice site of group II introns but lack the catalytic domain V and the domains II-IV. To splice, a similar mechanism is used as that of Group II introns where the IBS1 pairs with EBS1 to form long-range tertiary interactions and facilitate splicing (Hong, L. and Hallick, R. B., 1994 “A group III intron is formed from domains of two individual group II introns” Genes & development, 8(13), pp. 1589-1599). In principle, Group III introns can be turned into SIBR by changing/mutating the IBS-EBS interactions as described in Example 14 and by introducing a ligand dependent aptamer to one of its domains (e.g. at the VI domain). The defined mutant libraries can then be used to modulate the splicing efficiency of the introns.
Throughout the description and claims of this specification, the words “comprise” and “contain” and variations of them mean “including but not limited to”, and they are not intended to (and do not) exclude other moieties, additives, components, integers or steps. Throughout the description and claims of this specification, the singular encompasses the plural unless the context otherwise requires. In particular, where the indefinite article is used, the specification is to be understood as contemplating plurality as well as singularity, unless the context requires otherwise.
Features, integers, characteristics, compounds, chemical moieties or groups described in conjunction with a particular aspect, embodiment or example of the invention are to be understood to be applicable to any other aspect, embodiment or example described herein unless incompatible therewith. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and/or all of the steps of any method or process so disclosed, may be combined in any combination, except combinations where at least some of such features and/or steps are mutually exclusive. The invention is not restricted to the details of any foregoing embodiments. The invention extends to any novel one, or any novel combination, of the features disclosed in this specification (including any accompanying claims, abstract and drawings), or to any novel one, or any novel combination, of the steps of any method or process so disclosed.
The readers attention is directed to all papers and documents which are filed concurrently with or previous to this specification in connection with this application and which are open to public inspection with this specification, and the contents of all such papers and documents are incorporated herein by reference.
Nucleotide Sequences
Number | Date | Country | Kind |
---|---|---|---|
2015944.8 | Oct 2020 | GB | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2021/077682 | 10/7/2021 | WO |