Embodiments relate to one pot methods for synthesis of double stranded poly-nucleotides comprising at least a whole open reading frame, and in particular, embodiments relate to covalently joining oligonucleotides to form synthetic genes that are transcribed and translated in vivo.
The chemical synthesis of oligonucleotides and their enzyme-mediated assembly into genes and genomes has significantly advanced multiple scientific disciplines. While current approaches are widely employed, they are not without their shortcomings. First, the reliance on enzymes for assembly is not amenable to automation, increasing the time and effort required. Second, enzymatic assembly does not allow the incorporation of epigenetic information and/or modified bases.
The ability to design and synthesize large fragments of DNA has underpinned and revolutionized multiple fields, including cell biology, biotechnology and synthetic biology. While chemical synthesis of short oligonucleotide fragments (<100 bases) is routine, the synthesis of longer fragments is often plagued by poor yields and high error rates which occur as a function of oligonucleotide length. As a result, large DNA fragments need to be assembled from multiple short oligonucleotides using enzymes. Current assembly approaches typically make use of PCR amplification or enzymatic ligation, and although these approaches are well established and form the cornerstone of current gene and genome synthesis efforts, they have some limitations. First, chemical modifications and epigenetic information cannot be introduced into a gene or genome site-specifically as modified bases are not differentiated by PCR enzymes. Second, current assembly methods do not readily lend themselves to automation, and therefore require significant effort and time. Third, the assembly reactions are often low yielding and carried out at a small scale, so require a final PCR amplification step to isolate the full length product from the partially assembled fragments.
Regardless of these limitations, enzymatic assembly of genes and genomes has been used to prepare genomes of over a million base pairs and is routinely employed on a smaller scale for the preparation of genes in everyday research. Previous studies have attempted to chemically ligate synthesised oligonucleotides to form longer DNA molecules as described in WO2008/120016, Kumar et al. 2007, J Am Chem Soc 129, 6859-6864, Kocalka et al. 2008, Chem Bio Chem, 9, 1280-1285, and El-Sagheer et al. 2009, J Am Chem Soc. 131(11), 3958-3964. The drawback with these molecules was that, because they contained unnatural linkages between the oligonucleotides they were not fully active in a biological system. DNA and RNA polymerases could not read these nucleotide sequences accurately and mis-read or missed out nucleotides when trying to replicate the sequences.
Given the challenge of preparing ever increasing lengths of DNA at larger scale and the need for chemically modified DNA constructs, alternative approaches to current DNA assembly methods are needed.
Embodiments will be readily understood by the following detailed description in conjunction with the accompanying drawings. Embodiments are illustrated by way of example and not by way of limitation in the figures of the accompanying drawings.
In the following detailed description, reference is made to the accompanying drawings which form a part hereof, and in which are shown by way of illustration embodiments that may be practiced. It is to be understood that other embodiments may be utilized and structural or logical changes may be made without departing from the scope. Therefore, the following detailed description is not to be taken in a limiting sense, and the scope of embodiments is defined by the appended claims and their equivalents.
Various operations may be described as multiple discrete operations in turn, in a manner that may be helpful in understanding embodiments; however, the order of description should not be construed to imply that these operations are order dependent.
The description may use perspective-based descriptions such as up/down, back/front, and top/bottom. Such descriptions are merely used to facilitate the discussion and are not intended to restrict the application of disclosed embodiments.
The terms “coupled” and “connected,” along with their derivatives, may be used. It should be understood that these terms are not intended as synonyms for each other. Rather, in particular embodiments, “connected” may be used to indicate that two or more elements are in direct physical or electrical contact with each other. “Coupled” may mean that two or more elements are in direct physical or electrical contact. However, “coupled” may also mean that two or more elements are not in direct contact with each other, but yet still cooperate or interact with each other.
For the purposes of the description, a phrase in the form “A/B” or in the form “A and/or B” means (A), (B), or (A and B). For the purposes of the description, a phrase in the form “at least one of A, B, and C” means (A), (B), (C), (A and B), (A and C), (B and C), or (A, B and C). For the purposes of the description, a phrase in the form “(A)B” means (B) or (AB) that is, A is an optional element.
The description may use the terms “embodiment” or “embodiments,” which may each refer to one or more of the same or different embodiments. Furthermore, the terms “comprising,” “including,” “having,” and the like, as used with respect to embodiments, are synonymous.
All publications referred to in this application are hereby incorporated by reference in their entirety.
Embodiments herein provide purely chemical methods for linking and assembling synthetic oligonucleotides into larger fragments, which methods overcome the limitations of existing enzymatic DNA assembly methods, and which reduce the time and cost associated with gene and genome synthesis. It is an aim of the present disclosure to provide a method of synthesising a double stranded polynucleotide comprising at least a whole open reading frame that can be transcribed and translated in vivo to express a functional protein. In a first aspect the present disclosure provides a method of synthesising a double stranded polynucleotide comprising at least a whole open reading frame, wherein the method comprises:
I) providing a set of self-templating oligonucleotides wherein the set of self-templating oligonucleotides encodes a double stranded polynucleotide that is at least a whole open reading frame;
II) annealing the oligonucleotides together so that they form the double polynucleotide, with the oligonucleotides in the correct order and wherein adjacent oligonucleotides in the same strand are close enough to each other that a covalent bond could form between the 5′ end of one oligonucleotide and the 3′ end of an adjacent oligonucleotide; and
III) covalently bonding the backbones of adjacent oligonucleotides in the same strand to each other so that a covalent bond is formed between the 5′end of an oligonucleotide to the 3′end of the adjacent oligonucleotide to provide the double stranded polynucleotide comprising at least a whole open reading frame, wherein the covalent bonds between adjacent oligonucleotides can be read-through accurately by a DNA polymerase or an RNA polymerase in vivo.
The polynucleotide may be at least a whole open reading frame, for example an open reading frame encoding one or more proteins or protein fragments. The polynucleotide may comprise a whole gene or genome. The polynucleotide may comprise an open reading frame encoding one or more proteins or protein fragments as well as regulatory sequences for correct expression of the protein or protein fragment in a cell. For example, the polynucleotide may encode upstream regulatory sequences, such as promoters and enhancers, start codons, stop codons, downstream regulatory sequences as well as the polypeptide sequence. The polynucleotide sequence, gene or genome may include epigenetically modified base, including methyl-cytosine, or hydroxymethyl-cytosine, or other chemical moieties, such as a fluorescent group, or added chemical functional groups for further decoration of the gene after assembly (for example bromination).
The oligonucleotides may be a set of oligonucleotides that, together as a set, encode the whole sequence of the polynucleotide. The set of oligonucleotides comprises two or more, three or more, five or more or ten or more oligonucleotides that encode the whole sequence of the sense strand of the polynucleotide and two or more, three or more, five or more or ten or more oligonucleotides that encode the whole sequence of the antisense strand of the polynucleotide.
The set of oligonucleotides may comprise more than 2, more than 3, more than 4, more than 5, more than 6, more than 7, more than 8, more than 9, more than 10, more than 15 or more than 20, more than 50, more than 100 or more than 200 oligonucleotides. The set of oligonucleotides may comprise the same number of oligonucleotides encoding the sense strand and the antisense strand.
The set of oligonucleotides may be self-templating. Self-templating means that, each oligonucleotide sequence overlaps with at least part of the sequences of two or more adjacent oligonucleotides from the opposite strand. Each oligonucleotide anneals to two or more adjacent oligonucleotides from the opposite strand so that each oligonucleotide acts as a template to ensure that some of the oligonucleotides on the opposite strand are in the correct order and correctly positioned. The set of oligonucleotides may be designed so that the oligonucleotides on each strand assemble in the correct order because they use the oligonucleotides of the other strand as templates which cross the joins between adjacent oligonucleotides. An example of self-templating oligonucleotides is shown in
When the set of oligonucleotides is annealed together, they template each other and assemble in the correct order to provide the whole double stranded polynucleotide sequence but with breaks in the phosphodiester backbone where the gaps between adjacent oligonucleotides are. These gaps are only the distance between adjacent base pairs so the 3′ end of one oligonucleotide is very close to the 5′end of the next adjacent oligonucleotide.
A covalent bond can be formed chemically between the adjacent 3′ and 5′ ends of adjacent oligonucleotides on the same strand to form a double stranded polynucleotide with no breaks in either of the strands.
The chemistry of the covalent bond joining the 3′ end of one oligonucleotide to the 5′ end of the next oligonucleotide may be chosen to form a covalent linkage that can be read through in vitro and in vivo by natural polymerases and transcription factors so that the resulting polypeptide can be transcribed and translated in vivo to express a protein encoded by the polynucleotide. The covalent linker may be a linker that can be repaired by the cellular DNA machinery to a phosphodiester bond prior to transcription.
The covalent bond joining the oligonucleotides may formed by a chemical process in vitro. The covalent bond joining the oligonucleotides may not be a natural phosphodiester bond. The covalent bond joining the oligonucleotides may not be made by enzymes. The covalent bond joining the oligonucleotides may not be made by a ligase.
In order to use a chemical method to create a covalent bond between the 3′ end of one oligonucleotide and the 5′ end of the adjacent oligonucleotide, the 3′ end and/or the 5′ end of one or more, or each, oligonucleotide may be chemically modified to introduce a functional group that is able to react to form a bond with the functional group on the adjacent oligonucleotide. The functional groups on all of the 3′ ends of all of the oligonucleotides may be the same functional group. The functional groups on all of the 5′ ends of all of the oligonucleotides may be the same functional group. The functional groups on 3′ ends of all of the oligonucleotides may be different from the functional groups on the 5′ ends of all of the nucleotides. All of the functional groups may be the same. The functional group on the 5′ end of one oligonucleotide may be specifically chosen to react with and form a covalent linkage with the functional group on the 3′ end of an adjacent oligonucleotide to form a linkage that can be read through by polymerases and transcription factors in vivo.
Functionalised means that a functional group has been added. Where the end of an oligonucleotide has been functionalised, a functional group has been added to the end of the oligonucleotide.
The functional groups may be any groups that react to form disulphide bonds, amides, alkenes, alkanes, or may be any two heteroatoms that can be joined together or any heteroatom joined to a carbon atom.
The chemical reaction between the functional group at the 3′ end of one oligonucleotide and the functional group at the 5′ end of the adjacent oligonucleotide may be spontaneous or initiated by addition of a catalyst or further chemical.
Where the functional groups are an alkyne and an azide the catalyst may be Cu (I) which catalyses the reaction to form a triazole phosphodiester mimic or no catalyst may be required.
The catalyst may not be an enzyme. For example, the catalyst may not be a ligase. The covalent linkage between the oligonucleotides is formed using a chemical process in vitro. This is advantageous because a chemical process has a higher yield and can be done on a larger scale. A chemical process does not depend on enzymes, which can contain impurities.
The reaction may occur upon the addition of additional reagents (e.g. coupling reagents) to initiate the reaction or occur due to the proximity of the functional groups once self-assembly of the gene has occurred.
When synthetic oligonucleotides are ligated together chemically any epigenetic markers or modified bases that are in the oligonucleotides are unchanged and this allows large open reading frames, whole genes or a whole genome to be synthesised chemically on a large scale including epigenetic markers.
The chemical linkages between the oligonucleotides are read through by enzymes in vivo. A polynucleotide or gene with more than 2, more than 3, more than 4, more than 5, more than 10 or more than 15, more than 30, more than 50 or more than 100 chemical linkages may be read through by RNA and/or DNA polymerases, prokaryotic and/or eukaryotic transcription factors and/or DNA replication machinery in vitro or in vivo.
The method may be a one-pot ligation. This means that the oligonucleotides are designed to template each other so that the oligonucleotides assemble in the correct orientation and the correct order when they are annealed together and the oligonucleotides can all be ligated together in one chemical reaction that forms bonds between each of the oligonucleotides in the same reaction to form a double stranded polynucleotide. the double stranded polynucleotide may encode a whole gene.
An advantage of the present method is that assembly of the oligonucleotides and joining of the oligonucleotides to form the final product can be carried out without the need for a solid support such as resin beads. Because the oligonulceotides are self-templating they can self-assemble in the correct order in solution without splints or additional oligonucleotides or solid supports. Therefore, an advantage of the present method is that it provides a one-pot method for assembling oligonucleotides to form a polynucleotide that can be a whole gene or genome in solution.
The double stranded polynucleotide may be at least 300 base pairs, at least 400 base pairs, at least 500 base pairs, at least 800 base pairs, at least 1000 base pairs, at least 1500 base pairs, at least 2000 or at least 5000 base pairs long.
Each of the oligonucleotides may be at least 30 base pairs, at least 50 base pairs, at least 70 base pairs, at least 100 base pairs or at least 200 base pairs long. Each oligonucleotide may be between 30 and 200 base pairs long.
One or more of the oligonucleotides may be chemically synthesised using standard chemical oligonucleotide synthesis methods known in the art. One or more of the oligonucleotides may be produced using any standard laboratory technique. The oligonucleotides may all be chemically synthesised. The oligonucleotides may be chemically synthesised with functional groups attached to the 5′ and/or 3′ ends.
The oligonucleotides may be synthesised including non-standard bases and/or epigenetic modifications, for example methylated bases. An advantage of the present method is that many copies of the polynucleotide can be made by chemical synthesis and chemical ligation and the epigenetic modifications remain unchanged. This is in contrast to amplification by PCR techniques, which do not recognise non-standard bases and epigenetic modifications such as methylation and do not replicate them.
The polynucleotide sequence, gene or genome may include epigenetically modified bases, including methyl-cytosine, hydroxymethyl-cytosine, formyl cytosine, methyl adenine, hydroxymethyladenine or other chemical moieties, such as a fluorescent group, or added chemical functional groups for further decoration of the gene after assembly (for example bromination).
One method of chemical ligation that can ligate oligonucleotides together by forming a link between the oligonucleotide backbones is a triazole phosphodiester mimic in RNA as described in El-Sagheer and Brown 2010, PNAS vol. 107 no. 35, 15329-15334 and is also a phosphodiester mimic in DNA that can be read through by DNA and RNA polymerases as described in El-Sagheer et al. 2011, PNAS vol. 108 no. 28, 11338-11343. Both of the above publications are incorporated herein in their entirety. The 5′ end of an oligonucleotide is functionalised with an alkyne group and the 3′ end of an adjacent oligonucleotide is functionalised with an azide group and the covalent bond formed between the two functionalised groups is a triazole phosphodiester mimic. The reaction is catalysed by Cu(I).
The reaction to link the oligonucleotides may form a triazole phosphodiester mimic and follow the reaction scheme below or an RNA equivalent thereof:
This reaction provides a triazole phosphodiester mimic that has an overall shape similar to that of a phosphodiester group.
The method of the present disclosure may further comprises the steps of:
I) Ligating the double stranded polynucleotide into a vector, preferably an expression vector; and
II) Transforming cells with the expression vector such that the cells express the open reading frame.
The polynucleotide may be a whole open reading frame or a whole gene and may include upstream and downstream regulatory elements. The polynucleotide may be ligated into a vector. The polynucleotide may be ligated into an expression vector that is suitable for expressing the protein encoded by the open reading frame in cells. The vector may be suitable for expressing the open reading frame in prokaryotic cells, for example bacterial cells, eukaryotic cells, such as animal cells and/or human cells.
The polynucleotide may be designed or arranged to complement the vector that it will be ligated into to ensure that, once the polypeptide is ligated into the vector all of the necessary sequences are present to allow the polypeptide to be expressed in the chosen cell type.
The polynucleotide may be directly transformed in to a cell (prokaryote or eukaryote) or may be designed with ends that are suitable for ligating into a vector. For example the polynucleotide may be made with overhangs at each end or with restriction sites at each end that can be cleaved to provide ends that are suitable for ligating into the chosen vector.
The vector comprising the polynucleotide may be transformed into any suitable type of cells. As the vector/polynucleotide construct may be constructed with all of the necessary sequences for expression the cells may express a protein encoded by the open reading frame.
In order to facilitate expression in prokaryotic or Eukaryotic cells, each terminus of the gene may be pre-designed to include sticky ends for direct ligation into an expression plasmid. The start of the assembled polynucleotide may contain transcription factor, or polymerase binding sites and/or non-natural nucleosides (such a locked nucleic acids) to prevent degradation of the linear gene product.
The polynucleotide may not be ligated into a vector.
The polynucleotide may be designed to comprise 2-3 or more locked nucleic acids at each end so that it can remain linear and be transformed into cells as a linear piece of DNA. Transforming linear DNA is particularly advantageous for eukaryotic cells, for example human cells.
In a second aspect the present disclosure provides a double stranded DNA sequence made by the method of the disclosure. The double stranded DNA sequence may comprise at least a complete open reading frame and may comprise regulatory sequences or a gene.
In a further aspect, the present disclosure provides an expression vector comprising a double stranded DNA sequence made by the method of the disclosure and comprising at least a complete open reading frame.
In a further aspect, the present disclosure provides a cell comprising a DNA sequence made by the disclosed methods and comprising a double stranded DNA sequence comprising at least a complete open reading frame.
Thus, in various embodiments, the present disclosure relates to the use of click-DNA ligation for the fully chemical, one-pot assembly of oligonucleotides into a gene. The potential of this method is demonstrated by synthesizing the 335 base-pair gene encoding the green fluorescent protein iLOV from ten functionalized oligonucleotides containing 5′-azide and 3′-alkyne units. The resulting click-linked iLOV contains eight triazoles at the sites of click-ligation in its backbone, yet is fully biocompatible. Click-linked iLOV is replicated by DNA polymerases in vitro and encodes a functional iLOV protein in E. coli. This fully chemical approach to gene synthesis may be employed for the construction of epigenetically modified genes and genomes.
In this strategy, a highly specific and selective chemical reaction is used to join oligonucleotides with the required functional groups at each terminus (
A possible chemical reaction for this purpose is the copper-catalyzed azide-alkyne cycloaddition (CuAAC), which has been used by us and others to link oligonucleotides functionalized with a 3′-propargyl and a 5′-methylene azide (FIG. 1B). The resulting click-linked DNA backbone has been shown to be accurately replicated and transcribed in bacterial and human cells when a single triazole linker was incorporated into each strand of a DNA duplex. In addition, the linker has been shown to be accurately transcribed in vitro by T7 RNA polymerase, while structural studies illustrated minor disturbance to the double helix structure from a single click-linker. The present inventors have surprisingly shown that it is possible to incorporate multiple non-natural linkers, in this case multiple click-linkers into the same DNA strand and that they are tolerated by living systems. This is surprising because the efficiency of natural polymerases was thought to decrease with just one non-natural linkage in an in vivo system. The present inventors have taken this a step further by combining the use of multiple non-natural linkers, for example multiple click-DNA ligation reactions, with the self-assembling properties of DNA for one pot gene synthesis that creates a gene that can be expressed with significant efficiency in vivo and can retain epigenetic information.
All previous gene synthesis methods use enzymes to assemble synthetic oligonucleotides into gene- or genome-sized fragments. While these methods have been pushed to their limit, achieving challenging feats such as the synthesis of whole prokaryotic genomes and a eukaryotic chromosome, the reliance on enzymatic assembly limits scalability as well as the ability to encode epigenetic information (which cannot be read or transferred by the enzymes used, and/or is erased during assembly). Yet the ability to include epigenetic information in synthetic genes will be critical in meeting the challenge of the next phase of DNA synthesis, namely the goal of synthesizing a functional human genome. It may be reasonably argued that given the extensive level of cytosine methylation and hydroxymethylation in the human genome, and the critical role it plays in gene regulation, synthesizing the human genome will only be meaningful and biologically relevant if it also contains epigenetic information. To overcome these limitations and enable the synthesis of epigenetically modified genes and genomes, the present inventors have demonstrated a fully chemical approach to gene assembly, using chemically modified oligonucleotides that are covalently bound into genes by a suitable chemical reaction. The inventors have demonstrated this possibility using click-chemistry and the CuAAC reaction; but it should be noted that the principle of chemical DNA ligation is not limited to this reaction and can (and should) be applied to a variety of chemical reactions. The one key requirement however, is that the functional group produced on the DNA backbone by chemical ligation is biocompatible. The inventors chose the CuAAC reaction owing to its fast reaction rate, high yield, compatibility with aqueous media, because azides and unactivated alkynes are orthogonal to the functional groups present in oligonucleotides, and its biocompatibility. However, there are many other examples of DNA backbone mimics that are biocompatible and could be successfully used in place of to CuAAC reaction for chemical DNA ligation.
Combining conventional oligonucleotide synthesis with chemical DNA ligation as demonstrated in the present disclosure, not only allows the synthesis of genes on the μg-mg scale bearing site-specific modifications ranging from epigenetic bases to larger bulky groups such as fluorophores, but also enables the automation of gene assembly, a critical step for scaling-up production and reducing the time taken to make a gene or genome.
Here, the inventors demonstrate the one-pot synthesis of the 335 bp iLOV gene by click-DNA ligation of ten doubly-modified oligonucleotides. The resulting click-iLOV construct has eight triazole moieties in its backbone, yet is fully biocompatible in E. coli. The inventors isolated 95 μg of the click-linked iLOV gene after purification, a challenging feat when using conventional enzymatic methods. The inventors initially compared the properties of click-linked iLOV to the canonical equivalent (generated by PCR) in vitro; CD spectroscopy showed that both had similar secondary structures, and despite the presence of eight triazole backbone linkers, the melting temperature of click iLOV was only 3° C. lower than the canonical gene. They also observed that DNA polymerases can read through the click-linked iLOV.
The inventors next assessed the biocompatibility of the click-iLOV gene in E. coli. To distinguish mutations in the progeny of the click-iLOV gene caused by the triazole linkers from those arising from oligonucleotide synthesis, or cloning, the inventors assembled a control gene using T4 DNA ligase with ten oligonucleotides (canonical equivalents of those used for click-ligation). Interestingly, it was found that there are fewer mutations in iLOV genes isolated from cells transformed with click-linked iLOV than in those isolated from cells transformed with ligase-linked iLOV, both in terms of ratio of functional genes produced (58.3±11.2% for click-linked iLOV versus 39.5±13.6% for ligase-linked iLOV), and ratio of functional genes containing errors (2% for click-linked iLOV versus 16% for ligase-linked iLOV). The mutations observed in the click-linked gene were not located at, or adjacent to, the sites of click-ligation, and are therefore unlikely to be a consequence of the triazole-linked backbone. Given the similarity in errors between the click-linked and canonical control gene, the most likely cause of these errors lies in the oligonucleotide synthesis and purification steps, and this may account for the higher error rates when using non-modified oligonucleotides. As they observed an effect from the multiple triazole linkers on PCR amplification in vitro, they next assessed the contribution of NER to the observed biocompatibility in cells. The inventors used a UvrB-deficient E. coli strain (incapable of NER) and observed a similar rate of error in the iLOV sequence isolated from the progeny of cells transformed with click-linked iLOV as for those transformed with ligase-linked iLOV. These data indicate that the triazole-linkers are truly biocompatible and not repaired. In this respect, had repair-mediated conversion of the triazole linkers to phosphodiester linkers been the origin of the observed biocompatibility, this would not necessarily have been a problem, as the sequence of the opposing strand (to the click-linker) used as a repair template is canonical and contains the correct genetic information. Regardless, the experiments demonstrate the viability of using a chemical ligation strategy for gene synthesis.
Gene synthesis has been driven forward by new techniques and technologies. The one-pot click-mediated DNA ligation approach presented here offers an alternative, fully chemical approach to gene assembly and has the potential to respond to the challenge of synthesizing epigenetically modified genes and genomes.
Results
Synthesis of Oligonucleotides Comprising the iLOV Gene.
The 335 bp gene encoding iLOV was codon-optimized for expression in E. coli. The sites of ligation were between adjacent deoxycytidine (dC) and deoxythymidine (dT) residues (CpT steps), and positioned throughout the gene to give an overlap of at least 10 base pairs between the sense and antisense strands (
The purity of the oligonucleotides and the integrity of the azide and alkyne functional groups were crucial factors in ensuring successful assembly of the full length gene, and expected function of the protein product. Two different methods of purifying the oligonucleotide were therefore evaluated. The crude solutions of each oligonucleotide were divided into two fractions; one fraction purified by semi-preparative HPLC using a hexylammonium acetate/acetonitrile buffer system and the other via polyacrylamide gel electrophoresis (PAGE). The purity of the oligonucleotides was then quantified via analytical HPLC and capillary electrophoresis. It was observed that HPLC purification yielded purer oligonucleotides than PAGE purification. Furthermore, the inventors found that capillary electrophoresis tended to overestimate the purity of the oligonucleotides compared to analytical HPLC. Consequently, only the HPLC-purified oligonucleotides were used to assemble the iLOV gene.
Click-Mediated Assembly of Gene Encoding iLOV.
The ten oligonucleotide fragments synthesized above were combined in ascorbate salt solution, heated at 95° C. for 3 min, then cooled to room temperature over 2 h to enable annealing. The inventors hypothesized that the self-templating properties of DNA would cause the oligonucleotides to anneal to give the unligated iLOV gene. The alkyne- and azide-functionalized termini of these thermally assembled oligonucleotides were simultaneously reacted to form triazoles via addition of copper sulphate. A control assembly reaction containing modified oligonucleotides, but no copper was also carried out to assess the importance of click-linking the annealed DNA. The crude click reactions were purified by PAGE under denaturing conditions to ensure that any unreacted fragments or by-products migrated separately from the assembled gene. As expected, in the absence of the copper catalyst, the individual oligonucleotide fragments did not assemble to form the iLOV gene, but rather migrated individually towards the bottom of the gel (
The Effect of Multiple Click-Linkers on the Secondary Structure of DNA.
A single triazole incorporated into the backbone of DNA has been shown to cause small distortions that result in displacement of the deoxyribose sugar and an increase in the distance between the bases flanking the triazole. Given this, the inventors were concerned that the presence of multiple triazoles might drastically perturb the secondary structure of the click-linked gene and affect its biocompatibility. The inventors therefore conducted circular-dichroism (CD) spectroscopy analysis to probe this. CD analysis of the click-linked iLOV gene gave bisignate signals with maxima at λ=+276/−248 nm which are characteristic of the B-type DNA helix, identical to that observed for the canonical iLOV gene (
Replication of click-linked iLOV in vitro. The ability of DNA polymerases to replicate click-linked iLOV in vitro was next assessed. The inventors used click-linked iLOV as template for PCR amplification by either Taq or Pfu polymerase, and the unligated, modified oligonucleotides were used as a negative control. Click-linked iLOV was amplified by both DNA polymerases, giving a single band that appeared at ˜350 bp markers in the ladder (
These oligonucleotides were assembled in the correct order using four shorter complementary splints overlapping the terminal regions of each modified oligonucleotide by 20-25 bases either side of the ligation points. The assembled sense strand was observed as a high molecular weight band at the top of the gel; as with the one-pot gene assembly, cyclization and truncation products were observed in addition to the residual unreacted oligonucleotides. The click-linked sense strand of iLOV was extended successfully by Klenow, depleting all the primer added to the reaction, with only a single band corresponding to the full length iLOV gene being observed (
The inventors further probed the effect of DNA backbone triazoles on DNA polymerases using real-time PCR (qPCR). The inventors designed four primers to bind upstream of each triazole on the sense strand of click-linked iLOV, and monitored the rate of replication through the increasing number of triazoles by Taq polymerase. For comparison, the experiment was repeated using canonical iLOV as template. The inventors observed an inverse relationship between the number of triazoles in the DNA backbone and rate of PCR product formation (
Probing the Biocompatibility of the Click-Linked iLOV Gene in E. coli.
The biocompatibility of our clicked-linked gene was next probed in E. coli. For comparison, the inventors assembled the canonical iLOV gene with T4-DNA ligase, using 5′-phosphorylated equivalents of the oligonucleotides used for click-ligation. Both the click-linked iLOV and ligase-assembled iLOV were designed to contain sticky ends for ligation into the pRSET-mCherry plasmid24 cleaved by NdeI and EcoRI restriction endonucleases (
Assessing the Role of DNA Repair in the Biocompatibility of Click-Linked iLOV.
The green fluorescent phenotype observed in the majority of colonies transformed with click-linked iLOV and subsequent sequencing analysis confirmed the biocompatibility and high fidelity replication of this click-linked gene in E. coli (
Methods
For complete experimental methods see Supplementary Information.
One-Pot Assembly of iLOV Gene.
The oligonucleotides which comprised the sense and antisense strands of iLOV (F1-F5 and R1-R5, 4 nmol each, were combined and lyophilised. The oligonucleotides were resuspended in 0.2 M NaCl (400 μL) then annealed by heating at 95° C. for 15 min then gradually cooled to room temperature over 2 h; the temperature was reduced by 10° C. every 15 min. A Cu′ click catalyst solution was prepared from tris(3-hydroxypropyltriazolylmethyl)amine (0.7 μmol in 0.2 M NaCl, 154 μL), sodium ascorbate (1.0 μmol in 0.2 M NaCl, 14 μL) and CuSO4.5H2O (0.1 μmol in 0.2 M NaCl, 7 μL). The pre-mixed catalyst solution (160 μL) was added and mixture thoroughly degassed using argon then left at room temperature for 2 h. Formamide (560 μL) was added and the samples analyzed by denaturing 8% polyacrylamide gel electrophoresis by applying 550 V for 3.5 h. Bands corresponding to the assembled gene were excised and extracted from the gel using the ‘crush and soak method’. In brief, the excised polyacrylamide pieces were broken down into small pieces then suspended in distilled water (25 mL). The suspension was shaken at 37° C. for 18 h then filtered through a plug of cotton wool. The filtrate was concentrated to approximately 2 mL then desalted using through two NAP-25 and one NAP-10 columns. The desalted eluent was lyophilised prior to use.
Ligation of Click-Assembled iLOV Gene into pRSET Backbone.
The pRSET backbone required for ligation was prepared from the double digestion of pRSET mCherry. The plasmid was digested sequentially between the NdeI and EcoRI restriction sites using enzymes and buffer supplied by New England Biolabs, UK. Restriction digestions were performed in a 50 μL reaction volume with between 1-2.5 μg of plasmid, 10× CutSmart® buffer (5 μL) and the restriction enzyme (20 U/μg plasmid). The restriction digestion reactions were incubated at 37° C. (60 min/μg plasmid) then the 5′-terminus dephosphorylated by addition of shrimp alkaline phosphatase (1 U/μg plasmid, New England Biolab, UK). The reactions were incubated at 37° C. for a further 30 min then the enzymes inactivated by incubating at 70° C. for 10 min. The digested plasmid was analyzed by gel electrophoresis using 1% agarose gel in 1× Tris/Borate/EDTA buffer (TBE) by applying 100 V for 30 min. The band corresponding to the backbone excised and the DNA was isolated using GeneJet PCR Purification Kit (Thermo Fisher Scientific, UK). Ligation of click-ligated iLOV gene was performed in a total volume of 10 μL using 50 ng of pRSET vector, T4 DNA Ligase (1 μL, 3 U, Promega UK) and T4 DNA Ligase 10× reaction buffer (1 μL, Promega). The lyophilised click-ligated gene was resuspended in ultrapure water and an aliquot diluted to a concentration of 19.7 ng/μL. The click-ligated gene was added to the ligation reaction to give molar ratios of 1:1, 1:3 and 1:5 backbone:insert. Negative control ligations were set up as above, using water instead of insert. The ligation reactions were incubated at 4° C. for 16 h then at room temperature for 1 h. The T4 DNA ligase enzyme was subsequently deactivated by heating at 70° C. for 10 min.
Transformation of pRSET-iLOV into E. coli.
The inactivated ligation reactions were dialysed for 1.5 h against ultrapure water using a 0.025 μm membrane filter (Millipore, Cat No: VSWP02500). The recovered ligation mixtures (approximately 7 μL) were added to frozen aliquots of electrocompetent KRX cells (100 μL). Electroporation of the plasmids was achieved using MicroPulser system (BioRad) and standard protocols. The transformants were immediately recovered using ice-cold SOC medium (890 μL) then incubated at 37° C. for 1 h. An aliquot (100 μL) of the recovered cells were spread on LB agar plates supplemented with carbenicillin (100 μg/μL) and rhamnose (0.1% w/v) then incubated at 37° C. for 18 h. Individual colonies were selected and grown in LB Broth (25 mL) supplemented with carbenicillin (100 μg/μL) at 37° C. for 16 h. Plasmid DNA was extracted from these cells using a QiaPrep® Miniprep kit. The plasmids were sequenced by Eurofins MWG Operon (Ebersberg, Germany) using the T7 forward and reverse primers. Trace files were aligned against reference sequences using Clustal Omega web-based software.
Although certain embodiments have been illustrated and described herein, it will be appreciated by those of ordinary skill in the art that a wide variety of alternate and/or equivalent embodiments or implementations calculated to achieve the same purposes may be substituted for the embodiments shown and described without departing from the scope. Those with skill in the art will readily appreciate that embodiments may be implemented in a very wide variety of ways. This application is intended to cover any adaptations or variations of the embodiments discussed herein. Therefore, it is manifestly intended that embodiments be limited only by the claims and the equivalents thereof.