The disclosure relates to a method of cloning and assembly of deoxyribonucleic acid (DNA).
Conventionally, short DNA fragments are obtained by chemical synthesis and amplified by PCR, while cloning of DNA fragments larger than 10-kb depends on the construction and screening of DNA libraries. Conventional methods of library construction and screening are time-consuming and laborious, and the target DNA fragments are often located on several different clones.
Large DNA fragments larger than 10-kb can also be assembled from small fragments using Gibson assembly in vitro or DNA assembler in vivo. However, in the assembly process, small fragments tend to mutate randomly.
Direct cloning refers to cloning a specific DNA sequence of a genomic DNA into a vector. The process involves recombinant proteins RecE and RecT. However, RecET direct cloning fails to synthesize DNA fragments larger than 50 kb from a bacterial genome or clone DNA fragments larger than 10 kb from the mammalian genome. In addition, RecET direct cloning can only synthesize up to five DNA fragments.
In one aspect, the disclosure provides a method of homologous recombination, the method comprising in vitro joining two or more target nucleic acid molecules with a first exonuclease, and recombining the two or more target nucleic acid molecules in the presence of a second exonuclease and an annealing protein, wherein recombined target nucleic acid molecules share at least one homologous sequence.
In another aspect, the disclosure provides a method of homologous recombination, the method comprising in vitro joining a first nucleic acid molecule and a second nucleic acid molecule in the presence of a first exonuclease, and recombining the first nucleic acid molecule and the second nucleic acid molecule in the presence of a second exonuclease and an annealing protein, wherein the first nucleic acid molecule and the second nucleic acid molecule share at least one homologous sequence.
In still another aspect, the disclosure provides a method of assembling a linear nucleic acid molecule, the method comprising in vitro joining two or more nucleic acid molecules with a first exonuclease, and recombining the two or more nucleic acid molecules in the presence of a second exonuclease and an annealing protein, wherein each nucleic acid molecule shares at least one homologous sequence with an adjacent nucleic acid molecule in a resulting assembly product.
Also, the disclosure provides a method of cloning genomic DNA, the method comprising in vitro joining a linear cloning vector and a mixture of genomic DNA fragments with a first exonuclease, and recombining the linear cloning vector and a target DNA fragment of the mixture of genomic DNA fragments, wherein the linear cloning vector shares at least one homologous sequence with the target DNA fragment of the mixture of genomic DNA fragments.
The at least one homologous sequence can be in or at one end of the two or more target nucleic acid molecules, particularly, at least one homologous sequence at one end of a target nucleic acid molecule, and more particularly, all the homologous sequences at one end of the target nucleic acid molecules.
The at least one homologous sequence has a length of at least 6, at least 10, at least 20, at least 25, at least 30, at least 40, at least 50, at least 60, at least 70, at least 80 nucleotides, particularly 25, 40 or 80 nucleotides, and more particularly 80 nucleotides.
The first exonuclease can be a 5′ to 3′ exonuclease or a 3′ to 5′ exonuclease, particularly T4 DNA polymerase, Klenow fragment of DNA polymerase I, T5 exonuclease, T7 exonuclease, and more particularly T4 DNA polymerase or T5 exonuclease.
In vitro joining in the presence of the first exonuclease comprises cleaving and annealing. The in vitro joining can join two or more target nucleic acid molecules or the first nucleic acid molecule to the second nucleic acid molecule, or join the treated linear cloning vector to the target DNA fragment of a mixture of genomic DNA fragments.
In vitro joining in the presence of the first exonuclease comprises enzyme digestion and annealing; the enzyme digestion of different nucleic acid molecules can be performed separately or simultaneously, such as in a mixture of samples.
In vitro joining in the presence of the first exonuclease further comprises addition of a DNA polymerase, dNTPs and a DNA ligase.
In vitro joining in the presence of the first exonuclease further comprises addition of a DNA polymerase having 3′ to 5′ exonuclease activity.
In vitro joining in the presence of the first exonuclease further excludes the addition of dNTPs.
In vitro joining in the presence of the first exonuclease further comprises T4 DNA polymerase treatment or Gibson assembly.
The second exonuclease is RecE, and particularly, the RecE is a recombinant expression product.
The annealing protein includes RecA, RAD51, Redβ, RecT, Pluβ or RAD52, and particularly, the annealing protein is RecT, more particularly, the RecT is a recombinant expression product.
The annealing protein is RecT, particularly, the RecT is a recombinant expression product.
The homologous recombination is carried out in vitro or in a host cell.
The host cell can be a yeast cell, particularly the yeast cell is a Saccharomyces cerevisiae cell; or a bacterial cell, particularly the bacterial cell is Bacillus subtilis or Escherichia coli.
The host cell expresses an exonuclease, particularly a second exonuclease and an annealing protein.
The host cell expresses an exonuclease, an annealing protein, and Redγ. Particularly, the host cell further expresses RecA, more particularly, the host cell expresses RecE, RecT, Redγ, and RecA.
The host cell is E. coli cell expressing full length RecE and/or RecT, particularly, the host cell is E. coli cell expressing full length RecE, RecT and Redγ, more particularly the host cell is E. coli cell expressing full length RecE, RecT, Redγ and RecA.
The host cell is E. coli cell expressing truncated RecE and RecT.
The host cell is E. coli cell expressing Redα and Redβ.
The host cell expresses an exonuclease on a plasmid vector and/or chromosome, particularly, the second exonuclease, the annealing protein, Redγ and/or RecA, particularly, expressing by a plasmid vector, and more particularly, expressing by a plasmid vector and a chromosome simultaneously.
The target nucleic acid molecule or the target DNA fragment is a linear DNA segment selected from a DNA fragment digested by endonuclease, a DNA fragment amplified by PCR, a genomic DNA fragment, a member of cDNA library, a fragment derived from bacterial artificial chromosomes (BACs), and a fragment of cloning vectors.
The endonuclease can be a restriction enzyme or a programmable endonuclease, such as Cas9.
The number of the target nucleic acid molecules or DNA fragments is 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 or more.
The target nucleic acid molecule comprises a sequence of 0.5 kb or longer (e.g., 1-kb or longer, 2.5-kb or longer, 4-kb or longer, 5-kb or longer, 7.5-kb or longer, 10-kb or longer, 15-kb or longer, 20-kb or longer, 25-kb or longer, 40-kb or longer, 50-kb or longer, 75-kb or longer or 100-kb or longer).
The two or more target nucleic acid molecules, the first target nucleic acid molecule, and the second target nucleic acid molecule or target DNA fragments comprise one or more PCR amplified DNA fragments, genomic DNA fragments, cDNA library members, and/or a fragment derived from BAC.
The first exonuclease has 3′ to 5′ exonuclease activity, and particularly, is T4 DNA polymerase.
In vitro joining in the presence of a first exonuclease is carried out in vitro in the absence of dNTPs.
The second exonuclease is a full length RecE.
The annealing protein is RecT.
The homologous recombination is carried out in a bacterial host cell expressing the full-length RecE and RecT, particularly in Escherichia coli.
The two or more target nucleic acid molecules comprise one or more PCR amplified DNA fragments, genomic DNA fragments, cDNA library members, and/or a fragment derived from BAC, linear plasmid and/or cloning vector fragment, and particularly, three or more linear plasmids and/or cloning vector fragments.
The first exonuclease comprises a Gibson assembly.
The second exonuclease is full length RecE.
The annealing protein is RecT.
The homologous recombination is carried out in a bacterial host cell expressing the full length RecE and RecT, particularly in E. coli.
The two or more target nucleic acid molecules include three or more PCR amplified DNA fragments, genomic DNA fragments, cDNA library members, and/or a fragment derived from BAC, linear plasmids and/or cloning vector fragments, particularly three or more linear plasmids and/or cloning vector fragments.
The first exonuclease comprises a Gibson assembly.
The second exonuclease is full length RecE.
The annealing protein is RecT.
The homologous recombination is carried out in a bacterial host cell expressing the full length RecE and RecT, particularly in E. coli.
Also provided is a kit comprising the first exonuclease and the second exonuclease described in the preceding method, or a nucleic acid encoding the first exonuclease and the second exonuclease described in the preceding method.
A kit comprises the first exonuclease and the second exonuclease described in the preceding method, or a nucleic acid encoding the first exonuclease and the second exonuclease. Particularly, the kit further comprises a host cell expressing the second exonuclease, particularly, the host cell expresses an exonuclease, an annealing protein and Redγ, particularly, the host cell also expresses RecA, more particularly, the host cell expresses RecE, RecT, Red γ and RecA, the host cell can be a yeast cell, particularly the yeast cell is a Saccharomyces cerevisiae cell; or a bacterial cell, particularly the bacterial cell is Bacillus subtilis or Escherichia coli, The host cell expresses an exonuclease, an annealing protein, Redγ and/or RecA on a plasmid vector and/or chromosome simultaneously, particularly, expressed by a plasmid vector, the more particularly, expressed by a plasmid vector or chromosome, further particularly, the kit may further comprise one or more pre-prepared linear vectors.
The first exonuclease is a DNA polymerase having 3′ to 5′ exonuclease activity, such as T4 DNA polymerase, Klenow fragments of DNA polymerase I, T5 exonuclease or T7 exonuclease, the second exonuclease is full length RecE.
The kit further comprises a host cell expressing a second exonuclease, particularly, the host cell comprises a nucleic acid encoding of full length RecE, RecT, Redγ and RecA.
The kit further comprises one or more pre-prepared linear vectors.
Use of the aforesaid method or kit in construction of a targeting vector.
Use of the aforesaid method or kit in genotyping of mammalian cells.
Use of the aforesaid method or kit in DNA synthesis.
DNA recombination engineering is a genetic engineering technique for modifying DNA molecules in E. coli cells, which is mediated by homologous recombination of the phage syn/exo proteins (mainly Redα and Redβ). DNA recombination engineering was first discovered in the E. coli sbcA (recBC repressor) strain, which has an activity that efficiently mediates homologous recombination between DNA molecules with homology boxes. The sbcA strain was discovered in a classic experiment by A J Clark looking for a homologous recombination pathway in E. coli. He used the recBC strain, which is very sensitive to DNA damage, to screen for its inhibitor, and found sbcA mutant strains with RecE and RecT expression activities. Subsequent studies have shown RecE and RecT are expressed by Rac phage integrated on chromosomes, which function identically to phage Redα and Redβ, and only 280 amino acids at the C-terminus of RecE protein are expressed in the sbcA mutant strain. The truncated RecE is similar to Redα (266 amino acids) and is a 5′ to 3′ exonuclease. RecT is similar to Redβ and is a single-strand annealing protein (SSAP).
RecE/RecT and Redα/Redβ belong to the 5′to 3′ exonuclease/SSAP syn/exo protein pairs, and a specific protein-protein interaction potential between each pair of proteins is necessary for homologous recombination of double-stranded DNA. Redα/Redβ-mediated homologous recombination occurs mainly on the replication fork and requires simultaneous replication. Although recombinant engineering techniques through truncated RecE/RecT was initially discovered, Redα/Redβ was used to modify DNA molecules because the latter were more efficient. The characteristics of RecE/RecT is studied, and it is found that the 600 amino acid residues at the N-terminus of RecE changes their recombination activity from replication-dependent to replication-independent. Therefore, two linear DNA molecules can form a circular plasmid by efficient homologous recombination through a very short homologous box. Compared with the Redα/Redβ recombination engineering, this linear-linear recombination mechanism has different applications, such as directly cloning large DNA fragments from genome or performing multiple DNA fragment assembly.
The disclosure provides a method of homologous recombination (linear-linear recombination) between two or more target linear nucleic acid molecules sharing at least one homologous sequence. The method comprises the mixture of target linear nucleic acid molecules treated with a first exonuclease; then the treated target linear nucleic acid molecules is subjected to homologous recombination in the presence of a second exonuclease and an annealing protein. The second exonuclease can be RecE, and the amino acid sequence of full length RecE from E. coli K12 is disclosed in WO2011/154927. Or the second exonuclease may also be truncated RecE, and the truncated forms of RecE including RecE protein consisting of amino acids 588-866, 595-866, 602-866 or 606-866.
Homologous recombination is mediated by the second exonuclease and the annealing protein. In some embodiments, the annealing protein used in the methods is a related art. Particularly, the annealing protein is RecT or a fragment thereof (derived from Rac phage). More particularly, the annealing protein is the full length RecT and the second exonuclease is the full length RecE. However, any other suitable annealing protein can be used as long as the annealing protein interacts with the exonuclease used. Linear-linear recombination can occur in the host cells lacking RecT expression, such as E. coli strain GB2005, possibly due to the presence of certain endogenous RecT-like activities. However, the efficiency of linear-linear recombination mediated by full length RecE is significantly increased in the presence of RecT.
The methods of the disclosure can be affected in whole or in part in a host cell. Suitable host cells include cells of many species, including parasites, prokaryotes, and eukaryotes, but bacteria such as Gram-negative bacteria are preferred hosts. More particularly, the host cell is an enteric bacterial cell such as Salmonella, Klebsiella, Bacillus, Neisseria, Photorhabdus or Escherichia coli cells (the method of the disclosure plays an effective role in all E. coli strains that have been tested). A preferred host cell is E. coli K12. However, it should be noted that the methods of the disclosure are equally applicable to eukaryotic cells or organisms, such as fungal, yeast, plant or animal cells. This system has been shown to be functional in mouse's ES cells and it is reasonable to speculate that it is also functional in other eukaryotic cells. The host cell is typically an isolated host cell, but can be unisolated host cells.
The host cell of the disclosure comprises a nucleic acid encoding an exonuclease (particularly full length RecE), an annealing protein (particularly RecT) and Redγ. In some embodiments, the host cell further comprises a nucleic acid encoding RecA. Particularly, the host cell expresses RecE, RecT and Redγ, and optionally RecA. More particularly, the host cell expresses RecE, RecT, Redγ and RecA.
The exonuclease, annealing protein, Redγ and/or RecA of the disclosure can be a recombinant expression product from a foreign DNA in a host cell, for example, expressed by a vector transformed into a host cell. An example of a suitable vector is the pSC101 plasmid, although other suitable vectors can also be used. Any suitable promoter can be used to drive the expression of these proteins. However, in the case of expressing RecE, an inducible promoter such as an arabinose-inducible promoter (PBAD) or a rhamnose-inducible promoter (PRhaSR) is preferred. These promoters are well known in the art.
The host cell of the disclosure expresses an exonuclease, an annealing protein, Redγ and/or RecA by the inducible promoters on a plasmid vector or a chromosome. Particularly, the exonuclease, annealing protein, Redγ and/or RecA are expressed in the host cell by a plasmid vector. More particularly, the exonuclease, annealing protein, Redγ and/or RecA are simultaneously expressed in the host cell by the plasmid vector and the chromosome.
The genome of the E. coli K12 host cell consists of an endogenous copy of the full-length recE gene and the recT gene, which are present in the Rac phage that has been integrated into the host genome. However, since the gene is silent, the expression of full-length RecE cannot naturally occur from the integrated gene. Thus, in embodiments where the 5′ to 3′exonuclease is expressed by exogenous DNA, the method can be carried out in the absence of an endogenous recE gene.
Host cells transformed with the encoding as above nucleic acid molecule of the exonuclease are also provided. Particularly, the exonuclease is expressed by the nucleic acid molecule, and thus the disclosure also provides the host cell expressing the exonuclease enumerated in the method of the disclosure. The exonuclease is particularly expressed under the control of an inducible promoter, such as an arabinose-inducible promoter (PBAD) or a rhamnose-inducible promoter (PRHaSR).
In the foregoing embodiments, the methods of the disclosure may be affected in whole or in part in vitro. For example, purified 5′ to 3′ exonuclease and annealing protein (particularly purified RecE and RecT proteins) can be used, or an extract of E. coli cell expressing the 5′ to 3′ exonuclease and the annealing protein are used. When the method is carried out in vitro, it is advantageous to pretreat the first and second linear target nucleic acid molecules to expose single-stranded homologous ends.
Linear-linear recombination requires that at least one homologous sequence must be shared between the target linear nucleic acid molecules in which homologous recombination occurs. In some embodiments, the first target nucleic acid molecule shares a homologous sequence with the second target nucleic acid molecule to perform the linear-linear recombination between the first and second target nucleic acid molecules, to produce a linear product. In embodiments in which linear-linear recombination occurs between the first and second linear nucleic acids and one or more additional linear nucleic acids to form a linear product, each linear nucleic acid shares a homologous sequence with the linear nucleic acid that forms its neighbor in the linear products of the linear-linear recombination reaction. In embodiments in which linear recombination occurs between the first and second linear nucleic acids and one or more additional linear target nucleic acid molecules, to form a cyclic product, each linear nucleic acid shares a homologous sequence with linear nucleic acid that forms its neighbor in the cyclic product of the linear-linear recombination reaction. In some embodiments, the first target nucleic acid molecule and the second target nucleic acid molecule share two homologous sequences to perform a linear-linear recombination between the first and second target nucleic acid molecules, to form a cyclic molecule. Those skilled in the art know how to design homologous sequences to form linear or cyclic molecules.
Particularly, at least one homology box is at the very end of each linear fragment. When the homology boxes are at the very end of each linear fragment and the different homology boxes are at the other end, these homologous sequences or ‘homologous boxes’ produce the optimal configuration, and the construction of these homology boxes enables recombination to generate a ring. Linear recombination can occur when the homology box is not at the end, but the efficiency is reduced. Thus, in a preferred embodiment, at least one at least one homologous sequence is located at the outermost end of one or both ends of the target nucleic acid molecule. In some embodiments, the at least one homologous sequence is internal to the certain target nucleic acid molecule.
The homologous sequences of the disclosure are at least 4, at least 6, at least 10, at least 20, at least 30, at least 40, at least 50, at least 60, at least 70, at least 80, at least 90, at least 100 nucleotides in length. For example, in some embodiments, the homologous sequences are 4-6, 6-9, 6-30, 6-100, 10-20, 20-29, 20-40, 20-50, 10-100, 25-30, 25-40, 25-50, 30-40, 30-50, 40-50, 40-80 or more than 80 nucleotides. The efficiency of homologous recombination increases with the length of the homology boxes used, so longer homology boxes can be used.
‘Homologous’ between two nucleic acid molecules means that when the sequences of two nucleic acid molecules are aligned, there are many nucleotide residues that are identical at the same position in the sequence. The degree of homology is easy to calculate.
In some embodiments, the methods of the disclosure comprise joining together a plurality of linear nucleic acid molecules to form a circular nucleic acid molecule, such as a circular plasmid. Each target nucleic acid molecule shares a at least one homologous sequence with a target nucleic acid molecule that forms its neighbor in the resulting cyclic product and is subjected to linear-linear recombination in accordance with the methods of the disclosure. The number of target nucleic acid molecules is 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 or more.
In some embodiments, at least one of the target linear nucleic acid molecules comprises a selection marker to allow selection of the correct recombinant. Any suitable selection marker can be used in the disclosure. In some embodiments, the selection marker is an antibiotic resistance gene, e.g., a resistance gene for chloramphenicol, ampicillin, kanamycin, or blasticidin.
The target linear nucleic acid molecule can be derived from any suitable source. For example, nucleic acid sequences from eukaryotes or prokaryotes can be included. In some embodiments, the first target linear nucleic acid molecule is genomic DNA. Typically, the genomic DNA is a genomic DNA fragment. The genomic DNA particularly consists of a target sequence. In some embodiments, a genomic DNA fragment can be obtained by cleavage or digestion of genomic DNA (for example, using a restriction enzyme) to obtain a complete target sequence containing. In some embodiments, the first target linear nucleic acid molecule (such as, a genomic DNA fragment, a cDNA library member, or a BAC-derived fragment) comprises a target sequence of 2-kb or longer (e.g., 2.5-kb or longer, 4-kb or longer, 5-kb or longer, 7.5-kb or longer, 10-kb or longer, 15-kb or longer, 20-kb or longer, 25-kb or longer, 40-kb or longer, 50-kb or longer, 75-kb or longer or −100 kb or longer). Particularly, the target sequence is the entire region between the homology boxes at either end of the first target linear nucleic acid molecule. For example, a gene cluster encoding a secondary metabolite pathway or a fatty acid synthesis pathway. In some embodiments, the methods of the disclosure can be used to directly clone a DNA region from a human or non-human animal genome. For example, regenerative therapies for health research or for correction by gene targeting. For example, in some embodiments, the first target nucleic acid molecule comprises or consists of a genomic DNA fragment from a human or non-human animal. The genomic DNA fragment can comprise a target sequence, such as a gene comprising a mutation, wherein the mutation results in a disease or condition and the modification of the mutation to a wild type sequence can treat or prevent the disease or condition. In embodiments where the first target nucleic acid molecule is a genomic DNA fragment, the second target nucleic acid molecule is particularly a linear cloning vector.
In embodiments where the first target nucleic acid molecule is a genomic DNA fragment, The method comprises generating a first target nucleic acid molecule by digesting or cleaving genomic DNA to obtain a linear genomic DNA fragment comprising the target sequence, then, the first exonuclease is used to treat the mixture of the genomic DNA fragment and the linear cloning vector, processing the steps of cleaving the target nucleic acid molecule and annealing to join the target nucleic acid molecule, and then the mixture of the treated nucleic acid molecules is transferred into host cells. The second target nucleic acid molecule particularly comprises a selection marker.
In one embodiment, the methods of the disclosure comprise the step of joining of DNA molecules in vitro.
The joining process in vitro comprises exonuclease digestion followed by annealing.
The exonuclease is T4 polymerase.
The joining process in vitro comprises Gibson assembly.
The joining process in vitro comprises DNA synthesis by DNA polymerase with or without exonuclease followed by annealing.
The joining process in vitro comprises annealing by a single-stranded annealing protein, such as RecA/RAD51, Redβ, RecT. Pluβ or RAD52.
Host cells used for homologous recombination are E. coli cells.
Host cells for homologous recombination are E. coli cell expressing full length RecE and/or RecT.
Host cells for homologous recombination are E. coli cell expressing full length RecE, RecT and/or Redγ.
Host cells for homologous recombination are E. coli cell expressing truncated RecE, RecT and/or Redγ.
Host cells for homologous recombination are any bacterial host cell expressing full length RecE and/or RecT.
Host cells for homologous recombination are E. coli cell expressing Redα, Redβ and/or Redγ.
The host cell for homologous recombination is Saccharomyces cerevisiae cells.
Kits for use in the disclosure are provided. In some embodiments, the kits comprise a nucleic acid encoding an exonuclease as described herein. In some embodiments, the kit comprises an exonuclease as described herein. Particularly, the first exonuclease is T4 DNA polymerase (T4pol), Klenow fragment of DNA Polymerase I (Kle), T7 DNA polymerase (T7pol), Exonuclease III (ExoIII), Phusion DNA polymerase (Phu), T5 exonuclease (T5exo), T7 exonuclease (T7exo) and Lambda exonuclease (λexo), and the second exonuclease is full length RecE. More particularly, the kits comprise a host cell as described herein. For example, in some embodiments, the host cells in a kit comprises a nucleic acid encoding a full length RecE, RecT, Redγ, and RecA described herein under the control of an inducible promoter. The kits may also include one or more pre-prepared linear cloning vectors.
Another preferred application of the disclosure relates to the assembly of linear nucleic acid molecules in synthetic biology, particularly linear DNA. Thus, in some embodiments, the first and second target nucleic acid molecules are linear, and the method further comprises contacting the first and second target nucleic acid molecules with one or more other linear target nucleic acid molecules (For example, 1, 2, 3, 4, 5, 6, 7, 8, 9, at least 10, at least 20 other target nucleic acid molecules) in the presence of a 5′ to 3′ exonuclease and an annealing protein, to produce a linear or circular product. In a preferred embodiment, homologous recombination between the first and second target nucleic acid molecules and one or more other target nucleic acid molecules results in the production of genes, operons, chromosomes or whole genomes. Synthetic biological assembly of DNA nucleic acids has been used to generate genes, operons, chromosomes, or recently used to generate whole genomes. In an embodiment of the disclosure, the combination of the first exonuclease and the second exonuclease significantly increases the assembly efficiency of the linear nucleic acid molecule, the disclosure will be a preferred method for the assembly of synthetic biological DNA in commerce and research.
Another preferred application of the disclosure is to construct a haplotype isogenic targeting vector, which can directly clone a 5 to 10-kb DNA fragment from mammalian genome using the method of the disclosure as an isogenic homology box, and these DNA fragments are the identical genes and maintains a polymorphic haplotype, which is called a haplotype isogenic targeting vector, that is the so called haplotype isogenic targeting (HIT) vector. The selection marker and other functional elements are then inserted into the HIT vector by recombinant engineering to obtain a vector for targeting. Another preferred application of the disclosure is the genotyping of mammalian cells. The DNA fragment containing the complete targeting element is cloned from the genome of the possible target embryonic stem cells by the method of the disclosure, and the recombinant plasmid obtained by the cloning is subjected to restriction analysis and DNA sequencing, and the cell is successfully determined according to the result.
E. coli GB2005 was derived from DH10B by deleting fhuA, ybcC and recET. GB05-dir was derived from GB2005 by integrating the PBAD-ETgA operon (full length recE, recT, redγ and recA under the arabinose-inducible PBAD promoter) at the ybcC locus. GB08-red was derived from GB2005 by integrating the PBAD-gbaA operon (redγ, redβ, redα and recA under the arabinose-inducible PBAD promoter) at the ybcC locus.
pSC101-BAD-ETgA-tet conveys tetracycline resistance and carries the PBAD-full length ETgA operon and a temperature sensitive pSC101 replication origin which replicates at 30° C. but not at 37° C. so it can be easily eliminated from the host by temperature shift in the absence of selection.
Gram-negative Photobacterium phosphoreum ANT-2200 and Photorhabdus luminescens DSM15139 were cultured overnight in 50 mL of medium. After centrifugation the cells were resuspended thoroughly in 8 mL of 10 mM Tris-Cl (pH 8.0). Five hundred microliters of 20 mg mL−1 proteinase K and 1 mL of 10% SDS were added and incubated at 50° C. for 2 h until the solution became clear. Genomic DNA was recovered from the lysate by phenol-chloroform-isoamyl alcohol (25:24:1, pH 8.0) extraction and ethanol precipitation. The DNA was dissolved in 10 mM Tris-Cl (pH 8.0) and digested with BamHI+KpnI for cloning of the 14-kb lux gene cluster.
Gram-positive Streptomyces albus DSM41398 was cultured in 50 mL of tryptic soy broth at 30° C. for 2 days. The genomic DNA was isolated. After centrifugation the cells were resuspended thoroughly in 8 mL of SET buffer (75 mM NaCl, 25 mM EDTA, 20 mM Tris, pH 8.0) and 10 mg lysozyme was added. After incubation at 37° C. for 1 h, 500 μL of 20 mg mL−1 proteinase K and 1 mL of 10% SDS were added and incubated at 50° C. for 2 h until the solution became clear. Three and a half milliliters of 5 M NaCl was added into the lysate. Genomic DNA was recovered from the lysate by phenol-chloroform-isoamyl alcohol (25:24:1, pH 8.0) extraction and ethanol precipitation. The DNA was dissolved in 10 mM Tris-Cl (pH 8.0).
Genomic DNA was purified from mouse melanoma B16 cells, human embryonic kidney 293T cells and human blood using Qiagen Blood & Cell Culture DNA Kits according to the manufacturer's instructions, except DNA was recovered from the Proteinase K treated lysate by phenol-chloroform-isoamyl alcohol (25:24:1, pH 8.0) extraction and ethanol precipitation. The DNA was dissolved in 10 mM Tris-Cl (pH 8.0). Restriction digested genomic DNA was extracted with phenol-chloroform-isoamyl alcohol (25:24:1, pH 8.0) and precipitated with ethanol. The DNA was dissolved in 10 mM Tris-Cl (pH 8.0). End cut pipette tips were used to avoid shearing genomic DNA.
The genomic DNA of P. luminescens DSM15139 was digested with XbaI for plu3535-plu3532 cloning, and XbaI+XmaI for plu2670 cloning. The genomic DNA of S. albus was digested with EcoRV or Cas9-gRNA complexes for cloning of the salinomycin gene cluster. The mouse genomic DNA was digested with HpaI for Prkar1a cloning, BamHI+KpnI for Dpy30 cloning, and SwaI for Wnt4 or Lmbr1l-Tubala cloning. The human genomic DNA was digested with SpeI for DPY30 cloning, NdeI+BstZ17I for IGFLR1-LIN37 cloning, BstZ17I for IGFLR1-ARHGAP33 cloning and NdeI for ZBTB32-LIN37 cloning. Digested genomic DNA was extracted with phenol-chloroform-isoamyl alcohol (25:24:1, pH 8.0) and precipitated with ethanol. The DNA was dissolved in ddH2O and concentrated to 1 μg μl−1. End cut pipette tips were used to avoid shearing genomic DNA. Ten micrograms of digested genomic DNA were used for ExoCET cloning.
S. pyogenes Cas9 protein was purchased from New England Biolab. Cas9 digestion of S. albus genomic DNA was carried out in an 800 μL reaction system containing 80 μL of 10×Cas9 reaction buffer (NEB), 80 μg of genomic DNA, 40 μg of gRNA-2, 40 μg of gRNA-7 and 20 μg of Cas9. Since the cleavage efficiency of Cas9 was severely affected by the purity of the DNA substrate, in this experiment, the S. albus genomic DNA needed to be extracted three times with phenol-chloroform-isoamyl alcohol (25:24:1, pH 8.0) to ensure the cleavage efficiency of Cas9. After the resulting product was incubated at 37° C. for 6 h, 100 μg of RNase A (Thermo Scientific) was added, and after incubation at 37° C. for 1 h, 100 μg of proteinase K (Roche) was then added, and incubation was continued at 50° C. for 1 h. The genomic DNA was then extracted once with phenol-chloroform-isoamyl alcohol (25:24:1, pH 8.0), and after ethanol precipitation, it was dissolved in an appropriate amount of ddH2O to a final concentration of about 1 μg uL−1. Finally, 10 μg of genomic DNA cleaved by Cas9 protein was used for the cloning experiment of the method of the disclosure.
Using p15A-Pamp-luxABECD plasmid (Genebridge) as a template, the p15A-cm vector was PCR-amplified with PrimeSTAR Max DNA Polymerase (Takara), the primers used (Table 1) consist of 80 nucleotide homology boxes and were purified by PAGE. The PCR product eliminated the interference of the primers on subsequent experiments by gel recovery. The kit used was QIAquick gel extraction kit (Qiagen). Finally, the DNA was eluted with ddH2O at a concentration of approximately 200 ng/μL, and 200 ng was used for the ExoCET cloning experiment.
The pBeloBAC11 vector used to clone the salinomycin gene cluster and the pBAC2015 vector used to clone plu3535-3532 were constructed. BAC vectors were linearized with BamHI to expose both homology arms, and extracted with phenol-chloroform-isoamyl alcohol (25:24:1, pH 8.0) and precipitated with isopropanol. The DNA was dissolved in ddH2O and concentrated to 1 μg/μ. One microgram of linear BAC vectors were used for ExoCET cloning.
Lower-case letters in Table 1 represent the homologous box sequence.
The DNA fragments with 40 bp homologies used in the multi-piece assembly experiment were PCR amplified using genomic DNA of P. luminescens DSM15139 as the template and PrimeSTAR Max DNA Polymerase (Takara) according to the manufacturer's instructions. The PCR products were extracted from agarose gels after electrophoresis and purified using the QIAquick gel extraction kit (Qiagen) according to the manufacturer's instructions, except that DNA was eluted from the column with ddH2O and concentrated to 200 ng μL−1. 250 ng of each fragment was used for DNA assembly.
The mVenus-PGK-neo cassette was amplified from pR6K-2Ty1-2PreS-mVenus-Biotin-PGK-em7-neo with PCR using the proof reading PrimeSTAR Max DNA Polymerase (Takara) according to the manufacturer's instructions. The primers are listed in Table 1. The PCR products were purified with QIAquick PCR Purification Kit (Qiagen) according to the manufacturer's instructions, except that DNA was eluted from the column with ddH2O and concentrated to 100 ng μL−1. Two hundred nanograms of the cassette was used for recombineering.
Ten micrograms of genomic DNA and 200 ng of 2.2-kb p15A-cm linear vector (1 μg of 8-kb linear BAC vector) were assembled in 20 μL reactions consisting of 2 μL of 10×NEBuffer 2.1 and 0.13 μL of 3 U μL−1 T4pol (NEB, cat. no. M0203). Assembly reactions were prepared in 0.2 ml PCR tubes and cycled in a thermocycler as follows: 25° C. for 1 h, 75° C. for 20 min, 50° C. for 30 min, then held at 4° C. For multi-piece assembly, 250 ng of each fragment was added and a chew-back time of 20 min was used. The in vitro assembly products were desalted at room temperature for 30 min by drop dialysis against ddH2O using Millipore Membrane Filters (Merck-Millipore, cat. no. VSWP01300) prior to electroporation. All experiments were performed in triplicates.
Assembly reactions with other exonucleases were cycled as follows: T5exo: 50° C. for 30 min, then held at 4° C.; T7exo: 25° C. for 20 min, 50° C. for 30 min, then held at 4° C.; Kle, T7pol and λexo: 25° C. for 20 min, 75° C. for 20 min, 50° C. for 30 min, then held at 4° C.; ExoIII: 37° C. for 20 min, 75° C. for 20 min, 50° C. for 30 min, then held at 4° C.; Phu: 37° C. for 20 min, 50° C. for 30 min, then held at 4° C. Gibson assembly was performed at 50° C. for 30 min with Gibson Assembly Master Mix (NEB, cat. E2611), then held at 4° C.
E. coli GB05-dir containing plasmid pSC101-BAD-ETgA-tet was cultured (OD600=3-4) overnight at 30° C. in LB supplemented with 4 μg/mL tetracycline. 40 μL of overnight culture (OD600=3-4) was transferred to 1.4 mL LB supplemented with appropriate antibiotics, then the mixture was placed on an Eppendorf thermomixer at 30° C. and incubated at 950 rpm for 2 h (OD600=0.35-0.4). 35 μL of 10% L-arabinose (w/v, in ddH2O) was added to induce expression of recombinant enzyme (ETgA or gbaA), and the incubation was continued for 40 min at 37° C. (OD600=0.7 to 0.8). The cells were collected by centrifugation at 9,400 g for 30 sec at 2° C.
The supernatant was discarded and the cell pellet was suspended in 1 mL of ice-cold ddH2O. The cells were collected by centrifugation at 9,400 g for 30 sec at 2° C. The supernatant was discarded and the cell pellet was suspended in 1 mL of ice-cold ddH2O. The cells were repeatedly centrifuged, resuspended, and centrifuged again, and the cells were suspended with 20 μL of ice ddH2O. Then 5 μL of desalted in vitro assembled product was added, while in the mVenus-PGK-neo element insertion experiment, the mixture of 200 ng of plasmid and 200 ng of PCR product was added. The mixture of cells and DNA was transferred to a 1-mm cuvettes and electroporated with an Eppendorf electroporator 2510 at a voltage of 1350 V, a capacitance of 10 μF, and a resistance of 600Ω. 1 mL LB was added to the cuvette, washed the cells and transferred it to a 1.5 mL tube with holes, and then placed on the Eppendorf thermomixer at 950 rpm for 1 h at 37° C. Finally, an appropriate amount of the bacterial solution was spread to an LB plate supplemented with a suitable antibiotic (15 μg/mL chloramphenicol or 15 μg/mL kanamycin) and incubated at 37° C. overnight.
A series of exonucleases and annealing methods were tested by direct cloning of a 14-kb lux gene cluster of Photobacterium phosphoreum ANT-2200 (
The longer the homology box, the higher the cloning efficiency (
Since RecT is a single-stranded DNA annealing protein, RecT may anneal the single-stranded DNA region produced by T4pol (3′ exonuclease), so RecE may not be required in the ExoCET technique system. In order to verify this conjecture, the T4pol-treated DNA substrate was transformed into E. coli cell expressing RecT and Redγ (pSC101-Tg) and not expressing RecE, and found no interaction between RecT and T4pol. Therefore, both RecE and RecT were required for ExoCET (
To verify the superiority of ExoCET technique, some experiments that were difficult to be finished with RecET technique were performed. There are two large gene clusters on the genome of Photorhabdus luminescens: 37.5-kb plu3535-3532 and 52.6-kb plu2670. It was very difficult to directly clone these two gene clusters with RecET technique, and the efficiencies were only 2/12 and 0/48, respectively. While using ExoCET technique achieved correct rates of 10/12 and 11/17, respectively (Table 2).
luminescens)
luminescens)
albus)
albus)
Previously, the attempt to directly clone the 106-kb salinomycin gene cluster from Streptomyces albicans genome was failed. So, the gene cluster was divided into three fragments which were cloned step by step and then integrated to form a complete gene cluster. However, through ExoCET, the 106-kb salinomycin gene cluster can be directly cloned into the BAC vector by using a BAC vector with a homology box and EcoRV digested genomic DNA, and obtained the correct rate of 2/24 (Table 2 and
Next, it is tested whether the efficiency of ExoCET could meet the requirement of directly cloning large DNA fragments from mammalian genome. SwaI was used to release a 45-kb fragment containing the Wnt4 gene from the mouse genome (
Gibson was a method for multiple fragment DNA assembly, and ExoCET and Gibson through some DNA multiple fragment assembly experiments (
ExoCET could also be used to directly clone DNA fragments from mammalian genomes including blood, disease-associated cell lines, etc. to facilitate haplotype studies of SNPs and to rapidly construct haplotype syngeneic (HIT) targeting vectors for targeting of nuclease-mediated human stem cells. The importance of human stem cells isolated from patients, cord blood or somatic cell reprogramming in biomedical research had received more and more attention. The research on the precise modification of stem cell genome had also received widespread attention. Transforming the human genome was more challenging than structuring the genome of experimental mice because human genetic diversity was complex. The importance of isogenicity (sequence similarity) for homologous recombination was realized many years ago when people use mouse embryonic stem cells for gene targeting.
Unlike the method of amplifying homology boxes from the genome by PCR, ExoCET was not limited by fragment size, did not introduce mutations, and was capable of maintaining a DNA haplotype. Furthermore, the ends of the homology boxes could also be selected according to the manner of genotyping (such as Southern blotting or joining PCR), so the length of the homology boxes could be optimized. ExoCET therefore offers advantages for individualized genomic surgery, especially when combined with CRISPR/Cas9.
ExoCET was used to construct isogenic targeting vectors to modify mammalian genomes. Given the experience in mouse embryonic stem cell research, one purpose was to clone a 5 to 10-kb DNA fragment directly from human or mouse genome as an isogenic homology box (
ExoCET could also be used as the most reliable method for genotyping a modified genome, while Southern blotting and joining PCR could produce false positive signals. Since long range PCR was prone to false positive signals in mammalian genotyping studies, it is need to confirm the Kmt2d-AID-neo-targeted mouse embryonic stem cells screened by long-segment PCR by Southern blotting. However, no probe was available. Therefore, a DNA fragment containing the entire targeting element was cloned from the genome of four possible Kmt2d-AID-neo-targeted mouse embryonic stem cells using the ExoCET method shown in
ExoCET genotyping did not produce a false positive signal compared to long range PCR. Compared to Southern blotting, ExoCET genotyping is simpler and did not require cumbersome screening of hybridization probes. In ExoCET genotyping, restriction enzyme sites for the release of intact targeting elements were easily available, and in the case of well-prepared genomes, genotyping results were obtained in three days. More importantly, ExoCET never produced a false positive signal. Since the targeting element had a selection marker, as long as 500 ng of restriction enzyme genomic DNA was sufficient to obtain better cloning efficiency (Table 4). To increase the throughput of ExoCET genotyping, cells cultured in 96-well plates can be used.
Functional analysis of whole genome sequencing results requires a simple and rapid method of expression vector construction. According to the method of the disclosure, a DNA fragment of up to 50-kb can be cloned from a 3.0×109-bp genome. To this end, 1 ng of P. phosphoreum genomic DNA was diluted and added to 10 μg of Bacillus subtilis genomic DNA to mimic the metagenomics. The experiment successfully cloned the 14-kb lux gene cluster by ExoCET and obtained considerable efficiency (Table 5). Environmental samples usually contained more than 104 species, so the results show ExoCET cloning technique can be applied to metagenomic samples.
P. phosphoreum
B. subtilis
In this disclosure, the genomic DNA and the cloning vector are cleaved with exonuclease in vitro, and the reaction product in vitro was then homologously recombined in the presence of RecET recombinase to establish the ExoCET cloning technique. ExoCET technique can clone DNA fragments larger than 100-kb directly from bacterial genome, and clone DNA fragments larger than 50-kb from mammalian cells and human blood. ExoCET technique is also capable of efficiently assembling at least twenty DNA fragments to form a complete plasmid. Like the RecET direct cloning technique, ExoCET presents advantages over PCR for amplification of DNA because it has a much higher fidelity, is not limited in size and does not scramble haplotypes. The target DNA is directly cloned into a plasmid vector to facilitate expression studies. In addition, ExoCET is more efficient than Gibson assembly because Gibson relies on circular DNA molecules produced by in vitro assembly. Moreover, due to the self-circularization of the empty vector, Gibson assembly may produce a very serious background in the process of direct cloning.
Through the ability to selectively acquire large DNA segments from complex genomic preparations including blood, ExoCET also presents options for diagnostics and pathology tests such as directed sequence acquisition for personal medicine or the isolation of DNA viruses from patient materials. ExoCET will have broad applications in functional and comparative genomics, especially for direct cloning of biosynthetic pathways from prokaryotes or assembling multiple DNA molecules for synthetic biology.
It will be obvious to those skilled in the art that changes and modifications can be made, and therefore, the aim in the appended claims is to cover all such changes and modifications.
Number | Date | Country | Kind |
---|---|---|---|
201710177676.1 | Mar 2017 | CN | national |
This application is a continuation-in-part of International Patent Application No. PCT/CN2017/000483 with an international filing date of Aug. 2, 2017, designating the United States, now pending, and further claims foreign priority benefits to Chinese Patent Application No. 201710177676.1 filed Mar. 23, 2017. The contents of all of the aforementioned applications, including any intervening amendments thereto, are incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/CN2017/000483 | Aug 2017 | US |
Child | 16578385 | US |