The present disclosure generally relates to compositions and methods for the assembly of nucleic acid molecules into larger nucleic acid molecules. Provided are compositions and methods for seamless connection of nucleic acid molecules, in many instances, with high sequence fidelity.
As genetic engineering has developed, a need for the generation of larger and larger nucleic acid molecules has also developed. In many instances, nucleic acid assembly methods involve the production of sub-assemblies (e.g., chemically synthesized oligonucleotides), followed by the generation of larger (e.g., annealing of oligonucleotides to form double-stranded nucleic acid molecules) and larger assemblies (e.g., ligation of double-stranded nucleic acid molecules).
The present disclosure generally relates to compositions and methods for efficient assembly of nucleic acid molecules.
The present disclosure relates, in part, to compositions and methods for efficient assembly of nucleic acid molecules. Three aspects of the invention, that may be used in combination or separately, are as follows:
1. The use of nuclease resistant regions near the termini (e.g., within 12, 15, 20, 30, 40, or 50 base pairs) of nucleic acid segments to limit digestion of these nucleic acid segments during the formation of single-stranded regions (e.g., single-stranded regions designed for hybridization to other nucleic acid segments).
2. The reconstitution of functional nucleic acid elements (e.g., selectable marker, origins or replication, etc.) for the purpose of selecting for correctly assembled nucleic acid molecules.
3. The stopping/inhibition of assembly reaction processes that can affect the stability of nucleic acid molecules prepared during the assembly process.
In some aspects, the invention relates to compositions and methods for covalently linking two nucleic acid segments, these method comprising: (a) incubating the two nucleic acid segments with one or more nuclease (e.g., exonuclease) under conditions that allow for digestion of termini of the two nucleic acid segments to form complementary single-stranded regions on each nucleic acid segment and hybridization of the complementary single-stranded regions, wherein each of the two nucleic acid segments comprises a nuclease resistant region within 30 nucleotides of the end of the complementary terminus, and (b) covalently connecting at least one strand of the hybridized termini formed in (a) resulting in the linkage of the two nucleic acid segments.
Steps (a) and (b), referred to above, may be performed in the same tube and/or at the same time. Further, the two or more nucleic acid segments may be simultaneously contacted with one or more nuclease (e.g., exonuclease) and one or more molecule with ligase activity (e.g., ligase, topoisomerase, etc.) in step (a). In such instances, the two or more nucleic acid segments may be contacted with the one or more nuclease first, followed by contacting with the one or more molecule ligase activity or the two or more nucleic acid segments with the one or more nuclease and the one or more molecule ligase activity at the same time.
The invention also includes compositions and methods in which three or more (e.g., four, five, eight, ten twelve, fifteen, etc.) nucleic acid segments are covalently linked to each other. Further, some of these nucleic acid segments may not contain a nuclease (e.g., exonuclease) resistant region, some may contain a single nuclease resistant region and some may contain two nuclease resistant regions. In most cases, nucleases resistant regions, when present will be within 30 base pairs of a terminus of the nucleic acid segment in which they are present.
In many instances, nucleic acid molecules prepared by methods of the invention will be replicable. Further, many of these replicable nucleic acid molecules will be circular (e.g., plasmids). Replicable nucleic acid molecules, regardless of whether they are circular, will generally be formed from the assembly of two or more (e.g., three, four, five, eight, ten, twelve, etc.) nucleic acid segments. In some instances, methods of the invention employ selection based upon the reconstitution of one or more (e.g., two, three, four, etc.) selection marker or one or more (e.g., two, three, four, etc.) origin of replication resulting from the linking of different nucleic acid segments. Further selection may result from the formation of a circular nucleic acid molecule, in instances where circularity is required for replication.
The invention also relates, in part, to compositions and methods for storing assembled nucleic acid molecules (e.g., nucleic acid molecules assembled by method disclosed herein). Stabilization of nucleic acid molecules is often facilitated by the inhibition of nucleic acid assembly activities (e.g., nuclease activities). Thus, the invention includes methods for the stabilization of nucleic acid molecules associated with the inhibition or elimination of activities (e.g., enzymatic activities) associated with the assembly process. One example is that methods of the invention include those involving the partial or full inactivated one or more enzyme contacting assembled nucleic acid molecules. This may be accomplished by the use of enzymatic inhibitors, pH changes, as well as other means.
In some instances, inhibition of enzymatic activity will be mediated by heating. While the temperatures required to inactivate enzymes differ with the particular enzyme or enzymes in the mixture, typically, heating will be to a temperature greater than 65° C. (e.g., 70° C., 75° C., 80° C., or 85° C.) for at least 10 minutes (e.g., 15 minutes, 20 minutes, 25 minutes, 30 minutes, etc.).
In many instances, after the partial or full inactivated one or more enzyme contacting assembled nucleic acid molecules, the assembled nucleic acid molecules will be stored at a temperature equal to or below 4° C. (e.g., −20° C., −30° C., −60° C., or −70° C). for at least 24 hours (e.g., 36 hours, two days, five days, seven days, two weeks, three weeks, one month, three months, six months, nine months, one year).
The invention also includes methods for assembling nucleic acid molecules, these methods comprising: (a) incubating a first nucleic acid segment with a nuclease (e.g., an exonuclease) under conditions that allow for partial digestion of at least one terminus of the first nucleic acid segment to form a single-stranded region, wherein the first nucleic acid segment contains a nuclease resistant region within 30 nucleotides of the at least one terminus, (b) preparing a reaction mixture containing the digested first nucleic acid segment formed in (a) with an undigested second nucleic acid segment under conditions that allow for the hybridization of termini with sequence complementarity, and (c) covalently connecting at least one strand of the hybridized termini formed in (b). The second nucleic acid segment of (b) may or may not contain a nuclease resistant region. In many instances, the at least one terminus of the second nucleic acid segment of (b) will contain a single-stranded region with sequence complementarity to the single-stranded region of the first nucleic acid molecules formed in step (a). Further, the nuclease of (a) may be an exonuclease and, more specifically, a 5′ to 3′ exonuclease or 3″ to 5′ exonuclease. Additionally, two or more nucleases are present in step (a). Further, the nuclease(s) present may retain partial or full functionality in step (b) or may be partially or fully inactivated.
The invention also includes methods for assembling nucleic acid molecules, these methods comprising: (a) incubating two or more nucleic acid segments with a nuclease (e.g., an exonuclease) under conditions that allow for partial digestion of at least one terminus of each of the two or more nucleic acid segments to generate single-stranded termini, wherein at least two of the two or more nucleic acid segments contain a nuclease resistant region within 30 nucleotides of at least one of their termini, (b) preparing a reaction mixture containing the digested nucleic acid segments prepared in (a) with one or more undigested nucleic acid segment under conditions that allow for the hybridization of termini with sequence complementarity, wherein at least one of the one or more undigested nucleic acid segment has region of sequence complementarity with at least one single-stranded terminus formed in (a), and (c) covalently connecting at least one strand of the hybridized termini formed in (b).
For a more complete understanding of the principles disclosed herein, and the advantages thereof, reference is now made to the following descriptions taken in conjunction with the accompanying drawings, in which:
Definitions:
As used herein the term “sequence fidelity” refers to the level of sequence identity of a nucleic acid molecule as compared to a reference sequence. Full identity being 100% identical over the full length of the nucleic acid molecules being scored for sequence identity. Sequence fidelity can be measure in a number of ways, for example, by the comparison of the actual nucleotide sequence of a nucleic acid molecule to a desired nucleotide sequence (e.g., a nucleotide sequence that one wishes to be used to generate a nucleic acid molecule). Another way sequence fidelity can be measured is by comparison of sequences of two nucleic acid molecules in a reaction mixture. In many instances, the difference on a per base basis will be, on average, the same.
As used herein the term “exonuclease” refers to enzymes that cleaves nucleotides one from the end (exo) of a polynucleotide chain. Typically, their enzymatic mechanism involves hydrolyzing reactions that breaks phosphodiester bonds at either the 3′ or the 5′ end occurs. Exemplary exonucleases include Escherichia coli exonuclease I, Escherichia coli exonuclease III (3′ to 5′), Escherichia coli exonuclease VII, Escherichia coli exonuclease VIII, bacteriophage lambda exonuclease (5′ to 3′), exonuclease T (3′ to 5′), bacteriophage T5 Exonuclease, and bacteriophage T7 exonuclease (5′ to 3′).
As used herein the term “error correction” refers to changes is the nucleotide sequence of a nucleic acid molecule to alter a defect. These defects can be mis-matches, insertions, and/or substitutions. Defects can occur when a nucleic acid molecule that is being generated (e.g., by chemical or enzymatic synthesis) is intended to contain a particular base at a location but a different base is present at that location. One error correction workflow is set out in
As used herein the term “selectable marker” refers to a nucleic acid segment that allows one to select for or against a nucleic acid molecule or a cell that contains it, often under particular conditions. Examples of selectable markers include but are not limited to: (1) nucleic acid segments that encode products which provide resistance against otherwise toxic compounds (e.g., antibiotics); (2) nucleic acid segments that encode products which are otherwise lacking in the recipient cell (e.g., tRNA genes, auxotrophic markers); (3) nucleic acid segments that encode products which suppress the activity of a gene product; (4) nucleic acid segments that encode products which can be readily identified (e.g., phenotypic markers such as (P-galactosidase, green fluorescent protein (GFP), yellow flourescent protein (YFP), red fluorescent protein (RFP), cyan fluorescent protein (CFP), and cell surface proteins); (5) nucleic acid segments that bind products which are otherwise detrimental to cell survival and/or function; (6) nucleic acid segments that bind products that modify a substrate (e.g., restriction endonucleases); (7) nucleic acid segments that can be used to isolate or identify a desired molecule (e.g., specific protein binding sites); (8) nucleic acid segments, which when absent, directly or indirectly confer resistance or sensitivity to particular compounds; and/or (9) nucleic acid segments that encode products which either are toxic (e.g., Diphtheria toxin) or convert a relatively non-toxic compound to a toxic compound (e.g., Herpes simplex thymidine kinase, cytosine deaminase) in recipient cells.
A “counter selectable” marker (also referred to herein a “negative selectable marker”) or marker gene as used herein refers to any gene or functional variant thereof that allows for selection of wanted vectors, clones, cells or organisms by eliminating unwanted elements. These markers are often toxic or otherwise inhibitory to replication under certain conditions which often involve exposure to a specific substrates or shift in growth conditions. Counter selectable marker genes are often incorporated into genetic modification schemes in order to select for rare recombination or cloning events that require the removal of the marker or to selectively eliminate plasmids or cells from a given population. One example of a negative selectable marker system widely used in bacterial cloning methods is the CcdA/CCdB Type II Toxin-antitoxin system.
Overview:
The invention relates, in part, to compositions and methods for the preparation of nucleic acid molecules. While the invention has numerous aspects and variations associated with it, some of these aspects and variations of applicability of the technology may be represented with the exemplary work flow shown in
Multiple variations of the work flow represented in
In one aspect, methods are provided for the production of nucleic acid molecules having high “sequence fidelity”. This high sequence fidelity can be achieved by, for example, one two or all three of the following: accurate nucleic acid synthesis, error correction, and sequence verification.
Described herein are a number of technologies with applicability to work flows such as those shown in
Nucleic Acid Assembly:
One exemplary embodiment of assembly technology described herein is set out in
Nucleic acid segments such as those used in the work flow of
Only one terminus of each nucleic acid segment represented in
Numerous parameters may be designed and chosen to assemble, for example, different numbers of segments and nucleic acid segments of different length. Parameters may also be altered that result in increased efficiency of nucleic acid assembly for particular applications.
Using the schematic representation of
With respect to lengths of Regions A and/or D, when a nucleic acid molecule is longer than a certain length, the termini act as though they are, for purposes of association with other nucleic acid molecules, in effect different molecules. This, and other factors associated with long nucleic acid molecules (e.g., fragility), means that nucleic acid segment length is one factor for optimization with respect to assembly efficiency.
In some aspects of the invention, nucleic acid segment length will vary from about 20 base pairs to about 5,000 base pairs, from about 100 base pairs to about 5,000 base pairs, from about 150 base pairs to about 5,000 base pairs, from about 200 base pairs to about 5,000 base pairs, from about 250 base pairs to about 5,000 base pairs, from about 300 base pairs to about 5,000 base pairs, from about 350 base pairs to about 5,000 base pairs, from about 400 base pairs to about 5,000 base pairs, from about 500 base pairs to about 5,000 base pairs, from about 700 base pairs to about 5,000 base pairs, from about 800 base pairs to about 5,000 base pairs, from about 1,000 base pairs to about 5,000 base pairs, from about 100 base pairs to about 4,000 base pairs, from about 150 base pairs to about 4,000 base pairs, from about 200 base pairs to about 4,000 base pairs, from about 300 base pairs to about 4,000 base pairs, from about 500 base pairs to about 4,000 base pairs, from about 50 base pairs to about 3,000 base pairs, from about 100 base pairs to about 3,000 base pairs, from about 200 base pairs to about 3,000 base pairs, from about 250 base pairs to about 3,000 base pairs, from about 300 base pairs to about 3,000 base pairs, from about 400 base pairs to about 3,000 base pairs, from about 600 base pairs to about 3,000 base pairs, from about 800 base pairs to about 3,000 base pairs, from about 100 base pairs to about 2,000 base pairs, from about 200 base pairs to about 2,000 base pairs, from about 300 base pairs to about 1,500 base pairs, etc.
Nucleic acid segments used for assembly may be derived from a number of sources, for example, they may be cloned, derived from polymerase chain reactions, or chemically synthesized. Chemically synthesized nucleic acids tend to be of less than 100 nucleotides in length. PCR and cloning can be used to generate much longer nucleic acids. Further, the percentage of erroneous bases present in nucleic acids (e.g., nucleic acid segment) is, to some extent, tied to the method by which it is made. Typically, chemically synthesized nucleic acids have the highest error rate.
The length of the “hybridization” region, Region C, may also vary. The lengths of Region C may vary on each nucleic acid segment.
Typically, Region C will be, independently, on one or both segments in ranges of from about 1 to about 100 base pairs, from about 2 to about 100 base pairs, from about 10 to about 100 base pairs, from about 15 to about 100 base pairs, from about 20 to about 100 base pairs, from about 5 to about 80 base pairs, from about 10 to about 80 base pairs, from about 20 to about 80 base pairs, from about 30 to about 80 base pairs, from about 40 to about 80 base pairs, from about 25 to about 65 base pairs, from about 35 to about 65 base pairs, from about 1 to about 50 base pairs, from about 2 to about 50 base pairs, from about 3 to about 50 base pairs, from about 5 to about 50 base pairs, from about 6 to about 50 base pairs, from about 7 to about 50 base pairs, from about 8 to about 50 base pairs, from about 10 to about 50 base pairs, from about 12 to about 50 base pairs, from about 13 to about 50 base pairs, from about 14 to about 50 base pairs, from about 15 to about 50 base pairs, from about 18 to about 50 base pairs, from about 20 to about 50 base pairs, from about 1 to about 35 base pairs, from about 5 to about 30 base pairs, from about 5 to about 25 base pairs, from about 5 to about 20 base pairs, from about 5 to about 18 base pairs, from about 8 to about 50 base pairs, from about 8 to about 35 base pairs, from about 8 to about 30 base pairs, from about 8 to about 25 base pairs, from about 8 to about 20 base pairs, from about 10 to about 40 base pairs, from about 10 to about 35 base pairs, from about 10 to about 30 base pairs, from about 10 to about 25 base pairs, from about 10 to about 20 base pairs, etc.
The invention includes compositions and methods for nucleic acid assembly where the length or Region C varies with the sequence of this region. In particular, the invention includes reaction mixtures where nucleic acid segments with higher amount of As and Ts in Region C have a longer Region C than nucleic acid segments with a higher amount of Cs and Gs. As an example, Region C of a nucleic acid segment with 60% C and G and 40% A and T may be 12 base pairs in length. Region C of a nucleic acid segment with 60% A and T and 40% C and G may be 18 base pairs in length. Further, both of these nucleic acid segments may be assembled in the same reaction mixture.
Table 1 shows an exemplary relationship between the A/T:C/G content and length of Region C. Region C may also be of different lengths when present at both termini of a nucleic acid segment.
The invention thus includes methods for assembling two or more nucleic acid segments, wherein one nucleic acid segment comprises at least one terminus with sequence homology to a second nucleic acid segment (e.g., Region C), wherein the region of homology varies in length as a function of the A/T:C/G ratio, with longer regions of sequence homology being present where the termini have higher A/T:C/G ratios. In some instances, one or both nucleic acid segment with sequence homology at their termini will contain an exonuclease resistant region (e.g., Region B).
In many instances, Regions C will be designed such that the two regions share 100% sequence complementarity after nuclease digestion. In some instances, sequence complementarity will be below 100% (e.g., greater than 75%, greater than 80%, greater than 85%, greater than 90%, greater than 95%, between 75% and 99%, between 75% and 95%, between 75% and 90%, between 75% and 85%, between 80% and 99%, between 80% and 95%, between 85% and 99%, between 85% and 95%, etc.).
Further, incubation conditions may be adjusted such that there is, on average, partial or complete nuclease digestion of one strand of Region C. Also, conditions may be adjusted such that either the 3′ strand or the 5′ strand is digested. This may be determined by the choice of nuclease used (e.g., exonuclease). In particular, one or more 3′-exonuclease or 5′ exonuclease may be used. For example, two or more exonucleases may be used to digest termini of nucleic acid segments.
The length, number and spacing of nuclease resistant bases in Region B may also vary. In some instances, Region B will be bounded by nuclease resistant bases. In other instances, Region B will contain non-resistant bases abutting Region C. This may be useful instances where one seeks to add one or more bases (e.g., restriction sites) to final assembly products that may or may not be translated. With reference to
Regions B and C will generally be determined by the overlap region (Region C) between nucleic acid 1 (NA1) and nucleic acid 2 (NA2).
Nuclease resistant bases will normally be in only one strand of nucleic acid segments to be joined but may be present in both strands.
The length of Region B may be as short as one base pair or substantially longer than one base pair. In some instances, the length of Region B will be from about one to about twenty base pairs, from about one to about fifteen base pairs, from about one to about ten base pairs, from about one to about six base pairs, from about one to about four base pairs, from about one to about two base pairs, from about two to about twenty base pairs, from about two to about ten base pairs, from about two to about five base pairs, from about three to about twenty base pairs, from about three to about ten base pairs, from about three to about five base pairs, etc.
The number of nuclease resistant bases in Region B may also vary. For example, the number of bases may be from about one to about ten, from about two to about ten, from about three to about ten, from about four to about ten, from about five to about ten, from about two to about five, from about two to about four, etc.
Other parameters that may be varied include the concentration of nucleic acid segments present and the ratio of these segments. In many instances, the nucleic acid segment concentration will be adjusted in combination with the concentration of nuclease and enzyme with ligase activity. Further, the ratio of nucleic acid segments to each other will often be essentially 1:1 but ratios may vary for particular applications. For example, when hybridization termini are AT rich (e.g., greater than 50%, 55%, 60%, 65% AT), these nucleic acid segments may be present in a higher ratio than nucleic acid segments with non-AT rich hybridization termini.
Nucleic acid segments such as those represented in
In some embodiments, two or more nucleic acid segments may be digested with exonucleases together or separately, then combined for assembly. In such instances, the same or different exonuclease may be used to digest termini or each fragment. Similarly, digestion reaction conditions may be the same or different the nucleic acid segments.
If desired, amplification of these nucleic acid molecules (e.g., polymerase chain reaction) may also be employed to generate nucleic acid molecules without phosphorothioate bonds.
In many instances of embodiments shown in
When a replicable, circular vector is generated, two types of selection are employed in the workflow of
A second type of selection involves the use of selectable markers. Vector Segment A and Vector Segment B shown in
The invention further includes methods involving multiple selection methods for obtaining assembled nucleic acid molecules containing desired nucleic acid segments. In one embodiment, the invention includes methods for selecting assembled nucleic acid molecules through a combination of the generation of replicable vectors (e.g., recircularized vectors) and one or more selectable marker.
In some instances, vector segments may be distinguished from other nucleic acid segments in that they contain components in that they will generally contain components (e.g., functional components) normally found on. Examples of such components include origins or replication, long terminal repeats, selectable markers, promoters and antidote coding sequences (e.g., ccdA coding sequences for counter-acting toxic effects of ccdB). However, all nucleic acid segments assembled by methods described herein may contain such components. For example, when nucleic acid segments are assembled to form an operon, the assembled nucleic acid segments will often contain promoter and terminator sequences. Further, in some instances when a vector is assembled, the only segments that will be assembled will be vector segments.
The invention thus includes methods for the assembly of nucleic acid segments where some of the nucleic acid segments contain selectable markers or have functionalities that are otherwise required for replication (e.g., contain an origin of replication). As noted above, the number of nucleic acid segments assembled by methods of the invention may vary greatly. For example, the number of nucleic acid fragments/segments that may be assembled by methods of the invention include from about two to about fifty, from about three to about fifty, from about four to about fifty, from about two to about five, from about two to about ten, from about two to about fifteen, from about two to about twenty, from about three to about five, from about three to about ten, from about three to about twenty, from about four to about six, from about four to about ten, from about four to about fifteen, from about four to about twenty, from about five to about ten, from about five to about twenty, from about five to about thirty, from about five to about forty, from about eight to about fifteen, etc.
Further, the number of nucleic acid segments that do not contain components that confer selective or other replication related functionality may also vary. In general, the number of “non-selective” assembly components will be greater than the number of “selective” assembly components and the ratio of these two components may vary from about 2:1 to about 1:1, from about 2:1 to about 1.1:1, from about 3:1 to about 1.1:1, from about 5:1 to about 1.1:1, from about 6:1 to about 1.1:1, from about 7:1 to about 1.1:1, from about 8:1 to about 1.1:1, from about 10:1 to about 1.1:1, from about 15:1 to about 1.1:1, from about 20:1 to about 1.1:1, from about 10:1 to about 2:1, from about 10:1 to about 3:1, from about 10:1 to about 4:1, from about 10:1 to about 5:1, from about 10:1 to about 6:1, etc.
In the representation of
The six nucleic acid segments represented in
The right hand side of
Correctly assembled nucleic acid molecules resulting from the work flow shown in
The invention thus provides compositions and methods for the preparation of shuttle vectors. These shuttle vectors may be screened for full length, correctly assembly in one organism (e.g., a eukaryotic cell), followed by transfer to another organism (e.g., a prokaryotic cell).
The invention also provides compositions and methods for the assembly of nucleic acid segments involving the reconstitution of one or more selectable markers and/or one or more origin of replication. In many instances, two functional components required for cell survival will be reconstituted in methods of the invention.
Compositions and methods of the invention are also useful for the preparation of nucleic acid molecules that encode counter-selectable markers (e.g., ccdB). Such vectors may be generated in a number of different ways. Vectors with counter-selectable markers may be generated by introducing assembled nucleic acid molecules into a cell that is not susceptible to the marker. Two types of such cells are ones that are not naturally susceptible to the marker (e.g., introduction of a ccdB counter-selectable marker into a yeast cell) or one that encodes an antidote or is otherwise resistant to the counter-selectable marker product (e.g., ccdA and ccdB).
Error Identification and Correction:
Errors may find their way into nucleic acid molecules in a number of ways. Examples of such ways include chemical synthesis errors, amplification/polymerase mediated errors (especially when non-proof reading polymerases are used), and assembly mediated errors (usually occurring at nucleic acid segment junctions).
Two ways to lower the number of errors in assembled nucleic acid molecules is by (1) selection of nucleic acid segments for assembly with corrects sequences and (2) correction of errors in nucleic acid segments, partially assembled sub-assemblies nucleic acid molecules, or fully assembled nucleic acid molecules.
In many instances, errors are incorporated into nucleic acid molecules regardless of the method by which the nucleic acid molecules are generated. Even when nucleic acid segments known to have correct sequences are used for assembly, errors can find their way into the final assembly products. Thus, in many instances, error reduction will be desirable. Error correction can be achieved by any number of means.
One method is by individually sequencing nucleic acid segments (e.g., chemically synthesized nucleic acid segments), followed by assembly of only nucleic acid segments determined to have correct sequences. This may be done by the selection of a single nucleic acid segment for amplification, then sequencing of the amplification products to determine if any errors are present. Thus, the invention also includes selection methods for the reduction of sequence errors. Methods for amplifying and sequence verifying nucleic acid molecules are set out in U.S. Pat. No. 8,173,368, the disclosure of which is incorporated herein by reference. Similar methods are set out in Matzas et al., Nature Biotechnology, 28:1291-1294 (2010).
Another way to reduce the number of sequence errors is by error correction. An exemplary error correction workflow is set out in
In the optional second step, the nucleic acid molecules are amplified to obtain more of each nucleic acid molecule. The amplification may be accomplished by any method, for example, by PCR. Introduction of additional errors into the nucleotide sequences of any of the nucleic acid molecules may occur during amplification.
In the third step, the amplified nucleic acid molecules are assembled into a first set of molecules intended to have a desired length, which may be the intended full length of the desired nucleotide sequence. Assembly of amplified nucleic acid molecules into full-length molecules may be accomplished in any way, for example, by using a PCR-based method.
In the fourth step, the first set of full-length molecules is denatured. Denaturation renders single-stranded molecules from double-stranded molecules. Denaturation may be accomplished by any means. In some embodiments, denaturation is accomplished by heating the molecules.
In the fifth step, the denatured molecules are annealed. Annealing renders a second set of full-length, double-stranded molecules from single-stranded molecules. Annealing may be accomplished by any means. In some embodiments, annealing is accomplished by cooling the molecules.
In the sixth step, the second set of full-length molecules are reacted with one or more endonucleases to yield a third set of molecules intended to have lengths less than the length of the complete desired gene sequence. The endonucleases cut one or more of the molecules in the second set into shorter molecules. The cuts may be accomplished by any means. Cuts at the sites of any nucleotide sequence errors are particularly desirable, in that assembly of pieces of one or more molecules that have been cut at error sites offers the possibility of removal of the cut errors in the final step of the process. In an exemplary embodiment, the molecules are cut with T7 endonuclease I, E. coli endonuclease V, and Mung Bean endonuclease in the presence of manganese. In this embodiment, the endonucleases are intended to introduce blunt cuts in the molecules at the sites of any sequence errors, as well as at random sites where there is no sequence error.
In the last step, the third set of molecules is assembled into a fourth set of molecules, whose length is intended to be the full length of the desired nucleotide sequence. Because of the late-stage error correction enabled by the provided method, the set of molecules is expected to have many fewer nucleotide sequence errors than can be provided by methods in the prior art.
The process set out above and in
Another process for effectuating error correction in chemically synthesized nucleic acid molecules is by a commercial process referred to as ERRASETM (Novici Biotech). Error correction methods and reagent suitable for use in error correction processes are set out in U.S. Pat. Nos. 7,838,210 and 7,833,759, U.S. Patent Publication No. 2008/0145913 A1 (mismatch endonucleases), and PCT Publication WO 2011/102802 A1, the disclosures of which are incorporated herein by reference.
Exemplary mismatch endonucleases include endonuclease VII (encoded by the T4 gene 49), RES I endonuclease, CEL I endonuclease, and SP endonuclease or methyl-directed endonucleases such as MutH, MutS or MutL. The skilled person will recognize that other methods of error correction may be practiced in certain embodiments of the invention such as those described, for example, in U.S. Patent Publication Nos. 2006/0127920 AA, 2007/0231805 AA, 2010/0216648 A1, 2011/0124049 A1 or U.S. Pat. No. 7,820,412, the disclosures of which are incorporated herein by reference.
One error correction methods involves the following steps. The first step is to denature DNA contained in a reaction buffer (e.g., 200 mM Tris-HCl (pH 8.3), 250 mM KCl, 100 mM MgCl2, 5 mM NAD, and 0.1% TRITON® X-100) at 98° C. for 2 minutes, followed by cooling to 4° C. for 5 minutes, then warming the solution to 37° C. for 5 minutes, followed by storage at 4° C. At a later time, T7endonuclease I and DNA ligase are added the solution 37° C. for 1 hour. The reaction is stopped by the addition EDTA. A similar process is set out in Huang et al., Electrophoresis 33:788 796 (2012).
Once nucleic acid segments are assembled, their sequences may be confirmed by sequence analysis. Sequence analysis may be used to confirm that “junction” sequences are correct and that no other nucleotide sequence “errors” are located within assembled nucleic acid molecules.
A number of nucleic acid sequences methods are known in the art and include Maxam-Gilbert sequencing, chain-termination sequencing (e.g., Sanger sequencing), pyrosequencing, sequencing by synthesis and sequencing by ligation.
The invention thus includes compositions and methods for the assembly of nucleic acid molecules with high sequence fidelity. High sequence fidelity can be achieved by several means, including sequencing of nucleic acid segments prior to assembly or partially assembled nucleic acid molecules, sequencing of fully assembled nucleic acid molecules to identify ones with correct sequences, and/or error correction.
High Order Assembly:
Large nucleic acid molecules are relatively fragile and, thus, shear readily. One method for stabilizing such molecules is by maintaining them intracellularly. Thus, in some aspects, the invention involves the assembly and/or maintenance of large nucleic acid molecules in host cells. Large nucleic acid molecules will typically be 20 kb or larger (e.g., larger than 25 kb, larger than 35 kb, larger than 50 kb, larger than 70 kb, larger than 85 kb, larger than 100 kb, larger than 200 kb, larger than 500 kb, larger than 700 kb, larger than 900 kb, etc.).
Methods for producing and even analyzing large nucleic acid molecules are known in the art. For example, Karas et al., “Assembly of eukaryotic algal chromosomes in yeast, Journal of Biological Engineering 7:30 (2013) shows the assembly of an algal chromosome in yeast and pulse-field gel analysis of such large nucleic acid molecules.
As suggested above, one group of organisms known to perform homologous recombination fairly efficient is yeasts. Thus, host cells used in the practice of the invention may be yeast cells (e.g., Saccharomyces cerevisiae, Schizosaccharomyces pombe, Pichia, pastoris, etc.).
Yeast hosts are particularly suitable for manipulation of donor genomic material because of their unique set of genetic manipulation tools. The natural capacities of yeast cells, and decades of research have created a rich set of tools for manipulating DNA in yeast. These advantages are well known in the art. For example, yeast, with their rich genetic systems, can assemble and re-assemble nucleotide sequences by homologous recombination, a capability not shared by many readily available organisms. Yeast cells can be used to clone larger pieces of DNA, for example, entire cellular, organelle, and viral genomes that are not able to be cloned in other organisms. Thus, in some embodiments, the invention employs the enormous capacity of yeast genetics generate large nucleic acid molecules (e.g., synthetic genomics) by using yeast as host cells for assembly and maintenance.
Exemplary of the yeast host cells are yeast strain VL6-48N, developed for high transformation efficiency parent strain: VL6-48 (ATCC Number MYA-3666TM)), the W303a strain, the MaV203 strain (Thermo Fisher Scientific, cat. no. 11281-011), and recombination-deficient yeast strains, such as the RAD54 gene-deficient strain, VL6-48-Δ54G (MATαhis3-Δ200 trp1-Δ1 ura3-52 lys2 ade2-101 met14 rad54-Δ1::kanMX), which can decrease the occurrence of a variety of recombination events in yeast artificial chromosomes (YACs).
Sample Preparation and Storage:
In some instances, enzymes associated with nucleic acid assembly reactions interfere with nucleic acid molecule stability. As a result, some assembly protocols call for the transformation of cells within a short time period (e.g., less than one hour) after assembly has been performed. This is not always convenient and, in some cases (e.g., high-throughput applications), may not be practical. The invention thus provides compositions and methods for stabilizing partially and/or fully assembled nucleic acid molecules.
This aspect of methods of the invention involves the use of conditions for inhibiting enzymatic reactions employed in the assembly of nucleic acid segments. One enzyme that may be inhibited is exonuclease. Exonucleases, as well as other enzymes (e.g., polymerases and ligases), may be inhibited by (1) the addition of an inhibitor, a proteinase, and/or an antibody with binding affinity for a reaction component (e.g., an exonuclease) and/or (2) physical means such as alteration of pH, metal ion concentration, heating, or salt concentration. Also, compositions and methods of the invention may involve a combination of inhibition methods. One goal of such methods is to reduce the activity of enzymatic function to a desired level, including essentially complete inactivation (i.e., unidentifiable levels of activity).
In terms of reduction of exonuclease activity, the level of inhibition will typically be measured under conditions and at a temperature (e.g., 37° C.) where the particular enzyme exhibits high levels of activity. This provides a benchmark for comparison. Exemplary reaction conditions include 67 mM glycine-KOH, 2.5 mM MgCl2, 50 μg/ml BSA, pH 9.4, 37° C. (Lambda Exonuclease); and 67 mM glycine-KOH, 6.7 mM MgCl2, 10 mM 3-mercaptoethanol, pH 9.5, 37° C. (E. coli Exonuclease I). Typically, the goal will be to achieve a reduction in enzymatic activity of at least 80% (e.g., at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, from about 80% to about 99%, from about 80% to about 98%, from about 80% to about 97%, from about 80% to about 95%, from about 80% to about 93%, from about 85% to about 99%, from about 85% to about 98%, from about 85% to about 97%, from about 85% to about 95%, from about 90% to about 99%, from about 90% to about 98%, from about 90% to about 97%, from about 90% to about 96%, from about 90% to about 95%, from about 90% to about 94%, from about 90% to about 93%, etc.) as compared to benchmark conditions.
Methods for identifying degradation of nucleic acid molecules include transformation efficiency and gel electrophoresis. With gel electrophoresis, a portion of a reaction mixture may be run on a gel and the amount of “smearing” may be determined. The level of smearing may then be used to calculate the amount (e.g., percentage) of the nucleic acids present that have been damaged. Thus, in some aspects, assays that may be used for determining whether a sample has been stabilized by methods for the invention involve the measurement of degradation of nucleic acid molecules in reaction mixtures maintained under defined storage conditions (e.g., −20° C. for 2 weeks, −20° C. for 4 weeks, −20° C. for 8 weeks, −20° C. for 12 weeks, −20° C. for 20 weeks, −20° C. for 24 weeks, −20° C. for 30 weeks, −20° C. for 36 weeks, −20° C. for 40 weeks, −20° C. for 48 weeks, −20° C. for 52 weeks, −70° C. for 2 weeks, −70° C. for 4 weeks, −70° C. for 8 weeks, −70° C. for 12 weeks, −70° C. for 20 weeks, −70° C. for 24 weeks, −70° C. for 30 weeks, −70° C. for 36 weeks, -70° C. for 40 weeks, −70° C. for 48 weeks, −70° C. for 52 weeks, etc.).
Enzymatic reactions normally follow a trend of decreasing as the temperature decreases from the optimum temperature for the particular enzyme catalyzing the reaction. Further, enzymatic reactions continue to occur even when reactions mixtures are frozen. Also, the lower the temperature after a sample is frozen, the lower the enzymatic reaction rate. Thus, enzymatic reaction rates are expected to be lower at −70° C. than at −20° C. The benchmark temperature referenced above is used for convenience because assaying of enzymatic activity under common laboratory sample storage conditions (e.g., −20° C.) is generally more difficult than under optimal reaction conditions. Also, high levels of enzymatic activities typical associated with optimal reaction conditions (or reactions conditions close thereto) provide sufficient activity to accurately measure the effects of inhibitory conditions.
Exonuclease inhibitors that may be used in the practice of the invention include 8-oxoguanine, mononucleotides, nucleoside 5′-monophsophates, 6-mercaptopurine ribonucleoside 5′-monophsophate, sodium fluoride, fludarabine (9-β-D-arabinofuranosyl-2-fluoroadenine 5′-monophosphate)-terminated DNA, and nucleic acid binding proteins (e.g., poly(U)-binding protein. Exonuclease inhibitors may inhibit specific exonucleases, groups of exonucleases (e.g., 3′ to 5′ exonucleases, 5′ to 3′ exonucleases, etc.), or essentially all exonucleases.
As noted above, pH may also be altered to inhibit enzymatic activities (e.g., exonuclease activity). Many exonucleases, for example, exhibit significant nuclease activity at pHs in ranges of 7.5 to 9.5. A shifting of the pH away from the optimum for the particular exonuclease or exonucleases used will generally decrease enzymatic activities. Further, the farther the pH is shifted from the optimum pH, the less enzymatic activity is expected. Also, pH may be shifted higher or lower. In instances, where the removal of RNA is desired pH may be shifted higher because RNA, but generally not DNA, is hydrolyzed under basic conditions.
In many instances, pH shifts will be greater than one pH unit from the optimum pH of at least one of the exonucleases present in a nucleic acid segments assembly reaction mixture. Thus, if the optimum pH for a particular enzyme is 7.5, then the pH would be shifted to at least either pH 6.5 or 8.5. pH shifts will typically be in the ranges of from about 1 to about 7 pH units, from about 1.5 to about 7 pH units, from about 2 to about 7 pH units, from about 2.5 to about 7 pH units, from about 3 to about 7 pH units, from about 3.5 to about 7 pH units, from about 4 to about 7 pH units, from about 4.5 to about 7 pH units, from about 5 to about 7 pH units, from about 1 to about 6 pH units, from about 1.5 to about 6 pH units, from about 2 to about 6 pH units, from about 2.5 to about 6 pH units, from about 3 to about 6 pH units, from about 3.5 to about 6 pH units, from about 4 to about 6 pH units, from about 4.5 to about 6 pH units, from about 5 to about 6 pH units, from about 1 to about 5 pH units, from about 1.5 to about 5 pH units, from about 2 to about 5 pH units, from about 2.5 to about 5 pH units, from about 3 to about 5 pH units, from about 3.5 to about 5 pH units, from about 4 to about 5 pH units, from about 4.5 to about 5 pH units, etc.
Many enzymes, including exonucleases, require divalent metal ions (e.g., magnesium, manganese, and calcium) for enzymatic activity. Removal or sequestration of divalent metal ions may also be used to inhibit enzymatic activities. For example, divalent metal ion sequestration may occur by the addition of a chelating agent such as EDTA, EGTA, 1,2-bis(o-aminophenoxy)ethane-N,N,N′,N′-tetraacetic acid (BAPTA). Many chelating agents have higher affinity for some metal ions than other metal ions. For example, EGTA is more selective for calcium ions than magnesium ions.
Final divalent metal ion concentrations in exonuclease reaction mixtures, for example, tend to be in the range of 2 to 7 mM. Sequestration agents, when used, will typically be present in an amount to binding greater than 95% of the total amount of divalent metal ion present. The stoichiometry will often be determined by the affinity of the sequestration agent for the divalent metal ion, the amount of divalent metal ion present, the amount of sequestration agent present, the amount of ions present that compete for the sequestration agent, and other reaction mixture conditions. Typically, sequestration agents will be present in an amount that is at least equal to the divalent metal ion (1:1) but may be present in a greater amount (e.g., from about 5:1 to about 1:1, from about 4:1 to about 1:1, from about 3:1 to about 1:1, from about 5:1 to about 1:1, from about 5:1 to about 1:1, from about 5:1 to about 1:1, from about 2:1 to about 1:1, from about 1.5:1 to about 1:1, from about 1.25:1 to about 1:1, from about 5:1 to about 1.1:1, from about 2.5:1 to about 1.1:1, from about 5:1 to about 1.5:1, from about 2.5:1 to about 1.5:1, from about 5:1 to about 2:1, from about 4:1 to about 2:1, from about 5:1 to about 1.5:1, etc.). In many instances, the amount of sequestration agent will be adjusted to achieve a reduction in enzymatic activity of at least 80% under the selected benchmark conditions.
One method of inhibiting thermolabile enzymes (e.g., exonucleases, ligases and polymerases) is by heating aqueous reaction mixtures (e.g., aqueous reaction mixtures) containing these enzymes for a sufficient period of time to allow for enzymatic inactivation. In most instances, this will result in irreversible inactivation by denaturation of enzyme(s) present in the reaction mixtures. Suitable heating conditions will vary with the thermal properties of particular enzymes present but will generally be greater than 60° C. (e.g., from about 60° C. to about 95° C., from about 65° C. to about 95° C., from about 70° C. to about 95° C., from about 75° C. to about 95° C., from about 80° C. to about 95° C., from about 60° C. to about 90° C., from about 60° C. to about 85° C., from about 60° C. to about 80° C., from about 60° C. to about 75° C., from about 65° C. to about 90° C., from about 60° C. to about 95° C., from about 65° C. to about 85° C., from about 70° C. to about 95° C., from about 70° C. to about 90° C., etc.) for at least 5 minutes (e.g., from about 5 min. to about 30 min., from about 5 min. to about 20 min., from about 5 min. to about 15 min., from about 5 min. to about 10 min., from about 10 min. to about 30 min., from about 10 min. to about 25 min., from about 10 min. to about 20 min., etc.).
One advantage of heating to inactivate exonucleases is that, in many instances, it will not be necessary to open containers (e.g., tubes) or add reagents as part of the inactivation step. This is especially useful when high-throughput methods are used.
Another way in which assembly reactions may be inhibited is through degradation of one or more assembly reaction components (e.g., an exonuclease). This may be done, for example, using a one or more proteinase. Exemplary proteinases include serine endopeptidases (e.g., Proteinase K of Tritirachium album limber) and aspartate proteinases (e.g., pepsin and cathepsin D), threonine proteases, cysteine proteases, glutamic acid proteases, and metalloproteases. Thus, the invention includes methods in which assembled nucleic acid molecules are exposed to one or more proteinase for a time sufficient to inhibit assembly reaction components.
Inhibition of assembly reaction components may be measure in a number of ways. One way is by measure the reduction in one or more assembly reaction activity (e.g., exonuclease or ligase activity). For example, when inhibition of exonuclease activity is measured, the amount of reduction of activity is discussed above but will often be greater than 75%. Further, this reduction in activity may be measured in units, with, for example, a decrease in activity of at least 75 units as compared to a control.
Exonuclease units may be defined as the amount of enzyme that will catalyze the release of 10 nanomole of acid-soluble nucleotide in 30 minutes at 37° C. in a total reaction volume of 50 μl, with the reaction mixture containing 67 mM Glycine-KOH, 6.7 mM MgCl2, 10 mM (3-ME, pH 9.5 at 25° C. and 0.17 mg/ml single-stranded [3H]-DNA.
Methods for assessing exonuclease activity based on the preferential binding of single-stranded DNA over double-stranded DNA to graphene oxide are set out, for example, in Lee et al., “A simple fluorometric assay for DNA exonuclease activity based on graphene oxide,” Analyst 137:2024-2026 (2012).
Another way in which assembly reactions can be inhibited is through the use of antibodies with binding affinity for assembly reaction components (e.g., ligase and exonuclease). A number of antibodies with binding affinity for, for examples, ligases and exonucleases are commercially available from companies such as abcam (1 Kendall Square, Suite B2304, Cambridge, Mass. 02139), including Anti-DNA Ligase III antibody [6G9] (ab587), Anti-DNA Ligase I antibody [10H5] (ab615), Anti-DNA Ligase IV antibody (ab26039), and Anti-Exonuclease 1 antibody (ab106303).
More than one (e.g., two, three or four) enzyme (e.g., exonuclease) inhibition method may be used in the practice of the invention. For example, a pH shift may be use in conjunction with heating. When a thermostable enzyme is used, heat based inactivation will generally not be used.
The invention thus provides compositions and methods for stabilizing assembled nucleic acid molecules present in reaction mixtures. These reaction mixtures will generally contain components (e.g., enzymes) that can cause damage to the nucleic acid molecules present therein. Nucleic acid molecules in reaction mixtures prepared using methods of the invention will typically show little (less than 5% of the total nucleic acid molecules present) or no degradation upon storage at −20° C. for 8 weeks, −20° C. for 12 weeks, −20° C. for 24 weeks, −70° C. for 12 weeks, −70° C. for 24 weeks, −70° C. for 36 weeks, or -70° C. for 52 weeks.
Kits:
The invention also provides kits for the assembly and storage of nucleic acid molecules. As part of these kits, materials and instruction are provided for both the assembly of nucleic acid molecules and the preparation of reaction mixtures for storage.
Kits of the invention will often contain one or more of the following components:
1. One or more exonuclease,
2. One or more polymerase,
3. One or more ligase,
4. One or more partial vector (e.g., one or more nucleic acid segment containing an origin of replication and/or a selectable marker) or complete vector,
5. One or more enzymatic (e.g., an exonuclease) inhibitor (e.g., a solution with a pH above 9 or below 6.5, a sequestration agent, and, optionally, one or more of the following
6. One or more non-vector nucleic acid segments in may
7. Instructions for how to prepare and store samples (e.g., direction the addition of one or more inhibitory compound and/or heating of the sample, followed by storage at low temperature (e.g., −20° C. or below).
There is increasing demand for large, high-fidelity, synthetic DNA constructs. However, the most commonly synthesized genes range in size from 600 to 1,200 bp. Further seamless assembly is required to obtain large nucleic acid (e.g., DNA) constructs. A seamless, sequence-independent nucleic acid assembly method, based on phosphorothioate chemistry, is set out in this example. Some features of methods set out in this example are:
1. The use of phosphorothioate chemistry stops the “chew back” reaction of exonuclease at a specified location, allowing the generation of controllable overhangs and correct assembly.
2. Synthetic DNA fragments are generated by PCR using a pair of phosphorothioate end primers, followed by one-step reaction using, for example, the GeneArt® Seamless Cloning and Assembly Kit (Life Technologies Corporation, now part of Thermo Fisher Scientific, cat. no. A13288).
3. Data indicate that the efficiency of cloning ten 1 kb PCR fragments is around 98%, with about 2000 colonies, although the efficiency of cloning ten synthetic strings reduces to about 64%.
4. DNA sequencing analysis confirms the integrity of the DNA conjunctions.
5. Optimization of assembly reactions can be achieved by the alteration of factors such as PCR conditions, length of overhangs, amount of DNAs, and incubation times. In brief, these are highly efficient in vitro assembly methods applicable, for example, to gene synthesis.
Introduction: Long synthetic DNA fragments (e.g., >10 kb), commonly used for the construction of large genes and multi gene pathways, are often challenging to assemble. Traditional restriction-based ligation methods are sequence-specific and often generate “scars”.
Homologous recombination-based methods, such as those employed by the GeneArt® Seamless Cloning and Assembly Kit (Life Technologies Corporation, now part of Thermo Fisher Scientific, cat. no. A13288), utilize exonuclease to generate single-stranded DNA overhangs for joining of overlapping fragments. However, the “chew back” reaction is often difficult to control, which leads to non-specific annealing amongst DNA fragments and decreases the efficiency of large DNA assembly.
In this example, a highly efficiency DNA assembly methods is described, which utilizes phosphorothioate chemistry in conjunction with GeneArt® Seamless Cloning and Assembly Enzyme Mix (cat. no. A14606). These methods allow for one-step assembly of, for example, ten 1 kb PCR fragments, as well as repetitive DNA fragments.
Material and Methods.
Materials: Phusion DNA polymerase (NEB), GeneArt® Seamless Cloning and Assembly Kit (Life Technologies Corporation, now part of Thermo Fisher Scientific, cat. no. A13288), AccuPrime™ Pfx DNA polymerase (Thermo Fisher, cat. no. 12344-032), T4 DNA ligase (Thermo Fisher, cat. no. 15224-090), PureLink™ Quick PCR purification kit (Thermo Fisher, cat. no. K3100-1), pType-IIs recipient vector (vector map can be viewed at www.lifetechnologies.com), One-Shot TOP10 Chemically Competent Cells (Thermo Fisher, cat. no. C4040-10), BigDye terminator v3.1 cycle sequencing kit (Thermo Fisher, cat. no. 4337457), E-gel (Thermo Fisher cat. no. G5018-8), synthetic DNAs and the trimers of Tal assembly repeats are synthesized by GeneArt® (Thermo Fisher).
Methods:
Oligo Design: Two adjacent PCR fragments share 15 bases of homology at each end (
The following phosphorothioate primers were used for DNA amplification and assembly:
Mycoplasma genitalium
Assembly Method: Ten 1 kb DNA fragments from either M. genitalium, V. cholerae or C. violaceum were PCR-amplified using phosphorothioate primers in the presence of either Phusion® DNA polymerase or AccuPrime™ Pfx DNA polymerase. To assemble synthetic DNA strings, synthetic DNAs were PCR-amplified using phosphorothioate primers. Linearized of pType IIs vector was also prepared by PCR amplification using phosphorothioate primers accordingly. PCR fragments were purified using standard PCR column. If the DNA concentration is too low (below 50 ng/μl), the DNA fragments can be mixed and concentrated using a Speed Vac. The DNA fragments were resuspended in 7 μl water. In a 10 μl assembly reaction, 75 ng of linear vector, 75 ng each of 10 PCR fragments, 2 μl of 5× reaction buffer, and 1 μl of 10×enzyme mix were added. The reaction was initiated by the addition of enzyme mix, followed by incubation at room temperature for 1 hour. 3 μl of reaction mix was transformed into TOP10 competent cells and then incubated on ice for 30 minutes, followed by heat shock at 42° C. for 30 seconds. Upon incubation on ice for 2 minutes, 250 μl of SOC medium was added to the transform reaction and incubated at 37° C. for 1 hour. One hundred μl of cell suspension was spread on LB+Amp plates and incubated at 37° C. overnight. Colonies were randomly picked and subjected to plasmid DNA isolation, followed by analysis of both restriction enzyme digestion and sequencing.
Results and Discussion: Because the phosphorothioate bonds stop the chew back reaction catalyzed by exonucleases at a specified location and generate perfect overhangs for homologous recombination, it was expected that the efficiency for DNA assembly would be higher than for assembly reactions using molecules not having phosphorothioate bonds, especially for large fragment assembly. To examine this, two sets of ten 1 kb fragments were designed that are PCR-amplified from either M. genitalium and V. cholerae using phosphorothioate primers, respectively. The DNA fragments share 15 bp homology at their ends. The assembly of ten 1 kb fragments plus linear vector was performed in triplicate as described above. The DNA fragments of M. genitalium also harbor a functional LacZ gene which was intentionally split into two adjacent fragments so that blue colonies were produced on X-gal plates once the DNA fragments were assembled correctly. As depicted in Table 5, about 2000 colonies per transformation were obtained. The cloning efficiency was more than 98% based on the calculation of percentage of blue colonies. To confirm the identity of the construct, 11 blue colonies were picked. Plasmid DNA was isolated from each of these colonies for restriction digestion analysis and sequencing analysis. Digestion of the 11 plasmids with BglII all generated three expected sizes of DNA fragments, which are 640 bp, 2003 bp and 8743 bp (data not shown), respectively. Sequencing of three individual plasmids reveals that all three constructs had the correct sequences at the 11 junctions connecting the fragments and vector. Similar results were observed with the second set of ten 1 kb DNA fragments that were amplified from V. cholerae. As shown in Table 6, around 2000 CFU were obtained in two individual experiments. Ten colonies were randomly picked and subjected to restriction analysis with NcoI. Upon digestion, all ten clonal isolates showed the expected sizes of DNA fragments, which are 1263 bp and 10396 bp (data not shown), respectively.
Next this method was evaluated on the assembly of ten synthetic DNA fragments (strings). The synthetic strings were produced by GeneArt (Thermo Fisher) and PCR amplified using phosphorothioate primers. The quality of the PCR products was fair as some of the DNA fragments had minor truncated products. Average of 248 colonies was observed in the triplicate experiments (Table 7). Restriction digestion analysis with XmnI produced three expected sizes of DNA fragments of 1563 bp, 2317 bp and 6 kb (data not shown), suggesting that the efficiency of assembly is around 60%.
The feasibility of using this PS approach for assembly of repetitive DNA fragments was also examiner. Tal repeat trimers having more than 90% homology were obtained from GeneArt® (Thermo Fisher). To minimize the cross-reactivity, the length of overlap was reduced from 15 bp to 12 bp. Four trimers of Tal repeats were PCR amplified using phosphorothioate primers and assembly simultaneously to produce a Tal effector containing 12 repeats. Around 28,000 colonies were observed. Five colonies were randomly picked for DNA sequencing. The results indicated that 4 out of 5 contained all four trimers of Tal repeats.
In conclusion, a robust assembly method was developed using phosphorothioate chemistry. Since T7exo hydrolyzes double stranded DNA from 5′ to 3′, it generates a 5′ phosphate at a specified phosphorothioate nucleotide. Upon annealing to a complimentary strand, the double stranded DNA contains a nick bounded by 3′-OH and 5 ′-P termini. Ligase may be used to seal the gaps.
Summary: Here, a technique based on positive-selection vectors is presented. The strategy relies on vectors with a truncated and inactive replication origin and selection marker, whose short missing sequences are provided in trans during the cloning procedure. The approach i) provides selective survivability on the assembly products that have correct assembled outermost fragments and ii) reduces background colony growth due to recircularized vectors.
Materials and Methods
Strains: Chemically or electro competent Escherichia coli strains, DH10B-T1 and TOP10, were obtained from Thermo Fisher Scientific. E. coli strain S17-1::λ-pir (de Lorenzo and Timmis, Analysis and construction of stable phenotypes in gram-negative bacteria with Tn5- and Tn10-derived minitransposons, Methods Enzymol. 235:386-405 (1994)) was used to maintain the positive-selection vector pASE101. Chemically competent yeast MaV203 strain (a part of the GeneArt® High-Order Genetic Assembly System kit) was obtained from Thermo Fisher Scientific. E. coli strains were grown in LB medium appropriate antibiotics: ampicillin (Ap, 50 μg/ml), kanamycin (Km, 25 μg/ml), and chloramphenicol (Cm, 20 μg/ml). Yeast MaV203 transformants were grown on CSM-Trp medium.
Oligonucleotides, synthetic DNAs, and plasmids: Oligonucleotides used in this study are listed in Table 8. Synthetic DNA strings were obtained from Thermo Fisher Scientific (GeneArt, Germany). A subset of these synthetic DNA fragments were cloned into pCR®-Blunt II-TOPO® (Thermo Fisher Scientific) Vector as indicated below. These pre-cloned DNA fragments were used as templates to produce PCR-amplified inserts. Then those three different types of DNA, synthetic, pre-cloned, and PCR-amplified, were used for DNA assembly tests. All DNA fragments for assembly test were listed in Table 9.
A 4,255-bp DNA fragment was amplified from pYES3/CT (Thermo Fisher Scientific) using a primer set (CH316 & CH371) and circularized by self-ligation to generate pYES8. A 2,848-bp linear positive-selection vector pYES8D for in vivo DNA assembly in yeast was PCR amplified from pYES8 using a primer set (CH327 and CH353), and was also circularized by self-ligation to maintain in E. coli. Three DNA fragments, 2micron ori-TR_pUC ori (1045 bp, CH353 & CH397) and TRP1-TR (871 bp, CH399 & CH401) from pYES8D and KmR (1006 bp, CH396 & CH400) from pCR®-Blunt II-TOPO® Vector (ThermoFisher), were assembled using GeneArt® Seamless PLUS Cloning and Assembly Kit (ThermoFisher) to generate pYES10. A 1815 bp DNA fragment harboring pUC ori and ApR gene from pYES8 was amplified by PCR (CH423 & CH418) and self-ligated to produce pUC-Ap. A 1794 bp DNA fragment harboring pUC ori and ApR gene from pYES10 was amplified by PCR using a primer set (CH423 & CH418) and self-ligated to produce pUC-Km. A 1581 bp DNA fragment harboring truncated pUC ori (pUC ori-TR) and KmR (KmR-TR) was amplified from pUC-Km by PCR using a primer set (CH428 & CH438) and assembled with a 1223 bp PCR-amplified (CH450 & CH451) synthetic DNA fragment using GeneArt® Seamless PLUS Cloning and Assembly Kit (ThermoFisher) to generate pASE101. This vector can be maintained only in an E. coli strain harboring pir gene such as S17-1::λ-pir. A linear 1581 bp positive selection vector pASE101L was amplified from pASE101 by PCR using a primer set (CH428 & CH438). A linear 1603 bp control vector pASE_cont harboring functional pUC ori and KmR was amplified from pASE101 by PCR using a primer set (CH476 & CH477). Phosphorothiate version of pASE101L and pASE_cont were amplified using phosphorothioate primer sets, CHPT1 & CHPT2 and CHPT3 & CHPT4.
DNA assembly: For in vivo assembly in yeast, the protocol for GeneArt® High-Order Genetic Assembly System (ThermoFisher) was followed using a modified amount of vector (50 ng) and inserts (50 ng each). For in vitro assembly and cloning in E. coli, both GeneArt® Seamless Cloning and Assembly Kit and GeneArt® Seamless PLUS Cloning and Assembly Kit (ThermoFisher) were used following the manufacturer's protocol using vector (75 ng) and inserts (75 ng each).
Results and Conclusions
Positive selection in Saccharomyces cerevisiae: A map and sequence of the 2848 bp vector pYES8D is shown in
In a first cloning example 10 different fragments accounting for a total of 9868 bp were PCR amplified from Vibrio cholerae's genomic DNA and mixed with the linearized vector above (
The fragments and vector were transformed into competent MaV203 yeast cells, which were subsequently plated onto CSM-Trp agar plates as indicated in materials and methods. The cells are unable to grow on media lacking tryptophan, unless they are complemented by a plasmid harboring an active trpl gene.
A series of control experiments were performed. First, a linear plasmid with intact and functional trpl and 2μ ori elements, pYES8 (
The results showed that the positive selection vector pYES8D promoted the recombination of the expected construct with an efficiency of 94%. In other words, 94 out of 100 colonies contained the right clone. In the absence of the positive selection feature, no correct clone could be obtained, despite the fact that a comparable number of colonies appeared on the plates. Lastly, the negative control experiments (no fragment number 6 and no insert controls) produced a significantly reduced number of colonies only if the positive control vector was employed.
In a second example, 10 synthetic DNA fragments were in vitro synthesized employing standard gene synthesis procedures (
The results showed that with the use of the positive selection vector pYES8D, the expected final construct could be obtained with cloning efficiencies ranging from 77 to 100%. Without positive selection, the expected clone was obtained at a significantly lower rate (compare assemblies 1 and 9 in
In conclusion, the positive selection vector approach in yeast significantly reduces the downstream screening effort compared with standard selection procedures, shortening the hands-on and overall time required to obtain the expected clone.
Positive selection in Escherichia coli: In this second example, the performance of the positive selection approach is shown in the context of E. coli cloning. In this case the vector, pASE101 (
A similar 10-fragment array as that one described in the previous section was employed as a source of inserts. In this particular case the fragments were PCR amplified using oligonucleotides harboring phosphorothioate bonds as described in materials and methods. The construct was assembled using the GeneART Seamless Assembly kit (Thermo Fisher Scientific), and transformed into TOP10 cells (Thermo Fisher Scientific). As a control, a similar construct was assembled using the vector pASE_cont (
The results show that the positive control vector strategy significantly increases the cloning efficiency compared with the approach where no positive selection is employed (cloning efficiencies of 71 and 45% respectively).
In conclusion, the positive selection approach can be applied to the most common E. coli-based cloning complementing and boosting the performance of otherwise standard cloning methodologies.
C. violaceum
C. violaceum
C. violaceum
C. violaceum
C. violaceum
C. violaceum
C. violaceum
C. violaceum
C. violaceum
C. violaceum
C. violaceum
C. violaceum
C. violaceum
C. violaceum
C. violaceum
C. violaceum
C. violaceum
C. violaceum
C. violaceum
C. violaceum
C. violaceum
C. violaceum
C. violaceum
C. violaceum
C. violaceum
C. violaceum
C. violaceum
C. violaceum
C. violaceum
C. violaceum
V. cholera
V. cholerae
V. cholerae
V. cholerae
V. cholerae
V. cholerae
V. cholerae
V. cholerae
V. cholerae
V. cholerae
E. coli
V. cholerae
E. coli
V. cholerae
E. coli
V. cholerae
E. coli
V. cholerae
E. coli
V. cholerae
E. coli
V. cholerae
E. coli
V. cholerae
E. coli
V. cholerae
E. coli
V. cholerae
E. coli
V. cholerae
E. coli
V. cholerae
E. coli
V. cholera
E. coli
C. violaceum
E. coli
C. violaceum
E. coli
C. violaceum
E. coli
C. violaceum
E. coli
C. violaceum
E. coli
C. violaceum
E. coli
C. violaceum
E. coli
C. violaceum
E. coli
C. violaceum
E. coli
C. violaceum
E. coli
C. violaceum
E. coli
C. violaceum
The protocol below is directed, in part, to the termination of enzymatic reactions related to nucleic acid assembly. Once nucleic acid segments are fully assembled, the continued action of enzymes (e.g., exonucleases) can damage assembled nucleic acid molecules.
A linearized vector and DNA fragments is prepared as instructed in GeneArt® Seamless DNA assembly kit (Life Technologies, Catalog number A14606) manual. Add DNA mix in a volume of 10 μl to a thin-walled PCR tube or a well on a PCR plate. Add 10 μl of GeneArt Seamless DNA assembly enzyme mix, mix by pipetting up and down or flicking the tube. Brief spin down the liquid to the bottom of the tube (DO NOT exceed 5 seconds and 500 rpm). Incubate in a PCR machine with the following protocol if final construct is smaller than 13 kb: 30 minutes at 25° C., then 10 minutes at 75° C., hold at 4° C. If final the construct is larger than 13 kb, use the following protocol: 30 minutes at 25° C., 75 minutes at 75° C., then 60 minutes at 25° C., hold at 4° C. The reaction mixture can be stored at 25° C. or lower temperature for up to 48 hours until transformation.
GGGTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATAA
CTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCG
CCCATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCCAAT
AGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAA
CTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGC
CCCCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATG
CCCAGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCT
ACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGT
ACATCAATGGGCGTGGATAGCGGTTTGACTCACGGGGATTTCCA
AGTCTCCACCCCATTGACGTCAATGGGAGTTTGTTTTGGCACCAA
AATCAACGGGACTTTCCAAAATGTCGTAACAACTCCGCCCCATT
GACGCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATAA
GCAGAGCTCGTTTAGTGAACCGTCAGATCGCCTGGAGACGCCAT
CCACGCTGTTTTGACCTCCATAGAAGACACCGGGACCGATCCAG
CCTCCGGACTCTAGAGGATCGAATGGCAA (SEQ ID NO: 157)
TGCAGATGCAGCTTGAAGCAAATGCAGATACTTCAGTGGAAGAA
GAAAGCTTTGGCCCACAACCCATTTCACGGTTAGAGCAGTGTGG
CATAAATGCCAACGATGTGAAGAAATTGGAAGAAGCTGGATTCC
ATACTGTGGAGGCTGTTGCCTATGCGCCAAAGAAGGAGCTAATA
AATATTAAGGGAATTAGTGAAGCCAAAGCTGATAAAATTCTGGC
TGAGGCAGCTAAATTAGTTCCAATGGGTTTCACCACTGCAACTG
AATTCCACCAAAGGCGGTCAGAGATCATACAGATTACTACTGGC
TCCAAAGAGCTTGACAAACTACTTCAAGGTGGAATTGAGACTGG
ATCTATCACAGAAATGTTTGGAGAATTCCGAACTGGGAAGACCC
AGATCTGTCATACGCTAGCTGTCACCTGCCAGCTTCCCATTGACC
GGGGTGGAGGTGAAGGAAAGGCCATGTACATTGACACTGAGGG
TACCTTTAGGCCAGAACGGCTGCTGGCAGTGGCTGAGAGGTATG
GTCTCTCTGGCAGTGATGTCCTGGATAATGTAGCATATGCTCGAG
CGTTCAACACAGACCACCAGACCCAGCTCCTTTATCAAGCATCA
GCCATGATGGTAGAATCTAGGTATGCACTGCTTATTGTAGACAGT
GCCACCGCCCTTTACAGAACAGACTACTCGGGTCGAGGTGAGCT
TTCAGCCAGGCAGATGCACTTGGCCAGGTTTCTGCGGATGCTTCT
GCGACTCGCTGATGAGTTTGGTGTAGCAGTGGTAATCACTAATC
AGGTGGTAGCTCAAGTGGATGGAGCAGCGATGTTTGCTGCTGAT
CCCAAAAAACCTATTGGAGGAAATATCATCGCCCATGCATCAAC
AACCAGATTGTATC (SEQ ID NO: 158)
TGAGGAAAGGAAGAGGGGAAACCAGAATCTGCAAAATCTACGA
CTCTCCCTGTCTTCCTGAAGCTGAAGCTATGTTCGCCATTAATGC
AGATGGAGTGGGAGATGCCAAAGACGGAAGCGGAGCTACTAAC
AACGTCACTCAGCCAGAACACTTAATAATAAATTAAGTCTTTCA
AAACCAAAATTTTCAGGTTTCACTTTTAAAAAGAAAACATCTTCA
GATAACAATGTATCTGTAACTAATGTGTCAGTAGCAAAAACACC
TGTATTAAGAAATAAAGATGTTAATGTTACCGAAGACTTTTCCTT
CAGTGAACCTCTACCCAACACCACAAATCAGCAAAGGGTCAAGG
ACTTCTTTAAAAATGCTCCAGCAGGACAGGAAACACAGAGAGGT
GGATCAAAATCATTATTGCCAGATTTCTTGCAGACTCCGAAGGA
AGTTGTATGCACTACCCAAAACACACCAACTGTAAAGAAATCCC
GGGATACTGCTCTCAAGAAATTAGAATTTAGTTCTTCACCAGATT
CTTTAAGTACCATCAATGATTGGGATGATATGGATGACTTTGATA
CTTCTGAGACTTCAAAATCATTTGTTACACCACCCCAAAGTCACT
TTGTAAGAGTAAGCACTGCTCAGAAATCAAAAAAGGGTAAGAG
AAACTTTTTTAAAGCACAGCTTTATACAACAAACACAGTAAAGA
CTGATTTGCCTCCACCCTCCTCTGAAAGCGAGCAAATAGATTTGA
CTGAGGAACAGAAGGATGACTCAGAATGGTTAAGCAGCGATGTG
ATTTGCATCGATGATGGCCCCATT (SEQ ID NO: 159)
GCTGAAGTGCATATAAATGAAGATGCTCAGGAAAGTGACTCTCT
GAAAACTCATTTGGAAGATGAAAGAGATAATAGCGAAAAGAAG
AAGAATTTGGAAGAAGCTGAATTACATTCAACTGAGAAAGTTCC
ATGTATTGAATTTGATGATGATGATTATGATACGGATTTTGTTCC
ACCTTCTCCAGAAGAAATTATTTCTGCTTCTTCTTCCTCTTCAAAA
TGCCTTAGTACGTTAAAGGACCTTGACACATCTGACAGAAAAGA
GGATGTTCTTAGCACATCAAAAGATCTTTTGTCAAAACCTGAGA
AAATGAGTATGCAGGAGCTGAATCCAGAAACCAGCACAGACTGT
GACGCTAGACAGATAAGTTTACAGCAGCAGCTTATTCATGTGAT
GGAGCACATCTGTAAATTAATTGATACTATTCCTGATGATAAACT
GAAACTTTTGGATTGTGGGAACGAACTGCTTCAGCAGCGGAACA
TAAGAAGGAAACTTCTAACGGAAGTAGATTTTAATAAAAGTGAT
GCCAGTCTTCTTGGCTCATTGTGGAGATACAGGCCTGATTCACTT
GATGGCCCTATGGAGGGTGATTCCTGCCCTACAGGGAATTCTAT
GAAGGAGTTAAATTTTTCACACCTTCCCTCAAATTCTGTTTCTCCT
GGGGACTGTTTACTGACTACCACCCTAGGAAAGACAGGATTCTC
TGCCACCAGGAAGAATCTTTTTGAAAGGCCTTTATTCAATACCCA
TTTACAGAAGTCCTTTGTAAGTAGCAACTGGGCTGAAACACCAA
GACTAGGAAAAAAAAATGAAAGCTCTTATTTCCCAGGAAATGTT
CTCACAAGCACTGCTGTGAAAGATCAGAATAAACATACTGCTTC
AATAAATGACTTAGAAAGAGAAACCCAACCTTCCTATGATATTG
ATAATTTTGACATAGATGACTTTGATGATGATGATGACTGGGAA
GACATAATGCATAATTTAGCAGCCAGCAAATCTTCCACAGCTGC
CTATCAACCCATCAAGGAAGGTCGGCCAATTAAATCAGTATCAG
AAAGACTTTCCTCAGCCAAGACAGACTGTCTTCCAGTGTCATCTA
CTGCTCAAAATATAAACTTCTCAGAGTCAATTCAGAATTATACTG
ACAAGTCAGCACAAAATTTAGCATCCAGAAATCTGAAACATGAG
CGTTTCCAAAGTCTTAGTTTTCCTCATACAAAGGAAATGATGAAG
ATTTTTCATAAAAAATTTGGCCTGCATAATTTTAGAACTAATCAG
CTAGAGGCGATCAATGCTGCACTGCTTGGTGAAGACTGTTTTATC
CTGATGCCGACTGGAGGTGGTAAGAGTTTGTGTTACCAGCTCCCT
GCCTGTGTTTCTCCTGGGGTCACTGTTGTCATTTCTCCCTTGAGAT
CACTTATCGTAGATCAAGTCCAAAAGCTGACTTCCTTGGATATTC
CAGCTACATATCTGACAGGTGATAAGACTGACTCAGAAGCTACA
AATATTTACCTCCAGTTATCAAAAAAAGACCCAATCATAAAACT
TCTATATGTCACTCCAGAAAAGATCTGTGCAAGTAACAGACTCA
TTTCTACTCTGGAGAATCTCTATGAGAGGAAGCTCTTGGCACGTT
TTGTTATTGATGAAGCACATTGTGTCAGTCAGTGGGGACATGATT
TTCGTCAAGATTACAAAAGAATGAATATGCTTCGCCAGAAGTTT
CCTTCTGTTCCGGTGATGGCTCTTACGGCCACAGCTAATCCCAGG
GTACAGAAGGACATCCTGACTCAGCTGAAGATTCTCAGACCTCA
GGTGTTTAGCATGAGCTTTAACAGACATAATCTGAAATACTATGT
ATTACCGAAAAAGCCTAAAAAGGTGGCATTTGATTGCCTAGAAT
GGATCAGAAAGCACCACCCATATGATTCAGGGATAATTTACTGC
CTCTCCAGGCGAGAATGTGACACCATGGCTGACACGTTACAGAG
AGATGGGCTCGCTGCTCTTGCTTACCATGCTGGCCTCAGTGATTC
TGCCAGAGATGAAGTGCAGCAGAAGTGGATTAATCAGGATGGCT
GTCAGGTTATCTGTGCTACAATTGCATTTGGAATGGGGATTGACA
AACCGGACGTGCGATTTGTGATTCATGCATCTCTCCCTAAATCTG
TGGAGGGTTACTACCAAGAATCTGGCAGAGCTGGAAGAGATGGG
AGACTGAAAAGACTTATAATGATGGAAAAAGATGGAAACCATC
ATACAAGAGAAACTCACTTCAATAATTTGTATAGCATGGTACATT
ACTGTGAAAATATAACGGAATGCAGGAGAATACAGCTTTTGGCC
TACTTTGGTGAAAATGGATTTAATCCTGATTTTTGTAAGAAACAC
CCAGATGTTTCTTGTGATAATTGCTGTAAAACAAAGGATTATAAA
ACAAGAGATGTGACTGACGATGTGAAAAGTATTGTAAGATTTGT
TCAAGAACATAGTTCATCACAAGGAATGAGAAATATAAAACATG
TAGGTCCTTCTGGAAGATTTACTATGAATATGCTGGTCGACATTT
TCTTGGGGAGTAAGAGTGCAAAAATCCAGTCAGGTATATTTGGA
AAAGGATCTGCTTATTCACGACACAATGCCGAAAGACTTTTTAA
AAAGCTGATACTTGACAAGATTTTGGATGAAGACTTATATATCA
ATGCCAATGACCAGGCGATCGCTTATGTGATGCTCGGAAATAAA
GCCCAAACTGTACTAAATGGCAATTTAAAGGTAGACTTTATGGA
AACAGAAAATTCCAGCAGTGTGAAAAAACAAAAAGCGTTAGTA
GCAAAAGTGTCTCAGAGGGAAGAGATGGTTAAAAAATGTCTTGG
AGAACTTACAGAAGTCTGCAAATCTCTGGGGAAAGTTTTTGGTG
TCCATTACTTCAATATTTTTAATACCGTCACTCTCAAGAAGCTTG
CAGAATCTTTATCTTCTGATCCTGAGGTTTTGCTTCAAATTGATG
GTGTTACTGAAGACAAACTGGAAAAATATGGTGCGGAAGTGATT
TCAGTATTACAGAAATACTCTGAATGGACATCGCCAGCTGAAGA
CAGTTCCCCAGGGATAAGCCTGTCCAGCAGCAGAGGCCCCGGAA
GAAGTGCCGCTGAGGAGCTTGACGAGGAAATACCCGTATCTTCC
CACTACTTTGCAAGTAAAACCAGAAATGAAAGGAAGAGGAAAA
AGATGCCAGCCTCCCAAAGGTCTAAGAGGAGAAAAACTGCTTCC
AGTGGTTCCAAGGCAAAGGGGGGGTCTGCCACATGTAGAAAGAT
ATCTTCCAAAACGAAATCCTCCAGCATCATTGGATCCAGTTCAGC
CTCACATACTTCTCAAGCGACATCAGGAGCCAATAGCAAATTGG
GGATTATGGCTCCACCGAAGCCTATAAATAGACCGTTTCTTAAGC
CTTCATATGCATTCT (SEQ ID NO: 160)
CATAAGGGGGAGGCTAACTGAAACACGGAAGGAGACAATACCG
GAAGGAACCCGCGCTATGACGGCAATAAAAAGACAGAATAAAA
CGCACGGGTGTTGGGTCGTTTGTTCATAAACGCGGGGTTCGGTCC
CAGGGCTGGCACTCTGTCGATACCCCACCGAGACCCCATTGGGG
CCAATACGCCCGCGTTTCTTCCTTTTCCCCACCCCACCCCCCAAG
TTCGGGTGAAGGCCCAGGGCTCGCAGCCAACGTCGGGGCGGCAG
GCCCTGCCATAGCAGATCTGCGCAGCTGGGGCTCTAGGGGGTAT
GGTTACGCGCAGCGTGACCGCTACACTTGCCAGCGCCCTAGCGC
CCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCGCCACGTTCGCCGG
CTTTCCCCGTCAAGCTCTAAATCGGGGCATCCCTTTAGGGTTCCG
ATTTAGTGCTTTACGGCACCTCGACCCCAAAAAACTTGATTAGGG
TGATGGTTCACGTAGTGGGCCATCGCCCTGATAGACGGTTTTTCG
CCCTTTGACGTTGGAGTCCACGTTCTTTAATAGTGGACTCTTGTT
CCAAACTGGAACAACACTCAACCCTATCTCGGTCTATTCTTTTGA
TTTATAAGGGATTTTGGGGATTTCGGCCTATTGGTTAAAAAATGA
GCTGATTTAACAAAAATTTAACGCGAATTAATTCTGTGGAATGTG
TGTCAGTTAGGGTGTGGAAAGTCCCCAGGCTCCCCAGCAGGCAG
AAGTATGCAAAGCATGCATCTCAATTAGTCAGCAACCAGGTGTG
GAAAGTCCCCAGGCTCCCCAGCAGGCAGAAGTATGCAAAGCATG
CATCTCAATTAGTCAGCAACCATAGTCCCGCCCCTAACTCCGCCC
ATCCCGCCCCTAACTCCGCCCAGTTCCGCCCATTCTCCGCCCCAT
GGCTGACTAATTTTTTTTATTTATGCAGAGGCCGAGGCCGCCTCT
GCCTCTGAGCTATTCCAGAAGTAGTGAGGAGGCTTTTTTGGAGG
CCTAGGCTTTTGCAAAAAGCTCCCGGGAGCTTGTATATCCATTTT
CGGATCTGATCAAGAGACAGGATGAGGATCGTTTCGCATGATTG
CAGGGATTGCTACAATTTATCAAAGAAGCTTCAGAACCCATCCA
TGTGAGGAAGTATAAAGGGCAGGTAGTAGCTGTGGATACATATT
GCTGGCTTCACAAAGGAGCTATTGCTTGTGCTGAAAAACTAGCC
AAAGGTGAACCTACTGATAGGTATGTAGGATTTTGTATGAAATTT
GTAAATATGTTACTATCTCATGGGATCAAGCCTATTCTCGTATTT
GATGGATGTACTTTACCTTCTAAAAAGGAAGTAGAGAGATCTAG
AAGAGAAAGACGACAAGCCAATCTTCTTAAGGGAAAGCAACTTC
TTCGTGAGGGGAAAGTCTCGGAAGCTCGAGAGTGTTTCACCCGG
TCTATCAATATCACACATGCCATGGCCCACAAAGTAATTAAAGC
TGCCCGGTCTCAGGGGGTAGATTGCCTCGTGGCTCCCTATGAAGC
TGATGCGCAGTTGGCCTATCTTAACAAAGCGGGAATTGTGCAAG
CCATAATTACAGAGGACTCGGATCTCCTAGCTTTTGGCTGTAAAA
AGGTAATTTTAAAGATGGACCAGTTTGGAAATGGACTTGAAATT
GATCAAGCTCGGCTAGGAATGTGCAGACAGCTTGGGGATGTATT
CACGGAAGAGAAGTTTCGTTACATGTGTATTCTTTCAGGTTGTGA
CTACCTGTCATCACTGCGTGGGATTGGATTAGCAAAGGCATGCA
AAGTCCTAAGACTAGCCAATAATCCAGATATAGTAAAGGTTATC
AAGAAAATTGGACATTATCTCAAGATGAATATCACGGTACCAGA
GGATTACATCAACGGGTTTATTCGGGCCAACAATACCTTCCTCTA
TCAGCTAGTTTTTGATCCCATCAAAAGGAAACTTATTCCTCTGAA
CGCCTATGAAGATGATGTTGATCCTGAAACACTAAGCTACGCTG
GGCAATATGTTGATGATTCCATAGCTCTTCAAATAGCACTTGGAA
ATAAAGATATAAATACTTTTGAACAGATCGATGACTACAATCCA
GACACTGCTATGCCTGCCCATTCAAGAAGTCGTAGTTGGGATGA
CAAAACATGTCAAAAGTCAGCTAATGTTAGCAGCATTTGGCATA
GGAATTACTCTCCCAGACCAGAGTCGGGTACTGTTTCAGATGCCC
CACAATTGAAGGAAAATCCAAGTACTGTGGGAGTGGAACGAGTG
ATTAGTACTAAAGGGTTAAATCTCCCAAGGAAATCATCCATTGT
GAAAAGACCAAGAAGTGCAGAGCTGTCAGAAGATGACCTGTTG
AGTCAGTATTCTCTTTCATTTACGAAGAAGACCAAGAAAAATAG
CTCTGAAGGCAATAAATCATTGAGCTTTTCTGAAGTGTTTGTGCC
TGACCTGGTAAATGGACCTACTAACAAAAAGAGTGTAAGCACTC
CACCTAGGACGAGAAATAAATTTGCAACATTTTTACAAAGGAAA
AATGAAGAAAGTGGTGCAGTTGTGGTTCCAGGGACCAGAAGCAG
GTTTTTTTGCAGTTCAGATTCTACTGACTGTGTATCAAACAAAGT
GAGCATCCAGCCTCTGGATGAAACTGCTGTCACAGATAAAGAGA
ACAATCTGCATGAATCAGAGTATGGAGACCAAGAAGGCAAGAG
ACTGGTTGACACAGATGTAGCACGTAATTCAAGTGATGACATTC
CGAATAATCATATTCCAGGTGATCATATTCCAGACAAGGCAACA
GTGTTTACAGATGAAGAGTCCTACTCTTTTAAGAGCAGCAAATTT
ACAAGGACCATTTCACCACCCACTTTGGGAACACTAAGAAGTTG
TTTTAGTTGGTCTGGAGGTCTTGGAGATTTTTCAAGAACGCCGAG
CCCCTCTCCAAGCACAGCATTGCAGCAGTTCCGAAGAAAGAGCG
ATTCCCCCACCTCTTTGCCTGAGAATAATATGTCTGATGTGTCGC
AGTTAAAGAGCGAGGAGTCCAGTGACGATGAGTCTCATCCCTTA
CGAGAAGGGGCATGTTCTTCACAGTCCCAGGAAAGTGGAGAATT
CTCACTGCAGAGTTCAAATGCATCAAAGCTTTCTCAGTGCTCTAG
TAAGGACTCTGATTCAGAGGAATCTGATTGCAATATTAAGTTACT
TGACAGTCAAAGTGACCAGACCTCCAAGCTATGTTTATCTCATTT
CTCAAAAAAAGACACACCTCTAAGGAACAAGGTTCCTGGGCTAT
ATAAGTCCAGTTCTGCAGACTCTCTTTCTACAACCAAGATCAAAC
AGCATCCAGAAGAGAAAGCATCATAATGCCGAGAACAAGCCGG
GGTTACAGATCAAACTCAATGGAGCTCTGGAAAAACTTTGGATT
AAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCA
TAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAA
GTCAGAGGTGGCGAAACCCGACAGGACTATAAAGATACCAGGC
GTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCT
GCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGT
GGCGCTTTCTCAATGCTCACGCTGTAGGTATCTCAGTTCGGTGTA
GGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCA
GCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAA
CCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTA
ACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTC
TTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGGACAGTATT
TGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAG
TTGGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGT
GGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGG
ATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCA
TTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTA
CGATACGGGAGGGCTTACCATCTGGCCCCAGTGCTGCAATGATA
CCGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAATAAA
CCAGCCAGCCGGAAGGGCCGAGCGCAGAAGTGGTCCTGCAACTT
TATCCGCCTCCATCCAGTCTATTAATTGTTGCCGGGAAGCTAGAG
TAAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTGTTGCCATTG
CTACAGGCATCGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCAT
TCAGCTCCGGTTCCCAACGATCAAGGCGAGTTACATGATCCCCC
ATGTTGTGCAAAAAAGCGGTTAGCTCCTTCGGTCCTCCGATCGTT
GTCAGAAGTAAGTTGGCCGCAGTGTTATCACTCATGGTTATGGC
AGCACTGCATAATTCTCTTACTGTCATGCCATCCGTAAGATGCTT
TTCTGTGACTGGTGAGTACTCAACCAAGTCATTCTGAGAATAGTG
TATGCGGCGACCGAGTTGCTCTTGCCCGGCGTCAATACGGGATA
ATACCGCGCCACATAGCAGAACTTTAAAAGTGCTCATCATTGGA
AAACGTTCTTCGGGGCGAAAACTCTCAAGGATCTTACCGCTGTTG
AGATCCAGTTCGATGTAACCCACTCGTGCACCCAACTGATCTTCA
GCATCTTTTACTTTCACCAGCGTTTCTGGGTGAGCAAAAACAGGA
AGGCAAAATGCCGCAAAAAAGGGAATAAGGGCGACACGGAAAT
GTTGAATACTCATACTCTTCCTTTTTCAATATTATTGAAGCATTTA
Table 14 shows the nucleotide sequence of the pcDNA Rad51 BLM Exo1 vector. Also, indicated in Table 14 are the nucleotide sequences of a number of vector elements. As shown in
Embodiments of apparatuses, systems and methods for providing a simplified workflow for nucleic acid sequencing are described in this specification. The section headings used herein are for organizational purposes only and are not to be construed as limiting the described subject matter in any way.
While the foregoing embodiments have been described in some detail for purposes of clarity and understanding, it will be clear to one skilled in the art from a reading of this disclosure that various changes in form and detail can be made without departing from the true scope of the embodiments disclosed herein. For example, all the techniques, apparatuses, systems and methods described above can be used in various combinations.
The application is a divisional of U.S. application Ser. No. 14/800,384, filed Jul. 15, 2015, now pending, which claims the benefit of U.S. Provisional Application No. 62/024,650, filed Jul. 15, 2014, whose disclosure is incorporated by reference in its entirety. The instant application contains a Sequence Listing which has been submitted electronically in ASCII format and is hereby incorporated by reference in its entirety. Said ASCII copy, created on Jul. 13, 2015, is named LT00899_SL.txt and is 82,197 bytes in size.
Number | Date | Country | |
---|---|---|---|
62024650 | Jul 2014 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 14800384 | Jul 2015 | US |
Child | 15873477 | US |