This invention relates to compositions of matter, methods, and instruments for tracking nucleic acid-guided editing events in live cells, particularly mammalian cells.
In the following discussion certain articles and methods will be described for background and introductory purposes. Nothing contained herein is to be construed as an “admission” of prior art. Applicant expressly reserves the right to demonstrate, where appropriate, that the methods referenced herein do not constitute prior art under the applicable statutory provisions.
The ability to make precise, targeted changes to the genome of living cells has been a long-standing goal in biomedical research and development. Recently, various nucleases have been identified that allow manipulation of gene sequence, and hence gene function. The nucleases include nucleic acid-guided nucleases, which enable researchers to generate permanent edits in live cells. Of course, it is not only desirable to attain the highest editing rates possible in a cell population, but also to track the genomic edits in the cells, especially when multiple rounds of editing are performed and/or combinatorial libraries of edits are prepared. However, current tracking methods are inefficient and may lead to random genomic integration of tracking sequences, and/or require successive rounds of editing for targeted integration.
There is thus a need in the art of nucleic acid-guided nuclease editing for improved methods, compositions, modules, and instruments for efficient tracking of genomic edits, particularly in mammalian cells. The present disclosure addresses this need.
This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter. Other features, details, utilities, and advantages of the claimed subject matter will be apparent from the following written Detailed Description including those aspects illustrated in the accompanying drawings and defined in the appended claims.
In some aspects, the present disclosure provides a method for performing nucleic acid-guided nuclease/reverse transcriptase fusion editing in a genome of a live cell, comprising: (a) providing the live cell, wherein the live cell comprises a first target locus and a second target locus; (b) providing a nucleic acid-guided nuclease/reverse transcriptase fusion enzyme; (c) providing a CF editing cassette, the CF editing cassette comprising: (i) a nucleic acid sequence encoding a first CFgRNA having a region of complementarity to a sequence of the first target locus; and (ii) a nucleic acid sequence encoding a first repair template; (d) providing a CF barcoding cassette, the CF barcoding cassette comprising: (i) a nucleic acid sequence encoding a second CFgRNA having a region of complementarity to a sequence of the second target locus; and (ii) a nucleic acid sequence encoding a second repair template; (e) providing conditions to allow the nucleic acid-guided nuclease/reverse transcriptase fusion enzyme, the first CFgRNA, and the first repair template to bind to the first target locus; (f) allowing the nucleic acid-guided nuclease/reverse transcriptase fusion enzyme, the first CFgRNA, and the first repair template to edit the first target locus; (g) providing conditions to allow the nucleic acid-guided nuclease/reverse transcriptase fusion enzyme, the second CFgRNA, and the second repair template to bind to the second target locus; and (h) allowing the nucleic acid-guided nuclease/reverse transcriptase fusion enzyme, the second CFgRNA, and the second repair template to integrate the barcode into the second target locus.
In some aspects, the present disclosure provides an editing system comprising one or more vectors comprising: a nucleic acid sequence encoding a nucleic acid-guided nuclease/reverse transcriptase fusion enzyme; a CF editing cassette comprising: a nucleic acid sequence encoding a first CFgRNA having a region of complementarity to a sequence of a first target locus in a cell; and a nucleic acid sequence encoding a first repair template; a CF barcoding cassette comprising: a nucleic acid sequence encoding a second CFgRNA having a region of complementarity to a sequence of a second target locus in the cell; and a nucleic acid sequence encoding a second repair template.
In some aspects, the present disclosure provides a vector comprising: a nucleic acid sequence encoding a nucleic acid-guided nuclease/reverse transcriptase fusion enzyme; a CF editing cassette comprising: a nucleic acid sequence encoding a first CFgRNA having a region of complementarity to a sequence of a first target locus in a cell; and a nucleic acid sequence encoding a first repair template: a CF barcoding cassette comprising: a nucleic acid sequence encoding a second CFgRNA having a region of complementarity to a sequence of a second target locus in the cell; and a nucleic acid sequence encoding a second repair template.
In some aspects, the present disclosure provides a method for performing nucleic acid-guided nuclease/reverse transcriptase fusion editing in a genome of a live cell, comprising: (a) providing a live cell suitable for the editing; (b) introducing a nucleic acid-guided nuclease/reverse transcriptase fusion enzyme; (c) providing conditions to allow the nucleic acid-guided nuclease/reverse transcriptase fusion enzyme to bind to a first target locus; (d) allowing the nucleic acid-guided nuclease/reverse transcriptase fusion enzyme to edit the first target locus; (e) providing conditions to allow the nucleic acid-guided nuclease/reverse transcriptase fusion enzyme to bind to a second target locus; (f) allowing the nucleic acid-guided nuclease/reverse transcriptase fusion enzyme to integrate a barcode into the second target locus.
These aspects and other features and advantages of the invention are described below in more detail.
The foregoing and other features and advantages of the present invention will be more fully understood from the following detailed description of illustrative embodiments taken in conjunction with the accompanying drawings in which:
FLUORESCENT PROTEIN (GFP)-to-BLUE FLUORESCENT PROTEIN (BFP) edit rates (y-axis) improved from 17% BFP+ in unsorted cells to 50% BFP+ in MACS enriched cells.
16D illustrate examples of hypoxanthine phosphoribosyltransferase (HPRT) loss-of-function edits allowing for negative selection by resistance to 6-thioguanine (6-TG).
It should be understood that the drawings are not necessarily to scale, and that like reference numbers refer to like features.
All the functionalities described in connection with one aspect are intended to be applicable to the additional aspects described herein except where expressly stated or where the feature or function is incompatible with the additional aspects. For example, where a given feature or function is expressly described in connection with one aspect but not expressly mentioned in connection with an alternative aspect, it should be understood that the feature or function may be deployed, utilized, or implemented in connection with the alternative aspect unless the feature or function is incompatible with the alternative aspect.
The practice of the techniques described herein may employ, unless otherwise indicated, conventional techniques and descriptions of organic chemistry, polymer technology, molecular biology (including recombinant techniques), cell biology, biochemistry and sequencing technology, which are within the skill of those who practice in the art. Such conventional techniques include polymer array synthesis, hybridization and ligation of polynucleotides, and detection of hybridization using a label. Specific illustrations of suitable techniques can be had by reference to the examples herein. However, other equivalent conventional procedures can, of course, also be used. Such conventional techniques and descriptions can be found in standard laboratory manuals such as Green, et al., Eds. (1999), Genome Analysis: A Laboratory Manual Series (Vols. I-IV), Weiner, Gabriel, Stephens, Eds. (2007), Genetic Variation: A Laboratory Manual; Dieffenbach, Dveksler, Eds. (2003), PCR Primer: A Laboratory Manual; Mount (2004), Bioinformatics: Sequence and Genome Analysis; Sambrook and Russell (2006), Condensed Protocols from Molecular Cloning: A Laboratory Manual; and Sambrook and Russell (2002), Molecular Cloning: A Laboratory Manual (all from Cold Spring Harbor Laboratory Press); Stryer, L. (1995) Biochemistry (4th Ed.) W.H. Freeman, New York N.Y.; Gait, “Oligonucleotide Synthesis: A Practical Approach” 1984, IRL Press, London; Nelson and Cox (2000), Lehninger, Principles of Biochemistry 3rd Ed., W. H. Freeman Pub., New York, N.Y.; Berg et al. (2002) Biochemistry, 5th Ed., W.H. Freeman Pub., New York, N.Y.; all of which are herein incorporated in their entirety by reference for all purposes. CRISPR-specific techniques can be found in, e.g., Genome Editing and Engineering from TALENs and CRISPRs to Molecular Surgery, Appasani and Church (2018); and CRISPR: Methods and Protocols, Lindgren and Charpentier (2015); both of which are herein incorporated in their entirety by reference for all purposes.
Note that as used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to “an oligonucleotide” refers to one or more oligonucleotides, and reference to “an automated system” includes reference to equivalent steps and methods for use with the system known to those skilled in the art, and so forth. Additionally, it is to be understood that terms such as “left,” “right,” “top,” “bottom,” “front,” “rear,” “side,” “height,” “length,” “width,” “upper,” “lower,” “interior,” “exterior,” “inner,” “outer” that may be used herein merely describe points of reference and do not necessarily limit aspects of the present disclosure to any particular orientation or configuration. Furthermore, terms such as “first,” “second,” “third,” etc., merely identify one of a number of portions, components, steps, operations, functions, and/or points of reference as disclosed herein, and likewise do not necessarily limit aspects of the present disclosure to any particular configuration or orientation.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. All publications mentioned herein are incorporated by reference herein in their entireties.
When a range of numbers is provided herein, the range is understood to be inclusive of the edges of the range as well as any number between the defined edges of the range. For example, “between 1 and 10” includes any number between 1 and 10, as well as the number 1 and the number 10.
The term “about” means plus or minus 10% of the numerical value of the number with which it is being used. For example, “about 100” refers to numbers between (and including) 90 and 110.
When a grouping of alternatives is presented, any and all combinations of the members that make up that grouping of alternatives is specifically envisioned. For example, if an item is selected from a group consisting of A, B, C, and D, the inventors specifically envisions each alternative individually (e.g. A alone, B alone, etc.), as well as combinations such as A, B, and D; A and C; B and C; etc.
The term “and/or” when used in a list of two or more items means any one of the listed items by itself or in combination with any one or more of the other listed items. For example, the expression “A and/or B” is intended to mean either or both of A and B—i.e., A alone, B alone, or A and B in combination. The expression “A, B and/or C” is intended to mean A alone, B alone, C alone, A and B in combination, A and C in combination, B and C in combination, or A, B, and C in combination.
In the following description, numerous specific details are set forth to provide a more thorough understanding of the present invention. However, it will be apparent to one of ordinary skill in the art that the present invention may be practiced without one or more of these specific details. In other instances, well-known features and procedures well known to those skilled in the art have not been described in order to avoid obscuring the invention.
The term “complementary” as used herein refers to Watson-Crick base pairing between nucleotides and specifically refers to nucleotides hydrogen bonded to one another with thymine or uracil residues linked to adenine residues by two hydrogen bonds and cytosine and guanine residues linked by three hydrogen bonds. The terms “percent complementarity” or “percent complementary” as used herein in reference to two nucleotide sequences is similar to the concept of percent identity but refers to the percentage of nucleotides of a query sequence that optimally base-pair or hybridize to nucleotides a subject sequence when the query and subject sequences are linearly arranged and optimally base paired without secondary folding structures, such as loops, stems or hairpins. Such a percent complementarity can be between two DNA strands, two RNA strands, or a DNA strand and a RNA strand. The “percent complementarity” can be calculated by (i) optimally base-pairing or hybridizing the two nucleotide sequences in a linear and fully extended arrangement (e.g., without folding or secondary structures) over a window of comparison, (ii) determining the number of positions that base-pair between the two sequences over the window of comparison to yield the number of complementary positions, (iii) dividing the number of complementary positions by the total number of positions in the window of comparison, and (iv) multiplying this quotient by 100% to yield the percent complementarity of the two sequences. Optimal base pairing of two sequences can be determined based on the known pairings of nucleotide bases, such as G-C, A-T, and A-U, through hydrogen binding. If the “percent complementarity” is being calculated in relation to a reference sequence without specifying a particular comparison window, then the percent identity is determined by dividing the number of complementary positions between the two linear sequences by the total length of the reference sequence. Thus, for purposes of the present application, when two sequences (query and subject) are optimally base-paired (with allowance for mismatches or non-base-paired nucleotides), the “percent complementarity” for the query sequence is equal to the number of base-paired positions between the two sequences divided by the total number of positions in the query sequence over its length, which is then multiplied by 100%. In general, a nucleic acid includes a nucleotide sequence described as having a “percent complementarity” or being a “percent complementary” to a specified second nucleotide sequence. For example, a nucleotide sequence may have 70%, 80%, 90%, 95%, 99%, or 100% complementarity to a specified second nucleotide sequence, indicating that, for example, 7 of 10, 8 of 10, 9 of 10, 19 of 20, 99 of 100, or 10 of 10 nucleotides, respectively, of a sequence are complementary to the specified second nucleotide sequence. For example, the nucleotide sequence 3′-TCGA-5′ is 100% complementary to the nucleotide sequence 5′-AGCT-3′; and the nucleotide sequence 3′-TCGA-5′ is 100% complementary to a region of the nucleotide sequence 5′-TAGCTG-3′.
The term DNA “control sequences” refers collectively to promoter sequences, polyadenylation signals, transcription termination sequences, upstream regulatory domains, origins of replication, internal ribosome entry sites, nuclear localization sequences, enhancers, and the like, which collectively provide for the replication, transcription and translation of a coding sequence in a recipient cell. Not all of these types of control sequences need to be present so long as a selected coding sequence is capable of being replicated, transcribed and—for some components—translated in an appropriate host cell.
A “regulatory sequence” or “regulatory region” refers to the region of a gene where RNA polymerase and other accessory transcription modulator proteins (e.g., transcription factors) bind and interact to control transcription of the gene. Non-limiting examples of regulatory sequences or regions include promoters, enhancers, and terminators. Regulatory sequences or regions are capable of increasing or decreasing gene expression. As a result, these elements can control net protein expression from the gene.
The terms “CREATE fusion gRNA” or “CFgRNA” refer to a gRNA engineered to function with a nucleic acid-guided nickase/reverse transcriptase fusion enzyme (a “nickase-RT fusion”) where the CFgRNA is designed to bind to and facilitate editing or barcoding of one or both DNA strands in a target locus of a cell genome. In certain aspects, “CREATE fusion gRNA” or “CFgRNA” refer to one of two gRNAs engineered to function with a nucleic acid-guided nickase/reverse transcriptase fusion enzyme (a “nickase-RT fusion”) where the two CFgRNAs are designed to bind to and edit/barcode opposite DNA strands in a target locus. The two CFgRNAs specific to a target locus have regions of complementarity to one another at least at the site of the edit and preferably at regions 5′ and 3′ to the site of the edit. The term “complementary CFgRNAs” refers to two CFgRNAs engineered to bind to opposite DNA strands in a target locus which often create the complementary edit at a site in the target locus.
The terms “CREATE fusion barcoding cassette” or “CF barcoding cassette” in the context of the current methods and compositions refers to a nucleic acid molecule comprising a coding sequence for transcription of a CREATE fusion gRNA or “CFgRNA” to effect barcoding in a nucleic acid-guided nickase/reverse transcriptase fusion system where the CFgRNA is designed to bind to and facilitate incorporation of a barcode sequence into one or both DNA strands in a target locus. In certain aspects, “CF barcoding cassette” refers to a nucleic acid molecule comprising a coding sequence for transcription of two gRNAs to effect barcoding in a nucleic acid-guided nickase/reverse transcriptase fusion system where the two gRNAs are designed to bind to and integrate barcode sequences into opposite DNA strands in a target locus.
The terms “CREATE fusion editing cassette” or “CF editing cassette” in the context of the current methods and compositions refers to a nucleic acid molecule comprising a coding sequence for transcription of a CREATE fusion gRNA or “CFgRNA” to effect editing in a nucleic acid-guided nickase/reverse transcriptase fusion system where the CFgRNA is designed to bind to and facilitate editing of one or both DNA strands in a target locus. In certain aspects, “CF editing cassette” refers to a nucleic acid molecule comprising a coding sequence for transcription of two gRNAs to effect editing in a nucleic acid-guided nickase/reverse transcriptase fusion system where the two gRNAs are designed to bind to and edit opposite DNA strands in a target locus.
The terms “CREATE fusion editing system” or “CF editing system” refer to the combination of a nucleic acid-guided nickase enzyme/reverse transcriptase fusion protein (“nickase-RT fusion”) and a CREATE fusion editing cassette (“CF editing cassette”) to effect editing in live cells. In certain aspects, a CF editing system further includes a CREATE fusion barcoding cassette (“CF barcoding cassette”).
The terms “guide nucleic acid” or “guide RNA” or “gRNA” refer to a polynucleotide comprising 1) a guide sequence capable of hybridizing to a target genomic locus, and 2) a scaffold sequence capable of interacting or complexing with a nucleic acid-guided nuclease.
“Homology” or “identity” or “similarity” refers to sequence similarity between two peptides or, more often in the context of the present disclosure, between two nucleic acid molecules. The term “homologous region” or “homology arm” refers to a region on a donor DNA with a certain degree of homology with a target genomic DNA sequence. Homology can be determined by comparing a position in each sequence which may be aligned for purposes of comparison. When a position in the compared sequence is occupied by the same base or amino acid, then the molecules are homologous at that position. A degree of homology between sequences is a function of the number of matching or homologous positions shared by the sequences.
The terms “percent identity” or “percent identical” as used herein in reference to two or more nucleotide or amino acid sequences is calculated by (i) comparing two optimally aligned sequences (nucleotide or amino acid) over a window of comparison (the “alignable” region or regions), (ii) determining the number of positions at which the identical nucleic acid base (for nucleotide sequences) or amino acid residue (for proteins and polypeptides) occurs in both sequences to yield the number of matched positions, (iii) dividing the number of matched positions by the total number of positions in the window of comparison, and then (iv) multiplying this quotient by 100% to yield the percent identity. If the “percent identity” is being calculated in relation to a reference sequence without a particular comparison window being specified, then the percent identity is determined by dividing the number of matched positions over the region of alignment by the total length of the reference sequence. Accordingly, for purposes of the present application, when two sequences (query and subject) are optimally aligned (with allowance for gaps in their alignment), the “percent identity” for the query sequence is equal to the number of identical positions between the two sequences divided by the total number of positions in the query sequence over its length (or a comparison window), which is then multiplied by 100%. When percentage of sequence identity is used in reference to amino acids it is recognized that residue positions which are not identical often differ by conservative amino acid substitutions, where amino acid residues are substituted for other amino acid residues with similar chemical properties (e.g., charge or hydrophobicity) and therefore do not change the functional properties of the molecule. When sequences differ in conservative substitutions, the percent sequence identity can be adjusted upwards to correct for the conservative nature of the substitution. Sequences that differ by such conservative substitutions are said to have “sequence similarity” or “similarity.”
For optimal alignment of sequences to calculate their percent identity, various pair-wise or multiple sequence alignment algorithms and programs are known in the art, such as ClustalW or Basic Local Alignment Search Tool® (BLAST™), etc., that can be used to compare the sequence identity or similarity between two or more nucleotide or amino acid sequences. Although other alignment and comparison methods are known in the art, the alignment and percent identity between two sequences (including the percent identity ranges described above) can be as determined by the ClustalW algorithm, see, e.g., Chenna et al., “Multiple sequence alignment with the Clustal series of programs,” Nucleic Acids Research 31:3497-3500 (2003); Thompson et al., “Clustal W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice,” Nucleic Acids Research 22:4673-4680 (1994); Larkin M A et al., “Clustal W and Clustal X version 2.0,” Bioinformatics 23:2947-48 (2007); and Altschul et al. “Basic local alignment search tool.” J. Mol. Biol. 215:403-410 (1990), the entire contents and disclosures of which are incorporated herein by reference.
The term “nucleic acid” or “polynucleotide” refers to deoxyribonucleic acids (DNA) or ribonucleic acids (RNA) and polymers thereof in either single- or double-stranded form. Unless otherwise indicated, the terms encompass nucleic acids containing known analogues or natural nucleotides that have similar binding properties as the reference nucleic acid and are metabolized in a manner similar to naturally occurring nucleotides. Unless otherwise indicated, in addition to the sequence specifically stated, a particular nucleic acid sequence also implicitly encompasses conservatively modified variants thereof (e.g., degenerate codon substitutions), alleles, orthologues, SNPs, and complementary sequences. The term nucleic acid is used interchangeably with DNA, RNA, cDNA, gene, and mRNA encoded by a gene.
As used herein, “nucleic acid-guided nickase/reverse transcriptase fusion” or “nickase-RT fusion” refers to a nucleic acid-guided nickase—or nucleic acid-guided nuclease or CRISPR nuclease that has been engineered to act as a nickase rather than a nuclease that initiates double-stranded DNA breaks—where the nucleic acid-guided nickase is fused to a reverse transcriptase, which is an enzyme used to generate cDNA from an RNA template. In certain aspects, “nucleic acid-guided nickase/reverse transcriptase fusion” or “nickase-RT fusion” refers to two or more nucleic acid-guided nickases—or nucleic acid-guided nucleases or CRISPR nucleases that have been engineered to act as nickases rather than nucleases that initiate double-stranded DNA breaks—where the nucleic acid-guided nickases are fused to a reverse transcriptase. For information regarding nickase-RT fusions see, e.g., U.S. Pat. No. 10,689,669 and U.S. Ser. No. 16/740,421.
“Nucleic acid-guided editing components” refers to one or both of a nickase-RT fusion and CREATE fusion guide nucleic acids (CFgRNAs).
“Operably linked” refers to an arrangement of elements where the components so described are configured so as to perform their usual function. Thus, control sequences operably linked to a coding sequence are capable of effecting the transcription, and in some cases, the translation, of a coding sequence. The control sequences need not be contiguous with the coding sequence so long as they function to direct the expression of the coding sequence. Thus, for example, intervening untranslated yet transcribed sequences can be present between a promoter sequence and the coding sequence and the promoter sequence can still be considered “operably linked” to the coding sequence. In fact, such sequences need not reside on the same contiguous DNA molecule (e.g. chromosome) and may still have interactions resulting in altered regulation.
A “PAM mutation” refers to one or more edits to a target sequence that removes, mutates, or otherwise renders inactive a protospacer adjacent motif (PAM) or spacer region in the target sequence.
A “promoter” or “promoter sequence” is a DNA regulatory region capable of binding RNA polymerase and initiating transcription of a polynucleotide or polypeptide coding sequence such as messenger RNA, ribosomal RNA, small nuclear or nucleolar RNA, guide RNA, or any kind of RNA. In some aspects, a promoter is an endogenous promoter, synthetically produced, varied, or derived from a known or naturally occurring promoter sequence or other promoter sequence. In some aspects, a promoter is a constitutive promoter. In some aspects, a promoter is an inducible promoter. In some aspects, a promoter is a heterologous promoter.
A “terminator” or “terminator sequence” or “transcription termination sequence” refers to a DNA regulatory region of a gene that signals termination of transcription of the gene to an RNA polymerase. Without being limiting, terminators cause transcription of an operably linked nucleic acid molecule to stop.
A “coding sequence” or “coding region” refers to the region of a gene's DNA or RNA which codes for a protein. In DNA, the coding region of a gene is flanked by the promoter sequence on 5′ end of the template strand and the termination sequence on the 3′ end. After transcription, the coding region in an mRNA is flanked by 5′ untranslated region (5′-UTR) and 3′ untranslated region (3′-UTR), 5′ cap, and poly-A tail.
A “non-coding sequence” or “non-coding region” refers to the region of a gene's DNA which does not code for a protein. However, some non-coding DNA is transcribed into functional non-coding RNA molecules (e.g., transfer RNA, microRNA, siRNA, piRNA, ribosomal RNA, and regulatory RNAs). Other functional non-coding DNA include, for example, regulatory sequences of a gene that control its expression.
As used herein “gene product” refers to a biochemical material, either RNA or protein, resulting from expression of a gene. In some aspects, a gene product is an RNA molecule, e.g., transfer RNA, microRNA, siRNA, piRNA, ribosomal RNA, or regulatory RNA. In some aspects, the gene product is a protein. In some aspects, the gene product is an enzyme. In some aspects, the gene product is a membrane protein. In some aspects, the gene product is a protein involved in the expression of a gene. In some aspects, the gene product is a transcription factor. In some aspects, the gene product is a coactivator protein.
In some aspects, the gene product is a corepressor protein. In some aspects, the gene product is a chromatin-binding protein.
As used herein, the terms “protein,” “peptide,” and “polypeptide” are used interchangeably herein and refer to a polymer of amino acid residues. In some aspects, proteins are made up entirely of amino acids transcribed by any class of any RNA polymerase I, II or III.
As used herein, the term “repair template” in the context of a CREATE fusion editing system employing a nickase-RT fusion enzyme refers to a nucleic acid (e.g., a ribonucleic acid) that is designed to serve as a template (including a desired edit or barcode) to be incorporated into target DNA via reverse transcription (e.g., by reverse transcriptase).
As used herein the term “selectable marker” refers to a gene introduced into a cell, which confers a trait suitable for artificial selection. General use selectable markers are well-known to those of ordinary skill in the art. Drug selectable markers such as ampicillin/carbenicillin, kanamycin, chloramphenicol, nourseothricin N-acetyl transferase, erythromycin, tetracycline, gentamicin, bleomycin, streptomycin, puromycin, hygromycin, blasticidin, and G418 may be employed. In other aspects, selectable markers include, but are not limited to human nerve growth factor receptor (detected with a MAb, such as described in U.S. Pat. No. 6,365,373); truncated human growth factor receptor (detected with MAb); mutant human dihydrofolate reductase (DHFR; fluorescent MTX substrate available); secreted alkaline phosphatase (SEAP; fluorescent substrate available); human thymidylate synthase (TS; confers resistance to anti-cancer agent fluorodeoxyuridine); human glutathione S-transferase alpha (GSTA1; conjugates glutathione to the stem cell selective alkylator busulfan; chemoprotective selectable marker in CD34+ cells); CD24 cell surface antigen in hematopoietic stem cells; human CAD gene to confer resistance to N-phosphonacetyl-L-aspartate (PALA); human multi-drug resistance-1 (MDR-1; P-glycoprotein surface protein selectable by increased drug resistance or enriched by FACS); human CD25 (IL-2a; detectable by Mab-FITC); Methylguanine-DNA methyltransferase (MGMT; selectable by carmustine); rhamnose; and Cytidine deaminase (CD; selectable by Ara-C). In some aspects, a selectable marker comprises an antibiotic resistance gene. In some aspects, a selectable marker comprises a puromycin resistance gene. “Selective medium” as used herein refers to cell growth medium to which has been added a chemical compound or biological moiety that selects for or against selectable markers. In some aspects, a selectable marker provides a phenotypic handle for live-cell selection. In some aspects, the selection is a positive selection for a knock-in edit. In some aspects, the selection is a negative selection for a knockout edit. In some aspects, a HA epitope tag can be used as a phenotypic selection marker. In some aspects, 6-GT can be used to select for HPRT knockout edits.
A “locus” refers to a fixed position in a genome. In some aspects, a locus comprises a coding region. In some aspects, a locus comprises a non-coding region. In some aspects, a locus comprises a gene. In an aspect, a locus comprises at least 1 nucleotide. In an aspect, a locus comprises at least 10 nucleotides. In an aspect, a locus comprises at least 25 nucleotides. In an aspect, a locus comprises at least 50 nucleotides. In an aspect, a locus comprises at least 100 nucleotides. In an aspect, a locus comprises at least 250 nucleotides. In an aspect, a locus comprises at least 500 nucleotides. In an aspect, a locus comprises at least 1000 nucleotides. In an aspect, a locus comprises at least 2500 nucleotides. In an aspect, a locus comprises at least 5000 nucleotides.
The terms “target sequence”, “target genomic DNA locus”, “target locus”, or “target genomic locus” refer to any locus in vitro or in vivo, or in a nucleic acid (e.g., genome or episome) of a cell or population of cells, in which a change of at least one nucleotide is desired using a nucleic acid-guided nuclease editing system. The target sequence can be a genomic locus or extrachromosomal locus. In some aspects, a target locus refers to a position in a genome targeted to be edited by the nucleic acid-guided nuclease/reverse transcriptase fusion enzyme and the CF editing cassette. In some aspects, a target locus comprises a gene, including its regulatory regions and coding regions. In some aspects, a target locus comprises a regulatory region of a gene, e.g., a promoter region or a terminator region. In some aspects, the target locus is within a nuclear genome. In some aspects, the target locus is within a mitochondrial genome. In some aspects, the target locus is within a vector.
In some aspects, an “integration locus” refers to a position in a genome targeted for the integration of a CF editing cassette. In some aspects, an integration locus comprises a coding region. In some aspects, an integration locus comprises a non-coding region. In some aspects, an integration locus comprises a “safe harbor locus.” A “safe harbor locus” as used herein refers to an intergenic region that has a reduced potential for the CF editing cassette integration adversely affecting genes neighboring (e.g., within 10 kb) the integrated CF editing cassette.
The term “gene” refers to a nucleic acid region which includes a coding region operably linked to a suitable regulatory region capable of regulating the expression of a gene product (e.g., a protein or functional non-coding RNA) in some manner. Genes include untranslated regulatory regions (e.g., promoters, enhancers, repressors, etc.) in the DNA before (upstream) and after (downstream) the coding region (open reading frame, ORF), and, where applicable, intervening sequences (e.g., introns) between individual coding regions (e.g., exons).
The term “variant” refers to a polypeptide or polynucleotide that differs from a reference polypeptide or polynucleotide. A typical variant of a polypeptide differs in amino acid sequence from another reference polypeptide. Generally, differences may be limited so that the sequences of the reference polypeptide and the variant are closely similar overall (e.g., at least 90% identical) and, in many regions, identical. A variant and reference polypeptide may differ in amino acid sequence by one or more modifications (e.g., substitutions, additions, and/or deletions). A variant of a polypeptide may be a conservatively modified variant (e.g., at least 95% identical to the reference polypeptide). A substituted or inserted amino acid residue may or may not be one encoded by the genetic code (e.g., a non-natural amino acid). A variant of a polypeptide may be naturally occurring, such as an allelic variant, or it may be a variant that is not known to occur naturally.
A “vector” is any of a variety of nucleic acids that comprise a desired sequence or sequences to be delivered to and/or expressed in a cell. Vectors are typically composed of DNA, although RNA vectors are also available. Vectors include, but are not limited to, plasmids, fosmids, phagemids, virus genomes, BACs, YACs, PACs, synthetic chromosomes, and the like. In the present disclosure, a single vector may include a coding sequence for a nickase-RT fusion enzyme and a CF editing cassette and/or CFgRNA sequence to be transcribed. In other aspects, however, two vectors—e.g., an engine vector comprising the coding sequence for the nickase-RT fusion enzyme, and an editing vector, comprising the CFgRNA sequence to be transcribed—may be used.
As used herein, a “mutation” refers to an inheritable genetic modification introduced into a gene to alter the expression or activity of a product encoded by the gene. In some aspects, “mutation,” “modification,” and “edit” may be used interchangeably in the present disclosure. In some aspects, a modification can be in any sequence region of a gene, for example, in a promoter, 5′ UTR, exon, 3′ UTR, or terminator region. In some aspects, a modification is in the regulatory region of a gene. In some aspects, a modification is in the coding region of a gene. In some aspects, a modification is in an exon. In some aspects, a modification is in an intron. In some aspects, a modification spans an intron/exon junction. In some aspects, a modification reduces, inhibits, or eliminates the expression or activity of a gene product as compared to an unmodified control. In some aspects, a modification increases, elevates, strengthens, or augments the expression or activity of a gene product as compared to an unmodified control.
In some aspects, a mutation, or modification is a “non-natural” or “non-naturally occurring” mutation or modification. As used herein, a “non-natural” or “non-naturally occurring” mutation or modification refers to a non-spontaneous mutation or modification generated via human intervention, and does not correspond to a spontaneous mutation or modification generated without human intervention. Non-limiting examples of human intervention include mutagenesis (e.g., chemical mutagenesis, ionizing radiation mutagenesis) and targeted genetic modifications (e.g., nucleic-acid guided nuclease-based methods, CREATE fusion-based methods, CRISPR-based methods, TALEN-based methods, zinc finger-based methods). Non-natural mutations or modifications and non-naturally occurring mutations or modifications do not include spontaneous mutations that arise naturally (e.g., via aberrant DNA replication).
Several types of mutations or modifications are known in the art. In some aspects, a mutation or modification comprises an insertion. An “insertion” refers to the addition of one or more nucleotides or amino acids to a given polynucleotide or amino acid sequence, respectively, as compared to an endogenous reference polynucleotide or amino acid sequence. In an aspect, an insertion comprises an insertion of at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 25, at least 50, at least 75, at least 100, at least 200, at least 300, at least 400, at least 500, at least 750, at least 1000, or at least 2500 nucleotides.
In some aspects, a mutation or modification comprises a deletion. A “deletion” refers to the removal of one or more nucleotides or amino acids to a given polynucleotide or amino acid sequence, respectively, as compared to an endogenous reference polynucleotide or amino acid sequence. In an aspect, a deletion comprises a deletion of at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 25, at least 50, at least 75, at least 100, at least 200, at least 300, at least 400, at least 500, at least 750, at least 1000, or at least 2500 nucleotides.
In some aspects, a mutation or modification comprises a substitution or a swap. A “substitution” or “swap” refers to the replacement of one or more nucleotides or amino acids to a given polynucleotide or amino acid sequence, respectively, as compared to an endogenous reference polynucleotide or amino acid sequence. In some aspects, a “substitution allele” refers to a nucleic acid sequence at a particular locus comprising a substitution. In an aspect, a substitution comprises the substitution of at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 15, at least 20, at least 25, at least 30, at least 35, at least 40, at least 45, or at least 50 nucleotides. When more than 1 nucleotide is substituted, the substitutions can be contiguous or non-contiguous.
In some aspects, a mutation or modification comprises an inversion. An “inversion” refers to when a segment of a polynucleotide or amino acid sequence is reversed end-to-end. In an aspect, an inversion comprises an inversion of at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 25, at least 50, at least 75, at least 100, at least 200, at least 300, at least 400, at least 500, at least 750, at least 1000, or at least 2500 nucleotides.
In some aspects, a mutation or modification provided herein comprises a mutation selected from the group consisting of an insertion, a deletion, a substitution, and an inversion. In some aspects, a mutation or modification provided herein comprises an insertion. In some aspects, a mutation or modification provided herein comprises a deletion. In some aspects, a mutation or modification provided herein comprises a substitution. In some aspects, a mutation or modification provided herein comprises an inversion.
In some aspects, a mutation or modification comprises one or more mutation types selected from the group consisting of a nonsense mutation, a missense mutation, a frameshift mutation, a splice-site mutation, and any combinations thereof. As used herein, a “nonsense mutation” refers to a mutation to a nucleic acid sequence that introduces a premature stop codon to an amino acid sequence by the nucleic acid sequence. As used herein, a “missense mutation” refers to a mutation to a nucleic acid sequence that causes a substitution within the amino acid sequence encoded by the nucleic acid sequence. As used herein, a “frameshift mutation” refers to an insertion or deletion to a nucleic acid sequence that shifts the frame for translating the nucleic acid sequence to an amino acid sequence. A “splice-site mutation” refers to a mutation in a nucleic acid sequence that causes an intron to be retained for protein translation, or, alternatively, for an exon to be excluded from protein translation. Splice-site mutations can cause nonsense, missense, or frameshift mutations.
Mutations or modifications in coding regions of genes (e.g., exonic mutations) can result in a truncated protein or polypeptide when a mutated messenger RNA (mRNA) is translated into a protein or polypeptide. In some aspects, this disclosure provides a mutation that results in the truncation of a protein or polypeptide. As used herein, a “truncated” protein or polypeptide comprises at least one fewer amino acid as compared to an endogenous control protein or polypeptide. For example, if endogenous Protein A comprises 100 amino acids, a truncated version of Protein A can comprise between 1 and 99 amino acids.
Without being limited by any scientific theory, one way to cause a protein or polypeptide truncation is by the introduction of a premature stop codon in an mRNA transcript of an endogenous gene. In some aspects, this disclosure provides a mutation that results in a premature stop codon in an mRNA transcript of an endogenous gene. As used herein, a “stop codon” refers to a nucleotide triplet within an mRNA transcript that signals a termination of protein translation. A “premature stop codon” refers to a stop codon positioned earlier (e.g., on 5′-side) than the normal stop codon position in an endogenous mRNA transcript. Without being limiting, several stop codons are known in the art, including “UAG,” “UAA,” “UGA,” “TAG,” “TAA,” and “TGA.” In some aspects, multiple (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, or more than 10) premature stop codons are introduced.
In some aspects, a mutation or modification provided herein comprises a null mutation. As used herein, a “null mutation” or “knockout edit” refers to a mutation that confers a decreased function or complete loss-of-function for a protein encoded by a gene comprising the mutation, or, alternatively, a mutation that confers a decreased function or complete loss-of-function for a small RNA encoded by a genomic locus. A null mutation can cause lack or decrease of mRNA transcript production, small RNA transcript production, protein function, or a combination thereof. As used herein, a “null allele” refers to a nucleic acid sequence at a particular locus where a null mutation has conferred a decreased function or complete loss-of-function to the allele.
In some aspects, a “synonymous edit” or “synonymous substitution” is the substitution of one base for another in an exon of a gene coding for a protein, such that the produced amino acid sequence is not modified. This is possible because the genetic code is “degenerate”, meaning that some amino acids are coded for by more than one three-base-pair codon; since some of the codons for a given amino acid differ by just one base pair from others coding for the same amino acid, a mutation that replaces the “normal” base by one of the alternatives will result in incorporation of the same amino acid into the growing polypeptide chain when the gene is translated.
In an aspect, an edit is a “knock-in” edit. As used herein, a “knock-in edit” or a “knock-in mutation” the substitution or replacement of a non-functioning or low-functioning allele of a gene with a functional or higher-functioning allele of the gene. Knock-in edits are sometimes referred to in the art as gain-of-function edits.
In some aspects, “codon optimization” refers to experimental approaches designed to improve the codon composition of a recombinant gene based on various criteria without altering the amino acid sequence. This is possible because most amino acids are encoded by more than one codon. Codon optimization may be used to improve gene expression and increase the translation efficiency of a gene of interest by accommodating for codon bias of the host organism. In some aspects, a nucleic acid molecule provided herein encodes a polypeptide that is codon optimized for a prokaryote. In some aspects, a nucleic acid molecule provided herein encodes a polypeptide that is codon optimized for an Escherichia coli cell. In some aspects, a nucleic acid molecule provided herein encodes a polypeptide that is codon optimized for a eukaryote. In some aspects, a nucleic acid molecule provided herein encodes a polypeptide that is codon optimized for a mammalian cell. In some aspects, a nucleic acid molecule provided herein encodes a polypeptide that is codon optimized for a human cell. In some aspects, a nucleic acid molecule provided herein encodes a polypeptide that is codon optimized for a non-human mammalian cell. In some aspects, a nucleic acid molecule provided herein encodes a polypeptide that is codon optimized for a fungal cell. In some aspects, a nucleic acid molecule provided herein encodes a polypeptide that is codon optimized for a Saccharomyces cerevisiae cell. In some aspects, a nucleic acid molecule provided herein encodes a polypeptide that is codon optimized for a plant cell. In some aspects, a nucleic acid molecule provided herein encodes a polypeptide that is codon optimized for an archaeal cell.
The present disclosure includes method of trackable nucleic acid-guided nuclease editing in cell populations, e.g., prokaryotic, archaeal, and eukaryotic cells. In some aspects, the cells include mammalian cells. In some aspects, the cells include human cells. In some aspects, the cells include non-human mammalian cells. In some aspects, the cells include bacterial cells. In some aspects, the cells include E. coli cells. In some aspects, the cells include fungal cells. In some aspects, the cells include S. cerevisiae cells. In some aspects, the cells include plant cells.
In some aspects, a mutation or modification provided herein can be positioned in any part of a gene. In some aspects, a mutation or modification provided herein can be positioned in the coding region of a gene. In some aspects, a mutation or modification provided herein can be positioned in the non-coding region of a gene. In some aspects, a mutation or modification provided herein can be positioned in the regulatory region of a gene. In some aspects, a mutation or modification provided herein is positioned within an exon of a gene. In some aspects, a mutation or modification provided herein is positioned within an intron of a gene. In some aspects, a mutation or modification provided herein is positioned within an exon and an intron of a gene. In a further aspect, a mutation or modification provided herein is positioned within a 5′-untranslated region (UTR) of a gene. In still another aspect, a mutation or modification provided herein is positioned within a 3′-UTR of a gene. In yet another aspect, a mutation or modification provided herein is positioned within a promoter of a gene. In yet another aspect, a mutation or modification provided herein is positioned within a terminator of a gene.
The present disclosure relates to methods and compositions for improved tracking of nucleic acid-guided nuclease editing. With the present compositions and methods, targeted editing and tracking of the intended edit(s) is facilitated using a barcoding gRNA covalently linked to a barcode sequence and designed to precisely insert the barcode sequence into a desired genomic locus. When introduced into cells along with an editing gRNA covalently linked to an intended edit and a corresponding nucleic acid-guided nuclease or nickase, the barcoding gRNA facilitates simultaneous editing and tracking (e.g., barcoding) of the edit(s), wherein the edit is incorporated into a first target locus and the corresponding barcode is integrated into a second, separate target locus. The integrated barcode may then be tracked or analyzed via genomic sequencing (e.g., amplicon-based next-generation sequencing), or RNA sequencing (“RNASeq”) if the second target locus is a gene-coding region, to identify the edit, plasmid, or other construct that was co-delivered with the barcoding gRNA. And, because the barcode is integrated into the genome, the barcode maybe tracked beyond the timeframe of any transient plasmid reagents utilized to facilitate editing, cell differentiation, and the like.
In certain aspects, the barcoding gRNA is a component of a barcoding cassette for performing tracking of nucleic acid-guided nuclease editing, the barcoding cassette comprising the barcoding gRNA having a region of complementarity to a sequence of a target locus in which a barcode sequence is to be integrated, and a barcode sequence for integration into the cell genome having a unique sequence by which a corresponding edit may be identified.
In certain aspects, the barcoding gRNA is a CREATE fusion gRNA (“CFgRNA,” defined infra) and the barcoding cassette is a CREATE fusion barcoding cassette (“CF barcoding cassette,” defined infra) comprising from 5′ to 3′: (A) a nucleic acid sequence encoding a barcoding gRNA having a region of complementarity to a sequence of a target locus in which a barcode sequence is to be integrated, the barcoding gRNA comprising: a guide or spacer sequence, and a scaffold region recognized by a corresponding nuclease or nickase; and (B) a nucleic acid sequence encoding a repair template covalently linked to the barcoding gRNA comprising from 5′ to 3′: an optional post-barcode homology region, a barcode sequence, a nick-to-barcode region, and a primer binding site (PBS). In some aspects, the components of the barcoding cassette are contiguous. In some aspects, the barcoding cassette is agnostic to the order of the barcoding gRNA and repair template. In some aspects, the barcoding gRNA is under the control of a promoter at the 5′ end of the barcoding cassette.
In certain aspects, the nick-to-barcode region of the repair template is between 1 nucleotide and 100 nucleotides in length, between 1 nucleotide and 75 nucleotides in length, between 1 nucleotide and 50 nucleotides in length, between 2 nucleotides and 250 nucleotides in length, between 5 nucleotides and 150 nucleotides in length, or between 1 nucleotide and 150 nucleotides in length. In some aspects of this method, the nick-to-barcode region of the repair template is up to 10,000 nucleotides in length, up to 5000 nucleotides in length, up to 3000 nucleotides in length, up to 1000 nucleotides in length, up to 500 nucleotides in length, up to 250 nucleotides in length, up to 100 nucleotides in length, up to 50 nucleotides in length, or up to 25 nucleotides in length.
In certain aspects, the post-barcode homology region of the repair template is between 2 nucleotides and 20 nucleotides in length, between 2 nucleotides and 15 nucleotides in length, between 2 nucleotides and 50 nucleotides in length, between 4 nucleotides and 40 nucleotides in length, between 3 nucleotides and 30 nucleotides in length or between 5 nucleotides and 25 nucleotides in length.
In certain aspects, the editing gRNA is a component of an editing cassette for performing nucleic acid-guided nuclease editing, the editing cassette comprising the editing gRNA having a region of complementarity to a sequence of a target locus in which an edit is to be incorporated, and an edit for incorporation into the cell genome.
In certain aspects of the present disclosure, the editing gRNA is a CREATE fusion gRNA (“CFgRNA,” defined infra) and the editing cassette is a CREATE fusion editing cassette (“CF editing cassette,” defined infra) comprising from 5′ to 3′: (A) a nucleic acid sequence encoding an editing gRNA having a region of complementarity to a sequence of a target locus in which an edit is to be incorporated, the editing gRNA comprising: a guide or spacer sequence, and a scaffold region recognized by a corresponding nuclease or nickase; and (B) a nucleic acid sequence encoding a repair template covalently linked to the editing gRNA comprising from 5′ to 3′: an optional post-edit homology region, an edit, an optional nick-to-edit region, and a primer binding site (“PBS”). In some aspects, the components of the editing cassette are contiguous. In some aspects, the editing cassette is agnostic to the order of the editing gRNA and repair template. In some aspects, the editing gRNA is under the control of a promoter at the 5′ end of the CF editing cassette.
In certain aspects, the nick-to-edit region of the repair template is between 1 nucleotide and 100 nucleotides in length, between 1 nucleotide and 75 nucleotides in length, between 1 nucleotide and 50 nucleotides in length, between 2 nucleotides and 250 nucleotides in length, between 5 nucleotides and 150 nucleotides in length, or between 1 nucleotide and 150 nucleotides in length. In some aspects of this method, the nick-to-edit region of the repair template is up to 10,000 nucleotides in length, up to 5000 nucleotides in length, up to 3000 nucleotides in length, up to 1000 nucleotides in length, up to 500 nucleotides in length, up to 250 nucleotides in length, up to 100 nucleotides in length, up to 50 nucleotides in length, or up to 25 nucleotides in length.
In certain aspects, the post-edit homology region of the repair template is between 2 nucleotides and 20 nucleotides in length, between 2 nucleotides and 15 nucleotides in length, between 2 nucleotides and 50 nucleotides in length, between 4 nucleotides and 40 nucleotides in length, between 3 nucleotides and 30 nucleotides in length, or between 5 nucleotides and 25 nucleotides in length.
In certain aspects, the editing cassette is designed to facilitate incorporation of an intended edit at a first target locus (e.g., target site or target region) of the cells, and the barcoding cassette is designed to facilitate integration of a barcode at a second target locus of the cells different from the first target locus.
In certain aspects, the editing cassette and/or the barcoding cassette (e.g., the repair templates) further comprise an edit (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 edits) to immunize a target locus to prevent re-nicking or re-cutting thereof. As discussed herein, in some aspects, an edit to immunize a target locus to prevent re-nicking is one that alters the proto-spacer adjacent motif (PAM) (or other element) such that subsequent binding at the target locus by the nucleic acid-guided polypeptide (e.g., nuclease, nickase, inactive nuclease or inactive nickase) is impaired or prevented.
In certain aspects, the editing cassette and/or the barcoding cassette further comprise an RNA G-quadruplex region at a 3′ end of the repair template to stabilize the cassette and improve target nicking or cleavage efficiency without inducing off-target activity.
In certain aspects, the editing cassette and/or barcoding cassette further comprise an amplification priming site or subpool primer binding sequence at a 3′ end thereof. In specific aspects, the editing cassette and/or barcoding cassette further comprise a melting temperature booster sequence at a 5′ end thereof, which is a short protective DNA buffer sequence. In addition, in specific aspects, the editing cassette and/or barcoding cassette comprise regions of homology to a vector for gap-repair insertion of the cassette into the vector, such as an editing vector or engine vector.
In some aspects, a region of complementarity between the barcoding gRNA and a target locus is between 4 nucleotides and 120 nucleotides in length, between 5 nucleotides and 80 nucleotides in length, or between 6 nucleotides and 60 nucleotides in length. In certain aspects, a region of complementarity between the barcoding gRNA and a target locus is between 1 nucleotide and 10 nucleotides in length, between 10 nucleotides and 20 nucleotides in length, between 20 nucleotides and 50 nucleotides in length, or between 50 nucleotides and 100 nucleotides in length.
In some aspects, the barcode sequence of the repair template of the barcoding cassette is between 1 nucleotide and 750 nucleotides in length, between 1 nucleotide and 500 nucleotides in length, between 1 nucleotide and 150 nucleotides in length, between 1 nucleotide and 100 nucleotides in length, between 4 nucleotides and 50 nucleotides in length, or between 4 nucleotides and 25 nucleotides in length.
In some aspects, a region of complementarity between the editing gRNA and a target locus is between 4 nucleotides and 120 nucleotides in length, between 5 nucleotides and 80 nucleotides in length, between 6 nucleotides and 60 nucleotides in length, between 1 nucleotide and 10 nucleotides in length, between 10 nucleotides and 20 nucleotides in length, between 20 nucleotides and 50 nucleotides in length, or between 50 nucleotides and 100 nucleotides in length.
In some aspects, the edit region of the repair template of the editing cassette is between 1 nucleotide and 750 nucleotides in length, between 1 nucleotide and 500 nucleotides in length, between 1 nucleotide and 150 nucleotides in length, between 1 nucleotide and 10 nucleotides in length, between 10 nucleotides and 20 nucleotides in length, between 20 nucleotides and 50 nucleotides in length, between 50 nucleotides and 100 nucleotides in length, between 100 nucleotides and 250 nucleotides in length, between 250 nucleotides and 500 nucleotides in length, or between 500 nucleotides and 750 nucleotides in length.
In certain aspects, the edit region of the repair template of the editing cassette comprises two or more edits, or three or more edits, or four or more edits, or five or more edits.
In some aspects, the edit created by the editing cassette in a target locus includes one or more nucleotide swaps in the target locus.
In some aspects, the edit created by the editing cassette in a target locus is an insertion in the target locus.
In some aspects, the edit created by the editing cassette is an insertion of recombinase sites, protein degron tags, promoters, terminators, alternative-splice sites, CpG islands, etc.
In some aspects, the edit created by the editing cassette in a target locus is a deletion in the target locus.
In some aspects, the editing cassette is designed to provide a deletion of between 1 nucleotide and 750 nucleotides at a target locus. In some aspects, the editing cassette is designed to provide a deletion of between 1 nucleotide and 10 nucleotides, between 10 nucleotides and 20 nucleotides, between 20 nucleotides and 50 nucleotides, between 50 nucleotides and 100 nucleotides, between 100 nucleotides and 200 nucleotides, between 200 nucleotides and 500 nucleotides, or between 250 nucleotides and 750 nucleotides at a target locus.
In some aspects, the edit created is a deletion of introns, exons, repetitive elements, promoters, terminators, insulators, CpG islands, non-coding elements, retrotransposons, etc.
In some aspects, the edit comprises several types of edits and/or comprises more than one of one or more types of edits. For example, in some aspects, the edit comprises two or more nucleotide swaps or substitutions (e.g., 2, 3, 4, 5, or between 1 and 20 nucleotide swaps), some or all of which can be adjacent to each other or nonadjacent to each other. In some aspects, the edit comprises one or more nucleotide swaps (e.g., 2, 3, 4, 5, or between 1 and 20 nucleotide swaps) and an insertion of one or more nucleotides (e.g., 2, 3, 4, 5, or between 1 and 20 nucleotides). In some aspects, the edit comprises one or more nucleotide swaps (e.g., 2, 3, 4, 5, or between 1 and 20 nucleotide swaps) and a deletion of one or more nucleotides (e.g., 2, 3, 4, 5, or between 1 and 20 nucleotides).
In some aspects, the edit created by the editing cassette in a target locus is in a coding region in the target locus.
In some aspects, the edit created by the editing cassette in a target locus is in a noncoding region in the target locus.
In some aspects, a barcode sequence integrated at a second target locus facilitates tracking of an incorporated edit at a first target locus. In some aspects, the second target locus comprises a neutral integration site, or “safe spot,” that facilitates stable integration of the barcode sequence without significant impact on cell growth or function. In some aspects, the second target locus is a safe harbor locus disposed centrally in a large intergenic region to reduce the potential of barcode sequence integration adversely affecting genes neighboring the integrated barcode. In some aspects, where a plurality of barcodes are integrated (e.g., during recursive or iterative editing methods), the barcodes are embedded into one or more clustered neutral safe harbor loci. In some aspects, the integration locus of the barcode sequence is elected based on inspection of a host GFP fusion localization database.
In some aspects, the second target locus is disposed within a coding region (e.g., exon). In some aspects, the second target locus is disposed within a noncoding region (e.g., intronic or intergenic region). In some aspects, the second target locus comprises the adeno-associated virus site 1 (“AAVS1”), the chemokine (C-C motif) receptor 5 (“CCR5”) gene, the DNA methyltransferase 3B (“DMNT3b”) gene, the eukaryotic translation initiation factor 4E-binding protein 2 (“4EBP2”) gene, the ornithine decarboxylase antizyme 1 (“OAZ1”) gene, or an orthologue of the Rosa26 locus.
In some aspects, the second target locus is adjacent to the first target locus, or within close proximity to the first target locus. As used herein, “close proximity” refers to within 5000 nucleotides.
In some aspects, successfully barcoded and/or edited cells may be enriched for based on the selection of the second target locus. For example, in specific aspects, the second target locus may be disposed within the coding region of a non-essential cell surface receptor such that integration of a barcode sequence in the second target locus eliminates the receptor from the cell surface. Accordingly, antibody and/or bead-based affinity purification may be utilized to remove cells that were not successfully barcoded, leaving only barcoded cells (e.g., negative selection). In some aspects, the barcode sequence may comprise a frameshifting edit. In some aspects, the barcode sequence may comprise a frameshifting edit and be 8 or more nucleotides in length, wherein the number of nucleotides is not a multiple of 3 (e.g., 10, 11, 13, 14, etc.). In some aspects, the barcode sequence may comprise an in-frame STOP codon (TAGTGA) edit and be 9 or more nucleotides in length.
In some aspects, the second target locus may be disposed within the coding region of the CD9 cell surface glycoprotein, CD81 cell surface receptor, CD63 cell surface receptor, or other non-essential cell surface receptor.
In specific aspects, the second target locus may be disposed within a locus corresponding to 5′ and/or 3′ untranslated region (UTR) of a gene, including CD81, OAZ2, CD9, CD63, SGK1, CARHSP1, MAP4, SLC38A1, WNK1, DIAPH1, LRRC8A, FAF2, NKTR, TBC1D16, GJC1, NUCKS1, CAPZB, TBC1D16, MPP6, WDR83OS, PMEPA1, SERINC5, HTT, SLC29A1, PPP3CA, EZR, HEBP2, HTT, SLC7A1, LSM14A, ERBB2, CYP51A1, GPATCH8, and/or the like. In some aspects, the second target locus may be disposed within a locus corresponding to a 5′ and/or 3′ UTR of an mRNA, so that the integrated barcode may be detected by RNA sequencing methods.
In some aspects, the integrated barcode sequence is tracked or analyzed via RNA sequencing (e.g., transcriptome sequencing) or genomic sequencing. In some aspects, the integrated barcode sequence is tracked or analyzed via single cell whole genome sequencing methods, which enable combinatorial and/or linked edit tracking.
In some aspects, the nuclease includes a MAD-series nuclease, nickase, or a variant (e.g., orthologue) thereof. In some aspects, the nuclease includes a MAD1, MAD2, MAD3, MAD4, MAD5, MAD6, MAD7R, MAD8, MAD9, MAD10, MAD11, MAD12, MAD13, MAD14, MAD15, MAD16, MAD17, MAD18, MAD19, MAD20, MAD2001, MAD2007, MAD2008, MAD2009, MAD2011, MAD2017, MAD2019, MAD297, MAD298, MAD299, or other MAD-series nuclease, nickase, variants thereof, and/or combinations thereof.
In some aspects, the nuclease includes a Cas9 nuclease (also known as Csn1 and Csx12), nickase, or a variant thereof.
In some aspects, the nuclease includes C2c1, C2c2, C2c3, Cas1, Cas1B, Cas2, Cas3, Cas4, Cas5, Cas6, Cas7, Cas8, Cas10, Cpf1, Csy1, Csy2, Csy3, Cse1, Cse2, Csc1, Csc2, Csa5, Csn2, Csm2, Csm3, Csm4, Csm5, Csm6, Cmr1, Cmr3, Cmr4, Cmr5, Cmr6, Csb1, Csb2, Csb3, Csx17, Csx14, Csx100, Csx16, CsaX, Csx3, Csx1, Csx15, Csf1, Csf2, Csf3, Csf4, or similar nuclease, nickase, variants thereof, and/or combinations thereof.
In some aspects, such as aspects wherein a CF barcoding cassette and/or CF editing cassette is utilized, the nuclease is a fusion protein—e.g., a nucleic acid-guided nickase/reverse transcriptase fusion enzyme (a “nickase-RT fusion”)—that retains certain characteristics of nucleic acid-directed nucleases (e.g., the binding specificity and ability to cleave one or more DNA strands in a targeted manner) combined with another enzymatic activity, namely, reverse transcriptase activity. In some aspects, the reverse transcriptase portion of the nickase-RT fusion may use a CFgRNA (e.g., the barcoding gRNA in a CF barcoding cassette or editing gRNA in a CF editing cassette) to synthesize and edit at a “flap” created by the nickase portion on one or both DNA strands of a target locus, thereby circumventing the endogenous mismatch repair systems to integrate a barcode or incorporate an edit.
In some aspects, the barcoding cassette, the editing cassette, and the nuclease are introduced into the cells on a single vector (e.g., a single-part system). In certain aspects, the barcoding cassette, the editing cassette, and/or the nuclease are introduced into the cells as a multi-part system, wherein the barcoding cassette may be introduced separately from the editing cassette and/or the nuclease. In some aspects, the barcoding cassette may be comprised on a first vector, and the editing cassette and/or the nuclease may be comprised on a second vector co-delivered with the first vector. In some aspects, the nuclease is introduced to the cells before the introduction of the barcoding cassette and the editing cassette. In some aspects, one or more of the barcoding cassette, the editing cassette, and the nuclease are introduced into the cells prior to the introduction of the remaining components for effecting editing and/or barcoding. In some aspects, the cell comprises the barcoding cassette, the editing cassette, and the nuclease. In some aspects, the cell comprises the barcoding cassette and the editing cassette. In some aspects, the cell comprises the nuclease.
In some aspects, the nuclease is introduced into the cells as a DNA molecule coding for the nuclease separately or linked to the barcoding cassette and/or the editing cassette. In some aspects, the nuclease may be introduced separately into the cells in protein form or as part of a complex. In some aspects, the same nuclease may be utilized to incorporate the intended edit into a first genomic locus of the cells and the barcode into a second genomic locus of the cells. In some aspects, different nucleases may be utilized to incorporate the intended edit into a first genomic locus of the cells and the barcode into a second genomic locus of the cells. In some aspects, the nuclease is endogenously expressed in the cells. In some aspects, the nuclease is transiently expressed in the cells. In some aspects, the nuclease is delivered into the cells as a protein molecule. In some aspects, the nuclease is delivered into the cells as a complex with nucleic acid.
In some aspects, the barcoding cassette, the editing cassette, and/or the nucleic acid-guided nuclease are introduced into the cell on a linear or circular plasmid. In some aspects, the barcoding cassette, the editing cassette, and/or the nucleic acid-guided nuclease are under the control of a constitutive or inducible promoter at a 5′ end thereof.
In some aspects, a vector comprising the barcoding cassette, the editing cassette, and/or the nucleic acid-guided nuclease further comprises an origin of replication and a selectable marker component, e.g., an antibiotic resistance gene or a fluorescent protein gene, for selection or enrichment of cells that have been edited and/or barcoded. In some aspects, the selectable marker may be utilized for selective enrichment of edited and/or barcoded cells. In some aspects, the selectable marker comprises an antibiotic resistance gene or a fluorescent protein. In some aspects, the selectable marker comprises the PuroR gene.
In some aspects, there is provided a library of vector or plasmid backbones, and/or a library of editing cassettes, and/or a library of barcoding cassettes to be transformed into cells. In some aspects, the utilization of a library of cassettes and/or a library of vector or plasmid backbones enables combinatorial or multiplex editing in the cells. In some aspects, a library of cassettes or vectors may comprise cassettes or vectors that have any combination of common elements and non-common or different elements as compared to other cassettes or vectors within the pool. In some aspects, a library of editing cassettes may comprise common priming sites or common nick-to-edit or post-edit homology regions, while also containing non-common or unique edits. In some aspects, a library of barcoding cassettes may comprise common priming sites or common nick-to-barcode or post-barcode homology regions, while also containing non-common or unique barcode sequences. In some aspects, combinations of common and non-common elements are advantageous for multiplexing or combinatorial techniques disclosed herein.
In some aspects, a library of cassettes comprises at least 2 cassettes, at least 10 cassettes, at least 100 cassettes, at least 500 cassettes, at least 1,000 cassettes, at least 5,000 cassettes, at least 10,000 cassettes, at least 100,000 cassettes, or at least 1,000,000 cassettes. In some aspects, a library of cassettes comprises between 5 cassettes and 1,000,000 cassettes, between 100 cassettes and 500,000 cassettes, between 1,000 cassettes and 100,000 cassettes, between 1,000 cassettes and 10,000 cassettes, or between 10,000 cassettes and 50,000 cassettes.
In some aspects, one or more editing cassettes in a library of editing cassettes each comprise a different editing gRNA targeting a different target locus within the cell genome. In some aspects, one or more editing cassettes in a library of editing cassettes each comprise a different edit to be incorporated within the cell genome. In some aspects, one or more barcoding cassettes in the library of barcoding cassettes each comprise a different barcoding gRNA targeting a different target locus within the cell genome. In some aspects, one or more barcoding cassettes in a library of barcoding cassettes each comprise a different barcode to be incorporated within the cell genome.
In some aspects, there is provided a trackable library comprising a plurality of cassettes or a plurality or vectors comprising a cassettes as disclosed herein. In some aspects, within the trackable library are distinct editing cassette and barcoding cassette combinations, which when sequenced upon editing, facilitate tracking of editing events in a population of cells. Accordingly, when edits and barcodes are incorporated into a target genome, the incorporation of an edit is determined based on sequenced the barcode.
In some aspects, there is provided a gene-wide or genome-wide library of cassettes or vectors comprising cassettes as disclosed herein.
In some aspects, there are provided methods of recursive or iterative rounds of editing operations. In some aspects, during each round of editing, a new or unique barcode is incorporated into the cell genome, such that following multiple editing rounds to construct combinatorial diversity throughout the genome, sequencing of the barcodes can be used to reconstruct each combinatorial genotype or to confirm that the edit from each round or operation has been incorporated into the genome. In some aspects, methods disclosed herein comprise 2 or more rounds of editing, 3 or more rounds of editing, 5 or more rounds of editing, 7 or more rounds of editing, or 10 or more rounds of editing. In some aspects, methods disclosed herein comprises one round of editing.
In some aspects, one or more unique barcodes can be inserted in each round of multiple iterative or recursive editing operations. In some aspects, the unique barcodes may be inserted adjacent or in proximity to each other (e.g., in a single target region), or at a distance and/or in separate target regions.
In some aspects, recursive or iterative editing methods may be used. In some aspects, recursive or iterative editing methods may be used for analyzing combinatorial mutational effects on large populations, or for inserting entire pathways within cells.
In some aspects, the methods described herein facilitate parallel analysis of two or more target proteins.
In some aspects, the methods described herein enable preparation of a comprehensive library of genetic variations encompassing all residue changes of one or more target proteins, such as one or more target proteins that contributed to a trait.
The present disclosure provides, in selected aspects, modules, instruments, and systems for automated multi-module cell processing for trackable nucleic acid-guided genome editing in multiple cells. Automated systems for cell processing that may be used for can be found, e.g., in U.S. Pat. Nos. 10,253,316; 10,329,559; 10,323,242; 10,421,959; 10,465,185; 10,519,437; 10,584,333; 10,584,334; 10,647,982; 10,689,645; 10,738,301; and 10,738,663.
In some aspects, the automated multi-module cell processing instruments of the present disclosure are designed for recursive genome editing, e.g., sequentially introducing multiple edits into genomes inside one or more cells of a cell population through two or more editing operations within the instruments.
In some aspects, the methods, compositions, modules, and instruments described herein may be utilized for efficient tracking of barcodes utilized during editing, for efficient tracking of ribonucleoprotein (RNP) based transfections, and for efficient tracking of non-plasmid based barcode delivery via homologous recombination (HR) or non-homologous end joining (NHEJ) based integration.
Certain aspects described herein provide an alternative to traditional nucleic acid-guided nuclease editing (e.g., RNA-guided nuclease or CRISPR editing) used to introduce desired edits to a population of cells, that is, the compositions and methods described herein may employ a nucleic acid-guided nickase/reverse transcriptase fusion enzyme (“nickase-RT fusion”) as opposed to a nucleic acid-guided nuclease (e.g., a “CRISPR nuclease”). The nickase-RT fusion employed herein differs from traditional CRISPR editing in that instead of initiating double-stranded breaks in the target genome and homologous recombination to effect an intended edit and/or integrate a barcode corresponding with an intended edit, the nickase initiates a nick in a single strand of the target genome, e.g., the non-complementary strand. Further, the fusion of the nickase to a reverse transcriptase, in combination with an editing or barcoding cassette comprising a gRNA and repair template, eliminates the need for a donor DNA to be incorporated by homologous recombination. Instead, a nucleic acid sequence encoding the repair template of the corresponding cassette—typically a ribonucleic acid—may serve as a template for the reverse transcription (“RT”) portion of the fusion enzyme to add an intended edit or barcode to the nicked strand at the target locus. That is, utilization of a nickase-RT fusion enables incorporation of the edit or barcode in the target genome by copying an RNA sequence (e.g., at the RNA level) rather than replacing a portion of the target locus with a donor DNA (e.g., at the DNA level).
The nickase-functioning as a single-strand cutter and having the specificity of a nucleic acid-guided nuclease-engages the target locus and nicks a strand of the target locus creating one or more free 3′ terminal nucleotides. The 3′ end of the repair template encoded by the editing cassette or barcoding cassette is then annealed to the nicked strand, and the reverse transcriptase utilizes 3′ terminal nucleotide(s) of the nicked strand to copy the repair template and create a “flap” containing the desired edit or barcode. Thereafter, endogenous repair mechanisms of the cells repair the nick in favor of the desired edit by hybridizing the flap to the wild-type (e.g., unedited) DNA strand. In summary, in certain aspects, the present methods and compositions are drawn to using the nickase-RT fusion to nick a strand of DNA at the target locus and, using an editing cassette or barcoding cassette, to effect the desired edit or barcode on the strand via the reverse transcriptase portion of the nickase-RT fusion.
Generally, nucleic acid-guided nuclease editing typically begins with a nucleic acid-guided nuclease complexing with an appropriate guide nucleic acid in a cell which can cut the genome of the cell at a desired location. The guide nucleic acid helps the nucleic acid-guided nuclease recognize and cut the DNA at a specific target sequence. By manipulating the nucleotide sequence of the guide nucleic acid, the nucleic acid-guided nuclease may be programmed to target any DNA sequence for cleavage as long as an appropriate protospacer adjacent motif (PAM) is nearby. For some nucleic acid-guided nucleases, two separate guide nucleic acid molecules that combine to function as a guide nucleic acid are used, e.g., a CRISPR RNA (crRNA) and trans-activating CRISPR RNA (tracrRNA). For other nucleic acid-guided nucleases, the guide nucleic acid may be a single guide nucleic acid that includes both the crRNA and tracrRNA sequences.
In general, a guide nucleic acid (e.g., gRNA or CFgRNA) complexes with a compatible nucleic acid-guided nuclease and can then hybridize with a target sequence, thereby directing the nuclease to the target sequence. A guide nucleic acid can be DNA or RNA; alternatively, a guide nucleic acid may comprise both DNA and RNA. In some aspects, a guide nucleic acid may comprise modified or non-naturally occurring nucleotides. In the present methods and compositions, the guide nucleic acid is RNA.
A guide nucleic acid comprises a guide sequence, where the guide sequence (as opposed to the scaffold sequence portion of the guide nucleic acid) is a polynucleotide sequence having sufficient complementarity with a target sequence to hybridize with the target sequence and direct sequence-specific binding of a complexed nucleic acid-guided nuclease to the target sequence. The degree of complementarity between a guide sequence and the corresponding target sequence, when optimally aligned using a suitable alignment algorithm, is about or more than about 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 92.5%, 95%, 97.5%, 99%, or more Optimal alignment may be determined with the use of any suitable algorithm for aligning sequences (e.g., without being limiting, BLAST™). In some aspects, a guide sequence is about or more than about 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 75, or more nucleotides in length. In some aspects, a guide sequence is less than about 75, 50, 45, 40, 35, 30, 25, 20 nucleotides in length. Preferably the guide sequence is between 10 nucleotides and 30 nucleotides long, between 15 nucleotides and 20 nucleotides long, or 15, 16, 17, 18, 19, or 20 nucleotides in length.
In some aspects of the present methods and compositions, the guide nucleic acids are provided as mRNAs or sequences to be expressed from a plasmid or vector, and/or as sequences to be expressed form a cassette optionally inserted into a plasmid or vector, and comprise both the guide sequence and the scaffold sequence as a single transcript. The guide nucleic acids are engineered to target a desired target sequence by altering the guide sequence so that the guide sequence is complementary to a desired target sequence, thereby allowing hybridization between the guide sequence and the target sequence. In general, to generate an edit or integrate a barcode in the target sequence, the gRNA/nuclease complex binds to a target sequence as determined by the guide RNA, and the nuclease recognizes a protospacer adjacent motif (PAM) sequence adjacent to the target sequence. The target sequence can be any polynucleotide endogenous or exogenous to a prokaryotic or eukaryotic cell, or in vitro. For example, the target sequence can be a polynucleotide residing in the nucleus of a eukaryotic cell. A target sequence can be a sequence encoding a gene product (e.g., a protein) or a non-coding sequence (e.g., a regulatory polynucleotide, an intron, a PAM, or “junk” DNA).
As described above, in certain aspects, the guide nucleic acids may be part of editing cassettes or barcoding cassettes that also encode for repair templates, which are used as templates for reverse transcription by the reverse transcriptase portion of the nickase-RT fusion. Each repair template generally comprises a desired edit or barcode corresponding with a desired edit to be incorporated into the target DNA sequence. Accordingly, the edit or barcode is integrated into the target DNA sequence via copying of the repair template by the nickase-RT fusion, therefore not depending on HDR mechanisms between the target genome and a donor nucleic acid to effect the edit or barcode.
The target sequence is associated with a proto-spacer adjacent motif (PAM), which is a short nucleotide sequence recognized by the gRNA/nuclease complex. The precise preferred PAM sequence and length requirements for different nucleic acid-guided nucleases vary; however, PAMs typically are 2-8 base-pair sequences adjacent or in proximity to the target sequence and, depending on the nuclease, can be 5′ or 3′ to the target sequence. Engineering of the PAM-interacting domain of a nucleic acid-guided nuclease may allow for alteration of PAM specificity, improve target site recognition fidelity, decrease target site recognition fidelity, or increase the versatility of a nucleic acid-guided nuclease.
In certain aspects, the editing or barcoding of a cellular target sequence both introduces a desired DNA change to the cellular target sequence, e.g., the genomic DNA of a cell, and removes, mutates, or renders inactive a PAM region or spacer region in the cellular target sequence. Rendering the PAM at the cellular target sequence inactive precludes additional editing of the cell genome at that cellular target sequence, e.g., upon subsequent exposure to a nucleic acid-guided nuclease complexed with a synthetic guide nucleic acid in later rounds of editing.
The range of target sequences that nucleic acid-guided nucleases can recognize is constrained by the need for a specific PAM to be located near the desired target sequence. As a result, it often can be difficult to target edits with the precision that is necessary for genome editing. It has been found that nucleases can recognize some PAMs very well (e.g., canonical PAMs), and other PAMs less well or poorly (e.g., non-canonical PAMs).
As for the nuclease or nickase-RT fusion component of the nucleic acid-guided nuclease editing system, a polynucleotide sequence encoding the nucleic acid-guided nuclease or nickase-RT fusion can be codon optimized for expression in particular cell types, such as archaeal, prokaryotic or eukaryotic cells. Eukaryotic cells can be yeast, fungi, algae, plant, animal, or human cells. Eukaryotic cells may be those of or derived from a particular organism, such as a mammal, including but not limited to human, mouse, rat, rabbit, dog, or non-human mammals including non-human primates. The choice of nucleic acid-guided nuclease or nickase-RT fusion to be employed depends on many factors, such as what type of edit is to be made in the target sequence and whether an appropriate PAM is located close to the desired target sequence.
Nucleases of use in the methods described herein include but are not limited to nickases engineered from nucleic acid-guided nucleases such as Cas9, Cas 12/Cpf1, MAD2, MAD2007, MAD2017, MAD2019, MAD297, MAD298, MAD299, MAD7®, or other MADZYMER, variants thereof, and nuclease or nickase fusions thereof. Nickase-RT fusion enzymes typically comprise one or more CRISPR nucleic acid-guided nucleases, each engineered to nick one DNA strand in the target DNA rather than making a double-stranded cut, and the nickase portion(s) are fused to a reverse transcriptase. In certain aspects of the present methods, the nickase-RT fusion nicks both strands of the target locus, albeit where the two nicks are staggered rather than at the same position which would result in a double-stranded cut. As with the guide nucleic acid, the nucleases or nickases may be encoded by one or more DNA sequences on a vector (e.g., an engine vector or an editing or barcoding vector also comprising the editing and/or barcoding cassette) and be under the control of a promoter—including inducible or constitutive promoters—or the nickase-RT fusion may be delivered as a protein or RNA-protein complex.
In addition to a nucleic acid sequence encoding the gRNA and a nucleic acid sequence encoding the repair template, an editing cassette, barcoding cassette, or editing vector backbone may comprise one or more primer sites. The primer sites can be used to amplify the cassette or editing vector backbone by using oligonucleotide primers; for example, if the primer sites flank one or more of the other components of the cassette or editing vector backbone, e.g., the nucleic acid sequence encoding the gRNA and/or the nucleic acid sequence encoding the repair template.
Additionally, in some aspects, a vector encoding the nickase-RT fusion enzyme and/or the CF editing cassette further encodes one or more nuclear localization sequences (NLSs), such as about or more than about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more NLSs. In some aspects, the engineered nuclease comprises NLSs at or near the amino-terminus, NLSs at or near the carboxy-terminus, or a combination.
Improved Nucleic Acid-Guided Nickase/Reverse Transcriptase Fusion Editing and Tracking of Edits
Creating a library of genomic edits requires tracking (e.g., identification) of editing events. Traditionally, in order to track editing events during one or more rounds of nucleic acid-guided nuclease editing, lentivector-based barcodes or episomal components are introduced into the host cells along with the editing guide nucleic acids, donor DNA, and/or nucleases for integration into the cell genomes. However, random integration of lentivector-based systems may adversely affect phonotype-genotype reagents, and episomes are inefficient and have low establishments rates, leading to a loss in library diversity. The present disclosure addresses the deficiencies of these and other trackable integration technologies.
In particular, aspects of the present disclosure provide compositions of matter, methods and instruments for nucleic acid-guided nickase/reverse transcriptase fusion (“nickase-RT fusion”) editing of live cells using editing and barcoding cassettes, e.g., CREATE fusion editing and barcoding cassettes, each comprising a gRNA covalently linked to a repair template comprising an intended edit or barcode. The editing cassettes and barcoding cassettes are engineered to edit and barcode genomic DNA, respectively, at separate target loci, wherein each barcode may correspond with one or more editing events.
Thus, nickase-RT fusion editing events may be tracked on a one-to-one basis utilizing single cell genomic DNA or RNA sequencing methods to identify integrated barcodes. In such examples, each integrated barcode may serve as a proxy for one or more corresponding edits.
Utilizing the compositions and methods described herein, a single nickase-RT fusion enzyme may be used to facilitate the incorporation of a desired edit into a cell genome at a first target locus, as well as the integration of a barcode sequence into the genome at a second target locus. Further, a single barcode integration locus, e.g., a safe harbor locus, intronic region, or non-essential gene exon, once optimized, may enable consistent integration of multiple trackable barcodes, thereby facilitating tracking of multiple editing events that may target many different genomic loci via sequencing of the single barcode integration locus. Accordingly, sequencing of a single locus enables relative quantitation of design diversity and abundance without having to directly sequence the exact targeted locus or construct used in an edit-driving library.
Looking at
Similarly, at 104, CF barcoding cassettes (e.g., a library of CF barcoding cassettes) are designed and synthesized, each cassette comprising a covalently-linked barcoding CFgRNA and repair template designed to incorporate a unique barcode into one or both DNA strands at a second target locus, which may be different from the first target locus. In other words, each CF barcoding cassette encodes a barcoding CFgRNA sequence and a repair template sequence to be reverse transcribed comprising a barcode sequence corresponding with an edit carried by the CF editing cassettes, as well as a PAM and/or spacer mutation(s). Once the CF barcoding cassettes have been synthesized, the individual cassettes may be amplified.
In certain aspects, the second target locus comprises a neutral integration site, or “safe spot,” that facilitates stable integration of the barcode sequence without significant impact on cell growth or function. In certain aspects, the second target locus is a safe harbor locus disposed centrally in a large intergenic or intronic region to reduce the potential of barcode sequence integration adversely affecting genes neighboring the integrated barcode.
In certain aspects, the integration locus of the barcode sequence is elected based on inspection of a host GFP fusion localization database. In further aspects, the second target locus is adjacent to the first target locus, or within close proximity to the first target locus.
At 106, a nickase-RT fusion enzyme is designed. As described above, the nickase-RT fusion enzyme comprises, in order from amino terminus to carboxy terminus, or from carboxy terminus to amino terminus, a nucleic acid-guided nickase and a reverse transcriptase. The nickase-RT fusion enzyme may be delivered to the cells as a coding sequence in a vector (in some aspects under the control of an inducible promoter), such as the same or different vector as the CF editing cassette and/or CF barcoding cassette, or the nickase-RT fusion enzyme may be delivered to the cells as a protein or protein complex.
In method 100, the nickase-RT fusion enzyme is delivered to the cells via a coding sequence in an editing vector further comprising at least the CF editing cassette.
At 108, the CF editing cassettes, the CF barcoding cassettes, and/or the nickase-RT fusion enzymes are assembled with vector backbones, such as plasmid backbones, to create editing vectors, e.g., a library of editing vectors. In certain aspects, a CF editing cassette, CF barcoding cassette, and nickase-RT fusion enzyme are assembled together on a single editing vector. An example of an editing vector comprising all the aforementioned components is illustrated in
At 110, the engine and editing vectors are introduced into the live cells. A variety of delivery systems may be used to introduce (e.g., transform, transfect, or transduce) nucleic acid-guided nickase fusion editing system components into a host cell 110. These delivery systems include the use of yeast systems, lipofection systems, microinjection systems, biolistic systems, virosomes, liposomes, immunoliposomes, polycations, lipid: nucleic acid conjugates, virions, artificial virions, viral vectors, electroporation, cell permeable peptides, nanoparticles, nanowires, exosomes. Alternatively, molecular trojan horse liposomes may be used to deliver nucleic acid-guided nuclease components across the blood brain barrier. Of particular interest is the use of electroporation, particularly flow-through electroporation (either as a stand-alone instrument or as a module in an automated multi-module system) as described in, e.g., U.S. Pat. No. 10,253,316, issued 9 May 2019; U.S. Pat. No. 10,329,559, issued 25 Jun. 2019; U.S. Pat. No. 10,323,242, issued 18 Jun. 2019; U.S. Pat. No. 10,421,959, issued 24 Sep. 2019; U.S. Pat. No. 10,465,185, issued 5 Nov. 2019; U.S. Pat. No. 10,519,437, issued 31 Dec. 2019; U.S. Pat. No. 10,584,333, issued 10 Mar. 2020; U.S. Pat. No. 10,584,334, issued 10 Mar. 2020; U.S. Pat. No. 10,647,982, issued 12 May 2020; U.S. Pat. No. 10,689,645, issued 23 Jun. 2020; U.S. Pat. No. 10,738,301, issued 11 Aug. 2020; U.S. Pat. No. 10,738,663, issued 29 Sep. 2020; and U.S. Pat. No. 10,894,958, issued 19 Jan. 2021.
Once transformed 110, the next steps in method 100 include providing conditions for nucleic acid-guided nuclease editing 112 and for barcoding 114 “Providing conditions” includes incubation of the cells in appropriate medium and may also include providing conditions to induce transcription of an inducible promoter (e.g., adding antibiotics, adding inducers, increasing temperature) for transcription of a CF editing cassette, CF barcoding cassette, and/or nickase-RT fusion enzyme. In certain aspects, the conditions for editing 112 and for genomic integration of the barcode 114 are the same and thus, these steps are performed simultaneously. In certain aspects, the conditions for editing 112 and for genomic integration of the barcode 114 are different (e.g., the barcoding CFgRNA of the CF barcoding cassette may be under the control of a different inducible promoter than other components of the editing system), and these steps may be performed either simultaneously or in sequence.
Once editing and barcoding is complete, the cells are allowed to recover and are preferably enriched for cells that have been edited and/or cells in which the barcode has integrated into the genome 116. Enrichment can be performed directly, such as via cells from the population that express a selectable marker, or by using surrogates, e.g., cell surface handles co-introduced with one or more components of the editing components. At this point in method 100, the cells can be characterized phenotypically or genotypically or, optionally, steps 102 to 114 or steps 110 to 114 may be repeated to make additional trackable edits 118 in recursive or iterative editing rounds. In certain aspects, steps 102 to 114 are repeated to create or construct a defined combination of edits or a combinatorial library.
After recovery and enrichment of edited cells, the genomic DNA or RNA transcripts of the cells may be sequenced to track or analyze the editing events 120, wherein the integrated barcode(s) serve as accurate proxies for corresponding edits. For example, the cells may be lysed and DNA or RNA extracted, purified, amplified, prepared into libraries, and sequenced to track for integrated barcodes. A simplified graphic depiction of step 120 is depicted in
At right in
At this stage, one DNA strand contains the edit while the second DNA strand does not. A mismatch repair or DNA replication process is likely responsible for copying the edit into both strands. Note that DNA replication and mismatch repair can also favor the wt strand as opposed to the edited strand. If the flap equilibration favors the wt (wildtype) 5′ flap, the newly synthesized flap is likely degraded and sealed in the same manner described above.
Although described with reference to a CF editing cassette, the mechanism depicted in
In certain aspects, the single-vector system further includes a selectable marker to facilitate enrichment or selection of cells successfully transformed with the vector, e.g., while performing the method 100. Accordingly, the selectable marker may be used to “tag” and enrich for transformation events, and may also be under the transcriptional control of a promoter. Examples of suitable markers include antibiotic resistance genes and fluorescent proteins. In
Similar to
Similarly, the CF barcoding cassette comprises from 5′ to 3′ an optional GG transcription initiation sequence (denoted “GG”); a barcoding CFgRNA configured to target a second target locus and comprising from 5′ to 3′ a spacer region (denoted “SR”) having complementarity to the second target locus and a scaffold (denoted “CR”); the repair template comprising from 5′ to 3′ an optional post-barcode homology region (denoted “PBH”), a barcode sequence (denoted “BC”), a nick-to-barcode region (denoted “NB”), and a primer binding site (denoted “PBS”); an RNA G-quadruplex region (denoted “QG”); and a PolyT transcription terminator (denoted “TT”).
In such examples, the barcode sequence may comprise a frameshifting edit (bottom of
In some implementations, the reagent cartridges 210 are disposable kits comprising reagents and cells for use in the automated multi-module cell processing/editing instrument 200. For example, a user may open and position each of the reagent cartridges 210 comprising various desired inserts and reagents within the chassis of the automated multi-module cell editing instrument 200 prior to activating cell processing. Further, each of the reagent cartridges 210 may be inserted into receptacles in the chassis having different temperature zones appropriate for the reagents contained therein.
Also illustrated in
Inserts or components of the reagent cartridges 210, in some implementations, are marked with machine-readable indicia (not shown), such as bar codes, for recognition by the robotic handling system 258. For example, the robotic liquid handling system 258 may scan one or more inserts within each of the reagent cartridges 210 to confirm contents. In other implementations, machine-readable indicia may be marked upon each reagent cartridge 210, and a processing system (not shown, but see element 237 of
Inside the chassis 290, in some implementations, will be most or all of the components described in relation to
The drive engagement mechanism 312 engages with a motor (not shown) to rotate the vial. In some aspects, the motor drives the drive engagement mechanism 312 such that the rotating growth vial 300 is rotated in one direction only, and in other aspects, the rotating growth vial 300 is rotated in a first direction for a first amount of time or periodicity, rotated in a second direction (e.g., the opposite direction) for a second amount of time or periodicity, and this process may be repeated so that the rotating growth vial 300 (and the cell culture contents) are subjected to an oscillating motion. Further, the choice of whether the culture is subjected to oscillation and the periodicity therefor may be selected by the user. The first amount of time and the second amount of time may be the same or may be different. The amount of time may be 1, 2, 3, 4, 5, or more seconds, or may be 1, 2, 3, 4 or more minutes. In another aspect, in an early stage of cell growth the rotating growth vial 400 may be oscillated at a first periodicity (e.g., every 60 seconds), and then a later stage of cell growth the rotating growth vial 300 may be oscillated at a second periodicity (e.g., every one second) different from the first periodicity.
The rotating growth vial 300 may be reusable or, preferably, the rotating growth vial is consumable. In some aspects, the rotating growth vial is consumable and is presented to the user pre-filled with growth medium, where the vial is hermetically sealed at the open end 304 with a foil seal. A medium-filled rotating growth vial packaged in such a manner may be part of a kit for use with a stand-alone cell growth device or with a cell growth module that is part of an automated multi-module cell processing system. To introduce cells into the vial, a user need only pipette up a desired volume of cells and use the pipette tip to punch through the foil seal of the vial. Open end 304 may optionally include an extended lip 302 to overlap and engage with the cell growth device. In automated systems, the rotating growth vial 300 may be tagged with a barcode or other identifying means that can be read by a scanner or camera (not shown) that is part of the automated system.
The volume of the rotating growth vial 300 and the volume of the cell culture (including growth medium) may vary greatly, but the volume of the rotating growth vial 300 must be large enough to generate a specified total number of cells. In practice, the volume of the rotating growth vial 300 may range from between 1 and 250 mL, between 2 and 100 mL, between 5 and 80 mL, between 10 and 50 mL, or between 12 and 35 mL. Likewise, the volume of the cell culture (cells+growth media) should be appropriate to allow proper aeration and mixing in the rotating growth vial 400. Proper aeration promotes uniform cellular respiration within the growth media. Thus, the volume of the cell culture should be approximately 5-85% of the volume of the growth vial or between 20% and 60% of the volume of the growth vial. For example, for a 30 mL growth vial, the volume of the cell culture would be from about 1.5 mL to about 26 mL, or from 6 mL to about 18 mL.
The rotating growth vial 300 preferably is fabricated from a bio-compatible optically transparent material—or at least the portion of the vial comprising the light path(s) is transparent. Additionally, material from which the rotating growth vial is fabricated should be able to be cooled to about 4° C.′ or lower and heated to about 55° C. or higher to accommodate both temperature-based cell assays and long-term storage at low temperatures. Further, the material that is used to fabricate the vial must be able to withstand temperatures up to 55° C. without deformation while spinning. Suitable materials include cyclic olefin copolymer (COC), glass, polyvinyl chloride, polyethylene, polyamide, polypropylene, polycarbonate, poly(methyl methacrylate (PMMA), polysulfone, polyurethane, and co-polymers of these and other polymers. Preferred materials include polypropylene, polycarbonate, or polystyrene. In some aspects, the rotating growth vial is inexpensively fabricated by, e.g., injection molding or extrusion.
The motor 338 engages with drive mechanism 312 and is used to rotate the rotating growth vial 300. In some aspects, motor 338 is a brushless DC type drive motor with built-in drive controls that can be set to hold a constant revolution per minute (RPM) between 0 and about 3000 RPM. Alternatively, other motor types such as a stepper, servo, brushed DC, and the like can be used. Optionally, the motor 338 may also have direction control to allow reversing of the rotational direction, and a tachometer to sense and report actual RPM. The motor is controlled by a processor (not shown) according to, e.g., standard protocols programmed into the processor and/or user input, and the motor may be configured to vary RPM to cause axial precession of the cell culture thereby enhancing mixing, e.g., to prevent cell aggregation, increase aeration, and optimize cellular respiration.
Main housing 336, end housings 352 and lower housing 332 of the cell growth device 330 may be fabricated from any suitable, robust material including aluminum, stainless steel, and other thermally conductive materials, including plastics. These structures or portions thereof can be created through various techniques, e.g., metal fabrication, injection molding, creation of structural layers that are fused, etc. Whereas the rotating growth vial 300 is envisioned in some aspects to be reusable, but preferably is consumable, the other components of the cell growth device 330 are preferably reusable and function as a stand-alone benchtop device or as a module in a multi-module cell processing system.
The processor (not shown) of the cell growth device 330 may be programmed with information to be used as a “blank” or control for the growing cell culture. A “blank” or control is a vessel containing cell growth medium only, which yields 100% transmittance and 0 OD (optical density), while the cell sample will deflect light rays and will have a lower percent transmittance and higher OD. As the cells grow in the media and become denser, transmittance will decrease and OD will increase. The processor (not shown) of the cell growth device 330—may be programmed to use wavelength values for blanks commensurate with the growth media typically used in cell culture (whether, e.g., mammalian cells, bacterial cells, animal cells, yeast cells, etc.). Alternatively, a second spectrophotometer and vessel may be included in the cell growth device 330, where the second spectrophotometer is used to read a blank at designated intervals.
In use, cells are inoculated (cells can be pipetted, e.g., from an automated liquid handling system or by a user) into pre-filled growth media of a rotating growth vial 300 by piercing though the foil seal or film. The programmed software of the cell growth device 330 sets the control temperature for growth, typically 30° C., then slowly starts the rotation of the rotating growth vial 300. The cell/growth media mixture slowly moves vertically up the wall due to centrifugal force allowing the rotating growth vial 300 to expose a large surface area of the mixture to a normal oxygen environment. The growth monitoring system takes either continuous readings of the OD or OD measurements at pre-set or pre-programmed time intervals. These measurements are stored in internal memory and if requested the software plots the measurements versus time to display a growth curve. If enhanced mixing is required, e.g., to optimize growth conditions, the speed of the vial rotation can be varied to cause an axial precession of the liquid, and/or a complete directional change can be performed at programmed intervals. The growth monitoring can be programmed to automatically terminate the growth stage at a pre-determined OD, and then quickly cool the mixture to a lower temperature to inhibit further growth.
One application for the cell growth device 330 is to constantly measure the optical density of a growing cell culture. One advantage of the described cell growth device is that optical density can be measured continuously (kinetic monitoring) or at specific time intervals; e.g., every 5, 10, 15, 20, 30, 45, or 60 seconds, or every 1, 2, 3, 4, 5, 6, 7, 8, 9, or minutes. While the cell growth device 330 has been described in the context of measuring the OD of a growing cell culture, it should, however, be understood by a skilled artisan given the teachings of the present specification that other cell growth parameters 10 can be measured in addition to or instead of cell culture OD. As with optional measure of cell growth in relation to the solid wall device or module described supra, spectroscopy using visible, ultraviolet (UV), or near infrared (NIR) light allows monitoring the concentration of nutrients and/or wastes in the cell culture and other spectroscopic measurements may be made; that is, other spectral properties can be measured via, e.g., dielectric impedance spectroscopy, visible fluorescence, fluorescence polarization, or luminescence. Additionally, the cell growth device 330 may include additional sensors for measuring, e.g., dissolved oxygen, carbon dioxide, pH, conductivity, and the like. For additional details regarding rotating growth vials and cell growth devices see U.S. Pat. Nos. 10,435,662; 10,443,031; 10,590,375, and 10,717,959
As described above in relation to the rotating growth vial and cell growth module, in order to obtain an adequate number of cells for transformation or transfection, cells typically are grown to a specific optical density in medium appropriate for the growth of the cells of interest, however, for effective transformation or transfection, it is desirable to decrease the volume of the cells as well as render the cells competent via buffer or medium exchange. Thus, one sub-component or module that is desired in cell processing systems to perform the methods described herein is a module or component that can grow, perform buffer exchange, and/or concentrate cells and render them competent so that they may be transformed or transfected with the nucleic acids needed for engineering or editing the cell's genome.
Permeate/filtrate member 420 is seen in the middle of
On the left of
A membrane or filter is disposed between the retentate and permeate members, where fluids can flow through the membrane but cells cannot and are thus retained in the flow channel disposed in the retentate member. Filters or membranes appropriate for use in the TFF device/module are those that are solvent resistant, are contamination free during filtration, and are able to retain the types and sizes of cells of interest. For example, in order to retain small cell types such as bacterial cells, pore sizes can be as low as 0.2 μm, however for other cell types, the pore sizes can be as high as 20 μm. Indeed, the pore sizes useful in the TFF device/module include filters with sizes from 0.20 μm, 0.21 μm, 0.22 μm, 0.23μ, 0.24 μm, 0.25 μm, 0.26 μm, 0.27 μm, 0.28 μm, 0.29 μm, 0.30 μm, 0.31μ, 0.32 μm, 0.33μ, 0.34 μm, 0.35 μm, 0.36 μm, 0.37 μm, 0.38 μm, 0.39 μm, 0.40 μm, 0.41 μm, 0.42 μm, 0.43 μm, 0.44 μm, 0.45 μm, 0.46 μm, 0.47 μm, 0.48 μm, 0.49 μm, 0.50 μm and larger. The filters may be fabricated from any suitable non-reactive material including cellulose mixed ester (cellulose nitrate and acetate) (CME), polycarbonate (PC), polyvinylidene fluoride (PVDF), polyethersulfone (PES), polytetrafluoroethylene (PTFE), nylon, glass fiber, or metal substrates as in the case of laser or electrochemical etching.
The length of the channel structure 402 may vary depending on the volume of the cell culture to be grown and the optical density of the cell culture to be concentrated. The length of the channel structure typically is between 60 mm and 300 mm, or between 70 mm and 200 mm, or between 80 mm and 100 mm. The cross-section configuration of the flow channel 402 may be round, elliptical, oval, square, rectangular, trapezoidal, or irregular. If square, rectangular, or another shape with generally straight sides, the cross section may be between about 10 μm and 1000 μm wide, or between 200 μm and 800 μm wide, or between 300 μm and 700 μm wide, or between 400 μm and 600 μm wide; and between about 10 μm and 1000 μm high, or between 200 μm and 800 μm high, or between 300 μm and 700 μm high, or between 400 μm and 600 μm high. If the cross section of the flow channel 402 is generally round, oval or elliptical, the radius of the channel may be between about 50 μm and 1000 μm in hydraulic radius, or between 5 μm and 800 μm in hydraulic radius, or between 200 μm and 700 μm in hydraulic radius, or between 300 μm and 600 μm wide in hydraulic radius, or between about 200 and 500 μm in hydraulic radius.
Moreover, the volume of the channel in the retentate 422 and permeate 420 members may be different depending on the depth of the channel in each member.
The TFF device may be fabricated from any robust material in which channels (and channel branches) may be milled including stainless steel, silicon, glass, aluminum, or plastics including cyclic-olefin copolymer (COC), cyclo-olefin polymer (COP), polystyrene, polyvinyl chloride, polyethylene, polyamide, polyethylene, polypropylene, acrylonitrile butadiene, polycarbonate, polyetheretheketone (PEEK), poly(methyl methylacrylate) (PMMA), polysulfone, and polyurethane, and co-polymers of these and other polymers. If the TFF device/module is disposable, preferably it is made of plastic. In some aspects, the material used to fabricate the TFF device/module is thermally-conductive so that the cell culture may be heated or cooled to a desired temperature. In certain aspects, the TFF device is formed by precision mechanical machining, laser machining, electro discharge machining (for metal devices); wet or dry etching (for silicon devices); dry or wet etching, powder or sandblasting, photostructuring (for glass devices); or thermoforming, injection molding, hot embossing, or laser machining (for plastic devices) using the materials mentioned above that are amenable to this mass production techniques.
The overall work flow for cell growth comprises loading a cell culture to be grown into a first retentate reservoir, optionally bubbling air or an appropriate gas through the cell culture, passing or flowing the cell culture through the first retentate port then tangentially through the TFF channel structure while collecting medium or buffer through one or both of the permeate ports 406, collecting the cell culture through a second retentate port 404 into a second retentate reservoir, optionally adding additional or different medium to the cell culture and optionally bubbling air or gas through the cell culture, then repeating the process, all while measuring, e.g., the optical density of the cell culture in the retentate reservoirs continuously or at desired intervals. Measurements of optical densities at programmed time intervals are accomplished using a 600 nm Light Emitting Diode (LED) that has been columnated through an optic into the retentate reservoir(s) containing the growing cells. The light continues through a collection optic to the detection system which consists of a (digital) gain-controlled silicone photodiode. Generally, optical density is shown as the absolute value of the logarithm with base 10 of the power transmission factors of an optical attenuator: OD=−log 10 (Power out/Power in). Since OD is the measure of optical attenuation—that is, the sum of absorption, scattering, and reflection—the TFF device OD measurement records the overall power transmission, so as the cells grow and become denser in population, the OD (the loss of signal) increases. The OD system is pre-calibrated against OD standards with these values stored in an on-board memory accessible by the measurement program.
In the channel structure, the membrane bifurcating the flow channels retains the cells on one side of the membrane (the retentate side 422) and allows unwanted medium or buffer to flow across the membrane into a filtrate or permeate side (e.g., permeate member 420) of the device. Bubbling air or other appropriate gas through the cell culture both aerates and mixes the culture to enhance cell growth. During the process, medium that is removed during the flow through the channel structure is removed through the permeate/filtrate ports 406. Alternatively, cells can be grown in one reservoir with bubbling or agitation without passing the cells through the TFF channel from one reservoir to the other.
The overall work flow for cell concentration using the TFF device/module involves flowing a cell culture or cell sample tangentially through the channel structure. As with the cell growth process, the membrane bifurcating the flow channels retains the cells on one side of the membrane and allows unwanted medium or buffer to flow across the membrane into a permeate/filtrate side (e.g., permeate member 420) of the device. In this process, a fixed volume of cells in medium or buffer is driven through the device until the cell sample is collected into one of the retentate ports 404, and the medium/buffer that has passed through the membrane is collected through one or both of the permeate/filtrate ports 406. All types of prokaryotic and eukaryotic cells—both adherent and non-adherent cells—can be grown in the TFF device. Adherent cells may be grown on beads or other cell scaffolds suspended in medium that flow through the TFF device.
The medium or buffer used to suspend the cells in the cell concentration device/module may be any suitable medium or buffer for the type of cells being transformed or transfected, such as LB, SOC, TPD, YPG, YPAD, MEM, DMEM, IMDM, RPMI, Hanks', PBS and Ringer's solution, where the media may be provided in a reagent cartridge as part of a kit. For culture of adherent cells, cells may be disposed on beads, microcarriers, or other type of scaffold suspended in medium. Most normal mammalian tissue-derived cells—except those derived from the hematopoietic system—are anchorage dependent and need a surface or cell culture support for normal proliferation. In the rotating growth vial described herein, microcarrier technology is leveraged. Microcarriers of particular use typically have a diameter of 100-300 μm and have a density slightly greater than that of the culture medium (thus facilitating an easy separation of cells and medium for, e.g., medium exchange) yet the density must also be sufficiently low to allow complete suspension of the carriers at a minimum stirring rate in order to avoid hydrodynamic damage to the cells. Many different types of microcarriers are available, and different microcarriers are optimized for different types of cells. There are positively charged carriers, such as Cytodex 1 (dextran-based, GE Healthcare), DE-52 (cellulose-based, Sigma-Aldrich Labware), DE-53 (cellulose-based, Sigma-Aldrich Labware), and HLX 11-170 (polystyrene-based); collagen- or ECM-(extracellular matrix) coated carriers, such as Cytodex 3 (dextran-based, GE Healthcare) or HyQ-sphere Pro-F 102-4 (polystyrene-based, Thermo Scientific); non-charged carriers, like HyQ-sphere P 102-4 (Thermo Scientific); or macroporous carriers based on gelatin (Cultisphere, Percell Biolytica) or cellulose (Cytopore, GE Healthcare).
In both the cell growth and concentration processes, passing the cell sample through the TFF device and collecting the cells in one of the retentate ports 404 while collecting the medium in one of the permeate/filtrate ports 406 is considered “one pass” of the cell sample. The transfer between retentate reservoirs “flips” the culture. The retentate and permeatee ports collecting the cells and medium, respectively, for a given pass reside on the same end of TFF device/module with fluidic connections arranged so that there are two distinct flow layers for the retentate and permeate/filtrate sides, but if the retentate port 404 resides on the retentate member of device/module (that is, the cells are driven through the channel above the membrane and the filtrate (medium) passes to the portion of the channel below the membrane), the permeate/filtrate port 406 will reside on the permeate member of device/module and vice versa (that is, if the cell sample is driven through the channel below the membrane, the filtrate (medium) passes to the portion of the channel above the membrane). Due to the high pressures used to transfer the cell culture and fluids through the flow channel of the TFF device, the effect of gravity is negligible.
At the conclusion of a “pass” in either of the growth and concentration processes, the cell sample is collected by passing through the retentate port 404 and into the retentate reservoir (not shown). To initiate another “pass”, the cell sample is passed again through the TFF device, this time in a flow direction that is reversed from the first pass. The cell sample is collected by passing through the retentate port 404 and into retentate reservoir (not shown) on the opposite end of the device/module from the retentate port 404 that was used to collect cells during the first pass. Likewise, the medium/buffer that passes through the membrane on the second pass is collected through the permeate port 406 on the opposite end of the device/module from the permeate port 406 that was used to collect the filtrate during the first pass, or through both ports. This alternating process of passing the retentate (the concentrated cell sample) through the device/module is repeated until the cells have been grown to a desired optical density, and/or concentrated to a desired volume, and both permeate ports (e.g., if there are more than one) can be open during the passes to reduce operating time. In addition, buffer exchange may be effected by adding a desired buffer (or fresh medium) to the cell sample in the retentate reservoir, before initiating another “pass”, and repeating this process until the old medium or buffer is diluted and filtered out and the cells reside in fresh medium or buffer Note that buffer exchange and cell growth may (and typically do) take place simultaneously, and buffer exchange and cell concentration may (and typically do) take place simultaneously. For further information and alternative aspects on TFFs see, e.g., U.S. Ser. Nos. 62/728,365, filed 7 Sep. 2018; 62/857,599, filed 5 Jun. 2019; and 62/867,415, filed 27 Jun. 2019.
In one aspect, the reagent reservoirs or reservoirs 504 of reagent cartridge 500 are configured to hold various size tubes, including, e.g., 250 mL tubes, 25 mL tubes, 10 mL tubes, 5 mL tubes, and Eppendorf or microcentrifuge tubes. In yet another aspect, all reservoirs may be configured to hold the same size tube, e.g., 5 mL tubes, and reservoir inserts may be used to accommodate smaller tubes in the reagent reservoir. In yet another aspect-particularly in an aspect where the reagent cartridge is disposable—the reagent reservoirs hold reagents without inserted tubes. In this disposable aspect, the reagent cartridge may be part of a kit, where the reagent cartridge is pre-filled with reagents and the receptacles or reservoirs sealed with, e.g., foil, heat seal acrylic or the like and presented to a consumer where the reagent cartridge can then be used in an automated multi-module cell processing instrument. As one of ordinary skill in the art will appreciate given the present disclosure, the reagents contained in the reagent cartridge will vary depending on work flow; that is, the reagents will vary depending on the processes to which the cells are subjected in the automated multi-module cell processing instrument, e.g., protein production, cell transformation and culture, cell editing, etc.
Reagents such as cell samples, enzymes, buffers, nucleic acid vectors, expression cassettes, proteins or peptides, reaction components (such as, e.g., MgCl2, dNTPs, nucleic acid assembly reagents, gap repair reagents, and the like), wash solutions, ethanol, and magnetic beads for nucleic acid purification and isolation, etc. may be positioned in the reagent cartridge at a known position. In some aspects of cartridge 500, the cartridge comprises a script (not shown) readable by a processor (not shown) for dispensing the reagents. Also, the cartridge 500 as one component in an automated multi-module cell processing instrument may comprise a script specifying two, three, four, five, ten or more processes to be performed by the automated multi-module cell processing instrument. In certain aspects, the reagent cartridge is disposable and is pre-packaged with reagents tailored to performing specific cell processing protocols, e.g., genome editing or protein production. Because the reagent cartridge contents vary while components/modules of the automated multi-module cell processing instrument or system may not, the script associated with a particular reagent cartridge matches the reagents used and cell processes performed. Thus, e.g., reagent cartridges may be pre-packaged with reagents for genome editing and a script that specifies the process steps for performing genome editing in an automated multi-module cell processing instrument, or, e.g., reagents for protein expression and a script that specifies the process steps for performing protein expression in an automated multi-module cell processing instrument.
For example, the reagent cartridge may comprise a script to pipette competent cells from a reservoir, transfer the cells to a transformation module, pipette a nucleic acid solution comprising a vector with expression cassette from another reservoir in the reagent cartridge, transfer the nucleic acid solution to the transformation module, initiate the transformation process for a specified time, then move the transformed cells to yet another reservoir in the reagent cassette or to another module such as a cell growth module in the automated multi-module cell processing instrument. In another example, the reagent cartridge may comprise a script to transfer a nucleic acid solution comprising a vector from a reservoir in the reagent cassette, nucleic acid solution comprising editing oligonucleotide cassettes in a reservoir in the reagent cassette, and a nucleic acid assembly mix from another reservoir to the nucleic acid assembly/desalting module, if present. The script may also specify process steps performed by other modules in the automated multi-module cell processing instrument. For example, the script may specify that the nucleic acid assembly/desalting reservoir be heated to 50° C. for 30 minutes to generate an assembled product; and desalting and resuspension of the assembled product via magnetic bead-based nucleic acid purification involving a series of pipette transfers and mixing of magnetic beads, ethanol wash, and buffer.
As described in relation to
Electrical stimulation may also be used for cell fusion in the production of hybridomas or other fused cells. During a typical electroporation procedure, cells are suspended in a buffer or medium that is favorable for cell survival. For bacterial cell electroporation, low conductance mediums, such as water, glycerol solutions and the like, are often used to reduce the heat production by transient high current. In traditional electroporation devices, the cells and material to be electroporated into the cells (collectively “the cell sample”) are placed in a cuvette embedded with two flat electrodes for electrical discharge. For example, Bio-Rad (Hercules, Calif.) makes the GENE PULSER XCELL™ line of products to electroporate cells in cuvettes. Traditionally, electroporation requires high field strength, however, the flow-through electroporation devices included in the reagent cartridges achieve high efficiency cell electroporation with low toxicity. The reagent cartridges of the disclosure allow for particularly easy integration with robotic liquid handling instrumentation that is typically used in automated instruments and systems such as air displacement pipettors. Such automated instrumentation includes, but is not limited to, off-the-shelf automated liquid handling systems from Tecan (Mannedorf, Switzerland), Hamilton (Reno, NV), Beckman Coulter (Fort Collins, CO), etc.
Additional details of the FTEP devices are illustrated in
In the FTEP devices of the disclosure, the toxicity level of the transformation results in greater than 30% viable cells after electroporation, preferably greater than 35%, 40%, 45%, 50%, 55%, 60%, 70%, 75%, 80%, 85%, 90%, 95% or even 99% viable cells following transformation, depending on the cell type and the nucleic acids being introduced into the cells.
The housing of the FTEP device can be made from many materials depending on whether the FTEP device is to be reused, autoclaved, or is disposable, including stainless steel, silicon, glass, resin, polyvinyl chloride, polyethylene, polyamide, polystyrene, polyethylene, polypropylene, acrylonitrile butadiene, polycarbonate, polyetheretheketone (PEEK), polysulfone and polyurethane, co-polymers of these and other polymers. Similarly, the walls of the channels in the device can be made of any suitable material including silicone, resin, glass, glass fiber, polyvinyl chloride, polyethylene, polyamide, polyethylene, polypropylene, acrylonitrile butadiene, polycarbonate, polyetheretheketone (PEEK), polysulfone and polyurethane, co-polymers of these and other polymers. Preferred materials include crystal styrene, cyclo-olefin polymer (COP) and cyclic olephin co-polymers (COC), which allow the device to be formed entirely by injection molding in one piece with the exception of the electrodes and, e.g., a bottom sealing film if present.
The FTEP devices described herein (or portions of the FTEP devices) can be created or fabricated via various techniques, e.g., as entire devices or by creation of structural layers that are fused or otherwise coupled. For example, for metal FTEP devices, fabrication may include precision mechanical machining or laser machining; for silicon FTEP devices, fabrication may include dry or wet etching; for glass FTEP devices, fabrication may include dry or wet etching, powderblasting, sandblasting, or photostructuring; and for plastic FTEP devices fabrication may include thermoforming, injection molding, hot embossing, or laser machining. The components of the FTEP devices may be manufactured separately and then assembled, or certain components of the FTEP devices (or even the entire FTEP device except for the electrodes) may be manufactured (e.g., using 3D printing) or molded (e.g., using injection molding) as a single entity, with other components added after molding. For example, housing and channels may be manufactured or molded as a single entity, with the electrodes later added to form the FTEP unit. Alternatively, the FTEP device may also be formed in two or more parallel layers, e.g., a layer with the horizontal channel and filter, a layer with the vertical channels, and a layer with the inlet and outlet ports, which are manufactured and/or molded individually and assembled following manufacture.
In specific aspects, the FTEP device can be manufactured using a circuit board as a base, with the electrodes, filter and/or the flow channel formed in the desired configuration on the circuit board, and the remaining housing of the device containing, e.g., the one or more inlet and outlet channels and/or the flow channel formed as a separate layer that is then sealed onto the circuit board. The sealing of the top of the housing onto the circuit board provides the desired configuration of the different elements of the FTEP devices of the disclosure. Also, two to many FTEP devices may be manufactured on a single substrate, then separated from one another thereafter or used in parallel. In certain aspects, the FTEP devices are reusable and, in some aspects, the FTEP devices are disposable. In additional aspects, the FTEP devices may be autoclavable.
The electrodes 508 can be formed from any suitable metal, such as copper, stainless steel, titanium, aluminum, brass, silver, rhodium, gold or platinum, or graphite. One preferred electrode material is alloy 303 (UNS330300) austenitic stainless steel. An applied electric field can destroy electrodes made from of metals like aluminum. If a multiple-use (e.g., non-disposable) flow-through FTEP device is desired-as opposed to a disposable, one-use flow-through FTEP device-the electrode plates can be coated with metals resistant to electrochemical corrosion. Conductive coatings like noble metals, e.g., gold, can be used to protect the electrode plates.
As mentioned, the FTEP devices may comprise push-pull pneumatic means to allow multi-pass electroporation procedures; that is, cells to electroporated may be “pulled” from the inlet toward the outlet for one pass of electroporation, then be “pushed” from the outlet end of the flow-through FTEP device toward the inlet end to pass between the electrodes again for another pass of electroporation. This process may be repeated one to many times.
Depending on the type of cells to be electroporated (e.g., bacterial, yeast, mammalian) and the configuration of the electrodes, the distance between the electrodes in the flow channel can vary widely. For example, where the flow channel decreases in width, the flow channel may narrow to between 10 μm and 5 mm, or between 25 μm and 3 mm, or between 50 μm and 2 mm, or between 75 μm and 1 mm. The distance between the electrodes in the flow channel may be between 1 mm and 10 mm, or between 2 mm and 8 mm, or between 3 mm and 7 mm, or between 4 mm and 6 mm. The overall size of the FTEP device may be between 3 cm and 15 cm in length, or between 4 cm and 12 cm in length, or between 4.5 cm and 10 cm in length. The overall width of the FTEP device may be between 0.5 cm and 5 cm, or between 0.75 cm and 3 cm, or between 1 cm and 2.5 cm, or between 1 cm and 1.5 cm.
The region of the flow channel that is narrowed is wide enough so that at least two cells can fit in the narrowed portion side-by-side. For example, a typical bacterial cell is 1 μm in diameter; thus, the narrowed portion of the flow channel of the FTEP device used to transform such bacterial cells will be at least 2 μm wide. In another example, if a mammalian cell is approximately 50 μm in diameter, the narrowed portion of the flow channel of the FTEP device used to transform such mammalian cells will be at least 100 μm wide. That is, the narrowed portion of the FTEP device will not physically contort or “squeeze” the cells being transformed.
In aspects of the FTEP device where reservoirs are used to introduce cells and exogenous material into the FTEP device, the reservoirs range in volume from between 100 μL and 10 mL, or between 500 μL and 75 mL, or between 1 mL and 5 mL. The flow rate in the FTEP ranges from between 0.1 mL and 5 mL per minute, or between 0.5 mL and 3 mL per minute, or between 1.0 mL and 2.5 mL per minute. The pressure in the FTEP device ranges from between 1 and 30 PSI, or between 2 and 10 PSI, or between 3 and 5 PSI.
To avoid different field intensities between the electrodes, the electrodes should be arranged in parallel. Furthermore, the surface of the electrodes should be as smooth as possible without pin holes or peaks. Electrodes having a roughness Rz of between 1 μm and 10 μm are preferred. In another aspect of the invention, the flow-through electroporation device comprises at least one additional electrode which applies a ground potential to the FTEP device.
After editing 6053, many cells in the colonies of cells that have been edited die as a result of the nicks caused by active editing or by fitness effects from the edits themselves and there is a lag in growth for the edited cells that do survive but must repair and recover following editing (microwells 6058), where cells that do not undergo editing thrive (microwells 6059) (vi). All cells are allowed to continue grow to establish colonies and normalize, where the colonies of edited cells in microwells 6058 catch up in size and/or cell number with the cells in microwells 6059 that do not undergo editing (vii). Once the cell colonies are normalized, either pooling 6060 of all cells in the microwells can take place, in which case the cells are enriched for edited cells by eliminating the bias from non-editing cells and fitness effects from editing; alternatively, colony growth in the microwells is monitored after editing, and slow growing colonies (e.g., the cells in microwells 6058) are identified and selected 6061 (e.g., “cherry picked”) resulting in even greater enrichment of edited cells.
In growing the cells, the medium used will depend, on the type of cells being edited—e.g., bacterial, yeast or mammalian. For example, medium for yeast cell growth includes LB, SOC, TPD, YPG, YPAD, MEM and DMEM.
A module useful for performing the method depicted in
The SWIIN module 650 in
In this
In this aspect of a SWIIN module, the perforated member includes through-holes to accommodate ultrasonic tabs disposed on the permeate member. Thus, in this aspect the perforated member is fabricated from 316 stainless steel, and the perforations form the walls of microwells while a filter or membrane is used to form the bottom of the microwells. Typically, the perforations (microwells) are approximately 150 μm to 200 μm in diameter, and the perforated member is approximately 125 μm deep, resulting in microwells having a volume of approximately 2.5 nL, with a total of approximately 200,000 microwells. The distance between the microwells is approximately 279 μm center-to-center. Though here the microwells have a volume of approximately 2.5 nL, the volume of the microwells may be between 1 nL and 25 nL, or preferably between 2 nL and 10 nL, and even more preferably between 2 nL and 4 nL. As for the filter or membrane, like the filter described previously, filters appropriate for use are solvent resistant, contamination free during filtration, and are able to retain the types and sizes of cells of interest. For example, in order to retain small cell types such as bacterial cells, pore sizes can be as low as 0.10 μm, however for other cell types (e.g., such as for mammalian cells), the pore sizes can be as high as from 10.0 μm to 20.0 μm or more. Indeed, the pore sizes useful in the cell concentration device/module include filters with sizes from 0.10 μm, 0.11 μm, 0.12 μm, 0.13 μm, 0.14 μm, 0.15 μm, 0.16 μm, 0.17 μm, 0.18 μm, 0.19 μm, 0.20 μm, 0.21 μm, 0.22 μm, 0.23 μm, 0.24 μm, 0.25 μm, 0.26 μm, 0.27 μm, 0.28 μm, 0.29 μm, 0.30 μm, 0.31 μm, 0.32 μm, 0.33 μm, 0.34 μm, 0.35 μm, 0.36 μm, 0.37 μm, 0.38 μm, 0.39 μm, 0.40 μm, 0.41 μm, 0.42 μm, 0.43 μm, 0.44 μm, 0.45 μm, 0.46 μm, 0.47 μm, 0.48 μm, 0.49 μm, 0.50 μm and larger. The filters may be fabricated from any suitable material including cellulose mixed ester (cellulose nitrate and acetate) (CME), polycarbonate (PC), polyvinylidene fluoride (PVDF), polyethersulfone (PES), polytetrafluoroethylene (PTFE), nylon, or glass fiber.
The cross-section configuration of the mated serpentine channel may be round, elliptical, oval, square, rectangular, trapezoidal, or irregular. If square, rectangular, or another shape with generally straight sides, the cross section may be between about 2 mm and 15 mm wide, or between 3 mm and 12 mm wide, or between 5 mm and 10 mm wide. If the cross section of the mated serpentine channel is generally round, oval or elliptical, the radius of the channel may be between about 3 mm and 20 mm in hydraulic radius, or between 5 mm and 15 mm in hydraulic radius, or between 8 mm and 12 mm in hydraulic radius.
Serpentine channels 660a and 660b can have approximately the same volume or a different volume. For example, each “side” or portion 660a, 660b of the serpentine channel may have a volume of, e.g., 2 mL, or serpentine channel 660a of permeate member 608 may have a volume of 2 mL, and the serpentine channel 660b of retentate member 604 may have a volume of, e.g., 3 mL. The volume of fluid in the serpentine channel may range from about 2 mL to about 80 mL, or from about 4 mL to about 60 mL, or from about 5 mL to about 40 mL, or from about 6 mL to about 20 mL (note these volumes apply to a SWIIN module comprising a, e.g., 50-500K perforation member). The volume of the reservoirs may range between 5 mL and 50 mL, or between 7 mL and 40 mL, or between 8 mL and 30 mL or between 10 mL and 20 mL, and the volumes of all reservoirs may be the same or the volumes of the reservoirs may differ (e.g., the volume of the permeate reservoirs is greater than that of the retentate reservoirs).
The serpentine channel portions 660a and 660b of the permeate member 608 and retentate member 604, respectively, are approximately 200 mm long, 130 mm wide, and 4 mm thick, though in other aspects, the retentate and permeate members can be between 75 mm and 400 mm in length, or between 100 mm and 300 mm in length, or between 150 mm and 250 mm in length; between 50 mm and 250 mm in width, or between 75 mm and 200 mm in width, or between 100 mm and 150 mm in width; and between 2 mm and 15 mm in thickness, or between 4 mm and 10 mm in thickness, or between 5 mm and 8 mm in thickness. In some aspects, the retentate (and permeate) members may be fabricated from PMMA (poly(methyl methacrylate) or other materials may be used, including polycarbonate, cyclic olefin co-polymer (COC), glass, polyvinyl chloride, polyethylene, polyamide, polypropylene, polysulfone, polyurethane, and co-polymers of these and other polymers. Preferably at least the retentate member is fabricated from a transparent material so that the cells can be visualized (see, e.g.,
Because the retentate member preferably is transparent, colony growth in the SWIIN module can be monitored by automated devices such as those sold by JoVE (ScanLag™ system, Cambridge, MA) (also see Levin-Reisman, et al., Nature Methods, 7:737-39 (2010)). Cell growth for, e.g., mammalian cells may be monitored by, e.g., the growth monitor sold by IncuCyte (Ann Arbor, MI) (see also, Choudhry, PLos One, 11(2): e0148469 (2016)). Further, automated colony pickers may be employed, such as those sold by, e.g., TECAN (Pickolo™ system, Mannedorf, Switzerland); Hudson Inc. (RapidPick™, Springfield, NJ); Molecular Devices (QPix 400™ system, San Jose, CA); and Singer Instruments (PIXL™ system, Somerset, UK).
Due to the heating and cooling of the SWIIN module, condensation may accumulate on the retentate member which may interfere with accurate visualization of the growing cell colonies. Condensation of the SWIIN module 650 may be controlled by, e.g., moving heated air over the top of (e.g., retentate member) of the SWIIN module 650, or by applying a transparent heated lid over at least the serpentine channel portion 660b of the retentate member 604. See, e.g.,
In SWIIN module 650 cells and medium—at a dilution appropriate for Poisson or substantial Poisson distribution of the cells in the microwells of the perforated member—are flowed into serpentine channel 660b from ports in retentate member 604, and the cells settle in the microwells while the medium passes through the filter into serpentine channel 660a in permeate member 608. The cells are retained in the microwells of perforated member 601 as the cells cannot travel through filter 603. Appropriate medium may be introduced into permeate member 608 through permeate ports 611. The medium flows upward through filter 603 to nourish the cells in the microwells (perforations) of perforated member 601. Additionally, buffer exchange can be effected by cycling medium through the retentate and permeate members. In operation, the cells are deposited into the microwells, are grown for an initial, e.g., between 2 and 100 doublings, editing is induced by, e.g., raising the temperature of the SWIIN to 42° C. to induce a temperature inducible promoter or by removing growth medium from the permeate member and replacing the growth medium with a medium comprising a chemical component that induces an inducible promoter.
Once editing has taken place, the temperature of the SWIIN may be decreased, or the inducing medium may be removed and replaced with fresh medium lacking the chemical component thereby de-activating the inducible promoter. The cells then continue to grow in the SWIIN module 650 until the growth of the cell colonies in the microwells is normalized. For the normalization protocol, once the colonies are normalized, the colonies are flushed from the microwells by applying fluid or air pressure (or both) to the permeate member serpentine channel 660a and thus to filter 603 and pooled. Alternatively, if cherry picking is desired, the growth of the cell colonies in the microwells is monitored, and slow-growing colonies are directly selected; or, fast-growing colonies are eliminated.
Imaging of cell colonies growing in the wells of the SWIIN is desired in most implementations for, e.g., monitoring both cell growth and device performance and imaging is necessary for cherry-picking implementations. Real-time monitoring of cell growth in the SWIIN requires backlighting, retentate plate (top plate) condensation management and a system-level approach to temperature control, air flow, and thermal management. In some implementations, imaging employs a camera or CCD device with sufficient resolution to be able to image individual wells. For example, in some configurations a camera with a 9-pixel pitch is used (that is, there are 9 pixels center-to-center for each well). Processing the images may, in some implementations, utilize reading the images in grayscale, rating each pixel from low to high, where wells with no cells will be brightest (due to full or nearly-full light transmission from the backlight) and wells with cells will be dim (due to cells blocking light transmission from the backlight). After processing the images, thresholding is performed to determine which pixels will be called “bright” or “dim”, spot finding is performed to find bright pixels and arrange them into blocks, and then the spots are arranged on a hexagonal grid of pixels that correspond to the spots. Once arranged, the measure of intensity of each well is extracted, by, e.g., looking at one or more pixels in the middle of the spot, looking at several to many pixels at random or pre-set positions, or averaging X number of pixels in the spot. In addition, background intensity may be subtracted. Thresholding is again used to call each well positive (e.g., containing cells) or negative (e.g., no cells in the well). The imaging information may be used in several ways, including taking images at time points for monitoring cell growth. Monitoring cell growth can be used to, e.g., remove the “muffin tops” of fast-growing cells followed by removal of all cells or removal of cells in “rounds” as described above, or recover cells from specific wells (e.g., slow-growing cell colonies); alternatively, wells containing fast-growing cells can be identified and areas of UV light covering the fast-growing cell colonies can be projected (or rastered with shutters) onto the SWIIN to irradiate or inhibit growth of those cells. Imaging may also be used to assure proper fluid flow in the serpentine channel 660.
After recovery, the cells may be transferred to a storage module 712, where the cells can be stored at, e.g., 4° C. or −20° C. for later processing, or the cells may be diluted and transferred to a selection/singulation/growth/induction/editing/normalization (SWIIN) module 720. In the SWIIN 720, the cells are arrayed such that there is an average of one to twenty or fifty or so cells per microwell. The arrayed cells may be in selection medium to select for cells that have been transformed or transfected with the editing vector(s). Once singulated, the cells grow through 2 to 50 doublings and establish colonies. Once colonies are established, editing is induced by providing conditions (e.g., temperature, addition of an inducing or repressing chemical) to induce editing. Editing is then initiated and allowed to proceed, the cells are allowed to grow to terminal size (e.g., normalization of the colonies) in the microwells and then are treated to conditions that cure the editing vector from this round. Once cured, the cells can be flushed out of the microwells and pooled, then transferred to the storage (or recovery) unit 712 or can be transferred back to the growth module 704 for another round of editing. In between pooling and transfer to a growth module, there typically is one or more additional steps, such as cell recovery, medium exchange (rendering the cells electrocompetent), cell concentration (typically concurrently with medium exchange by, e.g., filtration.
Note that the selection/singulation/growth/induction/editing/normalization and curing modules may be the same module, where all processes are performed in, e.g., a solid wall device, or selection and/or dilution may take place in a separate vessel before the cells are transferred to the solid wall singulation/growth/induction/editing/normalization/editing module (SWIIN) Similarly, the cells may be pooled after normalization, transferred to a separate vessel, and cured in the separate vessel. Once the putatively-edited cells are pooled, they may be subjected to another round of editing, beginning with growth, cell concentration and treatment to render electrocompetent, and transformation by yet another donor nucleic acid in another editing cassette via the electroporation module 708.
In electroporation device 708, the cells selected from the first round of editing are transformed by a second set of editing vectors and the cycle is repeated until the cells have been transformed and edited by a desired number of, e.g., CF editing cassettes. The multi-module cell processing instrument exemplified in
It should be apparent to one of ordinary skill in the art given the present disclosure that the process described may be recursive and multiplexed; that is, cells may go through the workflow described in relation to
In any recursive process, it is advantageous to “cure” the editing vectors comprising the CF editing cassette. “Curing” is a process in which one or more editing vectors used in the prior round of editing is eliminated from the transformed cells. (See, e.g., curing can be accomplished by, e.g., cleaving the editing vector(s) using a curing plasmid thereby rendering the editing vectors nonfunctional; diluting the editing vector(s) in the cell population via cell growth (that is, the more growth cycles the cells go through, the fewer daughter cells will retain the editing vector(s)), or by, e.g., utilizing a heat-sensitive origin of replication on the editing vector. The conditions for curing will depend on the mechanism used for curing; that is, in this example, how the curing plasmid cleaves the editing vector.
A variety of further modifications and improvements in and to the composition, methods, and modified cells of the present disclosure will be apparent to those skilled in the art. The following non-limiting, embodiments are specifically envisioned:
1. A method for performing nucleic acid-guided nuclease/reverse transcriptase fusion editing in a genome of a live cell, comprising:
2. The method of embodiment 1, wherein the first CFgRNA comprises a spacer region and a structural region recognized by the nucleic acid-guided nuclease/reverse transcriptase fusion enzyme.
3. The method of embodiment 1 or 2, wherein the first repair template comprises an edit and a primer binding site (PBS).
4. The method of embodiment 3, wherein the first repair template further comprises a post-edit homology region.
5. The method of embodiment 3 or 4, wherein the first repair template comprises a nick-to-edit region.
6. The method of any one of embodiments 1 to 5, wherein the second CFgRNA comprises a spacer region and a structural region recognized by the nucleic acid-guided nuclease/reverse transcriptase fusion enzyme.
7. The method of any one of embodiments 1 to 6, wherein the second repair template comprises a barcode and a primer binding site (PBS).
8. The method of embodiment 7, wherein the second repair template further comprises a post-barcode homology region.
9. The method of embodiment 7 or 8, wherein the second repair template comprises a nick-to-barcode region.
10. The method of any one of embodiments 1 to 9, further comprising:
11. The method of embodiment 10, wherein the genome is sequenced by single-cell amplicon-based next-generation sequencing.
12. The method of embodiment 10, wherein the transcriptome is sequenced by single-cell RNA sequencing.
13. The method of any one of embodiments 1 to 12, wherein the nucleic acid-guided nuclease/reverse transcriptase fusion enzyme comprises a nucleic acid-guided nickase and a reverse transcriptase.
14. The method of embodiment 13, wherein the nucleic acid-guided nickase comprises a MAD nickase or a variant thereof.
15. The method of embodiment 13, wherein the nucleic acid-guided nickase comprises a Cas nickase or a variant thereof.
16. The method of any one of embodiments 1 to 15, wherein a nucleic acid sequence encoding the nucleic acid-guided nuclease/reverse transcriptase fusion enzyme and the CF editing cassette are assembled into a vector for introduction of the nucleic acid-guided nuclease/reverse transcriptase fusion enzyme and CF editing cassette into the cell.
17. The method of embodiment 16, wherein the CF barcoding cassette is further assembled into the vector for introduction of the CF barcoding cassette into the cell.
18. The method of any one of embodiments 1 to 15, wherein the CF barcoding cassette is assembled into a barcoding vector for introduction of the CF barcoding cassette into the cell, and wherein the CF editing cassette is assembled into an editing vector for introduction of the CF editing cassette into the cell, wherein the editing vector is different from the barcoding vector.
19. The method of any one of embodiments 1 to 18, further comprising: providing a selectable marker.
20. The method of embodiment 19, wherein the selectable marker is for selection and enrichment of cells having an integrated barcode or an effected edit.
21. The method of embodiment 19 or 20, further comprising selecting and enriching for cells having an integrated barcode or an effected edit.
22. The method of any one of embodiments 1 to 21, wherein the second target locus is a safe harbor locus disposed centrally in an intergenic or intronic region of the cell.
23. The method of any one of embodiments 1 to 21, wherein the second target locus is disposed within a coding region of the cell.
24. The method of any one of embodiments 1 to 21, wherein the second target locus is disposed within a noncoding region of the cell.
25. The method of any one of embodiments 1 to 24, wherein the CF editing cassette further comprises an edit to immunize the first target locus and prevent re-nicking.
26. The method of any one of embodiments 1 to 25, wherein the CF barcoding cassette further comprises an edit to immunize the second target locus and prevent re-nicking.
27. An editing system comprising one or more vectors comprising:
28 The editing system of embodiment 27, wherein the first CFgRNA comprises a spacer region and a structural region recognized by the nucleic acid-guided nuclease/reverse transcriptase fusion enzyme.
29. The editing system of embodiment 27 or 28, wherein the first repair template comprises an edit and a primer binding site (PBS).
30. The editing system of embodiment 29, wherein the first repair template further comprises a post-edit homology region.
31. The editing system of embodiment 29 or 30, wherein the first repair template further comprises a nick-to-edit region.
32. The editing system of any one of embodiments 27 to 31, wherein the second CFgRNA comprises a spacer region and a structural region recognized by the nucleic acid-guided nuclease/reverse transcriptase fusion enzyme.
33. The editing system of any one of embodiments 27 to 32, wherein the second repair template comprises a barcode and a primer binding site (PBS).
34. The editing system of embodiment 33, wherein the second repair template further comprises a post-barcode homology region.
35. The editing system of embodiment 33 or 34, wherein the second repair template comprises a nick-to-barcode region.
36. The editing system of any one of embodiments 27 to 35, wherein the nucleic acid-guided nuclease/reverse transcriptase fusion enzyme comprises a nucleic acid-guided nickase and a reverse transcriptase.
37. The editing system of embodiment 36, wherein the nucleic acid-guided nickase comprises a MAD nickase or a variant thereof.
38. The editing system of embodiment 36, wherein the nucleic acid-guided nickase comprises a Cas nickase or a variant thereof.
39. The editing system of any one of embodiments 27 to 38, wherein the one or more vectors comprises an editing vector, and wherein the editing vector comprises the CF editing cassette.
40. The editing system of embodiment 39, wherein the editing vector further comprises a nucleic acid sequence encoding the nucleic acid-guided nuclease/reverse transcriptase fusion enzyme.
41. The editing system of embodiment 39 or 40, wherein the editing vector further comprises the CF barcoding cassette.
42. The editing system of any one of embodiments 27 to 38, wherein an editing vector comprises the CF editing cassette, and a barcoding vector comprises the CF barcoding cassette, and wherein the editing vector is different than the barcoding vector.
43. The editing system of any one of embodiments 27 to 38, wherein one or more of the one or more vectors comprises a selectable marker.
44. The editing system of embodiment 43, wherein the selectable marker is for selection and enrichment of cells having an integrated barcode or an effected edit.
45. The editing system of any one of embodiments 27 to 44, wherein the second target locus is a safe harbor locus disposed centrally in an intergenic or intronic region of the cell.
46. The editing system of any one of embodiments 27 to 44, wherein the second target locus is disposed within a coding region of the cell.
47. The editing system of any one of embodiments 27 to 44, wherein the second target locus is disposed within a noncoding region of the cell.
48. The editing system of any one of embodiments 27 to 47, wherein the CF editing cassette further comprises an edit to immunize the first target locus and prevent re-nicking.
49. The editing system of any one of embodiments 27 to 48, wherein the CF barcoding cassette further comprises an edit to immunize the second target locus and prevent re-nicking.
50. A vector comprising:
51. The vector of embodiment 50, wherein the first CFgRNA comprises a spacer region and a structural region recognized by the nucleic acid-guided nuclease/reverse transcriptase fusion enzyme.
52. The vector of embodiment 50 or 51, wherein the first repair template comprises an edit and a primer binding site (PBS).
53. The vector of embodiment 52, wherein the first repair template further comprises a post-edit homology region.
54. The vector of embodiment 52 or 53, wherein the first repair template further comprises a nick-to-edit region.
55. The vector of any one of embodiments 50 to 54, wherein the second CFgRNA comprises a spacer region and a structural region recognized by the nucleic acid-guided nuclease/reverse transcriptase fusion enzyme.
56. The vector of any one of embodiments 50 to 55, wherein the second repair template comprises a barcode and a primer binding site (PBS).
57. The vector of any one of embodiments 56, wherein the second repair template further comprises a post-barcode homology region.
58. The vector of embodiment 56 or 57, wherein the second repair template comprises a nick-to-barcode region.
59. The vector of any one of embodiments 50 to 58, wherein the nucleic acid-guided nuclease/reverse transcriptase fusion enzyme comprises a nucleic acid-guided nickase and a reverse transcriptase.
60. The vector of embodiment 59, wherein the nucleic acid-guided nickase comprises a MAD nickase or a variant thereof.
61. The vector of embodiment 59, wherein the nucleic acid-guided nickase comprises a Cas nickase or a variant thereof.
62. The vector of any one of embodiments 50 to 61, wherein the vector further comprises a selectable marker.
63. The vector of embodiment 62, wherein the selectable marker is for selection and enrichment of cells having an integrated barcode or an effected edit.
64. The vector of embodiment 62 or 63, wherein the selectable marker is a puromycin resistance gene.
65. The vector of any one of embodiments 50 to 64, wherein the second target locus is a safe harbor locus disposed centrally in an intergenic or intronic region of the cell.
66. The vector of any one of embodiments 50 to 64, wherein the second target locus is disposed within a coding region of the cell.
67. The vector of any one of embodiments 50 to 64, wherein the second target locus is disposed within a noncoding region of the cell.
68. The vector of any one of embodiments 50 to 67, wherein the CF editing cassette further comprises an edit to immunize the first target locus and prevent re-nicking.
69. The vector of any one of embodiments 50 to 68, wherein the CF barcoding cassette further comprises an edit to immunize the second target locus and prevent re-nicking.
70. A method for performing nucleic acid-guided nuclease/reverse transcriptase fusion editing in a genome of a live cell, comprising:
71. The method of embodiment 70, wherein the live cell comprises a CF editing cassette, the CF editing cassette comprising a nucleic acid sequence encoding a first CFgRNA having a region of complementarity to a sequence of the first target locus, and a nucleic acid sequence encoding a first repair template.
72. The method of embodiment 71, wherein the first CFgRNA comprises a spacer region and a structural region recognized by the nucleic acid-guided nuclease/reverse transcriptase fusion enzyme.
73. The method of any one of embodiments 70 to 72, wherein the first repair template comprises an edit and a primer binding site (PBS).
74. The method of any one of embodiments 70 to 73, wherein the live cell comprises a CF barcoding cassette, the CF barcoding cassette comprising a nucleic acid sequence encoding a second CFgRNA having a region of complementarity to a sequence of the second target locus, and a nucleic acid sequence encoding a second repair template.
75. The method of embodiment 74, wherein the second CFgRNA comprises a spacer region and a structural region recognized by the nucleic acid-guided nuclease/reverse transcriptase fusion enzyme.
76. The method of any one of embodiments 70 to 75, wherein the second repair template comprises a barcode and a primer binding site (PBS).
The following examples are put forth so as to provide those of ordinary skill in the art with a complete disclosure and description of how to make and use the present invention, and are not intended to limit the scope of what the inventors regard as their invention, nor are they intended to represent or imply that the experiments below are all of or the only experiments performed. It will be appreciated by persons skilled in the art that numerous variations and/or modifications may be made to the invention as shown in the specific aspects without departing from the spirit or scope of the invention as broadly described. The present aspects are, therefore, to be considered in all respects as illustrative and not restrictive.
A GFP to BFP reporter cell line is created using mammalian cells with a stably integrated genomic copy of the GFP gene (HEK293T-GFP). These cell lines enablephenotypic detection of genomic edits of different classes by various different mechanisms, including flow cytometry, fluorescent cell imaging, and genotypic detection by sequencing of the genome-integrated GFP gene. Lack of editing, or perfect repair of cut events in the GFP gene, result in cells that remain GFP-positive. Cut events that are repaired by the Non-Homologous End-Joining (NHEJ) pathway often result in nucleotide insertion or deletion events (indels), resulting in frame-shift mutations in the coding sequence that cause loss of GFP gene expression and fluorescence. Cut events that are repaired by the Homology-Directed Repair (HDR) pathway using the GFP to BFP HDR donor as a repair template or by the use of CFgRNAs, e.g., complementary CFgRNAs, result in conversion of the cell fluorescence profile from that of GFP to that of BFP.
CREATE Fusion Editing (CFE) is a technique that uses a nucleic acid nickase fusion protein (e.g., MAD2007 nickase) fused to a peptide with reverse transcriptase activity along with a nucleic acid encoding a gRNA comprising a region complementary to a target region of a nucleic acid in one or more cells, which comprises a mutation of at least one nucleotide relative to the target region in the one or more cells and a protospacer adjacent motif (PAM) mutation.
In a first design, a nickase enzyme derived from the MAD2007 nuclease (see, e.g., U.S. Pat. Nos. 9,982,279 and 10,337,028), e.g., Cas9 H840A nickase or MAD7® nickase (see, e.g., U.S. Ser. Nos. 16/837,212 and 17/084,522), is fused to an engineered reverse transcriptase (RT) on the C-terminus and cloned downstream of a CMV promoter. In this instance, the RT used is derived from Moloney Murine Leukemia Virus (M-MLV).
RNA guides (gRNAs) are designed that are complementary to a single region proximal to the EGFP-to-BFP editing site. The gRNA is extended on 3′ end to include a region of 13 bp that include the TY-to-SH edit and a second region of 13 bp that is complementary to the nicked EGFP DNA sequence. This allows the nicked genomic DNA to anneal to the 3′ end of the gRNA which can then be extended by the reverse transcriptase to incorporate the edit in the genome. A second gRNA targets a region in the EGFP DNA sequence that is 86 bp upstream of the edit site. This gRNA is designed such that it enables the nickase to cut the opposite strand relative to gRNA. Both of these gRNAs are cloned downstream of a U6 promoter. A poly-T sequence is also included that terminates the transcription of the gRNA.
The plasmids are transformed into NEB Stable E. coli (Ipswich, NY) and grown overnight in 25 mL LB cultures. The following day the plasmids are purified from E. coli using the Qiagen Midi Prep kit (Venlo, Netherlands). The purified plasmid is then RNase A (ThermoFisher, Waltham, Mass) treated and re-purified using the DNA Clean and Concentrator kit (Zymo, Irvine, CA).
HEK293T cells are cultured in DMEM medium which is supplemented with 10% FBS and 1× Penicillin and Streptomycin. 100 ng of total DNA (50 ng of gRNA plasmid and 50 ng of CFE plasmids) is mixed with 1 μL of PolyFect (Qiagen, Venlo, Netherlands) in 25 μL of OptiMEM in a 96 well plate. The complex is incubated for 10 minutes and then 20,000 HEK293T cells resuspended in 100 μL of DMEM are added to the mixture. The resulting mixture is then incubated for 80 hours at 37° C. and 5% CO2.
The cells are harvested from flat bottom 96 well plates using TrypLE Express reagent (ThermoFisher, Waltham, MA) and transferred to v-bottom 96 well plate. The plate is then spun down at 500×g for 5 minutes. The TrypLE solution is then aspirated and the cell pellet is resuspended in FACS buffer (1×PBS, 1% FBS, 1 mM EDTA and 0.5% BSA). The GFP+, BFP+ and RFP+ cells are then analyzed on the Attune NxT flow cytometer and the data is analyzed on FlowJo software.
The RFP+BFP+ cells that are identified are indicative of the proportion of enriched cells that have undergone precise or imprecise editing process. BFP+ cells indicate cells that have undergone successful editing process and express BFP. The GFP-cells indicate cells that have been imprecisely edited, leading to disruption of the GFP open reading frame and loss of expression.
In this experiment, the edit is immediately 3′ of the gRNA, and 3′ of the edit is a further region complementary to the nicked genome, although the intended edit could also be present further 5′ within the region homologous to the nicked genome. A nickase RT fusion enzyme (Cas9 H840A nickase) creates a nick in the target site and the nicked DNA anneals to its complementary sequence on 3′ end of the gRNA. The RT then extends the DNA, thereby incorporating the intended edit directly in the genome.
The effectiveness of CREATE Fusion Editing in GFP+HEK293T cells is tested. In the assay system devised, a successful precise edit results in a BFP+ cell whereas an imprecise edit turns the cell both BFP and GFP negative. CREATE Fusion gRNA in combination with CFE2.1 or CFE2.2 gives approximately 40-45% BFP+ cells indicating that almost half the cell population undergoes successful editing (data not shown). The GFP-cells are ˜10% of the population. The use of a second nicking gRNA, as described in Anzalone et al. (Nature, 576 (7785): 149-157 (2019)) does not increase the precision edit rate any further; in fact, it significantly increases the imprecisely edited, GFP-negative cell population and the editing rate is lower.
Previous literature has shown that double nicks on opposite strands (<90 bp away) do result in a double strand break which tend to be repaired via NHEJ resulting in imprecise insertions or deletions. Overall, the results indicate that CREATE Fusion Editing predominantly yields precisely edited cells and that the imprecisely edited cells proportion is much lower (data not shown).
An enrichment handle, specifically a fluorescent reporter (RFP) linked to nuclease expression is included in this experimentation as a proxy for cells receiving the editing machinery. When only the RFP-positive cells are analyzed (computational enrichment) after 3 to 4 cell divisions, up to 75% of the cells are BFP+ when tested with gRNA (data not shown), indicating uptake or expression-linked reporters can be used to enrich for a population of cells with higher rates of CREATE Fusion-mediated gene editing. In fact, the combined use of CREATE Fusion Editing and the described enrichment methods result in a significantly improved rate of intended edits (data not shown).
CREATE Fusion Editing is carried out in mammalian cells using a single guide RNA covalently linked to a homology arm having an intended edit to the native sequence and an edit that disrupts nuclease cleavage at this site. Briefly, lentiviral vectors are produced using the following protocol: 1000 ng of lentiviral transfer plasmid containing the CREATE Fusion cassettes along with 1500 ng of lentiviral packaging plasmids (ViraSafe Lentivirus Packaging System Cell BioLabs) are transfected into HEK293T cells using Lipofectamine LTX in 6-well plates. Media containing the lentivirus is collected 72 hours post transfection. Two clones of a lentiviral CREATE Fusion gRNA-HA design are chosen, and an empty lentiviral backbone is included as negative control.
The day before the transduction, 200,000 HEK293T cells are seeded in six well plates. Different volumes of CREATE lentivirus (10 μL to 1000 μL) are added to HEK293T cells in six well plates along with 10 μg/mL of Polybrene. 48 hours after transduction, media with 15 μg/mL of Blasticidin is added to the wells. Cells are maintained in selection for one week. Following selection, the well with lowest number of surviving cells is selected for future experiments (<5% cells).
The experimental constructs or wild-type SpCas9 are electroporated into HEK293T cells using the Neon Transfection System (Thermo Fisher Scientific). Briefly, 400 ng of total plasmid DNA is mixed with 100,000 cells in Buffer R in a total of 15 μL volume. The 10 μL Neon tip is used to electroporate cells using 2 pulses of 20 ms and 1150 v. Cells are analyzed on the flow cytometer 80 hours post electroporation. Unenriched editing rates of up to 15% are achieved from single copy delivery of gRNA (data not shown).
When the editing is combined with computational selection of RFP+ cells, however, enriched editing rates of up to 30% are achieved from a single copy delivery gRNA. This enrichment via selection of cells receiving the editing machinery is shown to result in a 2-fold increase in precise, complete intended edits (data not shown). Two or more enrichment/delivery steps can also be used to achieve higher editing rates of CREATE Fusion Editing in an automated instrument, e.g., use of a module for cell handle enrichment and identification of cells having BFP expression. When the method enriches for cells that have higher gRNA expression levels, the editing rate is even further increased, and thus a growth and/or enrichment module of the instrument may include gRNA enrichment.
CREATE Fusion Barcoding (“CFB”) is simultaneously carried out with CREATE Fusion Editing in mammalian iPS-GFP cells using CF barcoding cassettes and CF editing cassettes having different CFgRNAs for targeting separate genomic loci. CF editing cassettes include a single CFgRNA covalently linked to a repair template for targeting a model GFP locus and effecting a GFP-to-BFP swap mutation at this site. CF barcoding cassettes include a single CFgRNA covalently linked to a repair template for targeting either a model eukaryotic translation initiation factor 4E-binding protein 2 (“4EBP2”) locus or model DNA methyltransferase 3 beta (“DNMT3b”) locus and integrating a 9 bp barcode at these sites. The CF editing cassettes are assembled into editing plasmids encoding the CF editing cassettes and a CREATE Fusion Enzyme (“CFE”), and the CF barcoding cassettes are assembled into separate barcoding plasmids.
The editing and barcoding plasmids are co-transfected into the iPS-GFP cells, and conditions for editing/barcoding are provided.
Similar to Example IV, CREATE Fusion Barcoding (“CFB”) is simultaneously carried out with CREATE Fusion Editing in mammalian iPS-GFP cells using CF barcoding cassettes and CF editing cassettes having different CFgRNAs for targeting separate loci. CF editing cassettes include a single CFgRNA covalently linked to a repair template for targeting a model GFP locus and effecting a GFP-to-BFP swap mutation at this site. CF barcoding cassettes include a single CFgRNA covalently linked to a repair template for targeting a model DNA methyltransferase 3 beta (“DNMT3b”) locus and integrating either a 9 bp barcode (MAM10 or Pac1 insertion) or an 18 bp barcode (ISceI insertion) at these sites. The CF editing cassettes and CF barcoding cassettes are either assembled into a single plasmid (e.g., in “tandem”) further encoding a CREATE fusion enzyme (“CFE”), or the CF editing cassettes and CF barcoding cassettes are assembled into separate editing and barcoding plasmids, respectively.
The single or dual plasmids are transfected into the iPS-GFP cells, and conditions for editing/barcoding are provided.
CREATE Fusion Barcoding (“CFB”) is simultaneously carried out with CREATE Fusion Editing in mammalian iPS-GFP cells using CF barcoding cassettes and CF editing cassettes having different CFgRNAs for targeting separate loci. CF editing cassettes include a single CFgRNA covalently linked to a repair template for targeting a model GFP locus and effecting a GFP-to-BFP swap mutation at this site. CF barcoding cassettes include a single CFgRNA covalently linked to a repair template for targeting a model ornithine decarboxylase antizyme 1 (“OAZ1”) locus and effecting a 2 bp swap at this site. The CF editing cassettes and CF barcoding cassettes are assembled into separate editing and barcoding plasmids, and are co-transfected into the iPS-GFP cells.
As previously discussed, barcoding may be carried out in genomic safe harbor loci to reduce the potential of barcode sequence integration adversely affecting genes neighboring the integrated barcode(s). In other words, a target genomic locus for barcoding may include a safe harbor region of the genome that minimally affects the biology of the cell. In this experiment, genomic loci corresponding to 3′ untranslated regions (UTRs) regions of transcribed genes are considered and tested for barcoding due to 3′ UTRs being benign to cell function while also being readily detectable by RNAseq. In this experiment, ninety-six (96) different 3′-UTR loci are tested and screened for barcoding efficiency utilizing a 9 bp barcode encoded by barcoding cassettes.
Briefly, CREATE Fusion Barcoding (“CFB”) is carried out in mammalian iPS-GFP cells using CF barcoding cassettes having CFgRNAs designed for targeting one of a plurality of genomic loci corresponding to 3′ UTRs of transcribed genes. The CF barcoding cassettes include a single CFgRNA covalently linked to a repair template for targeting and integrating the 9 bp barcode at a respective 3′-UTR locus.
With the aforementioned cassettes, each of the ninety-six (96) different 3′UTR loci are individually tested for barcoding efficiency using the following protocol: PGP168-GFP cells are cultured in mTeSR Plus medium (STEMCELL Technologies, Vancouver, Canada) at 37° C. and 5% CO2. 24 hours before transfection, the cells are seeded at 15k cells per well in Matrigel-coated (Corning Life Sciences, Corning, NY) flat bottom, 96-well culture plates and supplemented with 10 μM Y-27632 ROCK inhibitor (STEMCELL Technologies). 100 μL of the medium is replaced (without Y-27632) immediately before transfection. 100 ng of total DNA (50 ng of barcoding gRNAs plus 50 ng of CREATE Fusion Enzyme (CFE) and plasmids) is mixed with 1 μL Lipofectamine Stem Transfection Reagent (Thermo Fisher Scientific) in 10 μL of OptiMEM (Thermo Fisher Scientific). The resulting mixture is incubated for 10-30 minutes at room temperature, then added to a single well of PGP168-GFP cells in 96-well culture plates and transferred to a 37° C. incubator with 5% CO2. After 24 hours of incubation, the transfection medium is removed, and replaced with mTeSR Plus Medium and 2 μg/mL puromycin (InvivoGen, San Diego, CA). 48 hours after transfection, the medium is replaced with mTeSR Plus and 0.25 pg/mL puromycin.
Ninety-six hours after transfection, genomic DNA is purified from the cells using a DNAdvanced Kit (Beckman Coulter Life Sciences, Brea, CA) for genomic DNA extraction. Regions of genes containing the target loci of the CFE-gRNA complexes are then amplified via PCR. These PCR amplicons are prepared for next generation sequencing (NGS) using a TruSeq DNA Sample Prep Kit (Illumina, Inc., San Diego, CA) according to the manufacturer's instructions. The samples are then sequenced using an Illumina MiSeq benchtop sequencer and 2×150 reagent kit (Illumina). NGS analysis is performed using a custom NGS analysis and sequencing read alignment pipeline to bin read counts according to sequence identity to target genomic loci with a complete targeted 9 base insertion or wild-type sequence.
CREATE Fusion Barcoding (“CFB”) is simultaneously carried out with CREATE Fusion Editing in mammalian iPS-GFP cells using CF barcoding cassettes and CF editing cassettes having different CFgRNAs for targeting separate genomic loci. CF editing cassettes include a single CFgRNA covalently linked to a repair template for targeting a model GFP locus and effecting a GFP-to-BFP swap mutation at this site. CF barcoding cassettes include a single CFgRNA covalently linked to a repair template for targeting one of four different genomic loci corresponding to 3′ UTRs of transcribed genes (“guide1”, “guide2”, “guide3”, and “guide4”—shown in
The CF editing cassettes and CF barcoding cassettes are either assembled into a single plasmid (e.g., in “tandem”) further encoding a CREATE fusion enzyme (“CFE”), or the CF editing cassettes and CF barcoding cassettes are assembled into separate editing and barcoding plasmids, respectively.
Briefly, CF barcoding and editing with the single plasmid system is carried out with the following protocol: PGP168-GFP cells are transfected in 96-well culture plates at 15,000 cells per well with 100 ng total plasmid DNA, as described in Example VII. Here, a CFE, barcoding cassette (e.g., comprising a barcoding gRNA), library editing cassette (e.g., comprising an editing gRNA targeting the GFP-to-BFP conversion target described in Example I), and puromycin deacetylase gene are expressed from the single plasmid. 24 hours after transfection, the transfection medium is replaced with mTeSR Plus and 4, 8, or 10 μg/mL puromycin, and is then reduced to 0.25 μg/mL puromycin at 48 hours after transfection.
Meanwhile, CF barcoding and editing with the dual-plasmid system is carried out with the following protocol: PGP168-GFP cells are transfected in 96-well culture plates at 15,000 cells per well with 100 ng total plasmid DNA, as described in Example VII. The cells are transfected with 25, 50, or 75 ng of a first plasmid containing a CFE, editing cassette, and puromycin deacetylase gene, as well as 75, 50, or 25 ng of a second plasmid containing a barcoding cassette, respectively. Twenty-four hours after transfection, the transfection medium is replaced with mTeSR Plus and 10 μg/mL puromycin, and is then reduced to 0.25 μg/mL puromycin at 48 hours after transfection.
Ninety-six hours after transfection for each of the two experiments, genomic DNA are extracted from a first aliquot of each sample and barcoding rates are determined via genomic amplicon sequencing, as described in Example VII. Another aliquot of each sample is then collected and analyzed for GFP-to-BFP conversion by flow cytometry, as described in Example II.
CREATE fusion barcode insertion in genomic loci corresponding with the 3′UTRs of transcribed genes enables efficient barcode detection and sequencing using common poly-A mRNA sequencing techniques. Here, CREATE Fusion barcodes inserted into such 3′ UTR loci are detected by single-cell RNAseq using the following protocol: PGP168-GFP cells are transfected in triplicate with plasmids comprising a CFE and barcode editing cassette, and are thereafter enriched by puromycin selection as described in Example VII. Ninety-six hours after transfection, samples having about 500 cells per sample are collected and processed with a 10× Genomics Chromium Next GEM Single Cell 3′ Reagent Kit v3.1 with feature barcoding for cell multiplexing, per the manufacturer's instructions (10× Genomics, Pleasanton, CA). Sequencing libraries are then prepared using a NextSeq 550 v2.5 150 cycle High Output Kit (Illumina) and sequenced with paired end, dual indexing on an Illumina NextSeq System.
NGS analysis is performed using 10× Genomics Cell Ranger software and a custom NGS analysis and sequencing read alignment pipeline to de-multiplex single cell transciptomes and bin read counts according to sequence identity (e.g., similarity) to barcoded target genomic loci with a complete 9 base insertion or wild-type sequence. Raw counts of unique molecular identifiers associated with barcode or wild-type reads are then used to determine the relative expression level of barcoded and non-barcoded transcripts, as shown in
In
CREATE fusion barcoding and editing is carried out in mammalian cells (iPSCs), where the CFgRNA design allows for barcoded HA epitope tag knock-in edits at various genomic loci that encodes for endogenous surface receptors (BST2, CD151, CD63, CD81, and CD9) (
CREATE fusion barcoding and editing is carried out in mammalian cells (iPSCs), where the cells are co-transfected with CFgRNAs targeting a GFP-to-BFP edit and a CD81-HA tag knock-in barcoding edit. Physical magnetic sorting (MACS) is performed with anti-HA antibody functionalized beads, and enriches the population of cells with a successful CD81-HA tag knock-in edit from 2.2% (left) to 84.1% (right) (
CREATE fusion barcoding and editing is carried out in mammalian cells using CFgRNA designs for HPRT loss-of-function edit, for example wherein the HPRT loss of function is introduced by a frame shift barcode insertion (
CREATE fusion barcoding and editing is carried out in mammalian cells, where the cells are co-transfected with CFgRNAs targeting various genomic edits and a CD81-HA knock-in barcoding edit, wherein each genomic edit CFgRNA design is paired with a unique barcode CFgRNA during co-transfection. HA-tag knock-in provides a phenotypic handle for live-cell selection which enriches the population of cells with the desired genomic edits. NGS data shows genomic DNA-level insertion rates across diverse barcode sequences. NGS data also shows correlation of the barcode-genomic edit pairs, and shows a proof-of-concept for using barcodes to track intended genomic edits.
Further to the aforementioned examples, in some aspects, the compositions, methods, and modified cells of the current disclosure applies to the use of gRNA. In some aspects, the compositions, methods, and modified cells of the current disclosure applies to the use of any type of gRNA. In some aspects, the compositions, methods, and modified cells of the current disclosure applies to the use of one or more types of gRNAs.
In some aspects, the compositions, methods, and modified cells of the current disclosure applies to gene editing via endogenous repair mechanisms, e.g., Homology-Directed Repair (HDR), recombination pathways, or other DNA repair pathways. In some aspects, the compositions, methods, and modified cells of the current disclosure applies to HDR-based gene editing. In some aspects, the compositions, methods, and modified cells of the current disclosure applies to any method to introduce a genetic mutation into a genome (e.g., knock-in). In some aspects, the compositions, methods, and modified cells of the current disclosure applies to the use of gRNAs and HDR-based gene editing.
While this invention is satisfied by aspects in many different forms, as described in detail in connection with preferred aspects of the invention, it is understood that the present disclosure is to be considered as an example of the principles of the invention and is not intended to limit the invention to the specific aspects illustrated and described herein. Numerous variations may be made by persons skilled in the art without departure from the spirit of the invention. The scope of the invention will be measured by the appended claims and their equivalents. The abstract and the title are snot to be construed as limiting the scope of the present invention, as their purpose is to enable the appropriate authorities, as well as the general public, to quickly determine the general nature of the invention. In the claims that follow, unless the term “means” is used, none of the features or elements recited therein should be construed as means-plus-function limitations pursuant to 35 U.S.C. § 112.
This application claims the benefit of U.S. Provisional Application No. 63/291,550, filed Dec. 20, 2021, and U.S. Provisional Application No. 63/347,709, filed Jun. 1, 2022, the contents of which are incorporated herein by reference in their entireties.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2022/053377 | 12/19/2022 | WO |
Number | Date | Country | |
---|---|---|---|
63347709 | Jun 2022 | US | |
63291550 | Dec 2021 | US |