The present invention relates generally to single cell analysis and more specifically to simultaneous analysis of DNA and RNA from the same cell.
Single cell transcriptomic sequencing (scRNA-seq) has shown that individual cells are unique, heterogeneous units with their own subtle but importantly different gene expression profiles. Separately, single-cell DNA sequencing has highlighted genetic heterogeneity in multicellular organisms and its role in inherited diseases, cancer, and aging. Genomic sequences and transcriptomic profiles each independently reflect intercellular differences and coupling targeted genomic information to the transcriptome of a single cell can provide a high resolution to the direct and diverse relationship between genotype and phenotype.
The ability to simultaneously assay genomic DNA (gDNA) and messenger RNA (mRNA) from the same single cell is important for attributing differential transcriptomic output to specific cell genotypes. A high-throughput method for single cell measurements of mRNA and DNA can provide advantages to multiple applications, ranging from clinical diagnostics to CRISPR screens and cellular barcoding. Existing approaches for generating this particular combination of data are limited by time consumption and low throughput. More recent high-throughput methodologies exist that employ cDNA as a proxy for genomic sequence, but these restrict the measurable DNA sequence to only expressed regions of the genome. Thus, there exists a need for efficient and cost-effective methods of simultaneously analyzing DNA and RNA from the same cell.
The present invention is based on the seminal discovery that DNA and RNA from the same single cell can be captured and analyzed simultaneously using sequencing methodologies.
While the illustrative examples herein provide DNA and RNA analysis in droplets, it is understood that the methods of the invention can be performed in non-droplet cell encapsulation methods as well, including for example, Fluorescence activated cell sorting (FACS), gravity, microfluidics and the like.
In some embodiments, provided herein are methods of simultaneously analyzing DNA and RNA from the same cell including: (a) providing a droplet including a single cell, wherein the single cell is lysed providing nucleic acid in the droplet; (b) performing a first polymerase chain reaction (PCR) reaction on DNA in the droplet, thereby generating amplicons including a 3′ poly(dA) sequence and a bead oligonucleotide sequence; (c) capturing the RNA on a microparticle comprising a bead oligonucleotide sequence; (d) breaking the droplet and separating the supernatant comprising the amplicons from the RNA captured on the microparticle; and (e) performing a reverse transcription reaction transcribing the RNA. In one aspect, methods provided herein further include preparing libraries of the separated amplicons and transcribed RNA. In another aspect, methods provided herein further include performing a PCR reaction on
the transcribed RNA. In another aspect, the methods provided herein further comprising enzymatically modifying the amplicons. In an additional aspect, the amplicons are modified by a lambda nuclease and a terminal transferase. In a further aspect, the methods provided herein further comprise biotinylating the amplicons by biotinylated second strand synthesis. In some aspects, the methods provided herein further comprise subjecting the amplicons to mung bean nuclease modification. In another aspect, the methods provided herein further include performing a second PCR reaction on the amplicons. In an additional aspect, the methods provided herein further include performing a third PCR reaction on the amplicons. In certain aspects, the forward primers for the second PCT reaction on the amplicons include sites for sequencing. In a further aspect, the forward and reverse primers in the third PCR reaction on the amplicons include sites for sequencing. In yet another aspect, methods provided herein further include sequencing the transcribed RNA and amplicons after the third PCR reaction. In one aspect the microparticle that captures RNA is a bead. In another aspect, the bead includes bead oligonucleotide sequences. In yet another aspect, the bead oligonucleotide sequences include a barcode. In a further aspect, the barcode includes a cellular barcode and a unique molecular identifier. In certain aspects, the oligonucleotide sequences further include a poly(dT) sequence. In some aspects, the oligonucleotide sequences further include a PCR handle for reverse transcription and PCR. In certain aspects, methods provided herein further include mapping sequences of separated molecules that include a matching cellular barcode to the same cell. In one aspect, the DNA analyzed by the methods provided herein is genomic DNA, mitochondrial DNA, or a combination thereof. In another aspect, the RNA analyzed by the methods provided herein is messenger RNA (mRNA), long non-coding RNA, or a combination thereof. In a further aspect, first PCR reverse primers include a poly(dT) sequence. In yet a further aspect, the reverse transcription reaction includes Moloney Murine Leukemia Virus (M-MLV)-reverse transcriptase and template switching oligonucleotides.
In some embodiments, provided herein are methods of simultaneously analyzing DNA and RNA from the same cell including: (a) providing a droplet including a single cell, wherein the single cell is lysed providing nucleic acid in the droplet; (b) performing a first polymerase chain reaction (PCR) reaction on DNA in the droplet, thereby generating amplicons including a 3′ poly(dA) sequence and a bead oligo nucleotide sequence; (c) capturing the RNA with a microparticle comprising a bead oligonucleotide sequence; (d) breaking the droplets and separating the supernatant containing the amplicons from the RNA captured on the microparticle; (e) performing a reverse transcription reaction on the RNA including the bead oligonucleotide sequence; (f) enzymatically modifying the amplicons and performing a second PCR reaction on the amplicons; and (g) performing a PCR reaction on the transcribed RNA. In one aspect, methods provided herein further include performing a third PCR reaction on the amplicons.
In some embodiments, provided herein are methods of analyzing a transcriptome of a genome-edited cell including: (a) determining a genotype of a single cell by sequencing amplicons prepared by methods of simultaneously analyzing DNA and RNA from the same cell provided herein, thereby identifying edited and unedited cells; (b) sequencing transcribed RNA prepared by methods of simultaneously analyzing DNA and RNA from the same cell provided herein; (c) mapping sequences of amplicons and sequences of transcribed RNA that include a matching cellular barcode to the same cell; and (d) grouping sequences of amplicons and sequences of transcribed RNA from edited and unedited cells according to matching genome edits. In another aspect, methods provided herein include preparing a sequencing library of amplicons and preparing a sequencing library of transcribed RNA before sequencing transcribed molecules. In a further aspect, single cells include a genomic barcode. In yet a further aspect, edited cells include one or more mutations in the genomic barcode.
In some embodiments, provided herein are methods of determining tumor heterogeneity that include simultaneously analyzing DNA and RNA from a tumor cell using any of the methods provided herein.
In some embodiments, provided herein are methods of determining somatic mosaicism that include simultaneously analyzing DNA and RNA from a cell using any of the methods provided herein. In one aspect, the cell is a normal cell. In another aspect, the cell is a disease cell. In yet another aspect, the disease cell is a tumor cell. In a further aspect, somatic mosaicism includes a mutation or a chromosomal rearrangement.
In some embodiments, provided herein are methods of screening for perturbations in cells modified with guide RNAs that include simultaneously analyzing DNA and RNA of a cell in a population of modified cells using any of the methods provided herein. In one aspect, the cells are modified with a library (e.g., lentiviral) of guide RNAs representative of a range of genes. A readout of integrated guide RNAs provides information as to perturbed genes. In one aspect, the cells are modified with a gene modifying agent selected from a CRISPR-associated (Cas) protein, a Cre DNA recombinase, a TALEN, a zinc finger nuclease, a homing endonuclease, or a targeted SPO11 nuclease.
In some embodiments, provided herein are methods of probing genetic thresholds on a phenotype that include simultaneously analyzing DNA and RNA from a cell using any of the methods provided herein. In one aspect, the phenotype is a normal phenotype. In another aspect, the phenotype is a disease phenotype.
In some embodiments, provided herein are methods of genotyping cells that include simultaneously analyzing DNA and RNA from a cell using any of the methods provided herein. In one aspect, the cell is a tumor cell, a genome-edited cell, a disease cell, or a normal cell.
In some embodiments, provided herein are methods of tracing a lineage of a cell that include simultaneously analyzing DNA and RNA from the cell the lineage of which is being traced using any of the methods provided herein. In one aspect, methods of tracing cell lineage provided herein further include marking the cell with a barcode. In another aspect the barcode is a DNA barcode. In yet another aspect, the DNA barcode is an editable DNA barcode.
In some embodiments, provided herein are oligonucleotides that include a PCR handle for reverse transcription and PCR, a barcode, and a poly(dT) sequence. In one aspect, the barcode includes a cellular barcode and a unique molecular identifier. In another aspect, the oligonucleotides are attached to a microparticle. In a further aspect, the microparticle includes a bead. In yet a further aspect, oligonucleotides provided herein further include an amplified DNA sequence including a 3′ poly(dA) sequence.
In another embodiment a kit including a combination lysis and PCR buffer comprising a lysis component, PCR components, and a reaction buffer, wherein the lysis component comprises a nonionic, non-denaturing detergent, wherein the PCR component comprises a polymerase, deoxynucleoside triphosphates, and PCR primers, and wherein the reaction buffer comprises MgCl2, Tween-80, carrier protein, Tris and NaCl is provided. In a specific aspect, the Tris buffer is at pH 8.0. In some aspects, the reaction buffer comprises dimethyl sulfoxide (DMSO). In one aspect, the kit further includes droplet generation oil; and a microparticle. In one aspect, the kit further includes instructions to generate aqueous solution-in-oil droplets comprising a cell and a microparticle. In another aspect, the kit further includes a polymerase and instructions to perform a polymerase chain reaction (PCR) reaction on DNA in the droplet. In one aspect, the PCR reaction generates amplicons comprising a 3′ poly(dA) sequence and bead oligonucleotide sequence. In another aspect, the kit further includes a reverse transcriptase and instructions to perform a reverse transcription reaction on the RNA. In a further aspect, the kit also includes instructions to separate the amplicons and the RNA prior to performing the reverse transcriptase reaction on the RNA. In some aspects, the combination lysis and PCR buffer includes a lysis component, PCR components, and a reaction buffer. In various aspects, the lysis component includes Igepal CA-630. In many aspects, the PCR components include a polymerase, deoxynucleoside triphosphates, and PCR primers. In various aspects, the polymerase is a Q5 Hot Start High-Fidelity DNA polymerase, Phusion polymerase or a KOD polymerase. In many aspects, the carrier protein is BSA or ubiquitin. In some aspects, the reaction buffer includes MgCl2, Tween-80, a carrier protein, Tris and NaCl.
Before the present compositions and methods are described, it is to be understood that this invention is not limited to particular compositions, methods, and experimental conditions described, as such compositions, methods, and conditions may vary. It is also to be understood that the terminology used herein is for purposes of describing particular embodiments only, and is not intended to be limiting, since the scope of the present invention will be limited only in the appended claims.
Provided herein, in some illustrative embodiments, are methods of simultaneously analyzing DNA and RNA from the same cell that include (a) providing a droplet including a single cell, wherein the single cell is lysed providing nucleic acid in the droplet; (b) performing a first polymerase chain reaction (PCR) reaction on DNA in the droplet, thereby generating amplicons including a 3′ poly(dA) sequence and a bead oligonucleotide sequence; (c) capturing the RNA on a microparticle containing a bead oligonucleotide sequence; (d) breaking the droplet and separating the supernatant containing the amplicons from the RNA captured on the microparticles; and (e) performing a reverse transcription reaction on the RNA including the bead oligonucleotide sequence, thereby transcribing the RNA.
Also provided herein, in some embodiments, are methods of simultaneously analyzing DNA and RNA from the same cell that include (a) providing a droplet including a single cell, wherein the single cell is lysed providing nucleic acid in the droplet; (b) performing a first polymerase chain reaction (PCR) reaction on DNA in the droplet, thereby generating amplicons including a 3′ poly(dA) sequence and a bead oligonucleotide sequence; (c) capturing the RNA on a microparticle containing a bead oligonucleotide sequence; (d) breaking the droplet and separating the supernatant containing the amplicons from the captured RNA; (e) performing a reverse transcription reaction on the RNA including the bead oligonucleotide sequence, thereby transcribing the RNA; (f) enzymatically modifying the amplicons and performing a second PCR reaction on the modified amplicons; and (g) performing a second PCR reaction on the transcribed RNA, thereby amplifying transcribed RNA.
Methods of simultaneously analyzing DNA and RNA from the same cell provided herein include providing droplets that include nucleic acid from a single cell. Droplets encompass single cells that are lysed within the droplets, thus releasing nucleic acid into the droplets. As used herein, the term “nucleic acid” refers to any deoxyribonucleic acid (DNA) molecule, ribonucleic acid (RNA) molecule, or nucleic acid analogues. A DNA or RNA molecule can be double-stranded or single-stranded and can be of any size. Exemplary nucleic acids include, but are not limited to, chromosomal DNA, mitochondrial DNA, chloroplast DNA, plasmid DNA, cDNA, cell-free DNA(cfDNA), mRNA, tRNA, rRNA, siRNA, micro RNA (miRNA or miR), hnRNA, and long non-coding RNA. As used herein, the term “nucleic acid molecule” is meant to include fragments of nucleic acid molecules as well as any full-length or non-fragmented nucleic acid molecule, for example.
Any DNA and RNA can be analyzed using the methods provided herein. In one aspect, DNA analyzed using the methods provided herein is genomic DNA, mitochondrial DNA, or a combination thereof. In another aspect, the RNA is messenger RNA (mRNA), long non-coding RNA, or a combination thereof.
Methods of simultaneously analyzing DNA and RNA from the same cell provided herein include performing a first polymerase chain reaction (PCR) reaction on DNA in the droplets. A first PCR on DNA in the droplets generates amplicons that include a 3′ poly(dA) sequence and a bead oligonucleotide sequence. The 3′ poly(dA) sequence of amplicons derived from DNA can be used to prime the bead oligonucleotides and incorporate their sequences into the amplicons. Accordingly, in one aspect, first PCR reverse primers include a poly(dT) sequence. The poly(dT) sequence included in first PCR reverse primers results in the introduction of the 3′ poly(dA) sequence of amplicons derived from DNA, such as genomic DNA, mitochondrial DNA, or a combination thereof. In another aspect, a bead reverse primer can be used that binds to the bead oligo sequences that any amplicons have incorporated, allowing them to be further amplified in the subsequent cycles of the first PCR. The bead oligonucleotide sequence can be used to match to the RNA with the same bead oligonucleotide sequence.
Methods of simultaneously analyzing DNA and RNA from the same cell provided herein include capturing the RNA. In one aspect, the RNA are captured on a support. Any suitable support can be used to capture the RNA, such as a microparticle, for example. In some aspects, the microparticle for capture of amplicons and RNA is a bead. In a further aspect, the beads for capture of the RNA include oligonucleotide sequences. Oligonucleotide sequences can be attached to the bead by any suitable method, including covalent and non-covalent interactions. In one aspect, oligonucleotides are covalently attached to the bead.
In another aspect, bead oligonucleotide sequences include a barcode. Barcodes can be used to identify the origin or source of a nucleic acid molecule or of a nucleic acid sequence. A barcode can include any number of nucleotides. As an example, a barcode can include about 10 to about 35 nucleotides. As another example, a barcode can include about 12 to about 25 nucleotides. As yet another example, a barcode can include about 5, about 6, about 7, about 8, about 9, about 10, about 11, about 12, about 13, about 14, about 15, about 16, about 17, about 18, about 19, about 20, about 21, about 22, about 23, about 24, about 25, about 26, about 27, about 28, about 29, about 30, about 31, about 32, about 33, about 34, about 35, about 36, about 37, about 38, about 39, about 40, or more nucleotides. As yet another example, a barcode can include at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least 21, at least 22, at least 23, at least 24, at least 25, at least 26, at least 27, at least 28, at least 29, at least 30, at least 31, at least 32, at least 33, at least 34, at least 35, at least 36, at least 37, at least 38, at least 39, at least 40, or more nucleotides. In yet another aspect, the barcode includes a cellular barcode and a unique molecular identifier (UMI). Cellular barcodes can be used to identify the cell a nucleic acid molecule came from and allow for grouping of sequence reads generated from nucleic acid molecules into cell categories. Thus, sequence reads of nucleic acid molecules from the same cell can be grouped together. UMIs can be used to identify a nucleic acid molecule that gave rise to a sequence read. Accordingly, UMIs can be used to group together sequence reads generated from the same nucleic acid molecule.
In a further aspect, oligonucleotide sequences include a poly(dT) sequence. A poly(dT) sequence allows for the capture and tagging of nucleic acid molecules that include a poly(dA) sequence. Exemplary molecules that can interact with an oligonucleotide poly(dT) sequence include mRNA, long non-coding RNA, and amplicons derived from DNA with poly(dA) sequences generated by PCR. In yet a further aspect, oligonucleotide sequences include a PCR handle for reverse transcription and PCR. The PCR handle can be used for reverse transcription and/or PCR of captured nucleic acid molecules.
Methods of simultaneously analyzing DNA and RNA from the same cell provided herein further include breaking the droplets that include nucleic acid from a single cell and separating the supernatant containing the amplicons from the captured RNA. Further, the method provides performing a reverse transcription reaction on the captured RNA, thereby transcribing the RNA. In one aspect, the transcription reaction is performed on beads after capture of RNA and breaking of the droplets. In another aspect, the reverse transcription reaction is performed using Moloney Murine Leukemia Virus (M-MLV)-reverse transcriptase and template switching oligonucleotides. In yet another aspect, transcribing the amplicons and the RNA incorporates microparticle oligonucleotide sequences into transcribed molecules.
Methods of simultaneously analyzing DNA and RNA from the same cell provided herein further include separating amplicons and captured RNA. In one aspect, a PCR reaction is performed on the transcribed RNA. Any suitable method can be used for separating transcribed molecules. In one aspect, the amplicons are enzymatically modified. The certain aspects the amplicons are enzymatically modified using a lambda nuclease to prevent the reverse strand from acting as a downstream template amplicon without a bead oligonucleotide sequence. In some aspects, the amplicons are further enzymatically modified using a terminal transferase and ddNTPs to prevent forward strands without a bead oligonucleotide sequence from priming reverse strands with a bead oligonucleotide sequences inducing template switching. In a further aspect, the amplicons are then biotinylated by a biotinylated second strand synthesis reaction. In some aspects, the amplicons are subjected to mung bean nuclease modification. In some aspects, the amplicons are removed from the supernatant using streptavidin beads. In certain aspects, a second PCR reaction is performed on the amplicons using forward primers including sequencing sites. Primers that target regions of interest in transcribed amplicons can be used in the PCR reactions. Any region or sequence of transcribed amplicons can be a region of interest and be targeted by primers for amplification. More than one primer pair or more than one set of primers can be used to amplify regions of interest.
. In yet another aspect, forward primers for PCR on tagged amplicons include sites for sequencing. Accordingly, a PCR reaction on tagged amplicons using primers can be performed before preparing sequencing libraries and sequencing transcribed molecules.
Methods of simultaneously analyzing DNA and RNA provided herein further include preparing libraries of separated amplicons and transcribed RNA. Any suitable method for library preparation can be used. In one aspect, libraries of transcribed RNA (i.e., cDNA sequencing libraries) are prepared using tagmentation methods followed by amplification. In another aspect, amplicon sequencing libraries are prepared from the supernatant derived from breaking the droplet and subjected to enzymatic modification using lambda nuclease and terminal transferase with ddNTPs followed by being subjected to a biotinylated second strand synthesis reaction, mung bean nuclease modification, and selection of biotinylated molecules with streptavidin beads followed by two rounds of PCR, as detailed above.
Methods of simultaneously analyzing DNA and RNA from the same cell provided herein further include sequencing the separated transcribed molecules. Any sequencing method can be used, including Sanger sequencing using labeled terminators or primers and gel separation in slab or capillary systems, and Next Generation Sequencing (NGS), for example. Exemplary NGS methodologies include the Roche 454 sequencer, Life Technologies SOLiD systems, the Life Technologies Ion Torrent, BGI/MGI systems, Genapsys systems, and Illumina systems such as the Illumina Genome Analyzer II, Illumina MiSeq, Illumina HiSeq, Illumina NextSeq, and Illumina NovaSeq instruments. In one aspect, methods of simultaneously analyzing DNA and RNA from the same cell provided herein further include mapping sequences of separated transcribed molecules having a matching cellular barcode to the same cell.
In some embodiments, provided herein are methods of analyzing a transcriptome of a genome-edited cell that include a) determining a genotype of a single cell by sequencing transcribed amplicons prepared by any of the methods provided herein that include simultaneously analyzing DNA and RNA from the same cell, thereby identifying edited and unedited cells; (b) sequencing transcribed RNA prepared by any of the methods provided herein that include simultaneously analyzing DNA and RNA from the same cell; (c) mapping sequences of transcribed amplicons and sequences of transcribed RNA that include a matching cellular barcode to the same cell; and (d) grouping sequences of transcribed amplicons and sequences of transcribed RNA from edited and unedited cells according to matching genome edits.
As used herein, the term “transcriptome” means all RNA transcripts in a cell or in a population of cells. Accordingly, RNA transcripts include coding and non-coding RNA, and the term “transcriptome” encompasses both coding and non-coding RNA, unless context clearly indicates otherwise. As used herein, the term “genome editing” means insertion, deletion, modification, or replacement of DNA in the genome of a cell or organism. Any type of genetic engineering can be used for genome editing, including gene targeting, conditional gene targeting, homologous recombination, and use of nucleases, such as meganucleases, zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and the CRISPR/Cas system. As used herein, the term “edited cell” means a cell whose genome includes a desired insertion, deletion, modification, or replacement of genomic DNA. The genome of an edited cell may include a mutation as a result of editing. Alternatively, editing can be used to correct a mutation, i.e., restore a mutation or alteration to wild-type or to an equivalent of wild-type. As used herein, “unedited cell” means a cell whose genome does not include a desired insertion, deletion, modification, or replacement of genomic DNA. Accordingly, the genome of an unedited cell may be wild-type or include a mutation, based on the nature or purpose of genome editing.
Methods of analyzing a transcriptome of a genome-edited cell provided herein include performing a first polymerase chain reaction (PCR) reaction on DNA in a droplet, thereby generating amplicons including a 3′ poly(dA) sequence. Generation of amplicons that include a poly(dA) sequence allows the amplicons to be tagged by a support that includes an oligonucleotide sequence, such as a bead that includes an oligonucleotide sequence having a poly(dT) sequence. Accordingly, in one aspect, first PCR reverse primers include a poly(dT) sequence.
In one aspect, single cells in the methods of analyzing a transcriptome of a genome-edited cell provided herein include a genomic barcode. In another aspect, edited cells of the methods provided herein include one or more mutations in the genomic barcode. In yet another aspect, mutations in the genomic barcode result from genome editing. In one aspect, the barcodes are introduced by genome editing. In other aspects, the barcodes are introduced, for example by methods that cause random or untargeted genomic changes (e.g., use of transposases or lentiviruses).
In one aspect, methods of analyzing a transcriptome of a genome-edited cell provided herein include separating the supernatant containing amplicons from the captured RNA. The captured RNA is subjected to reverse transcriptase reaction transcribing the RNA. The isolated amplicons are subjected enzymatic modification as described previously and further PCR reactions. In another aspect, methods of analyzing a transcriptome of a genome-edited cell further include preparing a sequencing library of transcribed amplicons and preparing a sequencing library of transcribed RNA before sequencing transcribed molecules. Any of the methods provided herein can be used for preparing sequencing libraries, including tagmentation methods for cDNA sequencing library preparation and PCR. In one aspect, the amplicons are enzymatically modified using lambda nuclease and a terminal transferase followed by a biotinylated second strand synthesis reaction and modification by mung bean nuclease prior to a second and third PCR reaction.
Methods of analyzing a transcriptome of a genome-edited cell provided herein include capturing and transcribed RNA. In one aspect, transcribed RNAs are captured on a support, such as a microparticle, for example. In another aspect, the microparticle is a bead. In yet another aspect, beads for capture of transcribed RNA include oligonucleotide sequences. In another aspect, bead oligonucleotide sequences include a barcode that includes a cellular barcode and a unique molecular identifier (UMI). In a further aspect, bead oligonucleotide sequences further include a poly(dT) sequence, a PCR handle for reverse transcription and PCR, or any combination thereof. Any suitable reverse transcriptase, including engineered reverse transcriptase, can be used to transcribe captured RNA. In one aspect, the reverse transcriptase is Moloney Murine Leukemia Virus (M-MLV)-reverse transcriptase, using template switching oligonucleotides.
Any type of nucleic acid can be analyzed using the methods of analyzing a transcriptome of a genome-edited cell provided herein. In one aspect, a first PCR reaction is performed on DNA in droplets as provided herein, wherein the DNA is genomic DNA, mitochondrial DNA, or a combination thereof. In another aspect, transcribed and/or captured RNA includes mRNA, long non-coding RNA, or a combination thereof.
Any of the methods provided herein can be used to determine tumor heterogeneity, for example. Methods provided herein can also be used to determine somatic mosaicism. In one aspect, single cells analyzed by the methods provided herein for determining somatic mosaicism are normal cells. In another aspect, single cells analyzed by the methods provided herein for determining somatic mosaicism are tumor cells. In certain aspects, somatic mosaicism includes a mutation or a chromosomal rearrangement.
Methods provided herein can also be used for screening for perturbations in cells modified with guide RNAs in a population of modified cells. In one aspect, the cells are modified with a library (e.g., lentiviral) of guide RNAs representative of a range of genes. Any suitable type of library can be used to generate genome edits. A readout of integrated guide RNAs provides information as to perturbed genes. In one aspect, the cells are modified using a gene modifying agent selected from a CRISPR-associated (Cas) protein, a Cre DNA recombinase, a TALEN, a zinc finger nuclease, a homing endonuclease, or a targeted SPO11 nuclease.
Methods provided herein can be used for probing genetic thresholds on phenotype. In one aspect, the phenotype is a normal phenotype. In another aspect, the phenotype is a disease phenotype. Accordingly, methods provided herein can be used to determine the number and/or type of genetic markers or genetic changes that contribute to a phenotype of interest.
Any of the methods for simultaneously analyzing DNA and RNA from a cell provided herein can be used to genotype cells. In one aspect, the cell is a tumor cell. In another aspect, the cell is a genome-edited cell. In a further aspect, the cell is a disease cell. In yet a further aspect, the cell is a normal cell. Any of these cell types are optionally barcoded cells. Accordingly, the genotype of any cell can be determined using the methods provided herein. Exemplary cells that can be analyzed include single cells from any organ, single cells from any cell culture, primary cells, cells that have been preserved by any suitable method, including single frozen cells, single formalin-fixed cells, methanol fixed, or single cells from formalin-fixed paraffin-embedded (FFPE) tissue, by way of example.
Methods provided herein for simultaneously analyzing DNA and RNA from the same cell can be used for tracing the lineage of the cell. In one aspect, the cell whose lineage is being traced is marked with a barcode. In another aspect, the barcode is a DNA barcode. In a further aspect, the DNA barcode is an editable barcode. Any of the genome-editing methods provided herein can be used to edit a DNA barcode, including use of a gene modifying agent such as a CRISPR-associated (Cas) protein, a Cre DNA recombinase, a TALEN, a zinc finger nuclease, a homing endonuclease, or a targeted SPO11 nuclease, for example.
Provided herein, in some embodiments, are oligonucleotides including a PCR handle for reverse transcription, a barcode, and a poly(dT) sequence. In one aspect, the barcode includes a cellular barcode and a unique molecular identifier (UMI). In yet another aspect, the oligonucleotides are attached to microparticles, such as beads, for example. In a further aspect, oligonucleotides provided herein include tagging DNA with a 3′ poly(dA) sequence and/or a bead oligonucleotide sequence. In yet a further aspect, the DNA that includes a 3′ poly(dA) sequence is genomic DNA, mitochondrial DNA, or a combination thereof.
Generally, oligonucleotides provided herein are single-stranded. Oligonucleotides can be of any length. For example, oligonucleotides can have a length of 10 nucleotides, 11 nucleotides, 12 nucleotides, 13 nucleotides, 14 nucleotides, 15 nucleotides, 16 nucleotides, 17 nucleotides, 18 nucleotides, 19 nucleotides, 20 nucleotides, 25 nucleotides, 30 nucleotides, 35 nucleotides, 40 nucleotides, 45 nucleotides, 50 nucleotides, 60 nucleotides, 70 nucleotides, 80 nucleotides, 90 nucleotides, 100 nucleotides, 125 nucleotides, 150 nucleotides, 200 nucleotides, or more nucleotides, and any number or range in between. PCR handles, barcodes, including cellular barcodes and UMIs, and poly(dT) sequences can be arranged in any order and be separated by any number of nucleotides or be contiguous, i.e., be located next to each other without other nucleotides in between. Cellular barcodes and UMIs included in barcodes of oligonucleotides provided herein can be contiguous, i.e., located next to each other, or located apart from each other. For example, cellular barcodes can be separated by 1 nucleotide, 2 nucleotides, nucleotides, 3 nucleotides, 4 nucleotides, 5 nucleotides, 6 nucleotides, 7 nucleotides, 8 nucleotides, nucleotides, 9 nucleotides, 10 nucleotides, 11 nucleotides, 12 nucleotides, 13 nucleotides, 14 nucleotides, 15 nucleotides, 16 nucleotides, 17 nucleotides, 18 nucleotides, 19 nucleotides, 20 nucleotides, or more nucleotides. Cellular barcodes, UMIs, poly(dT) sequences, and PCR handles can be of any length, including 1 nucleotide, 2 nucleotides, nucleotides, 3 nucleotides, 4 nucleotides, 5 nucleotides, 6 nucleotides, 7 nucleotides, 8 nucleotides, nucleotides, 9 nucleotides, 10 nucleotides, 10 nucleotides, 11 nucleotides, 12 nucleotides, 13 nucleotides, 14 nucleotides, 15 nucleotides, 16 nucleotides, 17 nucleotides, 18 nucleotides, 19 nucleotides, 20 nucleotides, 25 nucleotides, 30 nucleotides, 35 nucleotides, 40 nucleotides, 45 nucleotide, 50 nucleotides, or more nucleotides, and any number or range in between.
In another embodiment, a kit including a combination lysis and PCR buffer; droplet generation oil; and a microparticle is provided. In one aspect, the combination lysis and PCR buffer includes a lysis component, PCR components, and a reaction buffer. In another aspect, the lysis component includes Igepal CA-630. In one aspect, the PCR components include a polymerase, deoxynucleoside triphosphates and PCR primers. In various aspects, the polymerase is a Q5 Hot Start High-Fidelity DNA polymerase. In another aspect, the reaction buffer includes MgCl2, Tween-80, a carrier protein Tris and NaCl. In many aspects, the carrier protein is BSA or ubiquitin.
In one aspect, the kit further includes instructions to generate aqueous solution-in-oil droplets comprising a cell and a microparticle.
In another aspect, the kit further includes a polymerase and instructions to perform a polymerase chain reaction (PCR) reaction on DNA in the droplet. In some aspects, the PCR reaction generates amplicons comprising a 3′ poly(dA) sequence.
In one aspect, the kit further includes a reverse transcriptase and instructions to perform a reverse transcription reaction on RNA. In various aspects, the reverse transcription reaction includes Moloney Murine Leukemia Virus (M-MLV)-reverse transcriptase and template switching oligonucleotides.
In other aspects, the kit further includes instructions to separate the supernatant containing the amplicons from the RNA prior to performing the reverse transcriptase reaction the RNA.
In some aspects, the microparticle includes a bead. In one aspect, the bead includes oligonucleotide sequences. In many aspects, the oligonucleotide sequences include a barcode. In some aspects, the barcode includes a cellular barcode and a unique molecular identifier. In one aspect, the oligonucleotide sequences further include a poly(dT) sequence. In other aspects, the oligonucleotide sequences further include a PCR handle for reverse transcription and PCR.
The following examples are provided to further illustrate the embodiments of the present invention but are not intended to limit the scope of the invention. While they are typical of those that might be used, other procedures, methodologies, or techniques known to those skilled in the art may alternatively be used.
This example illustrates design of a method for analysis of DNA and RNA from the same cell.
Collecting information from multiple “omes” from the same single cells allows for a more refined analysis and understanding of cellular heterogeneity and cellular function, as shown in
To overcome these limitations, a system that adapts 3′ poly(dA) capture-based scRNA-seq methods to simultaneously barcode both targeted DNA and RNA from single cells was developed using a microfluidics-based platform (
DREAM-seq is performed as follows (
DREAM-seq relies on PCR amplification of genomic targets directly from cells lysed after droplet encapsulation. This requires overcoming PCR inhibition introduced by the limited reaction volume, beads, and cell lysis. A buffer was developed to effectively lyse cells and proceed directly to PCR in droplets (
Before creating sequencing libraries from gDNA-derived amplicons, droplets were broken open using a filter and supernatant containing the amplicons was purified and concentrated. The purified supernatant was then subjected to a series of enzymatic reactions (
To create sequencing libraries for the DNA amplicons, the molecules bound to streptavidin-coated beads were subjected to a second round of PCR using a nested forward primer to append the first part of the sequencing adapters (
This example shows an exemplary protocol for performing simultaneous DNA and RNA analysis from the same cell.
DREAM-Seq (DNA/RNA Extraction, Amplification, and Multiplexing) Stepwise Protocol:
Equipment:
Real-time PCR detection machine
Thermocycler capable of ramping temperatures (preferably with the ability to hold PCR plates)
E-Gel Power Snap Electrophoresis Device (Thermo Fisher)
2100 Bioanalyzer (Agilent)
Qubit fluorometer (Thermo Fisher)
Tube rotator
Magnetic bead rack
Drop-seq setup (i.e. pumps, microscope, chips, etc. See mccarrolllab.org/dropseq/for instructions on building one)
Fuchs-Rosenthal Hemocytometer
MiSeq (Illumina)
1CS10 freezing medium can be used to freeze single cell suspensions for DREAM-seq. For frozen cells that are particularly buoyant, samples may be frozen at high concentration in CS10 and diluted directly (at least 1:20) with no negative effects on the reactions. Other freezing medias have not been tested for their effects on in-droplet cell lysis/PCR.
2Both PBS and HBSS have been successfully tested as cell suspension buffers, either may be used.
3PCR plates or tubes used for droplet PCR must be emulsion safe to prevent droplets from merging.
4Both polymerases have been successfully tested and used in the direct droplet PCR reaction. Other polymerases may be substituted as well, although it should be noted that Taq-based polymerases do not work in this system. It is important that any polymerase used for droplet PCR is a hot start polymerase.
Buffer Formulations and Reaction Mixes:
2× DREAM-seq direct PCR/lysis buffer (1 mL):1
TE-SDS
TE-TW
PBS (or HBSS)-BSA
2× BW-Tween
1DREAM-seq_F
1DREAM-
1It is extremely important that DREAM-seq primers are PAGE-purified.
Designing and Testing Primers for DREAM-Sea:
It is critical to test primers designed for DREAM-seq before performing any runs. DREAM-seq primers with shorter products (100-300 bp) amplify more efficiently and are preferred. When beads are present in the reaction, ideal primers produce clean bands with no visible primer dimers. Primer sets should be tested with the poly(T) tract already incorporated in the reverse primer.
Testing DREAM-seq primers:
After initial DREAM-seq primer testing, XT_Nested_F primers need to be tested and the whole serial PCR optimized. The 5′ portion of these primers adds the first part of the Nextera sequencing adapter to the amplicons, while the 3′ portion binds specifically to the amplicon. XT_Nested_F primers should bind at least 15 bases downstream of the DREAM-seq_F sequence. The product from this adapter PCR reaction should be clean with no visible off-target amplification. Nested primers and serial PCR can be tested and optimized in bulk by modeling the droplet reactions. Supernatant can be mixed from separate bulk per reactions that represent the distribution of empty droplets, droplets with only cells, droplets with only beads, and droplets with a cell and a bead. The mixed supernatant can then be used as a template for testing XT_Nested_F primers in a qPCR reaction, and the resulting curves will determine the number of droplet PCR cycles and adapter PCR cycles will be needed for actual DREAM-seq experiments.
XT_Nested_F Primer Testing and Optimization:
After verifying that the adapter PCR product is clean, use the qPCR results to determine the number of droplet PCR cycles and adapter PCR Cycles for your experiment. If the qPCR reaches early exponential phase before 20 cycles, consider decreasing the amount of droplet PCR cycles to no lower than 25 for actual experiments. If the qPCR reaches early exponential phase after 25 cycles, consider increasing the amount of droplet PCR cycles (cDNA quality has not been tested at higher than 32 cycles). The number of adapter PCR cycles should be determined based on how many cycles it takes for the reaction to reach early-to-mid exponential phase.
Running DREAM-Sea:
Bead/Buffer Preparation:
Cell Preparation:
Direct PCR in Droplets:
Tip—begin preparing the RT mix (below) during the droplet PCR.
Same-Day Bead Processing:
Exonuclease:
Total cDNA Amplification:
Tip: After counting the beads (step 1 below), it may be useful to optimize the number of PCR cycles needed for each experiment before amplifying cDNA from the total pool of beads. Optimization can be done by following the below steps with one or two aliquots of beads run for different numbers of cycles and analyzing the cDNA, or by adding sybr green to the PCR reaction and performing qPCR to identify how many cycles are required to reach early-to-mid exponential phase.
Total cDNA Purification and Analysis:
Occasionally in DREAM-seq, particularly in experiments that require higher numbers of droplet PCR cycles, the total cDNA yields a wider or bimodal size distribution, possibly because of RNA degradation during the PCR. When this occurs, the shorter fragments tend to bind more efficiently to the sequencing flow cell and are largely comprised of polyA, so it is important to remove them during purification.
It was found that doing two sequential 0.6× Ampure bead purifications (following the manufacturer's instructions) was sufficient if the cDNA is uncompromised and yields total cDNA with an average size of 1200-1500 bp and a smooth, normal distribution, as assessed on a Bioanalyzer. With this protocol, final purified cDNA was eluted in ˜6-8 ul per original PCR reaction.
However, if a lower size distribution of cDNA is observed, it is recommended to purify the total cDNA by concentrating it with Zymo clean and concentrator columns, running it out on a 2% E-gel EX agarose gel, and gel-extracting fragments larger than 500 bp (may or may not be visible on the gel). One well of the gel was loaded for every 50 PCR rxns, to avoid overloading. It was found that the NEB Monarch gel extraction kit combined with the E-gel EX provided very favorable cDNA yield. With this protocol, final purified cDNA was eluted in ˜1-2ul per original PCR reaction.
Purified total cDNA should be assessed on a Bioanalyzer High Sensitivity DNA chip, according to the manufacturer's instructions. Total cDNA should be at least 150 pg/ul.
cDNA Library Preparation and Purification:
The final library should average between 500-800 bp, with an ideal concentration of 4 nM or greater. Libraries can be stored at 4° C. or −20° C. before sequencing.
Droplet Supernatant/Amplicon Purification:
Lambda Exonuclease Digestion:
This reaction removes the reverse strands of amplicons, which is necessary to minimize extreme template switching in the subsequent PCR steps.
Terminal Transferase 3′ End Blocking: This reaction blocks the 3′ end of the amplicons, preventing shorter fragments without incorporated bead oligo sequences from priming longer ones in the subsequent PCR reactions.
Biotin labelling of tagged amplicons
Single Stranded DNA Digestion and Streptaviding Pulldown
Amplicon Adapter PCR and Purification:
For the adapter PCR, we suggest you use between ¼ and ½ of your total purified supernatant volume, to ensure that you have material remaining in case the reaction needs to be repeated.
Droplet PCR target size+30 bp polyA+45 bp bead oligo−nested primer depth+34 bp adapter. Actual band size may vary slightly based on polyA/polyT binding.
Amplicon Indexing PCR and Purification:
Adapter PCR amplicon size+41bp P5-SMART sequence+32bp N70×index sequence
Sequencing Your Libraries:
Libraries should be sequenced on a MiSeq for quality control and determining the number of cells, before moving on to deep sequencing. If you are sequencing cDNA libraries alone or in combination with the amplicon libraries, it is not necessary to spike in a PhiX control. If you are sequencing amplicon libraries alone, PhiX is required to introduce diversity into your library.
The Custom Read 1 Primer is required for sequencing the Drop-seq bead barcodes. If you are not using PhiX, the custom primer should be diluted and loaded as per Illumina's instructions: supportillumina.com/content/dam/illumina-support/documents/documentation/system_documentation/miseq/miseq-system-custom-primers-guide-15041638-01.pdf. When setting up your sequencing run parameters, you need to specify that custom read 1 primer is being used.
If using PhiX, then the custom primer should be diluted and loaded as per Illumina's primer spike-in instructions: support.illumina.com/bulletins/2016/04/spiking-custom-primers-into-the-illumina-sequencing-primers-.html. Otherwise, the PhiX will not be amplified during the run. It was determined that the Custom Read 1 Primer does not interfere with any of the Illumina sequencing primers when spiked in. When setting up the sequencing run parameters in this case, do not select the option for a custom read one primer.
Additional Sequencing Parameters:
Read 1:25 bp
Read 2: 100 bp (or more/enough to cover the regions of your amplicon sequence that needs to be to read for genotyping)
Read 1 index: 8 bp
Lastly, when determining the number of reads wanted, cDNA reads should be based on the number of cells, while amplicon reads should be based on the total number of beads.
This example illustrates analysis of mutated cell populations with DREAM-seq.
DREAM-seq was used for in-droplet genotyping of a population of cells with mixed mutational profiles. When CRISPR/Cas9 is used to introduce a genetic mutation into cells, the editing efficiency is less than 100% (
After transfection, entire cell populations were analyzed using DREAM-seq with primers to amplify respective target sites. After processing captured molecules into amplicon and cDNA libraries, libraries were sequenced on a Miseq and genotypes assigned to individual cells (
These results show that direct in-droplet genotyping can be used for the identification of gene-edited cells, thereby bypassing a need for clonal expansion of potentially edited cells after transfection and recovery and repeated rounds of colony picking and genotyping that may take several months. Using primers that target potential edits, cells can be individually genotyped inside droplets, with amplicon sequences used to bioinformatically determine whether a cell is edited or unedited and what the edit is. Cells can then be clustered by identifying labels and gene expression can be compared directly between subpopulations.
This example illustrates use of simultaneous analysis of RNA and DNA from the same single cell for cell lineage tracing and DNA barcoding.
The DREAM-seq method for simultaneous analysis of RNA and DNA from the same single cell can be used to combine transcriptomics and lineage barcoding from single cells to analyze cell differentiation, for example. As shown in
Editable DNA barcodes can be used to record and track cell relationships, reflecting a cell's true biological lineage (
The individual limitations of scRNA-seq and lineage tracing in single cells can be overcome by combining the two techniques (
Tracing of cell lineages in retinal organoids from barcoded human stem cells or barcoded mice is shown in
This example illustrates use of simultaneous analysis of RNA and DNA from the same single cell to distinguish cells from different species.
To validate the ability of DREAM-seq to characterize single cells simultaneously based on their genotype and transcriptomes, we performed species mixing using our method to target an intronic region of the PAX6 locus with a high degree of sequence conservation between mouse and human genomes. One pair of primers was designed to generate 180 base-pair amplicons from a single-cell suspension of equal parts human induced pluripotent cells and mouse Neuro-2a cells. Within the amplified region, three single nucleotide polymorphisms between mouse and human sequences were used to determine the species of the originating cell, as per the DNA readout. Based on transcriptomic alignment of the cDNA, 95% of cells were confidently identified as either human or mouse (
In summary, the above examples (Examples 1-5) show that DREAM-seq offers itself to a versatile set of applications, ranging from clinical diagnostics to developmental biology to high-throughput screens.
As used in this specification and the appended claims, the singular forms “a”, “an”, and “the” include plural references unless the context clearly dictates otherwise. Thus, for example, references to “the method” includes one or more methods, and/or steps of the type described herein which will become apparent to those persons skilled in the art upon reading this disclosure and so forth.
All publications, patents, and patent applications mentioned in this specification are herein incorporated by reference to the same extent as if each individual publication, patent, or patent application was specifically and individually indicated to be incorporated by reference.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although any methods and materials similar or equivalent to those described herein can be used in the practice or testing of the invention, it will be understood that modifications and variations are encompassed within the spirit and scope of the instant disclosure.
Ranges: throughout this disclosure, various aspects of the invention can be presented in a range format. It should be understood that the description in range format is merely for convenience and brevity and should not be construed as an inflexible limitation on the scope of the invention. Accordingly, the description of a range should be considered to have specifically disclosed all the possible subranges as well as individual numerical values within that range. For example, description of a range such as from 1 to 6 should be considered to have specifically disclosed subranges such as from 1 to 3, from 1 to 4, from 1 to 5, from 2 to 4, from 2 to 6, from 3 to 6 etc., as well as individual numbers within that range, for example, 1, 2, 2.1, 2.2, 2.7, 3, 4, 5, 5.5, 5.75, 5.8, 5.85, 5.9, 5.95, 5.99, and 6. This applies regardless of the breadth of the range.
Although the invention has been described with reference to the above examples, it will be understood that modifications and variations are encompassed within the spirit and scope of the invention. Accordingly, the invention is limited only by the following claims.
This application claims benefit of priority under 35 U.S.C. § 119(e) of U.S. Provisional Application No. 63/065,433, filed Aug. 13, 2020. The disclosure of the prior application is considered part of and is herein incorporated by reference in the disclosure of this application in its entirety.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2021/045802 | 8/12/2021 | WO |
Number | Date | Country | |
---|---|---|---|
63065433 | Aug 2020 | US |