METHODS AND COMPOSITIONS FOR LABELING CELLS

BACKGROUND

Biological samples, such as cellular samples, may be processed for various purposes, for example, to analyze gene and/or protein expression levels within cells. Such analysis may be useful for a variety of applications, such as in detection of a disease (e.g., cancer), the study of disease progression, and detection of contamination. There are various approaches for processing samples, such as polymerase chain reaction (PCR) and sequencing.

Biological samples may be processed within various reaction environments, such as partitions. Partitions may be wells or droplets. Droplets or wells may be employed to process biological samples in a manner that enables the biological samples to be partitioned and processed separately. For example, such droplets may be fluidically isolated from other droplets, enabling accurate control of respective environments in the droplets.

Partitioning biological samples into separate partitions for separate processing, for example, enables single-cell analysis in a relatively high-throughput manner. In some cases, biological samples are transformed into well-mixed single cell suspensions followed by random partitioning. Currently available sample processing techniques are limited by the inability to process multiple samples in parallel.

SUMMARY

In view of the foregoing, improved methods and compositions for sample analysis are needed. The present disclosure provides methods and compositions for sample analysis, for example processing multiple samples in parallel. The methods of the present disclosure may comprise analyzing a cell. For example, a cell may be provided with a barcode moiety (e.g., a nucleic acid barcode molecule, such as a nucleic acid barcode molecule coupled to a lipophilic or amphiphilic moiety) prior to undergoing further processing (e.g., partitioning within a partition, analyzing nucleic acid molecules or other analytes from within the cells, sequencing nucleic acid molecules associated with the cell, etc.) and the barcode moiety may later be used to identify the cell (e.g., as deriving from a given sample, as being of a certain type, as being associated with a given partition, etc.). Identification of the cell may comprise, for example, performing a nucleic acid sequencing assay. The present disclosure also provides methods for analyzing the cellular occupancy of partitions (e.g., droplets or wells). Such methods may comprise, for example, labeling a plurality of cells with a plurality of barcodes (e.g., nucleic acid barcode sequences, such as nucleic acid barcode sequences coupled to lipophilic or amphiphilic moieties) to provide a plurality of labeled cells. Labeled cells of the plurality of labeled cells may be labeled with different barcodes. Labeled cells may be partitioned within a plurality of partitions (e.g., droplets or wells) and may be further labeled with additional barcodes (e.g., partition nucleic acid barcode sequences). The barcodes of the labeled cells may then be used to, e.g., identify labeled cells as originating from the same partition. The methods of the present disclosure may also be useful for determining the relative sizes of cells within a cellular sample, e.g., based at least in part on the uptake of barcodes (e.g., barcodes (e.g., nucleic acid barcode sequences) coupled to lipophilic or amphiphilic moieties) by the cells. The uptake of such barcodes may be measured by, for example, directly detecting barcodes associated with the cells or by performing a nucleic acid sequencing assay and measuring an abundance of various barcode sequences identified in the sequencing assay.

In an aspect, the present disclosure provides a method for analyzing a cell, comprising: (a) labeling the cell with a cell nucleic acid barcode sequence to generate a labeled cell, wherein a cell nucleic acid barcode molecule comprises the cell nucleic acid barcode sequence and a cell labeling agent; (b) generating a partition comprising the labeled cell and a plurality of partition nucleic acid barcode molecules, wherein each partition nucleic acid barcode molecule of the plurality of partition nucleic acid barcode molecules comprises a partition nucleic acid barcode sequence; (c) permeabilizing or lysing the cell to provide access to a plurality of nucleic acid molecules therein; (d) generating (i) a barcoded nucleic acid molecule comprising the cell nucleic acid barcode sequence, or a complement thereof, and the partition nucleic acid barcode sequence, or a complement thereof, and (ii) a plurality of barcoded nucleic acid products each comprising a sequence of a nucleic acid molecule of the plurality of nucleic acid molecules and the partition nucleic acid barcode sequence, or a complement thereof; and (e) identifying the plurality of nucleic acid molecules as originating from the cell.

In some embodiments, the cell nucleic acid barcode sequence identifies a sample from which the cell originates. In some embodiments, the sample is derived from a biological fluid.

In some embodiments, the cell is an immune cell.

In some embodiments, each partition nucleic acid barcode molecule of the plurality of partition nucleic acid barcode molecules comprises a priming sequence. In some embodiments, the priming sequence is a targeted priming sequence or a random N-mer sequence.

In some embodiments, the barcoded nucleic acid molecule and the plurality of barcoded nucleic acid products are synthesized via one or more primer extension reactions, ligation reactions, or nucleic acid amplification reactions.

In some embodiments, the method further comprises sequencing the barcoded nucleic acid molecule and the barcoded nucleic acid products, or derivatives thereof, to yield a plurality of sequencing reads. In some embodiments, the method further comprises associating each sequencing read of the plurality of sequencing reads with the partition via its partition nucleic acid barcode sequence.

In some embodiments, the method further comprises in (b), partitioning the labeled cell with a bead, which bead comprises the plurality of partition nucleic acid barcode molecules. In some embodiments, the partition nucleic acid barcode sequence of each nucleic acid barcode molecule of the plurality of partition nucleic acid barcode molecules is releasably coupled to the bead. In some embodiments, the method further comprises after (b), releasing partition nucleic acid barcode sequences of the plurality of partition nucleic acid barcode molecules from the bead. In some embodiments, releasing partition nucleic acid barcode sequences of the plurality of partition nucleic acid barcode molecules from the bead comprises application of a stimulus. In some embodiments, the bead is a gel bead.

In some embodiments, the partition is a well or a droplet.

In some embodiments, the plurality of nucleic acid molecules comprises a plurality of deoxyribonucleic acid molecules or a plurality of ribonucleic acid molecules.

In some embodiments, the priming sequence is capable of hybridizing to a sequence of at least a subset of the plurality of nucleic acid molecules. In some embodiments, In some embodiments, the priming sequence is capable of hybridizing to a sequence of the cell nucleic acid barcode molecule.

In some embodiments, prior to (b), the cell nucleic acid barcode molecule is at least partially disposed within the labeled cells.

In some embodiments, the plurality of nucleic acid molecules comprises a plurality of nucleic acid sequences corresponding to a V(D)J region of the genome of the cell. In some embodiments, the V(D)J region of the genome of the cell comprises a T cell receptor variable region sequence, a B cell receptor variable region sequence, or an immunoglobulin variable region sequence. In some embodiments, the partition further comprises a primer molecule, which primer molecule comprises a sequence complementary to a sequence of the plurality of nucleic acid molecules. In some embodiments, the plurality of nucleic acid molecules comprises a plurality of messenger ribonucleic acid (mRNA) molecules, and wherein the sequence of the plurality of nucleic acid molecules is a poly(A) sequence. In some embodiments, the plurality of barcoded nucleic acid products comprises a plurality of complementary deoxyribonucleic acid (cDNA) molecules, or derivatives thereof. In some embodiments, (d) comprises hybridizing the sequence of the primer molecule to the sequence of a nucleic acid molecule of the plurality of nucleic acid molecules and using an enzyme to extend the sequence of the primer molecule to provide a nucleic acid product comprising a complementary deoxyribonucleic acid (cDNA) sequence corresponding to a sequence of the nucleic acid molecule. In some embodiments, the enzyme incorporates a sequence at an end of the nucleic acid product. In some embodiments, the sequence is a poly(C) sequence. In some embodiments, at least a subset of the partition nucleic acid barcode molecules comprise a sequence complementary to the poly(C) sequence. In some embodiments, (d) further comprises using the nucleic acid product and a partition nucleic acid barcode molecule of the plurality of partition nucleic acid barcode molecules to generate a barcoded nucleic acid product of the plurality of barcoded nucleic acid products.

In some embodiments, the cell labelling agent is selected from the group consisting of a lipophilic moiety, a fluorophore, a dye, a peptide, and a nanoparticle.

In an aspect, the present disclosure provides a method for analyzing cellular occupancy of partitions, comprising: (a) providing a plurality of cell nucleic acid barcode molecules comprises a plurality of cell nucleic acid barcode sequences, each cell nucleic acid barcode molecule of the plurality of cell nucleic acid barcode molecules comprising (i) a single cell nucleic acid barcode sequence of the plurality of cell nucleic acid barcode sequences and (ii) a lipophilic moiety; (b) labeling a plurality of cells with the plurality of cell nucleic acid barcode sequences to generate a plurality of labeled cells, wherein each labeled cell of the plurality of labeled cells comprises a different cell nucleic acid barcode sequence of the plurality of cell nucleic acid barcode sequences; (c) generating a plurality of partitions comprising the plurality of labeled cells and a plurality of partition nucleic acid barcode sequences, wherein each partition of the plurality of partitions comprises a different partition nucleic barcode sequence of the plurality of partition nucleic acid barcode sequences, and wherein at least a fraction of the plurality of partitions comprises more than one labeled cell of the plurality of labeled cells; and (d) identifying at least two labeled cells of the plurality of labeled cells as originating from a same partition using (i) cell nucleic acid barcode sequences of the plurality of cell nucleic acid barcode sequences, or complements thereof, and (ii) partition nucleic acid barcode sequences of the plurality of partition nucleic acid barcode sequences, or complements thereof.

In some embodiments, a given cell nucleic acid barcode sequence of the plurality of cell nucleic acid barcode sequences identifies a sample from which an associated cell of the plurality of labeled cells originates. In some embodiments, the sample is derived from a biological fluid. In some embodiments, the biological fluid comprises blood or saliva.

In some embodiments, the method further comprises, after (c), synthesizing a plurality of barcoded nucleic acid products from the plurality of labeled cells, wherein a given barcoded nucleic acid product of the plurality of barcoded nucleic acid products comprises (i) a cell identification sequence comprising a given cell nucleic acid barcode sequence of the plurality of cell nucleic acid barcode sequences, or a complement of the given cell nucleic acid barcode sequence; and (ii) a partition identification sequence comprising a given partition nucleic acid barcode sequence of the plurality of partition nucleic acid barcode sequences, or a complement of the given partition nucleic acid barcode sequence.

In some embodiments, a plurality of partition nucleic acid barcode molecules comprises the plurality of partition nucleic acid barcode sequences, each partition nucleic acid barcode molecule of the plurality of partition nucleic acid barcode molecules comprising a single partition nucleic acid barcode sequence of the plurality of partition nucleic acid barcode sequences. In some embodiments, a given partition nucleic acid barcode molecule of the plurality of partition nucleic acid barcode molecules comprises a priming sequence that is capable of hybridizing to a sequence of a given cell nucleic acid barcode molecule of the plurality of cell nucleic acid barcode molecules. In some embodiments, each cell nucleic acid barcode molecule of the plurality of cell nucleic acid barcode molecules comprises the sequence. In some embodiments, the priming sequence is a targeted priming sequence. In some embodiments, the priming sequence is a random N-mer sequence. In some embodiments, the plurality of barcoded nucleic acid products is synthesized via one or more primer extension reactions. In some embodiments, the plurality of barcoded nucleic acid products is synthesized via one or more ligation reactions. In some embodiments, the plurality of barcoded nucleic acid products is synthesized via one or more nucleic acid amplification reactions.

In some embodiments, the method further comprises sequencing the plurality of barcoded nucleic acid products or derivatives thereof to yield a plurality of sequencing reads. In some embodiments, the method further comprises associating each sequencing read of the plurality of sequencing reads with a labeled cell of the plurality of labeled cells via its respective cell identification sequence, and associating each sequencing read of the plurality of sequencing reads with a partition of the plurality of partitions via its respective partition identification sequence.

In some embodiments, the method further comprises, in (c), partitioning the plurality of labeled cells with a plurality of beads, wherein each bead of the plurality of beads comprises a partition nucleic acid barcode sequence of the plurality of partition nucleic acid barcode sequences. In some embodiments, each partition of the plurality of partitions comprises a single bead of the plurality of beads. In some embodiments, each bead of the plurality of beads comprises a plurality of partition nucleic acid barcode molecules, wherein each partition nucleic acid barcode molecule of the plurality of partition nucleic acid barcode molecules comprises a single partition nucleic acid barcode sequence of the plurality of partition nucleic acid barcode sequences. In some embodiments, each partition nucleic acid barcode sequence of the plurality of partition nucleic acid barcode sequences is releasably coupled to its respective bead of the plurality of beads. In some embodiments, each partition nucleic acid barcode sequence of the plurality of partition nucleic acid barcode sequences is releasable from its respective bead of the plurality of beads upon application of a stimulus. In some embodiments, the stimulus is a chemical stimulus. In some embodiments, the method further comprises, after (c), releasing partition nucleic acid barcode sequences of the plurality of partition nucleic acid barcode sequences from each bead of the plurality of beads. In some embodiments, the method further comprises degrading each bead of the plurality of beads to release the partition nucleic acid barcode sequences from each bead of the plurality of beads. In some embodiments, each partition of the plurality of partitions comprises an agent that is capable of degrading each bead of the plurality of beads. In some embodiments, the plurality of beads is a plurality of gel beads.

In some embodiments, the plurality of partitions is a plurality of droplets. In some embodiments, the plurality of partitions is a plurality of wells.

In some embodiments, in (b), the plurality of cells is labeled with the plurality of cell nucleic acid barcode sequences by binding cell binding moieties, each coupled to a given cell nucleic acid barcode sequence of the plurality of cell nucleic acid barcode sequences, to each cell of the plurality of cells. In some embodiments, the cell binding moieties are antibodies, cell surface receptor binding molecules, receptor ligands, small molecules, pro-bodies, aptamers, monobodies, affimers, darpins, or protein scaffolds. In some embodiments, the cell binding moieties are antibodies. In some embodiments, the cell binding moieties bind to a protein of cells of the plurality of cells. In some embodiments, the cell binding moieties bind to a cell surface species of cells of the plurality of cells. In some embodiments, the cell binding moieties bind to a species common to each cell of the plurality of cells.

In some embodiments, in (b), the plurality of cells is labeled with the plurality of cell nucleic acid barcode sequences by delivering nucleic acid barcode molecules each comprising an individual cell nucleic acid barcode sequence of the plurality of cell nucleic acid barcode sequences to each cell of the plurality of cells with the aid of a cell-penetrating peptide.

In some embodiments, in (b), the plurality of cells is labeled with the plurality of cell nucleic acid barcode sequences with the aid of liposomes, nanoparticles, electroporation, or mechanical force. In some embodiments, the mechanical force comprises the use of nanowires or microinjection.

In some embodiments, the lipophilic moiety of each nucleic acid barcode molecule of the plurality of cell nucleic acid barcode molecules is a cholesterol.

In some embodiments, the lipophilic moiety is linked to the plurality of cell nucleic acid barcode molecules via a linker.

In some embodiments, each cell of the plurality of cells comprises a plurality of nucleic acid molecules. In some embodiments, the plurality of nucleic acid molecules comprises a plurality of deoxyribonucleic acid molecules. In some embodiments, the plurality of nucleic acid molecules comprises a plurality of ribonucleic acid molecules. In some embodiments, the labeled cells are lysed or permeabilized to provide access to the plurality of nucleic acid molecules. In some embodiments, a plurality of partition nucleic acid barcode molecules comprises the plurality of partition nucleic acid barcode sequences, each partition nucleic acid barcode molecule of the plurality of partition nucleic acid barcode molecules comprising a single partition nucleic acid barcode sequence of the plurality of partition nucleic acid barcode sequences and a priming sequence that is capable of hybridizing to a sequence of at least a subset of the plurality of nucleic acid molecules. In some embodiments, the priming sequence is a targeted priming sequence. In some embodiments, the priming sequence is a random N-mer sequence.

In some embodiments, prior to (c), at least a subset of the cell nucleic acid barcode molecules of the plurality of cell nucleic acid barcode molecules are at least partially disposed within the plurality of labeled cells.

In another aspect, the present disclosure provides a method for analyzing cellular occupancy of a partition, comprising: (a) providing a first cell nucleic acid barcode molecule comprising (i) a first cell nucleic acid barcode sequence and (ii) a lipophilic moiety, and a second nucleic acid barcode molecule comprising (i) a second cell nucleic acid barcode sequence and (ii) a lipophilic moiety, wherein the first cell nucleic acid barcode sequence has a different sequence than the second cell nucleic acid barcode sequence; (b) labeling a first cell with the first cell nucleic acid barcode sequence to generate a first labeled cell and labeling a second cell with the second cell nucleic acid barcode sequence to generate labeled a second labeled cell; (c) generating a partition comprising the first labeled cell and the second labeled cell, wherein the partition further comprises a partition nucleic acid barcode sequence; (d) generating (i) a first barcoded nucleic acid molecule comprising the first cell nucleic acid barcode sequence, or a complement thereof, and the partition nucleic acid barcode sequence, or a complement thereof, and (ii) a second barcoded nucleic acid molecule comprising the second cell nucleic acid barcode sequence, or a complement thereof, and a partition nucleic acid barcode sequence, or a complement thereof; and (e) identifying the first labeled cell and the second labeled cell as originating from the partition based on the first barcoded nucleic acid molecule and the second barcoded nucleic acid molecule having the same partition nucleic acid barcode sequence, or a complement thereof.

In some embodiments, the first cell nucleic acid barcode sequence and the second cell nucleic acid barcode sequence identify a sample from which the first cell and the second cell originate. In some embodiments, wherein the sample is derived from a biological fluid. In some embodiments, the biological fluid comprises blood or saliva.

In some embodiments, wherein the first barcoded nucleic acid molecule and the second barcoded nucleic acid molecule each comprise a priming sequence. In some embodiments, the priming sequence is a targeted priming sequence. In some embodiments, the priming sequence is a random N-mer sequence.

In some embodiments, the first barcode nucleic acid molecule and the second barcode nucleic acid molecule are synthesized via one or more primer extension reactions, ligation reactions, or nucleic acid amplification reactions.

In some embodiments, the method further comprises sequencing the first barcode nucleic acid molecule and the second barcode nucleic acid molecule, or derivatives thereof, to yield a plurality of sequencing reads. In some embodiments, the method further comprises associating each sequencing read of the plurality of sequencing reads with the first labeled cell or the second labeled cell via its cell nucleic acid barcode sequence, and associating each sequencing read of the plurality of sequencing reads with the partition via its respective partition nucleic acid sequence.

In some embodiments, the method further comprises, in (c), partitioning the first labeled cell and the second labeled cell with a bead, which bead comprises a plurality of nucleic acid barcode molecules, each of which comprises the partition nucleic acid barcode sequence. In some embodiments, the partition nucleic acid barcode sequence of each nucleic acid barcode molecule of the plurality of nucleic acid barcode molecules is releasably coupled to the bead. In some embodiments, the method further comprises, after (c), releasing partition nucleic acid barcode sequences of the plurality of partition nucleic acid barcode molecules from the bead. In some embodiments, the bead is a gel bead.

In some embodiments, the partition is a well. In some embodiments, the partition is a droplet.

In some embodiments, the lipophilic moiety of the first cell nucleic acid barcode molecule and the second cell nucleic acid barcode molecule is a cholesterol.

In some embodiments, the first cell and the second cell each comprise a plurality of nucleic acid molecules. In some embodiments, the first labeled cell and the second labeled cell are lysed or permeabilized to provide access to the pluralities of nucleic acid molecules. In some embodiments, a plurality of partition nucleic acid barcode molecules each comprise the partition nucleic acid barcode sequence and a priming sequence that is capable of hybridizing to a sequence of at least a subset of the plurality of nucleic acid molecules. In some embodiments, the priming sequence is a targeted priming sequence. In some embodiments, the priming sequence is a random N-mer sequence.

In another aspect, the present disclosure provides a method for analyzing a cell, comprising: (a) labeling the cell with a cell nucleic acid barcode sequence to generate a labeled cell, wherein a cell nucleic acid barcode molecule comprises the cell nucleic acid barcode sequence and a lipophilic moiety; (b) generating a partition comprising the labeled cell and a plurality of partition nucleic acid barcode molecules, wherein each partition nucleic acid barcode molecule of the plurality of partition nucleic acid barcode molecules comprises a partition nucleic acid barcode sequence; (c) permeabilizing the cell to provide access to a plurality of nucleic acid molecules therein; (d) generating (i) a barcoded nucleic acid molecule comprising the cell nucleic acid barcode sequence, or a complement thereof, and the partition nucleic acid barcode sequence, or a complement thereof, and (ii) a plurality of barcoded nucleic acid products each comprising a sequence of a nucleic acid molecule of the plurality of nucleic acid molecules and the partition nucleic acid barcode sequence, or a complement thereof; and (e) identifying the plurality of nucleic acid molecules as originating from the cell.

In some embodiments, the cell nucleic acid barcode sequence identifies a sample from which the cell originates. In some embodiments, the sample is derived from a biological fluid. In some embodiments, the biological fluid comprises blood or saliva.

In some embodiments, the barcoded nucleic acid molecule comprises a priming sequence. In some embodiments, each partition nucleic acid barcode molecule of the plurality of partition nucleic acid barcode molecules comprises a priming sequence. In some embodiments, the priming sequence is a targeted priming sequence. In some embodiments, the priming sequence is a random N-mer sequence. In some embodiments, the priming sequence is capable of hybridizing to a sequence of at least a subset of the plurality of nucleic acid molecules. In some embodiments, the priming sequence is capable of hybridizing to a sequence of the cell nucleic acid barcode molecule.

In some embodiments, the method further comprises, in (b), partitioning the labeled cell with a bead, which bead comprises the plurality of partition nucleic acid barcode molecules. In some embodiments, the partition nucleic acid barcode sequence of each nucleic acid barcode molecule of the plurality of partition nucleic acid barcode molecules is releasably coupled to the bead. In some embodiments, the method further comprises, after (b), releasing partition nucleic acid barcode sequences of the plurality of partition nucleic acid barcode molecules from the bead. In some embodiments, the bead is a gel bead.

In some embodiments, the partition is a well. In some embodiments, the partition is a droplet.

In some embodiments, the lipophilic moiety of the cell nucleic acid barcode molecule is a cholesterol.

In some embodiments, the plurality of nucleic acid molecules comprise a plurality of deoxyribonucleic acid molecules. In some embodiments, the plurality of nucleic acid molecules comprise a plurality of ribonucleic acid molecules.

In some embodiments, prior to (b), the cell nucleic acid barcode molecule is at least partially disposed within the labeled cells.

Additional aspects and advantages of the present disclosure will become readily apparent to those skilled in this art from the following detailed description, wherein only illustrative embodiments of the present disclosure are shown and described. As will be realized, the present disclosure is capable of other and different embodiments, and its several details are capable of modifications in various obvious respects, all without departing from the disclosure. Accordingly, the drawings and description are to be regarded as illustrative in nature, and not as restrictive.

INCORPORATION BY REFERENCE

All publications, patents, and patent applications mentioned in this specification are herein incorporated by reference to the same extent as if each individual publication, patent, or patent application was specifically and individually indicated to be incorporated by reference. To the extent publications and patents or patent applications incorporated by reference contradict the disclosure contained in the specification, the specification is intended to supersede and/or take precedence over any such contradictory material.

BRIEF DESCRIPTION OF THE DRAWINGS

The novel features of the invention are set forth with particularity in the appended claims. A better understanding of the features and advantages of the present invention will be obtained by reference to the following detailed description that sets forth illustrative embodiments, in which the principles of the invention are utilized, and the accompanying drawings (also “Figure” and “FIG.” herein), of which:

FIG. 1 shows an example of a microfluidic channel structure for partitioning individual biological particles.

FIG. 2 shows an example of a microfluidic channel structure for delivering barcode carrying beads to droplets.

FIG. 3 shows an example of a microfluidic channel structure for co-partitioning biological particles and reagents.

FIG. 4 shows an example of a microfluidic channel structure for the controlled partitioning of beads into discrete droplets.

FIG. 5 shows an example of a microfluidic channel structure for increased droplet generation throughput.

FIG. 6 shows another example of a microfluidic channel structure for increased droplet generation throughput.

FIG. 7A shows an example arrangement of nine sets of nucleic acid barcode molecules arranged in a two-dimensional configuration; FIG. 7B shows an example of a sample overlaying a two-dimensional arrangement of nucleic acid barcode molecules.

FIG. 8 shows a computer system that is programmed or otherwise configured to implement methods provided herein.

FIG. 9 shows an exemplary lipophilic moiety-conjugated-feature barcode comprising a cholesterol, a linker, and a nucleic acid attachment region.

FIG. 10 schematically depicts representative lipophilic barcodes as well as exemplary nucleic acid extension schemes to couple cell barcodes to lipophilic barcodes.

FIGS. 11A-11B show BioAnalyzer results of barcode libraries prepared from a first cell population (FIG. 11A) and a second cell population (FIG. 11B) incubated with ˜1 uM of feature barcodes without a lipophilic moiety while FIGS. 11C-11D show BioAnalyzer results of barcode libraries prepared from a first cell population (FIG. 11C) and a second cell population (FIG. 11D) incubated with ˜1 uM of cholesterol-conjugated feature barcodes.

FIGS. 12A-12J show representative graphs from pooled cell populations incubated with 0.1 μM cholesterol-conjugated feature barcodes showing the number of unique molecular identifier (UMI) counts on the x-axis versus number of cells on the y-axis. FIGS. 12A-B show log₁₀UMI counts of a first feature barcode sequence (“BC1”) identified from sequencing reads generated from sequencing libraries prepared from the pooled cell population (FIG. 12A—replicate 1; FIG. 12B—replicate 2). FIGS. 12C-D show log₁₀UMI counts of a second feature barcode sequence (′BC2″) identified from sequencing reads generated from sequencing libraries prepared from the pooled cell population (FIG. 12C—replicate 1; FIG. 12D—replicate 2). FIGS. 12E-F show log₁₀UMI counts of a third feature barcode sequence (′BC3″) identified from sequencing reads generated from sequencing libraries prepared from the pooled cell population (FIG. 12E—replicate 1; FIG. 12F—replicate 2). FIGS. 12G-H show log₁₀UMI counts of a fourth feature barcode sequence (′BC4″) identified from sequencing reads generated from sequencing libraries prepared from the pooled cell population (FIG. 12G—replicate 1; FIG. 12H—replicate 2). FIGS. 12I-12J show 3D representations of UMI counts obtained from the pooled cell populations for replicate 1. Graphs depict UMI counts in linear (FIG. 12I) and in log₁₀scale (FIG. 12J).

FIG. 13A-13J show representative graphs from pooled cell populations incubated with 0.01 μM cholesterol-conjugated feature barcodes showing the number of unique molecular identifier (UMI) counts on the x-axis versus number of cells on the y-axis. FIGS. 13A-B show log₁₀UMI counts of a first feature barcode sequence (“BC1”) identified from sequencing reads generated from sequencing libraries prepared from the pooled cell population (FIG. 13A—replicate 1; FIG. 13B—replicate 2). FIGS. 13C-D show log₁₀UMI counts of a second feature barcode sequence (′BC2″) identified from sequencing reads generated from sequencing libraries prepared from the pooled cell population (FIG. 13C—replicate 1; FIG. 13D—replicate 2). FIGS. 13E-F show log₁₀UMI counts of a third feature barcode sequence (′BC3″) identified from sequencing reads generated from sequencing libraries prepared from the pooled cell population (FIG. 13E—replicate 1; FIG. 13F—replicate 2). FIGS. 13G-H show log₁₀UMI counts of a fourth feature barcode sequence (′BC4″) identified from sequencing reads generated from sequencing libraries prepared from the pooled cell population (FIG. 13G—replicate 1; FIG. 13H—replicate 2). FIGS. 13I-12J show 3D representations of UMI counts obtained from the pooled cell populations for replicate 1. Graphs depict UMI counts in linear (FIG. 13I) and in log₁₀scale (FIG. 13J).

FIGS. 14A-14I show representative graphs from pooled cell populations incubated with antibody-conjugated feature barcodes showing the number of unique molecular identifier (UMI) counts on the x-axis versus number of cells on the y-axis. FIGS. 14A-14B show UMI counts of a first feature barcode sequence (“BC18”) identified from sequencing reads generated from sequencing libraries prepared from the pooled cell population (FIG. 14A—replicate 1; FIG. 14B—replicate 2). From these results, a clearly distinguished BC18-containing cell population can be distinguished 1401a (replicate 1) and 1401b (replicate 2). FIGS. 14C-14D show UMI counts of a second feature barcode sequence (“BC19”) identified from sequencing reads generated from sequencing libraries prepared from the pooled cell population (FIG. 14C—replicate 1; FIG. 14D—replicate 2). From these results, a clearly distinguished BC19-containing cell population can be distinguished 1402a (replicate 1) and 1402b (replicate 2). FIGS. 14E-14F show UMI counts of a third feature barcode sequence (“BC20”) identified from sequencing reads generated from sequencing libraries prepared from the pooled cell population (FIG. 14E—replicate 1; FIG. 14F—replicate 2). From these results, a clearly distinguished BC20-containing cell population can be distinguished 1403a (replicate 1) and 1403b (replicate 2). FIG. 14G shows UMI counts of feature barcode sequences identified from sequencing reads generated from sequencing libraries prepared from the pooled cell population with log₁₀UMI counts for BC18 on the y-axis and log₁₀UMI counts for BC20 on the x-axis. FIG. 14H shows UMI counts of feature barcode sequences identified from sequencing reads generated from sequencing libraries prepared from the pooled cell population with log₁₀UMI counts for BC18 on the y-axis and log₁₀UMI counts for BC19 on the x-axis. FIG. 14I shows UMI counts of feature barcode sequences identified from sequencing reads generated from sequencing libraries prepared from the pooled cell population with log₁₀UMI counts for BC19 on the y-axis and log₁₀UMI counts for BC20 on the x-axis.

FIGS. 15A-15B show clustering of UMI counts prepared using antibody t-distributed stochastic neighbor embedding (t-SNE) (FIG. 15A), as well as in gene expression (GEX) t-SNE analyses (FIG. 15B).

FIG. 16 depicts an example of a tissue section with barcode staining using a fixed array of needles.

FIG. 17 depicts a diffusion map to spatially localize barcodes and associated cells.

FIG. 18 shows the position of cells (designated “C1” to “C7”) defined by a barcode and its relative amount.

FIG. 19 depicts a three dimensional application of spatial mapping.

FIG. 20 depicts a three dimensional application of spatial mapping.

FIG. 21A depicts regions of a mouse brain with delivery devices for delivering barcode molecules.

FIG. 21B shows a pattern for injection of barcodes to a sample.

FIG. 22 shows a correlation between cell diameter and cell surface area.

FIG. 23 shows the uptake of lipophilic barcodes of given cell diameters (μm).

FIG. 24 shows an example graph of barcode counts vs. cell counts.

FIG. 25 shows a schematic for enriching WM sequences from immune molecules such as TCRs, BCRs, and immunoglobulins.

FIGS. 26A and 26B show variations of a schematic for generating labeled polynucleotides.

FIG. 27 shows a schematic for enhanced cell multiplexing.

FIG. 28 shows an exemplary fluorophore-conjugated-feature barcode molecule.

FIG. 29 shows exemplary nucleic acid barcode molecules comprising different capture sequences.

FIG. 30 shows exemplary moiety conjugated oligonucleotides.

DETAILED DESCRIPTION

While various embodiments of the invention have been shown and described herein, it will be obvious to those skilled in the art that such embodiments are provided by way of example only. Numerous variations, changes, and substitutions may occur to those skilled in the art without departing from the invention. It should be understood that various alternatives to the embodiments of the invention described herein may be employed.

Where values are described as ranges, it will be understood that such disclosure includes the disclosure of all possible sub-ranges within such ranges, as well as specific numerical values that fall within such ranges irrespective of whether a specific numerical value or specific sub-range is expressly stated.

The term “barcode,” as used herein, generally refers to a label, or identifier, that conveys or is capable of conveying information about an analyte. A barcode can be part of an analyte. A barcode can be independent of an analyte. A barcode can be a tag attached to an analyte (e.g., nucleic acid molecule) or a combination of the tag in addition to an endogenous characteristic of the analyte (e.g., size of the analyte or end sequence(s)). A barcode may be unique. Barcodes can have a variety of different formats. For example, barcodes can include: polynucleotide barcodes; random nucleic acid and/or amino acid sequences; and synthetic nucleic acid and/or amino acid sequences. A barcode can be attached to an analyte in a reversible or irreversible manner. A barcode can be added to, for example, a fragment of a deoxyribonucleic acid (DNA) or ribonucleic acid (RNA) sample before, during, and/or after sequencing of the sample. Barcodes can allow for identification and/or quantification of individual sequencing-reads.

The term “real time,” as used herein, can refer to a response time of less than about 1 second, a tenth of a second, a hundredth of a second, a millisecond, or less. The response time may be greater than 1 second. In some instances, real time can refer to simultaneous or substantially simultaneous processing, detection or identification.

The term “subject,” as used herein, generally refers to an animal, such as a mammal (e.g., human) or avian (e.g., bird), or other organism, such as a plant. The subject can be a vertebrate, a mammal, a rodent (e.g., a mouse), a primate, a simian or a human. Animals may include, but are not limited to, farm animals, sport animals, and pets. A subject can be a healthy or asymptomatic individual, an individual that has or is suspected of having a disease (e.g., cancer) or a pre-disposition to the disease, and/or an individual that is in need of therapy or suspected of needing therapy. A subject can be a patient.

The term “genome,” as used herein, generally refers to genomic information from a subject, which may be, for example, at least a portion or an entirety of a subject's hereditary information. A genome can be encoded either in DNA or in RNA. A genome can comprise coding regions (e.g., that code for proteins) as well as non-coding regions. A genome can include the sequence of all chromosomes together in an organism. For example, the human genome ordinarily has a total of 46 chromosomes. The sequence of all of these together may constitute a human genome.

The terms “adaptor(s)”, “adapter(s)” and “tag(s)” may be used synonymously. An adaptor or tag can be coupled to a polynucleotide sequence to be “tagged” by any approach, including ligation, hybridization, or other approaches.

The term “sequencing,” as used herein, generally refers to methods and technologies for determining the sequence of nucleotide bases in one or more polynucleotides. The polynucleotides can be, for example, nucleic acid molecules such as deoxyribonucleic acid (DNA) or ribonucleic acid (RNA), including variants or derivatives thereof (e.g., single stranded DNA). Sequencing can be performed by various systems currently available, such as, without limitation, a sequencing system by Illumina®, Pacific Biosciences (PacBio®), Oxford Nanopore®, or Life Technologies (Ion Torrent®). Alternatively or in addition, sequencing may be performed using nucleic acid amplification, polymerase chain reaction (PCR) (e.g., digital PCR, quantitative PCR, or real time PCR), or isothermal amplification. Such systems may provide a plurality of raw genetic data corresponding to the genetic information of a subject (e.g., human), as generated by the systems from a sample provided by the subject. In some examples, such systems provide sequencing reads (also “reads” herein). A read may include a string of nucleic acid bases corresponding to a sequence of a nucleic acid molecule that has been sequenced. In some situations, systems and methods provided herein may be used with proteomic information.

The term “bead,” as used herein, generally refers to a particle. The bead may be a solid or semi-solid particle. The bead may be a gel bead. The gel bead may include a polymer matrix (e.g., matrix formed by polymerization or cross-linking). The polymer matrix may include one or more polymers (e.g., polymers having different functional groups or repeat units). Cross-linking can be via covalent, ionic, or inductive, interactions, or physical entanglement. The bead may be a macromolecule. The bead may be formed of nucleic acid molecules bound together. The bead may be formed via covalent or non-covalent assembly of molecules (e.g., macromolecules), such as monomers or polymers. Such polymers or monomers may be natural or synthetic. Such polymers or monomers may be or include, for example, nucleic acid molecules (e.g., DNA or RNA). The bead may be formed of a polymeric material. The bead may be magnetic or non-magnetic. The bead may be rigid. The bead may be flexible and/or compressible. The bead may be disruptable or dissolvable. The bead may be a solid particle (e.g., a metal-based particle including but not limited to iron oxide, gold or silver) covered with a coating comprising one or more polymers. Such coating may be disruptable or dissolvable.

The term “sample,” as used herein, generally refers to a biological sample of a subject. The biological sample may comprise any number of macromolecules, for example, cellular macromolecules. The biological sample may be a nucleic acid sample or protein sample. The biological sample may also be a carbohydrate sample or a lipid sample. The biological sample may be derived from another sample. The sample may be a tissue sample, such as a biopsy, core biopsy, needle aspirate, or fine needle aspirate. The sample may be a fluid sample, such as a blood sample, urine sample, or saliva sample. The sample may be a skin sample. The sample may be a cheek swab. The sample may be a plasma or serum sample. The sample may be a cell-free or cell free sample. A cell-free sample may include extracellular polynucleotides. Extracellular polynucleotides may be isolated from a bodily sample that may be selected from the group consisting of blood, plasma, serum, urine, saliva, mucosal excretions, sputum, stool and tears.

The term “biological particle,” as used herein, generally refers to a discrete biological system derived from a biological sample. The biological particle may be a virus. The biological particle may be a cell or derivative of a cell. The biological particle may be an organelle. The biological particle may be a rare cell from a population of cells. The biological particle may be any type of cell, including without limitation prokaryotic cells, eukaryotic cells, bacterial, fungal, plant, mammalian, or other animal cell type, mycoplasmas, normal tissue cells, tumor cells, or any other cell type, whether derived from single cell or multicellular organisms. The biological particle may be or may include a matrix (e.g., a gel or polymer matrix) comprising a cell or one or more constituents from a cell (e.g., cell bead), such as DNA, RNA, organelles, proteins, or any combination thereof, from the cell. The biological particle may be obtained from a tissue of a subject. The biological particle may be a hardened cell. Such hardened cell may or may not include a cell wall or cell membrane. The biological particle may include one or more constituents of a cell, but may not include other constituents of the cell. An example of such constituents is a nucleus or an organelle. A cell may be a live cell. The live cell may be capable of being cultured, for example, being cultured when enclosed in a gel or polymer matrix, or cultured when comprising a gel or polymer matrix.

The term “macromolecular constituent,” as used herein, generally refers to a macromolecule contained within or from a biological particle. The macromolecular constituent may comprise a nucleic acid. The macromolecular constituent may comprise DNA. The macromolecular constituent may comprise RNA. The RNA may be coding or non-coding. The RNA may be messenger RNA (mRNA), ribosomal RNA (rRNA) or transfer RNA (tRNA), for example. The RNA may be a transcript. The RNA may comprise small RNA that are less than 200 nucleic acid bases in length, or large RNA that are greater than 200 nucleic acid bases in length. Small RNAs mainly include 5.8S ribosomal RNA (rRNA), 5S rRNA, transfer RNA (tRNA), microRNA (miRNA), small interfering RNA (siRNA), small nucleolar RNA (snoRNAs), Piwi-interacting RNA (piRNA), tRNA-derived small RNA (tsRNA) and small rDNA-derived RNA (srRNA). The RNA may be double-stranded RNA or single-stranded RNA. The RNA may be circular RNA. The macromolecular constituent may comprise a protein. The macromolecular constituent may comprise a peptide. The macromolecular constituent may comprise a polypeptide.

The term “molecular tag,” as used herein, generally refers to a molecule capable of binding to a macromolecular constituent. The molecular tag may bind to the macromolecular constituent with high affinity. The molecular tag may bind to the macromolecular constituent with high specificity. The molecular tag may comprise a nucleotide sequence. The molecular tag may comprise a nucleic acid sequence. The nucleic acid sequence may be at least a portion or an entirety of the molecular tag. The molecular tag may be a nucleic acid molecule or may be part of a nucleic acid molecule. The molecular tag may be an oligonucleotide or a polypeptide. The molecular tag may comprise a DNA aptamer. The molecular tag may be or comprise a primer. The molecular tag may be, or comprise, a protein. The molecular tag may comprise a polypeptide. The molecular tag may be a barcode.

The term “partition,” as used herein, generally, refers to a space or volume that may be suitable to contain one or more species or conduct one or more reactions. The partition may isolate space or volume from another space or volume. The partition may be a droplet or well, for example. The droplet may be a first phase (e.g., aqueous phase) in a second phase (e.g., oil) immiscible with the first phase. The droplet may be a first phase in a second phase that does not phase separate from the first phase, such as, for example, a capsule or liposome in an aqueous phase.

The term “epitope binding fragment,” as used herein generally refers to a portion of a complete antibody capable of binding the same epitope as the complete antibody, albeit not necessarily to the same extent. Although multiple types of epitope binding fragments are possible, an epitope binding fragment typically comprises at least one pair of heavy and light chain variable regions (VH and VL, respectively) held together (e.g., by disulfide bonds) to preserve the antigen binding site, and does not contain all or a portion of the Fc region. Epitope binding fragments of an antibody can be obtained from a given antibody by any suitable technique (e.g., recombinant DNA technology or enzymatic or chemical cleavage of a complete antibody), and typically can be screened for specificity in the same manner in which complete antibodies are screened. In some embodiments, an epitope binding fragment comprises an F(ab′)₂fragment, Fab′ fragment, Fab fragment, Fd fragment, or Fv fragment. In some embodiments, the term “antibody” includes antibody-derived polypeptides, such as single chain variable fragments (scFv), diabodies or other multimeric scFvs, heavy chain antibodies, single domain antibodies, or other polypeptides comprising a sufficient portion of an antibody (e.g., one or more complementarity determining regions (CDRs)) to confer specific antigen binding ability to the polypeptide.

Provided herein are methods, systems, and compositions for processing cellular and/or polynucleotide samples. In various aspects, the methods, systems, and compositions herein enable parallel processing of multiple samples. Parallel processing of samples can enable high-throughput analysis. For example, using methods and compositions provided herein, multiple cell samples or polynucleotides derived therefrom can be processed in parallel for gene expression analysis.

Parallel Analysis of Cell Samples

Provided herein are methods, systems, and compositions for analysis of a plurality of samples in parallel. The samples can comprise cells, cell beads, or in some cases, cellular derivatives (e.g., components of cells, such as cell nuclei, or matrices comprising cells or components thereof, such as cell beads). A cell bead can be a biological particle and/or one or more of its macromolecular constituents encased inside of a gel or polymer matrix, such as via polymerization of a droplet containing the biological particle and precursors capable of being polymerized or gelled. In an aspect, the present disclosure provides a method of analyzing nucleic acids (e.g., deoxyribonucleic acids (DNAs) or ribonucleic acid (RNAs)) of a plurality of different cell samples. The method may comprise labeling cells and/or cell beads of one or more different cell samples using a plurality of nucleic acid barcode molecules to yield a plurality of labeled cell samples, wherein an individual nucleic acid barcode molecule of the plurality of nucleic acid barcode molecules comprises a sample barcode sequence (e.g., a moiety-conjugated barcode molecule, also referred to herein as a feature barcode), and wherein nucleic acid barcode molecules of a given labeled cell sample are distinguishable from nucleic acid barcode molecules of another labeled cell sample by the sample barcode sequence. Nucleic acid molecules of the plurality of labeled cell samples may then be subjected to one or more reactions to yield a plurality of nucleic acid barcode products, wherein an individual nucleic acid barcode product of the plurality of nucleic acid barcode products comprises (i) a sample barcode sequence (e.g., a nucleic acid barcode sequence) and (ii) a sequence corresponding to a nucleic acid molecule of the plurality of labeled cell samples. The sequence corresponding to the nucleic acid molecule of the plurality of labeled cell samples may be, for example, a partition nucleic acid barcode molecule. The plurality of nucleic acid barcode products may be subjected to a sequencing reaction to yield a plurality of sequencing reads, which sequencing reads may be associated with individual labeled cell samples based on the sample barcode sequence, thereby analyzing nucleic acids of the plurality of different cell samples. In some embodiments, individual cells of a cell sample are labeled with two or more nucleic acid barcode molecules. In some cases, each of the two or more nucleic acid barcode molecules have unique barcode sequences (e.g., unique nucleic acid barcode sequences). In some cases, the barcode sequences of the two or more nucleic acid barcode molecules are not unique amongst the different cell samples but the combination of the barcode sequences of the two or more nucleic acid barcode molecules is a unique combination.

A nucleic acid barcode molecule can be used to label individual cells and/or cell beads of a cell sample. The label can be used in downstream processes, for example in sequencing analysis, as a mechanism to associate a cell and/or cell bead and a particular cell sample. For example, a plurality of cell samples (e.g., a plurality of cell samples from a plurality of different subjects (e.g., human or animal subjects), or a plurality of cell samples from a plurality of different biological fluids or tissues of a given subject, or a plurality of cell samples taken at different times from the same subject) can be uniquely labeled with nucleic acid barcode molecules such that the cells of a particular sample can be identified as originating from the particular sample, even if the particular cell sample was mixed with other cell samples and subjected to nucleic acid processing and/or sequencing in parallel. Accordingly, the present methods provide means of deconvoluting complex samples and enable massively parallel, high throughput sequencing.

Cells and/or cell beads of a given sample may be labeled with the same or different labels. For example, a first cell of a cell sample may be labeled with a first label and a second cell of the cell sample may be labeled with a second label. In some cases, the first and second labels may be the same. In other cases, the first and second labels may be different. Labels may differ in different aspects. For example, a first label and a second label used to label cells of the same sample may comprise the same nucleic acid barcode sequence but differ in another aspect, such as a unique molecular identifier sequence. Alternatively or in addition, a first label and a second label may both comprise a first nucleic acid barcode sequence and a second nucleic acid barcode sequence, where the first nucleic acid barcode sequences are the same and the second nucleic acid barcode sequences are different. Similarly, labels applied to different cellular samples may have one or more common features. For example, labels for cells of a first sample from a given subject may include a first common barcode sequence (e.g., identical nucleic acid barcode sequence) and a second common barcode sequence, while labels for cells of a second sample from the same subject may include a third common barcode sequence and a fourth common barcode sequence, which first common barcode sequence and third common barcode sequence are identical and which second common barcode sequence and fourth common barcode sequence are different.

The methods provided herein may comprise labeling and/or analysis of cell beads. Cell beads may comprise biological particles and/or their macromolecular constituents encased in a gel or polymer matrix. For example, a cell bead may comprise an entrapped cell. A cell bead may be generated prior to labeling of the cell bead, or components thereof. Alternatively, a cell bead may be generated after labeling and partitioning of a cell. For example, a labeled cell may be co-partitioned with polymerizable materials, and a cell bead comprising the labeled cell may be generated within the partition. A stimulus may be used to promote polymerization of the polymerizable materials within the partition.

Labeling individual cells and/or cell beads of a cell sample with nucleic acid barcode molecules for different cell samples can yield a plurality of labeled cell samples. An individual nucleic acid barcode molecule for labeling a cell and/or cell bead (e.g., a moiety-conjugated barcode molecule) can comprise a sample barcode sequence (also referred to as a feature barcode). Individual cell samples of a plurality of cell samples can each be labeled with nucleic acid barcode molecules having a barcode sequence unique to the cell sample. In embodiments herein, nucleic acid barcode molecules of a given labeled cell sample are distinguishable from nucleic acid barcode molecules of another labeled cell sample by the sample barcode sequence. In some instances, labeled cell samples can be combined and subjected to downstream sample processing in bulk. Sample barcode sequences can later be used to determine from which cell sample a particular cell originated.

Individual nucleic acid barcode molecules may form a part of a barcoded oligonucleotide. A barcoded oligonucleotide (e.g., a moiety-conjugated barcode molecule) can comprise sequence elements (e.g., functional sequences) in addition to the nucleic acid barcode molecule or sample barcode sequence. The additional sequence elements may be useful for a variety of downstream applications, including, but not limited to, sample preparation for sequencing analysis, e.g., next-generation sequence analysis. Non-limiting examples of additional sequence elements that can be present on barcoded oligonucleotides in embodiments herein include amplification primer annealing sequences or complements thereof; sequencing primer annealing sequences or complements thereof; common sequences shared among multiple different barcoded oligonucleotides; restriction enzyme recognition sites; probe binding sites or sequencing adapters (e.g., for attachment to a sequencing platform, such as a flow cell for parallel sequencing); molecular identifier sequences, e.g., unique molecular identifiers (UMIs); lipophilic molecules; and antibodies or epitope fragments thereof. For example, the barcoded oligonucleotide may comprise an amplification primer binding sequence. In another example, the barcoded oligonucleotide may comprise a sequencing primer binding sequence. In another example, the barcoded oligonucleotide may comprise a lipophilic molecule. In another example, the barcoded oligonucleotide may comprise an antibody or epitope fragment thereof. A sequence element may include a label, such as an optical label. Such a label may, for example, enable detection of a moiety with which the sequence element is associated. For example, a sequence element such as a lipophilic molecule may comprise a fluorescent moiety. The fluorescent moiety may permit optical detection of the lipophilic molecule and moieties with which it is associated.

A nucleic acid barcode molecule or a barcoded oligonucleotide comprising the nucleic acid barcode molecule may be linked to a moiety (“barcoded moiety”) such as an antibody or an epitope binding fragment thereof, a cell surface receptor binding molecule, a receptor ligand, a small molecule, a pro-body, an aptamer, a monobody, an affimer, a darpin, or a protein scaffold. The moiety to which a nucleic acid barcode molecule or barcoded oligonucleotide can be linked may bind a molecule expressed on the surface of individual cells of the plurality of cell samples. A labeled cell sample may refer to a sample in which the cells and/or cell beads are bound to barcoded moieties.

A molecule of a cell and/or cell bead to which a moiety (e.g., barcoded moiety) may bind may be common to all cells of a given sample and/or all cells and/or cell beads of a plurality of different cell samples. Such a molecule may be a protein. For example, a protein to which a moiety may bind may be a transmembrane receptor, major histocompatibility complex protein, cell-surface protein, glycoprotein, glycolipid, protein channel, or protein pump. A non-limiting example of a cell-surface protein can be a cell adhesion molecule. A molecule to which a moiety (e.g., barcoded moiety) may bind may be expressed at similar levels for all cells and/or cell beads of a given sample and/or all cells of a plurality of different cell samples. The expression of the molecule for all cells and/or cell beads of a sample and/or all cells of a plurality of different cell samples may be within biological variability. Alternatively, the molecule may be differentially expressed for certain cells and/or cell beads of the cell sample or a plurality of different cell samples. For example, the expression of the molecule for all cells and/or cell beads of a sample or a plurality of different cell samples may not be within biological variability, and/or some of the cells and/or cell beads of a cell sample or a plurality of different cell sample may be abnormal cells. A barcoded moiety may bind a molecule that is present on a majority of the cells and/or cell beads of a cell sample and/or a plurality of different cell samples. The molecule may be present on at least 50%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% of the cells and/or cell beads in a cell sample and/or a plurality of different cell samples.

A nucleic acid barcode molecule or barcoded oligonucleotide comprising the nucleic acid barcode molecule may be linked to an antibody or an epitope binding fragment thereof, and labeling cells and/or cell beads may comprise subjecting the antibody-linked barcode molecule or the epitope binding fragment-linked barcode molecule to conditions suitable for binding the antibody to a molecule present on a cell surface. The binding affinity between the antibody or the epitope binding fragment thereof and the molecule present on the cell surface may be within a desired range to ensure that the antibody or the epitope binding fragment thereof remains bound to the molecule. For example, the binding affinity may be within a desired range to ensure that the antibody or the epitope binding fragment thereof remains bound to the molecule during various sample processing steps, such as partitioning and/or nucleic acid amplification or extension. A dissociation constant (Kd) between the antibody or an epitope binding fragment thereof and the molecule to which it binds may be less than about 100 μM, 90 μM, 80 μM, 70 μM, 60 μM, 50 μM, 40 μM, 30 μM, 20 μM, 10 μM, 9 μM, 8 μM, 7 μM, 6 μM, 5 μM, 4 μM, 3 μM, 2 μM, 1 μM, 900 nM, 800 nM, 700 nM, 600 nM, 500 nM, 400 nM, 300 nM, 200 nM, 100 nM, 90 nM, 80 nM, 70 nM, 60 nM, 50 nM, 40 nM, 30 nM, 20 nM, 10 nM, 9 nM, 8 nM, 7 nM, 6 nM, 5 nM, 4 nM, 3 nM, 2 nM, 1 nM, 900 pM, 800 pM, 700 pM, 600 pM, 500 pM, 400 pM, 300 pM, 200 pM, 100 pM, 90 pM, 80 pM, 70 pM, 60 pM, 50 pM, 40 pM, 30 pM, 20 pM, 10 pM, 9 pM, 8 pM, 7 pM, 6 pM, 5 pM, 4 pM, 3 pM, 2 pM, or 1 pM. For example, the dissociation constant may be less than about 10 μM.

A nucleic acid barcode molecule or barcoded oligonucleotide comprising the nucleic acid barcode molecule may be coupled to a cell-penetrating peptide (CPP), and labeling cells may comprise delivering the CPP coupled nucleic acid barcode molecule into a cell and/or cell bead by the cell-penetrating peptide. The nucleic acid barcode molecule or barcoded oligonucleotide comprising the nucleic acid barcode molecule may be conjugated to a cell-penetrating peptide (CPP), and labeling cells and/or cell beads may comprise delivering the CPP conjugated nucleic acid barcode molecule into a cell and/or cell bead by the cell-penetrating peptide. A cell-penetrating peptide that can be used in the methods provided herein can comprise at least one non-functional cysteine residue, which may be either free or derivatized to form a disulfide link with an oligonucleotide that has been modified for such linkage. Non-limiting examples of cell-penetrating peptides that can be used in embodiments herein include penetratin, transportan, plsl, TAT(48-60), pVEC, MTS, and MAP. Cell-penetrating peptides useful in the methods provided herein can have the capability of inducing cell penetration for at least about 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% of cells of a cell population. The cell-penetrating peptide may be an arginine-rich peptide transporter. The cell-penetrating peptide may be Penetratin or the Tat peptide.

A nucleic acid barcode molecule or barcoded oligonucleotide comprising a nucleic acid barcode molecule may be coupled to a fluorophore or dye, and labeling cells may comprise subjecting the fluorophore-linked barcode molecule to conditions suitable for binding the fluorophore to the cell surface. See, e.g., FIG. 28. In some instances, fluorophores can interact strongly with lipid bilayers and labeling cells may comprise subjecting the fluorophore-linked barcode molecule to conditions such that the fluorophore binds to or is inserted into the cell membrane. In some cases, the fluorophore is a water-soluble, organic fluorophore. In some instances, the fluorophore is Alexa 532 maleimide, tetramethylrhodamine-5-maleimide (TMR maleimide), BODIPY-TMR maleimide, Sulfo-Cy3 maleimide, Alexa 546 carboxylic acid/succinimidyl ester, Atto 550 maleimide, Cy3 carboxylic acid/succinimidyl ester, Cy3B carboxylic acid/succinimidyl ester, Atto 565 biotin, Sulforhodamine B, Alexa 594 maleimide, Texas Red maleimide, Alexa 633 maleimide, Abberior STAR 635P azide, Atto 647N maleimide, Atto 647 SE, or Sulfo-Cy5 maleimide. See, e.g., Hughes L D, et al. PLoS One. 2014 Feb. 4; 9(2):e87649, which is hereby incorporated by reference in its entirety for a description of organic fluorophores.

A nucleic acid barcode molecule or barcoded oligonucleotide comprising the nucleic acid barcode molecule may be coupled to a lipophilic molecule, and labeling cells and/or cell beads may comprise delivering the nucleic acid barcode molecule to a cell membrane or a nuclear membrane by the lipophilic molecule. Lipophilic molecules can associate with and/or insert into lipid membranes such as cell membranes and nuclear membranes. In some cases, the insertion can be reversible. In some cases, the association between the lipophilic molecule and the cell and/or cell bead may be such that the cell and/or cell bead retains the lipophilic molecule (e.g., and associated components, such as nucleic acid barcode molecules, thereof) during subsequent processing (e.g., partitioning, cell permeabilization, amplification, pooling, etc.). The nucleic acid barcode molecule or barcoded oligonucleotide comprising the nucleic acid barcode molecule may enter into the intracellular space and/or a cell nucleus. Non-limiting examples of lipophilic molecules that can be used in the methods provided herein include sterol lipids such as cholesterol, tocopherol, and derivatives thereof, steryl lipids, lignoceric acid, and palmitic acid. Other lipophilic molecules that may be used in the methods provided herein comprise amphiphilic molecules wherein the headgroup (e.g., charge, aliphatic content, and/or aromatic content) and/or fatty acid chain length (e.g., C12, C14, C16, or C18) can be varied. For instance, fatty acid side chains (e.g., C12, C14, C16, or C18) can be coupled to glycerol or glycerol derivatives (e.g., 3-t-butyldiphenylsilylglycerol), which can also comprise, e.g., a cationic head group. The nucleic acid feature barcode molecules disclosed herein can then be coupled (either directly or indirectly) to these amphiphilic molecules. An amphiphilic molecule may associate with and/or insert into a membrane (e.g., a cell/cell bead or nuclear membrane). In some cases, an amphiphilic or lipophilic moiety may cross a cell membrane and provide a nucleic acid barcode molecule to an internal region of a cell and/or cell bead.

A nucleic acid barcode molecule may be attached to a lipophilic moiety (e.g., a cholesterol molecule). A nucleic acid barcode molecule may be attached to the lipophilic moiety via a linker, such as a tetra-ethylene glycol (TEG) linker. Other exemplary linkers include, but are not limited to, Amino Linker C6, Amino Linker C12, Spacer C3, Spacer C6, Spacer C12, Spacer 9, Spacer 18. A nucleic acid barcode molecule may be attached to the lipophilic moiety or the linker on the 5′ end of the nucleic acid barcode molecule. Alternatively, a nucleic acid barcode molecule may be attached to the lipophilic moiety or the linker on the 3′ end of the nucleic acid barcode molecule. In some instances, a first nucleic acid barcode molecule is attached to the lipophilic moiety or the linker at the 5′ end of the nucleic acid barcode molecule and a second nucleic acid barcode molecule is attached to the lipophilic moiety or the linker at the 3′ of the nucleic acid barcode molecule. The linker may be a glycol or derivative thereof. For example, the linker may be tetra-ethylene glycol (TEG) or polyethylene glycol (PEG). A nucleic acid barcode molecule may be releasably attached to the linker or lipophilic moiety (e.g., as described elsewhere herein for releasable attachment of nucleic acid molecules) such that the nucleic acid barcode molecule or a portion thereof can be released from the lipophilic molecule.

In some cases, a lipophilic molecule may comprise a label, such as an optical label. Such a label may, for example, enable detection of a moiety with which the lipophilic molecule is associated. For example, a lipophilic molecule may comprise a fluorescent moiety. The fluorescent moiety may permit optical detection of the lipophilic molecule and moieties with which it is associated.

An example of reagents and schemes suitable for analysis of barcoded lipophilic molecules is shown in panels I and II of FIG. 10. Although a lipophilic moiety is shown in FIG. 10, any moiety described herein (e.g., an antibody) can be conjugated to barcode oligonucleotides as described below. As shown in FIG. 10 (panel I), a lipophilic moiety (e.g., a cholesterol) 1001 is directly (e.g., covalently bound, bound via a protein-protein interaction, etc.) coupled to an oligonucleotide 1002 comprising a feature barcode sequence 1003 that functions to identify a cell or cell population. In some embodiments, oligonucleotide 1002 also includes additional sequences suitable for downstream reactions (e.g., sequence 1004 comprising a reverse complement of a sequence on second nucleic acid molecule 1006 and optionally sequence 1005 comprising a sequence configured to function as a PCR primer binding site). FIG. 10 (panel I) also shows an additional oligonucleotide 1006 (e.g., which in some instances, may be attached to a bead as described elsewhere herein) comprising a cell barcode sequence 1008 (also referred to herein as a bead barcode sequence or a nucleic acid barcode sequence), and a sequence 1010 complementary to a sequence 1004 on oligonucleotide 1002. See also FIGS. 29 and 30 for exemplary sequences (e.g., 1010, 1030) complementary to moiety bound oligonucleotides (e.g., 1002, 1022). In some instances, oligonucleotide 1006 also comprises additional functional sequences suitable for downstream reactions such as a UMI sequence 1009 and an adapter sequence 1007 (e.g., a sequence 1007 comprising a sequencing primer binding site, e.g., a Read 1 (“R1”) or a Read 2 (“R2”) sequence, and in some instances, a P5 or P7 flow cell attachment sequence). Sequence 1010 represents a sequence that is complementary to complementary sequence 1004. In some instances, sequence 1004 comprises a poly-A sequence and sequence 1010 comprises a poly-T sequence. In some instances, sequence 1010 comprises a poly-A sequence and sequence 1004 comprises a poly-T sequence. In some instances, sequence 1004 comprises a GGG-containing sequence and sequence 1010 comprises a complementary CCC-containing sequence. In some instances, sequence 1010 comprises a GGG-containing sequence and sequence 1004 comprises a complementary CCC-containing sequence. In some instances, the CCC-containing or GGG-containing sequences comprise one or more ribonucleotides. During analysis, sequence 1010 hybridizes with sequence 1004 and oligonucleotides 1002 and/or 1006 are extended via the action of a polymerizing enzyme (e.g., a reverse transcriptase, a polymerase), where oligonucleotide 1006 then comprises complement sequences to oligonucleotide 1002 at its 3′ end. These constructs can then be optionally processed as described elsewhere herein and subjected to nucleic acid sequencing to, for example, identify cells associated with a specific feature barcode 1003 and a specific cell barcode 1008. While the sequences included in panel I of FIG. 10 are presented in a given order, the sequences may be included in a different order, and/or with additional sequences or nucleotides disposed between one or more of the sequences. For example, the UMI 1009 and the barcode sequence 1008 may be transposed.

In another example, shown in FIG. 10 (panel II), a lipophilic moiety (e.g., a cholesterol) 1021 is indirectly (e.g., via hybridization or ligand-ligand interactions, such as biotin-streptavidin) coupled to an oligonucleotide 1022 comprising a feature barcode sequence 1023 that functions to identify a cell or cell population. Lipophilic molecule 1021 is directly (e.g., covalently bound, bound via a protein-protein interaction) coupled to a hybridization oligonucleotide 1032 that hybridizes with sequence 1031 of oligonucleotide 1022, thereby indirectly coupling oligonucleotide 1022 to the lipophilic moiety. In some embodiments, oligonucleotide 1022 includes additional sequences suitable for downstream reactions (e.g., sequence 1024 comprising a reverse complement of a sequence on second nucleic acid molecule 1026 and optionally sequence 1025 comprising a sequence configured to function as a PCR primer binding site). FIG. 10 (panel II) also shows an additional oligonucleotide 1026 (e.g., which in some instances, may be attached to a bead as described elsewhere herein) comprising a cell barcode sequence 1028 (e.g., a nucleic acid barcode sequence), and a sequence 1030 complementary to a sequence 1024 on oligonucleotide 1022. In some instances, oligonucleotide 1026 also comprises additional functional sequences suitable for downstream reactions such as a UMI sequence 1029 and an adapter sequence 1027 (e.g., a sequence 1027 comprising a sequencing primer binding site, e.g., a Read 1 (“R1”) or a Read 2 (“R2”) sequence, and in some instances, a P5 or P7 flow cell attachment sequence). Sequence 1010 represents a sequence that is complementary to complementary sequence 1004. In some instances, sequence 1024 comprises a poly-A sequence and sequence 1030 comprises a poly-T sequence. In some instances, sequence 1030 comprises a poly-A sequence and sequence 1024 comprises a poly-T sequence. In some instances, sequence 1024 comprises a GGG-containing sequence and sequence 1030 comprises a complementary CCC-containing sequence. In some instances, sequence 1030 comprises a GGG-containing sequence and sequence 1024 comprises a complementary CCC-containing sequence. In some instances, the CCC-containing or GGG-containing sequences comprise one or more ribonucleotides. During analysis, sequence 1030 hybridizes with sequence 1024 and oligonucleotides 1022 and/or 1026 are extended via the action of a polymerizing enzyme (e.g., a reverse transcriptase, a polymerase), where oligonucleotide 1026 then comprises complement sequences to oligonucleotide 1022 at its 3′ end. These constructs can then be optionally processed as described elsewhere herein and subjected to nucleic acid sequencing to, for example, identify cells associated with a specific feature barcode 1023 and a specific cell barcode 1028. While the sequences included in panel II of FIG. 10 are presented in a given order, the sequences may be included in a different order, and/or with additional sequences or nucleotides disposed between one or more of the sequences. For example, the UMI 1029 and the barcode sequence 1028 may be transposed. See, e.g., FIG. 30 for additional exemplary oligonucleotides suitable for use with the labeling moieties (e.g., lipophilic, antibody, fluorophore, etc.) described herein.

In an example, a method provided herein may be used to label cells using feature barcodes linked to cell surfaces. A cell surface feature (e.g., a lipophilic moiety, such as a cholesterol) of a plurality of cells may be linked (e.g., conjugated) to a feature barcode. The feature barcode may include, for example, a sequence configured to hybridize to a nucleic acid barcode molecule, such as a sequence comprising multiple cytosine nucleotides (e.g., a CCC sequence). Each feature barcode may comprise a barcode sequence and/or a unique molecular identifier sequence. A plurality of beads (e.g., gel beads) each comprising a plurality of nucleic acid barcode molecules may be provided. The nucleic acid barcode molecules of each bead (e.g., releasably attached to each bead) may comprise a barcode sequence (e.g., cell barcode sequence), a unique molecular identifier sequence, and a sequence configured to hybridize to a feature barcode linked to a cell surface. Nucleic acid barcode molecules of each different bead may comprise the same barcode sequence, which barcode sequence differs from barcode sequences of nucleic acid barcode molecules of other beads of the plurality of beads. The feature barcode-linked cells may be partitioned with the plurality of beads into a plurality of partitions (e.g., droplets, such as aqueous droplets in an emulsion) such that at least a subset of the plurality of partitions each comprise a single cell and a single bead. One or more nucleic acid barcode molecules of the bead of each partition may attach (e.g., hybridize or ligate) to one or more feature barcodes of the cell of the same partition. The one or more nucleic acid barcode molecules of the bead may be released (e.g., via application of a stimulus, such as a chemical stimulus) from the bead within the partition prior to attachment of the one or more nucleic acid barcode molecules to the one or more feature barcodes of the cell. The cell may be lysed or permeabilized within the partition to provide access to analytes therein, such as nucleic acid molecules therein (e.g., deoxyribonucleic acid (DNA) molecules and/or ribonucleic acid (RNA) molecules). One or more analytes (e.g., nucleic acid molecules) of the cell may also be barcoded within the partition with one or more nucleic acid barcode molecules of the bead to provide a plurality of barcoded analytes (e.g., barcoded nucleic acid molecules). The plurality of partitions comprising barcoded analytes and barcoded cell surface features may be combined (e.g., pooled). Additional processing may be performed to, for example, prepare the barcoded analytes and barcoded cell surface features for subsequent analysis. For example, barcoded nucleic acid molecules may be derivatized with flow cell adapters to facilitate nucleic acid sequencing. Barcodes of barcoded analytes may be detected (e.g., using nucleic acid sequencing) and used to identify the barcoded analytes as deriving from particular cells or cell types of the plurality of cells.

In another example, a method provided herein may be used to label cells using lipophilic feature barcodes. Feature barcodes comprising a lipophilic moiety (e.g., a cholesterol moiety) may be incubated with a plurality of cells. The feature barcodes may comprise an optical label such as a fluorescent moiety. The feature barcodes may include, for example, a sequence configured to hybridize to a nucleic acid barcode molecule, such as a sequence comprising multiple cytosine nucleotides (e.g., a CCC sequence). Each feature barcode may also comprise a barcode sequence and/or a unique molecular identifier sequence. A plurality of beads (e.g., gel beads) each comprising a plurality of nucleic acid barcode molecules may be provided. The nucleic acid barcode molecules of each bead (e.g., releasably attached to each bead) may comprise a barcode sequence (e.g., cell barcode sequence), a unique molecular identifier sequence, and a sequence configured to hybridize to a feature barcode. Nucleic acid barcode molecules of each different bead may comprise the same barcode sequence, which barcode sequence differs from barcode sequences of nucleic acid barcode molecules of other beads of the plurality of beads. The cells incubated with feature barcodes may be partitioned (e.g., subsequent to one or more washing processes) with the plurality of beads into a plurality of partitions (e.g., droplets, such as aqueous droplets in an emulsion) such that at least a subset of the plurality of partitions each comprise a single cell and a single bead. Within each partition of the at least a subset of the plurality of partitions, one or more nucleic acid barcode molecules of the bead may attach (e.g., hybridize or ligate) to one or more feature barcodes of the cell. The one or more nucleic acid barcode molecules of the bead may be released (e.g., via application of a stimulus, such as a chemical stimulus) from the bead within the partition prior to attachment of the one or more nucleic acid barcode molecules to the one or more feature barcodes of the cell to provide a barcoded feature barcode. The cell may be lysed or permeabilized within the partition to provide access to analytes therein, such as nucleic acid molecules therein (e.g., deoxyribonucleic acid (DNA) molecules and/or ribonucleic acid (RNA) molecules), and/or to the feature barcode therein (e.g., if the feature barcode has permeated the cell membrane). One or more analytes (e.g., nucleic acid molecules) of the cell may also be barcoded within the partition with one or more nucleic acid barcode molecules of the bead to provide a plurality of barcoded analytes (e.g., barcoded nucleic acid molecules). The plurality of partitions comprising barcoded analytes and barcoded feature barcodes may be combined (e.g., pooled). Additional processing may be performed to, for example, prepare the barcoded analytes and barcoded feature barcodes for subsequent analysis. For example, barcoded nucleic acid molecules and/or barcoded feature barcodes may be derivatized with flow cell adapters to facilitate nucleic acid sequencing. Barcodes of barcoded analytes and barcoded feature barcodes may be detected (e.g., using nucleic acid sequencing) and used to identify the barcoded analytes and barcoded feature barcodes as deriving from particular cells or cell types of the plurality of cells.

Cells and/or cell beads may be contacted with one or more additional agents along with moiety-conjugated feature barcodes (e.g., the lipophilic molecules described herein). For example, cells and/or cell beads may be contacted with a lipophilic moiety-conjugated barcode molecule and one or more additional moiety (e.g., lipophilic moiety) conjugated “anchor” molecules. In some instances, a cell and/or cell bead is contacted with (1) a lipophilic-moiety conjugated to a first nucleic acid molecule comprising a capture sequence (e.g., a poly-A sequence), a feature barcode sequence, and a primer sequence; and (2) an anchor molecule comprising a lipophilic moiety conjugated to a second nucleic acid molecule comprising a sequence complementary to the primer sequence. In other instances, a cell and/or cell bead is contacted with (1) a lipophilic-moiety conjugated to a first nucleic acid molecule comprising a capture sequence (e.g., a poly-A sequence), a feature barcode sequence, and a primer sequence; (2) an anchor molecule comprising a lipophilic moiety conjugated to a second nucleic acid molecule comprising an anchor sequence and a sequence complementary to the primer sequence; and (3) a co-anchor molecule comprising a lipophilic moiety conjugated to a third nucleic acid molecule comprising a sequence complementary to the anchor sequence. Moiety-conjugated oligonucleotides can comprise any number of modifications, such as modifications which prevent extension by a polymerase and other such modifications described elsewhere herein.

The structure of the moiety-attached barcode oligonucleotides may include a number of sequence elements in addition to the feature barcode sequence. The oligonucleotide may include functional sequences that are used in subsequent processing, which may include one or more of a sequencer specific flow cell attachment sequence, e.g., a P5 or P7 sequence for Illumina sequencing systems, as well as sequencing primer sequences, e.g., a R1 or R2 sequencing primer sequence for Illumina sequencing systems. A specific priming and/or capture sequence, such as poly-A sequence, may be also included in the oligonucleotide structure.

As described above, moiety-attached barcode oligonucleotides can be processed to attach a cell barcode sequence. Cell barcode oligonucleotides (which can be attached to a bead) may comprise a poly-T sequence designed to hybridize and capture poly-A containing moiety-attached barcode oligonucleotides. A poly-T cell barcode molecule may comprise an anchoring sequence segment to ensure that the poly-T sequence hybridizes to the poly-A sequence of the moiety-attached barcode oligonucleotides. This anchoring sequence can include a random short sequence of nucleotides, e.g., 1-mer, 2-mer, 3-mer or longer sequence. An additional sequence segment may be included within the cell barcode oligonucleotide molecules. This additional sequence may provide a unique molecular identifier (UMI) sequence segment, e.g., as a random sequence (e.g., such as a random N-mer sequence) that varies across individual oligonucleotides (e.g., cell barcode molecules coupled to a single bead), whereas the cell barcode sequence is constant among the oligonucleotides (e.g., cell barcode molecules coupled to a single bead). This unique sequence may serve to provide a unique identifier of the starting nucleic acid molecule that was captured, in order to allow quantitation of the number of original molecules present (e.g., the number of moiety-conjugated nucleic acid barcode molecules).

Nucleic acid barcode molecules or barcoded oligonucleotides comprising the nucleic acid barcode molecules may be coupled to a plurality of beads, such as a plurality of gel beads. An individual bead of a plurality of beads can include tens to hundreds of thousands or millions of individual oligonucleotide molecules (e.g., at least about 10,000, 50,000, 100,000, 500,000, 1,000,000 or 10,000,000 oligonucleotide molecules), where a barcode segment of the oligonucleotide molecules can be constant or relatively constant for all of the oligonucleotide molecules coupled to a given bead. Oligonucleotide molecules coupled to a given bead may also comprise a variable or unique sequence segment that may vary across the oligonucleotide molecules coupled to the given bead. The variable or unique sequence segment may be a unique molecular identifier (UMI) sequence segment that may include from 5 to about 8 or more nucleotides within the sequence of the oligonucleotides. In some cases, the unique molecular identifier (UMI) sequence segment can be 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 or 20 nucleotides in length or longer. In some cases, the unique molecular identifier (UMI) sequence segment can be at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 or 20 nucleotides in length or longer. In some cases, the unique molecular identifier (UMI) sequence segment can be at most 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 or 20 nucleotides in length. In some cases, the sample oligonucleotide (e.g., partition nucleic acid barcode molecule) may comprise a target-specific primer (e.g., a primer sequence specific for a sequence in the moiety-conjugated oligonucleotides). For example, the specific sequence may be a sequence that is not in the capture sequence (e.g., not the poly-A or CCC-containing capture sequence).

Labeling cells and/or cell beads may comprise delivering a nucleic acid barcode molecule or barcoded oligonucleotide comprising the nucleic acid barcode molecule into a cell and/or cell bead using a physical force or chemical compound. A labeled cell sample may refer to a sample in which one or more cells and/or cell beads have nucleic acid barcode molecules introduced to the cells and/or cell beads (e.g., coupled to the surface of the cells and/or cell beads) and/or within the cells and/or cell beads.

Use of physical force (e.g., to deliver a nucleic acid barcode molecule or barcoded oligonucleotide to a cell and/or cell bead) can refer to the use of a physical force to counteract the cell membrane barrier in facilitating intracellular delivery of oligonucleotides. Examples of physical methods that can be used in embodiments herein include the use of a needle, ballistic DNA, electroporation, sonoporation, photoporation, magnetofection, and hydroporation.

Labeling cells and/or cell beads may comprise the use of a needle, for example for injection (e.g., microinjection). Alternatively or in addition, labeling cells and/or cell beads may comprise particle bombardment. With particle bombardment, nucleic acid barcode molecules can be coated on heavy metal particles and delivered to a cell and/or cell bead at a high speed. Labeling cells and/or cell beads may comprise electroporation. With electroporation, nucleic acid barcode molecules can enter a cell and/or cell bead through one or more pores in the cellular membrane formed by applied electricity. The pore of the membrane can be reversible based on the applied field strength and pulse duration. Labeling cells and/or cell beads may comprise sonoporation. Cell membranes can be temporarily permeabilized using sound waves, allowing cellular uptake of nucleic acid barcode molecules. Labeling cells and/or cell beads may comprise photoporation. A transient pore in a cell membrane can be generated using a laser pulse, allowing cellular uptake of nucleic acid barcode molecules. Labeling individual cells and/or cell beads may comprise magnetofection. Nucleic acid barcode molecules can be coupled to a magnetic particle (e.g., magnetic nanoparticle, nanowires, etc.) and localized to a target cell and/or cell bead via an applied magnetic field. Labeling cells and/or cell beads may comprise hydroporation. Nucleic acid barcode molecules can be delivered to cells and/or cell beads via hydrodynamic pressure.

Various chemical compounds can be used in embodiments herein to deliver nucleic acid barcode molecules into a cell and/or cell bead. Chemical vectors can include inorganic particles, lipid-based vectors, polymer-based vectors and peptide-based vectors. Non-limiting examples of inorganic particles that can be used in embodiments herein to deliver nucleic acid barcode molecules into a cell and/or cell bead include inorganic nanoparticles prepared from metals, (e.g., iron, gold, and silver), inorganic salts, and ceramics (e.g, phosphate or carbonate salts of calcium, magnesium, or silicon). The surface of a nanoparticle can be coated to facilitate nucleic acid molecule binding or chemically modified to facilitate nucleic acid molecule attachment. Magnetic nanoparticles (e.g., supermagnetic iron oxide), fullerenes (e.g., soluble carbon molecules), carbon nanotubes (e.g., cylindrical fullerenes), quantum dots and supramolecular systems may be used.

Labeling cells and/or cell beads may comprise use of a cationic lipid, such as a liposome. Various types of lipids can be used in liposome delivery. In some cases, a nucleic acid barcode molecule is delivered to a cell via a lipid nano emulsion. A lipid emulsion refers to a dispersion of one immiscible liquid in another stabilized by emulsifying agent. Labeling cells and/or cell beads may comprise use of a solid lipid nanoparticle.

Labeling cells and/or cell beads may comprise use of a peptide based chemical vector. Cationic peptides may be rich in basic residues like lysine and/or arginine. Labeling cells and/or cell beads may comprise use of polymer based chemical vector. Cationic polymers, when mixed with nucleic acid molecules, can form nanosized complexes called polypexes. Polymer based vectors may comprise natural proteins, peptides and/or polysaccharides. Polymer based vectors may comprise synthetic polymers. Labeling cells may comprise use of a polymer based vector comprising polyethylenimine (PEI). PEI can condense DNA into positively charged particles which bind to anionic cell surface residues and are brought into the cell via endocytosis. Labeling cells and/or cell beads may comprise use of polymer based chemical vector comprising poly-L-lysine (PLL), poly (DL-lactic acid) (PLA), poly (DL-lactide-co-glycoside) (PLGA), polyornithine, polyarginine, histones, or protamines. Polymer based vectors may comprise a mixture of polymers, for example PEG and PLL. Other polymers include dendrimers, chitosans, synthetic amino derivatives of dextran, and cationic acrylic polymers.

Following cell labeling, a majority of the cells and/or cell beads of individual cell samples can be labeled with nucleic acid barcode molecules having a sample barcode sequence (e.g., a moiety-conjugated barcode molecule, also referred to herein as a feature barcode). At least 50%, 60%, 70%, 75%, 80%, 85%, 90%, or 95% of cells of a cell sample may be labeled. In some cases, not all of the cells are labeled. For example, less than 100%, 95%, 90%, 85%, 80%, 75%, 70%, 65%, 60%, or 50% of cells of a cell sample may be labeled.

The plurality of labeled cell samples may be subjected to one or more reactions. The one or more reactions may comprise one or more nucleic acid extension reactions. The one or more reactions may comprise one or more nucleic acid amplification reactions. Alternatively or in addition, the one or more reactions may comprise one or more ligation reactions.

Individual labeled cells and/or cell beads of the plurality of labeled cell samples may be co-partitioned into a plurality of partitions (e.g., a plurality of wells or droplets). For example, labeled cells and/or cell beads may be partitioned into a plurality of partitions prior to undergoing one or more reactions. Labeled cells may be partitioned into partitions with one or more polymerizable materials such that labeled cell beads may be generated within the partitions. One or more labeled cells and/or cell beads may be included in a given partition of the plurality of partitions. Subjecting the nucleic acid molecules of the plurality of labeled cell samples one or more reactions may comprise partitioning individual cells and/or cell beads of the plurality of labeled cell samples into partitions and within individual partitions, synthesizing a nucleic acid molecule comprising (i) a sample barcode sequence and (ii) a sequence corresponding to a nucleic acid molecule. By partitioning the labeled cell samples into a plurality of partitions, the one or more reactions can be performed for individual cells and/or cell beads in isolated environments. Individual partitions may comprise at most a single cell and/or cell bead. Alternatively, a subset of partitions may contain at least a single cell and/or cell bead.

A partition may be an aqueous droplet in a non-aqueous phase such as oil. For example, a partition may comprise droplets, such as a droplet in an emulsion. Alternatively or in addition, partitions comprise wells or tubes.

A partition may contain a bead comprising a reagent for synthesizing a nucleic acid molecule. The reagent may be releasably attached to the bead. The reagent may comprise a nucleic acid, such as a nucleic acid primer. The nucleic acid may comprise a partition-specific barcode sequence. Two cells from a given cell sample may have an identical sample (e.g., cell) barcode sequence but different partition-specific barcode sequences (e.g., if the two cells are partitioned in two different partitions comprising the different partition-specific barcode sequences). In an example, a first cell from a first cell sample has a first sample barcode sequence and a first partition-specific barcode sequence and a second cell from a second cell sample has a second sample barcode sequence and a second partition-specific barcode sequence. The first sample barcode sequence and the second sample barcode sequence may be different. The first partition-specific barcode sequence and the second partition-specific barcode sequence may also be different (e.g., if the two cells are partitioned in two different partitions comprising the different partition-specific barcode sequences). Alternatively, the first partition-specific barcode sequence and the second partition-specific barcode sequence may be the same (e.g., if the two cells are partitioned in the same partition).

A bead to which one or more oligonucleotides or nucleic acid barcode molecules may be degradable upon application of a stimulus. The stimulus may comprise a chemical stimulus. A bead may be degraded within a partition. Where a bead comprises a reagent for synthesizing a nucleic acid molecule, the reagent may be released, e.g., into a partition comprising the bead, upon degradation of the bead.

A plurality of nucleic acid barcode products can be subjected to nucleic acid sequencing to yield a plurality of sequencing reads. Individual sequencing reads can be associated with individual labeled cell samples based on a sample barcode sequence. Individual reads can be associated with individual labeled cell samples based on the sample barcode sequence.

A method of the present disclosure may comprise pooling a plurality of nucleic acid barcode products from partitions prior to subjecting the nucleic acid barcode products, or derivatives thereof, to an assay such as nucleic acid sequencing. Nucleic acid barcode products may be subjected to processing such as nucleic acid amplification. In some cases, one or more features such as one or more functional sequences (e.g., sequencing primers and/or flow cell adapter sequences) may be added to nucleic acid barcode products, e.g., after pooling of nucleic acid barcode products from the partitions. For example, pooled amplification products may be subjected to one or more reactions prior to sequencing. For example, the pooled nucleic acid barcode products may be subjected to one or more additional reactions (e.g., nucleic acid extension, polymerase chain reaction, or adapter ligation). Adapter ligation may include, for example, fragmenting the nucleic acid barcode products (e.g., by mechanical shearing or enzymatic digestion) and enzymatic ligation.

A cell sample may comprise a plurality of cells and/or cell beads. A cell sample may comprise constituents in addition to cells and/or cell beads. For example, a cell sample can contain at least one of proteins, cell-free polynucleotides (e.g., cell-free DNA), cell stabilizing agents, protein stabilizing agents, enzyme inhibitors, cell nuclei, and ions.

Cell samples can be obtained from any of a variety of sources. For example, cell samples can be obtained from tissue samples. A tissue sample can be obtained from any suitable tissue source. Tissue samples can be obtained from components of the circulatory system, the digestive system, the endocrine system, the immune system, the lymphatic system, the nervous system, the muscular system, the reproductive system, the skeletal system, the respiratory system, the urinary system, and the integumentary system. A cell sample may be obtained from a tissue sample of the circulatory system such as the heart or blood vessels (e.g., arteries, veins, etc). A cell sample may be obtained from a tissue sample of the digestive system (e.g., mouth, esophagus, stomach, small intestine, large intestine, rectum, and anus). A cell sample may be obtained from a tissue sample of the endocrine system (e.g., pituitary gland, pineal gland, thyroid gland, parathyroid gland, adrenal gland, and pancreas). A cell sample may be obtained from a tissue sample of the immune system (e.g., lymph nodes, spleen, and bone marrow). A cell sample may be obtained from a tissue sample of the lymphatic system (e.g., lymph nodes, lymph ducts, and lymph vessels). In some embodiments, a cell sample is obtained from a tissue sample of the nervous system (e.g., brain and spinal cord). In some embodiments, a cell sample is obtained from a tissue sample of the muscular system (e.g., skeletal muscle, smooth muscle, and cardiac muscle). In some embodiments, a cell sample is obtained from a tissue sample of the reproductive system (e.g., penis, testes, vagina, uterus, and ovaries). In some embodiments, a cell sample is obtained from a tissue sample of the skeletal system (e.g., tendons, ligaments, and cartilage). In some embodiments, a cell sample is obtained from a tissue sample of the respiratory system (e.g., trachea, diaphragm, and lungs). In some embodiments, a cell sample is obtained from a tissue sample of the urinary system (e.g., kidneys, ureters, bladder, sphincter muscle, and urethra). In some embodiments, a cell sample is obtained from a tissue sample of the integumentary system (e.g., skin).

A tissue sample can be obtained by invasive, minimally invasive, or non-invasive procedures. Tissues samples can be obtained, for example, by surgical excision, biopsy, cell scraping, or swabbing. A tissue sample may be a tissue sample obtained during a surgical procedure or a sample obtained for diagnostic purposes. A tissue sample can be a fresh tissue sample, a frozen tissue sample, or a fixed tissue sample.

In some cases, a tissue and/or cell sample may be embedded, embalmed, preserved, and/or fixed. For example, a tissue and/or cell sample may be both fixed and embedded. A tissue and/or cell sample may comprise one or more fixed cells. Fixation is a process that preserves biological tissue or a cell from decay, thereby preventing autolysis or putrefaction. A fixed tissue may preserve its cells, its tissue components, or both. Fixation may be done through a crosslinking fixative by forming covalent bonds between proteins in the tissue or cell to be fixed. Fixation may anchor soluble proteins to the cytoskeleton of a cell. Fixation may form a rigid cell, a rigid tissue, or both. Fixation may be achieved through use of chemicals such as formaldehyde (e.g. formalin), gluteraldehyde, ethanol, methanol, acetic acid; osmium tetraoxide, potassium dichromate, chromic acid, potassium permanganate, Zenker's fixative, picrates, Hepes-glutamic acid buffer-mediated organic solvent protection effect (HOPE), or any combination thereof. Formaldehyde May be used as a mixture of about 37% formaldehyde gas in aqueous solution on a weight by weight basis. The aqueous formaldehyde solution may additionally comprise about 10-15% of an alcohol (e.g. methanol), forming a solution termed “formalin.” A fixative-strength (10%) solution would equate to a 3.7% solution of formaldehyde gas in water. Formaldehyde may be used as at least 5%, 8%. 10%, 12% or 15% Neutral Buffered Formalin (NBF) solution (i.e. fixative strength). Formaldehyde may be used as 3.7% to 4.0% formaldehyde in phosphate buffered saline (i.e. formalin). In some instances, fixation is performed using at least 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0, 5.5, 6.0, 6.5, 7.0, 7.5, 8.0, 8.5, 9.0, 9.5, 10, 10.5, 11.0, 11.5, 12.0, 12.5, 13.0, 13.5, 14.0, 14.5, or 15.0 percent (%) or more formalin flush or immersion. In some instances, fixation is performed using about 10% formalin flush. Fixative volume can be 10, 15, 20, 25 or 30 times that of tissue on a weight per volume. Subsequent to fixation in formaldehyde, the tissue or cell may be submerged in alcohol for long term storage. In some cases, the alcohol is methanol, ethanol, propanol, butanol, an alcohol containing five or more carbon atoms, or any combination thereof. The alcohol may be linear or branched. The alcohol may be at least 50%, 60%, 70%, 80% or 90% alcohol in aqueous solution. In some examples, the alcohol is 70% ethanol in aqueous solution.

Cell samples can be obtained from biological fluids. A biological fluid can be obtained from any suitable source. Exemplary biological fluid sources from which cell samples can be obtained include amniotic fluid, bile, blood, cerebral spinal fluid, lymph fluid, pericardial fluid, peritoneal fluid, pleural fluid, saliva, seminal fluid, sputum, sweat, tears, and urine. Biological fluids can be obtained by invasive, minimally invasive, or non-invasive procedures. A biological fluid comprising blood can be obtained, for example, by venipuncture, pinprick, or aspiration.

The plurality of different cell samples analyzed by methods provided herein may be a plurality of samples from a single subject. The plurality of different cell samples may be obtained from the single subject at different time points over the course of a pre-defined or un-defined length of time. For example, the plurality of cell samples may be obtained from a subject a multiple time points before and/or after the administration of a therapeutic treatment. The plurality of cell samples can be analyzed to assess and/or monitor the subject's response to the therapeutic treatment. In some embodiments, the plurality of different cell samples are cell samples obtained from different sources from the single subject. For example, the subject may be diagnosed with cancer and cell samples from a plurality of tissue sources are examined to determine the extent of cancer metastasis. The plurality of different cell samples may be obtained from different regions of a tissue sample. For example, a subject may undergo surgical treatment to excise a tumorous region. A plurality of different cell samples from different regions of a tissue sample can be assessed to identify the boundary between normal and abnormal tissue. The plurality of different cell samples may comprise cancerous and non-cancerous cell samples.

The plurality of different cell samples analyzed by methods provided herein may be a plurality of samples from a plurality of subjects. Alternatively or in addition, the plurality of different cell samples may comprise a plurality of different cell samples from the same subject. For example, different cell samples may be taken from the same subject at different times (e.g., at different time points in during a treatment regimen). In another example, different cell samples may be taken from different areas or features of the same subject. For instance, a first cell sample may be a blood sample, and a second cell sample may be a tissue sample. For parallel processing, a plurality of samples (e.g., from a plurality of subjects) can be combined for simultaneous processing. In some cases, at least two different cell samples from at least two different subjects are processed simultaneously (e.g., at least 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, or 25 samples) are combined and processed in parallel.

Spatial Mapping

In an aspect, the present disclosure provides methods and compositions for spatial mapping. A plurality of nucleic acid barcode molecules can be arranged according to a spatial relationship. The method of spatially mapping a plurality of cells in a sample may comprise spotting or otherwise distributing a plurality of nucleic acid barcode molecules comprising a labelling barcode sequence onto a cell sample comprising cells and/or cell beads (e.g., a three-dimensional tissue sample or a tissue section on a substrate) to yield a plurality of labeled cells in said cell sample. The plurality of nucleic acid barcode molecules may be modified to penetrate the cell membrane of cells and/or cell beads in said cell sample. The nucleic acid barcode molecules may be modified with a lipophilic moiety. In some instances, the cell sample is spotted with the plurality of nucleic acid barcode molecules according to a pre-defined spatial configuration or pattern. For example, nine sets of nucleic acid barcode molecules (e.g., 9 sets of nucleic acid barcode molecules having 9 unique sample barcode sequences) can be arranged in square grid of 3×3. All sample barcodes located in a particular square of the grid (e.g., #1) can have the same sample barcode sequence (e.g., sample barcode sequence #1). The sample barcode sequence in a given square may be different from all other sample barcode sequences in other squares. The sample barcodes and corresponding sample barcode sequences of the various sets can have a pre-defined spatial relationship. For example, with reference to FIG. 7A, a sample barcode sequence #1 can be positioned in proximity to sample barcode sequence #2 and #4; sample barcode sequence #2 can be positioned in proximity to sample barcode sequence #1, #3 and #5; sample barcode sequence #3 can be positioned in proximity to sample barcode sequence #2 and #6; sample barcode sequence #4 can be positioned in proximity to sample barcode sequence #1, #5 and #7; sample barcode sequence #5 can be positioned in proximity to sample barcode sequence #2, #4, #6, and #8; sample barcode sequence #6 can be positioned in proximity to sample barcode sequence #3, #5 and #9; sample barcode sequence #7 can be positioned in proximity to sample barcode sequence #4 and #8; sample barcode sequence #8 can be positioned in proximity to sample barcode sequence #5, #7 and #9; and sample barcode sequence #9 can be positioned in proximity to sample barcode sequence #6 and #8. Other spatial arrangements and relationships are contemplated herein. A plurality of nucleic acid barcode molecules can be arranged in any suitable configuration, for example deposited onto a planar or non-planar two-dimensional surface.

In some instances, the modified nucleic acid barcode molecule is coupled to a lipophilic molecule which enables the delivery of the nucleic acid molecule across the cell membrane or the nuclear membrane. Non-limiting examples of lipophilic molecules that can be used in embodiments described herein include sterol lipids such as cholesterol, tocopherol, and derivatives thereof. In other instances, the modified nucleic acid barcode molecule is coupled to a cell-penetrating peptide which can enable the molecule to penetrate the cell in the sample. In other cases, the modified nucleic acid barcode molecules are delivered into the cells and/or cell beads using liposomes, nanoparticles, or electroporation. In some cases, the modified nucleic acid barcode molecule may be delivered into the cells and/or cell beads by mechanical force (e.g. nanowires, or microinjection). In some examples, the unique sample barcode sequences are generated using antibodies, which may bind to proteins coupled to cells and/or cell beads in each of the regions in which the sample is located. The antibodies or sequences derived from the antibodies may then be used to identify the regions within which the sample is located. In yet another embodiment, the modified nucleic acid barcode molecule is coupled to a fluorophore or dye, as further described herein. In one other embodiment, the modified nucleic acid barcode molecule is coupled to an inorganic nanoparticle, as further described herein.

In some instances, nucleic acid barcode molecules are spotted or otherwise distributed onto a cell sample comprising cells and/or cell beads present in the cell sample in at least two dimensions. Nucleic acid barcode molecules may be spotted onto the cell sample in known locations or in a regular pattern, e.g., in a grid pattern as described above and as shown in FIG. 7A. In some cases, nucleic acid barcode molecules spotted into a known location are distributed radially from the spotting location. The spotting or distribution pattern of nucleic acid barcode molecules may be such that some cells and/or cell beads will comprise two or more different nucleic acid barcode molecules, each comprising a unique barcode sequence. For example, nucleic acid barcode molecules (e.g., nucleic acid barcode molecules conjugated to a lipophilic moiety) are spotted onto a cell sample in a 3×3 grid pattern (see, e.g., FIG. 7A) such that a different set of nucleic acid barcode molecules are deposited onto each “square” of the grid (i.e., each “square” of the grid has a unique barcode sequence). In some cases, the nucleic acid barcode molecules diffuse out (e.g. radially) from the spotting or distribution point creating a concentration gradient of nucleic acid barcode molecules such that cells and/or cell beads closer to the spotting position will have relatively more nucleic acid barcode molecules compared to cells further from the spotting point. Furthermore, in some instances, a labeled cell and/or cell bead will comprise nucleic acid barcode molecules comprising 2 or more different nucleic acid barcode sequences. A cell and/or cell bead can then be analyzed for particular barcode sequences to infer the special relationship of cells (or the relative spatial relationship of a cell to another cell) within the cell sample. For example, cells and/or cell beads present in grid #1 of FIG. 7A are labelled by a set nucleic acid barcode molecules, each comprising a common barcode sequence (e.g., barcode sequence #1), while cells and/or cell beads present in grid #2 are labelled by a different set nucleic acid barcode molecules each comprising a common barcode sequence (e.g., barcode sequence #2). The labelling procedure is repeated for each area of the grid or pattern such that a different set of nucleic acid barcode molecules is distributed across the relevant portions of the cell sample. Dependent upon their position in the cell sample, cells and/or cell beads can be labelled with one or more unique barcode sequences (e.g., a cell can be labelled with both barcode sequence #1 and barcode sequence #2, etc.). Individual cells and/or cell beads are then dissociated from the cell sample and analyzed for the presence of nucleic acid barcode molecules comprising one or more barcode sequences. In some instances, cells and/or cell beads are analyzed for both the presence of specific barcode sequences and also the amount of each nucleic acid barcode molecule associated with each cell and/or cell bead (e.g., using a UMI). Thus, in some instances, the known spotting pattern of the nucleic acid barcode molecules, the presence of particular barcode sequences, and the amount of each nucleic acid barcode molecule is utilized to determine the spatial position of a cell and/or cell bead in the cell sample or the relative spatial position of a cell and/or cell bead to another cell and/or cell bead in the cell sample.

A sample 700 having at least two dimensions, for example a tissue sample or a cross-section of a tissue, may be labeled with a plurality of nucleic acid barcode molecules, for example, as shown in FIG. 7B. In some cases, cells and/or cell beads present in different locations of a tissue sample or a cross-section of a tissue can be labeled with different sample barcode sequences (e.g., a moiety-conjugated barcode molecule, also referred to herein as a feature barcode). Nucleic acid analysis, for example sequencing analysis, can utilize the sample barcode sequences and spatial relationship of the barcode sequences to analyze various differences among subpopulations of cells and/or cell beads in the sample.

In some examples, a method for spatially mapping a plurality of cells and/or cell beads comprises labeling cells and/or cell beads of a different cell samples using nucleic acid barcode molecules to yield a plurality of labeled cell samples. An individual nucleic acid barcode molecule may comprise a sample barcode sequence, and nucleic acid barcode molecules of a given labeled cell sample can be distinguished from nucleic acid barcode molecules of another labeled cell sample by the sample barcode sequence. The nucleic acid barcode molecules may be arranged in at least a pre-defined two-dimensional configuration.

Next, nucleic acid molecules of the plurality of labeled cell samples may be subjected to one or more reactions to yield a plurality of barcoded nucleic acid products. Individual nucleic acid barcode products can comprise (i) a sample barcode sequence and (ii) a sequence corresponding to a nucleic acid molecule.

Next, the plurality of nucleic acid barcode products (or derivatives thereof) may be sequenced to yield sequencing reads. Spatial relationships may then be inferred between individual cell samples based on the sample barcode sequence and the pre-defined two-dimensional arrangement of nucleic acid barcode molecules, thereby spatially mapping a plurality of cell samples to at least a two dimensional configuration.

For example, a cell sample having at least two dimensions (e.g., a tissue section on a slide or a three-dimensional tissue sample from a subject, such as a fixed tissue sample) may be spotted with labelling nucleic acid barcode molecules comprising a labeling barcode sequence in a predefined pattern as described above. Cells are then dissociated from the cell sample and partitioned into a plurality of partitions, each partition comprising (1) a single cell from the cell sample, the single cell comprising at least one labelling nucleic acid barcode molecule comprising a labeling barcode sequence; and (2) a plurality of sample nucleic acid barcode molecules comprising a sample barcode sequence, wherein each partition comprises sample nucleic acid barcode molecules comprising a different sample barcode sequence. The plurality of sample nucleic acid barcode molecules further may comprise a unique molecular identifier (UMI) sequence. The plurality of sample nucleic acid barcode molecules may be attached to a bead (e.g., a gel bead) and each partition comprises a single bead. In some cases, the labelling nucleic acid barcode molecules comprise one or more functional sequences, such as a primer sequence or a UMI sequence. In some instances, cells are lysed to release the labelling nucleic acid barcode molecule or other analytes present in or associated with the cells. In each partition, the labelling nucleic acid barcode molecules associated with each cell are barcoded by the sample nucleic acid barcode molecule to generate a nucleic acid molecule comprising the labeling barcode sequence and the sample barcode sequence. In addition to the barcoding of the labelling nucleic acid barcode molecules, another analyte such as RNA or DNA molecules may also be barcoded with a sample barcode sequence. Nucleic acid molecules barcoded with a sample barcode sequence can then be processed as necessary to generate a library suitable for sequencing as described elsewhere herein.

Three-Dimensional Spatial Mapping

Barcoded molecules (e.g., oligonucleotide-lipophilic moiety conjugates) may be used to target or label cells in suspension. In one aspect, cells within an intact tissue sample (e.g., a solid tissue sample) are contacted with these barcode molecules for spatial analysis. The present invention concerns methods and devices or instruments for injecting barcode molecules in situ into a tissue sample and subsequently identifying positions that correspond to uptake of the barcode molecules by cells within the tissue sample. In one aspect, oligonucleotide-lipophilic moiety conjugates (e.g., oligonucleotide-cholesterol conjugates) are used to label cells in a tissue sample. In one embodiment, the conjugates are injected into a tissue sample with a very fine needle (or array of needles). The location of each barcode molecule would have a defined position, e.g., in two dimensions (2D in one plane) or in three dimensions (3D in several planes). After injection of the conjugate, the barcode molecules insert into the plasma membrane of cells (e.g., via the lipophilic moiety) and diffuse within the tissue. At the point of injection, the concentration of the barcode would be the highest, and as it diffuses in the tissue its concentration would decrease. Considering this diffusion, the uptake of the barcode would define its location to the point of injection. With an array of needles (e.g., FIG. 16), it would be possible to reconstruct cell position as cells take up different barcodes at different concentrations, thereby indicating the relative position of cells to each other. The barcoded molecules may also be applied to cells within a tissue sample using microarray nucleic acid printing methods known to those of ordinary skill in the art.

FIG. 16 depicts an example of a tissue section with barcode staining using one fixed array of needles (one 2-dimensional plane). x, y z may be determined depending on diffusion of the barcode. By way of example, a cell diameter of 10 μm means the diffusion of barcodes will be on a scale of about 10-15 cells or about 100 μm-150 μm. A very fine needle can be used to infuse barcodes with or without pressure where the infusion can be in a skewer-like pattern separated by x μm apart in all directions (defined by desired diffusion of barcode). Each needle can infuse a different barcode.

FIG. 17 depicts a diffusion map to localize spatially barcodes and associated cells (one plane in 2D view). FIG. 18 shows the position of cells (designated “C1” to “C7”) defined by the barcode and its relative amount (higher amount at the point of infusion, lower as cells are away from the point of diffusion). The amount of the different barcode in each cell defines its position in the tissue spatially. The following table illustrates this for cells C1 to C7 in a hypothetical scenario.

TABLE 1

Distribution of barcodes throughout cells.

Cell #
BC level: solid line
BC level: dashed line
BC level: dotted line

C1
++
−
−

C2
+++
+
−

C3
++
++
−

C4
+
+++
+

C5
−
++
++

C6
−
+
+++

C7
−
−
++

FIG. 19 depicts a three dimensional application. A fused needle at 3 levels is used to deliver 3 different barcodes. FIG. 20 depicts a three dimensional application to maximize 3D space with barcode staining.

In one embodiment, the present disclosure provides methods and compositions for spatial mapping where different barcode molecules are contacted with different regions of a 3D biological sample (e.g., a solid tissue sample). In one other embodiment, the biological sample comprises different regions of interest that may be contacted with barcode molecules. For instance, FIG. 21A depicts regions of a mouse brain (P0-P8) with delivery devices (e.g., needles including fused or multipoint needles) for delivering barcode molecules (e.g., oligonucleotide-lipophilic moiety conjugates). The tissue sample (e.g., mouse brain or other solid tissue sample) is washed with a suitable media such as Hibernate Medium or HEB medium (Thermo Fisher Scientific), removed from the media, and any excess media allowed to drain before application of the barcode molecules. Multiple syringes (e.g., 2-3 μL volume, mounted with 30 to 31 gauge needle) loaded with oligonucleotide-lipophilic moiety conjugates at a suitable concentration (e.g., about 0.1 μM) for injection into the tissue sample at a depth of about 1 mm. At a fixed injection volume, the concentration of the conjugate can be adjusted depending on the resulting labeling of cells and the diffusion speed within the tissue. As depicted in FIG. 21B, a first conjugate is injected at position A, a second conjugate at position B, a third conjugate at position C, and a fourth conjugate at position D according to a pattern. In one embodiment, position B is a first distance away from position A, position C is a second distance away from positions A and B, and position D is a third distance away from positions A and B. In other embodiments, the first distance is less than the second distance and/or greater than the third distance (e.g., Pattern 1 in FIG. 21B).

In another embodiment, positions A-D are injected in a linear pattern, wherein each position is the same distance from the other in sequence. For example, position A is a first distance away from position B and a second distance away from position C, wherein the first distance is half of the second distance (e.g., Pattern 2 in FIG. 21B). Those of ordinary skill in the art will appreciate that different conjugates can be injected into a tissue sample according to the patterns shown in FIG. 21B or any other suitable pattern.

Following injection, the tissue sample is incubated at room temperature or any other suitable temperature to allow the conjugates to diffuse into the tissue at their respective points of injection. After incubation, the tissue sample is placed in a 15 mL conical tube and washed again in HEB medium (e.g., washed twice). Following removal of the medium, the tissue sample is dissociated according to a suitable sample preparation protocol for single cell sequencing (e.g., 10× Genomics Sample Preparation Demonstrated Protocol—Dissociation of Mouse Embryonic Neural Tissue for Single Cell RNA Sequencing CG00055). Following dissociation, the suspension of cells from the tissue sample is processed to generate a sequencing library. As described herein, single cells (with the oligonucleotide-lipophilic moiety (e.g., cholesterol) conjugates inserted into their cell membranes) from the suspension of cells are provided in individual partitions with reagents for one or more additional barcoding reactions that involve analytes from the same single cells. Analytes from the suspension of cells are processed to provide nucleic acid libraries for sequencing (see, e.g., U.S. Pat. Nos. 10,011,872, 9,951,386, 10,030,267, and 10,041,116, which are incorporated herein by reference in their entireties). In one embodiment, barcode sequences of the plurality of oligonucleotide-lipophilic moiety conjugates are identified via sequencing along with barcode sequences associated with the analyte(s) processed from the single cells in suspension. In one embodiment, one or more barcode sequences from the plurality of oligonucleotide-lipophilic moiety conjugates are associated with one or more spatial positions corresponding to one or more cells within the tissue sample (see FIGS. 21A-21B). In another embodiment, the spatial position corresponds to one or more cells where a particular oligonucleotide-lipophilic moiety conjugate diffused into the tissue sample (as determined by the pattern by which the oligonucleotide-lipophilic moiety conjugates were delivered to the tissue). In other embodiments, the one or more spatial positions are then associated with the analyte(s) detected and identified in the cell or cells into which the oligonucleotide-lipophilic moiety conjugate diffused. In one additional embodiment, a method of spatial analysis (e.g., three dimensional spatial analysis) using oligonucleotide-lipophilic moiety conjugates is provided. In one embodiment, the method comprises contacting a tissue sample (e.g., a solid tissue sample) with a plurality of oligonucleotide-lipophilic moiety conjugates at a plurality of locations within the sample. In another embodiment, the plurality of oligonucleotide-lipophilic moiety conjugates comprises a first, second, third, fourth, fifth, sixth, etc. types of oligonucleotide-lipophilic moiety conjugates. The type of oligonucleotide-lipophilic moiety conjugate may differ as to the sequence of the barcode and/or the type of lipophilic moiety. In one other embodiment, the method comprises allowing the plurality of oligonucleotide-lipophilic moiety conjugates to diffuse into the tissue sample, such that the plurality of oligonucleotide-lipophilic moiety conjugates insert into cell membranes of the cells within the tissue sample. In additional embodiments, the method comprises providing a suspension of cells (e.g., single cells) that are derived from the tissue sample (containing the diffused oligonucleotide-lipophilic moiety conjugates), such that the suspension comprises one or more cells that retain one or more oligonucleotide-lipophilic moiety conjugates of the plurality of oligonucleotide-lipophilic moiety conjugates. In one more embodiment, the method comprises providing a nucleic acid library for sequencing from the suspension of cells. In one embodiment, the nucleic acid library comprises nucleic acid barcode molecules corresponding to an oligonucleotide-lipophilic moiety conjugate and an analyte (as described herein), including without limitation, a nucleic acid analyte, a metabolite analyte, and a protein analyte.

In one aspect, the present invention provides methods of processing a tissue sample for spatial analysis. In one embodiment, the method comprises the step of delivering a plurality of spatial oligonucleotides to a location in a tissue sample, wherein a spatial oligonucleotide of the plurality of spatial oligonucleotides comprises (i) a spatial barcode sequence and (ii) a cell membrane labeling (or targeting) agent to label a cell at the location in the tissue sample. In one embodiment, the cell membrane labeling agent interacts with or associates with the cell membrane as further described herein (e.g., lipophilic molecules, fluorophores, dyes, etc.). In another embodiment, the spatial oligonucleotide further comprises a cleavable linker (such as a linker described herein) to allow separation of the spatial barcode sequence from the cell membrane labeling agent. In another embodiment, the plurality of spatial oligonucleotides may be delivered to the tissue sample in a pattern as described herein. In another embodiment, the method further comprises the step of dissociating the tissue sample into a plurality of cells, wherein a cell of the plurality of cells is a single cell that comprises the spatial oligonucleotide and an analyte of interest. In another embodiment, the single cell comprises the spatial oligonucleotide via the cell membrane labeling agent. In another embodiment, the method further comprises the step of partitioning the single cell with a (i) plurality of cell barcode nucleic acid molecules each comprising a cell barcode sequence and configured to couple to the analyte and (ii) a plurality of spatial barcode nucleic acid molecules configured to couple to the spatial oligonucleotide. In another embodiment, the method further comprises the step of in the partition, lysing the single cell and using the spatial oligonucleotide and the analyte of interest to generate (i) a first barcoded nucleic acid molecule comprising the spatial barcode sequence or a complement thereof, and (ii) a second barcoded nucleic acid molecule comprising the cell barcode sequence or a complement thereof. In other embodiments, the method further comprises the step of sequencing (i) the first barcoded nucleic acid molecule to determine the spatial barcode sequence, and (ii) the second barcoded nucleic acid molecule to determine the cell barcode sequence. In further embodiments, the method also comprises the step of using (i) the determined spatial barcode sequence to identify the location in the tissue sample at which the single cell was labelled and/or from which the single cell originated, and (ii) the determined cell barcode sequence to identify the analyte as originating from the single cell. In another embodiment, the cell membrane labeling agent is selected from the group consisting of a lipid (e.g., a lipophilic moiety), a fluorophore, a dye, a peptide, and a nanoparticle. In another embodiment, the analyte is a nucleic acid molecule or a protein labelling agent capable of specifically binding to a surface protein on the cell. In another embodiment, each cell barcode nucleic acid molecule further comprises a cleavable linker (such as a linker described herein) to allow separation of the cell barcode sequence from the protein labeling agent. In other embodiments, the method is suitable for processing tissue samples for two dimensional (e.g., tissue section or sample on a slide) and three dimensional (e.g., biopsy from a subject) spatial analysis.

Doublet Reduction and Detection

The present disclosure also provides methods and compositions for doublet reduction. In an aspect, a method of analyzing polynucleotides may comprise labeling cells and/or cell beads of different cell samples (e.g., cell samples from different subjects, such as different humans or animals; cell samples from the same subject taken at different times; and/or cell samples from the same subject taken from different areas or features of a subject, such as from different tissues) using nucleic acid barcode molecules or oligonucleotides comprising the nucleic acid barcode molecules to yield a plurality of labeled cell samples, wherein an individual nucleic acid barcode molecule comprises a sample barcode sequence (e.g., a moiety-conjugated barcode molecule, also referred to herein as a feature barcode), and wherein nucleic acid barcode molecules of a given labeled cell sample are distinguishable from nucleic acid barcode molecules of another labeled cell sample by the sample barcode sequence. Different cells and/or cell beads from the same cell sample may have the same sample barcode sequence. Labeled cells and/or cell beads of the plurality of cell samples may be co—into a plurality of partitions. The labeled cells and/or cell beads may be co-partitioned with a plurality of beads, such as a plurality of gel beads. Beads of the plurality of beads may comprise a plurality of bead nucleic acid barcode molecules attached (e.g., releasably coupled) thereto, wherein an individual bead nucleic acid barcode molecule attached to a bead comprises a bead barcode sequence. Bead nucleic acid barcode molecules of a given bead may e distinguishable from bead nucleic acid barcode molecules of another bead by their bead barcode sequence(s). Nucleic acid molecules of the at least one labeled cell and/or cell bead of a given partition may be subjected to one or more reactions to yield nucleic acid barcode products comprising (i) a sample barcode sequence, (ii) a bead barcode sequence, and (iii) a sequence corresponding to a nucleic acid molecule of the nucleic acid molecules of the at least one labeled cell and/or cell bead. Nucleic acid barcode products may be subjected to sequencing to yield a plurality of sequencing reads. In some cases, contents of a plurality of partitions may be pooled to provide a plurality of nucleic acid barcode products corresponding to the plurality of partitions. Sequencing reads may be processed to identify bead and sample barcode sequences, which sequences may be used to identify the cell and/or cell bead to which a sequencing read corresponds. For example, sequencing reads corresponding to two different cells and/or cell beads from different cell samples that are co-partitioned in the same partition may be identified as having identical bead barcode sequences and different sample barcode sequences. Sequencing reads corresponding to two different cells and/or cell beads from the same cell sample partitioned in different partitions may be identified as having different bead barcode sequences and identical sample barcode sequences.

As described elsewhere herein, a sample barcode sequence which is used to label individual cells and/or cell beads of a cell sample can later be used as a mechanism to associate a cell and/or cell bead and a given cell sample. For example, a plurality of cell samples can be uniquely labeled with nucleic acid barcode molecules such that the cells and/or cell beads of a particular sample can be identified as originating from the particular sample, even if the particular cell sample were mixed with additional cell samples and subjected to nucleic acid processing in bulk.

Individual nucleic acid barcode molecules may form a part of a barcoded oligonucleotide. A barcoded oligonucleotide, as described elsewhere herein, can comprise sequence elements in addition to a sample barcode sequence that may serve a variety of purposes, for example in sample preparation for sequencing analysis, e.g., next-generation sequence analysis.

Cells and/or cell beads can be labeled with nucleic acid barcode molecules by any of a variety of suitable mechanisms described elsewhere herein. A nucleic acid barcode molecule or a barcoded oligonucleotide comprising the nucleic acid barcode molecule may be linked to a moiety (“barcoded moiety”) such as an antibody or an epitope binding fragment thereof, a cell surface receptor binding molecule, a receptor ligand, a small molecule, a pro-body, an aptamer, a monobody, an affimer, a darpin, or a protein scaffold. The moiety to which a nucleic acid barcode molecule or barcoded oligonucleotide can be linked may bind a molecule expressed on the surface of individual cells of the plurality of cell samples. A labeled cell sample may refer to a sample in which the cells and/or cell beads are bound to barcoded moieties. A labeled cell sample may refer to a sample in which the cells have nucleic acid barcode molecules within the cells and/or cell beads.

A molecule (e.g., a molecule expressed on the surface of individual cells of the plurality of cell samples) may be common to all cells and/or cell beads of the plurality of the different cell samples. The molecule may be a protein. Exemplary proteins in embodiments herein include, but are not limited to, transmembrane receptors, major histocompatibility complex proteins, cell-surface proteins, glycoproteins, glycolipids, protein channels, and protein pumps. A non-limiting example of a cell-surface protein can be a cell adhesion molecule. The molecule may be expressed at similar levels for all cells and/or cell beads of the sample. The expression of the molecule for all cells and/or cell beads of a sample may be within biological variability. The molecule may be differentially expressed in cells and/or cell beads of the cell sample. The expression of the molecule for all cells and/or cell beads of a sample may not be within biological variability, and some of the cells and/or cell beads of a cell sample may be and/or comprise abnormal cells. A moiety linked to a nucleic acid barcode molecule or barcoded oligonucleotide may bind a molecule that is present on a majority of the cells and/or cell beads of a cell sample. The molecule may be present on at least 50%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% of the cells and/or cell beads in a cell sample.

Cells and/or cell beads can be labeled in (a) by any suitable mechanism, including those described elsewhere herein. The nucleic acid barcode molecule or barcoded oligonucleotide comprising the nucleic acid barcode molecule may be linked to an antibody or an epitope binding fragment thereof, and labeling cells and/or cell beads may comprise subjecting the antibody-linked nucleic acid barcode molecule or the epitope binding fragment-linked nucleic acid barcode molecule to conditions suitable for binding the antibody or the epitope binding fragment thereof to a molecule present on a cell surface. The nucleic acid barcode molecule or barcoded oligonucleotide comprising the nucleic acid barcode molecule may be coupled to a cell-penetrating peptide (CPP), and labeling cells and/or cell beads may comprise delivering the CPP coupled nucleic acid barcode molecule into a cell and/or cell bead by the CPP. The nucleic acid barcode molecule or barcoded oligonucleotide comprising the nucleic acid barcode molecule may be conjugated to a cell-penetrating peptide (CPP), and labeling cells and/or cell beads may comprise delivering the CPP conjugated nucleic acid barcode molecule into a cell and/or cell bead by the CPP. The nucleic acid barcode molecule or barcoded oligonucleotide comprising the nucleic acid barcode molecule may be coupled to a lipophilic molecule, and labeling cells and/or cell beads may comprise delivering the nucleic acid barcode molecule to a cell membrane by the lipophilic molecule. The nucleic acid barcode molecule or barcoded oligonucleotide comprising the nucleic acid barcode molecule may enter into the intracellular space. The nucleic acid barcode molecule or barcoded oligonucleotide comprising the nucleic acid barcode molecule may be coupled to a lipophilic molecule, and labeling cells may comprise delivering the nucleic acid barcode molecule to a nuclear membrane by the lipophilic molecule. The nucleic acid barcode molecule or barcoded oligonucleotide comprising the nucleic acid barcode molecule may enter into a cell nucleus. Labeling cells and/or cell beads may comprise use of a physical force or chemical compound to deliver the nucleic acid barcode molecule or barcoded oligonucleotide into the cell and/or cell bead. Examples of physical methods that can be used in the methods provided herein include the use of a needle, ballistic DNA, electroporation, sonoporation, photoporation, magnetofection, and hydroporation. Various chemical compounds can be used in the methods provided herein to deliver nucleic acid barcode molecules to a cell. Chemical vectors, as previously described herein, can include inorganic particles, lipid-based vectors, polymer-based vectors and peptide-based vectors. In some cases, labeling cells and/or cell beads may comprise use of a cationic lipid, such as a liposome. A labeled cell sample may refer to a sample in which the cells and/or cell beads have nucleic acid barcode molecules within the cells and/or cell beads.

Following labeling of cells and/or cell beads, a majority of the cells and/or cell beads of a particular cell sample can be labeled with nucleic acid barcode molecules having a sample specific barcode sequence. At least 50%, 60%, 70%, 75%, 80%, 85%. 90%, or 95% of cells of a cell sample may be labeled. In some cases, not all of the cells and/or cell beads of a given cell sample of a plurality of cell samples are labeled. Less than 100%, 95%, 90%, 85%, 80%, 75%, 70%, 65%, 60%, or 50% of cells and/or cell beads of a cell sample may be labeled. In some cases, cells and/or cell beads of multiple different cell samples of the plurality cell samples may not be labeled.

The plurality of labeled cell samples can be co-partitioned with a plurality of beads into a plurality of partitions. Individual beads can comprise a plurality of bead nucleic acid barcode molecules attached thereto. Bead nucleic acid barcode molecules of a given bead can be distinguishable from bead nucleic acid barcode molecules of another bead by a bead barcode sequence. The bead nucleic acid barcode molecule may be releasably attached to the bead. The bead may be degradable upon application of a stimulus. The stimulus may comprise a chemical stimulus.

By partitioning the labeled cell samples into a plurality of partitions, one or more reactions can be performed individually for single cells in isolated partitions. In some cases, the partition is an aqueous droplet in a non-aqueous phase such as oil. The partitions comprise droplets. For example, a partition can be a droplet in an emulsion. Alternatively, the partitions may comprise wells or tubes.

Individual partitions may comprise a single cell and/or cell bead. Alternatively or in addition, a subset of partitions may contain more than a single cell and/or cell bead.

Nucleic acids generated in partitions having more than a single cell and/or cell bead may undesirably assign the same bead barcode sequence to two different cells and/or cell beads. While the nucleic acids may share the same bead barcode sequence, the two different cells and/or cell beads can be distinguished by different sample barcode sequences if the two cells and/or cell beads originated from different cell samples. By using both a sample barcode sequence (e.g., a moiety-conjugated barcode molecule) and a bead (or partition) barcode sequence, sequencing reads from partitions comprising more than one labeled cell and/or cell bead can be identified.

Cell Characterization

In an aspect, the methods provided herein may be useful in identifying and/or characterizing cells and/or cell beads. For example, the present disclosure provides a method of identifying a size of a cell and/or cell bead. By identifying the size of the cell, other properties, such as its type and/or tissue of origin may also be determined.

Cells of different sizes (e.g., diameters) will have different associated cell surfaces. For example, a first cell of a first size may have a different surface area and surface features than a second cell of a second size that is larger than the first size. As described herein, lipophilic or amphiphilic moieties (e.g., coupled to nucleic acid barcode molecules) may associate with and/or insert into membranes of cells and/or cell beads. At a non-saturating concentration of lipophilic or amphiphilic moieties (e.g., coupled to nucleic acid barcode molecules), uptake of the lipophilic or amphiphilic moieties by a cell or cell bead may be proportional to the surface of the cell or cell bead. Accordingly, a second cell or cell bead that is larger than a first cell or cell bead (e.g., has a larger diameter and, accordingly, a larger surface area, than the first cell or cell bead) may uptake more lipophilic or amphiphilic moieties than the first cell or cell bead (see, e.g., FIGS. 22 and 23).

Identifying or characterizing cells and/or cell beads may comprise measuring uptake of lipophilic or amphiphilic moieties (e.g., coupled to nucleic acid barcode molecules) by the cells and/or cell beads. A known amount of lipophilic and/or amphiphilic moieties (e.g., coupled to nucleic acid barcode molecules) may be provided to a cell or cell bead or a collection of cells or cell beads and the uptake of such moieties may be measured. Uptake of such moieties by cells may be measured by, for example, measuring a residual amount of such moieties that are not taken up by cells and subtracting this amount from the initial known amount. In another example, lipophilic and/or amphiphilic moieties may be labeled (e.g., with optically detectable labels such as fluorescent moieties) and the labels may be used to determine a relative uptake of the lipophilic and/or amphiphilic moieties by the cell/cell bead and/or cells/cell beads (e.g., using an optical detection method). In another example, the amount of lipophilic/amphiphilic moieties (e.g., coupled to nucleic acid barcode molecules) taken up by cells and/or cell beads may be determined by measuring the amount of nucleic acid barcode molecules associated with the cells and/or cell beads (e.g., using nucleic acid sequencing). Such a method may provide an alternative to other methods of determining cell size, such as flow cytometry.

In an example, a plurality of cells may be labeled with lipophilic or amphiphilic feature barcodes (e.g., as described herein). Feature barcodes comprising a lipophilic moiety (e.g., a cholesterol moiety) may be incubated with the plurality of cells. The feature barcodes may comprise an optical label such as a fluorescent moiety. The feature barcodes may include, for example, a sequence configured to hybridize to a nucleic acid barcode molecule, such as a sequence comprising multiple cytosine nucleotides (e.g., a CCC sequence). Each feature barcode may also comprise a barcode sequence and/or a unique molecular identifier (UMI) sequence. Each lipophilic or amphiphilic moiety may be coupled to a different UMI sequence. For example, where about 1 million lipophilic or amphiphilic moieties will be used, about 1 million different UMI sequences may be used. Alternatively, each lipophilic or amphiphilic moiety may be coupled to a different combination of UMI and barcode sequences. For example, where about 1 million lipophilic or amphiphilic moieties will be used, about 1 million different combinations may be used. Cells may be partitioned into a plurality of partitions (e.g., a plurality of droplets, such as aqueous droplets in an emulsion) with a plurality of partition nucleic acid barcode molecules, where each nucleic acid barcode molecule of the plurality of partition nucleic acid barcode molecules comprises a barcode sequence. Each partition may comprise at most one cell. The plurality of partition nucleic acid barcode molecules may be distributed throughout the partitions such that each partition includes nucleic acid barcode molecules having a different barcode sequence, where a given partition of the plurality of partitions may include multiple nucleic acid barcode molecules having the same barcode sequence. Nucleic acid barcode molecules may be coupled (e.g., releasably coupled) to beads (e.g., gel beads). In addition to barcode sequences, nucleic acid barcode molecules may further comprise unique molecule identifier sequences and/or sequences configured to hybridize to feature barcodes coupled to the lipophilic or amphiphilic moieties (e.g., GGG sequences). Within each partition comprising a cell, partition nucleic acid barcode molecules may couple to feature barcodes coupled to lipophilic or amphiphilic moieties, such that cells comprise a plurality of lipophilic or amphiphilic moieties coupled to i) feature barcodes and ii) partition nucleic acid barcode molecules. The barcode sequences of the partition nucleic acid barcode molecules are uniform across the plurality of lipophilic or amphiphilic moieties and identify the cell as corresponding to a given partition, while the diversity of barcode and/or UMI sequences of the feature barcodes is proportional to the uptake of lipophilic or amphiphilic moieties by the cell, and thus to the cell size. Accordingly, upon sequencing the feature barcodes coupled to the partition nucleic acid barcode molecules (e.g., subsequent to derivitization of the feature barcodes coupled to the partition nucleic acid barcode molecules with, e.g., flow cell adapters), a plurality of sequencing reads may be obtained that may be associated with the cells to which the feature barcodes and partition nucleic acid barcode molecules corresponded. The number of barcode and/or UMI sequences of the feature barcodes may be used to determine a relative size of the cells with which they are associated (e.g., a larger cell will have more barcode and/or UMI sequences associated therewith than a smaller cell) (see, e.g., FIG. 24).

In another example, a plurality of cells may be labeled with lipophilic or amphiphilic feature barcodes (e.g., as described herein). Feature barcodes comprising a lipophilic moiety (e.g., a cholesterol moiety) may be incubated with a plurality of cells. The feature barcodes may comprise an optical label such as a fluorescent moiety. The feature barcodes may include, for example, a sequence configured to hybridize to a nucleic acid barcode molecule, such as a sequence comprising multiple cytosine nucleotides (e.g., a CCC sequence). Each feature barcode may also comprise a barcode sequence and/or a unique molecular identifier sequence. A plurality of beads (e.g., gel beads) each comprising a plurality of nucleic acid barcode molecules may be provided. The nucleic acid barcode molecules of each bead (e.g., releasably attached to each bead) may comprise a barcode sequence (e.g., cell barcode sequence), a unique molecular identifier sequence, and a sequence configured to hybridize to a feature barcode. Nucleic acid barcode molecules of each different bead may comprise the same barcode sequence, which barcode sequence differs from barcode sequences of nucleic acid barcode molecules of other beads of the plurality of beads. The cells incubated with feature barcodes may be partitioned (e.g., subsequent to one or more washing processes) with the plurality of beads into a plurality of partitions (e.g., droplets, such as aqueous droplets in an emulsion) such that at least a subset of the plurality of partitions each comprise a single cell and a single bead. Within each partition of the at least a subset of the plurality of partitions, one or more nucleic acid barcode molecules of the bead may attach (e.g., hybridize or ligate) to one or more feature barcodes of the cell. The one or more nucleic acid barcode molecules of the bead may be released (e.g., via application of a stimulus, such as a chemical stimulus) from the bead within the partition prior to attachment of the one or more nucleic acid barcode molecules to the one or more feature barcodes of the cell to provide a barcoded feature barcode. The cell may be lysed or permeabilized within the partition to provide access to analytes therein, such as nucleic acid molecules therein (e.g., deoxyribonucleic acid (DNA) molecules and/or ribonucleic acid (RNA) molecules), and/or to the feature barcode therein (e.g., if the feature barcode has permeated the cell membrane). One or more analytes (e.g., nucleic acid molecules) of the cell may also be barcoded within the partition with one or more nucleic acid barcode molecules of the bead to provide a plurality of barcoded analytes (e.g., barcoded nucleic acid molecules). The plurality of partitions comprising barcoded analytes and barcoded feature barcodes may be combined (e.g., pooled). Additional processing may be performed to, for example, prepare the barcoded analytes and barcoded feature barcodes for subsequent analysis. For example, barcoded nucleic acid molecules and/or barcoded feature barcodes may be derivatized with flow cell adapters to facilitate nucleic acid sequencing. Barcodes of barcoded analytes and barcoded feature barcodes may be detected using nucleic acid sequencing and used to identify the barcoded analytes and barcoded feature barcodes as deriving from particular cells or cell types of the plurality of cells. The relative abundance of a given sequence (e.g., barcode or UMI sequence) measured in a sequencing assay may provide an estimate of the size of various cells of the plurality of cells. For example, a first barcode sequence associated with a first cell (e.g., via a feature barcode and/or a partition nucleic acid barcode sequence of a nucleic acid barcode molecule of a bead co-partitioned with the first cell) may appear in greater number than a second barcode sequence associated with a second cell, indicating that the first cell is larger than the second cell. Barcode sequences and UMIs associated with cellular debris (e.g., cellular components and/or damaged cells) may have few lipophilic or amphiphilic moieties associated therewith and may therefore contribute only minimally to distributions of barcode sequences vs. cell counts (see, e.g., FIG. 24).

Cell Multiplexing and Hashing

As described herein, in an aspect, the present disclosure provides methods for simultaneously processing multiple analytes derived from the same or different samples. Such a method may comprise, for example, providing a first nucleic acid barcode sequence (e.g., as a component of a cell nucleic acid barcode molecule) to a first sample and a second nucleic acid barcode sequence to a second sample such that cells or other analytes associated with the first sample are labeled with the first nucleic acid barcode sequence and cells or other analytes associated with the second sample are labeled with the second nucleic acid barcode sequence. The nucleic acid barcode sequences may be components of nucleic acid barcode molecules that also comprise lipophilic moieties (such as cholesterol moieties, e.g., as described herein). Cells may be labeled by, for example, binding cell binding moieties coupled to nucleic acid barcode sequences to the cells. Such cell binding moieties may be, for example, antibodies, cell surface receptor binding molecules, receptor ligands, small molecules, pro-bodies, aptamers, monobodies, affimers, darpins, or protein scaffolds (e.g., as described herein). Cell binding moieties may bind to a protein and/or a cell surface species of the cells. Alternatively, cells may be labeled by delivering nucleic acid barcode molecules (e.g., as described herein) to the cells, optionally using cell-penetrating peptides, liposomes, nanoparticles, electroporation, or mechanical force (e.g., as described herein). Nucleic acid barcode molecules may comprise barcode sequences unique to a cell sample and/or to an individual cell within a cellular sample. Labeled cells (and/or other analytes) may be partitioned between a plurality of partitions (e.g., as described herein), which partitions may comprise one or more reagents, such as one or more partition nucleic acid barcode sequences. Each partition may comprise a different partition nucleic acid barcode sequence. Some partitions may comprise more than one labeled cell (e.g., as described herein). For example, partitions (e.g., droplets or wells) may be intentionally loaded in such a manner that more partitions including more than one cell than would be achieved according to Poisson statistics (e.g., partitions may be overloaded). At least two labeled cells may be identified as originating from a same partition using the nucleic acid barcode sequences with which the cells are labeled, or complements thereof, and the partition nucleic acid barcode sequences associated with the partition, or complements thereof. Such identification may be facilitated by synthesizing barcoded nucleic acid products from the plurality of labeled cells (e.g., as described herein), which a given barcoded nucleic acid product may comprise a cell identification sequence comprising a cell nucleic acid barcode sequence or complement thereof and a partition identification sequence comprising a partition nucleic acid barcode sequence or complement thereof. Synthesizing the barcoded nucleic acid products may comprise hybridizing a sequence of a partition nucleic acid barcode molecule to a cell nucleic acid barcode molecule and performing an extension reaction (e.g., as described herein). Such methods may facilitate assignation of cells to their samples of origin, as well as the identification of multiplets originating from multiple samples (e.g., as described herein).

Single cell processing and analysis methods and systems such as those described herein can be utilized for a wide variety of applications, including analysis of specific individual cells, analysis of different cell types within populations of differing cell types, analysis and characterization of large populations of cells for environmental, human health, epidemiological forensic, or any of a wide variety of different applications.

One application of the methods described herein is in the sequencing and characterization of immune cells. Methods and compositions disclosed herein can be utilized for sequence analysis of the immune repertoire. Analysis of sequence information underlying the immune repertoire can provide a significant improvement in understanding the status and function of the immune system.

Non-limiting examples of immune cells which can be analyzed utilizing the methods described herein include B cells, T cells (e.g., cytotoxic T cells, natural killer T cells, regulatory T cells, and T helper cells), natural killer cells, cytokine induced killer (CIK) cells; myeloid cells, such as granulocytes (basophil granulocytes, eosinophil granulocytes, neutrophil granulocytes/hypersegmented neutrophils), monocytes/macrophages, mast cell, thrombocytes/megakaryocytes, and dendritic cells. In some embodiments, individual T cells are analyzed using the methods disclosed herein. In some embodiments, individual B cells are analyzed using the methods disclosed herein.

Immune cells express various adaptive immunological receptors relating to immune function, such as T cell receptors and B cell receptors. T cell receptors and B cells receptors play a part in the immune response by specifically recognizing and binding to antigens and aiding in their destruction.

The T cell receptor, or TCR, is a molecule found on the surface of T cells that is generally responsible for recognizing fragments of antigen as peptides bound to major histocompatibility complex (MHC) molecules. The TCR is generally a heterodimer of two chains, each of which is a member of the immunoglobulin superfamily, possessing an N-terminal variable (V) domain, and a C terminal constant domain. In humans, in 95% of T cells the TCR consists of an alpha (α) and beta (β) chain, whereas in 5% of T cells the TCR consists of gamma and delta (γ/δ) chains. This ratio can change during ontogeny and in diseased states as well as in different species. When the TCR engages with antigenic peptide and MHC (peptide/MHC), the T lymphocyte is activated through signal transduction.

Each of the two chains of a TCR contains multiple copies of gene segments—a variable ‘V’ gene segment, a diversity ‘D’ gene segment, and a joining ‘J’ gene segment. The TCR alpha chain is generated by recombination of V and J segments, while the beta chain is generated by recombination of V, D, and J segments. Similarly, generation of the TCR gamma chain involves recombination of V and J gene segments, while generation of the TCR delta chain occurs by recombination of V, D, and J gene segments. The intersection of these specific regions (V and J for the alpha or gamma chain, or V, D and J for the beta or delta chain) corresponds to the CDR3 region that is important for antigen-MHC recognition. Complementarity determining regions (e.g., CDR1, CDR2, and CDR3), or hypervariable regions, are sequences in the variable domains of antigen receptors (e.g., T cell receptor and immunoglobulin) that can complement an antigen. Most of the diversity of CDRs is found in CDR3, with the diversity being generated by somatic recombination events during the development of T lymphocytes. A unique nucleotide sequence that arises during the gene arrangement process can be referred to as a clonotype.

The B cell receptor, or BCR, is a molecule found on the surface of B cells. The antigen binding portion of a BCR is composed of a membrane-bound antibody that, like most antibodies (e.g., immunoglobulins), has a unique and randomly determined antigen-binding site. The antigen binding portion of a BCR includes membrane-bound immunoglobulin molecule of one isotype (e.g., IgD, IgM, IgA, IgG, or IgE). When a B cell is activated by its first encounter with a cognate antigen, the cell proliferates and differentiates to generate a population of antibody-secreting plasma B cells and memory B cells. The various immunoglobulin isotypes differ in their biological features, structure, target specificity and distribution. A variety of molecular mechanisms exist to generate initial diversity, including genetic recombination at multiple sites.

The BCR is composed of two genes IgH and IgK (or IgL) coding for antibody heavy and light chains. Immunoglobulins are formed by recombination among gene segments, sequence diversification at the junctions of these segments, and point mutations throughout the gene. Each heavy chain gene contains multiple copies of three different gene segments—a variable ‘V’ gene segment, a diversity ‘D’ gene segment, and a joining ‘J’ gene segment. Each light chain gene contains multiple copies of two different gene segments for the variable region of the protein—a variable ‘V’ gene segment and a joining ‘J’ gene segment. The recombination can generate a molecule with one of each of the V, D, and J segments. Furthermore, several bases may be deleted and others added (called N and P nucleotides) at each of the two junctions, thereby generating further diversity. After B cell activation, a process of affinity maturation through somatic hypermutation occurs. In this process progeny cells of the activated B cells accumulate distinct somatic mutations throughout the gene with higher mutation concentration in the CDR regions leading to the generation of antibodies with higher affinity to the antigens. In addition to somatic hypermutation activated B cells undergo the process of isotype switching. Antibodies with the same variable segments can have different forms (isotypes) depending on the constant segment. Whereas all naïve B cells express IgM (or IgD), activated B cells mostly express IgG but also IgM, IgA and IgE. This expression switching from IgM (and/or IgD) to IgG, IgA, or IgE occurs through a recombination event causing one cell to specialize in producing a specific isotype. A unique nucleotide sequence that arises during the gene arrangement process can similarly be referred to as a clonotype.

In some embodiments, the methods, compositions and systems disclosed herein are utilized to analyze the various sequences of TCRs and BCRs from immune cells, for example various clonotypes. In some embodiments, methods, compositions and systems disclosed herein are used to analyze the sequence of a TCR alpha chain, a TCR beta chain, a TCR delta chain, a TCR gamma chain, or any fragment thereof (e.g., variable regions including VDJ or VJ regions, constant regions, transmembrane regions, fragments thereof, combinations thereof, and combinations of fragments thereof). In some embodiments, methods, compositions and systems disclosed herein are used to analyze the sequence of a B cell receptor heavy chain, B cell receptor light chain, or any fragment thereof (e.g., variable regions including VDJ or VJ regions, constant regions, transmembrane regions, fragments thereof, combinations thereof, and combinations of fragments thereof).

Where immune cells are to be analyzed, primer sequences useful in any of the various operations for attaching barcode sequences and/or extension/amplification reactions may comprise gene specific sequences which target genes or regions of genes of immune cell proteins, for example immune receptors. Such gene sequences include, but are not limited to, sequences of various T cell receptor alpha variable genes (TRAV genes), T cell receptor alpha joining genes (TRAJ genes), T cell receptor alpha constant genes (TRAC genes), T cell receptor beta variable genes (TRBV genes), T cell receptor beta diversity genes (TRBD genes), T cell receptor beta joining genes (TRBJ genes), T cell receptor beta constant genes (TRBC genes), T cell receptor gamma variable genes (TRGV genes), T cell receptor gamma joining genes (TRGJ genes), T cell receptor gamma constant genes (TRGC genes), T cell receptor delta variable genes (TRDV genes), T cell receptor delta diversity genes (TRDD genes), T cell receptor delta joining genes (TRDJ genes), and T cell receptor delta constant genes (TRDC genes).

Additionally the methods and compositions disclosed herein, allow the determination of not only the immune repertoire and different clonotypes, but the functional characteristics (e.g., the transcriptome) of the cells associated with a clonotype or plurality of clonotypes that bind to the same or similar antigen. These functional characteristics can comprise transcription of cytokine, chemokine, or cell-surface associated molecules, such as, costimulatory molecules, checkpoint inhibitors, cell surface maturation markers, or cell-adhesion molecules. Such analysis allows a cell or cell population expressing a particular T cell receptor, B cell receptor, or immunoglobulin to be associated with certain functional characteristics. For example, for any given antigen there will be multiple clonotypes of T cell receptor, B cell receptor, or immunoglobulin that specifically bind to that antigen. Multiple clonotypes that bind to the same antigen are known as the idiotype.

The present disclosure also provides methods for reducing nonspecific priming in a single-cell 5′ gene expression assay. In generating an assay that allows measurement of 1) a cell barcode sequence (barcode), 2) a unique molecular identifier sequence (UMI) and 3) the 5′ sequence of an mRNA transcript simultaneously, one strategy is to place these sequences on a sequence that attaches to the 5′ end of an mRNA transcript—in the present disclosure, this may be accomplished by placing the barcode and UMI on a template switching oligonucleotide (TSO). This oligonucleotide may be attached to the first strand cDNA via a template switching reaction where the reverse transcription (RT) enzyme 1) reverse transcribes a messenger RNA (mRNA) sequence into first-strand complementary DNA (cDNA) from a primer targeting the 3′ end of the mRNA, 2) adds nontemplated cytidines to the 5′ end of the first-strand cDNA, 3) switches template to the TSO, which may contain 3′ guanidines or guanidine-derivatives that hybridize to the added cytidines. The result is a first-strand cDNA molecule that is complementary to the TSO sequence: cell-barcode, UMI, guanidines, and the 5′ end of the mRNA.

In some cases, the TSO may co-exist in solution with the RT enzyme and the total RNA contents of a cell. If the TSO is a single stranded DNA (ssDNA) molecule, it can participate as an RT primer rather than as a template-switching substrate. Given, for example, that the over 90% of the total RNA contents of a cell include noncoding ribosomal RNA (rRNA), this may produce barcoded off products that do not contribute to the 5′ gene expression or V(D)J sequencing assay but do consume sequencing reads, increasing the cost required to achieve the same sequencing depth. In addition, if the UMI is implemented as a randomer, the presence of this randomer at the 3′ end of the TSO greatly increases its ability to serve as a primer on rRNA template.

In some cases, a TSO that is less likely to serve as an RT primer via the introduction of a particular spacer sequence between the UMI and terminal riboGs may be used. Another approach is to design and include a set of auxiliary blocking oligonucleotides that may hybridize to rRNA and prevent binding of the TSO.

The spacer sequence can be optimized by selecting a sequence that minimizes the predicted melting temperature of the (spacer-GGG):rRNA duplex against all human ribosomal RNA molecules.

The blocker sequences can be optimized by selecting sequences that maximize the predicted melting temperature of the (blocker):rRNA duplex against all human ribosomal RNA molecules.

Provided herein are TSO that are less likely to serve as an RT primer via the introduction of a particular spacer sequence between the UMI and terminal riboGs. Additionally, described herein are auxiliary blocking oligonucleotides that hybridize to rRNA and prevent binding of the TSO.

Examples of spacer sequences, blocker sequences, and full construct barcodes that may of use in the methods provided herein can be found in at least U.S. Patent Publication No. 201801058008, which is herein incorporated by reference in its entirety.

In some examples, a cell barcode may be a 16 base sequence that is a random choice from about 737,000 sequences. The length of the barcode (16) can be altered. The diversity of potential barcode sequences (737 k) can be alterable. The defined nature of the barcode can be altered, for example, it may also be completely random (16 Ns) or semi-random (16 bases that come from a biased distribution of nucleotides).

The canonical UMI sequence may be a 10 nucleotide randomer. The length of the UMI can be altered. The random nature of the UMI can be altered, for example, it may be semi-random (bases that come from a biased distribution of nucleotides.) In a certain case, the distribution of UMI nucleotide(s) may be biased; for example, UMI sequences that do not contain Gs or Cs may be less likely to serve as primers.

The spacer may alterable within given or predetermined parameters. For example one method may give an optimal sequence of TTTCTTATAT, but using a slightly different optimization strategy results in a sequence that is likely just as or nearly as good.

The selected template switching region can comprise 3 consecutive riboGs or more. The selected template switching region can comprise 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 consecutive riboGs or more. Alternative nucleotide may be used such as deoxyribo Gs, LNA G's, and potentially any combination thereof.

The present disclosure also provides methods of enriching cDNA sequences. Enrichment may be useful for TCR, BCR, and immunoglobulin gene analysis since these genes may possess similar yet polymorphic variable region sequences. These sequences can be responsible for antigen binding and peptide-MHC interactions. For example, due to gene recombination events in individual developing T cells, a single human or mouse will naturally express many thousands of different TCR genes. This T cell repertoire can exceed 100,000 or more different TCR rearrangements occurring during T cell development, yielding a total T cell population that is highly polymorphic with respect to its TCR gene sequences especially for the variable region. For immunoglobulin genes, the same may apply, except even greater diversity may be present. As previously noted, each distinct sequence may correspond to a clonotype. In certain embodiments, enrichment increases accuracy and sensitivity of methods for sequencing TCR, BCR and immunoglobulin genes at a single cell level. In certain embodiments, enrichment increases the number of sequencing reads that map to a TCR, BCR, or immunoglobulin gene. In some embodiments, enrichment leads to greater than or equal to 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95% or more of total sequencing reads mapping to a TCR, BCR or immunoglobulin gene. In some embodiments, enrichment leads to greater than or equal to 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95% or more of total sequencing reads mapping to a variable region of a TCR, BCR or immunoglobulin gene.

In order to aide in sequencing, detection, and analysis of sequences of interest, an enrichment step can be employed. Enrichment may be useful for the sequencing and analysis of genes that may be related yet highly polymorphic. In some embodiments, an enriched gene comprises a TCR sequence, a BCR sequence, or an immunoglobulin sequence. In some embodiments, an enriched gene comprises a mitochondrial gene or a cytochrome family gene. In some embodiments, enrichment is employed after an initial round of reverse transcription (e.g., cDNA production). In some embodiments, enrichment is employed after an initial round of reverse transcription and cDNA amplification for at least 5, 10, 15, 20, 25, 30, 40 or more cycles. In some embodiments, enrichment is employed after a cDNA amplification. In some embodiments, the amplified cDNA can be subjected to a clean-up step before the enrichment step using a column, gel extraction, or beads in order to remove unincorporated primers, unincorporated nucleotides, very short or very long nucleic acid fragments and enzymes. In some embodiments, enrichment is followed by a clean-up step before sequencing library preparation.

Enrichment of gene or cDNA sequences can be facilitated by a primer that anneals within a known sequence of the target gene. In some embodiments, for enrichment of a TCR, BCR, or immunoglobulin gene, a primer that anneals to a constant region of the gene or cDNA can be paired with a sequencing primer that anneals to a TSO functional sequence. In some embodiments, the enriched cDNA falls into a length range that approximately corresponds to that genes variable region. In some embodiments, greater than about 50%, 60%, 70%, 80%, 85%, 90%, 95% or more cDNA or cDNA fragments fall within a range of about 300 base pairs to about 900 base pairs, of about 400 base pairs to about 800 base pairs, of about 500 base pairs to about 700 base pairs, or of about 500 base pairs to about 600 base pairs.

FIG. 25 shows an example enrichment scheme. In operation 2001, an oligonucleotide with a poly-T sequence 2014, and in some cases an additional sequence 2016 that binds to, for example, a sequencing or PCR primer, anneals to a target RNA 2020. In operation 2002 the oligonucleotide is extended yielding an anti-sense strand 2022 which is appended by multiple cytidines on the 3′ end. A barcode oligonucleotide attached to a bead 2038 (such as a gel bead) is provided and a riboG of the barcode oligonucleotide 2008 pairs with the cytidines of the sense strand and is extended to create a sense and an antisense strand. In some cases, the barcode oligonucleotide is released from the gel bead during extension. In some cases, the barcode oligonucleotide is released from the gel bead prior to extension. In some cases, the barcode oligonucleotide is released from the gel bead after extension. In addition to the riboG sequence, the barcode oligonucleotide comprises a barcode 2012 sequence (which, in some instances may also comprise a unique molecular index) and one or more additional functional sequences 2010. The additional functional sequences can comprise a primer/primer binding sequence (such as a sequencing primer sequence, e.g., R1 or R2, or partial sequences thereof), a sequence for attachment to an Illumina sequencing flow cell (such as a P5 or P7 sequence), etc. Operations 2001 and 2002 may be performed in a partition (e.g., droplet or well). Subsequent to operation 2002, the nucleic acid product from operations 2001 and 2002 may be removed from the partition and in some cases pooled with other products from other partitions for subsequent processing. In some cases, the barcode oligonucleotide may be a template switching oligonucleotide.

Next, additional functional sequences can be added that allow for amplification or sample identification. This may occur in a partition or in bulk. This reaction yields amplified cDNA molecules as in 2003 comprising a barcode and, e.g., sequencing primers. In some cases, not all of these cDNA molecules will comprise a target variable region sequence (e.g., from a TCR or immunoglobulin). In one enrichment scheme, shown in operation 2004, a primer 2018 that anneals to a sequence 3′ of a TCR, BCR or immunoglobulin variable region 2020 specifically amplifies the variable region comprising cDNAs yielding products as shown in operation 2005. Such enrichment may be performed for various approaches described herein.

In certain aspects, primer 2018 anneals in a constant region of a TCR (e.g., TCR-alpha or TCR-beta), BCR or immunoglobulin gene. After amplification the products are sheared, adaptors ligated and amplified a second time to add additional functional sequences 2007 and 2011 and a sample index 2009 as shown in operation 2006. The additional functional sequences can be, for example a primer/primer binding sequence (such as a sequencing primer sequence, e.g., R1 or R2, or partial sequences thereof), a sequence for attachment to an Illumina sequencing flow cell (such as a P5 or P7 sequence), etc. In some embodiments, the initial poly-T primer, comprising sequences 2016 and 2014 can be attached to a gel bead as opposed to the barcode oligonucleotide or template switching oligonucleotide (TSO). In some embodiments, the poly-T comprising primer comprises functional sequences and barcode sequences 2008, 2010, 2012, and the barcode oligonucleotide (e.g., TSO, which, in some instances, is free in solution) comprises sequence 2016. Operations 2003-2006 may be performed in bulk.

In some embodiments, clonotype information derived from next-generation sequencing data of cDNA prepped from cellular RNA is combined with other targeted on non-targeted cDNA enrichment to illuminate functional and ontological aspects of B-cell and T cells that express a given TCR, BCR, or immunoglobulin. In some embodiments, clonotype information is combined with analysis of expression of an immunologically relevant cDNA. In some embodiments, the cDNA encodes a cell lineage marker, a cell surface functional marker, immunoglobulin isotype, a cytokine and/or chemokine, an intracellular signaling polypeptide, a cell metabolism polypeptide, a cell-cycle polypeptide, an apoptosis polypeptide, a transcriptional activator/inhibitor, an miRNA or lncRNA.

Also disclosed herein are methods and systems for reference-free clonotype identification. Such methods may be implemented by way of software executing algorithms. Tools for assembling T-cell Receptor (TCR) sequences may use known sequences of V and C regions to “anchor” assemblies. This may make such tools only applicable to organisms with well characterized references (human and mouse). However, most mammalian T cell receptors have similar amino acid motifs and similar structure. In the absence of a reference, a method can scan assembled transcripts for regions that are diverse or semi-diverse, find the junction region which should be highly diverse, then scan for known amino acid motifs. In some cases, it may not be critical that the complementary CDRs, such as the CDR1, CDR2, or CDR3, region be accurately delimited, only that a diverse sequence is found that can uniquely identify the clonotype. One advantage of this method is that the software may not require a set of reference sequences and can operate fully de novo, thus this method can enable immune research in eukaryotes with poorly characterized genomes/transcriptomes.

The methods described herein allow simultaneously obtaining single-cell gene expression information with single-cell immune receptor sequences (TCRs/BCRs). This can be achieved using the methods described herein, such as by amplifying genes relevant to lymphocyte function and state (either in a targeted or unbiased way) while simultaneously amplifying the TCR/BCR sequences for clonotyping. This can allow such applications as 1) interrogating changes in lymphocyte activation/response to an antigen, at the single clonotype or single cell level; or 2) classifying lymphocytes into subtypes based on gene expression while simultaneously sequencing their TCR/BCRs. UMIs are typically ignored during TCR (or generally transcriptome) assembly.

Key analytical operations involved in clonotype sequencing according to the methods described herein include: 1) Assemble each UMI separately, then merge highly similar assembled sequences. High depth per molecule in TCR sequencing makes this feasible. This may result in a reduced chance of “chimeric” assemblies; 2) Assemble all UMIs from each cell together but use UMI information to choose paths in the assembly graph. This is analogous to using barcode and read-pair information to resolve “bubbles” in WGS assemblies; 3) Base quality estimation. UMI information and alignment of short reads may be used to assemble contigs to compute per-base quality scores. Base quality scoring may be important as a few base differences in a CDR sequence may differentiate one clonotype from another. This may be in contrast to other methods that rely on using long-read sequencing.

Thus, base quality estimates for assembled contigs can inform clonotype inference. Errors can make cells with the same (real) clonotype have mismatching assembled sequences. Further, combining base-quality estimates and clonotype abundances to correct clonotype assignments. For example, if 10 cells have clonotype X and one cell has a clonotype that differs by X in only a few bases and these bases have low quality, then this cell may be assigned to clonotype X. In some embodiments, clonotypes that differ by a single amino acid or nucleic acid may be discriminated. In some embodiments, clonotypes that differ by less than 50, 40, 30, 20, 15, 10, 9, 8, 7, 6, 5, 4, 3, or 2 amino acids or nucleic acids may be discriminated.

The present disclosure provides methods combining cell multiplexing methods and immune cell analysis methods. In an example, the present disclosure provides a method for analyzing a cell, which cell may be an immune cell such as a T cell or B cell. The cell may comprise a plurality of nucleic acid molecules (e.g., RNA molecules and/or DNA molecules). The plurality of nucleic acid molecules may comprise a plurality of nucleic acid sequences corresponding to a V(D)J region of the genome of the cell. The V(D)J region of the genome of the cell may comprise a T cell receptor variable region sequence, a B cell receptor variable region sequence, or an immunoglobulin variable region sequence. The cell may be labeled with a cell nucleic acid barcode sequence to generate a labeled cell. The cell nucleic acid barcode sequence may be a component of a cell nucleic acid barcode molecule. The cell nucleic acid barcode molecule may also comprise a cell labeling agent that may couple to the cell, such as to a cell surface feature. The cell labeling agent may be, for example, a lipophilic moiety (e.g., a cholesterol), a fluorophore, a dye, a peptide, a nanoparticle, an antibody, or another moiety. The cell nucleic acid barcode sequence may identify a sample from which the cell originates. The sample may be derived from a biological fluid, such as a biological fluid comprising blood or saliva. The cell nucleic acid barcode molecule may be at least partially disposed within the labeled cell.

The labeled cell may be partitioned in a partition (e.g., a droplet or well) with a plurality of partition nucleic acid barcode molecules. Each partition nucleic acid barcode molecule of the plurality of partition nucleic acid barcode molecules may comprise a partition nucleic acid barcode sequence. Each partition nucleic acid barcode molecule of the plurality of partition nucleic acid barcode molecules may comprise a priming sequence, such as a targeted priming sequence or a random N-mer sequence. Each partition nucleic acid barcode molecule of the plurality of partition nucleic acid barcode molecules may comprise a TSO sequence as described elsewhere herein. The priming sequence may be capable of hybridizing to a sequence of at least a subset of the plurality of nucleic acid molecules. The priming sequence may be capable of hybridizing to a sequence of the cell nucleic acid barcode molecule. The TSO sequence may be capable of facilitating a template switching reaction and/or serve as a priming/hybridization sequence for a cell nucleic acid molecule present in a labeled cell (e.g., a lipophilic or other moiety as described elsewhere herein). The partition nucleic acid barcode molecules may be coupled to a bead, such as a gel bead. The gel bead may be dissolvable or degradable. The partition nucleic acid barcode molecules may be releasably coupled to the bead. Some or all of the partition nucleic acid barcode molecules may be released from the bead within the partition (e.g., upon application of a stimulus, such as a chemical stimulus). Within the partition, the cell may be lysed or permeabilized to provide access to the plurality of nucleic acid molecules therein. The partition may also include a primer molecule, which primer molecule may comprise a sequence complementary to a sequence of the plurality of nucleic acid molecules. Where the plurality of nucleic acid molecules is a plurality of messenger RNA (mRNA) molecules, such a sequence may be a poly(A) sequence.

A barcoded nucleic acid molecule comprising the cell nucleic acid barcode sequence, or a complement thereof, and the partition nucleic acid barcode sequence, or a complement thereof may be generated within the partition. A plurality of barcoded nucleic acid products each comprising a sequence of a nucleic acid molecule of the plurality of nucleic acid molecules and the partition nucleic acid barcode sequence, or a complement thereof may also be generated within the partition. The plurality of barcoded nucleic acid products may comprise a plurality of complementary DNA (cDNA) molecules, or derivatives thereof. Generating the plurality of barcoded nucleic acid products may comprise hybridizing a sequence of a primer molecule within the partition to a sequence (e.g., a poly(A) sequence) of a nucleic acid molecule of the plurality of nucleic acid molecules (e.g., mRNA molecules) and using an enzyme (e.g., a reverse transcriptase) to extend the sequence of the primer molecule to provide a nucleic acid product comprising a cDNA sequence corresponding to a sequence of the nucleic acid molecule. The enzyme may have terminal transferase activity and may incorporate a sequence at an end of the nucleic acid product. Such a sequence may be, for example, a poly(C) sequence. Some or all of the plurality of partition nucleic acid barcode molecules may comprise a sequence complementary to the poly(C) sequence (e.g., a poly(riboG) sequence). Generating the plurality of barcoded nucleic acid products may comprise using the nucleic acid product and a partition nucleic acid barcode molecule to generate a barcoded nucleic acid product. The barcoded nucleic acid molecule and the plurality of barcoded nucleic acid products may be synthesized via one or more primer extension reactions, ligation reactions, or nucleic acid amplification reactions. The barcoded nucleic acid molecule and the barcoded nucleic acid products, or derivatives thereof (e.g., the barcoded nucleic acid molecule and the barcoded nucleic acid products having functional sequences appended thereto, such as flow cell sequences and sequencing primers) to yield a plurality of sequencing reads. Each sequencing read of the plurality of sequencing reads may be associated with the partition via its partition nucleic acid barcode sequence. The plurality of nucleic acid molecules may subsequently be identified as originating from the cell.

Such a method may be extended to a plurality of labeled cells. Each cell of the plurality of labeled calls may be labeled with a cell nucleic acid barcode sequence of a plurality of cell nucleic acid barcode sequences. A plurality of cell nucleic acid barcode molecules may comprise the plurality of cell nucleic barcode sequences, wherein each cell nucleic acid barcode molecule of the plurality of cell nucleic acid barcode molecules may comprise (i) a single cell nucleic acid barcode sequence of the plurality of cell nucleic acid barcode sequences and (ii) a cell labeling agent. The cell labeling agent may be, for example, a lipophilic moiety, a nanoparticle, a fluorophore, a dye, a peptide, an antibody, or another moiety. A lipophilic moiety of each nucleic acid barcode molecule of the plurality of nucleic acid barcode molecules may comprise cholesterol. The cell labeling agent may be linked to the plurality of cell nucleic acid barcode molecules via a linker. The cell labeling agent may be linked to a cell via a cell surface feature, such as a protein. Each labeled cell of the plurality of labeled cells may comprise a target nucleic acid molecule of a plurality of target nucleic acid molecules. The plurality of target nucleic acid molecules may comprise a plurality of messenger RNA (mRNA) molecules. The plurality of target nucleic acid molecules may comprise a plurality of nucleic acid sequences corresponding to V(D)J regions of genomes of the plurality of labeled cells. The V(D)J regions of the genomes of the plurality of labeled cells may comprise T cell receptor variable region sequences, B cell receptor variable region sequences, immunoglobulin variable region sequences, or a combination thereof. The plurality of labeled cells may be a plurality of immune cells, such as a plurality of T cells or B cells. The plurality of labeled cells may derive from a plurality of cellular samples. A given cell nucleic acid barcode sequence of the plurality of cell nucleic acid barcode sequences may identify a cellular sample from which an associated cell of the plurality of labeled cells originates, such as a sample derived from a biological fluid (e.g., a biological fluid comprising saliva or blood). The plurality of cells may be labeled according to the methods provided herein. For example, cells may be labeled using cell binding moieties (e.g., antibodies, cell surface receptor binding molecules, receptor ligands, small molecules, pro-bodies, aptamers, monobodies, affimers, darpins, or protein scaffolds) that may bind to a protein, cell surface species, or other feature of the cells. Cells may alternatively be labeled by delivering nucleic acid barcode molecules to cells using cell-penetrating peptides, liposomes, nanoparticles, electroporation, or mechanical force (e.g., nanowires or microinjection). The cell nucleic acid barcode molecules utilized to label cells may comprise a barcode sequence and one or more functional sequences including a unique molecular index, a primer/primer binding sequence (such as a sequencing primer sequence, e.g., R1, R2, or partial sequences thereof), a sequence configured to attach to the flow cell of a sequencer (such as P5 or P7), an adapter sequence (such as a sequence configured to be complementary or hybridize to a sequence on a partition barcode molecule, e.g., attached to a bead), etc.

The plurality of labeled cells and a plurality of nucleic acid barcode molecules may be co-partitioned within a plurality of partitions (e.g., droplets or wells). Each partition of the plurality of partitions may comprise at least one labeled cell of the plurality of labeled cells and a partition nucleic acid barcode molecule of a plurality of partition nucleic acid barcode molecules. At least a subset of the plurality of partitions may comprise at least two labeled cells of the plurality of labeled cells. Each partition nucleic acid barcode molecule of the plurality of partition nucleic acid barcode molecules may comprise a partition nucleic acid barcode sequence of a plurality of partition nucleic acid barcode sequences, and each partition of the plurality of partitions may comprise a different partition nucleic acid barcode sequence. The plurality of partition nucleic acid barcode molecules may be coupled to a plurality of beads, such as a plurality of gel beads. Each bead of the plurality of beads may comprise at least 10,000 partition nucleic acid barcode molecules of the plurality of partition nucleic acid barcode molecules coupled thereto. The plurality of gel beads may be dissolvable or degradable. Each partition of the plurality of partitions may comprise a single bead of the plurality of beads. The plurality of partition nucleic acid barcode molecules may be releasably coupled to the plurality of beads. The plurality of partition nucleic acid barcode molecules may be releasable from the beads upon application of a stimulus, such as a chemical stimulus. Partition nucleic acid barcode molecules of the plurality of partition nucleic acid barcode molecules may be released from each bead of the plurality of beads within the plurality of partitions. Each partition nucleic acid barcode molecule of the plurality of partition nucleic acid barcode molecules may comprise a common partition nucleic acid barcode sequence. Each partition nucleic acid barcode molecule of the plurality of partition nucleic acid barcode molecules may comprise a common partition nucleic acid barcode sequence and one or more functional sequences including a unique molecular index, a primer/primer binding sequence (such as a sequencing primer sequence, e.g., R1, R2, or partial sequences thereof), a sequence configured to attach to the flow cell of a sequencer (such as P5 or P7), an adapter sequence (such as a sequence configured to be complementary or hybridize to a sequence on a cell barcode molecule, e.g., coupled to a labeled cell, such as via a lipophilic moiety), etc. A given bead may comprise multiple different types of partition nucleic acid barcode molecules. For example, the given bead may comprise a first set of partition nucleic acid barcode molecules and a second set of partition nucleic acid barcode molecules. The first set of partition nucleic acid barcode molecules may comprise a sequence complementary to a sequence of the cell nucleic acid barcode sequence of a given partition comprising the given bead, while the second set of partition nucleic acid barcode molecules may comprise a sequence useful in processing target nucleic acid molecules of a labeled cell of the given partition.

Within the partitions, the plurality of labeled cells may be subjected to conditions sufficient to provide access to the plurality of target nucleic acid molecules therein. For example, the plurality of labeled cells may be lysed or permeabilized. The plurality of partition nucleic acid barcode molecules may be used to synthesize (i) a first plurality of barcoded nucleic acid products comprising a cell nucleic acid barcode sequence of the plurality of cell nucleic acid barcode sequences, or a complement thereof, and a partition nucleic acid barcode sequence of the plurality of partition nucleic acid barcode sequences, or a complement thereof; and (ii) a second plurality of barcoded nucleic acid products comprising a sequence of a target nucleic acid molecule (e.g., a V(D)J sequence as described herein) of the plurality of target nucleic acid molecules, or a complement thereof, and the partition nucleic acid barcode sequence of the plurality of partition nucleic acid barcode sequences, or a complement thereof. This process may comprise reverse transcribing mRNA molecules to generate cDNA molecules (e.g., as described herein). A reverse transcriptase, such as a reverse transcriptase having terminal transferase activity, may be used to reverse transcribe mRNA. Template switching may be performed (e.g., using partition nucleic acid barcode molecules comprising terminal poly(riboG) sequences) to generate the second plurality of barcoded nucleic acid products (e.g., as described herein). In some cases, multiplet reduction techniques such as those described herein may also be employed. For example, at least two labeled cells of the plurality of labeled cells may be identified as originating from a same partition of the plurality of partitions using (i) cell nucleic acid barcode sequences of the plurality of cell nucleic acid barcode sequences, or complements thereof, and (ii) partition nucleic acid barcode sequences of the plurality of partition nucleic acid barcode sequences, or complements thereof. Relative cell sizes of the plurality of labeled cells may also be determined (e.g., as described herein).

In some instances, different cell barcode sequences may be attached to different samples of cells, which are then pooled for partition barcoding. For example, in some embodiments, (1) a first population of cells is labeled with a first cell barcode sequence using, e.g., a lipophilic moiety as described herein and (2) a second population of cells is labeled with a second cell barcode sequence using, e.g., a lipophilic moiety as described herein. The labeled first and second population of cells may then be pooled and co-partitioned with partition barcode molecules (e.g., attached to a bead, such as a gel bead) for barcoding as described elsewhere herein. Any suitable number of samples (e.g., population of cells) may be labeled with cell barcodes as described herein and pooled (e.g., multiplexed) for analysis thereby increasing the throughput and reducing the cost of sample analysis.

Enhanced Cell Multiplexing

The methods provided herein may make use of multiple cellular barcodes or tags (e.g., multiple different cell nucleic acid barcode sequences for a given cell). The use of multiple tags may facilitate higher level multiplexing with a reduced number of reagents. Accordingly, the present disclosure provides a method comprising the use of multiple (e.g., two or more) different tags to label a single population of cells. Cell identification in such a scheme is based on a combination of tags, rather than a single tag. Such a method may be referred to as “combinatorial tagging.”

In some cases, the combinatorial tagging methods provided herein may be used to specifically label different populations and conditions. For example, a first set of tags may be used for sample identification, while a second set of tags may be used to associate cells with a given condition. Multiple additional layers of tagging may be incorporated. For example, a first set of tags may be used to indicate a subject from which a cellular sample derives, a second set of tags may be used to indicate a bodily area of the subject from which a cellular sample or portion thereof derives, a third set of tags may be used to indicate a first processing or storage condition, a fourth set of tags may be used to indicate a second processing or storage condition, etc. Tagging of cells may be performed simultaneously or sequentially. For example, a first tag may be provided to a cell prior to provision of a second tag. Alternatively, the first and second tags may be provided at the same time (e.g., in a mixture of tags). In some cases, a matrix-based method may be used for staining. For example, FIG. 27 shows tagging of cells assigned to specific spatial positions (e.g., wells within a well plate). For a microwell plate having 96 microwells, a total of 20 barcodes (8 for 8 rows and 12 for 12 columns) may be used to provide 96 unique cell identifier combinations. Accordingly, many more cell identifiers may be generated with fewer total reagents.

In addition to providing for greater levels of multiplexing, the use of multiple tags may also provide greater confidence in sample identification, which may be particularly relevant for clinical samples. For example, if each tag is assumed to be about 95% sensitive (e.g., binds to 95% of the intended cells) and 1% non-specific (e.g., binds to 1% of the wrong cells, possibly after pooling and prior to partitioning of cells), using just 2 tags per sample would result in much better specificity (0.01%) without significant loss of sensitivity (net sensitivity 90.2%). Using 2 tags per sample, N(N−1)/2 pairs can be achieved from N tags. Using 3 tags per sample, this increases to O(N{circumflex over ( )}3). Additional schemes may also be used.

In some cases, first tags and second tags may be provided to a population of cells simultaneously (e.g., within a mixture). In other cases, a cell may be labeled with a first tag (e.g., as described herein) prior to provision of the second tag. Subsequent to labeling with the first tag, the cell may be labeled with the second tag (e.g., as described herein). In some cases, the second tag may couple to the first tag (e.g., via hybridization of complementary sequences of the first and second tags, ligation, chemical binding (e.g., formation of a covalent bond), or another process). In other cases, the second tag may not be directly coupled to the first tag.

First and second tags may label cells according to the same or different mechanisms. The present disclosure provides numerous examples of labeling of cells with tags (e.g., cell nucleic acid barcode molecules comprising cell nucleic acid barcode sequences). In an example, first and second tags may each include lipophilic moieties capable of coupling to cells (e.g., as described herein).

First and second tags may have the same or different characteristics. For example, first tags may comprise barcode sequences having a first length (e.g., between 6-20 nucleotides) while second tags may comprise barcode sequences having a second length (e.g., between 6-20 nucleotides) that is different than the first length. In another example, first tags may comprise nucleic acid barcode sequences (e.g., as described herein) while second tags may comprise optical labels. Optical labels may be distinguished by, for example, the intensity and wavelength of fluorescence emission upon excitation. Optical labels may comprise fluorescent labels such as fluorescent dyes.

In an example, the present disclosure provides a method of analyzing a plurality of cells, comprising providing a first plurality of cell nucleic acid barcode molecules comprising a first plurality of cell nucleic acid barcode sequences and a second plurality of cell nucleic acid barcode molecules comprising a second plurality of cell nucleic acid barcode sequences. Each cell nucleic acid barcode molecule of the first plurality of cell nucleic acid barcode molecules and the second plurality of cell nucleic acid barcode molecules may comprise a single cell nucleic acid barcode sequence of the first plurality of cell nucleic acid barcode sequences or the second plurality of cell nucleic acid barcode sequences. In some cases, each cell nucleic acid barcode molecule of the first plurality of cell nucleic acid barcode molecules or the second plurality of cell nucleic acid barcode molecules comprises a lipophilic moiety. The lipophilic moiety may comprise cholesterol. The lipophilic moiety may be linked to the first plurality of cell nucleic acid barcode molecules or the second plurality of cell nucleic acid barcode molecules via a linker.

The plurality of cells may be labeled with the first plurality of cell nucleic acid barcode sequences and the second plurality of cell nucleic acid barcode sequences (e.g., as described herein) to generate a plurality of labeled cells. Each labeled cell of the plurality of labeled cells may comprise (i) a different cell nucleic acid barcode sequence of the first plurality of cell nucleic acid barcode sequences and (ii) a different cell nucleic acid barcode sequence of the second plurality of cell nucleic acid barcode sequences. In some cases, the plurality of cells may be labeled with the first plurality of cell nucleic acid barcode sequences and the second plurality of cell nucleic acid barcode sequences simultaneously. In other cases, the plurality of cells are labeled with the first plurality of cell nucleic acid barcode sequences prior to the second plurality of cell nucleic acid barcode sequences. A cell nucleic acid barcode molecule of the second plurality of cell nucleic acid barcode sequences may be coupled to a cell nucleic acid barcode molecule of the first plurality of cell nucleic acid barcode sequences coupled to a given cell of the plurality of cells. In some cases, the second plurality of cell nucleic acid barcode sequences may comprise a sequence complementary to a sequence of the first plurality of cell nucleic acid barcode sequences. The plurality of cells may be labeled with the first plurality of cell nucleic acid barcode sequences and/or the second plurality of cell nucleic acid barcode sequences by binding cell binding moieties, each coupled to a given cell nucleic acid barcode sequence of the first plurality of cell nucleic acid barcode sequences and/or the second plurality of cell nucleic acid barcode sequences, to each cell of the plurality of cells. The cell binding moieties may be, for example, antibodies, cell surface receptor binding molecules, receptor ligands, small molecules, pro-bodies, aptamers, monobodies, affimers, darpins, or protein scaffolds. The cell binding moieties may bind to a protein or a cell surface species of cells of the plurality of cells. In some cases, the cell binding moieties may bind to a species common to each cell of the plurality of cells. In some cases, the plurality of cells may be labeled with the first plurality of cell nucleic acid barcode sequences and/or the second plurality of cell nucleic acid barcode sequences by delivering nucleic acid barcode molecules each comprising an individual cell nucleic acid barcode sequence of the first plurality of cell nucleic acid barcode sequences and/or the second plurality of cell nucleic acid barcode sequences to each cell of the plurality of cells with the aid of a cell-penetrating peptide. Alternatively, the plurality of cells may be labeled with the first plurality of cell nucleic acid barcode sequences and/or the second plurality of cell nucleic acid barcode sequences with the aid of liposomes, nanoparticles, electroporation, or mechanical force (e.g., using nanowires or microinjection).

A plurality of partitions (e.g., droplets or wells) comprising the plurality of labeled cells and a plurality of partition nucleic acid barcode sequences may be generated (e.g., as described herein). Each partition of the plurality of partitions may comprise a different partition nucleic barcode sequence of the plurality of partition nucleic acid barcode sequences. The plurality of partition nucleic acid barcode sequences may be components a plurality of partition nucleic acid barcode molecules, which plurality of partition nucleic acid barcode molecules may be coupled to a plurality of beads (e.g., gel beads that may be dissolvable or degradable). Each partition of the plurality of partitions may comprise a single bead of the plurality of beads. The plurality of partition nucleic acid barcode molecules may be releasably coupled to the plurality of beads. The plurality of partition nucleic acid barcode molecules may be releasable from the bead upon application of a stimulus (e.g., a chemical stimulus). In some cases, subsequent to partitioning, partition nucleic acid barcode molecules of the plurality of partition nucleic acid barcode molecules may be released from each bead of the plurality of beads. Each partition nucleic acid barcode molecule of the plurality of partition nucleic acid barcode molecules coupled to a given bead may comprise a common partition nucleic acid barcode sequence. Each partition nucleic acid barcode molecule of the plurality of partition nucleic acid barcode molecules may comprise a unique molecular identifier sequence and/or a priming sequence (e.g., a targeted priming sequence or a random priming sequence). In some cases, the plurality of labeled cells may be lysed or permeabilized after partitioning, e.g., to provide access to nucleic acid molecules therein.

A plurality of barcoded nucleic acid products may be synthesized from the plurality of labeled cells, wherein a given barcoded nucleic acid product of the plurality of barcoded nucleic acid products comprises (i) a cell identification sequence comprising a given cell nucleic acid barcode sequence of the first plurality of cell nucleic acid barcode sequences or the second plurality of cell nucleic acid barcode sequences, or a complement of the given cell nucleic acid barcode sequence; and (ii) a partition identification sequence comprising a given partition nucleic acid barcode sequence of the plurality of partition nucleic acid barcode sequences, or a complement of the given partition nucleic acid barcode sequence.

The plurality of labeled cells may be derived from a plurality of cellular samples. A given cell nucleic acid barcode sequence of the first plurality of cell nucleic acid barcode sequences or the second plurality of cell nucleic acid barcode sequences may identify a cellular sample from which an associated cell of the plurality of labeled cells originates. The sample may be derived from a biological fluid (e.g., blood or saliva). In some cases, the first plurality of cell nucleic acid barcode sequences may identify the cellular sample. In some cases, the second plurality of cell nucleic acid barcode sequences may identify a condition to which an associated cell of the plurality of labeled cells is subjected. In some cases, the first plurality of cell nucleic acid barcode sequences and the second plurality of cell nucleic acid barcode sequences may identify a spatial position of an associated cell of the plurality of labeled cells prior to cell partitioning.

In some cases, at least a subset of the plurality of partitions may comprise at least two labeled cells of the plurality of labeled cells. The method may further comprise identifying at least two labeled cells of the plurality of labeled cells as originating from a same partition of the plurality of partitions using (i) cell nucleic acid barcode sequences of the first plurality of cell nucleic acid barcode sequences, or complements thereof, (ii) cell nucleic acid barcode sequences of the second plurality of cell nucleic acid barcode sequences, or complements thereof, and/or (iii) partition nucleic acid barcode sequences of the plurality of partition nucleic acid barcode sequences, or complements thereof. The method may further comprise identifying the first plurality of barcoded nucleic acid products and the second plurality of barcoded nucleic acid products as originating from labeled cells of the plurality of labeled cells.

Systems and Methods for Sample Compartmentalization

In an aspect, the systems and methods described herein provide for the compartmentalization, depositing, or partitioning of macromolecular constituent contents of individual biological particles into discrete compartments or partitions (referred to interchangeably herein as partitions), where each partition maintains separation of its own contents from the contents of other partitions. The partition can be a droplet in an emulsion. A partition may comprise one or more other partitions.

A partition of the present disclosure may comprise biological particles and/or macromolecular constituents thereof. A partition may comprise one or more gel beads. A partition may comprise one or more cell beads. A partition may include a single gel bead, a single cell bead, both a single cell bead and single gel bead, two cell beads and a single gel bead, three cell beads and a single gel bead, etc. A cell bead can be a biological particle and/or one or more of its macromolecular constituents encased inside of a gel or polymer matrix, such as via polymerization of a droplet containing the biological particle and precursors capable of being polymerized or gelled. Unique identifiers, such as barcodes, may be injected into the droplets previous to, subsequent to, or concurrently with droplet generation, such as via a microcapsule (e.g., bead), as described further below. Microfluidic channel networks (e.g., on a chip) can be utilized to generate partitions as described herein. Alternative mechanisms may also be employed in the partitioning of individual biological particles, including porous membranes through which aqueous mixtures of cells are extruded into non-aqueous fluids.

The partitions can be flowable within fluid streams. The partitions may comprise, for example, micro-vesicles that have an outer barrier surrounding an inner fluid center or core. In some cases, the partitions may comprise a porous matrix that is capable of entraining and/or retaining materials within its matrix. The partitions can comprise droplets of aqueous fluid within a non-aqueous continuous phase (e.g., oil phase). The partitions can comprise droplets of a first phase within a second phase, wherein the first and second phases are immiscible. A variety of different vessels are described in, for example, U.S. Patent Application Publication No. 2014/0155295, which is entirely incorporated herein by reference for all purposes. Emulsion systems for creating stable droplets in non-aqueous or oil continuous phases are described in, for example, U.S. Patent Application Publication No. 2010/0105112, which is entirely incorporated herein by reference for all purposes.

In the case of droplets in an emulsion, allocating individual biological particles to discrete partitions may in one non-limiting example be accomplished by introducing a flowing stream of biological particles in an aqueous fluid into a flowing stream of a non-aqueous fluid, such that droplets are generated at the junction of the two streams. By providing the aqueous stream at a certain concentration and/or flow rate of biological particles, the occupancy of the resulting partitions (e.g., number of biological particles per partition) can be controlled. Where single biological particle partitions are used, the relative flow rates of the immiscible fluids can be selected such that, on average, the partitions may contain less than one biological particle per partition in order to ensure that those partitions that are occupied are primarily singly occupied. In some cases, partitions among a plurality of partitions may contain at most one biological particle (e.g., bead, cell or cellular material). In some embodiments, the relative flow rates of the fluids can be selected such that a majority of partitions are occupied, for example, allowing for only a small percentage of unoccupied partitions. The flows and channel architectures can be controlled as to ensure a given number of singly occupied partitions, less than a certain level of unoccupied partitions and/or less than a certain level of multiply occupied partitions. In some embodiments, a partitions contain more than one biological particle.

FIG. 1 shows an example of a microfluidic channel structure 100 for partitioning individual biological particles. The channel structure 100 can include channel segments 102, 104, 106 and 108 communicating at a channel junction 110. In operation, a first aqueous fluid 112 that includes suspended biological particles (or cells) 114 may be transported along channel segment 102 into junction 110, while a second fluid 116 that is immiscible with the aqueous fluid 112 is delivered to the junction 110 from each of channel segments 104 and 106 to create discrete droplets 118, 120 of the first aqueous fluid 112 flowing into channel segment 108, and flowing away from junction 110. The channel segment 108 may be fluidically coupled to an outlet reservoir where the discrete droplets can be stored and/or harvested. A discrete droplet generated may include an individual biological particle 114 (such as droplets 118). A discrete droplet generated may include more than one individual biological particle 114 (not shown in FIG. 1), for example at least two biological particles. A discrete droplet may contain no biological particle 114 (such as droplet 120). Each discrete partition may maintain separation of its own contents (e.g., individual biological particle 114) from the contents of other partitions.

The second fluid 116 can comprise an oil, such as a fluorinated oil, that includes a fluorosurfactant for stabilizing the resulting droplets, for example, inhibiting subsequent coalescence of the resulting droplets 118, 120. Examples of particularly useful partitioning fluids and fluorosurfactants are described, for example, in U.S. Patent Application Publication No. 2010/0105112, which is entirely incorporated herein by reference for all purposes.

As will be appreciated, the channel segments described herein may be coupled to any of a variety of different fluid sources or receiving components, including reservoirs, tubing, manifolds, or fluidic components of other systems. As will be appreciated, the microfluidic channel structure 100 may have other geometries. For example, a microfluidic channel structure can have more than one channel junction. For example, a microfluidic channel structure can have 2, 3, 4, or 5 channel segments each carrying biological particles, cell beads, and/or gel beads that meet at a channel junction. Fluid may be directed flow along one or more channels or reservoirs via one or more fluid flow units. A fluid flow unit can comprise compressors (e.g., providing positive pressure), pumps (e.g., providing negative pressure), actuators, and the like to control flow of the fluid. Fluid may also or otherwise be controlled via applied pressure differentials, centrifugal force, electrokinetic pumping, vacuum, capillary or gravity flow, or the like.

The generated droplets may comprise two subsets of droplets: (1) occupied droplets 118, containing one or more biological particles 114, and (2) unoccupied droplets 120, not containing any biological particles 114. Occupied droplets 118 may comprise singly occupied droplets (having one biological particle) and multiply occupied droplets (having more than one biological particle). As described elsewhere herein, in some cases, the majority of occupied partitions can include no more than one biological particle per occupied partition and some of the generated partitions can be unoccupied (of any biological particle). In some cases, though, some of the occupied partitions may include more than one biological particle. In some cases, the partitioning process may be controlled such that fewer than about 25% of the occupied partitions contain more than one biological particle, and in many cases, fewer than about 20% of the occupied partitions have more than one biological particle, while in some cases, fewer than about 10% or even fewer than about 5% of the occupied partitions include more than one biological particle per partition.

In some cases, it may be desirable to minimize the creation of excessive numbers of empty partitions, such as to reduce costs and/or increase efficiency. While this minimization may be achieved by providing a sufficient number of biological particles (e.g., biological particles 114) at the partitioning junction 110, such as to ensure that at least one biological particle is encapsulated in a partition, the Poissonian distribution may expectedly increase the number of partitions that include multiple biological particles. As such, where singly occupied partitions are to be obtained, at most about 95%, 90%, 85%, 80%, 75%, 70%, 65%, 60%, 55%, 50%, 45%, 40%, 35%, 30%, 25%, 20%, 15%, 10%, 5% or less of the generated partitions can be unoccupied.

In some cases, the flow of one or more of the biological particles (e.g., in channel segment 102), or other fluids directed into the partitioning junction (e.g., in channel segments 104, 106) can be controlled such that, in many cases, no more than about 50% of the generated partitions, no more than about 25% of the generated partitions, or no more than about 10% of the generated partitions are unoccupied. These flows can be controlled so as to present a non-Poissonian distribution of single-occupied partitions while providing lower levels of unoccupied partitions. The above noted ranges of unoccupied partitions can be achieved while still providing any of the single occupancy rates described above. For example, in many cases, the use of the systems and methods described herein can create resulting partitions that have multiple occupancy rates of less than about 25%, less than about 20%, less than about 15%, less than about 10%, and in many cases, less than about 5%, while having unoccupied partitions of less than about 50%, less than about 40%, less than about 30%, less than about 20%, less than about 10%, less than about 5%, or less.

As will be appreciated, the above-described occupancy rates are also applicable to partitions that include both biological particles and additional reagents, including, but not limited to, microcapsules carrying barcoded nucleic acid molecules (e.g., oligonucleotides) (described in relation to FIG. 2). The occupied partitions (e.g., at least about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, or 99% of the occupied partitions) can include both a microcapsule (e.g., bead) comprising barcoded nucleic acid molecules and a biological particle.

In another aspect, in addition to or as an alternative to droplet based partitioning, biological particles may be encapsulated within a microcapsule that comprises an outer shell, layer or porous matrix in which is entrained one or more individual biological particles or small groups of biological particles. The microcapsule may include other reagents. Encapsulation of biological particles may be performed by a variety of processes. Such processes may combine an aqueous fluid containing the biological particles with a polymeric precursor material that may be capable of being formed into a gel or other solid or semi-solid matrix upon application of a particular stimulus to the polymer precursor. Such stimuli can include, for example, thermal stimuli (e.g., either heating or cooling), photo-stimuli (e.g., through photo-curing), chemical stimuli (e.g., through crosslinking, polymerization initiation of the precursor (e.g., through added initiators)), or a combination thereof.

Preparation of microcapsules comprising biological particles may be performed by a variety of methods. For example, air knife droplet or aerosol generators may be used to dispense droplets of precursor fluids into gelling solutions in order to form microcapsules that include individual biological particles or small groups of biological particles. Likewise, membrane based encapsulation systems may be used to generate microcapsules comprising encapsulated biological particles as described herein. Microfluidic systems of the present disclosure, such as that shown in FIG. 1, may be readily used in encapsulating cells as described herein. In particular, and with reference to FIG. 1, the aqueous fluid 112 comprising (i) the biological particles 114 and (ii) the polymer precursor material (not shown) is flowed into channel junction 110, where it is partitioned into droplets 118, 120 through the flow of non-aqueous fluid 116. In the case of encapsulation methods, non-aqueous fluid 116 may also include an initiator (not shown) to cause polymerization and/or crosslinking of the polymer precursor to form the microcapsule that includes the entrained biological particles. Examples of polymer precursor/initiator pairs include those described in U.S. Patent Application Publication No. 2014/0378345, which is entirely incorporated herein by reference for all purposes.

For example, in the case where the polymer precursor material comprises a linear polymer material, such as a linear polyacrylamide, PEG, or other linear polymeric material, the activation agent may comprise a cross-linking agent, or a chemical that activates a cross-linking agent within the formed droplets. Likewise, for polymer precursors that comprise polymerizable monomers, the activation agent may comprise a polymerization initiator. For example, in certain cases, where the polymer precursor comprises a mixture of acrylamide monomer with a N,N′-bis-(acryloyl)cystamine (BAC) comonomer, an agent such as tetraethylmethylenediamine (TEMED) may be provided within the second fluid streams 116 in channel segments 104 and 106, which can initiate the copolymerization of the acrylamide and BAC into a cross-linked polymer network, or hydrogel.

Upon contact of the second fluid stream 116 with the first fluid stream 112 at junction 110, during formation of droplets, the TEMED may diffuse from the second fluid 116 into the aqueous fluid 112 comprising the linear polyacrylamide, which will activate the crosslinking of the polyacrylamide within the droplets 118, 120, resulting in the formation of gel (e.g., hydrogel) microcapsules, as solid or semi-solid beads or particles entraining the cells 114. Although described in terms of polyacrylamide encapsulation, other ‘activatable’ encapsulation compositions may also be employed in the context of the methods and compositions described herein. For example, formation of alginate droplets followed by exposure to divalent metal ions (e.g., Ca²⁺ ions), can be used as an encapsulation process using the described processes. Likewise, agarose droplets may also be transformed into capsules through temperature based gelling (e.g., upon cooling, etc.).

In some cases, encapsulated biological particles can be selectively releasable from the microcapsule, such as through passage of time or upon application of a particular stimulus, that degrades the microcapsule sufficiently to allow the biological particles (e.g., cell), or its other contents to be released from the microcapsule, such as into a partition (e.g., droplet). For example, in the case of the polyacrylamide polymer described above, degradation of the microcapsule may be accomplished through the introduction of an appropriate reducing agent, such as DTT or the like, to cleave disulfide bonds that cross-link the polymer matrix. See, for example, U.S. Patent Application Publication No. 2014/0378345, which is entirely incorporated herein by reference for all purposes.

The biological particle can be subjected to other conditions sufficient to polymerize or gel the precursors. The conditions sufficient to polymerize or gel the precursors may comprise exposure to heating, cooling, electromagnetic radiation, and/or light. The conditions sufficient to polymerize or gel the precursors may comprise any conditions sufficient to polymerize or gel the precursors. Following polymerization or gelling, a polymer or gel may be formed around the biological particle. The polymer or gel may be diffusively permeable to chemical or biochemical reagents. The polymer or gel may be diffusively impermeable to macromolecular constituents of the biological particle. In this manner, the polymer or gel may act to allow the biological particle to be subjected to chemical or biochemical operations while spatially confining the macromolecular constituents to a region of the droplet defined by the polymer or gel. The polymer or gel may include one or more of disulfide cross-linked polyacrylamide, agarose, alginate, polyvinyl alcohol, polyethylene glycol (PEG)-diacrylate, PEG-acrylate, PEG-thiol, PEG-azide, PEG-alkyne, other acrylates, chitosan, hyaluronic acid, collagen, fibrin, gelatin, or elastin. The polymer or gel may comprise any other polymer or gel.

The polymer or gel may be functionalized to bind to targeted analytes, such as nucleic acids, proteins, carbohydrates, lipids or other analytes. The polymer or gel may be polymerized or gelled via a passive mechanism. The polymer or gel may be stable in alkaline conditions or at elevated temperature. The polymer or gel may have mechanical properties similar to the mechanical properties of the bead. For instance, the polymer or gel may be of a similar size to the bead. The polymer or gel may have a mechanical strength (e.g., tensile strength) similar to that of the bead. The polymer or gel may be of a lower density than an oil. The polymer or gel may be of a density that is roughly similar to that of a buffer. The polymer or gel may have a tunable pore size. The pore size may be chosen to, for instance, retain denatured nucleic acids. The pore size may be chosen to maintain diffusive permeability to exogenous chemicals such as sodium hydroxide (NaOH) and/or endogenous chemicals such as inhibitors. The polymer or gel may be biocompatible. The polymer or gel may maintain or enhance cell viability. The polymer or gel may be biochemically compatible. The polymer or gel may be polymerized and/or depolymerized thermally, chemically, enzymatically, and/or optically.

The polymer may comprise poly(acrylamide-co-acrylic acid) crosslinked with disulfide linkages. The preparation of the polymer may comprise a two-step reaction. In the first activation step, poly(acrylamide-co-acrylic acid) may be exposed to an acylating agent to convert carboxylic acids to esters. For instance, the poly(acrylamide-co-acrylic acid) may be exposed to 4-(4,6-dimethoxy-1,3,5-triazin-2-yl)-4-methylmorpholinium chloride (DMTMM). The polyacrylamide-co-acrylic acid may be exposed to other salts of 4-(4,6-dimethoxy-1,3,5-triazin-2-yl)-4-methylmorpholinium. In the second cross-linking step, the ester formed in the first step may be exposed to a disulfide crosslinking agent. For instance, the ester may be exposed to cystamine (2,2′-dithiobis(ethylamine)). Following the two steps, the biological particle may be surrounded by polyacrylamide strands linked together by disulfide bridges. In this manner, the biological particle may be encased inside of or comprise a gel or matrix (e.g., polymer matrix) to form a “cell bead.” A cell bead can contain biological particles (e.g., a cell) or macromolecular constituents (e.g., RNA, DNA, proteins, etc.) of biological particles. A cell bead may include a single cell or multiple cells, or a derivative of the single cell or multiple cells. For example after lysing and washing the cells, inhibitory components from cell lysates can be washed away and the macromolecular constituents can be bound as cell beads. Systems and methods disclosed herein can be applicable to both cell beads (and/or droplets or other partitions) containing biological particles and cell beads (and/or droplets or other partitions) containing macromolecular constituents of biological particles.

Encapsulated biological particles can provide certain potential advantages of being more storable and more portable than droplet-based partitioned biological particles. Furthermore, in some cases, it may be desirable to allow biological particles to incubate for a select period of time before analysis, such as in order to characterize changes in such biological particles over time, either in the presence or absence of different stimuli. In such cases, encapsulation may allow for longer incubation than partitioning in emulsion droplets, although in some cases, droplet partitioned biological particles may also be incubated for different periods of time, e.g., at least 10 seconds, at least 30 seconds, at least 1 minute, at least 5 minutes, at least 10 minutes, at least 30 minutes, at least 1 hour, at least 2 hours, at least 5 hours, or at least 10 hours or more. The encapsulation of biological particles may constitute the partitioning of the biological particles into which other reagents are co-partitioned. Alternatively or in addition, encapsulated biological particles may be readily deposited into other partitions (e.g., droplets) as described above.

Beads

A partition may comprise one or more unique identifiers, such as barcodes. Barcodes may be previously, subsequently or concurrently delivered to the partitions that hold the compartmentalized or partitioned biological particle(s). For example, barcodes may be injected into droplets previous to, subsequent to, or concurrently with droplet generation. The delivery of the barcodes to a particular partition allows for the later attribution of the characteristics of the individual biological particle to the particular partition. Barcodes may be delivered, for example on a nucleic acid molecule (e.g., an oligonucleotide), to a partition via any suitable mechanism. Barcoded nucleic acid molecules can be delivered to a partition via a microcapsule. A microcapsule, in some instances, can comprise a bead. Beads are described in further detail elsewhere herein.

In some cases, barcoded nucleic acid molecules can be initially associated with the microcapsule and then released from the microcapsule. Release of the barcoded nucleic acid molecules can be passive (e.g., by diffusion out of the microcapsule). In addition or alternatively, release from the microcapsule can be upon application of a stimulus which allows the barcoded nucleic acid nucleic acid molecules to dissociate or to be released from the microcapsule. Such stimulus may disrupt the microcapsule, an interaction that couples the barcoded nucleic acid molecules to or within the microcapsule, or both. Such stimulus can include, for example, a thermal stimulus, photo-stimulus, chemical stimulus (e.g., change in pH or use of a reducing agent(s)), a mechanical stimulus, a radiation stimulus, a biological stimulus (e.g., enzyme), or any combination thereof.

FIG. 2 shows an example of a microfluidic channel structure 200 for delivering barcode carrying beads to droplets. The channel structure 200 can include channel segments 201, 202, 204, 206 and 208 communicating at a channel junction 210. In operation, the channel segment 201 may transport an aqueous fluid 212 that includes a plurality of beads 214 (e.g., with nucleic acid molecules, oligonucleotides, molecular tags) along the channel segment 201 into junction 210. The plurality of beads 214 may be sourced from a suspension of beads. For example, the channel segment 201 may be connected to a reservoir comprising an aqueous suspension of beads 214. The channel segment 202 may transport the aqueous fluid 212 that includes a plurality of biological particles 216 along the channel segment 202 into junction 210. The plurality of biological particles 216 may be sourced from a suspension of biological particles. For example, the channel segment 202 may be connected to a reservoir comprising an aqueous suspension of biological particles 216. In some instances, the aqueous fluid 212 in either the first channel segment 201 or the second channel segment 202, or in both segments, can include one or more reagents, as further described below. A second fluid 218 that is immiscible with the aqueous fluid 212 (e.g., oil) can be delivered to the junction 210 from each of channel segments 204 and 206. Upon meeting of the aqueous fluid 212 from each of channel segments 201 and 202 and the second fluid 218 from each of channel segments 204 and 206 at the channel junction 210, the aqueous fluid 212 can be partitioned as discrete droplets 220 in the second fluid 218 and flow away from the junction 210 along channel segment 208. The channel segment 208 may deliver the discrete droplets to an outlet reservoir fluidly coupled to the channel segment 208, where they may be harvested.

As an alternative, the channel segments 201 and 202 may meet at another junction upstream of the junction 210. At such junction, beads and biological particles may form a mixture that is directed along another channel to the junction 210 to yield droplets 220. The mixture may provide the beads and biological particles in an alternating fashion, such that, for example, a droplet comprises a single bead and a single biological particle.

Beads, biological particles and droplets may flow along channels at substantially regular flow profiles (e.g., at regular flow rates). Such regular flow profiles may permit a droplet to include a single bead and a single biological particle. Such regular flow profiles may permit the droplets to have an occupancy (e.g., droplets having beads and biological particles) greater than 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or 95%. Such regular flow profiles and devices that may be used to provide such regular flow profiles are provided in, for example, U.S. Patent Publication No. 2015/0292988, which is entirely incorporated herein by reference.

The second fluid 218 can comprise an oil, such as a fluorinated oil, that includes a fluorosurfactant for stabilizing the resulting droplets, for example, inhibiting subsequent coalescence of the resulting droplets 220.

A discrete droplet that is generated may include an individual biological particle 216. A discrete droplet that is generated may include a barcode or other reagent carrying bead 214. A discrete droplet generated may include both an individual biological particle and a barcode carrying bead, such as droplets 220. In some instances, a discrete droplet may include more than one individual biological particle or no biological particle. In some instances, a discrete droplet may include more than one bead or no bead. A discrete droplet may be unoccupied (e.g., no beads, no biological particles).

Beneficially, a discrete droplet partitioning a biological particle and a barcode carrying bead may effectively allow the attribution of the barcode to macromolecular constituents of the biological particle within the partition. The contents of a partition may remain discrete from the contents of other partitions.

As will be appreciated, the channel segments described herein may be coupled to any of a variety of different fluid sources or receiving components, including reservoirs, tubing, manifolds, or fluidic components of other systems. As will be appreciated, the microfluidic channel structure 200 may have other geometries. For example, a microfluidic channel structure can have more than one channel junctions. For example, a microfluidic channel structure can have 2, 3, 4, or 5 channel segments each carrying beads that meet at a channel junction. Fluid may be directed flow along one or more channels or reservoirs via one or more fluid flow units. A fluid flow unit can comprise compressors (e.g., providing positive pressure), pumps (e.g., providing negative pressure), actuators, and the like to control flow of the fluid. Fluid may also or otherwise be controlled via applied pressure differentials, centrifugal force, electrokinetic pumping, vacuum, capillary or gravity flow, or the like.

A bead may be porous, non-porous, solid, semi-solid, semi-fluidic, fluidic, and/or a combination thereof. In some instances, a bead may be dissolvable, disruptable, and/or degradable. In some cases, a bead may not be degradable. In some cases, the bead may be a gel bead. A gel bead may be a hydrogel bead. A gel bead may be formed from molecular precursors, such as a polymeric or monomeric species. A semi-solid bead may be a liposomal bead. Solid beads may comprise metals including iron oxide, gold, and silver. In some cases, the bead may be a silica bead. In some cases, the bead can be rigid. In other cases, the bead may be flexible and/or compressible.

A bead may be of any suitable shape. Examples of bead shapes include, but are not limited to, spherical, non-spherical, oval, oblong, amorphous, circular, cylindrical, and variations thereof.

Beads may be of uniform size or heterogeneous size. In some cases, the diameter of a bead may be at least about 1 micrometers (μm), 5 μm, 10 μm, 20 μm, 30 μm, 40 μm, 50 μm, 60 μm, 70 μm, 80 μm, 90 μm, 100 μm, 250 μm, 500 μm, 1 mm, or greater. In some cases, a bead may have a diameter of less than about 1 μm, 5 μm, 10 μm, 20 μm, 30 μm, 40 μm, 50 μm, 60 μm, 70 μm, 80 μm, 90 μm, 100 μm, 250 μm, 500 μm, 1 mm, or less. In some cases, a bead may have a diameter in the range of about 40-75 μm, 30-75 μm, 20-75 μm, 40-85 μm, 40-95 μm, 20-100 μm, 10-100 μm, 1-100 μm, 20-250 μm, or 20-500 μm.

In certain aspects, beads can be provided as a population or plurality of beads having a relatively monodisperse size distribution. Where it may be desirable to provide relatively consistent amounts of reagents within partitions, maintaining relatively consistent bead characteristics, such as size, can contribute to the overall consistency. In particular, the beads described herein may have size distributions that have a coefficient of variation in their cross-sectional dimensions of less than 50%, less than 40%, less than 30%, less than 20%, and in some cases less than 15%, less than 10%, less than 5%, or less.

A bead may comprise natural and/or synthetic materials. For example, a bead can comprise a natural polymer, a synthetic polymer or both natural and synthetic polymers. Examples of natural polymers include proteins and sugars such as deoxyribonucleic acid, rubber, cellulose, starch (e.g., amylose, amylopectin), proteins, enzymes, polysaccharides, silks, polyhydroxyalkanoates, chitosan, dextran, collagen, carrageenan, ispaghula, acacia, agar, gelatin, shellac, sterculia gum, xanthan gum, Corn sugar gum, guar gum, gum karaya, agarose, alginic acid, alginate, or natural polymers thereof. Examples of synthetic polymers include acrylics, nylons, silicones, spandex, viscose rayon, polycarboxylic acids, polyvinyl acetate, polyacrylamide, polyacrylate, polyethylene glycol, polyurethanes, polylactic acid, silica, polystyrene, polyacrylonitrile, polybutadiene, polycarbonate, polyethylene, polyethylene terephthalate, poly(chlorotrifluoroethylene), poly(ethylene oxide), poly(ethylene terephthalate), polyethylene, polyisobutylene, poly(methyl methacrylate), poly(oxymethylene), polyformaldehyde, polypropylene, polystyrene, poly(tetrafluoroethylene), poly(vinyl acetate), poly(vinyl alcohol), poly(vinyl chloride), poly(vinylidene dichloride), poly(vinylidene difluoride), poly(vinyl fluoride) and/or combinations (e.g., co-polymers) thereof. Beads may also be formed from materials other than polymers, including lipids, micelles, ceramics, glass-ceramics, material composites, metals, other inorganic materials, and others.

In some instances, the bead may contain molecular precursors (e.g., monomers or polymers), which may form a polymer network via polymerization of the molecular precursors. In some cases, a precursor may be an already polymerized species capable of undergoing further polymerization via, for example, a chemical cross-linkage. In some cases, a precursor can comprise one or more of an acrylamide or a methacrylamide monomer, oligomer, or polymer. In some cases, the bead may comprise prepolymers, which are oligomers capable of further polymerization. For example, polyurethane beads may be prepared using prepolymers. In some cases, the bead may contain individual polymers that may be further polymerized together. In some cases, beads may be generated via polymerization of different precursors, such that they comprise mixed polymers, co-polymers, and/or block co-polymers. In some cases, the bead may comprise covalent or ionic bonds between polymeric precursors (e.g., monomers, oligomers, linear polymers), nucleic acid molecules (e.g., oligonucleotides), primers, and other entities. In some cases, the covalent bonds can be carbon-carbon bonds or thioether bonds.

Cross-linking may be permanent or reversible, depending upon the particular cross-linker used. Reversible cross-linking may allow for the polymer to linearize or dissociate under appropriate conditions. In some cases, reversible cross-linking may also allow for reversible attachment of a material bound to the surface of a bead. In some cases, a cross-linker may form disulfide linkages. In some cases, the chemical cross-linker forming disulfide linkages may be cystamine or a modified cystamine.

In some cases, disulfide linkages can be formed between molecular precursor units (e.g., monomers, oligomers, or linear polymers) or precursors incorporated into a bead and nucleic acid molecules (e.g., oligonucleotides). Cystamine (including modified cystamines), for example, is an organic agent comprising a disulfide bond that may be used as a crosslinker agent between individual monomeric or polymeric precursors of a bead. Polyacrylamide may be polymerized in the presence of cystamine or a species comprising cystamine (e.g., a modified cystamine) to generate polyacrylamide gel beads comprising disulfide linkages (e.g., chemically degradable beads comprising chemically-reducible cross-linkers). The disulfide linkages may permit the bead to be degraded (or dissolved) upon exposure of the bead to a reducing agent.

In some cases, chitosan, a linear polysaccharide polymer, may be crosslinked with glutaraldehyde via hydrophilic chains to form a bead. Crosslinking of chitosan polymers may be achieved by chemical reactions that are initiated by heat, pressure, change in pH, and/or radiation.

In some cases, a bead may comprise an acrydite moiety, which in certain aspects may be used to attach one or more nucleic acid molecules (e.g., barcode sequence, barcoded nucleic acid molecule, barcoded oligonucleotide, primer, or other oligonucleotide) to the bead. In some cases, an acrydite moiety can refer to an acrydite analogue generated from the reaction of acrydite with one or more species, such as, the reaction of acrydite with other monomers and cross-linkers during a polymerization reaction. Acrydite moieties may be modified to form chemical bonds with a species to be attached, such as a nucleic acid molecule (e.g., barcode sequence, barcoded nucleic acid molecule, barcoded oligonucleotide, primer, or other oligonucleotide). Acrydite moieties may be modified with thiol groups capable of forming a disulfide bond or may be modified with groups already comprising a disulfide bond. The thiol or disulfide (via disulfide exchange) may be used as an anchor point for a species to be attached or another part of the acrydite moiety may be used for attachment. In some cases, attachment can be reversible, such that when the disulfide bond is broken (e.g., in the presence of a reducing agent), the attached species is released from the bead. In other cases, an acrydite moiety can comprise a reactive hydroxyl group that may be used for attachment.

Functionalization of beads for attachment of nucleic acid molecules (e.g., oligonucleotides) may be achieved through a wide range of different approaches, including activation of chemical groups within a polymer, incorporation of active or activatable functional groups in the polymer structure, or attachment at the pre-polymer or monomer stage in bead production.

For example, precursors (e.g., monomers, cross-linkers) that are polymerized to form a bead may comprise acrydite moieties, such that when a bead is generated, the bead also comprises acrydite moieties. The acrydite moieties can be attached to a nucleic acid molecule (e.g., oligonucleotide), which may include a priming sequence (e.g., a primer for amplifying target nucleic acids, random primer (e.g., a random N-mer), primer sequence for messenger RNA (e.g., a polyT sequence)) and/or a one or more barcode sequences. The one more barcode sequences may include sequences that are the same for all nucleic acid molecules coupled to a given bead and/or sequences that are different across all nucleic acid molecules coupled to the given bead. The nucleic acid molecule may be incorporated into the bead.

In some cases, the nucleic acid molecule can comprise a functional sequence, for example, for attachment to a sequencing flow cell, such as, for example, a P5 sequence for Illumina® sequencing. In some cases, the nucleic acid molecule or derivative thereof (e.g., oligonucleotide or polynucleotide generated from the nucleic acid molecule) can comprise another functional sequence, such as, for example, a P7 sequence for attachment to a sequencing flow cell for Illumina sequencing. In some cases, the nucleic acid molecule can comprise a barcode sequence. In some cases, the primer can further comprise a unique molecular identifier (UMI). In some cases, the primer can comprise an R1 primer sequence for Illumina sequencing. In some cases, the primer can comprise an R2 primer sequence for Illumina sequencing. Examples of such nucleic acid molecules (e.g., oligonucleotides, polynucleotides, etc.) and uses thereof, as may be used with compositions, devices, methods and systems of the present disclosure, are provided in U.S. Patent Pub. Nos. 2014/0378345 and 2015/0376609, each of which is entirely incorporated herein by reference.

In some cases, precursors comprising a functional group that is reactive or capable of being activated such that it becomes reactive can be polymerized with other precursors to generate gel beads comprising the activated or activatable functional group. The functional group may then be used to attach additional species (e.g., disulfide linkers, primers, other oligonucleotides, etc.) to the gel beads. For example, some precursors comprising a carboxylic acid (COOH) group can co-polymerize with other precursors to form a gel bead that also comprises a COOH functional group. In some cases, acrylic acid (a species comprising free COOH groups), acrylamide, and bis(acryloyl)cystamine can be co-polymerized together to generate a gel bead comprising free COOH groups. The COOH groups of the gel bead can be activated (e.g., via 1-Ethyl-3-(3-dimethylaminopropyl)carbodiimide (EDC) and N-Hydroxysuccinimide (NHS) or 4-(4,6-Dimethoxy-1,3,5-triazin-2-yl)-4-methylmorpholinium chloride (DMTMM)) such that they are reactive (e.g., reactive to amine functional groups where EDC/NHS or DMTMM are used for activation). The activated COOH groups can then react with an appropriate species (e.g., a species comprising an amine functional group where the carboxylic acid groups are activated to be reactive with an amine functional group) comprising a moiety to be linked to the bead.

Beads comprising disulfide linkages in their polymeric network may be functionalized with additional species via reduction of some of the disulfide linkages to free thiols. The disulfide linkages may be reduced via, for example, the action of a reducing agent (e.g., DTT, TCEP, etc.) to generate free thiol groups, without dissolution of the bead. Free thiols of the beads can then react with free thiols of a species or a species comprising another disulfide bond (e.g., via thiol-disulfide exchange) such that the species can be linked to the beads (e.g., via a generated disulfide bond). In some cases, free thiols of the beads may react with any other suitable group. For example, free thiols of the beads may react with species comprising an acrydite moiety. The free thiol groups of the beads can react with the acrydite via Michael addition chemistry, such that the species comprising the acrydite is linked to the bead. In some cases, uncontrolled reactions can be prevented by inclusion of a thiol capping agent such as N-ethylmalieamide or iodoacetate.

Activation of disulfide linkages within a bead can be controlled such that only a small number of disulfide linkages are activated. Control may be exerted, for example, by controlling the concentration of a reducing agent used to generate free thiol groups and/or concentration of reagents used to form disulfide bonds in bead polymerization. In some cases, a low concentration (e.g., molecules of reducing agent:gel bead ratios of less than or equal to about 1:100,000,000,000, less than or equal to about 1:10,000,000,000, less than or equal to about 1:1,000,000,000, less than or equal to about 1:100,000,000, less than or equal to about 1:10,000,000, less than or equal to about 1:1,000,000, less than or equal to about 1:100,000, less than or equal to about 1:10,000) of reducing agent may be used for reduction. Controlling the number of disulfide linkages that are reduced to free thiols may be useful in ensuring bead structural integrity during functionalization. In some cases, optically-active agents, such as fluorescent dyes may be coupled to beads via free thiol groups of the beads and used to quantify the number of free thiols present in a bead and/or track a bead.

In some cases, addition of moieties to a gel bead after gel bead formation may be advantageous. For example, addition of an oligonucleotide (e.g., barcoded oligonucleotide) after gel bead formation may avoid loss of the species during chain transfer termination that can occur during polymerization. Moreover, smaller precursors (e.g., monomers or cross linkers that do not comprise side chain groups and linked moieties) may be used for polymerization and can be minimally hindered from growing chain ends due to viscous effects. In some cases, functionalization after gel bead synthesis can minimize exposure of species (e.g., oligonucleotides) to be loaded with potentially damaging agents (e.g., free radicals) and/or chemical environments. In some cases, the generated gel may possess an upper critical solution temperature (UCST) that can permit temperature driven swelling and collapse of a bead. Such functionality may aid in oligonucleotide (e.g., a primer) infiltration into the bead during subsequent functionalization of the bead with the oligonucleotide. Post-production functionalization may also be useful in controlling loading ratios of species in beads, such that, for example, the variability in loading ratio is minimized. Species loading may also be performed in a batch process such that a plurality of beads can be functionalized with the species in a single batch.

A bead injected or otherwise introduced into a partition may comprise releasably, cleavably, or reversibly attached barcodes. A bead injected or otherwise introduced into a partition may comprise activatable barcodes. A bead injected or otherwise introduced into a partition may be degradable, disruptable, or dissolvable beads.

Barcodes can be releasably, cleavably or reversibly attached to the beads such that barcodes can be released or be releasable through cleavage of a linkage between the barcode molecule and the bead, or released through degradation of the underlying bead itself, allowing the barcodes to be accessed or be accessible by other reagents, or both. In non-limiting examples, cleavage may be achieved through reduction of di-sulfide bonds, use of restriction enzymes, photo-activated cleavage, or cleavage via other types of stimuli (e.g., chemical, thermal, pH, enzymatic, etc.) and/or reactions, such as described elsewhere herein. Releasable barcodes may sometimes be referred to as being activatable, in that they are available for reaction once released. Thus, for example, an activatable barcode may be activated by releasing the barcode from a bead (or other suitable type of partition described herein). Other activatable configurations are also envisioned in the context of the described methods and systems.

In addition to, or as an alternative to the cleavable linkages between the beads and the associated molecules, such as barcode containing nucleic acid molecules (e.g., barcoded oligonucleotides), the beads may be degradable, disruptable, or dissolvable spontaneously or upon exposure to one or more stimuli (e.g., temperature changes, pH changes, exposure to particular chemical species or phase, exposure to light, reducing agent, etc.). In some cases, a bead may be dissolvable, such that material components of the beads are solubilized when exposed to a particular chemical species or an environmental change, such as a change temperature or a change in pH. In some cases, a gel bead can be degraded or dissolved at elevated temperature and/or in basic conditions. In some cases, a bead may be thermally degradable such that when the bead is exposed to an appropriate change in temperature (e.g., heat), the bead degrades. Degradation or dissolution of a bead bound to a species (e.g., a nucleic acid molecule, e.g., barcoded oligonucleotide) may result in release of the species from the bead.

As will be appreciated from the above disclosure, the degradation of a bead may refer to the disassociation of a bound or entrained species from a bead, both with and without structurally degrading the physical bead itself. For example, the degradation of the bead may involve cleavage of a cleavable linkage via one or more species and/or methods described elsewhere herein. In another example, entrained species may be released from beads through osmotic pressure differences due to, for example, changing chemical environments. By way of example, alteration of bead pore sizes due to osmotic pressure differences can generally occur without structural degradation of the bead itself. In some cases, an increase in pore size due to osmotic swelling of a bead can permit the release of entrained species within the bead. In other cases, osmotic shrinking of a bead may cause a bead to better retain an entrained species due to pore size contraction.

A degradable bead may be introduced into a partition, such as a droplet of an emulsion or a well, such that the bead degrades within the partition and any associated species (e.g., oligonucleotides) are released within the droplet when the appropriate stimulus is applied. The free species (e.g., oligonucleotides, nucleic acid molecules) may interact with other reagents contained in the partition. For example, a polyacrylamide bead comprising cystamine and linked, via a disulfide bond, to a barcode sequence, may be combined with a reducing agent within a droplet of a water-in-oil emulsion. Within the droplet, the reducing agent can break the various disulfide bonds, resulting in bead degradation and release of the barcode sequence into the aqueous, inner environment of the droplet. In another example, heating of a droplet comprising a bead-bound barcode sequence in basic solution may also result in bead degradation and release of the attached barcode sequence into the aqueous, inner environment of the droplet.

Any suitable number of molecular tag molecules (e.g., primer, barcoded oligonucleotide) can be associated with a bead such that, upon release from the bead, the molecular tag molecules (e.g., primer, e.g., barcoded oligonucleotide) are present in the partition at a pre-defined concentration. Such pre-defined concentration may be selected to facilitate certain reactions for generating a sequencing library, e.g., nucleic acid extension, within the partition. In some cases, the pre-defined concentration of the primer can be limited by the process of producing nucleic acid molecule (e.g., oligonucleotide) bearing beads.

In some cases, beads can be non-covalently loaded with one or more reagents. The beads can be non-covalently loaded by, for instance, subjecting the beads to conditions sufficient to swell the beads, allowing sufficient time for the reagents to diffuse into the interiors of the beads, and subjecting the beads to conditions sufficient to de-swell the beads. The swelling of the beads may be accomplished, for instance, by placing the beads in a thermodynamically favorable solvent, subjecting the beads to a higher or lower temperature, subjecting the beads to a higher or lower ion concentration, and/or subjecting the beads to an electric field. The swelling of the beads may be accomplished by various swelling methods. The de-swelling of the beads may be accomplished, for instance, by transferring the beads in a thermodynamically unfavorable solvent, subjecting the beads to lower or high temperatures, subjecting the beads to a lower or higher ion concentration, and/or removing an electric field. The de-swelling of the beads may be accomplished by various de-swelling methods. Transferring the beads may cause pores in the bead to shrink. The shrinking may then hinder reagents within the beads from diffusing out of the interiors of the beads. The hindrance may be due to steric interactions between the reagents and the interiors of the beads. The transfer may be accomplished microfluidically. For instance, the transfer may be achieved by moving the beads from one co-flowing solvent stream to a different co-flowing solvent stream. The swellability and/or pore size of the beads may be adjusted by changing the polymer composition of the bead.

In some cases, an acrydite moiety linked to a precursor, another species linked to a precursor, or a precursor itself can comprise a labile bond, such as chemically, thermally, or photo-sensitive bond e.g., disulfide bond, UV sensitive bond, or the like. Once acrydite moieties or other moieties comprising a labile bond are incorporated into a bead, the bead may also comprise the labile bond. The labile bond may be, for example, useful in reversibly linking (e.g., covalently linking) species (e.g., barcodes, primers, etc.) to a bead. In some cases, a thermally labile bond may include a nucleic acid hybridization based attachment, e.g., where an oligonucleotide is hybridized to a complementary sequence that is attached to the bead, such that thermal melting of the hybrid releases the oligonucleotide, e.g., a barcode containing sequence, from the bead or microcapsule.

The addition of multiple types of labile bonds to a gel bead may result in the generation of a bead capable of responding to varied stimuli. Each type of labile bond may be sensitive to an associated stimulus (e.g., chemical stimulus, light, temperature, enzymatic, etc.) such that release of species attached to a bead via each labile bond may be controlled by the application of the appropriate stimulus. Such functionality may be useful in controlled release of species from a gel bead. In some cases, another species comprising a labile bond may be linked to a gel bead after gel bead formation via, for example, an activated functional group of the gel bead as described above. As will be appreciated, barcodes that are releasably, cleavably or reversibly attached to the beads described herein include barcodes that are released or releasable through cleavage of a linkage between the barcode molecule and the bead, or that are released through degradation of the underlying bead itself, allowing the barcodes to be accessed or accessible by other reagents, or both.

The barcodes that are releasable as described herein may sometimes be referred to as being activatable, in that they are available for reaction once released. Thus, for example, an activatable barcode may be activated by releasing the barcode from a bead (or other suitable type of partition described herein). Other activatable configurations are also envisioned in the context of the described methods and systems.

In addition to thermally cleavable bonds, disulfide bonds and UV sensitive bonds, other non-limiting examples of labile bonds that may be coupled to a precursor or bead include an ester linkage (e.g., cleavable with an acid, a base, or hydroxylamine), a vicinal diol linkage (e.g., cleavable via sodium periodate), a Diels-Alder linkage (e.g., cleavable via heat), a sulfone linkage (e.g., cleavable via a base), a silyl ether linkage (e.g., cleavable via an acid), a glycosidic linkage (e.g., cleavable via an amylase), a peptide linkage (e.g., cleavable via a protease), or a phosphodiester linkage (e.g., cleavable via a nuclease (e.g., DNAase)). A bond may be cleavable via other nucleic acid molecule targeting enzymes, such as restriction enzymes (e.g., restriction endonucleases), as described further below.

Species may be encapsulated in beads during bead generation (e.g., during polymerization of precursors). Such species may or may not participate in polymerization. Such species may be entered into polymerization reaction mixtures such that generated beads comprise the species upon bead formation. In some cases, such species may be added to the gel beads after formation. Such species may include, for example, nucleic acid molecules (e.g., oligonucleotides), reagents for a nucleic acid ligation, extension, or amplification reactions (e.g., primers, polymerases, dNTPs, co-factors (e.g., ionic co-factors), buffers) including those described herein, reagents for enzymatic reactions (e.g., enzymes, co-factors, substrates, buffers), reagents for nucleic acid modification reactions such as polymerization, ligation, or digestion, and/or reagents for template preparation (e.g., tagmentation) for one or more sequencing platforms (e.g., Nextera® for Illumina®). Such species may include one or more enzymes described herein, including without limitation, polymerase, reverse transcriptase, restriction enzymes (e.g., endonuclease), transposase, ligase, proteinase K, DNAse, etc. Such species may include one or more reagents described elsewhere herein (e.g., lysis agents, inhibitors, inactivating agents, chelating agents, stimulus). Trapping of such species may be controlled by the polymer network density generated during polymerization of precursors, control of ionic charge within the gel bead (e.g., via ionic species linked to polymerized species), or by the release of other species. Encapsulated species may be released from a bead upon bead degradation and/or by application of a stimulus capable of releasing the species from the bead. Alternatively or in addition, species may be partitioned in a partition (e.g., droplet) during or subsequent to partition formation. Such species may include, without limitation, the abovementioned species that may also be encapsulated in a bead.

A degradable bead may comprise one or more species with a labile bond such that, when the bead/species is exposed to the appropriate stimuli, the bond is broken and the bead degrades. The labile bond may be a chemical bond (e.g., covalent bond, ionic bond) or may be another type of physical interaction (e.g., van der Waals interactions, dipole-dipole interactions, etc.). In some cases, a crosslinker used to generate a bead may comprise a labile bond. Upon exposure to the appropriate conditions, the labile bond can be broken and the bead degraded. For example, upon exposure of a polyacrylamide gel bead comprising cystamine crosslinkers to a reducing agent, the disulfide bonds of the cystamine can be broken and the bead degraded.

A degradable bead may be useful in more quickly releasing an attached species (e.g., a nucleic acid molecule, a barcode sequence, a primer, etc) from the bead when the appropriate stimulus is applied to the bead as compared to a bead that does not degrade. For example, for a species bound to an inner surface of a porous bead or in the case of an encapsulated species, the species may have greater mobility and accessibility to other species in solution upon degradation of the bead. In some cases, a species may also be attached to a degradable bead via a degradable linker (e.g., disulfide linker). The degradable linker may respond to the same stimuli as the degradable bead or the two degradable species may respond to different stimuli. For example, a barcode sequence may be attached, via a disulfide bond, to a polyacrylamide bead comprising cystamine. Upon exposure of the barcoded-bead to a reducing agent, the bead degrades and the barcode sequence is released upon breakage of both the disulfide linkage between the barcode sequence and the bead and the disulfide linkages of the cystamine in the bead.

As will be appreciated from the above disclosure, while referred to as degradation of a bead, in many instances as noted above, that degradation may refer to the disassociation of a bound or entrained species from a bead, both with and without structurally degrading the physical bead itself. For example, entrained species may be released from beads through osmotic pressure differences due to, for example, changing chemical environments. By way of example, alteration of bead pore sizes due to osmotic pressure differences can generally occur without structural degradation of the bead itself. In some cases, an increase in pore size due to osmotic swelling of a bead can permit the release of entrained species within the bead. In other cases, osmotic shrinking of a bead may cause a bead to better retain an entrained species due to pore size contraction.

Where degradable beads are provided, it may be beneficial to avoid exposing such beads to the stimulus or stimuli that cause such degradation prior to a given time, in order to, for example, avoid premature bead degradation and issues that arise from such degradation, including for example poor flow characteristics and aggregation. By way of example, where beads comprise reducible cross-linking groups, such as disulfide groups, it will be desirable to avoid contacting such beads with reducing agents, e.g., DTT or other disulfide cleaving reagents. In such cases, treatment to the beads described herein will, in some cases be provided free of reducing agents, such as DTT. Because reducing agents are often provided in commercial enzyme preparations, it may be desirable to provide reducing agent free (or DTT free) enzyme preparations in treating the beads described herein. Examples of such enzymes include, e.g., polymerase enzyme preparations, reverse transcriptase enzyme preparations, ligase enzyme preparations, as well as many other enzyme preparations that may be used to treat the beads described herein. The terms “reducing agent free” or “DTT free” preparations can refer to a preparation having less than about 1/10th, less than about 1/50th, or even less than about 1/100th of the lower ranges for such materials used in degrading the beads. For example, for DTT, the reducing agent free preparation can have less than about 0.01 millimolar (mM), 0.005 mM, 0.001 mM DTT, 0.0005 mM DTT, or even less than about 0.0001 mM DTT. In many cases, the amount of DTT can be undetectable.

Numerous chemical triggers may be used to trigger the degradation of beads. Examples of these chemical changes may include, but are not limited to pH-mediated changes to the integrity of a component within the bead, degradation of a component of a bead via cleavage of cross-linked bonds, and depolymerization of a component of a bead.

In some embodiments, a bead may be formed from materials that comprise degradable chemical crosslinkers, such as BAC or cystamine. Degradation of such degradable crosslinkers may be accomplished through a number of mechanisms. In some examples, a bead may be contacted with a chemical degrading agent that may induce oxidation, reduction or other chemical changes. For example, a chemical degrading agent may be a reducing agent, such as dithiothreitol (DTT). Additional examples of reducing agents may include β-mercaptoethanol, (2S)-2-amino-1,4-dimercaptobutane (dithiobutylamine or DTBA), tris(2-carboxyethyl) phosphine (TCEP), or combinations thereof. A reducing agent may degrade the disulfide bonds formed between gel precursors forming the bead, and thus, degrade the bead. In other cases, a change in pH of a solution, such as an increase in pH, may trigger degradation of a bead. In other cases, exposure to an aqueous solution, such as water, may trigger hydrolytic degradation, and thus degradation of the bead.

Beads may also be induced to release their contents upon the application of a thermal stimulus. A change in temperature can cause a variety of changes to a bead. For example, heat can cause a solid bead to liquefy. A change in heat may cause melting of a bead such that a portion of the bead degrades. In other cases, heat may increase the internal pressure of the bead components such that the bead ruptures or explodes. Heat may also act upon heat-sensitive polymers used as materials to construct beads.

Any suitable agent may degrade beads. In some embodiments, changes in temperature or pH may be used to degrade thermo-sensitive or pH-sensitive bonds within beads. In some embodiments, chemical degrading agents may be used to degrade chemical bonds within beads by oxidation, reduction or other chemical changes. For example, a chemical degrading agent may be a reducing agent, such as DTT, wherein DTT may degrade the disulfide bonds formed between a crosslinker and gel precursors, thus degrading the bead. In some embodiments, a reducing agent may be added to degrade the bead, which may or may not cause the bead to release its contents. Examples of reducing agents may include dithiothreitol (DTT), β-mercaptoethanol, (2S)-2-amino-1,4-dimercaptobutane (dithiobutylamine or DTBA), tris(2-carboxyethyl) phosphine (TCEP), or combinations thereof. The reducing agent may be present at a concentration of about 0.1 mM, 0.5 mM, 1 mM, 5 mM, 10 mM. The reducing agent may be present at a concentration of at least about 0.1 mM, 0.5 mM, 1 mM, 5 mM, 10 mM, or greater than 10 mM. The reducing agent may be present at concentration of at most about 10 mM, 5 mM, 1 mM, 0.5 mM, 0.1 mM, or less.

Any suitable number of molecular tag molecules (e.g., primer, barcoded oligonucleotide) can be associated with a bead such that, upon release from the bead, the molecular tag molecules (e.g., primer, e.g., barcoded oligonucleotide) are present in the partition at a pre-defined concentration. Such pre-defined concentration may be selected to facilitate certain reactions for generating a sequencing library, e.g., nucleic acid extension, amplification, or ligation within the partition. In some cases, the pre-defined concentration of the primer can be limited by the process of producing oligonucleotide bearing beads.

Although FIG. 1 and FIG. 2 have been described in terms of providing substantially singly occupied partitions, above, in certain cases, it may be desirable to provide multiply occupied partitions, e.g., containing two, three, four or more cells and/or microcapsules (e.g., beads) comprising barcoded nucleic acid molecules (e.g., oligonucleotides) within a single partition. Accordingly, as noted above, the flow characteristics of the biological particle and/or bead containing fluids and partitioning fluids may be controlled to provide for such multiply occupied partitions. In particular, the flow parameters may be controlled to provide a given occupancy rate at greater than about 50% of the partitions, greater than about 75%, and in some cases greater than about 80%, 90%, 95%, or higher.

In some cases, additional microcapsules can be used to deliver additional reagents to a partition. In such cases, it may be advantageous to introduce different beads into a common channel or droplet generation junction, from different bead sources (e.g., containing different associated reagents) through different channel inlets into such common channel or droplet generation junction (e.g., junction 210). In such cases, the flow and frequency of the different beads into the channel or junction may be controlled to provide for a certain ratio of microcapsules from each source, while ensuring a given pairing or combination of such beads into a partition with a given number of biological particles (e.g., one biological particle and one bead per partition).

The partitions described herein may comprise small volumes, for example, less than about 10 microliters (μL), 54, 14, 900 picoliters (pL), 800 pL, 700 pL, 600 pL, 500 pL, 400 pL, 300 pL, 200 pL, 100 pL, 50 pL, 20 pL, 10 pL, 1 pL, 500 nanoliters (nL), 100 nL, 50 nL, or less.

For example, in the case of droplet based partitions, the droplets may have overall volumes that are less than about 1000 pL, 900 pL, 800 pL, 700 pL, 600 pL, 500 pL, 400 pL, 300 pL, 200 pL, 100 pL, 50 pL, 20 pL, 10 pL, 1 pL, or less. Where co-partitioned with microcapsules, it will be appreciated that the sample fluid volume, e.g., including co-partitioned biological particles and/or beads, within the partitions may be less than about 90% of the above described volumes, less than about 80%, less than about 70%, less than about 60%, less than about 50%, less than about 40%, less than about 30%, less than about 20%, or less than about 10% of the above described volumes.

As is described elsewhere herein, partitioning species may generate a population or plurality of partitions. In such cases, any suitable number of partitions can be generated or otherwise provided. For example, at least about 1,000 partitions, at least about 5,000 partitions, at least about 10,000 partitions, at least about 50,000 partitions, at least about 100,000 partitions, at least about 500,000 partitions, at least about 1,000,000 partitions, at least about 5,000,000 partitions at least about 10,000,000 partitions, at least about 50,000,000 partitions, at least about 100,000,000 partitions, at least about 500,000,000 partitions, at least about 1,000,000,000 partitions, or more partitions can be generated or otherwise provided. Moreover, the plurality of partitions may comprise both unoccupied partitions (e.g., empty partitions) and occupied partitions.

Reagents

In accordance with certain aspects, biological particles may be partitioned along with lysis reagents in order to release the contents of the biological particles within the partition. In such cases, the lysis agents can be contacted with the biological particle suspension concurrently with, or immediately prior to, the introduction of the biological particles into the partitioning junction/droplet generation zone (e.g., junction 210), such as through an additional channel or channels upstream of the channel junction. In accordance with other aspects, additionally or alternatively, biological particles may be partitioned along with other reagents, as will be described further below.

FIG. 3 shows an example of a microfluidic channel structure 300 for co-partitioning biological particles and reagents. The channel structure 300 can include channel segments 301, 302, 304, 306 and 308. Channel segments 301 and 302 communicate at a first channel junction 309. Channel segments 302, 304, 306, and 308 communicate at a second channel junction 310.

In an example operation, the channel segment 301 may transport an aqueous fluid 312 that includes a plurality of biological particles 314 along the channel segment 301 into the second junction 310. As an alternative or in addition to, channel segment 301 may transport beads (e.g., gel beads). The beads may comprise barcode molecules.

For example, the channel segment 301 may be connected to a reservoir comprising an aqueous suspension of biological particles 314. Upstream of, and immediately prior to reaching, the second junction 310, the channel segment 301 may meet the channel segment 302 at the first junction 309. The channel segment 302 may transport a plurality of reagents 315 (e.g., lysis agents) suspended in the aqueous fluid 312 along the channel segment 302 into the first junction 309. For example, the channel segment 302 may be connected to a reservoir comprising the reagents 315. After the first junction 309, the aqueous fluid 312 in the channel segment 301 can carry both the biological particles 314 and the reagents 315 towards the second junction 310. In some instances, the aqueous fluid 312 in the channel segment 301 can include one or more reagents, which can be the same or different reagents as the reagents 315. A second fluid 316 that is immiscible with the aqueous fluid 312 (e.g., oil) can be delivered to the second junction 310 from each of channel segments 304 and 306. Upon meeting of the aqueous fluid 312 from the channel segment 301 and the second fluid 316 from each of channel segments 304 and 306 at the second channel junction 310, the aqueous fluid 312 can be partitioned as discrete droplets 318 in the second fluid 316 and flow away from the second junction 310 along channel segment 308. The channel segment 308 may deliver the discrete droplets 318 to an outlet reservoir fluidly coupled to the channel segment 308, where they may be harvested.

The second fluid 316 can comprise an oil, such as a fluorinated oil, that includes a fluorosurfactant for stabilizing the resulting droplets, for example, inhibiting subsequent coalescence of the resulting droplets 318.

A discrete droplet generated may include an individual biological particle 314 and/or one or more reagents 315. In some instances, a discrete droplet generated may include a barcode carrying bead (not shown), such as via other microfluidics structures described elsewhere herein. In some instances, a discrete droplet may be unoccupied (e.g., no reagents, no biological particles).

Beneficially, when lysis reagents and biological particles are co-partitioned, the lysis reagents can facilitate the release of the contents of the biological particles within the partition. The contents released in a partition may remain discrete from the contents of other partitions.

As will be appreciated, the channel segments described herein may be coupled to any of a variety of different fluid sources or receiving components, including reservoirs, tubing, manifolds, or fluidic components of other systems. As will be appreciated, the microfluidic channel structure 300 may have other geometries. For example, a microfluidic channel structure can have more than two channel junctions. For example, a microfluidic channel structure can have 2, 3, 4, 5 channel segments or more each carrying the same or different types of beads, reagents, and/or biological particles that meet at a channel junction. Fluid flow in each channel segment may be controlled to control the partitioning of the different elements into droplets. Fluid may be directed flow along one or more channels or reservoirs via one or more fluid flow units. A fluid flow unit can comprise compressors (e.g., providing positive pressure), pumps (e.g., providing negative pressure), actuators, and the like to control flow of the fluid. Fluid may also or otherwise be controlled via applied pressure differentials, centrifugal force, electrokinetic pumping, vacuum, capillary or gravity flow, or the like.

Examples of lysis agents include bioactive reagents, such as lysis enzymes that are used for lysis of different cell types, e.g., gram positive or negative bacteria, plants, yeast, mammalian, etc., such as lysozymes, achromopeptidase, lysostaphin, labiase, kitalase, lyticase, and a variety of other lysis enzymes available from, e.g., Sigma-Aldrich, Inc. (St Louis, Mo.), as well as other commercially available lysis enzymes. Other lysis agents may additionally or alternatively be co-partitioned with the biological particles to cause the release of the biological particles's contents into the partitions. For example, in some cases, surfactant-based lysis solutions may be used to lyse cells, although these may be less desirable for emulsion based systems where the surfactants can interfere with stable emulsions. In some cases, lysis solutions may include non-ionic surfactants such as, for example, TritonX-100 and Tween 20. In some cases, lysis solutions may include ionic surfactants such as, for example, sarcosyl and sodium dodecyl sulfate (SDS). Electroporation, thermal, acoustic or mechanical cellular disruption may also be used in certain cases, e.g., non-emulsion based partitioning such as encapsulation of biological particles that may be in addition to or in place of droplet partitioning, where any pore size of the encapsulate is sufficiently small to retain nucleic acid fragments of a given size, following cellular disruption.

In addition to the lysis agents co-partitioned with the biological particles described above, other reagents can also be co-partitioned with the biological particles, including, for example, DNase and RNase inactivating agents or inhibitors, such as proteinase K, chelating agents, such as EDTA, and other reagents employed in removing or otherwise reducing negative activity or impact of different cell lysate components on subsequent processing of nucleic acids. In addition, in the case of encapsulated biological particles, the biological particles may be exposed to an appropriate stimulus to release the biological particles or their contents from a co-partitioned microcapsule. For example, in some cases, a chemical stimulus may be co-partitioned along with an encapsulated biological particle to allow for the degradation of the microcapsule and release of the cell or its contents into the larger partition. In some cases, this stimulus may be the same as the stimulus described elsewhere herein for release of nucleic acid molecules (e.g., oligonucleotides) from their respective microcapsule (e.g., bead). In alternative aspects, this may be a different and non-overlapping stimulus, in order to allow an encapsulated biological particle to be released into a partition at a different time from the release of nucleic acid molecules into the same partition.

Additional reagents may also be co-partitioned with the biological particles, such as endonucleases to fragment a biological particle's DNA, DNA polymerase enzymes and dNTPs used to amplify the biological particle's nucleic acid fragments and to attach the barcode molecular tags to the amplified fragments. Other enzymes may be co-partitioned, including without limitation, polymerase, transposase, ligase, proteinase K, DNAse, etc. Additional reagents may also include reverse transcriptase enzymes, including enzymes with terminal transferase activity, primers and oligonucleotides, and switch oligonucleotides (also referred to herein as “switch oligos” or “template switching oligonucleotides”) which can be used for template switching. In some cases, template switching can be used to increase the length of a cDNA. In some cases, template switching can be used to append a predefined nucleic acid sequence to the cDNA. In an example of template switching, cDNA can be generated from reverse transcription of a template, e.g., cellular mRNA, where a reverse transcriptase with terminal transferase activity can add additional nucleotides, e.g., polyC, to the cDNA in a template independent manner. Switch oligos can include sequences complementary to the additional nucleotides, e.g., polyG. The additional nucleotides (e.g., polyC) on the cDNA can hybridize to the additional nucleotides (e.g., polyG) on the switch oligo, whereby the switch oligo can be used by the reverse transcriptase as template to further extend the cDNA. Template switching oligonucleotides may comprise a hybridization region and a template region. The hybridization region can comprise any sequence capable of hybridizing to the target. In some cases, as previously described, the hybridization region comprises a series of G bases to complement the overhanging C bases at the 3′ end of a cDNA molecule. The series of G bases may comprise 1 G base, 2 G bases, 3 G bases, 4 G bases, 5 G bases or more than 5 G bases. The template sequence can comprise any sequence to be incorporated into the cDNA. In some cases, the template region comprises at least 1 (e.g., at least 2, 3, 4, 5 or more) tag sequences and/or functional sequences. Switch oligos may comprise deoxyribonucleic acids; ribonucleic acids; modified nucleic acids including 2-Aminopurine, 2,6-Diaminopurine (2-Amino-dA), inverted dT, 5-Methyl dC, 2′-deoxylnosine, Super T (5-hydroxybutynl-2′-deoxyuridine), Super G (8-aza-7-deazaguanosine), locked nucleic acids (LNAs), unlocked nucleic acids (UNAs, e.g., UNA-A, UNA-U, UNA-C, UNA-G), Iso-dG, Iso-dC, 2′ Fluoro bases (e.g., Fluoro C, Fluoro U, Fluoro A, and Fluoro G), or any combination.

In some cases, the length of a switch oligo may be at least about 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249 or 250 nucleotides or longer.

In some cases, the length of a switch oligo may be at most about 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249 or 250 nucleotides.

Once the contents of the cells are released into their respective partitions, the macromolecular components (e.g., macromolecular constituents of biological particles, such as RNA, DNA, or proteins) contained therein may be further processed within the partitions. In accordance with the methods and systems described herein, the macromolecular component contents of individual biological particles can be provided with unique identifiers such that, upon characterization of those macromolecular components they may be attributed as having been derived from the same biological particle or particles. The ability to attribute characteristics to individual biological particles or groups of biological particles is provided by the assignment of unique identifiers specifically to an individual biological particle or groups of biological particles. Unique identifiers, e.g., in the form of nucleic acid barcodes can be assigned or associated with individual biological particles or populations of biological particle, in order to tag or label the biological particle's macromolecular components (and as a result, its characteristics) with the unique identifiers. These unique identifiers can then be used to attribute the biological particle's components and characteristics to an individual biological particle or group of biological particles.

In some aspects, this is performed by co-partitioning the individual biological particle or groups of biological particles with the unique identifiers, such as described above (with reference to FIG. 2). In some aspects, the unique identifiers are provided in the form of nucleic acid molecules (e.g., oligonucleotides) that comprise nucleic acid barcode sequences that may be attached to or otherwise associated with the nucleic acid contents of individual biological particle, or to other components of the biological particle, and particularly to fragments of those nucleic acids. The nucleic acid molecules are partitioned such that as between nucleic acid molecules in a given partition, the nucleic acid barcode sequences contained therein are the same, but as between different partitions, the nucleic acid molecule can, and do have differing barcode sequences, or at least represent a large number of different barcode sequences across all of the partitions in a given analysis. In some aspects, only one nucleic acid barcode sequence can be associated with a given partition, although in some cases, two or more different barcode sequences may be present.

The nucleic acid barcode sequences can include from about 6 to about 20 or more nucleotides within the sequence of the nucleic acid molecules (e.g., oligonucleotides). In some cases, the length of a barcode sequence may be about 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 nucleotides or longer. In some cases, the length of a barcode sequence may be at least about 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 nucleotides or longer. In some cases, the length of a barcode sequence may be at most about 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 nucleotides or shorter. These nucleotides may be completely contiguous, i.e., in a single stretch of adjacent nucleotides, or they may be separated into two or more separate subsequences that are separated by 1 or more nucleotides. In some cases, separated barcode subsequences can be from about 4 to about 16 nucleotides in length. In some cases, the barcode subsequence may be about 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 nucleotides or longer. In some cases, the barcode subsequence may be at least about 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 nucleotides or longer. In some cases, the barcode subsequence may be at most about 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 nucleotides or shorter.

The co-partitioned nucleic acid molecules can also comprise other functional sequences useful in the processing of the nucleic acids from the co-partitioned biological particles. These sequences include, e.g., targeted or random/universal amplification/extension primer sequences for amplifying or extending the genomic DNA from the individual biological particles within the partitions while attaching the associated barcode sequences, sequencing primers or primer recognition sites, hybridization or probing sequences, e.g., for identification of presence of the sequences or for pulling down barcoded nucleic acids, or any of a number of other potential functional sequences. Other mechanisms of co-partitioning oligonucleotides may also be employed, including, e.g., coalescence of two or more droplets, where one droplet contains oligonucleotides, or microdispensing of oligonucleotides into partitions, e.g., droplets within microfluidic systems.

In an example, microcapsules, such as beads, are provided that each include large numbers of the above described barcoded nucleic acid molecules (e.g., barcoded oligonucleotides) releasably attached to the beads, where all of the nucleic acid molecules attached to a particular bead will include the same nucleic acid barcode sequence, but where a large number of diverse barcode sequences are represented across the population of beads used. In some embodiments, hydrogel beads, e.g., comprising polyacrylamide polymer matrices, are used as a solid support and delivery vehicle for the nucleic acid molecules into the partitions, as they are capable of carrying large numbers of nucleic acid molecules, and may be configured to release those nucleic acid molecules upon exposure to a particular stimulus, as described elsewhere herein. In some cases, the population of beads provides a diverse barcode sequence library that includes at least about 1,000 different barcode sequences, at least about 5,000 different barcode sequences, at least about 10,000 different barcode sequences, at least about 50,000 different barcode sequences, at least about 100,000 different barcode sequences, at least about 1,000,000 different barcode sequences, at least about 5,000,000 different barcode sequences, or at least about 10,000,000 different barcode sequences, or more. Additionally, each bead can be provided with large numbers of nucleic acid (e.g., oligonucleotide) molecules attached. In particular, the number of molecules of nucleic acid molecules including the barcode sequence on an individual bead can be at least about 1,000 nucleic acid molecules, at least about 5,000 nucleic acid molecules, at least about 10,000 nucleic acid molecules, at least about 50,000 nucleic acid molecules, at least about 100,000 nucleic acid molecules, at least about 500,000 nucleic acids, at least about 1,000,000 nucleic acid molecules, at least about 5,000,000 nucleic acid molecules, at least about 10,000,000 nucleic acid molecules, at least about 50,000,000 nucleic acid molecules, at least about 100,000,000 nucleic acid molecules, at least about 250,000,000 nucleic acid molecules and in some cases at least about 1 billion nucleic acid molecules, or more. Nucleic acid molecules of a given bead can include identical (or common) barcode sequences, different barcode sequences, or a combination of both. Nucleic acid molecules of a given bead can include multiple sets of nucleic acid molecules. Nucleic acid molecules of a given set can include identical barcode sequences. The identical barcode sequences can be different from barcode sequences of nucleic acid molecules of another set.

Moreover, when the population of beads is partitioned, the resulting population of partitions can also include a diverse barcode library that includes at least about 1,000 different barcode sequences, at least about 5,000 different barcode sequences, at least about 10,000 different barcode sequences, at least at least about 50,000 different barcode sequences, at least about 100,000 different barcode sequences, at least about 1,000,000 different barcode sequences, at least about 5,000,000 different barcode sequences, or at least about 10,000,000 different barcode sequences. Additionally, each partition of the population can include at least about 1,000 nucleic acid molecules, at least about 5,000 nucleic acid molecules, at least about 10,000 nucleic acid molecules, at least about 50,000 nucleic acid molecules, at least about 100,000 nucleic acid molecules, at least about 500,000 nucleic acids, at least about 1,000,000 nucleic acid molecules, at least about 5,000,000 nucleic acid molecules, at least about 10,000,000 nucleic acid molecules, at least about 50,000,000 nucleic acid molecules, at least about 100,000,000 nucleic acid molecules, at least about 250,000,000 nucleic acid molecules and in some cases at least about 1 billion nucleic acid molecules.

In some cases, it may be desirable to incorporate multiple different barcodes within a given partition, either attached to a single or multiple beads within the partition. For example, in some cases, a mixed, but known set of barcode sequences may provide greater assurance of identification in the subsequent processing, e.g., by providing a stronger address or attribution of the barcodes to a given partition, as a duplicate or independent confirmation of the output from a given partition.

The nucleic acid molecules (e.g., oligonucleotides) are releasable from the beads upon the application of a particular stimulus to the beads. In some cases, the stimulus may be a photo-stimulus, e.g., through cleavage of a photo-labile linkage that releases the nucleic acid molecules. In other cases, a thermal stimulus may be used, where elevation of the temperature of the beads environment will result in cleavage of a linkage or other release of the nucleic acid molecules form the beads. In still other cases, a chemical stimulus can be used that cleaves a linkage of the nucleic acid molecules to the beads, or otherwise results in release of the nucleic acid molecules from the beads. In one case, such compositions include the polyacrylamide matrices described above for encapsulation of biological particles, and may be degraded for release of the attached nucleic acid molecules through exposure to a reducing agent, such as DTT.

In some aspects, provided are systems and methods for controlled partitioning. Droplet size may be controlled by adjusting certain geometric features in channel architecture (e.g., microfluidics channel architecture). For example, an expansion angle, width, and/or length of a channel may be adjusted to control droplet size.

FIG. 4 shows an example of a microfluidic channel structure for the controlled partitioning of beads into discrete droplets. A channel structure 400 can include a channel segment 402 communicating at a channel junction 406 (or intersection) with a reservoir 404. The reservoir 404 can be a chamber. Any reference to “reservoir,” as used herein, can also refer to a “chamber.” In operation, an aqueous fluid 408 that includes suspended beads 412 may be transported along the channel segment 402 into the junction 406 to meet a second fluid 410 that is immiscible with the aqueous fluid 408 in the reservoir 404 to create droplets 416, 418 of the aqueous fluid 408 flowing into the reservoir 404. At the juncture 406 where the aqueous fluid 408 and the second fluid 410 meet, droplets can form based on factors such as the hydrodynamic forces at the juncture 406, flow rates of the two fluids 408, 410, fluid properties, and certain geometric parameters (e.g., w, h₀, α, etc.) of the channel structure 400. A plurality of droplets can be collected in the reservoir 404 by continuously injecting the aqueous fluid 408 from the channel segment 402 through the juncture 406.

A discrete droplet generated may include a bead (e.g., as in occupied droplets 416). Alternatively, a discrete droplet generated may include more than one bead. Alternatively, a discrete droplet generated may not include any beads (e.g., as in unoccupied droplet 418). In some instances, a discrete droplet generated may contain one or more biological particles, as described elsewhere herein. In some instances, a discrete droplet generated may comprise one or more reagents, as described elsewhere herein.

In some instances, the aqueous fluid 408 can have a substantially uniform concentration or frequency of beads 412. The beads 412 can be introduced into the channel segment 402 from a separate channel (not shown in FIG. 4). The frequency of beads 412 in the channel segment 402 may be controlled by controlling the frequency in which the beads 412 are introduced into the channel segment 402 and/or the relative flow rates of the fluids in the channel segment 402 and the separate channel. In some instances, the beads can be introduced into the channel segment 402 from a plurality of different channels, and the frequency controlled accordingly.

In some instances, the aqueous fluid 408 in the channel segment 402 can comprise biological particles (e.g., described with reference to FIGS. 1 and 2). In some instances, the aqueous fluid 408 can have a substantially uniform concentration or frequency of biological particles. As with the beads, the biological particles can be introduced into the channel segment 402 from a separate channel. The frequency or concentration of the biological particles in the aqueous fluid 408 in the channel segment 402 may be controlled by controlling the frequency in which the biological particles are introduced into the channel segment 402 and/or the relative flow rates of the fluids in the channel segment 402 and the separate channel. In some instances, the biological particles can be introduced into the channel segment 402 from a plurality of different channels, and the frequency controlled accordingly. In some instances, a first separate channel can introduce beads and a second separate channel can introduce biological particles into the channel segment 402. The first separate channel introducing the beads may be upstream or downstream of the second separate channel introducing the biological particles.

The second fluid 410 can comprise an oil, such as a fluorinated oil, that includes a fluorosurfactant for stabilizing the resulting droplets, for example, inhibiting subsequent coalescence of the resulting droplets.

In some instances, the second fluid 410 may not be subjected to and/or directed to any flow in or out of the reservoir 404. For example, the second fluid 410 may be substantially stationary in the reservoir 404. In some instances, the second fluid 410 may be subjected to flow within the reservoir 404, but not in or out of the reservoir 404, such as via application of pressure to the reservoir 404 and/or as affected by the incoming flow of the aqueous fluid 408 at the juncture 406. Alternatively, the second fluid 410 may be subjected and/or directed to flow in or out of the reservoir 404. For example, the reservoir 404 can be a channel directing the second fluid 410 from upstream to downstream, transporting the generated droplets.

The channel structure 400 at or near the juncture 406 may have certain geometric features that at least partly determine the sizes of the droplets formed by the channel structure 400. The channel segment 402 can have a height, h₀and width, w, at or near the juncture 406. By way of example, the channel segment 402 can comprise a rectangular cross-section that leads to a reservoir 404 having a wider cross-section (such as in width or diameter). Alternatively, the cross-section of the channel segment 402 can be other shapes, such as a circular shape, trapezoidal shape, polygonal shape, or any other shapes. The top and bottom walls of the reservoir 404 at or near the juncture 406 can be inclined at an expansion angle, α. The expansion angle, a, allows the tongue (portion of the aqueous fluid 408 leaving channel segment 402 at junction 406 and entering the reservoir 404 before droplet formation) to increase in depth and facilitate decrease in curvature of the intermediately formed droplet. Droplet size may decrease with increasing expansion angle. The resulting droplet radius, Rd, may be predicted by the following equation for the aforementioned geometric parameters of h₀, w, and α:

$R_{d} \approx 0.4 4 (1 + 2.2 \sqrt{\tan α} \frac{w}{h_{0}}) \frac{h_{0}}{\sqrt{\tan α}}$

By way of example, for a channel structure with w=21 μm, h=21 μm, and α=3°, the predicted droplet size is 121 μm. In another example, for a channel structure with w=25 h=25 μm, and α=5°, the predicted droplet size is 123 μm. In another example, for a channel structure with w=28 μm, h=28 μm, and α=7°, the predicted droplet size is 124 μm.

In some instances, the expansion angle, a, may be between a range of from about 0.5° to about 4°, from about 0.1° to about 10°, or from about 0° to about 90°. For example, the expansion angle can be at least about 0.01°, 0.1°, 0.2°, 0.3°, 0.4°, 0.5°, 0.6°, 0.7°, 0.8°, 0.9°, 1°, 2°, 3°, 4°, 5°, 6°, 7°, 8°, 9°, 10°, 15°, 20°, 25°, 30°, 35°, 40°, 45°, 50°, 55°, 60°, 65°, 70°, 75°, 80°, 85°, or higher. In some instances, the expansion angle can be at most about 89°, 88°, 87°, 86°, 85°, 84°, 83°, 82°, 81°, 80°, 75°, 70°, 65°, 60°, 55°, 50°, 45°, 40°, 35°, 30°, 25°, 20°, 15°, 10°, 9°, 8°, 7°, 6°, 5°, 4°, 3°, 2°, 1°, 0.1°, 0.01°, or less. In some instances, the width, w, can between a range of from about 100 micrometers (μm) to about 500 μm. In some instances, the width, w, can be between a range of from about 10 μm to about 200 μm. Alternatively, the width can be less than about 10 μm. Alternatively, the width can be greater than about 500 μm. In some instances, the flow rate of the aqueous fluid 408 entering the junction 406 can be between about 0.04 microliters (μL)/minute (min) and about 40 μL/min. In some instances, the flow rate of the aqueous fluid 408 entering the junction 406 can be between about 0.01 microliters (μL)/minute (min) and about 100 μL/min. Alternatively, the flow rate of the aqueous fluid 408 entering the junction 406 can be less than about 0.01 μL/min. Alternatively, the flow rate of the aqueous fluid 408 entering the junction 406 can be greater than about 40 μL/min, such as 45 μL/min, 50 μL/min, 55 μL/min, 60 μL/min, 65 μL/min, 70 μL/min, 75 μL/min, 80 μL/min, 85 μL/min, 90 μL/min, 95 μL/min, 100 μL/min, 110 μL/min, 120 μL/min, 130 μL/min, 140 μL/min, 150 μL/min, or greater. At lower flow rates, such as flow rates of about less than or equal to 10 microliters/minute, the droplet radius may not be dependent on the flow rate of the aqueous fluid 408 entering the junction 406.

In some instances, at least about 50% of the droplets generated can have uniform size. In some instances, at least about 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or greater of the droplets generated can have uniform size. Alternatively, less than about 50% of the droplets generated can have uniform size.

The throughput of droplet generation can be increased by increasing the points of generation, such as increasing the number of junctions (e.g., junction 406) between aqueous fluid 408 channel segments (e.g., channel segment 402) and the reservoir 404. Alternatively or in addition, the throughput of droplet generation can be increased by increasing the flow rate of the aqueous fluid 408 in the channel segment 402.

FIG. 5 shows an example of a microfluidic channel structure for increased droplet generation throughput. A microfluidic channel structure 500 can comprise a plurality of channel segments 502 and a reservoir 504. Each of the plurality of channel segments 502 may be in fluid communication with the reservoir 504. The channel structure 500 can comprise a plurality of channel junctions 506 between the plurality of channel segments 502 and the reservoir 504. Each channel junction can be a point of droplet generation. The channel segment 402 from the channel structure 400 in FIG. 4 and any description to the components thereof may correspond to a given channel segment of the plurality of channel segments 502 in channel structure 500 and any description to the corresponding components thereof. The reservoir 404 from the channel structure 400 and any description to the components thereof may correspond to the reservoir 504 from the channel structure 500 and any description to the corresponding components thereof.

Each channel segment of the plurality of channel segments 502 may comprise an aqueous fluid 508 that includes suspended beads 512. The reservoir 504 may comprise a second fluid 510 that is immiscible with the aqueous fluid 508. In some instances, the second fluid 510 may not be subjected to and/or directed to any flow in or out of the reservoir 504. For example, the second fluid 510 may be substantially stationary in the reservoir 504. In some instances, the second fluid 510 may be subjected to flow within the reservoir 504, but not in or out of the reservoir 504, such as via application of pressure to the reservoir 504 and/or as affected by the incoming flow of the aqueous fluid 508 at the junctures. Alternatively, the second fluid 510 may be subjected and/or directed to flow in or out of the reservoir 504. For example, the reservoir 504 can be a channel directing the second fluid 510 from upstream to downstream, transporting the generated droplets.

In operation, the aqueous fluid 508 that includes suspended beads 512 may be transported along the plurality of channel segments 502 into the plurality of junctions 506 to meet the second fluid 510 in the reservoir 504 to create droplets 516, 518. A droplet may form from each channel segment at each corresponding junction with the reservoir 504. At the juncture where the aqueous fluid 508 and the second fluid 510 meet, droplets can form based on factors such as the hydrodynamic forces at the juncture, flow rates of the two fluids 508, 510, fluid properties, and certain geometric parameters (e.g., w, h₀, α, etc.) of the channel structure 500, as described elsewhere herein. A plurality of droplets can be collected in the reservoir 504 by continuously injecting the aqueous fluid 508 from the plurality of channel segments 502 through the plurality of junctures 506. Throughput may significantly increase with the parallel channel configuration of channel structure 500. For example, a channel structure having five inlet channel segments comprising the aqueous fluid 508 may generate droplets five times as frequently than a channel structure having one inlet channel segment, provided that the fluid flow rate in the channel segments are substantially the same. The fluid flow rate in the different inlet channel segments may or may not be substantially the same. A channel structure may have as many parallel channel segments as is practical and allowed for the size of the reservoir. For example, the channel structure may have at least about 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 150, 500, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000, 1500, 5000 or more parallel or substantially parallel channel segments.

The geometric parameters, w, h₀, and α, may or may not be uniform for each of the channel segments in the plurality of channel segments 502. For example, each channel segment may have the same or different widths at or near its respective channel junction with the reservoir 504. For example, each channel segment may have the same or different height at or near its respective channel junction with the reservoir 504. In another example, the reservoir 504 may have the same or different expansion angle at the different channel junctions with the plurality of channel segments 502. When the geometric parameters are uniform, beneficially, droplet size may also be controlled to be uniform even with the increased throughput. In some instances, when it is desirable to have a different distribution of droplet sizes, the geometric parameters for the plurality of channel segments 502 may be varied accordingly.

FIG. 6 shows another example of a microfluidic channel structure for increased droplet generation throughput. A microfluidic channel structure 600 can comprise a plurality of channel segments 602 arranged generally circularly around the perimeter of a reservoir 604. Each of the plurality of channel segments 602 may be in fluid communication with the reservoir 604. The channel structure 600 can comprise a plurality of channel junctions 606 between the plurality of channel segments 602 and the reservoir 604. Each channel junction can be a point of droplet generation. The channel segment 402 from the channel structure 400 in FIG. 2 and any description to the components thereof may correspond to a given channel segment of the plurality of channel segments 602 in channel structure 600 and any description to the corresponding components thereof. The reservoir 404 from the channel structure 400 and any description to the components thereof may correspond to the reservoir 604 from the channel structure 600 and any description to the corresponding components thereof.

Each channel segment of the plurality of channel segments 602 may comprise an aqueous fluid 608 that includes suspended beads 612. The reservoir 604 may comprise a second fluid 610 that is immiscible with the aqueous fluid 608. In some instances, the second fluid 610 may not be subjected to and/or directed to any flow in or out of the reservoir 604. For example, the second fluid 610 may be substantially stationary in the reservoir 604. In some instances, the second fluid 610 may be subjected to flow within the reservoir 604, but not in or out of the reservoir 604, such as via application of pressure to the reservoir 604 and/or as affected by the incoming flow of the aqueous fluid 608 at the junctures. Alternatively, the second fluid 610 may be subjected and/or directed to flow in or out of the reservoir 604. For example, the reservoir 604 can be a channel directing the second fluid 610 from upstream to downstream, transporting the generated droplets.

In operation, the aqueous fluid 608 that includes suspended beads 612 may be transported along the plurality of channel segments 602 into the plurality of junctions 606 to meet the second fluid 610 in the reservoir 604 to create a plurality of droplets 616. A droplet may form from each channel segment at each corresponding junction with the reservoir 604. At the juncture where the aqueous fluid 608 and the second fluid 610 meet, droplets can form based on factors such as the hydrodynamic forces at the juncture, flow rates of the two fluids 608, 610, fluid properties, and certain geometric parameters (e.g., widths and heights of the channel segments 602, expansion angle of the reservoir 604, etc.) of the channel structure 600, as described elsewhere herein. A plurality of droplets can be collected in the reservoir 604 by continuously injecting the aqueous fluid 608 from the plurality of channel segments 602 through the plurality of junctures 606. Throughput may significantly increase with the substantially parallel channel configuration of the channel structure 600. A channel structure may have as many substantially parallel channel segments as is practical and allowed for by the size of the reservoir. For example, the channel structure may have at least about 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000, 1500, 5000 or more parallel or substantially parallel channel segments. The plurality of channel segments may be substantially evenly spaced apart, for example, around an edge or perimeter of the reservoir. Alternatively, the spacing of the plurality of channel segments may be uneven.

The reservoir 604 may have an expansion angle, a (not shown in FIG. 6) at or near each channel juncture. Each channel segment of the plurality of channel segments 602 may have a width, w, and a height, h₀, at or near the channel juncture. The geometric parameters, w, h₀, and a, may or may not be uniform for each of the channel segments in the plurality of channel segments 602. For example, each channel segment may have the same or different widths at or near its respective channel junction with the reservoir 604. For example, each channel segment may have the same or different height at or near its respective channel junction with the reservoir 604.

The reservoir 604 may have the same or different expansion angle at the different channel junctions with the plurality of channel segments 602. For example, a circular reservoir (as shown in FIG. 6) may have a conical, dome-like, or hemispherical ceiling (e.g., top wall) to provide the same or substantially same expansion angle for each channel segments 602 at or near the plurality of channel junctions 606. When the geometric parameters are uniform, beneficially, resulting droplet size may be controlled to be uniform even with the increased throughput. In some instances, when it is desirable to have a different distribution of droplet sizes, the geometric parameters for the plurality of channel segments 602 may be varied accordingly.

The channel networks, e.g., as described above or elsewhere herein, can be fluidly coupled to appropriate fluidic components. For example, the inlet channel segments are fluidly coupled to appropriate sources of the materials they are to deliver to a channel junction. These sources may include any of a variety of different fluidic components, from simple reservoirs defined in or connected to a body structure of a microfluidic device, to fluid conduits that deliver fluids from off-device sources, manifolds, fluid flow units (e.g., actuators, pumps, compressors) or the like. Likewise, the outlet channel segment (e.g., channel segment 208, reservoir 604, etc.) may be fluidly coupled to a receiving vessel or conduit for the partitioned cells for subsequent processing. Again, this may be a reservoir defined in the body of a microfluidic device, or it may be a fluidic conduit for delivering the partitioned cells to a subsequent process operation, instrument or component.

The methods and systems described herein may be used to greatly increase the efficiency of single cell applications and/or other applications receiving droplet-based input. For example, following the sorting of occupied cells and/or appropriately-sized cells, subsequent operations that can be performed can include generation of amplification products, purification (e.g., via solid phase reversible immobilization (SPRI)), further processing (e.g., shearing, ligation of functional sequences, and subsequent amplification (e.g., via PCR)). These operations may occur in bulk (e.g., outside the partition). In the case where a partition is a droplet in an emulsion, the emulsion can be broken and the contents of the droplet pooled for additional operations. Additional reagents that may be co-partitioned along with the barcode bearing bead may include oligonucleotides to block ribosomal RNA (rRNA) and nucleases to digest genomic DNA from cells. Alternatively, rRNA removal agents may be applied during additional processing operations. The configuration of the constructs generated by such a method can help minimize (or avoid) sequencing of the poly-T sequence during sequencing and/or sequence the 5′ end of a polynucleotide sequence. The amplification products, for example, first amplification products and/or second amplification products, may be subject to sequencing for sequence analysis. In some cases, amplification may be performed using the Partial Hairpin Amplification for Sequencing (PHASE) method.

A variety of applications require the evaluation of the presence and quantification of different biological particle or organism types within a population of biological particles, including, for example, microbiome analysis and characterization, environmental testing, food safety testing, epidemiological analysis, e.g., in tracing contamination or the like.

Computer Systems

The present disclosure provides computer systems that are programmed to implement methods of the disclosure. FIG. 8 shows a computer system 801 that is programmed or otherwise configured to process multiple cellular or nucleic acid samples in parallel, for example (i) control a microfluidics system (e.g., fluid flow) for the generation of partitions, (ii) sort occupied droplets from unoccupied droplets, (iii) polymerize droplets, (iv) perform sequencing applications, (v) generate and maintain a library of sequencing reads, and (vi) analyze sequencing reads. The computer system 801 can regulate various aspects of the present disclosure, such as, for example, regulating fluid flow rate in one or more channels in a microfluidic structure during the formation of partitions comprising droplets, regulating polymerization application units, nucleic acid extension or amplification, etc. The computer system 801 can be an electronic device of a user or a computer system that is remotely located with respect to the electronic device. The electronic device can be a mobile electronic device.

The computer system 801 includes a central processing unit (CPU, also “processor” and “computer processor” herein) 805, which can be a single core or multi core processor, or a plurality of processors for parallel processing. The computer system 801 also includes memory or memory location 810 (e.g., random-access memory, read-only memory, flash memory), electronic storage unit 815 (e.g., hard disk), communication interface 820 (e.g., network adapter) for communicating with one or more other systems, and peripheral devices 825, such as cache, other memory, data storage and/or electronic display adapters. The memory 810, storage unit 815, interface 820 and peripheral devices 825 are in communication with the CPU 805 through a communication bus (solid lines), such as a motherboard. The storage unit 815 can be a data storage unit (or data repository) for storing data. The computer system 801 can be operatively coupled to a computer network (“network”) 830 with the aid of the communication interface 820. The network 830 can be the Internet, an internet and/or extranet, or an intranet and/or extranet that is in communication with the Internet. The network 830 in some cases is a telecommunication and/or data network. The network 830 can include one or more computer servers, which can enable distributed computing, such as cloud computing. The network 830, in some cases with the aid of the computer system 801, can implement a peer-to-peer network, which may enable devices coupled to the computer system 801 to behave as a client or a server.

The CPU 805 can execute a sequence of machine-readable instructions, which can be embodied in a program or software. The instructions may be stored in a memory location, such as the memory 810. The instructions can be directed to the CPU 805, which can subsequently program or otherwise configure the CPU 805 to implement methods of the present disclosure. Examples of operations performed by the CPU 805 can include fetch, decode, execute, and writeback.

The CPU 805 can be part of a circuit, such as an integrated circuit. One or more other components of the system 801 can be included in the circuit. In some cases, the circuit is an application specific integrated circuit (ASIC).

The storage unit 815 can store files, such as drivers, libraries and saved programs. The storage unit 815 can store user data, e.g., user preferences and user programs. The computer system 801 in some cases can include one or more additional data storage units that are external to the computer system 801, such as located on a remote server that is in communication with the computer system 801 through an intranet or the Internet.

The computer system 801 can communicate with one or more remote computer systems through the network 830. For instance, the computer system 801 can communicate with a remote computer system of a user (e.g., operator). Examples of remote computer systems include personal computers (e.g., portable PC), slate or tablet PC's (e.g., Apple® iPad, Samsung® Galaxy Tab), telephones, Smart phones (e.g., Apple® iPhone, Android-enabled device, Blackberry®), or personal digital assistants. The user can access the computer system 801 via the network 830.

Methods as described herein can be implemented by way of machine (e.g., computer processor) executable code stored on an electronic storage location of the computer system 801, such as, for example, on the memory 810 or electronic storage unit 815. The machine executable or machine readable code can be provided in the form of software. During use, the code can be executed by the processor 805. In some cases, the code can be retrieved from the storage unit 815 and stored on the memory 810 for ready access by the processor 805. In some situations, the electronic storage unit 815 can be precluded, and machine-executable instructions are stored on memory 810.

The code can be pre-compiled and configured for use with a machine having a processor adapted to execute the code, or can be compiled during runtime. The code can be supplied in a programming language that can be selected to enable the code to execute in a pre-compiled or as-compiled fashion.

Aspects of the systems and methods provided herein, such as the computer system 801, can be embodied in programming. Various aspects of the technology may be thought of as “products” or “articles of manufacture” typically in the form of machine (or processor) executable code and/or associated data that is carried on or embodied in a type of machine readable medium. Machine-executable code can be stored on an electronic storage unit, such as memory (e.g., read-only memory, random-access memory, flash memory) or a hard disk. “Storage” type media can include any or all of the tangible memory of the computers, processors or the like, or associated modules thereof, such as various semiconductor memories, tape drives, disk drives and the like, which may provide non-transitory storage at any time for the software programming. All or portions of the software may at times be communicated through the Internet or various other telecommunication networks. Such communications, for example, may enable loading of the software from one computer or processor into another, for example, from a management server or host computer into the computer platform of an application server. Thus, another type of media that may bear the software elements includes optical, electrical and electromagnetic waves, such as used across physical interfaces between local devices, through wired and optical landline networks and over various air-links. The physical elements that carry such waves, such as wired or wireless links, optical links or the like, also may be considered as media bearing the software. As used herein, unless restricted to non-transitory, tangible “storage” media, terms such as computer or machine “readable medium” refer to any medium that participates in providing instructions to a processor for execution.

Hence, a machine readable medium, such as computer-executable code, may take many forms, including but not limited to, a tangible storage medium, a carrier wave medium or physical transmission medium. Non-volatile storage media include, for example, optical or magnetic disks, such as any of the storage devices in any computer(s) or the like, such as may be used to implement the databases, etc. shown in the drawings. Volatile storage media include dynamic memory, such as main memory of such a computer platform. Tangible transmission media include coaxial cables; copper wire and fiber optics, including the wires that comprise a bus within a computer system. Carrier-wave transmission media may take the form of electric or electromagnetic signals, or acoustic or light waves such as those generated during radio frequency (RF) and infrared (IR) data communications. Common forms of computer-readable media therefore include for example: a floppy disk, a flexible disk, hard disk, magnetic tape, any other magnetic medium, a CD-ROM, DVD or DVD-ROM, any other optical medium, punch cards paper tape, any other physical storage medium with patterns of holes, a RAM, a ROM, a PROM and EPROM, a FLASH-EPROM, any other memory chip or cartridge, a carrier wave transporting data or instructions, cables or links transporting such a carrier wave, or any other medium from which a computer may read programming code and/or data. Many of these forms of computer readable media may be involved in carrying one or more sequences of one or more instructions to a processor for execution.

The computer system 801 can include or be in communication with an electronic display 835 that comprises a user interface (UI) 840 for providing, for example, results of sequencing analysis. Examples of UIs include, without limitation, a graphical user interface (GUI) and web-based user interface.

Methods and systems of the present disclosure can be implemented by way of one or more algorithms. An algorithm can be implemented by way of software upon execution by the central processing unit 805. The algorithm can, for example, perform sequencing and analyze sequencing reads.

Devices, systems, compositions and methods of the present disclosure may be used for various applications, such as, for example, processing a single analyte (e.g., RNA, DNA, or protein) or multiple analytes (e.g., DNA and RNA, DNA and protein, RNA and protein, or RNA, DNA and protein) form a single cell. For example, a biological particle (e.g., a cell or cell bead) is partitioned in a partition (e.g., droplet), and multiple analytes from the biological particle are processed for subsequent processing. The multiple analytes may be from the single cell. This may enable, for example, simultaneous proteomic, transcriptomic and genomic analysis of the cell.

Embodiments

In some aspects, the present disclosure provides a method according to any of the following embodiments:

1. A method for analyzing a cell, comprising:
- (a) labeling the cell with a cell nucleic acid barcode sequence to generate a labeled cell, wherein a cell nucleic acid barcode molecule comprises the cell nucleic acid barcode sequence and a cell labeling agent;
- (b) generating a partition comprising the labeled cell and a plurality of partition nucleic acid barcode molecules, wherein each partition nucleic acid barcode molecule of the plurality of partition nucleic acid barcode molecules comprises a partition nucleic acid barcode sequence;
- (c) permeabilizing or lysing the cell to provide access to a plurality of nucleic acid molecules therein;
- (d) generating (i) a barcoded nucleic acid molecule comprising the cell nucleic acid barcode sequence, or a complement thereof, and the partition nucleic acid barcode sequence, or a complement thereof, and (ii) a plurality of barcoded nucleic acid products each comprising a sequence of a nucleic acid molecule of the plurality of nucleic acid molecules and the partition nucleic acid barcode sequence, or a complement thereof; and
- (e) identifying the plurality of nucleic acid molecules as originating from the cell.
2. The method of embodiment 1, wherein the cell nucleic acid barcode sequence identifies a sample from which the cell originates.
3. The method of embodiment 2, wherein the sample is derived from a biological fluid.
4. The method of embodiment 3, wherein the biological fluid comprises blood or saliva.
5. The method of embodiment 1, wherein the cell is an immune cell.
6. The method of embodiment 5, wherein the immune cell is a T cell.
7. The method of embodiment 5, wherein the immune cell is a B cell.
8. The method of embodiment 1, wherein each partition nucleic acid barcode molecule of the plurality of partition nucleic acid barcode molecules comprises a priming sequence.
9. The method of embodiment 8, wherein the priming sequence is a targeted priming sequence.
10. The method of embodiment 8, wherein the priming sequence is a random N-mer sequence.
11. The method of embodiment 1, wherein the barcoded nucleic acid molecule and the plurality of barcoded nucleic acid products are synthesized via one or more primer extension reactions, ligation reactions, or nucleic acid amplification reactions.
12. The method of embodiment 1, further comprising sequencing the barcoded nucleic acid molecule and the barcoded nucleic acid products, or derivatives thereof, to yield a plurality of sequencing reads.
13. The method of embodiment 12, further comprising associating each sequencing read of the plurality of sequencing reads with the partition via its partition nucleic acid barcode sequence.
14. The method of embodiment 1, further comprising, in (b), partitioning the labeled cell with a bead, which bead comprises the plurality of partition nucleic acid barcode molecules.
15. The method of embodiment 14, wherein the partition nucleic acid barcode sequence of each nucleic acid barcode molecule of the plurality of partition nucleic acid barcode molecules is releasably coupled to the bead.
16. The method of embodiment 15, further comprising, after (b), releasing partition nucleic acid barcode sequences of the plurality of partition nucleic acid barcode molecules from the bead.
17. The method of embodiment 16, wherein releasing partition nucleic acid barcode sequences of the plurality of partition nucleic acid barcode molecules from the bead comprises application of a stimulus.
18. The method of embodiment 14, wherein the bead is a gel bead.
19. The method of embodiment 18, wherein the gegl bead is dissolvable or degradable.
20. The method of embodiment 1, wherein the partition is a well.
21. The method of embodiment 1, wherein the partition is a droplet.
22. The method of embodiment 119, wherein the cell labeling agent is a lipophilic moiety, and wherein the lipophilic moiety of the cell nucleic acid barcode molecule is a cholesterol.
23. The method of embodiment 1, wherein the plurality of nucleic acid molecules comprise a plurality of deoxyribonucleic acid molecules.
24. The method of embodiment 1, wherein the plurality of nucleic acid molecules comprise a plurality of ribonucleic acid molecules.
25. The method of embodiment 8, wherein the priming sequence is capable of hybridizing to a sequence of at least a subset of the plurality of nucleic acid molecules.
26. The method of embodiment 8, wherein the priming sequence is capable of hybridizing to a sequence of the cell nucleic acid barcode molecule.
27. The method of embodiment 1, wherein, prior to (b), the cell nucleic acid barcode molecule is at least partially disposed within the labeled cells.
28. The method of embodiment 1, wherein the plurality of nucleic acid molecules comprises a plurality of nucleic acid sequences corresponding to a V(D)J region of the genome of the cell.
29. The method of embodiment 28, wherein the V(D)J region of the genome of the cell comprises a T cell receptor variable region sequence, a B cell receptor variable region sequence, or an immunoglobulin variable region sequence.
30. The method of embodiment 29, wherein the partition further comprises a primer molecule, which primer molecule comprises a sequence complementary to a sequence of the plurality of nucleic acid molecules.
31. The method of embodiment 30, wherein the plurality of nucleic acid molecules comprises a plurality of messenger ribonucleic acid (mRNA) molecules, and wherein the sequence of the plurality of nucleic acid molecules is a poly(A) sequence.
32. The method of embodiment 31, wherein the plurality of barcoded nucleic acid products comprises a plurality of complementary deoxyribonucleic acid (cDNA) molecules, or derivatives thereof
33. The method of embodiment 30, wherein (d) comprises hybridizing the sequence of the primer molecule to the sequence of a nucleic acid molecule of the plurality of nucleic acid molecules and using an enzyme to extend the sequence of the primer molecule to provide a nucleic acid product comprising a complementary deoxyribonucleic acid (cDNA) sequence corresponding to a sequence of the nucleic acid molecule.
34. The method of embodiment 33, wherein the enzyme is a reverse transcriptase.
35. The method of embodiment 33, wherein the enzyme incorporates a sequence at an end of the nucleic acid product.
36. The method of embodiment 35, wherein the sequence is a poly(C) sequence.
37. The method of embodiment 36, wherein at least a subset of the partition nucleic acid barcode molecules comprise a sequence complementary to the poly(C) sequence.
38. The method of embodiment 33, wherein (d) further comprises using the nucleic acid product and a partition nucleic acid barcode molecule of the plurality of partition nucleic acid barcode molecules to generate a barcoded nucleic acid product of the plurality of barcoded nucleic acid products.
39. A method of analyzing a plurality of cells, comprising:
- (a) providing a plurality of cell nucleic acid barcode molecules comprising a plurality of cell nucleic acid barcode sequences, each cell nucleic acid barcode molecule of the plurality of cell nucleic acid barcode molecules comprising a single cell nucleic acid barcode sequence of the plurality of cell nucleic acid barcode sequences;
- (b) labeling the plurality of cells with the plurality of cell nucleic acid barcode sequences to generate a plurality of labeled cells, wherein each labeled cell of the plurality of labeled cells comprises a different cell nucleic acid barcode sequence of the plurality of cell nucleic acid barcode sequences;
- (c) generating a plurality of partitions comprising the plurality of labeled cells and a plurality of partition nucleic acid barcode sequences, wherein each partition of the plurality of partitions comprises a different partition nucleic barcode sequence of the plurality of partition nucleic acid barcode sequences;
- (d) synthesizing a plurality of barcoded nucleic acid products from the plurality of labeled cells, wherein a given barcoded nucleic acid product of the plurality of barcoded nucleic acid products comprises (i) a cell identification sequence comprising a given cell nucleic acid barcode sequence of the plurality of cell nucleic acid barcode sequences, or a complement of the given cell nucleic acid barcode sequence; and (ii) a partition identification sequence comprising a given partition nucleic acid barcode sequence of the plurality of partition nucleic acid barcode sequences, or a complement of the given partition nucleic acid barcode sequence; and
- (e) based at least in part on (d), determining a relative size of cells of the plurality of cells.
40. The method of embodiment 39, further comprising sequencing the plurality of barcoded nucleic acid products or derivatives thereof to yield a plurality of sequencing reads.
41. The method of embodiment 40, further comprising associating each sequencing read of the plurality of sequencing reads with a labeled cell of the plurality of labeled cells via its respective cell identification sequence, and associating each sequencing read of the plurality of sequencing reads with a partition of the plurality of partitions via its respective partition identification sequence.
42. The method of embodiment 40, wherein (e) comprises determining a number of cell identification sequences and/or partition identification sequences in the plurality of sequencing reads and using the number to determine the relative size of the cells.
43. The method of embodiment 42, wherein (e) comprises determining a number of cell identification sequences in the plurality of sequencing reads and using the number to determine the relative size of the cells.
44. The method of embodiment 42, wherein (e) comprises determining a number of partition identification sequences in the plurality of sequencing reads and using the number to determine the relative size of the cells.
45. The method of embodiment 39, wherein each cell nucleic acid barcode molecule of the plurality of cell nucleic acid barcode molecules comprises an optical label.
46. The method of embodiment 45, wherein the optical label is a fluorescent moiety.
47. The method of embodiment 45, wherein (e) comprises determining a relative number of cell nucleic acid barcode molecules of the plurality of cell nucleic acid barcode molecules associated with a given cell of the plurality of cells.
48. The method of embodiment 39, wherein each cell nucleic acid barcode molecule of the plurality of cell nucleic acid barcode molecules comprises a lipophilic moiety.
49. The method of embodiment 48, wherein the lipophilic moiety of each nucleic acid barcode molecule of the plurality of nucleic acid barcode molecules comprises cholesterol.
50. The method of embodiment 48, wherein the lipophilic moiety is linked to the plurality of cell nucleic acid barcode molecules via a linker.
51. The method of embodiment 39, wherein each partition nucleic acid barcode molecule of the plurality of partition nucleic acid barcode molecules comprises a unique molecular identifier sequence.
52. The method of embodiment 39, wherein the plurality of cells are derived from a plurality of cellular samples.
53. The method of embodiment 39, wherein a given cell nucleic acid barcode sequence of the plurality of cell nucleic acid barcode sequences identifies a cellular sample from which an associated cell of the plurality of cells originates.
54. The method of embodiment 53, wherein the sample is derived from a biological fluid.
55. The method of embodiment 54, wherein the biological fluid comprises blood or saliva.
56. The method of embodiment 39, wherein at least a subset of the plurality of partitions comprise at least two cells of the plurality of cells.
57. The method of embodiment 56, further comprising identifying at least two cells of the plurality of cells as originating from a same partition of the plurality of partitions using (i) cell nucleic acid barcode sequences of the plurality of cell nucleic acid barcode sequences, or complements thereof, and (ii) partition nucleic acid barcode sequences of the plurality of partition nucleic acid barcode sequences, or complements thereof
58. The method of embodiment 39, wherein the plurality of partition nucleic acid barcode molecules are coupled to a plurality of beads.
59. The method of embodiment 58, wherein the plurality of beads is a plurality of gel beads.
60. The method of embodiment 59, wherein the plurality of gel beads is dissolvable or degradable.
61. The method of embodiment 58, wherein each partition of the plurality of partitions comprises a single bead of the plurality of beads.
62. The method of embodiment 58, wherein the plurality of partition nucleic acid barcode molecules is releasably coupled to the plurality of beads.
63. The method of embodiment 62, wherein the plurality of partition nucleic acid barcode sequences is releasable from the bead upon application of a stimulus.
64. The method of embodiment 63, wherein the stimulus is a chemical stimulus.
65. The method of embodiment 63, further comprising, subsequent to (b), releasing partition nucleic acid barcode molecules of the plurality of partition nucleic acid barcode molecules from each bead of the plurality of beads.
66. The method of embodiment 58, wherein each bead of the plurality of beads comprises at least 10,000 partition nucleic acid barcode molecules of the plurality of partition nucleic acid barcode molecules coupled thereto.
67. The method of embodiment 39, wherein the plurality of partitions is a plurality of droplets.
68. The method of embodiment 39, wherein the plurality of partitions is a plurality of wells.
69. The method of embodiment 39, wherein, in (b), the plurality of cells is labeled with the plurality of cell nucleic acid barcode sequences by binding cell binding moieties, each coupled to a given cell nucleic acid barcode sequence of the plurality of cell nucleic acid barcode sequences, to each cell of the plurality of cells.
70. The method of embodiment 69, wherein the cell binding moieties are antibodies, cell surface receptor binding molecules, receptor ligands, small molecules, pro-bodies, aptamers, monobodies, affimers, darpins, or protein scaffolds.
71. The method of embodiment 70, wherein the cell binding moieties are antibodies.
72. The method of embodiment 69, wherein the cell binding moieties bind to a protein of cells of the plurality of cells.
73. The method of embodiment 69, wherein the cell binding moieties bind to a cell surface species of cells of the plurality of cells.
74. The method of embodiment 69, wherein the cell binding moieties bind to a species common to each cell of the plurality of cells.
75. The method of embodiment 39, wherein, in (b), the plurality of cells is labeled with the plurality of cell nucleic acid barcode sequences by delivering nucleic acid barcode molecules each comprising an individual cell nucleic acid barcode sequence of the plurality of cell nucleic acid barcode sequences to each cell of the plurality of cells with the aid of a cell-penetrating peptide.
76. The method of embodiment 39, wherein, in (b), the plurality of cells is labeled with the plurality of cell nucleic acid barcode sequences with the aid of liposomes, nanoparticles, electroporation, or mechanical force.
77. The method of embodiment 76, wherein the mechanical force comprises the use of nanowires or microinjection.
78. A method of analyzing a plurality of cells, comprising:
- (a) providing a first plurality of cell nucleic acid barcode molecules comprising a first plurality of cell nucleic acid barcode sequences and a second plurality of cell nucleic acid barcode molecules comprising a second plurality of cell nucleic acid barcode sequences, each cell nucleic acid barcode molecule of the first plurality of cell nucleic acid barcode molecules and the second plurality of cell nucleic acid barcode molecules comprising a single cell nucleic acid barcode sequence of the first plurality of cell nucleic acid barcode sequences or the second plurality of cell nucleic acid barcode sequences;
- (b) labeling the plurality of cells with the first plurality of cell nucleic acid barcode sequences and the second plurality of cell nucleic acid barcode sequences to generate a plurality of labeled cells, wherein each labeled cell of the plurality of labeled cells comprises (i) a different cell nucleic acid barcode sequence of the first plurality of cell nucleic acid barcode sequences and (ii) a different cell nucleic acid barcode sequence of the second plurality of cell nucleic acid barcode sequences;
- (c) generating a plurality of partitions comprising the plurality of labeled cells and a plurality of partition nucleic acid barcode sequences, wherein each partition of the plurality of partitions comprises a different partition nucleic barcode sequence of the plurality of partition nucleic acid barcode sequences; and
- (d) synthesizing a plurality of barcoded nucleic acid products from the plurality of labeled cells, wherein a given barcoded nucleic acid product of the plurality of barcoded nucleic acid products comprises (i) a cell identification sequence comprising a given cell nucleic acid barcode sequence of the first plurality of cell nucleic acid barcode sequences or the second plurality of cell nucleic acid barcode sequences, or a complement of the given cell nucleic acid barcode sequence; and (ii) a partition identification sequence comprising a given partition nucleic acid barcode sequence of the plurality of partition nucleic acid barcode sequences, or a complement of the given partition nucleic acid barcode sequence.
79. The method of embodiment 78, wherein the plurality of labeled cells are derived from a plurality of cellular samples.
80. The method of embodiment 78, wherein a given cell nucleic acid barcode sequence of the first plurality of cell nucleic acid barcode sequences or the second plurality of cell nucleic acid barcode sequences identifies a cellular sample from which an associated cell of the plurality of labeled cells originates.
81. The method of embodiment 80, wherein the sample is derived from a biological fluid.
82. The method of embodiment 81, wherein the biological fluid comprises blood or saliva.
83. The method of embodiment 80, wherein the first plurality of cell nucleic acid barcode sequences identifies the cellular sample.
84. The method of embodiment 80, wherein the second plurality of cell nucleic acid barcode sequences identifies a condition to which an associated cell of the plurality of labeled cells is subjected.
85. The method of embodiment 80, wherein the first plurality of cell nucleic acid barcode sequences and the second plurality of cell nucleic acid barcode sequences identify a spatial position of an associated cell of the plurality of labeled cells prior to (c).
86. The method of embodiment 78, wherein each cell nucleic acid barcode molecule of the first plurality of cell nucleic acid barcode molecules or the second plurality of cell nucleic acid barcode molecules comprises a lipophilic moiety.
87. The method of embodiment 86, wherein the lipophilic moiety comprises cholesterol.
88. The method of embodiment 86, wherein the lipophilic moiety is linked to the first plurality of cell nucleic acid barcode molecules or the second plurality of cell nucleic acid barcode molecules via a linker.
89. The method of embodiment 78, wherein, subsequent to (c), the plurality of labeled cells are lysed or permeabilized.
90. The method of embodiment 78, wherein at least a subset of the plurality of partitions comprise at least two labeled cells of the plurality of labeled cells.
91. The method of embodiment 90, further comprising identifying at least two labeled cells of the plurality of labeled cells as originating from a same partition of the plurality of partitions using (i) cell nucleic acid barcode sequences of the first plurality of cell nucleic acid barcode sequences, or complements thereof, (ii) cell nucleic acid barcode sequences of the second plurality of cell nucleic acid barcode sequences, or complements thereof, and/or (iii) partition nucleic acid barcode sequences of the plurality of partition nucleic acid barcode sequences, or complements thereof
92. The method of embodiment 78, wherein the plurality of partition nucleic acid barcode molecules are coupled to a plurality of beads.
93. The method of embodiment 92, wherein the plurality of beads is a plurality of gel beads.
94. The method of embodiment 93, wherein the plurality of gel beads is dissolvable or degradable.
95. The method of embodiment 92, wherein each partition of the plurality of partitions comprises a single bead of the plurality of beads.
96. The method of embodiment 92, wherein the plurality of partition nucleic acid barcode molecules is releasably coupled to the plurality of beads.
97. The method of embodiment 96, wherein the plurality of partition nucleic acid barcode molecules is releasable from the bead upon application of a stimulus.
98. The method of embodiment 97, wherein the stimulus is a chemical stimulus.
99. The method of embodiment 96, further comprising, subsequent to (b), releasing partition nucleic acid barcode molecules of the plurality of partition nucleic acid barcode molecules from each bead of the plurality of beads.
100. The method of embodiment 92, wherein each partition nucleic acid barcode molecule of the plurality of partition nucleic acid barcode molecules comprises a common partition nucleic acid barcode sequence.
101. The method of embodiment 78, wherein each partition nucleic acid barcode molecule of the plurality of partition nucleic acid barcode molecules comprises a unique molecular identifier sequence.
102. The method of embodiment 78, wherein each partition nucleic acid barcode molecule of the plurality of partition nucleic acid barcode molecules comprises a priming sequence.
103. The method of embodiment 102, wherein the priming sequence is a targeted priming sequence.
104. The method of embodiment 102, wherein the priming sequence is a random priming sequence.
105. The method of embodiment 78, further comprising identifying the first plurality of barcoded nucleic acid products and the second plurality of barcoded nucleic acid products as originating from labeled cells of the plurality of labeled cells.
106. The method of embodiment 78, wherein the plurality of partitions is a plurality of droplets.
107. The method of embodiment 78, wherein the plurality of partitions is a plurality of wells.
108. The method of embodiment 78, wherein the plurality of cells are labeled with the first plurality of cell nucleic acid barcode sequences and the second plurality of cell nucleic acid barcode sequences simultaneously.
109. The method of embodiment 78, wherein the plurality of cells are labeled with the first plurality of cell nucleic acid barcode sequences prior to the second plurality of cell nucleic acid barcode sequences.
110. The method of embodiment 109, wherein a cell nucleic acid barcode molecule of the second plurality of cell nucleic acid barcode sequences is coupled to a cell nucleic acid barcode molecule of the first plurality of cell nucleic acid barcode sequences coupled to a given cell of the plurality of cells.
111. The method of embodiment 109, wherein the second plurality of cell nucleic acid barcode sequences comprise a sequence complementary to a sequence of the first plurality of cell nucleic acid barcode sequences.
112. The method of embodiment 78, wherein the plurality of cells is labeled with the first plurality of cell nucleic acid barcode sequences and/or the second plurality of cell nucleic acid barcode sequences by binding cell binding moieties, each coupled to a given cell nucleic acid barcode sequence of the first plurality of cell nucleic acid barcode sequences and/or the second plurality of cell nucleic acid barcode sequences, to each cell of the plurality of cells.
113. The method of embodiment 112, wherein the cell binding moieties are antibodies, cell surface receptor binding molecules, receptor ligands, small molecules, pro-bodies, aptamers, monobodies, affimers, darpins, or protein scaffolds.
114. The method of embodiment 112, wherein the cell binding moieties bind to a protein of cells of the plurality of cells.
115. The method of embodiment 112, wherein the cell binding moieties bind to a cell surface species of cells of the plurality of cells.
116. The method of embodiment 112, wherein the cell binding moieties bind to a species common to each cell of the plurality of cells.
117. The method of embodiment 78, wherein the plurality of cells is labeled with the first plurality of cell nucleic acid barcode sequences and/or the second plurality of cell nucleic acid barcode sequences by delivering nucleic acid barcode molecules each comprising an individual cell nucleic acid barcode sequence of the first plurality of cell nucleic acid barcode sequences and/or the second plurality of cell nucleic acid barcode sequences to each cell of the plurality of cells with the aid of a cell-penetrating peptide.
118. The method of embodiment 78, wherein the plurality of cells is labeled with the first plurality of cell nucleic acid barcode sequences and/or the second plurality of cell nucleic acid barcode sequences with the aid of liposomes, nanoparticles, electroporation, or mechanical force.
119. The method of embodiment 1, wherein the cell labeling agent is selected from the group consisting of a lipophilic moiety, a nanoparticle, a dye, a fluorophore, and a peptide.

EXAMPLES
Example 1. Cells Incubated with Cholesterol-Conjugated Feature Barcodes can be Detected in Sequencing Libraries

Single cell sequencing libraries were prepared and analyzed from cells incubated with and without a cholesterol conjugated-feature barcode to assess the ability to detect the feature barcode in processed libraries.

Briefly, cells were washed in medium followed by a wash in PBS. The cells were counted and separated into 2 mL Eppendorf tubes and incubated for five minutes at room temperature with: (1) cholesterol-conjugated feature barcodes at a concentration of 1 uM; or (2) 1 uM of feature barcodes only (i.e., barcodes not conjugated to a cholesterol moiety). Following the incubation, the cells were washed three times in medium. The cells were then pooled and counted. The pooled cell population was then partitioned into droplets as generally described elsewhere herein to generate droplets comprising: (1) a single cell; and (2) a single gel bead comprising releasable nucleic acid barcode molecules attached thereto. The nucleic acid barcode molecules attached to the gel bead comprise a barcode sequence, a UMI sequence, and a GGG-containing capture sequence. The cholesterol-conjugated feature barcodes comprise a CCC-containing sequence complementary to the gel bead oligonucleotide capture sequence.

Cells in each droplet were then lysed and the cellular nucleic acids (including feature barcodes if present) were barcoded with the cell barcode sequences. Cell barcoded nucleic acids were then pooled and processed to complete library preparation. Fully constructed barcode libraries were analyzed on a BioAnalyzer to detect the presence of the feature barcode.

FIGS. 11A-11D show BioAnalyzer results for sequencing libraries prepared from four different cell populations (two cell populations incubated with cholesterol-conjugated feature barcodes “oligo133” and two cell populations incubated with feature barcodes only “oligo131” i.e., no cholesterol conjugation). As seen in FIGS. 11A-11B, the signal (as measured by fluorescent units (FU, y-axis)) at −150 basepairs (the expected size of feature barcodes—see x-axis) was about 500 FU (see arrow FIGS. 11A-B) for the two cell populations incubated with feature barcodes that were not conjugated to a cholesterol moiety. In contrast, as seen in FIGS. 11C-11D, a signal of over 5,000 FU (FIG. 11C—see arrow) and 10,000 FU (FIG. 11D—see arrow) was observed in libraries prepared from cells incubated with the cholesterol-conjugated feature barcodes. These results indicate that feature barcodes were successfully introduced into the cell populations and that the feature barcodes can be successfully detected when present in a mixed cell, pooled population.

Example 2. DNA Sequencing Results of Cholesterol-Conjugated Feature Barcode Libraries

Jurkat cells were washed in medium followed by a wash in PBS, and then counted. 100,000 such cells were split into 5 Eppendorf tubes (2 mL) to generate 5 different cell populations. Individual cell populations (four in total) were then incubated with 0.1 uM or 0.01 uM cholesterol-conjugated feature barcodes (four in total, one for each cell population) for five minutes at room temperature to yield one cell population “tagged” with a first barcode (BC1), one cell population “tagged” with a second barcode (BC2), one cell population “tagged” with a third barcode (BC3), and one cell population “tagged” with a fourth barcode (BC4). One cell population was not incubated with a cholesterol-conjugated feature barcode (background population). The 5 cell populations were then washed in media, pooled into a single tube, and then counted to determine cell numbers. The pooled cell population was then partitioned into single-cell containing droplets for single-cell barcoding as described above. Fully constructed barcode libraries were then sequenced on an Illumina sequencer to detect the presence of the cell and feature barcodes.

A summary of the analysis of the sequencing results are presented in Table 2. As seen in Table 2, sequencing reads corresponding to cells containing feature barcodes BC1, BC2, BC3, and BC4 were successfully detected from the pooled cell sample at both the 0.1 uM and 0.01 uM concentration of cholesterol-conjugated feature barcodes tested. The “#background” indicates the number of cells associated with the unlabeled population. Two replicates were performed at each concentration (replicate 1 and replicate 2).

TABLE 2

Sequence Analysis of Pooled Cell Populations

mean
mean
mean
mean

purity
purity
purity
purity

Total
# BC1
# BC2
# BC3
# BC4
#
# back-
BC1
BC2
BC3
BC4

Description
cells
cells
cells
cells
cells
doublets
ground
cells
cells
cells
cells

5′Chol-BC 0.1 uM
1593
285
314
303
344
8
339
0.953
0.966
0.961
0.923

(Replicate 1)

5′Chol-BC 0.1 uM
1776
303
335
373
361
15
389
0.951
0.964
0.956
0.908

(Replicate 2)

5′Chol-BC 0.01 uM
1676
325
337
348
313
11
342
0.936
0.945
0.951
0.871

(Replicate 1)

5′Chol-BC 0.01 uM
1602
292
330
326
320
12
322
0.939
0.949
0.955
0.876

(Replicate 2)

FIGS. 12A-12L show graphs from pooled cell populations incubated with 0.1 μM cholesterol-conjugated feature barcodes showing the number of unique molecular identifier (UMI) counts on the x-axis versus number of cells on the y-axis. FIGS. 12A-12B show log₁₀UMI counts of a first feature barcode sequence (“BC1”) identified from sequencing reads generated from sequencing libraries prepared from the pooled cell population (FIG. 12A—replicate 1; FIG. 12B—replicate 2). From these results, a clearly distinguished BC1-containing cell population can be distinguished 1201a (replicate 1) and 1201b (replicate 2). FIGS. 12C-12D show log₁₀UMI counts of a second feature barcode sequence (′BC2″) identified from sequencing reads generated from sequencing libraries prepared from the pooled cell population (FIG. 12C—replicate 1; FIG. 12D—replicate 2). From these results, a clearly distinguished BC2-containing cell population can be distinguished 1202a (replicate 1) and 1202b (replicate 2). FIGS. 12E-12F show log₁₀UMI counts of a third feature barcode sequence (′BC3″) identified from sequencing reads generated from sequencing libraries prepared from the pooled cell population (FIG. 12E—replicate 1; FIG. 12F—replicate 2). From these results, a clearly distinguished BC3-containing cell population can be distinguished 1203a (replicate 1) and 1203b (replicate 2). FIGS. 12G-12H show log₁₀UMI counts of a fourth feature barcode sequence (′BC4″) identified from sequencing reads generated from sequencing libraries prepared from the pooled cell population (FIG. 12G—replicate 1; FIG. 12H—replicate 2). From these results, a clearly distinguished BC4-containing cell population can be distinguished 1204a (replicate 1) and 1204b (replicate 2).

FIGS. 12I-12J show 3D representations of UMI counts obtained from the pooled cell populations barcoded with 0.1 uM cholesterol-conjugated feature barcodes for replicate 1. Graphs depict UMI counts in linear (FIG. 12I) and log₁₀scale (FIG. 12J). The three axes of the graphs show UMI counts corresponding to sequencing reads found to contain BC1 (1205, 1209), BC2 (1206, 1210), or BC3 (1207, 1211). UMI counts associated with sequencing reads containing BC4 and unlabeled cells (1208, 1212) are clustered together.

FIGS. 13A-13L show graphs from pooled cell populations incubated with 0.01 μM cholesterol-conjugated feature barcodes showing the number of unique molecular identifier (UMI) counts on the x-axis versus number of cells on the y-axis. FIGS. 13A-13B show log₁₀UMI counts of a first feature barcode sequence (“BC1”) identified from sequencing reads generated from sequencing libraries prepared from the pooled cell population (FIG. 13A—replicate 1; FIG. 13B—replicate 2). From these results, a clearly distinguished BC1-containing cell population can be distinguished 1301a (replicate 1) and 1301b (replicate 2). FIGS. 13C-13D show log₁₀UMI counts of a second feature barcode sequence (′BC2″) identified from sequencing reads generated from sequencing libraries prepared from the pooled cell population (FIG. 13C—replicate 1; FIG. 13D—replicate 2). From these results, a clearly distinguished BC2-containing cell population can be distinguished 1302a (replicate 1) and 1302b (replicate 2). FIGS. 13E-13F show log₁₀UMI counts of a third feature barcode sequence (′BC3″) identified from sequencing reads generated from sequencing libraries prepared from the pooled cell population (FIG. 13E—replicate 1; FIG. 13F—replicate 2). From these results, a clearly distinguished BC3-containing cell population can be distinguished 1303a (replicate 1) and 1303b (replicate 2). 13G-13H show log₁₀UMI counts of a fourth feature barcode sequence (′BC4″) identified from sequencing reads generated from sequencing libraries prepared from the pooled cell population (FIG. 13G—replicate 1; FIG. 13H—replicate 2). From these results, a clearly distinguished BC4-containing cell population can be distinguished 1304a (replicate 1) and 1304b (replicate 2).

FIGS. 13I-13J show 3D representations of UMI counts obtained from the pooled cell populations barcoded with 0.01 uM cholesterol-conjugated feature barcodes for replicate 1. Graphs depict UMI counts in linear (FIG. 13I) and log₁₀scale (FIG. 13J). The three axes of the graphs show UMI counts corresponding to sequencing reads found to contain BC1 (1305, 1309), BC2 (1306, 1310), or BC3 (1307, 1311). UMI counts associated with sequencing reads containing BC4 and unlabeled cells (1308, 1312) are clustered together.

Example 3. DNA Sequencing Results of Antibody-Conjugated Feature Barcode Libraries

BioLegend “hashing” antibodies that broadly target cell surface proteins across human cell types were provided. The antibodies included a mixture of clones LNH94 (anti-CD298) and 2M2 (anti-(β2-microglobulin). The antibodies were pooled into different populations and barcoded with different feature barcodes. Jurkat, Raji, and 293T cells were provided in separate populations and incubated with different antibody-associated feature barcodes. Jurkat cells were stained with antibodies barcoded with Barcode #18 (BC18); Raji cells were stained with antibodies barcoded with Barcode #19 (BC19); and 293T cells were stained with antibodies barcoded with Barcode #20 (BC20). A total of 9,000 cells were loaded. The separate cell populations were subsequently pooled. The pooled mixture was expected to include Jurkat cells comprising feature barcode BC18, Raji cells comprising feature barcode BC19, and 293T cells comprising feature barcode BC20. The number of cells in the pooled mixture was counted to determine cell numbers. The pooled cell population was then partitioned into single-cell containing droplets for single-cell barcoding as described above. Fully constructed barcode libraries were then sequenced on an Illumina sequencer to detect the presence of the cell and feature barcodes.

Feature barcode UMI counts were used to group cells after pooling and library preparation. Barcode purity was calculated as (target barcode UMIs)/(sum of all barcode UMIs). Multiplets were identified by high UMI count for more than 1 barcode.

A summary of the analysis of the sequencing results are presented in Table 3. As seen in Table 3, sequencing reads corresponding to cells containing feature barcodes BC1, BC2, BC3, and BC4 were successfully detected from the pooled cell sample at both the 0.1 uM and 0.01 uM concentration of cholesterol-conjugated feature barcodes tested. The “#background” indicates the number of cells associated with the unlabeled population. Two replicates were performed at each concentration (replicate 1 and replicate 2).

TABLE 3

Sequence Analysis of Pooled Cell Populations

mean
mean
mean

purity
purity
purity

Total
# BC18
# BC19
# BC20
#
# back-
BC18
BC19
BC20

Description
cells
cells
cells
cells
doublets
ground
cells
cells
cells

Cell
8595
2866
2338
2800
506
85
0.985
0.99
0.813

multiplexing_9000_rep1_3′ ver_meta

Cell
8175
2582
2407
2613
513
60
0.984
0.99
0.822

multiplexing_9000_rep2_3′ ver_meta

FIGS. 14A-14I show graphs from pooled cell populations incubated with antibody-conjugated feature barcodes showing the number of unique molecular identifier (UMI) counts on the x-axis versus number of cells on the y-axis. FIGS. 14A-14B show UMI counts of a first feature barcode sequence (“BC18”) identified from sequencing reads generated from sequencing libraries prepared from the pooled cell population (FIG. 14A—replicate 1; FIG. 14B—replicate 2). From these results, a clearly distinguished BC18-containing cell population can be distinguished 1401a (replicate 1) and 1401b (replicate 2). FIGS. 14C-14D show UMI counts of a second feature barcode sequence (“BC19”) identified from sequencing reads generated from sequencing libraries prepared from the pooled cell population (FIG. 14C—replicate 1; FIG. 14D—replicate 2). From these results, a clearly distinguished BC19-containing cell population can be distinguished 1402a (replicate 1) and 1402b (replicate 2). FIGS. 14E-14F show UMI counts of a third feature barcode sequence (“BC20”) identified from sequencing reads generated from sequencing libraries prepared from the pooled cell population (FIG. 14E—replicate 1; FIG. 14F —replicate 2). From these results, a clearly distinguished BC20-containing cell population can be distinguished 1403a (replicate 1) and 1403b (replicate 2).

FIGS. 14G-14I show graphs from pooled cell populations incubated with antibody-conjugated feature barcodes showing the number of unique molecular identifier (UMI) counts against populations of various barcode sequences. Cells enriched for one, two (cell doublets), and three (cell triplets) are categorized. FIG. 14G shows UMI counts of feature barcode sequences identified from sequencing reads generated from sequencing libraries prepared from the pooled cell population with log₁₀UMI counts for BC18 on the y-axis and log₁₀UMI counts for BC20 on the x-axis. The graph shows clustered UMI counts in which the majority of sequencing reads were found to contain BC18 (1404), BC19 (1405), BC20 (1406), and BC18 and BC20 (1407). FIG. 14H shows UMI counts of feature barcode sequences identified from sequencing reads generated from sequencing libraries prepared from the pooled cell population with log₁₀UMI counts for BC18 on the y-axis and log₁₀UMI counts for BC19 on the x-axis. The graph shows clustered UMI counts in which the majority of sequencing reads were found to contain BC18 (1408), BC19 (1410), BC20 (1409), and BC18 and BC19 (1411). FIG. 14I shows UMI counts of feature barcode sequences identified from sequencing reads generated from sequencing libraries prepared from the pooled cell population with log₁₀UMI counts for BC19 on the y-axis and log₁₀UMI counts for BC20 on the x-axis. The graph shows clustered UMI counts in which the majority of sequencing reads were found to contain BC18 (1413), BC19 (1412), BC20 (1414), and BC19 and BC20 (1415). Additional UMI counts corresponding to other doublets and to triplets for each of FIGS. 14G-14I are less pronounced in these visualizations.

Cell types and multiplets are identifiable using feature barcode UMI counts. As shown in FIGS. 15A-15B, doublets identified by antibody UMI counts cluster together in antibody t-distributed stochastic neighbor embedding (t-SNE) (FIG. 15A), as well as in gene expression (GEX) t-SNE analyses (FIG. 15B). Clustering is driven by cell type in GEX t-SNE, and by antibody label in antibody t-SNE. Overlap between clusters shows that antibody-based doublet identification matches the expected gene expression profiles. FIG. 15A shows clusters corresponding to single barcodes BC18, BC19, and BC20 (1503, 1502, 1501, respectively); doublets including BC18 and BC19 (1505), BC18 and BC20 (1504), and BC19 and BC20 (1506); triplets including BC18, BC19, and BC20 (1507); and absence of any barcode (1508). FIG. 15B shows clusters corresponding to single barcodes BC18, BC19, and BC20 (1513, 1512, 1511, respectively); doublets including BC18 and BC19 (1515), BC18 and BC20 (1514), and BC19 and BC20 (1516); and absence of any barcode (1518). A cluster corresponding to triplets including BC18, BC19, and BC20 is not pronounced in FIG. 15B.

Example 4: Generating Labeled Polynucleotides

In this example, and with reference to FIGS. 26A and 26B, individual cells are lysed in partitions comprising gel bead emulsions (GEMs). GEMs, for example, can be aqueous droplets comprising gel beads. Within GEMs, a template polynucleotide comprising an mRNA molecule can be reverse transcribed by a reverse transcriptase and a primer comprising a poly(dT) region. A template switching oligo (TSO) present in the GEM, for example a TSO delivered by the gel bead, can facilitate template switching so that a resulting polynucleotide product or cDNA transcript from reverse transcription comprises the primer sequence, a reverse complement of the mRNA molecule sequence, and a sequence complementary to the template switching oligo. The template switching oligo can comprise additional sequence elements, such as a unique molecular identifier (UMI), a barcode sequence (BC), and a Read1 sequence. See FIG. 26A. In some cases, a plurality of mRNA molecules from the cell is reverse transcribed within the GEM, yielding a plurality of polynucleotide products having various nucleic acid sequences. Following reverse transcription, the polynucleotide product can be subjected to target enrichment in bulk. Prior to target enrichment, the polynucleotide product can be optionally subjected to additional reaction(s) to yield double-stranded polynucleotides. The target may comprise VDJ sequences of a T cell and/or B cell receptor gene sequence. As shown at the top of the right panel of FIG. 26A, the polynucleotide product (shown as a double-stranded molecule, but can optionally be a single-stranded transcript) can be subjected to a first target enrichment polymerase chain reaction (PCR) using a primer that hybridizes to the Read 1 region and a second primer that hybridizes to a first region of the constant region (C) of the receptor sequence (e.g., TCR or BCR). The product of the first target enrichment PCR can be subjected to a second, optional target enrichment PCR. In the second target enrichment PCR, a second primer that hybridizes to a second region of the constant region (C) of the receptor can be used. This second primer can, in some cases, hybridize to a region of the constant region that is closer to the VDJ region that the primer used in the first target enrichment PCR. Following the first and second (optional) target enrichment PCR, the resulting polynucleotide product can be further processed to add additional sequences useful for downstream analysis, for example sequencing. The polynucleotide products can be subjected to fragmentation, end repair, A-tailing, adapter ligation, and one or more clean-up/purification operations.

In some cases, a first subset of the polynucleotide products from cDNA amplification can be subjected to target enrichment (FIG. 26B, right panel) and a second subset of the polynucleotide products from cDNA amplification is not subjected to target enrichment (FIG. 26B, bottom left panel). The second subset can be subjected to further processing without enrichment to yield an unenriched, sequencing ready population of polynucleotides. For example, the second subset can be subjected to fragmentation, end repair, A-tailing, adapter ligation, and one or more clean-up/purification operations.

The labeled polynucleotides can then be subjected to sequencing analysis. Sequencing reads of the enriched polynucleotides can yield sequence information about a particular population of the mRNA molecules in the cell whereas the enriched polynucleotides can yield sequence information about various mRNA molecules in the cell.

Example 5: Multiplexing Immune Samples

The multiplexing and sample pooling described herein may be applied to the analysis of immune cells (e.g., T cells and B cells) and immune receptors (e.g., TCRs, BCRs, and immunoglobulins). For example, a first cell population of cells comprising immune cells (such as peripheral blood mononuclear cells (PBMCs) or immune cells isolated from PBMCs) are labeled with a plurality of nucleic acid label molecules comprising a first cell barcode sequence and a universal capture sequence. A second cell population of cells comprising immune cells (such as peripheral blood mononuclear cells (PBMCs) or immune cells isolated from PBMCs) are labeled with a plurality of nucleic acid label molecules comprising a second cell barcode sequence and the universal capture sequence. Additional populations of cells (e.g., from additional samples or treatment conditions) can be labeled with additional cell barcode sequences as necessary. Additional labels can also be added to the cells, such as in a “combinatorial tagging” scheme as described elsewhere herein. Further, in some instances, the labels on cell populations can be stabilized through use of one or more anchor oligonucleotides (e.g., attached to a lipophilic moiety) as described herein.

Labeled cell populations are then pooled and partitioned into a plurality of partitions (e.g., a plurality of aqueous droplets or wells of a microwell array) such that at least some partitions of the plurality of partitions comprise a single labelled cell and a single bead (e.g., a gel bead) comprising a plurality of nucleic acid barcode molecules comprising a common partition barcode sequence and a template switch oligonucleotide (TSO) sequence. The TSO sequence is configured to facilitate a template switching reaction as described herein to generate barcoded molecules comprising a sequence corresponding to an immune transcript (e.g., TCR, BCR, immunoglobulin). In some instances, the TSO sequence is also complementary to and/or capable of hybridizing to the universal capture sequence of the label molecules. In other instances, the nucleic acid barcode molecules comprise (1) a first plurality of nucleic acid barcode molecules comprising (i) a common partition barcode sequence; and (ii) a TSO sequence configured to facilitate a template switching reaction; and (2) a second plurality of nucleic acid barcode molecules comprising (i) the common partition barcode sequence and (ii) a capture sequence complementary to and/or capable of hybridizing to the universal capture sequence of the label molecules. See, e.g., FIG. 25.

Subsequent to partitioning, cells are lysed to release mRNA, which is then barcoded, e.g., as described in Example 4. Nucleic acid label molecules are then hybridized to the partition barcode molecules and a nucleic acid molecule is generated comprising the label barcode and the partition barcode. Barcoded products may then be pooled and subjected to one or more reactions to generate a sequencing library, such as a library suitable for an Illumina sequencer.

While preferred embodiments of the present invention have been shown and described herein, it will be obvious to those skilled in the art that such embodiments are provided by way of example only. It is not intended that the invention be limited by the specific examples provided within the specification. While the invention has been described with reference to the aforementioned specification, the descriptions and illustrations of the embodiments herein are not meant to be construed in a limiting sense. Numerous variations, changes, and substitutions will now occur to those skilled in the art without departing from the invention. Furthermore, it shall be understood that all aspects of the invention are not limited to the specific depictions, configurations or relative proportions set forth herein which depend upon a variety of conditions and variables. It should be understood that various alternatives to the embodiments of the invention described herein may be employed in practicing the invention. It is therefore contemplated that the invention shall also cover any such alternatives, modifications, variations or equivalents. It is intended that the following claims define the scope of the invention and that methods and structures within the scope of these claims and their equivalents be covered thereby.

Number	Date	Country
62596557	Dec 2017	US
62723960	Aug 2018	US
62596557	Dec 2017	US
62596557	Dec 2017	US

	Number	Date	Country
Parent	16439568	Jun 2019	US
Child	17462712		US

	Number	Date	Country
Parent	PCT/US2018/064600	Dec 2018	US
Child	16439568		US
Parent	16107685	Aug 2018	US
Child	PCT/US2018/064600		US
Parent	16107685	Aug 2018	US
Child	16439568		US

METHODS AND COMPOSITIONS FOR LABELING CELLS

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

CROSS REFERENCE

Provisional Applications (4)

Continuations (1)

Continuation in Parts (3)