This specification describes technologies to evaluate clonotypes.
The discovery of patterns in a dataset facilitates a number of technical applications such as the validation, in the biological arts, of RNA-extraction protocols and associated methodologies that result in mRNA sequencing of mRNA in single cells. Such techniques have given rise to high throughput transcript identification and sequencing of genes in hundreds or even thousands of individual cells in a single dataset. Thus, in the art, datasets containing attribute values (e.g., transcript reads mapped to individual genes in a particular cell) have been generated. While this is a significant advancement in the art, a number of technical problems need to be addressed to make such data more useful.
In particular, the adaptive human immune system is comprised of B-cells and T-cells. During T-cell and B-cell development these cells express unique heterodimeric receptors that are used for recognition of pathogens. Each of these receptor chains is generated by a somatic rearrangement process that joins different segments of the TCR and BCR genes and creates a novel gene. This joining process is imprecise with insertion of nontemplated nucleotides (N nucleotides) in the junction site, as well as 3′- and 5′-nucleotide deletion from the germline genes participating in the rearrangement. This region of random nucleotide insertion or deletion referred to as the third complementarity-determining region (CDR3). The resulting CDR3 have a unique nucleotide sequence that is specific to that particular B or T-cell and all its progeny. Hence, the clonotypic nature of the receptors. The CDR3 is the portion of these receptors that is most involved in interactions with intact soluble antigens (B-cells) or intracellular processed antigens presented as immunogenic peptides loaded in MHC molecules (T-cells). See Yassai et al., 2009, “A clonotype nomenclature for T-cell receptors,” Immunogenetics 61, pp. 493-502. Given the ability to generate large amounts of data, what is needed in the art are improved systems and methods for analyzing such data.
The following presents a summary of the invention in order to provide a basic understanding of some of the aspects of the invention. This summary is not an extensive overview of the invention. It is not intended to identify key/critical elements of the invention or to delineate the scope of the invention. Its sole purpose is to present some of the concepts of the invention in a simplified form as a prelude to the more detailed description that is presented later.
In the present disclosure, data representing a plurality of cells from a single subject is obtained. The data represents a plurality of clonotypes. The data includes a plurality of contigs for each respective clonotype in the plurality of clonotypes. Each respective contig in the plurality of contigs comprises (i) an indication of chain type for the respective contig, and
a contig (consensus) sequence of an mRNA of the respective cell. There is determined, using the data, for each respective clonotype in the plurality of clonotypes, a number of the plurality of cells that represent the respective clonotype. In some embodiments, respective clonotypes in the plurality of clonotypes are ordered by the number of the plurality of cells that have the respective clonotype. In some embodiments, more than one cell in the plurality of cells have the same clonotype in the plurality of clonotypes. In some embodiments, more than ten cells in the plurality of cells have the same clonotype in the plurality of clonotypes. In some embodiments, the plurality of clonotypes comprises 25 clonotypes and wherein the plurality of cells includes at least one cell for each clonotype in the plurality of clonotypes. In some embodiments, the plurality of clonotypes comprises 100 clonotypes and the plurality of cells includes at least one cell for each clonotype in the plurality of clonotypes. In some embodiments, the plurality of cells consists of B-cells from the single subject. In some embodiments, the plurality of cells consists of B-cells from the single subject. In some embodiments, the single subject is mammalian. In some embodiments, the single subject is mammalian, a reptile, avian, amphibian, fish, ungulate, ruminant, bovine, equine, caprine, ovine, swine, camelid, monkey, ape, ursid, poultry, dog, cat, mouse, rat, fish, dolphin, whale or shark.
One aspect of the present disclosure provides a system comprising one or more processing cores, a memory, and a display, the memory storing instructions for performing a method for analyzing one or more datasets using the one or more processing cores. The method comprises obtaining a first dataset representing a first plurality of cells from a single first subject. The first dataset represents a first plurality of clonotypes. The first dataset includes a plurality of contigs for each respective clonotype in the first plurality of clonotypes, where each respective contig in the plurality of contigs comprises an indication of chain type for the respective contig, a barcode, from among a plurality of barcodes, for the respective contig, wherein the barcode is associated with a respective cell in the first plurality of cells from which the respective contig was constructed, and a contig consensus sequence of an mRNA of the respective cell. In the method, there is determined, using the first dataset, for each respective clonotype in the first plurality of clonotypes, a percentage, absolute number or proportion of the first plurality of cells that represent the respective clonotype. There is provided, on a first portion of the display a first two-dimensional visualization. A first axis of the first two-dimensional visualization represents individual clonotypes in the first plurality of clonotypes and a second axis of first the two-dimensional visualization represents the percentage, the absolute number or the proportion of the first plurality of cells that represent respective clonotypes. There is provided on a second portion of the display a listing of the first plurality of clonotypes.
In some embodiments, the first visualization is a bar chart.
In some embodiments, respective clonotypes in the first plurality of clonotypes are ordered on the second axis of the two-dimensional visualization by the percentage, absolute number or proportion of the first plurality of cells that have the respective clonotype.
In some embodiments, respective clonotypes in the first plurality of clonotypes are ordered in the listing by the percentage, absolute number or proportion of the first plurality of cells that have the respective clonotype.
In some embodiments, more than one cell in the first plurality of cells have the same clonotype in the first plurality of clonotypes. In some embodiments, more than ten cells in the first plurality of cells have the same clonotype in the first plurality of clonotypes.
In some embodiments, the first plurality of clonotypes comprises 25 clonotypes and the first plurality of cells includes at least one cell for each clonotype in the first plurality of clonotypes.
In some embodiments, the first plurality of clonotypes comprises 100 clonotypes and the first plurality of cells includes at least one cell for each clonotype in the first plurality of clonotypes.
In some embodiments, the first plurality of cells consists of B-cells from the single first subject.
In some embodiments, the listing includes for a first contig in the plurality of contigs for a first clonotype in the first plurality of clonotypes: an identifier for a V segment in the first contig, an identifier for a J region in the first contig, and an identifier for a C region in the first contig. In some such embodiments, the first contig is for an α chain or a γ chain. In some embodiments, the first contig is for a β chain or a δ chain and the first contig further includes an identifier for a D region in the first contig.
In some embodiments, the method further comprises providing an affordance on the display that allows a user to limit the number of clonotypes that are displayed in the first two-dimensional visualization and the listing to a number that is less than the first plurality of clonotypes in the first dataset.
In some embodiments, the method further comprises providing a first affordance, where, when a user toggles the first affordance, the display of the first two-dimensional visualization is replaced with a second two-dimensional visualization while maintaining the listing of the first plurality of clonotypes. In such embodiments, the second two-dimensional visualization provides a first filter for selection of one or more genes of a lymphocyte receptor represented by the first dataset. The second two-dimensional visualization also provides a second filter for one or more chain types. A first axis of the second two-dimensional visualization represents the one or more individual genes. A second axis of the second two-dimensional visualization represents the percentage, the absolute number or the proportion of the plurality of contigs present in the first dataset that include the one or more individual genes independently of how the one or more individual genes have been incorporated into clonotype. When a user toggles the first filter an identity of the one or more genes is selected. When a user toggles the second filter one or more chain types is selected, thereby limiting the percentage, the absolute number or the proportion of the plurality of contigs present in the first dataset that include the one or more individual genes to those contigs in the one or more chain types identified by the second filter that include the one or more individual genes. In some such embodiments, the first plurality of cells consists of B-cells from the single first subject, and the one or more genes is any combination of V gene, D gene, J gene, and C gene.
In some embodiments, a first contig in the plurality of contigs for a first clonotype in the first plurality of clonotypes in the first dataset for a respective cell in the first plurality of cells is between 600 and 800 bases in length and is determined from overlaying a plurality of sequence reads of the first contig, the plurality of sequence reads has an average read length that is less than 600 bases and, each sequence read in the plurality of sequence reads has the same unique molecular identifier.
In some embodiments, the first plurality of cells consists of B-cells from the single first subject.
In some embodiments, the single first subject is mammalian.
In some embodiments, the single first subject is mammalian, a reptile, avian, amphibian, fish, ungulate, ruminant, bovine, equine, caprine, ovine, swine, camelid, monkey, ape, ursid, poultry, dog, cat, mouse, rat, fish, dolphin, whale or shark.
In some embodiments, the method further comprises providing a first affordance, where, when a user toggles the first affordance, the display of the first two-dimensional visualization is replaced with a second two-dimensional visualization while maintaining the listing of the first plurality of clonotypes, the second two-dimensional visualization provides a first filter for selection of a pair of genes of a lymphocyte receptor represented by the first dataset, the second two-dimensional visualization provides a second filter for one or more chain types, a first axis of the second two-dimensional visualization represents a first individual gene in the pair of genes, and a second axis of the second two-dimensional visualization represents a second individual gene in the pair of genes and wherein each respective cell in a plurality of two-dimensional cells in the second two-dimensional visualization that intersects the first and second axis indicates a number of contigs of the one or more chain types designated by the second filter in the first dataset that includes the respective gene on the first axis and the respective gene on the second axis for the respective two-dimensional cell. In some such embodiments, the second two-dimensional visualization is a heat map, and the heat map provides a scale that provides a numeric indication in a color coded format of the number of contigs of the one or more chain types designated by the second filter in the first dataset that includes the respective gene on the first axis and the respective gene on the second axis for each two-dimensional cell in the plurality of two-dimensional cells of the second two-dimensional visualization.
In some embodiments, the method further comprises providing one or more affordances on the display, wherein the one or more affordances are configured to receive a user specified selection criterion. Responsive to receiving the user specified selection criterion, the listing is limited to those clonotypes in the first plurality of clonotypes that match the selection criterion. Further, the selection criterion is at least one contig, at least one barcode, at least one amino acid sequence, or at least one nucleic acid sequence.
In some embodiments, the method further comprises responsive to receiving the user specified selection criterion, further limiting the first two-dimensional visualization to the display of those clonotypes in the first plurality of clonotypes that match the selection criterion.
In some embodiments, the selection criterion includes a wild card thereby matching more than one contig, barcode, amino acid sequence, or nucleic acid sequence.
In some embodiments, the listing includes a plurality of rows, and each respective row in the plurality of rows specifies the indication of a chain type of a contig in the plurality of contigs for a clonotype in the first plurality of clonotypes. In such embodiments, the method further comprises, responsive to user selection of a row in the plurality of rows, replacing the display of the first two-dimensional visualization with a panel of summary information for the chain represented by the selected row, while maintaining the display of the listing. In some such embodiments, the panel of summary information comprises: a reference sequence that is a published curated sequence of the selected chain type, a consensus sequence from all the contigs in the first dataset that include the selected chain type, a representation of each respective contig in the first dataset that includes the selected chain type, and the reference sequence, the consensus sequence, each representation of each respective contig in the panel occupy a different row in the panel and are sequence aligned with respect to each other. In some embodiments, a representation of a respective contig includes one or more indicators, where the one or more indicators includes a start codon of the respective contig, a mismatch between the respective contig and the consensus sequence, a deletion incurred in the respective contig with respect to the consensus sequence, a stop codon of the respective contig, or a coding region of the respective contig. In some such embodiments, responsive to selection of the consensus sequence, the method further comprises displaying the entire consensus sequence in a format that is configured for user cutting and pasting into separate application running on the system.
In some embodiments, responsive to selection of a representation of a contig displayed in the panel of summary information, the method further comprises displaying information about the selected contig that includes one or more of a barcode for the contig, an identifier for the contig, a number of unique molecular identifiers supporting the contig, a number of sequence reads supporting the contig, a reference identity of a V gene for the contig, a reference identity of a D gene for the contig, a reference identity of J gene for the contig, and a reference identity of a C gene for the contig.
In some embodiments, the method further comprises displaying a toggle, and user selection of the toggle switches the representation of each respective contig in the first dataset that includes the selected chain type from one of (i) a graphical representation of each respective contig and (ii) a sequence of each respective contig, to the other of (i) the graphical representation of each respective contig and (ii) the sequence of each respective contig.
In some embodiments, responsive to selection of a representation of a first contig displayed in the panel of summary information, the method further comprises displaying an alignment of each sequence read in a plurality of sequence reads to the first contig, wherein each sequence read in the plurality of sequence reads has a unique molecular identifier that is associated with the first contig. In some embodiments, a plurality of unique molecular identifiers is associated with the first contig, and the method further comprises displaying a unique molecular identifier affordance that affords choosing between (i) selection of all the unique molecular identifiers in the plurality of unique molecular identifiers and (ii) selection of a single unique molecular identifiers in the plurality of unique molecular identifiers, when the single unique molecular identifier is selected, only those sequence reads for the first contig that have the single unique molecular identifier are displayed in the alignment of each sequence read in a plurality of sequence reads to the first contig.
In some embodiments, the method further comprises obtaining a second dataset representing a second plurality of cells from a single second subject, where the second dataset represents a second plurality of clonotypes, the first second dataset includes a plurality of contigs for each respective clonotype in the second plurality of clonotypes, wherein each respective contig in the plurality of contigs comprises: an indication of chain type for the respective contig, a barcode for the respective contig, wherein the barcode is associated with a respective cell in the second plurality of cells from which the respective contig was constructed, and a contig consensus sequence of an mRNA of the respective cell. In the method a determination is made, using the second dataset, for each respective clonotype in the second plurality of clonotypes, a percentage, absolute number or proportion of the second plurality of cells that represent the respective clonotype. Further in the method, a comparison of the first dataset to the second dataset is performed at a paired-clonotype, single-cell level that evaluates a number of cells with a given clonotype in the first dataset that match the clonotype of cells with the same clonotype in the second dataset thereby identifying a pairwise clonotype commonality between the first dataset and the second dataset. In some such embodiments, the pairwise clonotype commonality between the first dataset and the second dataset is a Morisita-Horn metric. In some such embodiments, the method further comprises displaying for each clonotype in a subset of the first plurality of clonotypes: a percentage, an absolute number or a proportion of the first plurality of cells that represent the respective clonotype in the first dataset, and a percentage, an absolute number or a proportion of the second plurality of cells that represent the respective clonotype in the second dataset. In some instances, the subset of the first plurality of clonotypes are those clonotypes in the first plurality of clonotypes that are each represented by at least a threshold percentage, absolute number or proportion of the first plurality of cells.
In some embodiments, the method further comprises displaying for each respective clonotype element in a plurality of clonotype elements: a percentage, an absolute number or a proportion of the contigs in the first dataset that include the respective clonotype element, and a percentage, an absolute number or a proportion of the contigs in the second dataset that include the respective clonotype element. In some such embodiments, each clonotype element in the plurality of clonotype elements is a different V gene sequence. In some embodiments, each clonotype element in the plurality of clonotype elements is a different D gene sequence. In some embodiments, each clonotype element in the plurality of clonotype elements is a different J gene sequence. In some embodiments, each clonotype element in the plurality of clonotype elements is a different C gene sequence.
In some embodiments, the first plurality of cells consists of B-cells from the single first subject, and the second plurality of cells consists of B-cells from the single second subject, and the method further comprises displaying for each respective B-cell isotype in a plurality of B-cell isotype: a percentage, an absolute number or a proportion of the first dataset that has the respective B-cell isotope, and a percentage, an absolute number or a proportion of the second dataset that has the respective B-cell isotope.
In some embodiments, the single first subject and the single second subject are the same subject.
In some embodiments, the single first subject and the single second subject are different subjects.
In some embodiment the method further comprises obtaining a second dataset representing a second plurality of cells from a single second subject, where the second dataset comprises a corresponding discrete attribute value for mRNA for each gene in a plurality of genes for each respective cell in the second plurality of cells, each corresponding discrete attribute value for mRNA for each gene in a plurality of genes for each respective cell in the second plurality of cells is supported by one or more barcodes in the plurality of barcodes, and individual respective cells in the first plurality of cells represented by the first dataset are present in the second plurality of cells and mappable between the first dataset and the second through the plurality of barcodes. In the method, the second dataset is clustered using the discrete attribute value for mRNA for each gene in the plurality of genes, or principal components derived therefrom, for each respective cell in the second plurality of cells thereby assigning each respective cell in the second plurality of cells to a corresponding cluster in a plurality of clusters, where each respective cluster in the plurality of clusters consists of a unique different subset of the second plurality of cells. In the method, a subset of the first plurality of cells is selected by selecting those cells in the first plurality of cells that map onto the cells in the second plurality of cells in a cluster selected from among the plurality of clusters. In the method, clonotype information from the first dataset for the subset of the first plurality of cells is displayed without displaying clonotype information for cells in the first plurality of cells outside of the subset of the first plurality of cells. In some such embodiments, the displaying clonotype information comprises providing a second two-dimensional visualization, where a first axis of the second two-dimensional visualization represents individual clonotypes represented in the subset of the first plurality of cells, and a second axis (e.g., orthogonal to first axis) of the two-dimensional visualization represents a percentage, an absolute number or a proportion of the subset of the first plurality of cells that represent respective clonotypes in the subset of the first plurality of cells.
In some embodiments, the single first subject and the single second subject are the same subject.
In some embodiments, the clustering the second dataset comprises hierarchical clustering, agglomerative clustering using a nearest-neighbor algorithm, agglomerative clustering using a farthest-neighbor algorithm, agglomerative clustering using an average linkage algorithm, agglomerative clustering using a centroid algorithm, or agglomerative clustering using a sum-of-squares algorithm.
In some embodiments, the clustering the second dataset comprises application of a Louvain modularity algorithm, k-means clustering, a fuzzy k-means clustering algorithm, or Jarvis-Patrick clustering.
In some embodiments, the clustering the second dataset comprises k-means clustering of the discrete attribute value dataset into a predetermined number of clusters. In some such embodiments, the predetermined number of clusters is an integer between 2 and 50.
Another aspect of the present disclosure provides a method for analyzing one or more datasets. The method comprises, at a computer system comprising a memory, a processor and a display: obtaining, using the processor, a first dataset representing a first plurality of cells from a single first subject, where the first dataset represents a first plurality of clonotypes,
the first dataset includes a plurality of contigs for each respective clonotype in the first plurality of clonotypes, wherein each respective contig in the plurality of contigs comprises: an indication of chain type for the respective contig, a barcode, from among a plurality of barcodes, for the respective contig, wherein the barcode is associated with a respective cell in the first plurality of cells from which the respective contig was constructed, and a contig consensus sequence of an mRNA of the respective cell. In the methods, a determination is made, using the first dataset and the processor, for each respective clonotype in the first plurality of clonotypes, a percentage, absolute number or proportion of the first plurality of cells that represent the respective clonotype. Further in the method, there is provided on a first portion of the display a first two-dimensional visualization. A first axis of the first two-dimensional visualization represents individual clonotypes in the first plurality of clonotypes and a second axis of first the two-dimensional visualization represents the percentage, the absolute number or the proportion of the first plurality of cells that represent respective clonotypes. Further in the method there is provides on a second portion of the display a listing of the first plurality of clonotypes.
Still another aspect of the present disclosure provides a non-transitory computer readable storage medium. The non-transitory computer readable storage medium stores instructions, which when executed by a computer system having a display, causes the computer system to perform a method for analyzing one or more datasets, the method comprising: obtaining a first dataset representing a first plurality of cells from a single first subject, where the first dataset represents a first plurality of clonotypes, the first dataset includes a plurality of contigs for each respective clonotype in the first plurality of clonotypes. Each respective contig in the plurality of contigs comprises: an indication of chain type for the respective contig, a barcode, from among a plurality of barcodes, for the respective contig, wherein the barcode is associated with a respective cell in the first plurality of cells from which the respective contig was constructed, and a contig consensus sequence of an mRNA of the respective cell. In the method a determination is made, using the first dataset, for each respective clonotype in the first plurality of clonotypes, a percentage, absolute number or proportion of the first plurality of cells that represent the respective clonotype. Further in the method, there is provided on a first portion of the display a first two-dimensional visualization, wherein a first axis of the first two-dimensional visualization represents individual clonotypes in the first plurality of clonotypes and a second axis of first the two-dimensional visualization represents the percentage, the absolute number or the proportion of the first plurality of cells that represent respective clonotypes. Further in the method there is provided on a second portion of the display a listing of the first plurality of clonotypes.
Various embodiments of systems, methods and devices within the scope of the appended claims each have several aspects, no single one of which is solely responsible for the desirable attributes described herein. Without limiting the scope of the appended claims, some prominent features are described herein. After considering this discussion, and particularly after reading the section entitled “Detailed Description” one will understand how the features of various embodiments are used.
All publications, patents, and patent applications mentioned in this specification are herein incorporated by reference in their entireties to the same extent as if each individual publication, patent, or patent application was specifically and individually indicated to be incorporated by reference.
The implementations disclosed herein are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings. Like reference numerals refer to corresponding parts throughout the several views of the drawings.
Reference will now be made in detail to embodiments, examples of which are illustrated in the accompanying drawings. In the following detailed description, numerous specific details are set forth in order to provide a thorough understanding of the present disclosure. However, it will be apparent to one of ordinary skill in the art that the present disclosure may be practiced without these specific details. In other instances, well-known methods, procedures, components, circuits, and networks have not been described in detail so as not to unnecessarily obscure aspects of the embodiments.
The implementations described herein provide various technical solutions to analyze datasets. An example of such datasets are datasets arising from sequencing pipelines that sequence the VDJ regions in single cells, such as B-cells and T-cells. Details of implementations are now described in conjunction with the Figures.
In some implementations, one or more of the above identified elements are stored in one or more of the previously mentioned memory devices, and correspond to a set of instructions for performing a function described above. The above identified modules, data, or programs (e.g., sets of instructions) need not be implemented as separate software programs, procedures, datasets, or modules, and thus various subsets of these modules and data may be combined or otherwise re-arranged in various implementations. In some implementations, the non-persistent memory 111 optionally stores a subset of the modules and data structures identified above. Furthermore, in some embodiments, the memory stores additional modules and data structures not described above. In some embodiments, one or more of the above identified elements is stored in a computer system, other than that of visualization system 100, that is addressable by visualization system 100 so that visualization system 100 may retrieve all or a portion of such data when needed.
In some embodiments clonotype dataset 122 is organized as a series of data blocks with a master JSON table of contents at the beginning of the file and a JSON table of contents describing the addresses and structure of each block at the end of the file. In some embodiments there are a plurality of blocks in the clonotype dataset 122.
In some embodiments, one such block constitutes a database (e.g., a sqlite3 database) containing one table each for clonotypes, lymphocyte (e.g. T-cells, B-cells) receptor chain reference sequences, lymphocyte (e.g. T-cells, B-cells) receptor chain consensus sequences 126, contigs 128, and a secondary table mapping cell barcodes 130 to clonotypes 124. This database is queried to create the clonotype list, sorted by frequency, and again queried to populate the chain visualization with data when clicking on the chain in the user interface disclosed herein. Each row in the reference, consensus and contig tables also include file offsets and lengths that encode the location of more detailed and hierarchical information about that entity within a set of JSON files, stored within other blocks in the plurality of block. Finally, alignment and sequence information for each reference and consensus are stored in the database for future debugging and troubleshooting.
In some embodiments, one or more blocks contain a reference annotation JSON file, which is a complete set of information about each reference per lymphocyte (e.g. T-cell, B-cell) receptor chain. This block is equivalent to VDJ chain reference sequence table 140. Accordingly, in some embodiments, VDJ chain reference sequence table 140 is a component of the clonotype dataset 122.
In some embodiments, one or more blocks contain a consensus annotation, e.g., as JSON file, which is a complete set of information about each consensus sequence 126 per lymphocyte (e.g. T-cell, B-cell) receptor chain.
In some embodiments, one or more blocks contains a contig annotation, e.g. as a JSON file, which is a complete set of information about each contig 128. A contig 128 is the assembled sequence of a transcript that encodes a chain (e.g. T-cell α chain, T-cell β chain, B-cell heavy chain, B-cell light chain) of a lymphocyte receptor (e.g., T-cell receptor, B-cell immunoglobulin). Thus, in the example case of a single T-cell it is expected that there would be at least one contig 128 for the α chain and at least one contig 128 for the β chain.
In some embodiments, one or more blocks contain a reference sequence, e.g., in FASTA format, that is used during clonotype dataset 122 file creation, not during VDJ browser 120 operation, for debugging purposes.
In some embodiments, one or more blocks contain a reference alignment, e.g. as a BAM file, which stores how chain consensus sequence/contigs 128 differ from the reference sequence. This is typically used during clonotype dataset 122 creation as opposed to during VDJ browser 120 operation, for instance, for debugging purposes.
In some embodiments, one or more blocks contain a reference alignment BAM index for the above identified BAM file to accelerates sequence alignment queries.
In some embodiments, one or more blocks contain a consensus sequence, e.g., in FASTA format, that is typically used during clonotype dataset 122 creation as opposed to during VDJ browser 120 operation.
In some embodiments, one or more blocks contain consensus alignments BAM file that stores how contig sequences differ from the consensus, that is typically used during clonotype dataset 122 creation as opposed to during VDJ browser 120 operation.
In some embodiments, one or more blocks contain a contig BAM index which stores where to find read information for individual contigs.
In some embodiments, one or more blocks contain a contig BED file that stores gene annotations for each contig.
In some embodiments, one or more blocks contain a contig FASTA file that stores sequences of each contig.
In some embodiments, there are two processes that are initiated when a user runs the VDJ browser 120 (i) a backend server process that reads the clonotype dataset 122 and returns JSON responses and (ii) a front-end web application that processes the JSON into a visualization, and handles user input. In some embodiments, the backend server process extracts the sqlite3 database bytes out of the clonotype dataset 122 into a temporary location. The server process holds a relation between a clonotype dataset 122 and its associated sqlite3 database file, discussed above, in memory, and directs all queries pertaining to a clonotype dataset 122 to that database. When shutting down, the server process cleans itself up by removing all database files that were opened during the session.
Although
While a system in accordance with the present disclosure has been disclosed with reference to
The scRNAseq microfluidics system builds on the GemCode technology, which has been used for genome haplotyping, structural variant analysis and de novo assembly of a human genome. See Zheng et al., 2016 “Haplotyping germline and cancer genomes with high-throughput linked-read sequencing,” Nat. Biotechnol. 34, pp. 303-311; Narasimhan et al., 2016, “Health and population effects of rare gene knockouts in adult humans with related parents,” Science 352, pp. 474-477 (2016); and Mostovoy et al., 2016, “A hybrid approach for de novo human genome sequence assembly and phasing,” Nat. Methods 13, 587-590, each of which is incorporated by reference, for a general description of GemCode technology. Such sequencing uses a gel bead-in-emulsion (GEM).
GEM generation takes place in a multi-channel microfluidic chip that encapsulates single gel beads at a predetermined fill rates, such as approximately 80%. For the clonotype datasets 122 of the present disclosure, in some embodiments, a 5′ gene expression protocol is followed rather than a 3′ gene expression protocol. In the case where the sample comprises T-cells, this provides full-length (5′ UTR to constant region), paired T-cell receptor (TCR) transcripts from a number of (e.g., 100-10,000) individual lymphocytes per sample. In the case where the sample comprises B-cells, this provides full-length (5′ UTR to constant region), paired B-cell immunoglobulin heavy chain and light chain transcripts from a number of (e.g., 100-10,000) individual lymphocytes per sample.
In some embodiments, as in the case of the 3′ gene expression protocol described in Zheng et al., id., the 5′ expression protocol includes partitioning the cells into GEMs. In particular, in some embodiments, single cell resolution is achieved by delivering the cells at a limiting dilution, such that the majority (˜90-99%) of generated GEMs contains no lymphocyte (cell), while the remainder largely contain a single lymphocyte. In some embodiments, upon dissolution of the single cell 5′ gel bead in a GEM, oligonucleotides containing (i) a read 1 sequencing primer (e.g., ILLUMINA R1 sequence), (ii) a barcode 130, (iii) a unique molecular identifier (UMI) 132, and (iv) a switch oligonucleotide are released and mixed with cell lysate and a master mix that contains poly(dT) primers. Incubation of the GEMs then produces barcoded, full-length cDNA from poly-adenylated mRNA. After incubation, the GEMs are broken and the pooled fractions are recovered. In some embodiments, magnetic beads (e.g., silane beads) are used to remove leftover biochemical reagents and primers from the post GEM reaction mixture. The barcoded, full-length V(D)J segments from lymphocyte cDNA are enriched by PCR amplification prior to library construction. In some embodiments, enzymatic fragmentation and size selection are used to generate variable length fragments that collectively span the V(D)J segments of the enriched receptor chains prior to library construction.
In some embodiments, R1 (read 1 primer sequence) is added to the molecules during GEM incubation. P5 is added during target enrichment. P7, a sample index and R2 (read 2 primer sequence) are added during library construction via end repair, A-tailing, adaptor ligation and implementation of the polymerase chain reaction (PCR). The resulting single cell V(D)J libraries contain the P5 and P7 primers used in Illumina bridge amplification. See the Internet, at assets.contentful.com/an68im79xiti/26tufAiwI0KCYA0ou2gCWK/8d313d2b126a7a1652d13810 73e72015/CG000086_SingleCellVDJReagentKitsUserGuide_RevA.pdf, last accessed May 18, 2017, pp. 2-4, which is hereby incorporated by reference. See also, “Multiplexed Sequencing with the Illumina Genome Analyzer System,” copyright 2008, on the Internet at www.illumina.com/documents/products/datasheets/datasheet_sequencing_multiplex.pdf, last accessed May 18, 2017, hereby incorporated by reference, for documentation on the P5 and P7 primers. In some embodiments, the sequenced single cell V(D)J library is in the form of a standard ILLUMINA BCL data output folder. In some such embodiments, the BCL data includes the paired-end Read 1 (comprising the barcode 130, the UMI 132, the switch oligonucleotide, as well as the 5′ end of a receptor chain cDNA) and Read 2 (comprising a random part of the of the same receptor chain cDNA) and the sample index in the i7 index read. In some embodiments, a computer program such as the 10× CELL RANGER analysis pipeline performs secondary analysis on the BCL data such as using the barcodes 130 to group read pairs from the same cells, assemble full-length V(D)J segments in the form of contigs 128, and thereby create the clonotype dataset 122
The multiple sequence reads 134 with the same barcode 130 form at least one contig 128, and each such contig 128 represents a chain (e.g., T-cell receptor α chain, T-cell receptor β chain, B-cell heavy chain, B-cell light chain) of a single cell. The contig consensus sequence 126 for each of the contigs 128 of a cell are collectively used to determine the clonotype 124 of the cell. Stated differently, sequence reads 134 are grouped by barcode 130, and contigs 128 are assembled by looking at sequence reads 134 with the same UMI identifier 132. A set of chain consensus sequences, including a CDR3 region, is created by analyzing the common bases in the contigs 128. Cells with like CDR3 regions within these consensus sequences are grouped into clonotypes 124, and bar chart 302 of
In some embodiments, the clonotype dataset 122 includes the V(D)J clonotype of the T-cell receptor of any T-cells or B-cell immunoglobulins of any B-cells that were in the biological sample represented by the clonotype dataset 122. The clonotypes of T-cells and B-cells is described below.
T-Cell Clonotypes.
Most T-cell receptors are composed of an alpha chain and a beta chain. The T-cell receptor genes are similar to B-cell immunoglobulin genes discussed below in that they too contain multiple V, D and J gene segments in their beta chains (and V and J gene segments in their alpha chains) that are rearranged during the development of the lymphocyte to provide the cell with a unique antigen receptor. The T-cell receptor in this sense is the topological equivalent to an antigen-binding fragment of the antibody, both being part of the immunoglobulin superfamily. B-cells and T-cells are defined by their clonotype, that is the identity of the final rearrangement of the V(D)J regions into the heavy and light chains of a B-cell immunoglobulin, in the case of B-cells, or into each chain of the T-cell receptor in the case of T-cells.
There are two subsets of T-cells based on the exact pair of receptor chains expressed. These are either the alpha (α) and beta (β) chain pair, or the gamma (γ) and delta (δ) chain pair, identifying the αβ or γδ T-cell subsets, respectively. The expression of the β and δ chain is limited to one chain in each of their respective subsets and this is referred to as allelic exclusion (Bluthmann et al., 1988, “T-cell-specific deletion of T-cell receptor transgenes allows functional rearrangement of endogenous alpha- and beta-genes,” Nature 334, pp. 156-159; and Uematsu et al., 1988, “In transgenic mice the introduced functional T-cell receptor beta gene prevents expression of endogenous beta genes,” Cell 52, pp. 831-841, each of which is hereby incorporated by reference). These two chains are also characterized by the use of an additional DNA segment, referred to as the diversity (D) region during the rearrangement process. The D region is flanked by N nucleotides which constitutes the NDN region of the CDR3 in these two chains. The CDR3 of each of the two receptor chains defines the clonotype 124 that is analyzed in
B-Cell Clonotypes.
B-cells are highly diverse, each expressing a practically unique B-cell immunoglobulin (e.g., B-cell immunoglobulin receptor—BCR). There are approximately 1010-1011 B-cells in a human adult. See Ganusov et al., 2007, “Do most lymphocytes in humans really reside in the gut?,” Trends Immunol, 208(12), pp. 514-518, which is hereby incorporated by reference. B-cells are important components of adaptive immunity, and directly bind to pathogens through B-cell immunoglobulin receptors (BCRs) expressed on the cell surface of the B-cells. Each B-cell in an organism (e.g. human) expresses a different BCR that allows it to recognize a particular set of molecular patterns. Individual B-cells gain this specificity during their development in the bone marrow, where they undergo a somatic rearrangement process that combines multiple germline-encoded gene segments to procures the BCR, as illustrated in
Because of the rearrangement undergone of the V(D)J region in T-cells and B-cells, only parts of the V(D)J regions (the V, D, and J segments) can be traced back to segments encoded in highly repetitive regions of the germline that are not typically sequenced directly from the germ line DNA. Furthermore, the V, D, and J segments can be significantly modified during the V(D)J rearrangement process and through, in the case of B-cells, somatic hypermutation. As such, there are typically no pre-existing full-length templates to align to sequence reads of the V(D)J regions of T-cell receptors and B-cell immunoglobulins. Clonal grouping, referred to herein as clonotyping, involves clustering the set of B-cell immunoglobulin V(D)J) sequences (in the case of B-cells) or the set of T-cell receptor V(D)J sequences, in the case to T-cells into clones, which are defined as a group of cells that are descended from a common ancestor. Unlike the case of T-cells, members of a B-cell clone do not carry identical V(D)J sequences, but differ because of somatic hypermutation. Thus, defining clones (clonotyping) based on BCR sequence data requires machine learning techniques in some instances. See, for example, Chen et al., 2010, “Clustering-based identification of clonally-related immunoglobulin gene sequence sets,” Immunome Res. 6 Suppl 1: S4; and Hershberg and Prak, 2015, “The analysis of clonal expansion in normal and autoimmune B-cell repertoires,” Philos Trans R Soc Lond B Biol Sci. 370(1676), each of which is hereby incorporated by reference.
In general, the VDJ cell browser 120 can be used to analyze clonotyping datasets prepared from T-cells or B-cells. In the case of T-cells, clonotyping identifies the unique nucleotide CDR3 sequences of a T-cell receptor chain, which constitute V, D, and J segments. In accordance with the systems and methods of the present disclosure, this generally involves PCR amplification of the mRNA obtained using the above described scRNAseq microfluidics system in which each GEM encapsulates a single cell, employing V-region-specific primers and either constant region (C) specific or J-region-specific primer pairs, followed by nucleotide sequencing of the amplicon.
The VDJ cell browser 120 is applicable to genes that code for the B-cells (the antibodies) and T-cells (the T-cell receptors). As discussed above, T-cells and B-cells get their diversity by a recombination process involving the V, D, J and C germ line regions. So each T-cell and B-cell encodes a unique clonotype.
Sequence reads 134 obtained from mRNA encoding all or portions of a cell receptor chain for an individual cell are used to derive a contig 128 that includes the CDR3 region. Each of the contigs 128 for a given cell will have a common barcode 130 thereby defining the set of contigs for the given cell and, correspondingly, the set of CDR3 sequences for the given cell. The CDR3 region across the set of contig consensus sequences 126 for the given cell thereby determines the clonotype 124 of the cell. Thus, graph 302 represents the frequency of clonotype 124 occurrence across the plurality of cells represented in a clonotype dataset 122. In the biological sample represented by the clonotype dataset 122, each clonotype has some number of cells of a particular clonotype. These clonotypes are sorted by frequency of clonotype occurrence. Table 304 lists out the clonotype information that is summarized in graph 304. Each box 306 in table 304 is the clonotype 124 of a particular set of contigs. There may be multiple cells represented by this clonotype in the clonotype dataset 122. For instance, in the biological sample represented by dataset 122, there are 32 T-cells that have the clonotype described in box 306-1, 9 T-cells that have the clonotype described in box 306-2, 6 T-cells that have the clonotype described in box 306-3, 6 T-cells that have the clonotype described in box 306-4, and 5 T-cells that have the clonotype described in box 306-5.
Clonotype 306-1 includes one contig type for a T-cell α chain and another contig type for a T-cell β chain. That is, each of the contigs for a T-cell α chain for clonotype 306-1 have a same first CDR3 sequence, and each of the contigs for a T-cell β chain for clonotype 306-1 have a same second CDR3 sequence. By contrast, clonotype 306-5 includes two contig types for a T-cell α chain and another two contig types for a T-cell β chain. That is, each of the contigs for a T-cell α chain for clonotype 306-1 have either a first or second CDR3 sequence, and each of the contigs for a T-cell β chain for clonotype 306-1 have either a third or fourth CDR3 sequence.
Further, toggle 308 can be used to scroll further down in table 304 to reveal the clonotypes and frequency (or number) of additional T-cells in the biological sample represented by dataset 122. For each clonotype, table 304 details each chain type 310 represented in the clonotype 124. A clonotype may have multiple chain consensus sequences, these chain consensus sequences are grouped into clonotypes for the reasons cited above. Two cells have the same clonotype if they share the set of same CDR3s for each distinct chain consensus sequence derived from its contigs.
For each clonotype 306, table 304 details each chain type 310 represented by that clonotype. In the case of clonotype 306-1, there is a single α chain type and a single β chain type meaning that all of the α chains for this clonotype 306-1 have the same first CDR3 sequence and all of the β chains for this clonotype 306-1 have the same second CDR3 sequence For each chain type 310 represented in a clonotype, table 304 provides an identifier for the V segment 312, an identifier for the diversity region 314 (present in the case of T-cell β chains and δ chains, but not α chains and γ chains), an identifier for the J region 316, and an identifier for the C region 318. Two cells are deemed to have the same clonotype if their respective receptor chains have the same corresponding CDR3 sequences.
In the case where the sample comprises T-cells, due to the heterozygous nature of the cells being sampled, it is possible for a single cell in the sample represented by the clonotype dataset illustrated in
Advantageously, VDJ browser allows for the analysis of the clonotype information in a variety of different ways.
Affordance 322 is used to specify the total number of clonotypes, from among all the clonotypes in a clonotype dataset 122 under analysis that are displayed in chart 302 and table 304. Presently, as illustrated in
Toggle 324 is used to select other chart types that can be applied to the clonotype 124 dataset. For instance, turning to
In some embodiments, if a cell represented in the clonotype dataset 122 does not have a V region or a J region, it is filtered out of the views provided by the VDJ browser. This occurs in some instances. The VDJ region is about 700 bases in length whereas, in some embodiments, the sequence reads 134 are about 150 base pairs long. Therefore, situations arise in which some mRNA molecules encoding the VDJ region only get sequence reads 134 on one part of the VDJ region (V only or J only) and not the other part of the VDJ region and so the V region or the J region is not represented for such mRNA molecules. In such instances, it is not possible to determine the clonotype of such cells. In some instances, in order to have an assigned clonotype, some embodiments of the present disclosure impose the condition that there has to be within a single cell a read with a particular UMI code that aligns to a V gene and another read with the particular UMI code that aligns to a J gene. In the alternative, longer sequence reads are employed that align to the entire VDJ region. In the alternative still, sequence reads having the same UMI are employed that collectively align to the entire VDJ region.
The advantage of the clonotype data illustrated in
Turning to
As noted above, each chain has a V region 312 and a J region 316. Each x-y cell in the heat map of chart 602 provides an indication of the number of contigs present in the clonotype dataset 122 whose CDR3 region contains a receptor chain that contains a corresponding pair of a respective V region and a respective J region from among the V regions and J regions represented. For instance, in the case of B-cells, each x-y cell in the heat map of chart 602 provides an indication of the number of contigs present in the clonotype dataset 122 whose CDR3 region contains a heavy chain or a light chain that contains a corresponding pair of a respective V region and a respective J region from among the V regions and J regions represented. In the case of T-cells, each x-y cell in the heat map of chart 602 provides an indication of the number of contigs present in the clonotype dataset 122 whose CDR3 region contains an α chain or a β chain that contains a corresponding pair of a respective V region and a respective J region from among the V regions and J regions represented. Turning to
Accordingly some embodiments of the present disclosure provide a second two-dimensional visualization (602) while maintaining the listing of the plurality of clonotypes (304). The second two-dimensional visualization (602) provides a first filter (324) for selection of a pair of genes of a lymphocyte receptor represented by the dataset. The second two-dimensional visualization (602) provides a second filter (320) for one or more chain types. A first axis of the second two-dimensional visualization represents a first individual gene (e.g., J Region axis of visualization 602 of
Scale 604 provides a basis for interpreting the x-y cells in the chart 602. In some embodiments, the heat map is color coded between a first color that indicates a first number of contigs (e.g., green, representing zero contigs) and a second number of contigs (e.g., blue, representing 120 contigs). Thus, when this color coding is used in the heat map 602, if the x-y cell in the chart 602 indicating the number of contigs present in the clonotype dataset 122 whose clonotype contains a TRAV-1-1 V region and a TRAJ3 J region is colored green, that means that there are no contigs in the clonotype dataset 122 that contain a TRAV-1-1 V region and a TRAJ3 J region. On the other hand, if the x-y cell in the chart 602 indicating the number of contigs present in the clonotype dataset 122 that contains a TRAV-1-1 V region and a TRAJ3 J region is colored blue, that means there are 120 contigs in the clonotype dataset 122 that contain a TRAV-1-1 V region and a TRAJ3 J region. In such embodiments, intermediate values between zero and 120 are represented by intermediate color shades between green and blue. It will be appreciated that scale 604 adjust to the values of the data being represented, with the maximum value representing the maximum possible contigs present in the dataset with a particular V region/J region pair. It will further be appreciated that different color palettes can be used in the heat map or, in fact, the heat map can be grey scaled. As such, referring to
It will be noted that heat map 602 includes large blank regions in the upper left and lower right coordinates that include no data. This is because heat map 602 is showing data for the CDR3 region from both α chains and β chains to T-cells. It is typically not of interest to match the V region of a given α chain with the J region of a given β chain even when the two chains are from the same cell. It is further typically not of interest to match the J region of a given α chain with the V region of a given β chain even when the two chains are from the same cell. Exclusion of such matchings give rise to the blank regions in the upper left quadrant and lower right quadrant of heat map 602. In the view illustrated in
Turning to column 320 of
The number of possible clonotypes in a given clonotype dataset 122 can be quite large. Accordingly, referring to
In
Turning to
Turning to
In some embodiments, when two genes are inputted, only those contigs in the clonotype dataset that contain both of the selected genes are displayed in list 304 and the corresponding left hand graph. In some embodiments, selection of two genes in this manner does not update the filters on the left hand graph.
In some embodiments, when three genes are inputted, only those contigs in the clonotype dataset that contain all three of the selected genes are displayed in list 304 and the corresponding left hand graph. In some embodiments, selection of three genes in this manner does not update the filters on the left hand graph.
In some embodiments, when four genes are inputted, only those contigs in the clonotype dataset that contain all four of the selected genes are displayed in list 304 and the corresponding left hand graph. In some embodiments, selection of four genes in this manner does not update the filters on the left hand graph.
Continuing with
Turning to
It will be noted that each of the contigs have the same sequence in the region denoted by box 918 because this region defines the single clonotype which was used to select the contigs represented in panel 902. However, the contigs can have differences outside of box 918 in some clonotype datasets 122. In other clonotype datasets, where the cells are essentially the same, for instance a clonal expansion from a single cell, where one cell has been expanded into hundreds of cells, there is not expected to be any differences in the V regions and the J regions of each of the contigs. Advantageously, panel 902 of the VDJ browser allows a user to quickly ascertain if this is the case.
It will be appreciated that there will be bars, such as bars 920 at the ends of the reads that are also mismatches. These bars represent artifacts of analysis because the 5′ end of sequence reads tend to vary, so mismatches are expected at those points, but that is outside the region that is of concern. The protein coding region starts after box 908 for each contig and goes to the right. As such, panel 902 provides a graphical representation that validates that a clonal expansion, represented by clonotype dataset 122 was successful in the embodiment of the VDJ browser 120 illustrated by
The region of the consensus spanning box 918 is about 12 amino acids long in some embodiments and defines the clonotype. However, panel 902 shows more of the VDJ region of the chain to assist users in analyzing the VDJ genes. For instance, some users seek to synthesize the VDJ region. Such users need to know the entire coding sequence which is the entire V and the entire J sequence. The CDR3 region denoted by box 918 is the clonotype, but that is not the only important sequence, regions 5′ and 3′ are needed in many use cases to establish fidelity.
Referring to
Continuing to
Referring to
Accordingly, in some embodiments, the VDJ chain reference sequence table is all the human V, D, J and C regions that are found in the human genome in accordance with the Ensembl gene annotation system database and the reference sequences that best match the selected chain of the selected clonotype serve as the reference sequence when affordance 1504 is set to align contigs to reference sequence. That is, the reference sequence is the concatenation of the canonical assembly of the individual V, D, J, and C genes from the Ensembl gene annotation system database that best match the contigs of the selected chain of the selected clonotype.
In some embodiments, the VDJ chain reference sequence table is all the V, D, J and C regions that are found in a mammalian genome. In some embodiments, the VDJ chain reference sequence table is all the V, D, J, and C regions that are found in a non-human animal. Examples of the animal include, but are limited to mammal, reptile, avian, amphibian, fish, ungulate, ruminant, bovine (e.g., cattle), equine (e.g., horse), caprine and ovine (e.g., sheep, goat), swine (e.g., pig), camelid (e.g., camel, llama, alpaca), monkey, ape (e.g., gorilla, chimpanzee), ursid (e.g., bear), poultry, dog, cat, mouse, rat, fish, dolphin, whale and shark.
Accordingly,
Referring to
Turning to
In
Moreover, there are several different UMIs that support the contig represented in
Referring to
In some embodiments, the VDJ browser provides counts of the number of clonotypes and number of barcodes that will update based on filtering criteria entered into fields 326 and 328.
Multi-Sample Comparison.
Referring to
In
In
Advantageously, the comparison is done at the paired-clonotype, single-cell level. That is, as noted above in conjunction with
Morisita's overlap index is a statistical measure of dispersion of individuals (e.g. clonotypes) in a population (e.g., in a biological sample comprising cells). It is used to compare overlap among samples. This formula is based on the assumption that increasing the size of the samples will increase the diversity because it will include different clonotypes. The Morisita formula is:
where,
X is the number of cells represented by the first clonotype dataset 122 of the pairwise comparison,
Y is the number of cells represented by the second clonotype dataset 122 of the pairwise comparison,
xi is the number cells having clonotype i in the first clonotype dataset 122,
yi is the number of cells having clonotype i in the second clonotype dataset 122,
Dx and Dy are the Simpson's index values for the x and y clonotype datasets 122 respectively, and
S is the number of unique clonotypes 124 across the two clonotype datasets 122 being compared.
Here, CD=0 if the two clonotype datasets 122 do not overlap in terms of clonotypes 124, and CD=1 if the clonotypes 124 occur in the same proportions of cells in both clonotype datasets 122. Horn's modification of the index, which is used as the basis for each pairwise clonotype dataset 122 comparison in
as set forth in Horn, 1966, ‘Measurement of “Overlap” in comparative ecological studies,’ The American Naturalist 100, pp. 419-424, which is hereby incorporated by reference.
Referring to
Referring to
In some embodiments, the VDJ cell browser 120 provides an indication of clonotype distribution in the open clonotype datasets 122. For example, referring to
In some embodiments, the VDJ cell browser 120 provides a sample table 3106 that provides a comparison of the statistics of two selected clonotype datasets 122 that have been read by the VDJ cell browser 120. For example, referring to
In some embodiments, the VDJ cell browser 120 provides a graph 3202 that provides a comparison of the frequency of occurrence of clonotypes within two selected clonotype datasets 122 that have been read by the VDJ cell browser 120. For example, referring to
In
The clonotypes 124 that appear with a frequency of “2” divided by the total number cells represented by the “44914-CRC_1_UB” dataset 122 ( 2/454 or 0.00440) but do not appear in the “44915-CRC_2_UB” dataset 122 are represented by icon 3202-2. Although not shown in
The clonotypes 124 that appear with a frequency of “3” divided by the total number cells represented by the “44914-CRC_1_UB” dataset 122 ( 3/454 or 0.00660) but do not appear in the “44915-CRC_2_UB” dataset 122 are represented by icon 3202-3. Although not shown in
In
Icon 3208 is the frequency intersection between icons 3202-1 and 3204-1. As such, icon 3208 represents the number of clonotypes that appear with a frequency of 1/454 in the “44914-CRC_1_UB” dataset (1/total cells in the first dataset) and a frequency of 1/365 in the “44915-CRC_2_UB” dataset (1/total cells in the second dataset). Although not shown in
Turning to
Thus, turning to
Turning to
Turning to
Turning to
The comparison of
Referring to
B-Cell Paired Isotypes.
The relative distribution of heavy+light chain combinations for all loaded B-cell samples is illustrated in
Referring to
Integration of Gene Expression Data.
Referring to
In some embodiments mRNA from a single cell is amplified and barcoded with the same barcode. In some such embodiments, discrete attribute values are measured from single cells, and microfluidic partitions are used to capture such individual cells within respective microfluidic droplets and then pools of single barcodes within each of those droplets are used to tag all of the contents (e.g., mRNA corresponding to genes) of a given cell. For example, in some embodiments, a pool (e.g., of ˜750,000 barcodes) is sampled to separately index each second entities' transcriptome by partitioning thousands of second entities into nanoliter-scale Gel Bead-In-EMulsions (GEMs), where all generated cDNA share a common barcode. In some embodiments, each respective droplet (GEM) is assigned its own barcode and all the contents (e.g., first entities) in a respective droplet are tagged with the barcode unique to the respective droplet. In some embodiments, such droplets are formed as described in Zheng et al., 2016, Nat Biotchnol. 34(3): 303-311; in the Chromium, Single Cell 3′ Reagent Kits v2. User Guide, 2017, 10× Genomics, Pleasanton, California, Rev. B, or the Chromium Single Cell V(D)J Reagent Kits User Guide, 2017, 10× Genomics, Pleasanton, California, each of which is hereby incorporated by reference.
The amplified DNA from such mRNA, now barcoded, is pooled across the population of cells in a test sample (e.g. a tumor biopsy, etc.) and then divided into two or more aliquots, three or more aliquots, four or more aliquots, ten or more aliquots, etc. Each such respective aliquot includes one or more barcoded cDNA constructs, for each of the mRNA in each cell in the original sample. That is, each respective aliquot fully represents the relative expression of each expressed gene from each cell in the original sample. Moreover, because the expressed gene (e.g., in the form of mRNA) was barcoded upon amplification to cDNA, it is possible to identify a cDNA from one of the aliquots as being from the same gene as the cDNA from the other aliquots, because they will have matching barcodes. As such, one of the respective aliquots is applied to the general V(D)J transcript library construction and selection protocol described above thereby populating the clonotype dataset 122, and another of the aliquots follows a 5′ gene expression library construction protocol, such as that described in the section entitled “Discrete attribute value pipeline” in U.S. Patent Application No. 62/572,544, entitled “Systems and Methods for Visualizing a Pattern in a Dataset,” filed Oct. 15, 2017, thereby populating the discrete attribute values for each gene for each cell in the test sample in a discrete attribute value dataset. In some embodiments, the test sample comprise 10 or more second entities, 100 or more second entities, or 1000 or more second entities. In some embodiments, the test sample is a biopsy from a subject, such as a human subject. In some embodiments, the sample is a biopsy of a tumor and contains several different cell types.
As such, barcoded sequence reads from each library generated using the original barcoded amplified cDNA that share the same barcode will most likely have come from the same cell. Moreover, as further discussed below, other aliquots in the plurality of aliquots can be subjected to other forms of single cell sequence or expression analysis and data derived from such pipelines can be indexed to individual cells in the discrete attribute value dataset based on common barcodes.
Thus, in a joint gene expression/targeted V(D)J experiment, users will create the above-described libraries (e.g., first and second aliquot described above) and run the respective analysis pipeline for each library, such as the pipeline disclosed the section entitled “Discrete attribute value pipeline,” in U.S. Patent Application No. 62/572,544, entitled “Systems and Methods for Visualizing a Pattern in a Dataset,” filed Oct. 15, 2017, as well as the pipeline disclosed in the present disclosure that forms a clonotype dataset 122 thereby respectively populating the discrete attribute value dataset and the clonotype dataset 122. In other words, once the analysis pipelines have completed, the discrete attribute value (e.g., gene expression) pipeline will yield a discrete attribute value dataset (e.g., a Loupe Cell Browser (cloupe) file, as disclosed in U.S. Provisional Patent Application No. 62/572,544, filed Oct. 15, 2017 entitled “Systems and Methods for Visualizing a Pattern in a Dataset.” The targeted VDJ pipeline will yield a clonotype dataset 122 (e.g., Loupe VDJ Browser (vloupe) file, as disclosed herein. The discrete attribute value dataset and the clonotype dataset 122 share common barcodes because they are derived from the same cells in the same biological sample under study, the VDJ browser 120 is able to import the clustered dataset 180 derived from the discrete attribute set into the clonotype dataset 122 workspace of the corresponding clonotype dataset 122. The discrete attribute values 120 of the genes of the discrete attribute value dataset are directly traceable to single corresponding single cells in both the discrete attribute value dataset and the corresponding clonotype dataset 122. This feature advantageously provides an example of integrated single cell genomic analysis, where a worker can combine information about the same cells arising from two or more different data processing pipelines (e.g., the clonotype dataset 122 and the discrete attribute value dataset) in order to provide new, multi-faceted information about those cells. In addition, such embodiments of the VDJ cell browser 120 that can access both the clonotype dataset 122 and the discrete attribute value dataset 120 in which genes have been indexed to a single cell and to a clonotype 124 through common barcodes in the clonotype dataset 122 and the corresponding discrete attribute value dataset, enables the review of the discrete attribute values using clonotype as a filter.
The discrete attribute values in the discrete attribute value dataset are used by a clustering module in the cell browser disclosed in U.S. Patent Application 62/572,544 to cluster the cells into clusters in the form of a clustered dataset 180 (equivalent to clustered dataset 128 in U.S. patent application Ser. No. 15/891,607). As such, the clustered dataset 180 identifies the bar codes 130 that map to each cluster. In embodiments where the same biological sample was used to construct both the clonotype dataset and the discrete attribute set, the cluster information from the clustered dataset derived from the discrete attribute set includes bar codes that map onto the bar codes in the clonotype dataset. Thus, it is possible to use the expression cluster information (e.g., the barcode) of the clustered dataset to identify which cells in the clonotype set belong to which clusters in the clustered dataset.
In typical embodiments, principal component values stored in the discrete attribute value dataset that have been computed by the method of principal component analysis using the discrete attribute values of the genes (first entities) across the plurality of cells (second entities) of the discrete attribute value dataset are used by the clustering module of the cell browser to take the discrete attribute value dataset and cluster the cells into a clustered dataset 180.
Principal component analysis (PCA) is a mathematical procedure that reduces a number of correlated variables into a fewer uncorrelated variables called “principal components.” The first principal component is selected such that it accounts for as much of the variability in the data as possible, and each succeeding component accounts for as much of the remaining variability as possible. The purpose of PCA is to discover or to reduce the dimensionality of the dataset, and to identify new meaningful underlying variables. PCA is accomplished by establishing actual data in a covariance matrix or a correlation matrix. The mathematical technique used in PCA is called Eigen analysis: one solves for the eigenvalues and eigenvectors of a square symmetric matrix with sums of squares and cross products. The eigenvector associated with the largest eigenvalue has the same direction as the first principal component. The eigenvector associated with the second largest eigenvalue determines the direction of the second principal component. The sum of the eigenvalues equals the trace of the square matrix and the maximum number of eigenvectors equals the number of rows (or columns) of this matrix. See, for example, Duda, Hart, and Stork, Pattern Classification, Second Edition, John Wiley & Sons, Inc., NY, 2000, pp. 115-116, which is hereby incorporated by reference.
For clustering in accordance with one embodiment of the clustering module of in U.S. Patent Application 62/572,544, consider the case in which each second entity is associated with ten first entities in a discrete attribute value dataset that is to be clustered into a corresponding clustered dataset. In such instances, each second entity can be expressed as a vector:
{right arrow over (X)}10={x1,x2,x3,x4,x5,x6,x7,x8,x9,x10}
where Xi is the discrete attribute value for the first entity i associated with the second entity. Thus, if there are one thousand second entities, 1000 vectors are defined. Those cells that exhibit similar discrete attribute values across the set of genes of the discrete attribute value dataset will tend to cluster together. For instance, in the case where each second entity is an individual cell, the first entities correspond to mRNA mapped to individual genes within such individual cells, and the discrete attribute values are mRNA counts for such mRNA, it is the case in some embodiments that the discrete attribute value dataset includes mRNA data from one or more cell types (e.g., diseased state and non-diseased state), two or more cell types, three or more cell types. In such instances, it is expected that cells of like type will tend to have like values for mRNA across the set of first entities (mRNA) and therefor cluster together. For instance, if the discrete attribute value dataset includes class a: cells from subjects that have a disease, and class b: cells from subjects that do not have a disease, an ideal clustering classifier will cluster the discrete attribute value dataset into two groups, with one cluster group uniquely representing class a and the other cluster group uniquely representing class b.
For clustering in accordance with another embodiment of the clustering module of U.S. Patent Application 62/572,544, consider the case in which each second entity is associated with ten principal component values that collectively represent the variation in the discrete attribute values of a large number of first entities of a given second entity with respect to the discrete attribute values of corresponding first entities of other second entities in the dataset. In such instances, each second entity can be expressed as a vector:
{right arrow over (X)}10={x1,x2,x3,x4,x5,x6,x7,x8,x9,x10}
where Xi is the principal component value i associated with the second entity. Thus, if there are one thousand second entities, one those vectors are defined. Those second entities that exhibit similar discrete attribute values across the set of principal component values will tend to cluster together. For instance, in the case where each second entity is an individual cell, the first entities correspond to mRNA mapped to individual genes within such individual cells, and the discrete attribute values are mRNA counts for such mRNA, it is the case in some embodiments that the discrete attribute value dataset includes mRNA data from one or more cell types (e.g., diseased state and non-diseased state), two or more cell types, three or more cell types. In such instances, it is expected that cells of like type will tend to have like values for mRNA across the set of first entities (mRNA) and therefor cluster together. For instance, if the discrete attribute value dataset includes class a: cells from subjects that have a disease, and class b: cells from subjects that have a disease, an ideal clustering classifier will cluster the discrete attribute value dataset into two groups, with one cluster group uniquely representing class a and the other cluster group uniquely representing class b.
Clustering is described on pages 211-256 of Duda and Hart, Pattern Classification and Scene Analysis, 1973, John Wiley & Sons, Inc., New York, (hereinafter “Duda 1973”) which is hereby incorporated by reference in its entirety. As described in Section 6.7 of Duda 1973, the clustering problem is described as one of finding natural groupings in a dataset. To identify natural groupings, two issues are addressed. First, a way to measure similarity (or dissimilarity) between two samples is determined. This metric (similarity measure) is used to ensure that the samples in one cluster are more like one another than they are to samples in other clusters. Second, a mechanism for partitioning the data into clusters using the similarity measure is determined.
Similarity measures are discussed in Section 6.7 of Duda 1973, where it is stated that one way to begin a clustering investigation is to define a distance function and to compute the matrix of distances between all pairs of samples in a dataset. If distance is a good measure of similarity, then the distance between samples in the same cluster will be significantly less than the distance between samples in different clusters. However, as stated on page 215 of Duda 1973, clustering does not require the use of a distance metric. For example, a nonmetric similarity function s(x, x′) can be used to compare two vectors x and x′. Conventionally, s(x, x′) is a symmetric function whose value is large when x and x′ are somehow “similar.” An example of a nonmetric similarity function s(x, x′) is provided on page 216 of Duda 1973.
Once a method for measuring “similarity” or “dissimilarity” between points in a dataset has been selected, clustering requires a criterion function that measures the clustering quality of any partition of the data. Partitions of the dataset that extremize the criterion function are used to cluster the data. See page 217 of Duda 1973. Criterion functions are discussed in Section 6.8 of Duda 1973.
More recently, Duda et al., Pattern Classification, Second edition, John Wiley & Sons, Inc. New York, which is hereby incorporated by reference, has been published. Pages 537-563 describe clustering in detail. More information on clustering techniques can be found in Kaufman and Rousseeuw, 1990, Finding Groups in Data: An Introduction to Cluster Analysis, Wiley, New York, N.Y.; Everitt, 1993, Cluster analysis (Third Edition), Wiley, New York, N.Y.; and Backer, 1995, Computer Assisted Reasoning in Cluster Analysis, Prentice Hall, Upper Saddle River, N.J. Particular exemplary clustering techniques that can be used by the clustering module of U.S. Patent Application 62/572,544 to cluster a plurality of vectors, where each respective vector in the plurality of vectors comprises the discrete attribute values across the first entities of a corresponding second entity (or principal components derived therefrom) includes, but is not limited to, hierarchical clustering (agglomerative clustering using nearest-neighbor algorithm, farthest-neighbor algorithm, the average linkage algorithm, the centroid algorithm, or the sum-of-squares algorithm), k-means clustering, fuzzy k-means clustering algorithm, and Jarvis-Patrick clustering.
Thus, in some embodiments, the clustering module of U.S. Patent Application 62/572,544 clusters the discrete attribute value dataset using the discrete attribute value for each first entity (e.g., mRNA of genes) in the plurality of first entities for each respective second entity (e.g., cell) in the plurality of second entities (e.g., plurality of cells), or principal component values derived from the discrete attribute values, thereby assigning each respective second entity in the plurality of second entities to a corresponding cluster in a plurality of clusters and thereby assigning a cluster attribute value to each respective second entity in the plurality of second entities.
In some embodiments, the clustering module of U.S. Patent Application No. 62/572,544 makes use of k-means clustering to form a clustered dataset 180. The goal of k-means clustering is to cluster the discrete attribute value dataset based upon the principal components or the discrete attribute values of individual second entities into K partitions. In some embodiments, K is a number between 2 and 50 inclusive. In some embodiments, the number K is set to a predetermined number such as 10. In some embodiments, the number K is optimized for a particular discrete attribute value dataset. In some embodiments, a user sets the number K using the cell browser 150.
As noted in U.S. Patent Application No. 62/572,544, in some embodiments, the discrete attribute value dataset that is clustered includes discrete attribute values for 1000 or more, 3000 or more, 5000 or more, 10,000 or more, or 15,000 or more mRNAs in each cell represented by the dataset. In some such embodiments, the discrete attribute value dataset includes discrete attribute values for the mRNAs of 500 or more cells, 5000 or more cells, 100,000 or more cells, 250,000 or more cells, 500,000 or more cells, 1,000,000 or more cells, 10 million or more cells or 50 million or more cells. In some embodiments, each single cell is a human cell. In some embodiments, each second entity represents a different human cell. In some embodiments, the discrete attribute value dataset includes data for human cells of several different classes (e.g., representing different deceased states and/or wild type states). In such embodiments, the discrete attribute value for a respective mRNA (first entity) in a given cell (second entity) is the number of mRNAs for the respective mRNA that were measured in the given cell. This will either be zero or some positive integer. In some embodiments, the discrete attribute value for a given first entity for a given second entity is a number in the set {0, 1, . . . , 100}. In some embodiments, the discrete attribute value for a given first entity for a given second entity is a number in the set {0, 1, . . . , 50}. In some embodiments, the discrete attribute value for a given first entity for a given second entity is a number in the set {0, 1, . . . , 30}. In some embodiments, the discrete attribute value for a given first entity for a given second entity is a number in the set {0, 1, . . . , N}, where N is a positive integer.
Referring to
In the case where the VDJ cell browser 120 has opened one or more clonotype datasets 122 as well as a clustered dataset 180 that were formed using a common sample of barcoded amplified cDNA, the relation between the gene expression barcodes 130 of the clustered dataset 180 and the barcodes 130 of the clonotype dataset 122 is tracked by the VDJ cell browser 120 using the exemplary data structure disclosed in
Thus, referring to
For instance, referring to
Additionally, once a clustered dataset 180 has been loaded, the clusters 5002 can be applied to the single clonotype dataset 122 analyses to thereby filter the view of clonotypes 124 in the single clonotype dataset 122 to those clonotypes 124 from cells that are in a particular cluster 5002 in the clustered dataset 180. For instance, referring to
The comparison of
Single-Sample Charts.
Referring to
Plural instances may be provided for components, operations or structures described herein as a single instance. Finally, boundaries between various components, operations, and data stores are somewhat arbitrary, and particular operations are illustrated in the context of specific illustrative configurations. Other allocations of functionality are envisioned and may fall within the scope of the implementation(s). In general, structures and functionality presented as separate components in the example configurations may be implemented as a combined structure or component. Similarly, structures and functionality presented as a single component may be implemented as separate components. These and other variations, modifications, additions, and improvements fall within the scope of the implementation(s).
It will also be understood that, although the terms first, second, etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first subject could be termed a second subject, and, similarly, a second subject could be termed a first subject, without departing from the scope of the present disclosure. The first subject and the second subject are both subjects, but they are not the same subject.
The terminology used in the present disclosure is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the description of the invention and the appended claims, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will also be understood that the term “and/or” as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
As used herein, the term “if” may be construed to mean “when” or “upon” or “in response to determining” or “in response to detecting,” depending on the context.
The foregoing description included example systems, methods, techniques, instruction sequences, and computing machine program products that embody illustrative implementations. For purposes of explanation, numerous specific details were set forth in order to provide an understanding of various implementations of the inventive subject matter. It will be evident, however, to those skilled in the art that implementations of the inventive subject matter may be practiced without these specific details. In general, well-known instruction instances, protocols, structures and techniques have not been shown in detail.
The foregoing description, for purpose of explanation, has been described with reference to specific implementations. However, the illustrative discussions above are not intended to be exhaustive or to limit the implementations to the precise forms disclosed. Many modifications and variations are possible in view of the above teachings. The implementations were chosen and described in order to best explain the principles and their practical applications, to thereby enable others skilled in the art to best utilize the implementations and various implementations with various modifications as are suited to the particular use contemplated.
This application claims priority to U.S. Provisional Application No. 62/508,947, filed May 19, 2017, entitled “SYSTEMS AND METHODS FOR ANALYZING DATASETS,” and U.S. Provisional Application No. 62/582,866, filed Nov. 7, 2017, entitled “SYSTEMS AND METHODS FOR ANALYZING DATASETS,” each of which is herein incorporated by reference.
Number | Name | Date | Kind |
---|---|---|---|
5149625 | Church et al. | Sep 1992 | A |
5202231 | Drmanac et al. | Apr 1993 | A |
5413924 | Kosak et al. | May 1995 | A |
5436130 | Mathies et al. | Jul 1995 | A |
5512131 | Kumar et al. | Apr 1996 | A |
5587128 | Wilding et al. | Dec 1996 | A |
5605793 | Stemmer | Feb 1997 | A |
5618711 | Gelfand et al. | Apr 1997 | A |
5695940 | Drmanac et al. | Dec 1997 | A |
5736330 | Fulton | Apr 1998 | A |
5834197 | Parton | Nov 1998 | A |
5851769 | Gray et al. | Dec 1998 | A |
5856174 | Lipshutz et al. | Jan 1999 | A |
5958703 | Dower et al. | Sep 1999 | A |
5994056 | Higuchi | Nov 1999 | A |
6046003 | Mandecki | Apr 2000 | A |
6051377 | Mandecki | Apr 2000 | A |
6057107 | Fulton | May 2000 | A |
6103537 | Ullman et al. | Aug 2000 | A |
6143496 | Brown et al. | Nov 2000 | A |
6172218 | Brenner | Jan 2001 | B1 |
6297006 | Drmanac et al. | Oct 2001 | B1 |
6297017 | Thompson | Oct 2001 | B1 |
6327410 | Walt et al. | Dec 2001 | B1 |
6355198 | Kim et al. | Mar 2002 | B1 |
6361950 | Mandecki | Mar 2002 | B1 |
6372813 | Johnson et al. | Apr 2002 | B1 |
6406848 | Bridgham et al. | Jun 2002 | B1 |
6432360 | Church | Aug 2002 | B1 |
6485944 | Church et al. | Nov 2002 | B1 |
6511803 | Church et al. | Jan 2003 | B1 |
6524456 | Ramsey et al. | Feb 2003 | B1 |
6586176 | Trnovsky et al. | Jul 2003 | B1 |
6632606 | Ullman et al. | Oct 2003 | B1 |
6632655 | Mehta et al. | Oct 2003 | B1 |
6670133 | Knapp et al. | Dec 2003 | B2 |
6767731 | Hannah | Jul 2004 | B2 |
6800298 | Burdick et al. | Oct 2004 | B1 |
6806052 | Bridgham et al. | Oct 2004 | B2 |
6806058 | Jesperson et al. | Oct 2004 | B2 |
6859570 | Walt et al. | Feb 2005 | B2 |
6913935 | Thomas | Jul 2005 | B1 |
6929859 | Chandler et al. | Aug 2005 | B2 |
6969488 | Bridgham et al. | Nov 2005 | B2 |
6974669 | Mirkin et al. | Dec 2005 | B2 |
7041481 | Anderson et al. | May 2006 | B2 |
7115400 | Adessi et al. | Oct 2006 | B1 |
7129091 | Ismagilov et al. | Oct 2006 | B2 |
7268167 | Higuchi et al. | Sep 2007 | B2 |
7282370 | Bridgham et al. | Oct 2007 | B2 |
7323305 | Leamon et al. | Jan 2008 | B2 |
7425431 | Church et al. | Sep 2008 | B2 |
7536928 | Kazuno | May 2009 | B2 |
7604938 | Takahashi et al. | Oct 2009 | B2 |
7622280 | Holliger et al. | Nov 2009 | B2 |
7638276 | Griffiths et al. | Dec 2009 | B2 |
7645596 | Williams et al. | Jan 2010 | B2 |
7666664 | Sarofim et al. | Feb 2010 | B2 |
7708949 | Stone et al. | May 2010 | B2 |
7709197 | Drmanac | May 2010 | B2 |
7745178 | Dong | Jun 2010 | B2 |
7776927 | Chu et al. | Aug 2010 | B2 |
RE41780 | Anderson et al. | Sep 2010 | E |
7799553 | Mathies et al. | Sep 2010 | B2 |
7842457 | Berka et al. | Nov 2010 | B2 |
7901891 | Drmanac | Mar 2011 | B2 |
7910354 | Drmanac et al. | Mar 2011 | B2 |
7960104 | Drmanac et al. | Jun 2011 | B2 |
7968287 | Griffiths et al. | Jun 2011 | B2 |
7972778 | Brown et al. | Jul 2011 | B2 |
8003312 | Krutzik et al. | Aug 2011 | B2 |
8067159 | Brown et al. | Nov 2011 | B2 |
8133719 | Drmanac et al. | Mar 2012 | B2 |
8252539 | Quake et al. | Aug 2012 | B2 |
8268564 | Roth et al. | Sep 2012 | B2 |
8273573 | Ismagilov et al. | Sep 2012 | B2 |
8278071 | Brown et al. | Oct 2012 | B2 |
8304193 | Ismagilov et al. | Nov 2012 | B2 |
8329407 | Ismagilov et al. | Dec 2012 | B2 |
8337778 | Stone et al. | Dec 2012 | B2 |
8592150 | Drmanac et al. | Nov 2013 | B2 |
8603749 | Gillevet | Dec 2013 | B2 |
8748094 | Weitz et al. | Jun 2014 | B2 |
8748102 | Berka et al. | Jun 2014 | B2 |
8765380 | Berka et al. | Jul 2014 | B2 |
8822148 | Ismagliov et al. | Sep 2014 | B2 |
8871444 | Griffiths et al. | Oct 2014 | B2 |
8889083 | Ismagilov et al. | Nov 2014 | B2 |
9012370 | Hong | Apr 2015 | B2 |
9017948 | Agresti et al. | Apr 2015 | B2 |
9029083 | Griffiths et al. | May 2015 | B2 |
9347059 | Saxonov | May 2016 | B2 |
9388465 | Hindson et al. | Jul 2016 | B2 |
9410201 | Hindson et al. | Aug 2016 | B2 |
9694361 | Bharadwaj et al. | Jul 2017 | B2 |
9695468 | Hindson et al. | Jul 2017 | B2 |
9824068 | Wong | Nov 2017 | B2 |
20010020588 | Adourian et al. | Sep 2001 | A1 |
20010044109 | Mandecki | Nov 2001 | A1 |
20020034737 | Drmanac | Mar 2002 | A1 |
20020051992 | Bridgham et al. | May 2002 | A1 |
20020089100 | Kawasaki | Jul 2002 | A1 |
20020092767 | Bjornson et al. | Jul 2002 | A1 |
20020179849 | Maher et al. | Dec 2002 | A1 |
20030008285 | Fischer | Jan 2003 | A1 |
20030008323 | Ravkin et al. | Jan 2003 | A1 |
20030027221 | Scott et al. | Feb 2003 | A1 |
20030028981 | Chandler et al. | Feb 2003 | A1 |
20030039978 | Hannah | Feb 2003 | A1 |
20030044777 | Beattie | Mar 2003 | A1 |
20030044836 | Levine et al. | Mar 2003 | A1 |
20030104466 | Knapp et al. | Jun 2003 | A1 |
20030108897 | Drmanac | Jun 2003 | A1 |
20030149307 | Hai et al. | Aug 2003 | A1 |
20030170698 | Gascoyne et al. | Sep 2003 | A1 |
20030182068 | Battersby et al. | Sep 2003 | A1 |
20030207260 | Trnovsky et al. | Nov 2003 | A1 |
20030215862 | Parce et al. | Nov 2003 | A1 |
20040063138 | McGinnis et al. | Apr 2004 | A1 |
20040132122 | Banerjee et al. | Jul 2004 | A1 |
20040258701 | Dominowski et al. | Dec 2004 | A1 |
20050019839 | Jespersen et al. | Jan 2005 | A1 |
20050042625 | Schmidt et al. | Feb 2005 | A1 |
20050079510 | Berka et al. | Apr 2005 | A1 |
20050130188 | Walt et al. | Jun 2005 | A1 |
20050172476 | Stone et al. | Aug 2005 | A1 |
20050181379 | Su et al. | Aug 2005 | A1 |
20050202429 | Trau et al. | Sep 2005 | A1 |
20050202489 | Cho et al. | Sep 2005 | A1 |
20050221339 | Griffiths et al. | Oct 2005 | A1 |
20050244850 | Huang et al. | Nov 2005 | A1 |
20050287572 | Mathies et al. | Dec 2005 | A1 |
20060020371 | Ham et al. | Jan 2006 | A1 |
20060073487 | Oliver et al. | Apr 2006 | A1 |
20060078888 | Griffiths et al. | Apr 2006 | A1 |
20060153924 | Griffiths et al. | Jul 2006 | A1 |
20060163385 | Link et al. | Jul 2006 | A1 |
20060199193 | Koo et al. | Sep 2006 | A1 |
20060240506 | Kushmaro et al. | Oct 2006 | A1 |
20060257893 | Takahashi et al. | Nov 2006 | A1 |
20060263888 | Fritz et al. | Nov 2006 | A1 |
20060292583 | Schneider et al. | Dec 2006 | A1 |
20070003442 | Link et al. | Jan 2007 | A1 |
20070020617 | Trnovsky et al. | Jan 2007 | A1 |
20070054119 | Garstecki et al. | Mar 2007 | A1 |
20070077572 | Tawfik et al. | Apr 2007 | A1 |
20070092914 | Griffiths et al. | Apr 2007 | A1 |
20070099208 | Drmanac et al. | May 2007 | A1 |
20070111241 | Cereb et al. | May 2007 | A1 |
20070154903 | Marla et al. | Jul 2007 | A1 |
20070172873 | Brenner et al. | Jul 2007 | A1 |
20070190543 | Livak | Aug 2007 | A1 |
20070195127 | Ahn et al. | Aug 2007 | A1 |
20070207060 | Zou et al. | Sep 2007 | A1 |
20070228588 | Noritomi et al. | Oct 2007 | A1 |
20070264320 | Lee et al. | Nov 2007 | A1 |
20080003142 | Link et al. | Jan 2008 | A1 |
20080004436 | Tawfik et al. | Jan 2008 | A1 |
20080014589 | Link et al. | Jan 2008 | A1 |
20080213766 | Brown et al. | Sep 2008 | A1 |
20080241820 | Krutzik et al. | Oct 2008 | A1 |
20080268431 | Choy et al. | Oct 2008 | A1 |
20090005252 | Drmanac et al. | Jan 2009 | A1 |
20090011943 | Drmanac et al. | Jan 2009 | A1 |
20090012187 | Chu et al. | Jan 2009 | A1 |
20090025277 | Takanashi | Jan 2009 | A1 |
20090035770 | Mathies et al. | Feb 2009 | A1 |
20090048124 | Leamon et al. | Feb 2009 | A1 |
20090053169 | Castillo et al. | Feb 2009 | A1 |
20090068170 | Weitz et al. | Mar 2009 | A1 |
20090098555 | Roth et al. | Apr 2009 | A1 |
20090118488 | Drmanac et al. | May 2009 | A1 |
20090137404 | Drmanac et al. | May 2009 | A1 |
20090137414 | Drmanac et al. | May 2009 | A1 |
20090143244 | Bridgham et al. | Jun 2009 | A1 |
20090155781 | Drmanac et al. | Jun 2009 | A1 |
20090197248 | Griffiths et al. | Aug 2009 | A1 |
20090197772 | Griffiths et al. | Aug 2009 | A1 |
20090202984 | Cantor | Aug 2009 | A1 |
20090203531 | Kum | Aug 2009 | A1 |
20090264299 | Drmanac et al. | Oct 2009 | A1 |
20090286687 | Dressman et al. | Nov 2009 | A1 |
20100021973 | Makarov et al. | Jan 2010 | A1 |
20100021984 | F. et al. | Jan 2010 | A1 |
20100022414 | Link et al. | Jan 2010 | A1 |
20100069263 | Shendure et al. | Mar 2010 | A1 |
20100105112 | Holtze et al. | Apr 2010 | A1 |
20100130369 | Shenderov et al. | May 2010 | A1 |
20100136544 | Agresti et al. | Jun 2010 | A1 |
20100137163 | Link et al. | Jun 2010 | A1 |
20100173394 | Colston et al. | Jul 2010 | A1 |
20100210479 | Griffiths et al. | Aug 2010 | A1 |
20110033854 | Drmanac et al. | Feb 2011 | A1 |
20110053798 | Hindson et al. | Mar 2011 | A1 |
20110071053 | Drmanac et al. | Mar 2011 | A1 |
20110086780 | Colston et al. | Apr 2011 | A1 |
20110092376 | Colston et al. | Apr 2011 | A1 |
20110092392 | Colston et al. | Apr 2011 | A1 |
20110160078 | Fodor et al. | Jun 2011 | A1 |
20110195496 | Muraguchi et al. | Aug 2011 | A1 |
20110201526 | Berka et al. | Aug 2011 | A1 |
20110217736 | Hindson | Sep 2011 | A1 |
20110218123 | Weitz et al. | Sep 2011 | A1 |
20110257889 | Klammer et al. | Oct 2011 | A1 |
20110263457 | Krutzik et al. | Oct 2011 | A1 |
20110267457 | Weitz et al. | Nov 2011 | A1 |
20110281738 | Drmanac et al. | Nov 2011 | A1 |
20110305761 | Shum et al. | Dec 2011 | A1 |
20110319281 | Drmanac | Dec 2011 | A1 |
20120000777 | Garrell et al. | Jan 2012 | A1 |
20120010098 | Griffiths et al. | Jan 2012 | A1 |
20120010107 | Griffiths et al. | Jan 2012 | A1 |
20120015382 | Weitz et al. | Jan 2012 | A1 |
20120015822 | Weitz et al. | Jan 2012 | A1 |
20120041727 | Mishra et al. | Feb 2012 | A1 |
20120071331 | Casbon et al. | Mar 2012 | A1 |
20120121481 | Romanowsky et al. | May 2012 | A1 |
20120132288 | Weitz et al. | May 2012 | A1 |
20120135893 | Drmanac et al. | May 2012 | A1 |
20120172259 | Rigatti et al. | Jul 2012 | A1 |
20120190032 | Ness et al. | Jul 2012 | A1 |
20120196288 | Beer | Aug 2012 | A1 |
20120211084 | Weitz et al. | Aug 2012 | A1 |
20120220494 | Samuels et al. | Aug 2012 | A1 |
20120220497 | Jacobson et al. | Aug 2012 | A1 |
20120222748 | Weitz et al. | Sep 2012 | A1 |
20120230338 | Ganeshalingam et al. | Sep 2012 | A1 |
20120309002 | Link | Dec 2012 | A1 |
20120316074 | Saxonov | Dec 2012 | A1 |
20130028812 | Prieto et al. | Jan 2013 | A1 |
20130046030 | Rotem et al. | Feb 2013 | A1 |
20130078638 | Berka et al. | Mar 2013 | A1 |
20130079231 | Pushkarev et al. | Mar 2013 | A1 |
20130109575 | Kleinschmidt et al. | May 2013 | A1 |
20130130919 | Chen et al. | May 2013 | A1 |
20130157870 | Pushkarev et al. | Jun 2013 | A1 |
20130157899 | Adler et al. | Jun 2013 | A1 |
20130178368 | Griffiths et al. | Jul 2013 | A1 |
20130189700 | So et al. | Jul 2013 | A1 |
20130203605 | Shendure et al. | Aug 2013 | A1 |
20130210639 | Link et al. | Aug 2013 | A1 |
20130225418 | Watson | Aug 2013 | A1 |
20130268206 | Porreca et al. | Oct 2013 | A1 |
20130274117 | Church et al. | Oct 2013 | A1 |
20130311106 | White et al. | Nov 2013 | A1 |
20140057799 | Johnson et al. | Feb 2014 | A1 |
20140065234 | Shum et al. | Mar 2014 | A1 |
20140155295 | Hindson et al. | Jun 2014 | A1 |
20140194323 | Gillevet | Jul 2014 | A1 |
20140199730 | Agresti et al. | Jul 2014 | A1 |
20140199731 | Agresti et al. | Jul 2014 | A1 |
20140200166 | Rooyen et al. | Jul 2014 | A1 |
20140206554 | Hindson et al. | Jul 2014 | A1 |
20140214334 | Plattner et al. | Jul 2014 | A1 |
20140227684 | Hindson et al. | Aug 2014 | A1 |
20140227706 | Kato et al. | Aug 2014 | A1 |
20140228255 | Hindson et al. | Aug 2014 | A1 |
20140235506 | Hindson et al. | Aug 2014 | A1 |
20140287963 | Hindson et al. | Sep 2014 | A1 |
20140302503 | Lowe et al. | Oct 2014 | A1 |
20140323316 | Drmanac et al. | Oct 2014 | A1 |
20140378322 | Hindson et al. | Dec 2014 | A1 |
20140378345 | Hindson et al. | Dec 2014 | A1 |
20140378349 | Hindson et al. | Dec 2014 | A1 |
20140378350 | Hindson et al. | Dec 2014 | A1 |
20150005199 | Hindson et al. | Jan 2015 | A1 |
20150005200 | Hindson et al. | Jan 2015 | A1 |
20150011430 | Saxonov | Jan 2015 | A1 |
20150011432 | Saxonov | Jan 2015 | A1 |
20150066385 | Schnall-Levin et al. | Mar 2015 | A1 |
20150111256 | Church et al. | Apr 2015 | A1 |
20150133344 | Shendure et al. | May 2015 | A1 |
20150220532 | Wong | Aug 2015 | A1 |
20150224466 | Hindson et al. | Aug 2015 | A1 |
20150292988 | Bharadwaj et al. | Oct 2015 | A1 |
20150298091 | Weitz et al. | Oct 2015 | A1 |
20150299772 | Zhang | Oct 2015 | A1 |
20150376605 | Jarosz et al. | Dec 2015 | A1 |
20150376609 | Hindson et al. | Dec 2015 | A1 |
20150376700 | Schnall-Levin et al. | Dec 2015 | A1 |
20150379196 | Schnall-Levin et al. | Dec 2015 | A1 |
20160232291 | Kyriazopoulou-Panagiotopoulou et al. | Aug 2016 | A1 |
20160289760 | Suzuki et al. | Oct 2016 | A1 |
20160304860 | Hindson et al. | Oct 2016 | A1 |
20160350478 | Chin et al. | Dec 2016 | A1 |
20170235876 | Jaffe et al. | Aug 2017 | A1 |
20180196781 | Wong | Jul 2018 | A1 |
20180225416 | Wong et al. | Aug 2018 | A1 |
20180265928 | Schnall-Levin et al. | Sep 2018 | A1 |
Number | Date | Country |
---|---|---|
0249007 | Dec 1987 | EP |
0637996 | Jul 1997 | EP |
1019496 | Sep 2004 | EP |
1482036 | Oct 2007 | EP |
1594980 | Nov 2009 | EP |
1967592 | Apr 2010 | EP |
2258846 | Dec 2010 | EP |
2145955 | Feb 2012 | EP |
1905828 | Aug 2012 | EP |
2136786 | Oct 2012 | EP |
1908832 | Dec 2012 | EP |
2540389 | Jan 2013 | EP |
2485850 | May 2012 | GB |
5949832 | Mar 1984 | JP |
2006507921 | Mar 2006 | JP |
2006289250 | Oct 2006 | JP |
2007268350 | Oct 2007 | JP |
2009208074 | Sep 2009 | JP |
2321638 | Apr 2008 | RU |
1996029629 | Sep 1996 | WO |
1996041011 | Dec 1996 | WO |
1999009217 | Feb 1999 | WO |
1999052708 | Oct 1999 | WO |
2000008212 | Feb 2000 | WO |
2000026412 | May 2000 | WO |
2001014589 | Mar 2001 | WO |
2001089787 | Nov 2001 | WO |
2002031203 | Apr 2002 | WO |
2002086148 | Oct 2002 | WO |
2004002627 | Jan 2004 | WO |
2004010106 | Jan 2004 | WO |
2004069849 | Aug 2004 | WO |
2004091763 | Oct 2004 | WO |
2004102204 | Nov 2004 | WO |
2004103565 | Dec 2004 | WO |
2004105734 | Dec 2004 | WO |
2005002730 | Jan 2005 | WO |
2005021151 | Mar 2005 | WO |
2005023331 | Mar 2005 | WO |
2005040406 | May 2005 | WO |
2005049787 | Jun 2005 | WO |
2005082098 | Sep 2005 | WO |
2006030993 | Mar 2006 | WO |
2006078841 | Jul 2006 | WO |
2006096571 | Sep 2006 | WO |
2007001448 | Jan 2007 | WO |
2007002490 | Jan 2007 | WO |
2007024840 | Mar 2007 | WO |
2007081385 | Jul 2007 | WO |
2007081387 | Jul 2007 | WO |
2007089541 | Aug 2007 | WO |
2007114794 | Oct 2007 | WO |
2007121489 | Oct 2007 | WO |
2007133710 | Nov 2007 | WO |
2007138178 | Dec 2007 | WO |
2007140015 | Dec 2007 | WO |
2007149432 | Dec 2007 | WO |
2008021123 | Feb 2008 | WO |
2008091792 | Jul 2008 | WO |
2008102057 | Aug 2008 | WO |
2008109176 | Sep 2008 | WO |
2008121342 | Oct 2008 | WO |
2008134153 | Nov 2008 | WO |
2007139766 | Dec 2008 | WO |
2009005680 | Jan 2009 | WO |
2009011808 | Jan 2009 | WO |
2009023821 | Feb 2009 | WO |
2009061372 | May 2009 | WO |
2009085215 | Jul 2009 | WO |
2010004018 | Jan 2010 | WO |
2010033200 | Mar 2010 | WO |
2010115154 | Oct 2010 | WO |
2010127304 | Nov 2010 | WO |
2010148039 | Dec 2010 | WO |
2010151776 | Dec 2010 | WO |
2011047870 | Apr 2011 | WO |
2011056546 | May 2011 | WO |
2011066476 | Jun 2011 | WO |
2011074960 | Jun 2011 | WO |
2012012037 | Jan 2012 | WO |
2012048341 | Apr 2012 | WO |
2012055929 | May 2012 | WO |
2012061832 | May 2012 | WO |
2012100216 | Jul 2012 | WO |
2012083225 | Aug 2012 | WO |
2012106546 | Aug 2012 | WO |
2012112804 | Aug 2012 | WO |
2012116331 | Aug 2012 | WO |
2012142531 | Oct 2012 | WO |
2012142611 | Oct 2012 | WO |
2012149042 | Nov 2012 | WO |
2012166425 | Dec 2012 | WO |
2013035114 | Mar 2013 | WO |
2013055955 | Apr 2013 | WO |
2013123125 | Aug 2013 | WO |
2013177220 | Nov 2013 | WO |
2014028537 | Feb 2014 | WO |
2014093676 | Jun 2014 | WO |
2015002908 | Jan 2015 | WO |
2015157567 | Oct 2015 | WO |
2015200891 | Dec 2015 | WO |
2016130578 | Aug 2016 | WO |
Entry |
---|
Usoskin et al. Nature Neuroscience Nov. 2014, vol. 18, No. 1, pp. 145-153. |
Ross et al z Genome Biology (2016) 17:69 DOI 10.1186/s13059-016-0929-9. |
Schmidt et al Pharmaceuticals 2016, 9, 33; doi:10.3390/ph9020033. |
“bedtools: General Usage,” http://bedtools.readthedocs.io/en/latest/content/generalusage. html; Retrieved from the Internet Jul. 8, 2016. |
“SSH Tunnel—Local and Remote Port Forwarding Explained With Examples,” Trackets Blog, http://blog.trackets.com/2014/05/17/ssh-tunnel-local-and-remote-port-forwarding-explained with-examples.html; Retrieved from the Internet Jul. 7, 2016. |
Abate et al., Valve-based flow focusing for drop formation. Appl Phys Lett. 2009;94. 3 pages. |
Abate, A.R et al. “Beating Poisson encapsulation statistics using close-packed ordering” Lab on a Chip (Sep. 21, 2009) 9(18):2628-2631. |
Abate, et al. High-throughput injection with microfluidics using picoinjectors. Proc Natl Acad Sci U S A. Nov. 9, 2010;107(45):19163-6. doi: 10.1073/pNas.1006888107. Epub Oct. 20, 2010. |
Agresti, et al. Selection of ribozymes that catalyse multiple-turnover Diels-Alder cycloadditions by using in vitro compartmentalization. Proc Natl Acad Sci U S A. Nov. 8, 2005;102(45): 16170-5. Epub Oct. 31, 2005. |
Aitman, et al. Copy number polymorphism in Fcgr3 predisposes to glomerulonephritis in rats and humans. Nature. Feb. 16, 2006;439(7078):851-5. |
Akselband, “Enrichment of slow-growing marine microorganisms from mixed cultures using gel microdrop (GMD) growth assay and fluorescence-activated cell sorting”, J. Exp. Marine Biol., 329: 196-205 (2006). |
Akselband, “Rapid mycobacteria drug susceptibility testing using gel microdrop (GMD) growth assay and flow cytometry”, J. Microbiol. Methods, 62:181-197 (2005). |
Anna et al., “Formation of dispersions using ‘flow focusing’ in microchannels”, Appln. Phys. Letts. 82:3 364 (2003). |
Attia, U.M et al., “Micro-injection moulding of polymer microfluidic devices” Microfluidics and nanofluidics (2009) 7(1):1-28. |
Balikova, et al. Autosomal-dominant microtia linked to five tandem copies of a copy-number-variable region at chromosome 4p16. Am J Hum Genet. Jan. 2008;82(1):181-7. doi: 10.1016/j.ajhg.2007.08.001. |
Bansal et al. “An MCMC algorithm for haplotype assembly from whole-genome sequence data,” (2008) Genome Res 18:1336-1346. |
Bansal et al. “HapCUT: an efficient and accurate algorithm for the haplotype assembly problem,” Bioinformatics (2008) 24:i153-i159. |
Baret et al. “Fluorescence-activated droplet sorting (FADS): efficient microfluidic cell sorting based on enzymatic activity” Lab on a Chip (2009) 9(13):1850-1858. |
Bentley et al. “Accurate whole human genome sequencing using reversible terminator chemistry,” (2008) Nature 456:53-59. |
Boone, et al. Plastic advances microfluidic devices. The devices debuted in silicon and glass, but plastic fabrication may make them hugely successful in biotechnology application. Analytical Chemistry. Feb. 2002; 78A-86A. |
Braeckmans et al., Scanning the Code. Modern Drug Discovery. 2003:28-32. |
Bransky, et al. A microfluidic droplet generator based on a piezoelectric actuator. Lab Chip. Feb. 21, 2009;9(4):516-20. doi: 10.1039/b814810d. Epub Nov. 20, 2008. |
Bray, “The JavaScript Object Notation (JSON) Data Interchange Format,” Mar. 2014, retrieved from the Internet Feb. 15, 2015; https://tools.ietf.org/html/rfc7159. |
Brouzes, E et al., “Droplet microfluidic technology for single-cell high-throughput screening” PNAS (2009) 106(34):14195-14200. |
Browning, S.R. et al. “Haplotype Phasing: Existing Methods and New Developments” NaRevGenet (Sep. 16, 2011) 12(10):703-714. |
Cappuzzo, et al. Increased HER2 gene copy number is associated with response to gefitinib therapy in epidermal growth factor receptor-positive non-small-cell lung cancer patients. J Clin Oncol. Aug. 1, 2005;23(22):5007-18. |
Carroll, “The selection of high-producing cell lines using flow cytometry and cell sorting”, Exp. Op. Bioi. Therp., 4:11 1821-1829 (2004). |
Chaudhary “A rapid method of cloning functional variable-region antibody genes in Escherichia coli as single-chain immunotoxins” Proc. Nat!. Acad. Sci USA 87: 1066-1070 (Feb. 1990). |
Chechetkin et al., Sequencing by hybridization with the generic 6-mer oligonucleotide microarray: an advanced scheme for data processing. J Biomol Struct Dyn. Aug. 2000;I8(1):83-101. |
Chen et al. “BreakDancer: an algorithm for high-resolution mapping of genomic structural variation,” Nature Methods (2009) 6(9):677-681. |
Chen, F. et al. “Chemical transfection of cells in picoliter aqueous droplets in fluorocarbon oil” Anal Chem (2011) 83(22):8816-8820. |
Choi et al. “Identification of novel isoforms of the EML4-ALK transforming gene in non-small cell lung cancer,” Cancer Res (2008) 68:4971-4976. |
Chokkalingam, V et al., “Probing cellular heterogeneity in cytokine-secreting immune cells using droplet-based microfluidics” Lab Chip (2013) 13:4740-4744. |
Chou, H-P. et al. “Disposable Microdevices for DNA Analysis and Cell Sorting” Proc. Solid-State Sensor and Actuator Workshop Hilton Head, SC Jun. 8-11, 1998, pp. 11-14. |
Chu, L-Y. et al., “Controllable monodisperse multiple emulsions” Angew. Chem. Int. Ed. (2007) 46:8970-8974. |
Clausell-Tormos et al., “Droplet-based microfluidic platforms for the encapsulation and screening of mammalian cells and multicellular organisms”, Chem. Biol. 15:427-437 (2008). |
Cleary et al. “Joint variant and de novo mutation identification on pedigrees from highthroughput sequencing data,” J Comput Biol (2014) 21:405-419. |
Cook, et al. Copy-number variations associated with neuropsychiatric conditions. Nature. Oct. 16, 2008;455(7215):919-23. doi: 10.1038/nature07458. |
Fabi, et al. Correlation of efficacy between EGFR gene copy number and lapatinib/capecitabine therapy in HER2-positive metastatic breast cancer. J. Clin. Oncol. 2010; 28:15S. 2010 ASCO Meeting abstract Jun. 14, 2010:1059. |
De Bruin et al., UBS Investment Research. Q-Series?: DNa Sequencing. UBS Securities LLC. Jul. 12, 2007. 15 pages. |
Demirci, et al. “Single cell epitaxy by acoustic picolitre droplets” Lab Chip. Sep. 2007; 7(9):1139-45. Epub Jul. 10, 2007. |
Doerr, “The smallest bioreactor”, Nature Methods, 2:5 326 (2005). |
Dowding, et al. “Oil core/polymer shell microcapsules by internal phase separation from emulsion droplets. II: controlling the release profile of active molecules” Langmuir. Jun. 7, 2005;21(12):5278-84. |
Draper, M.C. et al., “Compartmentalization of electrophoretically separated analytes in a multiphase microfluidic platform” Anal. Chem. (2012) 84:5801-5808. |
Dressler, O.J. et al., “Droplet-based microfluidics enabling impact on drug discovery” J. Biomol. Screen (2014) 19(4):483-496. |
Drmanac et al., Sequencing by hybridization (SBH): advantages, achievements, and opportunities. Adv Biochem Eng Biotechnol. 2002;77 :75-101. |
Droplet Based Sequencing (slides) dated (Mar. 12, 2008). |
Eastburn, D.J. et al., “Ultrahigh-throughput mammalian single-cell reverse-transcriptase polymerase chain reaction In microfluidic droplets” Anal. Chem. (2013) 85:8016-8021. |
Eid et al. “Real-time sequencing form single polymerase molecules,” Science (2009) 323:133-138. |
Ekblom, R. et al. “A field guide to whole-genome sequencing, assembly and annotation” Evolutionary Apps (Jun. 24, 2014) 7(9):1026-1042. |
Esser-Kahn, et al. Triggered release from polymer capsules. Macromolecules. 2011; 44:5539-5553. |
Makino, K. et al. “Preparation of hydrogel microcapsules Effects of preparation conditions upon membrane properties” Colloids and Surfaces: B Biointerfaces (1998) 12:97-104. |
Marcus. Gene method offers diagnostic hope. The Wall Street Journal. Jul. 11, 2012. |
Margulies et al. “Genome sequencing in microfabricated high-density picoliter reactors,” Nature (2005) 437:376-380. |
Matochko, W.L. et al., “Uniform amplification of phage display libraries in monodisperse emulsions,” Methods (2012) 58:18-27. |
Mazutis, et al. Selective droplet coalescence using microfluidic systems. Lab Chip. Apr. 24, 2012;12(10):1800-6. doi: 10.1039/c2Ic40121e. Epub Mar. 27, 2012. |
McCoy, R. et al. “Illumina TruSeq Synthetic Long-Reads Empower De Novo Assembly and Resolve Complex, Highly-Repetitive Transposable Elements” PLOS (2014) 9(9):e1016689. |
McKenna et al. “The Genome Analysis Toolkit: A MapReduce framework for anaylzing nextgeneration DNA sequencing data,” Genome Research (2010) pp. 1297-1303. |
Merriman, et al. Progress in ion torrent semiconductor chip based sequencing. Electrophoresis. Dec. 2012;33(23):3397-3417. doi: 10.1002/elps.201200424. |
Microfluidic ChipShop. Microfluidic product catalogue. Mar. 2005. |
Microfluidic ChipShop. Microfluidic product catalogue. Oct. 2009. |
Miller et al. “Assembly Algorithms for next-generation sequencing data,” Genomics, 95 (2010), pp. 315-327. |
Mirzabekov, “DNA Sequencing by Hybridization—a Megasequencing Method and A Diagnostic Tool?” Trends in Biotechnology 12(1): 27-32 (1994). |
Moore, J.L. et al., “Behavior of capillary valves in centrifugal microfluidic devices prepared by three-dimensional printing” Microfluid Nanofluid (2011) 10:877-888. |
Mouritzen et al., Single nucleotide polymorphism genotyping using locked nucleic acid (LNa). Expert Rev Mol Diagn. Jan. 2003;3(1):27-38. |
Myllykangas et al. “Efficient targeted resequencing of human germline and cancer genomes by oligonucleotide-selective sequencing,” Nat Biotechnol, (2011) 29:1024-1027. |
Nagashima, S. et al. “Preparation of monodisperse poly(acrylamide-co-acrylic acid) hydrogel microspheres by a membrane emulsification technique and their size dependent surface properties” Colloids and Surfaces: B Biointerfaces (1998) 11:47-56. |
Navin, N.E. “The first five years of single-cell cancer genomics and beyond” Genome Res. (2015) 25:1499-1507. |
Nguyen, et al. In situ hybridization to chromosomes stabilized in gel microdrops. Cytometry. 1995; 21:111-119. |
Novak, R. et al., “Single cell multiplex gene detection and sequencing using microfluidicallygenerated agarose emulsions” Angew. Chem. Int. Ed. Engl. (2011) 50(2):390-395. |
Oberholzer, et al. Polymerase chain reaction in liposomes. Chem Biol. Oct. 1995;2(10):677-82. |
Ogawa, et al. Production and characterization of O/W emulsions containing cationic droplets stabilized by lecithin-chitosan membranes. J Agric Food Chem. Apr. 23, 2003;51(9):2806-12. |
Okushima, “Controlled production of monodisperse double emulsions by two-step droplet breakup in microfluidic devices”, Langmuir, 20:9905-9908 (2004). |
Perez, C., et al., “Poly(lactic acid)-poly(ethylene glycol) Nanoparticles as new carriers for the delivery ofplasmid DNa,” Journal of Controlled Release, vol. 75, pp. 211-224 (2001). |
Peters et al., “Accurate whole-genome sequencing and haplotyping from 10 to 20 human cells,” Nature, Jul. 12, 2012, vol. 487, pp. 190-195. |
Pinto, et al. Functional impact of global rare copy number variation in autism spectrum disorders. Nature. Jul. 15, 2010;466(7304):368-72. doi: 10.1038/nature09146. Epub Jun. 9, 2010. |
Plunkett, et al. Chymotrypsin responsive hydrogel: application of a disulfide exchange protocol for the preparation of methacrylamide containing peptides. Biomacromolecules. Mar.-Apr. 2005;6(2):632-7. |
Pushkarev et al. “Single-molecule sequencing of an individual human genome,” Nature Biotech (2009) 17:847-850. |
Ritz, A. et al. “Characterization of structural variants with single molecule and hybrid sequencing approaches” Bioinformatics (2014) 30(24):3458-3466. |
Ropers. New perspectives for the elucidation of genetic disorders. Am J Hum Genet. Aug. 2007;81(2):199-207. Epub Jun. 29, 2007. |
Rotem, A. et al. “Single Cell Chip-Seq Using Drop-Based Microfluidics” Abstract #50. Frontiers of Single Cell Analysis, Stanford University Sep. 5-7, 2013. |
Rotem, A. et al., “High-throughput single-cell labeling (Hi-SCL) for RNA-Seq using drop-based microfluidics” PLOS One (May 22, 2015) 0116328 (14 pages). |
Ryan, et al. Rapid assay for mycobacterial growth and antibiotic susceptibility using gel microdrop encapsulation. J Clin Microbiol. Jul. 1995;33(7):1720-6. |
Schirinzi et al., Combinatorial sequencing-by-hybridization: analysis of the NFI gene. Genet Test. 2006 Spring;10(1):8-17. |
Schmitt, “Bead-based multiplex genotyping of human papillomaviruses”, J. Clinical Microbiol., 44:2 504-512 (2006). |
Sebat, et al. Strong association of de novo copy number mutations with autism. Science. Apr. 20, 2007;316(5823):445-9. Epub Mar. 15, 2007. |
Seiffert, S. et al., “Smart microgel capsules from macromolecular precursors” J. Am. Chem. Soc. (2010) 132:6606-6609. |
Shah, “Fabrication of mono disperse thermosensitive microgels and gel capsules in micro fluidic devices”, Soft Matter, 4:2303-2309 (2008). |
Shendure et al. “Accurate Multiplex Polony Sequencing of an Evolved bacterial Genome” Science (2005) 309:1728-1732. |
Shimkus et al. “A chemically cleavable biotinylated nucleotide: Usefulness in the recovery of protein-DNA complexes from avidin affinity columns” PNAS (1985) 82:2593-2597. |
Shlien, et al. Copy number variations and cancer. Genome Med. Jun. 16, 2009;1(6):62. doi: 10.1186/gm62. |
Shlien, et al. Excessive genomic DNA copy number variation in the Li-Fraumeni cancer predisposition syndrome. Proc Natl Acad Sci U S A. Aug. 12, 2008;105(32):11264-9. doi: 10.1073/pnas.0802970105. Epub Aug. 6, 2008. |
Simeonov et al., Single nucleotide polymorphism genotyping using short, fluorescently labeled locked nucleic acid (LNa) probes and fluorescence polarization detection. Nucleic Acids Res. Sep. 1, 2002;30(17):e91. |
Sorokin et al., Discrimination between perfect and mismatched duplexes with oligonucleotide gel microchips: role of thermodyNamic and kinetic effects during hybridization. J Biomol Struct Dyn. Jun. 2005;22(6):725-34. |
Su, et al., Microfluidics-Based Biochips: Technology Issues, Implementation Platforms, and Design-Automation Challenges. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems. 2006;25(2):211-23. (Feb. 2006). |
Sun et al., Progress in research and application of liquid-phase chip technology. Chinese Journal Experimental Surgery. May 2005;22(5):639-40. |
Tawfik, D.S. et al. “Man-made cell-like compartments for molecular evolution” Nature Biotech (Jul. 1998) 16:652-656. |
Tewhey et al. “The importance of phase information for human genomics,” Nat Rev Genet (2011) 12:215-223. |
Tewhey, R. et al., “Microdroplet-based PCR enrichment for large-scale targeted sequencing” Nature Biotech. (2009) 27(11):1025-1031 and Online Methods (11 pages). |
The SAM/BAM Format Specificatio Working Group, “Sequence Alignment/ Map Format Specification,” Dec. 28, 2014. |
Theberge, A.B, et al. Microdropelts in microfluidics: an evolving platform for discoveries in chemsitry and biology. Angew Chem Int Ed Engl. Aug. 9, 2010;49(34):5846-68. doi: 10.1002/anie.200906653. |
Tonelli, C. et al., “Perfluoropolyether functional oligomers: unusual reactivity in organic chemistry” J. Fluorine Chem. (2002) 118:107-121. |
Tubeleviciute, et al. Compartmentalized self-replication (CSR) selection of Thermococcus litoralis Sh1B DNa polymerase for diminished uracil binding. Protein Eng Des Sel. Aug. 2010;23(8):589-97. doi: 10.1093/protein/gzq032. Epub May 31, 2010. |
Turner, et al. “Methods for genomic partitioning” Annu Rev Genomics Human Genet. (2009) 10:263-284. doi: 10.1146/annurev-genom-082908-150112. Review. |
Voskoboynik, A. et al. “The genome sequence of the colonial chordate, Botryllus schlosseri.” eLife Jul. 2, 2013, 2:e00569. |
Wagner, O et al., “Biocompatible fluorinated polyglycerols for droplet microfluidics as an alternative to PEG-based copolymer surfactants” Lab Chip DOI:10.1039/C5LC00823A. (2015). |
Wang et al., Single nucleotide polymorphism discrimination assisted by improved base stacking hybridization using bligonucleotide microarrays. Biotechniques. 2003;35:300-08. |
Wang, et al. A novel thermo-induced self-bursting microcapsule with magnetic-targeting property. Chemphyschem. Oct. 5, 2009;10(14):2405-9. |
Wang, et al. Digital karyotyping. Proc Natl Acad Sci U S A. Dec. 10, 2002;99(25):16156-61. Epub Dec. 2, 2002. |
Weaver, J.C. et al. “Rapid clonal growth measurements at the single-cell level: gel microdroplets and flow cytometry”, Biotechnology, 9:873-877 (1991). |
Wheeler et al., “Database resources of the National Center for Biotechnology Information,” Nucleic Acids Res. (2007) 35 (Database issue): D5-12. |
Whitesides, “Soft lithography in biology and biochemistry”, Annual Review of Biomedical Engineering, 3:335-373 (2001). |
Williams, R. et al. “Amplification of complex gene libraries by emulsion PCR” Nature Methods (Jul. 2006) 3(7):545-550. |
Woo, et al. G/C-modified oligodeoxynucleotides with selective complementarity: synthesis and hybridization properties. Nucleic Acids Res. Jul. 1, 1996;24(13):2470-5. |
Kia, “Soft lithography”, Annual Review of Material Science, 28: 153-184 (1998). |
Yamamoto, et al. Chemical modification of Ce(IV)/EDTA-base artificial restriction DNa cutter for versatile manipulation of doulbe-stranded DNa. Nucleic Acids Research. 2007; 35(7):e53. |
Zerbino et al. “Velvet: Algorithms for de novo short read assembly using de Bruijn graphs,” Genome Research (2008) 18:821-829. |
Zerbino, D.R. “Using the Velvet de novo assembler for short-read sequencing technologies” Curr Protoc Bioinformatics (Sep. 1, 2010) 31:11.5:11.5.1-11.5.12. |
Zerbino, Daniel, “Velvet Manual—version 1.1,” Aug. 15, 2008, pp. 1-22. |
Zhang, “Combinatorial marking of cells and organelles with reconstituted fluorescent proteins”, Cell, 119:137-144 (Oct. 1, 2004). |
Zhang, et al. Degradable disulfide core-cross-linked micelles as a drug delivery system prepared from vinyl functionalized nucleosides via the RAFT process. Biomacromolecules. Nov. 2008;9(11):3321-31. doi: 10.1021/bm800867n. Epub Oct. 9, 2008. |
Zhao, J., et al., “Preparation of hemoglobin-loaded Nano-sized particles with porous structure as oxygen carriers,” Biomaterials, vol. 28, pp. 1414-1422 (2007). |
Zheng, X.Y. et al. “Haplotyping germline and cancer genomes with high-throughput linked-read sequencing” Nature Biotech (Feb. 1, 2016) 34(3):303-311 and Supplemental Material. |
Zhu, S. et al., “Synthesis and self-assembly of highly incompatible polybutadienepoly(hexafluoropropoylene oxide) diblock copolymers” J. Polym. Sci. (2005) 43:3685-3694. |
Zimmermann et at., Microscale production of hybridomas by hypo-osmolar electrofusion. Human Antibodies Hybridomas. Jan. 1992;3(1 ): 14-8. |
Zong, C. et al. “Genome-wide detection of single-nucleotide and copy-number variations of a single human cell” Science. Dec. 21, 2012;338(6114):1622-6. doi: 10.1126/science.1229164. |
Freeman et al., “Profiling the T-cell receptor beta-chain repertoire by massively parallel sequencing”, Genome Research, Dec. 31, 2009, 9 pages. |
Bischof et al., “bcRep: R Package for Comprehensive Analysis of B Cell Receptor Repertoire Data”, PLOS One, Aug. 23, 2016, 15 pages. |
Fisher, S. et al. “A Scalable, fully automated process for construction of sequence-ready human exome targeted capture libraries” Genome Biology (2011) 2:R1-R15. doi: 10.1186/GB-2011-12-1-r1. Epub Jan. 4, 2011. |
Fredrickson, C.K. et al., “Macro-to-micro interfaces for microfluidic devices” Lab Chip (2004) 4:526-533. |
Freiberg, et al. “Polymer microspheres for controlled drug release” Int J Pharm. Sep. 10, 2004;282(1-2):1-18. |
Fu. A.Y. et al. “A microfabricated fluorescence-activated cell sorter” Nature Biotech (Nov. 1999) 17:1109-1111. |
Fulton et al., “Advanced multiplexed analysis with the FlowMetrix system” Clin Chern. Sep. 1997;43(9): 1749-56. |
Garstecki, P. et al. “Formation of monodisperse bubbles in a microfluidic flow-focusing device” Appl. Phys. Lett (2004) 85(13):2659-2651. DOI: 10.1063/1.1796526. |
Gartner, et al. The Microfluidic Toolbox: examples for fluidic interfaces and standardization concepts. Proc. SPIE 4982, Microfluidics, BioMEMS, and Medical Microsystems, (Jan. 17, 2003); doi: 10.1117/12.479566. |
Ghadessy, et al. Directed evolution of polymerase function by compartmentalized self-replication. Proc Natl Acad Sci U S A. Apr. 10, 2001;98(8):4552-7. Epub Mar. 27, 2001. |
Gonzalez, et al. The influence of CCL3L1 gene-containing segmental duplications on HIV-1/AIDS susceptibility. Science. Mar. 4, 2005;307(5714):1434-40. Epub Jan. 6, 2005. |
Gordon et al. “Consed: A Graphical Tool for Sequence Finishing,” Genome Research (1998) 8:198-202. |
Granieri, Lucia “Droplet-based microfluidics and engineering of tissue plasminogen activator for biomedical applications” Ph.D. Thesis, Nov. 13, 2009 (131 pages). |
Grasland-Mongrain, E. et al. “Droplet coalescence in microlfuidic devices” Internet Citation, 2003, XP002436104, Retrieved from the Internet: URL:http://www.eleves.ens.fr./home/grasland/rapports/stage4.pdf [retrieved on Jun. 4, 2007]. |
Guo, M.T. et al., “Droplet microfluidics for high-throughput biological assays” Lab Chip (2012) 12:2146-2155. |
Gyarmati et al., “Reversible Disulphide Formation in Polymer Networks: A Versitile Functional Group from Synthesis to Application,” European Polymer Journal, 2013, 49, 1268-1286. |
Hashimshony, T et al. “CEL-Seq: Single-Cell RNa-Seq by Multiplexed Linear Amplification” Cell Rep. Sep. 27, 2012;2(3):666-73. doi: 10.1016/j.celrep.2012.08.003. Epub Aug. 30, 2012. |
He “Selective Encapsulation of Single Cells and Subcellular Organelles into Picoliter- and Femtoliter-Volume Droplets” Anal. Chem 77: 1539-1544 (2005). |
Heng et al. “Fast and accurate long-read alignment with Burrows-Wheeler transform,” Bioinformatics (2010) 25(14):1754-1760. |
Holtze, C. et al. Biocompatible surfactants for water-in-fluorocarbon emulsions. Lab Chip. Oct. 2008;8(10):1632-9. doi: 10.1039/b806706f. Epub Sep. 2, 2008. |
Huang et al. “EagleView: A genome assembly viewer for next-generationsequencing technologies,” Genome Research (2008) 18:1538-1543. |
Huebner, “Quantitative detection of protein expression in single cells using droplet microfluidics”, Chern. Commun. 1218-1220 (2007). |
Hug, H. et al. “Measurement of the number of molecules of a single mRNA species in a complex mRNA preparation” J Theor Biol. Apr. 21, 2003;221(4):615-24. |
Illumina, Inc. An Introduction to Next-Generation Sequencing Technology. Feb. 28, 2012. |
Jena et al., “Cyclic olefin copolymer based microfluidic devices for biochip applications: Ultraviolet surface grafting using 2-methacryloyloxyethyl phosphorylchloline” Biomicrofluidics (Mar. 15, 2012) 6:012822 (12 pages). |
Jung, W-C et al., “Micromachining of injection mold inserts for fluidic channel of polymeric biochips” Sensors (2007) 7:1643-1654. |
Kanehisa et al. “KEGG: Kyoto Encyclopedia of Genes and Genomes,” Nucleic Acids Research (2000) 28:27-30. |
Khomiakov A et al., “Analysis of perfect and mismatched DNA duplexes by a generic hexanucleotide microchip”. Mol Bioi (Mosk). Jul.-Aug. 2003;37(4):726-41. Russian. Abstract only. |
Kim et al. “HapEdit: an accuracy assessment viewer for haplotype assembly using massively parallel DNA-sequencing technologies,” Nucleic Acids Research (2011) pp. 1-5. |
Kim, et al. Albumin loaded microsphere of amphiphilic poly(ethylene glycol)/ poly(alpha-ester) multiblock copolymer. Eur J Pharm Sci. Nov. 2004;23(3):245-51. |
Kim, et al. Fabrication of monodisperse gel shells and functional microgels in microfluidic devices. Angew Chem Int Ed Engl. 2007;46(11):1819-22. |
Kim, J et al., “Rapid prototyping of microfluidic systems using a PDMS/polymer tape composite” Lab Chip (2009) 9:1290-1293. |
Kirkness et al. “Sequencing of isolated sperm cells for direct haplotyping of a human genome,” Genome Res (2013) 23:826-832. |
Kitzman et al. “Haplotype-resolved genome sequencing of a Gujarati Indian individual.” Nat Biotechnol (2011) 29:59-63. |
Kitzman, et al. Noninvasive whole-genome sequencing of a human fetus. Sci Transl Med. Jun. 6, 2012;4(137):137ra76. doi: 10.1126/scitranslmed.3004323. |
Klein, et al. Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells. Cell. May 21, 2015;161:1187-1201. |
Knight, et al. Subtle chromosomal rearrangements in children with unexplained mental retardation. Lancet. Nov. 13, 1999;354(9191):1676-81. |
Koster et al., “Drop-based microfluidic devices for encapsulation of single cells”, Lab on a Chip The Royal Soc. of Chern. 8: 1110-1115 (2008). |
Kutyavin, et al. Oligonucleotides containing 2-aminoadenine and 2-thiothymine act as selectively binding complementary agents. Biochemistry. Aug. 27, 1996;35(34):11170-6. |
Lagus, T.P. et al., “A review of the theory, methods and recent applications of high-throughput single-cell droplet microfluidics” J. Phys. D: Appl. Phys. (2013) 46:114005 (21 pages). |
Layer et al. “Lumpy: A probabilistic framework for structural variant discovery,” Genome Biology (2014) 15(6):R84. |
Li, Y., et al., “PEGylated PLGA Nanoparticles as protein carriers: synthesis, preparation and biodistribution in rats,” Journal of Controlled Release, vol. 71, pp. 203-211 (2001). |
Lippert et al. “Algorithmic strategies for the single nucleotide polymorphism haplotype assembly problem,” Brief. Bionform (2002) 3:23-31. |
Liu, et al. Preparation of uniform-sized PLA microcapsules by combining Shirasu porous glass membrane emulsification technique and multiple emulsion-solvent evaporation method. J Control Release. Mar. 2, 2005;103(1):31-43. Epub Dec. 21, 2004. |
Liu, et al. Smart thermo-triggered squirting capsules for Nanoparticle delivery. Soft Matter. 2010; 6(16):3759-3763. |
Lo, et al. On the design of clone-based haplotyping. Genome Biol. 2013;14(9):R100. |
Loscertales, I.G., et al., “Micro/Nano Encapsulation via Electrified Coaxial Liquid Jets,” Science, vol. 295, pp. 1695-1698 (2002). |
Love, “A microengraving method for rapid selection of single cells producing antigen-specific antibodies”, Nature Biotech, 24(6):703-707 (Jun. 2006). |
Lowe, Adam J.“Norbornenes and [n]polynorbornanes as molecular scaffolds for anion recognition” Ph.D. Thesis (May 2010). (361 pages). |
Lupski. Genomic rearrangements and sporadic disease. Nat Genet. Jul. 2007;39(7 Suppl):S43-7. |
Macosko, et al. Highly Parallel Genome-wide Expression Profiling of Individual Cells Using Nanoliter Droplets. Cell. May 21, 2015;161(5):1202-14. doi: 10.1016/j.cell.2015.05.002. |
Mair, D.A. et al., “Injection molded microfluidic chips featuring integrated interconnects” Lab Chip (2006) 6:1346-1354. |
Bluthmann et al., 1988, “T-cell-specific deletion of T-cell receptor transgenes allows functional rearrangement of endogenous alpha- and beta-genes,” Nature 334, pp. 156-159. |
Ganusov et al., 2007, “Do most lymphocytes in humans really reside in the gut?,” Trends Immunol, 208(12), pp. 514-518. |
Mostovoy et al., 2016, “A hybrid approach for de novo human genome sequence assembly and phasing,” Nat. Methods 13, 587-590. |
Narasimhan et al., 2016, “Health and population effects of rare gene knockouts in adult humans with related parents,” Science 352, pp. 474-477 (2016). |
Rudolph et al., 2006, “How TCRs bind MHCs, peptides, and coreceptors,” Annu Rev Immunol 24:pp. 419-466, doi:10.1146/annurev.immunol.23.021704.115658. |
Uematsu et al., 1988, “In transgenic mice the introduced functional T-cell receptor beta gene prevents expression of endogenous beta genes,” Cell 52, pp. 831-841. |
Yassai et al., 2009, “A clonotype nomenclature for T-cell receptors,” Immunogenetics 61, pp. 493-502. |
Yaari and Kleinstein, 2015, “Practical guidelines for B-cell repertoire sequencing analysis,” Genome Medicine 7:121. |
Matsuda et al., 1998, “The complete nucleotide sequence of the human immunoglobulin heavy chain variable region locus,” The Journal of Experimental Medicine. 188 (11): 2151-62, doi: 10.1084/jem.188.11.2151. |
Li et al., 2004, “Utilization of Ig heavy chain variable, diversity, and joining gene segments in children with B-lineage acute lymphoblastic leukemia: implications for the mechanisms of VDJ recombination and for pathogenesis,” Blood. 103 (12): 4602-9, doi:10.1182/blood-2003-11-3857. |
Chen et al., 2010, “Clustering-based identification of clonally related immunoglobulin gene sequence sets,” Immunome Res. 6 Suppl 1:S4. |
Hershberg and Prak, 2015, “The analysis of clonal expansion in normal and autoimmune B-cell repertoires,” Philos Trans R Soc Lond B Biol Sci. 370(1676). |
Zheng, 2017, “Massively parallel digital transcriptional profiling of single cells,” Nature Communications, DOI: 10.1038/ncomms 14049. |
Aken et al., 2015, “The Ensembl gene annotation system Database,” baw093, doi: 10.1093/database/baw093. |
McLaren, 2016, et al., “The Ensembl Variant Effect Predictor,” Genome Biology 17, p. 122, doi: 10.1186/s13059-016-0974-4. |
Chromium, Single Cell 3' Reagent Kits v2. User Guide, 2017, 10X Genomics, Pleasanton, California, Rev. B. |
Chromium Single Cell V(D)J Reagent Kits User Guide, 2017, 10X Genomics. |
10x Genomics Announces the Addition of Unbiased Gene Expression and B-cell Repertoire to the Chromium Single Cell V(D)J Solution, Oct. 18, 2017, https://www.businesswire.com/news/home/20171018005362/en/10x-Genomics-Announces-Addition-Unbiased-Gene-Expression#.XLfZzQ4GaAw.email. |
U.S. Appl. No. 62/572,544, filed Oct. 15, 2017. |
Greiff et al., “Bioinformatic and Statistical Analysis of Adaptive Immune Repertoires,” Trends in Immunology, Nov. 2015, vol. 36 No. 11, pp. 738-749. |
Turchaninova et al., “High-quality full-length immunoglobulin profiling with unique molecular barcoding,” Nature Protocols, Aug. 4, 2016, vol. 11 No. 9, pp. 1599-1616. |
Zhang et al., “IMonitor: A Robust Pipeline for TCR and BCR Repertoire Analysis,” Genetics, Aug. 21, 2015, vol. 201 No. 2, pp. 459-472. |
Extended European Search Report dated Feb. 15, 2021, for European Patent Application No. 18802746.0. |
Number | Date | Country | |
---|---|---|---|
20180371545 A1 | Dec 2018 | US |
Number | Date | Country | |
---|---|---|---|
62582866 | Nov 2017 | US | |
62508947 | May 2017 | US |