The present application relates to biotechnologies, and more particularly, to a method and a system for integrating morphological characteristics and gene expression of individual cells.
Single-cell RNA sequencing (scRNA-seq) technologies have revolutionized the way for transcriptomic analysis of multicellular tissues. Via investigation of thousands of cells at single-particle resolution, scRNA-seq can provide quantitative expression profiles of individual cells with valuable insights into cellular differences, such as cell types and cell states, which are usually elusive in the traditional bulk RNA-seq analysis. The distinct advantages of single-cell transcriptomics enable multi-dimensional investigation of individual cells to, for example, decipher tumor heterogeneity, reveal complex and rare cell populations, and uncover regulatory relationships between genes, offering a strong basis for designing precision medicine and targeted therapy.
On the other hand, due to the rapid evolution of computational technologies (e.g., deep learning and neural networks), cellular morphological profiling based on high-throughput and high-content image processing has become an emerging tool for various biological and medical applications. For example, imaging flow cytometry (IFC), the representative technology in this field, can extract hundreds of morphological features (e.g., shape, size, intensity, and texture) from each individual cell with AI-driven algorithms, allowing classification of complex cell phenotypes, identification of rare cells, and discovery of useful targets for disease diagnosis, personalized medicine, and drug development.
Since the morphological characteristics are highly correlated with gene expression patterns, researchers have shown substantial interests in the integration of phenomics and transcriptomics data for novel biological insights. However, currently there is no such an approach that could link gene expression profiles to morphological phenotypes at single-cell level with a high-throughput manner. Some studies are carried out based on bulk RNA-seq analysis while ignore cellular heterogeneity. The others rely on manual collection of each target cell using pipette, which have a limited throughput.
Implementations of the present disclosure will now be described, by way of embodiments only, with reference to the attached figures.
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by those skilled persons in the art. The terms used herein is only for the purpose of describing specific implementation manners, and is not intended to limit the embodiments of the present application.
In the case of no conflict, the following embodiments and features in the embodiments can be combined with each other.
Referring to
S1: a microfluidic device 10 is provided.
Referring to
Referring to
In some embodiments, the capture oligonucleotide 1010 can be synthesized by an inkjet printing technology, and the printing process is shown in
To validate that the cell barcode sequence 1011 are well synthesized, 100 spots with known oligonucleotide sequence are distributed at different locations of the microwell array for hybridization with fluorescent probes. The hybridization rate of the 100 spots and the CV (coefficient of variation) of fluorescent signal intensity are characterized by fluorescence imaging. In the present application, the probe hybridization rate is greater than 95%, and CV of fluorescent signal intensity is less than 10%, indicating that the cell barcode sequence 1011 synthesized by this application has a high accuracy rate.
S2: cells are injected into the microwells 101, and the interdigital electrode 103 is used to capture a single cell in the microwell 101. The morphological characteristics of the cells in the microwells 101 are recorded for morphological analysis.
During operations, cells are injected through the inlet 102, which flow into the microwell array through the glass channel 109. The dielectrophoretic (DEP) force generated by the interdigital electrode 103 can trap single cells above the microwells 101. After the cells stop flowing, the interdigital electrode 103 is turned off, allowing the cells to be precipitated into the microwells 101. The excess cells outside the microwells 101 are washed away from the outlet 104. The ideal situation is that every microwell 101 traps only one cell. In order to achieve the ideal situation, the key parameters, including but not limited to microwell 101 dimension (diameter and depth), DEP force intensity, channel height, input cell concentration, and flow rate, are optimized to maximize the single-cell purity and microwell occupancy rate. Referring to
After individual cells are trapped in the microwell array, the bright-field and fluorescent images of each cell for morphological profiling are recorded via a CCD (Charge Coupled Device) camera connected to a microscope. Meanwhile, the cell barcodes sequence 1011 on the capture oligonucleotide 1010 are assigned to the cells based on their locations in the array. For example, the cell barcode sequence of the microwell 101 in the first row and the first column is known as TACGAGC (TACGAGC is unique among all cell barcode sequences), and TACGAGC is assigned to the cell located in the first row and the first column of the microwells 101.
Air bubbles in the flow cell will cause failure to reverser transcription and PCR due to the bubble expansion at high temperature. To remove the air bubbles, the glass channel 107 is treated with 80% ethanol before buffer injection. Meanwhile, by applying a voltage to the interdigital electrode 103, the electrowetting behavior makes the surface hydrophilic, which is helpful to remove the bubbles trapped in the microwells 101.
S3: cells are lysed so that the mRNA released by the cell is captured by the capture oligonucleotide in the microwell 101 where the cell is located.
Referring to
S4: the captured mRNA is reverse transcribed to obtain cDNA, each cDNA comprises a capture oligonucleotide sequence 1010 and a nucleotide sequence complementary to the captured mRNA.
Referring to
S5: perform a PCR amplification reaction on the cDNA to obtain a cDNA library, and the cDNA library is sequenced.
Referring to
S6: the cell barcode sequence 1011 and the unique molecular identifier sequence 1012 are read according to a sequencing result, and the morphological characteristics and gene expression information of the cell in the microwell 101 are integrated together.
Specifically, after identifying the genes, reads are organized by their cell barcode sequence 1011 and individual UMIs are counted for each gene in each cell. By reading the cell barcode sequence 1011 in the sequencing result, the microwell 101 can be located, and the morphological characteristics (the bright-field image and the fluorescence image obtained in step S2) of the cell corresponding to the microwell 101 can be obtained. Furthermore, the morphological characteristics and gene expression of the cell are integrated by a controller which includes an analysis software. In some embodiments, the analysis software is t-distributed stochastic neighbor embedding (tSNE), a data visualization tool that can reduce high-dimensional data to two-dimensional or three-dimensional, and then draw it into a graph. Referring to FIG. 8, the single-cell expression profiles are eventually be plotted two-dimensionally (tSNE) for visualized analysis and integrated with their morphological features. It is understandable that other analysis software or methods can also be used to analyze the morphological characteristics and gene expression profiles of single cells.
Referring to
The application will be further described below in conjunction with specific embodiments.
HEK 293T cells (human embryonic kidney cell line) and mouse 3T3 cell lines were mixed at the same concentration, and the mixture was analyzed for single-cell morphological characteristics and gene expression profiles. Individual cells were isolated in the microwells 101 for imaging, as shown in
If cells from two cell lines are isolated into the same microwell 101 and their mRNAs are captured by the same capture oligonucleotide 1010, the genes from the two species will share the same cell barcode sequence 1011, the proportion of which reveals the single-cell purity. The single-cell purity could be further improved by analyzing the cell images to screen out the cell doublets and multiplets. In this example, the single-cell purity is greater than 95%.
The recovery rate is calculated as the percentage of recovered cells (number of cell barcode sequences) to the total input cell number. In this example, the recovery rate is greater than 80%.
The contamination of ambient RNA from original biofluids or cell disruption decreases the accuracy of interpretation. Synthetic RNA controls are spiked into the sample with known concentration and sequence to evaluate the RNA contamination, which is defined as the percentage of recovered spiked RNAs (number of individual UMIs with spike-RNA sequence) to the total input spiked RNA amount. In this example, RNA contamination is less than 5%.
To characterize the detection sensitivity, mouse cells are spiked into human cells in different ratios of concentration ranging from 1:1 (50%) to 1:99 (1%). The sensitivity is determined as the percentage above which the mouse cells can be detected. In this example, the sensitivity is less than 5%, that is, the lower limit of detection of mouse cell concentration is 5%, and mouse cell can be detected as long as the concentration is greater than 5%.
In this application, a plurality of individual cells are placed in a plurality of microwells 101 in the microfluidic device, and from the beginning (imaging) to the end (sequencing), each cell will be assigned a unique known cell barcode sequence to observe their phenotype before processing for sequencing. The cell barcode sequence in the capture oligonucleotide can be “read” in the microwells 101 and also can be “read” from the sequence reads obtained from the cDNA library, thereby the genome/transcriptome data (mRNA sequence information) is linked to the observed phenotype of single-cell, so that the morphological phenotype is directly related to gene expression. This method focuses on integrating the morphological characteristics and gene expression profiles of isolated single cells. It has the characteristics of high efficiency, single-cell purity (greater than 95%), recovery rate (greater than 80%) and sensitivity (less than 5%), and low RNA contamination (less than 5%). It can facilitate fundamental biological studies, develop multi-dimensional biomarker signatures for diseases, and accelerate drug discovery and development.
The above descriptions are some specific implementation manners of the present application, but in the actual application process, they should not be limited to these implementation manners. For persons skilled in the art, other modifications and changes made according to the technical concept of this application should all belong to the protection scope of this application.
Number | Date | Country | |
---|---|---|---|
63197331 | Jun 2021 | US |
Number | Date | Country | |
---|---|---|---|
Parent | PCT/IB2021/060684 | Nov 2021 | US |
Child | 17548790 | US |