Single-cell measurements are essential for understanding biological systems composed of different cell types. Recent advances in single-cell RNA and protein methods have allowed analyzing single-cell heterogeneity at unprecedented scale and depth. These emerging single-cell methods have the potential to go beyond classifying cell types and to help characterize intrinsically single-cell processes, such as the cell division cycle (CDC) and its coordination with metabolism and cell growth. Crucial aspects of the CDC are regulated post-transcriptionally by protein synthesis and degradation and their characterization demands single-cell protein analysis. There is a need to improve single-cell proteomic sample preparation toward, for example, improved quantification of proteins and/or protein variabilities.
Embodiments of the present invention include methods of single-cell proteomic sample preparation for analyzing peptides in samples with a low abundance of proteins.
In one aspect, the disclosure provides a method of forming a single-cell proteomic sample, said method comprising:
In some embodiments, each of the n droplets in step a), b), c), and/or d) has a volume of about 25 nanoliters (nl) or less. In particular embodiments, each of the n droplets in step a), b), c) and d) has a volume of about 25 nanoliters (nl) or less.
In some embodiments, the substantially planar solid surface is provided by a uniform glass slide. In certain embodiments, the substantially planar solid surface is etched with a geometric pattern. In particular embodiments, the substantially planar solid surface is fluorocarbon-coated.
In certain embodiments, n is ≥10.
In some embodiments, the lysis buffer comprises about 4-8 nanoliters of 90-100% dimethyl sulfoxide (DMSO).
In some embodiments, step b) comprises dispensing the single cell in a cell suspension buffer with a volume of about 100-1,000 picoliters. In particular embodiments, step b) comprises dispensing the single cell in a cell suspension buffer with a volume of about 300 picoliters.
In certain embodiments, the single cell is lysed in a total volume of about 4-10 nl for about 10-20 minutes.
In some embodiments, step c) comprises:
In certain embodiments, the chemical tag comprises a “light” version of TMT label reagents dissolved in DMSO. In other embodiments, the chemical tag comprises a “heavy” version of TMT label reagents dissolved in DMSO.
In some embodiments, step d) comprises dispensing about 18-22 nl of a chemical tag into each of the n droplets comprising the peptides; and enabling the chemical tag to react with the peptides at room temperature and a relative humidity of about 75% for about 1 hour to produce the labeled peptides. In certain embodiments, each droplet of the n droplets receives a unique chemical tag, thereby enabling the labeled peptides in each droplet to be distinguishable from the labeled peptides in each other droplet.
In certain embodiments, the fluid is water. In particular embodiments, the fluid has a volume of about 1 μl.
In some embodiments, steps a) to e) are repeated at least once to form two or more single-cell proteomic samples on the substantially planar solid surface.
In certain embodiments, at least 100 droplets of lysis buffer are dispensed onto the substantially planar solid surface.
In some embodiments, at least 500-3,000 droplets of lysis buffer are dispensed onto the substantially planar solid surface.
In certain embodiments, the two or more single-cell proteomic samples comprises peptides from at least 100 cells. In particular embodiments, the two or more single-cell proteomic samples comprises peptides from about 100-10,000 cells.
In some embodiments, the disclosed methods further comprise performing at least one proteomic analysis on the single-cell proteomic sample. In particular embodiments, the at least one single-cell proteomic analysis enables identifying and/or quantifying protein covariation across the single cells.
In another aspect, the disclosure provides a method of performing a proteomic analysis comprising analyzing a single-cell proteomic sample formed by any of the methods described herein. In some embodiments, the analyzing comprises identifying and/or quantifying protein covariation across the single cells.
In another aspect, the disclosure provides a single-cell proteomic sample, for example, one formed by any one of the methods of single-cell proteomic sample formation described herein.
In another aspect, the disclosure provides kits and systems comprising reagents described herein (for example, one or more buffers) and/or an element that provides for a substantially planar surface and/or devices described herein.
The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.
The foregoing will be apparent from the following more particular description of example embodiments, as illustrated in the accompanying drawings in which like reference characters refer to the same parts throughout the different views. The drawings are not necessarily to scale, emphasis instead being placed upon illustrating embodiments.
A description of example embodiments follows.
Traditionally, single-cell proteomic analyses have been performed by using fluorescent proteins or affinity reagents. While these approaches are powerful, mass spectrometry (MS) has the potential to increase the specificity and depth of single-cell protein quantification. For decades, MS has been a powerful tool for quantitative measurements of thousands of proteins in bulk samples consisting of thousands of cells or more.
Bulk samples are often prepared for liquid chromatography tandem MS analysis by using relatively large volumes (hundreds of microliters) and chemicals (detergents or chaotropic agents like urea) that are incompatible with MS analysis and require removal by cleanup procedures. The large volumes and cleanup procedures entail sample losses that may be prohibitive for small samples, such as single mammalian cells.
Unless otherwise defined, all terms of art, notations and other scientific terms or terminology used herein are intended to have the meanings commonly understood by those of skill in the art to which this disclosure pertains. In some cases, terms with commonly understood meanings are defined herein for clarity and/or for ready reference, and the inclusion of such definitions herein should not necessarily be construed to represent a substantial difference over what is generally understood in the art. It will be further understood that terms, such as those defined in commonly-used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and/or as otherwise defined herein.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting.
When introducing elements disclosed herein, the articles “a,” “an,” “the,” and “said” are intended to mean that there are one or more of the elements. Further, the one or more elements may be the same or different.
Throughout this specification and the claims which follow, unless the context requires otherwise, the word “comprise,” and variations such as “comprises” and “comprising”, will be understood to imply the inclusion of, e.g., a stated integer or step or group of integers or steps, but not the exclusion of any other integer or step or group of integer or step. When used herein, the term “comprising” can be substituted with the term “containing” or “including.”
As used herein, “consisting of” excludes any element, step, or ingredient not specified in the claim element. When used herein, “consisting essentially of” does not exclude materials or steps that do not materially affect the basic and novel characteristics of the claim. Any of the terms “comprising,” “containing,” “including,” and “having,” whenever used herein in the context of an aspect or embodiment of the disclosure, can in some embodiments, be replaced with the term “consisting of,” or “consisting essentially of” to vary scopes of the disclosure.
As used herein, the conjunctive term “and/or” between multiple recited elements is understood as encompassing both individual and combined options. For instance, where two elements are conjoined by “and/or,” a first option refers to the applicability of the first element without the second. A second option refers to the applicability of the second element without the first. A third option refers to the applicability of the first and second elements together. Any one of these options is understood to fall within the meaning, and, therefore, satisfy the requirement of the term “and/or” as used herein. Concurrent applicability of more than one of the options is also understood to fall within the meaning, and, therefore, satisfy the requirement of the term “and/or.”
It should be understood that for all numerical bounds describing some parameter in this application, such as “about,” “at least,” “less than,” and “more than,” the description also necessarily encompasses any range bounded by the recited values. Accordingly, for example, the description “at least 1, 2, 3, 4, or 5” also describes, inter alia, the ranges 1-2, 1-3, 1-4, 1-5, 2-3, 2-4, 2-5, 3-4, 3-5, and 4-5, et cetera.
In various aspects, the disclosure provides methods of forming single-cell proteomic samples.
In one aspect, the disclosure provides a method of forming a single-cell proteomic sample, said method comprising:
In some embodiments, each of the n droplets in step a), b), c), and/or d) has a volume of less than 100 nanoliters (nl or nL), for example, less than 80, 60, 50, 40, 35, 30, 25, 22 or 20 nl. In certain embodiments, each of the n droplets in step a), b), c), and/or d) has a volume of about 100 nl or less, for example, about: 80, 60, 50, 40, 35, 30, 25, 22 or 20 nl or less. In particular embodiments, each of the n droplets in step a), b), c), and/or d) has a volume of about 25 nanoliters (nl) or less. In more particular embodiments, each of the n droplets in step a), b), c), and d) has a volume of about 25 nl or less.
In certain embodiments, the lysis buffer, the digestion buffer, the chemical tag, or a combination thereof is dispensed in a volume of about 1-20 nl per droplet, for example, about: 1-18, 1-16, 1-14, 1-12, 1-10, 1-8, I-6, I-4, 2-18, 2-16, 2-14, 2-12, 2-10, 2-8, 2-6, 2-4, 4-20, 4-18, 4-16, 4-14, 4-12, 4-10, 4-8, 4-6, 6-20, 6-18, 6-16, 6-14, 6-12, 6-10, 6-8, 8-20, 8-18, 8-16, 8-14, 8-12, 8-10, 10-20, 10-18, 10-16, 10-14, 10-12, 12-20, 12-18, 12-16, 12-14, 14-20, 14-18, 14-16, 16-20, 16-18 or 18-20 nl.
In some embodiments, the disclosure provides a method of forming at least two single-cell proteomic samples, wherein steps a) to e) are repeated at least once to form two or more single-cell proteomic samples. In certain embodiments, steps a) to e) are repeated at least 3 times, for example, at least 5, 10, 20, 30, 50, 80, 100, 120, 150, 180, 200, 250, 300, 350, 400, 500 or 1,000 times. In particular embodiments, steps a) to e) are repeated about 200 times.
As used herein, the term “substantially planar solid surface” refers to a surface that is substantially flat. In some embodiments, a substantially planar solid surface is a smooth surface. In certain embodiments, a substantially planar solid surface comprises etching, one or more (e.g., arrays of) very shallow dimples, or a combination thereof. A substantially planar solid surface enables small droplets (e.g., about 10-200 nl) of liquids to merge into a combined droplet when applying a fluid of a discrete volume (e.g., about 1 microliter (μl or μL)). A member (such as a multi-well plate or a microfuge tube) where its contents are closed off or surrounded, for example, by a wall, does not have a substantially planar solid surface. In particular embodiments, the substantially planar solid surface is provided by a slide, for example, a uniform glass slide.
In some embodiments, at least 90% of the points in the substantially planar surface are located on one of or between a pair of planes which are parallel and which are spaced from each other by a distance of not more than 5% of the largest dimension of the surface. In certain embodiments, the radius of curvature of the space is much greater than the cross-sectional dimensions, and the curvature does not substantially alter the function of the space. In particular embodiments, the substantially planar surface has a generally uniform thickness and having surface dimensions that are both much larger (e.g., ten to 100 times or more) than the thickness.
In certain embodiments, the substantially planar solid surface is etched, for example, with a laser. An “etched surface” refers to a surface that is made by etching.
In some embodiments, the substantially planar solid surface comprises etchings arranged in spaced relation to each other (e.g., into clusters of a discrete number of spots (see, e.g.,
In some embodiments, the substantially planar solid surface is unetched.
In some embodiments, the distance between two spots (e.g., two closest spots) within a cluster is about 0.1-10.0 mm, for example, about: 0.1, 0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 0.45, 0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0, 5.5, 6.0, 6.5, 7.0, 7.5, 8.0, 8.5, 9.0, 9.5 or 10.0 mm. In certain embodiments, the distance between two spots (e.g., two closest spots) within a cluster is about: 0.1-9.5, 0.15-9.5, 0.15-9.0, 0.2-9.0, 0.2-8.5, 0.25-8.5, 0.25-8.0, 0.3-8.0, 0.3-7.5, 0.35-7.5, 0.35-7.0, 0.4-7.0, 0.4-6.5, 0.45-6.5, 0.45-6.0, 0.5-6.0, 0.5-5.5, 0.55-5.5, 0.55-5.0, 0.6-5.0, 0.6-4.5, 0.65-4.5, 0.65-4.0, 0.7-4.0, 0.7-3.5, 0.75-3.5, 0.75-3.0, 0.8-3.0, 0.8-2.5, 0.85-2.5, 0.85-2.0, 0.9-2.0, 0.9-1.5, 0.95-1.5 or 0.95-1.0. In particular embodiments, the distance between two spots (e.g., two closest spots) within a cluster is about 1.0 mm.
In some embodiments, the distance between the centers of two clusters (e.g., two neighboring clusters) is about 3.0-50 mm, for example, about: 3.5, 4.0, 4.5, 5.0, 5.5, 6.0, 8.0, 10, 15, 20, 30, 40 or 50 mm. In certain embodiments, the distance between the centers of two clusters (e.g., two neighboring clusters) is about: 3.0-40, 3.5-40, 3.5-30, 4.0-30, 4.0-20, 4.5-20, 4.5-15, 5.0-15, 5.0-10, 5.5-10, 5.5-8 or 6-8. In particular embodiments, the distance between the centers of two clusters (e.g., two neighboring clusters) is about 6 mm.
The distance between two spots (e.g., two closest spots) within a cluster and/or the distance between the centers of two clusters (e.g., two neighboring clusters) may be designed by a person of ordinary skill in the art based on the goal of the proteomic analysis, sample multiplexing strategy and/or desired throughput.
Methods disclosed herein can be compatible with many types of substantially planar solid surfaces with a wide range of sizes. In some embodiments, the length of the substantially planar solid surface is about 10 mm to 50 cm, for example, about: 20 mm to 50 cm, 20 mm to 25 cm, 40 mm to 25 cm, 40 mm to 12 cm, 50 mm to 12 cm, 50 mm to 10 cm, 100 mm to 10 cm, 100 mm to 5 cm, 200 mm to 5 cm, 200 mm to 2.5 cm, 500 mm to 2.5 cm or 500 mm to 1.0 cm.
In certain embodiments, the width of the substantially planar solid surface is about 5.0 mm to 30 cm, for example, about: 10 mm to 30 cm, 20 mm to 30 cm, 20 mm to 15 cm, 50 mm to 15 cm, 50 mm to 10 cm, 100 mm to 10 cm, 100 mm to 5.0 cm, 200 mm to 5.0 cm, 200 mm to 2.5 cm, 500 mm to 2.5 cm, 500 mm to 2.0 cm or 1.0 to 2.0 cm.
In particular embodiments, the substantially planar solid surface is provided by microscopic glass slides with dimensions of 75 mm by 25 mm (3″ by 1″) and about 1 mm thickness.
In certain embodiments, the substantially planar solid surface is coated with a compound (e.g., a compound that is neither hydrophobic nor hydrophilic) to stabilize the individual droplets. In particular embodiments, the substantially planar solid surface is fluorocarbon-coated. The term “fluorocarbon” refers to a compound formed by replacing one or more of the hydrogen atoms in a hydrocarbon with fluorine atoms.
In certain embodiments, movement of the substantially planar solid surface is minimized.
Lysing single cells comprises dispensing n droplets of lysis buffer onto the substantially planar solid surface (e.g., etched or unetched uniformed glass slide), wherein n≥2; and dispensing a single cell into each of the n droplets of lysis buffer to produce n droplets, each comprising a lysed single cell.
As used herein, the term “liquid droplet” refers to a very small drop of a liquid. In some embodiments, each individual droplet comprising the lysis buffer has a volume of about 1.0-10.0 nl, for example, about: 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 1.0-4.0, 1.0-6.0, 1.0-8.0, 2.0-4.0, 2.0-6.0, 2.0-8.0, 2.0-10.0, 4.0-6.0, 4.0-8.0, 4.0-10.0, 6.0-8.0, 6.0-10.0 or 8.0-10.0 nl. In certain embodiments, each individual droplet comprising the lysis buffer has a volume of about 10.0 nl or less, for example, about: 9.5, 9.0, 8.5, 8.0, 7.5, 7.0, 6.5, 6.0, 5.5, 5.0, 4.5 or 4.0 nl or less. In some embodiments, each individual droplet comprising the lysis buffer has a volume of about 4 nl. In particular embodiments, each individual droplet comprising the lysis buffer has a volume of about 8 nl.
In certain embodiments, the individual droplets of lysis buffer are dispensed using a first piezo dispensing capillary (PDC), for example, that of cellenONER (SCIENION GmbH, Berlin, Germany). In some embodiments, the first PDC is dedicated for handling organic solvents, protein solutions, or a combination thereof. In other embodiments, the individual droplets of lysis buffer are dispensed with MANTISR Liquid Handler (FORMULATRIXR, Bedford, MA) or HP D300e Digital Dispenser (Hewlett-Packard, Palo Alto, CA).
In some embodiments, n is >3, for example, >4, >5, >6, >7, >8, >9, >10, ≥11, ≥12, ≥13, >14, ≥15, ≥16, ≥17, >18, >19 or >20. In certain embodiments, n is about 2-20, for example, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 or 20, or 2-18, 3-18, 3-16, 4-16, 4-14, 5-14, 5-12, 6-12 or 6-10. In particular embodiments, n is about 12-20. In more particular embodiments, n is about 14-18. In some embodiments, the n droplets are arranged in spaced relation to each other (e.g., into a cluster (see, e.g.,
In some embodiments, the method comprises dispensing m times n droplets of lysis buffer onto a substantially planar solid surface, wherein n (corresponding to the number of droplets per subgroup/cluster)>2, and m (corresponding to the number of subgroups/clusters)≥2.
For example, a multiplexing format may be designed by a person of ordinary skill in the art based on the goal of the proteomic analysis, sample multiplexing strategy, etc., or a combination thereof. A suitable multiplexing format may include about 1-120 clusters per substantially planar solid surface, for example, about: 18, 36, 54, 72, 90 or 108 clusters per substantially planar solid surface; and each cluster may include about 1-20 droplets, for example, about: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 or 20 droplets. In certain embodiments, the multiplexing format comprises at least about 10 clusters, for example, at least about: 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 90 or 100 clusters.
In some embodiments, each cluster has at least about 6 droplets, for example, at least about: 7, 8, 9, 10, 11, 12, 13, 14, 15 or 16 droplets. In particular embodiments, the multiplexing format includes about 14 droplets per cluster. In more particular embodiments, the multiplexing format includes about 16 droplets per cluster.
In certain embodiments, the multiplexing format includes at least 10 clusters, and each cluster comprising at least 10 droplets (e.g., 14-16 droplets). In particular embodiments, the multiplexing format includes 36 clusters with 14 droplets per cluster.
In some embodiments, a total of about 100-10,000 individual droplets comprising the lysis buffer are dispensed onto the substantially planar solid surface, for example, about: 100-9,000, 150-9,000, 150-8,000, 200-8,000, 200-6,000, 300-6,000, 300-5,000, 500-5,000, 500-4,000, 750-4,000, 750-3,000, 1,000-3,000, 1,500-3,000, 1,500-2,000 or 2,000-3,000 individual droplets. In certain embodiments, about 2,000 (e.g., 2016) individual droplets are dispensed onto the substantially planar solid surface.
In some embodiments, the lysis buffer is devoid of any compound incompatible with the proteomic analysis (e.g., mass spectrometry (MS)). In certain embodiments, the method is devoid of one or more steps for removing one or more incompatible compounds (“cleanup steps”).
In certain embodiments, the lysis buffer comprises a mass-spec compatible organic solvent and/or detergent, such as acetonitrile, n-Dodecyl-ß-D-maltopyranoside (DDM), n-Decyl-B-D-maltopyranoside (DM) and Rapigest.
In some embodiments, the lysis buffer comprises a compound compatible with the intended proteomic analysis (e.g., MS). In certain embodiments, the compound has a vapor pressure of about 0.500-0.700 mm Hg or less at 25° C. In particular embodiments, the compound has a vapor pressure of about 0.600 mm Hg at 25° C. In more particular embodiments, the compound is an organosulfur compound, for example, dimethyl sulfoxide (DMSO).
In some embodiments, the lysis buffer comprises 33-100% DMSO, for example, 40-100%, 50-100%, 60-100%, 70-100%, 80-100%, 90-100%, 92-100%, 94-100%, 95-100%, 96-100%, 97-100%, 98-100% or 99-100% DMSO. In certain embodiments, the lysis buffer comprises about 4.0-8.0 nl of 90-100% DMSO. In particular embodiments, the lysis buffer comprises (e.g., consists of) about 4.0 nl 90-100% DMSO. In more particular embodiments, the lysis buffer comprises (e.g., consists of) about 8.0 nl 90-100% DMSO.
In some embodiments, a perimeter of water (e.g., mass spectrometry grade water) droplets is dispensed in a perimeter surrounding each grid (see, e.g.,
In some embodiments, the single cell is a prokaryotic cell. In certain embodiments, the single cell is a eukaryotic cell (e.g., an animal cell, a plant cell, a fungus cell, or a protist cell). Non-limiting examples of animals include humans, domestic animals, such as laboratory animals (e.g., cats, dogs, monkeys, pigs, rats, mice, etc.), household pets (e.g., cats, dogs, rabbits, etc.), livestock (e.g., pigs, cattle, sheep, goats, horses, etc.), and non-domestic animals. In particular embodiments, the single cell is a mammalian cell (e.g., a human cell).
In some embodiments, the single cell is a germ-line cell. In certain embodiments, the single cell is a somatic cell. Non-limiting examples of somatic cells include stem cells, red blood cells, white blood cells (e.g., neutrophils, eosinophils, basophils, or lymphocytes), platelets, nerve cells, neuroglial cells, muscle cells (e.g., skeletal muscle cells, cardiac muscle cells, or smooth muscle cells), cartilage cells, and skin cells. In certain embodiments, the individual cells comprise tumor cells (e.g., melanoma cells).
In some embodiments, the single cell has a diameter of less than 100 μm. In certain embodiments, the single cell has a diameter of about 10-20 μm. In particular embodiments, the single cell has a diameter of about 10-15 μm.
In some embodiments, the single-cell proteomic sample comprises peptides from at least two cells, for example, from at least about: 10, 15, 20, 30, 50, 80, 100, 150, 200, 250, 300, 500, 750, 1,000, 1,500, 2,000, 2,500, 3,000, 4,000, 5,000, 6,000, 7,000, 8,000 or 10,000 cells. In certain embodiments, the single-cell proteomic sample comprises peptides from about 10-10,000 cells, for example, about: 10-9,000, 15-9,000, 15-8,000, 30-8,000, 30-6,000, 50-6,000, 50-5,000, 100-5,000, 100-4,000, 150-4,000 or 150-3,000 cells. In some embodiments, the single-cell proteomic sample comprises peptides from at least 100 cells. In certain embodiments, the single-cell proteomic sample comprises peptides from at least 1,000 cells. In particular embodiments, the single-cell proteomic sample comprises peptides from at least 1,500 cells.
In some embodiments, the cells are a homogenous cell population (of the same cell type). In other embodiments, two or more cell types are dispensed into the n droplets of lysis buffer, for example, 3, 4, 5, 6, 7, 8, 9 or 10 or more cell types. Each cell type may comprise multiple subpopulations based on certain characteristics, for example, cell division cycle (CDC). In particular embodiments, the method further comprises enriching a subpopulation of cells, for example, with Fluorescence-activated cell sorting (FACS) (e.g., based on size, DNA content, cellular state, and/or surface marker), culture condition, reporter-based selection, or a combination thereof.
In some embodiments, (isolating and) dispensing the single cell uses a second piezo dispensing capillary (PDC), for example, that of cellenONER (SCIENION GmbH, Berlin, Germany). In some embodiments, the second PDC is dedicated to handling cell suspensions. In other embodiments, (isolating and) dispensing the single cell uses MANTISR Liquid Handler (FORMULATRIXR, Bedford, MA) or HP D300e Digital Dispenser (Hewlett-Packard, Palo Alto, CA).
In some embodiments, step b) comprises dispensing the single cell in a buffer (e.g., phosphate buffered saline (PBS)) with a measured volume. In certain embodiments, the measured volume is from about 30 picoliters to about 3,000 picoliters, for example, about: 30-2,400, 45-2,400, 45-1,800, 60-1,800, 60-1,200, 90-1,200, 90-900, 100-1,000, 120-900, 120-600, 150-600, 150-450, 200-450, 200-400, 200-300, 250-350, 260-340, 270-330, 280-320, 290-310 or 300-450 picoliters. In certain embodiments, the measured volume is less than 3,000 picoliters, for example, less than: 2,500, 2,400, 2,000, 1,800, 1,500, 1,200, 1,000, 800, 500, 450 or 400 picoliters. In particular embodiments, the measured volume is about 300 picoliters.
In certain embodiments, step b) comprises dispensing the single cell in a cell suspension buffer with a volume of about 100-1,000 picoliters. In particular embodiments, step b) comprises dispensing the single cell in a cell suspension buffer with a volume of about 300 picoliters.
In some embodiments, the method further comprises dispensing a cell suspension buffer devoid of any cell into one or more droplets of lysis buffer, for example, as a negative control for detecting background noise, contamination, etc., or a combination thereof.
In some embodiments, step b) enables lysing the single cell in a total volume of about 5.0-12.0 nl, for example, of about: 5.0, 5.5, 6.0, 6.5, 7.0, 7.5, 8.0, 8.5, 9.0, 9.5, 10.5, 11.0, 11.5, 12.0, 5.5-12.0, 5.5-11.5, 6.0-11.5, 6.0-11.0, 6.5-11.0, 6.5-10.5, 7.0-10.5, 7.0-10.0, 7.5-10.0, 7.5-9.5, 8.0-9.5 or 8.0-8.5 nl. In particular embodiments, step b) enables lysing the single cell in a total volume of about 7.5-8.5 nl. In particular embodiments, step b) enables lysing the single cell in a total volume of about 8.0-8.5 nl.
In certain embodiments, step b) enables lysing the single cell for about 10-20 minutes. In some particular embodiments, step b) enables lysing the single cell in a total volume of about 8-8.5 nl for about 10-20 minutes.
In some embodiments, 5.0-12.0 μl is the sum of the volume of the lysis buffer plus the volume of the single cell in its dispensing solution, for example, about: 5.0, 5.5, 6.0, 6.5, 7.0, 7.5, 8.0, 8.5, 9.0, 9.5, 10.5, 11.0, 11.5, 12.0, 5.5-12.0, 5.5-11.5, 6.0-11.5, 6.0-11.0, 6.5-11.0, 6.5-10.5, 7.0-10.5, 7.0-10.0, 7.5-10.0, 7.5-9.5, 8.0-9.5 or 8.0-8.5 nl. In particular embodiments, 4-10 μl is the sum of the volume of the lysis buffer plus the volume of the single cell in its dispensing solution.
In certain embodiments, the digestion buffer is a trypsin buffer, and dispensing digestion buffer into each of the n droplets produces a solution comprising about 100-150 ng/μl trypsin. In particular embodiments, dispensing digestion buffer into each of the n droplets produces a solution comprising about 120 ng/μl trypsin in about 5 mM HEPES buffer.
In certain embodiments, dispensing digestion buffer into each of the n droplets produces a solution with a volume of about 15-25 nl, for example, about: 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 15-20, 16-20, 16-19, 17-19 or 18-19 nl. In particular embodiments, dispensing digestion buffer into each of the n droplets produces a solution with a volume of about 18 nl.
In some embodiments, step c) comprises enabling the proteins from each lysed single cell to be digested at about 1ºC above the dew point, for example, about: 0.4-1.6° C., 0.5-1.5° C., 0.6-1.4° C., 0.7-1.3ºC, 0.8-1.2° C. or 0.9-1.1° C. above the dew point. In certain embodiments, step c) comprises enabling the proteins from each lysed single cell to be digested at a relative humidity of about 70-80%, for example, about: 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 71-79%, 72-78%, 73-77%, 74-76%, or 74.5-75.5%. As used herein, the term “relative humidity” refers to the amount of water vapor present in air expressed as a percentage of the amount needed for saturation at the same temperature. In particular embodiments, step c) comprises enabling the proteins from each lysed single cell to be digested at a relative humidity of about 75%. In more particular embodiments, step c) comprises enabling the proteins from each lysed single cell to be digested at about 1ºC above the dew point and at a relative humidity of about 75%. In some embodiments, the temperature, the relative humidity, or both are dynamically regulated.
In some embodiments, step c) comprises enabling the proteins from each lysed single cell to be digested for about 3-5 hours, for example, for about: 3, 3.5, 4, 4.5, 5, 3.1-4.9, 3.2-4.8, 3.3-4.7, 3.4-4.6, 3.5-4.5, 3.6-4.4, 3.7-4.3, 3.8-4.2 or 3.9-4.1 hours.
In some embodiments, step c) comprises:
In certain embodiments, the digestion buffer is dispensed using the first piezo dispensing capillaries (PDC), for example, that of cellenONER (Lyon, France). In other embodiments, the digestion buffer is dispensed using MANTISR Liquid Handler (FORMULATRIXR, Bedford, MA) or HP D300e Digital Dispenser (Hewlett-Packard, Palo Alto, CA).
In some embodiments, the one or more single-cell proteomic samples are intended for tandem mass spectrometry. Tandem mass spectrometry, also referred to herein as MS/MS or MS2, involves multiple steps of mass spectrometry selection, with some form of fragmentation occurring in between the stages. In a tandem mass spectrometer, ions are formed in the ion source and separated by mass-to-charge ratio in the first stage of mass spectrometry (MS1). Ions of a particular mass-to-charge ratio (precursor ions) are selected and fragment ions (product ions) are created by collision-induced dissociation, ion-molecule reaction, photodissociation, or other processes known to those skilled in the art. The resulting ions are then separated and detected in a second stage of mass spectrometry (MS2). A common use is for analysis of proteins and peptides.
In certain embodiments, the one or more single-cell proteomic samples are intended for quantitative proteomics. Quantitative proteomics can be used, for example, to determine the relative or absolute amount of proteins in a sample.
Several quantitative proteomics methods are based on MS/MS. One method commonly used for quantitative proteomics is isobaric tag labeling. Isobaric tag labeling enables simultaneous identification and quantification of proteins from multiple samples in a single analysis. To quantify proteins, peptides are labeled with chemical tags that have the same structure and nominal mass, but vary in the distribution of heavy isotopes in their structure. These tags, commonly referred to as tandem mass tags (TMT), are designed so that the mass tag is cleaved at a specific linker region upon higher-energy collisional-induced dissociation during tandem mass spectrometry, yielding reporter ions of different masses. Protein quantitation is accomplished by comparing the intensities of the reporter ions in the MS/MS spectra.
MS/MS can also be used for protein sequencing, as is understood by those skilled in the art. When intact proteins are introduced to a mass analyzer, it is termed “top-down proteomics,” and when proteins are digested into smaller peptides and subsequently introduced into the mass spectrometer, it is termed “bottom-up proteomics”. Shotgun proteomics is a variant of bottom-up proteomics in which proteins in a mixture are digested prior to separation and tandem mass spectrometry.
In some embodiments, the one or more single-cell proteomic samples are generated for cell classification, uncovering a regulatory process, associating a regulatory process with a functional outcome, or a combination thereof. In particular embodiments, the one or more single-cell proteomic samples are generated for understanding cell cycle regulation. In some embodiments, the one or more single-cell proteomic samples are generated for identifying proteins whose abundance differs in G1, S, and/or G2/M phase for two or more cell types.
In some embodiments, the one or more single-cell proteomic samples comprise 10 or more cells of the same cell type to minimize batch effects, background noise, or a combination thereof. In certain embodiments, the one or more single-cell proteomic samples comprise at least 10 cells of the same cell type, for example, at least: 15 cells, 20 cells, 30 cells, 50 cells, 80 cells, 100 cells, 150 cells, 200 cells, 250 cells, 300 cells, 500 cells, 750 cells, 1,000 cells, 1,500 cells, 2,000 cells, 2,500 cells or 3,000 cells of the same cell type. In certain embodiments, the one or more single-cell proteomic samples comprise about 10-10,000 cells of the same cell type, for example, about: 10-9,000 cells, 15-9,000 cells, 15-8,000 cells, 30-8,000 cells, 30-6,000 cells, 50-6,000 cells, 50-5,000 cells, 100-5,000 cells, 100-4,000 cells, 150-4,000 cells, 150-3,000, 500-3,000, 1,000-3,000, 1,000-2,500, 1,000-2,000, 1,500-3,000, 1,500-2,500 or 1,500-2,000 cells of the same cell type. In particular embodiments, the one or more single-cell proteomic samples comprise about 1,500-2,000 cells.
In some embodiments, the disclosed methods enable performing parallel sample preparation of multiple (e.g., hundreds or thousands of) single cells; obviating sample cleanup and associated losses; minimizing bias for cellular compartments; supporting accurate relative protein quantification, or a combination thereof.
In some embodiments, the disclosed methods further comprise performing at least one proteomic analysis on the single-cell proteomic sample. In particular embodiments, the at least one single-cell proteomic analysis enables identifying and/or quantifying protein covariation across the single cells.
In certain embodiments, the single-cell proteomic analysis is performed on a non-substantially planar solid surface, for example, in a multi-well plate or in a tube (such as a microfuge tube).
Peptide labeling comprises dispensing a chemical tag into each of the n droplets comprising the peptides to produce labeled peptides, wherein at least one droplet of the n droplets receives a different chemical tag from at least one other droplet of the n droplets, thereby enabling the labeled peptides in the at least one droplet to be distinguishable from the labeled peptides in the at least one other droplet (e.g., to be distinguishable from labeled peptides in any other droplet within a cluster/subgroup).
In particular embodiments, each droplet of the n droplets receives a unique chemical tag, thereby enabling the labeled peptides in each droplet to be distinguishable from the labeled peptides in each other droplet.
In isobaric labeling for tandem mass spectrometry, proteins are extracted from cells, digested, and labeled with tags of the same mass. When fragmented during MS/MS, the reporter ions show the relative amount of the peptides in the samples.
In some embodiments, the chemical tag comprises (e.g., consists of) an isobaric tag. Two commercially available isobaric tags are iTRAQR and tandem mass tag (TMT) reagents. A TMT comprises four regions: mass reporter, cleavable linker, mass normalization, and protein reactive group. TMT reagents can be used to simultaneously analyze, e.g., 2-18 different peptide samples prepared from individual cells. TMT reagents include three types: (1) a reactive NHS ester functional group for labeling primary amines (e.g., TMTduplex™, TMTTMsixplex™, TMT10plex plus™, TMT11-131C™, TMTpro 16plex, TMTpro 18plex,), (2) a reactive iodoacetyl functional group for labeling free sulfhydryls (e.g., iodoTMT™) and (3) reactive alkoxyamine functional group for labeling of carbonyls (e.g., aminoxyTMT™).
In certain embodiments, the peptides are labeled by isobaric mass tags (e.g., TMT or TMTpro) for multiplexed analysis. In particular embodiments, the chemical tag comprises (e.g., consists of) TMTpro 16plex or TMTpro 18plex.
In certain embodiments, the chemical tag comprises (e.g., consists of) an isobaric tag for relative and absolute quantitation (iTRAQR). ITRAQR is a reagent for tandem mass spectrometry that is used to determine the amount of proteins from different sources in a single experiment. iTRAQ® uses stable isotope labeled molecules that can form a covalent bond with the N-terminus and side chain amines of proteins. The iTRAQR reagents are used to label peptides from different samples that are pooled and analyzed by liquid chromatography and tandem mass spectrometry. The fragmentation of the attached tag generates a low molecular mass reporter ion that can be used to relatively quantify the peptides and the proteins from which they originated.
This sample preparation methods described herein are also compatible with non-isobaric mass tags, for example, as demonstrated with mTRAQ (
In some embodiments, the methods further comprise reducing the volumes of the individual droplets before labeling (e.g., by drying down the individual droplets). In certain embodiments, the volumes of the individual droplets are reduced to about 3-5 nl before dispensing the chemical tag into the corresponding droplet comprising the peptides, for example, about 3.0, 3.5, 4.0, 4.5, 5.0, 3.1-4.9, 3.2-4.8, 3.3-4.7, 3.4-4.6, 3.5-4.5, 3.6-4.4, 3.7-4.3, 3.8-4.2 or 3.9-4.1 nl. In particular embodiments, the volumes of the individual droplets are reduced to about 4 nl before dispensing the chemical tag into the corresponding droplet comprising the peptides.
In certain embodiments, step d) comprises dispensing a chemical tag in a volume of about 15-25 nl into each of the n droplets comprising the peptides, for example, the volume is about: 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 16-24, 17-23, 18-22, 19-21, 19.5-20.5, 19.6-20.4, 19.7-20.3, 19.8-20.2 or 19.9-20.1 nl. In some embodiments, step d) comprises dispensing a chemical tag in a volume of about 20 nl into each of the n droplets comprising the peptides. In particular embodiments, step d) comprises dispensing TMTpro™ (e.g., “light” version of TMTpro™ 14plex or TMTpro™ 16plex) in a volume of about 20 nl into each of the n droplets comprising the peptides.
In some embodiments, the chemical tag (e.g., TMT) is dissolved in DMSO. In certain embodiments, the chemical tag comprises TMT label reagents (such as of TMTpro™ 14plex or TMTpro™ 16plex) dissolved in DMSO. In particular embodiments, the chemical tag comprises a “light” version of TMT label reagents, also known as TMTO, dissolved in DMSO. In certain embodiments, the chemical tag comprises a “heavy” version of TMT label reagents, also known as TMT super heavy TMTsh, dissolved in DMSO.
In some embodiments, the concentration of the chemical label (e.g., TMTpro™ 14plex) is about 28 mM.
In some embodiments, step d) comprises enabling the chemical tag to react with the peptides at room temperature. In certain embodiments, step d) comprises enabling the chemical tag to react with the peptides at about 18-25° C., for example, at about: 18, 18.5, 19, 19.5, 20, 20.5, 21, 21.5, 22, 22.5, 23, 23.5, 24, 24.5, 25, 18.5-25, 19-24.5, 19.5-24, 20-23.5, 20.5-23, 21-22.5 or 21.5-22° C. In particular embodiments, step d) comprises enabling the chemical tag to react with the peptides at about 20-23.5° C.
In certain embodiments, step d) comprises enabling the chemical tag to react with the peptides at in a total volume of about 18-30 nl, for example, about: 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 19-29, 20-28, 21-27, 22-26, 23-25, 23-24 or 24-25 nl. In particular embodiments, step d) comprises enabling the chemical tag to react with the peptides in a total volume of about 24 nl.
In certain embodiments, dispensing a chemical tag into each of the n droplets comprising the peptides uses the first piezo dispensing capillaries (PDC), for example, that of cellenONER (SCIENION GmbH, Berlin, Germany). In other embodiments, dispensing a chemical tag into each of the n droplets uses MANTIS® Liquid Handler (FORMULATRIX®, Bedford, MA) or HP D300e Digital Dispenser (Hewlett-Packard, Palo Alto, CA).
In certain embodiments, greater than 90.0% of all peptides are labeled with the (corresponding) chemical tag, for example, greater than: 92.5%, 95.0%, 96.0%, 97.0%, 98.0%, 99.0%, 99.5%, 99.8% or 99.9% of all peptides are labeled. In particular embodiments, greater than 99% of all peptides are labeled.
In some embodiments, the methods of the disclosure further comprise dispensing a quenching reagent into each of the n droplets to quench unconjugated chemical tag.
In certain embodiments, the quenching reagent comprises about 20-30 nl of 5% hydroxylamine.
In particular embodiments, step d) further comprises:
In some embodiments, step d) further comprises enabling unconjugated chemical tag to be quenched at about 1ºC above the dew point, for example, about: 0.4-1.6° C., 0.5-1.5° C., 0.6-1.4ºC, 0.7-1.3ºC, 0.8-1.2° ° C. or 0.9-1.1ºC above the dew point. In certain embodiments, step d) further comprises enabling unconjugated chemical tag to be quenched at a relative humidity of about 70-80%, for example, about: 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 71-79%, 72-78%, 73-77%, 74-76%, or 74.5-75.5%. In particular embodiments, step d) further comprises enabling unconjugated chemical tag to be quenched at about 1ºC above the dew point and at a relative humidity of about 75%. In some embodiments, the temperature, the relative humidity, or both are dynamically regulated.
Pooling comprises applying a fluid to merge at least a subset the n droplets into a combined droplet on the substantially planar surface, thereby combining the labeled peptides to form a single-cell proteomic sample. In some embodiments, the fluid is water. In certain embodiments, the fluid has a volume of about 1 μl.
In some embodiments, the at least a subset the n droplets comprise n droplets. In certain embodiments, the at least a subset the n droplets comprise ≤n-1 droplets. In particular embodiments, the at least a subset the n droplets comprise ≤n-2 droplets.
In certain embodiments, step e) further comprises aspirating each combined droplet off the substantially planar solid surface in an acetonitrile solution. In some embodiments, the acetonitrile solution comprises about 100% acetonitrile, for example, about: 99.0-100%, 99.5-100%, 99.8-100% or 99.9-100% acetonitrile. In particular embodiments, the acetonitrile solution has a volume of about 5-15 μl, for example, about 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 9.0-11.0, 9.1-10.9, 9.2-10.8, 9.3-10.7, 9.4-10.6, 9.5-10.5, 9.6-10.4, 9.7-10.3, 9.8-10.2 or 9.9-10.1 μl. In more particular embodiments, the total volume for aspirating each combined droplet is about 10 μl. Each single-cell proteomic sample can be transferred into a single well of a multi-well (e.g., a 384-well) plate.
In some embodiments, the combined droplet (comprising the labeled peptides) is transferred onto a non-substantially planar solid surface. In certain embodiments, the combined droplet (comprising the labeled peptides) is transferred into a container (e.g., a well within a multi-well plate, a tube such as a microfuge tube).
In some embodiments, the method further comprises drying the single-cell proteomic samples, for example in a speed-vacuum.
In some embodiments, the one or more single-cell proteomic sample are stored (for example, frozen at −80° C.) for future proteomic analysis. In certain embodiments, the one or more single-cell proteomic sample are reconstituted (for example, each in about 1.1 μl of 0.1% formic acid) for proteomic analysis (e.g., mass spectrometry analysis).
In another aspect, the disclosure provides a single-cell proteomic sample formed with any one of the methods described herein.
Many biological processes and regulatory dynamics, such as the cell division cycle, are reflected in protein covariation across single cells. Variabilities within a cell type are challenging to analyze with existing single-cell omics methods. In some embodiments, the sample preparation methods described herein enable quantifying and interpreting the covariations by single-cell proteomics with sufficiently high throughput and accuracy. As shown below, the sample preparation methods have been used to prepare 1,888 single cells and 128 negative controls in a single batch. Their analysis enabled quantifying the covariation among thousands of proteins and cell-cycle protein markers. The results demonstrate that protein covariation across single cells may reveal functionally concerted biological differences between closely related cell states.
A substantially planar solid surface enables parallel processing of a large number of multiplexed single-cell samples at a high density, thereby significantly increasing the throughput of single-cell proteomic analysis. Said surface also enables efficient merging of each multiplexed single-cell proteomic sample, thereby significantly reducing sample loss and sample processing time. A substantially planar solid surface also enables precise dispensing of very small volumes of single cells and reagents and keeping the droplets separated.
Single cells are isolated in very small volumes (e.g., about 300 picoliter), and all preparation steps, including cell lysing, protein digesting, and peptide labeling are performed in droplets of small volumes (e.g., below about 20 nl) on a substantially planar surface. Reduced volumes during sample preparation and increased throughput result in reductions in background signal, increased sample consistencies, and increased sensitivities.
Single-cell measurements are commonly used to identify different cell types from tissues composed of diverse cells (Regev et al., Elife 6:e27041 (2017) and Specht & Slavov, J Proteome Res. 17(8):2565-71 (2018)). This analysis is powering the construction of cell atlases, which can pinpoint cell types affected by various physiological processes. This cell classification requires analyzing a large number of cells and may tolerate measurement errors (Regev et al., Elife 6:e27041 (2017), Ziegenhain et al., Mol Cell 65(4):631-43 (2017), and Slavov, Science 367(6477):512-13 (2020)). In addition to classifying cells by type, single-cell measurements may reveal regulatory processes within a cell type and even associate them with different functional outcomes (Slavov, PLOS Biol. 20(1):e3001512 (2022), Shaffer et al., Nature. 546(7658):431-35 (2017) and Emert et al., Nat Biotechnol. 39(7):865-76 (2021)). For example, the covariation among proteins across single cells from the same type may reflect cell intrinsic dynamics, such as the cell division cycle (Slavov, PLOS Biol. 20(1):e3001512 (2022) and Mahdessian et al., Nature 590(7847):649-54 (2021)). Furthermore, protein covariation may reflect protein interactions within complexes or cellular states, such as senescence (Slavov, PLOS Biol. 20(1):e3001512 (2022)). However, estimating and interpreting protein covariation within a cell type requires high quantitative accuracy and high throughput (Slavov, PLOS Biol. 20(1):e3001512 (2022) and Slavov, Mol Cell Proteomics 21(1):100179 (2022)). Indeed, protein differences within a cell type are smaller than differences across cell types and can be easily swamped by batch effects and measurement noise. A goal is to minimize measurement noise to levels consistent with estimating and interpreting protein covariation across single cells from the same cell type. Towards this goal, an aim was to reduce batch effects and background noise, since these factors undermine the accuracy of single-cell proteomics by mass spectrometry (MS) (Slavov, Curr Opin Chem Biol. 60:1-9 (2021), Vanderaa & Gatto, Expert Rev Proteomics 18(10):835-43 (2021), Kelly, Mol Cell Proteomics 19(11): 1739-48 (2020), and Specht et al., Genome Biol. 22(1):50 (2021)). Specifically, an aim was to develop a widely accessible, robust, and automated sample preparation method that reduces volumes to a few nanoliters. A goal was to perform parallel sample preparation of thousands of single cells to increase the size of experimental batches and thus reduce batch effects (Vanderaa & Gatto, Expert Rev Proteomics 18(10):835-43 (2021), Klein et al., Cell 161(5):1187-201 (2015) and Macosko et al., Cell 161(5):1202-14 (2015)). To achieve high precision, an aim is to avoid any movement of the samples during the sample preparation stage, so that 1-10 nl volumes of reagents can be repeatedly dispensed to each droplet containing a single cell. The CellenONE cell sorting and liquid handling system was used to develop nano-Proteomic sample Preparation (nPOP), which allowed a 100-fold reduction of the sample volumes over the Minimal ProteOmic sample Preparation (mPOP) method (Specht et al., Genome Biol. 22(1):50 (2021), Harrison et al., bioRxiv 399774 (2018), Petelski et al., Nat Protoc. 16(12):5398-25 (2021) and Marx, Nat Methods 16(9):809-12 (2019)). nPOP enabled analysis of protein covariation within two cell lines, monocytes and melanoma. This enabled classifying cells by cell division cycle (CDC) phase and identifying a sub-population of melanoma cells. Comparative analysis between the cell lines identifies both similar and differential patterns of CDC associated protein covariation. Further, this analysis was applied within melanoma sub-populations, and differences in CDC associated protein covariation as well as a differential distribution of cells throughout phases of the CDC were identified.
U-937 and Jurkat cells were grown as suspension cultures in RPMI medium (HyClone 16777-145, Cytiva, Marlborough, MA) supplemented with 10% fetal bovine serum and 1% penicillin-streptomycin (pen/strep) (15140122, ThermoFisher, Waltham, MA). Cells were passaged when a density of 106 cells/ml was reached, approximately every two days.
The melanoma cells (WM989-A6-G3, a gift from Arjun Raj, University of Pennsylvania) were grown as adherent cultures in TU2% media which is composed of 80% MCDB 153 (M7403, Sigma-Aldrich, St. Louis, MO), 10% Leibovitz L-15 (11415064, ThermoFisher, Waltham, MA), 2% fetal bovine serum, 0.5% penicillin-streptomycin and 1.68 mM Calcium Chloride (499609, Sigma-Aldrich, St. Louis, MO). Cells were passaged at 80% confluence (approximately every 3-4 days) in T75 flasks (Z707546, MilliporeSigma, Burlington, MA) using 0.25% Trypsin-EDTA (25200072, ThermoFisher, Waltham, MA) and re-plated at 30% confluence.
HPAF-II cells (CRL-1997™, ATCC, Manassas, VA) were cultured in EMEM (30-2003, ATCC, Manassas, VA), CFPAC-I cells (CRL-1918™, ATCC, Manassas, VA) were cultured in IMDM (30-2005), and BxPC-3 cells (CRL-1687™, ATCC, Manassas, VA) were cultured in RPMI 1640 (30-2001, ATCC, Manassas, VA). All media were supplemented with 10% fetal bovine serum (FBS) (F4135, MilliporeSigma, Burlington, MA) and 1% penicillin-streptomycin. Cells were passaged at 70% confluence.
Jurkat cells and U-937 cells cultured in heavy SILAC media (containing+10 Da Arg and +8 Da Lys) were washed and re-suspended in PBS at 20,000 cells per μl. Two solutions of equal cell count containing Jurkat and U-937 cells were made mixed in 1:1 ratios. One sample was lysed by diluting cells in 90% DMSO and the other was lysed in 6M urea. The DMSO cell lysate was diluted to a concentration of 33% DMSO and urea lysate was diluted to 0.5 M. Both solutions were digested in 15 ng/μl of trypsin for 12 hours. Each sample was then desalted using C18 stage tips and run using data dependent acquisition.
The isobaric carrier consisting of a 1:1 mixture of melanoma and monocyte cells was prepared in bulk and aliquoted into carriers corresponding to 200 cells each. A single cell suspension of 22,000 cells was transferred to a 200 μl PCR tube (1402-3900, USA Scientific, Inc., Ocala, FL) and then processed via the mPOP sample preparation method (Harrison et al. bioRxiv 399774 (2018)). The reference channel was made from the same sample.
Additional bulk samples of melanoma and monocyte cells were prepared for validating quantification of single cells. Cell pellets of 100,000 monocyte and melanoma cells were suspended in 50 μl of mass spectrometry grade water and lysed and digested via mPOP sample preparation (Harrison et al. bioRxiv 399774 (2018)). Samples were then labeled with TMT-16plex, combined, and diluted down to a concentration of 400 cells/μl for analysis by LC-MS.
Reagent Handling with CellenONE
The CellenONE (see, e.g., www.cellenion.com/technology/) was equipped with two piezo dispensing capillaries (PDC). One PDC was dedicated to handle cell suspensions. The other PDC was dedicated for all other reagent handling including organic solvents and protein solutions. Reagents were loaded into a 384-well plate in volumes of 30 μl. When aspirating protein solutions, 20 μl was aspirated to ensure the mixture was not diluted with system water. When dispensing DMSO, it was important to deactivate the humidifier. This allowed residual DMSO left on the tip of the PDC to evaporate quickly so dispensing was not affected. After each sample preparation, PDCs were washed with ethanol and cleaned under sonication to remove any built-up of material from inside of the PDC and ensure optimal performance.
nPOP reactions were carried out on the surface of a fluorocarbon coated glass slide. The array layout was very flexible and adjustable to the experimental parameters. The droplets used for single-cell sample preparation were arranged in clusters, and the number of droplets per cluster equals the number of single cells per SCOPE2 (Single Cell ProtEomics 2) set. TMTpro 18plex and 14 droplets per cluster, corresponding to the 14 isobaric labels used for single cells, were used. The design allowed fitting 36 clusters per slide and 4 glass slides on the temperature controlled target holder, which enabled simultaneous processing of up to 14×36×4=2,016 single cells. Reducing the space between clusters can further increase the number of clusters per slide and thus the number of simultaneously prepared single cells. The array layout was optimized to keep droplets from the same set close in proximity but prevent reaction volumes from merging. Once an array layout was selected, 8 nl of DMSO was dispensed to each location of the array, forming the initial reaction volume for each single cell reaction. Lysis began when cells were dispensed inside a droplet of about 300 pl of PBS into these reaction volumes of DMSO. After lysis, 10 nl of solution containing trypsin and HEPES buffer was added to each reaction volume, for a final concentration of 120 ng/μl of trypsin and 5 mM HEPES and total volume of 18 nl.
The humidifier and cooling system were then turned on to prevent droplet evaporation. Relative humidity inside the CellenONE was set to 75%, and the chiller temperature was set to dynamically chase one degree above the dew point. Mass spectrometry grade water was dispensed in a perimeter surrounding each grid to provide further control for the local humidity of the reaction volumes. The system was set to refresh the water droplet perimeter to control local humidity every 40 minutes for 5 hours as proteins digest.
After proteins were digested for 5 hours, the humidity and cooling controls were turned off. 20 nl of TMT labels suspended in DMSO and concentrated at 28 mM were then dispensed to each reaction volume using the organic dispensing tip. When dispensing labels, humidifier was turned off to assist with dispensing. After single cells were left to label for 1 hour, 20 nl of 5% hydroxylamine solution was added to each reaction volume to quench labeling reaction. Humidity and cooling controls were returned to previous settings for quenching labeling reaction. After 20 minutes, another addition of 30 nl of 5% hydroxylamine was added.
After quenching proceeds for another 20 minutes, sample clusters were pooled by aspirating them off the slide surface in 10 μl of a 100% acetonitrile solution via CellenONE PDC and syringe pump controls. Pooled samples were then transferred into a 384-well plate (AB1384, ThermoFisher, Waltham, MA) and dried down to dryness in a speed-vacuum (Eppendorf, Germany) and either frozen at −80ºC for later analysis or immediately reconstituted in 1.1 μl of 0.1% formic acid (85178, ThermoFisher, Waltham, MA) for mass spectrometry analysis.
Melanoma and monocyte cells were incubated using Vy-17 brand Dye Cycle (V35003, ThermoFisher, Waltham, MA) following manufacturer's instructions. Cells were sorted via the Beckman CytoFLEX SRT (Beckman Coulter, Brea, CA). Post sorting, cells were pelleted and washed with Mass Spectrometry grade water and resuspended in water at a concentration of 2000 cells/μl. Cells were then frozen at −80ºC for 10 minutes and then heated to 90ºC for 10 minutes for lysis. Proteins were then digested overnight in a solution of 15 ng/μl of trypsin. Samples were analyzed via data independent acquisition.
MS analysis was designed and performed according to the SCOPE2 guidelines and protocol (Specht et al., Genome Biol. 22(1):50 (2021), Petelski et al., Nat Protoc. 16(12):5398-425 (2021) and Specht & Slavov, J Proteome Res. 20(1):880-87 (2021)). Specifically, the single cells pooled into SCOPE2 sets were separated via online nlC on a Dionex UltiMate 3000 UHPLC; 1 μl out of 1.1 μl of sample was picked up out of a 384-well plate (AB1384, ThermoFisher, Waltham, MA) placed on an auto sampler height adjuster for PCR plates (6820.4089, ThermoFisher, Waltham, MA) and loaded onto a 25 cm×75 μl IonOpticks Aurora Series UHPLC column (AUR2-25075C18A). Buffer A was 0.1% formic acid in water and buffer B was 0.1% formic acid in 80 acetonitrile/20% water. A constant flow rate of 200 nl/min was used throughout sample loading and separation. Samples were loaded onto the column for 20 minutes at 1% B buffer, then ramped to 5 B buffer over two minutes. The active gradient then ramped from 5% B buffer to 25% B buffer over 53 minutes. The gradient was then ramped to 95% B buffer over 2 minutes and stayed at that level for 3 minutes. The gradient then dropped to 1% B buffer over 0.1 minutes and stayed at that level for 4.9 minutes. Loading and separating each sample took 95 minutes total. All samples were analyzed by a Thermo Scientific Q-Exactive mass spectrometer from minute 20 to 95 of the LC loading and separation process. Electrospray voltage was set to 1.8 V, applied at the end of the analytical column. To reduce atmospheric background ions and enhance the peptide signal-tonoise ratio, an Active Background Ion Reduction Device (ABIRD, ESI Source Solutons, LLC, Woburn, MA) was used at the nanospray interface. The temperature of ion transfer tube was 250° C. and the S-lens RF level was set to 80.
A prioritized analysis workflow (Huffman et al., bioRxiv 484655 (2022)) was used to increase consistency of identification and depth of coverage for the nPOP-prepared single-cell data shown in
LC-Settings for pSCOPE-Associated Experiments
Samples were analyzed using a 95-minute method with the following gradient characteristics: samples were loaded onto the column at 4% B; the gradient was then ramped to 8% at minute 12, 35% at minute 75, 95% at minute 77, 4% from minute 80.1 onward.
Spectronaut search results of the retention-timecalibration run were filtered to EG.PEP≤ 0.02 and EG.Qvalue≤ 0.05. Additionally, precursors without TMTPro modifications (+304.2071 Da) on the peptide n-terminus or lysine residue were filtered out. The distribution of precursor intensities for the remaining precursors was then subset into tertiles for use in priority tier assignment. These precursors were then filtered such that a maximum of four peptides per protein were selected, with the most intense peptides per protein being selected. Filtered peptides with precursor intensities in the top intensity tertile were placed on the top priority tier, peptides with intensities in the middle intensity tertile were placed on the middle priority tier, and peptides with intensities in the bottom intensity tertile were placed on the bottom priority tier. All species matching the original EG.PEP and EG.Qvalue filtration characteristics that were not previously selected for a priority tier were assigned a priority below the previous bottom tier. These priority-tier-assigned peptides were then enabled for participation in MaxQuant.Live's realtime-retention-time-alignment algorithm, as well as MS2 upon detection. Any remaining PSMs outside of the original filtration criteria (EG.PEP≤ 0.02 and the EG.Qvalue≤ 0.05) were enabled for participation in MaxQuant.Live's realtime-retention-time-alignment algorithm, but not sent for MS2 upon detection.
1 μl injections of a 1× concentrated aliquot of mixed carrier-reference material were analyzed using the instrument method detailed in the prioritized acquisition parameters section and MaxQuant.Live parameters indicated in the associated table. The two raw files associated with these experiments were then searched using MaxQuant (v. 1.6.17.0) using a FASTA containing all entries from the human SwissProt database (swissprot_human_20211005. fasta, 20,386 proteins). TMTPro 16plex was enabled as a fixed modification on peptide n-termini and lysines via the reporter ion MS2 submenu. Methionine oxidation (+15.99492 Da) and protein n-terminal acetylation (+42.01056 Da) were enabled as variable modifications, and trypsin was selected for in silico digestion with enzyme mode set to specific. Up to 2 missed cleavages were allowed per peptide with a minimum length of 7 amino acids. Second peptide identifications were disabled, calculate peak properties was enabled, and msScans was enabled as an output file. PSM FDR and protein FDR were set to 1.
One lul injection of a 1× concentrated aliquot of mixed carrier-reference material was analyzed using the LC settings indicated above. The following MS1 settings were used: 70k resolution, le6 AGC target, 100 ms maximum injection time, and a scan range of 450Th to 1600Th. MS2 scans were acquired with the following settings: 70k resolution, le6 AGC target, 300 ms maximum injection time, loop count (i.e., top-n) of 7, Isolation window of 0.7Th with a 0.3Th offset, fixed first mass of 100 m/z, NCE of 33, and a centroid spectrum data type. The minimum AGC target was 2e4, apex triggering was disabled, and charge exclusion was enabled for unassigned charge states, as well as charge states greater than 6. The peptide match setting was disabled, exclude isotopes was enabled, and dynamic exclusion was set to 30 seconds. Voltage was set to 0 for the first 25 minutes, sweep gas was applied from minute 24.6 to 25 to dislodge any accumulated droplets from the capillary tip. From minute 25 to 80, voltage was set to 1.7 kV, capillary temp to 250° C., and the S-lens RF level to 80. From minute 94.20 to 94.60, sweep gas was applied to dislodge any accumulated droplets from the capillary tip.
The raw file generated by this analysis was searched using the same maxquant settings as indicated in the Scout experiment instrument method and raw data analysis section.
The PSMs generated from the scout runs using intensity dependent-tiers (wAL00191 and wAL00192) were partitioned into three categories: PSMs at PEP≤ 0.02 (set a), PSMs with 0.02<PEP≤ 0.05 (set B), and PSMs with PEPs >0.05 (set γ). Then the same set of PEP filters defined above for wAL00191 and wAL00192 were applied to the results of a DDA analysis conducted on an injection of a 1× concentrated aliquot of carrier and reference material to generate sets δ, ∈, and ξ. Furthermore, these last three precursor sets were assembled such that they each contained a unique set of precursors with respect to one another and the preceding set of precursors.
Sets α and δ were combined and filtered such that a maximum of 4 peptides per protein were selected, choosing those precursors with the highest precursor intensities, to form the top priority tier candidates. The excluded precursors from this filtration were then combined with sets β and ∈ to make up the middle priority tier candidates. Peptides from sets γ and ξ were then combined to form the bottom priority tier candidates.
The results from the retention-time-calibration experiment were then intersected with the priority tier sets, and the PSMs matching each set were given a corresponding priority index for use by MaxQuant. Live. Up to 8,600 of the most abundant remaining retention-time-calibrationexperiment-associated PSMs were then added to the bottom priority tier to provide additional identifiable precursors when higher priority precursors were not detected. All selected precursors were then enabled for participation in the MaxQuant.live real-time-retention-time-alignment algorithm, and for MS2 upon detection. All remaining PSMs that were not part of the priority tiers were then selected for participation in the MaxQuant.live real-time-retention-time-alignment algorithm, but not for MS2 upon detection.
All single-cell samples were resuspended in 1 μl of 0.1% formic acid (85178, Thermo Fisher, Waltham, MA) and injected from a 384-well plate (AB1384, Thermo Fisher, Waltham, MA). All 1× concentrated carrier and reference samples were resuspended in 1 μl of 0.1% formic acid (85178, Thermo Fisher, Waltham, MA) and injected from a glass HPLC insert (C4010-630, Thermo Fisher, Waltham, MA). LC settings indicated above were used in these analyses. Scan parameters were implemented following the MQ.live listening scan guidelines: Two Full MS-SIM scans were applied from minute 25 to 30 to trigger MaxQuant.live. Both MS-SIM scans had the following parameters in common: 70k resolution, le6 AGC target, and a 300 ms maximum injection time. The first MS-SIM scan covered 908 to 1070Th, since the acquisition started at minute 25 and ended at minute 95. The second MS-SIM scan covered the scan space from 909Th to the numeric MaxQuant.live method index to call. The total Xcalibur MS method time was 95 minutes. Tune files governing voltage and sweep gas were implemented as in the pre-prioritization shotgun method.
The swissprot_human_20211005. fasta was read into the R environment using the seqinr (Charif & Lobry, Biological and Medical Physics, Biomedical Engineering 207-32 (2007)) package, and only those proteins with peptides present on the inclusion list were retained to generate the AndrewsnPOP_FASTA_v2.fasta file, containing 3535 proteins, used to search the resulting prioritized single-cell experiments.
After a precursor scan from 450 to 1600 m/z at 70,000 resolving power, the top 7 most intense precursor ions with charges 2 to 4 and above the AGC min threshold of 20,000 were isolated for MS2 analysis via a 0.7 Th isolation window with a 0.3 Th offset. These ions were accumulated for at most 300 ms before being fragmented via HCD at a normalized collision energy of 33 eV (normalized to m/z 500, z=1). The fragments were analyzed by an MS2 scan with 70,000 resolution. Dynamic exclusion was used with a duration of 30 seconds with a mass tolerance of 10 ppm.
Samples were run using the VI method from Derks et al. (Derks et al., bioRxiv 467007 (2021)). This method contains 140k resolution MS1 scans for improved MS1 level quantification.
Raw data were searched by MaxQuant (Cox et al., Nat Biotechnol. 26(12): 1367-72 (2008) and Cox et al., J Proteome Res. 10(4):1794-805 (2011)). 1.6.17.0 against a protein sequence database including entries from the appropriate human SwissProt database (dow nloaded Jul. 30, 2018) and known contaminants such as human keratins and common lab contaminants. Fasta was limited to proteins which were included on prioritization list. MaxQuant searches were performed using the standard work flow (Tyanova et al., Nat Protoc. 11(12):2301-19 (2016)). Trypsin specificity was specified and up to two missed cleavages for peptides having from 5 to 26 amino acids were allowed. Methionine oxidation (+15.99492 Da) and protein N-terminal acetylation (+42.01056 Da) were set as variable modifications. Carbamidomethylation was disabled as a fixed modification. All peptide-spectrummatches (PSMs) and peptides found by MaxQuant were exported in the msms.txt and the evidence. txt files.
Data Independent Acquisition runs were searched with DIA-NN v1.8.0 (Demichev et al., Nat Methods. 17(1):41-44 (2020)) using an in silico fasta generated library enabled by deep learning.
When comparing relative protein levels in Jurkat and U-937 cells, SILAC ratios for peptides were computed by taking dividing each channel by its median, and then taking the ratio of the light and heavy channels. When comparing absolute abundances between heavy and light U-937 cells to measure efficiency of extraction, label swap experiments were run so that both lysis conditions were measured with both heavy and light labels. The raw intensities for corresponding lysis methods were averaged and the ratio between different lysis methods was plotted.
The single-cell data were processed and normalized by the SCOPE2 pipeline (Specht et al., Genome Biol. 22(1):50 (2021) and Petelski et al., Nat Protoc. 16(12):5398-425 (2021)). This pipeline is also implemented by the scp Bioconductor package (Vanderaa & Gatto, Expert Rev Proteomics 18(10):835-43 (2021) and Vanderaa & Gatto, Bioconductor (2020)). Briefly, single cells with suboptimal quantification were removed prior to data normalization and analysis based on objective criteria: The internal consistency of protein quantification for each single cell was evaluated by calculating the coefficient of variation (CV) for proteins (leading razor proteins) identified with over 5 peptides for that cell. The coefficient of variation is defined as the standard deviation divided by the mean. The CVs were computed for the relative reporter ion intensities, i.e., the RI reporter ion intensities of each peptide were divided by their mean resulting in a vector of fold changes relative to the mean. Cells that fell outside the distribution were removed from analysis with a threshold of 0.41. Data was normalized as by procedure outlined by Specht et al. (Specht et al., Genome Biol. 22(1):50 (2021) and Specht et al., Genome Biol. 22(1):50 (2021)).
From the protein x single cell matrix, all pairwise protein correlations (Pearson) were computed. Thus, for each protein, there was a computed vector of correlations with a length the same as the number of rows in the matrix (number of proteins). The dot product of this vector with itself was used to weight each protein prior to principal component analysis. The principal component analysis was performed on the correlation matrix of the weighted data.
Protein set enrichment analysis was performed by t-test between Cluster A and B on the un-imputed data. It was required that a given gene set had at least 4 proteins measured in the single cells and that each population had at least 80% of cells with protein observations. The distribution of p-values was corrected for multiple hypothesis testing with the BH method. O nly GO terms were reported with Q value less than 0.0001 were reported.
Phase markers were constructed from proteins identify with differential abundant each CDC phase in both monocyte and melanoma cells. These proteins were first identified on the bulk level. To further narrow the list of proteins used to create phase markers, proteins that contained multiple, positively correlated peptides in the single cell samples were used. Phase markers were then constructed by averaging the abundances of all possible combinations of 2 or 3 proteins corresponding to each phase of the cell cycle. Groups of two markers for each CDC phase that were positively correlated were selected. This served as validation as it was expected that proteins that are highly abundant in same phase would positively covary. Groups of protein markers were then further filtered.
Markers were first constructed in the space of monocyte cells, and correlations between markers were validated in melanoma cells
Identifying Proteins that Covary with CDC Markers
To identify proteins that covary with the phase marker vectors, the phase marker vectors to the measured protein levels of each protein were correlated using Spearman correlation. The distribution of p-values obtained from the Spearman correlation test was adjusted using the BH method and the results were filtered at 1% FDR.
To identify functionally coherent sets of proteins that covary with the CDC phase markers, each protein was correlated to the median abundance of CDC proteins that showed similarity between melanoma and monocyte cells as plotted in
A greedy approach was taken to assign cells to a CDC phase. First, a vector comprised of length 3X the number of cells was created, where each value was the average abundance of G1, S, or G2 marker proteins. This vector was then sorted from highest to lowest. Subsequently iterated down the list and sorted cells into the G1, S, or G2 bin based off the phase of each value. 50% of cells were sorted into the G1 bin, 25% of cells were sorted into the S and G2 bins based off the distribution observed from the bulk FACS CDC sorting.
A null distribution that consists of all pairwise Euclidean distances was computed for each protein. Euclidean distances were o nly calculated between observed values, and vectors were subsequently normalized to the number of pairwise observed values in each vector. Euclidean distances were then calculated in the same fashion from all proteins within complexes from the CORUM protein database (Giurgiu et al., Nucleic Acids Res. 47(D1):D559-D563 (2019)).
To reduce batch effects and background signal, the goal was to maximize the number of single cells prepared in parallel while minimizing the volumes of sample preparation. To this end, the idea of performing all sample preparation steps in droplets on the surface of a uniform glass slide was explored (
These data supported the use of DMSO for cell lysis performed by first dispensing an experimenter defined regular array of DMSO droplets, and subsequently adding a cell to each droplet for lysis (
For the next step, labeling peptides, it was found that the commonly used approach of dissolving labels in acetonitrile was unreliable due to low density and low surface tension of acetonitrile. To overcome this problem, DMSO dissolved labels were introduced, and robust performance of sub-nanoliter droplets over hundreds of samples were observed. This approach was validated by measuring labeling efficiency in pooled samples, and over 99% of all possible peptides were found to be TMT labeled. The final step of nPOP entailed collecting the samples and delivering them for LC-MS analysis. Clusters of labeled single cells were pooled into a single set, aspirated, and dispensed into a 384-well plate in a fully automated fashion for streamline sample injection (
The nPOP sample preparation was combined with prioritized quantification of proteins introduced by Huffman et al. (Huffman et al., bioRxiv 484655 (2022) and followed the guidelines of the SCOPE2 protocol (Specht et al., Genome Biol. 22(1):50 (2021) and Petelski et al., Nat Protoc. 16(12):5398-25 (2021)). The AL-01 sample layout design, which prepares 2,016 single cells in one day, was employed (
To evaluate nPOP's ability to analyze protein covariation within and across cell types, two cell lines, WM989 melanoma and U-937 monocyte cells, were analyzed. The average number of proteins and peptides per single cell were 997 proteins and 2,630 peptides, with 2,844 proteins quantified across the 1,543 single cells prepared by nPOP (
As an additional QC metric, the agreement in relative quantification derived from different peptides originating from the same protein was evaluated. The agreement was significantly higher in the single cells than the negative controls (
In addition to the increased throughput, nPOP reduced sample preparation batch effects that could introduce technical artifacts. Indeed, because all single cells were prepared on the same day, no sample preparation batch corrections needed to be applied to the data.
Next, principal component analysis (PCA) of the single-cell protein dataset was performed using all quantified proteins (
The first step towards identifying within cell type protein covariation was to identify proteins that correlate significantly within monocyte and melanoma cells. Computing all pairwise correlations, 5,089 significant correlations were found in monocyte, and 4,679 correlations were found in melanoma cells at FDR <5%. 2,353 of these correlations were between the same pair of proteins. While most of these correlations shared the same trend, interestingly, 15 proteins showed opposite correlation trends. The joint distributions for proteins from these two cases were plotted in
A primary factor for observed protein covariation within a cell type may reflect proteins belonging to a complex. The goal was to identify whether observed protein covariation could be explained by proteins belonging to complexes. To this end, all pairwise Euclidean distances between proteins in know complexes from the CORUM database were computed (Giurgiu et al., Nucleic Acids Res. 47(D1):D559-D563 (2019)), and the distribution against all pairwise distances was tested. 96 protein complexes were identified in melanoma cells, and 89 were identified in monocytes at FDR <10%. Both cell types had similar agreement between Ribosomal proteins (
A more challenging problem was quantifying CDC-related protein covariation within a cell type. As a first step towards this analysis, the potential to classify individual cells by their cell cycle phase was evaluated. To obtain a list of proteins whose abundance varies periodically with the cell division cycle, populations of each cell type were first sorted based off their DNA content (
To construct robust markers for each phase, the abundances of groups of proteins corresponding to each phase of the cell cycle were averaged. For each CDC phase, two markers from non-overlapping sets of proteins were constructed. Positive correlation between markers from the same phase served as internal validation based on the expectation that proteins peaking in the same phase positively covary. Conversely, markers for different phases were expected to negatively correlate to each other (
The proteomes of both melanoma and monocyte cells were then projected into a joint 2-dimensional space of the CDC marker proteins defined by principal component analysis (
To identify proteins that covary with the CDC periodic markers, the phase marker vectors were correlated to the measured protein abundances of all proteins quantified across many single cells. For 121 of these proteins in the melanoma and 113 in the monocyte, the correlations were statistically significant, FDR <0.01, suggesting that these proteins are CDC periodic. Specifically, NPM1 which facilitates ribosome biogenesis positively correlated with G1 phase in both melanoma and monocyte populations, p<10-15, p<10-8, respectively.
To increase the statistical power and identify functional covariation with the CDC, the next focus was the covariation of phase markers and proteins with similar functions as defined by the gene ontology (GO). The distributions of correlations between the 3 phase marker vectors and all quantified proteins from a GO term were compared (see the boxplots in
In addition to finding groups of proteins that showed similar cell cycle covariation between cell types, several GO terms also varied differential with CDC markers (
Next, the two distinct clusters of melanoma cells observed in
To test if the clusters mapped to the same distinct cell states previously identified, the cells were color coded by the abundance of proteins whose transcripts were reported (Emert et al., Nat Biotechnol. 39(7):865-76 (2021)) to mark either the non-primed population (Cluster A) or the primed sub-population (Cluster B) (
To explore CDC differences further, the distribution of cells in each CDC phase across the two sub-populations were quantified. A substantially larger fraction of cells were found in cluster B in G1 phase, 78%, while only 4% of cells were found to be assigned to G2 phase (
Lastly, 234 additional proteins were differential between cluster A and cluster B cells at FDR <1%. Some of these proteins were displayed in
NPOP was also applied to specifically study surface proteins in an additional experimental system, pancreatic ducal adenocarcinoma (PDAC) (
Existing single-cell omics methods excel at classifying cells by cell type. However, the regulatory dynamics resulting in cell to cell variability within a cell type are more challenging to analyze. To support such analysis, a highly parallel sample preparation that enables preparation of hundreds to thousands of single cells in a given experiment was introducee. It allows for reduced volumes and increased consistency of single-cell proteomic sample preparation. Furthermore, it can enable processing thousands of single cells in parallel and thus empower high-throughput, high-power biological analysis (Slavov, Nat Biotechnol. 39(7):809-10 (2021)).
To maximize access and flexibility, nPOP used only commercially available equipment and prepared single cells on an open surface that could be pragmatically reconfigured and adopted to different experimental designs. The open environment also obviated all sample movements and maximized the consistency and precision of the sample preparation. The open layout using a hydrophobic slide can be scaled up to simultaneously prepare thousands of single cells. Furthermore, nPOP is amenable to different coatings or hydrophobic surfaces which have the potential to further improve recovery.
NPOP allowed for deeper single cell proteomic analysis of the cell division cycle than the CDC analysis using the minimal sample preparation method (mPOP) (Specht et al., bioRxiv. 399774 (2018)). The data allowed identification of new proteins and functional groups of proteins associated with the cell cycle without the artifacts associated with synchronizing cell cultures (Cooper, FEBS J. 286(23):4650-56 (2019)). Furthermore, functional groups of proteins associated with the cell cycle were determined in an identified subpopulation of cells within the melanomas. These initial results demonstrate the feasibility of inferring co-regulation of biological processes from single-cell proteomics measurements.
A non-limiting example of an overall work flow for nPOP sample preparation includes cell isolation, cell lysis, protein digestion, peptide labeling, and pooling as illustrated in
To pool all single-cell samples into a set, 1 μl of water is pipetted by hand onto each array of labeled samples. Samples are then pipetted directly into glass inserts containing carrier and reference previously prepared using the mPOP protocol (Specht & Slavov, J Proteome Res. 17(8):2565-71 (2018)) for injection vials. To improve the recovery of labeled peptides, the footprint of each array can be washed by 4 μl of acetonitrile, which is collected and added to the corresponding combined set. This wash is optional and is used to maximize the recovery of labeled peptides from the slide.
nPOP is a general sample preparation method that can be used for either label-free MS analysis or multiplexed MS analysis as part of existing workflows reviewed by Slavov, Curr Opin Chem Biol. 60:1-9 (2021) and Kelly, Mol Cell Proteomics 19(11): 1739-48 (2020). Here, sample preparation by nPOP as part of the SCOPE2 protocol (Specht et al., Genome Biol. 22(1):50 (2021) and Petelski et al., Nat Protoc. 16(12):5398-25 (2021)) is demonstrated. Specifically, Minimal ProteOmic sample Preparation (mPOP) module (Specht & Slavov, J Proteome Res. 17(8):2565-71 (2018)) was replaced with nPOP and used all other modules of the SCOPE2 workflow, including an isobaric carrier (Specht & Slavov, J Proteome Res. 20(1):880-87 (2021)), Data-Driven Optimization of Mass Spectrometry (DO-MS) (Huffman et al., bioRxiv. 512152 (2019)), Data-driven Alignment of Retention Times for IDentification (DART-ID) (Chen et al., PLOS Comput Biol. 15(7):e1007082 (2019)), and the SCOPE2 data analysis pipeline (Specht et al., Genome Biol. 22(1):50 (2021) and Vanderaa et al., Bioconductor (2020)).
To evaluate the performance of nPOP for single-cell sample preparation, proteins in 176 single cells of two distinct cell types, Hela cells and U-937 monocytes, were measured. The sample preparation was done on two different days so that the data may reflect day-specific batch effects. The resulting SCOPE2 sets were run using less than 24 hours of instrument time. Samples were analyzed and data processed via the SCOPE pipeline (Specht et al., Genome Biol. 22(1):50 (2021)). To evaluate the single-cell data, the pipeline calculated the coefficient of variation (CV) of relative peptide levels belonging to the same protein. The relatively low CV values indicate that protein quantification from different peptides was internally consistent (
Next, principal component analysis (PCA) of the single-cell protein dataset was performed using all quantified proteins (
To validate further that the cell type separation was driven by accurate quantification of proteins (rather than by secondary factors such as cell size or missing data), bulk samples of Hela cells and monocytes were included in the PCA. Similar to previous analysis (Specht et al., Genome Biol. 22(1):50 (2021), Petelski et al., Nat Protoc. 16(12):5398-25 (2021) and Budnik et al., Genome Biol. 19(1):161 (2018)), the bulk samples clustered with the corresponding single cells. This clustering indicated that the single cell protein quantification was consistent with the proteomic measurements of established bulk methods.
To test further the quantitative accuracy of the data, the heterogeneity within a cell type was studied. Differences in cell state were measured by analyzing the variation in known cell division cycle proteins. To do this, the data for CDC proteins were filtered and cells along the first two principal components were plotted. Each cell was then color coded based on the mean abundance of markers for M/G1 and G2/S phases in the cell. The color-coded cells clustered along the first and second principal component, indicating the feasibility of inferring cell cycle phase from the cells analyzed with nPOP.
A method that prepares single cells in 4-15 nanoliter volumes using only commercially available equipment is demonstrated. The current method prepares single cells in an open environment without a need to move samples in the process to maximize the consistency and precision of the sample preparation. The open layout using a glass slide is scalable to preparing hundreds of single cells at a time. Furthermore, the current method is amenable to different coatings or hydrophobic surfaces which have the potential to further improve recovery.
The teachings of all patents, published applications and references cited herein are incorporated by reference in their entirety.
While example embodiments have been particularly shown and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the embodiments encompassed by the appended claims.
This application claims the benefit of U.S. Provisional Application No. 63/179,035, filed on Apr. 23, 2021 and U.S. Provisional Application No. 63/179,184, filed on Apr. 23, 2021. The entire teachings of the above applications are incorporated herein by reference.
This invention was made with government support under Grant No. GM123497 awarded by the National Institutes of Health. The government has certain rights in the invention. The subject matter disclosed in this application was developed, and the claimed invention was made by, or on behalf of, one or more parties to a joint Research Agreement that was in effect on or before the effective filing date of the claimed invention. The parties to the Joint Research Agreement are as follows Northeastern University and SCIENION GmbH.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2022/071883 | 4/22/2022 | WO |
Number | Date | Country | |
---|---|---|---|
63179035 | Apr 2021 | US | |
63179184 | Apr 2021 | US |