This application relates to methods for scoring the spatial relationships or patterns of cells within tissue biopsies. More specifically, the invention relates to the use of spatial analysis to determine distributions of cells in tissues.
The majority of current tissue-based in vitro diagnostic assays (e.g., immunohistochemistry (IHC), chromogenic or fluorescent in situ hybridization (CISH or FISH, respectively), immunofluorescent (IF), mass spectrometry imaging (MSI)), companion diagnostics, laboratory developed tests, and research use only assays are based on measuring: (i) the expression level of a single biomarker within a tissue sample, or (ii) assessing the frequency of biomarker-positive, or graded expression levels of, cells within a tissue sample. Scoring these attributes of the tissue and biomarker(s) using quantitative (e.g., image analysis), semi-quantitative (e.g., manual pathologist H-score), or qualitative (e.g., manual pathology 0, 1, 2, 3+ scoring paradigm) methods are utilized to conduct medical research or inform a physician for determination of diagnosis, prognosis, or to guide future treatment decisions.
However, tissue-based assays only evaluate biomarker positivity or expression levels in tissue with limited granularity in regard to the spatial distribution or pattern of cells. Previous efforts to evaluate the relationships between multiple cell types within tissues are limited to relatively simplistic assignments of cell location (e.g., biomarker positive cells in the tumor tissue compartment) or distance (e.g., biomarker positive cells within a distance from the tumor/stroma interface), and do not rely on the tissue-wide spatial distribution statistics embodied in this invention.
Currently, sophisticated quantification of the distribution or pattern of cells within a tissue (e.g., modeling cell-to-cell interactions relative to random distributions, extraction of higher order statistics from density surface renderings, spatial assessment of autocorrelation between marker-positive cells) are not quantified nor utilized in a manner to guide a physician's determination of diagnosis, prediction of prognosis, or assessment of future treatments with a drug. While specific components of these approaches may be implemented in medical research, a defined process resulting in a summary score with medical utility has not been described.
Herein, we present a method that utilizes digital image analysis to extract sophisticated statistics pertaining to the distribution and patterns of cells within a tissue assayed by a tissue-based test relative to or independent of biomarkers staining, cell types, and overall tissue architecture to be used in medical research and practice.
In accordance with the embodiments herein, a method for extracting distribution statistics from patient tissue samples assayed with a tissue-based test for the purpose of scoring said patient sample and guiding medical research or treatment based on said score. The method described herein utilizes digital image analysis of an image of one or more tissue sections to extract object-based (e.g., cells) features to generate a dataset that associates a quantity of a specific analyte or biomolecule at a specific location in a tissue object in the tissue section. The numerical representation of the image of the tissue section is further processed using one or more algorithm processes to extract sophisticated distribution features of one or more object type or sub-type in the tissue. Statistics describing the spatial distribution features are summarized to generate a patient-specific diagnostic score, and this score is evaluated to guide patient treatment decisions.
In the following description, for purposes of explanation and not limitation, details and descriptions are set forth in order to provide a thorough understanding of the present invention. However, it will be apparent to those skilled in the art that the present invention may be practiced in other embodiments that depart from these details and descriptions without departing from the spirit and scope of the invention.
Within this disclosure, a multitude of spatial analysis techniques are contemplated. For the purpose of example, and not limitation, some of these spatial analysis techniques include geographic distribution analysis, cluster analysis, topological analysis, spatial autocorrelation, network analysis, connectivity analysis, and spatial interaction. Other types of spatial analysis are contemplated.
For purpose of definition, a tissue object is one or more of a cell (e.g., immune cell), cell sub-compartment (e.g., nucleus, cytoplasm, membrane, organelle), cell neighborhood, a tissue compartment (e.g., tumor, tumor microenvironment (TME), stroma, lymphoid follicle, healthy tissue), blood vessel, a lymphatic vessel, extra-cellular matrix, a medical device (e.g., stent, implant), a gel, a parasitic body (e.g., virus, bacterium,), a nanoparticle, a polymer, and/or a non-dyed object (e.g., metal particle, carbon particle). Tissue objects are visualized by histologic stains which highlight the presence and localization of a tissue object. Tissue objects can be identified directly by stains specifically applied to highlight the presence of said tissue object (e.g., hematoxylin to visualize nuclei, IHC stain for a protein specifically found in a muscle fiber membrane), indirectly by stains applied which non-specifically highlight the tissue compartment (e.g., DAB background staining), are biomarkers known to be localized to a specific tissue compartment (e.g., nuclear-expressed protein, carbohydrates only found in the cell membrane), or can be visualized without staining (e.g., carbon residue in lung tissue).
For the purpose of this disclosure, patient status includes diagnosis of inflammatory status, disease state, disease severity, disease progression, therapy efficacy, and changes in patient status over time. Other patient statuses are contemplated.
In one embodiment, the method includes: (i) obtaining digital images of stained tissue sections; (ii) extracting staining, morphologic, and location features of tissue objects within each image of the digital images using a digital image analysis algorithm implemented by a computer; (iii) identifying appropriate statistical framework(s) to analyze the spatial distribution features, (iv) summarizing the spatial distribution feature(s) using one or more statistical method(s) to extract appropriate summary statistic(s), (v) deriving a patient-specific diagnostic score which is a summarization of one or more spatial distribution feature summary statistics, (vi) applying patient selection criteria to the patient-specific summary score, and (vii) identifying patients as eligible/ineligible for a particular therapy based on said selection criteria.
The following described method is utilized to evaluate one or more patient tissue samples assayed with a tissue-based test to determine whether or not said patient or patients are candidates for a specified therapy. For the purposes of this invention, tissue-based assays refer to an assay modality which enables evaluation of tissue samples while retaining tissue architecture. A tissue-based assay enables evaluation of tissue objects and marker stains (e.g., presence and intensity) for biologic molecules (e.g., chromatin, biomarkers) relative to position (e.g., x-y coordinates, polar coordinates) in the tissue.
For example, and not limitation, tissue-based assays of relevance to this invention are IHC, IF, CISH, FISH, and MSI methods. These methods retain overall tissue architecture and enable the evaluation of biomolecules and underlying tissue objects of the sample relative to position in the tissue.
Patient tissue samples for evaluation are generated using standard processes and practices pertaining to IHC, IF, CISH, FISH, and MSI to produce tissue sections which can be evaluated for one or more biomarker or tissue feature. One or more biomarker or tissue feature of interest may be highlighted by one of the above mentioned assay modalities in each tissue section (i.e., mono- and multiplexed assay formats) or on multiple sections from a patient's tissue sample (e.g., one biomarker per serial section for a single patient).
Digitization, using standard practices (i.e., digital slide scanning, imaging with a digital camera mounted on a microscope, MSI), is performed to generate a real (e.g., brightfield image from an IHC-stained tissue) or false image (e.g., color stack from IF-stained tissue, molecule expression stack from MSI evaluated tissue) of the tissue which will be utilized for visualization of the tissue for the biomolecules and features of interest as well as downstream analysis. The digital images of each tissue sample are stored in computer memory or in a database for future recall and analysis.
In another embodiment of this invention, a digital tissue image analysis algorithm implemented by a computer is applied to each image of a tissue sample assayed with a tissue-based test to extract the image analysis features (e.g., morphometric, staining, and location features) pertaining to tissue objects in each image. Image analysis features are extracted for image objects which are groupings of pixels which relate to tissue objects and groupings of cells with similar attributes (e.g., cells of common biomarker staining levels), and interfaces between groupings of cells with similar attributes (e.g., tumor/stroma interface).
Morphometric features pertain to the size, shape, area, texture, organization, organizational relationship, and staining appearance of stains within tissue objects observed in a digital image. For example, and not limitation, morphometric features can be the area of a cell nucleus, the completeness of biomarker staining in a cell membrane, the diameter of a cell nucleus, the roundness of a cell, or lacunarity of biomarker staining in a nucleus.
Staining features pertain to the pixel intensities of specified IHC, ISH, and IF stains or dyes or amount of a molecule determined by MSI-based methodologies. Staining features are evaluated relative to tissue objects (e.g., average staining intensity across pixels in each cell in an image, staining level in a cell membrane, biomolecule expression in a nucleus).
Localization features pertain to the location of objects within a tissue section. Location can be determined based on an absolute (e.g., x and y location based on pixel dimensions of image, μm from center of image defined by pixel dimensions of image) or relative (e.g., x and y position of cells relative to a tissue feature of interest such as a vessel, polar coordinates referenced to the center of mass of a tumor nest) coordinate system (e.g., x-y-z coordinates, polar coordinates). Location for specific image objects can be defined as the centroid of the object or any position extending from the centroid to the exterior limits of the object.
The image analysis algorithm implemented by a computer extracts the image analysis features for each tissue object of interest within an image and stores said values for further analysis in computer memory or to a database.
In a further embodiment, images of one or more tissues assayed by a tissue-based test (e.g., IHC staining for PD-L1, CD8, and Granzyme B) for a single patient are analyzed by an algorithm process implemented by a computer. In this embodiment, the tissues are assayed with monoplex assays (e.g., three slides each stained for one of PD-L1, CD8, and Granzyme B), with one or more multiplex assays (e.g., one slide stained in triplex for PD-L1, CD8, and Granzyme B), or one or more combinations of monoplex and multiplex assays (e.g., one slide stained for PD-L1 and one slide stained with a CD8 and Granzyme B multiplex assay). One or more algorithm processes are applied by a computer to extract image analysis features for tissue objects within the image which results in one or more data arrays representing the tissue(s) in image analysis feature space.
Tissue objects can be identified by assay techniques developed to specifically highlight an object (e.g., hematoxylin staining to identify a cell nucleus, false image of nuclear protein expression generated by MSI). These assay techniques can identify each object directly by enabling visualization of the object itself (e.g., hematoxylin staining for nuclei) or the assay techniques can identify an alternative object or tissue feature which allows indirect definition of the tissue object of interest (e.g., cell membrane staining to define cell object, hematoxylin staining and a digital offset distance to define cell cytoplasm). Within the scope of the claimed invention, one or more objects can be identified from a single analysis and multiple image analysis features associated with the one or more tissue objects can be extracted by the algorithm process. Furthermore, tissue objects can be associated with other tissue objects (e.g., cell membrane and cytoplasm associated with the cell's nucleus) or combined (e.g., cell membrane and nucleus combined to define a cell-object) to generate a composite object. The image analysis features of each object or composite object can be extracted by the algorithm process implemented by the computer.
Each tissue object has an associated plurality of features, and each tissue section has a plurality of tissue objects, wherein the spatial distribution of tissue objects within the tissue can have spatial distribution feature summary statistics extracted and analyzed in whole, or in part, in any of the embodiments following.
In another embodiment, image analysis features are extracted from one or more tissues assayed for one or more biomarkers of interest. For the purpose of this embodiment, the localization features are evaluated and transformed into a format whereby spatial distribution feature statistics can be extracted from the spatial distribution features of tissue objects within the one or more images and associated biomarkers. This transformation can be one or more of: adjustment of location to a different coordinate system from the original analysis (e.g., Cartesian coordinates to polar coordinates), alignment of tissue object location(s) to a common coordinate system (e.g., multiple tissue sections from a single patient translated to a common center of mass), identification of a tissue object subset of interest (e.g., biomarker A positive cells, biomarker A positive and biomarker B negative cells), determination whether or not tissue objects located in different aligned tissue sections are the same tissue object (e.g., cell sectioned into two serial tissue sections), classification of tissue object subsets (e.g., tumor cell class, stroma cell class, inflammatory cell subsets defined by one or more stains), and adjustment to or addition of a third dimension coordinate for each object described in a two-dimensional space (e.g., alignment of serial tissue sections with consideration to section thickness to produce a 3-D description of tissue object position).
Extraction of Spatial Distribution Feature Statistic(s) describing the Spatial Distribution of Image Objects and Image Object Subsets
According to one embodiment of this invention, spatial distribution statistics are extracted from the data array representation of the assayed tissue section(s) for a patient in image analysis feature space once a data structure is generated which contains one or more image analysis features. An algorithm process is applied to the data structure in whole or in part to extract one or more spatial distribution features and associated summary statistics describing the spatial distribution of tissue objects or tissue object subsets (e.g., tumor cells, biomarker-positive cells, tumor nests). A variety of spatial distribution features and summary statistic types can be extracted and are calculated using point-pattern, point-referenced, or areal analysis frameworks.
In an embodiment of this invention, the spatial distribution of tissue objects is analyzed as a spatial point-pattern. Spatial point-patterns can be used to identify spatial trends in point density and position. For example, and not limitation, a data set in this embodiment could include a set of positions of cells in an IHC-stained tissue slide.
In the point-pattern analysis framework, image objects can be unmarked (e.g., all cell-objects identified without sub-classification), where only density of tissue objects is evaluated independent of tissue object classification. Alternatively, tissue objects can be marked (e.g., classified into sub-classes) based on one or more image analysis features, where the value of the covariate feature is assessed as a spatial density. Marks can be discrete values (e.g., biomarker positive/negative, biomarker 0, 1, 2, 3+ staining category) or continuous (e.g., biomarker staining intensity, cell size).
Once a point-pattern analysis approach is defined, quantification of the pattern would use summative methods such as, but not limited to: nearest neighbor distances, pair correlation functions, Ripley's K-function and related functions, or variogram indexing.
In an illustrative example of this embodiment,
In another embodiment of this invention, the spatial distribution of tissue objects is analyzed as continuous spatial distributions, using a point-referenced analysis framework. Point-referenced data models the continuous change in density of tissue objects or subclasses of objects across the tissue sample(s). For example, a set in this embodiment would be a function where the density of cells described by one or more image analysis feature values varies across the sample, which could be built from set of positions of cells in an IHC slide and covariate feature, if needed.
Once the function is determined, quantification of the function would use summative methods such as, but not limited to: curvature features, vector calculus features (i.e., gradient features, curl features, divergent features, and Laplacian features), global and local significance tests, point-pattern analysis of global and local minimum and maximum, and areas under the curve (AUC).
Additionally, for situations whereby it is of interest to derive a spatial distribution summary statistic for features describing the interaction or localization between two or more tissue object subsets (e.g., cells evaluated for two biomarkers, tumor cells and stroma cells, cell membranes evaluated for a plurality of biomarkers), statistics describing the interactions between two or more point-referenced analysis functions can be extracted. Summary statistics can relate to, but are not limited to, one or more of: methods described above for function resulting from the subtraction, multiplication, division or addition of two or more point-referenced analysis functions, methods described above for the function resulting from the thresholding of one or more point-referenced analysis functions by another, point-pattern analysis of point-referenced analysis function summary values described above with the one or more functions defining mark.
Modeling the continuous change in density between objects, or object sub-sets, in a point-referenced analysis framework can be achieved through several approaches. For example, and not limitation, Kriging-, spline-, and/or KDE- (kernel density estimate) based methods can be utilized to model the continuous change in density from tissue object point to tissue object point. The resulting surface representation of tissue object, or tissue object sub-set, density and localization can be further evaluated to derive summary spatial statistics.
Point-pattern and point-referenced spatial analysis frameworks result in a function describing the spatial distribution of tissue objects or tissue object subsets which defines a curve or surface. Spatial distribution features of these curves or surfaces are extracted to derive summary statistics which can be used to derive a patient-specific diagnostic score. Summary statistics describing curve or surface features can be derived from the original curve or surface itself or from the composite curve or surface which results from combination (e.g., addition, subtraction, division, multiplication, powers, exponents, logical operators, thresholding) of two or more curves or surfaces. Spatial statistics can be derived for features of the overall shape of the curves or surfaces and can also be derived from the fine and hyper-fine structure of the curves or surfaces. Furthermore, the functions describing the curve or surface of the spatial distribution analysis can be integrated or differentiated to identify and evaluate features (e.g., change in density across a tissue indicated by the derivative of the density curve) associated with the integrated or differentiated curves or surfaces.
In another embodiment of this invention, the spatial distribution of cells is analyzed as areal data within an imposed grid. Areal units are defined as sections of tissue that are continuously connected by boundaries to create a lattice. This lattice can be regularly (e.g., square grid) or irregularly shaped (e.g., tumor nest regions, stroma regions). Image analysis features are summarized within each areal unit giving a spatial pattern of values. This can then be subsequently analyzed and quantified using summative methods such as, but not limited to: spatial autocorrelation (Moran's I, Join Count statistic, Geary's C), hotspot analysis, or spatial cross correlation between biomarkers.
Guiding Patient Treatment Decisions based on Spatial Distribution Summary Statistics:
In a further embodiment of the present invention, one or more summary statistic(s) derived from the spatial distribution features for one or more image objects are utilized to stratify patients into two or more groups for guiding identification of a patient as a candidate for a specified therapy. The patient selection criteria will be pre-defined criteria which can be applied to a patient cohort to stratify patients into one or more groups selected to receive a therapy, and one or more groups which are excluded from receiving a therapy. Additionally, the summary statistic(s) are used to infer diagnosis or disease severity or monitor the efficacy of therapy
In this embodiment of the invention, the patient-specific diagnostic score can be derived from one or more spatial distribution feature statistic calculated from one or more spatial distribution analysis framework (point-pattern, point-referenced, and areal) applied to one or more image objects (e.g., cells, vessels, cell nuclei) or image object subsets (e.g., tumor cells, biomarker 1 positive cells, biomarker n positive cells).
This application is a continuation-in-part (CIP) of U.S. Ser. No. 15/181,156, filed Jun. 13, 2016, and titled “METHODS FOR ASSESSMENT OF SPATIAL STATISTICS OF CELLS IN TISSUE SAMPLES”, which is a CIP of U.S. Ser. No. 14/189,833, filed Feb. 25, 2014, and titled “CELL-BASED IMAGE REGISTRATION”, which is a CIP of U.S. Ser. No. 13/247,991, filed Sep. 28, 2011, titled “METHODS FOR FEATURE ANALYSIS ON CONSECUTIVE TISSUE SECTIONS”, which issued as U.S. Pat. No. 8,787,651 on Jul. 22, 2014; the contents of each of which are hereby incorporated by reference.
Number | Date | Country | |
---|---|---|---|
Parent | 15181156 | Jun 2016 | US |
Child | 15828587 | US | |
Parent | 14189833 | Feb 2014 | US |
Child | 15181156 | US | |
Parent | 13247991 | Sep 2011 | US |
Child | 14189833 | US |