The present invention relates generally to image analysis methods for the assessment of tissue samples. More specifically, the present invention relates to image analysis methods for the evaluation of tissue objects within a tissue sample based on image analysis feature clustering within those tissue objects.
Several methods exist which allow grouping of data points into categories based on their measured similarities. Current big data trends simultaneously measure hundreds to thousands of features or ‘dimensions’ of each data point. Multidimensional data clustering takes in to account every feature of a data point in order to group the data in to categories of similarity. Methods such as K means, HDB SCAN, and t-SNE are some of the most popular multidimensional data clustering algorithms in use today. Many of these methods are used solely for the purpose of visualizing graphic representations of the organization of the data itself.
In accordance with the embodiments herein, a method for analyzing tissue objects using image analysis feature clustering is disclosed. The method described herein generally utilizes digital image analysis of tissue objects within a digital image of at least one tissue sample. The tissue objects have one or more common image analysis features extracted from the image and are then grouped into clusters based on the commonalities of the image analysis feature or features. These individual clusters are then used to generate a cluster map, which can be used to coordinate the tissue object clusters with the location of the clustered tissue objects within the digital images. Each tissue object is identified in the original image based on the category in to which the object has been clustered.
In the following description, for purposes of explanation and not limitation, details and descriptions are set forth in order to provide a thorough understanding of the present invention. However, it will be apparent to those skilled in the art that the present invention may be practiced in other embodiments that depart from these details and descriptions without departing from the spirit and scope of the invention.
For purpose of definition, a tissue object is one or more of a cell (e.g., immune cell), cell sub-compartment (e.g., nucleus, cytoplasm, membrane, organelle), cell neighborhood, a tissue compartment (e.g., tumor, tumor microenvironment (TME), stroma, lymphoid follicle, healthy tissue), blood vessel, a lymphatic vessel, vacuole, collagen, regions of necrosis, extra-cellular matrix, a medical device (e.g., stent, implant), a gel, a parasitic body (e.g., virus, bacterium), a nanoparticle, a polymer, and/or a non-dyed object (e.g., metal particle, carbon particle). Tissue objects are visualized by histologic stains which highlight the presence and localization of a tissue object. Tissue objects can be identified directly by stains specifically applied to highlight the presence of said tissue object (e.g., hematoxylin to visualize nuclei, IHC stain for a protein specifically found in a muscle fiber membrane), indirectly by stains applied which non-specifically highlight the tissue compartment (e.g., DAB background staining), are biomarkers known to be localized to a specific tissue compartment (e.g., nuclear-expressed protein, carbohydrates only found in the cell membrane), or can be visualized without staining (e.g., carbon residue in lung tissue).
For the purpose of this disclosure, patient status includes diagnosis of inflammatory status, disease state, disease severity, disease progression, therapy efficacy, and changes in patient status over time. Other patient statuses are contemplated.
In an illustrative embodiment of the invention, as summarized in
In a second illustrative embodiment of the invention, the method may be summarized in the following four steps: i) acquiring a digital image of each of a plurality of tissue samples; ii) extracting image analysis features from the tissue objects with the digital images using a computer system; iii) grouping the tissue objects into tissue object clusters based on the similarities of the extracted image analysis features; and iv) generating at least one cluster map for at least one of the digital images through coordination of at least one of the tissue object clusters with the location of the clustered tissue objects within the digital image. As with the previous embodiment, the plurality of tissue samples will typically be stained with a number of stains to ensure that the tissue objects with the samples will be easily distinguishable. However, it is understood that staining the samples is not required for the method to function.
In a further embodiment, the image analysis features include morphometric features, localization features, neighborhood features, and staining features of the tissue objects within the tissue sample. Morphometric features are features related to the size, shape, area, texture, organization, and organizational relationship of tissue objects observed in a digital image. For example, and not limitation, morphometric features could be the area of a cell nucleus, the completeness of biomarker staining in a cell membrane, the diameter of a cell nucleus, the roundness of a blood vessel, lacunarity of biomarker staining in a nucleus, etc.
Localization features are features related to position of a feature in the tissue section, spatial relationships of tissue objects relative to each other, relationship of image analysis features between tissue objects in the tissue section, and distribution of image analysis features within a tissue object. Location can be determined based on an absolute (x and y location based on pixel dimensions of image, μm from center of image defined by pixel dimensions of image) or relative (e.g., x and y position of cells relative to a tissue feature of interest such as a vessel, polar coordinates referenced to the center of mass of a tumor nest) coordinate system (e.g., x-y-z coordinates, polar coordinates). Location for specific image objects can be defined as the centroid of the object or any position enclosed by the object extending from the centroid to the exterior limits of the object.
Neighborhood features are features related to tissue object morphology within a distance of an anchor tissue object, tissue object staining within a distance of an anchor tissue object, and morphology and/or staining between tissue objects within a distance of an anchor tissue object. For example, and not limitation, neighborhood features could be the average size or area of cells within 100 microns of an anchor cell or the quality or quantity of staining of cell nuclei within 500 microns of an anchor cell nucleus.
Staining features are features related to stain appearance, stain intensity, stain completeness, stain shape, stain texture, stain area, and stain distribution of specified IHC, ISH, and IF stains or dyes or amount of a molecule determined by MSI-based methodologies. Staining features are evaluated relative to tissue objects (e.g., average staining intensity in each cell in an image, staining level in a cell membrane, biomolecule expression in a nucleus).
In another embodiment, the cluster map can be a chart of data points taken from the feature cluster, a graphical representation of a chart of data points taken from the feature cluster, or a digital image of the feature cluster. The graphical cluster maps, the graphical representation, or the digital image, can be overlaid on top of the digital image of the tissue sample, or samples, such that the cluster map highlights the underlying tissue objects in the tissue sample.
In a further embodiment, the cluster map can be used to assign the tissue objects biological descriptions, such as a cell type, structural formation of cells, disease state, or features of clinical or anatomical pathology.
In another embodiment, the cluster map is used to calculate a score for each patient from whom the tissue samples were taken. This score is then used to determine the patient status for that patient. This can be performed both when the method is used for a single tissue section or for a plurality of tissue sections, such that how the cluster map is developed is agnostic to the determination of patient status for each patient.