The present systems, methods, and apparatuses relate generally to digital histopathological imaging and, more particularly, to analyzing histopathology images to determine the presence of certain predetermined abnormalities.
Pathologists review histopathology images to determine whether any abnormalities are present in the tissue within the image. For example, a pathologist may review histopathology images of tissue collected during a visit to a dermatologist to determine whether a mole is a carcinoma. Or, a pathologist may review tissue collected from a patient during surgery (while the patient is still under anesthesia) to determine whether a tumor has been completely removed. Anytime a pathologist reviews a histopathology image, the pathologist must review the entire image to determine whether a particular abnormality is present, which can be a slow and tedious process. In some cases, a resident physician is assigned the task of initially reviewing histopathology images to identify potential abnormalities for review by the pathologist, which also can further delay the histopathology review process.
Therefore, there is a long-felt but unresolved need for a system, method, or apparatus that quickly and efficiently analyzes histopathology images to determine the presence of certain predetermined abnormalities.
Briefly described, and according to one embodiment, aspects of the present disclosure generally relate to systems, methods, and apparatuses for analyzing histopathology images to determine the presence of certain predetermined abnormalities.
Histopathology images generally comprise digitized versions of microscopic views of tissue slides that may contain pieces of tissue from various human organs or abnormal masses within the body (e.g., lymph node, tumor, skin, etc.). These histopathology images may be collected for review by a trained professional (e.g., pathologist) to diagnose a particular disease, determine whether a tumor is malignant or benign, determine whether a surgeon has completely excised a tumor, etc. To expedite the review process (because pathologists view hundreds of histopathology images daily and often make only a binary decision regarding a particular histopathology image: yes, the image contains cancerous cells or, no, the image does not contain cancerous cells), the tissue analysis system described in the present disclosure processes histopathology images to identify/highlight regions of interest (e.g., a region that may comprise parts of a tumor, cancerous cells, or other predetermined abnormality), typically for subsequent review by a pathologist or other trained professional. Generally, by identifying/highlighting regions of interest within the histopathology images, the pathologist's review time of a particular histopathology image is reduced because the review need only cover the regions of interest and not the entire histopathology image. In some embodiments, professionals need not review the results of the processing because the tissue analysis system automatically identifies abnormalities within the histopathology images.
The tissue analysis system may, in various embodiments, process multiple histopathology images either concurrently or simultaneously to identify regions of interest in each of the histopathology images. In various embodiments, the identification of regions of interest within a histopathology image comprises the following processes: tissue identification, artifact removal, low-resolution analysis, and high-resolution analysis. In one embodiment, tissue identification is the process by which the present tissue analysis system identifies tissue regions (and, in one embodiment, a particular type of tissue) within the histopathology image (e.g., separating the tissue regions from the blank background regions). Generally, tissue identification increases the accuracy and efficiency of the tissue analysis system. Artifact removal, in one embodiment, is the process by which the tissue analysis system removes artifacts (e.g., blurry regions, fingerprints, foreign objects such as dust or hair, etc.) that may have accidentally been included on the tissue slide from the histopathology image, also increasing the accuracy and efficiency of the tissue analysis system. In one embodiment, low-resolution analysis is the process by which the tissue analysis system identifies potential regions of interest, with an emphasis on speed and/or low-resource processing (not necessarily accuracy), for subsequent confirmation as regions of interest based on certain predefined features within the identified tissue (e.g., cellular structures, nuclei patterns, etc.). High-resolution analysis, in one embodiment, is the process by which the tissue analysis system confirms whether a particular potential region of interest should be considered a region of interest, based on predefined nuclei patterns, for subsequent analysis by a professional. In various embodiments, the identified regions of interest (and other parts of the process as disclosed herein) are flagged and stored with the histopathology image as a layer(s) on top of the histopathology image that may be viewed (or removed from view) by the professional.
For example, to determine whether a patient has cancer, the tissue analysis system may process the histopathology image(s) of lymph node tissue to identify regions of interest that may contain cancerous cells. Accordingly, the histopathology image(s) undergo the tissue identification process to identify the lymph node tissue within the histopathology image(s) and confirm that the tissue is lymph node tissue and not some other tissue (e.g., adipose tissue, etc.). Similarly, the histopathology image(s) undergo the artifact removal process to remove any artifacts contained within the histopathology image(s). The histopathology image(s) undergo the low-resolution analysis process to quickly identify potential regions of interest for further analysis during the high-resolution analysis process, and the high-resolution analysis process, during which the tissue analysis system identifies/flags, for subsequent review by a pathologist, regions of interest that may contain cancerous cells. Thus, the pathologist may quickly review the histopathology image(s) of the lymph node tissue to determine whether a patient has cancer.
In one embodiment, a method for processing images of cells to identify cellular nuclei within the cells for use in connection with identifying a possible abnormality with respect to the cells, comprising the steps of: receiving an image of one or more cells, each cell having a cellular nucleus, wherein the image of the one or more cells comprises a plurality of pixels of varying brightness; applying a sampling matrix to each of the plurality of pixels of the image of the one or more cells, wherein the sampling matrix determines one or more first and second derivatives with respect to the brightness of a particular pixel to which the sampling matrix was applied; determining a consistency for each of the one or more first and second derivatives; and selecting, based on the determined consistency for each of the one or more first and second derivatives, one or more edges of a cellular nucleus within the image of the one or more cells, wherein the selected one or more edges of the cellular nucleus help define the shape of the cellular nucleus.
In one embodiment, a system for processing images of cells to identify cellular nuclei within the cells for use in connection with identifying a possible abnormality with respect to the cells, comprising: one or more electronic computing devices; and a processor operatively connected to the one or more electronic computing devices, wherein the processor is operative to: receive an image of one or more cells from the one or more electronic computing devices, each cell having a cellular nucleus, wherein the image of the one or more cells comprises a plurality of pixels of varying brightness; apply a sampling matrix to each of the plurality of pixels of the image of the one or more cells, wherein the sampling matrix determines one or more first and second derivatives with respect to the brightness of a particular pixel to which the sampling matrix was applied; determine a consistency for each of the one or more first and second derivatives; and select, based on the determined consistency for each of the one or more first and second derivatives, one or more edges of a cellular nucleus within the image of the one or more cells, wherein the selected one or more edges of the cellular nucleus help define the shape of the cellular nucleus.
In one embodiment, a method for processing images of cells to identify cellular nuclei within the cells and to determine nuclei shapes within the cells for use in connection with identifying a possible abnormality with respect to the cells, comprising the steps of: receiving an image of one or more cells, each cell having a cellular nucleus, wherein the image of the one or more cells comprises a plurality of pixels of varying brightness and shape data regarding at least one particular nucleus within the image of the one or more cells; selecting, based on the shape data, an initial pixel within the at least one particular nucleus from which to determine the shape of the at least one particular nucleus; adding additional pixels to the initial pixel, based on one or more predefined rules, until the number of pixels within the at least one particular nucleus exceeds a predetermined threshold value; and determining, based on the additional pixels, the shape of the at least one particular nucleus.
In one embodiment, a system for processing images of cells to identify cellular nuclei within the cells and to determine nuclei shapes within the cells for use in connection with identifying a possible abnormality with respect to the cells, comprising: one or more electronic computing devices; and a processor operatively connected to the one or more electronic computing devices, wherein the processor is operative to: receive, from the one or more electronic computing devices, an image of one or more cells, each cell having a cellular nucleus, wherein the image of the one or more cells comprises a plurality of pixels of varying brightness and shape data regarding at least one particular nucleus within the image of the one or more cells; select, based on the shape data, an initial pixel within the at least one particular nucleus from which to determine the shape of the at least one particular nucleus; add additional pixels to the initial pixel, based on one or more predefined rules, until the number of pixels within the at least one particular nucleus exceeds a predetermined threshold value; and determine, based on the additional pixels, the shape of the at least one particular nucleus.
In one embodiment, a method for processing images of cells comprising cellular nuclei to determine the presence of an abnormality within the cells, comprising the steps of: receiving an image of one or more cells, each cell having a cellular nucleus, wherein the image of the one or more cells comprises a plurality of pixels of varying brightness; identifying, based on the brightness of the plurality of pixels, one or more edges corresponding to a particular cellular nucleus; defining, based on the identified one or more edges and the plurality of pixels, a shape of the particular cellular nucleus; and comparing the shape of the particular cellular nucleus to one or more predefined rules to determine whether the shape of the particular cellular nucleus indicates the presence of the abnormality in the one or more cells.
In one embodiment, a system for processing images of cells comprising cellular nuclei to determine the presence of an abnormality within the cells, comprising: one or more electronic computing devices; and a processor operatively connected to the one or more electronic computing devices, wherein the processor is operative to: receive, from the one or more electronic computing devices, an image of one or more cells, each cell having a cellular nucleus, wherein the image of the one or more cells comprises a plurality of pixels of varying brightness; identify, based on the brightness of the plurality of pixels, one or more edges corresponding to a particular cellular nucleus; define, based on the identified one or more edges and the plurality of pixels, a shape of the particular cellular nucleus; and compare the shape of the particular cellular nucleus to one or more predefined rules to determine whether the shape of the particular cellular nucleus indicates the presence of the abnormality in the one or more cells.
According to one aspect of the present disclosure, the method, wherein the sampling matrix comprises an arc-shaped filter. Furthermore, the method, wherein determining the consistency for each of the one or more first and second derivatives further comprises the steps of: determining an arithmetic mean for each of the one or more first and second derivatives; and determining a standard deviation for each of the one or more first and second derivatives. Moreover, the method, wherein the one or more first and second derivatives comprise one or more arc lengths. Further, the method, wherein the consistency for each of the one or more first and second derivatives is determined for each of the one or more arc lengths. Additionally, the method, wherein selecting the one or more edges of the cellular nucleus within the image of the one or more cells further comprises the steps of: converting the determined consistency for each of the one or more first and second derivatives into a normalized signal-to-noise ratio value; and selecting the one or more edges of the cellular nucleus within the image of the one or more cells corresponding to the determined consistency for each of the one or more first and second derivatives with the maximum normalized signal-to-noise ratio value. Also, the method, wherein the image of the one or more cells comprises a preprocessed image of the one or more cells.
According to one aspect of the present disclosure, the system, wherein the sampling matrix comprises an arc-shaped filter. Furthermore, the system, wherein to determine the consistency for each of the one or more first and second derivatives, the processor is further operative to: determine an arithmetic mean for each of the one or more first and second derivatives; and determine a standard deviation for each of the one or more first and second derivatives. Moreover, the system, wherein the one or more first and second derivatives comprise one or more arc lengths. Further, the system, wherein the consistency for each of the one or more first and second derivatives is determined for each of the one or more arc lengths. Additionally, the system, wherein to select the one or more edges of the cellular nucleus within the image of the one or more cells, the processor is further operative to: convert the determined consistency for each of the one or more first and second derivatives into a normalized signal-to-noise ratio value; and select the one or more edges of the cellular nucleus within the image of the one or more cells corresponding to the determined consistency for each of the one or more first and second derivatives with the maximum normalized signal-to-noise ratio value. Also, the system, wherein the processor is further operative to, prior to applying the sampling matrix to each of the plurality of pixels, preprocess the image of the one or more cells. In addition, the system, wherein the one or more electronic computing devices further comprise one or more slide scanners.
According to one aspect of the present disclosure, the method, wherein the shape data comprises data corresponding to one or more edges of the at least one particular nucleus and data regarding one or more initial pixels within the at least particular one nucleus. Furthermore, the method, wherein the one or more predefined rules define, based on one or more multivariate normal distribution intensities of the brightness of the additional pixels, the additional pixels most likely to be within the at least particular one nucleus. Moreover, the method, wherein the one or more multivariate normal distribution intensities are determined based on the brightness of the additional pixels and the shape data. Further, the method, wherein the shape data comprises the predetermined threshold value. Additionally, the method, further comprising the step of determining, after each additional pixel is added to the initial pixel, a fitness of a current shape of the at least one particular nucleus, wherein the fitness corresponds to the accuracy of the current shape of the at least one particular nucleus. Also, the method, wherein the shape of the at least one particular nucleus is determined based on the fitness determined after each additional pixel was added to the initial pixel. In addition, the method, wherein the image of the one or more cells comprises a preprocessed image of the one or more cells.
According to one aspect of the present disclosure, the system, wherein the shape data comprises data corresponding to one or more edges of the at least one particular nucleus and data regarding one or more initial pixels within the at least particular one nucleus. Furthermore, the system, wherein the one or more predefined rules define, based on one or more multivariate normal distribution intensities of the brightness of the additional pixels, the additional pixels most likely to be within the at least particular one nucleus. Moreover, the system, wherein the one or more multivariate normal distribution intensities are determined based on the brightness of the additional pixels and the shape data. Further, the system, wherein the shape data comprises the predetermined threshold value. Additionally, the system, further comprising the step of determining, after each additional pixel is added to the initial pixel, a fitness of a current shape of the at least one particular nucleus, wherein the fitness corresponds to the accuracy of the current shape of the at least one particular nucleus. Likewise, the system, wherein the shape of the at least one particular nucleus is determined based on the fitness determined after each additional pixel was added to the initial pixel. Also, the system, wherein the processor is further operative to, prior to selecting the initial pixel, preprocess the image of the one or more cells. In addition, the system, wherein the one or more electronic computing devices further comprise one or more slide scanners.
According to one aspect of the present disclosure, the method, wherein identifying the one or more edges further comprises the steps of: applying a sampling matrix to each of the plurality of pixels of the image of the one or more cells, wherein the sampling matrix determines one or more first and second derivatives with respect to the brightness of a particular pixel to which the sampling matrix was applied; determining a consistency for each of the one or more first and second derivatives; and selecting, based on the determined consistency for each of the one or more first and second derivatives, one or more edges of a cellular nucleus within the image of the one or more cells, wherein the selected one or more edges of the cellular nucleus help define the shape of the cellular nucleus. Furthermore, the method, wherein defining the shape of the particular cellular nucleus further comprises the steps of: selecting, based on the identified one or more edges and the plurality of pixels, an initial pixel within the one particular nucleus from which to determine the shape of the particular nucleus; adding additional pixels to the initial pixel, based on one or more predefined rules, until the number of pixels within the particular nucleus exceeds a predetermined threshold value; and determining, based on the additional pixels, the shape of the particular nucleus. Moreover, the method, wherein the one or more predefined rules comprise data regarding the characteristics of cellular nuclei comprising the particular abnormality. Further, the method, wherein the characteristics of nuclei are selected from the group comprising: a shape of the cellular nuclei, a size of the cellular nuclei, a spatial relationship between the cellular nuclei, and a number of the cellular nuclei within a region of predetermined size.
According to one aspect of the present disclosure, the method, further comprising the step of, prior to identifying the one or more edges, preprocessing the image of the one or more cells. Additionally, the method, wherein preprocessing the image of the one or more cells further comprises the step of identifying tissue comprising the one or more cells within the image of the one or more cells. Also, the method, wherein preprocessing the image of the one or more cells further comprises the steps of identifying one or more artifacts within the image of the one or more cells and removing the identified one or more artifacts from the image of the one or more cells. Furthermore, the method, wherein preprocessing the image of the one or more cells further comprises the step of converting the image of the one or more cells to a particular color space. Moreover, the method, wherein preprocessing the image of the one or more cells further comprises the step of extracting one or more particular color channels from the image of the one or more cells. Further, the method, wherein preprocessing the image of the one or more cells further comprises the step of selecting a particular image size for the image of the one or more cells. Additionally, the method, wherein preprocessing the image of the one or more cells further comprises the step of identifying one or more texture features within the image of the one or more cells. Also, the method, wherein preprocessing the image of the one or more cells further comprises the step of dividing the plurality of pixels into one or more groups of predetermined size.
According to one aspect of the present disclosure, the system, wherein to identify the one or more edges, the processor is further operative to: apply a sampling matrix to each of the plurality of pixels of the image of the one or more cells, wherein the sampling matrix determines one or more first and second derivatives with respect to the brightness of a particular pixel to which the sampling matrix was applied; determine a consistency for each of the one or more first and second derivatives; and select, based on the determined consistency for each of the one or more first and second derivatives, one or more edges of a cellular nucleus within the image of the one or more cells, wherein the selected one or more edges of the cellular nucleus help define the shape of the cellular nucleus. Furthermore, the system, wherein to define the shape of the particular cellular nucleus, the process is further operative to: select, based on the identified one or more edges and the plurality of pixels, an initial pixel within the one particular nucleus from which to determine the shape of the particular nucleus; add additional pixels to the initial pixel, based on one or more predefined rules, until the number of pixels within the particular nucleus exceeds a predetermined threshold value; and determine, based on the additional pixels, the shape of the particular nucleus. Moreover, the system, wherein the one or more predefined rules comprise data regarding the characteristics of cellular nuclei comprising the particular abnormality. Further, the system, wherein the characteristics of nuclei are selected from the group comprising: a shape of the cellular nuclei, a size of the cellular nuclei, a spatial relationship between the cellular nuclei, and a number of the cellular nuclei within a region of predetermined size. Additionally, the system, wherein the one or more electronic computing devices further comprise one or more slide scanners.
According to one aspect of the present disclosure, the system, wherein the processor, prior to identifying the one or more edges, is further operative to preprocess the image of the one or more cells. Also, the system, wherein to preprocess the image of the one or more cells, the process is further operative to identify tissue comprising the one or more cells within the image of the one or more cells. Furthermore, the system, wherein to preprocess the image of the one or more cells, the process is further operative to identify one or more artifacts within the image of the one or more cells and remove the identified one or more artifacts from the image of the one or more cells. Moreover, the system, wherein to preprocess the image of the one or more cells, the process is further operative to convert the image of the one or more cells to a particular color space. Further, the system, wherein to preprocess the image of the one or more cells, the process is further operative to extract one or more particular color channels from the image of the one or more cells. Additionally, the system, wherein to preprocess the image of the one or more cells, the process is further operative to select a particular image size for the image of the one or more cells. Also, the system, wherein to preprocess the image of the one or more cells, the process is further operative to identify one or more texture features within the image of the one or more cells. Additionally, the system, wherein to preprocess the image of the one or more cells, the process is further operative to divide the plurality of pixels into one or more groups of predetermined size.
These and other aspects, features, and benefits of the claimed invention(s) will become apparent from the following detailed written description of the preferred embodiments and aspects taken in conjunction with the following drawings, although variations and modifications thereto may be effected without departing from the spirit and scope of the novel concepts of the disclosure.
The accompanying drawings illustrate one or more embodiments and/or aspects of the disclosure and, together with the written description, serve to explain the principles of the disclosure. Wherever possible, the same reference numbers are used throughout the drawings to refer to the same or like elements of an embodiment, and wherein:
For the purpose of promoting an understanding of the principles of the present disclosure, reference will now be made to the embodiments illustrated in the drawings and specific language will be used to describe the same. It will, nevertheless, be understood that no limitation of the scope of the disclosure is thereby intended; any alterations and further modifications of the described or illustrated embodiments, and any further applications of the principles of the disclosure as illustrated therein are contemplated as would normally occur to one skilled in the art to which the disclosure relates. All limitations of scope should be determined in accordance with and as expressed in the claims.
Whether a term is capitalized is not considered definitive or limiting of the meaning of a term. As used in this document, a capitalized term shall have the same meaning as an uncapitalized term, unless the context of the usage specifically indicates that a more restrictive meaning for the capitalized term is intended. However, the capitalization or lack thereof within the remainder of this document is not intended to be necessarily limiting unless the context clearly indicates that such limitation is intended.
Aspects of the present disclosure generally relate to systems, methods, and apparatuses for analyzing histopathology images to determine the presence of certain predetermined abnormalities.
The tissue analysis system may, in various embodiments, process multiple histopathology images either concurrently or simultaneously to identify regions of interest in each of the histopathology images. In various embodiments, the identification of regions of interest within a histopathology image comprises the following processes: tissue identification, artifact removal, low-resolution analysis, and high-resolution analysis. In one embodiment, tissue identification is the process by which the present tissue analysis system identifies tissue regions (and, in one embodiment, a particular type of tissue) within the histopathology image (e.g., separating the tissue regions from the blank background regions). Generally, tissue identification increases the accuracy and efficiency of the tissue analysis system. Artifact removal, in one embodiment, is the process by which the tissue analysis system removes artifacts (e.g., blurry regions, fingerprints, foreign objects such as dust or hair, etc.) that may have accidentally been included on the tissue slide from the histopathology image, also increasing the accuracy and efficiency of the tissue analysis system. In one embodiment, low-resolution analysis is the process by which the tissue analysis system identifies potential regions of interest, with an emphasis on speed and/or low-resource processing (not necessarily accuracy), for subsequent confirmation as regions of interest based on certain predefined features within the identified tissue (e.g., cellular structures, nuclei patterns, etc.). High-resolution analysis, in one embodiment, is the process by which the tissue analysis system confirms whether a particular potential region of interest should be considered a region of interest, based on predefined nuclei patterns, for subsequent analysis by a professional. In various embodiments, the identified regions of interest (and other parts of the process as disclosed herein) are flagged and stored with the histopathology image as a layer(s) on top of the histopathology image that may be viewed (or removed from view) by the professional.
For example, to determine whether a patient has cancer, the tissue analysis system may process the histopathology image(s) of lymph node tissue to identify regions of interest that may contain cancerous cells. Accordingly, the histopathology image(s) undergo the tissue identification process to identify the lymph node tissue within the histopathology image(s) and confirm that the tissue is lymph node tissue and not some other tissue (e.g., adipose tissue, etc.). Similarly, the histopathology image(s) undergo the artifact removal process to remove any artifacts contained within the histopathology image(s). The histopathology image(s) undergo the low-resolution analysis process to quickly identify potential regions of interest for further analysis during the high-resolution analysis process, and the high-resolution analysis process, during which the tissue analysis system identifies/flags, for subsequent review by a pathologist, regions of interest that may contain cancerous cells. Thus, the pathologist may quickly review the histopathology image(s) of the lymph node tissue to determine whether a patient has cancer.
Referring now to the figures, for the purposes of example and explanation of the fundamental processes and components of the disclosed systems, methods, and apparatuses, reference is made to
Generally, the tissue analysis system 102 processes histopathology images to identify regions of interest of the images, often for subsequent analysis by a trained professional (e.g., pathologist). In various embodiments, histopathology images are digitized versions of microscopic views of tissue slides 104 (e.g., whole slide images, etc.) that may contain pieces of tissue from various human organs (e.g., lymph node, tumor, skin, etc.). These histopathology images may be collected to diagnosis a particular disease, determine whether a tumor is malignant or benign, determine whether a surgeon has completely excised a tumor, etc. Accordingly, a pathologist may view hundreds of these histopathology images every day to make those determinations. To expedite the review process (because pathologists often only making a binary decision regarding a particular histopathology image: yes, the image contains cancerous cells or, no, the image does not contain cancerous cells), the tissue analysis system 102 process histopathology images to identify/highlight regions of interest (e.g., a region that may contain parts of a tumor, cancerous cells, etc.). In one embodiment, a professional (e.g., pathologist) reviews the results of the processing to confirm the presence of the abnormality. In another embodiment, professionals need not review the results of the processing because the tissue analysis system 102 automatically identifies abnormalities within the histopathology images.
In various embodiments, the tissue slides 104 may comprise any slide capable of receiving tissue samples (e.g., glass, plastic, etc.). Thus, the tissue slides 104 usually contain a small, thin piece of tissue that has been excised from a patient for a specific purpose (e.g., diagnose cancer, confirm removal of tumor, etc.). In various embodiments, the tissue is stained to increase the visibility of certain features of the tissue (e.g., using a hematoxylin and eosin/H&E stain, etc.). Traditionally, tissue slides 104 were viewed in painstaking fashion via a microscope. More recently, the tissues slides 104 are scanned by a slide scanner 106 to generate histopathology images, so that pathologists need not use microscopes to view the tissue. Regardless, the pathologist must still review the entirety of the histopathology images to detect abnormalities. Generally, the tissue slides 104 may be loaded automatically into the slide scanner 106 at a rapid rate or may be fed individually into the slide scanner 106 by a technician or other professional. Accordingly, the slide scanner 106 generates a histopathology image that may comprise a detailed, microscopic view of the tissue slide 104 (e.g., an image with dimensions of 80,000×60,000 pixels, wherein 1 pixel is approximately 0.25 microns).
For example, to confirm whether a patient has cancer, a biopsy of the patient's lymph nodes may be performed so that a pathologist may determine whether cancerous cells are present within the patient's lymph nodes. Continuing with this example, the tissue retrieved by that biopsy may be placed on a tissue slide 104, which is fed into the slide scanner 106 to convert the tissue slide into a histopathology image that comprises a detailed view of the lymph node tissue sample. In another example, to confirm whether a surgeon has completely removed a tumor during surgery, a biopsy of the exterior portions of the removed tumor may be taken so that a pathologist may determine whether the tissue comprises the exterior of the tumor (thus, signifying that the tumor has been completely removed) or an interior portion of the tumor (thus, indicating that the surgeon must remove more of the tumor). The tissue retrieved by that biopsy may be placed on a tissue slide 104, which is fed into the slide scanner 106 to convert the tissue slide into a histopathology image that comprises a detailed view of the tumor tissue sample.
After the histopathology image(s) has been generated, the histopathology image(s) is transmitted to the tissue analysis system 102 for identification of regions of interest. Generally, the tissue analysis system 102 may process multiple histopathology images either concurrently or simultaneously to identify regions of interest in each of the histopathology images. In various embodiments, the identification of regions of interest within a histopathology image comprises the following processes: tissue identification, artifact removal, low-resolution analysis, and high-resolution analysis. In one embodiment, tissue identification is the process by which the tissue analysis system 102 identifies tissue regions (and, in one embodiment, a particular type of tissue) within the histopathology image (e.g., separating the tissue regions from the blank background regions), which will be explained in further detail in association with the description of
For example, continuing with the lymph node tissue example, the tissue analysis system 102 processes the histopathology image(s) of the lymph node tissue to identify regions of interest that may contain cancerous cells. Accordingly, the histopathology image(s) undergo a tissue identification process to identify the lymph node tissue within the histopathology image(s) and confirm that the tissue is lymph node tissue and not other tissue (e.g., fat tissue, etc.). Similarly, the histopathology image(s) undergo an artifact removal process to remove any artifacts contained within the histopathology image(s). The histopathology image(s) undergo a low-resolution analysis process to quickly identify potential regions of interest for further analysis during a high-resolution analysis process, during which the tissue analysis system 102 identifies, for subsequent review by a pathologist, regions of interest that may contain cancerous cells.
Thus, in various embodiments, the processed histopathology image with the identified regions of interest is viewed on an electronic computing device 108 by a professional. Generally, the electronic computing device 108 may be any device capable of displaying the processed histopathology image with sufficient resolution so that a professional may confirm whether a region of interest contains a certain predefined abnormality (e.g., computer, laptop, smartphone, tablet computer, etc.). In various embodiments, the professional may view multiple layers of the processed histopathology image as part of the subsequent analysis of the histopathology image. For example, the pathologist may view the lymph node tissue 802 in a view 110 without any of the layers from the process disclosed herein so that the pathologist is not influenced by the identifications made by the tissue analysis system 102 (e.g., in a view 110 as if the pathologist were viewing the original tissue slide 104). Similarly, the pathologist may view the lymph node tissue in a view 112 that shows cell nuclei groups 816 that were identified by the tissue analysis system 102 (as will be discussed in association with the description of
Now referring to
In various embodiments, the slide scanner 106 is any device that is capable of performing the functionality disclosed herein, such as ultra-high resolution scans of many tissue slides 104 at once (e.g., Ultra-Fast Scanner, available from Philips Digital Pathology, Best, Netherlands). In various embodiments, the slide scanner 106 communicates via network 204 with the tissue analysis system 102 and tissue analysis system database 202 to provide histopathology images for processing and storage, respectively.
Generally, the electronic computing device 108 is any device that is capable of performing the functionality disclosed herein and comprises a high-resolution display (e.g., desktop computer, laptop computer, tablet computer, smartphone, etc.). In various embodiments, the electronic computing device 108 communicates via network 204 with the tissue analysis system 102 and tissue analysis system database 202 to view processed histopathology images and, in one embodiment, provide certain administrative functionality with respect to the tissue analysis system 102 (e.g., defining preferences, calibrating, etc.).
Still referring to
Generally, the tissue analysis system 102 (and its engines) may be any computing device (e.g., desktop computer, laptop, servers, tablets, etc.), combination of computing devices, software, hardware, or combination of software and hardware that is capable of performing the functionality disclosed herein. In various embodiments, the tissue analysis system 102 may comprise a tissue identification engine 401, artifact removal engine 501, low-resolution engine 601, and high-resolution engine 701. In one embodiment, the tissue identification engine 401 conducts the tissue identification process (further details of which will be discussed in association with the description of
Referring now to
In various embodiments, the tissue analysis process 300 begins at step 302 when the tissue analysis system receives one or more histopathology images. In one embodiment, the histopathology images may come directly from a slide scanner (e.g., slide scanner 106 from
In one embodiment, the tissue analysis process 300 continues with the tissue identification process 400, wherein the system identifies tissue regions and, in one embodiment, a particular type of tissue within the histopathology image (further details of which will be explained in association with the description of
After the tissue identification process 400, in one embodiment, the system proceeds with the artifact removal process 500 (further details of which will be explained in association with the description of
In one embodiment, the tissue analysis process 300 continues with the low-resolution analysis process 600, wherein the system identifies potential regions of interest within the histopathology images (further details of which will be explained in association with the description of
After the low-resolution analysis process 600, in one embodiment, the system proceeds with the high-resolution analysis process 700 (further details of which will be explained in association with the description of
After the high-resolution analysis process 700, in one embodiment, the system determines, at step 306, whether there are additional histopathology images to process/analyze. If there are additional histopathology images to process/analyze, then the tissue analysis process 300 returns to step 304 and selects a histopathology image to process/analyze. If there are no additional histopathology images to process/analyze, then the tissue analysis process 300 ends thereafter. To further understand the tissue analysis process 300, additional explanation may be useful.
Now referring to
In one embodiment, the exemplary tissue identification process 400 begins at step 402 when the system (e.g., tissue identification engine 401 from
Thus, at step 406, in one embodiment, the system selects an appropriately-sized image from the histopathology image for the analysis based on predefined criteria. Generally, the histopathology image may be stored as an image pyramid with multiple levels (e.g., the base level is the highest-resolution image and each level above corresponds to a lower-resolution version of the image below it). In one embodiment, the histopathology image may comprise an image with dimensions of 80,000×60,000 pixels, so the system selects a level that is less than 1,000 pixels in each dimension for ease of processing (and to limit the amount of necessary processing). In various embodiments, after selecting the appropriately-sized image, the system converts, at step 408, the selected image into the appropriate color space for the subsequent processing (e.g., a specific organization of colors such as sRGB, CIELCh, CMYK, etc.). As will occur to one having ordinary skill in the art, the appropriate color space may depend on the parameters of the subsequent processing algorithms. For example, a given histopathology image may be stored in sRGB color space, whereas an embodiment of the system may require CIELCh color space, so the system converts the histopathology image to CIELCh color space. In one embodiment, if the appropriately-sized image was created in the appropriate color space, then the system skips step 408 because the conversion is unnecessary.
Still referring to
At step 414, in various embodiments, when the distribution of foreground or background does not fit a Gaussian mixture model, the system selects a threshold foreground/background value and marks pixels above the threshold as background (e.g., non-tissue) and below the threshold as foreground (e.g., tissue). Generally, steps 410, 412, and 414 perform similar functionality; thus, in one embodiment, the system only performs one of steps 410, 412, and 414 to identify the background (e.g., non-tissue) and foreground (e.g., tissue) regions of the histopathology image. At step 416, in various embodiments, once the tissue and non-tissue regions are identified, a binary layer (alternatively referred to herein as a “mask” or a “tissue mask”) is generated that identifies the tissue and non-tissue regions. Generally, the mask may be refined with a sequence of morphological filters to remove small holes in the mask (which likely should be identified as tissue) or small islands of tissue (which likely should be identified as non-tissue). In one embodiment, the mask is stored with the histopathology image so that subsequent processes may utilize the mask. Thus, at step 418, the system determines whether to confirm the particular tissue type of the tissue within the mask/histopathology image. If the system determines, at step 418, to not confirm the particular type of tissue within the mask, then the system initiates the artifact removal process 500. If, however, the system determines, at step 418, to confirm the particular tissue type of the tissue within the mask, then the system proceeds to step 420.
After determining to confirm the particular tissue type of the tissue within the histopathology image (either at step 404 or 418), then the system proceeds to step 420, wherein the system selects an appropriately-sized image from the histopathology image for the analysis (as performed at step 406). In one embodiment, if an appropriately-size image was previously selected at step 406 (or if the histopathology image is not stored as an image pyramid), then the system selects the previously-selected image or mask (or skips step 420). In various embodiments, after selecting the appropriately-sized image, the system converts, at step 422, the selected image into the appropriate color space for the subsequent processing (e.g., as performed at step 408). In one embodiment, if the appropriately-sized image was already converted into the appropriate color space, then the system skips step 422 because the conversion is unnecessary.
Generally, the system confirms the presence of a particular tissue type within the histopathology image by determining that the tissue comprises a particular expected threshold color value (based on the particular stain used to generate the histopathology image). For example, lymph node tissue, after receiving an H&E stain, is expected to be very blue in color. Accordingly, any tissue that is not very blue is likely not lymph node tissue (e.g., is adipose tissue, tissue of another organ, etc.). In various embodiments, at step 424, the system eliminates non-tissue regions of the histopathology-image from the analysis. Generally, the system may automatically eliminate any pixels within the histopathology image that do not fall within the mask. In one embodiment, the system further selects a particular threshold value and eliminates pixels below that value (e.g., 20th percentile of a particular hue channel). As will occur to one having ordinary skill in the art, the functionality of step 424 may occur in subsequent processes discussed herein even if it is not explicitly described. At step 426, in various embodiments, the system generates an image based on the prototypical color value for the particular tissue by reducing the hue channel for that particular value by that particular value. For example, in an H&E stained histopathology image, the prototypical color value would be a certain blue value; thus, the system would reduce the blue hue channel by that certain blue value to generate an image with more contrast. In various embodiments, at step 428, the system applies a filter (e.g., a 2D order statistic filter) to the generated image to determine the number of pixels within a certain area that are of a certain color value (e.g., at least 10% of the pixels within a 0.32 mm diameter circle are of the expected value). Thus, at step 430, the system selects a threshold value to which to compare all of pixels that pass through the filter.
Accordingly, at step 432, a binary mask is generated with all of the pixels within the threshold and the mask is refined using morphological filters to remove isolated pixels, wherein the mask identifies the particular tissue type. Generally, this mask is stored with the histopathology image so that subsequent processes may utilize the mask. After storing the mask, in various embodiments, the system initiates the artifact removal process 500. In one embodiment, after initiating the artifact removal process 500, the exemplary tissue identification process 400 ends thereafter.
Referring now to
In one embodiment, the exemplary artifact removal process 500 begins at step 502 when the system (e.g., artifact removal engine 501 from
If, at step 506, the system determines that it should remove blurry regions (because blurry regions were identified at step 504 and/or according to a predefined rule), then the system proceeds at step 508 to extract a predetermined color channel from the histopathology image (e.g., red, etc.) to improve the contrast of the histopathology image. At step 510, the system divides the histopathology image into regions of predetermined size (e.g., 100×100 pixels). Thus, at step 512, the system calculates the sharpness of each region by calculating a direction of the edge within the region, determining the pixels that correspond to that edge, calculating a thickness of each edge pixel (e.g., using a Taylor approximation, etc.), and calculating the sharpness of the region (e.g., the inverse of the median of the edge pixels thickness, etc.). Based on the sharpness, the system, at step 514, classifies regions below a predetermined threshold as blurry and removes them from the mask/subsequent analysis. After classifying regions as blurry, the system, in one embodiment, initiates the low-resolution analysis process 600 (or, not shown in
If, however, at step 506, the system determines that it should not remove blurry regions (because none exist within the histopathology image and/or according to a predefined rule indicating, for example, certain areas from which to remove blurry regions, etc.), then the system proceeds at step 516 to remove the other identified artifacts from the histopathology image using image processing techniques similar to those used in steps 508-514. After removing the other identified artifacts, the system, in various embodiments, initiates the low-resolution analysis process 600. In one embodiment, after initiating the low-resolution analysis process 600, the exemplary artifact removal process 500 ends thereafter.
Now referring to
In one embodiment, the exemplary low-resolution analysis process 600 begins at step 602 when the system (e.g., low-resolution analysis engine 601 from
If the system determines, at step 604, not to calibrate the low-resolution analysis engine, then the system proceeds, in various embodiments, to step 614, wherein the system uniformly splits the histopathology image (e.g., the particular tissue type mask) into potential regions of interest (e.g., 100×100 pixel squares) that will each be analyzed to determine whether they potentially contain abnormalities. In one embodiment, the system splits the histopathology image in random, non-uniform potential regions of interest. Thus, at step 616, in various embodiments, the system calculates the texture features within each of the potential regions of interest. In various embodiments, at step 618, the system classifies the texture features by calculating a confidence metric for each potential region of interest that indicates the likelihood that a particular region of interest comprises the abnormality (e.g., how similar the calculated text features are to the texture features of the representative histopathology images from calibration). Accordingly, at step 618, in various embodiments, the system identifies regions of interest for high-resolution analysis by generating a “map” of the confidence metrics for the histopathology image, eliminating any false positives/other outliers of the confidence metrics for regions of interest as compared to the surrounding regions of interest, identifying regions of interest that comprise local maximums of the confidence metrics within the histopathology image, using those local maximums as seed points to grow/generate/identify particular regions of interest that should be analyzed using the high-resolution analysis process (e.g., assessing the size, shape, and confidence of each identified region of interest). In one embodiment, the system attempts to identify the smallest possible number of regions of interest corresponding to the smallest number of the largest abnormalities that were potentially identified (e.g., three regions of interest corresponding to three large tumors instead of six regions of interest corresponding to six small tumors, wherein the three large tumors comprise the six small tumors). Thus, the system initiates the high-resolution analysis process 700 to confirm that the identified regions of interest comprise the particular abnormality.
Referring now to
Referring now to
Now referring to
At step 716, in various embodiments, the system selects a particular nucleus to segment/further define its shape. Accordingly, the system processes the selected nucleus through the nuclear edge detection process 700C (further details of which will be explained in association with the description of
In various embodiments, at step 722, the system resolves/eliminates overlapping nuclei that may occur because the system accidentally segmented multiple nuclei from the same singular nucleus in the histopathology image. As will occur to one having ordinary skill in the art, because nuclei are segmented at seed pixels, one or more of the segment nuclei may overlap. As nuclei in actuality do not overlap, the conflict of the overlapping nuclei may be resolved to increase the accuracy of the system. Thus, in one embodiment, a fitness score for each overlapping nucleus (e.g., a combination of size score indicating whether the size of the detected nucleus is appropriate, shape score indicating whether the shape of the detected nucleus is appropriate, and edge strength at the detected edge indicating how likely the edge has been detected) is calculated to indicate which nucleus should be retained. In one embodiment, to confirm the correction resolution, the system masks out the pixels of the retained nucleus and conducts steps 716 through 720 again on the eliminated seed pixel to determine whether an entire nucleus may be segmented from that point without the masked pixels (e.g., if it cannot be done, then the resolution was correct). In various embodiments, at step 724, the system detects nuclei clumps (e.g., detected nuclei that are likely larger and more heterogeneous than other detected nuclei because they contain multiple nuclei) that may reduce the accuracy of subsequent processes. Generally, regions with the least probability of being a singular nucleus are identified as potential nuclei clumps. Thus, at step 726, in one embodiment, the system attempts to split images of nuclei clumps into their individual nuclei using a hypothesis-driven model. In one embodiment, all potential nuclei clump splits (e.g., hypotheses) are evaluated using a multiseed approach to the edge-driven region growing process 700D, wherein multiple seed pixels are grown at the same time (e.g., thereby competing for pixels, instead of overlapping to form the clump). The most probable nuclei clump split, in one embodiment, is selected for further processing. Accordingly, at step 728, the system removes false nuclei from the analysis. In various embodiments, nuclei may be unintentionally detected in other cell structures (e.g., stroma, etc.). Generally, these nuclei have irregular shape and color, so the system identifies (e.g., compared to all of the other segmented nuclei) the nuclei that are not the expected shape/color. After removing false nuclei, the system initiates feature extraction (at step 706 from
Referring now to
Now referring to
Referring now to
In various embodiments, at step 766, the system calculates the features of each subgroup that are relevant to a determination of abnormality. For example, the system may calculate the width, length, area, and aspect ratio of the group's convex hull; width, length, area, and aspect ratio of the group's locally-convex hull (e.g., the hull that would be created if there was a maximum allowed edge length in the convex hull); number of nuclei in the group; inverse strength of the nuclei's abnormality probability (e.g., the negative of the log of the mean probability of the group); number of benign nuclei (e.g., nuclei whose abnormality probability is below the threshold to be considered for grouping) near or within the group's boundary; mean and median probability of benign nuclei near or within the group's boundary; other aggregations of individual nuclei features including size and shape variability measures, texture measures of the nuclear interiors, mean, median, or other order statistics of the probability of a nucleus being a histiocyte, and normalized color measures inside the nuclei; and other features of the areas between the nuclei but inside (or near) the group including texture measures of the extra-nuclear area, normalized color measures, and aggregate statistics of approximations to the nuclear/cytoplasm area ratio for the cells included in the group. At step 768, in various embodiments, the system classifies the subgroups based on the calculated features to generate a probability of abnormality for each group/subgroup (e.g., the likelihood that the group/subgroup comprises the abnormality). Thus, at step 770, in various embodiments, the system calculates a probability of abnormality for the region of interest (e.g., the likelihood that the region of interest comprises the abnormality). In one embodiment, the probability of abnormality for a region of interest is the maximum of the probability of abnormality for all of the groups/subgroups within the region of interest. Generally, at step 772, the system flags a region of interest for further review by a professional (e.g., pathologist) if the probability of abnormality for the region of interest is above a predetermined threshold. As will occur to one having ordinary skill in the art, the predetermined threshold is determined based on the number of regions of interest that the professional wishes to be flagged (e.g., a lower predetermined threshold will result in more flagged regions of interest). Thus, at step 774, the results of the exemplary regional analysis process 700E are stored with the histopathology image for subsequent analysis by the professional and the exemplary regional analysis process 700E ends thereafter. In one embodiment, the system automatically determines whether the histopathology images comprises the abnormality and does not pass the image along to a professional for confirmation of the same. To further understand the tissue analysis process, a description of exemplary histopathology images may be helpful.
Referring now to
In one embodiment,
In one embodiment,
In one embodiment,
From the foregoing, it will be understood that various aspects of the processes described herein are software processes that execute on computer systems that form parts of the system. Accordingly, it will be understood that various embodiments of the system described herein are generally implemented as specially-configured computers including various computer hardware components and, in many cases, significant additional features as compared to conventional or known computers, processes, or the like, as discussed in greater detail herein. Embodiments within the scope of the present disclosure also include computer-readable media for carrying or having computer-executable instructions or data structures stored thereon. Such computer-readable media can be any available media which can be accessed by a computer, or downloadable through communication networks. By way of example, and not limitation, such computer-readable media can comprise various forms of data storage devices or media such as RAM, ROM, flash memory, EEPROM, CD-ROM, DVD, or other optical disk storage, magnetic disk storage, solid state drives (SSDs) or other data storage devices, any type of removable non-volatile memories such as secure digital (SD), flash memory, memory stick, etc., or any other medium which can be used to carry or store computer program code in the form of computer-executable instructions or data structures and which can be accessed by a general purpose computer, special purpose computer, specially-configured computer, mobile device, etc.
When information is transferred or provided over a network or another communications connection (either hardwired, wireless, or a combination of hardwired or wireless) to a computer, the computer properly views the connection as a computer-readable medium. Thus, any such a connection is properly termed and considered a computer-readable medium. Combinations of the above should also be included within the scope of computer-readable media. Computer-executable instructions comprise, for example, instructions and data which cause a general purpose computer, special purpose computer, or special purpose processing device such as a mobile device processor to perform one specific function or a group of functions.
Those skilled in the art will understand the features and aspects of a suitable computing environment in which aspects of the disclosure may be implemented. Although not required, some of the embodiments of the claimed inventions may be described in the context of computer-executable instructions, such as program modules or engines, as described earlier, being executed by computers in networked environments. Such program modules are often reflected and illustrated by flow charts, sequence diagrams, exemplary screen displays, and other techniques used by those skilled in the art to communicate how to make and use such computer program modules. Generally, program modules include routines, programs, functions, objects, components, data structures, application programming interface (API) calls to other computers whether local or remote, etc. that perform particular tasks or implement particular defined data types, within the computer. Computer-executable instructions, associated data structures and/or schemas, and program modules represent examples of the program code for executing steps of the methods disclosed herein. The particular sequence of such executable instructions or associated data structures represent examples of corresponding acts for implementing the functions described in such steps.
Those skilled in the art will also appreciate that the claimed and/or described systems and methods may be practiced in network computing environments with many types of computer system configurations, including personal computers, smartphones, tablets, hand-held devices, multi-processor systems, microprocessor-based or programmable consumer electronics, networked PCs, minicomputers, mainframe computers, and the like. Embodiments of the claimed invention are practiced in distributed computing environments where tasks are performed by local and remote processing devices that are linked (either by hardwired links, wireless links, or by a combination of hardwired or wireless links) through a communications network. In a distributed computing environment, program modules may be located in both local and remote memory storage devices.
An exemplary system for implementing various aspects of the described operations, which is not illustrated, includes a computing device including a processing unit, a system memory, and a system bus that couples various system components including the system memory to the processing unit. The computer will typically include one or more data storage devices for reading data from and writing data to. The data storage devices provide nonvolatile storage of computer-executable instructions, data structures, program modules, and other data for the computer.
Computer program code that implements the functionality described herein typically comprises one or more program modules that may be stored on a data storage device. This program code, as is known to those skilled in the art, usually includes an operating system, one or more application programs, other program modules, and program data. A user may enter commands and information into the computer through keyboard, touch screen, pointing device, a script containing computer program code written in a scripting language or other input devices (not shown), such as a microphone, etc. These and other input devices are often connected to the processing unit through known electrical, optical, or wireless connections.
The computer that effects many aspects of the described processes will typically operate in a networked environment using logical connections to one or more remote computers or data sources, which are described further below. Remote computers may be another personal computer, a server, a router, a network PC, a peer device or other common network node, and typically include many or all of the elements described above relative to the main computer system in which the inventions are embodied. The logical connections between computers include a local area network (LAN), a wide area network (WAN), virtual networks (WAN or LAN), and wireless LANs (WLAN) that are presented here by way of example and not limitation. Such networking environments are commonplace in office-wide or enterprise-wide computer networks, intranets, and the Internet.
When used in a LAN or WLAN networking environment, a computer system implementing aspects of the invention is connected to the local network through a network interface or adapter. When used in a WAN or WLAN networking environment, the computer may include a modem, a wireless link, or other mechanisms for establishing communications over the wide area network, such as the Internet. In a networked environment, program modules depicted relative to the computer, or portions thereof, may be stored in a remote data storage device. It will be appreciated that the network connections described or shown are exemplary and other mechanisms of establishing communications over wide area networks or the Internet may be used.
While various aspects have been described in the context of a preferred embodiment, additional aspects, features, and methodologies of the claimed inventions will be readily discernible from the description herein, by those of ordinary skill in the art. Many embodiments and adaptations of the disclosure and claimed inventions other than those herein described, as well as many variations, modifications, and equivalent arrangements and methodologies, will be apparent from or reasonably suggested by the disclosure and the foregoing description thereof, without departing from the substance or scope of the claims. Furthermore, any sequence(s) and/or temporal order of steps of various processes described and claimed herein are those considered to be the best mode contemplated for carrying out the claimed inventions. It should also be understood that, although steps of various processes may be shown and described as being in a preferred sequence or temporal order, the steps of any such processes are not limited to being carried out in any particular sequence or order, absent a specific indication of such to achieve a particular intended result. In most cases, the steps of such processes may be carried out in a variety of different sequences and orders, while still falling within the scope of the claimed inventions. In addition, some steps may be carried out simultaneously, contemporaneously, or in synchronization with other steps.
The embodiments were chosen and described in order to explain the principles of the claimed inventions and their practical application so as to enable others skilled in the art to utilize the inventions and various embodiments and with various modifications as are suited to the particular use contemplated. Alternative embodiments will become apparent to those skilled in the art to which the claimed inventions pertain without departing from their spirit and scope. Accordingly, the scope of the claimed inventions is defined by the appended claims rather than the foregoing description and the exemplary embodiments described therein.
This application claims priority to, the benefit under 35 U.S.C. § 119 of, and incorporates by reference herein in its entirety U.S. Provisional Patent Application No. 62/136,051, filed Mar. 20, 2015, and entitled “Systems, Methods, and Apparatuses for Digital Whole Slide Imaging for Prescreened Detection of Cancer and Other Abnormalities.”
| Filing Document | Filing Date | Country | Kind |
|---|---|---|---|
| PCT/US16/23421 | 3/21/2016 | WO | 00 |
| Number | Date | Country | |
|---|---|---|---|
| 62136051 | Mar 2015 | US |