The present invention relates to an information processing system, an information processing method, and a program.
Regarding cancers, such as lung cancer, colorectal cancer, stomach cancer, breast cancer, gastrointestinal stromal tumor (GIST), and skin cancer (e.g., malignant melanoma), in response to a medical doctor's decision, cancer genetic testing is carried out for examination of one or some genes. Then, diagnosis and treatment with medicine selected based on a test result are carried out (refer to Non Patent Literature 1).
Because it takes one to two months to obtain a result of cancer genetic testing, administration of medicine effective against the gene abnormality of a patient is delayed. Due to deterioration of the patient's condition during the delay, it is likely to be too late. In particular, such a problem is severe to a stage 4 cancer patient who does not have much time left.
There is a similar problem regarding diseases which are different from cancers and to which medicine effective against gene abnormality can be administrated.
The object of an aspect of the present invention having been made in consideration of the above problems is to provide an information processing system, an information processing method, and a program that enable inhibition of delay of administration of medication effective against the gene abnormality of a target disease.
The object of another aspect of the present invention having been made in consideration of the above problems is to provide an information processing system, an information processing method, and a program that enable inhibition of delay of administration of medication effective against the gene abnormality of a patient having colorectal cancer.
The object of another aspect of the present invention having been made in consideration of the above problems is to provide an information processing system enabling inhibition of delay of administration of medication effective against the gene abnormality of a cancer patient.
An information processing system according to a first aspect of the present invention comprises: an acquisition unit configured to acquire a pathologic tissue image of a patient having a target disease; a division unit configured to divide the pathologic tissue image of the patient into a plurality of region images; a feature prediction unit configured to input each of the plurality of region images to each of a plurality of feature prediction models constructed one-to-one for types of histopathological features, to acquire prediction information on presence or absence of the histopathological features; a sorting unit configured to sort a plurality of region images of which respective combinations of presence or absence of the histopathological features based on the acquisition match a previously set combination of presence or absence of the histopathological features at time of sorting; a gene mutation prediction unit configured to input each of the plurality of region images selected by the sorting to each of gene mutation prediction models constructed one-to-one for types of gene mutations, the gene mutation prediction models each having a combination of presence or absence of the histopathological features, to acquire prediction information on presence or absence of the gene mutations; and a prediction result output unit configured to output, with the prediction information on presence or absence of the gene mutations acquired for each region image, a prediction result of presence or absence of at least one gene mutation in the patient.
According to the configuration, input of a pathologic tissue image of a patient having a target disease enables prompt acquisition of a prediction result of the present or absence of gene mutation. Thus, with reference to the prediction result, a clinician can prescribe the patient medication corresponding to the prediction result of the presence or absence of gene mutation, so that administration of medication effective against the gene abnormality of the target disease can be inhibited from being delayed.
An information processing system according to a second aspect of the present invention, according to the first aspect of the information processing system, wherein the acquisition unit further acquires a primary lesion site of the target disease, and the sorting unit sorts the plurality of region images, with the acquired primary lesion site of the target disease together with combinations of presence or absence of the histopathological features based on the acquisition.
According to the configuration, sorting of an region image with the primary lesion site of the target disease leads to a rise in the probability that an region image to be input to a gene mutation prediction model can be limited to an region image relating to the target disease, so that an improvement can be made in the accuracy of prediction of the presence or absence of gene mutation.
An information processing system according to a third aspect of the present invention, according to the first or the second aspect of the information processing system, wherein the plurality of feature prediction models each results from machine learning with learning data including a divided region image of a pathologic tissue image as an input and a histopathological feature given to the divided region image as an output, and the gene mutation prediction models each result from machine learning with learning data including an region image sorted with a combination of presence or absence of the histopathological features as an input and information on presence or absence of a particular gene mutation as an output.
According to the configuration, the feature prediction models and the gene mutation prediction models correspond to models after machine learning, enabling an improvement in the accuracy of prediction of the presence or absence of gene mutation.
An information processing system for predicting a gene having mutated in colorectal cancer tissue of a patient having colorectal cancer, the information processing system comprises: an acquisition unit configured to acquire a colorectal-cancer pathologic tissue image of the patient; a division unit configured to divide the colorectal-cancer pathologic tissue image of the patient into a plurality of region images; a feature prediction unit configured to input each of the plurality of region images to each of a plurality of feature prediction models constructed one-to-one for types of histopathological features, to acquire prediction information on presence or absence of the histopathological features; a sorting unit configured to sort a plurality of region images of which respective combinations of presence or absence of the histopathological features based on the acquisition match a combination of presence or absence of the histopathological features at time of sorting previously set to at least one of BRAF, BRAF V600E, ERBB2, RAS, TP53, or MSI; a gene mutation prediction unit configured to input each of the sorted plurality of region images to each of gene mutation prediction models constructed one-to-one for types of gene mutations, the gene mutation prediction models each having a combination of presence or absence of the histopathological features, to acquire prediction information on presence or absence of the gene mutations; and a prediction result output unit configured to output, with the prediction information on presence or absence of the gene mutations acquired for each region image, a prediction result of presence or absence of at least one gene mutation of BRAF, BRAF V600E, ERBB2, RAS, TP53, or MSI in the patient.
According to the configuration, input of a colorectal-cancer pathologic tissue image of a patient having colorectal cancer enables prompt acquisition of a prediction result of the presence or absence of gene mutation. Thus, with reference to the prediction result, a clinician can prescribe the patient medication corresponding to the prediction result of the presence or absence of gene mutation, so that administration of medication effective against the gene abnormality of the colorectal cancer can be inhibited from being delayed.
An information processing system according to a fifth aspect of the present invention, according to the fourth aspect of the information processing system, wherein the acquisition unit further acquires a primary lesion site of the colorectal cancer, and the sorting unit sorts the plurality of region images, with the acquired primary lesion site of the colorectal cancer together with at least one histopathological feature based on the determination.
According to the configuration, sorting of an region image with the primary lesion site of the target disease leads to a rise in the probability that an region image to be input to a gene mutation prediction model can be limited to an region image relating to the target disease, so that an improvement can be made in the accuracy of prediction of the presence or absence of gene mutation.
An information processing system according to a sixth aspect of the present invention, according to the fourth or fifth aspect of the information processing system, wherein the plurality of feature prediction models each results from machine learning with learning data including a divided region image of a pathologic tissue image as an input and a histopathological feature given to the divided region image as an output, and the gene mutation prediction models each result from machine learning with learning data including an region image sorted with a combination of presence or absence of the histopathological features as an input and information on presence or absence of a particular gene mutation as an output.
According to the configuration, the feature prediction models and the gene mutation prediction models correspond to models after machine learning, enabling an improvement in the accuracy of prediction of the presence or absence of gene mutation.
An information processing method according to a seventh aspect of the present invention, comprises: an acquisition process of acquiring a pathologic tissue image of a patient having a target disease; a division process of dividing the pathologic tissue image of the patient into a plurality of region images; a feature prediction process of inputting each of the plurality of region images to each of a plurality of feature prediction models constructed one-to-one for types of histopathological features, to acquire prediction information on presence or absence of the histopathological features; a sorting process of sorting a plurality of region images of which respective combinations of presence or absence of the histopathological features based on the acquisition match a previously set combination of presence or absence of the histopathological features at time of sorting; a gene mutation prediction process of inputting each of the plurality of region images selected by the sorting to each of gene mutation prediction models constructed one-to-one for types of gene mutations, the gene mutation prediction models each having a combination of presence or absence of the histopathological features, to acquire prediction information on presence or absence of the gene mutations; and a prediction result output process of outputting, with the prediction information on presence or absence of the gene mutations acquired for each region image, a prediction result of presence or absence of at least one gene mutation in the patient. an acquisition process of acquiring a pathologic tissue image of a patient having a target disease; a division process of dividing the pathologic tissue image of the patient into a plurality of region images; a feature prediction process of inputting each of the plurality of region images to each of a plurality of feature prediction models constructed one-to-one for types of histopathological features, to acquire prediction information on presence or absence of the histopathological features; a sorting process of sorting a plurality of region images of which respective combinations of presence or absence of the histopathological features based on the acquisition match a previously set combination of presence or absence of the histopathological features at time of sorting; a gene mutation prediction process of inputting each of the plurality of region images selected by the sorting to each of gene mutation prediction models constructed one-to-one for types of gene mutations, the gene mutation prediction models each having a combination of presence or absence of the histopathological features, to acquire prediction information on presence or absence of the gene mutations; and a prediction result output process of outputting, with the prediction information on presence or absence of the gene mutations acquired for each region image, a prediction result of presence or absence of at least one gene mutation in the patient.
According to the configuration, input of a pathologic tissue image of a patient having a target disease enables prompt acquisition of a prediction result of the present or absence of gene mutation. Thus, with reference to the prediction result, a clinician can prescribe the patient medication corresponding to the prediction result of the presence or absence of gene mutation, so that administration of medication effective against the gene abnormality of the target disease can be inhibited from being delayed.
An information processing method according to an eighth aspect of the present invention, for predicting a gene having mutated in colorectal cancer tissue of a patient having colorectal cancer, the information processing method comprises: an acquisition process of acquiring a colorectal-cancer pathologic tissue image of the patient; a division process of dividing the colorectal-cancer pathologic tissue image of the patient into a plurality of region images; a feature prediction process of inputting each of the plurality of region images to each of a plurality of feature prediction models constructed one-to-one for types of histopathological features, to acquire prediction information on presence or absence of the histopathological features; a sorting process of sorting a plurality of region images of which respective combinations of presence or absence of the histopathological features based on the acquisition match a combination of presence or absence of the histopathological features at time of sorting previously set to at least one of BRAF, BRAF V600E, ERBB2, RAS, TP53, or MSI; a gene mutation prediction process of inputting each of the sorted plurality of region images to each of gene mutation prediction models constructed one-to-one for types of gene mutations, the gene mutation prediction models each having a combination of presence or absence of the histopathological features, to acquire prediction information on presence or absence of the gene mutations; and a prediction result output process of outputting, with the prediction information on presence or absence of the gene mutations acquired for each region image, a prediction result of presence or absence of at least one gene mutation of BRAF, BRAF V600E, ERBB2, RAS, TP53, or MSI in the patient.
According to the configuration, input of a colorectal-cancer pathologic tissue image of a patient having colorectal cancer enables prompt acquisition of a prediction result of the presence or absence of gene mutation. Thus, with reference to the prediction result, a clinician can prescribe the patient medication corresponding to the prediction result of the presence or absence of gene mutation, so that administration of medication effective against the gene abnormality of the colorectal cancer can be inhibited from being delayed.
A program according to a ninth aspect of the present invention, for causing a computer to carry out: an acquisition process of acquiring a pathologic tissue image of a patient having a target disease; a division process of dividing the pathologic tissue image of the patient into a plurality of region images; a feature prediction process of inputting each of the plurality of region images to each of a plurality of feature prediction models constructed one-to-one for types of histopathological features, to acquire prediction information on presence or absence of the histopathological features; a sorting process of sorting a plurality of region images of which respective combinations of presence or absence of the histopathological features based on the acquisition match a previously set combination of presence or absence of the histopathological features at time of sorting; a gene mutation prediction process of inputting each of the plurality of region images selected by the sorting to each of gene mutation prediction models constructed one-to-one for types of gene mutations, the gene mutation prediction models each having a combination of presence or absence of the histopathological features, to acquire prediction information on presence or absence of the gene mutations; and a prediction result output process of outputting, with the prediction information on presence or absence of the gene mutations acquired for each region image, a prediction result of presence or absence of at least one gene mutation in the patient.
According to the configuration, input of a pathologic tissue image of a patient having a target disease enables prompt acquisition of a prediction result of the present or absence of gene mutation. Thus, with reference to the prediction result, a clinician can prescribe the patient medication corresponding to the prediction result of the presence or absence of gene mutation, so that administration of medication effective against the gene abnormality of the target disease can be inhibited from being delayed.
A program according to a tenth aspect of the present invention, for predicting a gene having mutated in colorectal cancer tissue of a patient having colorectal cancer, the program causing a computer to carry out: an acquisition process of acquiring a colorectal-cancer pathologic tissue image of the patient; a division process of dividing the colorectal-cancer pathologic tissue image of the patient into a plurality of region images; a feature prediction process of inputting each of the plurality of region images to each of a plurality of feature prediction models constructed one-to-one for types of histopathological features, to acquire prediction information on presence or absence of the histopathological features; a sorting process of sorting a plurality of region images of which respective combinations of presence or absence of the histopathological features based on the acquisition match a combination of presence or absence of the histopathological features at time of sorting previously set to at least one of BRAF, BRAF V600E, ERBB2, RAS, TP53, or MSI; a gene mutation prediction process of inputting each of the sorted plurality of region images to each of gene mutation prediction models constructed one-to-one for types of gene mutations, the gene mutation prediction models each having a combination of presence or absence of the histopathological features, to acquire prediction information on presence or absence of the gene mutations; and a prediction result output process of outputting, with the prediction information on presence or absence of the gene mutations acquired for each region image, a prediction result of presence or absence of at least one gene mutation of BRAF, BRAF V600E, ERBB2, RAS, TP53, or MSI in the patient.
According to the configuration, input of a colorectal-cancer pathologic tissue image of a patient having colorectal cancer enables prompt acquisition of a prediction result of the presence or absence of gene mutation. Thus, with reference to the prediction result, a clinician can prescribe the patient medication corresponding to the prediction result of the presence or absence of gene mutation, so that administration of medication effective against the gene abnormality of the colorectal cancer can be inhibited from being delayed.
An information processing system according to an eleven th aspect of the present invention, for estimating presence or absence of BRAF gene mutation in a tumor, the information processing system comprising: means of dividing a pathologic tissue image of the tumor into one or a plurality of images; means of selecting, from among the divided images, at least one image of an image having a tumor-cell ratio larger than 50%, an image including a papillary structure, an image including a non-serrated and papillary structure, an image including a non-serrated and papillary structure and mucus present, an image including a cribriform structure, an image including a cribriform structure and no mucus present, or an image including a cribriform structure and mucus present; and means of estimating, with the selected at least one image, the presence or absence of the BRAF gene mutation.
According to the configuration, input of a pathologic tissue image enables prompt acquisition of a prediction result of the presence or absence of the BRAF gene mutation. Thus, with reference to the prediction result, a clinician can prescribe the patient medication corresponding to the prediction result of the presence or absence of the BRAF gene mutation, so that administration of medication effective against the BRAF gene abnormality of cancer can be inhibited from being delayed.
An information processing system according to a twelfth aspect of the present invention, for estimating presence or absence of BRAF V600E gene mutation in a tumor, the information processing system comprises: means of dividing a pathologic tissue image of the tumor into one or a plurality of images; means of selecting, from among the divided images, at least one image of an image including a rail pattern, an image including a small solid nest, an image including a small solid nest of typical cells, an image including a large solid nest of typical cells, an image including an elongated elliptic nucleus, an image including mucus present, an image including a non-serrated and papillary structure and no mucus present, an image including a cribriform structure and no mucus present, or an image including a cribriform structure and mucus present; and means of estimating, with the selected at least one image, the presence or absence of the BRAF V600E gene mutation.
According to the configuration, input of a pathologic tissue image enables prompt acquisition of a prediction result of the presence or absence of the BRAF V600E gene mutation. Thus, with reference to the prediction result, a clinician can prescribe the patient medication corresponding to the prediction result of the presence or absence of the BRAF V600E gene mutation, so that administration of medication effective against the BRAF V600E gene abnormality of cancer can be inhibited from being delayed.
An information processing system according to a thirteenth aspect of the present invention, for estimating presence or absence of ERBB2 gene mutation in a tumor, the information processing system comprises: means of dividing a pathologic tissue image of the tumor into one or a plurality of images; means of selecting, from among the divided images, at least one image of an image including a trabecular structure and mucus present or an image including a trabecular structure and mucus leakage; and means of estimating, with the selected at least one image as an analysis target, the presence or absence of the ERBB2 gene mutation.
According to the configuration, input of a pathologic tissue image enables prompt acquisition of a prediction result of the presence or absence of the ERBB2 gene mutation. Thus, with reference to the prediction result, a clinician can prescribe the patient medication corresponding to the prediction result of the presence or absence of the ERBB2 gene mutation, so that administration of medication effective against the ERBB2 gene abnormality of cancer can be inhibited from being delayed.
An information processing system according to a fourteenth aspect of the present invention, for estimating presence or absence of TP53 gene mutation in a tumor, the information processing system comprises: means of dividing a pathologic tissue image of the tumor into one or a plurality of images; means of selecting, from among the divided images, at least one image of an image including a signet ring cell and mucus leakage, an image including a germ cell, an image including a trabecular structure and no mucus present, or an image including mucus leakage; and means of estimating, with the selected at least one image, the presence or absence of the TP53 gene mutation.
According to the configuration, input of a pathologic tissue image enables prompt acquisition of a prediction result of the presence or absence of the TP53 gene mutation. Thus, with reference to the prediction result, a clinician can prescribe the patient medication corresponding to the prediction result of the presence or absence of the TP53 gene mutation, so that administration of medication effective against the TP53 gene abnormality of cancer can be inhibited from being delayed.
An information processing system according to a fifteenth aspect of the present invention, for estimating presence or absence of MSI gene abnormality in a tumor, the information processing system comprises: means of dividing a pathologic tissue image of the tumor into one or a plurality of images; means of selecting, from among the divided images, at least one image of an image including a rail pattern, an image including a trabecular structure and no mucus present, an image having a tumor-cell ratio larger than 50%, an image including a solid nest, an image including a small solid nest of typical cells, an image including an elongated elliptic nucleus, an image including a papillary structure, an image including a germ cell, an image including a non-serrated and papillary structure, an image including a roundish nucleus, an image including a cribriform structure and no mucus present, an image including a cribriform structure and mucus present, or an image including a cribriform structure and mucus leakage; and means of estimating, with the selected at least one image, the presence or absence of the MSI gene abnormality.
According to the configuration, input of a pathologic tissue image enables prompt acquisition of a prediction result of the presence or absence of the MSI gene abnormality. Thus, with reference to the prediction result, a clinician can prescribe the patient medication corresponding to the prediction result of the presence or absence of the MSI gene abnormality, so that administration of medication effective against the MSI gene abnormality of cancer can be inhibited from being delayed.
An information processing system according to a sixteenth aspect of the present invention, for estimating presence or absence of RAS gene mutation in a tumor, the information processing system comprising: means of dividing a pathologic tissue image of the tumor into one or a plurality of images; means of selecting, from among the divided images, at least one image of an image including a signet ring cell and mucus leakage, an image including a tubular structure and no mucus present, an image including a trabecular structure and no mucus present, an image including a small solid nest, an image including a large solid nest, an image including a large solid nest of typical cells, an image including a papillary structure, an image including no mucus present, an image including mucus present, an image including a germ cell, an image including a non-serrated and papillary structure, an image including a non-serrated and papillary structure and no mucus present, an image including a non-serrated and papillary structure and mucus present, an image including a cribriform structure, an image including a cribriform structure and no mucus present, an image including a cribriform structure and mucus present, an image including a cribriform structure and mucus leakage, an image including budding present, an image having a tumor-cell ratio larger than 50%, an image including a small solid nest, an image including a small solid nest of typical cells, an image including a large solid nest, an image including an elongated elliptic nucleus, an image including mucus present, an image including mucus leakage, an image including a non-serrated and papillary structure and no mucus present, an image including a cribriform structure, an image including a cribriform structure and no mucus present, an image including a cribriform structure and mucus present, an image including a tubular structure and mucus present, an image including a tubular structure and mucus leakage, an image including a rail pattern, an image including a high-grade cell atypia, an image including a trabecular structure and mucus present, an image including a trabecular structure and mucus leakage, an image including a solid nest, an image including a non-serrated and papillary structure, an image including a non-serrated and papillary structure and mucus present, or an image including a roundish nucleus; and means of estimating, with the selected at least one image, the presence or absence of the RAS gene mutation.
According to the configuration, input of a pathologic tissue image enables prompt acquisition of a prediction result of the presence or absence of the RAS gene mutation. Thus, with reference to the prediction result, a clinician can prescribe the patient medication corresponding to the prediction result of the presence or absence of the RAS gene mutation, so that administration of medication effective against the RAS gene abnormality of cancer can be inhibited from being delayed.
According to the configuration, input of a pathologic tissue image of a patient having a target disease enables prompt acquisition of a prediction result of the present or absence of gene mutation. Thus, with reference to the prediction result, a clinician can prescribe the patient medication corresponding to the prediction result of the presence or absence of gene mutation, so that administration of medication effective against the gene abnormality of the target disease can be inhibited from being delayed.
According to another aspect of the present invention, input of a colorectal-cancer pathologic tissue image of a patient having colorectal cancer enables prompt acquisition of a prediction result of the presence or absence of gene mutation. Thus, with reference to the prediction result, a clinician can prescribe the patient medication corresponding to the prediction result of the presence or absence of gene mutation, so that administration of medication effective against the gene abnormality of the colorectal cancer can be inhibited from being delayed.
Embodiments will be described below with reference to the drawings. Note that detailed description more than necessary will be omitted. For example, detailed description of already well-known matters or duplicate description of substantially identical configurations will be omitted. This is for avoidance of redundant description in the following and for those skilled in the art to have easy understanding.
Examples of a target disease to be subjected to prediction of the presence or absence of gene mutation according to the present embodiment include diseases with gene mutation in tissue, such as cancers (or diseases to which medicine effective against gene abnormality can be administrated). As an exemplary target disease according to the present embodiment, colorectal cancer will be given in the following description. Note that, in the present embodiment, as description, gene abnormality is included in gene mutation.
The terminals 1-1 to 1-N are to be used by different users and are each, for example, a mobile phone, such as a multifunctional mobile phone (so-called smartphone), a tablet, a laptop personal computer, or a desktop personal computer. For example, the terminals 1-1 to 1-N may each display, through a WEB browser, information transmitted from the computer system 2 or may display, on the respective screens of applications installed on the terminals 1-1 to 1-N, information transmitted from the computer system 2. An example in which the terminals 1-1 to 1-N each display, through a WEB browser, information transmitted from the computer system 2 will be given in the following description.
The computer system 2 is capable of communicating with the terminals 1-1 to 1-N, and such communications may be made by wire or by wireless. That is, the computer system 2 is connected to the terminals 1-1 to 1-N such that information can be exchanged. For example, the computer system 2 is used by an administrator who manages the information processing system S according to the present embodiment. The computer system 2 may be a single computer or may include a plurality of computers.
The input interface 21 receives an input from the administrator of the computer system 2 and outputs an input signal corresponding to the received input to the processor 25.
The communication module 22 is connected to the communication circuit network NW and communicates with the terminals 1-1 to 1-N. Such communications may be made by wire or by wireless. In the following description, such communications are made by wire.
The storage 23 stores a program that the processor 25 reads out and executes, a feature prediction model after machine learning, a mutation prediction model after machine learning, and various types of data.
The memory 24 temporarily retains data and a program. The memory 24 is a volatile memory and serves, for example, as a random access memory (RAM).
The processor 25 loads a program from the storage 23 to the memory 24 and executes a series of commands included in the program, to function as an acquisition unit 251 and an output unit 250. The output unit 250 has, for example, a division unit 252, a feature prediction unit 253, a sorting unit 254, a gene mutation prediction unit 255, and a prediction result output unit 256. Respective pieces of processing thereof will be described below.
Next, a learning process for a gene mutation prediction model will be described with
(Step S1) The division unit 252 divides, into one or a plurality of region images (e.g., tiled images shaped like tiles), an image (e.g., a slide image) of the pathologic tissue (herein, colorectal cancer tissue as an example) of a patient having a target disease (herein, colorectal cancer as an example). Here, the gene mutation of the disease tissue (herein, colorectal cancer tissue as an example) of the patient has been known. Here, division into tiled images shaped like tiles is also referred to as tiling division.
(Step S2) Subsequently, for example, with an annotation system, a pathologist inputs a histopathological feature (e.g., elongated elliptic nucleus) regarding each of the plurality of region images. At the time, a terminal device (not illustrated) used by the pathologist receives the histopathological feature. The acquisition unit 251 acquires the histopathological feature given for each of the plurality of region images.
(Step S3) Subsequently, per histopathological feature, the processor 25 learns the relationship between an region image and the histopathological feature and outputs a feature prediction model. Specifically, for example, with learning data including a divided region image of the pathologic tissue image as an input and the histopathological feature given to the region image as an output, the processor 25 learns by machine learning (e.g., deep learning) and then outputs a feature prediction model. Thus, feature prediction models, of which the number is identical to the number of histopathological features L (L is a natural number), are output, so that the feature prediction models FM-1 to FM-L are stored in the storage 23 by the processor 25. Such a feature prediction model as above results from machine learning with learning data including a divided region image of the pathologic tissue image as an input and the histopathological feature given to the region image as an output.
(Step S4) Next, with the feature prediction models FM-1 to FM-L, the processor 25 predicts the presence or absence of each histopathological feature regarding a divided image (herein, a tiled image as an example) to which no histopathological feature is annotated. Thus, the presence or absence of each histopathological feature is predicted to the divided image to which no histopathological feature is annotated.
(Step S5) Next, with the combinations of the presence or absence of the histopathological features predicted, the sorting unit 254 sorts an region image (herein, a tiled image as an example) from the plurality of region images. Specifically, with the histopathological features predicted in step S4, the sorting unit 254 extracts an region image (herein, a tiled image as an example) corresponding to a particular combination of the presence or absence of the histopathological features.
(Step S6) Next, the processor 25 learns the relationship between a tiled image group corresponding to the particular combination of the presence or absence of the histopathological features and gene mutation and then outputs a gene mutation prediction model. Specifically, for example, with learning data including an region image corresponding to the particular combination of the presence or absence of the histopathological features (region image extracted in step S5) as an input and information on the presence or absence of a particular gene mutation as an output, the processor 25 learns by machine learning (e.g., deep learning) and outputs a gene mutation prediction model. Thus, gene mutation prediction models, of which the number is identical to the number of gene mutations M (M is a natural number), are output, so that the gene mutation prediction models GM-1 to GM-M are stored in the storage 23 by the processor 25. Such a gene mutation prediction model as above results from machine learning with learning data including an region image sorted with a particular combination of the presence or absence of the histopathological features as an input and information on the particular gene mutation as an output.
Next, a process of annotating a histopathological feature to a pathologic tissue image will be described with
Next, learning and testing at the time of construction of a feature prediction model for a certain histopathological feature will be described with
As an example, at the time of learning, 5-fold cross validation is carried out with 80% of the entire learning data. That is, learning is carried out with 64% of the entire learning data and validation is carried out with 16% of the entire learning data. Specifically, validation is carried out as follows. First, 80% of the entire learning data is divided into five. Here, for easy understanding, subsets of data resulting from the division into five are defined as s1, s2, s3, s4, and s5.
For example, after 80% of the learning data is divided into five, first, with the subsets s1, s2, s3, and s4 as training subsets, model learning starts. Subsequently, with s5 as a validation subset, model validation is carried out. The evaluation indicator acquired at the time (e.g., accuracy or F1 score) is defined as el.
Next, learning is carried out with s2, s3, s4, and s5 and evaluation is carried out with s1. Similarly, learning and evaluation are repeated with replacement of the subsets. Learning and evaluation are repeated to all the combinations, so that five evaluation indicators are acquired. Then, the model highest in performance based on the five evaluation indicators is determined as a feature prediction model.
For accuracy validation of the acquired feature prediction model, with 20% of the entire learning data, blind testing is carried out to the model highest in performance based on the 5-fold cross validation. Thus, the performance of the determined feature prediction model is validated, and the feature prediction model is adopted if its performance fulfills the criterion.
Next, giving the prediction value of a histopathological feature to an region image in step S4 of
The prediction value of a histopathological feature corresponds to the degree of the histopathological feature determined based on a prediction result from each of the feature prediction models FM-1 to FM-L to each region image. The prediction value of a histopathological feature may be a prediction result itself (e.g., a numerical value of 0 to 1) or may be a value (e.g., the value of 0 or 1) corresponding to a comparison result from comparison of whether a prediction result is larger or smaller than the threshold. An example in which the prediction value of a histopathological feature is a numerical value of 0 to 1 will be given herein.
Next, learning and testing at the time of construction of a gene mutation prediction model for a certain gene mutation in step S6 of
As an example, at the time of learning, 5-fold cross validation is carried out with 80% of the entire learning data. That is, learning is carried out with 64% of the entire learning data and validation is carried out with 16% of the entire learning data. Specifically, validation is carried out as follows. First, 80% of the entire learning data is divided into five. Here, for easy understanding, subsets of data resulting from the division into five are defined as s1, s2, s3, s4, and s5.
For example, after 80% of the learning data is divided into five, first, with the subsets s1, s2, s3, and s4 as training subsets, model learning starts. Subsequently, with s5 as a validation subset, model validation is carried out. The evaluation indicator acquired at the time (e.g., accuracy or F1 score) is defined as el.
Next, learning is carried out with s2, s3, s4, and s5 and evaluation is carried out with s1. Similarly, learning and evaluation are repeated with replacement of the subsets. Learning and evaluation are repeated to all the combinations, so that five evaluation indicators are acquired. Then, the model highest in performance based on the five evaluation indicators is determined as a gene mutation prediction model.
For accuracy validation of the acquired gene mutation prediction model, with 20% of the entire learning data, blind testing is carried out to the model highest in performance based on the 5-fold cross validation. Thus, the performance of the determined gene mutation prediction model is validated, and the gene mutation prediction model is adopted if its performance fulfills the criterion.
A method of setting the “combination of the presence or absence of the histopathological features at the time of sorting” at the time of sorting an region image in a step of predicting an unknown gene mutation of a colorectal cancer patient will be described with
<Combination of Features Enabling Prediction of Gene Mutation (or Gene Abnormality)>
As an exemplary embodiment, per type of gene mutation (or gene abnormality) as a target, from all experimental data, a combination of a first feature (primary lesion site) and a second feature (histopathological feature) that fulfills all the following conditions (1) to (3) is sorted.
Here, the case import technique 1 and the case import technique 2 are as follows.
The method includes predicting the presence of gene mutation in case 1 in a case where the mean value of the respective prediction values of a gene mutation model to the region images included in an region image group after sorting (e.g., an region image group in
The method includes predicting the presence of gene mutation in the target patient in a case where the ratio of the number of tiles corresponding to region images, each having a prediction value not less than a first threshold th1 based on a gene mutation model, included in an region image group after sorting (e.g., the region image group in
Referring to
Referring to
In an ERBB2 table T3 in
In an RAS table T6 in
Next, an estimation process for the presence or absence of each gene mutation will be described with
(Step S11) First, the acquisition unit 251 acquires a pathologic tissue image of a patient in which gene mutation is unknown because gene analysis has not been carried out yet.
(Step S12) Next, the division unit 252 divides the pathologic tissue image of the patient into a plurality of region images.
(Step S13) Next, with each of feature prediction models constructed one-to-one for types of histopathological features, the feature prediction unit 253 predicts the presence or absence of the target histopathological feature for each region image. More particularly, for example, the feature prediction unit 253 inputs each region image to each of a plurality of feature prediction models constructed one-to-one for types of histopathological features and acquires prediction information on the presence or absence of the histopathological feature.
(Step S14) Next, with the acquired combinations of the presence or absence of the histopathological features, the feature prediction unit 253 sorts the plurality of region images. More particularly, for example, the feature prediction unit 253 extracts, from the plurality of region images, an region image of which the acquired combination of the presence or absence of the histopathological features matches a particular combination of the presence or absence of the histopathological features set for each gene abnormality.
Here, for example, a particular combination of the presence or absence of the histopathological features set to BRAF gene abnormality is one in the “combination of the presence or absence of the histopathological features at the time of sorting” in the BRAF table T1 of
(Step S15) Next, with each of gene mutation prediction models constructed one-to-one for types of gene mutations, the gene mutation prediction models each having a combination of the presence or absence of the histopathological features, the gene mutation prediction unit 255 predicts the presence or absence of the target gene mutation for each region image. More particularly, for example, the gene mutation prediction unit 255 inputs each region image selected by sorting to each of gene mutation prediction models constructed one-to-one for types of gene mutations, the gene mutation prediction models each having a combination of the presence or absence of the histopathological features, and acquires prediction information on the presence or absence of the gene mutation.
(Step S16) Next, with respective prediction results of the region images, the prediction result output unit 256 predicts the presence or absence of each gene mutation in the patient.
(Step S17) Next, for example, the prediction result output unit 256 outputs a list of respective prediction results regarding the presence or absence of the gene mutations. Note that a prediction result of the presence or absence of gene mutation does not necessarily correspond to all prediction results of the presence or absence of the gene mutations and thus may be a prediction result of the presence or absence of at least one gene mutation. As above, with acquired prediction information on the presence or absence of gene mutation for each region image, the prediction result output unit 256 outputs a prediction result of the presence or absence of at least one gene mutation in the patient.
In the above example, regarding the target gene mutation, region images that match the” combination of the presence or absence of the histopathological features at the time of sorting” are sorted and the sorted region images are each input to the gene mutation prediction model, resulting in acquisition of respective gene mutation prediction results of the region images. Then, the presence or absence of the target gene mutation in the target patient is predicted with the respective prediction results of the region images. However, this is not limiting. Such a series of steps may be carried out per “combination of the presence or absence of the histopathological features at the time of sorting”. In this case, collectively, output may be information as to with which set the presence of the target gene mutation has been predicted in the “combination of the presence or absence of the histopathological features at the time of sorting”. Furthermore, the above processing may be performed for a plurality of gene mutations and the above output may be made for the plurality of gene mutations.
Note that the acquisition unit 251 may further acquire the primary lesion site of the target disease. In this case, with the acquired primary lesion site of the target disease together with the acquired combination of the presence or absence of the histopathological features, the sorting unit 254 may sort the plurality of region images.
For example, in order to predict the presence or absence of BRAF gene mutation regarding colorectal cancer, with reference to the BRAF table T1 stored in the storage 23 (refer to
In this case, for example, regarding the target gene mutation, region images that match a set of the “primary lesion site” and the “combination of the presence or absence of the histopathological features at the time of sorting” are sorted and the sorted region images are each input to the gene mutation prediction model, resulting in acquisition of respective gene mutation prediction results of the region images. Then, the presence or absence of the target gene mutation in the target patient is predicted with the respective prediction results of the region images. Such a series of steps may be carried out per set of the “primary lesion site” and the “combination of the presence or absence of the histopathological features at the time of sorting”. In this case, collectively, output may be information as to with which set of the “primary lesion site” and the “combination of the presence or absence of the histopathological features at the time of sorting” the presence of the target gene mutation has been predicted. Furthermore, the above processing may be performed for a plurality of gene mutations and the above output may be made for the plurality of gene mutations.
Next, a method for importing from a prediction result to an region image to a prediction result per patient will be described with
For example, in a case where the number of tiles included in an region image group after sorting (e.g., an region image group in
Next, exemplary screen transition on a terminal 1 will be described with
Next, a flow of processing of screen transition in
(Step S110) First, the terminal 1 receives a pathologic tissue image of a patient and a primary lesion site of colorectal cancer.
(Step S120) Next, the terminal 1 transmits, to the computer system 2, the pathologic tissue image of the patient and the primary lesion site of colorectal cancer.
(Step S130) Next, the computer system 2 receives the pathologic tissue image of the patient and the primary lesion site of colorectal cancer, and then outputs information for displaying a prediction result of the presence or absence of each gene mutation, with the pathologic tissue image of the patient and the primary lesion site of colorectal cancer. This processing has been described in detail with
(Step S140) Next, the computer system 2 transmits, to the terminal 1, the information for displaying a prediction result of the presence or absence of each gene mutation.
(Step S150) Next, the terminal 1 receives the information for displaying a prediction result of the presence or absence of each gene mutation, and displays a prediction result regarding the presence or absence of each gene mutation, with the information. Then, the processing of the present flowchart terminates.
As above, according to the present embodiment, provided is an information processing system S including: an acquisition unit 251 that acquires a pathologic tissue image of a patient having a target disease; a division unit 252 that divides the pathologic tissue image of the patient into a plurality of region images; a feature prediction unit 253 that inputs each of the plurality of region images to each of a plurality of feature prediction models constructed one-to-one for types of histopathological features, to acquire prediction information on presence or absence of the histopathological features; a sorting unit 254 that sorts a plurality of region images of which respective combinations of presence or absence of the histopathological features based on the acquisition match a previously set combination of presence or absence of the histopathological features at time of sorting; a gene mutation prediction unit 255 that inputs each of the plurality of region images selected by the sorting to each of gene mutation prediction models constructed one-to-one for types of gene mutations, the gene mutation prediction models each having a combination of presence or absence of the histopathological features, to acquire prediction information on presence or absence of the gene mutations; and a prediction result output unit 256 that outputs, with the prediction information on presence or absence of the gene mutations acquired for each region image, a prediction result of presence or absence of at least one gene mutation in the patient.
According to the present embodiment, provided is an information processing system for predicting a gene having mutated in colorectal cancer tissue of a patient having colorectal cancer. The information processing system S includes an acquisition unit 251 that acquires a colorectal-cancer pathologic tissue image of the patient. The information processing system S further includes a division unit 252 that divides the colorectal-cancer pathologic tissue image of the patient into a plurality of region images. The information processing system S further includes a feature prediction unit 253 that inputs each of the plurality of region images to each of a plurality of feature prediction models constructed one-to-one for types of histopathological features, to acquire prediction information on presence or absence of the histopathological features. The information processing system S further includes a sorting unit 254 that sorts a plurality of region images of which respective combinations of presence or absence of the histopathological features based on the acquisition match a combination of presence or absence of the histopathological features at time of sorting previously set to at least one of BRAF, BRAF V600E, ERBB2, RAS, TP53, or MSI. The information processing system S further includes: a gene mutation prediction unit 255 that inputs each of the sorted plurality of region images to each of gene mutation prediction models constructed one-to-one for types of gene mutations, the gene mutation prediction models each having a combination of presence or absence of the histopathological features, to acquire prediction information on presence or absence of the gene mutations; and a prediction result output unit 256 that outputs, with the prediction information on presence or absence of the gene mutations acquired for each region image, a prediction result of presence or absence of at least one gene mutation of BRAF, BRAF V600E, ERBB2, RAS, TP53, or MSI in the patient.
According to the configuration, input of a colorectal-cancer pathologic tissue image of a patient having colorectal cancer enables prompt acquisition of a prediction result of the presence or absence of gene mutation. Thus, with reference to the prediction result, a clinician can prescribe the patient medication corresponding to the prediction result of the presence or absence of gene mutation, so that administration of medication effective against the gene abnormality of the colorectal cancer can be inhibited from being delayed.
According to the present embodiment, provided is an information processing system for estimating presence or absence of BRAF gene mutation in a tumor, the information processing system including: means of dividing a pathologic tissue image of the tumor into one or a plurality of images; means of selecting, from among the divided images, at least one image of an image having a tumor-cell ratio larger than 50%, an image including a papillary structure, an image including a non-serrated and papillary structure, an image including a non-serrated and papillary structure and mucus present, an image including a cribriform structure, an image including a cribriform structure and no mucus present, or an image including a cribriform structure and mucus present; and means of estimating, with the selected at least one image, the presence or absence of the BRAF gene mutation.
According to the configuration, input of a pathologic tissue image enables prompt acquisition of a prediction result of the presence or absence of the BRAF gene mutation. Thus, with reference to the prediction result, a clinician can prescribe the patient medication corresponding to the prediction result of the presence or absence of the BRAF gene mutation, so that administration of medication effective against the BRAF gene abnormality of cancer can be inhibited from being delayed.
According to the present embodiment, provided is an information processing system for estimating presence or absence of BRAF V600E gene mutation in a tumor, the information processing system including: means of dividing a pathologic tissue image of the tumor into one or a plurality of images; means of selecting, from among the divided images, at least one image of an image including a rail pattern, an image including a small solid nest, an image including a small solid nest of typical cells, an image including a large solid nest of typical cells, an image including an elongated elliptic nucleus, an image including mucus present, an image including a non-serrated and papillary structure and no mucus present, an image including a cribriform structure and no mucus present, or an image including a cribriform structure and mucus present; and means of estimating, with the selected at least one image, the presence or absence of the BRAF V600E gene mutation.
According to the configuration, input of a pathologic tissue image enables prompt acquisition of a prediction result of the presence or absence of the BRAF V600E gene mutation. Thus, with reference to the prediction result, a clinician can prescribe the patient medication corresponding to the prediction result of the presence or absence of the BRAF V600E gene mutation, so that administration of medication effective against the BRAF V600E gene abnormality of cancer can be inhibited from being delayed.
According to the present embodiment, provided is an information processing system for estimating presence or absence of ERBB2 gene mutation in a tumor, the information processing system including: means of dividing a pathologic tissue image of the tumor into one or a plurality of images; means of selecting, from among the divided images, at least one image of an image including a trabecular structure and mucus present or an image including a trabecular structure and mucus leakage; and means of estimating, with the selected at least one image as an analysis target, the presence or absence of the ERBB2 gene mutation.
According to the configuration, input of a pathologic tissue image enables prompt acquisition of a prediction result of the presence or absence of the ERBB2 gene mutation. Thus, with reference to the prediction result, a clinician can prescribe the patient medication corresponding to the prediction result of the presence or absence of the ERBB2 gene mutation, so that administration of medication effective against the ERBB2 gene abnormality of cancer can be inhibited from being delayed.
According to the present embodiment, provided is an information processing system for estimating presence or absence of TP53 gene mutation in a tumor, the information processing system including: means of dividing a pathologic tissue image of the tumor into one or a plurality of images; means of selecting, from among the divided images, at least one image of an image including a signet ring cell and mucus leakage, an image including a germ cell, an image including a trabecular structure and no mucus present, or an image including mucus leakage; and means of estimating, with the selected at least one image, the presence or absence of the TP53 gene mutation.
According to the configuration, input of a pathologic tissue image enables prompt acquisition of a prediction result of the presence or absence of the TP53 gene mutation. Thus, with reference to the prediction result, a clinician can prescribe the patient medication corresponding to the prediction result of the presence or absence of the TP53 gene mutation, so that administration of medication effective against the TP53 gene abnormality of cancer can be inhibited from being delayed.
According to the present embodiment, provided is an information processing system for estimating presence or absence of MSI gene abnormality in a tumor, the information processing system including: means of dividing a pathologic tissue image of the tumor into one or a plurality of images; means of selecting, from among the divided images, at least one image of an image including a rail pattern, an image including a trabecular structure and no mucus present, an image having a tumor-cell ratio larger than 50%, an image including a solid nest, an image including a small solid nest of typical cells, an image including an elongated elliptic nucleus, an image including a papillary structure, an image including a germ cell, an image including a non-serrated and papillary structure, an image including a roundish nucleus, an image including a cribriform structure and no mucus present, an image including a cribriform structure and mucus present, or an image including a cribriform structure and mucus leakage; and means of estimating, with the selected at least one image, the presence or absence of the MSI gene abnormality.
According to the configuration, input of a pathologic tissue image enables prompt acquisition of a prediction result of the presence or absence of the MSI gene abnormality. Thus, with reference to the prediction result, a clinician can prescribe the patient medication corresponding to the prediction result of the presence or absence of the MSI gene abnormality, so that administration of medication effective against the MSI gene abnormality of cancer can be inhibited from being delayed.
According to the present embodiment, provided is an information processing system for estimating presence or absence of RAS gene mutation in a tumor, the information processing system including: means of dividing a pathologic tissue image of the tumor into one or a plurality of images; means of selecting, from among the divided images, at least one image of an image including a signet ring cell and mucus leakage, an image including a tubular structure and no mucus present, an image including a trabecular structure and no mucus present, an image including a small solid nest, an image including a large solid nest, an image including a large solid nest of typical cells, an image including a papillary structure, an image including no mucus present, an image including mucus present, an image including a germ cell, an image including a non-serrated and papillary structure, an image including a non-serrated and papillary structure and no mucus present, an image including a non-serrated and papillary structure and mucus present, an image including a cribriform structure, an image including a cribriform structure and no mucus present, an image including a cribriform structure and mucus present, an image including a cribriform structure and mucus leakage, an image including budding present, an image having a tumor-cell ratio larger than 50%, an image including a small solid nest, an image including a small solid nest of typical cells, an image including a large solid nest, an image including an elongated elliptic nucleus, an image including mucus present, an image including mucus leakage, an image including a non-serrated and papillary structure and no mucus present, an image including a cribriform structure, an image including a cribriform structure and no mucus present, an image including a cribriform structure and mucus present, an image including a tubular structure and mucus present, an image including a tubular structure and mucus leakage, an image including a rail pattern, an image including a high-grade cell atypia, an image including a trabecular structure and mucus present, an image including a trabecular structure and mucus leakage, an image including a solid nest, an image including a non-serrated and papillary structure, an image including a non-serrated and papillary structure and mucus present, or an image including a roundish nucleus; and means of estimating, with the selected at least one image, the presence or absence of the RAS gene mutation.
According to the configuration, input of a pathologic tissue image enables prompt acquisition of a prediction result of the presence or absence of the RAS gene mutation. Thus, with reference to the prediction result, a clinician can prescribe the patient medication corresponding to the prediction result of the presence or absence of the RAS gene mutation, so that administration of medication effective against the RAS gene abnormality of cancer can be inhibited from being delayed.
Note that at least part of the computer system 2 in the embodiment described above may be based on hardware or software. In a case where at least part of the computer system 2 is based on software, a program for achieving the function thereof may be stored in a recording medium, such as a flexible disk or a CD-ROM, and a computer may read and execute the program. Such a recording medium is not limited to a detachably attachable recording medium, such as a magnetic disk or an optical disc, and thus may be a fixed recording medium, such as a hard disk drive or a memory.
Such a program for achieving the function of at least part of the computer system 2 may be distributed through a communication channel (including wireless communication), such as the Internet. Furthermore, the program having been encoded, modulated, or compressed may be distributed through a wired channel or a wireless channel, such as the Internet, or the program having been stored in a recording medium may be distributed.
Furthermore, one or a plurality of information apparatuses may cause the computer system 2 to function. In a case where a plurality of information apparatuses is used, a computer provided as one thereof may execute a predetermined program to achieve the function of at least one means of the computer system 2.
All steps in a method according to the present invention may be achieved by automated control with a computer. Each step may be carried out by a computer and proceeding control between steps may be manually carried out. Furthermore, at least part of all steps may be manually carried out.
The present invention is not limited to the above embodiment, and thus embodiments can be made with modifications of the constituent elements without departing from the gist thereof in a practical phase. Various modifications in the present invention can be made with any appropriate combination of a plurality of constituent elements disclosed in the above embodiment. For example, some constituent elements may be removed from all the constituent elements in the embodiment. Furthermore, any appropriate combination of constituent elements may be made between different embodiments.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2020/038843 | 10/14/2020 | WO |