The technique of the present disclosure relates to an image processing apparatus, an operation method of an image processing apparatus, and an operation program of an image processing apparatus.
US2021/0342570A describes a technique below. By using a machine learning model such as an autoencoder, a feature amount is extracted from a patch image into which a specimen image depicting a tissue specimen of a liver or the like of an animal is subdivided. Based on the extracted feature amount, it is determined whether a morphological abnormality (such as hyperplasia, infiltration, congestion, inflammation, tumor, carcinogenesis, proliferation, bleeding, or glycogen decrease) is present in the tissue specimen depicted in the patch image. Based on the feature amount, the patch image determined to have a morphological abnormality is clustered (hard clustered) into one of a plurality of clusters.
To determine whether a morphological abnormality is present, not only image information of a region where the abnormality is present but also image information of the surrounding region are needed. For example, in the case of determining whether bile duct hyperplasia is present as a morphological abnormality, even a specialized pathologist has difficulty determining whether a hyperplastic bile duct region is hyperplastic by observing the region alone and needs to observe a wider field of view including the region and the surrounding region. Therefore, in US2021/0342570A, it is desirable that patch images input to the machine learning model for extracting a feature amount have a size that covers not only the region of the morphological abnormality but also the surrounding region.
However, when the size of the patch images is set as described above, the feature amount is derived not only from the region of the morphological abnormality but also from the surrounding region. Therefore, a distribution of the feature amounts of the patch images determined to have a morphological abnormality is not discrete. When the distribution of the feature amounts is not discrete, it becomes difficult to perform hard clustering by computer processing as described in US2021/0342570A on the patch images determined to have a morphological abnormality.
One embodiment of the technique of the present disclosure provides an image processing apparatus, an operation method of an image processing apparatus, and an operation program of an image processing apparatus capable of clustering patch images of specimen images of which a distribution of feature amounts is not discrete.
An image processing apparatus according to the present disclosure includes a processor configured to acquire a first specimen image depicting a tissue specimen of a subject; extract, using a machine learning model, first feature amounts from respective first patch images into which the first specimen image is subdivided; determine, based on each of the first feature amounts, whether a morphological abnormality is present in the tissue specimen depicted in a corresponding first patch image of the first patch images; and perform one of manual clustering processing of receiving, from a user, a designation indicating which cluster a first patch image determined to have the morphological abnormality in the tissue specimen among the first patch images belongs to, and clustering, based on the designation, the first patch image into one of a plurality of clusters, or soft clustering processing of calculating a degree of belonging of the first patch image determined to have the morphological abnormality in the tissue specimen to each of the plurality of clusters.
Preferably, the processor is configured to perform control to display a result of the manual clustering processing or the soft clustering processing.
Preferably, the result is displayed by a plurality of cluster images generated by processing the first specimen image, and the plurality of cluster images are images that enable the plurality of clusters to be identified based on a display format preset for each of the plurality of clusters.
Preferably, the processor is configured to display at least one cluster image among the plurality of cluster images to be superimposed on the first specimen image.
Preferably, the processor is configured to receive, from the user, a designation of the at least one cluster image displayed to be superimposed on the first specimen image.
Preferably, the processor is configured to display statistical information based on the result.
Preferably, the processor is configured to, in the manual clustering processing, reduce a number of dimensions of the first feature amounts to two or three dimensions, display a graph in which the first feature amounts with a reduced number of dimensions are plotted in a two-dimensional space or a three-dimensional space, and receive the designation on the graph.
Preferably, the machine learning model is a model trained using, as labeled training data, second patch images into which second specimen images are subdivided, the second specimen images depicting tissue specimens of a plurality of subjects constituting a control group to which a candidate substance for a medicine is not administered in a past evaluation test of the candidate substance.
Preferably, the labeled training data further includes a patch image depicting the tissue specimen in which the morphological abnormality is present.
Preferably, the machine learning model is a model that performs a task of identifying a type of the morphological abnormality.
Preferably, the processor is configured to acquire information on a distribution of second feature amounts extracted using the machine learning model from the second patch images into which the second specimen images are subdivided; calculate a distance between the distribution and each of the first feature amounts; and perform the determination based on the distance.
An operation method of an image processing apparatus according to the present disclosure includes acquiring a first specimen image depicting a tissue specimen of a subject; extracting, using a machine learning model, first feature amounts from respective first patch images into which the first specimen image is subdivided; determining, based on each of the first feature amounts, whether a morphological abnormality is present in the tissue specimen depicted in a corresponding first patch image of the first patch images; and performing one of manual clustering processing of receiving, from a user, a designation indicating which cluster a first patch image determined to have the morphological abnormality in the tissue specimen among the first patch images belongs to, and clustering, based on the designation, the first patch image into one of a plurality of clusters, or soft clustering processing of calculating a degree of belonging of the first patch image determined to have the morphological abnormality in the tissue specimen to each of the plurality of clusters.
An operation program of an image processing apparatus according to the present disclosure causes a computer to execute a process including acquiring a first specimen image depicting a tissue specimen of a subject; extracting, using a machine learning model, first feature amounts from respective first patch images into which the first specimen image is subdivided; determining, based on each of the first feature amounts, whether a morphological abnormality is present in the tissue specimen depicted in a corresponding first patch image of the first patch images; and performing one of manual clustering processing of receiving, from a user, a designation indicating which cluster a first patch image determined to have the morphological abnormality in the tissue specimen among the first patch images belongs to, and clustering, based on the designation, the first patch image into one of a plurality of clusters, or soft clustering processing of calculating a degree of belonging of the first patch image determined to have the morphological abnormality in the tissue specimen to each of the plurality of clusters.
The technique of the present disclosure can provide an image processing apparatus, an operation method of an image processing apparatus, and an operation program of an image processing apparatus capable of clustering patch images of specimen images of which a distribution of feature amounts is not discrete.
Exemplary embodiments according to the technique of the present disclosure will be described in detail based on the following figures, wherein:
As illustrated in
First specimen images 151 are input to the image processing apparatus 10. The first specimen images 151 are images used for evaluating the efficacy and toxicity of the candidate substance 11. The first specimen images 151 are each generated, for example, in a procedure below. First, a subject S such as a rat prepared for evaluation of the candidate substance 11 is autopsied to collect a plurality of tissue specimens (hereinafter, referred to as liver specimens) LVS of cross sections of an organ, in this case a liver LV, of the subject S. Next, the collected liver specimens LVS are each placed on a slide glass 16. Then, the liver specimens LVS are stained, in this case, with hematoxylin-eosin stain. Subsequently, each of the stained liver specimens LVS is covered with a cover glass 17 to complete a slide specimen 18. Then, the slide specimen 18 is placed at an imaging device 19 such as a digital optical microscope, and the imaging device 19 captures the first specimen image 151. The first specimen image 151 thus obtained is assigned a subject identification data (ID) for uniquely identifying the subject S, a specimen image ID for uniquely identifying the first specimen image 151, an imaging date and time, etc. Note that the tissue specimen is also referred to as a tissue section. In addition, the staining may be staining using hematoxylin dye alone, staining using nuclear fast red dye, or the like.
An administration group and a control group will be described. The administration group is constituted by a plurality of subjects S to which the candidate substance 11 is administered. In contrast to the administration group, the control group is constituted by a plurality of subjects S to which the candidate substance 11 is not administered. The number of subjects S constituting the administration group and the number of subjects S constituting the control group are both, for example, approximately 5 to 10. The subjects S constituting the administration group and the subjects S constituting the control group are the subjects S having the same attributes and placed under the same rearing environment. The same attributes refer to, for example, the same week age, the same sex, and/or the like. The same attributes also include a same week age composition ratio and/or the same sex composition ratio (such as five males and five females). The same rearing environment refers to, for example, the same food that is given, the same temperature and humidity in a rearing space, the same rearing space size, and/or the like. The term “same” in the same rearing environment encompasses not only being completely the same but also being the same including a margin of error which is generally allowed in the technical field to which the technique of the present disclosure pertains and which is of a degree not inconsistent with the gist of the technique of the present disclosure.
In the administration group, there are a plurality of subgroups with different doses of the candidate substance 11. For example, the dose of the candidate substance 11 is made different in three levels such as a high-dose group, a medium-dose group, and a low-dose group. This makes it possible to inspect the effect of the different dose of the candidate substance 11 on the subjects S.
The first specimen images 151 may be images depicting the liver specimen LVS of the subject S of the administration group or images depicting the liver specimen LVS of the subject S of the control group.
As illustrated in
The storage 30 is a hard disk drive built in or connected, via a cable or network, to the computer constituting the image processing apparatus 10. Alternatively, the storage 30 may be a disk array of a plurality of linked hard disk drives. The storage 30 stores a control program such as an operating system, various application programs, and various kinds of data associated with these programs. Note that a solid-state drive may be used instead of the hard disk drive.
The memory 31 is a work memory for the CPU 32 to execute processing. The CPU 32 loads the program stored in the storage 30 into the memory 31 and executes processing according to the program. Thus, the CPU 32 comprehensively controls each part of the computer. The CPU 32 is an example of a “processor” according to the technique of the present disclosure. The memory 31 may be built in the CPU 32. The communication unit 33 controls transmission of various kinds of information with an external device such as the imaging device 19.
As illustrated in
When the operation program 40 is started, the CPU 32 of the computer constituting the image processing apparatus 10 functions, in cooperation with the memory 31 and the like, as a read/write (hereinafter abbreviated as RW) control unit 50, a feature amount extraction unit 51, a determination unit 52, and a clustering processing unit 53.
The RW control unit 50 controls storing of various kinds of data in the storage 30 and reading out of various kinds of data in the storage 30. For example, the RW control unit 50 stores the first specimen images 151 from the imaging device 19 in the storage 30. Since a plurality of first specimen images 151 are actually obtained from a single subject S, the plurality of first specimen images 151 are stored in the storage 30 for the single subject S.
The RW control unit 50 reads out, from the storage 30, and thus acquires the first specimen images 151 corresponding to a designation made by the drug discovery staff member DS through the input device 13. The RW control unit 50 outputs the read out first specimen images 151 to the feature amount extraction unit 51 and the clustering processing unit 53. The first specimen images 151 output from the RW control unit 50 to the feature amount extraction unit 51 and the like are subjected to determination as to whether a morphological abnormality is present in the liver specimen LVS. Hereinafter, the first specimen image 151 that is subjected to determination as to whether a morphological abnormality is present in the liver specimen LVS is referred to as a target first specimen image 151T (see
The RW control unit 50 reads out the feature amount extractor 41 from the storage 30, and outputs the read out feature amount extractor 41 to the feature amount extraction unit 51. The RW control unit 50 also reads out, from the storage 30, and thus acquires the second feature amount distribution information 42. The RW control unit 50 outputs the read out second feature amount distribution information 42 to the determination unit 52.
The feature amount extraction unit 51 extracts, using the feature amount extractor 41, a first feature amount 601 from the target first specimen image 151T. The feature amount extraction unit 51 outputs the first feature amount 601 to the determination unit 52.
The determination unit 52 determines whether a morphological abnormality is present in the liver specimen LVS depicted in the target first specimen image 151T, based on the first feature amount 601 and the second feature amount distribution information 42. A morphological abnormality refers to a lesion not observed in a normal liver specimen LVS, for example, hyperplasia, infiltration, congestion, inflammation, tumor, carcinogenesis, proliferation, bleeding, glycogen decrease, or the like. The determination unit 52 outputs a determination result 61 indicating whether a morphological abnormality is present in the liver specimen LVS depicted in the target first specimen image 151T, to the clustering processing unit 53.
The clustering processing unit 53 performs manual clustering processing in the present embodiment. The manual clustering processing will be described later.
As illustrated in
As illustrated in
As illustrated in
As is well known, the encoder unit 71 has layers such as a convolution layer that performs convolution processing using a filter and a pooling layer that performs pooling processing such as maximum value pooling processing. The same applies to the decoder unit 72. The encoder unit 71 extracts the feature amount 60 by repeatedly performing the convolution processing using the convolutional layer and the pooling processing using the pooling layer multiple times on the input patch image 65. The extracted feature amount 60 represents characteristics of the shape and texture of the liver specimen LVS depicted in the patch image 65.
The feature amount 60 is a set of a plurality of numerical values. In other words, the feature amount 60 is multidimensional data. The number of dimensions of the feature amount 60 is, for example, 512, 1024, or 2048. The first feature amount 601 and a second feature amount 602 (see
As illustrated in
In the training phase of the autoencoder 70, the series of processing of inputting the second patch image 652L for training to the autoencoder 70, outputting the reconstructed image 73L for training from the autoencoder 70, performing the loss calculation, making the update settings, and updating the autoencoder 70 is repeatedly performed while the second patch image 652L for training is replaced. The repetition of the series of processing is terminated when the reconstruction accuracy from the second patch image 652L for training to the reconstructed image 73L for training reaches a predetermined set level. The encoder unit 71 of the autoencoder 70 of which the reconstruction accuracy has reached the set level is stored in the storage 30 of the image processing apparatus 10 as the feature amount extractor 41. Note that training may be terminated after the series of processing is repeated a set number of times, regardless of the reconstruction accuracy from the second patch image 652L for training to the reconstructed image 73L for training.
The training of the autoencoder 70 may be performed by the image processing apparatus 10 or by an apparatus different from the image processing apparatus 10. In the latter case, the feature amount extractor 41 is transmitted from the different apparatus to the image processing apparatus 10, and the RW control unit 50 stores the feature amount extractor 41 in the storage 30.
As illustrated in
Next, how the second feature amount distribution information 42 is formed will be described. First, as illustrated in
In a graph 80 illustrated in
As in training of the autoencoder 70, the second feature amount distribution information 42 may be created by the image processing apparatus 10 or by an apparatus different from the image processing apparatus 10. In the latter case, the second feature amount distribution information 42 is transmitted from the different apparatus to the image processing apparatus 10, and the RW control unit 50 stores the second feature amount distribution information 42 in the storage 30. Additionally, the second feature amount distribution information 42 may be the plurality of second feature amounts 602 themselves. In this case, the representative position coordinates 82 are derived by the image processing apparatus 10.
As illustrated in
As illustrated in
On the other hand, as illustrated in
As illustrated in
As illustrated in
As illustrated in
The target designation screen 95 is a screen for designating a single target first specimen image 151T from among the plurality of first specimen images 151. The target designation screen 95 is provided with a single selection frame 97 movable among the plurality of first specimen images 151. The target designation screen 95 is also provided with an analyze button 98 at a lower portion thereof. The drug discovery staff member DS places the selection frame 97 at a desired first specimen image 151, and then selects the analyze button 98. As a result, the first specimen image 151 at which the selection frame 97 is placed is set as the target first specimen image 151T, the feature amount extraction unit 51 extracts the first feature amount 601 and the determination unit 52 determines whether a morphological abnormality is present.
As illustrated in
When receiving the dimension-reduced data 100 from the dimension reduction unit 90, the display control unit 91 performs control to display a cluster designation screen 105, illustrated in
As described above, the first patch image 651 has a size that covers not only a region of a morphological abnormality but also the surrounding region. Therefore, the distribution of the first feature amounts 601R is not discrete.
A cluster designation toolbar 108 is provided to the right of the graph 106. The cluster designation toolbar 108 is a toolbar to which icons of various tools are collected. The various tools are used by the drug discovery staff member DS to designate, on the graph 106, which cluster each first patch image 651 determined to have a morphological abnormality in the liver specimen LVS belongs to. The tools include a tool for enclosing a group of the plots of the first feature amounts 601R, which is considered to be a cluster, using a rectangular, circular, elliptic, or free curve. The tools also include a tool for erasing the rectangular, circular, elliptic, or free curve enclosing the plots of the first feature amounts 601R or a tool for changing the designated cluster.
As illustrated in
The cluster designation screen 105 is provided with a complete designation button 109 at a lower portion thereof. After designating the clusters on the graph 106, the drug discovery staff member DS selects the complete designation button 109. As a result, the designation reception unit 92 receives the designation made by the drug discovery staff member DS and indicating which cluster each first patch image 651 determined to have a morphological abnormality in the liver specimen LVS belongs to.
The designation reception unit 92 generates clustering information 112, based on the received designation. The clustering information 112 is information in which, for each patch image ID 85, the cluster to which the corresponding first patch image 651 belongs is registered. The first patch image 651 of which the plot of the first feature amount 601R does not belong to any of the clusters is not registered in the clustering information 112. The designation reception unit 92 outputs the clustering information 112 to the display control unit 91.
As illustrated in
The display control unit 91 generates the cluster images 115 to 117 in accordance with a display format 118 preset for each of the clusters 1 to 3. For example, the display format 118 indicates that the cluster 1, the cluster 2, and the cluster 3 are displayed in indigo, yellow-green, and gray, respectively. The display control unit 91 generates the cluster image 115 by filling the positions (which can be identified from the position information 86) of the first patch images 651 with the patch image IDs 85 registered in association with the cluster 1 in the clustering information 112, in indigo in the target first specimen image 151T. Likewise, the display control unit 91 generates the cluster image 116 by filling the positions of the first patch images 651 with the patch image IDs 85 registered in association with the cluster 2 in the clustering information 112, in yellow-green in the target first specimen image 151T. The display control unit 91 further generates the cluster image 117 by filling the positions of the first patch images 651 with the patch image IDs 85 registered in association with the cluster 3 in the clustering information 112, in gray in the target first specimen image 151T. By changing the display colors in this manner, the cluster images 115 to 117 become images that enable identification of the clusters 1 to 3. The cluster images 115 to 117 are an example of a “result of manual clustering processing” according to the technique of the present disclosure. Note that the drug discovery staff member DS may be allowed to change the setting of the display format 118 in any manner.
The display control unit 91 generates a superimposed image 119 in which at least one of the cluster images 115 to 117 is superimposed on the target first specimen image 151T.
As illustrated in
When the display switching button 127 is selected, the display control unit 91 displays the cluster image 115 to be superimposed on the target first specimen image 151T. On the other hand, when the display switching button 127 is not selected, the display control unit 91 does not display the cluster image 115 to be superimposed on the target first specimen image 151T. Likewise, when the display switching button 128 is selected, the display control unit 91 displays the cluster image 116 to be displayed on the target first specimen image 151T. When the display switching button 128 is not selected, the display control unit 91 does not display the cluster image 116 to be superimposed on the target first specimen image 151T. In addition, when the display switching button 129 is selected, the display control unit 91 displays the cluster image 117 to be superimposed on the target first specimen image 151T. When the display switching button 129 is not selected, the display control unit 91 does not display the cluster image 117 to be superimposed on the target first specimen image 151T.
Therefore, as illustrated in
Next, an operation performed by the configuration described above will be described with reference to flowcharts illustrated in
The imaging device 19 captures the first specimen images 151 depicting the liver specimen LVS of the subject S. The first specimen images 151 are output from the imaging device 19 to the image processing apparatus 10. In the image processing apparatus 10, the first specimen images 151 from the imaging device 19 are stored in the storage 30 by the RW control unit 50.
In
When the drug discovery staff member DS places the selection frame 97 at a desired one of the first specimen images 151 in the target designation screen 95 and selects the analyze button 98 (YES in step ST110), the RW control unit 50 reads out and thus acquires the first specimen image 151 at which the selection frame 97 is placed, from the storage 30 as the target first specimen image 151T (step ST115). The target first specimen image 151T is output from the RW control unit 50 to the feature amount extraction unit 51 and the display control unit 91.
The feature amount extractor 41 is read out from the storage 30 by the RW control unit 50, and the read out feature amount extractor 41 is output to the feature amount extraction unit 51. In addition, the second feature amount distribution information 42 is read out and thus acquired from the storage 30 by the RW control unit 50, and the read out second feature amount distribution information 42 is output to the determination unit 52.
As illustrated in
As illustrated in
The processing of steps ST125 and ST130 is performed for all the first patch images 651. After the processing of steps ST125 and ST130 is performed for all the first patch images 651 (YES in step ST135), the process proceeds to step ST140 illustrated in
As illustrated in
The cluster designation screen 105 includes the graph 106 in which the dimension-reduced two-dimensional first feature amounts 601R are plotted in the two-dimensional feature amount space 107. On this graph 106, as illustrated in
As illustrated in
The drug discovery staff member DS views the superimposed image 119 via the clustering result display screen 125 and evaluates the efficacy and toxicity of the candidate substance 11. At this time, the drug discovery staff member DS operates the display switching buttons 127 to 129 if necessary to switch the cluster images 115 to 117 that are displayed to be superimposed on the target first specimen image 151T.
As described above, the CPU 32 of the image processing apparatus 10 includes the RW control unit 50, the feature amount extraction unit 51, the determination unit 52, and the clustering processing unit 53. The RW control unit 50 reads out and thus acquires the target first specimen image 151T from the storage 30. The target first specimen image 151T is an image depicting the liver specimen LVS of the subject S.
The feature amount extraction unit 51 extracts, using the feature amount extractor 41, the first feature amounts 601 from the respective first patch images 651 into which the target first specimen image 151T is subdivided. The determination unit 52 determines, based on each of the first feature amounts 601, whether a morphological abnormality is present in the liver specimen LVS depicted in the corresponding first patch image 651. The clustering processing unit 53 receives, from the drug discovery staff member DS, a designation indicating which cluster the first patch image 651 determined to have a morphological abnormality in the liver specimen LVS belongs to. Then, based on the designation, the clustering processing unit 53 performs manual clustering processing of clustering the first patch image 651 into one of the plurality of clusters. Therefore, it becomes possible to cluster the first patch images 651 of the target first specimen image 151T of which the distribution of the first feature amounts 601 is not discrete.
The display control unit 91 performs control to display the superimposed image 119 including at least one of the cluster images 115 to 117 that are a result of the manual clustering processing on the display 12. Therefore, the drug discovery staff member DS can easily know the result of the manual clustering processing.
The result of the manual clustering processing is displayed by the plurality of cluster images 115 to 117 generated by processing the target first specimen image 151T. The plurality of cluster images 115 to 117 are images that enable the plurality of clusters 1 to 3 to be identified based on the display format 118 preset for each of the plurality of clusters 1 to 3. Therefore, the drug discovery staff member DS can easily grasp the result of the manual clustering processing.
The display control unit 91 displays at least one of the plurality of cluster images 115 to 117 to be superimposed on the target first specimen image 151T. Therefore, the drug discovery staff member DS can easily grasp which part of the liver specimen LVS belongs to which cluster and, consequently what type of morphological abnormality is present in which part of the liver specimen LVS.
The type of morphological abnormality can be estimated with a certain probability depending on the part where the morphological abnormality is present. Therefore, the drug discovery staff member DS can identify, based on the superimposed image 119, whether the morphological abnormality is caused by the toxicity of the candidate substance 11, such as hyperplasia, or is a natural phenomenon not caused by the toxicity of the candidate substance 11, such as glycogen decrease. Therefore, based on the superimposed image 119, the drug discovery staff member DS can examine a mechanism of action for both the efficacy and toxicity of the candidate substance 11.
The designation reception unit 92 receives, from the drug discovery staff member DS, the designation of the cluster images 115 to 117 displayed to be superimposed on the target first specimen image 151T. Therefore, the drug discovery staff member DS can display only one cluster image of interest to be superimposed on the target first specimen image 151T or display all the cluster images 115 to 117 to be superimposed on the target first specimen image 151T. This expedites the evaluation of the efficacy and toxicity of the candidate substance 11 by the drug discovery staff member DS.
The dimension reduction unit 90 reduces the number of dimensions of the first feature amounts 601 to two dimensions. The display control unit 91 displays, on the display 12, the cluster designation screen 105 including the graph 106 in which the dimension- reduced two-dimensional first feature amounts 601R are plotted in the two-dimensional feature amount space 107. The designation reception unit 92 receives the designation on the graph 106. Therefore, the drug discovery staff member DS can easily designate which cluster each first patch image 651 belongs to.
As illustrated in
The RW control unit 50 reads out and thus acquires, from the storage 30, the second feature amount distribution information 42 that is information on the distribution 83 of the second feature amounts 602 extracted using the feature amount extractor 41 from the second patch images 652 into which the second specimen images 152 are subdivided. The determination unit 52 calculates the distance D between the distribution 83 and the first feature amount 601, more specifically, the distance D between the representative position of the distribution 83 and the position of the first feature amount 601, and performs the determination based on the distance D.
The distance D indicates the degree of deviation between the target first specimen image 151T and the second specimen image 152. At least no morphological abnormality due to the toxicity of the candidate substance is present in the liver specimen LVS depicted in the second specimen image 152. Therefore, if the determination is performed based on the distance D, the reasonable determination result 61 can be obtained.
The number of dimensions of the dimension-reduced first feature amount 601R is not limited to the two dimensions illustrated in
As illustrated in
As illustrated in
The degree-of-belonging calculation unit 146 generates degree-of-belonging information 160 in which the degrees of belonging and the probability of not belonging are registered for each patch image ID 85 of the first patch image 651. The degree-of-belonging calculation unit 146 outputs the degree-of-belonging information 160 to the clustering information generation unit 147.
The clustering information generation unit 147 specifies the cluster to which each of the plurality of first patch images 651 belongs, based on the degree-of-belonging information 160. More specifically, the clustering information generation unit 147 specifies, as the cluster to which the first patch image 651 belongs, the cluster having the largest value among the degrees of belonging registered in the degree-of-belonging information 160. If the probability of not belonging is the largest value, the clustering information generation unit 147 specifies the corresponding first patch image 651 as not belonging to any of the clusters.
The clustering information generation unit 147 generates clustering information 161 representing the specified result. The clustering information 161 is information in which, for each patch image ID 85 of the first patch image 651, the cluster to which the first patch image 651 belongs is registered, similarly to the clustering information 112 of the first embodiment_1. The clustering information generation unit 147 outputs the clustering information 161 to the display control unit 91. Thereafter, the display control unit 91 generates a plurality of cluster images by processing the target first specimen image 151T in accordance with the clustering information 161, as in the first embodiment_1 described above. Then, the display control unit 91 generates a superimposed image in which the cluster images are superimposed on the target first specimen image 151T, and performs control to display a clustering result display screen including the superimposed image on the display 12.
As in a flowchart illustrated in
The clustering information generation unit 147 generates the clustering information 161, based on the degree-of-belonging information 160. As a result, the first patch images 651 are clustered into the plurality of clusters (step ST205). The clustering information 161 is output from the clustering information generation unit 147 to the display control unit 91. Then, the soft clustering processing is completed. Since the subsequent processing is substantially the same as that in the first embodiment_1, the description thereof is omitted.
As described above, the clustering processing unit 145 of the present embodiment performs soft clustering processing to calculate the degree of belonging of the first patch image 651, determined to have a morphological abnormality in the liver specimen LVS, to each of the plurality of clusters. Therefore, even with the present embodiment, it is possible to cluster the first patch images 651 of the target first specimen image 151T of which the distribution of the first feature amounts 601 is not discrete. In addition, since the drug discovery staff member DS does not need to designate which cluster the first patch image 651 belongs to, the burden on the drug discovery staff member DS can be reduced.
As in the first embodiment_1, the degrees of belonging may be calculated by the degree-of-belonging calculation unit 146 after the number of dimensions of the first feature amount 601 is reduced.
In the present embodiment, when only one cluster image is displayed to be superimposed on the target first specimen image 151T, a display manner such as a clustering result display screen 165 illustrated in
In the present embodiment, as in a clustering result display screen 170 illustrated in
By thus displaying the statistical information 171, the drug discovery staff member DS can easily grasp the proportion of the first patch images 651 belonging to each of the clusters 1 to 3. In addition to or instead of the number and percentage of first patch images 651 belonging to each of the clusters 1 to 3, the area of the first patch images 651 belonging to each of the clusters 1 to 3 may be displayed as the statistical information.
As illustrated in
As illustrated in
The third patch images 653L for training are obtained by choosing, from among the plurality of third patch images 653, the third patch images 653 depicting the liver specimen LVS in which a morphological abnormality is present. The third patch images 653L for training are chosen manually, for example, by a pathologist or the like. Alternatively, the third patch images 653L for training may be chosen in such a manner that by using the method described in the first embodiment_1, a third feature amount is extracted from the third patch image 653, and it is determined whether a morphological abnormality is present in the liver specimen LVS based on the third feature amount.
As described above, in the third embodiment, not only the second patch images 652L for training but also the third patch images 653L for training depicting the liver specimen LVS in which a morphological abnormality is present are used as labeled training data. Therefore, the autoencoder 70, and consequently the feature amount extractor 41, can learn the liver specimens LVS having more diverse characteristics of the shape and texture. As a result, it is possible to extract the first feature amount 601 that better represents the characteristics of the shape and texture of the liver specimen LVS.
Note that the patch images depicting the liver specimen LVS in which a morphological abnormality is present are not limited to the third patch images 653L for training obtained from the subjects S constituting the past administration group 175 described as an example. A morphological abnormality may also occur in the subjects S constituting the past control group 75. Therefore, it does not matter whether the subject S belongs to the past control group 75 or the past administration group 175, as long as the patch image depicts the liver specimen LVS in which a morphological abnormality is present. That is, the second patch images 652L for training depicting the liver specimen LVS in which a morphological abnormality is present may be used as labeled training data. Further, the patch images depicting the liver specimen LVS in which a morphological abnormality is present may also be images obtained from the subject S intentionally subjected to various stresses to induce such a morphological abnormality. Additionally, the patch images depicting the liver specimen LVS in which a morphological abnormality is present may also be images obtained by processing the patch image depicting a normal liver specimen LVS to artificially create a morphological abnormality.
As illustrated in
The CNN 180 has an output unit 183 in addition to the encoder unit 181. The patch image 65, such as the first patch image 651, is input to the encoder unit 181. The encoder unit 181 converts the patch image 65 into the feature amount 60. The encoder unit 181 passes the feature amount 60 to the output unit 183. The output unit 183 outputs a prediction result 184, based on the feature amount 60. The prediction result 184 is a prediction of one type of morphological abnormality that is present in the liver specimen LVS depicted in the patch image 65 from among a plurality of types such as hyperplasia, infiltration, congestion, and inflammation.
In a training phase of the CNN 180, in addition to the second patch images 652L for training, patch images depicting the liver specimen LVS in which a morphological abnormality is present, presented in the third embodiment, are used mainly as labeled training data. As described above, a morphological abnormality may also occur in the subjects S constituting the past control group 75. Therefore, the second patch images 652L for training may also depict the liver specimen LVS in which a morphological abnormality is present. Accordingly, in the training phase, it is preferable to perform label smoothing on ground truth data and add an error rate, rather than strictly classifying the type of the morphological abnormality with 1 or 0. In addition, regions that are likely to be erroneously detected as a morphological abnormality in the liver specimen LVS depicted in the third patch images 653L for training may be masked and then used in training. Examples of the regions that are likely to be erroneously detected as a morphological abnormality include dust or the like adhered when the slide specimen 18 is prepared.
As described above, in the fourth embodiment, the CNN 180, and consequently the feature amount extractor 182, is a model that performs a task of identifying the type of morphological abnormality. Therefore, the first feature amount 601 that still better represents the characteristics of the shape and texture of the liver specimen LVS can be extracted.
Note that the machine learning model to be repurposed as the feature amount extractor is not limited to the encoder unit 71 of the autoencoder 70 and the encoder unit 181 of the CNN 180 given as an example. A generator of a generative adversarial network (GAN) may also be repurposed as the feature amount extractor. A machine learning model without convolutional layers, such as a Vision Transformer (ViT), may also be repurposed as the feature amount extractor.
Contrastive learning may be performed in which learning is performed to make a distance between feature amounts derived from the same image in the feature amount space smaller and make a distance between feature amounts derived from different images in the feature amount space larger. As the contrastive learning, a learning method such as a simple framework for contrastive learning of visual representations (SimCLR) is known, for example. In addition, a learning method such as Bootstrap Your Own Latent (BYOL), which does not use the aforementioned pairs of different images (also referred to as negative samples), may be used. In addition, constraints may be applied to the distribution of the extracted feature amounts, such as constraining the distribution to be a distribution on a unit sphere or to follow a standard normal distribution.
As illustrated in
The CPU 32 of the image processing apparatus 10 according to the fifth embodiment functions as an identification unit 191, in addition to the processing units 50 to 53 described in the first embodiment_1 above. The identification unit 191 identifies tissue specimens of respective organs from the first specimen image 151 using, for example, templates for identifying the tissue specimens of the respective organs or a machine learning model. The identification unit 191 outputs coordinate information of frames 192 to 195 enclosing the tissue specimens of the respective organs as an identification result. The frame 192 is a frame enclosing the heart specimen HS. The frame 193 is a frame enclosing the liver specimen LVS. In addition, the frame 194 is a frame enclosing the brain specimen BS. The frame 195 is a frame enclosing the bone marrow specimen BMS.
As described above, in the fifth embodiment, the first specimen image 151 is an image obtained by imaging the slide specimen 190 on which tissue specimens of a plurality of types of organs are placed. The identification unit 191 identifies the tissue specimens of the respective organs from such a first specimen image 151. The feature amount extraction unit 51 extracts the first feature amount 601 for each of the tissue specimens of the identified organs. The determination unit 52 performs determination for each of the tissue specimens of the organs depicted in the target first specimen image 151T, based on the first feature amount 601. Therefore, the slide specimen 190 on which the tissue specimens of the plurality of types of organs are placed can be handled. As for the slide specimen, the slide specimen 190 on which tissue specimens of a plurality types of organs are placed as in the present embodiment is more common than the slide specimen 18 on which a tissue specimen of a single organ is placed as in the first embodiment_1 to the fourth embodiment. Therefore, processing that matches the more common operational practices can be performed.
The frame indicating the tissue specimen of each organ in the first specimen image 151 may be defined manually by the drug discovery staff member DS.
The feature amount 60 is not limited to that extracted by the feature amount extractor 41. The feature amount 60 may be an average value, a maximum value, a minimum value, a mode value, or a variance of the pixel values of the patch image 65.
The organ is not limited to the liver LV used as an example. The organ may be a stomach, a lung, a small intestine, a large intestine, or the like. Furthermore, the subject S is not limited to a rat. The subject S may be a mouse, a guinea pig, a gerbil, a hamster, a ferret, a rabbit, a dog, a cat, a monkey, or the like.
The image processing apparatus 10 may be a personal computer installed in a pharmaceutical development facility as described in
In a case where the image processing apparatus 10 is a server computer, the first specimen image 151 is transmitted from a personal computer installed in each pharmaceutical development facility to the server computer via a network such as the Internet. The server computer distributes various screens such as the target designation screen 95 to the personal computer in a form of screen data for web distribution created using a markup language such as Extensible Markup Language (XML), for example. The personal computer reproduces the screen to be displayed on a web browser based on the screen data, and displays the screen on a display. In addition, other data description languages such as Javascript (registered trademark) Object Notation (JSON) may be used instead of XML.
The image processing apparatus 10 according to the technique of the present disclosure can be widely used throughout all stages of pharmaceutical development, from the earliest stage of setting a drug discovery target to the final stage of clinical trials.
The hardware configuration of the computer constituting the image processing apparatus 10 according to the technique of the present disclosure can be modified in various ways. For example, the image processing apparatus 10 may be constituted by a plurality of computers that are separate in terms of hardware in order to improve the processing capacity and the reliability. For example, the functions of the feature amount extraction unit 51 and the determination unit 52 and the function of the clustering processing unit 53 or 145 can be distributed between two computers. In this case, the image processing apparatus 10 is constituted by the two computers.
As described above, the hardware configuration of the computer of the image processing apparatus 10 can be appropriately modified according to the required performance, such as the processing capacity, safety, and reliability. Further, not only the hardware but also an application program such as the operation program 40 may be duplicated or stored in a distributed manner across a plurality of storages, in order to ensure the safety and reliability.
In each of the embodiments described above, various processors mentioned below can be used as a hardware structure of the processing units that execute various types of processing such as the RW control unit 50, the feature amount extraction unit 51, the determination unit 52, the clustering processing units 53 and 145, the dimension reduction unit 90, the display control unit 91, the designation reception unit 92, the degree-of-belonging calculation unit 146, the clustering information generation unit 147, and the identification unit 191, for example. The various processors include the CPU 32, which is a general-purpose processor that executes software (the operation program 40) to function as various processing units as described above, a programmable logic device (PLD) such as a field programmable gate array (FPGA), which is a processor having a circuit configuration that can be changed after manufacture, and a dedicated electric circuit such as an application specific integrated circuit (ASIC), which is a processor with a circuit configuration specifically designed to execute particular processing.
One processing unit may be constituted by a single processor from among the various processors or by a combination of two or more processors of the same or different types (for example, a combination of a plurality of FPGAs and/or a combination of a CPU and an FPGA). Additionally, a plurality of processing units may be constituted by a single processor.
As an example in which a plurality of processing units are constituted by a single processor, firstly, there is a form represented by computers such as a client and a server, where one processor is constituted by a combination of one or more CPUs and software and this processor functions as the plurality of processing units. Secondly, there is a form that uses a processor that implements the functions of the entire system including the plurality of processing units, on a single integrated circuit (IC) chip, as represented by a system on chip (SoC) or the like. As described above, the various processing units are constituted by one or more of the aforementioned processors in terms of the hardware structure.
Further, as the hardware structure of these various processors, more specifically, an electric circuit (circuitry) in which circuit elements such as semiconductor elements are combined can be used.
From the above description, the technique described in appendices below can be grasped.
An image processing apparatus comprising:
The image processing apparatus according to appendix 1, wherein the processor is configured to perform control to display a result of the manual clustering processing or the soft clustering processing.
The image processing apparatus according to appendix 2, wherein
The image processing apparatus according to appendix 3, wherein the processor is configured to display at least one cluster image among the plurality of cluster images to be superimposed on the first specimen image.
The image processing apparatus according to appendix 4, wherein the processor is configured to receive, from the user, a designation of the at least one cluster image displayed to be superimposed on the first specimen image.
The image processing apparatus according to any one of appendices 2 to 5, wherein the processor is configured to display statistical information based on the result.
The image processing apparatus according to any one of appendices 1 to 6, wherein the processor is configured to, in the manual clustering processing:
The image processing apparatus according to any one of appendices 1 to 7, wherein the machine learning model is a model trained using, as labeled training data, second patch images into which second specimen images are subdivided, the second specimen images depicting tissue specimens of a plurality of subjects constituting a control group to which a candidate substance for a medicine is not administered in a past evaluation test of the candidate substance.
The image processing apparatus according to appendix 8, wherein the labeled training data further includes a patch image depicting the tissue specimen in which the morphological abnormality is present.
The image processing apparatus according to appendix 9, wherein the machine learning model is a model that performs a task of identifying a type of the morphological abnormality.
The image processing apparatus according to any one of appendices 8 to 10, wherein the processor is configured to:
The technique of the present disclosure can also be implemented by appropriately combining the various embodiments and/or various modifications described above. Furthermore, the present disclosure is not limited to the embodiments described above, and various configurations may be obviously adopted without departing from the gist. Further, the technique of the present disclosure encompasses not only a program but also a storage medium that stores the program in a non-transitory manner.
The contents and illustrations presented above are a detailed description of part related to the technique of the present disclosure, and are merely an example of the technique of the present disclosure. For example, the description regarding the configuration, functions, operations, and effects described above is a description about an example of the configuration, functions, operations, and effects of the part related to the technique of the present disclosure. Therefore, needless to say, within the scope not departing from the gist of the technique of the present disclosure, an unnecessary portion may be omitted, a new element may be added, or an element may be replaced in the contents described and illustrated above. Furthermore, to avoid confusion and facilitate understanding of the part related to the technique of the present disclosure, the contents described and illustrated above omit descriptions of common technical knowledge and the like that do not particularly require the description for enabling the implementation of the technique of the present disclosure.
In the present specification, “A and/or B” is synonymous with “at least one of A or B”. That is, “A and/or B” may mean only A, only B, or a combination of A and B. In addition, in the present specification, when three or more items are expressed by linked with “and/or”, the same concept as “A and/or B” is applied.
All documents, patent applications, and technical standards described in this specification are incorporated herein by reference to the same extent as if individual documents, patent applications, and technical standards were specifically and individually described to be incorporated by reference.
Number | Date | Country | Kind |
---|---|---|---|
2022-119115 | Jul 2022 | JP | national |
This application is a continuation application of International Application No. PCT/JP2023/026383 filed on Jul. 19, 2023, the disclosure of which is incorporated herein by reference in its entirety. Further, this application claims priority from Japanese Patent Application No. 2022-119115 filed on Jul. 26, 2022, the disclosure of which is incorporated herein by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/JP2023/026383 | Jul 2023 | WO |
Child | 19035368 | US |