CROSS-REFERENCE TO THE RELATED APPLICATIONS
This application is based upon and claims priority to Chinese Patent Application No. 202311524364.5, filed on Nov. 16, 2023, the entire contents of which are incorporated herein by reference.
TECHNICAL FIELD
The present disclosure relates to the technical field of medical image processing, and in particular to a multi-organ nuclei segmentation method based on prompt learning.
BACKGROUND
At present, most medical image segmentation models focus only on method improvement and exploration for image data. They rarely involve research on multi-modality guided segmentation, and only consider visual factors, lacking comprehensive learning for target region segmentation. In pathological section analysis, multi-organ cell segmentation is recognized as a difficult topic, and the datasets are usually dedicated to specific nuclei segmentation. Therefore, segmentation models are usually constructed based on a partially labeled dataset or a dataset for a specific organ and can only segment the nuclei of a specific organ. More and more segmentation models show excellent performance in single-organ nuclei segmentation, but research on multi-organ nuclei segmentation using a single segmentation model is still limited. To enable a segmentation model to recognize more organs, the model must be retrained to have the segmentation ability for more organs. Therefore, it is a challenge to guide a segmentation model for multi-organ nuclei segmentation based on the multi-modality information including text and image.
SUMMARY
To overcome the above technical deficiencies, the present disclosure provides a method for accurately performing an unsupervised nuclei segmentation task for a specific organ through a sufficient text prompt on an unlabeled dataset.
In order to solve the technical problem, the present disclosure adopts the following technical solution.
A multi-organ nuclei segmentation method based on prompt learning includes the following steps:
- a) acquiring N brain nuclei images, N kidney nuclei images, N liver nuclei images, N breast nuclei images, N colon nuclei images, and N stomach nuclei images to form a nuclei image-text dataset T,
where Tibrain denotes an i-th nuclei image, and a nuclei photo of [brain] is a medical prompt text template for brain; Tikidney denotes an i-th kidney nuclei image, and a nuclei photo of [liver] is a medical prompt text template for kidney; Tiliver denotes an i-th liver nuclei image, and a nuclei photo of [liver] is a medical prompt text template for liver; Tibreast denotes an i-th breast nuclei image, and a nuclei photo of [breast] is a medical prompt text template for breast; Ticolon denotes an i-th colon nuclei image, and a nuclei photo of [colon] is a medical prompt text template for colon; and Tistomach denotes an i-th stomach nuclei image, and a nuclei photo of [stomach] is a medical prompt text template for stomach;
- b) inputting the i-th brain nuclei image Tibrain and the medical prompt text template a nuclei photo of [brain] for brain, the i-th kidney nuclei image Tikidney and the medical prompt text template a nuclei photo of [kidney] for kidney, the i-th liver nuclei image Tiliver and the medical prompt text template a nuclei photo of [liver] for liver, the i-th breast nuclei image Tibreast and the medical prompt text template a nuclei photo of [breast] for breast, the i-th colon nuclei image Ticolon and the medical prompt text template a nuclei photo of [colon] for colon, and the i-th stomach nuclei image Tistomach and the medical prompt text template a nuclei photo of [stomach] for stomach in the nuclei image-text dataset T into a clip model to acquire an optimized clip model;
- c) acquiring a brain nuclei images, b kidney nuclei images, c liver nuclei images, d breast nuclei images, e colon nuclei images, and f stomach nuclei images from a nuclei instance segmentation (NuInsSeg) dataset, a+b+c+d+e+f=n; and forming a nuclei image-text dataset Y,
where Yibrain denotes an i-th brain nuclei image, Yikidney denotes an i-th kidney nuclei image, Yiliver denotes an i-th liver nuclei image, Yibreast denotes an i-th breast nuclei image, Yicolon denotes an i-th colon nuclei image, and Yistomach denotes an i-th stomach nuclei image;
- d) dividing the nuclei image-text dataset Y into a training set and a test set, and scaling the nuclei image in the training set to 572×572;
- e) constructing a segmentation network model, including a text module, an image module, and a multilayer perceptron (MILP) module;
- f) inputting the medical prompt text template in the training set into the text module to acquire a text vector;
- g) inputting the nuclei image in the training set into the image module of the segmentation network model to acquire a nuclei segmentation result image and a feature vector;
- h) inputting the feature vector into the MILP module of the segmentation network model to acquire a parameter;
- i) updating the segmentation network model through the parameter to acquire an updated segmentation network model;
- j) training the updated segmentation network model to acquire an optimized segmentation network model; and
- k) inputting the nuclei image in the test set into the optimized segmentation network model to acquire a final segmentation result image.
Preferably, N is 500.
Further, the step a) includes: acquiring the N brain nuclei images from a multi-organ nuclei segmentation (MoNuSeg) dataset and/or a CMP-15 dataset and/or a CMP-17 dataset and/or a nuclei instance segmentation dataset of cryosectioned hematoxylin-eosin-stained histological images (CryonNuSeg) dataset and/or the NuInsSeg dataset; acquiring the N kidney nuclei images from the MoNuSeg dataset and/or a kumar dataset and/or an Irshad dataset and/or the CryonNuSeg dataset and/or a multi-organ nuclei segmentation and classification (MoNuSAC) dataset and/or the NuInsSeg dataset; acquiring the N liver nuclei images from the MoNuSeg dataset and/or the CryonNuSeg dataset and/or a Crowedsourced dataset and/or the kumar dataset and/or the NuInsSeg dataset; acquiring the N breast nuclei images from the MoNuSeg dataset and/or the MoNuSAC dataset and/or a nucleus classification, localization and segmentation (Nucls) dataset and/or a triple negative breast cancer (TNBC) dataset and/or a Janowczyk mCryonNuSeg dataset and/or a Gelasca dataset and/or a Naylor dataset and/or the MoNuSAC dataset and/or a NuInsSeg dataset and/or the kumar dataset; acquiring the N colon nuclei images from a colorectal nuclei segmentation and phenotypes (CoNSeP) dataset and/or a CRCHisto dataset and/or the CryonNuSeg dataset and/or the NuInsSeg dataset and/or the kumar dataset; and acquiring the N stomach nuclei images from a MoNuSeg CryonNuSeg dataset and/or a Wienert dataset and/or the NuInsSeg dataset and/or the kumar dataset.
Preferably, the step d) includes: dividing the nuclei image-text dataset Y into the training set and the test set at a ratio of 7:3.
Further, the step f) includes:
- f-1) constructing the text module of the segmentation network model, including the optimized clip model;
- f-2) inputting the medical prompt text template a nuclei photo of [brain] for brain in the training set into the text module to acquire a text vector Nbrain, Nbrain ∈ RL×N, where R denotes a real number space, L denotes a length of a text, and N denotes a length of a last word in the text; and expanding, by a torch.unsqueeze function in a PyTorch library of Python, the text vector Nbrain by one channel dimension to acquire a text vector N′brain
- f-3) inputting the medical prompt text template a nuclei photo of [kidney] for kidney in the training set into the text module to acquire a text vector Nkidney, Nkidney ∈ RL×N; and expanding, by the torch.unsqueeze function in the PyTorch library of Python, the text vector Nkidney by one channel dimension to acquire a text vector N′kidney
- f-4) inputting the medical prompt text template a nuclei photo of [liver] for liver in the training set into the text module to acquire a text vector Nliver Nliver ∈ RL×N; and expanding, by the torch.unsqueeze function in the PyTorch library of Python, the text vector Nliver, by one channel dimension to acquire a text vector N′liver,
- f-5) inputting the medical prompt text template a nuclei photo of [breast] for breast in the training set into the text module to acquire a text vector Nbreast Nbreast ∈ RL×N; and expanding, by the torch.unsqueeze function in the PyTorch library of Python, the text vector Nbreast by one channel dimension to acquire a text vector N′breast;
- f-6) inputting the medical prompt text template a nuclei photo of [colon] for colon in the training set into the text module to acquire a text vector Ncolon, Ncolon ∈ RL×N; and expanding, by the torch.unsqueeze function in the PyTorch library of Python, the text vector Ncolon on by one channel dimension to acquire a text vector N′colon; and
- f-7) inputting the medical prompt text template a nuclei photo of [stomach] for stomach in the training set into the text module to acquire a text vector Nstomach, Nstomach ∈ R L×N; and expanding, by the torch.unsqueeze function in the PyTorch library of Python, the text vector Nstomach by one channel dimension to acquire a text vector N′stomach.
Further, the step g) includes:
- g-1) constructing the image module of the segmentation network model, including an image encoder, an image decoder, and a generalizable approximate partitioning (GAP) module;
- g-2) constructing the image encoder of the image module, including a first CRM, a second CRM, a third CRM, a fourth CRM, and a fifth CRM; constructing the first CRM, the second CRM, the third CRM, and the fourth CRM, each including a first convolutional layer, a first rectified linear unit (ReLU) activation function, a second convolutional layer, a second ReLU activation function, and a max pooling layer in sequence; constructing the fifth CRM, including a first convolutional layer, a first ReLU activation function, a second convolutional layer, and a second ReLU activation function in sequence; inputting the i-th brain nuclei image Yibrain in the training set into the first CRM to acquire a feature PE1brain, inputting the feature PE1brain into the second CRM to acquire a feature PE2brain, inputting the feature PE2brain into the third CRM to acquire a feature PE3brain inputting the feature PE3brain into the fourth CRM to acquire a feature PE4brain, and inputting the feature PE4brain into the fifth CRM to acquire a feature PE5brain; inputting the i-th kidney nuclei image Yikidney in the training set into the first CRM to acquire a feature PE1kidney, inputting the feature PE1kidney into the second CRM to acquire a feature PE2kidney, inputting the feature PE2kidney into the third CRM to acquire a feature PE3kidney, inputting the feature PE3kidney into the fourth CRM to acquire a feature PE4kidney and inputting the feature PE4kidney into the fifth CRM to acquire a feature PE5kidney; inputting the i-th liver nuclei image Yiliver in the training set into the first CRM to acquire a feature PE1liver, inputting the feature PE1liver into the second CRM to acquire a feature PE2liver, inputting the feature PE2liver into the third CRM to acquire a feature PE3liver, inputting the feature PE3liver into the fourth CRM to acquire a feature PE4liver, and inputting the feature PE4liver into the fifth CRM to acquire a feature PE5liver; inputting the i-th breast nuclei image Yibreast in the training set into the first CRM to acquire a feature PE1breast, inputting the feature PE1breast into the second CRM to acquire a feature PE2breast, inputting the feature PE2breast into the third CRM to acquire a feature PE3breast, inputting the feature PE3breast into the fourth CRM to acquire a feature PE4breast, and inputting the feature PE4breast into the fifth CRM to acquire a feature PE5breast; inputting the i-th colon nuclei image Yicolon in the training set into the first CRM to acquire a feature PE1colon, inputting the feature PE1colon into the second CRM to acquire a feature PE2colon, inputting the feature PE2colon into the third CRM to acquire a feature PE3colon inputting the feature PE3colon into the fourth CRM to acquire a feature PE4colon, and inputting the feature PE4colon into the fifth CRM to acquire a feature PE5colon; and inputting the i-th stomach nuclei image Yistomach in the training set into the first CRM to acquire a feature PE1stomach, inputting the feature PE1stomach into the second CRM to acquire a feature PE2stomach, inputting the feature PE2stomach into the third CRM to acquire a feature PE3stomach, inputting the feature PE3stomach into the fourth CRM to acquire a feature PE4stomach and inputting the feature PE4stomach into the fifth CRM to acquire a feature PE5stomach.
- g-3) constructing the image decoder of the image module, including a first GRU module, a second GRU module, a third GRU module, and a fourth GRU module; constructing the first GRU module, the second GRU module, the third GRU module, and the fourth GRU module, each including a first convolutional layer, a first ReLU activation function, a second convolutional layer, a second ReLU activation function, and an upsampling layer in sequence; inputting the feature PE5braom into the first CRM to acquire a feature PO1brain, inputting the feature PO1brain into the second CRM to acquire a feature PO2brain,inputting the feature PO2brain into the third CRM to acquire a feature PO3brain and inputting the feature PO3brain into the fourth CRM to acquire a brain nuclei segmentation result image PO4brain in inputting the feature PE5kidney into the first CRM to acquire a feature PO1kidney inputting the feature PO1kidney into the second CRM to acquire a feature PO2kidney, inputting the feature PO2kidney into the third CRM to acquire a feature PO3kidney, and inputting the feature PO3kidney into the fourth CRM to acquire a kidney nuclei segmentation result image PO4kidney; inputting the feature PE5liver into the first CRM to acquire a feature PO1liver inputting the feature PO1liver into the second CRM to acquire a feature PO2liver inputting the feature PO3liver into the third CRM to acquire a feature PO3liver and inputting the feature PO3liver into the fourth CRM to acquire a liver nuclei segmentation result image PO4liver; inputting the feature PE5breast into the first CRM to acquire a feature PO1breast, inputting the feature PO1breast into the second CRM to acquire a feature PO2breast, inputting the feature PO2breast into the third CRM to acquire a feature PO3breast, and inputting the feature PO3breast into the fourth CRM to acquire a breast nuclei segmentation result image PO4breast; inputting the feature PE5colon into the first CRM to acquire a feature PO1colon, inputting the feature PO1colon into the second CRM to acquire a feature PO2colon inputting the feature PO1colon into the third CRM to acquire a feature PO2colon, and inputting the feature PO3colon into the fourth CRM to acquire a colon nuclei segmentation result image PO4colon; and inputting the feature PE5stomach into the first CRM to acquire a feature PO1stomach, inputting the feature PO1stomach into the second CRM to acquire a feature PO2stomach inputting the feature PO2stomach into the third CRM to acquire a feature PO3stomach, and inputting the feature PO3stomach into the fourth CRM to acquire a stomach nuclei segmentation result image PO4stomach; and
- g-4) constructing the GAP module of the image module, including a batch normalization (BN) layer, a ReLU activation function, and an adaptive average pooling layer; inputting the feature PE5brain into the GAP module to acquire a feature PGbrain inputting the feature PE5kidney into the GAP module to acquire a feature PGkidney inputting the feature PE5liver into the GAP module to acquire a feature PGliver, inputting the feature PE5brain into the GAP module to acquire a feature PGbreast, inputting the feature PE5colon into the GAP module to acquire a feature PGcolon, and inputting the feature PE5stomach into the GAP module to acquire a feature PGstomach; and concatenating the feature PGbrain and the text vector N′brain to acquire a feature vector Nmerbrain, concatenating the feature PGkidney and the text vector N′kidney to acquire a feature vector Nmerkidney, concatenating the feature PGliver and the text vector N′liver, to acquire a feature vector Nmerliver concatenating the feature PGbreast and the text vector N′breast to acquire a feature vector Nmerbreast, concatenating the feature PGcolon and the text vector N′colon to acquire a feature vector Nmercolon, and concatenating the feature PGstomach and the text vector N′stomach to acquire a feature vector Nmerstomach.
Further, the step h) includes:
- h-1) constructing the MLP module of the segmentation network model, including a first convolutional layer, a second convolutional layer, and a third convolutional layer in sequence, where the first convolutional layer, the second convolutional layer, and the third convolutional layer each include a convolutional kernel with a size of 1*1;
- h-2) inputting the feature vector Nmerbrain into the MLP module to acquire a feature N1brain, inputting the feature vector Nmerkidney into the MLP module to acquire a feature N1kidney, inputting the feature vector Nmerliver into the MLP module to acquire a feature N1liver, inputting the feature vector Nmerbreast into the MLP module to acquire a feature N1breast, inputting the feature vector Nmercolon into the MLP module to acquire a feature N1colon, and inputting the feature vector Nmerstomach into the MLP module to acquire a feature N1stomach; and
- h-3) inputting the feature N1brain into a Sigmoid function to acquire a parameter θ1brain, inputting the feature N1kidney into the Sigmoid function to acquire a parameter θ1kidney, inputting the feature N1liver into the Sigmoid function to acquire a parameter θ1liver, inputting the feature N1breast into the Sigmoid function to acquire a parameter θ1breast, inputting the feature N1colon into the Sigmoid function to acquire the parameter θ1colon, and inputting the feature N1stomach into the Sigmoid function to acquire a parameter θ1stomach.
Further, the step i) includes:
- i-1) performing, by a reshape function in the PyTorch library of Python, a reshape operation, through the parameter θ1brain, in the first convolutional layer and the second convolutional layer of each of the first CRM, the second CRM, the third CRM, the fourth CRM, and the fifth CRM of the image encoder; performing a convolution, and an operation by a ReLU activation function in sequence to complete a first update of the image encoder; performing, by the reshape function in the PyTorch library of Python, a reshape operation, through the parameter θ1brain in in the first convolutional layer and the second convolutional layer of each of the first GRU module, the second GRU module, the third GRU module, and the fourth GRU module of the image decoder; and performing a convolution, and an operation by the ReLU activation function in sequence to complete a first update of the image decoder;
- i-2) performing, by the reshape function in the PyTorch library of Python, a reshape operation, through the parameter θ1kidney in the first convolutional layer and the second convolutional layer of each of the first CRM, the second CRM, the third CRM, the fourth CRM, and the fifth CRM of the image encoder; performing a convolution, and an operation by the ReLU activation function in sequence to complete a second update of the image encoder; performing, by the reshape function in the PyTorch library of Python, a reshape operation, through the parameter θ1kidney in the first convolutional layer and the second convolutional layer of each of the first GRU module, the second GRU module, the third GRU module, and the fourth GRU module of the image decoder; and performing a convolution, and an operation by the ReLU activation function in sequence to complete a second update of the image decoder;
- i-3) performing, by the reshape function in the PyTorch library of Python, a reshape operation, through the parameter θ1liver, in the first convolutional layer and the second convolutional layer of each of the first CRM, the second CRM, the third CRM, the fourth CRM, and the fifth CRM of the image encoder; performing a convolution, and an operation by the ReLU activation function in sequence to complete a third update of the image encoder; performing, by the reshape function in the PyTorch library of Python, a reshape operation, through the parameter θ1liver, in the first convolutional layer and the second convolutional layer of each of the first GRU module, the second GRU module, the third GRU module, and the fourth GRU module of the image decoder; and performing a convolution, and an operation by the ReLU activation function in sequence to complete a third update of the image decoder;
- i-4) performing, by the reshape function in the PyTorch library of Python, a reshape operation, through the parameter θ1breast in the first convolutional layer and the second convolutional layer of each of the first CRM, the second CRM, the third CRM, the fourth CRM, and the fifth CRM of the image encoder; performing a convolution, and an operation by the ReLU activation function in sequence to complete a fourth update of the image encoder; performing, by the reshape function in the PyTorch library of Python, a reshape operation, through the parameter θ1breast, in the first convolutional layer and the second convolutional layer of each of the first GRU module, the second GRU module, the third GRU module, and the fourth GRU module of the image decoder; and performing a convolution, and an operation by the ReLU activation function in sequence to complete a fourth update of the image decoder;
- i-5) performing, by the reshape function in the PyTorch library of Python, a reshape operation, through the parameter θ1colon, in the first convolutional layer and the second convolutional layer of each of the first CRM, the second CRM, the third CRM, the fourth CRM, and the fifth CRM of the image encoder; performing a convolution, and an operation by the ReLU activation function in sequence to complete a fifth update of the image encoder; performing, by the reshape function in the PyTorch library of Python, a reshape operation, through the parameter θ1colon, in the first convolutional layer and the second convolutional layer of each of the first GRU module, the second GRU module, the third GRU module, and the fourth GRU module of the image decoder; and performing a convolution, and an operation by the ReLU activation function in sequence to complete a fifth update of the image decoder; and
- i-6) performing, by the reshape function in the PyTorch library of Python, a reshape operation, through the parameter θ1stomach, in the first convolutional layer and the second convolutional layer of each of the first CRM, the second CRM, the third CRM, the fourth CRM, and the fifth CRM of the image encoder; performing a convolution, and an operation by the ReLU activation function in sequence to complete a sixth update of the image encoder; performing, by the reshape function in the PyTorch library of Python, a reshape operation, through the parameter θ1stomach, in the first convolutional layer and the second convolutional layer of each of the first GRU module, the second GRU module, the third GRU module, and the fourth GRU module of the image decoder; and performing a convolution, and an operation by the ReLU activation function in sequence to complete a sixth update of the image decoder, thereby acquiring the updated segmentation network model.
Further, the step j) includes: training, by an adaptive moment estimation (Adam) optimizer, the updated segmentation network model through a Dice similarity coefficient (DSC) loss function to acquire the optimized segmentation network model.
Further, the step k) includes:
- k-1) inputting the i-th brain nuclei image Yibrain in the training set into the image encoder and the image decoder of the image module in the optimized segmentation network model in sequence to acquire a final brain nuclei segmentation result image PO′4brain in
- k-2) inputting the i-th kidney nuclei image Yikidney in the training set into the image encoder and the image decoder of the image module in the optimized segmentation network model in sequence to acquire a final kidney nuclei segmentation result image PO′4kidney;
- k-3) inputting the i-th liver nuclei image Yiliver in the training set into the image encoder and the image decoder of the image module in the optimized segmentation network model in sequence to acquire a final liver nuclei segmentation result image PO′4liver;
- k-4) inputting the i-th breast nuclei image Yibreast in the training set into the image encoder and the image decoder of the image module in the optimized segmentation network model in sequence to acquire a final breast nuclei segmentation result image PO′4breast;
- k-5) inputting the i-th colon nuclei image Yicolon in the training set into the image encoder and the image decoder of the image module in the optimized segmentation network model in sequence to acquire a final colon nuclei segmentation result image PO′4colon; and
- k-6) inputting the i-th stomach nuclei image Yistomach in the training set into the image encoder and the image decoder of the image module in the optimized segmentation network model in sequence to acquire a final stomach nuclei segmentation result image PO′4brain.
The present disclosure has the following beneficial effects. More and more segmentation models show significant influences in single-organ nuclei segmentation, but research on multi-organ nuclei segmentation using a single segmentation model is still limited and cannot be used in new fields. To solve these problems, the present disclosure proposes a segmentation network model based on text prompt learning. The present disclosure fully mines image information based on text and image information and learns an association between semantic information and a segmentation target, thereby achieving comprehensive learning for target region segmentation. The segmentation network model learns a large amount of text and image paired knowledge from six publicly available nucleus datasets based on the clip model to acquire prior knowledge for semantic understanding of nuclei, making the model fully suitable for nuclei segmentation tasks. The constructed model inputs images and text prompts, and utilizes text and image information to achieve the nucleus recognition and accurate nuclei segmentation of six different organs, improving computational efficiency. The segmentation network model can also utilize sufficient text prompts to complete accurate segmentation tasks on some unlabeled datasets, achieving practicality and scalability.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a structural diagram of a network according to the present disclosure;
FIG. 2 shows sub-images of different organs cropped from MoNuSeg-2018 dataset and segmentation results;
Table 1 shows models comparison results on the MoNuSeg-2018 dataset and the NuInsSeg dataset;
Table 2 shows AJI, F1-score, and Dice on the MoNuSeg-2018 dataset; and
Table 3 shows model comparison results on the MoNuSeg-2018 dataset and state-of-the-art models (SOTAs).
DETAILED DESCRIPTION OF THE EMBODIMENTS
The present disclosure will be described in detail below with reference to FIG. 1.
A multi-organ nuclei segmentation method based on prompt learning includes the following steps.
- a) N brain nuclei images, N kidney nuclei images, N liver nuclei images, N breast nuclei images, N colon nuclei images, and N stomach nuclei images are acquired to form nuclei image-text dataset T,
where Tibrain in denotes an i-th brain nuclei image, and a nuclei photo of [brain] is a medical prompt text template for brain; Tikidney denotes an i-th kidney nuclei image, and a nuclei photo of [kidney] is a medical prompt text template for kidney; Tiliver denotes an i-th liver nuclei image, and a nuclei photo of [liver] is a medical prompt text template for liver; Tibrain denotes an i-th breast nuclei image, and a nuclei photo of [breast] is a medical prompt text template for breast; Ticolon denotes an i-th colon nuclei image, and a nuclei photo of [colon] is a medical prompt text template for colon; and Tistomach denotes an i-th stomach nuclei image, and a nuclei photo of [stomach] is a medical prompt text template for stomach.
- b) The i-th brain nuclei image Tibrain and the medical prompt text template a nuclei photo of [brain] for brain, the i-th kidney nuclei image Tikidney and the medical prompt text template a nuclei photo of [kidney] for kidney, the i-th liver nuclei image Tiliver and the medical prompt text template a nuclei photo of [liver] for liver, the i-th breast nuclei image Tibreast and the medical prompt text template a nuclei photo of [breast] for breast, the i-th colon nuclei image Ticolon and the medical prompt text template a nuclei photo of [colon] for colon, and the i-th stomach nuclei image Tistomach and the medical prompt text template a nuclei photo of [stomach] for stomach in the nuclei image-text dataset T are input into a clip model to acquire an optimized clip model.
- c) a brain nuclei images, b kidney nuclei images, c liver nuclei images, d breast nuclei images, e colon nuclei images, and f stomach nuclei images are acquired from a nuclei instance segmentation (NuInsSeg) dataset, a+b+c+d+e+f=n. Nuclei image-text dataset Y is formed,
where Yibrain denotes an i-th brain nuclei image, Yikidney denotes an i-th kidney nuclei image, Yiliver denotes an i-th liver nuclei image, Yibreast denotes an i-th breast nuclei image, Yicolon denotes an i-th colon nuclei image, and Yistomach denotes an i-th stomach nuclei image.
- d) The nuclei image-text dataset Y is divided into a training set and a test set, and scaling the nuclei image in the training set to 572×572.
- e) A segmentation network model is constructed, including a text module, an image module, and a multilayer perceptron (MLLP) module.
- f) The medical prompt text template in the training set is input into the text module to acquire a text vector.
- g) The nuclei image in the training set is input into the image module of the segmentation network model to acquire a nuclei segmentation result image and a feature vector.
- h) The feature vector is input into the MLP module of the segmentation network model to acquire a parameter.
- i) The segmentation network model is updated through the parameter to acquire an updated segmentation network model.
- j) The updated segmentation network model is trained to acquire an optimized segmentation network model.
- k) The nuclei image in the test set is input into the optimized segmentation network model to acquire a final segmentation result image.
The present disclosure provides a multi-organ nuclei segmentation method based on prompt learning, which can segment multiple different types of organ nuclei through text and image information. In addition, the present disclosure can also accurately complete the unsupervised segmentation task of specific organ nuclei on an unlabeled dataset through sufficient text prompts.
In an embodiment of the present disclosure, N is 500. That is, there are 500 nuclei images for each organ.
In an embodiment of the present disclosure, in the step a), the N brain nuclei images are acquired from a multi-organ nuclei segmentation (MoNuSeg) dataset and/or a CMP-15 dataset and/or a CMP-17 dataset and/or a nuclei instance segmentation dataset of cryosectioned hematoxylin-eosin-stained histological images (CryonNuSeg) dataset and/or the NuInsSeg dataset. The N kidney nuclei images are acquired from the MoNuSeg dataset and/or a kumar dataset and/or an Irshad dataset and/or the CryonNuSeg dataset and/or a multi-organ nuclei segmentation and classification (MoNuSAC) dataset and/or the NuInsSeg dataset. The N liver nuclei images are acquired from the MoNuSeg dataset and/or the CryonNuSeg dataset and/or a Crowedsourced dataset and/or the kumar dataset and/or the NuInsSeg dataset. The N breast nuclei images are acquired from the MoNuSeg dataset and/or the MoNuSAC dataset and/or a nucleus classification, localization and segmentation (Nucls) dataset and/or a triple negative breast cancer (TNBC) dataset and/or a Janowczyk mCryonNuSeg dataset and/or a Gelasca dataset and/or a Naylor dataset and/or the MoNuSAC dataset and/or a NuInsSeg dataset and/or the kumar dataset. The N colon nuclei images are acquired from a colorectal nuclei segmentation and phenotypes (CoNSeP) dataset and/or a CRCHisto dataset and/or the CryonNuSeg dataset and/or the NuInsSeg dataset and/or the kumar dataset. The N stomach nuclei images are acquired from a MoNuSeg CryonNuSeg dataset and/or a Wienert dataset and/or the NuInsSeg dataset and/or the kumar dataset.
In an embodiment of the present disclosure, in the step d), the nuclei image-text dataset Y is divided into the training set and the test set at a ratio of 7:3.
In an embodiment of the present disclosure, the step f) is as follows.
- f-1) The text module of the segmentation network model is constructed, including the optimized clip model.
- f-2) The medical prompt text template a nuclei photo of [brain] for brain in the training set is input into the text module to acquire text vector Nbrain, Nbrain ∈ RL×N, where R denotes a real number space, L denotes a length of a text, and N denotes a length of a last word in the text. The text vector Nbrain is expanded by one channel dimension by a torch.unsqueeze function in a PyTorch library of Python to acquire text vector N′brain.
- f-3) The medical prompt text template a nuclei photo of [kidney] for kidney in the training set is input into the text module to acquire text vector Nkidney, Nkidney ∈ RL×N The text vector Nkidney is expanded by one channel dimension by the torch.unsqueeze function in the PyTorch library of Python to acquire text vector N′kidney
- f-4) The medical prompt text template a nuclei photo of [liver] for liver in the training set is input into the text module to acquire text vector Nliver, Nliver ∈RL×N. The text vector Nliver is expanded by one channel dimension by the torch.unsqueeze function in the PyTorch library of Python to acquire text vector N′liver.
- f-5) The medical prompt text template a nuclei photo of [breast] for breast in the training set is input into the text module to acquire text vector Nbreast, Nbreast ∈RL×N. The text vector Nbreast is expanded by one channel dimension by the torch.unsqueeze function in the PyTorch library of Python to acquire text vector N′breast.
- f-6) The medical prompt text template a nuclei photo of [colon] for colon in the training set is input into the text module to acquire text vector Ncolon, Ncolon ∈RL×N. The text vector Ncolon is expanded by one channel dimension by the torch.unsqueeze function in the PyTorch library of Python to acquire text vector N′colon.
- f-7) The medical prompt text template a nuclei photo of [stomach] for stomach in the training set is input into the text module to acquire text vector Nstomach, Nstomach ∈ RL×N. The text vector Nstomach is expanded by one channel dimension by the torch.unsqueeze function in the PyTorch library of Python to acquire text vector N′stomach.
In an embodiment of the present disclosure, the step g) is as follows.
- g-1) The image module of the segmentation network model is constructed, including an image encoder, an image decoder, and a generalizable approximate partitioning (GAP) module.
- g-2) The image encoder of the image module is constructed, including a first CRM, a second CRM, a third CRM, a fourth CRM, and a fifth CRM. The first CRM, the second CRM, the third CRM, and the fourth CRM are constructed, each including a first convolutional layer, a first rectified linear unit (ReLU) activation function, a second convolutional layer, a second ReLU activation function, and a max pooling layer in sequence. The fifth CRM is constructed, including a first convolutional layer, a first ReLU activation function, a second convolutional layer, and a second ReLU activation function in sequence. The i-th brain nuclei image yibrain in the training set is input into the first CRM to acquire feature PE1brain, Yibrain ∈ RC×H×W, where C, H, and W denote a channel number, a height, and a width of the image, respectively. The feature PE1brain in is input into the second CRM to acquire feature PE2brain in the feature PE2brain is input into the third CRM to acquire feature PE3brain, the feature PE3brain is input into the fourth CRM to acquire feature PE4brain, and the feature PE4brain is input into the fifth CRM to acquire feature PE5brain. The i-th kidney nuclei image Yikidney in the training set is input into the first CRM to acquire feature PE1kidney, Yikidney ∈ RC×H×W, the feature PE1kidney is input into the second CRM to acquire feature PE2kidney, the feature PE2kidney is input into the third CRM to acquire feature PE3kidney the feature PE3kidney is input into the fourth CRM to acquire feature PE4kidney, and the feature PE4kidney is input into the fifth CRM to acquire feature PE5kidney. The i-th liver nuclei image Yiliver in the training set is input into the first CRM to acquire feature PE1liver, Yiliver ∈ RC×H×W inputting the feature PE1liver is input into the second CRM to acquire feature PE2liver, the feature PE2liver is input into the third CRM to acquire feature PE3liver, the feature PE3liver is input into the fourth CRM to acquire feature PE4liver, and the feature PE4liver is input into the fifth CRM to acquire feature PE5liver. The i-th breast nuclei image Yibreast in the training set is input into the first CRM to acquire feature PE1breast, Yibreast ∈ RC×H×W, the feature PE1breast is input into the second CRM to acquire feature PE2breast, the feature PE2breast is input into the third CRM to acquire feature PE1breast, the feature PE3breast is input into the fourth CRM to acquire feature PE4breast, and the feature PE4breast is input into the fifth CRM to acquire feature PE5breast. The i-th colon nuclei image Yicolon in the training set is input into the first CRM to acquire feature PE1colon, Yicolon ∈ RC×H×W, the feature PE1colon is input into the second CRM to acquire feature PE2colon the feature PE2colon is input into the third CRM to acquire feature PE3colon, the feature PE3colon is input into the fourth CRM to acquire feature PE4colon, and the feature PE4colon is input into the fifth CRM to acquire feature PE5colon. The i-th stomach nuclei image Yistomach in the training set is input into the first CRM to acquire feature PE1stomach, Yistomach ∈ RC×H×W, the feature PE1stomach is input into the second CRM to acquire feature PE2stomach,the feature PE2stomach is input into the third CRM to acquire feature PE3stomach, the feature PE3stomach is input into the fourth CRM to acquire feature PE4stomach, and the feature PE4stomach is input into the fifth CRM to acquire feature PE5stomach.
- g-3) The image decoder of the image module is constructed, including a first GRU module, a second GRU module, a third GRU module, and a fourth GRU module. The first GRU module, the second GRU module, the third GRU module, and the fourth GRU module are constructed, each including a first convolutional layer, a first ReLU activation function, a second convolutional layer, a second ReLU activation function, and an upsampling layer in sequence. The feature PE5brain is input into the first CRM to acquire feature PO1brain, the feature PO1brain is input into the second CRM to acquire feature PO2brain, the feature PO2brain is input into the third CRM to acquire feature PO3brain, and the feature PO3brainis input into the fourth CRM to acquire brain nuclei segmentation result image PO4brain. The feature PE5kidney is input into the first CRM to acquire feature PO1kidney, the feature PO1kidney is input into the second CRM to acquire feature PO2kidney, the feature PO2kidney is input into the third CRM to acquire feature PO3kidney, and the feature PO3kidney is input into the fourth CRM to acquire kidney nuclei segmentation result image PO4kidney. The feature PE5liver is input into the first CRM to acquire feature PO1liver the feature PO1liver is input into the second CRM to acquire feature PO2liver, the feature PO2liver is input into the third CRM to acquire feature PO3liver, and the feature PO3liver is input into the fourth CRM to acquire liver nuclei segmentation result image PO4liver. The feature PE5breast is input into the first CRM to acquire feature PO1breast, the feature PO1breast is input into the second CRM to acquire feature PO2breast, the feature PO2breast is input into the third CRM to acquire feature PO3breast, and the feature PO3breast is input into the fourth CRM to acquire breast nuclei segmentation result image PO4breast. The feature PE5colon is input into the first CRM to acquire feature PO1colon, the feature PO1colon is input into the second CRM to acquire feature PO2colon, the feature PO2colon is input into the third CRM to acquire feature PO3colon, and the feature PO3colon is input into the fourth CRM to acquire colon nuclei segmentation result image PO4colon. The feature PE5stomach is input into the first CRM to acquire feature PO1stomach, the feature PO1stomach is input into the second CRM to acquire feature PO2stomach, the feature PO2stomach is input into the third CRM to acquire feature PO3stomach,and the feature PO3stomach is input into the fourth CRM to acquire stomach nuclei segmentation result image PO4stomach.
- g-4) The GAP module of the image module is constructed, including a batch normalization (BN) layer, a ReLU activation function, and an adaptive average pooling layer. The feature PE5brain is input into the GAP module to acquire feature PGbrain, the feature PE5kidney is input into the GAP module to acquire feature PGkidney, the feature PE5liver is input into the GAP module to acquire feature PGliver, the feature PE5breast is input into the GAP module to acquire feature PGbreast, the feature PE5colon is input into the GAP module to acquire feature PGcolon, and the feature PE5stomach is input into the GAP module to acquire feature PGstomach. The feature PGbrain and the text vector N′brain are concatenated to acquire feature vector Nmerbrain. The feature PGkidney and the text vector N′kidney are concatenated to acquire feature vector Nmerkidney. The feature PGliver and the text vector N′liver are concatenated to acquire feature vector Nmerliver. The feature PGbreast and the text vector N′breast, are concatenated to acquire feature vector Nmerbreast. The feature PGcolon and the text vector N′colon are concatenated to acquire feature vector Nmercolon. The feature PGstomach and the text vector N′stomach are concatenated to acquire feature vector Nmerstomach.
In an embodiment of the present disclosure, the step h) is as follows.
- h-1) The MLP module of the segmentation network model is constructed, including a first convolutional layer, a second convolutional layer, and a third convolutional layer in sequence, where the first convolutional layer, the second convolutional layer, and the third convolutional layer each include a convolutional kernel with a size of 1*1.
- h-2) The feature vector Nmerbrain in is input into the MLP module to acquire feature N1brain, the feature vector Nmerkidney is input into the MLP module to acquire feature N1kidney, the feature vector Nmerliver is input into the MLP module to acquire feature N1liver, the feature vector Nmerbreast is input into the MLP module to acquire feature N1breast, the feature vector Nmercolon is input into the MLP module to acquire feature N1colon, and the feature vector Nmerstomach is input into the MLP module to acquire feature N1stomach.
- h-3) The feature N1brain is input into a Sigmoid function to acquire parameter θ1brain, the feature N1kidney is input is input into the Sigmoid function to acquire parameter θ1kidney, the feature N1liver is input is input into the Sigmoid function to acquire parameter θ1liver, the feature N1breast is input is input into the Sigmoid function to acquire parameter θ1breast, the feature N1colon is input into the Sigmoid function to acquire the parameter θ1colon, and the feature N1stomach is input is input into the Sigmoid function to acquire parameter θ1stomach.
In an embodiment of the present disclosure, the step i) is as follows.
- i-1) A reshape operation is performed by a reshape function in the PyTorch library of Python, through the parameter θ1brain, in in the first convolutional layer and the second convolutional layer of each of the first CRM, the second CRM, the third CRM, the fourth CRM, and the fifth CRM of the image encoder. A convolution, and an operation by a ReLU activation function are performed in sequence to complete a first update of the image encoder. A reshape operation is performed by the reshape function in the PyTorch library of Python, through the parameter θ1brain, brain in the first convolutional layer and the second convolutional layer of each of the first GRU module, the second GRU module, the third GRU module, and the fourth GRU module of the image decoder. A convolution, and an operation by the ReLU activation function are performed in sequence to complete a first update of the image decoder.
- i-2) A reshape operation is performed by the reshape function in the PyTorch library of Python, through the parameter θ1kidney, in the first convolutional layer and the second convolutional layer of each of the first CRM, the second CRM, the third CRM, the fourth CRM, and the fifth CRM of the image encoder. A convolution, and an operation by the ReLU activation function are performed in sequence to complete a second update of the image encoder. A reshape operation is performed by the reshape function in the PyTorch library of Python, through the parameter θ1kidney, in the first convolutional layer and the second convolutional layer of each of the first GRU module, the second GRU module, the third GRU module, and the fourth GRU module of the image decoder. A convolution, and an operation by the ReLU activation function are performed in sequence to complete a second update of the image decoder.
- i-3) A reshape operation is performed by the reshape function in the PyTorch library of Python, through the parameter θ1liver, in the first convolutional layer and the second convolutional layer of each of the first CRM, the second CRM, the third CRM, the fourth CRM, and the fifth CRM of the image encoder. A convolution, and an operation by the ReLU activation function are performed in sequence to complete a third update of the image encoder. A reshape operation is performed by the reshape function in the PyTorch library of Python, through the parameter θ1liver, in the first convolutional layer and the second convolutional layer of each of the first GRU module, the second GRU module, the third GRU module, and the fourth GRU module of the image decoder. A convolution, and an operation by the ReLU activation function are performed in sequence to complete a third update of the image decoder.
- i-4) A reshape operation is performed by the reshape function in the PyTorch library of Python, through the parameter θ1breast, in the first convolutional layer and the second convolutional layer of each of the first CRM, the second CRM, the third CRM, the fourth CRM, and the fifth CRM of the image encoder. A convolution, and an operation by the ReLU activation function are performed in sequence to complete a fourth update of the image encoder. A reshape operation is performed by the reshape function in the PyTorch library of Python, through the parameter θ1breast, in the first convolutional layer and the second convolutional layer of each of the first GRU module, the second GRU module, the third GRU module, and the fourth GRU module of the image decoder. A convolution, and an operation by the ReLU activation function are performed in sequence to complete a fourth update of the image decoder.
- i-5) A reshape operation is performed by the reshape function in the PyTorch library of Python, through the parameter θ1colon, in the first convolutional layer and the second convolutional layer of each of the first CRM, the second CRM, the third CRM, the fourth CRM, and the fifth CRM of the image encoder. A convolution, and an operation by the ReLU activation function are performed in sequence to complete a fifth update of the image encoder. A reshape operation is performed by the reshape function in the PyTorch library of Python, through the parameter θ1colon, in the first convolutional layer and the second convolutional layer of each of the first GRU module, the second GRU module, the third GRU module, and the fourth GRU module of the image decoder. A convolution, and an operation by the ReLU activation function are performed in sequence to complete a fifth update of the image decoder.
- i-6) A reshape operation is performed by the reshape function in the PyTorch library of Python, through the parameter θ1stomach, in the first convolutional layer and the second convolutional layer of each of the first CRM, the second CRM, the third CRM, the fourth CRM, and the fifth CRM of the image encoder. A convolution, and an operation by the ReLU activation function are performed in sequence to complete a sixth update of the image encoder. A reshape operation is performed by the reshape function in the PyTorch library of Python, through the parameter θ1stomach, in the first convolutional layer and the second convolutional layer of each of the first GRU module, the second GRU module, the third GRU module, and the fourth GRU module of the image decoder. A convolution, and an operation by the ReLU activation function are performed in sequence to complete a sixth update of the image decoder, thereby acquiring the updated segmentation network model.
In an embodiment of the present disclosure, in the step j), the updated segmentation network model is trained by an adaptive moment estimation (Adam) optimizer through a Dice similarity coefficient (DSC) loss function to acquire the optimized segmentation network model.
In an embodiment of the present disclosure, the step k) is as follows.
- k-1) The i-th brain nuclei image Yibrain in in the training set is input into the image encoder and the image decoder of the image module in the optimized segmentation network model in sequence to acquire final brain nuclei segmentation result image PO′4brain.
- k-2) The i-th kidney nuclei image Yikidney in the training set is input into the image encoder and the image decoder of the image module in the optimized segmentation network model in sequence to acquire final kidney nuclei segmentation result image PO′4kidney.
- k-3) The i-th liver nuclei image Yiliver in the training set is input into the image encoder and the image decoder of the image module in the optimized segmentation network model in sequence to acquire final liver nuclei segmentation result image PO′4liver.
- k-4) The i-th breast nuclei image Yibreast in the training set is input into the image encoder and the image decoder of the image module in the optimized segmentation network model in sequence to acquire final breast nuclei segmentation result image PO′4breast.
- k-5) The i-th colon nuclei image Yicolon in the training set is input into the image encoder and the image decoder of the image module in the optimized segmentation network model in sequence to acquire final colon nuclei segmentation result image PO′4colon.
- k-6) The i-th stomach nuclei image Yistomach in the training set is input into the image encoder and the image decoder of the image module in the optimized segmentation network model in sequence to acquire final stomach nuclei segmentation result image PO′4stomach.
Finally, it should be noted that the above descriptions are only preferred embodiments of the present disclosure, and are not intended to limit the present disclosure. Although the present disclosure has been described in detail with reference to the foregoing embodiments, those skilled in the art may still modify the technical solutions described in the foregoing embodiments, or equivalently substitute some technical features thereof. Any modification, equivalent substitution, improvement, etc. within the spirit and principles of the present disclosure shall fall within the scope of protection of the present disclosure.
The proposed model is qualitatively analyzed. As shown in FIG. 2, the seven sub-images were cropped from seven different MoNuSeg-2018 test images of seven different organs, where three organs (bladder, lung, prostate) not included in the training test and a validation set. Original images and nuclei segmentation images are listed in rows.
FIG. 2 shows that the proposed model can successfully segment individual nuclei, indicating that the proposed model has strong generalization ability. The overlap of certain nuclei and the diversity of tissue types, nucleus appearance, and H & E staining are not considered, and there are some false positive cases (over-segmentation) and false negative cases (under-segmentation), but their number is not significant relative to the total number of true positive cases.
For quantitative analysis, Table 1 lists the average values of aggregated Jaccard index (AJI), F1-score, and Dice for two test datasets. Some images in the NuInsSeg dataset show the nuclei of organs that are not present in the training set and the validation set. The test results are similar to those on the MoNuSeg-2018 dataset, indicating that the proposed model has strong generalization ability.
Table 2 shows the AJI, F1-score, and Dice of the proposed model in segmenting six types of organs in the MoNuSeg-2018 dataset. These values are relatively stable across different organ segmentation tasks. The stability of these scores is positive for the performance of the model, indicating that the model can maintain high consistency and stability in different organ segmentation tasks, which suggests that the model has strong applicability.
Table 3 shows that the proposed model achieves good performance in the nuclei segmentation task. Specifically, the model achieves a Dice score of 0.8311 and an AJI score of 0.6413 on the MoNuSeg dataset, showing an obvious improvement in the performance of the model compared to that on Hover-Net and CIA-Net. It should be noted that the proposed model is designed for multi-organ segmentation and has strong generalization ability. However, if it encounters an unfamiliar organ, it needs to be retrained.