This application claims priority to Japanese Patent Application No. 2019-217159, filed on Nov. 29, 2019, the entire content of which is incorporated herein by reference.
The present invention relates to a cell analysis method, cell analysis device, cell analysis system, and cell analysis program, and trained artificial intelligence algorithm generation method, generation device, and generation program.
WIPO Patent Publication No. 2015/065697 discloses a method of applying a filtered microscope image to a trained machine learning model to determine centers and boundaries of cells of a specific type, count the determined cells, and output an image of the cells.
In examinations of patients who may have a tumor, it is necessary to understand the presence of abnormal cells such as peripheral circulating tumor cells and the proportion of cells having chromosome abnormality in a sample containing multiple types of cells to determine the presence or absence of a tumor, the effect of anticancer therapy, the presence or absence of recurrence and the like.
A number of abnormal cells contained in a sample may be very small compared with the number of normal cells that should originally be present in the sample. Therefore, it is necessary to analyze more cells in order to detect abnormal cells contained in the sample. However, since the method described in WIPO Patent Publication No. 2015/065697 uses a microscope image, increasing the number of cells to be determined increases the time required to acquire the microscope image.
The present invention provides a cell analysis method, a cell analysis device, a cell analysis system, a cell analysis program, and a trained artificial intelligence algorithm generation method, generation device, and generation program to facilitate high-accuracy and high-speed analysis of more cells in the sample.
One embodiment of the present invention relates to a cell analysis method for analyzing cells using an artificial intelligence algorithm (60, 63, 97). The cell analysis method causes a sample (10) containing cells to flow through a flow channel (111), images cells passing through the flow channel (111) to generate analysis target images (80, 85, 95), generates analysis data (82, 87, 96) from the generated analysis target images (80, 85, 95), inputs the generated analysis data to the artificial intelligence algorithm (60, 63, 97), and generates data (84, 88, 98) indicating the properties of the cells contained in the analysis target images (80, 85, 95) by the artificial intelligence algorithm.
One embodiment of the present invention relates to a cell analysis device (400A, 200B, 200C) that analyzes cells using an artificial intelligence algorithm (60, 63, 97). The cell analysis device (400A, 200B, 200C) includes a control unit (40A, 20B, 20C) configured to cause a sample (10) containing cells to flow in a path (111), inputs analysis data (82, 87, 96) generated from analysis target images (80, 85, 95) of cells passing through the flow path (111) into an artificial intelligence algorithm (60, 63, 97), and generates data 84, 88, 98) indicating the properties of the cell contained in the analysis target images (80, 85, 95) by the artificial intelligence algorithm 60, 63, 97).
One embodiment of the present invention relates to a cell analysis system (1000, 2000, 3000). Cell analysis system (1000, 2000, 3000) includes a flow cell (110) through which a sample (10) containing cells flows, light sources (120, 121, 122, 123) for irradiating light on the sample (10) flowing in the flow cell (110), an imaging unit (160) for imaging the cells in the sample (10) irradiated with the light, and a control unit (40A, 20B, 20C). The control unit (40A, 20B, 20C) is configured to acquire, as the analysis target images (80, 85, 95), images of the cells passing through the inside of the flow path (111) captured by the imaging unit (160), generate analysis target data (82, 87, 96) from the analysis target images (80, 80, 85, 95), input the analysis data (82, 87, 96) to the artificial intelligence algorithm (60, 63, 97), and generate data (84, 88, 98) indicating the properties of cells included in the analysis target images (80, 85, 95).
One embodiment of the present invention relates to a cell analysis program for analyzing cells. The cell analysis program executes processing including a step (S22) of flowing a sample (10) containing cells into a flow path (111) and inputting analysis data (82, 87, 96) generated from analysis target images (80, 85, 95) obtained by imaging cells passing through the flow path (111) into an artificial intelligence algorithm (60, 63, 97), and a step (S23) of generating data (84, 88, 98) indicating the properties of cells included in the analysis target images (80, 85, 95) by the artificial intelligence algorithm (60, 63, 97).
The cell analysis device (400A, 200B, 200C), cell analysis system (1000, 2000, 3000), and cell analysis program facilitate high-accuracy and high-speed analysis of more cells contained in a sample.
One embodiment of the invention relates to a trained artificial intelligence algorithm (60, 63, 97) generation method for analyzing cells. The generation method includes inputting training data (73, 78, 92) generated from training images (70, 75, 90) which capture a cell passing through a flow path (111) when flowing a sample (10) containing cells in the flow path (111), and inputting a label (74P, 74N, 79P, 79N, 93P, 93N) showing the properties of cells contained in the training image (70, 75, 90) into an artificial intelligence algorithm (50, 53, 94) to train the artificial intelligence algorithm (50, 53, 94).
One embodiment of the present invention relates to a trained artificial intelligence algorithm (60, 63, 97) generation device (200A, 200B, 200C) for analyzing cells. The generation device (200A, 200B, 200C) is provided with a control unit (20A, 20B, 20C) configured to input training data (73, 78, 92) generated from training image (70, 75, 90) of a cell passing through a flow path (111) when flowing a sample (10) containing cells in the flow path (111), and input a label (74P, 74N, 79P, 79N, 93P, 93N) indicating a property of a cell included in the training image (70, 75, 90) to an artificial intelligence algorithm (50, 53, 94) to train the artificial intelligence algorithm (50, 53, 94).
One embodiment of the present invention relates to a trained artificial intelligence algorithm (60, 63, 97) generation program for analyzing cells. The generation program executes processing including a step (S12) of inputting training data (73, 78, 92) generated from training images (70, 75, 90) of a cell passing through a flow path (111) when flowing a sample (10) containing cells in the flow path (111) and inputting a label (74P, 74N, 79P, 79N, 93P, 93N) indicating the properties of cells contained in the training image (70, 75, 90) into the artificial intelligence algorithm (50, 53, 94), and a step (S12) of training the artificial intelligence algorithm (50, 53, 94).
An artificial intelligence algorithm (60, 63, 97) can be generated to facilitate high-speed high-accuracy analysis of cells contained in a sample by a trained artificial intelligence algorithm (60, 63, 97) generation method, generation device (200A, 200B, 200C), and generation program.
It is possible to facilitate high-accuracy and high-speed analysis of more cells contained in a sample.
Hereinafter, the summary and embodiments of the present invention will be described in detail with reference to the accompanying drawings. Note that in the following description and drawings, the same reference numeral denotes the same or similar component, and thus the description of the same or similar component may be omitted.
The present embodiment relates to a cell analysis method for analyzing cells using an artificial intelligence algorithm. In the cell analysis method, an analysis target image obtained by capturing an image of an analysis target cell is acquired by causing a sample containing cells to flow in a flow path and imaging the cells passing through the flow path. The analysis data to be input to the artificial intelligence algorithm are generated from the acquired analysis target image. When the analysis data are input to the artificial intelligence algorithm, the artificial intelligence algorithm generates data indicating the properties of the cells included in the analysis target image. The analysis target image is preferably an image of individual cells passing through the flow path.
In the present embodiment, the sample may be a sample prepared from a specimen collected from a subject. The sample may include, for example, blood samples such as peripheral blood, venous blood, arterial blood, urine samples, and body fluid samples other than blood and urine. Body fluids other than blood and urine may include bone marrow, ascites, pleural effusion, spinal fluid and the like. Body fluids other than blood and urine may be simply referred to as “body fluid”. The blood is preferably peripheral blood. For example, the blood may be peripheral blood collected by using an anticoagulant such as ethylenediaminetetraacetate sodium salt or potassium salt) and heparin sodium.
The sample can be prepared from the specimen according to a known method. For example, an examiner collects nucleated cells by subjecting a blood sample collected from a subject to centrifugation or the like using a cell separation medium such as Ficoll. In recovering the nucleated cells, the nucleated cells may be left by hemolyzing red blood cells and the like using a hemolytic agent instead of recovering the nucleated cells by centrifugation. The target site of the recovered nucleated cells is labeled with at least one selected from the Fluorescence In Situ Hybridization (FISH) method, immunostaining method, intracellular organelle staining method and the like described below, and preferably by performing fluorescent labeling; then the suspension liquid of the labeled cells is used as a sample supplied to, for example, in an imaging flow cytometer to image the analysis target cells.
The sample can include multiple cells. Although the number of cells contained in the sample is not particularly limited, the sample should contain at least 102or more, desirably 103or more, preferably 104or more, more preferably 105or more, and ideally 106or more cells. Also, the plurality of cells may include different types of cells.
In the present embodiment, cells that can be analyzed are also referred to as analysis target cells. The analysis target cell may be a cell contained in a sample collected from a subject. Preferably, the cells may be nucleated cells. The cells can include normal cells and abnormal cells.
Normal cell means a cell that should be originally contained in the sample depending on the body part where the sample is collected. Abnormal cell mean cells other than normal cells. Abnormal cells can include cells with chromosomal abnormalities and/or tumor cells. Here, the tumor cells are preferably peripheral circulating tumor cells. More preferably, the peripheral circulating tumor cells are not intended to be hematopoietic tumor cells in which tumor cells are present in the blood in a normal pathological state, rather tumor cells originating from a cell lineage other than a hematopoietic cell line are intended to be in circulation. In the present specification, tumor cells circulating peripherally are also referred to as circulating tumor cells (CTC).
When detecting a chromosomal abnormality, the target site is the nucleus of the cell to be analyzed. Examples of chromosomal abnormalities include chromosomal translocations, deletions, inversions, duplications, and the like. Examples of cells having such chromosomal abnormalities include myelodysplastic syndrome, acute myeloblastic leukemia, acute myeloblastic leukemia, acute promyelocytic leukemia, acute myelomonocytic leukemia, and acute monocytic leukemia, erythroleukemia, acute megakaryoblastic leukemia, acute myelogenous leukemia, acute lymphocytic leukemia, lymphoblastic leukemia, chronic myelogenous leukemia, chronic leukemia such as leukemia, Hodgkin lymphoma, non-Hodgkin lymphoma, malignant lymphoma and multiple myeloma.
The chromosomal abnormality can be detected by a known method such as the FISH method. In general, test items for detecting chromosomal abnormalities are set according to the type of abnormal cells to be detected. The gene or locus to be analyzed is set as an analysis item depending on what kind of test item is to be performed on the sample. In the detection of chromosomal abnormalities by the FISH method, abnormal chromosome position or abnormal number can be detected by hybridizing a probe that specifically binds to the locus or gene present in the nucleus of the cell to be analyzed. The probe is labeled with a labeling substance. The labeling substance is preferably a fluorescent dye. Depending on the probe, when the labeling substance is a fluorescent dye, the labeling substance combines with fluorescent dyes having different fluorescence wavelength regions, and it is possible to detect multiple genes or loci in one cell.
The abnormal cell is a cell that appears when suffering from a predetermined disease, and may include, for example, a tumor cell such as a cancer cell or a leukemia cell. In the case of hematopoietic organs, the predetermined diseases can be selected from a group consisting of myeloid dysplasia syndrome, acute myeloid leukemia, acute myeloid leukemia, acute premyelocytic leukemia, acute myeloid monocytic leukemia, acute monocytic leukemia, leukemia such as red leukemia, acute meganuclear blast leukemia, acute myeloid leukemia, acute lymphocytic leukemia, lymphoblastic leukemia, chronic myeloid leukemia, or chronic lymphocytic leukemia, Hodgkin lymphoma, non-Hodgkin lymphoma, malignant lymphoma and multiple myeloid leukemia. In the case of organs other than hematopoietic organs, the predetermined diseases may be gastrointestinal malignant tumors originating from the rectum or anal region, upper pharynx, esophagus, stomach, duodenum, jejunum, ileum, cecum, worm, ascending colon, transverse colon, descending colon, S-shaped colon; liver cancer; cholangiocarcinoma; pancreatic cancer; pancreatic cancer; urinary malignancies originating from the bladder, ureter or kidney; female reproductive system malignancies originating from the ovaries, Fallopian tubes, uterus; breast cancer; pre-stage cancer; skin cancer; endocrine malignancies such as the hypothalamus, pituitary gland, thyroid gland, parathyroid gland, adrenal gland, and pancreas; central nervous system malignancies; and solid tumors such as a malignant tumor that develops from bone and soft tissue.
Abnormal cells can be detected using at least one selected from bright-field images, immunostaining images for various antigens, and organelle-stained images that specifically stain organelles.
A bright-field image can be obtained by irradiating a cell with light and imaging the transmitted light from the cell or the reflected light from the cell. Preferably, the bright-field image is an image obtained by capturing the phase difference of cells using transmitted light.
Immunostained images can be obtained by imaging immunostained cells by labeling with a labeling substance using an antibody capable of binding to an antigen present at at least one intracellular or cell target site selected from the nucleus, cytoplasm, and cell surface. As the labeling substance, it is preferable to use a fluorescent dye as in the FISH method. Depending on the antigen, when the labeling substance is a fluorescent dye, the labeling substance combines with fluorescent dyes having different fluorescence wavelength regions, and it is possible to detect multiple antigens in one cell.
Organelle-stained images can be obtained by imaging stained cells using dyes that can selectively bind to proteins, sugar chains, lipids, nucleic acids and the like present in at least one cell or cell membrane target site selected from the nucleus, cytoplasm, and cell membrane. Examples of nuclear-specific stains include Hoechst™ 33342, Hoechst™ 33258, 4′,6-diamidino-2-phenylindole (DAPI), Propidium Iodide (PI), DNA-binding dyes such as ReadyProbes™ nuclear staining reagents, and Histone protein binding reagents such as Cell Light™ reagent. Examples of the nucleolus and RNA-specific staining reagent include SYTO™ RNA Select™, which specifically binds to RNA. Examples of the cytoskeleton-specific staining reagent include fluorescently labeled phalloidin. The CytoPainter series from Abcam plc (Cambridge, UK) can be used as dye to stain other organelles, such as lysosomes, endoplasmic reticulum, Golgi apparatus, mitochondria and the like. These staining dyes or staining reagents are fluorescent dyes or reagents containing fluorescent dyes, and different fluorescence wavelength regions can be selected depending on the wavelength range of the fluorescence of the organelles and the fluorescent dyes used as another stain applied jointly to one cell.
When detecting abnormal cells, inspection items are set according to what kind of abnormal cells are detected. The inspection items may include analysis items necessary for detecting abnormal cells. The analysis items may be set corresponding to the above-mentioned bright-field image, each antigen, and each organelle. Fluorescent dyes having different wavelength regions of fluorescence correspond to each analysis item except for the bright field, and different analysis items can be detected in one cell.
The analysis data to be input to the artificial intelligence algorithm is acquired by a method described later. The data indicating the properties of the cells included in the analysis target image generated by the artificial intelligence algorithm are, for example, data indicating whether the analysis target cells are normal or abnormal. More specifically, the data indicating the properties of the cells included in the analysis target image are data indicating whether the analysis target cell is a cell having a chromosomal abnormality or a peripheral circulating tumor cell.
For convenience of description in the present specification, “analysis target image” may be referred to as “analysis image”, “data to be analyzed” may be referred to as “analysis data”, “image for training” may be referred to as “training image”, and “data for training” may be referred to as “training data”. The “fluorescent image” is intended to be a training image obtained by imaging a fluorescent label or an analysis image obtained by imaging a fluorescent label.
The training method of the first artificial intelligence algorithms 50 and 53 and the cell analysis method using the trained first artificial intelligence algorithms 60 and 63 will be described with reference to
As the artificial intelligence algorithm, for example, the artificial intelligence algorithm provided by Python can be used.
This embodiment is related to a method for training a first artificial intelligence algorithm 60 for detecting a chromosomal abnormality, and a cell analysis method using the first artificial intelligence algorithm 60 for detecting a chromosomal abnormality. Here, the term “train” or “training” may be used in place of the term “generate” or “generating”.
A training method of the first artificial intelligence algorithm 50 for detecting a chromosomal abnormality will be described with reference to
As shown in
Here, the case of detecting the PML-RARA chimeric gene will be exemplified. The example shows a probe for detecting the PML locus is bound to a first fluorescent dye that fluoresces in the green wavelength region, and a probe for detecting the RARA locus is bound to a second fluorescent dye that fluoresces in the red wavelength region different from that of the first fluorescent dye. The nucleus of the first positive control cell and the nucleus of the first negative control cell can be labeled with the first fluorescent dye and the second fluorescent dye, respectively, by the FISH method using the probe bound with the first fluorescent dye and the probe bound with the second fluorescent dye. The label with the first fluorescent dye at the target site may be referred to as the first fluorescent label, and the label with the second fluorescent dye at the target site may be referred to as the second fluorescent label.
A sample containing cells having the first fluorescent label and the second fluorescent label can be subjected to analysis in a cell imaging device such as an imaging flow cytometer to capture an image of the cells. An image taken of a cell may include multiple images for the same field of view of the same cell. Since the first fluorescent label and the second fluorescent label have different fluorescence wavelength regions of the respective fluorescent dyes, a first filter for transmitting light emitted from the first fluorescent dye and a second filter for transmitting the light emitted from the second fluorescent dye differ. Therefore, the light transmitted through the first filter and the light transmitted through the second filter are taken into the imaging unit 160 described later via a corresponding first channel and a second channel, respectively, to capture as separate images of the same cell in the same field of view. That is, in the imaging unit 160, a plurality of images corresponding to the number of labeling substances labeling the cell are acquired for the same field of view of the same cell.
Therefore, in the example of
A method of generating the first positive numerical training data 71PA will be described using the first positive training image 70PA. In order to extract the cell region, each image captured by the imaging unit 160 is trimmed, for example, to a predetermined number of pixels, for example, 100 pixels in the vertical direction and 100 pixels in the horizontal direction, to generate a training image 70. At this time, trimming is performed so that the images acquired from each channel for one cell have the same field of view. It can be exemplified that the trimming process determines the center of gravity of the cell and cuts out a region within a range of a predetermined number of pixels centered on the center of gravity. In the image of the cells flowing through the flow cell, the position of the cells in the image may differ between the images, but by trimming, more accurate training becomes possible. The first positive training image 70PA is represented, for example, as a 16-bit grayscale image. Therefore, in each pixel, the brightness of the pixel can be indicated by a numerical value of the brightness of 65,536 gradations from 1 to 65,536. As shown in
Similar to the first positive numerical training data 71PA, the second positive numerical training data 71PB indicating the brightness of the imaged light at each pixel in the image can be generated from the second positive training image 70PB.
Next, the first positive numerical training data 71PA and the second positive numerical training data 71PB are integrated for each pixel to generate positive integrated training data 72P. As shown in
Next, the positive integrated training data 72P are labeled with a label value 74P indicating that the positive integrated training data 72P are derived from the first positive control cell, and the labeled positive integrated training data 73P are generated. The numeral “2” is attached in
From the negative training image 70N, the labeled negative integrated training data 73N are generated in the same manner as in the case of generating the labeled positive integrated training data 73P.
As shown in
Similarly, from the second negative training image 70NB, it is possible to generate the second negative numerical training data 71NB that numerically indicates the brightness of the captured light at each pixel in the image.
As shown in
Next, the negative integrated training data 72N is labeled with a label value 74N indicating that the negative integrated training data 72N is derived from the first negative control cell, and labeled negative integrated training data 73N are generated. A “1” is attached in
A cell analysis method in which cells flowing through a flow cell 110 are imaged, integrated analysis data 82 are generated from the generated analysis image 80, and a trained first artificial intelligence algorithm 60 is used will be described with reference to
As shown in
Similarly, from the second analysis image 80B, it is possible to generate the second numerical analysis data 81B which numerically indicates the brightness of the captured light in each pixel in the image.
As shown in
As shown in
i. In the present embodiment, the imaging flow cytometer uses an Extended Depth of Field (EDF) filter for expanding the depth of field when imaging cells, such that the cell image provided to the examiner restores the focal depth of the image after imaging. However, the training image 70 and the analysis image 80 used in the present embodiment are preferably images that have not been restored with respect to the images captured by using the EDF filter. An example of an image that has not been restored i shown in
ii. Out-of-focus images can be excluded from the training image 70 and the analysis image 80 during imaging. Whether the image is in focus can be determined because if the difference in brightness between each pixel and the adjacent pixel does not include a part where the gradient of the difference changes drastically in the entire image, it can be determined that the image is out of focus.
iii. The training image 70 and the analysis image 80 used in the present embodiment are typically trimmed so that the number of pixels is 100 pixels in the vertical direction and 100 pixels in the horizontal direction, but the size of the image is not limited to this. The number of pixels can be appropriately set between 50 to 500 pixels in the vertical direction and 50 to 500 pixels in the horizontal direction. The number of pixels in the vertical direction and the number of pixels in the horizontal direction of the image do not necessarily have to be the same. However, a training image 70 for training the first artificial intelligence algorithm 50 and an analysis image 80 for generating integrated analysis data 82 to be input in the first artificial intelligence algorithm 60 trained using the training image 70 have the same number of pixels and preferably the same number of pixels in the vertical direction and the horizontal direction.
iv. In this embodiment, the training image 70 and the analysis image 80 use a 16-bit grayscale image. However, the gradation of brightness may be 8 bits, 32 bits, or the like in addition to 16 bits. Although, the numerical value for brightness expressed in 16 bits (65, 536 gradations) is used directly in the present embodiment, these numerical values also may be subjected to a low-dimensional processing for summarizing them with gradations having a constant width, and these low-dimensional numerical values may be used as the numerical training data 71PA, 71PB, 71NA, 71NB. In this case, it is preferable to perform the same processing on the training image 70 and the analysis image 80.
v. The chromosomal abnormalities that can be detected in this embodiment are not limited to the PML-RARA chimeric gene. For example, BCR/ABL fusion gene, AML1/ETO (MTG8) fusion gene (t (8; 21)), PML/RARα fusion gene (t (15; 17)), AML1 (21q22) translocation, MLL (11q23) translocation, TEL (12p13) translocation, TEL/AML1 fusion gene (t (12; 21)), IgH (14q32) translocation, CCND1 (B)CL1)/IgH fusion gene (t (11; 14)), BCL2 (18q21) translocation, IgH/MAF fusion gene (t (14; 16)), IgH/BCL2 fusion gene (t (14; 18)), c-myc/IgH fusion gene (t (8; 14)), FGFR3/IgH fusion gene (t (4; 14)), BCL6 (3q27) translocation, c-myc (8q24) translocation, MALT1 (18q21) translocation, API2/MALT1 fusion gene (t (11; 18) translocation), TCF3/PBX1 fusion gene (t (1; 19) translocation), EWSR1 (22q12) translocation, PDGFRIβ (5q32) translocation and the like can be detected.
Also, translocations can include various variations.
FIG. 7 also shows an example of chromosome 8 trisomy. The first fluorescently labeled probe binds, for example, to the centromere on chromosome 8. The positive pattern has three first fluorescent labels. The negative pattern has two first fluorescent labels. Such a fluorescent labeling pattern is the same in trisomy 12 of chromosome. In the chromosome 7 monosomy, for example, when a first fluorescently labeled probe that binds to the centromere of chromosome 7 is used, the positive pattern is one first fluorescent label. The negative pattern has two first fluorescent labels.
The present embodiment relates to a method for training a first artificial intelligence algorithm 63 for detecting peripheral circulating tumor cells and a method for analyzing cells using the first artificial intelligence algorithm 63 for detecting peripheral circulating tumor cells. Here, the term “train” or “training” may be used in place of the term “generate” or “generating”.
The training method of the first artificial intelligence algorithm 53 for detecting peripheral circulating tumor cells will be described with reference to
As shown in
When detecting peripheral circulating tumor cells, the image captured by the imaging unit 160 may include a bright field image and a fluorescence image. The bright-field image can be an image of the phase difference of the cells. This imaging can be obtained, for example, on the first channel. The fluorescent image is an image of a fluorescent label labeled at a target site in the cell by immunostaining or intracellular organelle staining. Fluorescent labeling is performed with fluorescent dyes that have different fluorescence wavelength regions for each antigen and/or each organelle.
For example, when the first fluorescent dye that emits fluorescence in the first green wavelength region is bound to the first antigen, the first antigen can be labeled with the first fluorescent dye by binding the first fluorescent dye to an antibody that directly or indirectly binds to the first antigen.
When a second fluorescent dye that emits fluorescence in a red wavelength region different from that of the first fluorescent dye is bound to an antibody that binds to the second antigen, the second antigen can be labeled with the second fluorescent dye by binding the second fluorescent dye to an antibody that directly or indirectly binds to the second antigen.
When the antibody that binds to the third antigen is bound to the first fluorescent dye and the third fluorescent dye that emits fluorescence in a yellow wavelength region different from that of the second fluorescent dye, the third antigen can be labeled with a third fluorescent dye by binding the third fluorescent dye to an antibody that directly or indirectly binds to the third antigen.
In this way, fluorescent dyes with different fluorescence wavelength regions can be labeled from the first fluorescence label to the Xth fluorescence label.
A sample containing cells having the first fluorescent label to the Xth fluorescent label can be subjected to imaging with a cell imaging device such as an imaging flow cytometer, and an image of the cells can be obtained. An image taken of a cell may include multiple images for the same field of view of the same cell. Since the first fluorescent label to the Xth fluorescent label have different fluorescence wavelength regions of each fluorescent dye, the filter for transmitting the light emitted from each fluorescent dye is different for each fluorescent dye. The bright field image requires the use of a filter different from the filter that transmits light from the fluorescent dye. Therefore, the light transmitted through each filter is taken into the imaging unit 160 (described later) via each corresponding channel, and is captured as another image of the same cell in the same field of view. That is, in the imaging unit 160, for the same visual field of the same cell, a plurality of images corresponding to the number obtained by adding the number of bright-field images to the number of labeling substances labeling the cells are acquired.
The first channel (Ch1) indicates a bright-field image in
As shown in
A method of generating the first positive numerical training data 76P1 will be described with reference to the first positive training image 75P1. Each image captured by the imaging unit 160 is trimmed, for example, to 32 pixels in length×32 pixels in width by the above-mentioned preprocessing to obtain a training image 75. The first positive training image 75P1 is represented, for example, as a 16-bit grayscale image. Therefore, in each pixel, the brightness of the pixel can be indicated by a numerical value of the brightness of 65,536 gradations from 1 to 65,536. As shown in
Similar to the first positive numerical training data 76P1, the Xth positive numerical training data 76Px can be generated from the second positive numerical training data 76P2 which numerically indicate the brightness of the imaged light for each pixel in the image from the second positive training image 75P2 to the Xth positive training image 75Px.
Next, the first positive numerical training data 76P1 to the Xth positive numerical training data 76Px are integrated for each pixel to generate positive integrated training data 77P. As shown in
Next, the positive integrated training data 77P is labeled with a label value 79P indicating that the positive integrated training data 77P is derived from the second positive control cell, then labeled positive integrated training data 78P are generated. “2” is attached in
From the negative training image 75N, the labeled negative integrated training data 78N are generated in the same manner as in the case of generating the labeled positive integrated training data 78P.
As shown in
Similarly, from the second negative numerical training data 76N2 to the Xth negative numerical training data 76Nx indicating the brightness of the imaged light numerically can be generated for each pixel in the image from the second negative training image 75N2 to the Xth second training image 75Nx.
As shown in
Next, the negative integrated training data 77N is labeled with a label value 79N indicating that the negative integrated training data 77N is derived from the second negative control cell, and labeled negative integrated training data 78N are generated. “1” is attached in
With these inputs, each weight in the intermediate layer 53c of the neural network is calculated, the first artificial intelligence algorithm 53 is trained, and the trained first artificial intelligence algorithm 63 is generated.
The method of generating the integrated analysis data 72 and the cell analysis method using the trained first artificial intelligence algorithm 63 will be described from the analysis image 85 with reference to
As shown in
Similarly, from the second analysis image 85T2 to the Xth analysis image 85Tx, the Xth numerical analysis data 86Tx can be generated from the second numerical analysis data 86T2 numerically indicating the brightness of the captured light in each pixel in the image.
As shown in
As shown in
i. The training image 75 and the analysis image 85 used in the present embodiment are preferably images that have not been restored with respect to the images captured by using the EDF filter.
ii. Out-of-focus images can be excluded from the training image 75 and the analysis image 85 during imaging.
iii. Although the training image 75 and the analysis image 85 used in the present embodiment are typically trimmed so that the number of pixels is 32 pixels in the vertical direction and 32 pixels in the horizontal direction, the size of the image is not limited insofar as the entire cell is contained in the image. The number of pixels in the vertical direction and the number of pixels in the horizontal direction of the image do not necessarily have to be the same. However, a training image 75 for training the first artificial intelligence algorithm 53 and an analysis image 85 for generating integrated analysis data 87 to be input to the first artificial intelligence algorithm 63 trained using the training image 75 preferably have the same number of pixels in the vertical direction and the horizontal direction.
iv. In this embodiment, the training image 70 and the analysis image 80 use a 16-bit grayscale image. However, the gradation of brightness may be 8 bits, 32 bits, or the like in addition to 16 bits. Although, for each numerical training data 76P1 to numerical training data 76Px and numerical training data 76N1 to numerical training data 76Nx, the numerical values of the brightness represented by 16 bits (65,536 gradations) are used directly in the present embodiment, these numerical values are subjected to a low-dimensional processing that summarizes them with a gradation of a certain width, and the numerical values after the low dimensional processing also may be used as the numerical training data 76Px from each numerical training data 76P1 and the numerical training data 76Nx from the numerical training data 76N1. In this case, it is preferable to perform the same processing on the training image 70 and the analysis image 80.
The training method of the second artificial intelligence algorithm 94 and the cell analysis method using the trained second artificial intelligence algorithm 97 will be described with reference to
In this embodiment, examples of the algorithms that can be used as the second artificial intelligence algorithms 94 and 97 include random forest, gradient boosting, support vector machine (SVM), relevance vector machine (RVM), naive bays, logistic regression, feed, Forward Neural Network, Deep Learning, K-Nearest Neighbor Method, AdaBoost, Bagging, C4.5, Kernel Approximation, Stochastic Gradient Descent (SGD) Classifier, Lasso, Ridge Regression, Elastic Net, SGD Regression, Kernel Regression, Lowess Regression, matrix fructization, non-negative matrix fructization, kernel matrix fructization, interpolation method, kernel smoother, co-filtering and the like. The second artificial intelligence algorithms 94, 97 are preferably random forest or gradient boosting.
As the second artificial intelligence algorithms 94 and 97, for example, those provided by Python can be used.
Here, the term “train” or “training” may be used in place of the term “generate” or “generating”.
As shown in
When detecting peripheral circulating tumor cells, the image captured by the imaging unit 160 (described later), which is used as the training image 90, may be a bright-field image and/or a fluorescent image as in section 2-2 above. The bright-field image can be an image of the phase difference of the cells. The training image 90 can be acquired in the same manner as in section 2-2(1) above.
As the training data 91, for example, the feature amount shown in
As shown in
Although only the bright field image is shown in the example of
Here, the labeled positive training data 92A and the labeled negative training data 92B are also collectively referred to as training data 92.
The training data 92 trains the second artificial intelligence algorithm 94, and the trained second artificial intelligence algorithm 97 is generated.
As data indicating normality or abnormality of cells, data 98 indicating whether the cells to be analyzed are peripheral circulating tumor cells are generated by inputting the analysis data 96 into the trained second artificial intelligence algorithm 97. For example, “1” is output as a label value when it is determined that the cell to be analyzed is not a peripheral circulating tumor cell, and “2” is output as a label value when it is determined that the cell is a peripheral circulating tumor cell. Instead of the label value, labels such as “none”, “yes”, “normal”, and “abnormal” also may be output.
Hereinafter, the cell analysis systems 1000, 2000, and 3000 according to the first to third embodiments will be described with reference to
The hardware structure of the training device 200A will be described with reference to
The control unit 20A includes a CPU (Central Processing Unit) 21 that performs data processing described later, a memory 22 used as a work area for data processing, a storage unit 23 that records a program and processing data described later, a bus 24 for transmitting data among each of the units, an interface (I/F) unit 25 for inputting/outputting data to/from an external device, and a GPU (Graphics Processing Unit) 29. The input unit 26 and the output unit 27 are connected to the control unit 20A via the I/F unit 25. Illustratively, the input unit 26 is an input device such as a keyboard or a mouse, and the output unit 27 is a display device such as a liquid crystal display. The GPU 29 functions as an accelerator that assists in arithmetic processing (for example, parallel arithmetic processing) performed by the CPU 21. In the following description, the processing performed by the CPU 21 means that the processing performed by the CPU 21 using the GPU 29 as an accelerator is also included. Here, instead of GPU 29, a chip which is preferable for the calculation of the neural network may be provided. Examples of such a chip include FPGA (Field-Programmable Gate Array), ASIC (Application Specific Integrated Circuit), Myriad X (Intel), and the like.
The control unit 20A sends a training program for training the artificial intelligence algorithm and an artificial intelligence algorithm in advance and in an executable format to the storage unit 23, for example, in order to perform the processing of each step described with reference to
In the following description, unless otherwise specified, the processing performed by the control unit 20A means the processing performed by the CPU 21 or the CPU 21 and the GPU 29 based on the program and the artificial intelligence algorithm stored in the storage unit 23 or the memory 22. The CPU 21 temporarily stores necessary data (intermediate data during processing and the like) using the memory 22 as a work area, and appropriately records data to be stored for a long period of time, such as a calculation result, in the storage unit 23.
The 75Nx, 90A and 90B from the 75Pxm 75Nx and 75N1, which are from the 70PA, 70PB, 70NA, 70NB and 75P1 are acquired beforehand from the cell imaging device 100A by the cell analysis device 400A, and prestored in the storage unit 23 or the memory 22 of the control unit 20A of the training device 200A. The training device 200A also may acquire the training images 70PA, 70PB, 70NA, 70NB, 75P1 to 75Px, 75N1 to 75Nx, 90A, 90B from the cell analyzer 400A via the network, or via the media drive D98. The training data database (DB) 204 stores the generated training data 73, 78, 92. The pre-training artificial intelligence algorithm is pre-stored in the algorithm database 205. The trained first artificial intelligence algorithm 60 can be recorded in the algorithm database 205 in association with the test items and analysis items for testing for chromosomal abnormalities. The trained first artificial intelligence algorithm 63 can be recorded in the algorithm database 205 in association with the test and analysis items for testing peripheral circulating tumor cells. The trained second artificial intelligence algorithm 97 can be recorded in the algorithm database 205 in association with the feature quantity item to be input.
The control unit 20A of the training device 200A performs the training process shown in
First, in response to a request from the user to start processing, the CPU 21 of the control unit 20A acquires the training images 70PA, 70PB, 70NA, 70NB stored in the storage unit 23 or the memory 22; then acquires the training 75Nx or training images 90A and 90B from 75Px and 75N1 from the training image 75P1. Training images 70PA, 70PB, 70NA, 70NB are for training the first artificial intelligence algorithm 50, training images 75P1 to 75Px, and 75N1 to 75Nx are for training the first artificial intelligence algorithm 53, training images 90A and 90B are used to train the second artificial intelligence algorithm 94.
In step S11 of
Next, the control unit 20A inputs the generated labeled positive integrated training data 73P and the labeled negative integrated training data 73N into the first artificial intelligence algorithm 50 in step S12 of
Subsequently, in step S13 of
When the training results are accumulated for a predetermined number of trials, in step S14, the control unit 20A updates the weighting (w) (coupling weight) of the first artificial intelligence algorithm 50 using the training results accumulated in step S12.
Next, in step S15, the control unit 20A determines whether the first artificial intelligence algorithm 50 has been trained with a predetermined number of labeled positive integrated training data 73P and labeled negative integrated training data 73N. When the training is performed with the specified number of labeled positive integrated training data 73P and the labeled negative integrated training data 73N (in the case of “YES”), the training process is terminated. The control unit 20A stores the trained first artificial intelligence algorithm 60 in the storage unit 23.
When the first artificial intelligence algorithm 50 is not trained with the specified number of labeled positive integrated training data 73P and the labeled negative integrated training data 73N (in the case of “NO”), the control unit 20A advances from step S15 to step S16 and the processes from step S11 to step S15 are performed on the next positive training images 70PA and 70PB and the negative training images 70NA and 70NB.
In step S11 of
Next, the control unit 20A inputs the generated labeled positive integrated training data 78P and the labeled negative integrated training data 78N into the first artificial intelligence algorithm 53 in step S12 of
Subsequently, in step S13 of
When the training results are accumulated for a predetermined number of trials, in step S14, the control unit 20A uses the training results accumulated in step S12 to update the weight w (coupling weight) of the first artificial intelligence algorithm 53.
Next, in step S15, the control unit 20A determines whether the first artificial intelligence algorithm 53 has been trained with a predetermined number of labeled positive integrated training data 78P and labeled negative integrated training data 78N. When training is performed with the specified number of labeled positive integrated training data 78P and labeled negative integrated training data 78N (in the case of “YES”), the training process is completed. The control unit 20A stores the trained first artificial intelligence algorithm 63 in the storage unit 23.
When the first artificial intelligence algorithm 53 is not trained with the specified number of labeled positive integrated training data 78P and the labeled negative integrated training data 78N (in the case of “NO”), the control unit 20A advances from step S15 to step S16, and the processes from step S11 to step S15 are performed on the next positive training images 75P1 to 75Px and the negative training images 75N1 to 75Nx.
In step S111 of
Next, the control unit 20A inputs the generated labeled positive training data 92A and labeled negative training data 92B into the second artificial intelligence algorithm 94 in step S112 of
Next, in step S113, the control unit 20A determines whether the second artificial intelligence algorithm 94 has been trained with a predetermined number of labeled positive training data 92A and labeled negative training data 92B. When training is performed with the specified number of labeled positive training data 92A and labeled negative training data 92B (in the case of “YES”), the training process is completed. The control unit 20A stores the trained second artificial intelligence algorithm 97 in the storage unit 23.
When the second artificial intelligence algorithm 94 is not trained with the specified number of labeled positive training data 92A and the labeled negative training data 92B (in the case of “NO”), the control unit 20A advances from step S113 to step S114, and performs the processes from step S111 to step S113 on the next positive training image 90A and negative training image 90B.
The present embodiment includes a computer program for training an artificial intelligence algorithm that causes a computer to execute the processes of steps S11 to S16 or S111 to S114.
An implementation of the present embodiment relates to a program product such as a storage medium that stores the computer program. That is, the computer program can be stored on a hard disk, a semiconductor memory element such as a flash memory, or a storage medium such as an optical disk. The recording format of the program on the storage medium is not limited insofar as the training device 200A can read the program. Recording on the storage medium is preferably non-volatile.
Here, the “program” is a concept including not only a program that can be directly executed by the CPU, but also a source format program, a compressed program, an encrypted program, and the like.
For example, as described above, chromosomal abnormalities or peripheral circulating tumor cells use one or more fluorescent dyes to detect the target site. Preferably, the FISH method uses two or more fluorescent dyes to detect a target site on the first chromosome and a target site on the second chromosome (the “first” and “second” that modify the “chromosome” is a comprehensive concept of numbers that do not mean chromosome numbers). For example, a probe that hybridizes with the PML locus is labeled by a first fluorescent dye in which a nucleic acid having a sequence complementary to the base sequence of the PML locus is irradiated with light of wavelength λ11 to generate first fluorescence of wavelength λ21. With this probe, the PML locus is labeled with the first fluorescent dye. In the probe that hybridizes with the RARA locus, a nucleic acid having a sequence complementary to the base sequence of the RARA locus is labeled with a second fluorescent dye that produces a second fluorescence of a wavelength λ22 when irradiated with light of a wavelength λ12. Using this probe, the RARA locus is labeled with a second fluorescent dye. The nucleus is stained with a dye for nuclear staining that produces a third fluorescence of wavelength λ23 when irradiated with light of wavelength λ13. The wavelength λ11, the wavelength λ12, and the wavelength λ13 are so-called excitation lights. The wavelength λ114 is light emitted from a halogen lamp or the like for bright field observation.
The cell imaging device 100A includes a flow cell 110, a light source 120 to 123, a condenser lens 130 to 133, a dichroic mirror 140 to 141, a condenser lens 150, an optical unit 151, a condenser lens 152, and an imaging unit 160. The sample 10 is flowed through the flow path 111 of the flow cell 110.
The light sources 120 to 123 irradiate light on the sample 10 flowing from the bottom to the top of the flow cell 110. The light sources 120 to 123 are composed of, for example, a semiconductor laser light source. Lights having wavelengths λ11 to λ14 are emitted from the light sources 120 to 123, respectively.
The condenser lenses 130 to 133 collect light having wavelengths λ11 to λ14 emitted from light sources 120 to 123, respectively. The dichroic mirror 140 transmits light having a wavelength of λ11 and refracts light having a wavelength of λ12. The dichroic mirror 141 transmits light having wavelengths λ11 and λ12 and refracts light having wavelength λ13. In this way, light having wavelengths λ11 to λ14 is applied to the sample 10 flowing through the flow path 111 of the flow cell 110. The number of semiconductor laser light sources included in the cell imaging device 100A is not limited insofar as it is 1 or more. The number of semiconductor laser light sources can be selected from, for example, 1, 2, 3, 4, 5 or 6.
When the sample 10 flowing through the flow cell 110 is irradiated with light having wavelengths λ11 to λ13, fluorescence is generated from the fluorescent dye labeled on the cells flowing through the flow path 111. Specifically, when the light of the wavelength λ11 is irradiated on the first fluorescent dye that labels the PML locus, a first fluorescence of the wavelength λ21 is generated from the first fluorescent dye. When light of wavelength λ12 is irradiated on the second fluorescent dye that labels the RARA locus, the second fluorescent dye produces a second fluorescence of wavelength λ22. When light of wavelength λ13 is irradiated on the dye for nuclear dyeing that stains the nucleus, the dye for nuclear dyeing produces a third fluorescence of wavelength λ23. When the sample 10 flowing through the flow cell 110 is irradiated with light having a wavelength of λ14, this light passes through the cells. The transmitted light of wavelength λ14 transmitted through the cells is used to generate a bright-field image. For example, in the embodiment, the first fluorescence is the wavelength region of green light, the second fluorescence is the wavelength region of red light, and the third fluorescence is the wavelength region of blue light.
The condenser lens 150 collects the first fluorescence to the third fluorescence generated from the sample 10 flowing through the flow path 111 of the flow cell 110 and the transmitted light transmitted through the sample 10 flowing through the flow path 111 of the flow cell 110. The optical unit 151 has a configuration in which four dichroic mirrors are combined. The four dichroic mirrors of the optical unit 151 reflect the first fluorescence to the third fluorescence and the transmitted light at slightly different angles, and separate them on the light receiving surface of the imaging unit 160. The condenser lens 152 collects the first fluorescence to the third fluorescence and the transmitted light.
The imaging unit 160 is configured by a TDI (Time Delivery Integration) camera. The imaging unit 160 captures the first fluorescence to the third fluorescence and the transmitted light to obtain a fluorescence image corresponding to the first fluorescence to the third fluorescence and a bright field image corresponding to the transmitted light, which are output as imaging signals to the cell analysis device 400A. The image to be captured may be a color image or a grayscale image.
The cell imaging device 100A also may be provided with a pretreatment device 300 as necessary. The pretreatment device 300 samples a part of the sample and performs FISH, immunostaining, intracellular organelle staining, or the like on the cells contained in the sample to prepare the sample 10.
The hardware structure of the cell analyzer 400A will be described with reference to
The structure of the control unit 40A is the same as the structure of the control unit 20A of the training device 200A. Here, the CPU 21, the memory 22, the storage unit 23, the bus 24, the I/F unit 25, and the GPU 29 in the control unit 20A of the training device 200A are replaced with the CPU 41, the memory 42, the storage unit 43, the bus 44, and the I/F unit 45, and GPU 49, respectively. However, the storage unit 43 stores the trained artificial intelligence algorithms 60, 63, and 94 generated by the training device 200A and acquired by the CPU 41 from the I/F unit 45 via the network 99 or the media drive D98.
The analysis images 80, 85, and 95 can be acquired by the cell imaging device 100A and stored in the storage unit 43 or the memory 42 of the control unit 40A of the cell analysis device 400A.
The trained first artificial intelligence algorithm 60 can be recorded in the algorithm database 405 in association with the exam items and analysis items for testing for chromosomal abnormalities. The trained first artificial intelligence algorithm 63 can be recorded in the algorithm database 405 in association with the exam and analysis items for testing peripheral circulating tumor cells. The trained second artificial intelligence algorithm 97 can be recorded in the algorithm database 405 in association with the feature quantity item to be input.
The control unit 40A of the cell analysis device 400A performs the cell analysis process shown in
The CPU 41 of the control unit 40A starts the cell analysis process according to a request from the user to start the process or when the cell imaging device 100A starts the analysis.
The control unit 40A generates integrated analysis data 82 from the analysis images 80A and 80B in step S21 shown in
In step S22 shown in
In step S23 shown in
The control unit 40A determines whether all the analysis images 80A and 80B have been determined in step S24 shown in
The control unit 40A generates integrated analysis data 87 from the analysis images 85T1 to 85Tx in step S21 shown in
In step S22 shown in
In step S23 shown in
The control unit 40A determines whether all the analysis images 85T1 to 85Tx have been determined in step S24 shown in
The control unit 40A generates integrated analysis data 96 from the analysis image 95 in step S21 shown in
In step S22 shown in
In step S23 shown in
The control unit 40A determines whether all the analysis images 95 have been determined in step S24 shown in
The present embodiment includes a computer program for performing cell analysis that causes a computer to perform the processes of steps S21 to S26.
An implementation of the present embodiment relates to a program product such as a storage medium that stores the computer program. That is, the computer program is stored in a semiconductor memory element such as a hard disk or a flash memory, or a storage medium such as an optical disk. The recording format of the program on the storage medium is not limited insofar as the training device 200A can read the program. Recording on the storage medium is preferably non-volatile.
Here, the “program” is a concept including not only a program that can be directly executed by the CPU, but also a source format program, a compressed program, an encrypted program, and the like.
As shown in
The hardware structure of the training/analysis device 200B is the same as that of the cell analysis device 400A shown in
The training process and the cell analysis process are described in section 4-1 above which is incorporated herein by reference. However, various data generated in the process are stored in the storage unit 23 or the memory 22 of the training/analysis device 200B.
As shown in
The hardware structure of the training device 200C and the image acquisition device 400B is the same as that of the cell analysis device 400A shown in
The training process and the cell analysis process are described in section 4-1 above which is incorporated herein by reference. However, various data generated in the process are stored in the storage unit 23 or the memory 22 of the training 200C.
The present invention shall not be construed as being limited to the embodiments described above.
For example, although a plurality of different images of the same cell in the same field are used in the generation of training data and analysis data in the above-described embodiment, one training datum may be generated from one cell image, and one analysis datum may be generated from one cell image.
Although analysis data are generated from a plurality of images obtained by capturing images of different light wavelength regions of the same field of view of one cell in the above-described embodiment, one cell may be imaged multiple times to obtain a plurality of images by another method. For example, analysis data may be generated from a plurality of images obtained by imaging one cell from different angles, or analysis data may be generated from a plurality of images obtained by imaging with staggered timing the th cell of one label value 84.
In the above-described embodiment, the normality or abnormality of the cell is determined, but the cell type and the cell morphology also may be determined.
Examples will be used to describe embodiments in more detail. However, the present invention shall not be construed as being limited to the examples.
Breast cancer cell line MCF7 and peripheral blood mononuclear cells PBMC (Peripheral Blood Mononuclear Cells) were used as model samples of CTC and blood cells. The cells were stained with Hoechst reagent and then subjected to an imaging flow cytometer (ImageStream Mark II, Luminex) to obtain bright-field images and nuclear-stained images. The conditions of the imaging flow cytometer were magnification: 40 times, flow velocity: Medium, and EDF filter.
Python 3.7.3, TensorFlow 2.0 alpha Keras was used as the language and library. A convolutional neural network (CNN) was used as an artificial intelligence algorithm.
Details of the data set are shown in
Two classes, MCF7 and PBMC, were discriminated. First, a discriminant model was created using the training data set.
Python 3.7.3, scikit-learn was used as the language and library. Random forest and gradient boosting were used as artificial intelligence algorithms.
For each of the bright-field image and the nuclear-stained image of the data set shown in
Two classes, MCF7 and PBMC, were discriminated. A discriminant model was created by random forest and gradient boosting using the above dataset. The correct answer rate when each model is used is shown in
Python 3.7.3, TensorFlow 2.0 alpha was used as the language and library. A convolutional neural network (CNN) was used as an artificial intelligence algorithm. The training was conducted up to 50 times.
PML-RARA chimeric gene-positive cells were subjected to an imaging flow cytometer MI-1000 to acquire images of channel 2 (green) and channel 4 (red). The image was taken with a magnification of 60 times and an EDF filter.
Negative integrated training data were generated according to the analysis method of chromosomally abnormal cells using the first artificial intelligence algorithm 60 described in the text of the specification from the image set of channel 2 (green) and channel 4 (red) of negative control cells determined to be free of chromosomal aberrations (G2R2F0) by known methods. The negative integrated training data was labeled with a “nega label” indicating that the chromosomal abnormality was negative, and labeled negative integrated training data were generated. Similarly, positive integrated training data were generated from channel 2 and channel 4 image sets of positive control cells determined to have chromosomal abnormalities (G3R3F2) by known methods. The positive integrated training data were labeled with a “posi label” indicating that the chromosomal abnormality was positive, and labeled positive integrated training data were generated. Here, for example, “G” and “R” of G2R2F0 mean a channel number, and “F” means a fusion signal. The numbers indicate the number of signals in one cell.
3741 sets of labeled negative integrated training data and 2052 sets of labeled positive integrated training data were prepared, and 3475 sets, which is 60% of these, were used as training data. In addition, 1737 sets, which is 30%, were used as test data, and 581 sets, which was 10%, were used as validation data.
The correct answer rate was 100%. In addition,
Python 3.7.3, TensorFlow 2.0 alpha was used as the language and library. A convolutional neural network (CNN) was used as an artificial intelligence algorithm. The training was conducted up to 100 times.
Three PML-RARA chimeric gene-positive samples (sample IDs: 03-532, 04-785, 11-563) were used for the imaging flow cytometer MI-1000 to acquire images of channel 2 (green) and channel 4 (red). The image was taken with a magnification of 60 times and an EDF filter. Negative integrated training data were generated from channel 2 and channel 4 image sets of cells determined to be free of chromosomal abnormalities (G2R2F0) by known methods, according to the method described in the text of the specification. The negative integrated training data was labeled with a “nega label” indicating that the chromosomal abnormality was negative, and labeled negative integrated training data were generated. Similarly, positive integrated training data were generated from channel 2 and channel 4 image sets of cells determined to have chromosomal abnormalities (G3R3F2) by known methods. The positive integrated training data were labeled with a “posi label” indicating that the chromosomal abnormality was positive, and labeled positive integrated training data were generated. Here, for example, “G” and “R” of G2R2F0 mean a channel number, and “F” means a fusion signal. The numbers indicate the number of signals in one cell.
Using the images of these samples, we attempted to detect the PML-RARA chimeric gene by a deep learning algorithm. The number of training data was 20537 and the number of analysis data was 5867.
The determination results for each sample are shown in
Number | Date | Country | Kind |
---|---|---|---|
2019-217159 | Nov 2019 | JP | national |