The present invention relates to a method for determining whether a cell shown in a nuclear fluorescence image acquired through confocal microscope is a diseased cell, in particular a tumorous cell.
In particular, said method is designed to automatically identify any diseased cells (which may also be tumorous cells) starting from an image which shows the nuclei of respective cells, wherein said nuclei have been marked with a fluorescence technique and said image has been acquired through a confocal microscope.
More specifically, the method is designed to determine whether a cell of interest is diseased or healthy on the basis of results obtained by applying a plurality of statistical functions chosen to characterize the texture and preferably also the size and morphology of the nucleus of a cell of interest, wherein said statistical functions are calculated starting from a Co-occurrence matrix which contains information on the nucleus of the cell of interest in terms of texture, size and morphology.
The expression “nuclear fluorescence image” means an image containing a plurality of cells, wherein the nuclei of said cells have been marked with a fluorescence technique.
In particular, the fluorescence is obtained through a DNA intercalating agent.
In the following the description will be directed to a method for determining whether a cell of a liver tissue is a tumorous cell.
However, the method is not to be considered limited to this specific use.
In fact, the same method can be used to determine whether a cell in any body tissue is a diseased cell (and not necessarily a tumorous cell).
Data reported by the World Health Organization (WHO) shows that cancer is the second leading cause of death globally, with approximately 9.6 million deaths in 2018.
Among the most common malignant tumours are those of the lung, prostate, colorectal, stomach and liver in men, while breast, lung, cervical and thyroid cancer and colorectal cancer are more common among women.
Despite advances in research and technology in recent decades, the fight against cancer is not over.
Fortunately, mortality and morbidity from different types of cancer have significantly decreased over the past two decades.
However, therapy resistance problems, progression and recurrence of a tumour still plague many cancer survivors.
For this reason, an early diagnosis aimed at identifying the type and stage of a tumour is a fundamental element in the fight against cancer.
The current approach for diagnosing a tumour is based on a pathological analysis of the tumour and its characteristics.
The histopathological visualization phase or the morphometric analysis phase is performed by a pathologist and represents a key element in the pathological labelling of a tumour such as carcinoma, sarcoma or melanoma and is often the basis for the choice of treatment to be followed.
A disadvantage of this approach is due to the fact that often the morphometric analysis of a cell is a subjective analysis and depends on the interpretation of the pathologist since the tissue microenvironment can be highly heterogeneous.
In addition, a detailed morphometric analysis is required for the early identification of abnormal cells that may represent the start or trigger of the metastatic phase.
A disadvantage is due to the fact that a detailed morphometric analysis takes time and is subject to false positives and/or false negatives.
This is mainly due to the difficulty of identifying a small number of abnormal cells in a heterogeneous population of normal cells, such as in fine needle biopsies or in blood stains.
Currently, various technical solutions have been developed to reduce the subjectivity margin and the diagnosis time, as well as to increase the accuracy of the diagnosis.
Such solutions involve the use of computational methods that help the pathologist in the diagnosis.
Morphometric information of cell nuclei is one of the main clinical diagnostic approaches used by pathologists to determine the malignant potential of an abnormal cell.
The nucleus, in fact, reflects the potential and biological activity of a cell.
The nuclei of normal healthy cells are usually single for cell number, have a rounded or oval shape, and show a uniform chromatin distribution, as well as a normal edge and one or two inconspicuous nucleoli and normal mitotic figures.
During the development of cancer, the nucleus of a cell undergoes numerous alterations in terms of number, shape, size, chromatin distribution (pattern and organization), as well as in terms of the nuclear membrane and nucleoli.
Machine learning techniques (such as deep learning) applied to the image of a cell nucleus allow to classify (based on nuclear morphology) healthy and diseased cells with high precision [1].
Given the prominent role of changes of the nuclear structure in diseased cells, several machine learning techniques have been developed based on quantitative information about the size and shape of a cell nucleus, as well as the nucleus-cytoplasm relationship and chromatin consistency.
In this regard, a recent publication [2] shows how, by means of a deep learning technique, it is possible to correlate changes in heterochromatin with the ratios of euchromatin in normal and cancerous cell lines, so as to recognize any cancer cells in the case of breast cancer.
Methods using machine learning techniques involve dividing individual tissue images into areas with a predetermined number of pixels.
There are several studies and algorithms that have been implemented for the medical diagnosis of hepatocellular cancer and use the machine learning technique or the deep learning technique.
Most of these algorithms are applied to images obtained through Computed Axial Tomography (CT), Nuclear Magnetic Resonance (MRI) or ultrasound scanners.
The deep learning technique often involves a segmentation phase and the use of a convolution neural network (CNN).
However, a disadvantage of this technique involves the use of a large amount of data for training the neural network itself (Big Data).
There are some examples of automated diagnostic methods for hepatocarcinoma using a convolutional neural network [3-4].
However, these methods are based on the analysis of diagnostic images in resonance and ultrasound.
Consequently, the quality of these images depends on the operator.
Another method uses the deep learning technique to understand how sick a cell is compared to other diseased cells to determine the severity of a tumour [5].
However, this method is not capable of recognizing a healthy cell from a diseased cell.
A disadvantage of this method is that the segmentation step of an image of a cell is coarse as background portions are taken together with the cell.
A method for la classifying auto-antibodies is disclosed in a study titled “HEp-2 Cell Classification using Multilevel Wavelet Decomposition” in the name of Katyal et al.
The analysis of anti-nuclear antibodies in HEp-2 cells by Indirect Immunofluorescence (IIF) is considered a powerful and sensitive test per for auto-antibodies analysis for autoimmune diseases.
The aim is to explore the use of the analysis of texture for automated categorization of auto-antibodies into one of the six categories of immunofluorescent staining which are frequently used in the daily diagnostic practice: centromere, nucleolar, homogeneous, fine speckled, coarse speckled, cytoplasmic.
The images of HEp-2 cells are acquired by a fluorescence microscope coupled with a 50W mercury vapour lamp and with a digital camera.
The data-set consists of 14 immunofluorescence images based on Hep-2 substrate contributing to a total of 721 cells.
The images are first manually segmented by cropping the cell shown in color and the method consist of two main steps: extracting the characteristics of the cells by using a two-dimensional wavelet decomposition and classifying the cells by using a neural network.
The two-dimensional wavelet decomposition is a wavelet decomposition performed on an image in grey scale of each of 721 cell images.
Consequently, after segmentation and before the wavelet decomposition, each image is transformed in an image in grey scale.
In particular, the extraction process of cell characteristics involves the repeated application of a Wavelet transform as shown in
A first Wavelet transform is applied to an image in grey scale and a first group of images is generated from said initial image in grey scale.
In the Wavelet field, the images of said first group of images are four and said images are called sub-bands. As a result, a first group of four sub-bands is generated from the first Wavelet transform.
The four sub-bands are the following: a first sub-band, a second sub-band concerning horizontal components of said image in grey scale, a third sub-band concerning vertical components of said image in grey scale and a fourth sub-band concerning diagonal components of said image in grey scale.
A second Wavelet transform is applied to the first sub-band of the first group of sub-bands and is generated a second group of four sub-bands.
A third Wavelet transform is applied to the first sub-band of the second group of four sub-bands and is generated a third group of four sub-bands. The characteristics of the cells are extracted through a respective Co-occurrence matrix applied to three sub-bands: the second sub-band, the third sub-band and the fourth sub-band.
In particular, the characteristics are 19: Autocorrelation, Contrast, Correlation, Cluster Prominence, Cluster Shade, Dissimilarity, Energy, Entropy, Homogeneity, Maximum probability, Variance, Sum average, Sum variance, Sum entropy, Difference variance, Difference entropy, Information measure of correlation, Normalized inverse difference, Normalized inverse difference moment.
A feed-forward neural network is used for the classification of cells. The data-set of images is divided into three sets of images and each set of images is provided as input to the neural network to classify the cells.
A first disadvantage of said known method is that the results are not accurate for carrying out a quantitative analysis of the images.
The images of the cell obtained through a fluorescence microscope are blurred and consequently some information necessary for the analysis of a cell cannot be taken into consideration.
A second disadvantage is that manual segmentation does not allow the cell to be cut out precisely and consequently the texture analysis is not accurate.
A further disadvantage is given by the fact that the Wavelet transform is applied only to the first sub-band and the Co-occurrence matrix is always applied to the remaining three sub-bands, different from said first sub-band.
The fact that each Wavelet transform is carried out only on the first sub-band implies the loss of information contained in the other sub-bands and the fact that each Co-occurrence matrix is applied to the remaining three sub-bands (and not to four sub-bands) implies the loss of information contained in the first sub-band. This involves an analysis of the cell texture with reduced accuracy.
Aim of the present invention is to overcome said disadvantages, providing an automatic and efficient method for determining whether a cell shown in a nuclear fluorescence image obtained through a confocal microscope is a diseased cell, in particular a tumorous cell.
In particular, the method is conceived to determine whether the cell is a diseased cell on the analysis of the nucleus of said cell, taking into account one or more characteristics of said nucleus, i.e. texture and preferably size and morphology of said nucleus.
Advantageously, by means of the method object of the present invention it is possible to diagnose the type of tumour.
It is therefore object of the invention a method for determining whether at least a cell of body tissue shown in a nuclear fluorescence image acquired through a confocal microscope is a diseased cell, in particular a tumorous cell, wherein said fluorescence is obtained through a DNA intercalating agent and wherein said method comprises the following steps:
Further embodiments of the method are disclosed in the dependent method claims.
It is also object of the invention a system for determining whether at least a cell of body tissue shown in a nuclear fluorescence image acquired through a confocal microscope is a diseased cell, in particular a tumorous, wherein said fluorescence is obtained through a DNA intercalating agent and wherein said system comprises:
Further embodiments of the system are disclosed in the system dependent claims.
The present invention relates also to the computer program, comprising code means configured in such a way that, when executed on a computer, perform the steps of the method disclosed above.
Furthermore, the present invention relates to a computer-readable storage medium comprising instructions which, when executed by a computer, cause the computer to carry out the steps of the method disclosed above.
The present invention will be now described, for illustrative, but not limitative purposes, according to its embodiment, making particular reference to the enclosed figures, wherein:
With reference to
In particular, the method is conceived to verify whether the nucleus of a cell is the nucleus of a diseased cell, through an analysis of some characteristics of the nucleus itself.
Although the method can be applied to a nucleus of a healthy cell of a liver tissue (shown in
In particular,
Furthermore, the image shown in
In particular, said fluorescence is obtained through a DNA intercalating agent, i.e. a chemical agent capable of binding to the cell's DNA and emitting fluorescence.
Said DNA intercalating agent can be a fluorochrome and preferably the DRAWS.
The DRAQ5 is an anthraquinone-based dye that binds stoichiometrically to the DNA present into the nucleus of a cell and emits fluorescence.
The fact that the image of the cell is a nuclear fluorescence image (in which the fluorescence is obtained through said DNA intercalating agent and not through an antibody), and that said image is obtained with a confocal microscope allows the method object of the present invention to accurately determine if a cell is diseased on the basis of the analysis of some characteristics of the nucleus of said cell, such as texture, size and morphology.
This makes the method object of the present patent application different from the methods of the known type which are designed to analyse whether a cell expresses a protein or not to answer a diagnostic question.
In the disclosed embodiment, the fluorescence technique was performed on images of sections of a diseased liver tissue fixed in formalin and included in paraffin.
The nuclei of the cells present in said sections of liver tissue have been marked using a fluorochrome, DRAQ5, diluted 1:5000 and incubated for 5 minutes at room temperature.
After washing the liver tissue sections, a drop of phosphate buffer saline (PBS)/glycerol (1:1) was placed on those liver tissue sections which were subsequently covered with a coverslip.
The images concerning liver tissue sections have been acquired through a confocal microscope Olympus Fluoview FV1000 provided with software FV10-ASW version 4.1, by using a lens 40× and a further lens 20× (numerical opening: 0.75).
Individual liver tissue sections have been acquired with a scan format of 1024×1024 pixels, a sampling rate equal to 20 μs/pixel, and the images are 12-bit/pixel images.
The mixing of the fluorochromes was carried out through the automatic sequential acquisition of multi-channel images, in order to reduce the spectral crosstalk between the channels.
The fluorochrome is a molecule which, when excited by photons emitted from a light radiation source, emits further photons having a wavelength greater than the wavelength of the photons with which the fluorochrome was excited.
In particular, the DRAQ5 has an optimal excitation wavelength of 647 nm and its emission spectrum has a peak value in the 681/697 nm band.
This fluorochrome is used to highlight the DNA present in the cell nucleus.
Hepatocarcinoma is difficult to identify and has abnormal group of hepatocytes, as well as anomalies of the nucleus.
Therefore, one or more liver cells will have a high N/C (nucleus/cytoplasm) ratio.
The essential features that will be highlighted will concern the alteration of the nuclei of the liver cells that will appear large and often joined together.
With reference to the method object of the invention, said method comprises the following steps:
With reference to step A, a segmented image Is of the nucleus C of a single cell is obtained.
In the embodiment being described, as already said, said cell is a cell of a diseased liver tissue.
The number of pixels of the segmented image Is does not depend on the dimensions of the nucleus of the cell.
In the embodiment being described, the segmentation is a binary segmentation.
It is known that the binary segmentation applies to an image in grey scale and allows to distinguish an object (in the specific case the nucleus of a cell) from its background. As a result, if the image originally acquired was an image in color, it would be necessary to transform said image in color in a image in grey scale before performing a binary segmentation.
If the grey level of a pixel is greater than a predetermined threshold value, this pixel belongs to the object, otherwise this pixel belongs to the background.
With reference to step B, as said, the segmented image Is of the nucleus C of the cell is inserted in a background of a predetermined color, so that the resulting image is a reference image IREF.
A reference matrix MREF is associated with to said reference image IREF.
A respective number in said reference matrix MREF is associated with each pixel of said reference image IREF and the value of said number is the respective grey level of said pixel.
As already said, the predetermined color for the background is preferably the black color.
Advantageously, from the computational point of view, a number equal to 0 is associated with each pixel having black color.
The scale of grey levels goes from black color to the white color and the number 0 corresponds to the black color.
Consequently, the reference image IREF is the real image of the nucleus C of the cell, since the background of black color is not taking into account.
However, the predetermined color for the background can be a color different from the black color, such as dark blue, without departing from the scope of the invention.
With reference to step C, the discrete Wavelet transform allows to disclose the texture of the nucleus of the cell.
The discrete Wavelet transform is applied to the reference matrix MREF associated with reference image IREF (i.e. the image obtained by inserting the segmented image Is on a background of a predetermined color) and allows to obtain four further matrices M1,M2,M3,M4 associated with respective further images I1,I2,I3,I4 of the nucleus of the same cell.
Each further matrix M1,M2,M3,M4 has dimensions M′ x N′.
The sum of said further matrices M1,M2,M3,M4 is a matrix of dimensions M×N.
If on the hand, as said, said further first image I1 is an image of the nucleus of the cell shown in said reference image IREF wherein said further first image I1 has a resolution less than the resolution of said reference image IREF, on the other hand, said further first image I1 is the only further image in which the real perimeter of the nucleus of the cell is visible.
The other further images (i.e. the further second image I2, the further third image I3 and the further fourth image I4) are images of the same nucleus C of the cell respectively referring to the horizontal components of the nucleus of the cell, to the vertical components of the nucleus of the cell and to the diagonal components of the nucleus of the cell.
Furthermore, the discrete Wavelet transform mentioned in step C of the method is a transform of first order.
However, the discrete Wavelet transform can be a transform of any order, without departing from the invention.
In case of discrete Wavelet transforms of order higher than the first order, for example up to the third order, the Wavelet transform of second order will be applied to the further images I1, I2, I3, I4 which are the four sub-bands obtained from the Wavelet transform of first order and the Wavelet transform of third order will be applied to the further images which will be the four sub-bands obtained from the Wavelet transform of second order.
With reference to step D, a respective Co-occurrence P1(i,j|Δx, Δy) P2 (i,j|Δx, Δy) P3(i,j|Δx, Δy) P4(i,j|Δx, Δy) is created for each further matrix M1,M2,M3,M4 obtained through the discrete Wavelet transform (as well as associated with a respective further image I1,I2,I3,I4).
In general, the Co-occurrence matrix contains information on the characteristics of the nucleus C of the cell and the information on the texture, on the size and on morphology is present among this information.
Each Co-occurrence matrix P1(i,j|Δx, Δy) P2 (i,j|Δx, Δy) P3 (i,j|Δx, Δy) P4(i,j|Δx, Δy) is calculated according to the following formula:
P
z(i,j,Δx,Δy)=WzQz(i,j|Δx,Δy)
Each Co-occurrence matrix P1(i,j|Δx, Δy) P2 (i,j|Δx, Δy) P3 (i,j| Δx, Δy) P4(i,j|Δx, Δy) is a matrix of dimensions G×G, wherein G is the number of grey levels associated to the pixel present in said further matrices M1, M2, M3, M4.
Each Co-occurrence matrix P1(i,j|Δx, Δy) P2 (i,j|Δx, Δy) P3 (i,j|Δx, Δy) P4(i,j|Δx, Δy) has in a respective position i,j the number of pairs of elements of a respective further matrix M1,M2,M3,M4, wherein each pair pf elements is associated with a respective pair of pixels.
In particular, each pair of elements is formed by a first element associated with a first pixel of said pair of pixels having a grey level equal to i and by a second element associated with a second pixel of said pair of pixels, different from said first pixel and having a grey level equal to j.
Consequently, in each element in position i,j of a respective Co-occurrence matrix P1(i,j|Δx, Δy) P2 (i,j|Δx, Δy) P3(i,j|ΔX, Δy) P4(i,j|Δx, Δy) a triple contribution is present: the grey level of a first pixel, the grey level of a second pixel, different from said first pixel, and the number of pairs of pixels formed by a first pixel and by a second pixel with respective grey levels.
With reference to step E, a plurality of statistical functions SF1,SF2 . . . SFN are calculated starting from each Co-occurrence matrix P1(i,j|Δx, Δy) P2 (i,j|Δx, Δy) P3 (i,j|Δx, Δy) P4(i,j|Δx, Δy).
Said statistical functions are predetermined and chosen to characterize at least the texture and preferably the size and the morphology of the nucleus C of the cell, as explained below.
In other words, a respective plurality of statistical functions SF1,SF2 . . . SFN is calculated for each of said Co-occurrence matrix P1(i,j|Δx, Δy) P2 (i,j|Δx, Δy) P3(i,j|Δx, Δy) P4(i,j|Δx, Δy).
The result of each statistical function SF1,SF2 . . . SFN is a respective number, so that a vector V of numbers comprising four sub-vectors v1,v2,v3,v4 (i.e. V=[v1;v2;v3;v4]) is associated with the nucleus C of said cell.
Each of said sub-vectors v1,v2,v3,v4 is associated with a respective further image I1,I2,I3,I4 and contains k elements wherein k is the number of the used statistical functions (i.e. the number of elements is equal to the number of statistical functions).
In the embodiment being described, said plurality of statistical functions comprises seven statistical functions SF1,SF2 . . . SF7, mentioned below.
A first statistical function SF1 named Inverse Difference Moment (IDM) is conceived to indicate a homogeneity in the distribution of grey levels
Said first statistical function SF1 is a measure of the homogeneity of the image (i.e. of a homogeneity of the grey levels) and therefore offers an indication of how much the image is free of significant variations between two grey levels.
The greater the numerical result of said first statistical function SF1, the lower the numerical result of a further statistical function called Contrast mentioned below.
A second statistical function SF2 named Energy (EN) is conceived to indicate a homogeneity in the structure of the texture of the nucleus of the cell:
In other words, said second statistical function SF2 relates to the structure of the texture of the nucleus of the cell intended as a macrostructure of the texture, since it refers to the nucleus of the cell in its entirety.
A third statistical function SF3 named Norm Entropy (NE) is conceived to take into account the level of clutter between pixels:
In other words, the numerical result of said third statistical function SF3 is the higher the closer the numerical values associated with the respective grey levels are to the maximum value of the grey levels, based on the number of grey levels with which it has been chosen to encode the reference image.
The numerical result of said third statistical function will be greater the closer the grey levels are to 256.
In a further example, if the grey levels range from 0 to 56, the numerical result of said third statistical function will be greater the closer the grey levels are to 56.
A fourth statistical function SF4 named Local Homogeneity (LO) is conceived to indicate the presence of homogeneous areas or non-homogeneous areas:
The numerical result of said fourth statistical function SF4 is higher the higher the number of homogeneous areas inside the cell nucleus is, and lower the higher the number of inhomogeneous areas inside the nucleus of the cell.
A fifth statistical function SF5 named Cluster Shade (CS) is conceived to indicate an asymmetry of the Co-occurrence matrix:
A sixth statistical function SF6 named Cluster Prominence (CP) is conceived to indicate a further asymmetry of the Co-occurrence matrix:
The higher the numerical results of said fifth statistical function SF5 and of said sixth statistical function SF6 the more the Co-occurrence matrix is asymmetric with respect to its diagonal.
A seventh statistical function SF7 named Contrast (CO) is conceived to identify the difference in intensity between two grey levels, a first grey level associated with said first pixel and a second grey level associated with said second pixel:
The higher the numerical result of said seventh statistical function SF7, two pixels of a pair of pixels.
As mentioned, said two pixels can be placed side by side one or the other or at a predetermined distance between them.
As regards said seventh statistical function SF7, it is preferable that said two pixels are side by side.
With reference to vector V, said vector V is given by four sub-vectors v1,v2,v3,v4, each of which is formed by the numerical results of the seven statistical functions SF1,SF2 . . . SF7 mentioned above and referred to a respective Co-occurrence matrix P1(i,j|Δx, Δy) P2(i,j|Δx, Δy) P3(i,j|Δx, Δy) P4(i,j|Δx, Δy).
In other words, the vector V=[IDM1, EN1, NE1, LO1, CS1, CP1, CO1; IDM2, EN2, NE2, LO2, CS2, CP2, CO2; IDM3, EN3, NE3, LO3, CS3, CP3, CO3; IDM4, EN4, NE4, LO4, CS4, CP4, CO4].
Consequently, in the embodiment being described, each sub-vector v1,v2,v3,v4 is so defined:
However, it is preferable that said plurality of statistical functions comprises two further statistical functions to also characterize the size and texture of the nucleus of said cell: an eighth statistical function SF8 and a ninth statistical function SF9.
The eighth statistical function SF8 called Extension is conceived to offer an estimate of the size of the cell nucleus C through the number of pairs of pixels, each of which is formed by a respective first pixel and a respective second pixel, different from said first pixel and positioned next to said first pixel, in which the first pixel and the second pixel of each pair of pixels have a grey level equal to 0:
EX=1/Pz(i=1,j=1|Δx,Δy)
The greater the number of pixel pairs with both pixels having a grey level equal to 0, the smaller the size of the cell nucleus.
Consequently, this eighth statistical function offers an estimate of the size of the cell's nucleus.
A ninth statistical function SF9 named EdgeLengthEstimate is conceived to offer an estimate of the perimeter of the nucleus C of the cell through the number of pairs of pixels, each of which is formed by a respective first pixel and a respective second pixel, different from said first pixel and positioned next to said first pixel, in which one of said two pixels has a grey level equal to 0:
As can be seen from the formula, the ninth statistical function allows to add a first number which is the result of the sum of all the elements of the first row of the Co-occurrence matrix with a second number which is the result of the sum of the elements of the first column of the same Co-occurrence matrix.
The result obtained by adding said first number and said second number is the number of pairs of pixels arranged on the edge of the nucleus of the cell.
This ninth statistical function offers an estimate of the perimeter of the cell nucleus.
The values of the eighth statistical function and the ninth statistical function offer an estimate of the size and morphology of a nucleus of a cell.
In fact, if the value of the eighth statistical function is low and the value of the ninth statistical function is high, it means that the nucleus of the cell has a jagged edge and a jagged edge may be characteristic of a tumorous cell.
To determine the size and morphology of the nucleus of the cell, the same matrix, from which information on the texture of said nucleus was obtained, has been used, so as to simplify the calculations and optimize the calculation time.
If nine statistical functions, each of the four sub-vectors v1,v2,v3,v4 mentioned above would be formed by the numerical results of nine statistical functions SF1,SF2 . . . SF9 and referred to a respective Co-occurrence matrix P1(i,j|Δx, Δy) P2(i,j|Δx, Δy) P3(i,j|Δx, Δy) P4(i,j|Δx, Δy).
In other words, the vector V=[IDM1, EN1, NE1, LO1, CS1, CP1, CO1, EX1, ELE1; IDM2, EN2, NE2, LO2, CS2, CP2, CO2, EX2, ELE2; IDM3, EN3, NE3, LO3, CS3, CP3, CO3, EX3, ELE3; IDM4, EN4, NE4, LO4, CS4, CP4, CO4, EX4, ELE4].
Consequently, each sub-vector v1,v2,v3,v4 would be so defined:
With reference to step F, as said, said predetermined neural network NN is designed to provide at least a first numerical value between 0 and 1 at a respective output node, i.e. the first output node.
In particular, in the embodiment being described, said predetermined neural network is a feed-forward neural network.
Furthermore, the learning method for said neural network is a quasi-Newton method.
With reference to steps G and H, said first numerical value will be compared with a predetermined threshold and the cell will be considered a diseased cell, if said first numerical value is greater than said predetermined threshold.
With particular reference to steps G and H, said step G can comprise a sub-step G1 of approximating said first numerical value to 1, when said first numerical value is greater than said predetermined threshold, and to 0, when said first numerical value is less than or equal to said predetermined threshold, and with reference to step H said cell is a diseased cell, in particular a tumorous cell, when said first numerical value is approximated to 1.
Returning to step F, in the embodiment being described, said predetermined neural network NN comprises a second output node NOUT2.
Furthermore, said predetermined neural network NN is configured to provide as output a second numerical value between 0 and 1 at said second output node NOUT2.
Said second numerical value is compared with the same predetermined threshold with which the first numerical value is compared.
After the comparison with said predetermined threshold, said second numerical value is approximated to 1 or 0.
A diseased cell (in the embodiment being described) is identified by a first numerical value (at the first output node NOUT1) which has been approximated to 1 and by a second numerical value (at the second output node NOUT2) which was approximated to 0.
A healthy cell is identified by a first numerical value (at the first output node NOUT1) which has been approximated to 0 and by a second numerical value (at the second output node NOUT2) which was approximated to 1.
In other words, the steps from F to H have been modified as follows.
The step F of the method that said predetermined neural network NN is configured to provide as output a second numerical value at said second output node NOUT2.
The step G of the method comprises the comparison of said second numerical value at said second output node NOUT2 with said predetermined threshold.
The step H of the method allows to determine if the nucleus C of said cell is the nucleus of a diseased cell, in particular a tumorous cell, when said first numerical value is greater than said predetermined threshold and said second numerical value is less than or equal to said predetermined threshold.
In particular, the step G can comprise a sub-step G2 of approximating the second numerical value to 1, when said second numerical value is greater than said predetermined threshold, and to 0, when said second numerical value is less than or equal to said predetermined threshold and with reference to step H said cell is a diseased cell, in particular a tumorous cell, when said first numerical value is approximated to 1 and when said second numerical value is approximated to 0.
With reference to two output nodes NOUT1,NOUT2, said two output nodes NOUT1,NOUT2 are included in a output layer of said predetermined neural network NN.
As is clear from the system capable of implementing this method, shown in
With reference to the input layer, in the embodiment being described, said input layer comprises twenty-eight input nodes NIN1,NIN2 . . . NIN28, each of which is associated with a respective numerical result of each of said seven statistical functions SF1,SF2 . . . SF7 for each of the four Co-occurrence matrix M1,M2,M3,M4.
With reference to the hidden layer, in the embodiment being described, said hidden layer comprises ten hidden nodes NN1,NN2 . . . NN10.
The present invention also relates to a system, shown in
Said system comprises:
In particular, said logic control unit U is configured to approximate said first numerical value to 1, when said first numerical value is greater than said predetermined threshold, and to 0, when said first numerical value is less than or equal to said predetermined threshold, and to determine whether the nucleus C of a cell is the nucleus of a diseased cell, in particular a tumorous cell, when said first numerical value is approximated to 1.
Furthermore, as said for the method, said first output node is included in the output layer of said predetermined neural network NN.
Said predetermined neural network NN can comprise a second output node NOUT2 (also included in said output layer) and said predetermined neural network NN can be configured to provide a second numerical value between 0 and 1 at said second output node NOUT2 (in addition to the first numerical value and always on the basis of the results of the statistical functions provided as input to the neural network), and said logic control unit U can be configured to compare said second numerical value with said predetermined threshold and determine whether said cell C is a diseased cell, in particular a tumorous cell, when said second numerical value is less than or equal to said predetermined threshold, besides said first numerical value is greater than said predetermined threshold.
In particular, said logic control unit U can be configured to approximate said second numerical value to 1, when said second numerical value is greater than said predetermined threshold, and to 0, when said second numerical value is less than or equal to said predetermined threshold, and to determine whether the nucleus C of said cell is the nucleus of a diseased cell, in particular the nucleus of a tumorous cell, when said second numerical value is approximated to 0, besides said first numerical value is approximated to 1.
As said for the method, said plurality of statistical functions can comprise seven statistical functions to characterize the texture and preferably two further statistical functions to characterize the size and the morphology of the nucleus of a cell.
The present invention relates to a computer program, comprising code means configured in such a way that, when executed on a computer, perform the steps of the method described above.
Furthermore, the present invention also relates to a computer-readable storage medium comprising instructions, which, when executed by a computer, cause the computer to carry out the steps of the method described above.
Example of Creating a Co-Occurrence Matrix
Below, an example of how a Co-occurrence matrix is created starting from a further matrix associated with a further image, wherein said further matrix has dimensions 5×5 (consequently M′ is equal to 5 and N′ is equal to 5) and said further image is coded with 5 levels of grey (i.e. through the values 0,1,2,3,4).
It is assumed that said further matrix is the further first matrix M1 for convenience.
Below is an example of said further first matrix:
As mentioned, the Co-occurrence matrix is defined by the following general formula:
P
z(i,j,Δx,Δy)=Wz·Qz(i,j|Δx,Δy)
In the example being described Δx=1 and Δy=0.
This means that pairs of elements of said further matrix are taken into consideration (in which each element corresponds to a respective pixel) formed by two elements side by side, i.e. a first element and a second element arranged within said further matrix in the position subsequent to said first element.
Consequently, the general formula indicated above becomes:
P
1(i,j,1,0)=W1Q1(i,j|1,0)
In the example being described the parameter W1 (i.e. the number referred to the number of possible pairs of elements associated with a respective pixel pairs) becomes:
As regards the calculation of the parameter Q1 (i.e. the number referred to the number of pairs of elements of a further matrix, wherein each pair of elements is formed by said first element associated with said first pixel with grey level equal to i and from said second element associated with said second pixel with grey level equal to j), in order to facilitate the calculation of this parameter, a table is shown below which shows the number of pairs of elements as i and j vary.
With reference to the first row of the table:
With reference to the second row of the table:
With reference to the third row of the table:
With reference to the fourth row of the table:
As a result:
A nuclear fluorescence image of a liver tissue containing a number of cells equal to 573 (including healthy cells and diseased cells) has been processed through the method above describe, by using a neural network already trained with other nuclear fluorescence images concerning a plurality of cells present in a healthy and diseased liver tissue. The results have been compared with the results of the traditional anatomy-pathological methods.
Furthermore, in order to evaluate the robustness of the method described above, it has been chosen to apply different predetermined threshold values to determine whether the cell is healthy or diseased.
In the example being disclosed, said threshold values have been chosen between 0 and 1.
In particular, the chosen threshold values are the following: 0.2, 0.4, 0.6 e 0.8.
Below is a table showing the results obtained by varying the threshold values.
In the table above:
and
The values shown in the table have been used to construct a respective confusion matrix for each predetermined threshold value and to construct a ROC curve concerning all the confusion matrices.
The
The accuracy of the method described above to determine whether the cells are healthy cells or diseased cells is directly proportional to the area subtended by the ROC curve.
The area under the ROC curve is called AUC and measures the probability that the result of a test on a sick person randomly chosen from a group of sick people is different from (greater than) the result of a test on a healthy person randomly chosen by a group of healthy people.
In addition, several methods are known to estimate the area subtended by the ROC or AUC curve.
In particular, a known method for estimating the area subtended by the ROC or AUC curve provides for a numerical integration, for example by calculating different areas each of which is associated with a respective polygon subtended by the curve and then adding the area of all polygons.
The result of the sum of the areas of all polygons will provide a lower estimate of the real area subtended by the ROC or AUC curve.
In particular, it is possible to use a known method to interpret the value of the area subtended by the ROC or AUC curve, according to which:
Regardless of the predetermined threshold value, the method is moderately accurate or highly accurate.
The method described above is accurate with respect to the predetermined threshold value and robust with respect to the choice of each predetermined threshold value.
In the example being disclosed, it is preferably that the predetermined threshold value is greater than 0.2 and more preferably greater than or equal to 0.8.
Advantages Advantageously, as said, the method object of the present invention allows to determine automatically if a cell shown in a nuclear fluorescence image obtained through a confocal microscope is a diseased cell, in particular a tumorous cell.
A second advantage is given by the fact through said method it is possible to distinguish diseased cells from healthy cells.
A further advantage is due to the reliability of the method. The present invention has been described for illustrative, but not limitative purposes, according to its preferred embodiment, but it is to be understood that variations and/or modifications can be carried out by a skilled in the art, without departing from the scope thereof, as defined according to enclosed claims.
Number | Date | Country | Kind |
---|---|---|---|
102020000022801 | Sep 2020 | IT | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/IT2021/050293 | 9/28/2021 | WO |