The present invention relates to a computer-implemented method and corresponding apparatus for accurate identification of subcellular structures from tomographic reconstructions, thus permitting to extract the 3D subcellular specificity directly from the phase-contrast data in a typical cytometry configuration. In particular, subcellular structures can be identified by using a novel computational segmentation method based on statistical inference, applied to cells suspended in a flow channel.
Traditional tools of histopathology analysis will evolve soon and the future of precision medicine will pass through the accurate screening and measurement of single cells. A key challenge that will allow to make the next jump forward is achieving a more informative label-free microscopy. In fact, avoiding staining is of fundamental importance as it will permit one to access non-destructive, rapid, and chemistry-free analysis in biology, thus triggering new queries and new answers in biology and medicine. Statistically relevant investigations on a large number of cells will be mandatory, as cellular populations are heterogeneous and thus possible therapies will be necessarily based on high-throughput screening technologies. Nowadays, the gold standard imaging tool in cell biology is fluorescence microscopy (FM). Since biological samples are transparent objects, they mainly introduce phase delays on the incident optical radiation, changing only minimally its amplitude component. Therefore, in FM, stains or fluorescent tags are used to make them visible on a selective basis. However, fluorescent imaging is generally qualitative and limited by photobleaching and phototoxicity of the fluorescent proteins or dyes (exogenous contrast agents). Moreover, chemical toxicity can alter the normal cellular morphology and physiology, thus introducing undesired falsehoods on the observed images. Finally, sample preparation protocols can be expensive, time-consuming, and operator-sensitive, i.e. different users can obtain different FM results within the same experiment.
Quantitative Phase Imaging (QPI) is emerging as a very useful tool in label-free microscopy and recently many significant results have been achieved in this field. QPI can allow a non-invasive and quantitative measurement of significant parameters in unlabelled cells correlated to their health state, e.g. cellular dry-mass and dry-density. In QPI, phase-contrast is due to the optical path length difference between the unlabelled biological specimen and its background due to the combination of thickness and refractive index (RI). These two quantities can be decoupled by recording multiple two-dimensional (2D) Quantitative Phase Maps (QPMs) at different viewing angles around the sample and performing the three-dimensional (3D) Optical Diffraction Tomography (ODT). Hence, ODT is a label-free optical microscopy technique that allows the 3D RI mapping of a biological specimen. RI is an intrinsic optical feature associated with cell biophysical properties (mass density, biochemical, mechanical, electrical, and optical), therefore ODT provides a full quantitative measurement of 3D morphologies and RI volumetric distribution at the single-cell level. ODT has been exploited for studying different cells, e.g. red blood cells, yeast cells, cancer cells, chromosomes, white blood cells, lipid droplets, and cell pathophysiology.
However, the advantages of stain-free imaging of QPI are counterbalanced by the lack of subcellular specificity. In fact, it is very difficult to ascertain and extract the 3D boundary of subcellular structures based solely on the RI values. Among all subcellular structures, the nucleus is the principal one in the eukaryotic cells, since it contains most of the cellular genetic material and it is responsible for the cellular lifecycle. Quantitative and label-free morphological biomarkers identified from nuclei could greatly enlarge the knowledge about cell physiology and in particular cancer diagnosis in histopathology (Backman, V. et al. Detection of preinvasive cancer cells. Nature 406, 35-36 (2000)). In fact, it has been proven that the nucleus-to-cytoplasm ratio increases in a cancer cell, as well as the phase value. For instance, significant changes in nuclear RI have been measured in breast cancer. Moreover, the efficacy of cancer therapies can be enhanced through a precise nuclear characterization. Identification of the nucleus-like region in label-free 3D imaging is a challenging task since the nuclear size and RI vary among different cell lines, as well as within the same cell line, and even within the same cell depending on its lifecycle. In addition, different subcellular structures show similar RI values, thus making any threshold-based detection method ineffective. A possible solution is to isolate the nucleus from the outer cell by a chemical etching process and then make direct label-free measurements. However, this approach is destructive and also led to debated results.
Recently, significant progresses have been reported to introduce specificity in QPI by computational approaches based on artificial intelligence (AI). In particular, a Generative Adversarial-Network (GAN) has been employed to virtually stain unlabelled tissues (Rivenson, Y. et al. PhaseStain: the digital staining of label-free quantitative phase microscopy images using deep learning. Light.: Sci. Appl. 8, 23 (2019)) as well as single cells (Nygate, Y. N. et al. Holographic virtual staining of individual biological cells. Proc. Natl. Acad. Sci. USA 117, 9223-9231 (2020)) in QPMs, i.e. in a 2D imaging case. Instead, for the 3D imaging case, nuclei of unlabelled and adhered cells have been identified using a Convolutional Neural Network (CNN) to introduce specificity in ODT reconstructions. Moreover, digital staining through the application of deep neural networks has been successfully applied to multi-modal multi-photon microscopy in histopathology of tissues. In another work, a neural network has been used to translate autofluorescence images into images that are equivalent to the bright-field images of histologically stained versions of the same samples, thus achieving virtual histological staining (Rivenson, Y. et al. Deep learning-based virtual histology staining using auto-fluorescence of label-free tissue. arXiv: 1803.11293 (2018)).
In particular, about nucleus specificity, virtual staining-based segmentation makes 2D label-free QPI equivalent to 2D FM, both in static and flow cytometry environments. Moreover, CNN-based segmentation makes 3D label-free ODT equivalent to well-established 3D confocal microscopy, but only for static analysis of fixed cells at rest on a surface. Instead, the specificity property of confocal microscopy has not been replicated yet on suspended cells in a label-free manner. Furthermore, in FM, cyto-fluorimetry is the gold standard for histopathological analysis of biological samples, while its high throughput allows statistically relevant results. In fact, unlike other methods that measure averaged signals from a population of cells, in cyto-fluorimetry, thousands of cells per second can be analyzed individually. However, FM cyto-fluorimetry involves only 2D images.
Thus, there is still a need to provide methods and systems to identify and/or quantify the subcellular structures in the non-chemical staining mode.
The authors of the present invention have developed an alternate strategy to achieve specificity in label-free bioimaging proposing a new 3D shape retrieval method, named herein as Computational Segmentation based on Statistical Inference (CSSI), to identify subcellular structures in 3D Optical Diffraction Tomography (ODT) reconstructions in flow cytometry. To this aim, two recently established methods have been combined, tomographic flow cytometry by digital holography (DH) to record Quantitative Phase Maps (QPMs) of flowing and rotating cells and learning tomographic reconstruction algorithm, thus obtaining an in-flow 3D learning cyto-tomography system. The method of the present invention is completely different than the others known in the art. In fact, deep learning techniques are based on neural networks that must be previously trained through FM images. They can replicate only results about the specific situations they have been exposed to during the learning process. Moreover, the deep learning process needs a more complex recording system, that must be multimodal to acquire simultaneously both FM and label-free images of the same sample, in order to feed the neural network. Instead, the proposed CSSI method is fully applicable to any types of cells since it avoids the learning step and exploits a robust ad hoc clustering algorithm, i.e. it recognizes statistical similarities among the subcellular structure voxels. Furthermore, it is important to underline that, for the first time is posed the problem of delineating the stain-free subcellular structures in 3D within a single suspended cell and within single-cells flowing in a microfluidic cytometer, whose corresponding 3D FM technology does not exist.
Therefore, beyond the numerical assessment through virtual 3D cell phantoms, the CSSI algorithm can fill the specificity gap with 2D FM cyto-fluorimetry and with 3D FM confocal microscopy, but in the more difficult case of suspended cells. Imaging of suspended cells has a great advantage as the cells can be individually analyzed in-flow, i.e. in a high-throughput modality. However, while FM specificity leads to an indirect qualitative visualization of the subcellular elements, CSSI allows for direct measurements of intrinsic 3D parameters (morphology, RI, and their derivatives) of subcellular structures, thus providing a whole label-free quantitative characterization exploitable for analyzing large numbers of single-cells.
Hence, objects of the present invention are:
Additional advantages and/or embodiments of the present invention will be evident from the following detailed description.
The present invention and the following detailed description of preferred embodiments thereof may be better understood with reference to the following figures:
Block diagram of the CSSI method to segment the nucleus-like region from a stain-free 3D RI tomogram. d Visual comparison between the simulated 3D nucleus and the 3D nucleus-like region segmented from the simulated RI tomogram in (a). The simulated nucleus and the segmented nucleus are the dark structures within the outer cell shell. The clustering performances obtained in this simulation are reported below.
In the following, several embodiments of the invention will be described. It is intended that the features of the various embodiments can be combined, where compatible. In general, subsequent embodiments will be disclosed only with respect to the differences with the previously described ones.
As previously mentioned, a first object of the present invention is represented by A computer-implemented method for identifying a subcellular structure of a cell analysed by a cyto-tomographic technique, which method comprises the following steps:
As used herein, the expression “retrieving 3D Refractive Index (RI) tomograms” means that the tomograms acquired from the analysis of a cell by a cyto-tomographic technique are retrieved and used to carry out the method described herein.
The term “clustering” and the term “grouping” refers to is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense) to each other than to those in other groups (clusters). It is a main task of exploratory data analysis, and a common technique for statistical data analysis, used in many fields, including pattern recognition, image analysis, information retrieval, bioinformatics, data compression, computer graphics and machine learning. Cluster analysis itself is not one specific algorithm, but the general task to be solved. It can be achieved by various algorithms that differ significantly in their understanding of what constitutes a cluster and how to efficiently find them. Popular notions of clusters include groups with small distances between cluster members, dense areas of the data space, intervals or particular statistical distributions. Clustering can therefore be formulated as a multi-objective optimization problem. The appropriate clustering algorithm and parameter settings (including parameters such as the distance function to use, a density threshold or the number of expected clusters) depend on the individual data set and intended use of the results. Cluster analysis as such is not an automatic task, but an iterative process of knowledge discovery or interactive multi-objective optimization that involves trial and failure. It is often necessary to modify data preprocessing and model parameters until the result achieves the desired properties.
Thereafter, the term “voxel” refers to the three-dimensional counterpart of the two-dimensional pixel (representing the unit of area), and therefore the volume buffer (a large 3D array of voxels) of voxels can be considered as the three-dimensional counterpart of the two-dimensional frame buffer of pixels. The term “voxel cloud” refers to a group of voxels inside the cell, not necessarily connected to each other.
As described herein, the term “subcellular structure” refers to structures that are within a cell.
The term “outlier” means a data point that differs significantly from other observations. An outlier may be due to variability in the measurement or it may indicate experimental error; the latter are sometimes excluded from the data set. An outlier can cause serious problems in statistical analyses. Outliers can occur by chance in any distribution, but they often indicate either measurement error or that the population has a heavy-tailed distribution. In the former case one wishes to discard them or use statistics that are robust to outliers, while in the latter case they indicate that the distribution has high skewness and that one should be very cautious in using tools or intuitions that assume a normal distribution. A frequent cause of outliers is a mixture of two distributions, which may be two distinct sub-populations, or may indicate ‘correct trial’ versus ‘measurement error’; this is modelled by a mixture model. In most larger samplings of data, some data points will be further away from the sample mean than what is deemed reasonable. This can be due to incidental systematic error or flaws in the theory that generated an assumed family of probability distributions, or it may be that some observations are far from the center of the data. Outlier points can therefore indicate faulty data, erroneous procedures, or areas where a certain theory might not be valid. However, in large samples, a small number of outliers is to be expected (and not due to any anomalous condition).
In an embodiment, said statistical similarity test exploited in step iv) of the method herein described, is the Wilcoxon-Mann-Whitney test (WMW), used for determining a null hypothesis Ho which is that the two sets of values have been drawn from the same distribution.
In a further embodiment, according to any one of the embodiments disclosed, said null hypothesis H0 of said WMW test can or cannot be rejected depending on the following cases, respectively:
The significance level γ is the probability of making an error of 1st species, i.e. of rejecting the null hypothesis H0 when it is true. The confidence level is defined as 1−γ, i.e. it is the probability of not rejecting the null hypothesis H0 when it is true.
The p-value is the observed significance level, i.e. the smallest significance level at which H0 is rejected. It can be also defined as the probability of obtaining results at least as extreme as the results actually observed, when the null hypothesis H0 is true. Therefore, a low p-value leads to reject the null hypothesis H0, because it means that such an extreme observed result is very unlikely when the null hypothesis H0 is true.
In an embodiment of the present invention, according to any one of the embodiments herein described, the cloud of reference (CR) is a cube containing ε3 voxels supposed belonging to the subcellular structure of interest, chosen among the cubes obtained by centering the cell 3D Refractive Index Tomogram in its Lx×Ly×Lz array and dividing it into distinct cubes, each of which has an edge measuring ε pixel.
In a further embodiment, according to any one of the embodiments herein disclosed, said investigated clouds (CI) are cubes completely contained within the cell shell of the 3D Refractive Index Tomogram of the analyzed cell centered in its Lx×Ly×Lz array, and divided into distinct cubes, each of which has an edge measuring ε pixel. The investigated cubes do not comprise the reference cube.
Furthermore, in an embodiment said WMW test is carried out computing the p-value of each of said investigated cubes (CI) with respect to said reference cube (CR), thus obtaining a variable threshold TP value according to the p-values chosen as the maximum value less than or equal to τ such that for at least one CI it happens that p-value is higher or equal to TP.
τ is an upper bound for the TP threshold. It can preferably be set to 0.9.
In another embodiment of the invention, in accordance with any one of the embodiments herein disclosed, said grouping step v) is performed through repeated M-iterations loops, thus creating a preliminary subcellular structure set NP. M is an integer number. It can preferably be set to 10. Each M-iterations loop comprises the following steps:
In a further embodiment of this invention, according to any one of the embodiments herein disclosed said step iv) of removing statistical outliers comprises the following steps in order to delete outlier cubes from the preliminary subcellular structure set P, thus creating a filtered subcellular structure set F
Let be a cube within P, with i=1,2, . . . , n.
Both vectors
where & is the logical and operator, diSF and piS are elements of vectors
In an embodiment, according to any one of the embodiments herein disclosed, said step viii) further comprises the following steps in order to transform the filtered subcellular structure set F into a refined subcellular structure set R.
The parameters α and β are multiplicative coefficients with values between 0 and 1. They can preferably be set to 0.9 and 0.5, respectively. Due to the fact that in various embodiments the described method exploits multiple WMW tests, in some of which the reference set is randomly drawn from a greater one, obtaining two slightly different results if this method is repeated twice on the same cell and the repetition of steps iii) to vii) K times permits to create a tomogram of occurrences as the sum of all the K partial subcellular structure sets , wherein each voxel can take integer values k∈[0, K] since each voxel may have been classified as the subcellular structure k times.
An adaptive threshold k* is set to segment the tomogram of occurrences, thus obtaining the final 3D subcellular structure as the group of voxels that have been classified as the subcellular structure at least k* times.
with k=1,2, . . . , K.
In a further embodiment, according to any one of the embodiments herein disclosed, said resolution factor ε parameter must be an even number and, after dividing each side of the 3D array by ε, an odd number must be obtained. A resolution factor ε=10 px is preferable since it is an optimum compromise between the need of having both high resolution in nucleus segmentation and high statistical power in WMW test. Anyway, in the case of tomograms with lower resolution, it can also be reduced, and all the other parameters change accordingly. However, a resolution factor ε greater than 5 px is suggested in order to avoid a low statistical power in WMW test.
In another embodiment of the present invention, said K parameter is a number greater than or equal to 10. It can be preferably set to 20, since for too high values only the computational time increases, but there is no appreciable improvement in segmentation performances.
Furthermore, in an embodiment, said M parameter is a number from 5 to 15, preferably 10.
In an embodiment, said τ parameter is a number less than or equal to 0.99, preferably 0.99.
In another embodiment, said α parameter is a number from 0 to 1, preferably 0.9.
In another embodiment, said β parameter is a number from 0 to 1, preferably 0.5.
In a further embodiment, in accordance to any one of the embodiments herein disclosed, said resolution factor ε parameter is 10 px, said K parameter is 20, said M parameter is 10, said τ parameter is 0.99, said α parameter is 0.9 and said β parameter is 0.9.
In a further embodiment of the present invention, in accordance with any one of the embodiments herein disclosed, said cyto-tomographic technique is a flow cyto-tomography.
In an embodiment, said subcellular structure is selected from the following ones: nucleus, mitochondria, rough endoplasmic reticulum, smooth endoplasmic reticulum, Golgi apparatus, peroxisome, lysosome, centrosome, centriole, cell membrane, cytoplasm, lipid droplets, nucleolus.
In a preferred embodiment, said subcellular structure is the nucleus, whose reference cloud for the majority of cell types is the central cube.
In an embodiment, according to any one of the embodiments herein described, the method further comprises a step of analysing a cell by a cyto-tomographic technique before step i).
In a preferred embodiment, said step of analysing a cell by a cyto-tomographic technique comprises the following steps:
In another embodiment, according to any one of the embodiments herein disclosed, the method further comprises a step of culturing said cells to be analysed in a culture medium before step i) and before the step of injecting of said cell into a microfluidic channel or device
Said culture medium is selected from the following ones: RPMI 1640 (Sigma Aldrich), mammary epithelial cell growth medium (MEGM SingleQuots, Lonza Clonetics), Minimum Essential Medium (MEM Gibco-21090-022).
In an embodiment, said culture medium is further supplemented with fetal bovine serum in a concentration ranging from X to Y, preferably 10% by weight, L-glutamine in a concentration ranging from X to Y, preferably 2 mM, penicillin in a concentration ranging from X to Y, preferably 100 U ml−1, streptomycin in a concentration ranging from X to Y, preferably 100 μg ml−1.
In another embodiment of the present invention, said step c) of processing of holographic data to retrieve the 3D RI tomogram of said cell, is performed by recovering the rolling angles from the y-positions (y is the flow axis) of said cell within the imaged field of view. Let N be the number of digital holograms (i.e. frames) of said cell collected within the field of view. Let ϑ1=0° be the rolling angle of the first frame (k=1) and let ψ be a known angle rotation of said cell respect to the first frame. The rolling angles recovery method is performed through the following steps
where k=1, . . . , N is the frame index.
In a further embodiment, said step a) of Computing a Phase Image Similarity Metric, is performed by using the Tamura Similarity Index (TSI), based on the local contrast image calculated by the Tamura coefficient or any other numerical criterion useful for the same purpose.
TSI is a Phase Image Similarity Metric based on the local contrast measurements through the Tamura Coefficient (TC) that is the square root of the ratio between the standard deviation and the average value of a signal. In particular, let QPM(k) be the N×M quantitative phase map at the k-th frame, and let Si,j(k) be a 3×3 patch within QPM(k), centred in the pixel of coordinates (i,j), with i=2, . . . , N−1 and j=2, . . . , M−1. Each pixel (i,j) within QPM(k) is substituted with the TC of the Si,j(k) patch, thus obtaining the local contrast image (LCI), whose generic element is
with i=2, . . . , N−1, j=2, . . . , M−1 and k=1, . . . , K. Finally, the TSI is obtained as follows
where ./ denotes an elementwise division.
According to any one of the embodiments herein disclosed, said step b) of generating a 1D pointwise curve namely f104 , is performed by recovering the global minimum of the TSI or any other numerical criterion useful for the same purpose.
In another embodiment, said step c) of computing the unknown rolling angles is performed by defining ψ as a number such that ψ=Q×180° where Q can be in the set {1,2,3,4}, preferably Q=1.
Furthermore, in an embodiment, said TSI is calculated between pairs consisting of the QPM obtained from the first frame and all the other QPMs, flipped in the y direction if Q={1,3}.
In an embodiment, according to any one of the embodiments previously described, said step c) of computing the unknown rolling angles is performed by using any tomographic reconstruction algorithm, preferably the Learning Tomography method.
According to any one of the embodiments herein described, said identification of a subcellular structure, consisting of the full statistical characterization of the RI distribution (central moments), the full 3D morphometric analysis along with dry mass and dry mass density of the said subcellular structure.
Forms part of the present invention a computer program comprising instructions which, when the program is executed by a computer, cause the computer to carry out the method described in the present description.
Further object of the present invention is a computer-readable data carrier having stored thereon the computer program described by any one of the embodiments herein disclosed.
In an embodiment, said computer-readable data carrier is in the form of a USB pen-drive, an external hard disk (HDD), a compact disc (CD), a digital versatile disc (DVD).
Another object of the present invention is an apparatus suitable for carrying out tomographic analysis on a cell or on a group of cells, comprising a data processing device configured to execute the computer program herein described, and/or the computer-readable data carrier according to the previous embodiments.
In an embodiment, said apparatus is a flow cytometer comprising a microfluidic modulus. In a further embodiment, said apparatus comprises means of a light source to illuminate the cells flowing in the said microfluidic modulus, having the appropriate features of coherence such that an interference pattern, preferably a digital hologram, can be recorded on the digital camera, and a microscope objective to image the cells with the appropriate resolution and magnification and project said in focus or out of focus images on the sensor of the camera.
In an embodiment according to the present description, said light source can be selected among any source having appropriate coherence, preferably lasers, diode pumped solid state lasers, and Light Emitting Diodes (LEDs).
In another embodiment, said light source can be selected among any wavelength in the visible range or any other regions of the electromagnetic spectrum including X-ray regions.
Furthermore, in an embodiment of the present invention, the polarization state of said light source can be selected among any possible polarization state. The polarization is the phenomenon in which waves of light or other radiation are restricted in direction of vibration.
In an embodiment of the present invention, according to any of the embodiments herein disclosed, said apparatus is used with a single illumination source or with multiple sources having the same wavelength or multiple wavelengths.
In an additional embodiment, said apparatus is used with a single polarization state or with multiple sources having multiple polarization states.
In an embodiment of the present description, said apparatus is characterized in that the imaged Field of View of the digital hologram is such that the flowing cell experiences an enough amount of rotation angle while it travels inside the said field of view in order to retrieve a useful tomogram. Field of View of the digital hologram is the interference area imaged by the sensor of the camera.
In an embodiment, said apparatus is further characterized in that the acquisition frame rate of the camera is fast enough with respect to the longitudinal and angular cell velocity, to record the cells at different rotating positions with enough angle resolution in order to retrieve a useful tomogram of each cell.
Additionally, in an embodiment of the present invention, said apparatus has said microfluidic modulus which operates and is engineered or customized in the appropriate way in order to guarantee that the flowing cells rotate along the microfluidic channel in a way that assures the recovery of the tomogram.
In another embodiment according to the present description, said microfluidic modulus has the channel dimensions selected according to the size of the object and the flow properties in order to guarantee the rotation around a single axis.
In a further embodiment, said microfluidic modulus permits the cells to flow in a multi-channel microfluidic modulus with parallel channels, thus allowing the simultaneous recording of a large number of cells and accordingly the high-throughput property.
Furthermore, in an embodiment of the present invention according to any one of the embodiments herein disclosed, the apparatus is used in a diagnostic liquid biopsy method for the detection and classification of cells in a biological fluid sample.
In an embodiment, said biological fluid sample is selected from the following ones: blood, urine, cerebrospinal fluid, saliva, tear fluid.
In an additional embodiment, said cells analysed by the apparatus can be any type of live (cells, diatoms, microorganism, small live animals) or inanimate objects (particles, pollen grain, etc).
Advantageously, said apparatus is very useful for the research and identification of circulating tumor cells (CTC), and more generally the recognition and classification of cells in human fluids in human zootechnical and plant areas, as well as for pathology diagnostics or clinical studies and/or pharmacological tests.
In compliance with Art. 170bis of the Italian patent law it is herein declared that:
In any part of the present description and claims, the term comprising can be substituted by the term “consisting of”.
Examples are reported below which have the purpose of better illustrating the methodologies disclosed in the present description, such examples are in no way to be considered as a limitation of the previous description and the subsequent claims.
To reconstruct the 3D RI distribution at the single-cell level in-flow, a DH setup in microscope configuration has been used, as sketched in
In order to validate the novel CSSI method, has been preliminarily tested and assessed it on a 3D numerical cell-phantom simulation. Then has been proved that experimental results were consistent with confocal fluorescence microscopy data and microfluidic cyto-fluorimeter outputs. At this aim, 3D RI tomograms of five human neuroblastoma cancer cells (SK-N-SH cell line) and three human breast cancer cells (MCF-7 cell line) have been reconstructed through the 3D learning flow cyto-tomography technique.
A numerical 3D cell phantom has been modeled and simulated (see the Materials and Methods section). As reported in
In fact, it is confirmed by the 2D images of SK-N-SH cells recorded through a FM cyto-fluorimeter (see the Materials and Methods section), by the 3D morphological parameters reported in the literature for MCF-7 cells imaged through a confocal microscope, and more generally by the increase of the nucleus-cytoplasm ratio demonstrated in cancer cells. The CSSI method is based on the Wilcoxon-Mann-Whitney (WMW) test that is a statistical test that has been used to accept or reject the hypothesis for which a test set has been drawn from the same distribution of a designed reference set. In particular, the steps depicted in the scheme in
Notice that, whenever the WMW test is used, the reference set is randomly selected from the last estimation of the nucleus cluster until that moment, to match its dimensionality with that of the test set, thus preserving the fairness of the statistical test. Due to this random selection, by repeating several times the described steps, at each iteration j=1,2, . . . , K it is possible to obtain a slightly different estimation of the nucleus-like region. The output of each iteration is a binary 3D volume whose non-null values correspond to the voxels associated with the nucleus. Therefore, the sum of all the K outputs provides a tomogram of occurrences, from which the probability that a voxel belongs to the nucleus can be inferred through a normalization operation. Finally, the nucleus-like region is identified by a suitable probability threshold. A more detailed description of the CSSI algorithm is reported in the Materials and Methods section.
On the left in
Therefore, accuracy is the percentage of voxels correctly classified, sensitivity is the percentage of voxels correctly classified nucleus with respect to the number of nucleus voxels, and specificity is the percentage of voxels correctly classified non-nucleus with respect to the number of non-nucleus voxels. The visual comparison in
Direct experimental validation of the proposed CSSI method is very difficult in flow-cytometry and too prone to errors that could alter the truthfulness of the experiment, as the cells should be stained and recorded simultaneously by both the holographic and the fluorescent channels. Therefore, an indirect experimental assessment of the CSSI has been performed by retrieving the stain-free nuclei for two distinct tumor cell lines starting from the 3D learning flow cyto-tomograms. In the following, it is exploited the experimental results to discuss the consistency between the Inventor's approach presented here and the classical FM microscopic methods (see the two thickest dashed lines in
The proposed CSSI method has been used to retrieve the 3D nucleus-like regions from five stain-free SK-N-SH cells, reconstructed by 3D learning flow cyto-tomography. The isolevels representation of an SK-N-SH cell is shown in
In order to experimentally assess the validity of the 3D segmentation technique, the segmented 3D ODT reconstruction has been digitally projected back to 2D where the experimental 2D FM images are available for comparison. In particular, the segmented RI tomogram is digitally rotated from 0° to 150° with 30° angular step around x-, y- and z-axes, and then its silhouettes along the z-, x- and y-axes, respectively, are considered to create 2D ODT segmented projections, as sketched in
For the second experimental assessment, three stain-free MCF-7 cells have been reconstructed by 3D learning flow cyto-tomography and then segmented by the CSSI method, as shown in the example in
In this case, the experimental assessment is based on a quantitative comparison with the 3D morphological parameters measured in Wen, Y. et al., in which a confocal microscope has been employed to find differences between viable and apoptotic MCF-7 cells through 3D morphological features extraction. In this study, 206 suspended cells were stained with three fluorescent dyes in order to measure average values and standard deviations of 3D morphological parameters about the overall cell and its nucleus and mitochondria. A synthetic description of 3D nucleus size, shape, and position is given by nucleus-cell volume ratio (NCVR), nucleus surface-volume ratio (NSVR), and normalized nucleus-cell centroid distance (NNCCD), respectively. In particular, in this case, the nucleus-cell centroid distance refers to 3D centroids and has been normalized with respect to the radius of a sphere having the same cell volume, thus obtaining NNCCD. Moreover, it is worth underlining that NCVR and NSVR are direct measurements reported in Wen, Y. et al., while NNCCD is an indirect measurement since it has been computed by using the direct ones in Wen, Y. et al. In the 2D scatter plots in
MCF-7 cells were cultured in RPMI 1640 (Sigma Aldrich) supplemented with 10% fetal bovine serum, 2 mM L-glutamine and 100 U ml−1 penicillin, 100 μg ml−1 streptomycin. MCF-10A cells were cultured in mammary epithelial cell growth medium (MEGM SingleQuots, Lonza Clonetics) at 37° C. in a CO2 atmosphere. Subsequently, they were harvested from the Petri dish by incubation with a 0.05% trypsin-EDTA solution (Sigma, St. Louis, MO) for 5 min. The cells were then centrifuged for 5 min at 1500 rpm, resuspended in complete medium and injected into the microfluidic channel.
SK-N-SH cells were cultured in Minimum Essential Medium (MEM) (Gibco-21090-022) supplemented with 10% fetal bovine serum, 2 mM L-glutamine and 100 U ml−1 penicillin, 100 μg ml 1 streptomycin at 37° C. in a CO2 atmosphere. Subsequently, they were harvested from the Petri dish by incubation with a 0.05% trypsin-EDTA solution (Sigma, St. Louis, MO) for 5 min. For in flow studies after centrifugation, the cells were resuspended in complete medium and injected into the microfluidic channel at final concentration of 4×105 cells/ml.
The light beam generated by the laser (Laser Quantum-Torus, emitting at wavelength of 532 nm) is coupled into an optical fiber, which splits it into object and reference beams in order to constitute a Mach-Zehnder interferometer in off-axis configuration. The object beam exits from the fiber and is collimated to probe the biological sample that flows at 7 nL/s along a commercial microfluidic channel with cross section 200 μm×200 μm (Microfluidic Chip-Shop). The flux velocity is controlled by a pumping system (CETONI-neMESYS) that ensures temporal stability of the parabolic velocity profile into the microchannel. The wavefield passing throughout the sample is collected by the Microscope Objective (Zeiss 40× oil immersion—1.3 numerical aperture) and directed to the 2048×2048 CMOS camera (USB 3.0 U-eye, from IDS) by means of a Beam-Splitter that allows the interference with the reference beam. The interference patterns of the single cells rotating into a 170 μm×170 μm Field of View (FOV) are recorded at 35 fps. The microfluidic properties ensure that cells flow along the y-axis and continuously rotate around the x-axis. Each hologram of the recorded sequence is demodulated by extracting the real diffraction order through a band-pass filter, because of the off-axis configuration52. Then, a holographic tracking algorithm53 is used to estimate the 3D positions of the flowing cells along the microfluidic channel. It consists of two successive steps. The first one is the axial z-localization, in which the hologram is numerically propagated at different z-positions through the Angular Spectrum formula54, and, for each of them, the Tamura Coefficient53 (TC) is computed on the region of interest (ROI) containing the cell within the amplitude of the reconstructed complex wavefront. By minimizing this contrast-based metric, the cell z-position in each frame can be recovered, and the cell can be refocused. After computing the in-focus complex wavefront, the corresponding QPM is obtained by performing the phase unwrapping algorithm55. The second holographic tracking's step is the transversal xy-localization, which is obtained by computing the weighted centroid of the cell in its QPM53. The 3D holographic tracking allows centering each cell in all the QPM-ROIs of their recorded rolling sequence, thus avoiding motion artefacts in the successive tomographic reconstruction. Moreover, the y-positions can be exploited to estimate the unknown rolling angles46. A phase image similarity metric, namely Tamura Similarity Index (TSI), based on the evaluation of the local contrast by TC, is computed on all the QPMs of the rolling cell. It has been demonstrated minimizing in the frame f180 at which a 180° of rotation with respect to the first frame of the sequence has occurred. Thanks to the above-mentioned microfluidic properties and to the high recording frame rate, a linearization of the relationship between the angular and the translational speeds can be assumed. Hence, all the N unknown rolling angles are computed as follows
where k=1, . . . , N is the frame index. Finally, the tomographic reconstruction is performed by the Filtered Back Projection (FBP) algorithm.
Learning tomography is an iterative reconstruction algorithm based on a nonlinear forward mode, beam propagation method (BPM), to capture high orders of scattering. Using the BPM, an incident light illumination has been propagated on an initial guess acquired by the inverse Radon transform and compare the resulting field with the experimentally recorded field. The error between the two fields is backpropagated to calculate the gradient. At each iteration of LT, the gradient calculation is repeated for 8 randomly selected rotation angles, and the corresponding gradients are rotated and summed to update the current solution. As an intermediate step, the total variation regularization was employed. The total iteration number is 200 with a step size of 0.00025 and a regularization parameter of 0.005.
In order to run LT, two electric fields, incident and total electric fields are needed. The amplitude of the incident field was estimated from the amplitude of the total electric field by low-pass filtering in the Fourier domain with a circular aperture whose radius is 0.176 k0, where
given a wavelength λ=532 nm in a vacuum. In other words, we assume that the high-frequency information in the amplitude of the total electric field was only attributed to the light interference caused by a sample when illuminated by an illumination with slowly varying amplitude.
In order to record thousands of 2D FM images, a commercial multispectral flow cyto-fluorimeter has been employed, i.e. Amnis ImageStreamX®. Cells are hydrodynamically focused within a micro-channel, and then they are probed both by a transversal brightfield light source and by orthogonal lasers. The fluorescence emissions and the light scattered and transmitted from the cells are collected by an objective lens. After passing through a spectral decomposition element, the collected light is divided into multiple beams at different angles according to their spectral bands. The separated light beams propagate up to 6 different physical locations of one of the two CCD cameras (256 rows of pixels), which operates in time operation. Therefore, the image of each single flowing cell is decomposed into 6 separate sub-images on each of the two CCD cameras, based on their spectral band, thus allowing the simultaneous acquisition of up to 12 images of the same cell, including brightfield, scatter, and multiple fluorescent images. Hence, Amnis ImageStreamX® combines the single-cell analysis of the standard FM microscopy with the statistical significance due to large number of samples provided by standard flow-cytometry.
The Amnis ImageStreamX® allows to select the magnification of Microscope Objective (MO) between 20×, 40× or 60×, and then Field of View (FoV), Pixel Size (PIX), Depth of Field (DoF), Numerical Aperture (NA) and Core Velocity (CV) change accordingly. In this experiment a 60× MO has been set, thus resulting in FoV=40 μm, PIX=0.33 μm, DoF=2.5 μm, NA=0.9 and CV=40 mm/s. With these settings, it was observed 1280 SK-N-SH cells. For each of them, two simultaneous images have been recorded, i.e. a brightfield image of the flowing cell and its corresponding FM image with the stained nucleus. To segment nucleus, a global threshold is applied to the FM signal by the associated software. Three of the recorded brightfield and fluorescent images are shown at the top and at the bottom of
In Wen, Y. et al., a confocal microscope has been employed to find differences between viable and apoptotic MCF-7 cells through 3D morphological features extraction. In particular, 206 cells were stained with three fluorescent dyes in order to measure the average value and standard deviation of 3D morphological parameters about the overall cell and its nucleus and mitochondria. We exploit these measurements to simulate a 3D numerical cell phantom, by setting 1 px=0.12 μm. It is made of four sub-cellular structures, i.e. cell membrane, cytoplasm, nucleus and mitochondria. We shape cell, nucleus and mitochondria as ellipsoids, then we make irregular the cell external surface, and finally we obtain cytoplasm through a morphological erosion of the cell shape. Moreover, in each simulation, the number of mitochondria is drawn from uniform distribution U1{a1, b1}. A 3D numerical cell phantom is displayed in
In order to describe the steps of the proposed CSSI algorithm, sketched in
The ε parameter is the resolution factor at which the 3D array is firstly analyzed. It must be an even number and, after dividing each side of the 3D array by ε, an odd number must be obtained. Therefore, each distinct cube contains ε3 voxels (i.e. RI values). The cubes completely contained within the cell shell are the investigated cubes CI (dark gray cubes within the black cell shell in
As discussed, the CSSI algorithm is based on the WMW test. It is a rank-based non-parametric statistical test, thus distributions don't have to be normal. With a certain significance level γ, it allows accepting or rejecting the null hypothesis H0 for which two sets of values have been drawn from the same distribution. An important parameter in a statistical test is the p-value, which ranges from 0 to 1. In fact, if the p-value≥γ, H0 is not rejected with significance level γ, while if p-value<γ, H0 is rejected with significance level γ. Therefore, a greater p-value leads more not to reject that two sets of values have been extracted from the same population. Hence, our algorithm performs multiple comparisons between the investigated cubes CI and the reference one CR through WMW test, because, as discussed above, we are assuming that CR voxels belong to the nucleus, thus if a certain CI has been drawn from its same distribution, then also the CI voxels belong to the nucleus.
Both vectors
where & is the logical and operator, diSF and piS are elements of vectors
However, to build a filtered nucleus set F, a strong spatial and statistical filtering has been made, in order to store only cubes that belong to the nucleus with high probability, thus leading to a strong underestimation of the nucleus region. Moreover, to increase the statistical power of the WMW test, the resolution factor ε should not be too small. As a consequence, the ε-cubic structuring element leads to a low spatial resolution.
with k=1,2, . . . , K, as reported in
The 3D segmented nucleus-like region should be computed as the set of voxels that have occurred at least kopt times. The parameter kopt should maximize simultaneously the accuracy, sensitivity, and specificity of the proposed CSSI method. In
In
In
All the parameters involved in the proposed CSSI algorithm are described in Table S2. It is worth underlining that, in our experiments, a resolution factor ε=10 px has been set to analyse arrays made of at least 190×190×190 voxels, since it was an optimum compromise between the need of having both high resolution in nucleus segmentation and high statistical power in WMW test. Anyway, in the case of tomograms with lower resolution, it can also be reduced, and all the other parameters change accordingly. However, a resolution factor ε greater than 5 px is suggested in order to avoid a low statistical power in WMW test.
Number | Date | Country | Kind |
---|---|---|---|
102021000019490 | Jul 2021 | IT | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/IB2022/056625 | 7/19/2022 | WO |