The technical field generally relates methods and systems used for holographic image reconstruction performed with phase recovery and autofocusing using a trained neural network. While the invention has particular application for phase recovery and image reconstruction for holographic images, the method may also be applied to other intensity-only measurements where phase recovery is needed.
Holography provides a powerful tool to image biological samples, with minimal sample preparation, i.e., without the need for staining, fixation or labeling. The past decades have seen impressive progress in digital holography field, especially in terms of image reconstruction and quantitative phase imaging (QPI) methods, also providing some unique advantages over traditional microscopic imaging modalities by demonstrating field-portable and cost-effective microscopes for high-throughput imaging, biomedical and sensing applications, among others. One core element in all of these holographic imaging systems is the phase recovery step, since an opto-electronic sensor array only records the intensity of the electromagnetic field impinging on the sensor plane. To retrieve the missing phase information of a sample, a wide range of phase retrieval algorithms have been developed; some of these existing algorithms follow a physical model of wave propagation and involve multiple iterations, typically between the hologram and the object planes, in order to recover the missing phase information. Recently, deep learning-based phase retrieval algorithms have also been demonstrated to reconstruct a hologram using a trained neural network. These deep learning-based algorithms outperform conventional iterative phase recovery methods by creating speckle- and twin-image artifact-free object reconstructions in a single-pass forward through a neural network (i.e., without iterations) and provide additional advantages such as improved image reconstruction speed and extended depth-of-field (DOF), also enabling cross-modality image transformations, for example matching the color and spatial contrast of brightfield microscopy in the reconstructed hologram.
Here, a new deep learning-based holographic image reconstruction and phase retrieval algorithm is disclosed that is based on a convolutional recurrent neural network (RNN), trained using a generative adversarial network (GAN). This recurrent holographic (RH) imaging framework uses multiple (M) input hologram images that are back-propagated using zero-phase onto a common axial plane to simultaneously perform autofocusing and phase retrieval at its output inference. The efficacy of this method, which is termed RH-M herein, was demonstrated by holographic imaging of human lung tissue sections. Furthermore, by enhancing RH-M with a dilated (D) convolution kernel (
Different from other deep learning-based phase retrieval methods, in order to enhance the image reconstruction quality, this method incorporates multiple holograms, which encode the sample phase in the axial intensity differences. When compared with existing phase retrieval and holographic image reconstruction algorithms, RH-M and RH-MD framework introduces important advantages including superior reconstruction quality and speed, as well as extended DOF through its autofocusing feature. As an example, for imaging lung tissue sections, RH-M achieved ˜40% quality improvement over existing deep learning-based holographic reconstruction methods in terms of the amplitude root mean squared error (RMSE), and was ˜ 15-fold faster in its inference speed compared to iterative phase retrieval algorithms using the same input holograms. The results establish the first demonstration of the use of RNNs in holographic imaging and phase recovery, and the presented framework would be broadly useful for various coherent imaging modalities.
In one embodiment, a method of performing auto-focusing and phase-recovery using a plurality of holographic intensity or amplitude images of a sample (i.e., sample volume) includes obtaining a plurality of holographic intensity or amplitude images of the sample volume at different sample-to-sensor distances using an image sensor and back propagating each one of the holographic intensity or amplitude images to a common axial plane with image processing software to generate a real input image and an imaginary input image of the sample volume calculated from each one of the holographic intensity or amplitude images. A trained convolutional recurrent neural network (RNN) is executed by the image processing software using one or more processors, wherein the trained RNN is trained with holographic images obtained at different sample-to-sensor distances and back-propagated to a common axial plane and their corresponding in-focus phase-recovered ground truth images, wherein the trained RNN is configured to receive a set of real input images and imaginary input images of the sample volume calculated from the plurality of holographic intensity or amplitude images obtained at different sample-to-sensor distances and outputs an in-focus output real image and an in-focus output imaginary image of the sample volume that substantially matches the image quality of the ground truth images.
In another embodiment, a method of performing auto-focusing and phase-recovery using a plurality of holographic intensity or amplitude images of a sample volume includes the operations of obtaining a plurality of holographic intensity or amplitude images of the sample volume at different sample-to-sensor distances using an image sensor. A trained convolutional recurrent neural network (RNN) is executed by the image processing software using one or more processors, wherein the trained RNN is trained with holographic images obtained at different sample-to-sensor distances and their corresponding in-focus phase-recovered ground truth images, wherein the trained RNN is configured to receive a plurality of holographic intensity or amplitude images obtained at different sample-to-sensor distances and outputs an in-focus output real image and an in-focus output imaginary image of the sample volume that substantially matches the image quality of the ground truth images.
As seen in
The image sensor 24 may include a CMOS type image sensor that is well known and commercially available. The hologram images 20 are obtained using an imaging device 110, for example, a holographic microscope, a lens-free microscope device, a device that creates or generates an electron hologram image, a device that creates or generates an x-ray hologram image, or other diffraction-based imaging device. The sample volume 22 may include tissue that is disposed on or in an optically transparent substrate 23 (e.g., a glass or plastic slide or the like) such as that illustrated in
The systems 2 and methods described herein rapidly outputs autofocused output images 50, 52 as explained herein. The images 50, 52 substantially match the corresponding ground truth images obtained using the more complicated multi-height phase recovery (e.g., MH-PR). The output images 50, 52 illustrated in
In some embodiments, the input hologram images 20 may include raw hologram images without any further processing. In still other embodiments, the input hologram images 20 may include pixel super-resolution (PSR) images. These PSR images 20 may be obtained by performing lateral scanning of the sample volume 22 and/or image sensor 24 using a moveable stage 25 (
To demonstrate the efficacy of the RH-M imaging method for phase recovery and autofocusing, the RNN 10 was trained and tested (
To further analyze RH-M inference performance, a study was performed by feeding the trained RNN 10 with M=2 input holograms, captured at various different combinations of defocusing distances, i.e., Δz2,1, and Δz2,2; the results of this analysis are summarized in
The hyperparameter M is one of the key factors affecting RH-M's performance. Generally, networks with larger M learn higher-order correlations of the input hologram images 20 to better reconstruct a sample's complex field, but can also be vulnerable to overfitting on small training datasets and converging to local minima. Table 1 below summarizes the performance of RH-M trained with different input hologram numbers, M and different training set sizes. All in all, RH-M benefits from training sets with higher diversity and larger M. A general discussion on the selection of M in practical applications is also provided in the Discussion section.
Table 1: RH-M network reconstruction quality (amplitude RMSE/phase RMSE/ECC) with respect to M and training dataset size.
The RH-M framework can also be extended to perform phase recovery and autofocusing directly from input hologram images 20, without the need for free-space backpropagation using zero-phase and a rough estimate of the sample-to-sensor distance, =2 (see
Table 2: Quantitative comparison of RH-M and RH-MD image reconstruction results on back-propagated holograms. Metrics were calculated based on 64 different input hologram combinations.
To further demonstrate the advantages of the presented RNN-based system 2 and method over existing neural network-based phase recovery and holographic image reconstruction methods, the performance of RH-M was compared to an earlier method, termed Holographic Imaging using Deep Learning for Extended Focus (HIDEF), described in Wu et al., Extended depth-of-field in holographic imaging using deep-learning-based autofocusing and phase recovery, Optica, 5: 704 (2018), which is incorporated herein by reference. This previous framework, HIDEF, used a trained convolutional neural network (CNN) to perform both autofocusing and phase retrieval with a single hologram that is back-propagated using zero-phase. For providing a quantitative comparison of RH-M and HIDEF, both algorithms were tested using three (3) holograms of lung tissue sections that were acquired at different sample-to-sensor distances of 383, 438.4 and 485.5 μm (see
Another important advantage of RH-M is the extended DOF that it offers, over both HIDEF and MH-PR results. In
Finally, in Table 3, the output inference (or image reconstruction) speed of RH-M, RH-MD, HIDEF and MH-PR algorithms using Pap smear and lung tissue samples were compared. As shown in Table 3, among these phase retrieval and holographic image reconstruction algorithms, RH-M and RH-MD are the fastest, achieving ˜50-fold and ˜15-fold reconstruction speed improvement compared with MH-PR (M=8 and M=2, respectively); unlike RH-M or RH-MD, the performance of MH-PR is also dependent on the accuracy of the knowledge/measurement of the sample-to-sensor distance for each raw hologram, which is not the case for the RNN-based hologram reconstruction methods reported here.
The system 2 uses an RNN-based phase retrieval method that incorporates sequential input hologram images 20 to perform holographic image reconstruction with autofocusing. The trained RNN network 10 is applicable to a wide spectrum of imaging modalities and applications, including e.g., volumetric fluorescence imaging. Recurrent blocks learn to integrate information from a sequence of 2D microscopic scans that can be acquired rapidly to reconstruct the 3D sample information with high fidelity and achieve unique advantages such as an extended imaging DOF. In practice, when applying the presented RNN framework to different microscopy modalities and specific imaging tasks, two important factors should be taken into consideration: (1) the image sequence length M, and (2) physics-informed data preprocessing. As Table 1 suggests, increasing the input sequence length generally improves the reconstruction quality, whereas the training process with a larger M requires a more diverse dataset and costs more time in general. Besides, in view of the linearly increasing inference time with respect to the input sequence length, users should accordingly select a proper M to balance the tradeoff between the imaging system throughput and the improvement gained by multiple inputs, M. During the blind testing phase, a RNN network 10 trained with a larger M is in general more flexible to take in shorter sequences, but adequate padding needs to be applied to match the sequence length. Furthermore, physics-informed preprocessing could transform the raw microscopic images into a domain that has an easier and physically meaningful mapping to the target domain. Here, for example, the free space propagation was applied before RH-M to reduce the diffraction pattern size of the object field (despite the missing phase information and the twin-image artifacts that are present). Overall, the design of this preprocessing part should be based on the underlying physical imaging model and human knowledge/expertise.
Raw holograms images 20 were collected using a lensfree in-line holographic microscopy setup shown in
All the human samples (of sample volume 22) imaged were obtained after deidentification of the patient information and were prepared from existing specimen; therefore, this did not interfere with standard practices of medical care or sample collection procedures.
A pixel super-resolution algorithm was implemented to enhance the hologram resolution in the hologram images 20 and bring the effective image pixel size from 2.24 μm down to 0.37 μm. To perform this, in-line holograms at 6-by-6 lateral positions were captured with sub-pixel spacing using a 3D positioning stage (MAX606, Thorlabs, Inc.). The accurate relative displacements/shifts were estimated by an image correlation-based algorithm and the high-resolution hologram was generated using the shift-and-add algorithm. The resulting super-resolved holograms (also referred to as raw hologram images 20) were used for phase retrieval and holographic imaging, as reported in the Results section.
The angular spectrum-based field propagation was employed for both the holographic autofocusing and the multi-height phase recovery algorithms. This numerical propagation procedure enables one to propagate the initial complex optical field at z=z0 to obtain the complex optical field at z=z0+Δz. A 2D Fourier transform is first applied on the initial complex optical field ((x, y; z0) and the resulting angular spectrum is then multiplied by a spatial frequency-dependent phase factor parametrized by the wavelength, refractive index of the medium, and the propagation distance in free-space (Δz). Finally, to retrieve the complex optical field at z=z0+Δz, i.e., ((x, y; z0+Δz), an inverse 2D Fourier transform is applied. It should be appreciated that the plurality of obtained holographic intensity or amplitude images may be back propagated by angular spectrum propagation (ASP) or a transformation that is an approximation to ASP executed by image processing software 104.
In-line holograms at different axial positions, with e.g., ˜15 μm spacing, were captured to perform MH-PR. The relative axial distances between different holograms were estimated using an autofocusing algorithm based on the edge sparsity criterion. The iterative MH-PR algorithm first takes the amplitude of the hologram captured at the first height (i.e., z2,1) and pads an all-zero phase channel to it. It then propagates the resulting field to different hologram heights, where the amplitude channel is updated at each height by averaging the amplitude channel of the propagated field with the measured amplitude of the hologram acquired at that corresponding height. This iterative algorithm converges typically after 10-30 iterations, where one iteration is complete after all the measured holograms have been used as part of the multi-height amplitude updates. Finally, the converged complex field is backpropagated onto the sample plane using the sample-to-sensor distance determined by the autofocusing algorithm. To generate the ground truth images for the network training and testing phases, in-line holograms at eight (8) different heights were used for both the lung and the Pap smear samples reported herein.
RH-M and RH-MD adapt the GAN framework for their training, which is depicted in
After pixel super-resolution and multi-height based phase recovery steps, the resulting hologram images 20 along with the retrieved ground truth images were cropped into non-overlapping patches of 512×512 pixels, each corresponding to ˜0.2×0.2 mm2 unique sample field of view. For a given M, a combination of (MH) input holograms can be selected for each ground truth image patch during the training and testing phases, where H stands for the number of heights (H=8 in this work). As an example, Table 3 summarizes the training dataset size for RH-M and RH-MD networks for Pap smear tissue sections. RH-M and RH-MD were implemented using TensorFlow with Python and CUDA environments, and trained on a computer with Intel Xeon W-2195 processor, 256 GB memory and one NVIDIA RTX 2080 Ti graphic processing unit (GPU) In the training phase, for each image patch, Mtram holograms were randomly selected from different heights (sample-to-sensor distances) as the network input, and then the corresponding output field of RH-M or RH-MD was sent to the discriminator (D) network.
The generator loss LG is the weighted sum of three different loss terms: (1) pixel-wise mean absolute error (MAE), LMAE, (2) multi-scale structural similarity (MSSSIM) LMSSSIM between the network output ŷ and the ground truth y, and (3) the adversarial loss LGD from the discriminator network. Based on these, the total generator loss can be expressed as:
where α, β, γ are relative weights, empirically set as 3, 1, 0.5, respectively. The MAE and MSSSIM losses are defined as:
where n is the total number of pixels in y, and ŷj, yj are 2j-1 downsampled images of ŷ, y, respectively. μy, σy2 represent the mean and variance of the image y, respectively, while σy
Adam optimizers with decaying learning rates, initially set as 5×10−5 and 1×10−6, were employed for the optimization of the generator and discriminator networks, respectively. After ˜30 hours of training, corresponding to ˜10 epochs, the training was stopped to avoid possible overfitting.
In the testing phase of RH-M and RH-MD, the convolutional RNN was optimized for mixed precision computation. In general, a trained RNN can be fed with input sequences of variable length. In the experiments, RH-M/RH-MD was trained on datasets with fixed number of inputs (holograms) to save time, i.e., fixed Mtrain, and later tested on testing data with no more than Mtrain input holograms (i.e., Mtest≤ Mtram). In consideration of the convergence of recurrent units, shorter testing sequences (where Mtest<Mtrain) were replication-padded to match the length of the training sequences Mtrain. For example, in
HIDEF networks were trained in the same way as detailed in Wu et al., Extended depth-of-field in holographic imaging using deep-learning-based autofocusing and phase recovery, Optica, 5: 704 (2018). Blind testing and comparison of all the algorithms (HIDEF, RH-M, RH-MD and MH-PR) were implemented on a computer with Intel Core i9-9820X processor, 128 GB memory and one NVIDIA TITAN RTX graphic card using GPU acceleration, and the details, including the number of parameters and inference times are summarized in Table 3.
While embodiments of the present invention have been shown and described, various modifications may be made without departing from the scope of the present invention. The invention, therefore, should not be limited, except to the following claims, and their equivalents.
This application claims priority to U.S. Provisional Patent Application No. 63/148,545 filed on Feb. 11, 2021, which is hereby incorporated by reference Priority is claimed pursuant to 35 U.S.C. § 119 and any other applicable statute.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2022/015843 | 2/9/2022 | WO |
Number | Date | Country | |
---|---|---|---|
63148545 | Feb 2021 | US |