The invention generally relates to systems and method for deep learning driven adaptive optics for single molecule localization microscopy.
Fluorescent microscopy is an indispensable tool in visualizing cellular and tissue machinery with molecular specificity, however, its resolution is limited to 250-700 nm laterally and axially due to the diffraction of light. Molecular features smaller than this limit cannot be resolved. Super resolution microscopies such as Stimulated Emission Depletion Microscopy (STED), Structured Illumination Microscopy (SIM)3, and Single Molecule Localization Microscopy (SMLM) have overcome this barrier, allowing biological observations well beyond this fundamental limit of light. In particular, SMLM detects isolated photo-switchable or convertible fluorescent dyes or proteins, pinpoints the centers of individual probes from their emission patterns, and reconstructs the molecular centers into a super-resolution image. Localization precision as low as 1-10 nm can be achieved in fixed and living cells.
SMLM in tissues, however, is challenging. One major reason is the distortion and blurring of single molecule emission patterns (i.e. PSFs) caused by the inhomogeneous refractive indices within the tissue. Such alteration often reduces the information content12 carried by each detected photon, increases localization uncertainty, and thus causes significant resolution loss, which is irreversible by post-processing13. Reversing these sample induced aberrations requires optical path modifications in a microscopy system, commonly with a deformable mirror or a spatial light modulator, responsive towards each specimen and field-of-view to adaptively restore the PSFs of single emitters, and thus the achievable resolution. This process is known as adaptive optics (AO).
Guiding a deformable mirror to compensate sample induced aberrations, the distorted wavefront needs to be measured. For point-scanning microscopes, such as confocal and two-photon, the detection focus serves as a ‘guide star’ providing a stable wavefront measurable both directly and indirectly. In contrast, wavefronts of single molecule emissions, in spite of their abundance in SMLM experiments, cannot be directly measured as the signals from individual molecules blink stochastically with limited photons. Besides, wavefronts passing through the system are composed of not only the aberrated wavefront induced by the specimen, but also the wavefront variations induced by lateral and axial positions from a collection of emitters in a volume. For this reason, current sensorless AO-SMLM developments focus on iteratively introducing mirror changes then evaluating the changes with image-quality metrics. Despite that these iterative methods require a large number of cycles, each including image acquisition and mirror changes, to reach the optimal correction, these approaches provide robust corrections for tissue induced aberrations only when the target tissue structures are planar or with small axial extent. This is because emission patterns from single molecules at different axial positions results in inconsistent, and, in some cases, even opposite metric responses and thus fundamentally limit the efficacy of these approaches for aberration correction in tissues.
Bypassing the previous iterative trial-then-evaluate processes, the invention herein provides new systems and methods based on deep learning driven adaptive optics for SMLM to allow direct inference of wavefront distortion and near real time compensation. The systems and methods herein make use of a trained deep neural network (DNN) that monitors the individual emission patterns from single molecule experiments, infers their shared wavefront distortion, feeds the estimates through a dynamic filter (Kalman), and drives a deformable mirror to compensate sample induced aberrations. The method, referred to as deep learning driven adaptive optics (DL-AO) for single molecule imaging, simultaneously estimates and compensates 28 types of wavefront deformation shapes, restores single molecule emission patterns approaching the conditions untouched by specimen, and improves the resolution and fidelity of 3D SMLM through thick tissue specimens over 130 micrometer, with as few as 3-20 mirror changes.
In certain aspects, the invention provides a system comprising: an imaging apparatus configured for conducting single-molecule localization microscopy (SMLM), wherein the imaging apparatus comprises a deformable mirror and a dynamic filter (such as a Kalman filter); and a processor operably associated with the imaging apparatus and configured to: monitor individual emission patterns produced via the imaging apparatus from a plurality of different single molecules in a sample; infer shared wavefront distortion for each of the individual emission patterns; provide the shared wavefront distortion for each of the individual emission patterns through the dynamic filter; and operate the deformable mirror to compensate for sample induced aberrations.
In other aspects, the invention provides a method for improving single-molecule localization microscopy (SMLM), the method comprising: monitor, via a processor operably associated with an imaging apparatus configured for conducting single-molecule localization microscopy (SMLM), individual emission patterns produced via the imaging apparatus from a plurality of different single molecules in a sample; inferring, via the processor, shared wavefront distortion for each of the individual emission patterns; providing, via the processor, the shared wavefront distortion for each of the individual emission patterns through a dynamic filter of the imaging apparatus; and operating, via the processor, a deformable mirror of the imaging apparatus to compensate for sample induced aberrations, thereby improving SMLM.
In certain embodiments of the systems and methods, the processor simultaneously estimates and compensates for 28 types of wavefront deformation shapes. In certain embodiments of the systems and methods, the processor restores single molecule emission patterns approaching pre-analysis conditions. In certain embodiments of the systems and methods, the processor improves resolution and fidelity of three dimensional SMLM through tissue specimens over 130 micrometers. In certain embodiments of the systems and methods, the improvement is accomplished in as few as 3-20 mirror changes.
In certain embodiments of the systems and methods, the processor is trained via a training data set, wherein the training data set that comprises segmenting single molecule-containing sub-regions. In certain embodiments of the systems and methods, each sub-region goes through a sequence of template matching processes, which are organized as convolutional layers and residual blocks with PReLU activations and batch normalizations in between. In certain embodiments of the systems and methods, the processor then fully connects through 1×1 convolutional layers to an output vector of values amplitude estimates for wavefront shapes in terms of the native mirror deformation modes. In certain embodiments of the systems and methods, the wavefront is represented with coefficients of orthogonal basis.
Single molecule emission patterns generated by individual fluorescence molecules carry information not only about their molecular center positions, but also about the shared wavefront distortion25. The random lateral and axial positions of the blinking fluorescent molecules and their limited photons emitted in SMLM experiments, make these emission patterns unsuitable for direct wavefront measurement14,15. Single molecule deep neural network (smNet)26 was demonstrated in its capacity to infer wavefront distortions from individual PSFs in simulation and its responsiveness in experimental datasets. Moving from the inference task to active control of a deformable mirror driven by deep learning is, however, nontrivial. Here, we describe our developments in experimental wavefront based training, stacked estimation networks, and stabilized feedback controls through Kalman filter (
Upon detection of SMLM frames, single molecule-containing sub-regions are segmented and sent to the network. Each input sub-region goes through a sequence of template matching processes, which are organized as convolutional layers and
residual blocks with PReLU activations30 and batch normalizations in between, then “fully connects” through 1×1 convolutional layers to an output vector of values—amplitude estimates for wavefront shapes in terms of the native mirror deformation modes (hereafter referred to as mirror modes). Representing wavefront with coefficients of orthogonal basis helps cut down on the number of outputs and network parameters to be optimized in training. Forming this orthogonal basis directly from native mirror deformations further ensured the coefficients' accuracy in representing mirror responses. With this consideration, the conversion from mirror modes to Zernike polynomials—commonly used as the analytical basis to describe aberrations—is dropped to minimize mismatches between mirror responses and Zernike-based wavefront shapes. The residual differences between theoretical expectations and experimental mirror deformations are incorporated into training data generation.
To build an accurate link between experimentally detected emission patterns and the mirror control with neural networks, it is imperative to train the network with data that match those obtained experimentally. However, experimental training data of single molecules are challenging to obtain, since the ground-truth wavefronts are usually unknown and the extensive variations of the intensity, background, and the lateral and axial locations of single emitters, are impractical to cover experimentally. To this end, we simulate wavefront distortions by linearly combining the mirror deformations obtained experimentally in the SMLM system. We then use the coefficients of these experimental patterns to form the output of the network. The static residue of system aberration after optimizing the microscope system is also incorporated as the baseline of the wavefront shapes. This allows us to efficiently generate millions of training PSFs based on experimentally measured wavefronts with highly accurate training ground truth (normalized cross correlation (NCC) value of >0.95, comparing measured PSFs with those generated from network estimation).
Compensating wavefront distortions inferred from PSFs of blinking molecules, we found that the network proposed mirror change fluctuates with non-vanishing uncertainty before/after each mirror update. This uncertainty increases with the network training range, resulting in a trade-off between the compensation range and stability. To this end, we drive the deformable mirror by dynamically switching three networks trained with different aberration scales where the transitions between networks are based on the inference uncertainty. To stabilize network transitions, we employ Kalman filter to reduce the estimation uncertainty by recursively combining wavefront measurements before and after each correction. Due to the uncontrollable availability of single molecule emission patterns with a high signal-to-background ratio and the evolving PSFs after each correction, this process weighs heavily on high precision measurements against the uncertain ones to ensure stable feedback from the network.
First, we characterized the response accuracy of DL-AO network using controlled wavefront distortions generated by the deformable mirror. These wavefront distortions resulted in aberrated emission patterns, which were then collected and sent to DL-AO network (Methods). By comparing the induced deformation amplitudes with those estimated by DL-AO, we observed that DL-AO network responded towards individual mirror deformations mostly in a one-to-one manner. And this behavior was consistently observed with both beads samples and blinking single molecules from immune-fluorescence-labeled cell specimens. At the same time, we also observed that DL-AO sensed changes in other mirror modes besides the one actually being changed, an expected behavior considering that mirror modes are coupled experimentally. Due to such coupling, mapping between the wavefront shape and mirror mode amplitudes is no longer unique, and therefore we further quantified the network response accuracy through wavefront shape errors and PSF similarities. We observed that independent measurements from DL-AO and phase retrieval using PSFs of fluorescent beads resulted in nearly identical wavefront shapes with a small difference of 0.13±0.02 rad (mean±s.t.d, N=28) quantified in root mean square wavefront error (Wrms, Methods, Supplementary
DL-AO aims to restore PSFs to the level unmodified by the specimen. To characterize DL-AO's capacity for PSF restoration, we introduced random wavefront distortions using the deformable mirror and compensated these distortions with DL-AO during SMLM experiments with immune fluorescence-labeled TOM20 in COS-7 cells. Visualizing the raw blinking data during the correction, we found the PSFs became less distorted even after a single compensation, and the mirror shape became stable after ˜4 mirror updates (
Next, we evaluated the robustness of DL-AO on compensating different levels of wavefront distortion, from 0.25 to 2.75 radians in Wrms, by assessing the residual wavefront error post correction using both simulation and single molecule blinking data. After one mirror update, we observed that 51.9±9.3% and 64.3±12.8% (mean±s.t.d, N=165) of the induced level was compensated for experimental and simulated data, respectively (
Inhomogeneous refractive indices within cells and tissues redirect and scatter light. In particular, the mismatches between refractive indices in sample media and objective immersion media reduce the shape modulation of the single molecule emission patterns axially and broaden the focus laterally (
Here, we demonstrate DL-AO's capacity in compensating significant index mismatch induced aberrations using constructed specimens from 35 μm to 134 μm in thickness with water-based imaging media. Imaging immune-fluorescence-labeled Tom20 in COS-7 cells through such thickness without AO correction, the super resolution images of Tom20 proteins showed nearly no axial distributions (visualized by color differences,
Next, we illustrate the mechanism behind such resolution improvement (
We further demonstrated DL-AO on arbitrary tissue-induced aberrations by imaging through 200-μm thick unlabeled brain sections resolving membrane of mitochondria using immune fluorescence-labeled Tom20 in COS-7 cells (
Combing the power of single molecule deep neural network with careful designs in network training, feedback, and instrument control, we demonstrated that DL-AO optimizes PSFs approaching the instrument optimum during SMLM experiments, and restores the resolution of 3D SMLM through >130 μm depth of tissue. However, DL-AO requires at least two isolated and detectable PSFs to start compensation, and this requirement might be challenging to meet when the aberration level or imaging depth is significantly higher than the demonstrated cases where single molecule emissions are no longer identifiable. We also expect that further development in designing training data and neural network architecture will improve inference accuracy of DL-AO in an increasing compensation range, ultimately enabling single shot compensation during SMLM imaging. Additionally, the demonstrated DL-AO applications are limited by the working distance of the silicone-oil objective, and thus the imaging depth could potentially be extended when combined with long working distance objectives. To further improve the achievable resolution and imaging fidelity, we expect that DL-AO can be combined with light-sheet illumination49,50 for an increased signal to background ratio of single molecule detections, tissue clearing51 for labeling penetration and reduced aberration level, and expansion methods52 for further improved spatial resolution, thereby opening doors to observe nanoscale conformation in tissues and small animals.
References and citations to other documents, such as patents, patent applications, patent publications, journals, books, papers, web contents, have been made throughout this disclosure, including to the Supplementary. The Supplementary, and all other such documents are hereby incorporated herein by reference in their entirety for all purposes.
The invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. The foregoing embodiments are therefore to be considered in all respects illustrative rather than limiting on the invention described herein.
We cleaned 25-mm-diameter coverslips (CSHP-No 1.5-25, Bioscience Tools) successively in ethanol (2701, Decon) and HPLC-grade water (W5-4, Fisher Chemical) for three times and then dried them with compressed air. To promote fluorescent beads adhesion on coverslip, 200 μL of 546 poly-l-lysine solution (P4707, Sigma-Aldrich) was added to one coverslip and incubated for 20 min at room temperature (RT). Following poly-l-lysine treatment, the coverslip was subsequently rinsed with deionized water. For beads incubation, we first diluted 100-nm-diameter crimson beads (custom-designed, Invitrogen) to 1:1,000,000 in deionized water. Then we added 200 μL of the diluted bead solution to the center of the coverslip and incubated for 20 min at RT. The coverslip was subsequently rinsed with deionized water. The treated coverslip was placed on a custom-made holder, and 20 μL of 38% 2,2′-thiodiethanol (166782, Sigma-Aldrich) in 1×PBS (10010023, Gibco) was added to its center. Another 25-mm-diameter coverslip (also cleaned by using the above protocol) was placed on top of this coverslip. This coverslip sandwich was sealed with two-component silicone dental glue (Twinsil speed 22, Dental-Produktions und Vertriebs GmbH).
COS-7 cells (CRL-1651, ATCC) were grown on coverslips placed in six-well plates and cultured in DMEM (30-2002, ATCC) with 10% FBS (30-2020, ATCC) and 1% penicillin-streptomycin (15140122, Gibco) at 37° C. with 5% CO2. The cells are passaged when their confluence reaches 80%. And the cells were fixed for imaging when their confluence reaches about 30%.
Cultured cells were first fixed with 37° C. pre-warmed 3% Formaldehyde aqueous solution (diluted in 1×PBS from 16% Formaldehyde aqueous solution, 15710, Electron Microscopy Sciences) and 0.5% Glutaraldehyde aqueous solution (diluted in 1×PBS from 8% Glutaraldehyde aqueous solution, 16019, Electron Microscopy Sciences), with gently rocking at room temperature (RT) for 15 min. After fixation, cells were rinsed twice with 1×PBS and then quenched for 7 min with freshly prepared 0.1% NaBH4 (452882, Sigma-Aldrich) in 1×PBS. The cells were rinsed three times with 1×PBS and blocked with solution containing 3% BSA (001-000-162, Jackson ImmunoResearch) and 0.2% Triton X-100 in 1×PBS, with gently rocking at RT for 1 h. After blocking, the cells were incubated at 4° C. overnight with primary antibody (sc-11415, Santa Cruz Biotechnology), 1:500 diluted in antibody dilution buffer (1% BSA and 0.2% Triton X-100 in 1×PBS). We then washed cells three times with 5 min each time in 0.05% Triton X-100 in 1×PBS, and incubated cells at RT for 5 h with secondary antibody (A21245, Invitrogen, for Alexa Fluor 647), 1:500 diluted in antibody dilution buffer (1% BSA and 0.2% Triton X-100 in 1×PBS). After being washed three times with 5 min each time in 0.05% Triton X-100 in 1×PBS, cells were post-fixed with 4% Formaldehyde aqueous solution (1:4 diluted with 1×PBS from 16% Formaldehyde aqueous solution, Electron Microscopy Sciences) at RT for 10 min. Cells were then rinsed three times with 1×PBS and stored in 1×PBS at 4° C.
The 5×FAD Alzheimer's disease (AD) mouse model was used for immunostaining amyloid β. Mice were maintained on the C57BL/6J (B6) background, which were purchased from the Jackson Laboratory (JAX MMRRC Stock #034848). The 5×FAD transgenic mice overexpress the following five familial Alzheimer's disease (FAD) mutations under control of the Thy1 promoter: the APP (695) transgene containing the Swedish (K670N, M671L), Florida (I716V), and London (V7171) mutations, and the PSEN1 transgene containing the M146L and L286V FAD mutations33. Up to five mice were housed per cage with SaniChip bedding and LABDIET (animal food) 5K52/5K67 (6% fat) feed. The colony room was kept on a 12:12 h light/dark schedule with the lights on from 7:00 am to 7:00 pm daily. The mice were bred and housed in specific-pathogen-free conditions. Only female mice were used. Mice were euthanized by perfusion with ice-cold phosphate-buffered saline (PBS) following full anesthetization with AVERTIN (Tribromoethanol) (125-250 mg/kg intraperitoneal injection)53. Animals used in the study were housed in the Stark Neurosciences Research Institute Laboratory Animal Resource Center, Indiana University School of Medicine. All animals were maintained, and experiments performed in accordance with the recommendations in the Guide for the Care and Use of Laboratory Animals of the National Institutes of Health. The protocol was approved by the Institutional Animal Care and Use Committee (IACUC) at Indiana University School of Medicine.
Perfused brains from mice at 7.5 months of age were fixed in 4% formaldehyde in aqueous solution (1:4 diluted with 1×PBS from 16% Formaldehyde Aqueous Solution, Electron Microscopy Sciences) for 24 h at 4° C. Following fixation, brains were cryoprotected in 30% sucrose at 4° C., and then cut into sections of 150 μm by a vibratome (7000smz-2, Campden Instruments). For immunostaining, free-floating sections were washed and permeabilized with 0.1% Triton X-100 in 1×PBS (PBST), and antigen retrieval was subsequently performed using 1× Reveal Decloaker (Biocare Medical) at 85° C. for 10 min. Sections were blocked in 5% normal donkey serum (D9663 Sigma-Aldrich) in PBST for 1 h at RT. The sections were then incubated with β-Amyloid Antibody (Cell Signaling Technology #2454, rabbit), 1:1000 diluted in 5% normal donkey serum in PBST at 4° C. overnight. Sections were washed and stained for 1 h at RT with secondary antibody (A31573, Invitrogen, for Alexa Fluor 647) diluted at 1:1000 in 5% normal donkey serum in PBST54.
To obtain mice expressing the proper amount of ChR2-EYFP in Thy1+ pyramidal cells, the litters of Thy1-ChR2-EYFP (B6.Cg-Tg (Thy1-COP4/EYFP)18Gfng/J, Jackson Lab) cross with B6 (C57BL/6, Jackson Lab) were used for the labeling. To extract the brains for sectioning, the litters of seven-week-old were first anesthetized by intraperitoneal injections of a mix of 90 mg/kg ketamine (59399-114-10, Akron) and 10 mg/kg xylazine (343750, HVS). After confirmation of deep anesthesia, the abdomen was open to expose the diaphragm. The chest cavity was then opened by cutting through the diaphragm and ribs to expose the heart. The trans-cardiac perfusion was performed by inserting the needle into the left ventricle and a small incision at the right atrium. Mice were perfused with 1×PBS (1:10 diluted from DSP32060, Dot Scientific). After the liver was pale, mice were continuously perfused with 4% Formaldehyde Aqueous Solution (1:8 diluted with 1×PBS from 32% Formaldehyde Aqueous Solution, Electron Microscopy Sciences) to pre-fix the brain until the muscle turned stiff. Brains were carefully collected and placed in 4% Formaldehyde Aqueous Solution to post-fix at 4° C. overnight. The fixed brains were trimmed for coronal slicing. The trimmed brains were fixed and cut into sections of 150 μm, 200 μm and 250 μm by a vibratome (1000 Plus, TPI Vibratome). The brain sections were washed three times, 15 min for each time, in wash buffer (0.1% Triton X-100 in 1×PBS) with a gentle shake (120 rpm, Orbi-Shaker, Benchmark), and then were incubated in blocking butter (5% BSA (A9647, Sigma-Aldrich) in 1×PBS) for 1.5 h with a gentle shake. The blocked brain sections were incubated with chicken anti-GFP antibody (ab13970, Abcam, diluted to 1:1,000 in blocking buffer) at 4° C. overnight. After being washed three times in the wash buffer as in the first step, the slices were incubated with goat anti-chicken Alexa Fluor 647-conjugated antibody (A21449, Invitrogen, diluted to 1:600 in wash buffer) at room temperature for 2 h with a gentle rocking. All animals were maintained, and experiments performed in accordance with the recommendations in the Guide for the Care and Use of Laboratory Animals of the National Institutes of Health. The protocol was approved by the Institutional Animal Care and Use Committee (IACUC) at Purdue University.
Immediately before SMLM imaging, the coverslip with specimens on top was placed on a custom-made holder6. Imaging buffer55 (10% (wt/vol) glucose in 50 mM Tris, 50 mM NaCl, 10 mM MEA, 50 mM BME, 2 mM COT, 2.5 mM PCA and 50 nM PCD, pH 8.0) was added to the coverslip. Then another cleaned coverslip was placed on top of the imaging buffer. This coverslip sandwich was sealed with two-component silicone dental glue. Samples with immune fluorescence-labeled cells on the top coverslips were prepared as described below: 200 μL of poly-l-lysine solution was added to the bottom coverslip, incubated for 20 min and subsequently rinsed with deionized water. Then 20 μL of microsphere suspension (134 μm in diameter, 7640A, Thermo Scientific) was spread around the outer ring area of the coverslip, and incubated at RT until the coverslip was dried. Then we placed this coverslip with microspheres at the bottom, added 50-80 μL imaging buffer without touching the microspheres, and added the coverslip with cells on top of it, with the cell-side surface facing down.
All experimental data were recorded on a custom-designed SMLM setup built around an Olympus IX-73 microscope stand (Olympus America). This system is equipped with a 100×/1.35-NA (numerical aperture) silicone oil-immersion objective lens (UPLSAPO100XS, Olympus America), a PIFOC objective positioner (ND72Z2LAQ, Physik Instrumente), a three axis piezo nano-positioning systems (Nano-LP100, Mad City Labs) and a manual XY stage (MicroStage-LT, Mad City Labs Inc.). A continuous-wave laser at wavelength of 642 nm (2RU659 VFL-P-2000-642-B1R, MPB Communications) was coupled into a polarization-maintaining single-mode fiber (PM-S405-XP, Thorlabs) after passing through an acousto-optic tunable filter (AOTFnC-400.650-TN, AA Opto-electronic) for power modulation. The excitation light coming out of the fiber was focused to the pupil plane of the objective lens after passing through a filter cube holding a quadband dichroic mirror (Di03-R405/488/561/635-t1, Semrock). The emission fluorescence was split with a 50/50 non-polarizing beam splitter (BS016, Thorlabs) mounted on a kinematic base (KB25/M, Thorlabs). The separated fluorescent signals were delivered by two mirrors onto a 90° specialty mirror (47-005, Edmund Optics), passed through a band-pass filter (FF01-731/137-25), and were then projected on an sCMOS camera (Orca-Flash4.0v3, Hamamatsu) with an effective pixel size of 119 nm on the sample plane. The detection planes that received the signals transmitted and reflected by the beam splitter were referred to as plane 1 and plane 2, respectively. The pupil plane of the objective lens was imaged onto a deformable mirror (Multi-3.5, Boston Micromachines). The imaging system was controlled by a custom written program in LabVIEW (National Instruments).
The experimental mirror deformation modes25 (Supplementary Note 1) were measured using fluorescent bead sample described above. We introduced a positive and a negative (unit amplitude) mirror changes for each of the 28 mirror deformation modes. For each mirror shape setting, we acquired PSFs at z positions from −1.5 μm to 1.5 μm, with a step size of 100 nm, a frame rate of 10 Hz, and 3 frames per z position. Pupil phase was extracted through phase retrieval algorithm for each mirror change. To obtain the experimental mirror deformation bases without the influences of instrument or sample induced aberrations, we calculated the differences of the retrieved pupil phases between the positive and negative unit changes of mirror modes and divided them by two. The actual distortion level introduced by each experimental mirror mode is quantified through root mean square wavefront error28 (Methods, Supplementary Note 3).
We define instrument optimum as the status where optical hardware was optimized to limit the inherent system aberrations. To obtain this optimized status, we followed a previously described method6, where the deformable mirror was adjusted as follows. Starting from the flat voltage map (provided by the manufacturer) of the deformable mirror, 28 mirror modes were applied sequentially. For each mirror mode, 11 different amplitudes were applied while recording the corresponding fluorescence signal from an in-focus 100-nm crimson bead sample. To extract the fluorescence signal from individual beads, the symmetry center of each imaged bead was obtained using the radial symmetry method56. Subsequently, a symmetric 2D Gaussian was generated at the symmetry center and was multiplied by the isolated emission pattern from the fluorescent bead, generating a Gaussian-masked image, and then the total intensity of the masked image was calculated to extract the center peak signal of the beads in focus. For each mirror mode, images of the bead were acquired at 11 different mirror mode amplitudes and the corresponding center peak signals of the bead were extracted as described above. The optimal amplitude (i.e. the amplitude providing the highest center peak signal from the beads) was determined from a quadratic fit of these 11 signal measurements vs. mirror mode amplitudes. After identifying optimal amplitudes for each of the 28 modes, these amplitudes were added to the flat voltage map (provided by the manufacturer), serving as the starting point for another iteration. This iterative process was repeated five times to achieve optimal system aberration correction. PSFs under instrument optimum were measured using fluorescent beads sample described above. Data were acquired at a series of z positions from 1.5 μm to 1.5 μm, with a step size of 100 nm, a frame rate of 10 Hz, and 3 frames per z position. Phase retrieval algorithm was then performed on the bead stack to obtain the pupil function under instrument optimum. The instrument optimum can be further verified by decomposing the pupil phase into Zernike mode12 and checking whether the absolute values of first 64 Zernike coefficients (Wyant order28) are smaller than 0.2λ/2π.
The root mean square wavefront errors (Wrms) were calculated by the root mean square among all pixels within in the image of pupil phase angle Wrms for experimental wavefronts were either calculated using the pupil phase obtained by phase retrieval from fluorescent beads (Supplementary,
The aberrated PSFs for characterizing network responses (Supplementary
SMLM Acquisition with DL-AO
In SMLM data acquisition, the fluorescently labeled samples were first excited with a 642-nm laser at a low intensity of ˜50 W/cm2 to find a region of interest. Imaging depths of mitochondria specimens were measured by the differences of PIFOC readings between the apparent focus of the region-of-interest and the bottom coverslip surface. The imaging depths for immune fluorescence-labeled tissue specimens were measured by the differences of PIFOC readings between apparent focuses of the region-of-interest and the fluorescent signal closest to bottom coverslip surface. Before SMLM experiments, bright-field images of this region were recorded over an axial range from −1 to +1 μm with a step size of 100 nm as reference images for focus stabilization57. Then the blinking data were collected at a laser intensity of 2-6 kW/cm2 and a frame rate of 50 Hz, where the first 3-20 cycles were used for DL-AO, with 20-100 frames per cycle. In the case where significant background photons were observed (100 per pixel per frame), a temporal median filter was used to estimate structured background for each pixel. This background map was subtracted from each camera frame before the frames are segmented into sub-regions for DL-AO processing. After DL-AO correction, 2000 frames were collected per cycle, and 20-120 cycles (50000-236000 frames, Supplementary Table 2) were collected per imaging area. For the interleaved SMLM imaging without and with AO, deformable mirror shape was set to switch between DL-AO compensated shape and the shape used for instrument optimum (Methods) per imaging cycle (2000 frames). Acquisition of no-AO data was performed first in the interleaved sequence for fair comparison. Upon each switch between no-AO and DL-AO acquisitions, PIFOC objective positioner was moved to compensate apparent focal shift in the case of index mismatch induced aberration58. The focal shifts were determined by an estimated linear relationship between the apparent focus shift and the amplitudes of two radially symmetric mirror deformation modes. The shifts per unit amplitude changes were empirically estimated to be −0.3 μm for mirror mode 5 and −0.2 μm for mirror mode 15 (Supplementary, Fig. SS4). Here, a negative movement of PIFOC objective positioner corresponds to shifting the imaging plane closer to the bottom coverslip surface.
The neck sizes of dendritic spines are measured as follows. First we selected a profile line at the location where measurement is to be made. A rectangular box was then cropped along the line, with its width ranging from 50-500 nm (depending on the spine neck length and the number of localizations). The localization result inside this rectangular box was isolated and rendered into an image with 3 nm pixel size. Each point in the rendered image is blurred with a Gaussian kernel of 3 pixels in width. Intensity profile was generated along the profile line by sum projection and subsequently the histogram was normalized by dividing its maximum value. The spine neck sizes were calculated by the full width at the half maximum of the intensity histogram. Spine head sizes were measured the same way as that for the spine necks. The Amyloid β fibrils' widths were measured the same way as that for the spine necks, except for a Gaussian function was used to fit the line profile (‘fit’, Curve Fitting Toolbox 2020a, MATLAB R2020a, The MathWorks, Inc.), with Gaussian function switched between ‘gauss1’ (single Gaussian fit) and ‘gauss2’ (two Gaussians) depending on the number of peaks observed in intensity histogram. The half width at the half maximum of the fitted Gaussian curve is treated as the width of each fibril.
The 3D structures of amyloid-β (Aβ) fibrils are a focus of interest in the studies of Alzheimer's disease (AD) and are of particular importance with the success of amyloid-directed therapeutics38,39. Visualizing the formation and aggregation of these fibrils within the brain has been limited by the significant resolution loss when imaging through tissues. With DL-AO adaptively optimizing single molecule emission patterns during SMLM imaging, we can now clearly resolve the organization of immune-fluorescence-labeled β-amyloid fibrils in 125 μm thick brain sections from 5×FAD mice, a transgenic AD model that exhibits robust amyloid plaque pathology similar to that found in the human AD brain40 (
Using deep learning driven adaptive optics to correct sample induced aberrations, and in situ PSF model to perform super resolution reconstruction post-AO correction, we performed SMLM imaging through 150-250 μm thick brain tissues resolving dendritic spines, the 300-800 nm tiny protrusions from the dendrites whose morphology changes in response to neuronal activities associated with learning and memory41,42. Insufficient spatial resolution leads to an erroneous classification of spines43,44 due to their miniature sizes. The capacity to resolve spines' ultrastructure within their tissue environment is critical in detecting morphological changes in the same area of the functional measurements. This technological advancement will allow electrophysiological and morphological mapping of the same neural circuits linking functional and structural synaptic plasticity with animal behavior45. We imaged Thy1-ChR2-EYFP transgenic mice, expressing Channelrhodopsin-2 enhanced yellow fluorescent protein (EYFP) fusion protein in cortical L5 Thy1+ pyramidal cells46. Through a 250-μm-thick brain section, we resolved the distinct membrane distribution of the fluorescently tagged target decorating the dendritic spines (
+The output of “shortcut” connection here is the input to the residual block
++The output of “shortcut” connection here is result of the input to the residual block going through a 1 × 1 convolutional layer, with a stride of 4.
+Mirror mode 1-28 can have different shapes and levels in Wrms for different optical systems. The coefficient values here are the scaling factors for deformable mirror voltage control. They have arbitrary units. Their actual influences can be estimated by linear combining of measured mirror modes then calculating Wrms of the composed wavefront. This estimation is accurate only when mirror deforms linearly with input voltages.
Wavefront of a single emitter can be measured directly or indirectly if the fluorescent signal is stable and contain enough photon budget. Signals from photo-switchable or photoconvertible dyes in SMLM experiments blink stochastically with limited photons, making it difficult to measure wavefront. Still, wavefront can be obtained when building an in situ PSF model. However, this process requires accumulating thousands of emission patterns, and removing wavefront variations induced by lateral and axial positions of emitters during the iterative phase-retrieving process. Besides, the retrieved pupil phase wraps when wavefront deviation is larger than a wavelength, which cannot be used to feedback the correction element. Due to these difficulties in inferring aberration during SMLM imaging, current sensorless AO methods compensate aberration by iteratively introducing mirror changes then evaluating these changes with image-quality metrics. Though different optimization algorithms have been developed to improve the efficiency and robustness of the compensation, the effectiveness relies on the evaluation criteria, i.e. how to quantify the distortion changes. For conventional fluorescent microscopes, the intensity value can reflect distortion level. However, it cannot be adapted to SMLM due to the intrinsic intensity fluctuations in blinking molecules. Observed that high frequency components are less sensitive to intensity fluctuations, current sensorless AO methods describe the distortion level with a weighted sum of spatial frequency components in a SMLM frame, which is usually called an image-sharpness metric. Different weighting methods are designed to make the metric values respond correctly to distortion changes. Ideally, the metric values should change quadratically with an increasing amplitude of each mirror shape (when we scan large enough range of amplitude), and reach a maximum/minimum when the amplitude corresponds to an optimal compensation.
To test the current state-of-art metric design, we simulated the aberration inferring process. To rule out the sample structure induced variations, we simulated cases where there is only one molecule being detected. When the molecule being detected is right in-focus, we observed that the metric changes quadratically with increasing amplitude, and reaches a maximum value when amplitude is the ground-truth amplitude for optimal compensation (
It is difficult to design a feature extractor that summarizes aberration-related information from a SMLM frame, while ignoring irrelevant variations, such as intensity, background, and molecules' positions. Deep neural networks rely on backpropagation16 to turn the first few layers into an appropriate feature extractor, which has been demonstrated to learn the complex relationship from fluorescent beads and single molecule emission patterns to aberration. But the downside of automatic feature extraction is that result can become uncontrollable when irrelevant features are extracted for inference. Therefore, covering the sample space in training dataset is of key importance18. Through a careful characterization of DL-AO performance, we demonstrated that a trained DL-AO network can give estimation to experimental PSFs achieving a 3D normalized cross correlation (NCC) value of >0.95 when comparing measured PSFs with those generated from network estimation (
For SMLM, the shape of PSF is important for achieving optimal resolution. To compare the capability to restore PSF shapes, we tested DL-AO and current state-of-art metric-based AO, REALM2, on compensating sample induced aberrations based on same area of blinking molecules from immune-fluorescence-labeled Tom20 in COS-7 cells. For sample at coverslip surface, where the aberration is dominated by Coma, both of the methods can restore the PSF shapes. To quantify the restoration, we calculated 3D normalized cross correlation (NCC) between PSFs after AO and the PSFs measured under instrument optimum (Methods). We found that the PSFs after DL-AO is closer to the instrument optimum, with a similarity of 0.964±0.06 (mean±s.t.d, N=6) in NCC, versus a similarity of 0.939±0.031 (mean±s.t.d, N=6) after metric-based AO (
We observed that DL-AO reduces the distortion level to below 0.2 rad in with <5 mirror updates, when estimating from volumetric blinking data (Supplementary
General Workflow of SMLM Imaging with DL-AO
During SMLM imaging, the blinking data were collected at a laser intensity of 2-6 kW/cm2 and a frame rate of 50 Hz, where the first 100-2000 frames were used for DL-AO. Mirror shape is updated based on DL-AO networks' output every 20-100 frames. In the case where significant background photons were observed (about 100 per pixel per frame), a temporal median filter was used to estimate structured background for each pixel, and 100 frames were used to compute this background map. This background map was then subtracted from each camera frame before the frames are segmented into sub-regions for DL-AO processing. After DL-AO correction, 2000 frames were collected per cycle, and 20-120 cycles (50000-236000 frames, Table 1) were collected per imaging area. For the interleaved SMLM imaging without and with AO, deformable mirror shape was set to switch between DL-AO compensated shape and the shape used for instrument optimum (Methods) per imaging cycle (2000 frames). Acquisition of no-AO data was performed first in the interleaved sequence for fair comparison. Upon each switch between no-AO and DL-AO acquisitions, PIFOC objective positioner was moved to compensate apparent focal shift in the case of index mismatch induced aberration. The focal shifts were determined by an estimated linear relationship between the apparent focus shift and the amplitudes of two radially symmetric mirror deformation modes. The shifts per unit amplitude changes were empirically estimated to be −0.3 μm for mirror mode 5 and −0.2 μm for mirror mode 15. Here, a negative movement of PIFOC objective positioner corresponds to shifting the imaging plane closer to the bottom coverslip surface.
Before segmentation, the camera offset, with an estimated value of 100 ADU per pixel were removed from each detected camera frame. In the case where significant background photons were observed (about 100 per pixel per frame), a background map estimated by the temporal median filter was subtracted from each camera frame. Then we removed the camera gain by dividing each pixel value with an estimated gain of 2 ADU/e−. The non-positive pixel values in the each processed camera frame are also set to be 1×10−6. Then each pair of camera frames from two detection planes were sum together, after performing affine transformation to the frames detected on the second plane (‘imwarp’ function with ‘cubic’ interpolation type, MATLAB R2020a, The MathWorks, Inc.) to align the detections from two planes. The transformation matrix was obtained following the previously described method12: We first calculated the maximum intensity projection map of 1000-2000 frames containing single-molecule blinking events for each detection plane. Then we calculated the affine matrix based on these projection images in two planes (‘imregtform’ function, MATLAB R2020a, The MathWorks, Inc.). The same transformation matrix was calculated once and used for different specimen to avoid extra time delay in calculating transformation matrix for each frame.
We then segmented out biplane sub-regions, each of 32×32 pixels, using a segmentation algorithm26. To locate center coordinates of isolated PSFs in SMLM frames, two uniform filters with different kernel sizes (3×3 pixels and 9×9 pixels) were applied to each image, where the image is a summation of two frames from two detection planes. The images filtered with larger kernel size were subtracted from the images filtered with smaller kernel size. Then we applied a maximum filter to the resulting image to locate the pixels containing local maximum intensities. For pixels with local maximum intensities, we considered their positions as candidate sub-region centers if their pixel values are larger than an initial threshold (empirically chosen as 20 photon counts). We then discarded those candidate center coordinates that are closer than 26 pixels to prevent overlapping PSFs in one sub-region. Then we chose center coordinates for cropping sub-regions as the candidate coordinates whose pixel values are larger than a segmentation threshold (empirically chosen as 40-80 photon counts). The center coordinates were used to crop sub-regions out from the first detection plane. For cropping sub-regions in the second detection plane, we transformed coordinates in detection plane 1 to crop the PSF from SMLM frame in detection plane 2.
Aberration Estimation with Deep Neural Network
Pixels in each plane of the bi-plane sub-regions are normalized separately by dividing the maximum pixel value of that plane. Each input sub-region goes through a sequence of template matching processes, which are organized as convolutional layers and residual blocks with PReLU activations and batch normalizations in between, then “fully connects” through 1×1 convolutional layers to an output vector of 28 values—amplitude estimates for wavefront shapes in terms of the native mirror deformation modes.
The neural network resembles the architecture as previously developed single molecule network (smNet) for 21 Zernike coefficients' estimations, with slight modification to accommodate for the input and output size change. The detailed structure of neural network architecture is shown in Table 2. Convolutional layers are used throughout the architecture, as studies have shown that deep neural network architectures with the help of convolutional layers are capable of learning relevant features. Each convolutional kernel slide through the input images or feature maps, outputting a high value when local features have high similarity to the kernels. This process is similar to feature extraction process, and the output of each convolutional process is called a feature map. Except for the first two layers and the final layer connecting to the output, all other convolutional layers are packed into residual blocks6, which add outputs of “shortcut” connections (Supplementary Table 2) to the outputs of the stacked layers. The residual blocks were developed to address common issues in training deep architectures, e.g. overfitting, vanishing/exploding gradient6, and excessive inactive neurons. Due to limited GPU memory and computational resource, it is not practical to update network parameters using the entire training dataset, usually ˜6 million images, at the same time. Instead, a batch of 128 images (empirically chosen) is processed together and the gradient for updating network parameter is the average gradient of the batch. In each iteration, the images are processed batch by batch. The normalization step (Batch Normalization28) is used to normalize the output distribution of each convolutional layer. Activation functions are added to perform non-linear transform in between linear transformations with convolutional layers. We chose PReLU7 (Parametric Rectified Linear Unit) as the activation function due to the advantages of no saturation, computational efficient and fast convergence.
Combining Estimation with Kalman Filter
Our initial compensation starts when N0 sub-regions are segmented. The sub-regions are sent to the trained network, which then output N0 vectors of 28 mirror mode coefficients. Then we calculated the mean and variance for each mirror mode coefficient among the N0 estimations. The initial mirror update is applied according to the mean of estimation. Then we measure new SMLM frames under the updated mirror shape, and obtain N1 sub-regions. After calculating the mean and standard deviation among the N1 estimations, we update deformable mirror by multiplying the current estimation with a Kalman Gain29 (KG), which is the ratio between variance of the compensation history (till now it is the variance of initial estimation) and the sum of the new variance with history variance. Then the history variance is updated by multiplying with (1−KG). Similarly, future compensations will be new estimations damped by Kalman Gain, and we keep updating history variance after each mirror update. Each compensation starts only when Ni≥2 (for each compensation i=0, 1, 2, . . . ) to make sure there are enough sub-regions for estimating variance. The intuition behind the Kalman filter was to combine noisy measurements that are related, such that the combined prediction become closer to the ground truth in the criterion of mean squared error. Kalman Gain here is a weighting factor between prediction based on previous compensation and a new measurement after previous compensations. When the new measurement has higher variance comparing to the previous compensations, we damp the estimation value based on a variance ratio. The variance of the accumulated compensation will keep decreasing (or keep constant) through combining new measurements. Due to the uncontrollable availability of single molecule emission patterns with high signal-to-background ratio and the evolving PSFs after each correction, we use this process to weigh heavily on high precision measurements against the uncertain ones to ensure stable feedbacks from the network.
Compensating wavefront distortions inferred from PSFs of blinking molecules, we found that the network proposed mirror change fluctuates with non-vanishing uncertainty before/after each mirror update. This uncertainty increases with the network training range, resulting in a trade-off between the compensation range and stability (
The initial compensation starts with estimation from Network 1 (Table 4). By comparing the current estimation variance with two empirically chosen variance-thresholds th1 and th2, we decide whether we will switch to Network 2 or Network 3 respectively. For example, if current estimation variation is larger than th1, the program will use Network1 to continue estimate after the following compensation. With the help of Kalman filter, the history variance will continue decrease and reach below th1 or th2, the program will then switch to Network 2 or Network 3 for the following estimation respectively. Whenever we switch to a different network, we reset history variance to be the first estimation variance of the new network. Kalman filter was not applied to Network 3, due to its small uncertainty observed and the ignorable PSF shape changes caused by its small uncertainty. If the current estimation variance with Network 3 is larger than th1 or th2, the program will switch back to Network1 or Network2 respectively. We either manually stop the compensation after 10-20 compensations, or stop when the wavefront change proposed by Network 3 has a peak-to-valley value smaller than 1/20λ for three consecutive compensations.
Training neural network for deformable mirror control requires incorporating accurate wavefront deformations in training data generation. To incorporate these, we can measure the wavefront deformations induced by changes of either individual mirror actuator or several actuators together. However, representing wavefront with coefficients of orthogonal basis helps cut down on the number of outputs and network parameters to be optimized in training. Besides, nonorthogonal basis result in non-unique coefficients for representing the same wavefront, which cause vanishing compensation due to the requirement of averaging coefficients from estimations on different sub-regions. Forming this orthogonal basis directly from native mirror deformations further ensured the coefficients' accuracy in representing mirror responses. With this consideration, the conversion from mirror modes to Zernike polynomials—commonly used as the analytical basis to describe aberrations—is dropped to minimize mismatches between mirror responses and Zernike-based wavefront shapes.
Mirror mode generation process follows previously described methods27. In brief, the steps are: (1) simulating actuators' influence functions, i.e. wavefront deformations introduced by poking individual actuators. The simulation is performed by generating a Gaussian blur at the actuator's location, with Gaussian
where “a” represents the distance between nearby actuators' centers. (2) multiplying the wavefront deformation with a binary mask representing 2D shape of pupil in the optical system. (3) calculating the cross-talk between deformations induced by each actuator. (4) generating orthogonal deformation types by linear combining actuators' influence functions. The combination coefficients are found through singular value decomposition of the cross-talk matrix. The final orthogonal deformation types are called mirror modes. The relative actuator voltages for inducing these wavefront deformations in the optical system with deformable mirror are voltage maps of mirror modes.
We observed mismatch between expected wavefront deformations and experimental wavefront deformations, when loading the voltage maps generated in above process to the optical system. The expected wavefront deformations come from mirror mode and voltage map generation process as described above. The experimental wavefront deformations are measured through phase retrieval. Although the expected wavefront deformations are orthogonal with each other, due to the singular value decomposition process, the measured mirror modes are non-orthogonal with each other. We observed that the experimental wavefront deformations look similar to center areas of the expected wavefront deformations, indicating that the expected wavefront deformations were being cut on the boundary in the optical setup. After re-adjusting the relative positions between pupil and mirror actuators, we observed experimental mirror modes become more similar to the expected shapes. Such adjustment makes experimental mirror modes less coupled with each other. This coupling is verified by performing pixel-wise multiplication between a pair of measured mirror modes, where each mirror mode is normalized by dividing its root mean square. The final relationship between pupil and actuators are adjusted based on physical size of mirror actuators, with slight modification based on the measured influence functions. The residual difference between expected and experimental mirror modes is potentially caused by the mismatch between simulated and actual influence function of each actuator. Replacing the simulated actuator influence function with an experimental actual actuator influence function in the system during mirror mode generation is feasible, however, the generated mirror modes are noisy. Therefore, we proceed with this difference and incorporate this into PSF simulation process by using measured mirror modes.
The expected mirror deformations simulated from mirror mode generation process have arbitrary unit, which cannot be directly used for generating PSFs. The experimental deformation in optical system needs to be measured. The residual differences between theoretical expectations and experimental mirror deformations (
To build an accurate link between experimentally detected emission patterns and the mirror control with neural networks, it is imperative to train the network with data that match those obtained experimentally. However, experimental training data of single molecules are challenging to obtain, since the ground-truth wavefronts are usually unknown and the extensive variations of the intensity, background and the lateral and axial locations of single emitters, are impractical to cover experimentally. To this end, we simulate PSFs for training neural network. This allows us to efficiently generate millions of training PSFs based on experimentally measured wavefronts with highly accurate training ground truth (normalized cross correlation (NCC) value of >0.95, comparing measured PSFs with those generated from network estimation).
The static residue of system aberration after optimizing the microscope system is also incorporated as the baseline of the wavefront shapes. We measured the wavefront shape under instrument optimum (Methods) with the following steps: (1) collecting a stack of experimental PSFs at z positions from −1.5 to 1.5 μm, with a step size of 100 nm (Methods). (2) preprocessing the data to reduce the noise. (3) obtaining the pupil function through an iterative process based on Gerchberg-Saxton algorithm. Following the same process, we obtained two pupil functions h1(kx, ky) and h2(kx, ky) for the two detection planes. The relative defocus between two pupil functions were removed during phase retrieval process. The phase term contains the best achievable wavefront shape when compensating for sample induced aberrations. The common step of decomposing the obtained wavefront ψ0 into Zernike polynomials are excluded to avoid residual errors for representing the wavefront after decomposition. We used the Zernike expansion (Wyant ordering) of the pupil function to simulate PSFs at arbitrary positions (x, y, z). Using biplane setup comes with additional benefits: (1) Simultaneous detection at two axial planes provides improved Fisher information24 about wave-front distortion than one detection plane30. (2) The relative small PSF size compared to that of the Astigmatism setup results in increased number of sub-regions containing well-isolated emitters and thus the reliability of real-time aberration measurements.
Simulating PSFs with Wavefront Distortions
Each PSF was generated as follows: (1) generating wavefront distortion by linear combining measured mirror modes with coefficients (c1, c2, . . . , c28). These coefficients serve as ground truth label for each PSF. (2) generating normalized PSFs for two detection planes, μ01 and μ02, at position (x, y, z):
where
represent measured mirror modes. The terms
describe the defocus phase, where
is the axial component of the wave vector k and zd represents the axial distance between the two focal planes. (3) multiplying the normalized PSFs with photon count, I, and background count, bg to obtain μ1 and μ2.
where r represents intensity ratio between two detection planes.
Training a neural network to output the correct mirror mode coefficients while ignoring irrelevant pixel value variations, such as intensity, background, positions shift, is important. This is because we cannot control whether the network learns the correct features relate to the mirror mode coefficients. To avoid irrelevant features being taken into consideration, we generate these aberration-irrelevant parameters from a uniform distribution. The variation range of parameters for generating training dataset are included in Table 4.
Estimating a parameter based on a single measurement can deviate from the ground truth, as the uncertainty is identical to the standard deviation of measurement noise. Averaging a large number of repeated measurements reduces the estimation uncertainty, however, these repeats are difficult to carry out in practice. One situation is that the parameter is varying in time, e.g. tracking a car's position with GPS. Another case is that successive measurements can have different inaccuracies and uncertainties over time, e.g. weather change can affect the uncertainties of GPS readings. To prevent our estimation from fluctuating wildly before/after incorporating each new measurement, we can constraint our estimation with the knowledge that successive measurements of the same object are highly related with each other, e.g. car's position readings within 10 minutes cannot be larger than 100 miles. Other prior knowledge can come from either a theoretical model or reading from different sensors, e.g. measurement of a car's speed from odometer.
Kalman filter combines noisy measurements from different sources and the uncertain predictions from theoretical models, such that it tends to give an estimate closer to the ground truth than each single measurement/prediction. More specifically, Kalman filter computes a sequential minimum mean squared error (MMSE) estimator that allows us to estimate the parameter at each time point n based on available measurements on and before time n as n increases29. The optimal estimation at time point n is recursively computed by a weighted average between the prediction based on all previous information obtained before time point n and the new measurements obtained at time point n. The weighting factor, named Kalman Gain, is computed based on their uncertainties. Intuitively, this design judges if we should trust more on the new measurements at current time step or on the prediction made based on all previous measurements, according to the uncertainties.
Here we describe our application of a scalar Kalman filter in computing an upcoming compensation based on all available wavefront measurements pre-/post- each correction. Before applying nth compensation, we describe current ground truth mirror mode coefficient for mirror mode m as with sm[n]. We accumulate N[n] sub-regions and send to smNet for wavefront measurement. The measurement for mirror mode m, xm[n], is noisy due to the uncertainty of neural network estimation and the uncertainty of deformable mirror mechanical movement. To describe this relationship between noisy measurement and ground truth, we can write an equation for each mirror mode m as follow, which is called an observation equation:
where sm[n] represent the ground truth value of mirror mode m before nth compensation, wm[n] represent the measurement noise for mirror mode m. We assume that wm[n] is a Gaussian noise with mean
both changing at different n. We assume that the noise at different n are independent with each other and are uncorrelated with the ground truth sm[n]. We also assume that the measurements of different modes are uncorrelated with each other, so that we can use scalar Kalman filter to compute for each mirror mode separately. The standard deviation
σm[n]
can be calculated among smNet measurements based on different sub-regions. Instead of estimating sm[n] directly with the measurement xm[n] (i.e. set them to be equal), which results in an estimation uncertainty same as standard deviation of measurement noise wm[n], we estimate sm[n] by combining all available measurements
{xm[1],xm[2], . . . ,xm[n]}
to obtain an estimation closer to the ground truth. The criterion of “being closer to the ground truth”
ŝ
m
[n|n] closer to the ground truth sm[n].
is defined by minimizing the Bayesian MSE:
where the expectation is taken with respect to
p(xm[1],xm[2], . . . ,xm[n],sm[n]).
The solution, i.e. the optimal minimum mean squared error estimator (MMSE estimator), is the expectation of the posterior distribution:
To use Kalman filter, the expectation on the right hand side can be simplified as the following equation, assuming the estimator is linear:
where am,k are coefficients for linear combining all available measurements.
We then apply our nth compensation by changing the current wavefront shape using deformable mirror. With an ideally behaved deformable mirror, the amount of change is exactly our estimation
ŝ
m
[n|n].
Thus, for each mirror mode m, the ground truth value is changed to sm[n+1], which can be described by the following equation, commonly referred as the state equation:
Since the uncertainty of deformable mirror mechanical movement for each correction is unapproachable, we didn't include it in our state equation.
To restore the coefficients of each mode approaching 0, we need to find an optimal estimate (measured by MMSE),
ŝ
m
[n|n]
before each compensation. To find this, we can explicitly solve am,k in equation (7) for each compensation n. But this requires repeated computations with each new measurement arrives. Kalman filter calculates
the prediction of current mirror mode coefficient based on all previous measurements {xm[1], xm[2], . . . , xm[n−1]}:
Km[n] is the Kalman Gain, which acts as a weighting factor between new measurements xm[n] and prediction based on previous measurements, which is defined as:
where the
M
m
[n|n−1]
represents the Bayesian MSE error of the prediction, i,e, estimating sm[n] based on previous data before xm[n] is observed, which can be written as following, according to its definition in equation (3):
According to our state equation (8),
Thus we can relate the prediction error
M
m
[n|n−1]
with the estimation error before (n−1)th compensation,
by:
The Bayesian MSE error for nth compensation can be updated recursively based on the error in (n−1)th compensation by:
Intuitively, this process says that when the uncertainty of the new measurement is much larger than the uncertainty of prediction based on previous measurements, we will trust more on the prediction when estimating for the upcoming compensation. Therefore, as n increases, if we can keep obtaining new measurement which has significantly smaller uncertainty comparing to previous measurements, the accumulated Bayesian MSE error will be reduced. Once the error is reduced to be smaller than certain threshold, we will change the current driving network to another network trained with smaller range.
For the initial condition at n=1, we define km[1]=[1] and Mm[1][1]. According to our definition in equation (8), we assumed that the deformable mirror is behaving ideally, therefore, the prediction based on our previous measurements will be 0, i.e.
Thus, equation (9) for combining prediction and new measurement can be simplified as:
To construct samples with fluorescent beads nearby immune-fluorescence-labeled cells, we diluted 100-nm-diameter crimson beads (custom-designed, Invitrogen) to 1:1,000,000 in deionized water. Then 500 μL of poly-l-lysine solution (P4707, Sigma-Aldrich) was added to the coverslip with immune-fluorescence-labeled cells, incubated for 20 min and subsequently rinsed with deionized water (pipette gently to avoid washing out cells). We added 1 mL of the diluted bead solution to the coverslip, which was incubated for 20 min at room temperature (RT). Immediately before SMLM imaging, the coverslip without or with specimens attached was placed on a custom-made holder for imaging cells away from or at the bottom coverslip surface respectively. And imaging buffer (10% (wt/vol) glucose in 50 mM Tris, 50 mM NaCl, 10 mM MEA, 50 mM BME, 2 mM COT, 2.5 mM PCA and 50 nM PCD, pH 8.0) was added on top of the bottom coverslip. Then another coverslip with or without specimens was placed on top of the imaging buffer for imaging cells away from or at the bottom coverslip surface respectively. This coverslip sandwich was sealed with two-component silicone dental glue. To control the distance between specimens and bottom coverslip, we used following steps: 200 μL of poly-l-lysine solution was added to cleaned coverslip on bottom, incubated for 20 min and subsequently rinsed with deionized water. Then 20 μL of microsphere suspension (134 μm diameter, 7640A, Thermo Scientific) was spread around the outer ring area of the coverslip, and incubated at RT until the coverslip was dried. Then we placed this coverslip with microspheres at the bottom and added the coverslip with cells on top of it, with the cell-side surface facing down.
After finishing DL-AO compensation on a cell area, we moved manual stage (Manual MicroStage-LT, Mad City Labs Inc.) in lateral dimension to search for area containing fluorescent beads. An area with one isolated bead, i.e. with no other fluorescent structures within its neighborhood of 60 pixels×60 pixels, was chosen to measure the 3D PSF stack. The axial position of the bead was adjusted to be approximately in focus in the one detection plane. The PSFs were acquired at a series of z positions from −1.5 μm to 1.5 μm, with a step size of 100 nm, and 3 frames per z position, where z positions are moved by the PIFOC objective positioner (ND72Z2LAQ, Physik Instrumente). The frame rate was chosen from 5-10 Hz and laser power was ˜50 W/cm2. Both frame rate and laser power needs to be adjusted accordingly, such that the PSFs at ±1.5 μm contrast against background and the highest pixel value among the stack doesn't saturate the camera. To acquired 3D PSF stack without DL-AO compensation, we reset deformable mirror the shape as the one obtained for instrument optimum (Methods), and following the measurement process as described above. Only one detection plane is used for comparing PSFs without and with DL-AO.
To obtain the pupil function, which describes how wavefront from a single molecule is affected upon transmission through specimen and imaging system, we performed phase retrieval algorithm on the PSFs measured without and with DL-AO. Multiple phase retrieval methods have been developed for SMLM10,11,31,32. In this work, we chose a phase retrieval method10 based on Gerchberg-Saxon algorithm9,33,34, which is an iterative process of retrieval the pupil function from a series of PSFs with various amounts of known defocus.
Before performing phase retrieval, the following process was used to preprocess the measured PSFs, which is modified from previous developed method. First, we cropped out an area of 50×50 pixels in each acquired camera frame. We then computed an average of the 3 frames acquired at each z position. The camera offset, with an estimated value of 100 ADU per pixel were removed from each detected camera frame. Then we removed the camera gain by dividing each pixel value with an estimated gain of 2 ADU/e−. The non-positive pixel values in the each processed camera frame are also set to be 1×10−4. The PSF stack is first shifted to its center laterally, with the lateral shift relative to the center of the cropped region estimated by fitting a 2D Gaussian to the most in-focus PSF. Then the background is subtracted from each frame, with background estimated from the minimum of four values, each represent the mean value of the pixels at the four edges of the corresponding cropped image. Following that, a circular mask with diameter of 50 pixels are multiplied to the resulting image to set pixel values outside the mask to be zero. The negative pixel values in the resulting image are also set to be zero. Then the images sizes are restored to 128×128 pixels by padding zeros on the boundaries. Following that, each image is normalized to sum to 1.
Our parameters used for performing phase retrieval are: numerical aperture of the objective lens NA=1.35, the emission wavelength λ=680 nm, the refractive index of the objective immersion medium n=1.406, the refractive index of the sample medium n=1.35 or 1.406 (measured by Abbe refractometer, 334610, Thermo Scientific) and the effective pixel size of 119 nm on the sample plane. Phase retrieval algorithm was performed on eight PSFs selected from the measured PSF stacks (−1.5 μm, −1.1 μm, −0.7 μm, −0.3 μm, 0.1 μm, 0.5 μm, 0.9 μm, 1.3 μm).
For PSF at each axial position, an initial empty pupil was set to start phase retrieval. A defocus phase computed from the expected axial distance to the approximated focus was added to the pupil phase. After performing Fourier transform of the composed pupil, we replaced the magnitude of the resulting image with the square root of the measured PSF. Following an inverse Fourier transform, we updated our pupil function. The defocus phase added before transform was removed from the resulting pupil to estimate the system pupil. Pupil functions computed from all eight PSFs were averaged to complete the first iteration. And the averaged pupil function will be the initial pupil function for the next iteration. The above process was repeated for 25 iterations to complete the initial phase retrieval process. From the resulting pupil, we estimated the tip, tilt and defocus phase and removed them from the pupil phase to re-center the PSF stack. Subsequently, phase retrieval algorithm was performed again on the measured PSF stack with the updated lateral and axial positions, then a radial modification of the magnitude components of the pupil function was performed on the phase retrieved PSF model to account for the difference between measured PSF and phase retrieved PSF model. The final pupil function is obtained with 10 repeats of re-centering PSF stack, performing phase retrieval to the PSF stack, and modifying the magnitude component of the pupil function.
We constructed three different in vitro PSF models and pupil functions for comparing SMLM reconstruction without and with AO. The first one is in vitro PSF model and pupil function from bottom beads under instrument optimum, which is used in ‘DL-AO+PR’ certain figures herein. This is obtained by performing phase retrieval on the PSF stacks measured under instrument optimum. The second one is in vitro PSF model and pupil function with index mismatched aberration, which is used in ‘no AO+PR’ in certain figures herein. This is obtained by performing phase retrieval on the PSF stacks measured next to the compensation area. The third one is in vitro PSF model with theoretical index mismatch aberration, which is used in ‘no AO+PR’ in in certain figures herein. This is obtained by adding a theoretically derived index mismatch induced aberration phase to the pupil function measured under instrument optimum.
We constructed in situ PSF models and pupil functions for each compensation area for SMLM reconstruction without and with AO. The models were obtained by performing in situ phase retrieval on the blinking dataset measured for SMLM reconstruction, with INSPR software12. For segmentation process, we set the sub-region size to 32×32 pixels, initial intensity threshold to 25, segmentation threshold to 40, distance threshold to 26. And we accumulated at least 3000 PSFs for INSPR model generation. We set similarity threshold to 0.5, and group threshold to 30 during INSPR model generation.
The present application claims the benefit of and priority to U.S. provisional patent application Ser. No. 63/469,610, filed May 30, 2023, the content of which is incorporated by reference herein in its entirety.
Number | Date | Country | |
---|---|---|---|
63469610 | May 2023 | US |