The present invention relates generally to mosaicing algorithm, and relates more particularly to a method for mosaicing frames from a video sequence acquired from a fibered confocal scanning device.
Fibered confocal microscopy (FCM) is based on the principle of confocal microscopy which is the ability to reject light from out-of-focus planes and provide a clear in-focus image of a thin section within the sample. This optical sectioning property is what makes the confocal microscope ideal for imaging thick biological samples. The adaptation of a fibered confocal microscope for in vivo and in situ imaging can be viewed as replacing the microscope objective by a flexible microprobe of adequate length and diameter in order to be able to perform in situ imaging. For such purpose, a flexible fiber bundle is used as the link between the scanning device and the miniaturized microscope objective (see
The laser scanning unit, performs a scanning of the proximal surface of the flexible optical microprobe with the laser source by using two mirrors. Horizontal line scanning is performed using a 4 kHz oscillating mirror while a galvanometric mirror performs frame scanning at 12 Hz. A custom synchronization hardware controls the mirrors and digitizes, synchronously with the scanning, the signal coming back from the tissue using a mono-pixel photodetector. When organized according to the scanning, the output of the FCM can be viewed as a raw image of the surface of the flexible image bundle. Scanning amplitude and signal sampling frequency have been adjusted to perform a spatial over-sampling of the image bundle. This is clearly visible on the raw image in
Each fiber of the bundle provides one and only one sampling point on the tissue. Associated with these sampling points comes a signal that depends on the imaged tissue and on the single fiber characteristics. The role of the image processing is first to build a mapping between the FCM raw image and the fibers composing the image bundle. Once the mapping between the raw data and each individual fiber is obtained, characteristics of each fiber are measured and the effective signal coming back from the tissue is estimated.
The input of the mosaicing algorithm will therefore be composed of non-uniformly sampled frames where each sampling point corresponds to a center of a fiber in the flexible fiber bundle. A collection of typical frames acquired in vivo is shown in
Fibered confocal microscopy (FCM) is a promising tool for in vivo and in situ optical biopsy. This imaging modality unveils in real-time the cellular structure of the observed tissue. However, as interesting as dynamic sequences may be during the time of the medical procedure or biological experiment, there is a need for the expert to get an efficient and complete representation of the entire imaged region. The goal of the present invention is to enhance the possibilities offered by FCM. Image sequence mosaicing techniques can be used to provide this efficient and complete representation and widen the field of view (FOV). Several possible applications are targeted. First of all, the rendering of wide-field micro-architectural information on a single image will help experts to interpret the acquired data. This representation will also make quantitative and statistical analysis possible on a wide field of view. Moreover, mosaicing for microscopic images is a mean of filling the gap between microscopic and macroscopic scales. It allows multi-modality and multi-scale information fusion for the positioning of the optical microprobe. FCM is a direct contact imaging technique. In order to image and explore a region of interest, the optical microprobe is glided along the soft tissue. The displacement of the optical microprobe across the tissue can be described by a rigid motion. Since FCM is a laser scanning device, an input frame does not represent a single point in time. In contrast, each sampling point corresponds to a different instant. This induces motion artifacts when the optical microprobe moves with respect to the imaged tissue. Furthermore, the interaction of the contact optical microprobe with the soft tissue creates additional small non-rigid deformations. Due to these non-linear deformations, motion artifacts and irregular sampling of the input frames, classical video mosaicing techniques need to be adapted.
However the invention has a broader scope because it can apply to non-fibered confocal scanning microscope or to any scanning device providing a video sequence.
Another aim of the present invention is to improve the gain in both noise level and resolution of input frames whatever the acquisition system used.
According to one aspect of this invention, a method for mosaicing frames from a video sequence acquired from a scanning device, each frame being constituted by a point cloud, comprises the steps of:
a) compensating for motion and motion distortion due to the scanning device, particularly in order to map all frames according to a common reference coordinate system,
b) applying a global optimisation of inter-frame registration to align consistently the frames,
c) applying a construction algorithm on the registered frames to construct a mosaic, and
d) applying a fine frame-to-mosaic non rigid registration.
Preferably, the frame motion speed is between zero and x*f mm/s, x being the imaged field of view diameter in mm, and f the frame rate in Hz (e.g. 6 mm/s for 0.5 mm field of view, at 12 Hz).
More precisely, the step a) comprises the steps of:
a1—applying a local pairwise rigid registration,
a2—estimating a set of globally consistent transformations by considering a least square criterion, and
a3—using linear transformation to correct motion distortions
Steps a1, a2 and a3 constitute a first iteration loop, said first iteration loop being carried out three or four times. As a matter of fact, the number of iteration loop depends on the content of the video sequence; “three or four times” being a typical non limitative example.
During step a2, the least square criterion is applied to a transformation space which is a Lie Group.
According to another aspect of the invention, an estimation of transformation parameters is obtained by solving the following non linear least square equation:
According to yet another aspect of the invention, step c) comprises a discrete Shepard's like algorithm in which the value associated with each point in a chosen regular grid is given by:
where (pk, ik) is a set of sampling points and their associated signal,
More precisely, step d) consists in a demons algorithm applied between each frame and the constructed mosaic from step c). In the case of a scanning microscope with flexible fibered microprobe, the demons algorithm can be modified:
Advantageously, steps c) and d) constitute a second iteration loop. Preferably, the scanning device is a fibered confocal microscope using a bundle of several thousand of fiber optics.
In other words, the present invention is based on a hierarchical framework that is able to recover a globally consistent alignment of the input frames, to compensate for the motion-induced distortion of the input frames (simply called motion distortion hereafter) and to capture the non-rigid deformations. The global positioning is presented as an estimation problem on a Lie group. An efficient optimization scheme is proposed to solve this estimation problem. Because the motion distortions are induced by the motion of the optical microprobe, this relationship is modelled and used to recover the motion distortions. An efficient scattered data fitting method is also proposed to reconstruct on a regular grid the irregularly sampled images that arise from the inputs and from the mosaic construction process. This reconstruction method is also used when non-rigid deformations are recovered with an adapted demons algorithm.
In order that the present invention may be more clearly ascertained, preferred embodiments of the present invention will now be described, by way of example, with reference to the accompanying drawings in which:
Although the invention is not limited to this, the method according to the present invention implemented in a confocal microscope with laser scanning in fibre mode will now be described.
The scanning of the laser can be decomposed into a fast horizontal sinusoidal component and a slow linear uniform vertical component. Horizontally, the imaging is done only on the central part of the trajectory, where the spot velocity is maximal and nearly constant. Since in this part, the spot horizontal velocity Vx (>5 m/s) is several orders of magnitude higher than the spot vertical velocity Vy (˜2 mm/s), we assume that the horizontal spot velocity is infinite.
An interesting point of scanning imaging devices is that the output image is not a representation of a given instant, but a juxtaposition of points acquired at different times. Consequently, if the flexible microprobe moves with respect to the imaged tissue, what we observe is not a frozen picture of the tissue, but a skewed image of it. Each scan line indeed relates to a different instant, and the flexible microprobe moved between each scan line.
Considering a standard 2D+t volume, without scanning each acquired frame would correspond to a single time instant to and thus to a 2D horizontal slice of the volume. With scanning a point of ordinate y corresponds to the time t0+y/Vy. The process of image formation therefore comes down to imaging an oblique plane of the volume.
The goal of many existing mosaicing algorithms is to estimate the reference-to-frame mappings and use these estimates to construct the mosaic. Small residual misregistrations are then of little importance because the mosaic is reconstructed by segmenting the field into disjoint regions that use a single source image for the reconstruction. Even if these reconstruction techniques can ignore small local registration errors, a globally consistent alignment framework is needed to avoid large accumulated registration errors as shown in
Since input frames of the present invention are rather noisy, all the available information will be used to recover an approximation of the true underlying scene. The frame-to-reference transformations (instead of the usual reference-to-frame) will therefore be estimated and all the input sampling points as shifted sampling points of the mosaic will be considered. This has several advantages. First of all, this is really adapted to irregularly sampled input frames because we will always use the original sampling points and never interpolate the input data. This approach is also more consistent with a model of noise appearing on the observed frames rather than on the underlying truth. Finally in this framework, it will be possible to get a mosaic at a higher resolution than the input frames. The aim is to obtain an accurate estimate of the unknown transformations.
Each frame of the input sequence is modeled as a noisy and deformed partial view of a ground truth 2D scene (the mosaic we want to recover). Let I be the unknown underlying truth and In be the observed frames. Each observed sampling point p in the coordinate system Ωn associated with the nth input frame can be mapped to a point in the reference coordinate system Ω by the nth frame-to-reference mapping fn. Each observed sampled value In(p) can then be seen as a noisy observation of the ground truth signal I(fn(p)):
I
n(p)=I(fn(p))+εn(p)∀pεΩn (1)
where Σn(p) is a noise term. Note that according to this observation model, we need to recover the frame-to-reference mappings as opposed to many existing mosaicing algorithms that estimate the reference-to-frame mappings (see
We specifically designed our transformation model for fibered confocal microscopy. The displacement of the optical microprobe across the tissue can be described by a rigid shift denoted by rn. Since FCM is a laser scanning device, this motion of the optical microprobe with respect to the imaged tissue induces some motion artifacts. These distortions can be modeled by a linear transformation vn that will be described in more details in Section 4. Finally, due to the interaction of the contact optical microprobe with the soft tissue, a small non-rigid deformation bn appears. The frame-to-reference mappings are thus modeled by:
f
n(p)=bn·rn·vn(p). (2)
As depicted in
We start by assuming that the motion distortions as well as the non-rigid tissue deformations can be ignored. By making the reasonable assumption that consecutive frames are overlapping, an initial estimate of the global rigid mappings can be obtained by using a rigid registration technique to estimate the motion between the consecutive frames. Global alignment is then obtained by composing the local motions. This initial estimate suffers from the well-known accumulation of error problem illustrated in
The first loop of our algorithm (steps 1, 2 and 3 in
Once a globally consistent set of transformations is found, the algorithm constructs a point cloud by mapping all observed sampling points onto a common reference coordinate system. An efficient scattered data fitting technique is then used on this point cloud to construct an initial mosaic. The residual non-rigid deformations are finally taken into account by iteratively registering an input frame to the mosaic and updating the mosaic based on the new estimate of the frame-to-mosaic mapping.
In step 2 of our algorithm, we use all available pairwise rigid registration results to estimate a set of globally consistent transformations. A sound choice is to consider a least-square approach. However, the space of rigid body transformations is not a vector space but rather a Lie group that can be considered as a Riemannian manifold.
Many sets of primitives used in image processing and computer vision can be considered as real Lie Groups or as quotients of real Lie groups (e.g. 2D rigid body transformations, tensors (Pennec et al., 2006), quaternions, upper triangular matrices, M-reps. Most of them are not vector spaces and paradoxes such as Bertrand's paradox appear when one considers a Lie Group as a vector space within an estimation problem.
The goal of this section is to provide a basic toolbox for estimation problems on real Lie groups. This synthesis uses classical tools from differential geometry and focuses on Lie groups to simplify the results and notations. Detailed treatment of differential geometry and Lie groups can be found in textbooks. By using differential geometry, we will be able to generalize many algorithms designed for the usual vector space case.
A Lie Group G is a smooth manifold together with a smooth composition map (usually denoted as multiplication) and a smooth inverse map that satisfy the group axioms:
closure, associativity, existence of a neutral element (denoted hereafter as Id) and existence of an inverse. Two important mappings for us are the left-compositions and right-compositions by an element m:
L
m
:xεG
L
m(x)=m·xεG,
R
m
:xεG
R
m(x)=x·mεG,
They are diffeomorphisms by definition. Hence they naturally induce the following differential maps (in a particular local coordinate system or chart):
which allow us to map the space TxG of tangent vectors to G at x to its counterpart Tm·xG or Tx·mG.
Because many estimation problems involve a measure of discrepancy between two elements, it is natural to look for a definition of a distance between two elements of a Lie group. This can be done by providing the Lie group with a Riemannian metric which is a continuous collection of dot products on the tangent space Tx(G) at x:<v|w)x=vT·G(x)·w. Because of the smoothness of the Lie Group, we can smoothly translate a dot product at the Id-tangent space to any other tangent space by left or right composition thanks to the differential maps above.
In the sequel we focus on left invariant metrics. Thanks to the left-composition differential map DLx, they can be represented by the matrix field,
G(x)=DLx(Id)−T·G(Id)·DLx(Id)−1. (3)
The Riemannian metric provides the intrinsic way of measuring the length of any smooth curve on the Lie group. The distance between two points of a Lie Group is then the minimum length among the curves joining these points. The curves realizing this minimum for any two points of the manifold are called geodesics. The calculus of variation shows that the geodesics follow a second order differential system depending on the Riemannian metric. Therefore, we know that there exists one and only one geodesic γ(x,u)(·) defined for all times (thanks to the left-invariance), going through x at time t=0 and having u as a tangent vector.
A key notion of differential geometry is the exponential map. Let us consider a given point x of the Lie group, a vector u in the corresponding tangent space and the uniquely associated geodesic γ(x,u). The exponential map is the function that maps u to the point γ(x,u)(1) reached after a unit time by the geodesic:
This function realizes a local diffeomorphism from a neighbourhood of 0 to a neighborhood of x by developing the tangent space along the geodesics (see
expx=yu=logxy.
In the context of Lie groups, the exponential map notion can be ambiguous as one can use the Lie group exponential or the Riemannian manifold one. Both definitions agree if and only if the Lie group admits a left-and-right-invariant Riemannian metric (such as for compact Lie groups) used to define the geodesics. Unfortunately, for the group of rigid body transformations, it can be shown that no bi-invariant metric exists. In the context of estimation problems, we are mainly interested in distance measurements and we will therefore stick to the Riemannian exponential. When a left-invariant metric is used, the exponential and log maps at any point of a Lie group can be related through left composition to their counterpart at the identity with the following equations:
logx(y)=DLx(Id)logId(x−1·y), (4)
expx(u)=x·expId(DLx(Id)−1u). (5)
With the log map, the (geodesic) distance between two points is given by:
As shown above, we see that even if we mainly use the Lie group as a Riemannian manifold, its additional structure is of great practical interest because it allows us to map every tangent space to the one at the identity. We therefore only need to have computational tools for one tangent space, the tangent space at the identity.
Lie groups are usually not vector spaces, and the notion of expectation can not readily be extended to it. However, for any metric space, it is possible to define probabilistic spaces, random elements and probability density functions. Expectations and other usual tools are then defined for random variables, which are real-valued functions of the probabilistic events, but not directly for random elements of the group. However, by changing the definition of the expectation and using the Riemannian geometry tools presented in Section 2.2, it turns out that a consistent statistical framework (including the mean, the covariance and the Mahalanobis distance) can be defined.
In a vector space, the mean can be seen as the element that minimizes the expected distance to a random vector. This point of view allows us to generalize the mean for Lie groups. Let x be a random element and let
σx2(y)=E[dist(y,x)2]
be its variance at the (fixed) element y. Note that this is well defined because dist(y,·) is a real-valued function.
Let x be a random element of a Lie group G. If the variance σx2(y) is finite for all elements yεG, every element minimizing the variance is called a Fréchet mean element.
The set of Fréchet mean elements is thus given by
It can be shown that under suitable conditions (that are assumed to be fulfilled here), there exists one and only one Fréchet mean which we denote as E[x].
To define higher moments of a distribution, on a Lie group, the exponential map at the mean point is used. The random feature is thus represented as a random vector with zero mean in a star-shaped domain. With this representation, the covariance matrix can be defined by:
The Mahalanobis distance, which plays a key role to get a robust estimation of the global positioning, can be defined. The Mahalanobis distance is a classical tool to define a statistical distance in a sample space with known covariance matrix. Its definition can easily be extended to Lie groups by using the above definition of the covariance matrix. The Mahalanobis distance of a point y to a random feature with Fréchet mean E[x] and covariance matrix Σxx is given by
u
x
2(y)=logE[x](y)TΣxx−1 logE[x](y). (9)
Now, we will show how the problem of global positioning can be cast to an estimation problems on a Lie group. The first step of our algorithm is to find a globally consistent set of transformations to map the input frames to a common coordinate system. When the input frames arise from a single gliding of the flexible microprobe along a straight line, it may be possible to generate decent alignments by computing only pairwise registrations between the consecutive frames.
However in the general case, all spatial neighbors frames might not be temporal neighbors, and accumulated errors can lead to a poor global alignment as shown in
Given two input frames, we need to estimate the (pairwise) frame-to-frame transformation.
At this step of the algorithm, we assume that the nonrigid tissue deformations can be ignored and that the motion distortions have been correctly estimated. With these assumptions, we need to perform rigid registrations of distortion corrected input frames. For that purpose, we use a classical registration framework based on a similarity criterion optimization but any other technique can be used. Let rj,i(obs) be the pairwise rigid registration result between the distortion corrected input frames i and j. This result is considered as a noisy observation of r1, . . . , rN. Based on the set of all available observations, our algorithm looks for a globally consistent estimate of the global parameters r1, . . . , rN. In Sawhney et al. (1998), the authors propose a more general approach. Some chosen corner points are transformed through ri and rj·rj,i(obs). The squared distance between the transformed points added to a regularization term is then minimized. These techniques are sensitive to outliers, and are either tailored to a specific type of transformation or need a somewhat ad hoc choice of points. In this paper, we chose to rely on the tools presented in Section 2 in order to provide consistent and robust estimates of the global rigid body transformations. Practical instantiations of the generic tools presented in Section 2 are given, for 2D rigid body transformations, in the Appendix. The computational cost of registering all input frames pairs is prohibitive and not all pairs of input frames are overlapping. It is therefore necessary to choose which pairs could provide informative registration results. For that purpose, we chose the topology refinement approach proposed in Sawhney et al. (1998). An initial guess of the global parameters r1, . . . , rN is obtained by registering the consecutive frames, the algorithm then iteratively chooses a next pair of input frames to register (thus providing a new observation rj,i(obs) and updates the global parameters estimation. As we only consider the pairwise registration results as noisy observations, we need many of them. In order to minimize the computational cost of those numerous registrations, we use a multiresolution registration technique using a Gaussian image pyramid that stops at a coarse level of the optimization.
Let e be a random error whose Fréchet mean is the identity and whose covariance matrix is Σee. The observation model is given by:
r
j,i
(obs)
=r
j
−1
·r
i
·e
j,i
(obs) (10)
where ej,i(obs) is a realization of the random error e.
Given the set of all available pairwise rigid registration results Θ, we need to estimate the true transformations. A natural choice is to minimize the statistical distance (i.e. Mahalanobis distance) between the observations rj,i(obs) and the transformations rj−1·ri predicted by our model. Our global criterion is thus given by:
It can be seen that this criterion is insensitive to a composition of all transformations with a fixed arbitrary transformation. We can therefore choose any transformation in r1, . . . , rN to be the identity transformation Id. Without loss of generality we can look for any minimizer in (11) and then compose all estimates with a common rigid body transformation so that we get for example
The covariance matrix used for the Mahalanobis metric depends on the application. We typically start with a diagonal matrix that weights the angles with respect to the translation. This matrix can further be estimated from the data as explained in Section 3.3. According to (9), each term of the sum in (11) is given by:
where See is a matrix square root (e.g. Cholesky factorization) of Σee−1. We see that we need to solve the non-linear least squares problem:
where: φ(r1, . . . , rN)=Vect({See·logId(ri−1·rj·rj,i(obs))}(i,j)εΘ).
Note that this criterion is indeed non-linear because of the composition involved in the computation of ej,i(obs).
Many usual optimization algorithms work by making a series of straight steps towards the optimum. Most often, it first looks for a walking direction and then finds a step length according to a predefined rule (e.g. Gauss-Newton) or by using a line search technique (e.g. conjugate gradient). In our Lie group framework, the straightest paths are given by the geodesics. The idea is thus to walk towards an optimum by a series of steps taken along a geodesic of the Lie group rather than walking in the tangent vector space (i.e. using a usual optimization routine on the representation of the elements).
Thanks to the exponential map, we have a canonical way to follow a geodesic starting from a given point and having a given initial tangent vector. It is thus possible to combine the power of intrinsic geodesic walking and the ease of use of classical optimization routines in a natural way.
The Gauss-Newton method forms the basis for the efficient methods that have been developed for non-linear least squares optimization. We now show how it can be used in the Lie group setting. Let x be an element of a Lie group G and let
be the function we want to minimize. The Gauss-Newton method is based on a linear approximation of φ(·) in a neighborhood of the current estimate. We denote by φx(·)=φ(expx(·)) the expression of φ(·) in a normal coordinate system at x. Its Taylor expansion around the origin of this chart is given by:
By keeping only the linear part we get the following approximation:
The Gauss-Newton step minimizes this approximation:
It is well known that if Jφ(x) has full rank, this has a unique minimizer which is the solution of J100 (x)T·J100(x)·v=−J100(x)T·φ(r1:N).
Our optimization routine now follows the geodesic starting from the current estimate x(old) and whose tangent vector is vgn. We thus get the following update equation:
The classical Gauss-Newton routine uses λ=1 at all steps, but a line search could also be used.
We have shown how to adapt the Gauss-Newton procedure for a non-linear least squares problem on a Lie group.
A very similar approach would also provide extensions of other classical non-linear least squares optimizers (such as the Levenberg-Marquardt method or Powell's dog leg method) for Lie groups.
This method is applied to solve (11), where the Lie group we use is the Cartesian product of N rigid body transformation groups. The Jacobian Jφ[(r1, . . . , rN)] can easily be computed by seeing that:
Within this general framework, several improvements can easily be added. In order to make the rigid registration between the distortion corrected input frames, we chose to optimize the squared correlation coefficient. It is then straightforward to weight the terms of the cost function (11) by this confidence measure. A well-known problem of pure least squares approach is the sensitivity of the solution to outliers in the observations. Many solutions have been proposed to handle the presence of outliers. The most common ones rely on using only a subset of the observations (e.g. least trimmed squares, reweighted least squares) or on a minimization of the sum of a function of the residuals (e.g. M-estimators).
In order to be able to use the efficient least-squares optimizer presented above, the easiest is to use the first approach or to solve the M-estimator problem using iteratively reweighted least squares.
In our particular setting we chose the simple reweighted least squares approach:
where ρj,i is the correlation coefficient between the registered distortion corrected frames i and j, and γ is the 95% quantile of the X2 distribution with 3 degrees of freedom. If enough observations are available, our procedure also includes an estimation of the covariance matrix Σee.
We have shown that when using a laser scanning device, the relative motion of the imaged object with respect to the acquisition device induces distortions. Without any further assumption, these distortions can have a very general form. This can for example be seen in some famous photographs by Henri Lartigue or Robert Doisneau shot with a slit-scan camera. In our particular case, the main relative motion is due to the gliding of the flexible microprobe along the tissue and some residual movements can be produced by the deformation of the soft tissue.
We again use a hierarchical approach and ignore the effect of the tissue deformation. We thus face the problem of recovering the gliding motion of the flexible microprobe.
This motion will typically be smooth and will mainly be composed of a translation part because of an important torsion resistance of the flexible microprobe. Note that even if consecutive frames can only be slightly rotated, more time distant frames can have a large rotational difference. In order to be able to robustly recover the gliding motion, we need to further constrain it. We will therefore assume that during the time period Tscan taken by the laser to scan a complete input frame, the flexible microprobe is only animated with a translational motion with constant velocity vector {tilde over (η)}=[{tilde over (η)}x, {tilde over (η)}y].
Let p=[x,y] be a point in the coordinate system related to the input frame. As shown before, each laser scan line, indexed by the ordinate y of a point, corresponds to a different time instant t(y)=t(0)+y/Vy, where we recall that Vy is the vertical velocity of the laser scanning process. The point p will therefore be shifted by (t(y)−t(0)). {tilde over (η)}=y/Vy{tilde over (η)} with respect to the center of the coordinate system. In order to simplify the notations, we will from now on use the normalized velocity η=1/Vy{tilde over (η)}. We now have a way to map a point p of an input frame to a point pd in the corresponding distortion compensated frame by using the following linear transformation:
This is the explicit expression for the motion distortion transformations, i.e. vk(p)=M(η)·p for a given frame k (see
From this distortion model, we can derive the transformation model mapping an input frame k to another frame j. With the assumptions that the non-rigid deformations can be ignored, we have fj=rj·vj, fk=rk·vk and fj,k=fj−1·fk, where we recall that fn denotes a global frame-to-reference mapping. This implies that:
f
j,k
=v
j
−1
·r
k
·v
k
=v
j
−1
·r
j,k
·v
k (14)
In the local-to-global positioning scheme presented in Section 3, we mentioned that we needed to perform rigid registrations of distortion compensated frames. By using (14), we see that it is possible to make these registrations without the need to explicitly create the motion compensated frames. We simply look for the best rigid body transformation rj,k while keeping vk and vj fixed.
In order to recover the motion distortions, one could try to register the frames using the complete transformation model (14). However, this would imply to ignore the relationship between positions and velocity and would thus not really be robust. We therefore chose to compute the velocities using the displacements information only.
Using finite difference equations, we can relate the global positioning and the velocity η. Since our motion model ignores the rotational velocity, we can focus on the displacement of the center of the flexible microprobe. Let r=[θ/τ] be the rigid mapping between two consecutive distortion compensated input frames. The center [0,0] of the first distortion compensated frame is mapped through r to the point τ in the second motion distortion compensated frame. The displacement of the center of the flexible microprobe during the corresponding interframe time period Tframe is thus given by τ. By using a forward difference equation to relate the speed vector and the displacement vector, we get:
Now let us assume that, with the current estimates of the rigid mapping r(old) and velocity η(old), the two consecutive input frames are correctly aligned. By using (14), we see that the center [0,0] of the first (uncompensated) input frame is mapped to the point q=M(η(old))−1·τ(old) in the (uncompensated) second input frame. Let r(new) and n(new) be the updated rigid mapping and velocity. In order to keep a correct alignment, the center of the first (uncompensated) input frame should still be mapped to the point q in the (uncompensated) second input frame. We therefore need to have:
M(η(new))−1·τ(new)=M(η(old))−1·τ(old) (16)
On the other hand, according to (15), we should have
The above system is solved to get updated estimations. A similar procedure is used with the backward difference equation and the different estimations are averaged to get the final update. As shown in
Once a globally consistent, motion compensated, mosaic has been constructed, it is possible to make a refined multi-image registration by iteratively registering each input frame to the mosaic and updating the mosaic. The mosaicing problem can be written as an optimization problem over the unknown underlying image I and the unknown transformations [f1, . . . , fN] of the following multi-image criterion,
where S(Ia,Ib) is a usual pairwise similarity criterion between the two images Ia and Ib. With this framework our mosaic refinement procedure can be seen as an alternate optimization scheme. We again use a hierarchical approach and divide the fine frame-to-mosaic registrations into two loops of increasing model complexity. First we refine the global linear mappings. Then, in order to account for the small non-rigid deformations, we have adapted the demons algorithm (Thirion, 1998) to our special case of irregularly sampled input frames. The general demons scheme is modified as follows:
The optical flow is computed for all demons. Its computation requires the gradient of the reference image which in our case is a non-uniformly sampled input frame. In order to get an approximation of this gradient, we reconstruct the non-uniformly sampled input frame on a regular grid using the scattered data approximation method that will be presented in Section 5. We then use the gradient of this reconstruction.
We mainly used the above scheme because of its efficiency and its adaptation to our particular data, but other schemes could also be used. The residual deformation fields [b1, . . . , bN] could be modeled by using B-splines tensor products on a predefined grid. The framework can also easily be extended to use any other non-rigid registration methods using landmarksbased schemes or more accurate deformation models.
The iterative mosaic refinement scheme presented in Section 4.3 requires a new mosaic construction at each iteration. Furthermore, the adapted demons algorithm needs a method for smoothing deformation fields that are defined on a sparse grid, together with a method being able to construct a regularly sampled image from an irregularly sampled input. These goals can be achieved with a single method for scattered data approximation provided that it allows us to control the degree of smoothness of the reconstruction. If we want to register the input frames with the mosaic, controlling the smoothness of the reconstruction is also important. We indeed need a mosaic that is smooth enough for the registration process not to be trapped in a local minimum but detailed enough for the registration result to be accurate.
As appears above, the scattered data approximation method will be used many times. It is therefore necessary to use a very efficient algorithm. The usual algorithms for scattered data interpolation or approximation, such as triangulation based methods, Kriging methods, radial basis functions interpolations, B-Spline approximations or moving least squares methods (see e.g. Amidror, 2002) do not simultaneously meet the requirements of efficiency, and control over the smoothness of the approximation. In the sequel we develop a main contribution which is an efficient scattered data fitting algorithm that allows a control over the smoothness of the reconstruction.
5.1. Discrete Shepard's like method
Let {(pk,ik)εΩ×R} be the set of sampling points and their associated signal. Our goal is to get an approximation of the underlying function on a regular grid Γ defined in Ω.
The main idea is to use a method close to Shepard's interpolation.
The value associated with a point in Γ is a weighted average of the nearby sampled values,
The usual choice is to take weights that are the inverse of the distance, hk(p)=dist(p,pk)−1. In such a case we get a true interpolation (Amidror, 2002). An approximation is obtained if a bounded weighting function hk(p) is chosen.
We choose a Gaussian weight hk(p)=G(p−pk)∝exp(−∥p−pk∥2/2σa2) and (16) can thus be rewritten as
where δpk is a Dirac distribution centered at pk.
Finding the sampling points that are close enough to a given point is a time consuming task. Moreover, the positions of the sampling points are only known up to a certain accuracy and the resolution of the reconstruction is always limited by the spacing of the chosen grid Γ. Our method will therefore convert the point cloud to a list of pixels in the reconstruction grid Γ by using a nearest neighbor rule. This mapping is the key to the very efficient method presented here.
This scattered data approximation technique requires only two Gaussian filtering and one division and is thus very efficient. The smoothness is controlled by the variance of the Gaussian kernel.
5.2. Mosaic Construction
The above algorithm can further be tailored for the problem of mosaic reconstruction. Once an estimate {circumflex over (f)}n of the frame-to-reference mapping fn is available, we get a point cloud composed of all transformed sampling points from all the input frames
{(pk,ik)}={({circumflex over (f)}n(p),In(p))|pεΛn,nε[0, . . . , N]} (20)
where Λn is the set of sampling points in the input frame n.
A common approach, when constructing mosaic images, is to minimize the seam artifacts by using feathering or alpha blending. With this approach, the mosaic image is a weighted average on the input images and the weighting factor depends, for example, on the distance to the center of the image. This allows to smooth the transitions between the input images. Moreover one could have a confidence measure for each input frame based, for example, on the estimated velocity of the frame. Weighting the input frames with this confidence measure for the reconstruction of the mosaic could help getting a visually more pleasing mosaic image. Both approaches reduce to weighting the importance of a given mapped input sampling point pk by some factor ρk.
This weight can readily be used in (17) and we get:
Adapting the algorithm is thus straightforward. We only need to change the N and D update steps by N(pks)←N(pks)+ρkik and D(pks)←D(pks)+ρk. Finally there is often no need to extrapolate the data too much or to reconstruct zones that have a very small confidence. This can easily be achieved by setting the value of the mosaic to an arbitrary (but fixed) value when the total weight factor D(p) is below a predefined threshold.
The experimental evaluation of our approach was carried out on a reflectance fibered confocal microscope from Mauna Kea Technologies shown in
In order to validate the global positioning and motion distortion compensation framework, image sequences of a rigid object were acquired. The object needed to have structures that could be seen with the reflectance fibered confocal microscope. For the mosaicing to be of interest, we also needed to see shapes whose size were larger than the field of view of our imaging device. We therefore chose to image a silicon wafer.
A fair evaluation can only be made by comparing the output of the algorithm with independent information. Apart from simulated data, a ground truth of the imaged region is very difficult to get. Even with a standard microscope having a comparable resolution but a greater FOV it is not easy to see on the wafer whether the exact same region is being imaged or not. In addition to the mosaic, our algorithm estimates the motion of the flexible microprobe. The following evaluations use a computer numerical control (CNC) milling machine, shown in
In the first experiment, the milling machine was programmed to perform two consecutive circles with a radius of 125 μm. The first circle is a clockwise one whereas the second one is a counterclockwise one. The final shape thus looks like an “8”. The motion starts and ends at the center of this shape. The milling machine was programmed to keep a constant tangential velocity vector during the experiment. The radius of the circles were chosen so that in the middle of the circles a small blind zone remains. We therefore have both regions where many frames overlap (the center of the “8”), and regions where fewer frames do (the top and bottom of the “8”).
As a point of comparison, we compute an initial mosaic by registering the pairs of consecutive frames under a rigid motion assumption. An initial global positioning is computed by composing these local rigid body transformations. No frame distortion compensation is carried out.
In
In order to further evaluate the quality of our reconstruction we used a standard microscope to acquire images of the silicon wafer on a similar zone (being able to image the very same spot is a very difficult task due to the redundancy of the wafer pattern).
In the second experiment, the milling machine was again programmed to perform two circles with radius 125 Im. In addition to this motion, the table holding the silicon wafer was programmed to perform a rotation of angle −π/3 around a fixed axis. Both motions have been synchronized to start and end at the same time instant. The rotational velocity of the table was programmed to remain constant. Once again, the milling machine was programmed to keep a constant tangential velocity vector during the experiment. Because of the custom made part needed for the milling machine to hold the flexible microprobe, it was not possible to program the center of rotation of the table to be aligned with the center of the “8” motion imposed by the milling machine. We have however been able to roughly position it there.
a shows the mosaic reconstructed using the present global frame positioning and motion distortions compensation algorithms. We have superimposed the estimated trajectory of the flexible microprobe together with two axes showing the estimated orientation of the flexible microprobe. For obvious reasons of clarity, only a subset of the estimated positions are shown. Note that the first and last frame (denoted by F and L in
The power of our hierarchical scheme is shown in
Fibered confocal microscopy is designed to visualize cellular structures in living animals. We applied our mosaicing algorithm to different types of tissue imaged in vivo with the Celivizio© as presented in
In the field of colon cancer research, the development of methods to enable reliable and early detection of tumors is a major goal. In the colon, changes in crypt morphology are known to be early indicators of cancer development. The crypts that undergo these morphological changes are referred to as Aberrant Crypt Foci (ACF) and they can develop into cancerous lesions.
In laboratory rodents, ACF can either be induced by colon-specific carcinogens or through transgenic mutations. As in humans, ACF in mice are known to be reliable biomarkers for colon cancer and are used to study initiation, promotion and chemoprevention of colorectal cancer. Currently, ACF are routinely imaged, detected and counted under a dissecting microscope following staining with methylene blue. In order to do this, the animal is sacrificed and the whole colon is excised and opened flat.
Compared to this method, fluorescence FCM enables the operator to see the lesions in real-time and to make an almost immediate evaluation without sacrificing the animal. This offers the possibility of studying groups of individual animals over extended periods with the benefits of reduced inter-animal statistical variation and reduced number of animals used per experiment. However, in many cases the limited field of view restricts the confidence that the operator has in the ACF counting.
By offering an extended field of view, mosaicing techniques can be an answer to this restriction.
The effectiveness and relevance of the proposed algorithm for this study are shown on a sequence that has been acquired in vivo on a mouse colon stained with acriflavine at 0.001%. The mouse was treated with azoxymethane (AOM) to induce a colon cancer. The input sequence is composed of fifty frames, each with a field of view of 425 μm by 303 μm. As shown in
b shows the improvement we make with respect to a simple mosaic construction, shown in
Our method has also been successfully applied to many other types of sequences acquired in both mouse and human colon as shown in
The Cellvizio© offers a new way to image and characterize many types of tissue. In many cases, mosaicing can help move beyond the limitations of FCM by offering an extended field of view. We provide some insight of this by showing the result of our algorithm on the remaining sample sequences shown in
a shows a mosaic constructed from 21 input frames, each with a FOV of 417 μm by 297 μm. In this figure, we can see mouse tumoral angiogenesis. The need for in vivo imaging is urgent in this field. It can indeed help assess the efficiency of angiogenesis therapy. Mosaicing techniques can further help in getting objective quantitative measurements. The result shown in
d shows the ability of the Cellvizio© to image nervous tissue down to the dendritic endings and shows how mosaicing can help seeing many of those dendritic endings simultaneously. 70 input frames all with a FOV of 397 μm by 283 μm were used to produce the mosaic.
The results shown in
In conclusion, the present invention concerns a new method of video mosaicing for in vivo soft tissue fibered confocal microscopy. A fully automatic robust hierarchical approach is proposed. For example, rigorous tools for estimation problems on Lie groups are used to develop a robust algorithm to recover consistent global alignment from local pairwise registration results. A model of the relationship between the motion and the motion distortion is developed and used to robustly compensate for the motion distortions arising when using a laser scanning device. An efficient scattered data fitting technique is proposed for the construction of the mosaic. This tool is also used to adapt the demons algorithm for non-rigid registration with irregularly sampled images.
Lie group tools for 2D rigid transformations
In this section, we provide the explicit expressions of the tools defined in Section 4 for the specific case of interest, the 2D rigid body transformations R.
A rigid body transformation r is composed of a rotation of angle θ (expressed in [0,2π] mod(2π)) followed by a translation τ=(τx, τy). This set of parameters provides a chart or local coordinate system. The composition and inversion are easily expressed in this coordinate system. Let r1 and r2 be two rigid body transformations represented by (θ1, τ1) and (θ2, τ2), we have:
is a rotation matrix. In this coordinate system, the differential maps previously presented are given by:
We can notice here that the left composition differential map DLr(s) does not depend on s.
We now look at the left-invariant Riemannian metric G(x) we use on this Lie group. Let u1=(dθ1, dτ1) and u2=(dθ2, dτ2) be elements of the Id-tangent space TId
We choose the canonical dot product on TId
<u1|u2>Id=dθ1dθ2+dτ1T·dτ2G(Id)=I3,
According to (3), the left invariant metric is thus given, at all points, by the matrix G(x)=DLr−T·I3·DLr−1=I3. The Riemannian metric is, in this case, of the simplest possible form and does not depend on the tangent space. This particular Lie group is thus flat and geodesics are straight line. The Riemannian logarithm at the identity is therefore simply given by the parameters expressed in the chart we chose:
logId(r)=[θ,τx,τy].
Another common choice, would be to take G(Id)=−diag([λ2,1,1]) in order to weight the influence of the angles with respect to the translation, but using the Mahalanobis distance defined in Section 2.3 makes this weighting unnecessary.
Although the present invention and its advantages have been described in detail, it should be understood that various changes and substitutions can be made herein without departing from the principle and the scope of the invention. Accordingly, the scope of the present invention should be determined by the following claims and their legal equivalents.