DEEP LEARNING BASED BLOB DETECTION SYSTEMS AND METHODS

TECHNICAL FIELD

The present disclosure relates to image processing and, in particular, systems and methods for blob detection using deep learning.

BACKGROUND

Imaging biomarkers play a significant role in medical diagnostics and in monitoring disease progression and response to therapy. Development and validation of imaging biomarkers involves detection, segmentation, and classification of imaging features, and various conventional deep learning tools have been developed to perform these functions. However, these deep learning tools are strongly affected by image quality. Moreover, there are challenges in detecting objects in images, particularly small objects known as blobs. These challenges include low image resolution, image noise, and overlap among the blobs. Conventional imaging tools were developed to, e.g., precisely map and measure individual glomeruli in kidneys, which would allow detection of kidney pathology. However, such conventional blob detectors are not robust to noise and/or require large datasets for training, leading to high false positive rates and/or limited applicability in medical applications where sample sizes are often small. Thus, these conventional blob detectors are unable to perform measurement of glomeruli in kidneys efficiently and reliably. Accordingly, improved blob detection systems and methods are desirable.

SUMMARY

In various embodiments, systems, methods, and articles of manufacture (collectively, the “system”) for blob detection using deep learning are disclosed. In various embodiments, the system may include a non-transitory computer-readable storage medium configured to store a plurality of instructions thereon which, when executed by a processor, cause the system to: train a U-Net and generate a probability map including a plurality of centroids of a plurality of corresponding blobs; derive, from the U-Net, two distance maps with bounded probabilities; apply Difference of Gaussian (DoG) with an adaptive scale constrained by the two distance maps with the bounded probabilities; and apply Hessian analysis and perform a blob segmentation.

In various embodiments, the system may include a multi-threshold, multi-scale small blob detector.

In various embodiments, the two distance maps may include binarized maps of distances between the plurality of centroids of the plurality of corresponding blobs utilized to bound a search space for scales of the DoG.

In various embodiments, the plurality of instructions may be further configured to cause the system to generate a Hessian convexity map using the adaptive scale.

In various embodiments, the plurality of instructions may be further configured to cause the system to eliminate an under-segmentation of the U-Net.

In various embodiments, the system may include a Bi-Threshold Constrained Adaptive Scale (BTCAS) blob detector configured to perform the plurality of instructions.

In various embodiments, the plurality of instructions may include an implementation of a modified fully convolutional network including one or more concatenation paths.

In various embodiments, a method for blob detection using deep learning is disclosed. The method may include: obtaining an image for detecting a plurality of blobs; pre-training a U-Net to generate a probability map to detect a plurality of corresponding centroids of the plurality of blobs; deriving, from the U-Net, two distance maps including bounded probabilities; deriving, from the two distance maps, a plurality of bounded scales; smoothing each window of each centroid of the plurality of centroids with a Difference of Gaussian (DoG) filter, wherein the DoG filter includes an adaptive optimum scale constrained by the bounded scales; conducting a Hessian analysis on the smoothed window of the each centroid; and identifying a plurality of final segmented voxels sets corresponding to the plurality of blobs.

In various embodiments, the image may include an image of kidney glomeruli.

In various embodiments, the deriving the two distance maps may include minimizing a global loss function

$ℒ (Θ) = \frac{1}{N} \sum_{i = 1}^{n} loss (Y, ℱ (X; Θ)),$

wherein: the X is an input image; the Y is a denoised image; the Nis a sample size; the custom-character (X; Θ) is a probability map mapping the X to the Y by learning and optimizing the parameters Θ of convolutional and deconvolutional kernels, followed by a sigmoid activation function; and the loss(·) is a binary cross entropy loss function.

In various embodiments, the binary cross entropy loss function may be defined as

$loss (Y, ℱ (X; Θ)) = - \frac{1}{I_{1} I_{2} I_{3}} \sum_{k = 1}^{I_{1} \times I_{2} \times I_{3}} y_{k} \cdot \log (ℱ_{k} (X; Θ)) + (1 - y_{k}) \cdot \log (1 - ℱ_{k} (X; Θ))$

wherein: the I₁, I₂, and I₃are dimensions of the image; the y_kis a true label; and the custom-character _k(X; Θ) is a predicted probability for a voxel k.

In various embodiments, the method may further include outputting, to a computer in communicative connection with the U-Net, a count of the plurality of blobs in the image.

In various embodiments, the conducting the Hessian analysis may include eliminating an under-segmentation of the U-Net.

In various embodiments, an apparatus for blob detection using deep learning is disclosed. The apparatus may include a non-transitory computer-readable storage medium configured to store a plurality of instructions thereon which, when executed by a processor, cause the apparatus to: train a U-Net and generate a probability map including a plurality of centroids of a plurality of corresponding blobs; derive, from the U-Net, two distance maps with bounded probabilities; apply Difference of Gaussian (DoG) with an adaptive scale constrained by the two distance maps with the bounded probabilities; and apply Hessian analysis and perform a blob segmentation.

In various embodiments, the apparatus may include a multi-threshold, multi-scale small blob detector.

In various embodiments, the plurality of instructions may be further configured to cause the apparatus to generate a Hessian convexity map using the adaptive scale.

In various embodiments, the plurality of instructions may be further configured to cause the apparatus to eliminate an under-segmentation of the U-Net.

In various embodiments, the apparatus may include a Bi-Threshold Constrained Adaptive Scale (BTCAS) blob detector configured to perform the plurality of instructions.

The foregoing features and elements may be combined in various combinations without exclusivity, unless expressly indicated herein otherwise. These features and elements as well as the operation of the disclosed embodiments will become more apparent in light of the following description and accompanying drawings. The contents of this section are intended as a simplified introduction to the disclosure, and are not intended to limit the scope of any claim.

BRIEF DESCRIPTION OF THE DRAWINGS

The subject matter of the present disclosure is particularly pointed out and distinctly claimed in the concluding portion of the specification. A more complete understanding of the present disclosure, however, may best be obtained by referring to the detailed description and claims when considered in connection with the following illustrative figures. In the following figures, like reference numbers refer to similar elements and steps throughout the figures.

FIG. 1 illustrates use of a modified fully convolutional neural network (U-Net) in accordance with various exemplary embodiments;

FIGS. 2A and 2B illustrate a process of deriving a distance map from a U-Net in accordance with various exemplary embodiments;

FIG. 3 are images related to training dataset of U-Net for an experiment conducted in accordance with various exemplary embodiments;

FIGS. 4A and 4B are images of a 3D synthetic image dataset related to an experiment conducted in accordance with various exemplary embodiments;

FIG. 5A illustrates a comparison of blob detection error rate (%) of HDoG, U-Net, OT U-Net, UH-DoG, and BTCAS in 3D synthetic blob images with low noise (SNR=5 dB) related to an experiment conducted in accordance with various exemplary embodiments;

FIG. 5B illustrates a comparison of blob detection error rate (%) of HDoG, U-Net, OT U-Net, UH-DoG and BTCAS in 3D synthetic blob images with high noise (SNR=1 dB) related to an experiment conducted in accordance with various exemplary embodiments;

FIG. 6 illustrates a set of glomerular segmentation results from 3D MR images of a human kidney related to an experiment conducted in accordance with various exemplary embodiments; and

FIG. 7 is a process performed by a blob detection system using deep learning in accordance with various exemplary embodiments.

DETAILED DESCRIPTION

The detailed description of various embodiments herein makes reference to the accompanying drawings and pictures, which show various embodiments by way of illustration. While these various embodiments are described in sufficient detail to enable those skilled in the art to practice the disclosure, it should be understood that other embodiments may be realized and that logical and mechanical changes may be made without departing from the spirit and scope of the disclosure. Thus, the detailed description herein is presented for purposes of illustration only and not of limitation. For example, the steps recited in any of the method or process descriptions may be executed in any order and are not limited to the order presented. Moreover, any of the functions or steps may be outsourced to or performed by one or more third parties. Furthermore, any reference to singular includes plural embodiments, and any reference to more than one component may include a singular embodiment.

For the sake of brevity, conventional techniques for blob detection, deep learning, and/or the like may not be described in detail herein. Furthermore, the connecting lines shown in various figures contained herein are intended to represent exemplary functional relationships and/or physical couplings between various elements. It should be noted that many alternative or additional functional relationships or physical connections may be present in practical blob detection systems or methods.

As used herein, “electronic communication” means communication of at least a portion of the electronic signals with physical coupling (e.g., “electrical communication” or “electrically coupled”) and/or without physical coupling and via an electromagnetic field (e.g., “inductive communication” or “inductively coupled” or “inductive coupling”). As used herein, “transmit” may include sending at least a portion of the electronic data from one system component to another (e.g., over a network connection). Additionally, as used herein, “data,” “information,” or the like may include encompassing information such as commands, queries, files, messages, data for storage, and/or the like in digital or any other form.

As used herein, “satisfy,” “meet,” “match,” “associated with,” or similar phrases may include an identical match, a partial match, meeting certain criteria, matching a subset of data, a correlation, satisfying certain criteria, a correspondence, an association, an algorithmic relationship, and/or the like. Similarly, as used herein, “authenticate” or similar terms may include an exact authentication, a partial authentication, authenticating a subset of data, a correspondence, satisfying certain criteria, an association, an algorithmic relationship, and/or the like.

Systems, methods, and computer program products are provided. In the detailed description herein, references to “various embodiments,” “one embodiment,” “an embodiment,” “an example embodiment,” etc. indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the art to affect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described. After reading the description, it will be apparent to one skilled in the relevant art(s) how to implement the disclosure in alternative embodiments.

Principles of the present disclosure contemplate use of advanced techniques for detecting and enumerating blobs, for example, kidney glomeruli. The systems and methods disclosed herein may be utilized to process and/or evaluate images, such as medical images of kidneys.

Systems and methods disclosed herein include Bi-Threshold Constrained Adaptive Scale (BTCAS) blob detector configured for determining a relationship between U-Net threshold and Difference of Gaussian (DoG) scale to derive a multi-threshold, multi-scale small blob detector. With lower and upper bounds on the probability thresholds from U-Net, two binarized maps of distance may be rendered between blob centers. Each blob may be transformed into a DoG space with an adaptively identified local optimum scale. Furthermore, a Hessian convexity map may be rendered using the adaptive scale, and advantageously, an under-segmentation typical of the U-Net may be resolved thereby.

To validate the performance of the BTCAS blob detector described herein, a 3D (three-dimensional) simulated dataset (n=20) of blobs, a 3D MRI (magnetic resonance imaging) dataset of human kidneys, and a 3D MM dataset of mouse kidneys were studied. During the validation, the BTCAS systems and methods were compared against four other methods—HDoG (Hessian-based Difference of Gaussians), U-Net with standard thresholding, U-Net with optimal thresholding, and UH-DoG (U-Net and Hessian-based Difference of Gaussians)—using Precision, Recall, F-score, Dice coefficient, and IoU (Intersection over Union), and the BTCAS systems and methods were found to statistically outperform the compared detectors.

With respect to the present disclosure, monotonicity of U-Net probability map was first shown, laying a foundation for the blob detector described herein. With lower and upper bounds on probability thresholds, two binarized maps of distance between blob centers were then rendered. Since the true blob would fall between the two distance maps with a specified level of certainty, the search space for the DoG scales was bounded. Each blob could then be transformed to an optimum local DoG space locally (e.g., instead of by a single global optimum scale). Then, a Hessian convexity map was rendered using an adaptive scale, and under-segmentation typical of U-Net was resolved.

Then, to validate the performance of the BTCAS blob detector described herein, a 3D simulated dataset (n=20) where locations of blobs were known was studied and compared against other methods. Then, blob detection using these methods applied to a 3D image of three human kidneys and a set of 3D images of mouse kidneys from CFE-MRI (cationic ferritin-enhanced magnetic resonance imaging) was compared against the HDoG, UH-DoG, and stereology.

Methods

Systems and methods for Bi-Threshold Constrained Adaptive Scale (BTCAS) blob detector disclosed herein include two steps to detect blobs (e.g., glomeruli) from CFE-MRI of kidneys: (1) training U-Net to generate a probability map (e.g., denoising a raw image) to detect the centroids of the blobs, and then deriving two distance maps with bounded probabilities; and (2) applying the Difference of Gaussian (DoG) with an adaptive scale constrained by the bounded distance maps, followed by Hessian analysis for final blob segmentation.

Bi-Threshold Distance Maps from U-Net

In various embodiments and with reference to FIG. 1, U-Net 100 (a modified fully convolutional network) includes an encoding path 102 and a decoding path 104. The encoding path 102 has four blocks. Within each block, there are two 3×3 convolutional layers (Conv 3×3), a rectified linear unit (ReLU) layer, and a 2×2 max-pooling layer (Max pool 2×2). After each max-pooling layer, the resolution of the feature maps is halved, and the channel is doubled as shown in FIG. 1. The input images are compressed by layer through the encoding path 102. A corresponding decoding path 104 performs an inverse operation to generate an output including a reconstructed probability map of the same size as the input images. The resolution is increased by layer through the decoding path 104. To transfer information from the encoding path 102 to the decoding path 104, concatenation paths are added between them, marked by solid black arrows as shown in FIG. 1. The final layer is a 1×1 convolutional layer (Conv 1×1), followed by a sigmoid function as shown. This sigmoid function ensures that the resultant output is a probability map.

In some embodiments including supervised learning applications where the output labeling is known, U-Net 100 may be directly used as a model for segmentation. In some embodiments wherein the output labeling is unknown, U-Net 100 may be used to process and denoise the images. Here, since the ground truth is unknown, the denoising capabilities of U-Net 100 are investigated. It is noted that, for CFE-MRI of kidneys, the glomeruli are extremely small—similar to noise that can be potentially removed by autoencoders. Thus, with a major difference between U-Net and autoencoders being that U-Net has concatenation paths which can transfer fine-grained information from low layers to high layers to increase the performance of the segmentation results, U-Net may advantageously (e.g., over autoencoder model) remove background noise from the MR images while simultaneously enhancing the glomerular detection.

Let X∈[0,1]^I¹^×I²^×I³be the input image and Y∈{0,1}^I¹^×I²^×I³be the image after being denoised. For simplicity, input image X may be assumed to have Gaussian noise E. Then, X may be shown as:

X=Y+ε,ε˜ custom-character (0,σ²I). (1)

In various embodiments, U-Net 100 is to obtain a function custom-character (·) mapping X to Y by learning and optimizing the parameters Θ of convolutional and deconvolutional kernels. This may be achieved by minimizing the global loss function:

$\begin{matrix} ℒ (Θ) = \frac{1}{N} \sum_{i = 1}^{n} loss (Y, ℱ (X; Θ)), & (2) \end{matrix}$

where N is a sample size, custom-character (X; Θ) E [0,1]^I¹^×I²^×I³is the probability map followed by the sigmoid activation function, and loss(·) is a binary cross entropy loss function defined as:

$\begin{matrix} loss (Y, ℱ (X; Θ)) = - \frac{1}{I_{1} I_{2} I_{3}} \sum_{k = 1}^{I_{1} \times I_{2} \times I_{3}} y_{k} \cdot \log (ℱ_{k} (X; Θ)) + (1 - y_{k}) \cdot \log (1 - ℱ_{k} (X; Θ)), & (3) \end{matrix}$

where y_k∈{0,1} is the true label and custom-character _k(X; Θ)∈[0,1] is the predicted probability for voxel k. After denoising, the output of Y( ) may approximate Y:

custom-character (X;Θ)≈Y. (4)

In various embodiments, glomeruli in CFE-MR images are roughly spherical in shape, with varying image magnitudes. Based on this observation, Proposition 1 may be developed. Proposition 1 as well as Proposition 2 discussed herein are described with more detail in a separate section of the present disclosure.

A first use of Proposition 1 may be to identify the centroid of any blob. From Proposition 1, the centroid of any bright blob may reach maximum probability. Therefore, a regional maximum function RM may be applied to the probability map U(x, y, z) to find voxels with maximum probability from the connected neighborhood voxels as blob centroids:

$\begin{matrix} RM (U) = \underset{Δ u \in (- k, k)}{\max_{u \in U (x, y, z)}} U (u + Δ u), & (5) \end{matrix}$

where k is the Euclidean distance between each voxel and its neighborhood voxels. The blob centroid set C={C_i}_i=1^Nmay be defined as:

C={(x,y,z)|(x,y,z)∈arg RM(U(x,y,z))}. (6)

Here, k=1. Each blob centroid C_i∈C may have maximum probability within 6-connected neighborhood voxels.

A second use of Proposition 1 may be to binarize the probability map with a confidence level. Otsu's thresholding may be used first to remove noise and voxels in the blob centroids, and to extract the probability distribution of blob voxels. Next, instead of using single threshold, the two-sigma rule may be applied to the distribution to identify the lower probability δ_Land higher probability δ_Hcovering 95% range of the probabilities. As a result, the probability map may then be binarized to B_L(x, y, z)∈{0,1}^I¹^×I²^×I³and B_H(x, y, z)∈{0,1}^I¹^×I²^×I³.

$\begin{matrix} B_{L / H} (x, y, z) = {\begin{matrix} 1, & U (x, y, z) \geq δ_{L / H} \\ 0, & U (x, y, z) < δ_{L / H} \end{matrix} & (7) \end{matrix}$

From Proposition 2, B_L(x, y, z) may approximate a blob with larger size and B_H(X, y, z) may approximate a blob with smaller size. Without loss of generality, B(x,y,z) may be a binarized probability map and Ω={(x, y, z)|B(x, y, z)=1} may be defined as the set of blob voxels and ∂Ω the set of boundary voxels. d(·) is the Euclidean distance function of any two voxels. The Euclidean distance between each voxel and the nearest boundary voxels may be:

$\begin{matrix} d (p, \partial Ω) = \min_{\underset{q \in \partial Ω}{p \in Ω}} d (p, q) . & (8) \end{matrix}$

Given B_L(x,y,z) and B_H(x,y,z), two distance maps may be derived, D_L(x,y,z)∈R^I¹^×I²^×I³and D_H(x, y, z)∈R^I¹^×I²^×I³respectively.

With reference to FIGS. 2A and 2B, the process of deriving binarized distance maps from a probability map is described herein. The plot (a) in FIG. 2A illustrates a probability distribution of a probability map. The image (b) in FIG. 2A shows a visualization of the probability map. The plot (c) in FIG. 2A shows a probability distribution after applying Otsu's thresholding. The image (d) in FIG. 2A shows a visualization of a blob's probability. The image (e) in FIG. 2B shows a binarized probability map B_Lunder low threshold δ_L. The image (f) in FIG. 2B shows a binarized probability map B_Hunder high threshold o_H. The image (g) in FIG. 2B shows a distance map D_Lderived from B_L. The image (h) in FIG. 2B shows a distance map D_Hderived from B_H.

For each blob centroid C_i∈C (see (9)), radius r_iof blob i may be approximated as:

r
_i∈(D_H(C_i),D_L(C_i)). (9)

As known in the relevant art, the smoothing scale in DoG is positively correlated with the blob radius. Here, the bounded radius information in (9) may be used to constrain the adaptive scales in DoG imaging smoothing, as described further herein.

Bounded Adaptive Scales in DoG and Hessian Analysis

For a normalized 3D image X(x, y, z)∈[0,1]^I¹^×I²^×I³, a DoG filter is

$\begin{matrix} D o G (x, y, z; s) = X (x, y, z) ⋆ \frac{(G (x, y, z; s + Δ s) - G (x, y, z; s))}{Δ s}, & (10) \end{matrix}$

where s is a scale value, * is convolution operator, and Gaussian kernel

$G (x, y, z; s) = \frac{1}{{(2 π s^{2})}^{\frac{3}{2}}} e^{- \frac{(x^{2} + y^{2} + z^{2})}{2 s^{2}}} .$

The DoG filter smooths the image more efficiently in 3D than the Laplace of Gaussian (LoG) filter does. In various embodiments addressing challenge(s) in determining the optimum DoG scale in blob detection, the distance maps (D_Land D_H) from U-Net 100 may be applied to constrain the DoG scale for scale inference. Specifically, for d-dimensional images, the DoG may reach a maximum response under scale s=r/√{square root over (d)}. In a 3D image, the range of scale for each blob may be s_i∈(s_i^L,s_i^H), and substituting r with (9) results in:

s
_i
^l
=D
_H(C_i)/√{square root over (3)} (11)

s
_i
^H
=D
_L(C_i)/√{square root over (3)} (12)

For each blob, a normalized DoG_nor(x,y,z;s_i) with multi-scale s_i∈(s_i^L, s_i^H) may be applied on a small 3D window with size N×N×N (N>2*D_L(C_i)) and window center being the blob centroid C_i∈C. For each voxel (x, y, z) in DoG_nor,(x,y,z;s_i) at scale s_i, the Hessian matrix for this voxel may be:

$\begin{matrix} H (D o G_{nor} (x, y, z; s_{i}) = [\begin{matrix} \frac{\partial^{2} D o G_{nor} (x, y, z; s_{i})}{\partial x^{2}} & \frac{\partial^{2} D o G_{nor} (x, y, z; s_{i})}{\partial x \partial y} & \frac{\partial^{2} D o G_{nor} (x, y, z; s_{i})}{\partial x \partial z} \\ \frac{\partial^{2} D o G_{nor} (x, y, z; s_{i})}{\partial x \partial y} & \frac{\partial^{2} D o G_{nor} (x, y, z; s_{i})}{\partial y^{2}} & \frac{\partial^{2} D o G_{nor} (x, y, z; s_{i})}{\partial y \partial z} \\ \frac{\partial^{2} D o G_{nor} (x, y, z; s_{i})}{\partial x \partial z} & \frac{\partial^{2} D o G_{nor} (x, y, z; s_{i})}{\partial y \partial z} & \frac{\partial^{2} D o G_{nor} (x, y, z; s_{i})}{\partial z^{2}} \end{matrix}] & (13) \end{matrix}$

In a normalized DoG-transformed 3D image, each voxel of a transformed bright or dark blob may have a negative or positive definite Hessian. Taking a bright blob as an example, the Hessian convexity window, HW (x, y, z; s_i), may be defined as a binary indicator matrix:

$\begin{matrix} H W (x, y, z; s_{i}) = {\begin{matrix} 1, & if H (D o G_{nor} (x, y, z; s_{i})) is negative definite \\ 0, & otherwise \end{matrix} & (14) \end{matrix}$

In various embodiments, for each blob with centroid C_i∈C, the average DoG value for each window BW_DoGmay be:

$\begin{matrix} B W_{D o G} (s_{i}) = \frac{\sum_{(x, y, z)} D o G (x, y, z) H W (x, y, z; s_{i})}{\sum_{(x, y, z)} H W (x, y, z; s_{i})} . & (15) \end{matrix}$

The optimum scale s_i* for each blob may be determined if BW_DoG(s_i*) is maximum with s_i∈(s_i^L, s_i^H). The optimum scale s_i* may be derived for each blob with centroid C_i∈C, and the final segmented blob set S_blobmay be:

S
_blob={(x,y,z)|(x,y,z)∈DoG_nor(x,y,z;s_i),HW(x,y,z;s_i*)=1}. (16)

The details of an exemplary BTCAS blob detector discussed herein are summarized in Table I.

TABLE I

PSEUDOCODE FOR BTCAS BLOB DETECTOR

1. Use a pretrained model to generate a probability map of blobs from

original image.

2. Initialize probability range (δ_L, δ_H) and thresholding probability map to

get binarized map B(x, y, z) and distance map D(x, y, z)

3. Calculate the blob centroids set C from probability map U(x, y, z). For

each blob with centroid C_i∈ C, get the scale range (s_i^L, s_i^H).

4. For each blob with centroid C_i∈ C, transform raw image window of

blob to multi-scale DoG space with scale s_i∈ (s_i^L, s_i^H).

5. Calculate the Hessian matrix based on normalized DoG smoothed

window and generate the Hessian convexity window HW(x, y, z; s_i).

 \begin{matrix} 6. Calculate average DoG intensity of each window {BW}_{DoG} (s) = \\ \frac{\sum_{(x, y, z)} DoG (x, y, z) HW (x, y, z; s)}{\sum_{(x, y, z)} HW (x, y, z; s)} and find the optimum scale for each \\ blob by s_{i}^{*} = argmax {BW}_{DoG} (s_{i}) \end{matrix}

7. Get the optimum Hessian convexity window HW(x, y, z; s_i*) under

scale s_i*.

8. Identify the final segmented blob voxels set S_blob.

EXPERIMENTS AND RESULTS

Training Dataset and Data Augmentation

As part of an experiment, a public dataset of optical images of cell nuclei was used to train U-Net (e.g., U-Net 100). This dataset contained 141 pathology images (2,000×2,000 pixels). The 12,000 ground truth annotations were provided by a domain expert, which involved delineating object boundaries over 40 hours. Since the aim was to facilitate U-Net to denoise blobs images based on ground truth labeled images, Gaussian distributed noise with μ_noise=0 and σ_noise²=0.01 was generated, which were added to the labeled images, resulting in 141 synthetic training images as shown in FIG. 3 ((g)-(i)). As a brief aside and with reference to FIG. 3, images (a)-(c) are original images, images (d)-(f) are ground truth labeled images for image (a)-(c), and images (g)-(i) are synthetic training images based on images (d)-(f). Data were augmented to increase the invariance and robustness of U-Net. The augmented data were generated by a combination of rotation shift, width shift, height shift, shear, zoom, and horizontal flip. The trained model was validated using 3D synthetic image data and 3D MR image data.

Experiment I: Validation Experiments Using 3D Synthetic Image Data

20 3D images with 10 different numbers of blobs and two different levels of noise were simulated. From each 3D image (sized 256×256×256), blobs were generated using the Gaussian function with parameter s=1 for blob size. The radii of the blobs were approximated as (2×s+0.5) voxels, based on observation. Blobs were spread on the images at random locations. The number of blobs (N) ranged from 5,000 to 50,000 with a step size of 5,000. Noise was generated by the Gaussian function with μ_noise=0 and σ_noise²defined by:

$\begin{matrix} σ_{noise}^{2} = \frac{σ_{image}^{2}}{10^{\frac{S N R}{10}}} . & (17) \end{matrix}$

The signal-to-noise ratio (SNR) was set at 1 dB and 5 dB for high noise and low noise, respectively. As the quantity of blobs increased, so did blob density, which resulted in a large number of blobs being closely clumped together (see FIGS. 4A and 4B). FIGS. 4A and 4B show the 3D synthetic images dataset utilized in Experiment I (slice 100 (of 256) from simulated 3D blob images with different parameter settings on the number of blobs and signal-to-noise ratio (SNR) (dB)). Image (a) is 3D blob image with N=5,000 and SNR=1 dB, O=0.04; image (b) is 3D blob image with N=10,000 and SNR=5 dB, O=0.07; image (c) is 3D bob image with N=20,000 and SNR=5 dB, O=0.14; and image (d) is 3D blob image with N=50,000 and SNR=1 dB, O=0.31.

The ratio of overlap (O) of blobs in the 3D image was derived as:

$\begin{matrix} O = \frac{N_{O}}{N_{T}} . & (18) \end{matrix}$

Five methods were applied to the synthetic 3D blob images: four methods including the HDoG, U-Net with standard thresholding, U-Net with optimal thresholding (OT U-Net), and the UH-DoG, as well as the BTCAS blob detector disclosed herein. The parameter settings of the DoG were as follows: window size N of 7, γ of 2, and Δs of 0.001. To denoise the images of the 3D blobs using a trained U-Net, each 256×256 slice was first resized to 512×512, and then each slice was fed into U-Net. Then, Adam optimizer was used in U-Net with a learning rate set to 0.0001. The dropout rate was set to 0.5. The threshold for the U-Net probability map in UH-DoG was set to 0.5. U-Net was implemented on a NVIDIA TITAN XP GPU with 12 GB of memory. Moreover, a 2D (two-dimensional) U-Net was used, and 2D probability maps were rendered on each slice then stacked together to form a 3D probability map.

Evaluating the Number of Blobs Detected

First, the number of blobs detected from different algorithms and noisy image settings were compared. See FIGS. 5A and 5B. FIG. 5A shows a comparison of blob detection error rate (%) of HDoG, U-Net, OT U-Net, UH-DoG, and BTCAS in 3D synthetic blob images with low noise (SNR=SdB). Number of true blobs (overlap ratio) ranged from 5,000 (0.04) to 50,000 (0.31). FIG. 5B shows a comparison of blob detection error rate (%) of HDoG, U-Net, OT U-Net, UH-DoG, and BTCAS in 3D synthetic blob images with high noise (SNR=1 dB). Number of true blobs (overlap ratio) ranged from 5,000 (0.04) to 50,000 (0.31).

The HDoG suffered from significant over-detection, yielding a high error rate in both experiments. In other methods, for the experiment on images with low noise, as the number of true blobs increased from 5,000 to 50,000, error rates for U-Net, OT U-Net, and UH-DoG ranged from 4.96-38.78%, 4.28-32.22%, and 1.36-12.60%, respectively. The error rates observed with the BTCAS system disclosed herein were significantly lower, ranging from 0.06-1.44%. For the experiment using images with high noise, as the number of true blobs increased from 5,000 to 50,000, error rates for U-Net, OT U-Net, and UH-DoG ranged from 4.68-39.87%, 4.08-32.96%, and 1.38-12.79%, respectively. BTCAS had error rates of 0.08-10.20%. By integrating U-Net, the detection error decreased, and over-detection was reduced. However, both U-Net and OT U-Net detected fewer blobs than the ground truth. This may be due to overlapping blobs—that is, if the probability values at the boundaries of overlapping blobs are larger than the threshold, under-segmentation occurs, leading to fewer detected blobs. OT U-Net used Otsu's thresholding to find the optimal threshold to reduce under-segmentation. With Hessian analysis, under-segmentation may be eliminated. The UH-DoG and BTCAS outperformed both U-Net and OT U-Net. The error rate of BTCAS slowly increased when the number of blobs increased from 5,000 to 50,000 with low noise and from 5,000 to 40,000 with high noise. Although the error rate of BTCAS increased when the number of blobs increased from 40,000 to 50,000 under high noise, this error rate was significantly lower than that for UH-DoG. Thus, the BTCAS system showed much more robustness in the presence of noise compared to the other four methods.

Evaluating Blob Detection and Segmentation Accuracy

Further, algorithm performance was evaluated by Precision, Recall, F-score, Dice coefficient, and Intersection over Union (IoU). For detection, Precision measures the fraction of retrieved candidates confirmed by the ground-truth. Recall measures the fraction of ground-truth data retrieved. F-score is an overall performance of precision and recall. For segmentation, the Dice coefficient measures a similarity between the segmented blob mask and the ground truth. IoU measures amount of overlap between the segmented blob mask and the ground truth. Ground truth voxels and blob locations (the coordinates of the blob centers) were already generated when synthesizing the 3D blob images. A candidate was considered as a true positive if the centroid of its magnitude was in a detection pair j) for which the nearest ground truth center j had not been paired and the Euclidian distance D_ijbetween ground truth center j and blob candidate i was less than or equal to d. To avoid duplicate counting, the number (#) of true positives TP was calculated by (19). Precision, recall, and F-score were calculated by (20), (21), and (22), respectively.

$\begin{matrix} T P = \min {# {(i, j) : \min_{i = 1}^{m} D_{ij} \leq d}, # {(i, j) : \min_{j = 1}^{n} D_{ij} \leq d}}, & (19) \end{matrix}$

$\begin{matrix} precision = \frac{T P}{n}, & (20) \end{matrix}$

$\begin{matrix} recall = \frac{T P}{m}, & (21) \end{matrix}$

$\begin{matrix} F - score = 2 \times \frac{precision \times recall}{(precision + recall)}, & (22) \end{matrix}$

where m is the number of true glomeruli and n is the number of blob candidates, and d is a thresholding parameter set to a positive value (0, +∞). If d is small, fewer blob candidates may be counted since the distance between the blob candidate centroid and ground-truth would be small. If d is too large, more blob candidates are counted. Here, since local intensity extremes may be anywhere within a small blob with an irregular shape, d was set to the average diameter of the blobs:

$d = 2 \times \sqrt{\frac{\sum_{(x, y)} I (x, y; s)}{π} .}$

The Dice coefficient and IoU were calculated by comparing the segmented blob mask and ground truth mask by (23) and (24).

$\begin{matrix} Dice (B, G) = \frac{2 ❘ B ⋂ G ❘}{❘ B ❘ + ❘ G ❘}, & (23) \end{matrix}$

$\begin{matrix} I o U (B, G) = \frac{B ⋂ G}{B ⋃ G}, & (24) \end{matrix}$

where B is the binary mask for segmentation result and G is the binary mask for the ground truth.

Comparisons between the models are shown in Tables II and III. ANOVA test was performed with Tukey's HSD multi-comparison at significance level 0.05. BTCAS significantly outperformed other four methods on Recall and F-Score for images with low and high noises. Compared to UH-DoG, BTCAS provided better performance on Recall and F-Score and was comparable on Precision, Dice, and IoU. In this synthetic data, the blobs were generated with similar size (s=1); thus, the BTCAS system may still resolve under-segmentation by U-Net.

TABLE II

COMPARISON (AVG ± STD) AND ANOVA USING TUKEY’S HSD PAIRWISE TEST

OF BTCAS, HDOG, UH-DOG, U-NET, OT U-NET ON 3D SYNTHETIC IMAGES

UNDER SNR = 5 DB (LOW NOISE) (*SIGNIFICANCE p < 0.05)

METRICS
BTCAS
HDOG
U-NET
OT U-NET
UH-DOG

PRECISION
1.00 ± 0.00
0.10 ± 0.07
0.98 ± 0.01
1.00 ± 0.00
1.00 ± 0.00

(*<0.0001)
(*<0.0001)
(*<0.0001)
(0.172)

RECALL
0.99 ± 0.00
0.99 ± 0.01
0.76 ± 0.12
0.81 ± 0.09
0.93 ± 0.04

(* 0.041)
(*<0.001)
(*<0.0001)
(*<0.001)

F-SCORE
1.00 ± 0.00
0.18 ± 0.11
0.85 ± 0.08
0.89 ± 0.06
0.96 ± 0.02

(*<0.0001)
(*<0.001)
(*<0.001)
(*<0.001)

DICE
0.96 ± 0.03
0.26 ± 0.14
0.52 ± 0.00
0.60 ± 0.04
0.97 ± 0.02

(*<0.0001)
(*<0.0001)
(*<0.0001)
(*<0.0001)

IOU
0.92 ± 0.05
0.16 ± 0.09
0.35 ± 0.00
0.43 ± 0.04
0.94 ± 0.04

(*<0.0001)
(*<0.0001)
(*<0.0001)
(*<0.0001)

TABLE III

COMPARISON (AVG ± STD) AND ANOVA USING TUKEY’S HSD PAIRWISE TEST

OF BTCAS, HDOG, UH-DOG, U-NET, OT U-NET ON 3D SYNTHETIC IMAGES

UNDER SNR = 1 DB (HIGH NOISE) (*SIGNIFICANCE p < 0.05)

METRICS
BTCAS
HDOG
U-NET
OT U-NET
UH-DOG

PRECISION
0.98 ± 0.03
0.09 ± 0.06
0.98 ± 0.01
1.00 ± 0.00
1.00 ± 0.00

(*<0.0001)
(0.338)
(0.063)
(* 0.035)

RECALL
0.99 ± 0.00
0.99 ± 0.01
0.76 ± 0.12
0.81 ± 0.10
0.93 ± 0.04

(* 0.026)
(*<0.001)
(*<0.001)
(*<0.001)

F-SCORE
0.99 ± 0.02
0.17 ± 0.10
0.85 ± 0.08
0.89 ± 0.06
0.96 ± 0.02

(*<0.0001)
(*<0.001)
(*<0.0001)
(*<0.001)

DICE
0.92 ± 0.08
0.26 ± 0.13
0.51 ± 0.01
0.61 ± 0.03
0.94 ± 0.04

(*<0.0001)
(*<0.0001)
(*<0.0001)
(0.063)

IOU
0.85 ± 0.13
0.15 ± 0.09
0.34 ± 0.00
0.44 ± 0.03
0.89 ± 0.07

(*<0.0001)
(*<0.0001)
(*<0.0001)
(0.061)

Experiment II. Validation Using 3D Human Kidney CFE-MR Images

In this experiment, blob segmentation applied to 3D CFE-MR images was investigated to measure number (Nglom) and apparent volume (aVglom) of glomeruli in healthy and diseased human donor kidneys that were not accepted for transplant. Three human kidneys were obtained at autopsy through a donor network (The International Institute for the Advancement of Medicine, Edison, N.J.) after receiving Institutional Review Board (IRB) approval and informed consent from Arizona State University, and they were imaged by CFE-MRI.

Each human MR image had pixel dimensions of 896×512×512. HDoG, UH-DoG, and BTCAS blob detector were utilized to segment glomeruli. The parameter settings of DoG were as follows: window size N=7, γ=2, and Δs=0.001. First, 14,336 2D patches were generated with each patch being 128×128 in size, and each patch was then fed into U-Net. The threshold for the U-Net probability map in UH-DoG was 0.5. Quality control was performed by visually checking the identified glomeruli, visible as black spots in the images. For illustration, example results from CF2 which had more heterogenous pattern are shown in FIG. 6. FIG. 6 shows the glomerular segmentation results from 3D MR images of human kidney (CF2 slice 256); image (a) is the original magnitude image, image (b) shows glomerular segmentation results of HDoG, image (c) shows glomerular segmentation results of UH-DoG, image (d) shows glomerular segmentation results of BTCAS blob detector, and images (e)-(h) show magnified regions from images (a)-(d) indicated by boxes shown in images (a)-(d). As shown, the BTCAS blob detector performed better than HDoG and UH-DoG in segmentation. Several example glomeruli are marked with various circles in images (e)-(h). In images (e)-(h) of FIG. 6, the top circles show that some noise was detected as false positives by HDoG and UH-DoG. However, the BTCAS blob detector performed well using the denoising provided by U-Net. The bottom circles show some blobs that were under-segmented in UH-DoG due to the fixed probability threshold. The BTCAS blob detector was able to capture these. The middle circles show that for blobs with a range of sizes, the BTCAS blob detector delineated all voxels of blobs with the adaptive optimum DoG scale.

Nglom and aVglom are shown in Tables IV and V, where HDoG, UH-DoG, and BTCAS blob detector described herein were compared to data from unbiased dissector-fractionator stereology representing a ground truth in the average measurements in each kidney. The stereology data and the method of calculating aVglom known in the art were used. The differences between the results of HDoG, UH-DoG, BTCAS methods, and stereology data are also listed in Tables IV and V. Compared to stereology, HDoG identified more glomeruli, and the difference from stereology was much larger for HDoG than for the other two methods, indicating over-detection under the single optimal scale of DoG and lower mean aVglom than stereology. UH-DoG identified fewer glomeruli due to under-segmentation when using the single thresholding (0.5) on the probability map of U-Net combined with the Hessian convexity map. BTCAS provided the most accurate measurements of Nglom and mean aVglom when compared to the other two methods.

TABLE IV

HUMAN KIDNEY GLOMERULAR SEGMENTATION (NGLOM) FROM CFE-MRI

USING HDoG, UH-DoG AND THE BTCAS BLOB DETECTORS COMPARED TO

DISSECTOR-FRACTIONATOR STEREOLOGY

Nglom
Nglom
Difference
Nglom
Difference
Nglom
Difference

Human
(×10⁶)
(×10⁶)
Ratio
(×10⁶)
Ratio
(×10⁶)
Ratio

Kidney
(Stereology)
(BTCAS)
(%)
(UH-DoG)
(%)
(HDoG)
(%)

CF 1
1.13
1.16
2.65
0.66
41.60
2.95
>100

CF 2
0.74
0.86
16.22
0.48
35.14
1.21
63.51

CF 3
1.46
1.50
2.74
0.85
41.78
3.93
>100

TABLE V

HUMAN KIDNEY GLOMERULAR SEGMENTATION FROM CFE-MRI (MEAN AVGLOM)

USING HDoG, UH-DoG AND THE BTCAS BLOB DETECTORS COMPARED TO

DISSECTOR-FRACTIONATOR STEREOLOGY

Mean
Mean

Mean

Mean

aVglom
aVglom
Difference
aVglom
Difference
aVglom
Difference

Human
(×10⁻³ mm³)
(×10⁻³ mm³)
Ratio
(×10⁻³ mm³
Ratio
(×10⁻³ mm³
Ratio

Kidney
(Stereology)
(BTCAS)
(%)
(UH-DoG)
(%)
(HDoG)
(%)

CF 1
5.01
5.32
6.19
7.36
46.91
4.8
4.19

CF 2
4.68
4.78
2.14
5.62
20.09
3.2
31.62

CF 3
2.82
2.55
9.57
3.73
32.37
3.2
13.48

Experiment III: Validation Using 3D Mouse Kidney CFE-MR Images

Moreover, experiments were conducted on CF-labeled glomeruli from a dataset of 3D MR images to measure Nglom and aVglom of glomeruli in healthy and diseased mouse kidneys. This dataset included chronic kidney disease (CKD, n=3) vs. controls (n=6) and acute kidney injury (AKI, n=4) vs. control (n=5). The animal experiments were approved by the Institutional Animal Care and Use Committee (IACUC) under protocol #3929 on Apr. 7, 2020 at the University of Virginia, in accordance with the National Institutes of Health Guide for the Care and Use of Laboratory Animals. They were imaged by CFE-MRI.

Each Mill image had pixel dimensions of 256×256×256. HDoG, HDoG with VBGMM (Variational Bayesian Gaussian Mixture Model), UH-DoG, and BTCAS blob detector were utilized to segment glomeruli. The parameter settings of DoG were: window size N=7, γ=2, and Δs=0.001. To denoise the 3D blob images by using trained U-Net, each slice was first resized to 512×512, and then each slice was fed into U-Net. The threshold for the U-Net probability map in UH-DoG was 0.5.

Nglom and mean aVglom are shown in Table VI and Table VII, where HDoG, UH-DoG, and BTCAS blob detector described herein are compared to HDoG with VBGMM. The differences between the results are also listed in Tables VI and VII. Compared to HDoG with VBGMM, HDoG identified more glomeruli, and the difference from HDoG with VBGMM for HDoG was much larger than for the other two methods, indicating over-detection under the single optimal scale of the DoG and lower mean aVglom than HDoG with VBGMM. UH-DoG identified fewer glomeruli and larger mean aVglom due to under-segmentation when using the single thresholding (0.5) on the probability map of U-Net combined with the Hessian convexity map. BTCAS provided the most accurate measurements of Nglom and mean aVglom compared to the other two methods.

TABLE VI

MOUSE KIDNEY GLOMERULAR SEGMENTATION (Nglom) FROM CFE-MRI

USING HDoG, UH-DoG AND THE BTCAS COMPARED TO HDoG WITH VBGMM

METHOD

N_glom

(HDoG

Difference
N_glom
Difference

Difference

Mouse
with
N_glom
Ratio
(UH-
Ratio
N_glom
Ratio

kidney
VBGMM)
(BTCAS)
(%)
DoG)
(%)
(HDoG)
(%)

CKD
ID
7,656
7,719
0.82
7,346
4.05
10,923
42.67

429

ID
8,665
8,228
5.04
8,138
6.08
9,512
9.77

466

ID
8,549
8,595
0.54
8,663
1.33
12,755
49.20

467

Avg
8,290
8,181
2.13
8,049
2.91
11,063
33.88

Std
552
440

663

1626

Control
ID
12,724
12,008
5.63
12,701
0.18
15,515
21.93

for CKD
427

ID
10,829
11,048
2.02
11,347
4.78
15,698
44.96

469

ID
10,704
10,969
2.48
11,309
5.65
13,559
26.67

470

ID
11,943
12,058
0.96
12,279
2.81
16,230
35.90

471

ID
12,569
13,418
6.75
12,526
0.34
17,174
36.64

472

ID
12,245
12,318
0.60
11,853
3.20
15,350
25.36

473

Avg
11,836
11,970
3.07
12,003
1.41
15,588
31.91

Std
872
903

595

1193

AKI
ID
11,046
10,752
2.66
11,033
0.12
12,315
11.49

433

ID
11,292
10,646
5.72
10,779
4.54
17,634
56.16

462

ID
11,542
11,820
2.41
10,873
5.80
20,458
77.25

463

ID
11,906
12,422
4.33
11,340
4.75
25,233
>100

464

Avg
11,447
11,410
3.78
11,006
3.85
18,910
64.21

Std
367
858

246

5401

Control
ID
10,336
10,393
0.55
10,115
2.14
13,473
30.35

for AKI
465

ID
10,874
11,034
1.47
11,157
2.60
16,934
55.73

474

ID
10,292
9,985
2.98
10,132
1.55
12,095
17.52

475

ID
10,954
11,567
5.60
10,892
0.57
15,846
44.66

476

ID

477
10,885
11,143
2.37
11,335
4.13
14,455
32.80

Avg
10,668
10,824
2.59
10,726
0.54
14,561
36.21

Std
325
630

572

1908

TABLE VII

MOUSE KIDNEY GLOMERULAR SEGMENTATION (MEAN aVglom) FROM CFE-MRI

USING HDoG, UH-DoG AND THE BTCAS COMPARED TO HDoG WITH VBGMM

METHOD

Mean

aV_glom

Mean

(HDoG
Mean
Difference
aV_glom
Difference
Mean
Difference

Mouse
with
aV_glom
Ratio
(UH-
Ratio
aV_glom
Ratio

kidney
VBGMM)
(BTCAS)
(%)
DoG)
(%)
(HDoG)
(%)

CKD
ID
2.57
2.63
2.33
2.92
11.99
2.46
4.28

429

ID
2.01
2.01
0.00
2.06
2.43
1.75
12.94

466

ID
2.16
2.20
1.85
2.32
6.90
1.9
12.04

467

Avg
2.25
2.28
1.40
2.43
7.67
2.04
9.75

Std
0.29
0.32

0.44

0.37

Control
ID
1.49
1.57
5.37
1.61
7.45
1.49
0.00

for
427

CKD
ID
1.91
1.95
2.09
2.20
13.18
1.76
7.85

469

ID
1.98
2.05
3.54
2.04
2.94
1.73
12.63

470

ID
1.5
1.58
5.33
1.56
3.85
1.4
6.67

471

ID
1.35
1.36
0.74
1.49
9.40
1.35
0.00

472

ID
1.5
1.56
4.00
1.58
5.06
1.39
7.33

473

Avg
1.62
1.68
3.51
1.75
7.16
1.52
5.75

Std
0.26
0.26

0.30

0.18

AKI
ID
1.53
1.64
7.19
1.63
6.13
1.38
9.80

433

ID
1.34
1.41
5.22
1.48
9.46
1.3
2.99

462

ID
2.35
2.4
2.13
2.61
9.96
1.94
17.45

463

ID
2.31
2.36
2.16
2.40
3.75
1.78
22.94

464

Avg
1.88
1.95
4.18
2.03
7.27
1.60
13.29

Std
0.52
0.50

0.56

0.31

Control
ID
2.3
2.46
6.96
2.40
4.17
2.11
8.26

for
465

AKI
ID
2.44
2.34
4.10
2.52
3.17
2.14
12.30

474

ID
1.74
1.86
6.90
1.70
2.35
1.58
9.20

475

ID
1.53
1.57
2.61
1.62
5.56
1.49
2.61

476

ID
1.67
1.68
0.60
1.70
1.76
1.61
3.59

477

Avg
1.94
1.98
4.23
1.99
2.62
1.79
7.19

Std
0.41
0.40

0.43

0.31

Discussion of Computation Time

Various embodiments of the systems and methods described herein use U-Net for pre-processing, followed by DoG where the scales vary depending on sizes of blobs (e.g., glomeruli). The computational time of U-Net was satisfactory. For example, it took less than 5 minutes for training and less than 1 second per slice or per patch for testing.

With respect to the computation efforts related to the DoG implementation, given a 3D image in N₁×N₂×N₃and a convolutional filtering kernel size as r₁×r₂×r₃, the computational complexity of HDoG is O(N₁N₂N₃(r₁+r₂+r₃)). Considering the BTCAS method described herein, with N_Sbeing the number of scales searched (N_S>1), the computational complexity is O(N_sN₁N₂N₃(r₁+r₂+r₃)). Thus, BTCAS may involve more computing effort compared to HDoG since N_s>1—however, for HDoG, the single scale approach suffers from performances as shown in the comparison experiments (see FIG. 6 and Tables II-VII discussed herein). Exhaustively searching optimum scale for each glomerulus however may be computational prohibitive. Table VIII summarizes the computational time for DoG under exhaustive search on scales (noting the scale ranges [0, 1.5] using stereology knowledge) for each glomerulus and that for BTCAS. As shown, BTCAS saves about 30% computing time.

TABLE VIII

COMPARISON OF COMPUTATION TIME BETWEEN DOG UNDER

GLOMERULUS-SPECIFIC OPTIMAL SCALE AND BTCAS METHOD

DOG under

glomerulus-specific

Difference

Human
optimal scale
BTCAS
Ratio

Kidney
(second)
(second)
(%)

CF 1
51,238
34,792
32.10

CF 2
39,616
28,156
28.93

CF 3
59,703
41,425
30.61

AVG ± STD
50,186 ± 10,085
34,791 ± 6,635
30.55 ± 1.59

Discussion of Proposition 1 and Proposition 2

Proposition 1 and Proposition 2 referenced herein are described with more detail below.

Proposition 1

For any blob, with the normalized intensity distribution of the blob being I_b(x,y,z)∈[0,1]^I¹^×I²^×I³and the centroid of the blob being (μ_x, μ_y, μ_z) and an assumption that the blob (after denoising) follows a rotationally symmetric Gaussian distribution,

$\begin{matrix} I_{b} (x, y, z) = \frac{1}{2 {πσ}^{2}} \exp (- \frac{{(x - μ_{x})}^{2} + {(y - μ_{y})}^{2} + {(z - μ_{z})}^{2}}{2 σ^{2}}) . & (25) \end{matrix}$

The probability predicted by U-Net increases or decreases monotonically from the centroid to the boundary of the dark or bright blob.

As a way of a proof, following is provided. With respect to bright blobs, with the input intensity distributions of a blob with noise being I_N∈[0,1]^I¹^×I²^×I³,

I
_N
=I
_b+ε,ε˜ custom-character (0,σ²I). (26)

Then, the probability map from U-Net may be defined as U(x, y, z)∈[0,1]^I¹^×I²^×I³which indicates the probability of each voxel belonging to any blob. The probability of blob U_bcan approximate the intensity distribution of the blob based on (4):

U
_b(x,y,z)= custom-character _b(I_N;Θ)=_b(I_b+ε;Θ)≈I_b(x,y,z). (27)

The probabilities from U_b(x, y, z) thus follow a Gaussian distribution and the probabilities monotonically decrease from the centroid to the boundary of a blob, with U_b(μ_x, μ_y, μ_z) reaching maximum probability.

Proposition 2

Given a binarized probability map, a blob may be identified with a radius r. With B_L(x, y, z) and B_H(x, y, z), r_δ_Land r_δ_Hmay be obtained respectively. B_L(x, y, z) marks a larger blob region extending to the boundaries with low probability, and B_H(x, y, z) marks a smaller blob region extending the boundary with high probability, that is r_δ_L>r_δ_H.

As a way of a proof, following is provided. From (25) and (27), the following may be obtained:

$\begin{matrix} U_{b} (x, y, z) \approx I_{b} (x, y, z) = \frac{1}{2 π σ^{2}} \exp (- \frac{{(x - μ_{x})}^{2} + {(y - μ_{y})}^{2} + {(z - μ_{z})}^{2}}{2 σ^{2}}) . & (28) \end{matrix}$

With the radius of a blob being r(δ)∈R, the distance between the thresholding pixel (x_δ, y_δ, z_δ) and the centroid of blob may be approximated by the radius of the blob:

r(δ)≈(x_δ−μ_x)²+(y_δ−μ_y)²+(z_δ−μ_z)². (29)

Given high probability threshold δ_Hand low probability threshold δ_L,

U
_b(x_δ_H,y_δ_H,z_δ_H)=δ_H, (30)

and

U
_b(x_δ_L,y_δ_L,z_δ_L)=δ_L. (31)

From Proposition 1, the blob centroid has the maximum probability, and the probability monotonically decreases from the centroid to the boundary:

U
_b(x_δ_L,y_δ_L,z_δ_L)<U_b(x_δ_H,y_δ_H,z_δ_H)<U_b(μ_x,μ_y,μ_z), (32)

and

r(δ_L)>r(δ_H)>r(U_b(μ_x,μ_y,μ_z))=0. (33)

Description of Methods

The exemplary BTCAS systems and methods described herein provide an adaptive and effective tuning-free detector for blob detection and segmentation, which may be utilized for kidney biomarker identification for clinical use.

Principles of the present disclosure contemplate novel systems and methods for blob detection using deep learning techniques. In various embodiments and with reference to FIG. 7, a BTCAS blob detector includes two steps to detect blobs (for example, kidney glomeruli) from CFE-MRI. Step one may include training a U-Net to generate a probability map to detect the centroids of the blobs (step 702), and then deriving two distance maps with bounded probabilities (step 704). Step two may include applying a DoG filter with an adaptive scale constrained by the bounded distance maps (step 706), followed by Hessian analysis for final blob segmentation (step 708).

In various exemplary embodiments, exemplary systems and methods offer various advantages and improvements over prior approaches. For example, an exemplary system including a U-Net reduces over-detection when used in an initial denoising step. This results in a probability map with the identified centroid of blob candidates. Moreover, in exemplary systems, distance maps may be rendered with lower and upper probability bounds, which may be used as the constraints for local scale search for DoG. Additionally, in some exemplary embodiments, a local optimum DoG scale may be adapted to the range of blob sizes to better separate touching blobs. In two experiments described herein, the advantages of exemplary embodiments were confirmed: an adaptive scale based on deep learning greatly decreased under-segmentation by U-Net with over 80% increase in Dice and IoU and decreased over-detection by DoG with over 100% decrease in error rate of blob detection.

Moreover, in some embodiments, the DoG and the Hessian analysis may be integrated as layers of an overall deep learning network for comprehensive blob (e.g., glomerular) segmentation. In some embodiments, a 3D U-Net may be utilized instead of a 2D U-Net. Furthermore, in some embodiments related to glomeruli detection, a semi-supervised learning may be utilized by, e.g., incorporating domain knowledge of glomeruli to further improve glomerular detection and segmentation. The BTCAS systems and methods described herein were shown to be an adaptive and effective tuning-free detector for blob detection and segmentation and may be utilized for, e.g., kidney biomarker identification for clinical use.

In some exemplary embodiments, a blob detection system may include software operating on a general-purpose processor. In other exemplary embodiments, a blob detection system may include an application-specific integrated circuit (ASIC). In still other exemplary embodiments, a blob detection system may include instructions operative on a reconfigurable computing device, for example a field-programmable gate array (FPGA). Moreover, a blob detection system and methods thereof may be implemented as distributed software operative on multiple processors.

While the principles of this disclosure have been shown in various embodiments, many modifications of structure, arrangements, proportions, the elements, materials and components, used in practice, which are particularly adapted for a specific environment and operating requirements may be used without departing from the principles and scope of this disclosure. These and other changes or modifications are intended to be included within the scope of the present disclosure.

Benefits, other advantages, and solutions to problems have been described herein with regard to specific embodiments. Furthermore, the connecting lines shown in the various figures contained herein are intended to represent exemplary functional relationships and/or physical couplings between the various elements. It should be noted that many alternative or additional functional relationships or physical connections may be present in a practical system. However, the benefits, advantages, solutions to problems, and any elements that may cause any benefit, advantage, or solution to occur or become more pronounced are not to be construed as critical, required, or essential features or elements of any embodiment. In the claims, reference to an element in the singular is not intended to mean “one and only one” unless explicitly so stated, but rather “one or more.”

Systems, methods, and apparatus are provided herein. In the detailed description herein, references to “various exemplary embodiments”, “one embodiment”, “an embodiment”, “an exemplary embodiment”, etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. The word “exemplary” is used herein to mean “serving as an example, instance or illustration”. Any embodiment described as “exemplary” is not necessarily to be construed as preferred or advantageous over other embodiments and/or to exclude the incorporation of features from other embodiments. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the art to affect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described. After reading the description, it will be apparent to one skilled in the relevant art(s) how to implement the disclosure in alternative embodiments.

The present disclosure has been described with reference to various embodiments. However, one of ordinary skill in the art appreciates that various modifications and changes can be made without departing from the scope of the present disclosure. Accordingly, the specification is to be regarded in an illustrative rather than a restrictive sense, and all such modifications are intended to be included within the scope of the present disclosure. Likewise, benefits, other advantages, and solutions to problems have been described above with regard to various embodiments. However, benefits, advantages, solutions to problems, and any element(s) that may cause any benefit, advantage, or solution to occur or become more pronounced are not to be construed as a critical, required, or essential feature or element.

As used herein, the terms “comprises,” “comprising,” or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Also, as used herein, the terms “coupled,” “coupling,” or any other variation thereof, are intended to cover a physical connection, an electrical connection, a magnetic connection, an optical connection, a communicative connection, a functional connection, and/or any other connection. When language similar to “at least one of A, B, or C” or “at least one of A, B, and C” is used in the specification or claims, the phrase is intended to mean any of the following: (1) at least one of A; (2) at least one of B; (3) at least one of C; (4) at least one of A and at least one of B; (5) at least one of B and at least one of C; (6) at least one of A and at least one of C; or (7) at least one of A, at least one of B, and at least one of C.

Furthermore, no element, component, or method step in the present disclosure is intended to be dedicated to the public regardless of whether the element, component, or method step is explicitly recited in the claims. No claim element herein is to be construed under the provisions of 35 U.S.C. 112(f), unless the element is expressly recited using the phrase “means for.” The term “non-transitory” is to be understood to remove only propagating transitory signals per se from the claim scope and does not relinquish rights to all standard computer-readable media that are not only propagating transitory signals per se. Stated another way, the meaning of the term “non-transitory computer-readable medium” and “non-transitory computer-readable storage medium” should be construed to exclude only those types of transitory computer-readable media which were found in In re Nuijten to fall outside the scope of patentable subject matter under 35 U.S.C. § 101.

DEEP LEARNING BASED BLOB DETECTION SYSTEMS AND METHODS

Information

Publication Number

Date Filed

Date Published

Inventors

CPC

International Classifications

Abstract

Description

Claims

CROSS-REFERENCE TO RELATED APPLICATIONS

FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT

Provisional Applications (1)