The present disclosure relates generally to Ductal Carcinoma observation and/or determination, and more specifically, to exemplary embodiments of exemplary system, method and computer-accessible medium for patient selection for Ductal Carcinoma in Situ observation and/or determination of possible actions based on the same.
Attempts to minimize over-diagnoses and treatment of Ductal Carcinoma in Situ (“DCIS”) have led to clinical trials of observing patients with DCIS instead of surgery. Despite careful selection for “low risk” DCIS patients, occult invasive cancers can occur in significant number of these patients.
Thus, it may be beneficial to provide an exemplary system, method and computer-accessible medium for patient selection for ductal carcinoma in situ observation and/or determination of possible actions based on the same which can overcome at least some of the deficiencies described herein above.
An exemplary system, method and computer-accessible medium for determining ductal carcinoma in situ (DCIS) information regarding a patient(s) can include for example, receiving image(s) of internal portion(s) of a breast of the patient(s), and automatically determining the DCIS information by applying a neural network(s) to the image(s). The DCIS information can include predicting (i) pure DCIS or (ii) DCIS with invasion. Input information of the patient(s) can be selected for a DCIS observation for determining the DCIS information. The image(s) can be a mammographic image(s). The image(s) can be one of a magnetic resonance image or a computer tomography image.
In some exemplary embodiments of the present disclosure, the image(s) can contain a calcification(s). The image can be segmented and/or resized. The image can be centered using a histogram-based z score normalization of non-air pixel intensity values. The image(s) can be (i) randomly flipped, (ii) randomly rotated, or (iii) randomly cropped. A random affine shear can be applied to the image(s). The neural network(s) can be a convolutional neural network (CNN). The CNN can include a plurality of layers. The CNN can include 15 hidden layers. The CNN can include five residual layers. The CNN can include an inception style layer(s) after a ninth hidden layer. The CNN can include a fully connected layer(s) after a 13th layer thereof. The fully connected layer(s) can include 16 neurons. The CNN can include a linear layer(s) after a 13th layer. The linear layer(s) can include 8 neurons. A determination can be made as to what action to perform or whether to perform any action based on the determined DCIS information.
These and other objects, features and advantages of the exemplary embodiments of the present disclosure will become apparent upon reading the following detailed description of the exemplary embodiments of the present disclosure, when taken in conjunction with the appended claims.
Further objects, features and advantages of the present disclosure will become apparent from the following detailed description taken in conjunction with the accompanying Figures showing illustrative embodiments of the present disclosure, in which:
Throughout the drawings, the same reference numerals and characters, unless otherwise stated, are used to denote like features, elements, components or portions of the illustrated embodiments. Moreover, while the present disclosure will now be described in detail with reference to the figures, it is done so in connection with the illustrative embodiments and is not limited by the particular embodiments illustrated in the figures and the appended claims.
Conventional neural networks: Conventional neural networks can be, but not limited to, networks composed of neurons with learnable weights and biases. Raw data (e.g., an image) is input into the machine, which encodes defining characteristics into the network architecture. Each neuron receives multiple inputs, calculates a weighted sum that goes through an activation function, and creates an output.
Convolutional layer: The convolutional layer can apply a filter that slides over the entire image to calculate the dot product of each particular region. In this procedure, one image can become a stack of filtered images.
Pooling layer: The pooling layer can reduces the spatial size of each feature map. Maximum pooling can apply a filter that slides over the entire image and keeps only the maximum value for each particular region.
Rectified linear units: Rectified linear units can be, but not limited to, computation units that perform normalization of the stack of images. In a rectified linear unit, for example, all negative values can be changed to zero.
Inception layer: The inception layer can reduce the computation burden by making use of dual computational layers.
Fully connected layer: In the fully connected layer, as an example, every feature value from the created stack of filtered images can have a weighted output, which can be averaged to create a prediction.
Back propagation: In back propagation, the error of the final prediction can be calculated, and can be used to adjust each feature value to improve future predictions.
Dropout: Dropout can be, but not limited to, a regularization procedure used to reduce overfitting of the network by preventing coadaptation of training data. Dropout randomly selects neurons to be ignored during training.
L2 regularization: L2 regularization can be, but not limited to, a regularization procedure used to reduce overfitting by decreasing the weighted value of features to simplify the model.
The exemplary system, method, and computer-accessible medium, according to an exemplary embodiment of the present disclosure, can utilize a convolutional neural network (“CNN”) for predicting patients with pure DCIS versus DCIS with invasion using, for example, mammographic images; however, it should be understood that other imaging modalities can be used.
A retrospective study utilizing the exemplary CNN was performed, which included 246 unique images from 123 patients. Additionally, 164 images in 82 patients diagnosed with DCIS by stereotactic-guided biopsy of calcifications without any upgrade at the time of surgical excision (e.g., pure DCIS group) were used. 82 images in 41 patients with mammographic calcifications yielding occult invasive carcinoma as the final upgraded diagnosis on surgery (e.g., occult invasive group) were used. Two mammographic magnification views (e.g., bilateral craniocaudal and mediolateral/lateralmedial) of the calcifications were used for analysis. Calcifications were segmented using an exemplary 3D Slicer, which were then resized to fit a 128×128 pixel bounding box. A 15 hidden layer topology was used to implement the exemplary CNN. The exemplary network architecture included 5 residual layers and a dropout of 0.25 after each convolution. Cases were randomly separated into a training set (e.g., 80%) and a validation set (e.g., 20%).
An original pathology report was determined to be ground truth information and was used as the basis for dividing patients. Eighty percent of the available patients were randomly selected to develop the exemplary network, and the remaining 20% of patients were used to test the exemplary CNN.
The magnification views of each patient's mammogram were loaded into a 3D segmentation program. Segments were extracted using an exemplary automatic segmentation procedure to include the regions of the magnification view that contained calcifications. Each image was scaled in size on the basis of the radius of the segmentations and was resized to fit a bounding box of 128×128 pixels.
A topology with multiple layers, for example, 15 hidden layers, can be used to implement the exemplary CNN. The exemplary CNN can include fully convolutional (“FC”) layers. The exemplary CNN can include the application of a series of convolutional matrices to a vectorized input image that can iteratively separate the input to a target vector space. The exemplary CNN can include five residual layers. The residual neural networks can be used to stabilize gradients during back propagation, facilitating improved optimization and greater network depth. For example, starting with the 10th hidden layer, inception V2 style layers can be used. The inception layer architecture can facilitate a computationally efficient procedure for facilitating a network to selectively determine the appropriate filter architectures for an input feature map, providing improved learning rates.
A fully connected layer with, for example, 16 neurons can be implemented after, as an example, the 13th hidden layer, which can be followed by implantation of a linear layer with eight neurons. A final softmax function output layer with two classes can be inserted as the last layer. Training was performed using an exemplary optimization procedure (e.g., the AdamOptimizer optimization procedure) (see, e.g., Reference 20), combined with an exemplary accelerated gradient procedure (e.g., the Nesterov accelerated gradient procedure). (See, e.g., References 21 and 22). Parameters were initialized using an exemplary heuristic. (See, e.g., Reference 23). L2 regularization was performed to prevent over-fitting of data by limiting the squared magnitude of the kernel weights. Dropout (e.g., 25% randomly) was also used to prevent overfitting by limiting unit coadaptation. (See, e.g., Reference 24). Batch normalization was used to improve network training speed and regularize performance by reducing internal covariate shift. (See, e.g., Reference 25).
Softmax with cross-entropy hinge loss was used as the primary objective function of the network to provide a more intuitive output of normalized class probabilities. A class-sensitive cost function penalizing incorrect classification of the underrepresented class was used. A final softmax score threshold of 0.5 from the mean of raw logits from the ML and CC views was used for two-class classification. The area under the curve (“AUC”) value was used as the primary performance metric. Sensitivity, specificity, and accuracy were also calculated as secondary performance metrics.
Visualization of network predictions was performed using an exemplary gradient-weighted class activation mapping (“Grad-CAM”) procedure. (See, e.g., Reference 26). Each Grad-CAM map was generated by an exemplary prediction model along with every input image. The salient region of the averaged Grad-CAM map illustrates where important features come from when the exemplary prediction model makes classification decisions.
The exemplary CNN procedure for predicting patients with pure DCIS achieved an overall accuracy of about 74.6% (e.g., about 95% CI, ±5) with area under the ROC curve of about 0.71 (e.g., about 95% CI, ±0.04), a specificity of about 49.4% (e.g., about 95% CI, ±6%) and a sensitivity of about 91.6% (e.g., about 95% CI, ±5%).
Thus, as described above, the exemplary system, method, and computer-accessible medium, according to an exemplary embodiment of the present disclosure, can utilize the exemplary CNN to distinguish pure DCIS from DCIS with invasion using, for example, using mammographic images.
As shown in
Further, the exemplary processing arrangement 405 can be provided with or include an input/output ports 435, which can include, for example a wired network, a wireless network, the internet, an intranet, a data collection probe, a sensor, etc. As shown in
The foregoing merely illustrates the principles of the disclosure. Various modifications and alterations to the described embodiments will be apparent to those skilled in the art in view of the teachings herein. It will thus be appreciated that those skilled in the art will be able to devise numerous systems, arrangements, and procedures which, although not explicitly shown or described herein, embody the principles of the disclosure and can be thus within the spirit and scope of the disclosure. Various different exemplary embodiments can be used together with one another, as well as interchangeably therewith, as should be understood by those having ordinary skill in the art. In addition, certain terms used in the present disclosure, including the specification, drawings and claims thereof, can be used synonymously in certain instances, including, but not limited to, for example, data and information. It should be understood that, while these words, and/or other words that can be synonymous to one another, can be used synonymously herein, that there can be instances when such words can be intended to not be used synonymously. Further, to the extent that the prior art knowledge has not been explicitly incorporated by reference herein above, it is explicitly incorporated herein in its entirety. All publications referenced are incorporated herein by reference in their entireties.
The following references are hereby incorporated by reference in their entireties, as follows:
This application relates to and claims priority from U.S. Patent Application No. 62/672,945, filed on May 17, 2018, the entire disclosure of which is incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
62672945 | May 2018 | US |
Number | Date | Country | |
---|---|---|---|
Parent | PCT/US2019/032946 | May 2019 | US |
Child | 16950043 | US |