The present disclosure relates generally to breast cancer, and more specifically, to exemplary embodiments of exemplary system, method, and computer-accessible medium for determining breast cancer risk.
Breast cancer is a leading cause of death worldwide and is the second most common cause of cancer deaths among women in the United States. (See, e.g., Reference 1). One in eight women can develop breast cancer; however the risk is not homogeneously distributed throughout the population. While some risk factors have been established, the majority of women diagnosed with breast cancer have no identifiable risk. (See, e.g., Reference 2). This can limit the ability of the medical community to determine high risk versus low risk women.
Evidence for stratifying the risk of developing breast cancer lies in mammographic breast density, defined as the proportion of radiopaque epithelial and stromal tissue compared to radiolucent fat. (See, e.g., Reference 3). In 1976, breast density as a cancer risk factor was determined, with four distinct classifications based on parenchymal patterns: primarily fat (“N1”), ductal prominence involving up to one-fourth of the breast (“P1”), ductal prominence involving more than one-fourth of the breast (“P2”), and severe ductal prominence (“DY”). (See, e.g., Reference 4).
Further studies have described more quantitative categorization of breast density as it relates to cancer predisposition, such as the Tabar classification. (See, e.g., References 5 and 6). The American College of Radiology Breast Imaging Reporting and Data System (“BI-RADS”) defines four categories: (i) entirely fatty, (ii) scattered fibroglandular densities, (iii) heterogeneously dense, and (iv) extremely dense. Several studies have examined the correlation of breast cancer risk and BI-RADS breast density criteria.
A large prospective study (see, e.g., Reference 7) showed risk increase with a higher BI-RADS category, with heterogeneously dense breasts (“BI-RADS 3”) 2.8 and extremely dense breasts (“BI-RADS 4”) being 4.0 times more likely to develop cancer compared to entirely fatty breasts (“BI-RADS 1”). (See, e.g., Reference 7). Similarly, an increase in BI-RADS breast density has been shown to correlate with an increased risk of breast cancer over a 3 year follow up. (See, e.g., Reference 8). Beyond the correlation of breast density and cancer risk, evidence has shown increased density to be an independent risk factor beyond a masking effect, as it represents the amount of stromal and epithelial tissue from which breast cancer derives. (See, e.g., Reference 3).
The current climate of changing breast cancer screening recommendations by the United States Preventive Services Task Force and American Cancer Society has demonstrated a consistent trend toward later, less frequent, screening, unless a woman is considered to be high risk. This makes the challenge of defining the high risk group within the general population even more important. (See, e.g., References 9 and 10). According to the Breast Cancer Surveillance Consortium database, almost half (e.g., 47%) of the population falls into the category of dense breasts (e.g., Bi-RADS 3 and 4) and therefore can be classified as high risk. (See, e.g., Reference 11). Thus, a more individualized stratification is needed to appropriately predict breast cancer risk and therefore designate the most appropriate screening regimen. While advances in imaging technology have provided high quality mammograms with increased clarity, the question remains: is there something beyond the amount of breast density that is not appreciated by the human eye that may affect risk?
Thus, it may be beneficial to provide exemplary system, method, and computer-accessible medium for determining breast cancer risk, which can address and/or overcome at least some of the deficiencies described herein above.
An exemplary system, method and computer-accessible medium for determining a risk of developing breast cancer for a patient(s) can include, for example receiving an image(s) of an internal portion(s) of a breast of the patient(s), and determining the risk by applying a neural network(s) to the image(s). The neural network can be a convolutional neural network (CNN). The CNN can include a plurality of layers. Each of the layers can have a different number of feature channels. The CNN can include at least four layers. A first layer of the at least four layers can have 256×256×16 feature channels, a second layer of the at least four layers can have 128×128×32 feature channels, a third layer of the at least four layers can have 64×64×64 feature channels, and a fourth layer of the at least four layers can have 32×32×128 feature channels.
In some exemplary embodiments of the present disclosure, the CNN can include 3×3 convolutional kernels. Overfitting of the risk can be prevented using the 3×3 convolutional kernels. The CNN can exclude pooling layers. The image can be downsampled using, for example, a 3×3 convolutional kernel. The 3×3 convolutional kernel can have a stride length of 2. The risk can be determined by modeling non-linear functions using a rectified linear unit to limit drift of layer activations. The batch normalization can be performed between the ReLu layer(s) and a convolutional layer. The CNN can include four strided convolutions. The risk(s) can be a score.
These and other objects, features and advantages of the exemplary embodiments of the present disclosure will become apparent upon reading the following detailed description of the exemplary embodiments of the present disclosure, when taken in conjunction with the appended claims.
Further objects, features and advantages of the present disclosure will become apparent from the following detailed description taken in conjunction with the accompanying Figures showing illustrative embodiments of the present disclosure, in which:
Throughout the drawings, the same reference numerals and characters, unless otherwise stated, are used to denote like features, elements, components or portions of the illustrated embodiments. Moreover, while the present disclosure will now be described in detail with reference to the figures, it is done so in connection with the illustrative embodiments and is not limited by the particular embodiments illustrated in the figures and the appended claims.
The exemplary system, method, and computer-accessible medium, according to an exemplary embodiment of the present disclosure, can include the determination of breast cancer risk using various exemplary imaging modalities. For example, the exemplary system, method, and computer-accessible medium is described below using mammographic images. However, the exemplary system, method, and computer-accessible medium can also be used on other suitable imaging modalities, including, but not limited to, magnetic resonance imaging, positron emission tomography, ultrasound, and computed tomography.
A case-control study was performed retrospectively utilizing a screening mammogram database. Average risk screening women were evaluated by excluding women who have personal history of breast cancer, family history of breast cancer, and any known genetic mutation that increases the risk for breast cancer. After applying the exclusion criteria, 210 patients were identified consecutively with a new first time diagnosis of breast cancer. Mammograms from these patients, at least 2 years (e.g., median 3.3 years, range 2.0-5.3 years) prior to developing breast cancer, were identified and made up the “high risk” case group composed of the bilateral craniocaudal mammographic dataset (e.g., 420 total). The control group consisted of 527 patients without breast cancer from the same time period. Prior mammograms from these patients made up the “low risk” control group composed of the bilateral craniocaudal mammographic data-set (e.g., 1054 total). These 527 patients in the control group had documented negative follow-up mammogram for at least 2 years (e.g., median 3.1 years, range 2.0-4.8 years).
From each patient, the age and the BI-RADS mammographic density assessment was recorded on a 4-point scale (e.g., 1-fatty, 2-scattered, 3-heterogeneously dense, and 4-extremely dense) by one of five breast fellowship trained radiologists. Mammograms were performed on dedicated mammography units (e.g., Senographe Essential, GE Healthcare). Of patients who developed breast cancer, histologic subtype was recorded based on the World Health Organization classification. (See, e.g., Reference 13). Statistical analysis was performed using the IBM SPSS software (version 24).
As shown in the exemplary flow diagram of
As shown in
the random affine transformation was initialized with random uniform distributions of interval s1,s2 ϵ[0.8, 1.2], t1,t2 ϵ[−0.03, 0.3] and r1, r2 ϵ[−128, 128]. Four increasing layers (e.g., layer 235, layer 240, layer 245, and layer 250, can be used to produce an output 255, and a final softmax score 260 that can be used for risk classification. A softmax score of about 0.5 or above (e.g., above 0.45) can indicate a high risk of developing breast cancer.
Training was implemented using the Adam optimizer, and a procedure for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. (See, e.g., References 17-20). Parameters can be initialized using an exemplary heuristic. (See, e.g., Reference 16). L2 regularization can be implemented to prevent over-fitting of data by limiting the squared magnitude of the kernel weights. To account for training dynamics, the learning rate can be annealed and the mini-batch size can be increased whenever training lost plateaus. Furthermore, a normalized gradient procedure can be utilized to facilitate locally adaptive learning rates that adjust according to changes in the input signal. (See, e.g., References 17-20). The overall training time was 6 hours.
For statistical analysis, cases were separated into 80% training (e.g., 590/737) and 20% test sets (e.g., 147/737). A 5-fold cross validation was performed. A final softmax score threshold of 0.5 from the average of raw logits from each pixel was used for two class classification.
The average age of patients between the case and the control groups was not statistically different [case: 57.4 years (SD, 10.4) and control: 58.2 years (SD, 10.9), p=0.33]. All 210 patients had unilateral breast cancers; 69.5% (e.g., 146/210) had invasive ductal carcinoma; 19% (e.g., 40/210) had ductal carcinoma in situ; 7.1% (e.g., 15/210) had invasive lobular carcinoma; 4.3% (e.g., 9/210) had mixed lobular and ductal invasive carcinoma, and 17.6% (e.g., 37/210) of the patients had multifocal disease.
Breast Density (“BD”) was significantly higher in the case group [2.39 (SD, 0.7)] than the control group [1.98 (SD, 0.75), p<0.0001]. Using multivariate logistic regression analysis, both CNN pixel-wise mammographic risk model and BD were significant independent predictors of breast cancer risk (e.g., p<0.0001). The CNN risk model showed greater predictive potential [OR=4.42 (95% CI, 3.4-5.7] compared to BD [OR=1.67 (95% CI, 1.4-1.9).
There was a strong signification correlation of CNN pixel-wise mammographic risk results between the left and right breast (e.g., Pearson correlation, r=0.90, n=737). In the case group, there was a significant correlation between the left and right breast (e.g., Pearson correlation, r=0.86, n=210). In the control group, there was a signification correlation between the left and right breast (e.g., Pearson correlation, r=0.86, n=527).
The CNN risk model achieved an overall accuracy of 72% (e.g., 95% CI, 69.8-74.4%) in predicting patients in the case group.
The exemplary CNN was trained for a total of 144,000 iterations (e.g., approximately 1170 epochs with a batch size of 12) before convergence. A single forward pass through during test time for classification of new cases can be achieved in 0.063 seconds.
The exemplary system, method, and computer-accessible medium, according to an exemplary embodiment of the present disclosure, can include pixel-wise cancer risk assessment using mammogram to define risk on an individual basis. For example, an overall accuracy of 72% was achieved in predicting high versus low cancer risk mammograms.
With changing screening guidelines, it can be beneficial to better define an individual's risk for breast cancer. While mammographic breast density categorization procedures exist, accurate identification of who can be at high risk remains a challenge. In contrast, the exemplary system, method, and computer-accessible medium, which can utilize heat maps, can provide and show a breast cancer risk heterogeneity among mammographic breast density categories. For example, not all heterogeneously dense breasts are high risk, with a subset demonstrating a stronger resemblance to a low risk pattern. Similarly, not all breasts with the scattered fibroglandular density demonstrate a low risk pattern. While approximately half the population can be categorized as having dense breasts (e.g., BIRADS 3 and 4) (see, e.g., Reference 11), the exemplary system, method, and computer-accessible medium, according to an exemplary embodiment of the present disclosure, can be used to better classify low and high risk patients.
The exemplary CNN, according to an exemplary embodiment of the present disclosure, did not show any significant bias toward the cancer side. In addition, significant correlation was observed between the two breasts (e.g., the side that developed cancer and the contralateral non-cancer side), indicating that the exemplary CNN can predict risk for breast cancer based on features that are largely conserved on an individual basis. In particular, the heat maps shown in
Individualized breast cancer risk stratification can significantly impact clinical management. This risk assessment could be implemented into screening guidelines. In the setting of later and less frequent evolving screening guidelines for average risk women, accurately categorized high risk women can benefit from earlier and more frequent screening.
Beyond screening, individualized risk assessment can be used for chemoprevention strategies. The American Society of Clinical Oncology, National Comprehensive Cancer Network, and United States Preventive Services Task Force recommend counseling high risk women above the age of 35 on pharmacologic interventions for breast cancer risk reduction. (See, e.g., References 26-28). Two selective estrogen receptor modulators, tamoxifen and raloxifene, approved for chemoprevention in the United States, show up to a 50% cancer risk reduction. Additionally, two aromatase inhibitors, exemestane and anastrozole, not yet approved for use in the United States, have shown significant chemopreventive potential in preliminary studies. (See, e.g., Reference 29). Individualized breast cancer risk assessment can have potential to aid in the selection of high risk patients and counseling on chemoprevention.
As shown in
Further, the exemplary processing arrangement 505 can be provided with or include an input/output ports 535, which can include, for example a wired network, a wireless network, the internet, an intranet, a data collection probe, a sensor, etc. As shown in
The foregoing merely illustrates the principles of the disclosure. Various modifications and alterations to the described embodiments will be apparent to those skilled in the art in view of the teachings herein. It will thus be appreciated that those skilled in the art will be able to devise numerous systems, arrangements, and procedures which, although not explicitly shown or described herein, embody the principles of the disclosure and can be thus within the spirit and scope of the disclosure. Various different exemplary embodiments can be used together with one another, as well as interchangeably therewith, as should be understood by those having ordinary skill in the art. In addition, certain terms used in the present disclosure, including the specification, drawings and claims thereof, can be used synonymously in certain instances, including, but not limited to, for example, data and information. It should be understood that, while these words, and/or other words that can be synonymous to one another, can be used synonymously herein, that there can be instances when such words can be intended to not be used synonymously. Further, to the extent that the prior art knowledge has not been explicitly incorporated by reference herein above, it is explicitly incorporated herein in its entirety. All publications referenced are incorporated herein by reference in their entireties.
The following references are hereby incorporated by reference in their entireties:
This application relates to and claims priority from U.S. Patent Application No. 62/585,452, filed on Nov. 13, 2017, the entire disclosure of which is incorporated herein by reference.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US18/60271 | 11/12/2018 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62585452 | Nov 2017 | US |