PROJECT ABSTRACT: The overall objective of this proposal is the development and validation of rapid, deep learning-based digital pathology biomarkers for predicting risk and biological evolution of premalignant lesions progression to invasive oral cavity squamous cell carcinoma (OCSCC). To accomplish this task, we have assembled a team of clinical experts in oral premalignancy biology and patient care, and will deploy a cutting-edge computational pipeline to create computational models using a massive data set of more than 4000 samples. This project seeks to meet a critical unmet need in characterizing and risk stratifying the precursor lesions for OCSCC. Premalignant oral dysplasias transform to OCSCC at an overall 15% rate. Unfortunately, no unified grading schema exists to reflect the risk of progression from premalignancy to OCSCC, and interrater agreement between pathologists is low. The ability to prioritize the aggressiveness of dysplasias would be of significant benefit for guiding oral premalignancy management, as patients with OCSCC suffer from 55% five-year survival, high post-treatment relapse rate, and morbid diagnostic and treatment options. The current histopathology standard for premalignancy risk is hematoxylin and eosin (HE) slide evaluation by an expert pathologist. With a prevalence approaching 10%, the high frequency of oral premalignant lesions makes HE data abundant, and this large amount of data is an ideal use application for artificial intelligence methods. Deep Learning is an emerging discipline within artificial intelligence where digital image information can be automatically processed in order to recognize patterns in digital histology images and accomplish classification tasks. Deep learning is the focus of our lab, and we have both published and preliminary data relevant to premalignancy progression prediction. The central hypotheses of this proposal is that tissue patterns within oral premalignancy digital pathology samples can be automatically risk-stratified using deep learning, and contain important hidden information, and contain important hidden information about individual prognosis. In order to evaluate this hypothesis, this proposal seeks to train, validate, and interrogate a series of deep learning-based predictors to automatically assess progression risk in oral premalignancy digital pathology images. Each deep learning classifier will be built from patient histology samples and directly observed clinical outcomes. To accomplish the proposal tasks, we will build a global, multi-institution repository of more than 4000 patient samples: the Oral Premalignancy Repository for Deep Learning (OPR-DL), designed for deep learning research. This application responds to Notice of Special Interest NOT-CA-20-031, which seeks to ?advance early detection of head and neck cancer (HNC) ? distinguishing benign from malignant lesions?; ?Discover ? technologies that could complement currently available clinical methods (e.g. pathology)?; and ?validate imaging-based applications and data analysis tools?. This tool could guide treatments, minimize unnecessary diagnostic interventions, and improve care in resource- constrained environments.