Project Summary Idiopathic pulmonary fibrosis (IPF) is the most common and severe form of interstitial lung disease. IPF occurs in middle-aged and older adults and affects over 50,000 Americans each year. Most IPF patients die from respiratory failure within five years of diagnosis. The current therapies target downstream disease mechanisms, and while they modestly slow the decline in lung function, they have not been shown to improve survival or quality of life for IPF patients. There is considerable heterogeneity of clinical outcomes among IPF patients, and we believe this heterogeneity is due to distinct mechanisms and programs involved in disease initiation that culminate in a common a pathology of end-stage lung fibrosis. As such, the development of transformative treatments hinges on our ability to better understand and target ?upstream? disease mechanisms. However, progress to this end has been held back by the limited study of the cell types and molecular changes initiating IPF pathogenesis. Novel technologies have recently been developed that enable quantification of mRNA levels in individual cells to be performed in a parallel, high throughput manner (scRNA- seq). Our proposed studies will leverage these technologies and the heterogeneity of the disease within the IPF lung to determine the mechanisms and mediators that underlie the early pathogenesis of IPF. We will use scRNA-seq to determine the gene expression profiles and programs in non-fibrotic control lungs (n=50), and paired, differentially affected regions of IPF lungs (n=100, paired distal, more fibrotic, vs. proximal, less fibrotic samples). We will use computational methods to group cells into putative cell types based on transcriptional similarity and canonical marker gene expression. We will then quantify the relative abundance of each cell type in these different disease states, and use innovative bioinformatic approaches to determine the gene expression programs that drive different phases of disease pathogenesis. Then, to determine the role of genetic variation in regulating these disease pathways, we will utilize the inter-individual genetic variation present in our sample to identify single nucleotide polymorphisms that are associated with gene expression changes (eQTLs) in each independent cell type. Next, to begin to interrogate the mechanisms underlying disease heterogeneity, we will determine cell-type specific gene expression changes that are associated with genetic predictors of disease outcome (MUC5B genotype, peripheral blood telomere length). Finally, we will define novel disease endotypes based on cell type specific gene expression patterns. The localization and spatial patterns of identified genes will be determined using matched FFPE samples, and key findings will be validated in primary cell/organoid culture systems. This work will generate the most comprehensive molecular characterization of healthy and IPF lungs, and promises to answer fundamental questions about cell types, genetic variants, and gene expression changes driving the idiopathic pulmonary fibrosis pathogenesis.