PROJECT SUMMARY/ABSTRACT Genome-wide association studies (GWAS) have identified thousands of genetic variants associated with metabolic, cardiovascular, autoimmune and other diseases. These variants have the potential to reveal molecular mechanisms that underpin human diseases, but their interpretation is extremely challenging, because most are within non-coding genomic regions with unknown function. The long-term goal of the proposed research is to elucidate the molecular basis of complex diseases by assembling comprehensive catalogs of regulatory sequences and illuminating how non-coding genetic variants affect gene regulation. This proposal will leverage the power of high-throughput genomic perturbations and computational analyses to discover regulatory sequences, interpret non-coding genetic variants, and connect disease-associated variants to the genes they regulate. Research Focus 1 will systematically discover novel regulatory sequences using CRISPR-directed tiling deletion screens, which can discover regulatory sequences that are invisible to other approaches. These screens will be performed in primary T cells and applied to megabase-scale regions surrounding T cell differentiation genes, which are rich in uncharacterized GWAS hits. To determine how frequently GWAS hits affect novel regulatory sequences lacking canonical enhancer marks, fine-mapped GWAS variants will be intersected with regulatory sequences discovered by the screens. The function of novel regulatory sequences will be determined with deletions followed by experiments to measure 3D chromatin contacts, gene expression, and cellular proliferation. Research Focus 2 will utilize single-cell genome perturbations to connect thousands of variants associated with human diseases to the genes they regulate across multiple cell types. Sequences containing potentially causal GWAS variants will be targeted with CRISPR interference and gene expression will be measured with single-cell RNA-seq in a mixture of disease-relevant immune cells. Using the single-cell data, perturbed sequences will be connected to changes in gene expression in specific cell types. Variants predicted to regulate gene expression will be validated by modifying alleles with genome editing. The expected outcomes of this project are (i) systematic catalogs of regulatory sequences for genes involved in T cell differentiation, (ii) molecular characterization of novel unmarked regulatory sequences that contain GWAS hits and (iii) connections between sequences containing GWAS hits and genes that they regulate in specific cell types. This proposal will establish genomic perturbations as a new strategy to interpret non-coding variants, uncover important new regulatory biology, and accelerate mechanistic understanding of disease-associated variants.