Fighting viral epidemics is one of the major challenges faced by the modern globally connected world. Recent technological advances had a profound effect on our answers to that challenge. They allow for rapid and cost-effective sequencing (i.e., reading) of pathogen genomes and can generate enormous amounts of data in almost real time. Genomic epidemiology is an interdisciplinary research area that uses the large-scale analysis of viral genomes to understand how viruses evolve and spread. The methods of genomic epidemiology are currently becoming major instruments not only for research, but also for public-health decision making of broad societal importance. However, its computational toolkit is still developing, and this process faces many hard algorithmic challenges. Some of the major problems are: (i) how to extract the whole spectrum of viral genetic diversity, including newly emerging mutations and variants, from noisy and fragmented sequencing data; (ii) how to use genomic data to investigate outbreaks and reconstruct virus-transmission networks; and (iii) how to identify highly pathogenic or transmissible viral variants. The algorithms for these problems should be accurate, reproducible, interpretable and scalable with respect to the levels of "big data" produced by modern sequencing platforms. Development of such algorithms and study of the corresponding algorithmic problems is exactly the goal of this project. Other major objectives are to help to bring computational genomics into high-school and undergraduate classrooms, to broaden participation in computational biology via advanced pedagogical techniques, and to facilitate training of the next generation of interdisciplinary researchers, who will simultaneously possess an expertise in computer science, epidemiology, and molecular biology, and will be able to develop innovative algorithms and apply them to real-life problems.<br/><br/>This project will undertake the systematic study of fundamental computational problems of genomic epidemiology from the theoretical computer-science perspective. The overarching objective is the development of new methods based on cross-disciplinary convergence of techniques from algorithmic graph theory, network theory and mathematical (and, particularly, combinatorial) optimization. The first major specific scientific goal is the development of methods for assessment of viral genetic diversity using networks of statistically linked mutations and a graph-decomposition approach. The second goal is the development of a family of combinatorial algorithms for reconstruction of viral transmission networks using the fusion of phylogenetics and a network-theory approach to social networks relevant to infection dissemination. The final goal is the design of scalable computational techniques for quantification of viral phenotypic diversity using combinatorial and convex optimization. The investigator will closely collaborate with biologists and epidemiologists to ensure biomedical relevance and applicability of the developed algorithms. It is also expected that some of the new machinery will be applicable to non-biomedical problems arising in graph theory and in studies of complex networks.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.