The assembly of full RNA and DNA sequences from short sequence data is a fundamental computational problem in biology. The goals of this project are to develop state-of-the-art algorithmic approaches to better assemble sequences with guarantees on computational efficiency and the quality of the assemblies produced. We will evaluate these computational methods on biological datasets and develop the most promising software tools that are well-documented and easy to use by anyone in the biological research community. The project will provide support and training for two PhD students at Montana State University (MSU) and will also potentially involve undergraduate students in research, both through existing programs at MSU designed to engage undergraduate students in research, as well as through a new REU program in the School of Computing. Additionally, outreach activities are planned for K-12 students in the local community.<br/><br/><br/>The technical focus of the project is to advance computational methods and algorithms for both RNA and DNA multiassembly -- the assembly of multiple related but different genetic sequences from next-generation sequencing reads. Multiassembly is essential for many tasks in biological research, from the assembly of gene transcripts to characterize gene function to the assembly of viral variants in an evolving population. While there are existing tools in this space, they are often built around heuristic methods that do not necessarily guarantee an optimal solution is found. The objectives of this project are centered around explicit formulations of the problems and on exact computational methods that can find optimal solutions. Specific aims include developing scalable methods for flow decomposition problems and identifying “safe” properties of the solutions, adapting methods to cope with newer technologies such as paired-end and long reads as well as uncertainty in data, and comparing and relating data from multiple samples. Result of the project will be found on the website "www.cs.montana.edu/multiassembly". The project is jointly funded by Division of Biological Infrastructure (DBI) and the Established Program to Stimulate Competitive Research (EPSCoR).<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.