Scaling Flow Decomposition to Multi-Sample Multiassembly and Comparison

Information

  • NSF Award
  • 2309902
Owner
  • Award Id
    2309902
  • Award Effective Date
    6/1/2023 - a year ago
  • Award Expiration Date
    5/31/2026 - a year from now
  • Award Amount
    $ 200,000.00
  • Award Instrument
    Continuing Grant

Scaling Flow Decomposition to Multi-Sample Multiassembly and Comparison

The assembly of full RNA and DNA sequences from short sequence data is a fundamental computational problem in biology. The goals of this project are to develop state-of-the-art algorithmic approaches to better assemble sequences with guarantees on computational efficiency and the quality of the assemblies produced. We will evaluate these computational methods on biological datasets and develop the most promising software tools that are well-documented and easy to use by anyone in the biological research community. The project will provide support and training for two PhD students at Montana State University (MSU) and will also potentially involve undergraduate students in research, both through existing programs at MSU designed to engage undergraduate students in research, as well as through a new REU program in the School of Computing. Additionally, outreach activities are planned for K-12 students in the local community.<br/><br/><br/>The technical focus of the project is to advance computational methods and algorithms for both RNA and DNA multiassembly -- the assembly of multiple related but different genetic sequences from next-generation sequencing reads. Multiassembly is essential for many tasks in biological research, from the assembly of gene transcripts to characterize gene function to the assembly of viral variants in an evolving population. While there are existing tools in this space, they are often built around heuristic methods that do not necessarily guarantee an optimal solution is found. The objectives of this project are centered around explicit formulations of the problems and on exact computational methods that can find optimal solutions. Specific aims include developing scalable methods for flow decomposition problems and identifying “safe” properties of the solutions, adapting methods to cope with newer technologies such as paired-end and long reads as well as uncertainty in data, and comparing and relating data from multiple samples. Result of the project will be found on the website "www.cs.montana.edu/multiassembly". The project is jointly funded by Division of Biological Infrastructure (DBI) and the Established Program to Stimulate Competitive Research (EPSCoR).<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

  • Program Officer
    Jennifer Wellerjweller@nsf.gov7032922224
  • Min Amd Letter Date
    5/31/2023 - a year ago
  • Max Amd Letter Date
    5/31/2023 - a year ago
  • ARRA Amount

Institutions

  • Name
    Montana State University
  • City
    BOZEMAN
  • State
    MT
  • Country
    United States
  • Address
    216 MONTANA HALL
  • Postal Code
    59717
  • Phone Number
    4069942381

Investigators

  • First Name
    Lucia
  • Last Name
    Williams
  • Email Address
    lucia.williams@montana.edu
  • Start Date
    5/31/2023 12:00:00 AM
  • First Name
    Brendan
  • Last Name
    Mumey
  • Email Address
    brendan.mumey@montana.edu
  • Start Date
    5/31/2023 12:00:00 AM

Program Element

  • Text
    EPSCoR Co-Funding
  • Code
    9150

Program Reference

  • Text
    ADVANCES IN BIO INFORMATICS
  • Code
    1165