ATD Collaborative Research: New theorems and algorithms for comprehensive analysis of metagenomic data via statistical phylogenetics

Information

  • NSF Award
  • 1223057
Owner
  • Award Id
    1223057
  • Award Effective Date
    9/1/2012 - 11 years ago
  • Award Expiration Date
    8/31/2017 - 6 years ago
  • Award Amount
    $ 496,702.00
  • Award Instrument
    Standard Grant

ATD Collaborative Research: New theorems and algorithms for comprehensive analysis of metagenomic data via statistical phylogenetics

Current whole-metagenome analysis tools are primarily based on sequence similarity (assembly, BLAST) and taxonomies ("binning" approaches); while useful, these approaches have the following limitations. Assembly methods require read overlap and thus only reconstruct the most abundant organisms in a mixed sample. Sequence similarity approaches such as BLAST cannot relate reads to ancestral organisms and do not indicate the evolutionary significance of mutations. Taxonomic methods are too coarse to reflect the subtle DNA sequence changes that may characterize a biological threat. The investigators propose to overcome these limitations by developing the theoretical underpinnings of methods to: reconstruct the cellular compartmentalization of DNA in environmental samples, even when read counts are small, detect synthetic genomes and evidence of directed evolution within a metagenomic sample by performing a phylogenetic comparison with extant genomes, detect combinations of genetic material that are anomalous given their location or time of observation, statistically distinguish meaningful shifts in microbial community composition from noise, even when those shifts happen at a level below that detectable using currently available methods.<br/><br/>The tools of genetic engineering are in the hands of scientists of many countries; these tools can be used to synthesize biological weapons. Prevention of casualties from these weapons depends on their prompt detection and identification. Although high-throughput DNA sequencing could be used to monitor biological threats, the currently available tools for analyzing the wealth of information it generates are insufficient to statistically analyze threat risk. A biodefense monitoring approach informed by a statistical analysis of evolutionary signal could yield a means to detect genetic anomalies and threats directly from "metagenomic" data: high throughput shotgun sequencing data from environmental samples.

  • Program Officer
    Leland M. Jameson
  • Min Amd Letter Date
    8/28/2012 - 11 years ago
  • Max Amd Letter Date
    8/28/2012 - 11 years ago
  • ARRA Amount

Institutions

  • Name
    Fred Hutchinson Cancer Research Center
  • City
    Seattle
  • State
    WA
  • Country
    United States
  • Address
    1100 FAIRVIEW AVE N J6-300
  • Postal Code
    981094433
  • Phone Number
    2066674868

Investigators

  • First Name
    Aaron
  • Last Name
    Darling
  • Email Address
    aarondarling@ucdavis.edu
  • Start Date
    8/28/2012 12:00:00 AM
  • First Name
    Frederick
  • Last Name
    Matsen
  • Email Address
    matsen@fhcrc.org
  • Start Date
    8/28/2012 12:00:00 AM

Program Element

Program Reference

  • Text
    ALGORITHMS IN THREAT DETECTION
  • Code
    6877