Integrated Assembly Software for Sanger and Next Generation Sequence Technologies

Information

  • Research Project
  • 7328463
  • ApplicationId
    7328463
  • Core Project Number
    R43GM082117
  • Full Project Number
    1R43GM082117-01
  • Serial Number
    82117
  • FOA Number
    PA-06-20
  • Sub Project Id
  • Project Start Date
    9/1/2007 - 16 years ago
  • Project End Date
    2/29/2008 - 16 years ago
  • Program Officer Name
    BONAZZI, VIVIEN
  • Budget Start Date
    9/1/2007 - 16 years ago
  • Budget End Date
    2/29/2008 - 16 years ago
  • Fiscal Year
    2007
  • Support Year
    1
  • Suffix
  • Award Notice Date
    8/20/2007 - 16 years ago
Organizations

Integrated Assembly Software for Sanger and Next Generation Sequence Technologies

[unreadable] DESCRIPTION (provided by applicant): Obtaining complete genome DNA sequences of organisms from bacteria to man is a hallmark achievement of 20th century science and has had a huge impact on the biological sciences as well as the practice of medicine. Sequence assembly software played a critical role in making this possible. Now, there is a new rapid and dramatic fall in the cost of sequence data acquisition due to the emergence of revolutionary new sequencing machines that can turn out data in great quantities for a fraction of the cost. This is initiating a second wave of sequencing on a far greater scale and opens up many new applications, in addition to sequencing new genomes, that were heretofore not dreamed of or too costly to carry out. Projects such as using whole genome "snapshots" to identify crucial sequence changes that occur during evolution, metagenomic sequencing of communities, and medical diagnostic applications that can help track down alterations in genomic DNA related to disease are all becoming feasible. Medical scientists and entrepreneurs are even dreaming of the time when complete human genomes can be determined for about $1000, although this awaits yet another revolutionary advance in sequence data acquisition. Utilizing the present flood of new data will require corresponding progress in assembly software performance. We propose here to develop and evaluate prototype software approaches that can take advantage of all the new data gathering techniques. This will require developing new algorithms that take advantage of the abundant data generated to overcome bottlenecks that have previously limited the accuracy and completeness of assemblies necessitating expensive manual intervention to correct. As an independent software developer without vested ties to particular technical approaches, DNASTAR is uniquely positioned to provide a trusted implementation that can effectively combine data from all approaches allowing researchers the freedom to use whatever combination of technologies that best fits their needs. Our proposed new software techniques will make repeat handling, scaffold ordering and annotation far more efficient, and we predict the speed of assembly will also be increased dramatically. These feasibility experiments will be conducted on the strong foundation of DNASTAR's high performance SeqMan Genome Assembler platform. This will maximize the probability that a robust, commercially attractive product will emerge after Phase II with the high performance and accuracy needed by the large government sponsored sequencing centers, commercial and clinical sequencing operations as well as by individual research laboratories. We are embarking on a new era in medicine in which the genetic basis for diseases will be determined, both at the general public level and as individuals. This is being made possible, in part, by dramatic improvements in the ability to decipher vast amounts of DNA at a greatly reduced cost. This proposal aims to develop the computational software needed to convert that mass of coded data into medically useful information. [unreadable] [unreadable] [unreadable]

IC Name
NATIONAL INSTITUTE OF GENERAL MEDICAL SCIENCES
  • Activity
    R43
  • Administering IC
    GM
  • Application Type
    1
  • Direct Cost Amount
  • Indirect Cost Amount
  • Total Cost
    136286
  • Sub Project Total Cost
  • ARRA Funded
  • CFDA Code
    859
  • Ed Inst. Type
  • Funding ICs
    NIGMS:136286\
  • Funding Mechanism
  • Study Section
    ZRG1
  • Study Section Name
    Special Emphasis Panel
  • Organization Name
    DNASTAR, INC.
  • Organization Department
  • Organization DUNS
    130194947
  • Organization City
    MADISON
  • Organization State
    WI
  • Organization Country
    UNITED STATES
  • Organization Zip Code
    53705
  • Organization District
    UNITED STATES