Association Analysis Software for Mining Clinical Next-Gen Sequencing Data

Information

Research Project
8236680

ApplicationId
8236680
Core Project Number
R44HG006578
Full Project Number
1R44HG006578-01
Serial Number
006578
FOA Number
RFA-HG-10-019
Sub Project Id

Project Start Date
9/25/2012 - 13 years ago
Project End Date
6/24/2013 - 12 years ago
Program Officer Name
SOFIA, HEIDI J
Budget Start Date
9/25/2012 - 13 years ago
Budget End Date
6/24/2013 - 12 years ago
Fiscal Year
2012
Support Year
01
Suffix
Award Notice Date
9/24/2012 - 13 years ago

Organizations

DNASTAR, INC.

Information

Association Analysis Software for Mining Clinical Next-Gen Sequencing Data

DESCRIPTION (provided by applicant): Remarkable improvements in throughput, accuracy and cost-effectiveness of next-generation sequencing (next-gen) technologies are ushering in a new era of clinical medicine. Genome wide association studies (GWAS) in particular have begun to leverage these advances to determine the complete catalog of common and rare variants for each member of a cohort. The resolving power of this approach has the potential to greatly accelerate our understanding, diagnosis and treatment of human disease. Unfortunately, analysis of these massive data sets requires that several disparate pieces of software be cobbled together including a large capacity next-gen sequencing assembler, variation detection modules, mapping and comparison tools for tens to hundreds of variant reports, statistical analysis packages, reporting tools, and so on. Combining and using these tools typically requires extensive bioinformatic expertise as the software is rarely well documented or supported and often depends on having elaborate hardware. These hurdles makes next-gen based GWAS inaccessible to the vast majority of the crucial user base, the physician researchers. The goal of this proposal is to assemble the essential next-gen based GWAS software components into a single coherent pipeline that that is fully equipped to meet the needs of the medical research community. Consistent with DNASTAR's 28 year tradition, the software will be easy to use, run on a reasonably priced (<$3000) desktop computer, and will be fully documented and supported. The pipeline will consist of two modules already available through DNASTAR, SeqMan NGen 3.0 (SM NGen 3.0) and ArrayStar. SM NGen 3.0, our recently released human genome scale assembly and analysis package, forms the front end of pipeline. Reference-guided assemblies of whole human genome or exome next-gen data sets produce variation reports including impact on gene features and associations with the dbSNP database. Putative variations can be verified by direct inspection of the alignment through the SeqMan Pro component of the package. Variation reports from each member of a GWAS cohort will then be fed into our multi-sample comparison and analysis program, ArrayStar, at the back end of the pipeline. ArrayStar has the infrastructure for multi-sample management and processing which can be easily adapted to GWAS analysis. These adaptations and their documentation are a central focus of this application. Critical to the successful development of this software is our collaboration with Dr. Douglas McNeel (Dept. of Oncology, UW-Madison). The exomes from a panel of prostate cancer vaccine recipients, including responders and non-responders, from the McNeel lab will be sequenced as input from which to build the pipeline using iterative cycles of development followed by evaluation by the McNeel group. This relationship offers an ideal opportunity to build the analysis and reporting software needed by physician researchers to form, test and validate GWAS generated hypotheses. PUBLIC HEALTH RELEVANCE: The easy to use tools to be developed and integrated in this project will dramatically enhance the efficiency of clinical and diagnostic research for a wide range of life scientists and medical professionals using next-generation DNA sequencing technologies, allowing new treatments to be brought to market sooner, enhancing scientists' understanding of treatment efficacy, and supporting the tailoring of different treatments to specific groups of individuals based on their genetic composition. These tools will be flexible enough to support critical analysis of large populations for clinical research and easy enough to use for all life scientists and medical professionals to feel comfortable with them.

IC Name

NATIONAL HUMAN GENOME RESEARCH INSTITUTE

Activity
R44
Administering IC
HG
Application Type
1

Direct Cost Amount
Indirect Cost Amount
Total Cost
150000
Sub Project Total Cost

ARRA Funded
False
CFDA Code
172
Ed Inst. Type
Funding ICs
NHGRI:150000\
Funding Mechanism
SBIR-STTR RPGs
Study Section
ZHG1
Study Section Name
Special Emphasis Panel

Organization Name
DNASTAR, INC.
Organization Department
Organization DUNS
130194947
Organization City
MADISON
Organization State
WI
Organization Country
UNITED STATES
Organization Zip Code
537055202
Organization District
UNITED STATES

Association Analysis Software for Mining Clinical Next-Gen Sequencing Data

Information

ApplicationId

Core Project Number

Full Project Number

Serial Number

FOA Number

Sub Project Id

Project Start Date

Project End Date

Program Officer Name

Budget Start Date

Budget End Date

Fiscal Year

Support Year

Suffix

Award Notice Date

Organizations

Association Analysis Software for Mining Clinical Next-Gen Sequencing Data

IC Name

Activity

Administering IC

Application Type

Direct Cost Amount

Indirect Cost Amount

Total Cost

Sub Project Total Cost

ARRA Funded

CFDA Code

Ed Inst. Type

Funding ICs

Funding Mechanism

Study Section

Study Section Name

Organization Name

Organization Department

Organization DUNS

Organization City

Organization State

Organization Country

Organization Zip Code

Organization District