Curated copy number variation TCGA database and analytical platform for use in ca

Information

Research Project
8715268

ApplicationId
8715268
Core Project Number
R43CA186358
Full Project Number
1R43CA186358-01
Serial Number
186358
FOA Number
PA-13-234
Sub Project Id

Project Start Date
6/11/2014 - 10 years ago
Project End Date
5/31/2015 - 9 years ago
Program Officer Name
LOU, XING-JIAN
Budget Start Date
6/11/2014 - 10 years ago
Budget End Date
5/31/2015 - 9 years ago
Fiscal Year
2014
Support Year
01
Suffix
Award Notice Date
6/11/2014 - 10 years ago

Organizations

BioDiscovery, Inc.

Information

Curated copy number variation TCGA database and analytical platform for use in ca

DESCRIPTION (provided by applicant): Cancers selected for the NIH's The Cancer Genome Atlas (TCGA) project have been chosen because of their poor prognosis and overall public health impact. Select tissue samples have been profiled for gene and miRNA expression, promoter methylation, DNA sequence and mutation analysis, as well as copy number variation (CNV), with total expenditures of $275 Million13. The copy number variation (CNV) information, derived from the raw array-based comparative genomic hybridization (aCGH) and SNP-array data, has been successfully utilized in specific application areas, such as identification of significant recurrent aberrations in each tumor type from population-wide, tumor- specific analysis. However, the full potential of this data has not yet been exploited. The two major obstacles have been the method used to perform the initial data processing which have somewhat limited its utility, and the lack of a comprehensive integrated data access and analytical platform for copy number analysis. We have demonstrated that the copy number data could be successfully re- processed to more closely reflect the underlying genomic events, which, in turn, would open several high-impact avenues for further research. Examples of such new research areas include identification of CNVs predictive of survival, genomic stratification of like-tumors by phenotype, and correlation of copy number and gene expression information . A product which enables the research community to take advantage of this substantial national investment in a much broader way is highly significant both in terms of advancement in cancer research as well as a being a viable business opportunity. Hypothesis: We hypothesize that using the BioDiscovery Nexus pre-processing and calling algorithms, with optimized statistical parameters confirmed by a clinical laboratory, combined with sample review by scientists trained in copy number analysis, will yield a database of structural variants that is substantially more concordant with underlying tumor genomes than currently available data. Delivering such curated data, integrated with powerful, easy to use analysis tools will have great scientific benefit. Preliminary data: We have performed a proof-of-principle using data from the glioblastoma multiforme (GBM) level-1 data (raw data) through processing in our pipeline, and have demonstrated the copy number profiles generated better reflected the true genomic profile of the samples (showing the correct ploidy and break points as compared to expected profiles for these samples), Specific Aims: This project involves establishing the statistical and review methods for performing high-quality copy number analysis on existing TCGA level-1 data, creating a resultant data product, and delivering this through an integrated analytical platform. I SPECIFIC AIM1, we will optimize the statistical parameters from our commercially- developed Hidden Markov Model (HMM) algorithm for the TCGA data set and apply these to the dataset along with baseline ploidy correction. In SPECIFC AIM 2, we will develop quality control methods and metrics to identify samples which should be excluded from the data set, or which require manual review and analysis. In SPECIFIC AIM 3, we will create a database of the re-processed and curated TCGA copy number data with associated clinical annotations. In SPECIFIC AIM 4, we will integrate this data product with a scientifically-accepted analytical platform for accessing the data and performing downstream analyses. Phase II would involve extending the curating methodology and analytical platform to incorporate the other dimensions of data described.

IC Name

NATIONAL CANCER INSTITUTE

Activity
R43
Administering IC
CA
Application Type
1

Direct Cost Amount
Indirect Cost Amount
Total Cost
156809
Sub Project Total Cost

ARRA Funded
False
CFDA Code
393
Ed Inst. Type
Funding ICs
NCI:156809\
Funding Mechanism
SBIR-STTR RPGs
Study Section
ZRG1
Study Section Name
Special Emphasis Panel

Organization Name
BIODISCOVERY, INC.
Organization Department
Organization DUNS
005290924
Organization City
EL SEGUNDO
Organization State
CA
Organization Country
UNITED STATES
Organization Zip Code
902454743
Organization District
UNITED STATES

Curated copy number variation TCGA database and analytical platform for use in ca

Information

ApplicationId

Core Project Number

Full Project Number

Serial Number

FOA Number

Sub Project Id

Project Start Date

Project End Date

Program Officer Name

Budget Start Date

Budget End Date

Fiscal Year

Support Year

Suffix

Award Notice Date

Organizations

Curated copy number variation TCGA database and analytical platform for use in ca

IC Name

Activity

Administering IC

Application Type

Direct Cost Amount

Indirect Cost Amount

Total Cost

Sub Project Total Cost

ARRA Funded

CFDA Code

Ed Inst. Type

Funding ICs

Funding Mechanism

Study Section

Study Section Name

Organization Name

Organization Department

Organization DUNS

Organization City

Organization State

Organization Country

Organization Zip Code

Organization District