Curated copy number variation TCGA database and analytical platform for use in ca

Information

  • Research Project
  • 8715268
  • ApplicationId
    8715268
  • Core Project Number
    R43CA186358
  • Full Project Number
    1R43CA186358-01
  • Serial Number
    186358
  • FOA Number
    PA-13-234
  • Sub Project Id
  • Project Start Date
    6/11/2014 - 10 years ago
  • Project End Date
    5/31/2015 - 9 years ago
  • Program Officer Name
    LOU, XING-JIAN
  • Budget Start Date
    6/11/2014 - 10 years ago
  • Budget End Date
    5/31/2015 - 9 years ago
  • Fiscal Year
    2014
  • Support Year
    01
  • Suffix
  • Award Notice Date
    6/11/2014 - 10 years ago
Organizations

Curated copy number variation TCGA database and analytical platform for use in ca

DESCRIPTION (provided by applicant): Cancers selected for the NIH's The Cancer Genome Atlas (TCGA) project have been chosen because of their poor prognosis and overall public health impact. Select tissue samples have been profiled for gene and miRNA expression, promoter methylation, DNA sequence and mutation analysis, as well as copy number variation (CNV), with total expenditures of $275 Million13. The copy number variation (CNV) information, derived from the raw array-based comparative genomic hybridization (aCGH) and SNP-array data, has been successfully utilized in specific application areas, such as identification of significant recurrent aberrations in each tumor type from population-wide, tumor- specific analysis. However, the full potential of this data has not yet been exploited. The two major obstacles have been the method used to perform the initial data processing which have somewhat limited its utility, and the lack of a comprehensive integrated data access and analytical platform for copy number analysis. We have demonstrated that the copy number data could be successfully re- processed to more closely reflect the underlying genomic events, which, in turn, would open several high-impact avenues for further research. Examples of such new research areas include identification of CNVs predictive of survival, genomic stratification of like-tumors by phenotype, and correlation of copy number and gene expression information . A product which enables the research community to take advantage of this substantial national investment in a much broader way is highly significant both in terms of advancement in cancer research as well as a being a viable business opportunity. Hypothesis: We hypothesize that using the BioDiscovery Nexus pre-processing and calling algorithms, with optimized statistical parameters confirmed by a clinical laboratory, combined with sample review by scientists trained in copy number analysis, will yield a database of structural variants that is substantially more concordant with underlying tumor genomes than currently available data. Delivering such curated data, integrated with powerful, easy to use analysis tools will have great scientific benefit. Preliminary data: We have performed a proof-of-principle using data from the glioblastoma multiforme (GBM) level-1 data (raw data) through processing in our pipeline, and have demonstrated the copy number profiles generated better reflected the true genomic profile of the samples (showing the correct ploidy and break points as compared to expected profiles for these samples), Specific Aims: This project involves establishing the statistical and review methods for performing high-quality copy number analysis on existing TCGA level-1 data, creating a resultant data product, and delivering this through an integrated analytical platform. I SPECIFIC AIM1, we will optimize the statistical parameters from our commercially- developed Hidden Markov Model (HMM) algorithm for the TCGA data set and apply these to the dataset along with baseline ploidy correction. In SPECIFC AIM 2, we will develop quality control methods and metrics to identify samples which should be excluded from the data set, or which require manual review and analysis. In SPECIFIC AIM 3, we will create a database of the re-processed and curated TCGA copy number data with associated clinical annotations. In SPECIFIC AIM 4, we will integrate this data product with a scientifically-accepted analytical platform for accessing the data and performing downstream analyses. Phase II would involve extending the curating methodology and analytical platform to incorporate the other dimensions of data described.

IC Name
NATIONAL CANCER INSTITUTE
  • Activity
    R43
  • Administering IC
    CA
  • Application Type
    1
  • Direct Cost Amount
  • Indirect Cost Amount
  • Total Cost
    156809
  • Sub Project Total Cost
  • ARRA Funded
    False
  • CFDA Code
    393
  • Ed Inst. Type
  • Funding ICs
    NCI:156809\
  • Funding Mechanism
    SBIR-STTR RPGs
  • Study Section
    ZRG1
  • Study Section Name
    Special Emphasis Panel
  • Organization Name
    BIODISCOVERY, INC.
  • Organization Department
  • Organization DUNS
    005290924
  • Organization City
    EL SEGUNDO
  • Organization State
    CA
  • Organization Country
    UNITED STATES
  • Organization Zip Code
    902454743
  • Organization District
    UNITED STATES