Evaluation and Development of Statistical Methods for Data Harmonization in Molecular Prognostication

Information

  • Research Project
  • 10303963
  • ApplicationId
    10303963
  • Core Project Number
    R21HG012124
  • Full Project Number
    1R21HG012124-01
  • Serial Number
    012124
  • FOA Number
    PAR-18-843
  • Sub Project Id
  • Project Start Date
    9/3/2021 - 3 years ago
  • Project End Date
    8/31/2023 - a year ago
  • Program Officer Name
    LI, RONGLING
  • Budget Start Date
    9/3/2021 - 3 years ago
  • Budget End Date
    8/31/2023 - a year ago
  • Fiscal Year
    2021
  • Support Year
    01
  • Suffix
  • Award Notice Date
    9/3/2021 - 3 years ago

Evaluation and Development of Statistical Methods for Data Harmonization in Molecular Prognostication

PROJECT SUMMARY Survival analysis plays a foundational role in biomedical transcriptomics studies for developing reliable predictors of patient prognosis and treatment response. While survival analysis methods are available to address the issues of high dimensionality and signal sparsity, research is still lacking on the issue of data artifacts associated with disparate experimental handling, which is a pivotal feature of transcriptomics data. Published studies often deal with handling artifacts by borrowing methods that were developed for differential expression analysis, the most popular of which is quantile normalization for microarray data and scaling normalization for sequencing data. Despite the unfounded optimism for such ?off-label? uses, we found that normalization may distort a marker?s ordering across samples and subsequently compromise the detection of outcome-associated markers and the accuracy of outcome prediction. Thus, there is a pressing need to re- evaluate existing methods for dealing with these data artifacts and tailor new ones specifically for the derivation of molecular prognosticators so that it can be done accurately and reproducibly. In this proposal, we will first fill the knowledge gap for microRNAs (a class of small RNAs that play an important regulatory role of gene expression in humans) using data that are realistically distributed and robustly benchmarked. We will then develop new methods for managing handling artifacts, leveraging the survival regression framework. We will assess the performance of the new methods in comparison with existing methods using simulation tools and demonstrate their use with an application to ovarian cancer data from The Cancer Genome Atlas. Our project is expected to advance the knowledge needed for optimizing data harmonization in microRNA data and thus accelerating their reproducible translations to clinically useful predictors and for paving the way to press on these issues in RNA data and their translations.

IC Name
NATIONAL HUMAN GENOME RESEARCH INSTITUTE
  • Activity
    R21
  • Administering IC
    HG
  • Application Type
    1
  • Direct Cost Amount
    302000
  • Indirect Cost Amount
    194342
  • Total Cost
    496342
  • Sub Project Total Cost
  • ARRA Funded
    False
  • CFDA Code
    172
  • Ed Inst. Type
  • Funding ICs
    NHGRI:496342\
  • Funding Mechanism
    Non-SBIR/STTR RPGs
  • Study Section
    BMRD
  • Study Section Name
    Biostatistical Methods and Research Design Study Section
  • Organization Name
    SLOAN-KETTERING INST CAN RESEARCH
  • Organization Department
  • Organization DUNS
    064931884
  • Organization City
    NEW YORK
  • Organization State
    NY
  • Organization Country
    UNITED STATES
  • Organization Zip Code
    100656007
  • Organization District
    UNITED STATES