Multi-study Genomic Data Analysis

Information

  • NSF Award
  • 1041698
Owner
  • Award Id
    1041698
  • Award Effective Date
    9/1/2009 - 14 years ago
  • Award Expiration Date
    10/31/2011 - 12 years ago
  • Award Amount
    $ 212,427.00
  • Award Instrument
    Continuing grant

Multi-study Genomic Data Analysis

This project's goal is to develop innovative statistical approaches to<br/>multi-study genomic data analysis. Specific targets include<br/>generalization of meta-analysis tools used in medicine and social<br/>sciences to the genomics context, metrics for evaluating reproducibility<br/>of expression measurements across platform in the absence of a gold<br/>standard, approaches for deriving and validating common expression<br/>scales across platforms, and a novel reformulation of the combination<br/>problem based on constructing ``coexpression matrices'' in which an<br/>element represents the coexpression of a subset of genes in a given<br/>study. The project includes software implementation, application to a<br/>set of representative genomic analyses, and development of public-domain<br/>support website.<br/><br/>Genomics studies are studies that measure simultaneously the activity of<br/>a large portion of the thousands of genes in a biological system. These<br/>have given a great impulse to the life sciences in the past decade, and<br/>changed the way in which biology, medicine, and biotechnology make<br/>progress. A large number and variety of genomics studies are accruing.<br/>Because of cost and difficulty in the acquisition of biological samples,<br/>especially in medicine, the majority of genomic investigations are<br/>carried out using a limited number of samples, and focus on highly<br/>specific problems. This scenario poses two important questions for the<br/>genomics community. First, given the wide variety of genomic<br/>technologies and protocols, there is concern about reproducibility of<br/>genomic findings across technologies and laboratories. How can one<br/>systematically use the large body of genomic information available to<br/>assess reproducibility? Second, given the large, but fragmented and<br/>heterogeneous, set of studies that are accruing, there is concern about<br/>the ability of the scientific community to efficiently integrate the<br/>resulting knowledge. How can one perform analysis of genomics data<br/>across studies, across technologies and across related biological<br/>systems? This project's overall goal is to address these two questions<br/>by developing data analysis tools for comparison and integration of<br/>genomic information across studies, across measurement technologies and<br/>across biological systems. Today, multi-study genomic analysis are rare,<br/>despite the wide availability of genomic data in the public domain. The<br/>premise underlying this proposal is that this is due in large part to<br/>the lack of specific, systematic and rigorous statistical approaches and<br/>the associated software tools. This project aims at providing such tools<br/>and therefore, if the investigator's premise is correct, will promote a<br/>more extensive, more efficient and more rigorous use of the vast<br/>resources made available by the massive investment made on genomic<br/>studies.

  • Program Officer
    Junping Wang
  • Min Amd Letter Date
    1/25/2011 - 13 years ago
  • Max Amd Letter Date
    1/25/2011 - 13 years ago
  • ARRA Amount

Institutions

  • Name
    Dana-Farber Cancer Institute
  • City
    Boston
  • State
    MA
  • Country
    United States
  • Address
    Office of Grants and Contracts
  • Postal Code
    022155450
  • Phone Number
    6176323940

Investigators

  • First Name
    Giovanni
  • Last Name
    Parmigiani
  • Email Address
    gp@jimmy.harvard.edu
  • Start Date
    1/25/2011 12:00:00 AM