Computational Methods for Enhancing Privacy in Biomedical Data Sharing

Information

  • Research Project
  • 10260457
  • ApplicationId
    10260457
  • Core Project Number
    DP5OD029574
  • Full Project Number
    5DP5OD029574-02
  • Serial Number
    029574
  • FOA Number
    RFA-RM-19-008
  • Sub Project Id
  • Project Start Date
    9/10/2020 - 4 years ago
  • Project End Date
    8/31/2025 - 6 months from now
  • Program Officer Name
    MILLER, BECKY
  • Budget Start Date
    9/1/2021 - 3 years ago
  • Budget End Date
    8/31/2022 - 2 years ago
  • Fiscal Year
    2021
  • Support Year
    02
  • Suffix
  • Award Notice Date
    8/31/2021 - 3 years ago
Organizations

Computational Methods for Enhancing Privacy in Biomedical Data Sharing

Project Summary Data sharing is essential to modern biomedical data science. Access to a large amount of genomic and clinical data can help us better understand human genetics and its impact on health and disease. However, the sensitive nature of biomedical information presents a key bottleneck in data sharing and collection efforts, limiting the utility of these data for science. The goal of this project is to leverage cutting-edge advances in cryptography and information theory to develop innovative computational frameworks for privacy-preserving sharing and analysis of biomedical data. We will draw upon our recent success in developing secure pipelines for collaborative biomedical analyses to address the imminent need to share sensitive data securely and at scale. Practical adoption of existing privacy-preserving techniques in biomedicine has thus far been largely limited due to two major pitfalls, which this project overcomes with novel technical advances. First, emerging cryptographic data sharing frameworks, which promise to enable collaborative analysis pipelines that securely combine data across multiple institutions with theoretical privacy guarantees, are too costly to support complex and large-scale computations required in biomedical analyses. In this project, we will build upon recent advances in cryptography (e.g., secure distributed computation, pseudorandom correlation, zero-knowledge proofs) to significantly enhance the scalability and security of cryptographic biomedical data sharing pipelines. Second, existing approaches that locally transform data to protect sensitive information before sharing (e.g. de-identification techniques) either offer insufficient levels of protection or require excessive perturbation in order to ensure privacy. We will draw upon recent tools from information theory to develop effective local privacy protection methods that achieve superior utility-privacy tradeoffs on a range of biomedical data including genomes, transcriptomes, and medical images by directly exploiting the latent correlation structure of the data. To promote the use of our privacy techniques, we will create production-grade software of our tools and publicly release them. We will also actively participate in international standard-setting organizations in genomics, e.g. GA4GH and ICDA, to incorporate our insights into community guidelines for biomedical privacy. Successful completion of these aims will result in computational methods and software tools that open the door to secure sharing and analysis of massive sets of sensitive genomic and clinical data. Our long-term goal is to broadly enable data sharing and collaboration efforts in biomedicine, thus empowering researchers to better understand the molecular basis of human health and to drive translation of new biological insights to the clinic.

IC Name
OFFICE OF THE DIRECTOR, NATIONAL INSTITUTES OF HEALTH
  • Activity
    DP5
  • Administering IC
    OD
  • Application Type
    5
  • Direct Cost Amount
    250000
  • Indirect Cost Amount
    142800
  • Total Cost
    392800
  • Sub Project Total Cost
  • ARRA Funded
    False
  • CFDA Code
    310
  • Ed Inst. Type
  • Funding ICs
    NIDCR:1\OD:392799\
  • Funding Mechanism
    Non-SBIR/STTR RPGs
  • Study Section
    ZRG1
  • Study Section Name
    Special Emphasis Panel
  • Organization Name
    BROAD INSTITUTE, INC.
  • Organization Department
  • Organization DUNS
    623544785
  • Organization City
    CAMBRIDGE
  • Organization State
    MA
  • Organization Country
    UNITED STATES
  • Organization Zip Code
    021421027
  • Organization District
    UNITED STATES