Collaborative Research: CIF-Medium: Privacy-preserving Machine Learning on Graphs

Information

  • NSF Award
  • 2402816
Owner
  • Award Id
    2402816
  • Award Effective Date
    5/1/2024 - 17 days ago
  • Award Expiration Date
    4/30/2028 - 3 years from now
  • Award Amount
    $ 400,000.00
  • Award Instrument
    Standard Grant

Collaborative Research: CIF-Medium: Privacy-preserving Machine Learning on Graphs

Graph-structured data captures intricate interactions between diverse agents, and is widespread in various scientific and engineering applications such as communication theory and computer science, medical research, computational biology, and social sciences. In many scenarios, graph information is sensitive and has to be kept private. Additionally, it often necessitates updates to accommodate changes in permissions, leading to the need to retrain sophisticated large-scale machine learning models from the ground up. To simultaneously ensure that the data is kept private and easily removable without complete relearning, and that its utility for making inference and predictions remains uncompromised, innovative, and efficient privacy-preserving machine learning algorithms for graph data are essential. In addition to establishing a framework for novel graph-learning method development, the project will also provide unique cross-disciplinary training opportunities for students in biological, physics, and financial graph data analysis; broaden the participation of women and other under-represented groups in STEM research via targeted recruiting and specialized student exchange programs; and, in the process, establish new collaborations among various machine learning, data acquisition and modeling centers/institutes housed at the participating institutions.<br/><br/>This project aims to address fundamental challenges in designing privacy-preserving and efficiently updatable graph neural network models by leveraging interdisciplinary techniques from machine learning, data security, information theory, theoretical computer science and statistics. The main difficulties encountered are that (i) the graph attributes and topology are heterogeneous, yet highly correlated data types; (ii) privatization reduces utility; (iii) inference attacks that aim to determine how much information is leaking for sub-optimally privatized graph learners are generally unreliable. To resolve these issues, the team will devise novel non-uniform privatization protocols that trade accuracy for varied degrees of privacy protection; implement provably efficient methods to remove graph information from graph neural network models without retraining; and in, the process, implement a new cohort of membership inference approaches that can accurately measure information retention and leakage of machine learning models.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

  • Program Officer
    Alfred Heroahero@nsf.gov7032920000
  • Min Amd Letter Date
    3/29/2024 - a month ago
  • Max Amd Letter Date
    3/29/2024 - a month ago
  • ARRA Amount

Institutions

  • Name
    Georgia Tech Research Corporation
  • City
    ATLANTA
  • State
    GA
  • Country
    United States
  • Address
    926 DALNEY ST NW
  • Postal Code
    303186395
  • Phone Number
    4048944819

Investigators

  • First Name
    Pan
  • Last Name
    Li
  • Email Address
    panli@gatech.edu
  • Start Date
    3/29/2024 12:00:00 AM

Program Element

  • Text
    Comm & Information Foundations
  • Code
    7797

Program Reference

  • Text
    Machine Learning Theory
  • Text
    MEDIUM PROJECT
  • Code
    7924