CAREER: The Intersection of Spatial Statistics and Differential Privacy

Information

  • NSF Award
  • 2427447
Owner
  • Award Id
    2427447
  • Award Effective Date
    10/1/2023 - 8 months ago
  • Award Expiration Date
    4/30/2025 - 11 months from now
  • Award Amount
    $ 133,795.00
  • Award Instrument
    Continuing Grant

CAREER: The Intersection of Spatial Statistics and Differential Privacy

This CAREER award will develop methods for generating high-quality, spatially referenced public-use data while addressing data confidentiality concerns. Access to high-quality public-use data is critical for many research disciplines. However, analyses of fine-scale geographic regions with small population sizes (e.g., census tracts) often yield statistically unreliable inference. Small areas also may contain few study participants, thus increasing the risk of disclosure of sensitive information about a participant, such as an individual's disease or employment status. This project will create a unifying framework between the formal privacy literature and the spatial statistics literature that gives equal weight to privacy considerations and the utility of the resulting data. The results of this research will be of value both to academic researchers and staff at the Federal statistical agencies. The investigator will collaborate with researchers at the Centers of Disease Control and Prevention and the National Center for Health Statistics. Workshops and short courses will be developed by the investigator on spatial statistics and data privacy for staff at the Federal statistical agencies. The project also will create undergraduate research opportunities in Bayesian inference and statistical computing and provide educational opportunities related to spatial statistics and data privacy.<br/><br/>This project will develop Bayesian statistical methods for generating spatially referenced synthetic data that achieve or exceed the privacy protections currently implemented by U.S. Federal statistical agencies. Small area estimation methods from the spatial statistics literature provide a framework to leverage complex dependencies in the data to improve the precision of an estimate. Emerging methods from the data privacy literature may be used to mask or otherwise conceal information from these areas to protect the privacy guarantees made to the data subjects in exchange for their participation. Taken together, these two approaches present an analytic tension between providing accurate and reliable local estimates and the need to obscure detailed linkage between small area estimates and the data subjects residing therein. This project will tackle the following issues. First, the project will devise a statistical framework for producing massive, differentially private public-use data repositories comprised of spatially referenced synthetic aggregate count data. A key aspect of this work will be to strike a balance between computational efficiency and data utility. Second, the project will establish criteria for synthetic data from a broad class of spatial models to satisfy formal privacy protections. The result of this work will be methods that provide substantial gains in utility and help combine the tasks of data analysis and the generation of synthetic data to avoid redundancies.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

  • Program Officer
    Cheryl Eaveyceavey@nsf.gov7032927269
  • Min Amd Letter Date
    4/11/2024 - a month ago
  • Max Amd Letter Date
    4/11/2024 - a month ago
  • ARRA Amount

Institutions

  • Name
    University of Minnesota-Twin Cities
  • City
    MINNEAPOLIS
  • State
    MN
  • Country
    United States
  • Address
    200 OAK ST SE
  • Postal Code
    554552009
  • Phone Number
    6126245599

Investigators

  • First Name
    Harrison
  • Last Name
    Quick
  • Email Address
    hsq23@drexel.edu
  • Start Date
    4/11/2024 12:00:00 AM

Program Element

  • Text
    Methodology, Measuremt & Stats
  • Code
    133300
  • Text
    SCIENCE RESOURCES STATISTICS
  • Code
    880000

Program Reference

  • Text
    CAREER-Faculty Erly Career Dev
  • Code
    1045
  • Text
    UNDERGRADUATE EDUCATION
  • Code
    9178