ATD: Development of Statistical Methods for Detection and Characterization of Latent Subpopulations of Classes

Information

  • NSF Award
  • 2428037
Owner
  • Award Id
    2428037
  • Award Effective Date
    9/1/2024 - 3 months ago
  • Award Expiration Date
    8/31/2026 - a year from now
  • Award Amount
    $ 350,796.00
  • Award Instrument
    Standard Grant

ATD: Development of Statistical Methods for Detection and Characterization of Latent Subpopulations of Classes

When dealing with large sets of data that are divided into many categories but have only a few examples in each category, traditional statistics and machine learning methods struggle. These classes of problems are known as few- or one-shot learning. Improving how we handle these types of problems can help in various areas like classification, identifying forensic evidence sources, and testing large-scale hypotheses. Current methods, like linear discriminant analysis, are too rigid and don’t adapt well to these complex data sets. Another method, quadratic discriminant analysis, is more flexible but unstable because there aren’t enough samples in each category compared to the complexity of the model. A promising solution is to use models that share parameters across multiple categories, making them more stable and effective. The goal of this research is to create a range of models and algorithms that can better handle few-shot learning problems. The investigators will develop methods with desirable statistical properties that facilitate probabilistic conclusions, with a focus on applications in forensic source identification and geotemporal intelligence. Implementing these well-studied and trustworthy algorithms in forensic statistics will lead to an unbiased and fair value of forensic evidence, whether used to support the intelligence community or in the criminal justice system. This will help avoid a miscarriage of justice, which is widely reported, especially for minority populations. In the near future, the developed methods will be used to identify the sources of illicit drugs and contribute to the disruption of the illicit economy in collaboration with the South Dakota Governor’s Center. The research will be integrated into classrooms, and the results will be presented at several conferences and appear in peer-reviewed publications. Additionally, the project results will be implemented in open-source software packages, and user interfaces will be developed to make the results of this research available to other researchers and practitioners.<br/><br/>Within the realm of probabilistic few-shot learning, the project will establish theoretical guarantees and behaviors on methods that use parameter pooling. The project consists of three main tasks. The first is to develop parameter pooling methods for allowing a stable estimation of second-order moments shared among classes with few observations. These models will be developed assuming Gaussian distributions with a finite number of shared covariance matrices, and later transformation mixtures will be used to account for skewness and heavy tails. The second task addresses the problem for spatiotemporal data, which is motivated by keystroke dynamics and satellite image data where the within-class independence assumption is relaxed to incorporate the information about space/time dependence. The third task is to develop a general framework that allows the pooling of parameters constrained by various parsimonious structures, and the asymptotic properties of the resulting parameter estimates are studied. Expected outcomes include thoroughly developed methodologies and algorithms to address the subpopulation of classes and sampling problems, resulting in stable, trustworthy, and explainable models.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

  • Program Officer
    Jun Zhujzhu@nsf.gov7032924551
  • Min Amd Letter Date
    8/14/2024 - 3 months ago
  • Max Amd Letter Date
    8/14/2024 - 3 months ago
  • ARRA Amount

Institutions

  • Name
    South Dakota State University
  • City
    BROOKINGS
  • State
    SD
  • Country
    United States
  • Address
    940 ADMINISTRATION LN
  • Postal Code
    570070001
  • Phone Number
    6056886696

Investigators

  • First Name
    Yana
  • Last Name
    Melnykov
  • Email Address
    Ymelnykov@ua.edu
  • Start Date
    8/14/2024 12:00:00 AM
  • First Name
    Paul
  • Last Name
    May
  • Email Address
    paul.may@sdsmt.edu
  • Start Date
    8/14/2024 12:00:00 AM
  • First Name
    Semhar
  • Last Name
    Michael
  • Email Address
    Semhar.Michael@sdstate.edu
  • Start Date
    8/14/2024 12:00:00 AM
  • First Name
    Christopher
  • Last Name
    Saunders
  • Email Address
    christopher.saunders@sdstate.edu
  • Start Date
    8/14/2024 12:00:00 AM

Program Element

  • Text
    ATD-Algorithms for Threat Dete
  • Text
    OFFICE OF MULTIDISCIPLINARY AC
  • Code
    125300

Program Reference

  • Text
    ALGORITHMS IN THREAT DETECTION
  • Code
    6877
  • Text
    EXP PROG TO STIM COMP RES
  • Code
    9150