ATD: Development of Statistical Methods for Detection and Characterization of Latent Subpopulations of Classes

Information

NSF Award
2428037

Owner

Dakota State University

Award Id
2428037
Award Effective Date
9/1/2024 - 9 months ago
Award Expiration Date
8/31/2026 - a year from now
Award Amount
$ 350,796.00
Award Instrument
Standard Grant

Information

ATD: Development of Statistical Methods for Detection and Characterization of Latent Subpopulations of Classes

When dealing with large sets of data that are divided into many categories but have only a few examples in each category, traditional statistics and machine learning methods struggle. These classes of problems are known as few- or one-shot learning. Improving how we handle these types of problems can help in various areas like classification, identifying forensic evidence sources, and testing large-scale hypotheses. Current methods, like linear discriminant analysis, are too rigid and don’t adapt well to these complex data sets. Another method, quadratic discriminant analysis, is more flexible but unstable because there aren’t enough samples in each category compared to the complexity of the model. A promising solution is to use models that share parameters across multiple categories, making them more stable and effective. The goal of this research is to create a range of models and algorithms that can better handle few-shot learning problems. The investigators will develop methods with desirable statistical properties that facilitate probabilistic conclusions, with a focus on applications in forensic source identification and geotemporal intelligence. Implementing these well-studied and trustworthy algorithms in forensic statistics will lead to an unbiased and fair value of forensic evidence, whether used to support the intelligence community or in the criminal justice system. This will help avoid a miscarriage of justice, which is widely reported, especially for minority populations. In the near future, the developed methods will be used to identify the sources of illicit drugs and contribute to the disruption of the illicit economy in collaboration with the South Dakota Governor’s Center. The research will be integrated into classrooms, and the results will be presented at several conferences and appear in peer-reviewed publications. Additionally, the project results will be implemented in open-source software packages, and user interfaces will be developed to make the results of this research available to other researchers and practitioners.<br/><br/>Within the realm of probabilistic few-shot learning, the project will establish theoretical guarantees and behaviors on methods that use parameter pooling. The project consists of three main tasks. The first is to develop parameter pooling methods for allowing a stable estimation of second-order moments shared among classes with few observations. These models will be developed assuming Gaussian distributions with a finite number of shared covariance matrices, and later transformation mixtures will be used to account for skewness and heavy tails. The second task addresses the problem for spatiotemporal data, which is motivated by keystroke dynamics and satellite image data where the within-class independence assumption is relaxed to incorporate the information about space/time dependence. The third task is to develop a general framework that allows the pooling of parameters constrained by various parsimonious structures, and the asymptotic properties of the resulting parameter estimates are studied. Expected outcomes include thoroughly developed methodologies and algorithms to address the subpopulation of classes and sampling problems, resulting in stable, trustworthy, and explainable models.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Program Officer
Jun Zhujzhu@nsf.gov7032924551
Min Amd Letter Date
8/14/2024 - 9 months ago
Max Amd Letter Date
8/14/2024 - 9 months ago
ARRA Amount

Institutions

Name
South Dakota State University
City
BROOKINGS
State
SD
Country
United States
Address
940 ADMINISTRATION LN
Postal Code
570070001
Phone Number
6056886696

Investigators

First Name
Yana
Last Name
Melnykov
Email Address
Ymelnykov@ua.edu
Start Date
8/14/2024 12:00:00 AM

First Name
Paul
Last Name
May
Email Address
paul.may@sdsmt.edu
Start Date
8/14/2024 12:00:00 AM

First Name
Semhar
Last Name
Michael
Email Address
Semhar.Michael@sdstate.edu
Start Date
8/14/2024 12:00:00 AM

First Name
Christopher
Last Name
Saunders
Email Address
christopher.saunders@sdstate.edu
Start Date
8/14/2024 12:00:00 AM

Program Element

Text
ATD-Algorithms for Threat Dete

Text
OFFICE OF MULTIDISCIPLINARY AC
Code
125300

Program Reference

Text
ALGORITHMS IN THREAT DETECTION
Code
6877

Text
EXP PROG TO STIM COMP RES
Code
9150

ATD: Development of Statistical Methods for Detection and Characterization of Latent Subpopulations of Classes

Information

Owner

Award Id

Award Effective Date

Award Expiration Date

Award Amount

Award Instrument

ATD: Development of Statistical Methods for Detection and Characterization of Latent Subpopulations of Classes

Program Officer

Min Amd Letter Date

Max Amd Letter Date

ARRA Amount

Institutions

Name

City

State

Country

Address

Postal Code

Phone Number

Investigators

First Name

Last Name

Email Address

Start Date

First Name

Last Name

Email Address

Start Date

First Name

Last Name

Email Address

Start Date

First Name

Last Name

Email Address

Start Date

Program Element

Text

Text

Code

Program Reference

Text

Code

Text

Code