LEAPS-MPS: Importance, Significance, and Fairness in Large-Scale Estimation and Testing of Heteroscedastic Data

Information

NSF Award
2316746

Owner

SAN FRANCISCO STATE UNIVERSITY

Award Id
2316746
Award Effective Date
9/1/2023 - a year ago
Award Expiration Date
8/31/2025 - 7 months from now
Award Amount
$ 249,999.00
Award Instrument
Standard Grant

Information

LEAPS-MPS: Importance, Significance, and Fairness in Large-Scale Estimation and Testing of Heteroscedastic Data

The proposed research addresses two issues of increasing concern in the scientific community. One is that data has become so cheap and easy to collect that it is all too often difficult or impossible to reproduce these data or even entire studies. The second is that algorithms often encode biases with unintended consequences. The so-called reproducibility crisis and algorithmic fairness problems each have multiple causes, but one that they share is grounded in the unequal variability of most large-scale data such as their heteroscedasticity. Ameliorating these issues in modern data problems requires two fields of statistics for large-scale data: multiple testing and large-scale estimation. Advances in both fields are combined as they have not been previously to create powerful, theoretically backed statistical methodology. The economic, moral, and social impacts affect STEM research, public policy, and technology, as well as entertainment, finance, and sports, fields from which we analyze data in this proposal. The broader impacts and wide application of this project also provide underrepresented minority (URM) students with opportunities to produce scholarly work; bolster research infrastructure at a primarily undergraduate institute (PUI) and Hispanic Serving Institute (HSI); train URM members in professional development to help remove barriers to their participation in technological innovation; and invite the industry sector into the academic research feedback loop.<br/><br/>This project focuses on developing new inferential procedures for estimating, selecting, and ranking heteroscedastic data. When heteroscedastic data needs to be selected and ranked, as is often the case with large-scale data, both replicability and fairness issues arise together. The goal of this proposal is to resolve the dual issues of estimation and significance by developing new inferential procedures for ranking and selecting heteroscedastic data. The three objectives are to: (i) select ordered hypotheses based on the importance of data results, (ii) create more efficient and accurate estimators for large-scale data, (iii) create rankings to address inequities. The first objective will be accomplished by reversing the typical multiple hypothesis testing steps of ordering and then selecting hypotheses, which will better balance estimated effect size and variability; the second objective, by developing a convex optimization technique; and the third objective, by creating a ranking algorithm that accounts for protected characteristics. New algorithms will be developed to implement each of these methods and publicly shared via Github, DataLore, and R packages. Extensive mentorship and professional development opportunities will be offered to undergraduate and graduate students who participate in these research activities.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Program Officer
Jun Zhujzhu@nsf.gov7032924551
Min Amd Letter Date
7/31/2023 - a year ago
Max Amd Letter Date
7/31/2023 - a year ago
ARRA Amount

Institutions

Name
San Francisco State University
City
SAN FRANCISCO
State
CA
Country
United States
Address
1600 HOLLOWAY AVE BUILDING NAD R
Postal Code
941321722
Phone Number
4153387090

Investigators

First Name
Luella
Last Name
Fu
Email Address
luella@sfsu.edu
Start Date
7/31/2023 12:00:00 AM

Program Element

Text
LEAPS-MPS

LEAPS-MPS: Importance, Significance, and Fairness in Large-Scale Estimation and Testing of Heteroscedastic Data

Information

Owner

Award Id

Award Effective Date

Award Expiration Date

Award Amount

Award Instrument

LEAPS-MPS: Importance, Significance, and Fairness in Large-Scale Estimation and Testing of Heteroscedastic Data

Program Officer

Min Amd Letter Date

Max Amd Letter Date

ARRA Amount

Institutions

Name

City

State

Country

Address

Postal Code

Phone Number

Investigators

First Name

Last Name

Email Address

Start Date

Program Element

Text