Collaborative Research: Multi-source Learning: Data-driven Algorithms, Optimality Theory, and Applications

Information

NSF Award
2413106

Owner

UNIVERSITY OF PENNSYLVANIA, A CORP. OF PA

Award Id
2413106
Award Effective Date
8/1/2024 - a year ago
Award Expiration Date
7/31/2027 - a year from now
Award Amount
$ 250,000.00
Award Instrument
Standard Grant

Information

Collaborative Research: Multi-source Learning: Data-driven Algorithms, Optimality Theory, and Applications

Massive and diverse high-dimensional datasets are now routinely collected in a wide range of scientific fields. In many instances, in addition to the primary data from the target study, other datasets from different populations or under different environments with a similar structure to the primary data have been collected. Incorporating such related auxiliary data is desirable to make more accurate and informative decisions. For example, the availability of large-scale genomic and proteomic data promises a better understanding of disease processes and suggests the possibility of more accurate prediction of disease outcomes. Efficiently extracting meaningful information from multiple such datasets becomes a critical problem in medical research, which presents unprecedented opportunities to statisticians and data scientists. The project's goal is to devise a collection of advanced statistical tools for efficient integrative analysis of EHR and genomics data. <br/><br/>The PIs aim to address the pressing need for novel statistical methods to perform efficient integrative analysis that combines multiple data sources. The PIs plan to develop new methodologies and optimality theory for efficiently integrating large-scale data from multiple sources and to address critical biomedical problems using the newly developed methods. There are three major research goals to be pursued. One is to develop data-driven algorithms with theoretical optimality guarantees for transfer learning in various settings, including estimation/inference of high-dimensional covariance matrices, covariance functions for functional data, instrumental variable regression, and conformal inference. The second is to develop a class of adversarially robust algorithms that efficiently integrate the heterogeneous information from the multi-source data, including constructing the guided adversarially robust learning and conducting the group significance test for high-dimensional and nonparametric models. The third is to address the urgent needs and new challenges in biomedical studies through the analyses of EHR data and integrative genomics, using the newly developed methods for transfer learning and adversarially robust learning.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Program Officer
Tapabrata Maititmaiti@nsf.gov7032925307
Min Amd Letter Date
7/17/2024 - a year ago
Max Amd Letter Date
7/17/2024 - a year ago
ARRA Amount

Institutions

Name
University of Pennsylvania
City
PHILADELPHIA
State
PA
Country
United States
Address
3451 WALNUT ST STE 440A
Postal Code
191046205
Phone Number
2158987293

Investigators

First Name
T. Tony
Last Name
Cai
Email Address
tcai@wharton.upenn.edu
Start Date
7/17/2024 12:00:00 AM

Program Element

Text
STATISTICS
Code
126900

Collaborative Research: Multi-source Learning: Data-driven Algorithms, Optimality Theory, and Applications

Information

Owner

Award Id

Award Effective Date

Award Expiration Date

Award Amount

Award Instrument

Collaborative Research: Multi-source Learning: Data-driven Algorithms, Optimality Theory, and Applications

Program Officer

Min Amd Letter Date

Max Amd Letter Date

ARRA Amount

Institutions

Name

City

State

Country

Address

Postal Code

Phone Number

Investigators

First Name

Last Name

Email Address

Start Date

Program Element

Text

Code