In many modern big data applications, data is often collected from diverse sources. To improve prediction or clustering accuracy, multi-task learning and transfer learning techniques have been employed widely to leverage the possible similarities across different tasks. For example, it is of crucial importance to develop reliable inference procedures for applications such as the Federal Reserve Economic Database (FRED) to identify latent factors and individual compositions of significant macroeconomic variables associated with typical macroeconomic indicators. Similarly, for databases like the Alzheimer's Disease Neuroimaging Initiative (ADNI) and the National Alzheimer's Coordinating Center (NACC), early detection and risk factor identification of dementia, such as Alzheimer's disease, is vital. In these contexts, different economic indicators or patients in different hospitals may share certain similarities. However, it remains largely unclear how to develop flexible inference procedures for high-dimensional multi-task learning and transfer learning. The research project can have potentially significant impacts across diverse fields, including economics, business, engineering, and medicine. These new theoretical and methodological developments will build rigorous statistical foundations for high-dimensional multi-task and transfer learning inference under practical conditions, and provide interpretable, flexible, and robust tools for various researchers and practitioners in data science applications. The project also provides research training opportunities for graduate students. <br/><br/>High-dimensional multi-task and transfer learning inference under both supervised and unsupervised settings are challenging and important topics in statistical machine learning and data science. In this project, the PIs address these fundamental challenges by conducting systematic studies to develop novel methodologies, algorithms, theories, and applications through three interrelated aims. First, the PIs plan to investigate high-dimensional manifold-based multi-task learning inference, which involves learning a shared representation of multiple tasks that lie on a low-dimensional manifold. the project will develop robust and scalable algorithms that can handle high-dimensional data and incorporate manifold constraints to provide much-needed inference tools for the latent singular value decomposition (SVD) structures. Second, the PIs plan to tackle high-dimensional robust multi-task clustering inference, where the goal is to simultaneously cluster data from multiple tasks in the presence of outliers and noise. The project will develop novel robust multi-task clustering algorithms that can handle high-dimensional data and outlier tasks. Third, the PIs plan to investigate high-dimensional adaptive and robust multi-task learning and transfer learning from similar linear representations, which involves learning a shared representation of multiple tasks that share similar linear structures. Here, the project will develop adaptive and robust algorithms that can handle high-dimensional data, adapt to different noise levels, and transfer knowledge across similar linear representations.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.