A sustainable society needs a diverse and competent workforce equipped with skills to extract patterns from large and complex datasets, turn them into actionable insights, and develop solutions to solve real-world problems relevant to society. With the advancement of generative artificial intelligence (AI), machines are increasingly capable of writing code according to specific instructions and performing specific data analysis tasks. Higher-order problem-solving skills are becoming increasingly important to develop among students as they are less likely to be replaced by AI. Thus, a scalable, innovative solution is urgently needed to help graduate students develop their critical-thinking skills. This National Science Foundation Innovations in Graduate Education (IGE) award to the University of Maryland Baltimore County (UMBC) and the University of Central Florida (UCF) will augment, refine, and pilot Caselet, a scalable case-based practice tool, by leveraging AI, machine learning, and data analytics approaches, including large language models (LLMs). This project will support development of data science problem-solving skills in both cognitive (the knowledge and skills themselves) and metacognitive domains (the skills for learning how to learn). The project will address the rapidly changing landscape of education in computing and data-intensive courses in terms of both “what we teach” and “how we teach.” <br/><br/>This project will augment and refine the Caselet practice tool in three dimensions to support scalable deployment and adoption through an iterative design and test framework. The research team will enhance the Caselet tool with new features, to be piloted and tested by up to 1000 students drawn from three graduate programs over a three-year period at the University of Maryland Baltimore County, a minority-serving institution. The project will focus on three tasks to address scale-up challenges. The first task will explore the approach to help scale up the authoring of Caselet using Large Language Model (LLMs). This approach aims to expedite the authoring process by identifying appropriate case studies and drafting relevant questions and explanations before submitting them for expert review. The second task aims to scale up the cognitive skills assessment in data science problem solving using machine learning models to track students’ skill mastery at a refined level of precision. The third task will focus on the scalable assessment of metacognitive competencies related to data science problem-solving through multichannel multimodal data collection in controlled lab environments and course-based and self-paced settings. Along with technology development, the research team will conduct pilot studies among UBMC graduate students from three different programs in various educational contexts, including online vs. in-person, instructor-led, or self-paced. In addition to the research findings, a guidebook will be created to support the adoption of Caselet by students and instructors from other educational institutions. The findings and pedagogically enhanced Caselet and associated data science problems stemming for this project will be disseminated to graduate-level faculty across UMBC, UCF, and other partnering institutions as well as scholarly conferences.<br/><br/>The Innovations in Graduate Education (IGE) program is focused on research in graduate education. The goals of IGE are to pilot, test and validate innovative approaches to graduate education and to generate the knowledge required to move these approaches into the broader community.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.