The amount of data required to be analyzed by computing systems has been increasing drastically to exascale (i.e., billions of gigabytes) and beyond. Meanwhile, owing to the boom in artificial intelligence (AI), especially Deep Neural Network (DNN), there is a need for high performance, efficient, fast, and adaptive AI-based big data processing systems. However, those requirements are not sufficiently met by existing computing solutions due to the power-wall in silicon-based semiconductor devices, memory-wall in traditional Von-Neuman computing architecture, and ultra computation- and memory-intensive DNN-based AI algorithms. This project brings together an interdisciplinary group of researchers, with expertise spanning from material science, device fabrication, integrated circuit design, computer architecture, and AI algorithms to undertake innovative device-circuit-algorithm co-design for developing an AI Processing-In-Memory (AI-PIM) system that could leverage the emerging non-volatile magnetic memory technology to implement efficient AI data processing, as well as situation-aware on-chip continual learning. This project targets to significantly improve the AI data processing energy efficiency, with 100X higher efficiency than that of state-of-the-art Graph Processing Units (GPUs). The project will greatly benefit various application areas, such as autonomous driving, robotics, personalized cognitive speech, and smart connected health, etc. This project will also involve education and workforce development activities, including K-12 STEM outreach, undergraduate/graduate training, curriculum development in semiconductor, semiconductor industry internship mentoring, cleanroom fab internships, advance integrated circuit design courses. It will also encourage broader participation of female and under-represented minorities in the microelectronics and semiconductor chip industry. <br/><br/>This project will advance knowledge and conduct cross-layer research spanning from emerging Spin-Orbit Torque Magnetic Random Access Memory (SOT-MRAM) material, device, circuit, architecture, to AI algorithm exploration with three main interweaved thrusts. Thrust 1 will explore unconventional spins in SOT materials, e.g., MnPd3, and novel device geometry to fabricate a new design of 2-terminal SOT-MRAM, which simultaneously delivers unlimited endurance, nano-seconds programming time, very high cell density, deterministic programming without external magnetic field, zero leakage, and non-volatility. Leveraging the developed 2-terminal SOT-MRAM, Thrust 2 will design and tape-out an AI Processing-in-Memory (PIM) chip to implement fully digital ‘in-memory sparse multiplication-and-accumulation (MAC)’ operations that support both forward and backward computations of neural networks. Following a co-design methodology, Thrust 3 will first investigate automated network architecture search methods to construct AI model best suitable for given situation while considering our AI-PIM system constraint. This thrust will further develop novel PIM-friendly, compute- and memory-efficient, situation-aware continual learning algorithms that could minimize the power-hungry on-chip weight update (i.e., memory write) complexity, while learning new situation- and user-specific data.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.