The rapid growth of artificial intelligence (AI), especially Large Language Models (LLMs) like ChatGPT, is revolutionizing the way people work, learn, communicate, and access healthcare. Due to the complexity of AI workloads and the limitations of today’s semiconductor hardware, computing with powerful LLMs incurs enormous energy costs and generates a significant carbon footprint. This Future of Semiconductors project will develop a holistic computing solution to provide reliable, private, and energy-efficient computing capabilities directly to end users’ devices, democratizing access to advanced AI. By making oxide semiconductors —a class of mature semiconductor materials used in display panels— extremely thin, down to atomic levels through nanofabrication, breakthrough performance and new technological advantages can be achieved. By synergizing these material advancements with novel chip designs and algorithms, a new computing platform developed through this research will run AI workloads more efficiently than existing silicon-based platforms. <br/><br/>This project will engage in device-architecture-algorithm co-design to develop a complementary metal-oxide semiconductor (CMOS) compatible, indium-oxide-based computing platform, enabling reliable and energy-efficient edge inference with transformer models (e.g., LLMs). The research will advance scientific understanding of oxide semiconductors by experimenting with high-k/ferroelectric gate stacks in a unified atomic-layer-deposition (ALD) material system. By harnessing unique features of almost-atomically-thin indium-oxide transistors, novel compute primitives based on ferro-oxide-semiconductor field effect transistors (Fe-OSFETs) will be developed, natively supporting scalable precision. New sparsity and quantization algorithms will be created for transformers to fully exploit search-in-memory and compute-in-memory functionalities native to the indium-oxide compute primitives. In the proposed neural computing platform, the co-existence of a non-volatile mode for long-term data reuse and a charge-based mode with high endurance, allows effective co-optimization of both dynamic and static layers in transformers for energy-efficient, reliability-aware inference. The development of an open-source process development kit (PDK) through this project will further enhance device-to-system co-design and prototyping capabilities with emerging OSFETs. Ultimately, a system demonstrator for an LLM edge inference engine will advance the frontiers of edge machine intelligence, supporting a broad range of societally impactful applications. Finally, the project team is committed to educating a diverse group of students with vertically integrated mindsets and skillsets to form the next-generation workforce, fostering a sustainable and equitable future for society through joint innovations in AI and semiconductor technologies.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.