The future of science-enabled discoveries critically relies on the speed of high-performance simulations conducted at large scales and high resolutions. Unfortunately, lacking such performance and scale, current approaches cannot keep up with the backlog of problems in areas of paramount societal consequence, such as climate science and the spread of pandemics. A principal reason for these shortfalls is the rising cost of moving huge amounts of simulation data between supercomputer memories and processors – a cost that increasingly dwarfs the time spent in actual computations. Thus, developing techniques to reduce the volume of data exchanged without sacrificing accuracy is key to future progress in computation-enabled research. Such data reduction is even more important in the emerging area of Scientific Machine Learning (SciML), where simulations are assisted by artificial intelligence (AI) based surrogate models, an area where the data exchange needs are often much higher. The investigators’ expertise in scientific machine learning, data compression, compilers, and program correctness will be central in our collaboration to help SciOPT achieve its goal of fast and reliable AI-assisted scientific simulations. The impact of this project will be to establish new technologies that reduce data volume without sacrificing accuracy in both high-performance computing and the emerging area of SciML. These technologies, in turn, translate directly into societal benefits such as improved healthcare and safer environments. The project will broaden participation in this area through undergraduate research plans that reach out to students from groups underrepresented in computing.<br/><br/>This research project, entitled SciOPT, will principally rely on data compression to reduce the amount of data moved: simulation data will be compressed before transmission and decoded upon reception before applying computations. The investigators will also pursue the potentially even more impactful approach of compressing the data and applying computations directly on the compressed data. SciOPT will evaluate both of these approaches in the context of challenging SciML applications that are currently bottlenecked by data exchanges. To ensure higher degrees of automation and productivity, SciOPT will develop efficient compiler-based methods to manage compressed data layout and locality. Moreover, it will automatically generate high-speed compression algorithms that are tailored to the data. To ensure the veracity of the computational results produced by these compressed-data simulations, SciOPT will include rigorous correctness-checking methods at multiple stages to guard the overall simulation workflows.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.