Data in science and engineering are often managed by software libraries that store data and data-describing metadata together. These “self-describing” data, file formats, and software libraries, such as HDF5 and netCDF, offer standardized, machine-independent, and portable file formats that support flexible and performant organization of large amounts of data. As a result, numerous scientific, engineering, and industry applications use these formats for storing and analyzing their data. With many critical fields using self-describing formats, these data as well as their corresponding data management software libraries have become critical cyberinfrastructure that must be secured to perform accurate and reproducible science. Unfortunately, existing data management software libraries were designed decades ago, before cybersecurity was a major concern, so there has never been a targeted testing and evaluation of the trustworthiness, integrity, and resilience of these libraries. This project is exploring strategies to integrate both well-known and advanced security algorithms into prominent data management libraries. The research performed in in this project will be a foundational step towards building next generation secure data management cyberinfrastructure for the rapidly changing landscape of science and AI, where security, privacy, and trustworthiness are critically required. <br/><br/>This project will apply comprehensive testing, evaluation, issue identification, hardening, and validation to correct security deficiencies in self-describing file formats and libraries. The specific R&D tasks include: (1) assessing and fixing file format vulnerabilities, (2) protecting data access libraries, (3) exploring security solutions for metadata and data, and (4) constructing a security framework, called S2-D2. The S2-D2 project will have a direct impact on securing data in a variety of scientific domains. Additionally, bolstering the HDF5 library with robust security will make it more usable in applications that require increased security, such as the financial and medical fields.<br/><br/>This award by the NSF Office of Advanced Cyberinfrastructure is jointly supported by the NSF National Discovery Cloud for Climate (NDC-C) initiative.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.