Surging human population and climate change have placed unprecedented stress on agriculture. Meeting the world’s evolving food needs necessitates the development of resilient, nutritious, and bountiful crops. Collaborative and innovative approaches to plant science, combined with outreach initiatives for STEM education and recruitment, are needed to meet these challenges. This project will expand the Plant Metabolic Network (PMN), a collection of databases used to study the metabolism of plants and how their genomes relate to metabolism. This expansion will grow the databases in the PMN from 126 to over 1000 species. An expanded PMN will provide researchers access to a rich, evolving suite of tools to rapidly accelerate fundamental and translational plant science research and applications to meet real-world demands. The expansion of the PMN will also create a valuable educational tool for those in secondary- and post-secondary education. Educational resources about plant genetics, biochemistry, and metabolism will be produced and distributed, including video tutorials for PMN and lesson plans demonstrating how plants produce various chemicals useful to humans. In universities, a pilot Course-based Undergraduate Research Experience (CURE) will be established to provide undergraduate students with cutting-edge, experiential opportunities in genome and metabolism research. An advisory board of schoolteachers will guide engagement efforts in middle- and high-school audiences, focusing on student bodies predominantly composed of underrepresented minorities. This project offers ample opportunities for increased collaboration and advancement across the plant sciences, as well as initiatives to inspire and guide students who identify with underrepresented groups to pursue STEM education.<br/><br/>To expand the PMN databases, this project will retool the existing PMN pipeline to operate at scale, packaging it into a containerized app and automating much of the process of running it. The retooled pipeline will be used to rapidly create new PMN databases from 1000 plant genomes taken from GenBank, 10KP, and other sources. This project aims to expand the contents and improve the accuracy of the databases by experimentally determining and integrating enzyme biochemical data to enhance computational prediction of plant metabolism. As a project component, an enzyme consortium will be established to conduct prediction-guided enzyme function studies at scale, targeting enzyme classes that perform poorly in computational annotations, and integrate the biochemical data into PMN. We will focus on terpene synthases, cytochrome P450s, and aminoacyltransferases in this initial consortium. To effectively incorporate these data in PMN, improvements will be made to the PMN enzyme prediction software to better identify differentiating features among enzyme family members, including enzyme classes with few existing members. The scope of information available in PMN will be expanded to include new cell-type and tissue-specific databases. Workshops will be organized at major plant biology conferences to introduce users to PMN’s capabilities. Workshops at phytochemistry and plant metabolism conferences will focus on distributing the data generated by the consortium. These conferences are poised to recruit scientists to expand the consortium beyond the enzyme families targeted in this proposal.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.