Increase in atmospheric carbon dioxide (CO2) is a major cause of global climate change. One of the most effective nature-based solutions to this challenge lies right under our feet - the soil. Globally, soil contains more carbon than in the Earth's atmosphere and vegetation combined. National and international initiatives are in place to increase soil organic carbon (SOC) content and storage capacity to combat climate change. The multifaceted benefits of SOC storage can also ensure food and nutritional security for the Earth's human population and help meet many of the United Nations Sustainable Development goals. However, it is not clear how long soil can provide these ecosystem services to our global community. This is partly because SOC data available from various sources and predictions based on computer models don’t agree with each other. This project aims to provide a robust estimate of SOC for the conterminous United States (CONUS), which can help identify potential reasons for inconsistency across different models and ultimately facilitate policy-makers in making informed decisions about climate change. It will also offer research training opportunities for students as well as workshops and training courses for teachers.<br/> <br/>For the U.S., there is a unique opportunity to use spatial clustering approaches to reduce uncertainties in SOC dynamics and constrain models at the continental scale by upscaling site-based measurements across the National Ecological Observatory Network (NEON). Emergent ecosystem properties will be evaluated by using multivariate quantitative methods to extrapolate or interpolate point-scale SOC measurements from a spatial constellation of NEON terrestrial sites to CONUS. Data collected across NEON terrestrial sites will be coupled with an array of multivariate geographic clustering algorithms (k-means clustering, ensemble clustering) and machine-learning (convolutional neural network, artificial neural network) approaches. These quantitative analyses will also enable uncertainty quantification of spatial representativeness of SOC and help identify potential future relocatable (or mobile) sites for additional ground-truth measurements of variables related to terrestrial C cycle processes. Existing NEON biogeochemistry, microbial, hydrology, sensor, and remote sensing data products will be leveraged to produce quantitative SOC regional maps for CONUS using similar combinations of climatic, ecological, environmental, geochemical, and microbial variables. The algorithms developed with NEON data will be validated with other point-scale data like SoDaH (SOils DAta Harmonization database) and ISNC (International Soil Carbon Network). The spatial mismatch of derived representativeness-based SOC regional maps for CONUS will be evaluated with existing gridded databases: SoilGrids, Harmonized World Soil Database (HWSD), Northern Circumpolar Soil Carbon Database (NCSCD), and gridded U.S. Soil Survey Geographic Database (gSSURGO).EON-based SOC regional maps for CONUS will also be integrated with downscaled historical SOC predictions from participating models of the Coupled Model Intercomparison Project Phase 6 (CMIP6). The robust (and scalable) estimate of SOC for CONUS will enable the diagnosis of terrestrial C cycle processes using historical CMIP6 model runs. Broader impacts will involve training opportunities at the undergraduate and graduate levels, and workhops and training courses to teach data analysis workflow methods.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.