Understanding the structure and function of plants, especially roots as they change over time, is essential to understand how plants adapt to changing climates and ensure sustainable production. Cutting-edge advances in this understanding are benefitting from huge increases in the diversity and sheer amount of data available from sensors. For example, plant science uses sensor technology called minirhizotron (MR) systems to study root development. These sensors capture color images of plant roots through cameras placed into the soil in clear tubes. Preparing these images to be used in scientific research requires enormous amounts of time, labor, and effort, due to the need for human interpretation of the data. Machine learning algorithms can automate some of the preparation, but we do not know how to design a system to help humans and machine learning algorithms work best together. This project develops methods and tools to support plant scientists (of varying backgrounds and expertise levels, including youth and other novices) working in tandem with machine learning to better utilize MR systems. Outcomes will include advances in both the interactive machine learning experience for human image labelers, as well as the relationship between participating in labeling and self-identification as scientists. The project will have broad implications for sensor-based data science in plant science and beyond, addressing multiple issues of global importance. The U.S. continues to experience a shortage of scientists-in-training, and the project will advance and evaluate efforts to draw more students into science. Science education programs in this project, which involve youth in designing the human-machine system, can help youth from marginalized backgrounds learn how science works and help them see themselves as future scientists. <br/><br/>This project provides the tools needed to significantly reduce the analysis bottleneck of the plant root data generated by MR systems and, in the long term, enable larger-scale MR-based studies that may have significant global importance. The focus of this project is to develop interactive machine learning tools targeted to support plant scientists of varying expertise levels using a human-centered design approach. To accomplish these goals, this project triangulates findings from mixed methods, including laboratory studies of experienced labelers, participatory co-design workshops with stakeholders from diverse backgrounds, and summer participatory science experiences with Florida 4-H partner programs. The laboratory studies contribute new understanding of how human labeler behavior (such as annotation quantity and quality) affects machine learning algorithm performance, and vice versa. The participatory co-design workshops focus on designing interactive machine learning data visualization and labeling tools based in a human-centered understanding of plant scientists of varying expertise, including scientists, emerging scientists, and non-scientists, both youth and adults, as end users. Finally, the summer science experiences inform on how to scale this approach to broader domains and user populations beyond those traditionally engaged in STEM as youth. This project will facilitate higher throughput in the analysis of MR systems data in plant science, enabling future impacts to productivity, sustainability, and resilience of agricultural and natural ecosystems. It will also impact the throughput of human-centered machine learning in science in general. Methods from this project will also generalize to other similar labeling domains, such as human anatomy (blood vessels, neurons) or hydrology (river deltas, coastlines). Involving marginalized youth through partnerships with 4-H also grows the nation’s prospective STEM workforce.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.