Genome DNA sequences for many crops have been determined in the last two decades, providing the blueprints to discover genes that underlie key agricultural traits. However, a great challenge is identifying the differences in DNA between related varieties of the same crop, which are responsible for the subtle trait variation that plant breeders exploit to improve productivity. A major contributor to this trait variation is 'genome structural variation' where pieces of DNA are deleted, inserted, or rearranged resulting in changes in gene expression. This project will focus on how structural variation contributed to domestication and breeding of tomatoes. A related goal is to expand and develop new molecular tools to create structural variation for crop improvement. This project will improve US agriculture by providing new knowledge and tools to efficiently and predictably enhance crop productivity. A major part of the project will also include training of young scientists in fundamental principles of plant genome research that can be applied to agriculture. This knowledge will also be shared through outreach programs in inner city New York schools that do not have access to research opportunities. Project personnel will develop hands-on teaching activities that will highlight the importance of plant genomics and new genome editing technologies to improve crops and meet the agricultural needs of the 21st century.<br/><br/><br/>Limited knowledge on the extent and diversity of structural variation in plant genomes is hindering the ability to link genes to important crop phenotypes. This project will unite new long-read sequencing technologies, computational biology, developmental and quantitative genetics, and genome editing to elucidate and manipulate structural variation (SV) at a scale never before achieved for a major crop. Tomato provides a powerful system due to its relatively small and high quality reference genome and availability of resequenced genomes. By applying SV-detection algorithms to existing short-read Illumina sequencing data from hundreds of accessions, more than 40 genomes will be selected, capturing the majority of predicted SV diversity, to establish new reference genomes using the latest long-read sequencing technology (PacBio and 10X Genomics). From these data, a compendium of validated SVs will be generated and integrated with ongoing genome-wide association studies. Significant gene-associated SVs, including those affecting gene activity measured by genome-wide transcript profiling, will be characterized using CRISPR/Cas9 gene editing and quantitative phenotypic analyses, focusing on reproductive traits that drive crop productivity. In parallel, CRISPR/Cas9 gene editing will be used to generate a collection of SV mutations in known yield and fruit quality genes in two related wild Solanaceae with agricultural potential, with the goal of achieving major steps towards domestication and for comparative developmental genetics studies. This project will greatly expand our knowledge of genomic diversity in tomato, and provide a road map for dissecting SVs in other crops, where such knowledge can be exploited to improve productivity.