PI: Christopher D. Town, J. Craig Venter Institute, Inc.<br/><br/>Co-PI: David C. Schwartz, University of Wisconsin<br/><br/>Medicago truncatula, a close relative of alfalfa, is the preeminent model for legume genomics and has been the target of an international sequencing initiative for the past five years. When the international sequencing efforts wind down in the fall of 2008, there will be ~280 Mb of high quality DNA sequence distributed across the plant's eight chromosomes each with blocks of sequence ranging in size from a few hundred thousand bases to between five and 10 million bases in length and with from 10-30 gaps in each of the euchromatic arms. Based upon the projected capture rate of expressed sequence tags, the euchromatic, gene-rich portion of genome will be around 80% complete. There will also be ~200 Mb of unsequenced DNA in the centromeres that is gene poor and was not targeted for sequencing. This project will integrate, manage and enhance our understanding of both the structure and annotation of the Medicago genome, and comprises three goals:<br/><br/>1. Creation of the best possible sequence-based representation of the M. truncatula genome. This will involve construction of an optical map, which is a physical scaffold derived by methods totally independent of the DNA sequencing process. The map will allow the runs of contiguous stretches of DNA sequence (contigs) produced by the sequencing centers to be placed in the correct order and orientation and the sizes of gaps between these contigs and the location and sizes of the gene-poor centromeric regions to be determined.<br/><br/>2. Capturing and localizing as much as possible of the remaining gene-containing regions of M. truncatula genome in the most cost-efficient fashion, thus providing a close to complete inventory of its gene content.<br/><br/>3. Supporting and maintaining the IMGAG (International Medicago Genome Annotation Group) annotation pipeline that represents a consensus annotation process for the entire Medicago (and legume) community and enriching the annotation of the M. truncatula genome by overlaying other data types including expression data (both microarray and NextGeneration sequencing), proteomic data, locations of transposon and fast-neutron induced mutations, links to genetic resources, etc.<br/><br/>With the sequencing projects terminating, members of the IMGAG consortium (including the JCVI group in the US) will have few resources to devote to continued annotation. This project will maintain and keep updated both the sequence content and annotation of the Medicago genome and provide a critical central and stable resource for legume researchers for years to come. The project will also host a community annotation portal that will allow researchers to enrich the basically automatic, computer-generated annotation with more refined structural and functional details based upon their own knowledge and experience. All information generated by this project will be freely accessible through the project web site maintained at the J. Craig Venter Institute (http://www.jcvi.org/cms/research/projects/medicago-truncatula-database/overview/).<br/>The project will be monitored by an advisory committee composed of members of the community-elected International Medicago Steering Committee. <br/><br/>At the educational level, both participating institutions will host visiting students in their laboratories for summer internships. In addition, annual workshops will be held to provide education in genome annotation and analysis to graduate students, postdoctoral fellows and interested faculty in the legume community.