UniProt: A centralized protein sequence and function resource

Information

  • Research Project
  • 10372430
  • ApplicationId
    10372430
  • Core Project Number
    U24HG007822
  • Full Project Number
    3U24HG007822-07S2
  • Serial Number
    007822
  • FOA Number
    PA-20-272
  • Sub Project Id
  • Project Start Date
    9/8/2014 - 10 years ago
  • Project End Date
    9/30/2021 - 3 years ago
  • Program Officer Name
    PILLAI, AJAY
  • Budget Start Date
    6/1/2021 - 3 years ago
  • Budget End Date
    9/30/2021 - 3 years ago
  • Fiscal Year
    2021
  • Support Year
    07
  • Suffix
    S2
  • Award Notice Date
    5/13/2021 - 3 years ago

UniProt: A centralized protein sequence and function resource

PROJECT SUMMARY/ABSTRACT The mission of the Universal Protein Resource (UniProt) is to support biomedical research by providing a freely available, stable, comprehensive, richly and accurately annotated protein sequence knowledgebase (www.uniprot.org). UniProt integrates, interprets and standardizes data from a multitude of sources to achieve the most comprehensive catalog of protein sequences and functional annotation available to date, providing information from hundreds of thousands of publications for tens of millions of proteins from tens of thousands of species. The activities proposed here will increase the utility of UniProt for biomedical research and precision medicine. The expert curated functional information provided by UniProt is widely acknowledged to be of exceptional quality and is continuously updated as new knowledge becomes available. Our first aim will be to continue to curate the scientific literature to ensure UniProt remains up to date. We will also work with the text-mining community to continue to improve curation efficiency. The curated records (0.5 million) are complemented by the (80 million) records for uncharacterized proteins. To ensure their usefulness for the community we will continue to develop our automatic annotation systems to annotate these proteins based on the knowledge of characterized proteins. Our third aim is to connect to and integrate protein data from resources around the world to make UniProt the worldwide global hub of protein information. The integration of clinical variation data as well as metabolomics information with proteins will help to support the multi-omics approaches of precision medicine. Our fourth aim describes the production of the resource to ensure that our data is freely available according to the FAIR principles. UniProt forms a foundation for hundreds of life sciences data resources. Continuous software development is needed to ensure delivery of this key component of the life science infrastructure. The UniProt website is used by hundreds of thousands of scientists every month. The final aim describes how we will enable this community to make best use of UniProt, through user training, outreach and improved user interfaces, driven by user testing.

IC Name
NATIONAL HUMAN GENOME RESEARCH INSTITUTE
  • Activity
    U24
  • Administering IC
    HG
  • Application Type
    3
  • Direct Cost Amount
    1857774
  • Indirect Cost Amount
    59053
  • Total Cost
    1916827
  • Sub Project Total Cost
  • ARRA Funded
    False
  • CFDA Code
    172
  • Ed Inst. Type
  • Funding ICs
    NHGRI:1916827\
  • Funding Mechanism
    OTHER RESEARCH-RELATED
  • Study Section
  • Study Section Name
  • Organization Name
    EUROPEAN MOLECULAR BIOLOGY LABORATORY
  • Organization Department
  • Organization DUNS
    321691735
  • Organization City
    HEIDELBERG
  • Organization State
  • Organization Country
    GERMANY
  • Organization Zip Code
    69117
  • Organization District
    GERMANY