GENCODE Resource Informatics

Information

Research Project
9980962

ApplicationId
9980962
Core Project Number
U41HG007234
Full Project Number
5U41HG007234-08
Serial Number
007234
FOA Number
PAR-14-191
Sub Project Id
7901

Project Start Date
4/1/2013 - 13 years ago
Project End Date
5/31/2021 - 5 years ago
Program Officer Name
Budget Start Date
6/1/2020 - 6 years ago
Budget End Date
5/31/2021 - 5 years ago
Fiscal Year
2020
Support Year
08
Suffix
Award Notice Date
9/11/2020 - 5 years ago

Organizations

European Molecular Biology Laboratory

Information

GENCODE Resource Informatics

RESOURCE INFORMATICS ? PROJECT SUMMARY The creation, advancement and maintenance of the GENCODE resource requires both adherence to and optimization of defined processes that ensure the genome annotation created now and in the future will always be of the same or better standard compared to what has already been created. The GENCODE resource must also be attuned to the new technologies and opportunities that arise as the field of genomics evolves. A primary objective of the GENCODE resource is to ensure quality control (QC) and data validation of annotations. Ensembl will compare the GENCODE gene set to other gene sets (e.g. UniProt) to check for missing genes or transcripts; CNIO will validate the coding genes; the CNIO/CNIC proteomics pipeline will validates the gene models; CNIO/CNIC will perform manual verification for QC of proteomics data. Project stability will be ensured through a well-maintained computational infrastructure, adequate QC processes that will ensure the highest possible quality, as well as regular releases of freely available annotation in high value formats. The annotation curation for human and mouse will be completed, in particular the existing human partial transcript models will be extended to full length, expanding the human lncRNA annotation, as well as the completion of the initial full pass of the mouse annotation. GENCODE will incorporate individual genome representation and population data represented by available human variation data at both the sequence level (e.g. 1000 Genomes) and at the transcriptomic level (e.g. GTEx), and by the 16 mouse strain genomes produced by the Mouse Genomes Project led by the WTSI. Data from individuals and populations will be annotated. A personal genome resource will be developed, which will produce an accurate representation of an individual's gene set. Two pilot projects will help to define the most effective way to support future GENCODE annotations. The first pilot project will use GENCODE's experience in developing population reference genome graphs to pilot a scalable and potentially universal approach to population based genome annotation. The second pilot project will focus on connecting regulatory regions to regulated genes. GENCODE will enhance the current annotation of genes with their regulatory elements so that the annotation is dependent on tissue and cell type. The demand for manual annotation of transcripts across strains and species may outstrip GENCODE's ability to provide such services via existing mechanisms, therefore a system to enable the submission of annotated data will be developed. The described measures will ensure that GENCODE in 2020 will be significantly more valuable for research and clinical applications in genomics than today.

IC Name

NATIONAL HUMAN GENOME RESEARCH INSTITUTE

Activity
U41
Administering IC
HG
Application Type
5

Direct Cost Amount
1030867
Indirect Cost Amount
38449
Total Cost
Sub Project Total Cost
1069316

ARRA Funded
False
CFDA Code
Ed Inst. Type
Funding ICs
NHGRI:1069316\
Funding Mechanism
RESEARCH CENTERS
Study Section
ZHG1
Study Section Name
Special Emphasis Panel

Organization Name
EUROPEAN MOLECULAR BIOLOGY LABORATORY
Organization Department
Organization DUNS
321691735
Organization City
HEIDELBERG
Organization State
Organization Country
GERMANY
Organization Zip Code
69117
Organization District
GERMANY

GENCODE Resource Informatics

Information

ApplicationId

Core Project Number

Full Project Number

Serial Number

FOA Number

Sub Project Id

Project Start Date

Project End Date

Program Officer Name

Budget Start Date

Budget End Date

Fiscal Year

Support Year

Suffix

Award Notice Date

Organizations

GENCODE Resource Informatics

IC Name

Activity

Administering IC

Application Type

Direct Cost Amount

Indirect Cost Amount

Total Cost

Sub Project Total Cost

ARRA Funded

CFDA Code

Ed Inst. Type

Funding ICs

Funding Mechanism

Study Section

Study Section Name

Organization Name

Organization Department

Organization DUNS

Organization City

Organization State

Organization Country

Organization Zip Code

Organization District