GENCODE: comprehensive genome annotation for human and mouse

Information

Research Project
9980954

ApplicationId
9980954
Core Project Number
U41HG007234
Full Project Number
5U41HG007234-08
Serial Number
007234
FOA Number
PAR-14-191
Sub Project Id

Project Start Date
4/1/2013 - 12 years ago
Project End Date
5/31/2021 - 4 years ago
Program Officer Name
GILCHRIST, DANIEL A
Budget Start Date
6/1/2020 - 5 years ago
Budget End Date
5/31/2021 - 4 years ago
Fiscal Year
2020
Support Year
08
Suffix
Award Notice Date
9/11/2020 - 5 years ago

Organizations

European Molecular Biology Laboratory

Information

GENCODE: comprehensive genome annotation for human and mouse

OVERALL - PROJECT SUMMARY The objective of the GENCODE consortium is to create a foundational reference genome annotation, in which all gene features in the human and mouse genomes are identified and classified with high accuracy based on biological evidence, and then to release these annotations for the benefit of biomedical research and genome interpretation. GENCODE aims for a better understanding of a `normal' human genome; using genome sequences of the most commonly used mouse strains will facilitate the most effective use of these key models for large-scale knockout analysis and disease-specific research. To produce regular annotation releases of high accuracy, GENCODE will continue to follow its well-established and conservative research design, supplemented by targeted investigations into the value of new technologies, new data and new sources of evidence. GENCODE focuses on protein-coding and non-coding loci, including their alternatively spliced isoforms and pseudogenes. Over the course of this proposal GENCODE will follow major directions in genomics, including graph- based genome representations, long-read transcriptome sequencing, connecting genes and the associated regulatory regions that affect their transcription, and identifying genes that are not present on the current reference assembly. The GENCODE consortium has four fundamental components: (1) a comprehensive gene annotation pipeline leveraging manual annotation; (2) an integrated approach to pseudogene identification and classification; (3) a set of computational methods to evaluate and enhance gene annotation; and (4) complementary experimental pipelines for validation and functional annotation. More specifically, in the next four years GENCODE aims to (1) extend the human and mouse GENCODE gene sets to as near completion as possible given current experimental technology; (2) deploy population-based genome annotation to ensure that any transcript isoform expressed in an individual human will be present in the reference annotation set; (3) extend the gene annotation to include core regulatory regions and tissue-specific enhancers from selected datasets; (4) to distribute GENCODE annotations and engage with community annotation efforts. Current popular distribution channels for GENCODE data including the GENCODE web site, the Ensembl and UCSC Genome Browsers, will be maintained. Finally, new mechanisms for prioritizing genes for manual annotation with community input will be established, with the long-term aim of establishing GENCODE as the standard annotation set for research and clinical genomics efforts.

IC Name

NATIONAL HUMAN GENOME RESEARCH INSTITUTE

Activity
U41
Administering IC
HG
Application Type
5

Direct Cost Amount
2143527
Indirect Cost Amount
83442
Total Cost
2226969
Sub Project Total Cost

ARRA Funded
False
CFDA Code
172
Ed Inst. Type
Funding ICs
NHGRI:2226969\
Funding Mechanism
RESEARCH CENTERS
Study Section
ZHG1
Study Section Name
Special Emphasis Panel

Organization Name
EUROPEAN MOLECULAR BIOLOGY LABORATORY
Organization Department
Organization DUNS
321691735
Organization City
HEIDELBERG
Organization State
Organization Country
GERMANY
Organization Zip Code
69117
Organization District
GERMANY

GENCODE: comprehensive genome annotation for human and mouse

Information

ApplicationId

Core Project Number

Full Project Number

Serial Number

FOA Number

Sub Project Id

Project Start Date

Project End Date

Program Officer Name

Budget Start Date

Budget End Date

Fiscal Year

Support Year

Suffix

Award Notice Date

Organizations

GENCODE: comprehensive genome annotation for human and mouse

IC Name

Activity

Administering IC

Application Type

Direct Cost Amount

Indirect Cost Amount

Total Cost

Sub Project Total Cost

ARRA Funded

CFDA Code

Ed Inst. Type

Funding ICs

Funding Mechanism

Study Section

Study Section Name

Organization Name

Organization Department

Organization DUNS

Organization City

Organization State

Organization Country

Organization Zip Code

Organization District