Computational framework for identifiable and phase-consistent allele-specific expression quantification

Information

  • NSF Award
  • 2505285
Owner
  • Award Id
    2505285
  • Award Effective Date
    10/1/2024 - a month ago
  • Award Expiration Date
    8/31/2025 - 9 months from now
  • Award Amount
    $ 595,875.00
  • Award Instrument
    Standard Grant

Computational framework for identifiable and phase-consistent allele-specific expression quantification

Haplotype inference and allele-specific transcript expression quantification are two fundamental<br/>problems in genetics and genomics. Haplotype inference aligns maternal and paternal alleles of<br/>genetic variants along two diploid chromosomes, whereas allele-specific expression<br/>quantification obtains the expression levels of transcripts of maternal and paternal origins from<br/>RNA-seq reads. These two problems are coupled in that one can affect the accuracy of the<br/>other: accurate allele-specific expression quantification requires accurate haplotypes to map<br/>RNA-seq reads to and the accuracy of haplotype inference can be enhanced by allele-specific<br/>RNA-seq reads. While existing works have considered these two problems separately, this project<br/>develops a computational framework to address these two fundamental problems jointly in a<br/>single statistical framework to enhance the accuracy of both inferred haplotypes and<br/>allele-specific expression quantification. The computational methods to be developed in this<br/>research will advance various aspects of biological research that require accurate allele-specific<br/>expression estimates and haplotypes, including mapping allele-specific eQTLs, detecting<br/>imprinted genes, imputing untyped variants, finding signatures of natural selection, and<br/>detecting recombination events. The outcome of the research will be used in outreach activities<br/>in minority serving institutions to recruit graduate students.<br/><br/>The project develops a computational framework for obtaining accurate allele-specific<br/>expression measurements and haplotypes from RNA-seq and genotype data. Two existing<br/>frameworks, one for transcript expression quantification and the other for haplotype inference<br/>(e.g., Beagle), are combined into a single framework, while keeping the computational efficiency<br/>of the original frameworks. Each of these two existing frameworks is modified to address two<br/>previously-unmet challenges regarding allele-specific reads: for the RNA-seq quantification, the<br/>project develops a mathematically rigorous approach to obtaining identifiable allele-specific<br/>expression estimates at gene level, at transcript-set level, or at individual transcript level,<br/>whereas for haplotype inference, the project couples the model in Beagle with RNA-seq<br/>quantification methods of these investigators to jointly estimate identifiable allele-specific expression levels and<br/>haplotypes that are consistent with each other. The computational methods are benchmarked<br/>on allele-specific eQTL mapping, using genotypes and RNA-seq reads from human trios and<br/>LG/SM intercross mice with known haplotypes. The outcome of the research is available at<br/>http://www.cs.cmu.edu/~sssykim.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

  • Program Officer
    David Liberlesdliberle@nsf.gov7032920000
  • Min Amd Letter Date
    11/13/2024 - 15 days ago
  • Max Amd Letter Date
    11/13/2024 - 15 days ago
  • ARRA Amount

Institutions

  • Name
    University of Pittsburgh
  • City
    PITTSBURGH
  • State
    PA
  • Country
    United States
  • Address
    4200 FIFTH AVENUE
  • Postal Code
    152600001
  • Phone Number
    4126247400

Investigators

  • First Name
    Seyoung
  • Last Name
    Kim
  • Email Address
    sssykim@acm.org
  • Start Date
    11/13/2024 12:00:00 AM

Program Element

  • Text
    Innovation: Bioinformatics

Program Reference

  • Text
    ADVANCES IN BIO INFORMATICS
  • Code
    1165