Methods for Evolutionary Genomics Analysis

Information

Research Project
10405153

ApplicationId
10405153
Core Project Number
R35GM139540
Full Project Number
3R35GM139540-01S1
Serial Number
139540
FOA Number
PA-20-272
Sub Project Id

Project Start Date
2/1/2021 - 4 years ago
Project End Date
1/31/2026 - a month from now
Program Officer Name
JANES, DANIEL E
Budget Start Date
7/1/2021 - 4 years ago
Budget End Date
1/31/2022 - 3 years ago
Fiscal Year
2021
Support Year
01
Suffix
S1
Award Notice Date
8/31/2021 - 4 years ago

Organizations

TEMPLE UNIV OF THE COMMONWEALTH

Information

Methods for Evolutionary Genomics Analysis

PROJECT SUMMARY This administrative supplement request aims to develop a cloud-enabled, highly scalable version of the computational core of the Molecular Evolutionary Genetics Analysis software (MEGA-CC: www.megasoftware.net). The development of MEGA-CC is a significant component of the NIH-funded research project to develop machine learning methods and tools for comparative analysis of molecular sequences. With big advances in genome sequencing, researchers are assembling datasets containing large numbers of species, strains, genes, and genomic segments. Phylogenomic analyses of these data are essential to understanding the dynamics of evolutionary change of pathogens, humans, and species across the tree of life. Machine learning methods and software tools for phylogenomics are now necessary because the expanding size of phylogenomic datasets limits the practical utility of currently available methods and tools due to excessive computational time and memory requirements. One component of the funded grant is implementing our new machine learning methods in the MEGA software suite (www.megasoftware.net), an extremely popular bioinformatics software (>20,000 peer-reviewed citations and 350,000 software downloads in the year 2020 alone). The MEGA software includes a large repertoire of tools for assembling sequence alignments, inferring evolutionary trees, estimating genetic distances and diversities, inferring ancestral sequences, computing timetrees, and testing selection. These analyses are now required in all research investigations and fields in which multiple DNA or RNA sequences are used. However, MEGA and its computational core (MEGA-CC) are not optimized for distribution and execution on cloud infrastructure and high-performance computing clusters. This supplement to the funded grant will enable us to advance MEGA for cloud readiness to harness the scalability, elastic computing power, and easy software upgrade and maintenance enabled by cloud infrastructure (MEGA-CR). It will also make MEGA interoperable with existing and future cloud infrastructure. Additionally, this supplement will facilitate using the new machine learning methods in MEGA with big genomic data in practice, thus addressing an imminent and fast-growing need for an increasingly larger community of researchers using MEGA. MEGA-CR will increase the usability of MEGA for the scientific community analyzing very large datasets for which greater accessibility, cost-efficiency, and scalability of cloud-readiness is becoming crucial.

IC Name

NATIONAL INSTITUTE OF GENERAL MEDICAL SCIENCES

Activity
R35
Administering IC
GM
Application Type
3

Direct Cost Amount
87480
Indirect Cost Amount
51176
Total Cost
138656
Sub Project Total Cost

ARRA Funded
False
CFDA Code
310
Ed Inst. Type
SCHOOLS OF ARTS AND SCIENCES
Funding ICs
OD:138656\
Funding Mechanism
Non-SBIR/STTR RPGs
Study Section
Study Section Name

Organization Name
TEMPLE UNIV OF THE COMMONWEALTH
Organization Department
BIOLOGY
Organization DUNS
057123192
Organization City
PHILADELPHIA
Organization State
PA
Organization Country
UNITED STATES
Organization Zip Code
191226003
Organization District
UNITED STATES

Methods for Evolutionary Genomics Analysis

Information

ApplicationId

Core Project Number

Full Project Number

Serial Number

FOA Number

Sub Project Id

Project Start Date

Project End Date

Program Officer Name

Budget Start Date

Budget End Date

Fiscal Year

Support Year

Suffix

Award Notice Date

Organizations

Methods for Evolutionary Genomics Analysis

IC Name

Activity

Administering IC

Application Type

Direct Cost Amount

Indirect Cost Amount

Total Cost

Sub Project Total Cost

ARRA Funded

CFDA Code

Ed Inst. Type

Funding ICs

Funding Mechanism

Study Section

Study Section Name

Organization Name

Organization Department

Organization DUNS

Organization City

Organization State

Organization Country

Organization Zip Code

Organization District