Collaborative Research: CAIG: Using Deep Learning to Learn about the Deep Sea: Application of AI to Elucidate Drivers of Global Biogeochemical Cycles

Information

  • NSF Award
  • 2425834
Owner
  • Award Id
    2425834
  • Award Effective Date
    9/15/2024 - 8 months ago
  • Award Expiration Date
    8/31/2027 - 2 years from now
  • Award Amount
    $ 541,276.00
  • Award Instrument
    Standard Grant

Collaborative Research: CAIG: Using Deep Learning to Learn about the Deep Sea: Application of AI to Elucidate Drivers of Global Biogeochemical Cycles

The deep sea is an epicenter of biogeochemical cycling that is globally important but poorly understood. Big data generated by emergent gene sequencing technology provides a new avenue to link genes with biological processes. In the deep sea, the vast majority of genes are unknown. This project will focus on methane seep systems. New microbial samples will be collected from methane seeps off the coast of Oregon and Washington. This research will employ a novel natural language processing artificial intelligence approach to predict what these unknown genes do. This will be a critical step toward quantifying oceanic ecosystem function based on genomics. The artificial intelligence models developed using these samples will be broadly applicable. They can provide a foundation to answer many questions across scientific fields ranging from ecology to human health. A tutorial for the models developed will be written and workshop run to explain the techniques. Further, artists will be involved in the research and a documentary will be produced to spread the results of the research.<br/><br/>This research will build two new artificial intelligence models to use gene sequence data to understand ecosystem processes, and apply them to methane seep habitats. A new model incorporating genes and ribosomal amplicon co-occurrence will code genes and classify them into pathways. In parallel, generative models with text and sequence protein representation will be developed. Models will identify putative genes responsible for each of the cycles identified, or dl-genes. These two models will be applied to new samples collected from methane seeps offshore Oregon and Washington. Methane seep habitats are areas where methane is consumed by microbial activity and are also areas with strong redox gradients leading to diverse methane and nitrogen over a small spatial area. Both artificial intelligence models will be applied to these habitats, and the results used to empirically validate the dl-genes by testing if the dl-genes are transcribed when the associated geochemical process is observed. The main outcome will be a scalable approach with artificial intelligence that will advance key questions in earth system science. To broaden the use of the methods developed in this project to solve similar problems, a tutorial and workshop will help others learn and use the models. Further, the results of this work will include exhibits by artists involved in the research as well as producing a documentary about how artificial intelligence can harness big data to help advance the understanding of earth systems.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

  • Program Officer
    Scott M. Whitescwhite@nsf.gov7032928369
  • Min Amd Letter Date
    8/29/2024 - 9 months ago
  • Max Amd Letter Date
    8/29/2024 - 9 months ago
  • ARRA Amount

Institutions

  • Name
    Oregon State University
  • City
    CORVALLIS
  • State
    OR
  • Country
    United States
  • Address
    1500 SW JEFFERSON AVE
  • Postal Code
    973318655
  • Phone Number
    5417374933

Investigators

  • First Name
    Maude
  • Last Name
    David
  • Email Address
    maude.david@oregonstate.edu
  • Start Date
    8/29/2024 12:00:00 AM

Program Element

  • Text
    GEO CI - GEO Cyberinfrastrctre