NSF-SNSF: Generative Graph Models at Scale: Discrete Diffusion, Transferability and Requirements

Information

  • NSF Award
  • 2444713
Owner
  • Award Id
    2444713
  • Award Effective Date
    10/1/2024 - 5 months ago
  • Award Expiration Date
    9/30/2027 - 2 years from now
  • Award Amount
    $ 450,000.00
  • Award Instrument
    Standard Grant

NSF-SNSF: Generative Graph Models at Scale: Discrete Diffusion, Transferability and Requirements

Graphs are structures used to describe relationship between objects. As such, their presence is pervasive in science and engineering. Graphs are at the core of a variety of disciplines that includes chemistry, communications, and transportation and are of increasing importance in a broader set of domains that includes integrated circuits, robotics, health and biology. The goal of this project is to extend generative artificial intelligence (AI) to the generation of graphs at scale. That is, it seeks to develop generative graph models capable of generating graphs with large numbers of nodes (objects) and a concomitant large number of edges (relationships). Scalability is a challenge of generative graph models that sets the problem apart from other generative models such as language and images. This is because it is more difficult to learn relationships than it is to learn objects. To attain scalability the project will advance the state of the art in three directions: (D1) Generative processes that focus on learning relationships, not objects. (D2) Generative process that work independent of scale and can therefore be trained at small scale and transferred to larger scales. (D3) Generative process that incorporate user-specified constraints in the generated graphs.<br/><br/>Research Directions (D1)-(D3) are addressed in this international collaborative proposal in three research thrusts. Thrust I builds discrete diffusion processes that progressively add or remove edges form a random graph. The use of discrete diffusion is intended to reduce the combinatorial complexity of exploring the graph space and stands in contrast to the Gaussian diffusion processes used in audio and image generative AI systems. Thrust II learns by transference. It trains generative models for small graphs that are later transferred to larger graphs. This reduces the computational complexity of training diffusion models in large graphs. To build these generative models that work independent of scale we rely on graphon and manifold abstractions of graph and corresponding abstract learning architectures in the form of graphon and manifold neural networks. Thrust III incorporates requirements in generative models. This reduces the sample complexity of the search space by guiding the diffusion process towards graphs that satisfy user-specified constraints. Fundamental research is applied to the generation of molecular graphs that mimic molecules with known properties as is needed in, e.g., drug discovery aided by AI. Overall, this project will develop novel theory and methods for effective informed graph generation at scale. Advances in the use of AI for drug discovery are anticipated and impacts in communications, robotics, circuit design and health are likely. Further impact of this project will come from education activities related to the undergraduate AI major at the University of Pennsylvania. This research project will impact the major's introductory course for first year students in which students learn about machine learning architectures and training procedures. Labs in this course include simple examples of speech recognition, image classification, and recommendation systems that illustrate the role of learning architectures. Students also have labs on dynamical systems, reinforcement learning, generative diffusion models and language models to illustrate the different ways in which learning architectures are trained.<br/><br/>This collaborative U.S.-Swiss project is supported by the U.S. National Science Foundation (NSF) and the Swiss National Science Foundation (SNSF), where NSF funds the U.S. investigator and SNSF funds the partners in Switzerland.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

  • Program Officer
    Anthony Kuhakuh@nsf.gov7032924714
  • Min Amd Letter Date
    8/16/2024 - 6 months ago
  • Max Amd Letter Date
    8/16/2024 - 6 months ago
  • ARRA Amount

Institutions

  • Name
    University of Pennsylvania
  • City
    PHILADELPHIA
  • State
    PA
  • Country
    United States
  • Address
    3451 WALNUT ST STE 440A
  • Postal Code
    191046205
  • Phone Number
    2158987293

Investigators

  • First Name
    Alejandro
  • Last Name
    Ribeiro
  • Email Address
    aribeiro@seas.upenn.edu
  • Start Date
    8/16/2024 12:00:00 AM

Program Element

  • Text
    EPCN-Energy-Power-Ctrl-Netwrks
  • Code
    760700

Program Reference

  • Text
    International Partnerships
  • Text
    U.S. NSF-Swiss Resrch Corp
  • Text
    SWITZERLAND
  • Code
    5950
  • Text
    LEARNING & INTELLIGENT SYSTEMS
  • Code
    8888