Collaborative Research: RI: Medium: Understanding, Visualizing, and Attributing Multimodal Generative Models

Information

  • NSF Award
  • 2403303
Owner
  • Award Id
    2403303
  • Award Effective Date
    9/15/2024 - 4 months ago
  • Award Expiration Date
    8/31/2028 - 3 years from now
  • Award Amount
    $ 400,000.00
  • Award Instrument
    Standard Grant

Collaborative Research: RI: Medium: Understanding, Visualizing, and Attributing Multimodal Generative Models

Large-scale AI models that generate text and images are transforming visual synthesis and recognition tasks. These models can not only generate realistic images from texts but also serve as backbone models for object recognition, tracking, and segmentation. Despite their capabilities, they are complex, challenging to interpret, and sometimes unpredictable. This unpredictability is problematic for safety-critical applications such as autonomous driving and healthcare. Moreover, these models can easily generate copyrighted content, produce harmful content, or amplify stereotypes. Understanding such complex models, with billions of parameters trained on billions of images, remains challenging. Key questions include understanding why certain input prompts lead to failures or artifacts, how these models might amplify biases, and how the training data can impact output quality. This project seeks to provide a systematic, interpretable, rational basis for understanding and controlling the learned computations of multimodal generative models, with the potential to increase the accountable and safe use of state-of-the-art multimodal AI models and mitigate potential harms. To effectively disseminate the research, the investigators will freely release all materials (code, models, and datasets) and host tutorials, workshops, and courses to engage with the research community, enhance students’ participation at the K12, undergraduate, and graduate levels, and engage with policymakers to inform them of the latest technology and future trends.<br/> <br/>The project aims to develop a new systematic framework for visualizing, understanding, and rewriting the learned computation of multimodal generative models and leverage this knowledge to trace how the training data used influences the internal representation and, ultimately, affects the model outputs. The project will focus on three research thrusts. First, new research methodologies will be developed to visualize the internal mechanisms and the hierarchical structures of pre-trained multimodal generative models, understand their roles in different stages of the generation process, and extract visual concepts and their relationships. Second, the project will explore several model editing algorithms to manipulate these discovered concepts and relationships to pinpoint and fix inconsistencies, failures, biases, and safety concerns of existing multimodal generative models. Finally, the investigators will develop new attribution methods to assess the influence of training images on generated results based on the analysis of internal representations and show several potential applications of the attribution algorithms.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

  • Program Officer
    Jie Yangjyang@nsf.gov7032924768
  • Min Amd Letter Date
    9/3/2024 - 4 months ago
  • Max Amd Letter Date
    9/3/2024 - 4 months ago
  • ARRA Amount

Institutions

  • Name
    Carnegie-Mellon University
  • City
    PITTSBURGH
  • State
    PA
  • Country
    United States
  • Address
    5000 FORBES AVE
  • Postal Code
    152133815
  • Phone Number
    4122688746

Investigators

  • First Name
    Jun-Yan
  • Last Name
    Zhu
  • Email Address
    junyanz@andrew.cmu.edu
  • Start Date
    9/3/2024 12:00:00 AM

Program Element

  • Text
    Robust Intelligence
  • Code
    749500

Program Reference

  • Text
    ROBUST INTELLIGENCE
  • Code
    7495
  • Text
    MEDIUM PROJECT
  • Code
    7924