Collaborative Research: RI: Medium: Understanding, Visualizing, and Attributing Multimodal Generative Models

Information

NSF Award
2403303

Owner

CARNEGIE-MELLON UNIVERSITY

Award Id
2403303
Award Effective Date
9/15/2024 - 10 months ago
Award Expiration Date
8/31/2028 - 3 years from now
Award Amount
$ 400,000.00
Award Instrument
Standard Grant

Information

Collaborative Research: RI: Medium: Understanding, Visualizing, and Attributing Multimodal Generative Models

Large-scale AI models that generate text and images are transforming visual synthesis and recognition tasks. These models can not only generate realistic images from texts but also serve as backbone models for object recognition, tracking, and segmentation. Despite their capabilities, they are complex, challenging to interpret, and sometimes unpredictable. This unpredictability is problematic for safety-critical applications such as autonomous driving and healthcare. Moreover, these models can easily generate copyrighted content, produce harmful content, or amplify stereotypes. Understanding such complex models, with billions of parameters trained on billions of images, remains challenging. Key questions include understanding why certain input prompts lead to failures or artifacts, how these models might amplify biases, and how the training data can impact output quality. This project seeks to provide a systematic, interpretable, rational basis for understanding and controlling the learned computations of multimodal generative models, with the potential to increase the accountable and safe use of state-of-the-art multimodal AI models and mitigate potential harms. To effectively disseminate the research, the investigators will freely release all materials (code, models, and datasets) and host tutorials, workshops, and courses to engage with the research community, enhance students’ participation at the K12, undergraduate, and graduate levels, and engage with policymakers to inform them of the latest technology and future trends.<br/> <br/>The project aims to develop a new systematic framework for visualizing, understanding, and rewriting the learned computation of multimodal generative models and leverage this knowledge to trace how the training data used influences the internal representation and, ultimately, affects the model outputs. The project will focus on three research thrusts. First, new research methodologies will be developed to visualize the internal mechanisms and the hierarchical structures of pre-trained multimodal generative models, understand their roles in different stages of the generation process, and extract visual concepts and their relationships. Second, the project will explore several model editing algorithms to manipulate these discovered concepts and relationships to pinpoint and fix inconsistencies, failures, biases, and safety concerns of existing multimodal generative models. Finally, the investigators will develop new attribution methods to assess the influence of training images on generated results based on the analysis of internal representations and show several potential applications of the attribution algorithms.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Program Officer
Jie Yangjyang@nsf.gov7032924768
Min Amd Letter Date
9/3/2024 - 10 months ago
Max Amd Letter Date
9/3/2024 - 10 months ago
ARRA Amount

Institutions

Name
Carnegie-Mellon University
City
PITTSBURGH
State
PA
Country
United States
Address
5000 FORBES AVE
Postal Code
152133815
Phone Number
4122688746

Investigators

First Name
Jun-Yan
Last Name
Zhu
Email Address
junyanz@andrew.cmu.edu
Start Date
9/3/2024 12:00:00 AM

Program Element

Text
Robust Intelligence
Code
749500

Program Reference

Text
ROBUST INTELLIGENCE
Code
7495

Text
MEDIUM PROJECT
Code
7924

Collaborative Research: RI: Medium: Understanding, Visualizing, and Attributing Multimodal Generative Models

Information

Owner

Award Id

Award Effective Date

Award Expiration Date

Award Amount

Award Instrument

Collaborative Research: RI: Medium: Understanding, Visualizing, and Attributing Multimodal Generative Models

Program Officer

Min Amd Letter Date

Max Amd Letter Date

ARRA Amount

Institutions

Name

City

State

Country

Address

Postal Code

Phone Number

Investigators

First Name

Last Name

Email Address

Start Date

Program Element

Text

Code

Program Reference

Text

Code

Text

Code