Collaborative Research: RI: Medium: Programmatic Foundation Models for Visual Analysis on a Planetary Scale

Information

NSF Award
2403016

Owner

Columbia University

Award Id
2403016
Award Effective Date
8/15/2024 - a year ago
Award Expiration Date
7/31/2028 - 2 years from now
Award Amount
$ 95,677.00
Award Instrument
Continuing Grant

Information

Collaborative Research: RI: Medium: Programmatic Foundation Models for Visual Analysis on a Planetary Scale

Imagery around the world—from satellites to drones and social media photographs—provide vital information about our planet. There is a unique opportunity in the fields of artificial intelligence and computer vision to understand global and local phenomena from these images, providing insight about climate change, public health, and agriculture. However, the state-of-the-art methods in computer vision are not designed for these applications where decision-making is complex, and accuracy, robustness, and interpretability are required. Existing large-scale AI models, such as ChatGPT, only process individual images on the internet and cannot synthesize conclusions from planet-scale image collections. Even on single images, these models cannot reliably perform sophisticated logical reasoning, and building models to do such reasoning reliably requires unfeasibly large datasets. Creating such large models and datasets is a significant barrier for scientific and societal applications of computer vision, particularly for organizations that do not have the computational resources of large corporations. This project will create a new class of machine learning models, called programmatic foundation models, that have the capability and efficiency to scale to planetary-scale image and video datasets. These models can be queried by experts using natural language, thus empowering scientists and experts to benefit from AI related visual discovery from the vast amounts of visual information available in satellite imagery even if they lack expertise in machine learning. The proposed research has applications across public health, climate change, agriculture, security, and the economy. <br/><br/>The research objective of this project is to tightly integrate visual representations and program synthesis together, thereby delivering an accurate, interpretable, and robust machine learning framework for answering questions about what is visible in image collections. Across two research thrusts, the project will drive the creation of these new programmatic foundation models. The first thrust proposes new techniques for building open-world recognition primitives across multiple sensing modalities based on vision-language models, but without any language annotations. It introduces new cross-modal contrastive learning techniques, as well as approaches for reasoning about temporal change. The second thrust proposes new techniques to learn to synthesize programs, incorporating uncertainty, learning from feedback and adaptive computation. Given a query, our proposed framework learns to synthesize a customized program that breaks the task down into constituent steps and control flow that can be directly executed for solving the vision task. To execute each step, the project proposes new methods for training open-world classification, detection and segmentation models for satellite, aerial, and ground imagery. Unlike prior foundation models, this integrated approach has many potential benefits in interpretability, logical soundness, modularity, compositionality, efficiency, and generality to different tasks. The two thrusts taken together combine program synthesis with open-world recognition models for analyzing satellite, drone, and ground imagery around the world.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Program Officer
Jie Yangjyang@nsf.gov7032924768
Min Amd Letter Date
8/15/2024 - a year ago
Max Amd Letter Date
8/15/2024 - a year ago
ARRA Amount

Institutions

Name
Columbia University
City
NEW YORK
State
NY
Country
United States
Address
615 W 131ST ST
Postal Code
100277922
Phone Number
2128546851

Investigators

First Name
Carl
Last Name
Vondrick
Email Address
cv2428@columbia.edu
Start Date
8/15/2024 12:00:00 AM

Program Element

Text
Robust Intelligence
Code
749500

Program Reference

Text
ROBUST INTELLIGENCE
Code
7495

Text
MEDIUM PROJECT
Code
7924

Collaborative Research: RI: Medium: Programmatic Foundation Models for Visual Analysis on a Planetary Scale

Information

Owner

Award Id

Award Effective Date

Award Expiration Date

Award Amount

Award Instrument

Collaborative Research: RI: Medium: Programmatic Foundation Models for Visual Analysis on a Planetary Scale

Program Officer

Min Amd Letter Date

Max Amd Letter Date

ARRA Amount

Institutions

Name

City

State

Country

Address

Postal Code

Phone Number

Investigators

First Name

Last Name

Email Address

Start Date

Program Element

Text

Code

Program Reference

Text

Code

Text

Code