Collaborative Research: SHF: Medium: Toward Understandability and Interpretability for Neural Language Models of Source Code

Information

NSF Award
2311468

Owner

GEORGE MASON UNIVERSITY

Award Id
2311468
Award Effective Date
10/1/2023 - 8 months ago
Award Expiration Date
9/30/2027 - 3 years from now
Award Amount
$ 745,197.00
Award Instrument
Standard Grant

Information

Collaborative Research: SHF: Medium: Toward Understandability and Interpretability for Neural Language Models of Source Code

Advances in artificial intelligence (AI) have led to the development of several new types of tools for software developers that aim to help automate various parts of the software development process of building and maintaining software. However, the combination of complex underlying deep-learning models and massive training datasets makes it difficult to interpret why these models, and the developer tools powered by them, behave the way they do. Given the increasingly important role that these tools are beginning to play in software engineering (SE), it is imperative that techniques be developed that allow stakeholders to better understand and work with these tools such that critical software infrastructure can be maintained. This project will develop a framework and methodology that enables both researchers who build AI-powered developer tools, and software engineers who use these tools, to interpret why the underlying models make the predictions they do. The objective is to allow researchers to obtain detailed insights into why a model may not be performing as expected, allowing for targeted improvement and informed creation of new models. The methodology will be integrated into AI-powered software development tools, allowing software engineers to make informed decisions about when a tool’s suggestion may be helpful or harmful, thus building trust in their use. The interpretability framework will also enable new forms of interaction with these tools, providing a mechanism for natural language feedback that improves over time. This project will produce and disseminate educational materials on best practices related to building and using AI-powered programming tools. These materials are intended to be integrated into existing computer-literacy courses at all levels of education. In addition, the project will focus on recruiting and retaining computer science students from traditionally underrepresented categories.<br/><br/>This project has three specific goals. First, it will design an automated approach for generating global explanations of the behavior of “context-free” neural language models for source code. This component of the project will map predictions from large language models to human-interpretable programming language concepts using causal inference theory, wherein explanations of behavior will be generated via causal interventions. Second, it will develop automated techniques for local explanations of contextualized language models of code by developing a set of interpretability techniques that generate behavioral, feature-based, and textual explanations defined for given SE tasks (e.g., program repair). Finally, the project will create techniques that enable researchers and developers to provide feedback to models based on generated explanations.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Program Officer
Sol Greenspansgreensp@nsf.gov7032927841
Min Amd Letter Date
8/8/2023 - 10 months ago
Max Amd Letter Date
8/8/2023 - 10 months ago
ARRA Amount

Institutions

Name
George Mason University
City
FAIRFAX
State
VA
Country
United States
Address
4400 UNIVERSITY DR
Postal Code
220304422
Phone Number
7039932295

Investigators

First Name
Kevin
Last Name
Moran
Email Address
kpmoran@gmu.edu
Start Date
8/8/2023 12:00:00 AM

First Name
Ziyu
Last Name
Yao
Email Address
ziyuyao@gmu.edu
Start Date
8/8/2023 12:00:00 AM

Program Element

Text
Software & Hardware Foundation
Code
7798

Program Reference

Text
MEDIUM PROJECT
Code
7924

Text
SOFTWARE ENG & FORMAL METHODS
Code
7944

Collaborative Research: SHF: Medium: Toward Understandability and Interpretability for Neural Language Models of Source Code

Information

Owner

Award Id

Award Effective Date

Award Expiration Date

Award Amount

Award Instrument

Collaborative Research: SHF: Medium: Toward Understandability and Interpretability for Neural Language Models of Source Code

Program Officer

Min Amd Letter Date

Max Amd Letter Date

ARRA Amount

Institutions

Name

City

State

Country

Address

Postal Code

Phone Number

Investigators

First Name

Last Name

Email Address

Start Date

First Name

Last Name

Email Address

Start Date

Program Element

Text

Code

Program Reference

Text

Code

Text

Code