Collaborative Research: SHF: Medium: Toward Understandability and Interpretability for Neural Language Models of Source Code

Information

  • NSF Award
  • 2311468
Owner
  • Award Id
    2311468
  • Award Effective Date
    10/1/2023 - 8 months ago
  • Award Expiration Date
    9/30/2027 - 3 years from now
  • Award Amount
    $ 745,197.00
  • Award Instrument
    Standard Grant

Collaborative Research: SHF: Medium: Toward Understandability and Interpretability for Neural Language Models of Source Code

Advances in artificial intelligence (AI) have led to the development of several new types of tools for software developers that aim to help automate various parts of the software development process of building and maintaining software. However, the combination of complex underlying deep-learning models and massive training datasets makes it difficult to interpret why these models, and the developer tools powered by them, behave the way they do. Given the increasingly important role that these tools are beginning to play in software engineering (SE), it is imperative that techniques be developed that allow stakeholders to better understand and work with these tools such that critical software infrastructure can be maintained. This project will develop a framework and methodology that enables both researchers who build AI-powered developer tools, and software engineers who use these tools, to interpret why the underlying models make the predictions they do. The objective is to allow researchers to obtain detailed insights into why a model may not be performing as expected, allowing for targeted improvement and informed creation of new models. The methodology will be integrated into AI-powered software development tools, allowing software engineers to make informed decisions about when a tool’s suggestion may be helpful or harmful, thus building trust in their use. The interpretability framework will also enable new forms of interaction with these tools, providing a mechanism for natural language feedback that improves over time. This project will produce and disseminate educational materials on best practices related to building and using AI-powered programming tools. These materials are intended to be integrated into existing computer-literacy courses at all levels of education. In addition, the project will focus on recruiting and retaining computer science students from traditionally underrepresented categories.<br/><br/>This project has three specific goals. First, it will design an automated approach for generating global explanations of the behavior of “context-free” neural language models for source code. This component of the project will map predictions from large language models to human-interpretable programming language concepts using causal inference theory, wherein explanations of behavior will be generated via causal interventions. Second, it will develop automated techniques for local explanations of contextualized language models of code by developing a set of interpretability techniques that generate behavioral, feature-based, and textual explanations defined for given SE tasks (e.g., program repair). Finally, the project will create techniques that enable researchers and developers to provide feedback to models based on generated explanations.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

  • Program Officer
    Sol Greenspansgreensp@nsf.gov7032927841
  • Min Amd Letter Date
    8/8/2023 - 10 months ago
  • Max Amd Letter Date
    8/8/2023 - 10 months ago
  • ARRA Amount

Institutions

  • Name
    George Mason University
  • City
    FAIRFAX
  • State
    VA
  • Country
    United States
  • Address
    4400 UNIVERSITY DR
  • Postal Code
    220304422
  • Phone Number
    7039932295

Investigators

  • First Name
    Kevin
  • Last Name
    Moran
  • Email Address
    kpmoran@gmu.edu
  • Start Date
    8/8/2023 12:00:00 AM
  • First Name
    Ziyu
  • Last Name
    Yao
  • Email Address
    ziyuyao@gmu.edu
  • Start Date
    8/8/2023 12:00:00 AM

Program Element

  • Text
    Software & Hardware Foundation
  • Code
    7798

Program Reference

  • Text
    MEDIUM PROJECT
  • Code
    7924
  • Text
    SOFTWARE ENG & FORMAL METHODS
  • Code
    7944