CCRI: ENS: Collaborative Research: Enabling Automated Language Support for the srcML Infrastructure

Information

  • NSF Award
  • 2016452
Owner
  • Award Id
    2016452
  • Award Effective Date
    7/15/2020 - 3 years ago
  • Award Expiration Date
    6/30/2023 - 11 months ago
  • Award Amount
    $ 254,636.00
  • Award Instrument
    Standard Grant

CCRI: ENS: Collaborative Research: Enabling Automated Language Support for the srcML Infrastructure

srcML is an infrastructure for the exploration, analysis, and manipulation of source code. The infrastructure currently supports the translation of C, C++, C#, and Java source code to the srcML format. The srcML format contains all the original source code plus grammatical information from the specific programming language used. The self-contained parsing technology is very robust and highly scalable both in time and memory. Researchers and practitioners are able to construct source code analysis tools very easily by using the infrastructure. srcML has been leveraged to construct tools for such things as software quality assessment, error detection, and security risk assessment of software systems. The freely available srcML parser is used by a wide variety of researchers and practitioners in the fields of software engineering and programming languages, as well as computer science education. srcML has been used in the dissertation/thesis research of dozens (and counting) computer science graduate students at a number of institutions across the country. The proposed enhancements to the srcML infrastructure will extend the parsing and markup to a broad variety of widely used programming languages for example, Python, JavaScript, Go, Ruby, etc. The proposed enhancement to the srcML infrastructure will drastically reduce the entry cost for individuals to conduct research by enabling them to explore, analyze, and manipulate software in an easy and flexible manner, thus allowing them more time to pursue novel and transformative research on software, software engineering, and programming languages. Furthermore, it provides practical tools for engineers to improve the quality and lower the cost of software applications we all use daily.<br/><br/>The proposed enhancements to the srcML infrastructure extend it to a wider variety of popular programming languages. These extensions will be accomplished by developing a parser generator for the srcML format. The input is a programming language grammar, and the output is a parser that takes source code in that programming language and inserts the srcML markup into the code. This basic approach is similar to those taken by parser generators such as yacc or ANTLR. This grammar-based approach will significantly broaden the audience for the srcML infrastructure. It will not only allow for new languages to be easily added but also the ability to support dialects, legacy languages, and domain-specific languages. Many current research tools and techniques do not work on mixed/multi-language systems or are not validated on such real-world systems due to the lack of tools that can be applied. The enhancement to srcML will represent one of the only mixed language, source code analysis tool that is open source and freely available.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

  • Program Officer
    Sol Greenspan
  • Min Amd Letter Date
    7/9/2020 - 3 years ago
  • Max Amd Letter Date
    7/9/2020 - 3 years ago
  • ARRA Amount

Institutions

  • Name
    University of Akron
  • City
    Akron
  • State
    OH
  • Country
    United States
  • Address
    302 Buchtel Common
  • Postal Code
    443250001
  • Phone Number
    3309722760

Investigators

  • First Name
    Michael
  • Last Name
    Collard
  • Email Address
    collard@uakron.edu
  • Start Date
    7/9/2020 12:00:00 AM

Program Element

  • Text
    CCRI-CISE Cmnty Rsrch Infrstrc
  • Code
    7359

Program Reference

  • Text
    COMPUTING RES INFRASTRUCTURE
  • Code
    7359
  • Text
    EXP PROG TO STIM COMP RES
  • Code
    9150