srcML is an infrastructure for the exploration, analysis, and manipulation of source code. The infrastructure currently supports the translation of C, C++, C#, and Java source code to the srcML format. The srcML format contains all the original source code plus grammatical information from the specific programming language used. The self-contained parsing technology is very robust and highly scalable both in time and memory. Researchers and practitioners are able to construct source code analysis tools very easily by using the infrastructure. srcML has been leveraged to construct tools for such things as software quality assessment, error detection, and security risk assessment of software systems. The freely available srcML parser is used by a wide variety of researchers and practitioners in the fields of software engineering and programming languages, as well as computer science education. srcML has been used in the dissertation/thesis research of dozens (and counting) computer science graduate students at a number of institutions across the country. The proposed enhancements to the srcML infrastructure will extend the parsing and markup to a broad variety of widely used programming languages for example, Python, JavaScript, Go, Ruby, etc. The proposed enhancement to the srcML infrastructure will drastically reduce the entry cost for individuals to conduct research by enabling them to explore, analyze, and manipulate software in an easy and flexible manner, thus allowing them more time to pursue novel and transformative research on software, software engineering, and programming languages. Furthermore, it provides practical tools for engineers to improve the quality and lower the cost of software applications we all use daily.<br/><br/>The proposed enhancements to the srcML infrastructure extend it to a wider variety of popular programming languages. These extensions will be accomplished by developing a parser generator for the srcML format. The input is a programming language grammar, and the output is a parser that takes source code in that programming language and inserts the srcML markup into the code. This basic approach is similar to those taken by parser generators such as yacc or ANTLR. This grammar-based approach will significantly broaden the audience for the srcML infrastructure. It will not only allow for new languages to be easily added but also the ability to support dialects, legacy languages, and domain-specific languages. Many current research tools and techniques do not work on mixed/multi-language systems or are not validated on such real-world systems due to the lack of tools that can be applied. The enhancement to srcML will represent one of the only mixed language, source code analysis tool that is open source and freely available.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.