Collaborative Research: Frameworks: Scalable Performance and Accuracy analysis for Distributed and Extreme-scale systems (SPADE)

Information

  • NSF Award
  • 2311709
Owner
  • Award Id
    2311709
  • Award Effective Date
    9/15/2023 - a year ago
  • Award Expiration Date
    8/31/2027 - 2 years from now
  • Award Amount
    $ 447,016.00
  • Award Instrument
    Standard Grant

Collaborative Research: Frameworks: Scalable Performance and Accuracy analysis for Distributed and Extreme-scale systems (SPADE)

Advances in computer simulations have made scientific discoveries more accessible. However, with the evolution of computing technology, each new generation of hardware and software presents unique performance and reliability challenges. These challenges must be addressed to fully harness the potential of these evolving technologies. SPADE is a project aimed to tackle these issues head-on. At its core, SPADE builds on the PAPI performance monitoring library - a tool used by the High-Performance Computing (HPC) community for over two decades. SPADE aims to enhance this legacy by creating methods that can assess and improve performance and accuracy on a wide range of advanced and evolving hardware and software technologies. This endeavor is not just about improving computational science but also about fostering diversity and education of a new generation of application scientists, engineers, and computer scientists. By providing an understanding of, and the ability to, navigate the intricate details of emerging technologies in the computing realm, SPADE is directly contributing to the advancement of this field. This will also democratize access to HPC, allowing a more diverse range of researchers and institutions to contribute to scientific discovery. Moreover, as SPADE aims to improve the capabilities of computer simulations, it enhances the ability to tackle a broad range of challenges - from understanding climate change to drug discovery. In essence, beyond advancing the HPC field, SPADE intends delivering a real-world impact by unlocking the full potential of computational science.<br/><br/>The SPADE project focuses on advancing the monitoring, optimization, evaluation, and decision-making capabilities for extreme-scale systems. These critical capabilities are pivotal for both the High-Performance Computing (HPC) community and the scientific applications community that leverage these systems. With the evolution of HPC resources toward extreme scale, there is an increasing need for integrated performance and accuracy analysis frameworks to understand and mitigate performance and reliability challenges. To meet these needs, SPADE aims to deliver software and application programming interfaces (APIs) that broaden support for heterogeneity and scalability across a diverse range of computing platforms, including emerging vendor technologies. The SPADE project intends to utilize the established PAPI performance monitoring library to address the demands of scientific and machine learning applications effectively. Specifically, SPADE's mission includes: (1) developing monitoring capabilities for innovative and advanced technologies across the hardware stack; (2) designing novel abstractions that encapsulate the internal behavior of software components and facilitate interoperability across the software stack; (3) implementing a new performance and accuracy analysis framework that capitalizes on the efficiency and flexibility of C++'s object-oriented nature; (4) integrating new analysis functionality with various software stack layers and scientific and machine learning applications; and (5) examining new accuracy vs. performance trade-offs introduced with low-precision floating-point types. In essence, SPADE facilitates innovations in cyberinfrastructure development by enabling efficient and comprehensive resource utilization of extreme-scale platforms.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

  • Program Officer
    Ashok Srinivasanasriniva@nsf.gov7032922122
  • Min Amd Letter Date
    9/12/2023 - a year ago
  • Max Amd Letter Date
    9/12/2023 - a year ago
  • ARRA Amount

Institutions

  • Name
    University of Maine
  • City
    ORONO
  • State
    ME
  • Country
    United States
  • Address
    5717 CORBETT HALL
  • Postal Code
    044695717
  • Phone Number
    2075811484

Investigators

  • First Name
    Vincent
  • Last Name
    Weaver
  • Email Address
    vincent.weaver@maine.edu
  • Start Date
    9/12/2023 12:00:00 AM

Program Element

  • Text
    Software Institutes
  • Code
    8004

Program Reference

  • Text
    CSSI-1: Cyberinfr for Sustained Scientif
  • Text
    Software Institutes
  • Code
    8004
  • Text
    WOMEN, MINORITY, DISABLED, NEC
  • Code
    9102
  • Text
    EXP PROG TO STIM COMP RES
  • Code
    9150