SHF:Small:Intelligent Management of Hybrid Workloads for Extreme Scale Computing

Information

  • NSF Award
  • 2109316
Owner
  • Award Id
    2109316
  • Award Effective Date
    10/1/2021 - 2 years ago
  • Award Expiration Date
    9/30/2024 - 3 months from now
  • Award Amount
    $ 499,999.00
  • Award Instrument
    Standard Grant

SHF:Small:Intelligent Management of Hybrid Workloads for Extreme Scale Computing

The high-performance computing (HPC) community is embracing artificial intelligence (AI) techniques for countless pursuits, from driving ground-breaking scientific discoveries to protecting our national security. As newly emerging machine learning and date-centric workloads proliferate in HPC, current workload-management systems cannot keep up with the significant challenges introduced by the diverse mix of applications co-running on heterogeneous systems. This project tackles the problem by developing new workload-management methods to catalyze the convergence of HPC, AI, and data analytics. It will develop fundamental improvements in HPC workload management to promote the use of large-scale supercomputers for emerging data-centric applications (HPC4AI). Meanwhile it will exploit advanced AI technologies, especially multi-objective reinforcement learning, to empower job scheduling and resource allocation in HPC (AI4HPC).<br/> <br/>The project aims to develop an intelligent workload-management framework named MINT in which distinctive computational resource requirements of hybrid workloads will be automatically identified and fulfilled to achieve extreme resource efficiency and satisfactory user experience. Key research thrusts are: understanding performance implications of diverse workloads on supercomputers via model-driven analysis; new intelligent multi-resource scheduling methods; smart resource-allocation strategies for minimal workload interference; and extensive evaluation of the proposed framework through trace-based simulation and testing. The deliverables include a new workload-management framework and open-source software releases for intelligent management of hybrid workloads on extreme-scale systems.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

  • Program Officer
    Almadena Chtchelkanovaachtchel@nsf.gov7032927498
  • Min Amd Letter Date
    8/26/2021 - 2 years ago
  • Max Amd Letter Date
    8/26/2021 - 2 years ago
  • ARRA Amount

Institutions

  • Name
    Illinois Institute of Technology
  • City
    Chicago
  • State
    IL
  • Country
    United States
  • Address
    10 West 35th Street
  • Postal Code
    606163717
  • Phone Number
    3125673035

Investigators

  • First Name
    Zhiling
  • Last Name
    Lan
  • Email Address
    lan@iit.edu
  • Start Date
    8/26/2021 12:00:00 AM
  • First Name
    Kai
  • Last Name
    Shu
  • Email Address
    kshu@iit.edu
  • Start Date
    8/26/2021 12:00:00 AM

Program Element

  • Text
    Software & Hardware Foundation
  • Code
    7798

Program Reference

  • Text
    SMALL PROJECT
  • Code
    7923
  • Text
    HIGH-PERFORMANCE COMPUTING
  • Code
    7942
  • Text
    WOMEN, MINORITY, DISABLED, NEC
  • Code
    9102