Collaborative Research: NSF-MeitY: CSR: Small: Eco-LLM: Energy-Efficient Computation and Communication for Large Language Models with CXL-based Chip Architecture and Software

Information

  • NSF Award
  • 2415201
Owner
  • Award Id
    2415201
  • Award Effective Date
    10/1/2024 - a month ago
  • Award Expiration Date
    9/30/2027 - 2 years from now
  • Award Amount
    $ 400,000.00
  • Award Instrument
    Standard Grant

Collaborative Research: NSF-MeitY: CSR: Small: Eco-LLM: Energy-Efficient Computation and Communication for Large Language Models with CXL-based Chip Architecture and Software

Although crucial for advanced Artificial Intelligence (AI) applications due to their language understanding and generation capabilities, Large Language Models (LLMs) are energy intensive. This project’s goals and novelty are to enhance the efficiency of training and inference associated with LLMs by leveraging emerging high-speed networks and computing architecture. The project’s broader significance and importance are to (1) enable a broad range of LLMs to efficiently operate, advancing AI applications at a low energy cost; (2) strengthen international research collaboration between U.S. and India researchers; and (3) provide educational opportunities for graduate students.<br/><br/>This project addresses the energy efficiency challenges of LLMs by optimizing their energy consumption in heterogeneous Compute Express Link (CXL)-enabled hardware environments. By leveraging High-Performance Computing (HPC) middleware and the high-bandwidth, low-latency features of CXL, the project aims to ensure sustainable and efficient AI operations. This project seeks to find solutions to the following set of fundamental issues in training and using LLMs at scale: 1) identifying and characterizing idleness in the LLM workloads; 2) using the knowledge of long idleness to insert low-overhead Dynamic Voltage and Frequency Scaling (DVFS) control and undervolting to save static energy consumption; 3) designing CXL-aware and energy-efficient Message Passing Interface (MPI)-based communication runtime for LLM training and inferencing; and 4) studying the overall impact of the integrated systems on the energy consumption of LLM training and inference. The results are disseminated to collaborating organizations to impact their HPC/AI software applications and hardware chip designs, promoting broader societal advancement through improved technological capabilities.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

  • Program Officer
    Daniela Oliveiradoliveir@nsf.gov7032920000
  • Min Amd Letter Date
    8/26/2024 - 3 months ago
  • Max Amd Letter Date
    8/26/2024 - 3 months ago
  • ARRA Amount

Institutions

  • Name
    Ohio State University
  • City
    COLUMBUS
  • State
    OH
  • Country
    United States
  • Address
    1960 KENNY RD
  • Postal Code
    432101016
  • Phone Number
    6146888735

Investigators

  • First Name
    Dhabaleswar
  • Last Name
    Panda
  • Email Address
    panda@cse.ohio-state.edu
  • Start Date
    8/26/2024 12:00:00 AM
  • First Name
    Hari
  • Last Name
    Subramoni
  • Email Address
    subramoni.1@osu.edu
  • Start Date
    8/26/2024 12:00:00 AM
  • First Name
    Aamir
  • Last Name
    Shafi
  • Email Address
    shafi.16@osu.edu
  • Start Date
    8/26/2024 12:00:00 AM
  • First Name
    Mustafa
  • Last Name
    Abduljabbar
  • Email Address
    abduljabbar.1@osu.edu
  • Start Date
    8/26/2024 12:00:00 AM

Program Element

  • Text
    GVF - Global Venture Fund
  • Text
    CSR-Computer Systems Research
  • Code
    735400

Program Reference

  • Text
    US-India Collaborative Research
  • Text
    INDIA
  • Code
    6194
  • Text
    SMALL PROJECT
  • Code
    7923