Collaborative Research: CSR: Small: Cross-layer learning-based Energy-Efficient and Resilient NoC design for Multicore Systems

Information

  • NSF Award
  • 2321224
Owner
  • Award Id
    2321224
  • Award Effective Date
    10/1/2023 - 2 years ago
  • Award Expiration Date
    9/30/2026 - 7 months from now
  • Award Amount
    $ 375,000.00
  • Award Instrument
    Standard Grant

Collaborative Research: CSR: Small: Cross-layer learning-based Energy-Efficient and Resilient NoC design for Multicore Systems

The proliferation of multiple cores on the chip has signaled the advent of communication-centric, rather than computation-centric systems. Consequently, the design of low latency, high bandwidth, power-efficient, and reliable Network-on-Chips (NoCs) is proving to be one of the most critical challenges to achieving the performance potential of future multicore systems. However, as multicores are facilitating an enormous integration capacity, rapid transistor scaling has led to a steady degradation of the device and circuit reliability: unpredictable device behavior will undeniably increase and will result in a significant increase in faults (both permanent and transient), and hardware failures. The ramifications for the NoC are immense: a single fault in the NoC may paralyze the working of the entire chip. While considerable efforts are undertaken to tackle the reliability challenge of NoCs, most current solutions concentrate on local optimizations within the entire NoC abstractions (e.g., circuit, message, and network layers). These solutions tend to possess limited knowledge of the overall system and are therefore reactive in behavior, making worst-case assumptions and overprovisioning, and as a result, they introduce significant power, area, and performance overheads while not completely solving the reliability challenge.<br/><br/>This research project tackles the critical NoC reliability challenge by developing a comprehensive, cooperative, and adaptive multi-layer approach for designing reliable NoCs from fault-susceptible components, with globally-optimized power, performance, and costs. To achieve this research goal, this project is organized into four interrelated research tasks. First, this research project conducts a comprehensive study of the fundamental mechanisms that underlie the reliability issues across NoC abstractions. A detailed analysis of the dynamic interactions of NoC abstractions and design trade-offs. Second, the research project develops a cross-layer NoC architecture for resilient on-chip communication with machine-learning-based optimization. Third, this research project aims to incorporate the application layer and off-chip communications and develops a holistic design framework that can automatically capture and adapt to the various computation and communication requirements of different applications with optimized performance, power, and reliability. Finally, the project evaluates the designed framework by developing a cycle-accurate simulation framework and an FPGA prototype. This project will significantly advance the fundamental understanding of the interplay between the NoC and the rest of the components on the chip (cores, memory, etc.) as well as design tradeoffs between performance, power, reliability, and cost in future massively defective nanometer technologies. The developed NoC framework will benefit future multi-core architectures and computing systems with system-level performance and reliability improvements.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

  • Program Officer
    Karen Karavanickkaravan@nsf.gov7032922594
  • Min Amd Letter Date
    8/7/2023 - 2 years ago
  • Max Amd Letter Date
    8/7/2023 - 2 years ago
  • ARRA Amount

Institutions

  • Name
    George Washington University
  • City
    WASHINGTON
  • State
    DC
  • Country
    United States
  • Address
    1918 F ST NW
  • Postal Code
    200520042
  • Phone Number
    2029940728

Investigators

  • First Name
    Ahmed
  • Last Name
    Louri
  • Email Address
    louri@email.gwu.edu
  • Start Date
    8/7/2023 12:00:00 AM

Program Element

  • Text
    CSR-Computer Systems Research
  • Code
    7354

Program Reference

  • Text
    SMALL PROJECT
  • Code
    7923