Collaborative Research: CNS Core: Small: A Compilation System for Mapping Deep Learning Models to Tensorized Instructions (DELITE)

Information

  • NSF Award
  • 2341378
Owner
  • Award Id
    2341378
  • Award Effective Date
    10/1/2023 - 8 months ago
  • Award Expiration Date
    9/30/2026 - 2 years from now
  • Award Amount
    $ 299,999.00
  • Award Instrument
    Standard Grant

Collaborative Research: CNS Core: Small: A Compilation System for Mapping Deep Learning Models to Tensorized Instructions (DELITE)

As Machine Learning (ML), and especially Deep Neural Network (DNN) workloads have rapidly become prominent, many existing architectures have been enriched with instructions and/or processing capabilities targeting these workloads. Examples of these instructions include AMX instructions from Intel, Tensor cores from NVIDIA, DOT instructions from AMD, and many others. The emergence of such tensorized instructions is leading to many common and related challenges regarding how they can be used for production-level modern DNNs. The current state-of-the-art for exploiting these instruction sets for DNN workloads is very limited, with existing systems either completely lacking attention on these, not addressing global optimizations for complex DNNs, or being limited in other ways. The premise of our work is that a compilation system that is cognizant of the latest DNN trends and can optimize across different tensorized instruction sets, will provide large efficiency gains for modern ML computations. The resulting agenda will likely result in significant technical, economic, and societal impacts. From the technical side, the work impacts areas like High-Performance Computing (HPC), Compilers, and systems supporting AI/ML workloads. As DNNs are becoming an integral part of applications that most humans use, this work is poised to have a large economic and societal impact. On the education side, the research at the intersection of systems and ML will be incorporated into multiple courses and help to increase diversity at all levels in computing education and research, particularly by involving members from underrepresented groups.<br/><br/>This project addresses the following challenges associated with modern DNNs and recent and emerging tensorized instructions: 1) Local Instruction Selection for Dense Models -- To improve the execution efficiency of each operator, a critical first issue is selecting tensorized instructions (and associated data layouts), which will be addressed for arbitrary shapes of operators. 2) Global Optimizations for DNNs -- After local operator optimizations, each operator may prefer its own tensorized instruction and data layout, thus incurring significant data layout transformation costs during the execution of an entire DNN. This project formulates and solves a global optimization problem that chooses the right trade-off between the local operator execution and data transformation costs. 3) Optimizations for Dynamic DNNs -- This project also considers various forms of dynamism in modern DNN models including dynamic input shapes, dynamic control flows, and dynamic data structures. It proposes new optimizations such as those for effective memory management, while revisiting others like local and global instruction selection, in the presence of these forms of dynamism. 4) Mapping Sparse Models to Emerging Instructions -- This project also plans to improve the efficiency of using various types of tensorized instructions when sparsity is involved, building on top of earlier work for optimizing kernels like SpMM (and other sparse computations) on GPUs and SIMD instruction sets. 5) (Semi-) Automatic Support for New Instructions -- To minimize the optimization and programming effort, this proposal also introduces a module to automatically optimize DNN computations with new tensorized instructions or features. Besides addressing the above problems, one critical component of this project will be incorporating their implementations, together with code generation for multiple back-ends, in a reusable system. This system will take as the input the Computational Graph representation, and output Tensor and LLVM IRs, thus building around three representations widely used in the industry.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

  • Program Officer
    Karen Karavanickkaravan@nsf.gov7032922594
  • Min Amd Letter Date
    9/14/2023 - 8 months ago
  • Max Amd Letter Date
    9/14/2023 - 8 months ago
  • ARRA Amount

Institutions

  • Name
    University of Georgia Research Foundation Inc
  • City
    ATHENS
  • State
    GA
  • Country
    United States
  • Address
    310 E CAMPUS RD RM 409
  • Postal Code
    306021589
  • Phone Number
    7065425939

Investigators

  • First Name
    Gagan
  • Last Name
    Agrawal
  • Email Address
    gagrawal@uga.edu
  • Start Date
    9/14/2023 12:00:00 AM

Program Element

  • Text
    CSR-Computer Systems Research
  • Code
    7354