RI: Medium: Scalable Second-order Methods for Training, Designing, and Deploying Machine Learning Models

Information

  • NSF Award
  • 2107000
Owner
  • Award Id
    2107000
  • Award Effective Date
    10/1/2021 - 4 years ago
  • Award Expiration Date
    9/30/2025 - 2 months ago
  • Award Amount
    $ 1,200,000.00
  • Award Instrument
    Standard Grant

RI: Medium: Scalable Second-order Methods for Training, Designing, and Deploying Machine Learning Models

Scalable optimization algorithms that can handle large-scale modern datasets are an integral part of many applications of machine learning. Optimization methods that use only first derivative information, i.e., so-called first-order methods, are most common. However, many of these first-order methods come with inherent disadvantages, e.g., slow convergence, poor communication, and the need for laborious manual tuning. On the other hand, so-called second-order methods, i.e., methods that use second derivative information, come equipped with the ability to mitigate many of these disadvantages, but they are far less used within machine learning. This project will develop, implement, and apply novel methods that by innovative application of second-order information allow for enhanced design, diagnostics, and training of machine learning models. <br/><br/>Technical work will focus on theoretical developments, efficient implementations, and applications in multiple settings. Theoretical developments will tackle challenges involved in training large-scale non-convex machine learning models from four general angles: high-quality local minima; distributed computing environments; generalization performance; and acceleration. Work will also develop efficient Hessian-based diagnostics tools for the analysis of the training process as well as of already-trained models. Finally, improvements and applications of the proposed methods in a variety settings will be developed: improved communication properties; exploiting adversarial data; and exploring how these ideas can be used for more challenging problems such as how to improve neural architecture design and search. In all cases, high-quality user-friendly implementations for both shared-memory and distributed computing environments will be made available.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

  • Program Officer
    Rebecca Hwarhwa@nsf.gov7032927148
  • Min Amd Letter Date
    7/23/2021 - 4 years ago
  • Max Amd Letter Date
    7/23/2021 - 4 years ago
  • ARRA Amount

Institutions

  • Name
    International Computer Science Institute
  • City
    Berkeley
  • State
    CA
  • Country
    United States
  • Address
    2150 Shattuck Ave, Suite 1100
  • Postal Code
    947041345
  • Phone Number
    5106662900

Investigators

  • First Name
    Amir
  • Last Name
    Gholaminejad
  • Email Address
    amirgh@berkeley.edu
  • Start Date
    7/23/2021 12:00:00 AM
  • First Name
    Michael
  • Last Name
    Mahoney
  • Email Address
    mmahoney@icsi.berkeley.edu
  • Start Date
    7/23/2021 12:00:00 AM

Program Element

  • Text
    Robust Intelligence
  • Code
    7495

Program Reference

  • Text
    ROBUST INTELLIGENCE
  • Code
    7495
  • Text
    MEDIUM PROJECT
  • Code
    7924