BIGDATA: Collaborative Research: F: Stochastic Approximation for Subspace and Multiview Representation Learning

Information

  • NSF Award
  • 1546500
Owner
  • Award Id
    1546500
  • Award Effective Date
    9/1/2015 - 8 years ago
  • Award Expiration Date
    8/31/2019 - 4 years ago
  • Award Amount
    $ 394,518.00
  • Award Instrument
    Standard Grant

BIGDATA: Collaborative Research: F: Stochastic Approximation for Subspace and Multiview Representation Learning

Unsupervised learning of useful features, or representations, is one of the most basic challenges of machine learning. Unsupervised representation learning techniques capitalize on unlabeled data which is often cheap and abundant and sometimes virtually unlimited. The goal of these ubiquitous techniques is to learn a representation that reveals intrinsic low-dimensional structure in data, disentangles underlying factors of variation by incorporating universal AI priors such as smoothness and sparsity, and is useful across multiple tasks and domains. <br/><br/>This project aims to develop new theory and methods for representation learning that can easily scale to large datasets. In particular, this project is concerned with methods for large-scale unsupervised feature learning, including Principal Component Analysis (PCA) and Partial Least Squares (PLS). To capitalize on massive amounts of unlabeled data, this project will develop appropriate computational approaches and study them in the ?data laden? regime. Therefore, instead of viewing representation learning as dimensionality reduction techniques and focusing on an empirical objective on finite data, these methods are studied with the goal of optimizing a population objective based on sample. This view suggests using Stochastic Approximation approaches, such as Stochastic Gradient Descent (SGD) and Stochastic Mirror Descent, that are incremental in nature and process each new sample with a computationally cheap update. Furthermore, this view enables a rigorous analysis of benefits of stochastic approximation algorithms over traditional finite-data methods. The project aims to develop stochastic approximation approaches to PCA and PLS and related problems and extensions, including deep, and sparse variants, and analyze these problems in the data-laden regime.

  • Program Officer
    Jack Snoeyink
  • Min Amd Letter Date
    9/3/2015 - 8 years ago
  • Max Amd Letter Date
    9/3/2015 - 8 years ago
  • ARRA Amount

Institutions

  • Name
    Toyota Technological Institute at Chicago
  • City
    Chicago
  • State
    IL
  • Country
    United States
  • Address
    6045 S. Kenwood Avenue
  • Postal Code
    606372902
  • Phone Number
    7738340409

Investigators

  • First Name
    Nathan
  • Last Name
    Srebro
  • Email Address
    nati@ttic.edu
  • Start Date
    9/3/2015 12:00:00 AM

Program Element

  • Text
    Big Data Science &Engineering
  • Code
    8083

Program Reference

  • Text
    CyberInfra Frmwrk 21st (CIF21)
  • Code
    7433
  • Text
    Big Data Science &Engineering
  • Code
    8083