This EArly Concept Grant for Exploratory Research (EAGER) investigates new machine learning techniques for discovering sub-word units in speech for use in automatic speech recognition (ASR). The representation of this EArly Concept Grant for Exploratory Research investigates new machine learning techniques for discovering sub-word units in speech for use in automatic speech recognition (ASR). The representation of words in terms of sub-word units is rarely learned from data, despite significant disagreement among linguists as to the sub-word unit inventory. This project represents exploratory work toward a larger goal of making all aspects of ASR learnable, using scientific insights while being discriminatively trained.<br/><br/>In contrast with prior work, speech segments are clustered into units using discriminatively learned segmental similarities, rather than via dynamic time warping or hidden Markov models. Rather than pre-supposing phoneme-like units, multiple heterogeneous unit types<br/>are learned. The project also leverages multi-modal (video, articulatory, and so on) data to improve unit discovery by sharing<br/>information across modalities. In this exploratory work, the learned units are used in a discriminative model that rescores initial outputs from a standard phone-based recognizer, and the experiments focus on small-/medium-vocabulary recognition.<br/><br/>This project explores new ways of discovering the basic units of speech. Beyond improvements to speech recognition, this project has<br/>the potential for broad impact on other research areas involving sequences with segmental sub-structure (such as text, video,<br/>biological data, and financial data) or involving clustering. The results may also include new representations for the study of speech<br/>in linguistics and speech science. From a societal perspective, in the long term making speech recognition more learnable will enable<br/>improved porting of the technology to under-served linguistic communities, which do not have the benefit of large data sets or other resources.