RI: Small: From acoustics to semantics: Embedding speech for a hierarchy of tasks

Information

NSF Award
1816627

Owner

Toyota Jidosha Kabushiki Kaisha

Award Id
1816627
Award Effective Date
8/15/2018 - 6 years ago
Award Expiration Date
7/31/2021 - 3 years ago
Award Amount
$ 449,984.00
Award Instrument
Continuing grant

Information

RI: Small: From acoustics to semantics: Embedding speech for a hierarchy of tasks

There is an increasingly large array of spoken language interfaces available, such as virtual assistants and telephone customer service interfaces. These technologies both (1) recognize the words spoken by a user and (2) extract actionable information, such as the topic of the user's query and the degree of match between the query and documents in a database. Such applications are typically treated as a pipeline of automatic speech transcription followed by text processing to extract the meaning. This project aims to develop technology that directly extracts meaning from speech, while using a variety of linguistic information along the way. This approach is intended to mitigate the effects of speech recognition errors, as well as to use all of the meaning-bearing information in speech, such as intonation. This work is expected to have long-term broad impact through technological advances, as well as immediate broad impact through the PI's involvement in local schools and mentoring for a diverse set of visiting students.<br/><br/>The technical goals of this work are (1) to do high-quality natural language processing directly on speech; (2) to seamlessly integrate domain knowledge into end-to-end speech models; (3) improve the performance-vs.-resources tradeoff; and (4) develop models for embedding arbitrary speech signals into meaning-bearing representations. The process of mapping from speech to meaning can be viewed as a hierarchy of tasks, from the most basic acoustic-phonetic tasks to the deepest semantic tasks. The experimental work will focus on two task hierarchies: a "retrieval" hierarchy including query-by-example search, keyword spotting, semantic speech search; and a "recognition" hierarchy including phonetic recognition, word recognition, parsing, and topic identification. The main technical approaches to be developed include hierarchical multitask learning methods for incorporating domain knowledge and mitigating low-data settings, as well as new models for acoustic-semantic speech embedding.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Program Officer
Tatiana D. Korelsky
Min Amd Letter Date
8/2/2018 - 6 years ago
Max Amd Letter Date
9/13/2018 - 6 years ago
ARRA Amount

Institutions

Name
Toyota Technological Institute at Chicago
City
Chicago
State
IL
Country
United States
Address
6045 S. Kenwood Avenue
Postal Code
606372803
Phone Number
7738340409

Investigators

First Name
Karen
Last Name
Livescu
Email Address
klivescu@ttic.edu
Start Date
8/2/2018 12:00:00 AM

Program Element

Text
ROBUST INTELLIGENCE
Code
7495

Program Reference

Text
ROBUST INTELLIGENCE
Code
7495

Text
SMALL PROJECT
Code
7923

RI: Small: From acoustics to semantics: Embedding speech for a hierarchy of tasks

Information

Owner

Award Id

Award Effective Date

Award Expiration Date

Award Amount

Award Instrument

RI: Small: From acoustics to semantics: Embedding speech for a hierarchy of tasks

Program Officer

Min Amd Letter Date

Max Amd Letter Date

ARRA Amount

Institutions

Name

City

State

Country

Address

Postal Code

Phone Number

Investigators

First Name

Last Name

Email Address

Start Date

Program Element

Text

Code

Program Reference

Text

Code

Text

Code