CAREER: Personalized Speech Enhancement: Test-Time Adaptation Using No or Few Private Data

Information

NSF Award
2512987

Owner

UNIVERSITY OF ILLINOIS

Award Id
2512987
Award Effective Date
10/1/2024 - 8 months ago
Award Expiration Date
5/31/2026 - 12 months from now
Award Amount
$ 244,179.00
Award Instrument
Continuing Grant

Information

CAREER: Personalized Speech Enhancement: Test-Time Adaptation Using No or Few Private Data

Current general-purpose speech enhancement systems employ large models trained from big datasets of audio signals which are too bulky to run on small personal devices. A personalized model can be a resource-efficient solution because it focuses on a particular user and a specific test environment for which a smaller model architecture can be good enough. However, training a personalized model requires clean voice data from the test-time user in advance, which are not always available because of the user’s privacy concerns or problems with recording. This CAREER project develops machine-learning methods to achieve the personalization goal while requiring no or few data samples from the test-time users. Because the project achieves the personalization goal in a privacy-preserving and resource-efficient way, it is a step towards a more available and affordable use of artificial intelligence for all members of society.<br/><br/>The project circumvents the lack of personal data in the context of personalized speech enhancement using no- and few-shot learning frameworks with help from adversarial and self-supervised learning. First, it verifies that a personalized system with reduced computational complexity can still compete with a generic model in speech enhancement performance. To this end, the training algorithm divides the potentially large model into multiple sub-modules, each of which handles a particular sub-problem (e.g., a particular user's utterance). If the sub-problems are defined to be mutually exclusive, the test-time inference can be made efficiently by using only the most suitable sub-module. Since the sub-module selection is done on noisy speech, it achieves personalization with no additional training on the test user's data. Second, the project explores a no-shot learning approach, in which the fundamental challenge lies in optimizing a machine learning model with no available target. To this end, an already-trained general-purpose model is fine-tuned for an unseen test environment using adversarial optimization. The third research topic handles the case when a small amount of user's clean speech is available, which falls in the category of few-shot learning. The project overcomes data shortage via a self-supervised learning method that learns effective features from noisy speech data, which are more available than the clean ones. That way, the model can be prepared for a subsequent fine-tuning step, which can be done with only a few clean user-specific speech utterances.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Program Officer
Tatiana Korelskytkorelsk@nsf.gov7032920000
Min Amd Letter Date
12/16/2024 - 5 months ago
Max Amd Letter Date
12/16/2024 - 5 months ago
ARRA Amount

Institutions

Name
University of Illinois at Urbana-Champaign
City
URBANA
State
IL
Country
United States
Address
506 S WRIGHT ST
Postal Code
618013620
Phone Number
2173332187

Investigators

First Name
Minje
Last Name
Kim
Email Address
minje@indiana.edu
Start Date
12/16/2024 12:00:00 AM

Program Element

Text
Robust Intelligence
Code
749500

Program Reference

Text
CAREER-Faculty Erly Career Dev
Code
1045

Text
ROBUST INTELLIGENCE
Code
7495

CAREER: Personalized Speech Enhancement: Test-Time Adaptation Using No or Few Private Data

Information

Owner

Award Id

Award Effective Date

Award Expiration Date

Award Amount

Award Instrument

CAREER: Personalized Speech Enhancement: Test-Time Adaptation Using No or Few Private Data

Program Officer

Min Amd Letter Date

Max Amd Letter Date

ARRA Amount

Institutions

Name

City

State

Country

Address

Postal Code

Phone Number

Investigators

First Name

Last Name

Email Address

Start Date

Program Element

Text

Code

Program Reference

Text

Code

Text

Code