SBIR Phase I: Automatic Language Detection Using Fast Wordspotting

Information

NSF Award
0441490

Owner

NEXIDIA INC

Award Id
0441490
Award Effective Date
1/1/2005 - 20 years ago
Award Expiration Date
6/30/2005 - 19 years ago
Award Amount
$ 0.00
Award Instrument
Standard Grant

Information

SBIR Phase I: Automatic Language Detection Using Fast Wordspotting

This Small Business Innovation Research Phase I project will perform the research and development necessary to integrate extra information gathered from an existing phonetic word-spotting technology into a language and dialect identification system, thus enhancing the identification system. The research objective of this proposal is to use Nexidia's existing wordspotting technology to improve a state of the art language identification system. Wordspotting is the technique where a word (or phrase) is searched for in audio, with the return being a set of timestamps where the word or phrase might have occurred, along with a confidence score for each timestamp. Standard state-of-the-art language identification systems currently are based on Gaussian Mixture Models and phoneme statistics of each candidate language. They cannot use full speech recognition for computational reasons. However, wordspotting is lightweight, needing only a fraction of a CPU. If a list of several thousand common words and phrases is generated, it is very likely that in speech more than a few seconds long, an item from this list will be spoken. Thus for this project, it is proposed to begin with a state of the art language identification system, and augment it by such a search from each candidate language. The expected result is a language identification system capable of outperforming current state of the art systems<br/><br/>The ability to automatically classify which language is being spoken in a segment of speech would be a highly desirable feature in many speech communications systems. The proposed method for language identification is an extension to state of the art systems. As such, a baseline for performance can be considered to be current state of the art, and it is probable that the proposed research will result in better classification accuracy than is currently reported in the literature. If better accuracy is achieved, the proposed structure could become a standard. Further, there is no commercial product available at this time to perform language classification, as existing systems are all in the research lab and not commercialized. Were the proposed research to be even moderately successful, a new class of commercial offering would emerge. Possible applications include routing, monitoring, and quality assurance in call centers, data mining and intelligence applications, and to enable the proper speech recognition system. Call centers could automatically route incoming calls to appropriate CSRs, and surveillance operations could add additional filtering criteria to their intercepted records. The integration of this feature along with the original functionality of fast phonetic keyword spotting would greatly enhance data-mining capability.

Program Officer
Sara B. Nerlove
Min Amd Letter Date
11/5/2004 - 20 years ago
Max Amd Letter Date
3/2/2005 - 20 years ago
ARRA Amount

Institutions

Name
NEXIDIA INC.
City
ATLANTA
State
GA
Country
United States
Address
PIEDMONT RD NE BUILDING 2 STE 40
Postal Code
303051567
Phone Number
4044957239

Investigators

First Name
Jon
Last Name
Arrowood
Email Address
jarrowood@nexidia.com
Start Date
11/5/2004 12:00:00 AM

FOA Information

Name
Computer Science
Code
912

Program Element

Text
SMALL BUSINESS PHASE I
Code
5371

Program Reference

Text
ADVANCED SOFTWARE TECH & ALGOR
Code
9216

Text
HIGH PERFORMANCE COMPUTING & COMM

SBIR Phase I: Automatic Language Detection Using Fast Wordspotting

Information

Owner

Award Id

Award Effective Date

Award Expiration Date

Award Amount

Award Instrument

SBIR Phase I: Automatic Language Detection Using Fast Wordspotting

Program Officer

Min Amd Letter Date

Max Amd Letter Date

ARRA Amount

Institutions

Name

City

State

Country

Address

Postal Code

Phone Number

Investigators

First Name

Last Name

Email Address

Start Date

FOA Information

Name

Code

Program Element

Text

Code

Program Reference

Text

Code

Text