SBIR Phase I: Automatic Language Detection Using Fast Wordspotting

Information

  • NSF Award
  • 0441490
Owner
  • Award Id
    0441490
  • Award Effective Date
    1/1/2005 - 20 years ago
  • Award Expiration Date
    6/30/2005 - 19 years ago
  • Award Amount
    $ 0.00
  • Award Instrument
    Standard Grant

SBIR Phase I: Automatic Language Detection Using Fast Wordspotting

This Small Business Innovation Research Phase I project will perform the research and development necessary to integrate extra information gathered from an existing phonetic word-spotting technology into a language and dialect identification system, thus enhancing the identification system. The research objective of this proposal is to use Nexidia's existing wordspotting technology to improve a state of the art language identification system. Wordspotting is the technique where a word (or phrase) is searched for in audio, with the return being a set of timestamps where the word or phrase might have occurred, along with a confidence score for each timestamp. Standard state-of-the-art language identification systems currently are based on Gaussian Mixture Models and phoneme statistics of each candidate language. They cannot use full speech recognition for computational reasons. However, wordspotting is lightweight, needing only a fraction of a CPU. If a list of several thousand common words and phrases is generated, it is very likely that in speech more than a few seconds long, an item from this list will be spoken. Thus for this project, it is proposed to begin with a state of the art language identification system, and augment it by such a search from each candidate language. The expected result is a language identification system capable of outperforming current state of the art systems<br/><br/>The ability to automatically classify which language is being spoken in a segment of speech would be a highly desirable feature in many speech communications systems. The proposed method for language identification is an extension to state of the art systems. As such, a baseline for performance can be considered to be current state of the art, and it is probable that the proposed research will result in better classification accuracy than is currently reported in the literature. If better accuracy is achieved, the proposed structure could become a standard. Further, there is no commercial product available at this time to perform language classification, as existing systems are all in the research lab and not commercialized. Were the proposed research to be even moderately successful, a new class of commercial offering would emerge. Possible applications include routing, monitoring, and quality assurance in call centers, data mining and intelligence applications, and to enable the proper speech recognition system. Call centers could automatically route incoming calls to appropriate CSRs, and surveillance operations could add additional filtering criteria to their intercepted records. The integration of this feature along with the original functionality of fast phonetic keyword spotting would greatly enhance data-mining capability.

  • Program Officer
    Sara B. Nerlove
  • Min Amd Letter Date
    11/5/2004 - 20 years ago
  • Max Amd Letter Date
    3/2/2005 - 20 years ago
  • ARRA Amount

Institutions

  • Name
    NEXIDIA INC.
  • City
    ATLANTA
  • State
    GA
  • Country
    United States
  • Address
    PIEDMONT RD NE BUILDING 2 STE 40
  • Postal Code
    303051567
  • Phone Number
    4044957239

Investigators

  • First Name
    Jon
  • Last Name
    Arrowood
  • Email Address
    jarrowood@nexidia.com
  • Start Date
    11/5/2004 12:00:00 AM

FOA Information

  • Name
    Computer Science
  • Code
    912

Program Element

  • Text
    SMALL BUSINESS PHASE I
  • Code
    5371

Program Reference

  • Text
    ADVANCED SOFTWARE TECH & ALGOR
  • Code
    9216
  • Text
    HIGH PERFORMANCE COMPUTING & COMM