ITR-Collaborative Research: Development and Evaluation of a Hybrid Concatenative/Rule-Based Visual Speech Synthesis System

Information

  • NSF Award
  • 0312434
Owner
  • Award Id
    0312434
  • Award Effective Date
    7/15/2003 - 21 years ago
  • Award Expiration Date
    6/30/2007 - 17 years ago
  • Award Amount
    $ 216,822.00
  • Award Instrument
    Standard Grant

ITR-Collaborative Research: Development and Evaluation of a Hybrid Concatenative/Rule-Based Visual Speech Synthesis System

This project's goal is to develop a synthetic talking face. Humans developed sophisticated abilities to perceive and integrate auditory and visual (AV) speech information long before they were required to read printed text presented by computers. Seeing as well as hearing speech reduces the cognitive workload and improves comprehension over only hearing the talker. To realize the advantages of AV speech for human-computer interactions requires synthesizing visual speech, thereby providing an unlimited supply of visual speech images without having to pre-record data. The approach here is to drive optical speech synthesis with speech acoustics. Computational methods obtain models of the transformation from acoustics to optics. The method capitalizes on the speech production coarticulatory information captured by diphones to produce naturalistic visual speech images. The method is applied directly to natural acoustic speech features to obtain coordination between acoustic and optical signals. The synthesized visual speech is based on a texture-mapped wire frame model. A natural speech corpus to base the synthesis is being obtained via simultaneously recorded 3-D optical, audio, and video data. Synthesis development is guided by human perceptual testing. The DVD archived corpus will be disseminated. <br/><br/>The project will lead to expanded access to information and improvement in obtaining knowledge by diverse groups of individuals, for example: children still acquiring literacy skills; adults with inadequate literacy; individuals who are using a second language; and individuals with hearing losses who rely on audiovisual speech. Results will be disseminated broadly through professional outlets. Graduate and undergraduate students will participate.

  • Program Officer
    Ephraim P. Glinert
  • Min Amd Letter Date
    7/16/2003 - 21 years ago
  • Max Amd Letter Date
    1/9/2006 - 19 years ago
  • ARRA Amount

Institutions

  • Name
    House Ear Institute
  • City
    Los Angeles
  • State
    CA
  • Country
    United States
  • Address
    2100 West Third Street
  • Postal Code
    900571922
  • Phone Number
    2134834431

Investigators

  • First Name
    Sigfrid
  • Last Name
    Soli
  • Start Date
    10/6/2004 12:00:00 AM
  • End Date
    01/09/2006
  • First Name
    Edward
  • Last Name
    Auer
  • Email Address
    eauer@gwu.edu
  • Start Date
    7/24/2003 12:00:00 AM
  • End Date
    10/06/2004
  • First Name
    Lynne
  • Last Name
    Bernstein
  • Email Address
    lbernste@gwu.edu
  • Start Date
    1/9/2006 12:00:00 AM

FOA Information

  • Name
    Information Systems
  • Code
    104000
  • Name
    Human Subjects
  • Code
    116000

Program Element

  • Text
    ITR SMALL GRANTS
  • Code
    1686

Program Reference

  • Text
    ADVANCED SOFTWARE TECH & ALGOR
  • Code
    9216
  • Text
    HIGH PERFORMANCE COMPUTING & COMM