1510563(Stepp) & 1509791 (Koch Fager)<br/><br/>This work will develop and evaluate a system to allow individuals with unintelligible speech due to severe paralysis to control a speech synthesizer that includes prosody (changes in the pitch, loudness, and duration in speech that convey meaning). This advancement to the synthetic speech and the ease of its control by users will facilitate improved functionality of clinical communication systems, thus improving the quality of life of users. Natural and intelligible speech production in these individuals will increase their ability to participate actively in society and empower them to self-advocate for their own medical management.<br/><br/>The research objective of this proposal is to test the hypothesis that providing users of alternative and augmentative communication (AAC) with a method for prosodic control will result in speech synthesis that is more natural to listeners and provides greater function to users. Up to 1.2% of the population is unable to meet daily communication needs using typical speech due to stroke or other neurological injury, requiring AAC to meet their communication needs. Their quality of life is strongly dependent on access to this communication, both for social interaction as well as to relay information about urgent medical needs. The most advanced AAC devices incorporate speech synthesis, allowing the users to communicate orally with others. However, the resulting synthetic speech is both unnatural and difficult for others to understand, and is often described as "robotic". Specifically, synthetic speech does not vary in pitch, loudness, or rhythm, the prosodic features utilized in typical speech to relay emotional state, utterance form (statement vs. question), irony, and emphasis.<br/>Asking AAC users to control each of these dimensions individually would result in an intractably slow and complex system, an unacceptable burden for individuals who already have considerably reduced communication rates. Instead, this project will leverage the fact that typical speech predictably uses these prosodic markers (pitch, loudness, rhythm) in concert. A novel AAC interface will be developed to allow users to modify the overall "stress" of synthetic speech output as a single dimension, in order to provide easily controlled, natural, and intelligible speech synthesis. The co-PIs will use their combined expertise in speech technology, clinical application of AAC, and real-time control of human-machine-interfaces to enable essential advancements in AAC technology to achieve three goals. In Research Goal 1, a multi-stress speech bank for concatenative speech synthesis will be created via a novel interactive procedure in which speech productions of healthy speakers are "misunderstood", thus prompting speakers to naturally emphasize specific target sounds in their repeated responses. This will result in a bank of triphones (sounds with a specific left and right context, based on surrounding sounds) with all potential combinations of sounds and stresses. Research Goal 2 is to develop an AAC interface that allows users to select phonemes (individual sounds of speech) using two-dimensional cursor control (e.g., head-tracking, eye-tracking) in which the stress of individual phonemes will be based on cursor dwell time. In Research Goal 3, the functionality of the AAC interface will be evaluated by testing its effect on the naturalness of communicative interactions.