Speech impairments or loss due to paralysis/stroke or injury affects thousands of individuals each year, often times with loss of mobility due to the injury or stroke. For such individuals, in-person learning sessions with a speech or audio therapist/expert are not as frequent as needed, due to physical/mobility constraints, limited medical coverage, and cost constraints. It is very important for such individuals to receive expert guidance during their recovery and to practice exercises given by the experts. Today, there are very limited means by which the expert/therapist can monitor progress of their learners, between visits.
In a paralytic stroke the degree of damage in the brain determines the impact on sensory and neural pathways. Speech and audio therapy is used to recover speech, along with physical therapy to recover limb movements. Existing systems for speech therapy are quite rigid, non-adaptive and expensive. One recent application, iSwallow is available for Apple iPhones to assist in speech therapy, is a positive development for this space.
However, these existing applications are not interactive in multi-modal format. Existing applications are not integrated with the entire therapy/learning cycle. There is no learning platform where multiple Lessons and practice sessions are recorded for later review and evaluation.
This invention is that of an Integrated Multi-Modal Interactive Framework for Speech Therapy
This invention provides a framework in which lessons are prepared and recorded by the expert for the Learner to practice by interacting via multi-modal interfaces, and also recorded for review by the expert and the learner.
Further, this invention provides the platform on which learning sessions are created with differing levels of multi-modal interaction, complexity and game-playing, to engage and enhance the learning experience. The role of the expert/therapist remains paramount, hence this invention is intended to provide a framework to assist the expert (speech therapist).
In the present invention multi-modal interfaces with simple and intuitive visual cues to assist in learning audio/speech, and tactile or mouse/pointer interfaces provide mechanisms for Learners to interact while practicing lessons. As the individual progresses, game-playing exercises of increasing complexity are constructed with short fragments of audio, to engage the Learner and provide additional feedback. To overcome limitations of individual devices, this framework provides integration with a common repository and compute environment, such as a public cloud or private cloud, and software to transfer recorded lessons and practice sessions between devices and the cloud. Transfer of such information is using well-known technologies such as HTTP, TCP which are widely used and supported.
This invention also has application for non-speech-impaired individuals for learning a new language or improving proficiency in a foreign language, as well as for language assistance while traveling in a foreign land.
The drawings enclosed within this document are described briefly in relation to the text.
The following description is presented to enable any person skilled in the art to make and use the invention, and is provided in the context of particular applications of the invention and their individual requirements. Various modifications to the disclosed embodiments will be readily apparent to those skilled in the art and the general principles described herein may be applied to other embodiments and applications without departing from the spirit and scope of the present invention. Thus, the present invention is not intended to be limited to the embodiments shown, but to be accorded the widest scope consistent with the principles and properties disclosed herein.
This invention describes an interactive multi-modal framework to assist in speech and audio therapy with mechanisms for rendering visually, tactile manipulation of recording, feedback and game-playing.
In the following description, Guide refers to the Expert or Therapist or Teacher, and Learner refers to the Student or Patient.
The enviroment of the framework includes:
The framework includes:
A typical workflow for speech therapy using this framework is described below to illustrate how the each mechanism in the framework works in conjunction with other mechanisms.
The Guide examines how LEarner has adjusted the audio fragment and compares with Guide's own audio, to infer how the Learner has evolved in this practice session. Guide annotates these sessions as well.
The Learner and Guide meet in person to review practice sessions and for in-person speech therapy.
The Learner and Guide review lessons and practice sessions from recent and past, either together or each at their own convenience.
For the more advanced Learner, the Guide prepares Lessons with greater complexity—using longer speech segments, to represent words, and sentences. Guide prepares simple games using the framework to construct them.
Lessons and practice sessions are archived for safe-keeping and evaluations that span multiple months.
For a distant Learner, a live internet-streaming session is used by the Guide and Learner, in lieu of a face-to-face in-person session.
1. Capture and record audio segments as individual lessons.
2. Mechanism for Learner to vocalize audio in attempt to match Guide's audio, and then record it alongside Guide's audio.
3. Collecting practice sessions of audio and converting to visual rendering for Speech or Audio Guide to review.
4. Mechanisms for the Learner to manipulate the visual form to generate associated audio; This is intended as a feedback mechanism to assist the Learner with distinguishing related sounds;
5. Mechanism for the Learner to record manipulated audio and visuals;
6. Mechanism for each of these to be replayed.
7. Mechanism to group recorded speech fragments and visuals by criteria such as Lesson number, Date/Time, Practice session count, and so on. In one embodiment of this framework, an online filing system is presented to assist in storage and retrieval of Lessons and practice sessions, by Date, patient and other criteria.
8. Mechanism to share via upload/download over the computer network for non-immediate feedback from Guide; In one embodiment of this mechanism an upload interface is presented to the user to save recording or session into a common repository, and to retrieve chosen items from the common repository. The common repository is made available via this framework using a public or private cloud.
9. Mechanism to share via live streaming over a computer network, for immediate feedback from Guide; In one embodiment of this mechanism an internet link between the Learner and Guide is established to conduct an in-person session without requiring them to co-located.
10. Mechanism for retain sequence of Lessons, Practice sessions for ongoing reviews to track progress over time. Typically the learner would practice such a sequence of sessions; the Guide would evaluate Learner's progress in the recorded sessions and provide further instructions to refine or repeat some of the sessions.
11. Augmented lesson and session storage and retrieval based on cloud-based technologies.
12. Augmented compute resources to process recordings for feature extraction, manipulation, rendering and adaptation, based on computational resources of a private or public cloud. In one embodiment of this framework, additional compute resources are made available to offload the processing from the handheld device or computer, such that processing of recordings for feature extraction, manipulation, rendering, adaptation is done using compute resources from a public or private cloud.
1. Provisional Patent Application Number 62/285,260
Number | Date | Country | |
---|---|---|---|
62285260 | Oct 2015 | US |