The present invention relates to an interactive system and method for learning a select new language.
Our world is a multi-lingual world. With continued globalization and cross-border collaboration, the ability to speak more than one language is becoming increasingly more important in order to succeed both at international and national levels. Communicating clearly and effectively in someone's native language not only reduces communication errors, but also improves efficiency and productivity.
Language learning computer applications (“apps”) have become ubiquitous as learners increasingly look to technological platforms to facilitate language learning versus in-person tutoring for both convenience and cost. While these apps can be effective for basic language learning, they rely on visual prompts and user interaction with display screens. Such apps do not provide for interactivity and, most importantly, real-time actionable correction and feedback in the same way that in-person tutoring can provide. Therefore, there is a need for a system and method for learning a new language that does not rely on visual prompts or display screen interaction, and provides interactivity and real time actionable correction and feedback.
A system and method for assisting a user in learning a targeted non-native language is disclosed according to an embodiment of the present invention. In one embodiment of the present invention a system comprising one or more processors execute instructions stored on a computer-readable medium. The executed instructions cause the system to provide the user with an audible presentation of a word or a phrase in the targeted non-native language, prompting the user to audibly respond as speech data. The system captures the speech data and converts the speech data into text data using a speech recognition system that analyzes the speech data. The system then evaluates the text data by comparing text characters in the text data to anticipated text data contained in a database and calculating number of incorrect characters to determine the accuracy of the evaluated text data. The evaluated text data is converted back into an audio file with a text-to-speech conversion subsystem. The system then reads back the audio file to the user, thereby providing audible feedback to the user relating to the accuracy of the evaluated text data.
In an embodiment of the present invention a system for interactive language learning includes an audio input device, an audio to text converter coupled to the audio input device, a processor coupled to the audio to text converter, a predetermined set of instructions on a storage medium and readable by the processor, a speech generator coupled to the processor, and an audio output device coupled to the speech generator. A word or phrase spoken in a select language is detected by the audio input device and converted to a corresponding input electrical signal by the audio input device, then further converted to corresponding input text by the audio to text converter. The processor analyzes and evaluates the input text in comparison to predetermined reference text representing the correct pronunciation of the word or phrase in the select language, the processor outputting to the text to speech converter a text analysis evaluation of the comparison. The text to speech converter provides to the audio output device an output electrical signal corresponding to the text analysis, and the audio output device produces an audio signal corresponding to the text analysis.
The currently disclosed invention provides for an innovative and efficient system and method for learning a new language. The readback element of the currently-claimed system and method provides several advantages over the prior art. For example, it allows the user to receive immediate feedback, which in turn allows the user to correct their understanding and pronunciation accordingly. The readback element also provides a learning experience similar to that provided by in-person classroom lessons with the convenience of accessibility from any place at any time. Moreover, the audible or spoken interaction between the system and the user provides for hands-free interactivity, simplifying the learning process. It also reduces the need for physical interaction between the user and the system's input controls, which allows the user to multitask while learning a new language.
Further features of the present invention will become apparent to those skilled in the art to which the present invention relates from reading the following specification with reference to the accompanying drawings, in which:
A system and method for assisting a user in learning a targeted non-native language is disclosed according to an embodiment of the present invention. In one embodiment the system and method comprises one or more processors executing instructions stored on a computer-readable medium. The computer-readable medium may include permanent memory storage devices, such as computer hard drives or servers. Examples of machines and devices that utilize processors, memory, and instructions stored on computer-readable mediums include, but are not limited to, servers, computers, mobile devices, such as cellular telephones, and terminals.
Details of a non-limiting system 10 to facilitate learning a new language are shown in
Audio input device 12 may be any suitable transducer configured to convert audio signals to a corresponding input electrical signal, such as one or more microphones. Audio input device 12 may optionally include audio enhancing features in hardware and/or software form such as audio processors, noise limiters, compressors, equalizers, amplifiers, and filters. The input electrical signal may be in any analog or digital form readable by audio to text converter 14, and may be stored as an audio file in a suitable storage medium.
Audio to text converter 14 converts the electrical signal from audio input device 12 to corresponding input text in a form and format that can be recognized by processor 16. Converter 14 may be implemented in dedicated hardware, software operated on a generic platform, or a combination of hardware and software.
Processor 16 may be any suitable type of computing device including, without limitation, one or more central or distributed microprocessors, microcontrollers, or computers. Processor 16 may be implemented in dedicated hardware, software operated on a generic platform, or a combination of hardware and software.
Instructions 18 and database 20 may be in any form compatible with processor 16 including, without limitation, a computer-readable storage medium with a standard computing language or a proprietary or custom computing language stored thereon, as well as predetermined logic arrays and other hardware-only implementations of the instructions. As previously noted, the computer-readable medium upon which instructions 18 and database 20 are stored may include, without limitation permanent memory storage devices, such as computer hard drives or servers. Portable memory storage devices such as USB drives and external hard drives may also be utilized.
In some embodiments database 20 is configured to collect user performance information such as correct answers, incorrect answers, and number of trials. This information helps construct the lesson flow. User performance information may be saved for future analysis, or used temporarily by system 10 during a lesson.
Speech generator 22 receives predetermined signals from processor resulting from the analysis performed by the processor and converts the signals to an output electrical signal representing speech. Speech generator 22 may be implemented in dedicated hardware, software operated on a generic platform, or a combination of hardware and software. The output electrical signal may be in any analog or digital form readable by audio output device 24, and may be stored as an audio file in a suitable storage medium.
Audio output device 24 receives the audio speech output electrical signal and acts as a transducer to convert the electrical speech output signal to an audio signal that can be perceived by a user of system 10. Audio output device 24 may be a transducer such as one or more speakers. Audio output device 24 may also include audio processing features such as amplifiers and filters.
The foregoing components of system 10 may be realized using discrete subsystems that are mechanically and electrically coupled together to form the system. Alternatively, some or all of the components of system 10 may be integrated together and placed on a common substrate such as a chassis or printed circuit assembly. Example system 10 configurations may include, without limitation, one or more of: servers; computers; mobile devices such as cellular telephones; vehicle audio and entertainment systems; “smart” speakers; “smart” televisions and other “smart” appliances; augmented reality (AR), virtual reality (VR) and cross reality (XR) devices such as goggles, headsets, glasses and other wearable intelligence; and terminals. In some embodiments of the present invention some portions of system 10 may be located remotely from the others. For example, processor 16, instructions 18 and database 20 may be located remotely and coupled to the other components of system 10 and in communication with the other components using any suitable devices, such as a wired or wireless transmitter-receiver arrangement.
With reference now to
Instructions 18, which can be stored on any suitable computer-readable medium, involve lessons for learning a non-native language. Each lesson may involve individual words. Alternatively, or additionally, each lesson may involve phrases. Furthermore, each lesson may involve tests or quizzes.
With reference now to
The user then repeats the word or phrase into audio input device 12 at s104, the audio input device capturing the user's speech data in an electrical input signal such as an audio file. Speech recognition system 14 converts the speech data into input text data at s106. The input text data is analyzed by processor 16 by turning the input text data into characters and comparing said characters to answers in database 20. Then, the accuracy of the input text data is determined by processor 16 at s108, resulting in evaluated text data.
Text to speech system 22 converts the evaluated text data into speech data at s110. The evaluated speech data is read back to the user by audio output device 24 at s112 in a computer-generated audio file. In addition, feedback may be provided as to whether the user's response was correct, incorrect or partially correct.
The user may interact with system 10 using either voice commands via audio input device 12 and/or any suitable user input device 26 (
With reference to
The lesson begins at s102 by system 10 providing the user with a word or a phrase. The grade of difficulty of the word or phrase provided to the user in the target non-native language may depend on the level of expertise of the user. The expertise of the user may be classified as beginner, intermediate, or advanced. The user's expertise may be determined by processor 16 analyzing the accuracy of the user's responses or evaluated text data as the language lesson progresses. As the lesson advances, the grade of difficulty of the word or phrase provided may increase as the accuracy of the evaluated text data increases. Similarly, the grade of difficulty of the word or phrase provided may decrease as the accuracy of the evaluated dated decreases.
Alternatively, or in addition, the user may select their own expertise level; thus, selecting the grade of difficulty of the word or phrase provided. As the lesson progresses, the user may change their expertise level. Alternatively, or in addition, as the lesson progresses, the system 10 may prompt the user to adjust their expertise level to a higher or lower classification. Said prompt may be based on the accuracy of the evaluated text data.
Further, the grade of difficulty of the provided word or phrase may further depend on a complexity level determined by comparing the user's native language and the target language. The complexity level between the user's native language and the target language is determined based on several factors, including but not limited to, the similarity between the languages by comparing each language's root, syntax, and alphabet. Moreover, the complexity level classification between specific native/target languages combinations may be updated as data is collected from users' evaluated text data.
As illustrated in
As explained above, once the word or phrase is provided to the user, system 10 will ask the user to answer or repeat the word or phrase, generating speech data at s104. The speech recognition system 14 converts the speech data into text data at s106. The text data is converted to text characters and its accuracy is evaluated by processor 16.
The accuracy of the evaluated text data is determined by processor 16 of system 10 comparing the generated text characters with anticipated text data or answers stored in database 20. A user's answer may be classified as correct (
In one embodiment a user's answer or evaluated text data is considered to be correct (
The complexity level between the native language and the target language may also be considered when determining the accuracy of a user's answer or evaluated text data. As the complexity level increases, the number of accepted incorrect characters may increase. For example, the number of acceptable incorrect characters involving a type 3-complexity level maybe double the number of acceptable incorrect characters involving a type 1-complexity level.
A user's answer or evaluated text data may be classified as partially correct (
Other factors may be considered when determining if a user's answer is partially correct. For example, the user's level of expertise, the target non-native language classification, grade of difficulty of the provided word or phrase, and the user's native language may be considered when determining the acceptable number of incorrect characters.
Once the text data is evaluated for accuracy, the evaluated text data is converted to evaluated speech data or audio file at s110 by text to speech system 22. The evaluated speech data is then read back to the user via audio output device 24 at s112. The electrical output signal audio file is based on what the system 10 “understood” from the user's speech data, e.g., the fidelity of the user's pronunciation of the word or phrase in comparison to the correct pronunciation of the word or phrase stored in database 20 and emitted at s102 by audio output device 24. The readback comprises a representation of how the spoken word or phrase provided by the user would be perceived by a speaker of the select target language. For example, an accent introduced by the user may affect the user's pronunciation of a word or phrase in the target language. Thus, the audio readback function of s112 provides the user with further understanding and feedback on how their answer is being perceived and evaluated by a speaker of the target language, and the user may change and correct their answer accordingly, if needed. This unique readback function provides the user with immediate feedback on how their answer was understood, which in turn allows the user to self-correct in real time as if they were interacting with a live tutor.
In some embodiments of system 10 when the user has to answer a question and does not speak for several seconds, system 10 may assist the user by speaking out loud via audio output device 24 the first several words of the answer. When the user speaks only the first part of a phrase, system 10 may acknowledge that the answer is partially correct, and then help by speaking the last part of the answer. System 10 is also able to stress out the pronunciation of some words, so the user can understand how to accentuate the word, or that a certain word needs to be used.
As shown in
After certain number of correct user's answers, for example three to five correct user's answers, the lesson will continue with a test (
In one embodiment, a test may involve giving the user questions relating to the previously provided words or phrases, but in a different order or sequence. The user's test answers are evaluated in similar manner to the words or phrases at the beginning of the lesson. For example, at the end of each test, if all user's test answers are correct, then a new lesson may be started. Alternatively, or in addition, if at the end of the test, a certain number of answers are considered incorrect, for example, three or more answers are considered to be incorrect, then a new test may be automatically generated. Alternatively, or in addition, if at the end of the test, a certain number of answers are considered incorrect, for example, two or less answers are considered to be incorrect, then a review lesson may be generated.
In one embodiment, a test may involve giving the user questions relating to the previously provided words or phrases, but in a different order or sequence. The user's test answers are evaluated by system 10 in similar manner to the words or phrases at the beginning of the lesson. For example, at the end of each test, if all user's test answers are correct, then a new lesson may be started. Alternatively, or in addition, if at the end of the test, a certain number of answers are considered incorrect, for example, three or more answers are considered to be incorrect, then a new test may be automatically generated. Alternatively, or in addition, if at the end of the test, a certain number of answers are considered incorrect, for example, two or less answers are considered to be incorrect, then a review lesson may be generated.
A review lesson (
If the user's answer or evaluated text data is considered to be incorrect (
If the user's answer or text data is considered to be partially correct (
“Correct answers” may also include alternate answers. Alternate answers comprise answers that do not match what was taught but are considered correct for the language being taught.
In some embodiments of the present invention system 10 may include gamification features to add to the user's enjoyment. For example, the user may earn and collect points and awards based on their performance. Users may also be linked together using any suitable communication devices to share information relating to earned points for the purpose of listing on a leaderboard available to one or more users.
In addition to the test words or phrases and readback discussed above, system 10 may provide a user with visual and/or aural information including, but not limited to, instructions, test results, suggestions for improvement, updates, system status, responses to user input and controls, error messages, gamification points and awards, and encouragement. In some embodiments of the present invention system 10 may initially present the information to the user in the user's native (or known) language, then gradually begin providing at least a portion of the information in the target language as the user becomes more proficient with the target language. In this way the user becomes more and more interactively immersed in the target language as the user's proficiency in the target language increases.
As described above, the currently disclosed invention provides a system and method for learning a new language. In some embodiments of the present invention the system 10 may be implemented in a mobile-enabled application, such as for a cellular telephone or tablet computer, wherein the interaction between the system and the learner is hands-free, increasing convenience and ease while imitating real-life learning interactions such as tutoring by providing immediate feedback by a readback function.
From the above description of the invention, those skilled in the art will perceive improvements, changes, and modifications in the invention. Such improvements, changes, and modifications within the skill of the art are intended to be covered.
This PCT application claims priority to U.S. Provisional Patent App. No. 63/046,748, filed on Jul. 1, 2020, herein incorporated by reference.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2021/068177 | 7/1/2021 | WO |
Number | Date | Country | |
---|---|---|---|
63046748 | Jul 2020 | US |