The presently disclosed subject matter is directed towards audio-visual learning devices useful for learning materials such as foreign languages, physics, and chemistry.
For many people mastering a foreign language is an extremely difficult and challenging process. Not only can learning the vocabulary and grammatical rules of a foreign language be difficult, but developing the ability to fluently hear, comprehend and speak a foreign language requires far more than mere rote recitation. Fluent speaking requires the ability to use a language in such a comfortable, natural way that processing the language does not intrude on the message being delivered. However difficult it may be, actually mastering a foreign language can be both personally and financially rewarding.
There are many different foreign language learning tools. Books, flash cards, audio recorders and tapes, video recorders and tapes, computers, classroom instruction, individual tutoring, and language immersion are a few examples of tools commonly used to learn a foreign language. Using many different language learning tools is highly beneficial because language learning is a cumulative process in which new skills are added to old, and in which the old skills become even more useful and natural when new skills are added. Whatever tools are used they work best if the learner actively participates in the learning process. For example, while reading to oneself is helpful, reading out loud is better. While listening is helpful, writing out what was heard is better.
Learning a foreign language is not the only highly challenging learning activity. Medical and veterinary schools frequently teach materials that are both new and complex. Military science, physics, chemistry, architecture and many other fields also require students to learn complex materials such that they become so intuitive that they can be applied and manipulated to provide superior results. Such schools of thought also make use of many of the same learning tools used to learn a foreign language.
Of the many different learning tools available some are particularly well suited to individual study and instruction. Computers and audio-visual (AV) systems can be singled out as being particularly useful for individual study as they can be used at anytime, they are highly flexible, they can be used without any embarrassment to the learner, and they have infinite patience.
Computer-based learning programs are usually designed to operate interactively. Furthermore, modern computer-based learning programs have the ability to implement and integrate AV systems, auxiliary data, and system controls into one common package. Some computer programs can play AV content while also providing additional data and enabling user feedback and interaction. But, such prior art computer-based learning programs are usually not particularly realistic and provide somewhat limited additional new skills for the learner. Prior art computer-based learning programs tend to be more useful in repeating already known information or presenting information that can be easily obtained from other sources.
One of the more interesting ways of learning a foreign language is to watch movies, television programs, news reports, and similar content in the native language. This approach lends itself to “real-life” learning in which the foreign language flows naturally and in a normal context using ordinary dialects, inflections, speeds, tones, pauses, jargon, and other factors. Likewise, learning other materials could benefit from watching AV content that is supplemented by additional information and user interactions.
While learning using movies and other AV content is not new, in the prior art AV content was usually presented on a television or other AV imaging device using an AV player of some sort that was less-than-optimal for learning. For example, while standard AV players provide Forward-Reverse-Pause and Stop capabilities, such is far less then optimal for learning a foreign language wherein fast and easy returning to a specific scene, dialog, or sentence, or skipping to the next desired scene, dialog, or sentence would be highly useful. Furthermore, being able to play AV content in both a standard AV mode and an enhanced learning mode would be useful. In addition, enhanced “user controls” that enable playing, forwarding, fast forwarding, reverse, and fast reverse by scenes, dialogs, sentences, or words would be beneficial. In addition an AV player having enhanced playback controls useful for learning could also benefit by incorporating a timer during periods of non-verbal content along with a control that would enable skipping of the non-verbal content.
Also useful would be an AV device that uses meta data: additional learning data that is presented along with AV content. Being able to select among alternative sets of meta data would be beneficial. An AV learning system that also allows dictation interactions and storing of learner progress information would be helpful.
The principles of the present invention provide for audio-visual learning devices for playing audio visual content in a manner that is beneficial for learning. Such audio-visual learning devices include memory for storing audio-visual content and meta data as well as a media player having a display. The media player selectively images the AV content as well as user selected meta data. The AV content can be played in a continuous play mode which mimics a standard media player or in an enhanced Stop-and-Go mode. When in continuous play mode the media player implements Forward-Reverse-Pause controls and, optionally, Stop controls for a user. The Forward and Reverse functionality may be present in the form of a slide/progress bar as in state-of-the art software media player applications; there, the user can drag the playback to the desired time location. In the Stop-and-Go mode the media player implements Continuous Play, Segment Play, Play Next Segment, and a Navigation control for a user. Beneficially the Pause control switches the media player from continuous play mode to Stop-and-Go mode, while the Continuous Play control switches the media player from Stop-and-Go mode to continuous play mode. The Play Segment control causes the media player to play a segment of the AV content. Beneficially, in the Stop-and-Go mode AV content and said meta data are played by segments.
An audio-visual learning device can be implemented either completely or partially on a computer, such as a laptop computer, a desktop computer, or a tablet computer. Suitable input devices include mice, keyboards, and touch screens or audio with appropriate speech-to-text converter. An audio-visual learning device can also be implemented either completely or partially using a television, beneficially one with a remote control, or a game box. Usefully an Audio Visual Learning device can make use memory distributed over internet.
Meta data can include segment numbers, time stamps, timers, subtitle tracks such as translations and closed-captions or other transcription of AV content, phonetic transcriptions, and additional information related to the AV content such as its difficulty. Meta data can be output in one or more text boxes and can be input using a keyboard, beneficially a soft keyboard such as those on touch screens. Special meta data controls such as segment selectors, meta data selectors, slide controls, skip controls, information controls, and typing controls are beneficial.
Preferably AV content is comprised of a plurality of compressed data frames in which all segments start on an I-frame. Beneficially starting segments are synchronized with conversation markers.
The advantages and features of the present invention will become better understood with reference to the following detailed descriptions and claims when taken in conjunction with the accompanying drawings, in which:
a) illustrates representative MPEG-4 frames without frame processing;
b) illustrates MPEG-4 frames after frame processing in accord with the principles of the present invention;
The principles of the present invention will be described hereinafter with reference to the accompanying
All publications mentioned herein are incorporated by reference for all purposes to the extent allowable by law. In addition, in the figures like numbers refer to like elements throughout. Additionally, the terms “a” and “an” as used herein do not denote a limitation of quantity, but rather denote the presence of at least one of the referenced items.
The principles of the present invention provide for novel, useful, and non-obvious Audio-Visual Learning Devices (hereinafter referred to individually as “AVLD”) that implement methods and systems for learning using Audio-Visual (AV) content. In some embodiments AV content is supplemented with additional useful learning information that is generically referred to as meta data. Examples of meta data would be sub-titles for AV content or explanatory information presented in a text box (other examples of meta data are provided below). While the inventive methods and systems can be implemented using a wide variety of hardware, software, and firmware, any device or set of devices that implement the principles of the present invention will be generically referred to as an AVLD.
An AVLD will have a display for viewing AV content (and possibly meta data), a set of user inputs or controls for interacting with the AVLD, and memory to retain AV content and possibly meta data. Interactions with an AVLD include not only controlling the operation of the AVLD device but also inputting information such as foreign language text. User Inputs may be made by a keyboard, a touch screen, a computer mouse, a trackpad, a microphone, or by another input device.
Another system well suited for implementing an AVLD is a tablet computer 30, reference
It should be noted that the configuration with two physical devices exhibiting screens (here: the remote device 18 shown in
From the foregoing it should be obvious that AVLDs can be implemented in a very wide variety of hardware, software and firmware configurations using existing and/or specialized hardware and software. It should be clearly understood that AVLDs can make use of internet and intranet connections. In fact locating AV content and meta data and/or all or parts of the media player on one or several remote servers of some sort is highly beneficial it enables easy updating and distribution of content. Additionally, using wireless communications between remotes, mice, tablets, computers, and displays is also beneficial as such reduces set-up and operational problems and enhances user convenience.
Most AVLDs in accord with the principles of the present invention will operate in at least two different modes: a “Continuous Play” mode and a “Stop-and-Go” Mode.
As shown in
Still referring to
Turning now to
It should also be understood that learning segments can overlap in time. For instance, segment n can be a contiguous piece of AV content corresponding to a sentence spoken by a first person in the audio-visual content. Segment n+1 can correspond to a sentence spoken by a second person in the audio-visual content, but that sentence can start while segment n is still in progress. Then segment n can be defined as the content covering the first person's speaking from beginning to end, which may include an early part of the second person's sentence. Also, the subsequent segment n+1 can then be defined as the content covering the second person's speaking, which may contain a late part of the first person's sentence. This is explained in greater detail with referenced to
Still referring to
To playback segment n, a user presses a specialized user interface Play Segment control 52. This activates the Play Segment control 52 causing the AVLD 40 to transition to SG2 system state 54. While having a user 34 activate a control to initiate a state transition is contemplated in some AVLDs there can be less explicit ways to trigger state transitions (such as by a remote instructor control, automatically, time delays, interaction with context subtitles discussed below, etc).
When in SG2 system state 54 the AVLD 40 automatically plays back segment n, beneficially until segment n ends. Upon reaching the end of segment n the AVLD 40 automatically transitions into SG3 system state 56. In SG3 system state 56 the AV content is paused at the end of segment n. That is, the last image before the subsequent learning segment n+1, is displayed until another AVLD 40 control input or signal is received.
Still referring to
In one embodiment, activating a left navigation triangle of the Navigation/skip control 62 will first cause a transition from segment n to segment n−1; that is, the former segment n−1 will become the new segment n, and the AVLD will make the previous learning segment active. Continued clicking of the left navigation triangle steps back one learning segment at a time. However, continuous activation of the left navigation triangle of Navigation/skip control 62 causes faster and faster backward stepping of learning segments. Similarly, activating the right navigation triangle of the Navigation/skip control 62 will make a transition from segment n to segment n+1. Continued clicking of the right navigation triangle steps forward one learning segment at a time. However, continuous activation of the right navigation triangle causes faster and faster stepping forward of learning segments. Once you are in SG1 system state 50 activating the Navigation/skip control 62 causes the same functioning as described above. Different versions of the Navigation/skip control 62 are possible, such as controls that allow larger skip distances to be accomplished with a single activation of the control. For instance, controls can be provided to skip to the beginning of the preceding or subsequent dialogue or scene.
It should be understood that the user controls available in the Stop-and-Go mode 44, such as the Navigation/skip controls 62 or the Play Segment control 52, may be present during the Continuous Play mode 42. They will, however, assume slightly different roles then. If a Navigation/skip control 62 is activated while in Continuous Play mode 42 the playback position will skip to the beginning of the previous segment, or the next segment depending on the skip direction. Playback will then continue without interruption at the new playback position. Similarly, if the Stop-and-Go mode 44 is in progress and a control targeted for Continuous Play mode 42 is activated, such as a conventional Fast Forward and a Reverse or a user moving a slide/progress bar, the playback position is relocated to a new segment in the desired location, while the system will stay in the Stop-and-Go mode 44.
The AVLD 40 is further capable of additional functionality. For example
Once the user 34 has typed the textual information he/she has heard in the segment n, he will send the typing information for evaluation by pressing a button such as “Enter” on the keyboard. This causes a transition from state SG3a to state SG4 where an evaluation of the correctness of the typed information is displayed. After user confirmation, the system automatically transitions AVLD 40 from SG4 system state 68 back to SG3a system state 59. The results of the dictation mode can be stored for progress tracking (discussed in more detail subsequently). If the user has difficult hearing the spoken content for typing, he/she can repeat the segment at will using the Repeat control 58.
As previously noted, using a touch screen tablet computer (reference
It should be understood that AVLD 40 is capable of additional functionality. For example,
The foregoing introduced the concept of meta data: information outside of the AV content itself that is useful for learning. Meta data comprises information such as time-stamp information and various subtitle tracks that can be applied to each learning segment. It should be understood that meta data can be information from different places than the AV content. For example, AV content might be stored local to a playback system while the meta data might be downloaded over the internet. Meta data and AV content are different.
Meta data can further include things such as subtitle information. A subtitle may comprise a textual representation of the spoken content in a given learning segment. That might be the traditional subtitles (closed captions or translations) used in a foreign language movie, reference column 83; a semantic translation into the user's mother language, reference column 84; a word-by-word translation, reference column 85; an international phonetic transcription, reference column 86; or a simple phonetic transcription, reference column 87. Note that there can be many different subtitle tracks in meta data. For instance, there can be subtitle tracks representing semantic translations (translations optimized to express the meaning of the original content), or more direct word-by-word translations for various user mother tongues. Depending on the user's mother tongue (which might be selected using a high-level user interface control), only a subset of subtitle tracks may be offered to the user at any given time, namely those most appropriate for the user based on his specified mother tongue or linguistic skills in general.
Meta data can also contain additional information. For example, each learning segment's meta data might also contain an indication of the measure of difficulty, reference column 88 which provides a gauge of the complexity or difficulty of the learning segment. Alternatively, in place of difficulty “lesson information” (such as lesson 1, 2, 3) that connects each learning segment to one or more lessons may be included in the meta data, again reference column 88. In that case the AVLD 40 can then be directed (using a user control) to selectively play only learning segments associated with a selected difficulty or lesson and thereby automatically skip all segment not belonging to the specified difficulty or lesson when the system is in Continuous Play mode, or when the system is in Stop-and-Go mode and the Play Next Segment control 52 (or 60) is activated.
The manner of storing meta data in general and subtitle information in particular can be of particular importance. Subtitle information can simply be stored in the same storage alongside AV Content. In that case the subtitles are similar to “closed captioning” tracks. However, preferably meta data (which can include the subtitle track information) is stored as part of a table 80 which is part of a data-base. This is beneficial because State-of-the-Art data-bases are usually based on a standardized database language such as SQL (Structure Query Language). Such standardized database languages implement the roles of a Data Description Language, a Data Manipulation Language, and a Query Language and enable rapid database searches
If the table 80 is part of a database a segment can be a database table row and the columns 81, 82, etc., can be database table columns Then by using a database query (such as a SELECT in the SQL language), a media player application can easily retrieve meta data for a desired segment to be played back next or can obtain a list of segments that fulfill certain criteria for selective playback, such as segments belonging to a certain difficulty level, exhibiting certain grammatical constructs, include an occurrence of specific words in a subtitle track, or other selection criteria. The media player can then time-synchronize segment playback of AV content using queries and information retrieved from the database.
In one preferred embodiment the AV Content contains conversation markers that enable the media player to retrieve subtitle information for a desired next segment using a database query. If the order of segment play back is controlled by the user 34 using navigation controls or playback controls the media player will be able to request the meta data for the next desired segment n by sending a SELECT request to the database to retrieve the desired meta data information.
In a database table embodiment the meta data, represented by Table 80 that data can be stored as one or more database tables which can also contain a column representing pointers to the AV Content that corresponds to a given segment. For instance, a media player could receive a user input from the user 34 to play-back a desired segment n. By sending a query to the database based meta data, the media player receives the requested subtitle information for segment n, along with a pointer to the AV content representing the next segment to be played back. That information is used by the media player to request the AV Content for the desired next segment from the AV Content storage. The main flow control takes place between the meta data, the media player, and the User Inputs, while the AV Content will be retrieved and played back on an as-needed basis.
It should be noted that progress status can also be implemented as a table in a database.
Beneficially meta data can be established, updated, or improved by a user community. For instance, if AV content is made available on an internet website, the members and/or users of that website may be allowed to add subtitle tracks to the Table 80. For example, someone might provide a correct pronunciation, a new or alternative translation, or add cultural information about the AV content. This allows the meta data to improve and grow over time. Ideally the content of the Table 80 is made available for editing by qualified users.
It should be noted that the meta data may not be exclusively based on textual information. There may also be extra audio information for each segment. For instance, for spoken context that is hard to hear there may be additional audio information intonating the same spoken content as in the original AV Content, but with a clearer, standardized pronunciation for additional learning effects.
Meta data may also contain additional information, reference column 89, such as mathematical formulas, molecular representation, cultural information, linguistic facts that may be useful to the user when viewing a learning segment.
It should be understood that learning segments can be “Non-verbal Segments” (NV Segments). An NV learning segment is one that does not contain spoken linguistic content or that contains spoken content deemed inappropriate for learning. Thus the AVLD 40 may well encounter silence periods. A user control can be added so that short silences between learning segments, such as spanning up to a small number in seconds, can be added to the current, previous or subsequent learning segments. Silence periods can be highly useful when the visual content is more important than sound. Such periods of silence, especially when such periods exceed a certain minimum length such as a few seconds, can be marked as NV learning segments in meta data.
In some AVLD devices a countdown timer can be displayed that shows how long the periods of silence will last, reference timer 90 in
Referring now back to
In yet another embodiment, such scene descriptions for each segment are provided as an additional subtitle track that can be selected by the user 34 on a screen not only during NV segments, but any segment. This presents additional learning experiences for the user 34 by matching up observation of, for instance, a child picking up a ball from the ground, while a scene description subtitle “child picks up ball from the ground” is displayed. Context subtitles may be similar in form and content to a film script. In fact subtitle track may be selected by the user 34 to take the form of a film script containing a mix of spoken subtitle information and scene descriptions depending on the segment.
Screen shots may lead to a better understanding of the principles of the present invention.
When activated the Subtitle Select control 114 presents a drop-down list of available subtitle tracks (see
As described above the AVLD 40 includes a “Dictation” mode, reference SG4 system state 68 of
As noted above, in some ways “soft” keyboards, such as those on the tablet computer 30, reference
A soft-keyboard is best used in connection with a touch screen, as the specialized keyboard layout can be directly accessed by typing on the special keys on the screen. However, even when a no-touch screen is available, keys on a soft keyboard can be accessed by using, for example, a mouse pointer on a conventional PC. In that latter case, it may also be possible to only display a small, partial soft-keyboard on the screen to choose from special characters, whereas the regular “hard” keyboard can be used to enter all other letters. Note that the soft-keyboard may be displayed on top, as an overlay to the Video image, in state SG3a. In the case where there is a remote input device 38 as in
The use of meta data is highly advantageous in AVLDs.
Referring now to
Another benefit of using a touch screen and context subtitles (see tablet computer 30,
In one preferred embodiment, clicking or tapping on a given previous or subsequent segment subtitle such as 152, 153, 154, 155, in the context subtitle textbox 150, will make the textbox 150 scroll the selected segment so that it is moved to the center of the textbox 150. The selected segment becomes the new current segment n, being displayed with special highlighting and thereby replacing what used to be segment subtitle 151. The previous and subsequent segments displayed are adjusted accordingly so that some new segments might become visible while others disappear. In one preferred embodiment, after selecting a new current segment 151 the system transitions to SG1 system state 50, showing the still video image representing the beginning of the new segment and allowing immediate segment playback of that segment. In yet another embodiment clicking or tapping, or double clicking or double tapping on a specific segment subtitle in the context subtitle textbox automatically transitions the AVLD into SG2 system state 54 to play back the chosen segment.
Larger scale navigation can also be accomplished using context subtitles, especially when using touch screens. Either or both a slide/scrolling control (as used in many programs such as Microsoft Word and on web sites) or a swipe control (as used on the Apple iPhone) can be used for rapid navigation. For example, still referring to
Another embodiment uses a “Swipe” control. In “Swiping”, the finger of the user 34 (or mouse pointer) is put on a specific location on the screen inside the context subtitle textbox 150 and then dragged to another (vertical) position in the textbox 150 and the finger or mouse pointer is released. The displayed subtitles move by the same amount, thereby making some subtitles shown previously disappear and others appear while moving a new set of context subtitles into the context subtitle textbox 150. Swipe controlling can be used for more localized navigation around the current segment n, while Slide control—or repeated use of a swipe control—can be used for larger scale navigation. Both can be used jointly. Note that the Slide control bar 164 can also be arranged horizontally as in state-of-the-art video player applications.
In the preceding description the current segment n (here: subtitle 151) will typically be in the vertical center of the context subtitle textbox 150 where it is made especially visible by its position and possible extra highlighting. In another embodiment, context subtitles may be more similar to an “electronic book reader” (e-reader) device. Instead of scrolling the content of the textbox 150 to always center the current segment as subtitle 151, the textual content of the textbox 150 would be static while segments are played back. Once the last segment on a given page was played back, the page turns to a new page displaying a new set of subtitles. At any given time, the subtitle representing the currently playing segment could be highlighted on the screen by showing a box around it, using a different font or color or other highlighting technique, one after another. In some applications it might be beneficial to “synch” AV content and meta data together. For example, to improve navigation in relation to context subtitles or standard subtitles, if learning segment 589 corresponds to the 23rd scene in the AV content and it is the 78th sentence of that scene when segment 589 is selected “Scene 23, Sent 78” or some equivalent information could be displayed. This could be stored in a meta table in column 89, see
As previously noted the AVLD 40 can have selectable subtitle tracks.
The AVLD 40 can be configured to provide rapid useful information.
The AVLD 40 processes two general types of data: AV content in the form of learning segments and meta data. Referring to
While
Other memory storage configurations are also possible. For example,
In many modern digital players AV content is stored in a digital format referred to as MPEG-4. Even if a particular AVLD uses AV content in a different format that format will usually be similar to MPEG-4. This is because MPEG-4 type digital formats compress data to a manageable size while still allowing high-enough quality AV content. Without compression the memory size of AV content could become so large that it would simply take too long for transmission to be useful.
Compressed video is usually comprised of three different types of frames: I-frames, P-frames, and B-frames. I-frames (or intra-frames) represent a video image when the video content is quasi-static. I-frames present a complete image at one particular time without temporal dependence on any previous or subsequent frames. I-frames can be thought of as a snapshot at a given time, like fully self-contained still images such as digital jpg images. P-frames, in contrast, are predictive frames in which only information relative to a previous video frame is stored and/or transmitted. So, an I-frame presents something of a snapshot at one time while the next P-frame contains only changes from the I-frame. The following P-frame contains only changes from the prior I-frame+the prior P-frame. B-frames typically contain small data changes that leverage temporal redundancy. B-frames make use of bidirectional “predictions” for previous and subsequent video frames. Oftentimes the overall screen image of a digital video stream may be subdivided into so-called macroblocks: square pieces of each image containing a certain number of image pixels, such as 16 times 16 pixels. The decision to use I-frame, P-frame, or B-frame encoding may be made separately for each macroblock in the state of the art.
Compression takes place by converting raw AV content into a sequence of I-frames and P-frames and/or B-frames. I-frames refresh the content at somewhat regular intervals, while P or B frames are used to interpolate between I-frames in order to reduce the amount of data needed to represent the video content.
A consequence of the interleaving compression approach using I frames and P/B frames is that starting to play back conventional AV content is not possible at any arbitrary time. For instance, if playback is desired at a given time corresponding to the start time of a learning segment the video frame sequence may be in the middle of a sequence of P/B frames. Therefore, it will take some time into the playback to capture an I-frame and restore a good image quality. Otherwise video “tearing” occurs. Such tearing is a drawback of using conventionally encoded AV content in the context of an advanced language learning system as described in this invention.
b illustrates how to overcome this problem. In
One method of obtaining learning segments synchronized with I-frames is illustrated in
When discussing various controls, input devices, and actions of AVLDs in accord with the principles of the present invention, reference
The foregoing discussions also tend to infer that the segment length is not adjustable. However, some embodiments of the present invention will have an adjustable segment length. In such cases a user control will be included to enable changing the typical length of a segment. For instance, the default segment length could be one phrase, expression, or sentence. Upon user request, or after the material being learned has been mastered to some degree multiple consecutive segments can be joined to form a super-segment. All controls discussed so far would then apply not to each individual segment, but to super-segments. The Meta Info table 80 illustrated in
In another embodiment, playback in Continuous Mode or Stop-and-Go mode will allow speed control. If the spoken speed in a given segment is too fast, the user could reduce the playback speed. In a preferred embodiment, the speed adjustment will also utilize pitch preservation. Such techniques are well known in the prior art. If a given audio content is simply played back more slowly, the pitch will go down proportionally. State-of-the-Art pitch preservation technology can be used to change the speed while maintaining the pitch of the spoken audio content, thereby avoiding distortion of voices.
In yet another embodiment, the voice of the user can be recorded when he/she recites a given segment by reading the corresponding subtitle. That recorder vocal information can then be matched and paired with the video content for later playback. Such playback may be beneficial to compare one's own voice in another language with that of the original audio track in the audio-visual content. That way, the user may adopt the role of the voice of one of the characters in the audio-visual content.
It should be emphasized again that in the preferred configuration embodiment shown on
The foregoing operational descriptions of AVLDs can be viewed as embodiments of a generic AVLD 250 illustrated in
Finally, it will be understood by those skilled in the art that the transition from Continuous Play mode system state 42 to Stop-and-Go-mode system state 44 or vice versa (as shown in
For instance, when the system is in Continuous Play mode system state 42 and is subsequently paused by the user via activation of Pause control 48 in the middle of the current segment n (“within” segment n, that is somewhere between beginning and end), playback will stop with the Video image being held on the screen that was in display in the very moment the Pause control 48 was activated. Once halted in this state, namely SG5 system state 66, the following functionality will beneficially be linked to the user controls in Stop-and-Go-mode: A first activation of a Navigation/skip control 62 may jump to the beginning, or end, of current segment n, as opposed to skipping to a previous segment n−1 or subsequent segment n+1. This will cause transitions to SG1 system state 50 or SG3 system state 56, depending on the direction of the navigation/skip control activated by the user.
Still referring to
SG1 system state 50, SG2 system state 54, SG3 system state 56 (or SG3a system state 59 and SG4 system state 68 if dictation mode is enabled), SG5 system state 66, and SG6 system state 70 in their entirety represent the Stop-and-Go-mode system state 44. It should be understood that transition from Stop-and-Go-mode system state 44 to Continuous-Play-mode system state 42 can happen at any time by activation of Continuous Play control 46. In that event, a transition will occur from the currently active Stop-and-Go-mode state SG1 system state 50, SG2 system state 54, SG3 system state 56 (or SG3a system state 59 and SG4 system state 68 if dictation mode is enabled), SG5 system state 66, or SG6 system state 70, back to Continuous-Play-mode state 44. (Those transitions back into Continuous-Play-mode state 44 are not shown in
The concept of overlapping learning segments is highly beneficial. In the embodiment shown in
Dialogs DA 2701 and DB 2702 partially overlap. For example, segment 2730 starts at time 2705 and ends at time 2706 but does not overlap with anything said by speaker B. However, speaker B speaking segment 2731 does not end before speaker A starts speaking segment 2732. They overlap. In segment 2731 speaker B says “Y que piensa hacer?” from time 2706 to time 2710 Meanwhile speaker A says “Voy a conseguir otro trabajo” between time 2708 to 2709. There is an overlap 2741 between time 2708 and time 2710. Other overlaps occur, such as 2742.
Such overlaps are part of normal speaking and someone learning a language must deal with such overlaps. Yet a learner nonetheless still wants to make use of meta data of a sentence without the overlap. Therefore, defining learning segments that deal with overlapping speakers is beneficial to improving the user experience and learning success. For instance, if segments 2731 and 2732 were defined as non-overlapping abrupt and unnatural dialogs having missing content would result.
When using overlapping segments the meta information table 80 such as shown in
A refinement for playback of overlapping segments is illustrated in the bottom half of
The ramp-up and/or ramp-down control is beneficially contained in the meta data table 80. In a straightforward practical implementation, the table 80 will contain a column containing yes/no information describing whether the segment contains a ramp-up interval, and another column containing yes/no information describing whether the segment contains a ramp-down interval. The exact definition of the ramp-down configurations, such as duration and start/stop levels of the playback parameter P, can then happen elsewhere in the AVLD and will thus apply to all learning segments. It should finally be noted that above ramp up or ramp down mechanisms can be used in conjunction with other segments, with or without overlap.
It is beneficial to distinguish the operating mode (dictation versus subtitle based navigation) by using the orientation of the handheld device 290: If the handheld device 290 is held horizontally it acts as a typing keyboard 39; if it is held vertically it acts a device for reading subtitles and implementing subtitle navigation.
In a beneficial modification of the system shown in
At all times, if playback or navigation updates the current segment n to be a new segment n′, the caching mechanism will ensure that the desired number of segments before and after the new segment n′ will be cached locally in the media player 211, 202 by retrieving the corresponding AV Content & meta data from the containers 213, 214. This caching arrangement is particularly useful in a system configuration where the AV Content and/or the meta data reside on geographically remote storage systems that are accessible via a network such as the Internet while the Media Player 202, 211 is operational at the location of the student.
A particularly beneficial embodiment combines elements from
It should be understood that while the foregoing illustrates numerous embodiments of the present invention those embodiments are only exemplary examples only. They are not intended to be exhaustive or to limit the invention to the precise forms disclosed, and obviously many modifications and variations are possible in light of the above teachings. Others who are skilled in the applicable arts will recognize numerous modifications and adaptations of the illustrated embodiments that remain within the principles of the present invention. Therefore, the present invention is to be limited only by the appended claims.
To the extent allowed by law this application claims priority to and the benefit of U.S. provisional application No. 61/504,173 entitled “AUDIO-VISUAL LEARNING SYSTEM,” which was filed on Jul. 2, 2011 for inventor Joachim S Hammerschmidt. That application and any publication are hereby incorporated by reference to the fullest extent allowed by law.
Number | Date | Country | |
---|---|---|---|
61504173 | Jul 2011 | US |