This invention relates generally to presenting videos, and more particularly to presenting videos in a structured manner as controlled by a user.
Most older prior art devices, such as VCRs, present videos to a user according to a single compositional structure inherent in the temporal organized frames that can only be accessed sequentially on a linear tape. The modes of presentation are limited to play, reverse, pause, stop, fast forward, and fast reverse. Some VCRs allow the user to put index marks on the tape at arbitrary points along the video timeline. Then, the user can jump forward or backwards to the marks.
Newer prior art devices, such as DVDs, also provide prerecorded composition structures for the user, such as chapters and scenes, which are directly accessible. Additionally some DVDs provide alternative versions, e.g., cut and uncut versions, or versions in different languages. However, DVD players do not provide the user a simple uniform method of choosing and moving between these various composition structures. Typically, DVD players do not show how the various versions relate to the base video content.
Some very recent PVRs allow the user to generate compositional structures based on classified segments so that the user can play the video while skipping content, e.g., commercials.
The invention provides a method and system for presenting a video using multiple compositional structures. A compositional structure identifies and labels segments of the video. Example compositional structures are a list of commercials in a comedy program, a list of story items in a news program, and a list of baseball batters in a sports program. The user can select dynamically any compositional structure, and then the video is presented according to the selected compositional structure.
As shown in
The compositional structures 200 can be generated 110 either locally by a feature extractor operating on audio and visual features of the video, or the structures are downloaded 120 from a remote location via the network 121. The compositional structures can be generated automatically or manually. The compositional structures can be stored in a memory, e.g., the same memory storing the video, or a memory of the presentation system 100, as described below in greater detail.
Compositional Structures
Generally, the compositional structures 200 shown in
For example, a simple structure partitions a conventional broadcast video into program segments and commercial segments. Similarly, a simple composition of a sports video includes play and break segments, e.g. pre-game, time-outs, and post-game segments, or just scoring opportunities. Another simple structure partitions the audio and visual portions.
A hierarchical composition of a baseball game video includes game and commercial segments, and within the game segments, innings, and within innings, batters, and within batters, pitches, and within pitches, base hits, and perhaps, within base hits, home runs.
A compound structure can use both simple and hierarchical compositions, e.g., the intersection of just the game without commercials, and further innings within the game.
A particular video can have multiple compositional structures, and the user can present the video according to different selected compositional structures. The selected compositional structure can change while the video is presented.
As shown in
The label 221 describes or ‘names’ the structure, e.g., “Red Sox vs. Yankees 9/13/04.” The label can be a text string, an image, an icon or a short video and/or audio clip. The program segments 222 can be ordered. The ordering can be according to time, subjective importance based on, for example, percentage of cheering, etc. The ordering can also be hierarchical, as described above.
The start 223 is a time or frame relative to the beginning of the video 101. The optional duration is the length of the segment in terms of time or frames.
The attributes 225 further identify each program segment. The attributes can be a color, icon, or sound that represents content specific information about the segment, such as this segment contains a “scoring play” or that the “crowd reaction was intense”. Relative importance is another possible attribute. Attributes can also include classifications.
Example Presentation
The list 210 of available compositional structures 200 can also describe the content. Examples of such structures include highlights in a sports video, program-only segments, pitches in baseball, home runs, etc.
As shown in
It should be noted, the remote controller according to the invention only has five buttons, to give the user a much greater control of the presentations than prior art devices with many more buttons.
Although the invention has been described by way of examples of preferred embodiments, it is to be understood that various other adaptations and modifications may be made within the spirit and scope of the invention. Therefore, it is the object of the appended claims to cover all such variations and modifications as come within the true spirit and scope of the invention.