The invention relates in general to multimedia highlight content, and in particular, to generating multimedia highlight content from a full-length multimedia broadcast presentation.
The personal video recorder (PVR) and digital video recorder (DVR) have become increasingly popular devices in today's age of digital media. More than ever, consumers are appreciating the value in being able to time shift broadcast content. In addition, the ability to skip advertisements and navigate recorded content has generally been well received by consumers.
Despite the advantages offered by DVR technology, broadcast programs can often be large and cumbersome to navigate. This is particularly true for lengthy sports programming. This is why there is a significant amount of television programming devoted to sports highlights. The problem with such highlight content is that is currently being created manually using non-linear editing systems by media editors. Media editors are able to use their subjective judgment to key material in a full-length program and assemble it into a highlights program. This is of course a very laborious process which largely substitutes the subjectivity of the media editors for the individual viewers'.
Thus, there is still an unsatisfied need for a system and method for generating multimedia highlight content in an automatic fashion based on one or more user-defined parameters.
Systems and methods for generating multimedia highlight content are disclosed and claimed herein. In one embodiment, a method includes receiving one or more user highlight parameters, parceling Multimedia content into video packets and closed-caption packets where the video packets include a plurality of frames, and processing the video packets to identify graphical changes within a predetermined frame location between two or more frames. In one embodiment, the graphical changes are indicative of a potential highlight segment. The method further includes processing the closed-caption packets to identify highlight keywords that match at least one of the user highlight parameters, and compiling a plurality of potential highlight segments into a highlight program based on one of the identified graphical changes and the identified highlight keywords.
Other embodiments are disclosed and claimed herein.
The invention relates to a system and method for generating multimedia highlight content based on a recorded full-length broadcast program. In one embodiment, one or more user-defined parameters and/or default parameters are used to detect the presence of a potential highlight within a recorded version of the full-length broadcast program. Locations within the recorded program which satisfy any of the user or default parameters may then be added to a highlight list, which is usable to generate a highlight program.
One aspect of the invention is to provide an algorithm set which operates on the video content to identify potential highlights, as defined by user and/or default parameters. In one embodiment, a user may provide one or more keywords particular to the type of highlight desired. In another embodiment, a default set of keywords may be used instead of or in addition to user-defined keywords.
Another aspect of the invention is to parcel out the video packets, audio packets and closed-caption packets from a multimedia stream. Once separated, particular frames (e.g., I-frames) within the video packets may be analyzed for changes indicative of a potential highlight. Closed-caption packets may also be analyzed for the occurrence of keywords which match default or user-defined keywords. Similarly, the audio packets may be analyzed for speech containing highlight-indicative keywords.
Still another aspect of the invention is to tabulate the locations and descriptions potential highlights within the full-length broadcast program. In one embodiment, a video list containing the locations and descriptions of potential highlights identified by video analysis is generated. Similarly, an audio list and/or closed-caption list may be generated where each contains the locations and descriptions of potential highlights identified by speech recognition and closed-caption text analysis, respectively. In one embodiment, these three list are correlated and compiled into a single highlight list. The highlight list may then be used to access the various identified highlights from the recorded full-length broadcast program, and to present them in sequence on a display device.
Another aspect of the invention is to enable a user to edit and customize the various identified highlights to create a final highlight “program.” While in one embodiment, the customized highlight program may be stored separately on a local storage device, in another embodiment, the resulting highlight program may be generated “on the fly” by successively accessing the identified highlights in the recorded content and displaying them on a display device as if it were a separately existing program.
When implemented in software, the elements of the invention are essentially the code segments to perform the necessary tasks. The program or code segments can be stored in a processor readable medium or transmitted by a computer data signal embodied in a carrier wave over a transmission medium or communication link.
Referring now to the figures,
The input module 102 provides a media stream to the media switch 112. The input module 102 may also be used to tune the channel to a particular program, extract a specific MPEG program out of it, and feed it into the rest of the system. Analog video signals may be encoded into a similar MPEG format using separate video and audio encoders, such that the remainder of the system is unaware of how the signal was obtained. Information may be modulated into the Vertical Blanking Interval (VBI) of the analog video signal in a number of standard ways. For example, the North American Broadcast Teletext Standard (NABTS) may be used to modulate information onto lines 10 through 20 of an NTSC signal, while the FCC mandates the use of line 21 for Closed Caption (CC) and Extended Data Services (EDS). Such signals may be decoded by the input module 102 and passed to the other modules as if they were delivered via a private data channel.
In one embodiment, the media switch 112 mediates between a microprocessor CPU 106, hard disk or other storage device 108, which may or may not include the DVR system's live cache 114, and volatile memory 110. Input streams are converted to an MPEG stream and sent to the media switch 112. The media switch 112 buffers the MPEG stream into memory. If the user is watching real time broadcast content, the media switch 112 may send the stream to the output module 104, as well as simultaneously write it to the hard disk or storage device 108.
The output module 104 may take the MPEG streams as input and produces an analog video signal according to a particular standard (e.g., NTSC, PAL, or other video standard). In one embodiment, the output module 104 contains an MPEG decoder, on-screen display (OSD) generator, analog video encoder and audio logic. The OSD generator may be used to supply images which will be overlaid on top of the resulting analog video signal. Additionally, the output module 104 can modulate information supplied by the program logic onto the VBI of the output signal in a number of standard formats, including NABTS, CC, and EDS.
Memory 110 may further contain instructions to cause CPU 106 to insert programming information directly into the MPEG data stream(s). The user may input control instructions for displaying such programming information via button a remote control device, for example. It should equally be appreciated that a user may provide instructions to the DVR system 100 using any other known user input means. As will be described in more detail below, memory 110 may also include one or more instructions for generating multimedia highlight content based on broadcast content received by the input module 102.
In other embodiment, user-defined parameters 230 may be provided for financial programming content. In this case, users can define particular companies, currencies, fund managers, etc. to be highlight worthy. Similarly, user-defined parameters 230 may be provided for news programming to key in on particular countries, states, world leaders, world events, local news events, etc. It should be appreciated that the variety of possible user-defined parameters 230 is limitless, as is the type of programming which can be used to create highlights in accordance with the invention. Moreover, the parameters 230 may be provided by a user using any number of input devices, such as keyboards, remote controls, etc.
In addition to the recorded content 220 and user parameters 230, the MHE 210 may also make use of one or more default highlight parameters 240 based on the type of programming being processed. For example, in the case of a baseball game, any homerun may be considered a default highlight even though the user has not specifically added a user parameter 230 for homeruns. Similarly, any score change in a football game may be considered a highlight and, as such, a default highlight key 240 for score changes may be provided to the MHE 210.
Continuing to refer to
The output from the MHE 210 is highlight content 260. In one embodiment, highlight content 260 is comprised on a plurality of individual media segments selected from the recorded content 220 based on their probability of being a highlight, as either defined by the user or by the default settings. As will be described in more detail below with reference to
Once extracted, the frame data may be passed to the video search engine 520. Using the algorithm set library 535 (which is comprised of the highlight algorithm set 250, the video search engine 520 compares the video or text at a given coordinate of the frame for two successive frames. This comparison is performed to detect a change indicative of a potential highlight. For example, in the case of a sports broadcast, the top-right corner of the frame may contain a score box. By analyzing successive frames, the video search engine 520 can detect changes in the area of the screen, thereby indicating a score change. Assuming that score changes are either a default highlight or a user-defined highlight, this location within the program may then be identified as a highlight and this information may then be tabulated in a video list 545. In one embodiment, the video list is a table of potential highlight locations and their descriptions (e.g., type of highlight).
Continuing to refer to the video packet processing, it should equally be appreciated that changes in the video stream may be detected using other frame comparisons, and not necessarily a comparison of successive frames. In addition, sports broadcast score boxes often contain other information, such as which team is in possession of the ball, which bases have a man on, etc. Thus, changes in any of the data provided graphically, can be detected and used to identify a potential highlight. In the case of financial programming, for example, a stock ticker can be analyzed to detect when a particular stock symbol comes up. Similarly, many news programs have graphical text at the bottom of the screen detailing the topic of discussion. This area can be analyzed by the video search engine 520 to identify a particular word or graphic based on the previously provided user parameters 230 and/or default keys 240.
Referring now to the audio processing portion of the highlight extractor,
Continuing to refer to
In addition to identifying the occurrence of a keyword in the closed caption and audio feeds, in another embodiment context logic can be used to filter out false positives. For example, in a baseball game an announcer may use the word “homerun” despite the fact that a homerun had not been scored. One way to filter out such false positives is to perform a context analysis of how the keyword was used. For example, a predetermined number of words before and after the keyword may be analyzed. If the word “needs” appears in the same sentence before the word “homerun,” this is likely to be a false positive. On the other hand, if the words “just hit” appears before the word “homerun,” this is more likely to be an actual score change highlight. Another way to filter out potential false positives is to cross-reference against the graphical score change, as detected by the video search engine 520.
Referring now to
Since the highlight list 260 is comprised of specific locations and descriptions, in order to capture the entire highlight, it is necessary to define a window around the highlight timestamp. This window may be highlight specific, user definable, or a combination of the two. In addition, the highlight editor 710 may contain a learning algorithm which adjusts the size of the highlight windows depending on user actions. By way of example, if a user consistently extends the highlight window of “score change” highlights, the highlight editor 710 may adjust the default window size for all “score change” highlights.
Referring now to
The end result of a user editing the originally detected highlight segments 9201-920n is shown in
While the invention has been described in connection with various embodiments, it will be understood that the invention is capable of further modifications. This application is intended to cover any variations, uses or adaptations of the invention following, in general, the principles of the invention, and including such departures from the present disclosure as, within the known and customary practice within the art to which the invention pertains.