This application claims the benefit of Taiwan Patent Application No. 101134755, filed on Sep. 21, 2012, which is hereby incorporated by reference for all purposes as if fully set forth herein.
1. Field of Invention
The present invention relates to a media retrieval system and method, and more particularly to a media retrieval system and method for acquiring a demanded media frame based on scene description data.
2. Related Art
In the prior art, media data is usually played in a linear manner. Image playing software provides a timeline of playing the media data correspondingly. During use, positions on the timeline can be clicked or a slider on the timeline can be dragged, so as to determine an image playing interval. However, the precision of dragging a slider depends on the length of a timeline, and the precision of dragging a slider also affects the precision of positioning a timepoint of a demanded image. Generally speaking, when the timeline is longer, the precision of dragging a slider becomes higher. Therefore, if a user intends to acquire a targeted image or voice from the media data, manual timeline control operations are essential to position a slider at the timepoint of a demanded image, thereby playing the demanded image.
To solve the problems above, the present invention discloses a media scene playing system and method for retrieving and playing demanded scene section media by using auxiliary data for describing a scene as the reference for playing media.
The media scene playing system disclosed in the present invention includes a media providing module, an input module, and a media retrieval module.
The media providing module is used for providing media data and scene description information corresponding thereto. The input module is used for inputting navigation data. The media retrieval module is used for comparing the navigation data with each piece of scene description information to acquire at least one scene period matching the navigation data, so as to retrieve at least one piece of scene section media corresponding to the at least one scene period from the media data and play the scene section media.
In the media scene playing method disclosed in the present invention, a media providing module provides media data and at least one piece of scene description data corresponding thereto, next a media retrieval module compares each piece of scene description data with reference to navigation data to find at least one piece of target scene description data matching the navigation data, and then the media retrieval module retrieves a target media frame corresponding to the target scene description data from the media data.
The present invention also discloses a recording medium that stores a program code readable by an electronic device. When an electronic device reads the program code, a media scene playing method is executed. The method is as discussed above.
In the present invention, first, by means of targeted retrieval of media frames, a user finds a demanded video scene within a relatively short time. Secondly, by means of targeted retrieval of media frames, the operation on media data by a user is not limited by the length of a timeline, and the precision of acquiring demanded media data is enhanced, thereby avoiding the operational trouble that it is difficult for a user to drag a slider to a demanded point. Thirdly, by means of targeted retrieval of media frames, a user can acquire demanded media frames once for all to form self-generated media, so that not only the customization media operation that satisfies the demand of the user is formed, but also the operational complexity for the user is reduced.
The present invention will become more fully understood from the detailed description given herein below for illustration only, and thus are not limitative of the present invention, and wherein:
The embodiments of the present invention are illustrated below in detail with reference to the accompanying drawings.
The media providing module 10a is used for providing media data 41 and more than one piece of scene description information 42 corresponding to the media data 41. The media providing module 10a refers to hardware or a combination of more than one form of a unit, component, device, apparatus, and system with a media providing capability that combines software and hardware.
The scene description information 42 is annotation data for annotating the media data 41 or further for annotating media frames 411 included in the media data 41, for example, explanatory data such as brief description, playing duration and title of the playing content of the media data 41.
The input module 20 is used for a user to input navigation data 51, and refers to a data input interface for a user to input data, where the presentation end of the interface depends on the demands of designers, and is not limited.
The media data 41 includes multiple media frames 411 of different contents. For example, when the media data 41 is image data, the image data means image frames having more than one of contents such as targets, scenes and characters. For example, when the media data 41 is voice data, the voice data means voice frames having more than one of contents such as high pitch, low pitch, speeches and music.
The navigation data 51 refers to a retrieval demand condition 511 input by a user when the user intends to retrieve a specific scene interval from the media data 41. For example, the media data 41 is recorded images of a basketball game, a user inputs a retrieval demand condition 511 involving scoring pictures of his favorite players or scoring pictures of three-point shots of all players in a game or scoring pictures of three-point shots of his favorite players and the like. Further, for example, the media data 41 is music data such as an opera, a user inputs a retrieval demand condition 511 involving solos of a female protagonist of the opera or absolute music portion performance and the like.
The media retrieval module 30 is formed of software, hardware or software and hardware, for example, application soft executed by an arithmetic processor, a chip, an integrated circuit (IC) or further firmware that runs in combination with a chip or an IC, which is not limited herein and depends on the demands of designers.
The media retrieval module 30 acquires the media data 41 and the scene description information 42 provided by the media providing module 10a, and acquires the navigation data 51 from the input module 20. The media retrieval module 30 compares the retrieval demand condition 511 of the navigation data 51 and each piece of scene description information 42 to acquire more than one scene period 421 matching the navigation data, and then retrieves all scene section media 43 corresponding to the scene period 421 from the acquired media data 41. The manner of acquiring the scene period 421 is illustrated with the following examples, without being limited thereto:
(1) The scene description information 42 records a plurality of scene playing durations, and the retrieval demand conditions 511 of the navigation data 51 include more than one demand timepoint. The media retrieval module 30 matches these demand timepoints and the scene playing durations to acquire scene periods 421 that meet the demands.
(2) The scene description information 42 includes a plurality of scene description instructions and playing durations corresponding to the scene description instructions. The retrieval demand conditions 511 of the navigation data 51 include more than one piece of scene designation information. The scene designation information refers to the instruction of a demanded scene of a user. The media retrieval module 30 matches each piece of scene designation information and the scene description instruction to acquire the demanded scene periods 421 from all the playing durations.
However, the navigation data 51 may include more than the two types of retrieval demand condition 511, and may also include multiple different retrieval demand conditions 511. The media retrieval module 30 retrieves multiple periods of scene section media 43 from the media data 41 according to the retrieval demand conditions 511. The media retrieval module 30 may directly retrieve video/audio intervals from the media data 41, or retrieve media frames 411 corresponding to the scene periods 421 from the media data 41 and combine the media frames into the scene section media 43.
The media retrieval module 30 may construct one or more pieces of self-generated media with the retrieved pieces of scene section media 43 according to similar rules such as a playing sequence, the media frames 411, playing durations of the scene section media 43 in the media data 41, the sequence of retrieval demand conditions 511. Or, furthermore, a media playing tree structure is constructed depending on the data dependence relationships of the retrieval demand conditions 511 or the type of data attributes. However, an interface of the input module 20 also presents an input field of the media playing tree structure, and the user merely inputs the retrieval demand conditions 511 in the fields according to the data dependence relationships of retrieval demand conditions 511 or the types of the data attributes, which is used as a reference for retrieval and classification of the media frames 411 by the media retrieval module 30.
However, the media providing module 10a also provides a plurality of pieces of media data 41, provides scene description information 42 corresponding to each piece of media data 41, and provides the media data 41 and the scene description information 42 to the media retrieval module 30 together. When inputting the navigation data 51 by means of the input module 20, the user may set different retrieval demand conditions 511 for each piece of media data 41 or set a retrieval demand condition 511 for all the media data 41, which depends on demands of the user. The media retrieval module 30 compares relevant scene description information 42 according to the navigation data 51 to find the scene periods 421, and then retrieves the scene section media 43 from the media data 41 according to the scene periods 421.
Subsequently, the media retrieval module 30 can be designed to construct a playing media that meets the demand of the user with the scene section media 43 or even a group of media frames 411 formed of the scene section media 43.
However, the media retrieval module 30 can store retrieved results such as the scene section media 43, the group of media frames 411, the scene periods 421, the media playing tree structure, and the playing media in a storage module 80 (as shown in
When the media data 41 is stream media, the data processing unit 12 directly transmits the received media frame 411 or media interval to the media retrieval module 30, so that the media retrieval module 30 performs the media retrieval action according to the retrieved data. If the media data 41 is an integral video/audio data, the data processing unit 12 stores the received interval data in a register unit 13, and then provides the media data 41 to the media retrieval module 30 after finishing receiving the media data 41.
The storage module 80 stores one or more lists of scene description information 42. The list of scene description information 42 records a group of scene description information 42 corresponding to specific navigation data 51 after the retrieval operation. Furthermore, the list of scene description information 42 also records the playing sequence of scene section media 43 and the level and node of the navigation data 51 in the tree structure that are recorded in the media playing tree structure.
When acquiring the navigation data 51, the media retrieval module 30 compares the retrieval demand condition 511 included in the navigation data 51 and all lists of scene description information 42 in the storage module 80 to retrieve a target list. The scene description information 42 included in the target list is also the target scene description information 42. Subsequently, the media retrieval module 30 retrieves the scene section media 43 from the media data 41 according to the target list.
A media providing module 10 provides media data 41 and at least one piece of scene description information 42 corresponding thereto (Step S110). According to different manners of providing the media data 41 and the scene description information 42, the detailed implementation of this step is also different.
As shown in
As shown in
As shown in
The media retrieval module 30 compares each piece of scene description information 42 according to the navigation data 51 to find at least one scene period 421 matching the navigation data 51 (Step S120). According to different manners of providing the media data 41 and the scene description information 42, the detailed implementation of this step is also different.
As shown in
As shown in
The media retrieval module 30 retrieves at least one piece of scene section media 43 corresponding to the scene period 421 from the media data 41 (Step S130). As described above, the media retrieval module 30 compares the retrieval demand condition 511 of the navigation data 51 and each piece of scene description information 42, so as to acquire more than one scene period 421, and then retrieves all scene section media 43 corresponding to the scene period 421 from the acquired media data 41. The manner of retrieving the scene period 421 is described above, which is no longer described herein.
However, the navigation data 51 may include more than one type of retrieval demand condition 511, and may also include multiple different types of retrieval demand conditions 511. The media retrieval module 30 retrieves multiple groups of media frames 411 from the media data 41 according to the retrieval demand conditions 511. Furthermore, the media retrieval module 30 constructs one or more pieces of self-generated media with each piece of retrieved scene section media 43 according to the navigation data 51. Or, the media retrieval module 30 constructs playing media that meets the demands of the user with the scene section media 43 or the group of media frames 411 formed of the scene section media 43.
The method further includes: constructing, by the media retrieval module 30, a media playing tree structure with each piece of scene section media 43 corresponding to each retrieval demand condition 511 according to data dependences of retrieval demand conditions 511, data attributes, and data level relationships (Step S140).
When a user merely wants to watch “scoring pictures in the third quarter”, this demand condition may be set in navigation data 51. A media retrieval module 30 acquires the countdown of the third quarter “11:39”, “10:50”, “10:49”, “09:39”, “09:16”, “08:58”, “08:44”, “08:29”, “08:07”, “07:47”, “07:35”, and so on and timepoints corresponding to the “scoring scenes in the third quarter”, uses the time before or after the timepoints or the timepoints as the center as the scene periods 421 described above, and then uses the scene periods 421 to retrieve corresponding scene interval images (namely, the scene section media 43 described above) to be played by the relevant playing module. Or, when the demand condition set by the user is “scoring pictures of three-point shots in the third quarter”, the media retrieval module 30 takes the countdown of the third quarter “11:39”, “09:16”, “08:58”, “07:47” and so on and timepoints corresponding to the “scoring scenes of three-point shots in the third quarter”, calculates demanded scene periods 421 based on the timepoints, and then retrieves corresponding scene interval images (namely, the scene section media 43 described above) by using the scene periods 421 to be played by the relevant playing module.
The first level of the media playing tree structure is the whole game image. The second level is branches of the first level images, which are the images of both teams. The third level is branches of the second level images, which are close-up images of the two teams in the game. The fourth level is branches of the third level images, which are the close-up shot images of specific players from the two teams in the game.
After a user sets the navigation data 51, the media retrieval module 30 retrieves the demanded media frame 411 from the media data 41 through the media playing tree structure according to the demand condition included in the navigation data 51, so as to form the above self-generated image to be played by a relevant playing module.
However, the media frames 411 retrieved by the media retrieval module 30 do not need to consider the above image level. For example, when a user wants to watch “all scoring pictures of No. 2 player 2 of Team A” and then watch “whole images of the fourth quarter”. The media retrieval module 30 retrieves media frames 411 or scene description information 42 corresponding to “all scoring pictures of No. 2 player of Team A” according to the fourth level structure of the media playing tree structure, retrieves the media frames 411 or scene description information 42 corresponding to the “whole images of the fourth quarter” according to the second level structure of the media playing tree structure, and forms the demanded self-generated images according to the media retrieval manner to be played by the relevant playing module. That is, the media retrieval module 30 may retrieve the scene section media of the same level, different levels or partially same and partially different levels from the media playing tree structure, and integrate the scene section media to play the scene intervals.
The invention being thus described, it will be obvious that the same may be varied in many ways. Such variations are not to be regarded as a departure from the spirit and scope of the invention, and all such modifications as would be obvious to one skilled in the art are intended to be included within the scope of the following claims.
Number | Date | Country | Kind |
---|---|---|---|
101134755 | Sep 2012 | TW | national |