The present principles generally relate to an apparatus and method for recording transition history and selecting next playback from the transition history. In particular, the transition history may be recorded by a server or a playback device, and if the current playback media content segment is a source media content segment or the source media content segment is similar to the current playback media content segment in the transition history, a destination media content segment transitioned from the source media content segment in the transition history is selected for playback.
With an increasing amount of video on demand (VOD) content available, a major problem for viewers is to find interesting video segments without searching full length videos exhaustively. A central challenge is to save users' time and effort in browsing, searching, navigating, and consuming full length video content, while avoiding several segments which are not of interest.
One traditional approach for VOD viewing of content is a video playlist. A playlist often contains full-length videos. The disadvantage of this approach is that coarse playlists do not take into account users' interests in highlighted sections of a video, and often users jump ahead during playback of a video to another portion of the same video or to another video in a playlist. Crowdsourcing is becoming a necessity for image and video tasks which are normally too compute-intensive for ordinary video processing techniques. Thus, there is a need to improve selection of a next video or video segment for playback.
According to an exemplary embodiment of the present principles, a method is presented. The method comprises playing back a first media content segment; determining if a first source media content segment exists in a transition history, which is one of the first media content segment and a segment similar to the first media content segment, the transition history including transitions from at least one source media content segment to at least one destination media content segment; and if the first source media content segment exists, playing back a first destination media content segment transitioned from the first source media content segment in the transition history.
According to an exemplary embodiment of the present principles, an apparatus is presented. The apparatus comprises a processor configured to play back a first media content segment; determine if a first source media content segment exists in a transition history, which is one of the first media content segment and a segment similar to the first media content segment, the transition history including transitions from at least one source media content segment to at least one destination media content segment; and if the first source media content segment exists, play back a first destination media content segment transitioned from the first source media content segment in the transition history.
According to an exemplary embodiment of the present principles, a computer program product stored in non-transitory computer-readable storage media is presented, comprising computer-executable instructions for playing back a first media content segment; determining if a first source media content segment exists in a transition history, which is one of the first media content segment and a similar segment to the first media content segment, the transition history including transitions from at least one source media content segment to at least one destination media content segment; and if the first source media content segment exists, playing back a first destination media content segment transitioned from the first source media content segment in the transition history.
The above-mentioned and other features and advantages of the present principles and the manner of attaining them, will become more apparent and the present principles will be better understood by reference to the following description of embodiments of the present principles taken in conjunction with the accompanying drawings, wherein:
The examples set out herein illustrate exemplary embodiments of the present principles. Such examples are not to be construed as limiting the scope of the present principles in any manner.
The inventors recognize that viewers of video on demand (VOD) frequently transition between a sequence of video segments, rather than watch one video continuously from beginning to end. Examples of transitions can be found across diverse video genres: (1) music videos; (2) instrumental and tutorial videos; (3) how-to videos; (4) sports segments; (5) comedy clips, etc. The inventors recognize that transitions encode information about which video segments are of high-interest to viewers (i.e., in a way similar to links between web pages).
According to the principles of this disclosure, a crowdsourcing system and/or a smart playback device can be used to collect user transitions between media content segments, such as video segments. The crowdsourcing system or a smart playback device, for example, can provide an interface to collect user input by prompting users to switch between different media content segments in a media content series/playlist. The switching is guided by previous user selections, for example, from the user alone, from users using the same playback device, or from users of a social network, or is guided by similarity of media content segment. Based on all user transitions, a media content segment rank can be computed based on the transition history, which may be represented by a transition graph (web of transitions). This rank may supplement other metrics to evaluate the popularity of a particular media content segment. The disclosure also describes optimization algorithms (e.g., dynamic programming) for constructing a mixtape, a CD, a DVD, or a file including optimized number of media content segments having highest ranks/scores under the constraint that the total viewing time is bounded.
A crowdsourcing system can be provided by a media content provider, such as a streaming media website such as Netflix, Amazon, M-GO, or another streaming provider. The system may be a social networking provider, such as MySpace, Facebook, or LinkedIn. A social networking site provides a service or system that facilitates the creation, maintenance and management of social networks. For example, Facebook is a social networking site that allows users to create social networks such as interest groups therein. In one example, a social network may comprise a group of the user's friends. In another example, a social network may comprise a group of users sharing a similar interest, for example, in travel.
As used herein, “media content” may be defined to include any type of media, including any type of audio, video, and/or image media content received from any source. For example, “media content” may include Internet content, streaming services (e.g., M-GO, Netflix, Hulu, Amazon), recorded video content, video-on-demand content, broadcasted content, television content, television programs (or programming), advertisements, commercials, music, movies, video clips, interactive games, network-based entertainment applications, and other media assets. Media assets may include any and all kinds of digital media formats, such as audio files, image files or video files.
The term refers to a media content that is logically or physically segmented into segments. For example, each segment may last for a predetermined playback interval, such as but not limited to 90 seconds. The segmentation is important, so that when a user specifies a point in the media content for playback, the portion of the media content can be identified. The segmentation may also be based on different scenes, each constituting a segment. A media content segment may be a part of a media content or the media content itself, for example, if the media content is short. Examples of short media content are audio and video clips.
The present principles recognize that a user device may also record transition history and automatically selects the next media content segment for playback according to a recorded transition history or a transition history obtained from a crowdsourcing system.
Some examples of videos include tutorial video for learning an instrument, music video, comedy clip, and sports. In these exemplary videos, users normally jump from one video to the next in search of the most relevant or interesting content. Full-length videos are difficult to consume in one viewing session. After specific segments of a video are viewed, a user may not be interested in the remaining content of the video. According to principles of the disclosure, the sequence of the viewed segments form a mixtape, a CD, a DVD, or a file that the user can view to obtain the most pertinent and engaging content. According to the principles of the disclosure, the switching or transitioning from one media content segment to another media content segment is guided by a recorded transition history, as will be explained in more details later in this disclosure.
The present description illustrates the present principles. It will thus be appreciated that those skilled in the art will be able to devise various arrangements that, although not explicitly described or shown herein, embody the present principles and are included within its spirit and scope.
All examples and conditional language recited herein are intended for pedagogical purposes to aid the reader in understanding the present principles and the concepts contributed by the inventors to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions.
Moreover, all statements herein reciting principles, aspects, and embodiments of the present principles, as well as specific examples thereof, are intended to encompass both structural and functional equivalents thereof. Additionally, it is intended that such equivalents include both currently known equivalents as well as equivalents developed in the future, i.e., any elements developed that perform the same function, regardless of structure.
Thus, for example, it will be appreciated by those skilled in the art that the block diagrams presented herein represent conceptual views of illustrative circuitry embodying the present principles. Similarly, it will be appreciated that any flow charts, flow diagrams, state transition diagrams, pseudocode, and the like represent various processes which may be substantially represented in computer readable media and so executed by a computer or processor, whether or not such computer or processor is explicitly shown.
The functions of the various elements shown in the figures may be provided through the use of dedicated hardware as well as hardware capable of executing software in association with appropriate software. When provided by a processor, the functions may be provided by a single dedicated processor, by a single shared processor, or by a plurality of individual processors, some of which may be shared. Moreover, explicit use of the term “processor” or “controller” should not be construed to refer exclusively to hardware capable of executing software, and may implicitly include, without limitation, digital signal processor (“DSP”) hardware, read-only memory (“ROM”) for storing software, random access memory (“RAM”), and non-volatile storage.
Other hardware, conventional and/or custom, may also be included. Similarly, any switches shown in the figures are conceptual only. Their function may be carried out through the operation of program logic, through dedicated logic, through the interaction of program control and dedicated logic, or even manually, the particular technique being selectable by the implementer as more specifically understood from the context.
In the claims hereof, any element expressed as a means for performing a specified function is intended to encompass any way of performing that function including, for example, a) a combination of circuit elements that performs that function or b) software in any form, including, therefore, firmware, microcode or the like, combined with appropriate circuitry for executing that software to perform the function. The present principles as defined by such claims reside in the fact that the functionalities provided by the various recited means are combined and brought together in the manner which the claims call for. It is thus regarded that any means that can provide those functionalities are equivalent to those shown herein.
Reference in the specification to “one embodiment”, “an embodiment”, “an exemplary embodiment” of the present principles, or as well as other variations thereof, means that a particular feature, structure, characteristic, and so forth described in connection with the embodiment is included in at least one embodiment of the present principles. Thus, the appearances of the phrase “in one embodiment”, “in an embodiment”, “in an exemplary embodiment”, or as well any other variations, appearing in various places throughout the specification are not necessarily all referring to the same embodiment.
It is to be appreciated that the use of any of the following “/”, “and/or”, and “at least one of”, for example, in the cases of “A/B”, “A and/or B” and “at least one of A and B”, is intended to encompass the selection of the first listed option (A) only, or the selection of the second listed option (B) only, or the selection of both options (A and B). As a further example, in the cases of “A, B, and/or C” and “at least one of A, B, and C”, such phrasing is intended to encompass the selection of the first listed option (A) only, or the selection of the second listed option (B) only, or the selection of the third listed option (C) only, or the selection of the first and the second listed options (A and B) only, or the selection of the first and third listed options (A and C) only, or the selection of the second and third listed options (B and C) only, or the selection of all three options (A and B and C). This may be extended, as readily apparent by one of ordinary skill in this and related arts, for as many items listed.
Various exemplary smart user devices 160-1 to 160-n in
Smart user devices 160-1 to 160-n shown in
An exemplary smart user device 160-1 in
The exemplary device 160-1 in
Exemplary device 160-1 also comprises a memory 185 which may represent both a transitory memory such as RAM, and a non-transitory memory such as a ROM, a hard drive or a flash memory, for processing and storing different files and information as necessary, including computer program products and software (e.g., for performing a method as represented by a flow chart diagram of
Smart user devices 160-1 to 160-n in
The server 105 comprises the processor 110 which controls the various functions and components of the server 105 via a control bus 107 as shown in
In addition, server 105 is connected to network 150 through a communication interface 120 for communicating with other servers or web sites (not shown) and to one or more smart user devices 160-1 to 160-n, as shown in
The method 200 is invoked at 210 of
The term, as previously stated, refers to a media content that is logically or physically segmented into segments. For example, each segment may last for a predetermined playback interval, such as but not limited to 90 seconds. The segmentation is important, so that when a user specifies a point in the media content for playback, the portion of the media content can be identified. The segmentation may also base on different scenes, each constituting a segment. A media content segment may be a part of a media content or the media content itself, for example, if the media content is short. Examples of short media content are audio and video clips.
The identification of a logical segment in a media content may be the starting and ending time, the starting time and the duration, etc.
If segment information is not available for a media content segment, the processor 165 should automatically identifying segments according to principles stated previously.
The meaning of the term “playing back” is conventional for the processor 165 of the user device 160-1. The term as applied to the server 105 is used herein to mean streaming or broadcasting.
At step 220, the processor 165 is operative or configured to determine if a first source media content segment, which is one of the first media content segment and a segment similar to the first media content segment, exists in a transition history, the transition history including transitions from at least one source media content segment to at least one destination media content segment.
In one embodiment, the processor 165 should determine if the first source media content segment that is the first media content segment exists in the transition history, and only if it does not, the processor 165 then determines if the first source media content segment is a media segment similar to the first media content segment exists in the transition history.
Conventional methods can be used to establish similarity. For example, if the major scene of the first media content segment and the major scene of a source media content segment in the transition history has a similar structure, they may be considered similar. In this method, if the major scene of the first media content segment includes a ball and the source media content segment includes a ball or a globe, they are structurally similar because they both have a similar geometric feature, in this case, they are primarily circular. In another example, if both segments include the same semantic content, e.g., presence of a dog, the two are considered similar. In some embodiments, the presence/absence of other specific aesthetic elements like repetitive patterns, textures, etc., can also be considered for aesthetic similarity.
Referring to transition history 310 made by user A, it is clear that user A makes a transition from segment 1 of Video 1 to segment 3 of Video 1 during the playback. User A does not interrupt the playback of segment 3 of Video 1 and the playback continues into segment 4 of Video 1, where user A transitions the playback to segment 2 of Video 2. Since user A does not make a transition at segment 3 of Video 1 to segment 2 of Video 2, it indicates that user A likes both segments 3 and 4 of Video 1. As such, in one embodiment, it can be interpreted that user A may make a transition from segment 1 of Video 1 to segment 3 of Video 1 or directly to segment 4 of Video 1.
User A also does not make a transition to segment 4 of Video 2 at segment 2 of Video 2. Rather, user A makes the transition until the playback of segment 3 of Video 2. As such, in one embodiment, it can be interpreted that user A may make a transition from segment 4 of Video 1 to segment 2 of Video 2 or directly to segment 3 of Video 2. Similarly, after making a transition from segment 3 of Video 2 to segment 4 of Video 2, user A does not make a transition to segment 2 of Video 3 until segment 5 of Video 2. As such, in one embodiment, it can be interpreted that user A may make a transition from segment 3 of Video 2 to segment 4 of Video 2 or directly to segment 5 of Video 2.
Referring to transition history 330 made by user B, it is clear that user B makes a transition from segment 2 of Video 2 to segment 1 of Video 3 during the playback. User B does not interrupt the playback of segment 1 of Video 3 and the playback continues into segment 2 of Video 3, where user B transitions the playback to segment 2 of Video 4. Since user B does not make a transition at segment 1 of Video 3, it can be interpreted that user B may make a transition from segment 2 of Video 2 to segment 1 of Video 3 or directly to segment 2 of Video 3.
Referring to transition history 350 made by user C, it is clear that user C makes a transition from segment 2 of Video 1 to segment 3 of Video 1 during the playback. User B does not interrupt the playback of segment 3 of Video 1 and the playback continues into segment 4 of Video 1, where user C transitions the playback to segment 1 of Video 3. Since user C does not make a transition at segment 3 of Video 1, it can be interpreted that user C may make a transition from segment 2 of Video 1 to segment 3 of Video 1 or directly to segment 4 of Video 1.
User C also does not make a transition to segment 1 of Video 5 at segment 1 of Video 3. Rather, user C makes the transition until the playback of segment 2 of Video 3. As such, in one embodiment, it can be interpreted that user C may make a transition from segment 4 of Video 1 to segment 1 of Video 3 or directly to segment 2 of Video 3. Similarly, after making a transition from segment 2 of Video 3 to segment 1 of Video 5, user C does not make a transition to segment 3 of Video 5 until segment 2 of Video 5. As such, in one embodiment, it can be interpreted that user C may make a transition from segment 2 of Video 3 to segment 1 of Video 5 or directly to segment 2 of Video 5.
The processor 165 may collect and record a transition history for each user if the processor 165 can identify each user, for example, by login name of the user. The processor 165 may combine/merge the three transition histories into one representing the combined transition history of all the users of the user device 160-1. The combined history can be obtained by collecting and recording all transitions without identifying the users of the user device 160-1.
In one embodiment, if a transition is from a first segment to a second segment and before the completion of the playback of the second segment, the user switches to a third segment, the processor 165 is operative or configured to add a transition from the first segment to the third segment in the transition history and optionally delete the transition from the first segment to the second segment in the transition history. The deletion is optional because the switching in the middle of playing the second segment may indicate that the user does not like the second segment or just likes a portion of the second segment.
A transition history can be stored in any form. For example, the combined transition history of users A, B, C illustrated above is shown in
It is recognized that if a destination node has more source nodes transitioning into the destination node, the destination node is more popular or more favorable to a user. For example, node 412 has only one source node 410 transitioning into node 412, while node 411 has two source nodes, node 410 and 413, transitioning into node 411. As such, node 411 should be more popular or favorable than node 412. As used herein, a source node is connected to a destination node by an edge directed from the source node to the destination node.
In one embodiment, the popularity or rank of a node in the combined transition history can be determined by algorithm similar to Google's PageRank. In this application, the rank of a segment is the rank determined by the PageRank algorithm in relation to the combined graph shown in
The processor 165 is operative or configured to store a copy of at least one transition history in memory 185. The processor 165 is also operative or configured to update a transition history when a new transition is detected.
The server 105 can also collect a transition history for each user and/or a combined history for all users including the users of the user device 160-1 and users of other user devices. The processor 165 may default to use one of the transition histories stored locally or one of the transition histories stored in the server 105. The processor 165 may identify a user playing back a media content segment and use the transition history stored locally for that user for automatic transition to the next media content segment or presenting a list of suggested next media content segments arranged in the order of popularity for the user to select one for playback. As stated above, the popularity or rank of a destination media content segment may be determined by the number of source media content segments transitioning into that destination media content segment, the more source media content segments the more popular. PageRank and other algorithms can be used to a rank a segment as well.
In one embodiment, the processor 165 may present all versions of transition history, for example, individual versions, combined version for all users of the user device 160-1, and combined version for users including users of the user device 160-1 and users of at least another user device, and allow the user to select one of them.
At step 240, if the first source media content segment exists, the processor 165 is operative or configured to play back a first destination media content segment transitioned from the first source media content segment in the transition history.
If the first source media content segment is not the first media content segment but similar to the first media content segment, the processor 165 may be operative or configured to update the transition history to record the new transition from the first video segment to the first destination video segment.
If the first source media content segment transitions to more than one destination media content segment, the processor 165 may automatically select the most popular destination media content segment for the next playback at end of the playback of current media content segment. The popularity of a media content segment can be measured, for example, by the number of source media content segments transitioning to that media segment in the transition history. For example, in
As shown in
In one embodiment, if the first source media content segment does not exist in the transition history, the processor 165 is operative or configured to present at least one media content segment having a same type as the first media content segment. Two media content segments have the same type, for example, if they have the same actors/actresses or belong to the same category, theme, and/or scene. After the user has selected one of the suggested media content segments, the processor 165 is operative or configured to play back the selected suggested media content segment, and optionally, update the transition history to record the new transition from the first media content segment to the selected suggested media content segment.
If the server 105 is updating a transition history, the processor 165 is operative or configured to transmit the updating information to the server.
In one embodiment, the user may specify a duration and the processor 165 is operative or configured to compile a mixtapte that includes optimized number of media content segments that have the highest ranks. The ranks may be the ranks describe previously or derived by other methods. The ranks can be a combination of ranks from different ranking methods. The ranks can be determined from any version of the transition history.
Assuming that the user specified duration for the mixtape, a CD, a DVD, or a file is T, each segment of all the media contents is identified by an integer i, the rank and the duration of segment i are represented by Si and ti, and Xi represents whether segment i is selected (having a value of 1) or not selected (having a value of 0), the selection of segments for inclusion into the mixtape, the CD, the DVD, or the file can be determined by solving the following equation:
Maximize ΣSiXi(ti/T)
Subject to: (ΣtiXi≦T, where Xi is 0 or 1
This is a Knapsack Dynamic Programming Optimization problem. After the selection of the segments, the processor 165 compiles the mixtape, the CD, the DVD, or the file with all the selected segments. The user can then select the mixtape, the CD, the DVD, or the file for playback.
While several embodiments have been described and illustrated herein, those of ordinary skill in the art will readily envision a variety of other means and/or structures for performing the functions and/or obtaining the results and/or one or more of the advantages described herein, and each of such variations and/or modifications is deemed to be within the scope of the present embodiments. More generally, those skilled in the art will readily appreciate that all parameters, dimensions, materials, and configurations described herein are meant to be exemplary and that the actual parameters, dimensions, materials, and/or configurations will depend upon the specific application or applications for which the teachings herein is/are used. Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments described herein. It is, therefore, to be understood that the foregoing embodiments are presented by way of example only and that, within the scope of the appended claims and equivalents thereof, the embodiments disclosed may be practiced otherwise than as specifically described and claimed. The present embodiments are directed to each individual feature, system, article, material and/or method described herein. In addition, any combination of two or more such features, systems, articles, materials and/or methods, if such features, systems, articles, materials and/or methods are not mutually inconsistent, is included within the scope of the present embodiment.
This application claims the benefit of U.S. Provisional Patent Application 62/352,592, filed Jun. 21, 2016.
Number | Date | Country | |
---|---|---|---|
62352592 | Jun 2016 | US |