The present systems and methods relate generally to analyzing trends and patterns in video consumption, and more particularly to identifying trends in video viewer activity as a function of embedded video metadata for purposes of optimizing content associated with video media.
Information relating to viewer interaction with video media, whether that media is Internet videos, DVDs, television programs, etc., is invaluable for a variety of purposes, including advertising, editing video content, and the like. Current systems enable tracking of viewer behavior during videos, such as whether a viewer rewinds a video to a certain point to watch part of the video again, or if a viewer pauses the video at a particular time, etc. These behaviors are often tracked for Internet videos by simply recording user interaction with a media player, but may also be tracked and recorded on home television sets via a set-top-box attached to the television or through a digital cable system. This viewer behavior data (generally referred to as “viewing metrics” or “video consumption data”) provides information that enables content associated with videos to be edited or adapted to meet a desired objective, such as by targeting advertising to frequently-watched scenes, or editing a video to remove portions that are regularly ignored.
Typically, video consumption data comprises common “output states” for videos, which describe viewer actions associated with the video, such as i) fast-forwarding a video, ii) rewinding a video, iii) pausing a video, iv) closing a video player, v) navigating to a new video program (i.e. changing the channel or selecting new program in a video player), vi) engaging with an advertisement (i.e. clicking on an Internet advertisement or somehow otherwise interacting with an advertisement), and other similar actions. These output states are tracked and recorded for a wide range of videos across many viewers over time to develop trends in viewer behavior for each video. For example, a trend may emerge that indicates that 40% of viewers rewind and watch a certain portion of a particular video more than once. The output states are relatively easy to track for Internet videos by simply monitoring user interaction with a media player. Also, with the advent of digital video recording, more viewing metric data is becoming available for television use.
However, these viewing metrics only tell part of the story. They do not explain why viewers engage in certain behaviors while watching videos. Thus, while video consumption data may he helpful for one particular video, that same information generally cannot be readily applied to other videos—even if those videos are related to the given video (i.e. same actors, similar subject matter, etc.)—because there is no direct link between the viewer behavior and the content of the video. At best, viewing metrics can be compared to the corresponding video content on a video-by-video basis, and a guess can be made as to why certain viewer behavior occurs. For example, it may he assumed that viewers fast-forward through certain scenes because the scenes are boring, depressing, or just too long. Or, it may be assumed that viewers rewind and watch a certain portion of a video repeatedly because a popular actor or actress is in that portion of the video. However, the assumptions made about video content to explain viewer behavior are merely best guesses—they are imprecise, often time consuming to generate, and frequently inaccurate. Therefore, because it is difficult to link viewer behavior with common video concepts (such as a specific actor, setting, dialogue, etc.), viewing metrics are typically only helpful on a per-video basis. If a new video is introduced, targeted advertisements or other video content generally cannot be applied to the video until viewing metrics are obtained, overtime, for the specific video.
Further, many viewer behaviors may be triggered by a combination of several video content elements happening in a video at once, and thus no direct correlation can be drawn between one particular content element and a resulting viewer behavior. For instance, viewers may consistently choose to stop watching a particular video at a certain point in the video not because of any one element, but because a combination of many elements may make the video no longer appealing. As an example, a certain actor may be very popular in one video, causing a high rate of viewer interest in the video. However, the same actor in another video, based on the actor's character, a particular setting, and the overall subject matter of the video, may cause the video (or a scene in the video) to be highly unpopular, causing many viewers to exit the video. Accordingly, analysis of the second video may reveal that it was the combination of the character, setting, and subject matter of the video that caused viewers to exit the video. However, because traditional viewing metrics do not link content of videos to viewer behavior, the particular combination of content attributes that made the scene within the video unpopular may never be discovered.
Additionally, for advertising purposes, pure viewing metrics alone are often insufficient to optimize user interaction with or attention to advertisements. For example, Internet videos may display banner or pop-up advertisements while the videos are playing. An advertiser may elect to display such advertisements during the most-watched portions of a video (as indicated by video consumption data) because the advertiser believes the viewer is paying a great deal of attention to this portion of video, and thus will be likely to see the advertisement. However, it may actually be the case that because the viewer is highly-interested in the content of the video itself, the viewer pays little or no attention to the displayed advertisement. Thus, while simply tracking viewer behavioral trends may provide some helpful information to an advertiser or video editor, the reasons why viewers engage in certain behaviors, such as why a viewer clicks on an advertisement during a video, or what it is about the video that makes it popular, could be far more important.
If it were available, information relating to the causes behind viewer behavior could he applied across a wide range of videos and media, including new videos in which no viewing metrics are available. If correlations could be drawn between certain aspects or attributes of videos and corresponding viewer behavior, then advertisers, video editors, and other content providers could tailor future videos and advertisements accordingly.
Therefore, there is a long-felt but unresolved need for systems and/or methods that compare the behavior of viewers of video media with the associated content of the video media to generate and identify correlations, rules, and trends between specific content elements of the media and corresponding viewer behavior.
Briefly described, and according to one embodiment, the present disclosure is directed to systems and methods for analyzing video content in conjunction with historical video consumption data, and identifying and generating relationships, rules, and correlations between the video content and viewer behavior. According to one aspect, a system receives video consumption data associated with one or more output states for one or more videos. The output states generally comprise tracked and recorded viewer behaviors during videos such as pausing, rewinding, fast-forwarding, clicking on an advertisement (for Internet videos), and other similar actions. Next, the system receives metadata associated with the content of one or more videos. The metadata is associated with video content such as actors, places, objects, dialogue, etc. The system then analyzes the received video consumption data and metadata via a multivariate analysis engine to generate an output analysis of the data. The output may be a scatter plot, chart, list, or other similar type of output that is used to identify patterns associated with the metadata and the one or more output states. Finally, the system generates one or more rules incorporating the identified patterns, wherein the one or more rules define relationships between the video content (i.e. metadata) and viewer behavior (i.e. output states).
The accompanying drawings illustrate one or more embodiments of the disclosure and, together with the written description, serve to explain the principles of the disclosure. Wherever possible, the same reference numbers are used throughout the drawings to refer to the same or like elements of an embodiment, and wherein:
For the purpose of promoting an understanding of the principles of the present disclosure, reference will now be made to the embodiments illustrated in the drawings and specific language will be used to describe the same. It will, nevertheless, be understood that no limitation of the scope of the disclosure is hereby intended; any alterations and further modifications of the described or illustrated embodiments, and any further applications of the principles of the disclosure as illustrated herein are contemplated as would normally occur to one skilled in the art to which the disclosure relates.
Aspects of the present disclosure generally relate to systems and methods for analyzing video content in conjunction with historical video consumption data, and identifying and generating relationships, rules, and correlations between the video content and viewer behavior. In one embodiment, the present system compares metadata associated with video content for a plurality of videos with various output state data of those videos via a multivariate analysis (MVA) engine. The MVA engine analyzes the metadata in conjunction with the output states to identify patterns between the metadata and output states. Once patterns are identified, rules and correlations can be generated based on predefined parameters that link specific video content to specific viewer behaviors for subsequent advertising, content editing, and other similar purposes.
Overall, one purpose of the present system is to develop explicit correlations between metadata elements or combinations of metadata elements (linked to specific video content) and specific viewer behaviors (in the form of output states). These correlations may be used to determine why viewers engage in certain behaviors during videos, such that those behaviors can be utilized for a variety of purposes. For example, if a direct correlation can be drawn between a specific metadata element or group of elements and a high percentage of viewers interacting with an advertisement, then similar advertisements can he incorporated into videos at specific time-codes when the metadata element(s) are present. The benefits and uses of specific correlations between content metadata and viewer behaviors will be appreciated by those of ordinary skill in the art, and further described herein.
Referring now to the drawings,
In the embodiment shown, the server 105 provides processing functionality for the system 100, including receiving instructions from an operator 102, retrieving videos and viewing metric data, extracting embedded metadata from videos (or obtaining metadata otherwise associated with the videos), providing information to the MVA engine 125, and a host of other operations that will be or become apparent to one of ordinary skill in the art. Additionally, while only one server 105 is shown, it will be understood that a plurality of servers may be incorporated within embodiments of a computerized system 100. It will also be understood that such server(s) 105 include suitable hardware and software components for performing the functions and/or steps and taking the actions described herein.
In one embodiment, the server 105 interacts with the video database 110, which stores a plurality of videos for use within the system 100. The stored videos may be any multimedia content, such as movies, television programs, music videos, short clips, commercials, internet-created or personal videos, and other similar types of video media or multimedia. In some embodiments, these stored videos are embedded with metadata related to elements or content of the videos, such as actors or characters within the videos, products in the videos, places and settings shown or described in the videos, subject matter, dialogue, audio, titles, and other similar video attributes. In other embodiments, metadata is previously associated with the respective video but not embedded in the video per se. In other embodiments, some or all of the videos in the video database 110 do not previously have metadata embedded or associated with the video, and thus the system 100 must assign metadata attributes to the content of the videos (described in greater detail below).
Within embodiments of the present system 100, the server 105 extracts or obtains metadata from the videos and stores the metadata in the metadata database 115. In one embodiment, the metadata is further stored in metadata files that are associated with each specific video. Thus, the metadata database 115 includes one or more separate metadata files for each video in the video database 110, such that each metadata file or files includes all of the metadata for its respective video. Generally, the metadata includes identifiers or tags that provide descriptions and/or identification means for each item of metadata. For example, an identifier for metadata signifying a particular actor could be the actor's name. The identifiers or tags may describe a basic understanding or provide a detailed description of the associated video. The metadata identifiers enable the metadata to be easily located and utilized within the system 100.
Additionally, in a preferred embodiment, the metadata is time-coded, such that some or all of each item of metadata is associated with a time-code or range of time-codes within a given video. For example, an item of metadata for a certain actor within a video may indicate that the actor is on screen in the video from the 2 minute, 36 second mark of the video to the 4 minute, 41 second mark, and then does not appear in the video again. Another item of metadata may indicate that an object within a video, such as a car, is seen multiple times throughout the video, and at varying time-codes. On the other hand, some metadata may be associated with the entire video, such as metadata associated with the overall subject matter of a video, or with the title of a video, in which case it would not be tied to a time code. In one embodiment, the video consumption data is similarly time-coded to provide a baseline for comparison between the metadata and video consumption data. As will be appreciated, other embodiments of the system 100 rely on metadata and video consumption data that is not time-coded.
As mentioned, in one embodiment, some or all of the videos in the video database 110 arc not embedded with metadata. For these videos, the system 100 must associate metadata with the videos that require analyzation. Recently-developed technologies utilize facial recognition technology, textual analyzers, sound and pattern recognition technology, and other similar mechanisms to identify components within a video, and then associate time-coded metadata attributes automatically with those identified components. Metadata may also be associated with videos manually by viewing videos and associating metadata with items recognized by the viewer. One exemplary method for associating metadata with videos is described in U.S. Patent Publication No. 2004/0237101 to Davis et al., entitled “Interactive Promotional Content Management System and Article of Manufacture Thereof,” which is incorporated herein by reference in its entirety and made a part hereof. Once metadata has been associated with content components of a video, it may then be extracted (if necessary) and stored in the metadata database 115 for further use within the system 100.
Still referring to
Further, in one embodiment, viewing metrics may also include viewer demographic information indicating the types of viewer behaviors that are more common in certain viewer groups. For example, viewer demographic information may indicate that males are more likely to interact with sports advertisements than females, etc. This information may be obtained by tracking user-entered profiles, recently-viewed webpages, IP addresses, and other similar viewer indicia. Thus, this viewer demographic information may be used in conjunction with output state information to provide highly-specialized or tailored correlations between video content and viewer behavior.
Also connected to the server 105 within the computerized system 100 is a multivariate analysis (MVA) engine 125 for analyzing video consumption data in conjunction with content metadata to identify patterns or trends between the consumption data and the metadata. Multivariate analysis describes the observation and analysis of more than one variable at a time. Generally, MVA is used to perform studies across multiple dimensions while taking into account the effects of all variables on the responses of interest. In one embodiment, the MVA engine 125 comprises a proprietary software program. Such software programs may be written in and utilized via commercially available applications such as MATLAB®, available from The Mathworks, Inc., having a corporate headquarters at 3 Apple Drive, Natick, Mass. 01760-2098, and other similar applications or software programs.
In a common multivariate analysis problem, a plurality of data points each having K variables is analyzed, where the number of variables K is only limited by the processing capabilities of the MVA engine 125. Typically, the plurality of data points is represented by a multidimensional array, wherein each row represents a given data point and each column represents one of the K variables. The data points in the multidimensional array arc plotted in a K-space, such that each of the K variables defines one orthogonal axis in a coordinate system. Although a K-space of greater than three dimensions is not easily visualized or conceptualized, it has a real existence analogous to that of the two- and three-dimensional spaces. For ease of visualization,
As described, conventional systems merely track video consumption data in response to viewer behavior. Essentially, these conventional viewing metrics systems merely record time-coded output states in response to viewer activity. Accordingly, these systems generally utilize a univariate approach, as the only input variable is a video. Embodiments of the present system 100, however, utilize a multivariate approach to analyze videos because a multiplicity of inputs are analyzed (i.e. the metadata). In fact, for videos with large amounts of metadata, the MVA engine 125 may analyze thousands of variables at once to produce a desired output. In one embodiment, data points in the present system 100 are represented as a multidimensional array (as described above), wherein K represents each element of metadata, and each data point (i.e. each row of the multidimensional array) represents each of the separate combinations of K variables within a given scene, video, or other selected data set that has contributed to produce one instance of a selected output state. In one embodiment, a plurality of output states may be analyzed to produce a plurality of data points, each including some combination of metadata attributes. As will be understood, the metadata attributes in the array may be represented as binary values (either 1 or 0, indicating a positive or negative presence of the attribute in the given data point), numerical values, percentages, or some other similar representative value.
In one embodiment, the MVA engine 125 compares the video metadata with corresponding video consumption data as a function of time-codes associated with each. As explained, each item of metadata preferably includes a time-code or range of time-codes indicating the point or points in a video in which its associated content occurs. Generally, video consumption data also includes such time-codes, indicating at which point or points during a video a certain viewer behavior (e.g., pause, rewind, stop, etc.) often occurred. Thus, the MVA engine 125 uses the time-codes as a baseline to compare the metadata and viewing metrics. For example, an element of video consumption data may indicate that 45% of viewers paused a particular video at the 5-minute mark of the video. The metadata tile for the particular video may indicate that a certain actor was on screen at the 5-minute mark. Thus, the MVA engine 125 may suggest some correlation between the actor and the video being paused. While this one example may not he adequately statistically significant to make a conclusion regarding the viewer action and the actor, other videos with the same actor can be analyzed to determine if a pattern develops between video pausing and the actor (again, based on similar time-codes).
During analysis by the MVA engine 125, certain parameters are applied to shape the output 400 and corresponding relationships drawn between the video content and viewer behavior. The predefined parameters are defined by the parameter generator 130 as entered into the system 100 by the operator 102. For example, a parameter may be defined that instructs the MVA engine 125 to identify any metadata that occurs within 5 seconds before a video is exited. Another parameter may instruct the MVA engine 125 to identify any metadata that is present when a viewer interacts with an advertisement. Further, because the MVA engine 125 generally analyzes historical average video consumption data, certain percentage parameters may be applied. For example, the system operator 102 may instruct the MVA engine 125 to assume that any user behavior that occurred more than 20% of the time is statistically significant—thus, if only 10% of viewers interacted with a particular advertisement during a video, then that interaction is ignored. As will be understood, these exemplary parameters are presented for illustrative purposes only, and a user or operator 102 of the system 100 may define whatever parameters he or she deems important or appropriate.
Still referring to
Referring first to
Additionally, in some embodiments, the MVA engine 125 incorporates principal component analysis (PCA) and/or factor analysis (FA) to discover sets of variables that form coherent subsets that are relatively independent of each other. The subsets are determined by locating and analyzing dense groupings of data points in an output 400. These subsets help determine which variables or groups of variables provide the largest contribution to various output states. For example, metadata relating to products in videos may cause higher percentages of viewer interaction with advertisements than metadata relating to characters in videos. Further, use of PCA and/or FA helps identify combinations of variables that alone have little or no impact on output states, but when taken in combination have a statistically significant correlation to one or more output states. In an alternative embodiment, partial least squares (PLS) regression analysis can be used instead of or in combination with PCA.
As the plotted data in the output 400 is analyzed, it may be used to create rules or correlations between specific items of metadata and specific output states. For example, the plotted data in the output 400 may indicate some connection between a fast-forwarding output state and metadata associated with a particular subject matter. Thus, a rule may be generated that dictates that the given subject matter leads to fast-forwarding in some percentage of cases. This fast-forwarding may be an indication that viewers are uninterested in this type of subject matter. Regardless of the reason, however, it becomes understood that this type of subject matter is frequently ignored. Thus, video content for other videos, even videos with no video consumption data available, may be edited to avoid or remove that type of subject matter.
Further, it will be understood that rules may be generated that link not only singular items of metadata to output states, but combinations or groups of metadata to particular output states. As mentioned earlier, a given output state may be caused by a particular combination of video content elements, whereas any one of those elements, when taken alone, would not cause the noted output state. Conventional systems that merely track time-coded viewer behavior to videos are incapable of identifying such response-causing metadata combinations. Thus, the ability to generate rules based on multiple metadata variables via a multivariate and principal components approach makes aspects of the present system 100 particularly useful for targeting or editing video content in response to the generated rules.
In addition to a scatter-plot type of output, as shown in
Still referring to
In the advertising or marketing context, viewer interaction with an advertisement can be an important output state. For example, if the output 400 and/or corresponding loading plot 500 identify a high correlation between certain metadata and viewer interaction with an advertisement, then this information can be used to incorporate that advertisement (or similar types of advertisements) into videos with content matching that metadata. Even videos with no previous viewing metric data may be effectively utilized to display the advertisement, assuming those videos contain the correlated metadata.
Often, identified trends between video content and viewer behavior will be consistent amongst various “classes” of videos (i.e. amongst a particular television series, or a specific movie genre, etc.). As an example, data points associated with different classes are represented by varying types of shapes (e.g. triangles, squares, diamonds, stars, etc.) in the output 400 shown in
As will also be understood, the trends and correlations between metadata and viewer behavior can be used for a variety of purposes, including advertising, editing or creating video content, and other similar uses. Generally, if trends are tightly coupled to metadata, then a rule can be created linking the trend to the content associated with the metadata, such that viewer behavior can be predicted. If a trend in video consumption data is not tightly coupled to any metadata elements, then the video consumption trend may be an aberration, or the cause for the viewer behavior may be irreconcilable with pure metadata alone. Further, over time as trend data is collected and analyzed for many videos and output states, trends of trends may be determined that provide even more detailed analysis linking video attributes to corresponding viewer output states.
The foregoing description of the exemplary embodiments has been presented only for the purposes of illustration and description and is not intended to be exhaustive or to limit the inventions to the precise forms disclosed. Many modifications and variations are possible in light of the above teaching.
The embodiments were chosen and described in order to explain the principles of the inventions and their practical application so as to enable others skilled in the art to utilize the inventions and various embodiments and with various modifications as are suited to the particular use contemplated. Alternative embodiments will become apparent to those skilled in the art to which the present inventions pertain without departing from their spirit and scope.
This application claims the benefit under 35 U.S.C. §119(e) of U.S. provisional patent application No. 61/117,454, entitled “SYSTEMS AND METHODS FOR ANALYZING TRENDS IN VIDEO CONSUMPTION BASED ON EMBEDDED VIDEO METADATA,” filed Nov. 24, 2008, which is incorporated herein by reference in its entirety as if set forth in full herein.
Number | Date | Country | |
---|---|---|---|
61117454 | Nov 2008 | US |