This invention relates to computer systems for analyzing media content using Artificial Intelligence (Al) or machine learning, and more particularly, to a system for detecting and visualizing demographics, diversity, and disparity in user-generated videos.
User-generated content (UGC), alternatively known as user-created content (UCC), is any form of content, such as images, videos, text and audio, that have been posted by users on online platforms such as social media.
For consumers, UGC, particularly user-generated videos (UGVs), have become an increasingly common source of information on brands and their related products and services. UGVs can include reviews, tutorials and demonstrations that directly educate consumers about products or services, or they can indirectly expose consumers to brands through product placement, wherein products or services are incorporated into content such as home videos or web series. On YouTube, the largest online video platform, over 500 hours of videos are uploaded every minute, with over 5 billion videos watched every day, many of which are UGVs featuring brands, products or services.
There is a growing need for companies to better understand and control the influence UGVs have on consumer perception of their brands. Companies are particularly interested in monitoring demographics (e.g., race, gender and age) of individuals appearing in UGVs alongside their brands and in finding ways to improve upon the diversity of those demographics.
While prior art methods exist for quantifying diversity, these methods fall short of providing meaningful and actionable insights to brand managers or content creators. In addition, there is a need for a scalable system that can score and visualize diversity among large and ever-growing catalogs of UGVs hosted on platforms such as YouTube.
The present invention provides a system for detecting and visualizing demographics, diversity and disparity in UGVs.
The system is configured to generate diversity and disparity scores for UGVs, where diversity is a measure of the inclusion of different types of people (e.g., race, gender and age) in a video or collection of videos and disparity is a measure that compares the types of people in one video or collection of videos to those in another video or collection of videos. The system visualizes diversity and disparity scores, along with other demographic information, in meaningful and actionable ways, and it can be used by both brand managers and content creators to analyze and take strategic action based on the diversity and disparity scores.
Exemplary embodiments of the invention will now be described with reference to the accompanying drawings, in which:
There are several known methods of quantifying diversity in a collection, such as a collection of individuals in a video. One of the most common measurements of diversity is the Herfindahl-Hirschman Index (HHI). The HHI typically has a value ranging from 1 to ⅟N, where N is the number of different demographic categories being analyzed in a collection. For example, in calculating the racial diversity of individuals in a video across five different racial groups (N=5), a video entirely composed of individuals of just one of the racial groups, such as Asian, would have an HHI of 1, while a video that is made up equally of people from all five racial groups (most diverse), would have an HHI of 0.2. The HHI is calculated using the following formula:
where N is the number of different categories in a demographic being analyzed and si is the ratio of individuals that belong to a given one of those categories (i).
The most obvious shortcoming of the HHI is that the index is unable to effectively convey the difference between two collections having the same HHI, but different demographic distributions. For example, a video containing only Asian individuals would have the same HHI as a video containing only black individuals (HHI = 1). In addition, the range of HHI values (1 to ⅟N) can be confusing to understand and to visualize.
The system of the present invention utilizes an improved scoring of both diversity and disparity in videos. Compared to prior art indexes, such as the HHI, the improved scoring can be more easily understood and visualized in a software user interface, and the combination of both diversity and disparity scoring can provide a more complete picture of demographic distribution.
The system’s improved diversity scoring is generally calculated in a manner similar to the HHI, but the values are normalized to a range of 0 to 100 (or, alternatively, 0 to 1 or 0% to 100%), such that 0 means not diverse and 100 is most diverse, for example:
Again, where N is the number of different categories in a demographic being analyzed.
This normalized diversity scoring can be applied to each demographic of interest in a video or collection of videos, for example, age, gender and race, such that a video or collection of videos can have a separate age diversity score, gender diversity score and race diversity score. In addition, the scoring of different demographics can be weighted averaged to calculate an overall or cumulative diversity score for a video or collection of videos. It should be understood that the diversity scoring can be performed on any demographic of interest, such as relationship status, nationality, education, profession, and as to any parameter such as product type, topic, language, and country, and that any suitable method of calculating an overall or cumulative diversity score can be performed.
The system’s disparity scoring quantifies the difference between the types of individuals featured in a first video or collection of videos and the types of individuals featured in another video or collection of videos. The disparity scoring can be used, for example, to quantify the “uniqueness” of individuals featured in a product video found on YouTube (first video) compared to individuals found in all other videos on YouTube featuring the same product or category of product (collection of videos). The disparity scoring is generally performed using the following formula:
where ci represents the number of people per category in a whole collection (e.g. all videos in a product category), si represents the number of people per category in an individual video (e.g. first video) and N is the number of demographic categories being considered.
The disparity scoring allows certain demographic categories to be weighed more heavily than others, such that greater weight can be given to demographic categories that are more “unique” or that occur less regularly in the videos. For example, if we consider only three categories in the race demographic, white, black and Asian, and if the distribution of race within those categories is 80% white, 15% black and 5% Asian among a collection of videos, the disparity scoring of a single video could place more emphasis on the number of Asian individuals featured in the video than on the number of white individuals featured in the video, thus resulting in a higher disparity score for videos featuring “unique” or less commonly occurring types of individuals.
Therefore, disparity scoring involves a summation of (1 - ci) * si where ci is the percentage of people in a category of a first video or collection of videos and si is the percentage of people in the same category in a second video or collection of videos. It should be understood that any suitable alternative method of weighting demographic categories can be applied to the disparity scoring.
The disparity scoring can be performed for any demographic of interest in a video or collection of videos, and the scoring of different demographics can be weighted averaged to calculate an overall or cumulative disparity score for comparing a video or collection of videos to another video or collection of videos.
The diversity and disparity scoring allow for many different types of comparisons to be made. Diversity scoring, for example, can be performed on a single video featuring a single product (e.g., Conair Ceramic Hair Dryer), on all the videos on YouTube featuring a single product, on all the videos on YouTube featuring products in the same category (e.g., all Hair Dryers) or on all videos from a single content creator (e.g., YouTube personality or influencer). Disparity scoring can be performed on videos or collections of videos to compare, for example, a single video featuring a product against all videos on YouTube featuring that product, a single video featuring a product against all videos on YouTube featuring products in the same category, or a single video featuring a product against all videos from a single content creator.
In one implementation of the system, Company X plans to target customers from demographic groups A and B for a marketing campaign. The goal of the company is to generate 50,000 click throughs from videos related to a selected product assuming an average conversion rate of 5% from the click throughs and an average order size of $40. That would generate revenue of $100,000 ($40 × 0.05 × 50,000). It is assumed that each video generates 5,000 click throughs, thus requiring 10 videos featuring the product directed at the target demographic. After analyzing 10,000 videos featuring the selected product, the system determines that only 4 videos include content featuring the targeted demographic. The system can then recommend content creators and the type of content to produce the 6 additional videos that are needed to meet the company’s revenue goal. The system can automatically or manually launch a campaign by contacting the identified content creators to produce the required videos. Content creators may be fans, social media influencers or shoppers who have purchased the product.
In another implementation of the system, when a customer having a known profile, e.g., membership in a defined demographic of a population, searches for a product or a product feature, the system can recommend videos culled from a collection of videos for viewing by the customer based on the demographic and diversity content of each of the videos in the collection. The video recommendations might be across a product category, across one brand or multiple categories and products across multiple brands. In addition to video demographics and disparity scoring, the system can analyze other dimensions such as sentiment, topics, scenes, type of creator, and a video’s visual quality. These are merely examples of the types of diversity and disparity scoring that can be performed and the types of comparisons that can be made. Further examples will now be explored with respect to certain embodiments of the system of the invention.
Referring to
The system of the invention is configured to allow a user to analyze one or more videos, for example, a user can monitor and analyze all of the videos on YouTube that feature one or more products from a single company. The exemplary dashboard (100) of
The system of the invention is configured to include a database of profiles for various content creators or influencers (i.e., users that upload content to one or more social media or content hosting platforms). Each content creator profile contains information about videos uploaded to one or more social media or content hosting platforms by that content creator.
The system of the invention is configured to enable a user to create a promotional campaign based on the demographic information, diversity scoring information or disparity scoring information detected, generated and/or made available by the system. For example, in another view (920) of dashboard (100) as shown in
It should be understood that the system of the invention can be configured to incorporate improvements or alternatives to the diversity and disparity scoring described herein, and that the system of the invention can be configured to analyze diversity and disparity in UGC other than recorded videos, including text, images, audio, video games, virtual reality or augmented reality applications, and live audio or video streams.
There have thus been described and illustrated certain embodiments of a system for detecting and visualizing demographics, diversity and disparity in UGVs according to the invention. These embodiments are merely example implementations of the invention and are not intended to limit the scope of the invention to their particular details. Alternative embodiments of the invention not expressly disclosed herein will be evident to persons of ordinary skill in the art.
Number | Date | Country | |
---|---|---|---|
63227321 | Jul 2021 | US |