This application claims priority from Korean Patent Application No. 10-2014-0002592, filed on Jan. 8, 2014, in the Korean Intellectual Property Office, the disclosure of which is incorporated herein by reference in its entirety.
1. Field
The following description relates to broadcasting services, and more particularly, to technology for analyzing viewership history for broadcasting services.
2. Description of the Related Art
With the recent switchover to digital broadcasting, growing attention has been given to customized advertising (i.e. targeted advertising), which differs from conventional advertising services that simply expose the advertisements to viewers. For a Video-on-Demand (VoD) service on an IPTV, advertisements are inserted at the beginning and/or the end of the video. Recently, many attempts have been made to customize such advertisements to individual audiences. With the development of technology for not only IPTVs, but also smart TVs, such as hybrid TV that combines the Internet and terrestrial broadcast content, Internet connectivity of such TVs made it possible to depart from the traditional unidirectional broadcast services and unilateral terrestrial broadcast content services and provide viewers with bidirectional, interactive services. With the growing popularity of the bidirectional, interactive broadcast services, techniques for customizing advertising services are also being developed.
Generally, an advertiser provides demographical profiles of target consumers, such as the age range and gender of the target, as the requirement of the advertisement. The broadcast media or broadcast advertising agencies, however, have no information about the demographical profiles of individual audiences as the target consumers of the advertisement, and they thus schedule and execute the advertisement for a broadcast program popular to viewers of target gender/age groups, based on data, such as audience rating statistics, and experiences and history of executing advertisements. However, the audience rating statistics for broadcast content or the advertising execution history are merely statistical data, and do not reflect the preferences or needs of individual audiences. Hence, it is not possible to provide effective targeted advertising to each individual viewer, based on such data. Meanwhile, even without the knowledge of a current audience, if the gender/age ranges of members of a family are known, a family-targeted advertising, which is customized to the family members, is possible. Generally, a service provider, however, has only access to a profile of a representative subscriber, and no access to profiles of individual family members. Further, for the sake of privacy protection, the representative subscriber information cannot be utilized for any purpose, other than for subscription to services.
Korean published patent application No. 10-2008-0106799 discloses a method of providing content to audiences by collecting and processing viewing behaviors of the audiences. In this patent application, a system includes all unique information of individual members of a family, and viewership histories are collected and viewing behaviors are analyzed after authenticating each member through a login. Thus, for a family whose member profiles of each member are not known, it is not possible to collect viewership histories or analyze viewing behaviors.
The following description relates to an apparatus and method for inferring a user profile for analyzing viewership history so that preference or needs of each individual user can be reflected to in targeted advertising.
In one general aspect, there is provided an apparatus for a user profile, including: a data processor configured to analyze viewing patterns of sample families from received sample family data, extract viewing pattern characteristics from the analyzed viewing patterns, and generate one or more sorters by classifying the viewing pattern characteristics into groups; a target family data processor configured to generate target family viewing pattern information based on received target family data; and a profile inference component configured to generate a primary inference result by classifying the target family viewing pattern information through the one or more sorters and inferring a specific group of members present in the target family based on the viewing pattern characteristics.
The sample family data processor may be configured to calculate one or more probabilities related to TV viewing by analyzing viewership history contained in the received sample family data, and the profile inference component may be configured to calculate viewership probability distributions of individual groups of viewers from the one or more calculated probabilities related to TV viewing, and, in response to receiving a request for TV viewing from a viewer that is a member of the target family, generate a secondary inference result by calculating conditional probabilities for individual groups of viewers from a probability distribution of viewing TV in a specific time interval on a specific day of week corresponding to the received request and a probability distribution of viewing a specific type of program corresponding to the received request. The profile inference component may be configured to infer, based on a likelihood of presence of family member according to the primary inference result and viewership probability distributions according to the secondary inference result, that a group with a largest conditional probability value is an audience member group of the target family.
The profile inference component may be configured to, in a case where family member profiles of the target family are known, infer, based on a likelihood of presence of the family member profiles of the target family and the viewership probability distributions according to the secondary inference result, that a group of viewers with a largest conditional probability value is an audience member group of the target family.
The sample family data processor may be configured to calculate at least one of viewership probabilities according to an amount of TV viewing by type of program, an amount of TV viewing by time of day or an amount of TV viewing by time of day and type of program, a viewership distribution by type of program, a viewership distribution by time of day, or a viewership distribution by time of day and type of program.
The sample family data processor may be configured to generate the one or more sorters by classifying the viewing pattern characteristics into groups according to at least one of gender of viewers, age range of viewers, or type of program. In addition, the sample family data processor may be configured to analyze the sample family data that contains the viewership history and profiles of sample families. The target family data processor may be configured to generate the target family viewing pattern information from target family data that only contains viewership history of the target family. The profile inference component may be configured to generate the primary inference result by inferring a specific group of viewers present in the target family by classifying the target family viewing pattern information using sorters that correspond to viewing patterns that are not duplicated among the viewing patterns, which have been classified into the groups for generating the sorters.
In another genera aspect, there is provided a method of inferring a user profile, including: analyzing viewing patterns of sample families from received sample family data;
extracting viewing pattern characteristics from the analyzed viewing patterns and generating one or more sorters by classifying the viewing pattern characteristics into groups;
generating target family viewing pattern information from received target family data; and generating a primary inference result by classifying the target family viewing pattern information through the sorters and inferring a specific group of members present in the target family. In addition, the method may further include inferring, based on a likelihood of presence of family member according to the primary inference result and viewership probability distributions according to the secondary inference result, that a group of viewers with a largest conditional probability value is an audience member group of the target family. In a case where family member profiles of the target family are known, the method may further include inferring, based on a likelihood of presence of the family member profiles of the target family and the viewership probability distributions according to the secondary inference result, that a group of viewers with a largest conditional probability value is an audience member group of the target family.
Other features and aspects will be apparent from the following detailed description, the drawings, and the claims.
Throughout the drawings and the detailed description, unless otherwise described, the same drawing reference numerals will be understood to refer to the same elements, features, and structures. The relative size and depiction of these elements may be exaggerated for clarity, illustration, and convenience.
The following description is provided to assist the reader in gaining a comprehensive understanding of the methods, apparatuses, and/or systems described herein. Accordingly, various changes, modifications, and equivalents of the methods, apparatuses, and/or systems described herein will be suggested to those of ordinary skill in the art. Also, descriptions of well-known functions and constructions may be omitted for increased clarity and conciseness.
Referring to
The sample family data processor 110 collects sample family data. The sample family data contains viewership history and profiles of sample families. The sample families who are audiences whose compositions and member profiles that include the age range and gender of each member are known are generally registered in advance for the audience rating measurement. The sample family data processor 110 may collect sample family data including viewership history and profiles of individual family members. The sample family data may distinguish each family by a family ID, and distinguish each viewer in each family by an individual ID. Thus, it is possible to classify the viewership history of the sample families by families in general, and also by individual family members.
In addition, the sample family data processor 110 analyzes viewing patterns and viewership history of each sample family based on the received sample family data. The received sample family data includes information about viewership history that corresponds to the profile of the entire family of each sample family and the profiles of their individual family members. The analysis method of the sample family data processor 110 to analyze the viewership history is not limited to the aforementioned method, and various viewing-history analysis methods may be applicable according to the environment or types of broadcast services.
The viewership history of each sample family are identified according to individual audiences, thereby making it possible to analyze viewing patterns according to gender, age range, and group of viewers. On the other hand, gender and age distributions of family members of the target family are unknown and viewership history of audiences in the target family are all mixed together. Therefore, viewing patterns cannot be analyzed with respect to an individual audience member, but can only be analyzed with respect to the target family as a whole. In other words, the age range and gender of each member of the sample families can be identified based on the individual family member profiles, whereas the target family is provided with no specific profile of each member, and thus it is not possible to identify the age range and gender of each member of the target family. Further, in the viewership history of the target family, the family members' viewership history is all mixed together. Because the apparatus 100 in accordance with the exemplary embodiment infers age range and gender distribution of audiences by comparing the sample family data and the target family data, the sample family data processor 110 needs to analyze the viewing patterns not based on each audience member but based on each family.
The sample family data processor 110 extracts viewing pattern characteristics of audiences of each age range and gender group from the viewing patterns of each family amongst the sample families. In general, audiences of the same age range and gender are more likely to exhibit similar viewing patterns. The viewing pattern characteristics of audiences of a specific gender/age range group are extracted by comparing the viewing patterns of families that include the corresponding audiences of the specific gender/age range with the viewing patterns of families that do not include the pertinent audiences. To this end, the sample family data processor 110 categorizes the viewing patterns of the sample families by age range and gender and divides the sample families into two groups: one group of families that include members of the specific gender and age range; and the other group of families that do not include members of the specific gender and age range. For example, if the total of 200 sample families consist of 50 families, each including at least one man in his 20s, and the other 150 families that do not include any men in their 20s, a group of male viewers aged 20 to 29 is divided by 50 to 150, and if the 200 sample families consist of 30 families, each including at least one woman in her 20s, and the rest of 170 families, a group of female viewers aged 20 to 29 is divided by 30 to 170. As such, with respect to N specific age range and gender groups of viewers intended to be sorted, the sample family data processor 110 generates 2N data groups, including N groups of viewer families including the corresponding specific age range and gender groups and N groups of the other viewer families that include none of the corresponding specific age range and gender groups. In addition, the sample family data processor 110 extracts the viewing pattern characteristics of each data group from the viewing patterns of the 2N data groups divided by gender and age. Sorter studying is carried out with respect to viewer families that include “men in their 20s” and the other viewer families that include no “men in their 20s,” so that a sorter for determining the presence of men in their 20s in a target family can be created. Individual sorters are also generated for the other viewers of different gender/age ranges. As many sorters are generated as the number N of the gender/age range groups of viewers to be sorted. The sorter studying algorithm of the sample family data processor 110 may vary according to the purpose and use, without being limited to a specific sorter studying algorithm.
The sample family data processor 110 delivers to the profile inference component 130 the viewing pattern characteristics information of the sample families, which include the sorters that have been generated by analyzing the viewing patterns in a primary inference process.
Then, for a secondary inference process, the sample family data processor 110 may analyze the amount of each type of programs being watched, the amount of TV viewing by time of day, the amount of TV viewing by type of program and by time of day, the distribution of TV viewing by type of program, the distribution of TV viewing by time of day, and the distribution of TV viewing by type of program and by time of day. The sample family data processor 110 may analyze viewership history of the sample families whose member profiles, including the gender/age range of each member, are known, and calculate the viewership probability by time and day for each gender/age range group of viewers, and the viewership probability of a type of program for each age range and gender group of viewers. The calculated viewership probabilities are re-calculated into viewership probability distribution of individual groups. For example, the viewership probabilities by type of program and by time of day and day of week may be represented as conditional probabilities of the probability distribution of TV viewing by time of day and day of week and the probability distribution of TV viewing by type of program. The sample family data processor 110 delivers the generated probability distribution data to the profile inference component 130.
The target family data processor 120 gathers (receives) target family data from a target family for inferring the distribution of genders and age ranges of viewers. The age range/gender profile of each family member included in the sample family data allows for identification of individual family members' age range and gender, whereas the number of family members and the age range and gender of each family member of the target family are not known. In addition, the viewership history of each member within a target family is combined together, so that it is not possible to analyze the viewing patterns of each viewer. Therefore, the target family data processor 120 analyzes the viewing patterns of each audience target family in general based on the viewership history. The target family data processor 120 delivers, to the profile inference component 130, target family viewing pattern information generated by analyzing the viewership history.
The profile inference component 130 infers a profile of the target family based on the sample families' viewing pattern characteristics information received from the sample family data processor 110 and the target family viewing pattern information received from the target family data processor 120. The procedures of the profile inference component 130 to infer the profile of the target family based on the received sample family viewing pattern characteristics information and target family viewing pattern information will be described with reference to
Referring to
The profile inference component 130 infers the presence of viewers by classifying the target family's viewing pattern information by use of the sample families' viewing pattern characteristics information which is classified by gender and age. The inference process of the profile inference component 130 may include the primary inference process and the secondary preference process. The primary inference process compares and analyzes the sample family data and the target family data to determine whether the target family includes a member who belongs to a specific group. Then, the secondary inference process infers the presence of a specific viewer based on the result of the first inference, the viewership probability of each group of viewers of sample families and target family and the viewership probability of a type of programs for each group of viewers of sample families and target family.
The result that is obtained during the primary inference of the profile inference component 130 by using the viewing pattern characteristics information as sorters indicates whether characteristic viewing patterns of viewers of specific age range and gender are present. That is, the process of classifying the target family's viewing pattern information using the sample families' viewing pattern characteristics information is similar to the principle of filtering. Two or more viewers' viewership history may be mixed in the target family's viewership history. It is determined whether there are characteristic viewing patterns by classifying the target family's viewership history information based on the sample families' viewership history characteristics, and then it is further determined whether there are viewers with the characteristic viewing patterns. When inferring such profiles as gender and age of the members of each target family based on the target family's viewership history, the apparatus 100 in accordance with the exemplary embodiment identifies viewing patterns of each target family by parallel comparison using the gender/age sorters (sample families' viewing pattern characteristics information). That is, the profile inference component 130 classifies the target family's viewership history information based on the sample families' viewership history characteristics information, and analyzes gender/age-specific characteristics of the viewership history contained in the sample families' viewership history characteristics information, thereby enabling to infer the gender/age groups of members of the target family. The secondary inference process of the profile inference component 130 will be described below with reference to
The profile inference component 130 classifies the viewing patterns 310 of the target family based on the sorters 320 and 330 set by gender and age by the sample family information processor 110. More specifically, the profile inference component 130 compares the viewing patterns 310 of the target family with the viewing patterns 320 of the first-group viewer to determine similarities, and compares the viewing patterns 310 with the viewing patterns 330 of the second-group viewer to determine similarities. It may be determined whether there are patterns of a viewer of a specific group in the viewing patterns 310 of the target family, which are combined with viewing patterns of various viewers, by comparing the viewing patterns 310 of the target family with each of the viewing patterns 320 of the first-group viewer and the viewing patterns 330 of the second-group. When comparing the viewing patterns 320 of the first-group viewer and the viewing patterns 330 of the second-group viewer with the viewing patterns 310 of the target family, the profile inference component 130 may compare all viewing patterns or compares only the characteristic viewing patterns among the all included patterns. Viewing pattern 1-d among the viewing patterns 320 of the first-group viewer and viewing pattern 2-d among the viewing patterns 330 of the second-group viewer overlap with viewing patterns of a different group. The viewing patterns overlapping with the different group's viewing patterns do not exhibit characteristic values since many values associated with viewers of various groups are combined therein. Thus, only the viewing patterns except viewing patterns 1-d and 2-d are compared with the viewing patterns 310 of the target family.
Viewing behaviors of viewers are analyzed from a target viewer family (a target family) whose member profile is unknown, and the analysis result is input to N sorters which are based on the sample families' viewership history characteristics information, so that it is determined whether characteristic viewing patterns of each group are included in the viewing patterns of the target family. The profile inference component 130 infers that a viewer who corresponds to a sorter, which is determined as including the characteristic viewing pattern, belongs to the target family.
Referring to
The profile inference component 130 which has received the calculated probabilities from the sample family data processor 110 re-calculates the received probabilities as the probability distributions for individual groups in 404. For example, the viewership probabilities by type of program, and time and day may be represented as conditional probabilities of the viewership probability distribution by time and day and the viewership probability distribution by type of program. Then, in response to receiving, in 405, a request for viewing a specific type of program in a specific time interval on a specific day of week from a viewer 10 of a target family, the profile inference component 130 calculates conditional probabilities for each age range and gender group of viewers from the viewership probability distribution of each group of viewers in the specific time interval on the specific day of week and the viewership probability distributions of the specific type of program for each group of viewers and obtains the viewership probability distribution of the corresponding group of viewers watching the specific type of program in the specific time interval of the specific day, which results from the secondary inference in 406. The obtained viewership probability distribution as the outcome of the secondary inference is represented as the viewership probability distribution of the different age range and gender groups of viewers.
Referring to
The group/day/time viewership probability distribution table 510, the program-type/group viewership probability distribution table 520, and the program-type/group/time/day viewership probability distribution table 530 are based on thirteen groups of people, eight 3-hour time intervals, and seven days, wherein the 13 groups include a group of people under 10s (U10), a group of teenage boys (M10), a group of teenage girls (F10), a group of men in their 50s (M50), a group of women in their 50s (F50), a group of men over 60s (M60), a group of women over 60s (F60), and the like.
In 501, the profile inference component 130 infers an audience member by taking into consideration the primary inference result obtained through the procedures shown in
The profile inference component 130 may infer the audience based on the primary inference result that infers the group which is present in the target family, and also infer the audience based on family member information of the target family. When using the member information of the target family, the probability of the presence of a group belonging to the target family is 1, and the probability of the presence of a group not belonging to the target family is 0. When receiving the actual family member information of the target family, instead of the primary inference result, the profile inference component 130 infers that a group with the largest probability distribution value relative to TV viewing is the current audience. Depending on whether the table is mapped with the type of program, viewership probability distributions by time and day, or viewership probability distributions by time and day and type of program is used. If there is program type information, the viewership probability distributions by time and day, and type of program may be used. In the same manner, the profile inference component 130 infers that the age group and gender group with the largest conditional probability value is the current audience wherein the conditional probability value is obtained from the product of the probability of the presence of each family member and the viewership probability distribution.
In response to inferring the current audience belonging to the target family based on the primary inference and the secondary inference, the audience inference profile of the target family is generated in 502.
Referring to
In S602, viewing pattern characteristics of audiences of each gender/age range group are extracted from the viewing patterns of each family amongst the sample families based on a first viewing pattern generated from the received sample family data. In general, audiences of the same gender/age range group are more likely to exhibit similar viewing patterns. The viewing pattern characteristics of audiences of a specific gender/age range group are extracted by comparing the viewing patterns of families that include the corresponding audiences of the specific gender/age range with the viewing patterns of families that do not include the pertinent audiences. To this end, the viewing patterns of the sample families are categorized by age range and gender and the sample families are divided into two groups of families: one group of families that include members of the specific age range and gender; and the other group of families that do not include members of the specific age range and gender. As many sorters are generated as the number of the age range and gender groups of viewers to be sorted. For example, if the total of 200 sample families consists of 50 families, each including at least one man in his 20s, and the other 150 families that do not include any men in their 20s, a group of male viewers aged 20 to 29 is divided by 50 to 150, and if the 200 sample families consist of 30 families, each including at least one woman in her 20s, and the rest of 170 families, a group of female viewers aged 20 to 29 is divided by 30 to 170. As such, with respect to N specific age range and gender groups of viewers who are to be sorted, the sample family data processor 110 (refer to
In response to the second viewing pattern being generated, the target families' viewing pattern information is categorized using sample families' viewing pattern characteristics information classified by gender/age range, and a group of people corresponding to members of each target family is inferred in S604. The outcome obtained using the viewing pattern characteristics information as the sorters indicates the absence or presence of the characteristic viewing patterns of specific gender/age range audiences. That is, the process of classifying the target families' viewing pattern information based on the sample families' viewing pattern characteristics information is similar to the concept of filtering. The viewership history of the target families may include viewership history of two or more audience member. It is determined whether the target families' viewing patterns include a characteristic viewing pattern, based on the result of classifying the target families' viewership history information using the sample families' viewership history characteristics, and it is further determined whether each target family has an audience showing the characteristic viewing pattern. To infer profiles, for example, gender/age range, of each member of the target families from the target families' viewership history, the viewing patterns of the target families are compared with each gender/age range sorter (sample families' viewing pattern characteristics information) in a parallel fashion. That is, the target families' viewership history information is classified using the sample families' viewership history characteristics information, and the gender/age-associated characteristics of viewership history included in the sample families' viewership history characteristics information are analyzed, and thereby the members of each gender/age range can be inferred from the target families' viewership history information. Operation S604 is equivalent to the primary inference process described with reference to
In S606, the calculated probabilities are re-calculated into probability distributions for individual groups. For example, the viewership probabilities by time and day, and type of program may be represented as conditional probabilities of the viewership probability distribution by time and day and the viewership probability distribution by type of program. In addition, in response to a request for viewing a specific type of program in specific time interval on a specific day of week being received from a target family member, a probability distribution of each age range and gender group of viewers viewing the specific type of program in the specific time interval on the specific day of week is obtained as a secondary inference result by calculating conditional probabilities for individual gender/age range groups of viewers from a probability distribution of viewing TV in the specific time interval on the specific day of week and a probability distribution of viewing the specific type of program in S607. The obtained secondary inference result may be represented as viewership probability distribution of the individual gender/age range group of viewers.
Then, based on the primary inference result and the secondary inference result, an audience of the target family is inferred in S607. The audience may be inferred by taking into consideration the secondary inference result generated through operation S605. In the primary inference result table with respect to members of the target family, the probability of presence of each group belong to the target family, which is contained in family member information obtained from the primary inference result, reflects the precision of inference of each group. The precision refers to a likelihood of the inference being correct. If the precision of a group inferred as being included in the target family is PYa and the precision of a group inferred as not being included in the target family is PNa, the probability of the presence of the group inferred as being included in the target family is PYa and the probability of the presence of the group inferred as not being included in the target family is (1−PNa). The profile inference component 130 infers that the gender/age group with the largest conditional probability is the current audience wherein the conditional probability is obtained from the product of the probability of the presence of a family member and the probability distribution of TV viewing.
In S607, the current audience may be inferred using the primary inference result with respect to a group belonging to the target family, or may be inferred using actual family member information of the target family without using the primary inference result. In the case of using the family member information of the target family, the probability of the presence of the group belonging to the target family is 1 and the probability of the presence of the group not belonging to the target family is 0. In a case where the actual family member information of the target family is input, instead of the primary inference result, a group with the largest probability distribution value relative to TV viewing is inferred as current audiences. Depending on whether the table is mapped with type of program, viewership probability distribution by time and day or viewership probability distribution by time of day, day of week and type of program is used, and if information of type of program is present, viewership probability distribution by time and day, and type of program may be used.
According to the apparatus and method for inferring an audience profile in accordance with the exemplary embodiments of the present disclosure, it is possible to infer audience profiles, such as age range and gender of an audience member from viewership history of the family. Also, by using both viewership probability distribution of each gender/age range group of viewers and inference result of a member of a target family, it is possible to improve the precision of inference of the age range and gender of a current audience, when compared with the inference of the profile of the current audience only using viewership probability distribution. Further, without having to collect family member information of all audience families, the family member information and current audiences can be inferred from viewership history, and the inferred family member information and current audience information may be utilized for targeted advertising.
A number of examples have been described above. Nevertheless, it will be understood that various modifications may be made. For example, suitable results may be achieved if the described techniques are performed in a different order and/or if components in a described system, architecture, device, or circuit are combined in a different manner and/or replaced or supplemented by other components or their equivalents. Accordingly, other implementations are within the scope of the following claims.
Number | Date | Country | Kind |
---|---|---|---|
10-2014-0002592 | Jan 2014 | KR | national |