In the following, the preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings.
The program keyword acquiring unit 120 has a program keyword extracting portion 121 and acquires keywords contained in the program information file 510. The program information file 510 stores information on programs such as the titles and detailed descriptions of the programs, as will be described below. In digital television broadcasts, the content of the program information file 510 is stored as an Event Information Table (EIT) in Service Information (SI) in an MPEG (Moving Picture Experts Group)-2 transport stream.
The program keyword extracting portion 121 performs morphological analysis on the content of each program contained in the program information file 510 to extract keywords (hereinafter referred to as “program keywords”).
Morphological analysis herein refers to a process of separating a text into the smallest units called morphemes and analyzing the attributes of the individual morphemes. Such attributes generally include word classes such as “verb”, “adjective”, and “noun”. In this embodiment, nouns can be categorized into “common noun”, “place name”, “person name”, “organization name”, “proper noun”, etc. The program keyword acquiring unit 120 provides the genre determining unit 130 and the media personality information generating unit 150 with program keywords and corresponding attributes.
The genre determining unit 130 determines a program genre for the program keywords acquired by the program keyword acquiring unit 120 from the program information. In digital television broadcasts, program genres can be acquired by referring to a content descriptor in an EIT. The program genre may also be estimated from the program keywords acquired by the program keyword acquiring unit 120 from the program information.
The program genres includes, for example, “movie”, “drama”, “sports”, “variety”, and “news”. If sufficient information on a media personality is not obtained from only the program information file 510, the genre determining unit 130 determines that it is necessary to obtain keywords from the caption information file 520. A media personality herein refers to a person who appears on a program, such as an entertainer, an athlete, and a newscaster. Examples of such program genres for which the keywords from the caption information file 520 are necessary include “sports” and “news”, for example.
For example, in many cases, only names of teams are contained in program information of a sports program, and names of athletes are contained only in caption information. In the case of a news program, only names of announcers are contained in program information, and names of people appearing on the news program are contained only in caption information. For such program genres, the genre determining unit 130 notifies the caption keyword acquiring unit 140 that keywords need to be acquired from the caption information.
The caption keyword acquiring unit 140 has a caption keyword extracting portion 141 and a caption keyword generating portion 142 and acquires keywords contained in the caption information file 520. The caption information file 520 stores captions of each program such as closed captions in time series, as will be described below. In digital television broadcasts, the content of caption information file 520 is stored in a MPEG-2 transport stream together with audio information and video information.
The caption keyword extracting portion 141 performs morphological analysis on the content of a caption contained in the caption information file 520 to extract keywords (hereinafter referred to as “caption keywords”). The caption keywords extracted by the caption keyword extracting portion 141 are stored in the caption index file 530. The caption index file 530 stores the caption keywords together with corresponding attributes as pairs, as will be described below.
When the genre determining unit 130 determines that keywords from the caption information file 520 are necessary, the caption keyword generating portion 142 acquires the keywords from the caption information file 520 and generates caption keywords. The caption keywords generated by the caption keyword generating portion 142 are supplied to the media personality information generating unit 150.
The media personality information generating unit 150 generates information on media personalities in programs, on the basis of the program keywords supplied by the program keyword acquiring unit 120 and the caption keywords supplied as necessary by the caption keyword acquiring unit 140. The media personality information generating unit 150 has a program keyword narrowing portion 151, a character name linking portion 152, and a caption keyword narrowing portion 153.
The program keyword narrowing portion 151 narrows down the program keywords supplied by the program keyword acquiring unit 120 to keywords relating to “person”, “last name”, and “first name”. The keywords narrowed down by the program keyword narrowing portion 151 are supplied to the character name linking portion 152.
The character name linking portion 152 links a name of a media personality (media personality name) to a corresponding name of a character (character name) in a program, on the basis of the program keywords narrowed down by the program keyword narrowing portion 151. When the media personality is an actor or the like, the media personality name may be the screen name of the media personality or may be his or her real name, and the character name in the program may be the name of a person (character) whom the media personality performs, if the program is a movie program or a drama program.
Note that a media personality name and a character name can be distinguished from each other on the basis of a configuration of program information. However, to increase the precision of the distinction, for example, a performer dictionary 540 can be employed. Thus, if a name of a person is found in the performer dictionary 540, the name can be regarded as the screen name of the person and not as a character name.
The media personality name (screen name) and the corresponding character name that are linked by the character name linking portion 152 are stored together with a corresponding program identifier as media personality information in the media personality information file 550.
The caption keyword narrowing portion 153 narrows down the caption keywords supplied by the caption keyword acquiring unit 140 to keywords relating to “person”. The caption keywords can include either media personality names or character names, depending on the genre of the program. For example, in general, character names are used in movie programs and drama programs, and media personality names are used in sports programs and news programs. The caption keywords narrowed down by the caption keyword narrowing portion 153, i.e., media personality names or character names, are stored together with corresponding program identifiers as media personality information in the media personality information file 550.
As described above, in the media-personality information acquiring apparatus according to an embodiment of the present invention, pairs of media personality names and character names, or either media personality names or character names are stored together with corresponding program identifiers in the media personality information file 550.
The program identifier 511 indicates an identifier for allowing a corresponding program to be uniquely identified. The program start date and time 512 indicates the date and time when the broadcast of the program starts. The station code 513 indicates a code of a broadcast station that broadcasts the program. Note that a station code needs to be assigned to each broadcast station beforehand.
The program title 514 indicates the title of the program. The program description 515 indicates a detailed description of the content of the program. The format of the program description 515 is not particularly standardized. However, the format can roughly be classified as follows.
Referring to
Thus, when the program keyword narrowing portion 151 in the media personality information generating unit 150 extracts keywords related to “person”, “last name”, “first name”, and “symbol”, it is preferable that those symbols described in the examples of
The caption identifier 521 indicates an identifier allowing a caption to be uniquely identified. The program start date and time 522 indicates the date and time when the broadcast of a program containing the caption starts. The station code 523 indicates a code of a broadcast station that broadcasts the program containing the caption. The time code 524 indicates a time point at which the caption is presented during the program. The time code 524 may be presented as an elapsed time to the second since the beginning of the program. However the time code 524 is not limited to being such an elapsed time.
The caption text 525 indicates the content (text) of a caption. The caption keyword extracting portion 141 performs keyword extraction on the caption text 525. The color code 526 indicates a code of a color in which the caption is presented. The color code 526 is set when the caption is created, such that different colors are assigned to individual speakers.
In the caption text 525, the character name of each speaker is often placed in parentheses at the beginning of a caption. The caption keyword extracting portion 141 extracts the character name in parentheses as a keyword. However, the name of a speaker may not be provided for every caption and may be omitted for the second time and thereafter. Thus, when a speaker name is not provided in the caption text 525, the caption keyword extracting portion 141 preferably refers to the color code 526 to obtain the omitted speaker name so as to extract keywords.
For example, in the caption information file 520 illustrated in
The caption identifier 531 is an identifier allowing a caption corresponding to a caption keyword to be uniquely identified. Thus, the same caption identifier 531 is provided for individual caption keywords extracted from the same caption. The keyword 537 indicates the content of the caption keyword. The attribute 538 indicates the attribute of the caption keyword. The attribute 538 is based on morphological analysis and can include more detailed information than general word classes.
The caption index file 530 is not only referred to by the caption keyword generating portion 142 but also used in a search for a caption, as will be described below. This allows a rapid search for a media personality name or a character name from captions.
The program identifier 551 indicates an identifier allowing a program on which a corresponding media personality appears to be uniquely identified. The media personality name 552 indicates the name of the media personality (screen name, etc.). The character name 553 indicates the character name of the media personality in the program.
For a program such as a movie program or a drama program, in which character names are used, the character name of the media personality is indicated as the character name 553. However, for a program such as a sports program or a news program, in which character names are not used, no character name is indicated.
The operation receiving unit 210 serves as a user interface for receiving a search operation performed by a user for a search relating to a media personality. The search operation includes input of the name of the media personality to be searched for. The user may designate a program to be searched for or all programs may be searched for.
The program searching unit 220 searches for the media personality name which has been set as a search object in the operation receiving unit 210, in the media personality information file 550. In the media personality information file 550, the media personality information generating unit 150 stores a pair of a media personality name and a character name or either a media personality name or a character name is stored together with a corresponding program identifier as one record. The program searching unit 220 searches for the media personality name to be searched for in the media personality information file 550 and outputs a corresponding record.
The genre determining unit 230 determines the genre of a program searched by the program searching unit 220. In digital television broadcasts, the program genre can be acquired by referring to a content descriptor in an EIT or estimated from the content of a record acquired by the program searching unit 220 from the media personality information file 550.
Program genres to be determined by the genre determining unit 230 include, for example, “movie”, “drama”, “sports”, “variety”, and “news”, as in the case of the genre determining unit 130. The genre determining unit 230 determines whether a media personality is searched for by a media personality name or a corresponding character name in a search performed by the caption searching unit 270.
For example, in a movie program and a drama program, character names are often presented in captions. On the other hand, in other programs including a spots program, a variety program, and a news program, media personality names are presented in captions. Thus, the genre determining unit 230 determines that the search is performed on the basis of the character name of the media personality if the program to be searched for is a movie program or a drama program. On the other hand, the genre determining unit 230 determines that the search is performed on the basis of the media personality name if the program to be searched for is a sports program, a variety program, or a news program.
The media personality keyword extracting unit 260 extracts a media personality name or a character name from a record supplied by the program searching unit 220 as media personality keywords. In the record supplied by the program searching unit 220, a media personality name or a character name is recorded with the first name and last name not being separated. The media personality keyword extracting unit 260 divides such a media personality name or character name into “name”, “first name”, and “last name”, and sets the divided words as media personality keywords.
The caption searching unit 270 refers to the caption index file 530 on the basis of media personality keywords supplied by the media personality keyword extracting unit 260 and outputs the time code 524 of a corresponding caption. At this time, the caption searching unit 270 performs a search by a media personality name or a character name in accordance with a result of determination performed by the genre determining unit 230. The time code 524 of the caption can be derived from the caption identifier 521 in the caption information file 520 which corresponds to the caption identifier 531 corresponding to a record searched by the caption searching unit 270.
In addition, the caption searching unit 270 receives from the program searching unit 220 program identifiers to be searched for so as to narrow down search results using the program identifiers. Specifically, the caption searching unit 270 derives the program start date and time 512 and station code 513 corresponding to each of the program identifiers supplied by the program searching unit 220 from the program identifier 511 in the program information file 510 which corresponds to each of the program identifiers. Then, the caption searching unit 270 refers to the caption information file 520 to search for the program start date and time 522 and the station code 523 which correspond to the program start date and time 512 and station code 513, respectively, so as to acquire a range of the caption identifiers 521. Thus, by narrowing down the search results (time codes 524) so that the search results correspond to the individual caption identifiers within the acquired range, the time codes in the program to be searched for can be generated as the search result 590. The program title can be acquired from the program title 514 in the program information file 510.
When the selection of the item is instructed by the user, a corresponding media personality can be selected in the media personality information 620, as described below.
Thus, a scene in which the selected media personality performs his or her character can be searched for, and a result of the search is displayed in a search result 640 in the lower left of the screen. In this example, three captions are displayed together with corresponding time codes in the search result 640.
As described above, a list of media personalities corresponding to a program on the program list is displayed so that a user selects a media personality to be searched for from among the media personality list. With this arrangement, it is no longer necessary for the user to enter the name of a media personality to be searched for. In the above example, the case is described where a search is performed on the basis of a program corresponding to a media personality list. However, a media personality can be searched for by searching through all programs.
The file access module 740 first accesses the program information file 510 to acquire the program title 514 and the program description 515 of a program (701) and returns the operation result to the main routine 720 (702). Then, the keyword extraction module 730 extracts program keywords from the program description 515 (703) and returns the operation result to the main routine 720 (704). The file access module 740 retrieves media personality names and character names and links the media personality name to the corresponding character name (705) and returns the operation result to the main routine 720 (706). Thus, the file access module 740 registers the media personality names and the corresponding character names in the media personality information file 550 (707). A notification of success or failure of the registration is returned to the main routine 720 (708).
Subsequently, in the main routine 720, the genre of the program is determined (709). If the genre of the program is determined to be such a genre for which keywords are need to be acquired from caption information, the file access module 740 accesses the caption index file 530 to acquire caption keywords (711) and returns the operation result to the main routine 720 (712). Then, the file access module 740 retrieves media personality names (715) and returns the operation result to the main routine 720 (716). Thus, the file access module 740 registers the media personality names in the media personality information file 550 (717). A notification of success or failure of the registration is returned to the main routine 720 (718).
Note that in this example, it is assumed that the caption index file 530 has been generated beforehand.
When a program is designated by a user through the title list 610 (see,
Then, the keyword extraction module 730 extracts program keywords from the program description 515 (725) and returns the operation result to the main routine 720 (726). Subsequently, the file access module 740 retrieves media personality names and character names and links the media personality names to the corresponding character names (727) and returns the operation result to the main routine 720 (728). As a result, information indicating the linkage of the media personality names and the corresponding character names is displayed in the GUI (729) so as to be presented to the user (731).
When the user designates a media personality in the media personality information 620 (see,
Then, the file access module 740 narrows down a range of caption identifiers on the basis of a correspondence relationship between the caption information file 520 and the program information file 510 (756), and returns the operation result to the main routine 720 (757). Thus, the search results for the caption index file 530 are narrowed down (758). The search results are displayed in the GUI (759) so as to be presented to the user (761).
In this example, programs to be executed on a computer are similar to those in the example of
When the main routine 720 receives a search request, the file access module 740 searches through the media personality information file 550 to acquire program identifiers of programs in which the media personality appears (771) and the returns the operation result to the main routine 720 (772). For each of the program identifiers, the operation procedure from (751) to (758) described using
As described above, according to an embodiment of the present invention, in generating the media personality information file 550, the content of the caption information file 520 is also reflected on the media personality information file 550 if the genre determining unit 130 determines the genre of a program to be a sports program or a news program. With this arrangement, media personality information can be obtained which could not be extracted if only the content of the program information file 510 were used.
In addition, when scenes in a program in which a media personality appears are searched for, the caption index file 530 is searched through by the name of the media personality if the genre determining unit 230 determines the genre of the program to be “sports”, “variety”, or “news”. On the other hand, the caption index file 530 is searched through by a character name of the media personality if the genre determining unit 230 determines the genre of the program to be “movie” or “drama”. This arrangement allows a user to precisely search for scenes where a desired media personality appears.
It should be understood that the above embodiments of the present invention illustrate examples for implementing the present invention. The examples illustrated in the embodiments correspond to elements in the claims. However, the embodiments of the present invention are not limited to these examples and various modifications may be made without departing from the scope of the present invention.
Specifically, according to an aspect of the present invention, a program keyword acquiring unit corresponds, for example, to the program keyword acquiring unit 120; a program genre determining unit corresponds, for example, to the genre determining unit 130; a caption keyword acquiring unit corresponds, for example, to the caption keyword acquiring unit 140; and a media personality information generating unit corresponds, for example, to the media personality information generating unit 150.
According to an aspect of the present invention, an operation receiving unit corresponds, for example, to the operation receiving unit 210; a program searching unit corresponds, for example, to the program searching unit 220; a program genre determining unit corresponds, for example, to the genre determining unit 230; and a caption searching unit corresponds, for example, to the caption searching unit 270.
According to an aspect of the present invention, a program keyword acquiring unit corresponds, for example, to the program keyword acquiring unit 120; a first program genre determining unit corresponds, for example, to the genre determining unit 130; a caption keyword acquiring unit corresponds, for example, to the caption keyword acquiring unit 140; a media personality information generating unit corresponds, for example, to the media personality information generating unit 150; an operation receiving unit corresponds, for example, to the operation receiving unit 210; a program searching unit corresponds, for example, to the program searching unit 220; a second program genre determining unit corresponds, for example, to the genre determining unit 230; and a caption searching unit corresponds, for example, to the caption searching unit 270.
According to an aspect of the present invention, a program keyword acquiring step corresponds, for example, to the operation (703); a program genre determining step corresponds, for example, to the operation (709); a caption keyword acquiring step corresponds, for example, to the operation (711); and a media personality information generating step corresponds, for example, to the operation (715).
According to an aspect of the present invention, an operation receiving step corresponds, for example, to the operation (741); a program searching step corresponds, for example, to the operation (771); a program genre determining step corresponds, for example, to the operation (751); and a caption searching step corresponds, for example, to the operations (752) to (758).
The operation steps described in the above embodiments may be considered as a method including a series of operation steps or as a program for causing a computer to execute the series of operation steps or a recording medium for storing the program.
It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and alterations may occur depending on design requirements and other factors insofar as they are within the scope of the appended claims or the equivalents thereof.
Number | Date | Country | Kind |
---|---|---|---|
P2006-192309 | Jul 2006 | JP | national |