The present invention relates to a content register device, a content register method and a content register program, and particularly relates to a content register device, a content register method and a content register program for registering content after adding a tag for search to the content.
In a database for managing content such as images, the content is stored with metadata like keywords associated to the content, and the target content is obtained by searching the keywords. The keywords are registered by a person who registers the content. When there are a lot of content to be registered, it is cumbersome to register the keywords. In addition, the registered keywords are selected based on subjectivity of the person who registers the content, and the keywords used for search are selected based on subjectivity of people who search the content (herein after, searcher). When the person who registers the content and the searcher select different keywords with respect to an identical content, the target content may not be easily searched.
In order to solve the difficulty of search based on the keywords selection, in a publication of Japanese Patent Laid-open Publication No. 10-049542, one part of an input image is analyzed, and keywords such as “tree”, “human face” and the like are extracted from shape, colors, size, texture, and so on of this part. The keywords are then registered in association with the image. In a publication of Japanese Patent Laid-open Publication No. 2002-259410, metadata of content like an image, and feature quantity of the content are managed separately. When a new image is registered in a database, metadata of a previously input image that has similar feature quantity as the new image is given to the new image.
According to the invention disclosed in the Japanese Patent Laid-open Publication No. 10-049542, since the keywords are automatically extracted, the keywords can be known by analogy when the extraction method is understood, and therefore percent hit rate in search can be improved. However, since the keywords are limited to those extracted from the image, a broad-ranging search cannot be performed.
According to the invention disclosed in the Japanese Patent Laid-open Publication No. 2002-259410, since the preliminary registered metadata is used for the newly input content, quite a few content needs to be stored so that adequate metadata can be used for the newly input content, otherwise the search accuracy cannot be improved.
It is an object of the present invention to provide a content register device, a content register method and a content register program, for automatically providing content with keywords which enable an accurate b-road-ranging search of the content even with a small amount of registered data.
In order to achieve the above and other objects, a content register device of the present invention includes a content input device, a tag production device, a thesaurus, an associated word acquiring device, a score acquiring device, and a content database. When content is input by the content input device, the tag production device automatically produces a tag in which a keyword representing characteristics of the content is described. In the thesaurus, words are sorted and arranged in groups that have similar meanings. The associated word acquiring device acquires an associated word of the keyword by searching the thesaurus. The score acquiring device acquires a score representing the degree of association between the associated word and the keyword with use of the thesaurus. The content database registers the content, the tag, the associated word and the score in association with each other.
The tag production device includes a characteristics extracting section, a word table, and a keyword selecting section. The characteristics extracting section extracts the characteristics that can become the keyword by analyzing the content or metadata attached to the content. In the word table, the characteristics and a word are stored in association with each other. The keyword selecting section selects a word corresponding to the characteristics by searching the word table and describes the word as the keyword in the tag.
When the content is an image, the characteristics extracting section extracts at least one characteristic color of the image. The word table stores the characteristic color and a color name in association with each other. The keyword selecting section selects a color name corresponding to the characteristic color by searching the word table and describes the color name as the keyword in the tag.
The tag production section may include an image recognizing section and an object name table. The image recognizing section recognizes a kind and/or a shape of an object in the image. In the object name table, the object's kind is stored in associated with an object name and/or the object's shape is stored in associated with a shape name. At this time, the keyword selecting section selects an object name corresponding to the object's kind and/or a shape name corresponding to the object's shape by searching the word table and describes the object name and/or the shape name as the keyword in the tag.
The tag production device may include a color name conversion table in which the object name and/or the shape name, an original color name of the object, and a common color name corresponding to the original color name are stored in association with each other. At this time, the keyword selecting section selects a corresponding original color name by searching the color name conversion table based on the object name and/or the shape name, and the color name of the characteristic color, and describes the corresponding original color name as the keyword in the tag.
The tag production device may include a color impression table in which a plurality of color combinations and color impressions obtained from the color combinations are stored in association with each other. At this time, the keyword selecting section selects a corresponding color impression by searching the color impression table based on the characteristic colors extracted by the characteristics extracting section, and describes the corresponding color impression as the keyword in the tag.
The characteristics extracting section may extract time information such as created date and time of the content. At this time, the keyword selecting section selects a word associated with the time information by searching the word table that stores words related to date and time. The word selected by the keyword selecting section is described as the keyword in the tag.
The characteristics extracting section may extract location information such as a created place of the content. At this time, the keyword selecting section selects a word associated with the location information by searching the word table that stores words related to location and place. The word selected by the keyword selecting section is described as the keyword in the tag.
According to another embodiment of the present invention, the content register device further includes a schedule management device having an event input device and an event memory device. The event input device inputs a name of an event, and date and time of the event. The event memory device memorizes the event's name and the event's date and time in association with each other. At this time, the tag production device includes a schedule associating section for selecting an event's name and an event's date and time corresponding to time information such as created date and time of the content by searching the event memory device based on the time information, and describes the event's name and the event's date and time as the keywords in the tag.
In the thesaurus, the words are arranged in tree-structure according to conceptual broadness of the words. The score acquiring section acquires the score according to the number of words between the keyword and the associated word.
The content register device may further include a weighting device for assigning a weight to the keyword. The weighting device assigns the weight based on the number of the keywords existing in the content database.
A content register method and a content register program of the present invention includes the steps of: inputting content; automatically producing a tag in which a keyword representing characteristics of the content is described; acquiring an associated word of the keyword by searching a thesaurus having words sorted and arranged in groups that have similar meanings; acquiring a score representing the degree of association between the associated word and the keyword with use of the thesaurus; and registering the content, the tag, the associated word and the score in association with each other.
According to the present invention, the keywords are automatically added to the content when the content is registered. Owing to this, the content registration can be facilitated. In addition, since the keywords are selected according to a predetermined rule, the keywords used by the person who registers the content and the searcher do not differ based on their subjectivity. Accordingly, search accuracy and percent hit rate in search can be improved.
Since the associated words are also automatically selected and registered with the keywords, the content can be searched even with ambiguous keywords by utilizing the associated words. Accordingly, a broad-ranging search can be performed. Moreover, since the score of the associated word and the weight of the keyword are also registered, an accurate search can be performed based on the degree of association between the associated word and the keyword, the level of importance of the keyword, and the like.
The keywords included in the tag are selected from a variety of characteristics such as the characteristic color extracted from the content, the time information, the location information, the object's kind and/or shape according to the image recognition, the original color of the object, the color impressions produced from various color combinations, and the like. Owing to this, a broad-ranging search can be performed. Moreover, since the event's name recorded in the schedule management device can be described as the keyword, a search based on a user's personal activity can also be performed.
The above and other objects and advantages of the present invention will be more apparent from the following detailed description of the preferred embodiments when read in connection with the accompanied drawings, wherein like reference numerals designate like or corresponding parts throughout the several views, and wherein:
In
As shown in
The CPU 3 operates as an image registering section 21 shown in
The tag production section 23 is composed of a characteristics extracting section 29, a word table 30, and a keyword selecting section 31. The tag production section 23 produces a tag 35 for data search and adds the tag 35 to the image data 18, like an analyzed image file 34 shown in
The characteristics extracting section 29 analyzes the input image file 17 and extracts characteristics that can be keywords. For example, the characteristics extracting section 29 extracts a characteristic color of an image from the image data 18 and obtains the time information such as shooting date and time and the location information such as latitude and longitude of the shooting place from the EXIF data 19. A color having highest number of pixels (color having maximal area), a color having highest pixel density, or the like may be selected as the characteristic color. The characteristic color may be extracted according to the frequency of appearance in color sample as described in Japanese Patent Laid-open Publication No. 10-143670. Note that the characteristic color may be more than one.
The word table 30 stores the characteristics extracted by the characteristics extracting section 29 and the words used as the keywords as being associated with each other. As shown in
The keyword selecting section 31 searches the word table 30 based on the input characteristic color, time information and/or location information, and selects corresponding words. Then, the keyword selecting section 31 produces the tag 35 having the selected words as the keywords and inputs the tag 35 to the associated word acquiring section 25.
The associated word acquiring section 25 searches the thesaurus 24 for words associated to the keywords described in the tag 35 and inputs the words to the score acquiring section 26. In the thesaurus 24, words are sorted and arranged in groups that have similar meanings, and the words are arranged in tree-structure according to conceptual broadness of the words. As shown in
The associated words acquired in the associated word acquiring section 25 are added as associated word data 36 to the analyzed image file 34, as shown in
The score acquiring section 26 acquires a score representing the degree of association of the associated word and the keyword with use of the thesaurus 24. As shown in
Hereinafter, the operation of the above embodiment will be explained with referring to flow charts shown in
The characteristics extracting section 29 extracts the characteristic color of the image from the image data 18 of the image file 17. The characteristics extracting section 29 may also extract the time information such as shooting date and time and/or the location information such as shooting place from the EXIF data 19 of the image file 17. The keyword selecting section 31 searches the word table 30 and selects words corresponding to the characteristics extracted by the characteristics extracting section 29, as the keywords.
For example, when the characteristic color of the image data 18 has the RGB value of FF0000 representing the color red, the color name “RED” is selected from the color table 40 as the keyword. When the time information is “JANUARY 1ST”, words like “NEW YEAR” and/or “NEW YEAR'S DAY” are selected from the time information table 41 as the keyword. Based on the latitude and longitude of the location information, the city name like “SAPPORO-SHI” is selected from the location information table 42 as the keyword. The keyword selecting section 31 selects such words as the keywords and produces the tag having these keywords described. The tag is input to the associated word acquiring section 25.
The associated word acquiring section 25 searches the thesaurus 24 for words associated to the keywords of the tag and selects the associated words. For example, from the keyword “RED”, associated words like “AKA”, “CRIMSON”, “VERMILLION” and so on, and similar color names like “PINK”, “ORANGE” and so on are selected. From the keywords “NEW YEAR” and/or “NEW YEAR'S DAY”, associated words like “MORNING OF NEW YEAR'S DAY”, “COMING SPRING” and so on are selected. From the keyword “SAPPORO-SHI”, associated words like “HOKKAIDO”, “CENTRAL HOKKAIDO” and the like are selected. The associated words and the tag are input to the score acquiring section 26.
The score acquiring section 26 acquires a score representing the degree of association of the associated word and the keyword with use of the thesaurus 24. The score is calculated according to the internodal distance between the keyword and the associated word. For example, the score of the associated word “AKA” to the keyword “RED” is “1”, and the score of the associated word “CRIMSON” to the keyword “RED” is “2”. The score is input to the image database 5 together with the tag and the associated words.
The image database 5 adds the tag, the associated words, and the score to the image file 17 input from the image input section 22 and produces the analyzed image file 34, and stores this image file 34 to a predetermined memory area. The keywords and associated words in the tags enable the image file search.
In this way, since the keywords representing the characteristics of the input image are automatically added to the image file, the person who registers the image does not need to input the keywords. Owing to this, the image registration is facilitated. In addition, since the keywords are selected according to the predetermined rule, the keywords can be easily known by analogy, which improves search accuracy and percent hit rate in search. Since the image search can be performed not only with the keywords but also with the associated words, a broad-ranging search can be performed. When the score, which assigns a weight to the keyword, is used to output the image search result, the image search can be performed with higher accuracy.
In the above embodiment, the characteristic colors are extracted from the image data 18. It is also possible to recognize and use a kind and/or a shape of an object in the image as the keywords. As shown in
Each product may use original color names. The image search may be performed with use of such original color names. As shown in
The image management program 4 may be operated on a general-purpose personal computer (PC). It is common that a schedule management program is installed to the PC to manage a schedule. The schedule input to the schedule management program may be used for the image management.
As shown in
It is known that various color impressions can be obtained from a plurality of color combinations. For example, a color combination mainly composed of reddish and bluish colors having low brightness may provide an impression of elegance. A color combination mainly composed of grayish colors having medium brightness may provide impressions of natural, ecological, and the like. Such color impressions can be used for the image search.
As shown in
It is also possible to assign a weight to the keyword. As shown in
When the image search results are displayed on the monitor 10, the keywords are displayed in decreasing order of weight from the top. Owing to this, the level of importance of each keyword is reflected on the search results, which facilitates more broad-ranging search. When the weights are determined according to the number of keywords in the image database 5, the weights change as images are newly registered. It is therefore preferable to reevaluate the weight assigned to each keyword every time an image is registered. Although the weights of the keywords are registered separately from the scores of the associated words, the weights and the scores may be connected (associated) using some sort of calculation technique.
Although the present invention is applied to the image management device in the above embodiments, the present invention can be applied to other kinds of devices that deal with images, such as digital cameras, printers, and the like. Moreover, the present invention can be applied to content management devices that deal not only with images but also with other kinds of data such as audio data and the like.
Various changes and modifications are possible in the present invention and may be understood to be within the present invention.
Number | Date | Country | Kind |
---|---|---|---|
2006-351157 | Dec 2006 | JP | national |