Exemplary embodiments of the present invention will be described in detail based on the following figures, wherein:
Exemplary embodiments of the present invention will be described in detail below with reference to the accompanying drawings. In the description and drawings, the component having substantially the same function and configuration is designated by the same numeral, and repeated description is omitted.
An image interpretation apparatus and an image interpretation method according to a first embodiment of the invention will be described below.
(Configuration of Image Interpretation Apparatus)
A configuration of the image interpretation apparatus of the first embodiment will be described in detail below with reference to
The image interpretation apparatus of the first embodiment includes a registration section 100, an image search section 120, an image interpretation section 140, and a subsequent-stage processing section 160. Although not shown in the drawings, a function of each section, which will be described below, may be realized by hardware such as a storage device and CPU which are included in a computer,
(Registration Section 100)
The registration section 100 includes a registration image input section 102, a feature (characteristic) extraction section 104, an attribute input section 106, and a registration image information storage section 108. The registration section 100 is used to register image data of an object image which is necessary to interpret an image input by a user, and various pieces of information correlated with the object image.
As used herein, the object image may be, for example, an image for expressing a single object, and specifically an image for expressing a single substance or scene. Obviously the object image may be a single character, a character string, or an image set having a common abstract or conceptual feature. The various pieces of information may be semantics, a shape, a color, a name, and/or other information of the object image to be registered. A user can register various pieces of information according to a utilization mode of the image interpretation apparatus. Therefore, information which is not directly or indirectly related with an object expressed in the object image to be registered may be registered while being intentionally correlated with the object.
The registration image input section 102 is used to input the object image to be registered. The registration image input section 102 may be a keyboard, a mouse, a touch pen, an image scanner, a digital camera, and/or other input units, and further may be an image processing program and/or a drawing program which runs in conjunction with such input sections. The registration image input section 102 may also be an apparatus or a program for automatically or manually downloading the object image from a database server or the like (not shown) connected to a network.
Using an edge filter or the like, the feature extraction section 104 extracts at least one feature (characteristic) from the object image inputted by the registration image input section 102. For example, the feature extraction section 104 may scan brightness or a gradation level of the object image to detect a characteristic highlight portion or an outline portion of the object image. When the feature is detected, the feature extraction section 104 transmits the feature extracted from the object image together with the object image to the registration image information storage section 108, which is described later.
The attribute input section 106 is an input unit for inputting at least one attribute of the object image inputted by the registration image input section 102, and to be registered while correlated with the object image. The attribute input section 106 may be a keyboard, a mouse, and/or an information processing program which runs in conjunction with such input units. The attribute input section 106 transmits the inputted attribute information to the registration image information storage section 108. The attribute information may be, for example, semantics, a shape, a color, a name, and/or other information of an object expressed by the object image. The other information may include information which is not directly or indirectly correlated with the object expressed in the object image to be registered. For example, the other information may be a name of a person who is not related with the object image, numerical information such as an amount of money, and/or a place-name which are not related with the object image. Any piece of attribute information can be inputted according to the utilization mode of the image interpretation apparatus. The attribute input section 106 may also be, for example, an apparatus or a program for automatically or manually downloading the attribute information related with the object image from a database server or the like (not shown) connected to a network.
The registration image information storage section 108 includes an object database 110, in which registered are the object image inputted by the registration image input section 102, the feature of the object image extracted by the feature extraction section 104, and the attribute information of the object image inputted by the attribute input section 106. The registration image information storage section 108 can register various pieces of information including the object image, the feature, and the attribute information in the object database 110 in correlation with each other, and can also retrieve other related information using one or more pieces of information included in the various pieces of information as key information. Although the registration image information storage section 108 is described to be included in the registration section 100, the registration image information storage section 108 can be also referred from the image search section 120 described later, and therefore it may be considered being also included in the image search section 120.
(Image Search Section 120)
The image search section 120 includes an image obtaining section 122, a feature extraction section 124, a feature (characteristic) comparison section 126, and a component information storage section 128. As described above, the registration image information storage section 108 may be also included as a component. The image search section 120 searches the object image, which is registered in the object database 110, from the input image.
The image obtaining section 122 is an input unit for inputting an image which the user requires interpretation. The image obtaining section 122 may be, for example, a keyboard, a mouse, a touch pen, an image scanner, a digital camera, and/or other input sections, and further may be an image processing program or a drawing program which runs in conjunction with such input units. Hereinafter, “the image which the user asks requires interpretation” is referred to as “input image”. The input image may be any image including one or more object images, and may be, for example, a photograph, a character, a graphic, a diagram, and the like.
Using an edge filter or the like, the feature extraction section 124 extracts at least one feature from the input image inputted by the image obtaining section 122. For example, the feature extraction section 124 can scan the brightness or gradation level of the object image to detect the characteristic highlight portion or outline portion of the input image. When the feature is detected, the feature extraction section 124 transmits the feature extracted from the input image together with the input image to the feature comparison section 126 which will be described later.
The feature comparison section 126 retrieves the object image having at least one feature identical with or similar to that of the input image from the object database 110. As described above, one or more object images and the feature of each object image are registered in the object database 110. The feature comparison section 126 detects the features identical with or similar to that of the object image from among the features of the input image. Further, the feature comparison section 126 transmits detection information obtained by detecting the object image to the component information storage section 128. The detection information may include, for example, a detected position of the object image included in the input image is detected, a size of the object image, and a matching rate between the detected object image and the registered object image. That is, the feature comparison section 126 is one example of the object image extraction section and the arrangement information obtaining section.
The component information storage section 128 includes a component database 130, and registers the object images detected by the feature comparison section 126, attribute information related with each of the object images, and the detection information obtained in the detection by the component database 130. At this time, the component information storage section 128 refers to the object database 110 to retrieve the attribute information for each of the detected object images. The component information storage section 128 registers the object image, the attribute information, the detection information, and/or other information in the component database 130 with mutual correlation with each other such that each piece of information can be used as key information for retrieving the other information.
(Image Interpretation Section 140)
The image interpretation section 140 includes a grammatical rule input section 142, an arrangement rule information storage section 144, and an image information interpretation section 148. On the basis of the information on the input image retrieved by the image search section 120, the image interpretation section 140 interprets the semantics expressed by the input image according to at least one grammatical rule set in advance.
The grammatical rule input section 142 is an input unit for inputting semantic information which is given to morphology of the object image included in the input image. Particularly, because the first embodiment is characterized in that the semantics is given to the arrangement of the object image, the grammatical rule input section 142 is an input unit for inputting semantic information corresponding to the arrangement of the object image in the input image. The grammatical rule input section 142 may be a keyboard and/or a mouse. The grammatical rule input section 142 may also be an apparatus or a program for automatically or manually downloading arrangement information and semantic information related with the arrangement information from a database server or the like (not shown) connected to a network.
The arrangement information may include, for example, a position (vertical or horizontal) of the object image in the input image, a size of the object image (large or small, or an area proportion of the object image to the input image), a rotation angle (gradient), and/or a horizontal to vertical ratio. The semantic information may include, for example, information such as “which piece of attribute information registered in the object database 110 is selected (selection of item)?”, “what is a level of importance?”, and/or “what is a degree of satisfaction?” Thus, at least one rule which defines semantic information according to the morphology of the object image relative to the input image is referred to as “grammatical rule”. When the grammatical rule is inputted, the grammatical rule input section 142 transmits the inputted grammatical rule to the arrangement rule information storage section 144.
The arrangement rule information storage section 144 includes an arrangement rule database 146, and registers the grammatical rule inputted by the grammatical rule input section 142 in the arrangement rule database 146. At this time, the arrangement rule information storage section 144 registers the arrangement information on the object image and the semantic information corresponding to the object image in the arrangement rule database 146 in correlation with each other.
The image information interpretation section 148 may refer to the component database 130 and the arrangement rule database 146 to interpret the semantic information on the input image based on the arrangement information on the object image included in the input image. As described above, the object image included in the input image and the attribute information and the detection information on the object image are registered in the component database 130. On the other hand, the arrangement information on the object image and the semantic information correlated with the arrangement information are registered in the arrangement rule database 146. Therefore, the image information interpretation section 148 collates the detection information on the object image with the arrangement information to retrieve the semantic information corresponding to the object image. The image information interpretation section 148 can retrieve desired information from the attribute information on the object image based on the retrieved semantic information. When plural pieces of arrangement information correspond to the object image, the image information interpretation section 148 retrieves plural pieces of information included in the attribute information, based on the semantic information corresponding to each of the plural pieces of arrangement information, to obtain an interpretation result by a combination of the plural pieces of information. After the interpretation result is obtained, the image information interpretation section 148 transmits the interpretation result to the subsequent-stage processing section 160.
(Subsequent-Stage Processing Section 160)
The subsequent-stage processing section 160 may be an output unit which outputs the interpretation result outputted by the image information interpretation section 148 and/or a storage unit in which the interpretation result is stored. The output unit may be, for example, a display device and/or an audio output section. The storage unit may be, for example, a magnetic storage device and/or an optical storage device.
Thus, the configuration of the image interpretation apparatus of the first embodiment is described in detail with reference to
(Image Interpretation Method)
An object image registration procedure, an arrangement rule registration procedure, and an input image interpretation procedure of the image interpretation method of the first embodiment will be described in detail with reference to the drawings.
(Object Image Registration Procedure)
The registration procedure in the image processing method of the first embodiment will be described in detail with reference to
A user inputs an object image produced with a digital camera, an image producing tool or the like (registration image input section 102) (S102). The object image may be, for example, a photograph, an illustration, a graphic, a logo, and/or a handwritten picture.
When the object image is inputted, the feature extraction section 104 extracts at least one specific feature of the object image using an image processing filter or the like (S104). The features may be, for example, edge intensity and/or an edge position. A wavelet filter, for example, can be used as the image processing filter.
Then, the user inputs attribute information to be related with the object information through the attribute input section 106 (S106). The attribute information may for example, semantics, shape, color, and/or name of the object image.
When the object image and the attribute information thereon are inputted, the object image, the extracted feature, and the inputted attribute information are correlated with each other and registered in the object database 110 included in the registration image information storage section 108 (S108).
Due to the above registration procedure, the user can register the object image desired to be utilized for the image interpretation and the attribute information on the object image in the object database 110, and also can retrieve the information associated with the object image with reference to the object database 110 during the image interpretation.
Here, a specific configuration of the object database 110 will briefly be described with reference to
The object database 110 of
The index which is uniquely determined with respect to each object image is described in the ID field. The index is an indicator which is sequentially assigned to the object images at the registration thereof. The type of object specifically indicated by the object image is described in the type field. For example, name of the object indicated by the object image, as well as other classification types (such as movable estate, real estate, ship, automobile, airplane, animal, plant, amphibian, reptile, primate, Order Primates, Japanese Macaque, Cercopithecidae, Hominidae, and the like) may be described in the type field. The creator name of an image to which the object image is added is described in the creator field. In other words, a personal name assigned in each object image is described in the creator field. Data (feature amount) obtained by digitizing the features extracted by the feature extraction section 104 is described in the feature amount field. That is, the feature amount is numerical data which is quantified to specify the object image which is image data. The inputted object image is attached as image data in the object image field.
For example, referring to ID field of “001”, “picture of frog” is registered as the object image. “Frog” is registered in the type field, “Tanaka” is registered in the creator field, and numerical data of “0101001110” is registered in the feature amount. These pieces of data are correlated with each other, and the user can use one or more pieces of the data as the key information for searching the other data. Accordingly, it is possible to find the object image based on the feature amount, specify the creator from the object image, and so on.
(Arrangement Rule Registration Procedure)
Next, the arrangement rule registration procedure will be specifically described with reference to
First, the arrangement rule will be described with reference to
In
The numerals 182 and 184 in
Further, the numerals 192 and 194 in
As described above, the arrangement rule indicates the relative relationship between the input image and the object image. The arrangement information is information which includes the positional information, size information, rotation information and the like of the object image with respect to the input image. In other words, the arrangement information is classification information which can clearly define the relative relationship between the input image and the object image.
Next, a data structure of the arrangement rule database 146 and the grammatical rule correlated with each piece of the arrangement information will be described in detail with reference to
The arrangement rule database 146 of
The row in which the arrangement information is “upper left region” and the grammatical rule is “creator” will be specifically described by way of example. The descriptions of the row mean that the grammatical rule of “creator” is applied when the object image is positioned in the “upper left region”. That is, when the image information interpretation section 148 refers to the component database 130 and recognize that a certain object image is positioned in the upper left region of the input image, the image information interpretation section 148 obtains, as the key information, the grammatical rule of“creator” corresponding to the arrangement information of “upper left region” of the arrangement rule database 146. Although only a conceptual description using a key word are shown in
Thus, even with the same object image, various semantics can be given according to the position, the size, and the like thereof These various semantics enable wider range in the input image interpretation process or the subsequent-stage process performed subsequent to the input image interpretation process. The input image interpretation procedure will be described in detail below.
(Input Image Interpretation Procedure)
The interpretation procedure in the image processing method of the first embodiment will be described in detail with reference to
A user inputs an image desired to be interpreted (hereinafter referred to as input image) to the image interpretation apparatus through the image obtaining section 122 (S112). The input image is transmitted from the image obtaining section 122 to the feature extraction section 124, and the image obtaining section 122 extracts at least one feature (S114). Information of the feature is transmitted to the feature comparison section 126 and compared to the feature of the object image registered in the object database 110. Thus the feature comparison section 126 detects the object image included in the input image (S116). At this time, the feature comparison section 126 detects the arrangement information such as the position, size, and degree of coincidence of each object image. Further, the feature comparison section 126 refers to the object database 110 to retrieve the attribute information and the like related with the detected object image, and transmits the attribute information and the arrangement information and the like to the component information storage section 128. The component information storage section 128 registers the received attribute information and arrangement information and the like in the component database 130 (S118 and S120).
When the registration of various pieces of information in the component database 130 is completed, the image information interpretation section 148 interprets the semantics of the input image based on the arrangement information and the like of the detected object image by referring to the arrangement rule database 146 and the component database 130 (S122). At this time, the image information interpretation section 148 collates the arrangement information registered in the component database 130 with the at least one grammatical rule registered in the arrangement rule database 146, and obtains the semantic information corresponding to the arrangement information. Thus the image information interpretation section 148 can retrieve the information registered in the component database 130 based on the semantic information. As a result, the object image included in the input image and the arrangement information of the object image constitute a relationship such as a phrase and grammar in linguistic expression.
When the semantics of the input image is interpreted, the image information interpretation section 148 outputs the interpretation result through the subsequent-stage processing section 160 (S124). For example, the interpretation result may be displayed on a display unit such as a display device, or may be outputted to a print medium via a print unit such as a printer. The interpretation result may also be stored as electronic data in a magnetic storage medium and the like.
Here, the interpretation procedure will be further described with reference to a specific example shown in
The image interpretation apparatus obtains the business trip report 202 which is the input image through the image obtaining section 122, and transmits the business trip report 202 to the feature extraction section 124. The feature extraction section 124 extracts a feature from the image of the obtained business trip report 202, and transmits the feature amount obtained by digitizing the feature to the feature comparison section 126. The feature comparison section 126 compares the feature amount registered in the object database 110 with the transmitted feature amount and recognizes that the object image of the type of “frog” is included in the business trip report 202. Further, the feature comparison section 126 detects the arrangement information indicating the position, size, and gradient of the “frog” type object image. Then, the feature comparison section 126 transmits the “frog” type object image and the detected arrangement information to the component database 130. The component information storage section 128 registers, in the component database 130, the “frog” type object image transmitted from the feature comparison section 126, the detected arrangement information, and the attribute information retrieved from the object database 110 based on these pieces of information.
At this time, in the component database 130, at least the type “frog” and the creator of “Tanaka” are registered as the attribute information on the object image included in the business trip report 202, and at least the arrangement of “upper left region” and the size of “normal” are registered as the arrangement information.
When the process of registering the component database 130 is completed, the image information interpretation section 148 retrieves the grammatical rule from the arrangement rule database 146 (see
The image information interpretation section 148 interprets that the creator is “Tanaka” based on the grammatical rule of “creator” by referring to the component database 130. The image information interpretation section 148 further interprets that “level of importance” of the business trip report 202 is “middle” because the arrangement of “size” is “normal”. As a result, on the basis of the arrangement information of the object image, the image information interpretation section 148 can interpret the creator of the business trip report 202 as “Tanaka” and the level of importance of the business trip report 202 as “middle”. The interpretation result 204 is transmitted to the subsequent-stage processing section 160 and outputted to the display or the like. In
Thus, the image interpretation apparatus and image interpretation method of the first embodiment are described. According to the first embodiment, even if a single input image includes only a single object image, different semantics can be expressed by giving the semantics to the arrangement of the object image, thereby the image interpretation can be performed in wider range. Further, the subsequent-stage process can be changed according to the result of the image interpretation.
An image interpretation apparatus and an image interpretation method according to a second embodiment of the invention will be described below. Here, the same components as the first embodiment are designated by the same numerals and the descriptions thereof are omitted, and only the different point is described in detail.
(Configuration of Image Interpretation Apparatus)
A configuration of the image interpretation apparatus of the second embodiment will be described below with reference to
(Image Interpretation Section 140)
Referring to
The grammatical rule input section 142 is an input unit for receiving semantic information to be given to the morphology of an object image included in an input image. Particularly, because the second embodiment is characterized in that the semantics is given to a combination of the object images, the grammatical rule input section 142 is an input unit for inputting semantic information corresponding to the combination of the object images in the input image. The grammatical rule input section 142 may be composed of, for example, a keyboard and a mouse, or may be an apparatus or a program for automatically or manually downloading combination information and semantic information correlated therewith from a database server (not shown) connected to a network.
The combination information is information which indicates a relative positional relationship of plural object images included in the input image. The combination information may include “vertical positional relationship information” indicating whether the object image is positioned in relatively upper position or lower position in the input image, “overlap information” indicating whether or not the plural object images overlap each other, “foreground/background information” indicating whether the overlap object image is in foreground or background, and “magnitude relation information” indicating a relative magnitude relation between the object images. Thus, the rule which defines the semantic information according to the relative morphology of the plural object images is referred to as grammatical rule. When the grammatical rule is inputted, the grammatical rule input section 142 transmits the inputted grammatical rule to the combination rule information storage section 212.
The combination information will specifically be described with reference to
In
Reference numeral 232 in
Reference numeral 242 in
Reference numerals 252 and 254 in
As described above, the image interpretation apparatus and image interpretation method of the second embodiment are configured such that the subsequent-stage process can be changed based on the combination information indicating the correlation of the plural object images included in the input image. The combination information may be detected by the feature comparison section 126 included in the image search section 120 and registered in the component database 130. That is, the feature comparison section 126 is one example of the object image extraction section and the combination information obtaining section.
Referring to
Here, a data configuration of the combination rule database 214 will specifically be described with reference to
Referring to
(Image Interpretation Method)
Next, the input image interpretation method performed by the image information interpretation section 148 will be specifically described with reference to
Referring to
The image information interpretation section 148 recognizes, by referring to the component database 130, the overlap information indicating that “overlap exists” in the overlap relationship of the object images, and recognizes the vertical positional relationship information indicating that the object image of “summer” is positioned in the upper region while the object image of “butterfly” is positioned in the lower region. Then, the image information interpretation section 148 refers to the combination rule database 214 and recognizes that both of the object images are grouped based on the overlap information (corresponding to “overlap relationship (1)”). Similarly, on the basis of the vertical positional relationship information, the image information interpretation section 148 recognizes a language formation in which “summer” is a modifier and “butterfly” is a noun. As a result, the image information interpretation section 148 can interpret the input image 262 as “butterfly of summer”. The interpretation result is transmitted to the subsequent-stage processing section 160 and outputted to a display or the like.
Referring to
The image information interpretation section 148 recognizes, by referring to the component database 130, the overlap information indicating that “overlap exists” in the overlap relationship of the object images, and recognizes the foreground/background information indicating that the object image of “ABC electric” is positioned in the foreground while the object image of “address” is positioned in the background. Then, the image information interpretation section 148 refers to the combination rule database 214 and recognizes that both of the object images are grouped based on the overlap information (corresponding to “overlap relationship (1)”). Similarly, on the basis of the foreground/background information, the image information interpretation section 148 recognizes a search condition that “address” is the item name. As a result, the image information interpretation section 148 can interpret the input image 272 as “New York” described in the item “address” of the object image of “ABC electric”. The interpretation result is transmitted to the subsequent-stage processing section 160 and outputted to a display or the like.
Referring to
The image information interpretation section 148 obtains from the component database 130 the overlap information indicating that “overlap exists” in the overlap relationship of the object image showing the type of “address” and the object image showing the type of “ABC electric”, and recognizes these object images as a group image (1). The image information interpretation section 148 further obtains from the component database 130 the overlap information indicating that “overlap exists” in the overlap relationship of the object image showing the type of “address” and the object image showing the type of “butterfly”, and recognizes the object images as a group image (2). At the same time, the image information interpretation section 148 obtains the overlap information indicating that “overlap does not exist” in the overlap relationship between the group image (1) and the group image (2). Further, the image information interpretation section 148 obtains the foreground/background information indicating that the object image of “ABC electric” and the object image of “butterfly” are positioned in the foreground while each of the object images of “address” is positioned in the background.
From these pieces of information, the image information interpretation section 148 interprets the group (1) as “New York” and the group (2) as “Appalachia”. The image information interpretation section 148 can interpret the input image 282 as “New York and Appalachia” based on the recognition that the group (1) and the group (2) are not grouped. The interpretation result is transmitted to the subsequent-stage processing section 160 and outputted to a display or the like. Thus, the grammatical rule can also be applied to the object image group which is formed by grouping plural object images.
Thus, the second embodiment of the invention is described in detail. According to the second embodiment, the semantics can be given to the combination of the plural object images included in the input image, so that the number of pieces of semantic information corresponding to the number of combinations of the registered object images can be expressed by the one input image. Accordingly, the interpretation results having more variations can be obtained in the second embodiment compared with the first embodiment as well as a general image interpretation apparatus. Additionally, the second embodiment can perform the subsequent-stage process based on this interpretation result.
An image interpretation apparatus and an image interpretation method according to a third embodiment of the invention will be described below. Here, the same components as the first and second embodiments are designated by the same numerals and the descriptions thereof are omitted, and only the different point is described in detail.
(Configuration of Image Interpretation Apparatus)
A configuration of the image interpretation apparatus of the third embodiment will be described below with reference to
Referring to
As like the first embodiment, the at least one grammatical rule correlated with the arrangement information is registered in the arrangement rule database 146. For example, as shown by reference numeral 146 of
As like the second embodiment, the at least one grammatical rule correlated with combination information is registered in the combination rule database 214. For example, as shown by reference numeral 214 of
(Image Interpretation Method)
A method of interpreting an input image 292 will be described with reference to a specific example shown in
Referring to
The image information interpretation section 148 first obtains, from the component database 130, the overlap information indicating that “overlap exists” in the overlap relationship between the object image showing the type of “address” and the object image showing the type of “ABC electric”, and recognizes these object images as a group image (1). The image information interpretation section 148 further obtains, from the component database 130, the overlap information indicating that “overlap exists” in the overlap relationship between the object image showing the type of“address” and the object image showing the type of “butterfly”, and recognizes these object images as a group image (2). At the same time, the image information interpretation section 148 obtains the overlap information indicating that “overlap does not exist” in the overlap relationship between the group image (1) and the group image (2). The image information interpretation section 148 further obtains the foreground/background information indicating that the object image of “ABC electric” and the object image of “butterfly” are positioned in the foreground while each of the object images of “address” are positioned in the background. The image information interpretation section 148 obtains the arrangement information indicating that the group image (1) is positioned in the upper left region of the input image 292 while the group image (2) is positioned in the upper right region.
The image information interpretation section 148 refers to the combination rule database 214, interprets the group (1) as “New York” and interprets the group (2) as “Appalachia”. The image information interpretation section 148 further interprets the group (1) and the group (2) as not grouped. The image information interpretation section 148 refers to the arrangement rule database 146 and interprets “New York” which is of the semantics of the group image (1) and “Appalachia” which is of the semantics of the group image (2) as “destination” and “sender”, respectively. As a result, the image information interpretation section 148 interprets the input image 292 as “from Appalachia to New York”. The interpretation result is transmitted to the subsequent-stage processing section 160 and outputted to a display or the like.
Thus, the third embodiment of the invention is described above. According to the third embodiment, the semantics can be given according to the positional information and the combination information on the object images, and the third embodiment can deal with the subsequent-stage process having more variations compared with the first and second embodiments.
An image interpretation apparatus and an image interpretation method according to a fourth embodiment of the invention will be described below. Here, the same components as the first, second, and third embodiment are designated by the same numerals and the descriptions thereof are omitted, and only the different point is described in detail.
(Configuration of Image Interpretation Apparatus)
A configuration of the image interpretation apparatus of the fourth embodiment will be described below with reference to
(Image Interpretation Section 140)
Referring to
The grammatical rule input section 142 is an input unit which receives the semantic information to be given to the morphology of the object image included in the input image. Particularly, the fourth embodiment is characterized in that the semantics is given to missing information on the object image. Therefore, the grammatical rule input section 142 is an input unit for inputting the semantic information corresponding to the missing information on the object image in the input image. The grammatical rule input section 142 may be composed of, for example, a keyboard, a mouse and the like.
The missing information includes, for example, missing area information indicating a missing area where a part of the object image is blacked, missing area information indicating a percentage of a missing area to the area of the object image, missing area information indicating a missing area which is painted a color other than black, missing area information indicating an area where a part or the whole of the object image is simply distinguishably partitioned by other colors. The missing information may also be missing positional information indicating a position of a missing region in the input image.
The missing information will specifically be described with reference to
In
Reference numerals 322 and 324 in
(Missing Rule Database 304)
A configuration of the missing rule database 304 included in the missing rule information storage section 302 will be described with reference to
Referring to
Referring to the missing rule database 304 of
(Image Information Interpretation Section 148)
The image information interpretation section 148 refers to the component database 130 in which the object image extracted from the input image, the attribute information, the arrangement information and the like are registered, and further refers to the missing rule database 304 to interpret the semantics of the input image.
(Image Interpretation Method)
The method of interpreting an input image 332 will be described with reference to a specific example shown in
Referring to the input image 332, the object image showing the type of “money” is drawn, and a part of the lower left region of the object image is hidden behind a black rectangular region. The area of the hidden black region which is the missing region occupies for a quarter of the object image.
The image information interpretation section 148 refers to the component database 130 and recognizes that the object image showing the type of “money” is included in the input image 332 and the object image has the semantics of the amount of “one million yen”. On the basis of the arrangement information registered in the component database 130, the image information interpretation section 148 recognizes that the missing amount of the object image is a quarter. Then, the image information interpretation section 148 refers to the missing rule database 304 and recognizes that the missing amount has the semantics of “loss amount”, and interprets the “loss amount” of the amount of “one million yen” of the object image as two hundred and fifty thousand yen. As a result, the image information interpretation section 148 interprets the semantics of the input image 332 as “seven hundred and fifty thousand yen” (the quarter (two hundred and fifty thousand yen) of one million yen is lost). The interpretation result is transmitted to the subsequent-stage processing section 160 and outputted to a display or the like.
Thus, the image interpretation apparatus and image interpretation method of the fourth embodiment are described. According to the fourth embodiment, the semantics can be given to the missing state (hidden state) of the object image, and various pieces of semantics can be given to the input image by performing a simple operation of painting out the object image.
Although the exemplary embodiments of the present invention are described above with reference to the accompanying drawings, obviously the invention is not limited to the above embodiments. It should be understood that various changes and modifications can be made by a person skilled in the art without departing from the scope of the invention, and these changes and modifications are of course be included in the scope of the invention.
For example, in the above embodiments, the feature extraction section 104 included in the registration section 100 and the feature extraction section 124 included in the image search section 120 are described such that they are implemented in a same device. However, these may be separate sections having different functions and configurations in order to detect from the input image the object image arranged in different size and/or position from those of the registered object image.
Further, the above embodiments are described as directed to digital-format contents. However, the invention is not limited to the digital-format contents, and can also be applied to analog-format contents (such as a picture drawn in a paper or a whiteboard etc., and/or a photograph).
Number | Date | Country | Kind |
---|---|---|---|
2006-221215 | Aug 2006 | JP | national |