1. Field of the Invention
The present invention relates to a technique used for searching desired information from information stored in a storage medium.
2. Description of the Related Art
In recent years, along with the popularization of digital cameras and camera-equipped cellular phones, and further with the use of large-capacity memory cards, users tend to store captured images in the memory cards and select and reproduce a desired image whenever they want. However, it is not easy to search a desired image from among many images.
Conventionally, there have been methods for searching an image based on metadata that is added to the image. Images captured by digital cameras, for example, include metadata in exchangeable image file format (Exif). Thus, numerical information including shooting time and date as well as character string information such as scene information is added to the images.
The metadata may be manually added by the user or automatically added by a system. Japanese Patent Application Laid-Open No. 2006-166193 discusses a technique by which, if the user designates shooting time and date of the starting point as well as the end point corresponding to the search area, images with information that corresponds to the time and date of the search area are searched.
However, if the users do not remember the shooting time and date, it is difficult to efficiently search the desired image.
On the other hand, as another method, an image can be searched by the user designating information that relates to the scene of the image. However, in this case, the images that can be searched are limited to images having the information, which is related to the scene, designated by the user.
The present invention is directed to an information search apparatus and method for efficiently searching images based on numerical information and character string information out of metadata that is associated with information (file) of images.
According to an aspect of the present invention, an information search apparatus configured to search a plurality of files including numerical information, the apparatus includes a processor wherein the processor includes inputting a first numerical value and a keyword as a query used for determining a range, determining a unit of the first numerical value, acquiring a second numerical value of the unit that corresponds to the keyword, searching the plurality of files and outputting a file included in the range determined based on the first and the second numerical values.
Further features and aspects of the present invention will become apparent from the following detailed description of exemplary embodiments with reference to the attached drawings.
The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate exemplary embodiments, features, and aspects of the invention and, together with the description, serve to explain the principles of the invention.
Various exemplary embodiments, features, and aspects of the invention will be described in detail below with reference to the drawings.
The above-described information search apparatus includes an information database 101, a query input unit 102, a semantic information extraction unit 103, a first information search unit 104, a time range determination unit 105, a second information search unit 106, and a search result output unit 107. In
Although the information database 101 is in the information search apparatus according to the present embodiment, the information database 101 can be arranged outside of the information search apparatus and connected to the information search apparatus by a network.
Further, metadata describing time and date, scene, creator, and creation condition is associated to each file. A case where a plurality of files described above are searched will be described according to the present embodiment.
The query input unit 102, the semantic information extraction unit 103, the first information search unit 104, the time range determination unit 105, the second information search unit 106, and the search result output unit 107 are modules used for searching the files. The functions of these modules are realized by a central processing unit (CPU) by loading a program stored in a read-only memory (ROM) into a random access memory (RAM) and executing the program.
The query input unit 102 is used for inputting a query that is used for searching the information (file). The query is a request for processing which is performed when the information (file) that satisfies the designated condition is searched from the information database, and is data of a plurality of words that are connected.
The semantic information extraction unit 103 acquires semantic information such as a keyword used for determining time information and the information (file) based on the query. The time information is information used for designating time and date and includes numerical information and time unit information. The keyword is, for example, a character string that corresponds to the metadata that is associated with the information (file).
The metadata may be included in a table that is associated with IDs that represent the information (file) and may also be information that is added to the information (file) like the known Exif. The Exif information includes information that is automatically added when an image is generated or information that the user can manually and arbitrarily add to the image. Information indicating time and date, scene, and image capture conditions can be included in the Exif information.
The first information search unit 104 searches the information database 101 for the information (file) that is associated with the metadata that corresponds to the extracted keyword. Further, the first information search unit 104 acquires metadata that describes the time and date and the scene that are associated with the searched information (file).
The time range determination unit 105 determines a time range as a search range based on the time information extracted by the semantic information extraction unit 103 and the metadata that describes the time and date searched by the first information search unit 104.
Based on the time range determined by the time range determination unit 105, the second information search unit 106 searches the information database 101 for the information (file) with which the metadata that describes the time and date that corresponds to the determined time range is associated.
The search result output unit 107 outputs information regarding the information (file), which the second information search unit 106 has searched as a search result.
In step S201, the query input unit 102 accepts a query as an input. Although the query can take various forms such as a text or a voice, a query in a text form is described in the present embodiment.
In step S202, the semantic information extraction unit 103 extracts semantic information from the query. In step S203, the first information search unit 104 searches the information using a keyword included in the semantic information.
In step S204, metadata describing the time and date that is associated with the information (file), which is searched by the first information search unit 104, is acquired, and the acquired time and date information is output to the time range determination unit 105.
In step S205, the time range determination unit 105 determines the time range based on the time information extracted by the semantic information extraction unit 103 and the metadata that describes time and date associated with the information (file) searched by the first information search unit 104.
The time information extracted by the semantic information extraction unit 103 includes time unit information (e.g., year, month, day, hour, minute, and second). Further, the time information includes numerical information (first numerical information) of the designated time (e.g., 1 to 12 if the time unit is “month” and 0 to 59 if the time unit is “minute” or “second”).
Based on this time unit information, granularity that is used for determining the time range from the time and date information that is associated with the information (file) searched by the first information search unit 104 is determined. When the granularity is determined and, further, the time range is determined, in step S206, the second information search unit 106 searches the information database 101 for the information (file) that corresponds to the time range.
In step S207, the search result output unit 107 outputs information of the searched information (file) as a search result.
In
The semantic information extraction unit 103 extracts semantic information that corresponds to each word from each of the words. Semantic information that corresponds to each word is included in the word dictionary used for the morphological analysis. By reading out the word dictionary, the semantic information corresponding to each word can be extracted.
A keyword 304 is included in semantic information 303. The first information search unit 104 searches the information database 101 for information (file) that is associated with the metadata that describes the scene corresponding to the keyword (character string “athletic meet” in
Time unit information 305 is included in the semantic information. The time unit of the time unit information is, for example, year, month, day, hour, minute, and second. The time unit information 305 is used for determining the granularity that is used out of the metadata that describes the time and date that is associated with the information (file) detected by the first information search unit 104. The granularity is a unit that is used when the data is segmented, and the time granularity includes year, month, day, and time.
In
Metadata 403 describes a scene associated with the searched information (file) 401. If information (file) having the metadata describing a scene that corresponds to the keyword (“athletic meet”) is searched in a state illustrated in
The first information search unit 104 extracts the metadata 402 describing the time and date associated with the searched information (file) 401 and outputs the extracted metadata 402 to the time range determination unit 105.
The time range determination unit 105 acquires the semantic information 303 from the semantic information extraction unit 103 and acquires the metadata 402 describing the time and date that is associated with the information (file) that is searched by the first information search unit 104.
Then, the time range determination unit 105 sets the time information determined based on the metadata 402 that describes the time and date in the portion of the keyword 304 included in the semantic information 303. Then, the time range determination unit 105 determines the range of the time information.
As illustrated in
The unit that is set for the keyword 304 is determined based on the time unit information 305 that is included in the semantic information 303.
In
Next, by using the extracted numerical information (the second numerical information) “10” and “9”, the time range that includes all the information (file) that is searched by the first information search unit 104 is determined. For example, as illustrated in
At this time, the numerical information (the second numerical value) is determined to be either “10” or “9” so that both of the two pieces of information (files) searched by the first information search unit 104 are included in the range. Thus, in this case, “10” is employed as the numerical information (the second numerical information), and the time range will be “from August to October”.
A time range 501 illustrated in
At this time, unit information such as year, day, and hour can be additionally set based on the metadata describing the time and date associated with the searched information (file), and a predetermined time range such as the current year can be set as the search target.
By performing the setting as described above, not all of the files stored in the database 101 but the information (file) that is associated with the current year as the metadata describing the time and date can be set as the search target.
Next, the determined time range is output to the second information search unit 106. The second information search unit 106 searches the information database 101 for the information (file) that is associated with the metadata describing the time and date that satisfies the condition, based on the information corresponding to the time range output from the time range determination unit 105.
Thus, if the information (file) is searched using the time range “month: 8 to 10” as illustrated in
Further, if a query such as “from athletic meet to November 3rd” is input, the time unit information 305 (“month” and “day” in this case) is obtained from the words “November” and “3”. Thus, by using values that correspond to the “month” and the “date” out of the metadata 402 describing the time and date, a time range of “month/day: 9/28 to 11/3” is set.
In this case, the files, which are associated with metadata describing the time and date corresponding to September 28 to November 3, will be the target of the search. Further, if a query such as “from 7 o'clock to athletic meet” is input, the time unit information 305 (“hour” in this case) is acquired from the word “hour”.
Thus, the range is set as “hour: 7 to 13” by using the values that correspond to “hour” out of the metadata 402 that describes the time and date. In this case, the file whose metadata describes the time and date that corresponds to the time “from 7 o'clock to 13 o'clock” will be the search object.
In other words, even if the same keyword (“athletic meet”) is used, time range of different granularity is set depending on the time unit information included in the query. Further, the word that holds the time unit information as semantic information may not directly indicate time such as “7 o'clock” or “August”.
For example, semantic information of “hour=6 to 10” is set in advance for a word “morning”. Then, as illustrated in
In this case, the information (file) that corresponds to 6 o'clock to 13 o'clock of the metadata 402 that describes the time and data that is associated with the file will be the search target. In this way, a file whose metadata corresponds to the keyword is searched based on the keyword that is included in the query.
Further, by extracting the metadata that describes the time and data from the information (file) and, further, by determining the time range based on the time unit information included in the query, a flexible search using tag information can be realized.
According to the above-described exemplary embodiment, the query input unit inputs a query in the form of a text and then the semantic information extraction unit 103 extracts the semantic information by dividing the text of the query into words. However, in another exemplary embodiment, the query can be input in the form of a voice. In this case, the voice query is voice-recognized and semantic information is extracted from the result of the voice recognition.
A functional block diagram of the present exemplary embodiment is illustrated in
By adding semantic information to each recognition word of the recognition grammar in advance, the semantic information extraction unit can extract semantic information without using morphological analysis or a word dictionary.
According to the above-described exemplary embodiments, as illustrated
For example, in
In step S1001, the time range determination unit 105 determines the range of the time unit that is not included in the semantic information. For example, if the time unit is “year”, then the range can be set as “2007” based on the current time and date, or the range can be set as “2006 to 2007” based on the metadata 402 that describes the time and date.
In step S1002, the time range determination unit 105 determines the range of the time unit that is included in the semantic information. As is with the above-described exemplary embodiments, a time range of “August to October” is obtained.
In step S1003, the time ranges are combined. For example, if the year is set based on the current time and date, “from August 2007 to October 2007” can be obtained. Further, if the year is set based on the metadata 402 that describes the time and date, “from August 2006 to October 2006 or from August 2007 to October 2007” can be obtained.
Further, the flowchart in
In other words, in step S1001, if the time range concerning year is obtained from each of the metadata 402 that describes the time and date, then “2007” and “2006” will be obtained. Further, if the range of the time unit that is included in the semantic information is obtained from each of the metadata 402 in step S1002, then “August to October” and “August to September” can be obtained.
In combining the time ranges in step S1003, the time ranges are combined for each metadata 402 that describes the time and date, and then “August 2007 to October 2007” and “August 2006 to September 2006” are obtained. Further, by combining these to obtain a time range that satisfies both of the time ranges, a time range of “August 2007 to October or August 2006 to September” is obtained.
Further, according to the above-described exemplary embodiments, the time range is determined so that it includes all of the metadata 402 that describes the plurality of times and dates obtained from the plurality pieces of information searched by the first information search unit 104.
However, the time range of the present invention is not limited to this and, for example, the time range can be determined by using only the metadata 402 that describes the time and date that falls in the predetermined time period such as “the current year” or “a predetermined year”.
Further, the time range can be determined by using only the metadata 402 that describes the time and date that is closest to the current time or a predetermined time. For example, in
The present exemplary embodiment is realized by the first information search unit 104 performing search, based on a keyword, of only the information (file) of the current year or of only the information (file) that is closest to the current time.
According to the above-described exemplary embodiments, the granularity of the time and date is determined based on the time unit information included in the query. However, the granularity of the present invention is not limited to time, and other numerical information can be used so long as a range can be designated.
For example, the information (file) can be searched based on information such as global positioning system (GPS) information that includes position information (e.g., numerical information of latitude and longitude). In this case, the granularity of the position will be latitude/longitude, minute, and second. Address units such as prefecture, municipality, ward, street, and house number can also be used.
A functional block diagram when the position information is used in determining the range is illustrated in
A first information search unit 802 searches information based on the keyword extracted by the semantic information extraction unit 103. A position range determination unit 803 determines a position range used for the search based on the semantic information extracted by the semantic information extraction unit 103 and the metadata (latitude information, longitude information) that describes position and included in the information (file) that is searched by the first information search unit 802.
A position information database 804 stores position information that is used for matching position information such as GPS information with address information including prefecture, city, and ward. A second information search unit 805 searches the information database 801 for information (file) based on the position range determined by the position range determination unit 803.
In step S1101, the position range determination unit 803 extracts the metadata (latitude information, longitude information) that describes position from the information (file) searched by the first information search unit 802.
In step S1102, the position range determination unit 803 determines the position range based on the metadata (latitude information, longitude information) that describes the position, which is extracted from the information (file) searched by the first information search unit 802, and the semantic information extracted in step S202.
The address information 904 can be obtained by converting the metadata 902 that describes the position, which is used when the position range determination unit 803 obtains the position range, or the address information 904 can be stored in advance in the information (file) 901 as metadata (metadata that describes address).
Position unit information 905 is information of position unit such as prefecture, city, and chome included in the semantic information that is extracted by the semantic information extraction unit 103. The position range determination unit 803 converts the metadata that describes the position, which is extracted from the information (file) that is searched by the first information search unit 802, into the address information 904 by referring to the position information database 804. The position range determination unit 803 determines the position range based on the address information 904 and the position unit information included in the semantic information.
In
On the other hand, if the query is “1 chome to so-and-so tower”, then since “chome” is obtained as the position unit information, “3 chome” is extracted from the address information 904 and the search range is determined as “chome: 1 to 3” (1 chome to 3 chome). The granularity of the position at this time is “chome”.
Based on the position range determined in this way, in step S1103, the second information search unit 805 searches the information database 801 for information. The process in S207 is similar to the process described in the above-described exemplary embodiments.
As described above, the information (file), which is associated with the metadata (tag information) that corresponds to the keyword included in the query, is searched. Further, the metadata that describes the position is extracted from the information (file). Then, by determining the position range based on the position unit information included in the query, flexible search of the position range becomes possible.
Aspects of the present invention can also be realized by a computer of a system or apparatus (or devices such as a CPU or MPU) that reads out and executes a program recorded on a memory device to perform the functions of the above-described embodiment (s), and by a method, the steps of which are performed by a computer of a system or apparatus by, for example, reading out and executing a program recorded on a memory device to perform the functions of the above-described embodiment (s). For this purpose, the program is provided to the computer for example via a network or from a recording medium of various types serving as the memory device (e.g., computer-readable storage medium).
While the present invention has been described with reference to exemplary embodiments, it is to be understood that the invention is not limited to the disclosed exemplary embodiments. The scope of the following claims is to be accorded the broadest interpretation so as to encompass all such modifications and equivalent structures and functions.
This application claims the benefit of Japanese Patent Application No. 2008-281864, filed Oct. 31, 2008, which is hereby incorporated by reference herein in its entirety.
Number | Date | Country | Kind |
---|---|---|---|
2008-281864 | Oct 2008 | JP | national |