The present invention relates to an ontology creation assistance device for assisting creation of an ontology described in an ontology description language.
With progress in information processing technology in recent years, a knowledge base and a knowledge database have become large-scaled. In particular, when a knowledge base system or a knowledge database system distributed on a network is used, it is required to process a large amount of data efficiently. Such a knowledge base or knowledge database system is created in accordance with creation rules of the corresponding system.
As described above, in order to use data of a knowledge base system or a knowledge database system distributed on a network, it is convenient if the data can be classified and hierarchized in accordance with a certain standard. As a powerful method therefor, an ontology is attracting attention.
The ontology includes concepts themselves or terms themselves related to a domain and information that clearly defines a relationship between the concepts or between the terms, and is described using an ontology description language.
A first example of the ontology description language is Web Ontology Language (OWL) which is technology for systematically expressing vocabulary and knowledge existing on the WEB and a relationship therebetween, recommended by The World Wide Web Consortium (W3C). OWL expresses a predictable class system of vocabulary with a set called a triple, constituted by a subject that is an element equivalent to “who”, a predicate equivalent to “what”, and an object equivalent to “what value” in Resource Description Framework (RDF). Here, the subject/object is called a node, and the predicate is called a property. Hereinafter, this triple will be described as a triple described in the ontology description language.
Work of creating an original ontology has mostly been done manually. However, in order to create a highly accurate ontology, it is necessary to be familiar with an ontology, and thus it is difficult to create an ontology for a person other than some experts disadvantageously. Furthermore, as the number of data items constituting data increases, it takes more labor and time to select a property corresponding to each data item disadvantageously. Therefore, conventionally, as illustrated in, for example, Patent Literature 1, there has been an ontology creation assistance device for associating existing ontologies with each other.
Patent Literature 1: JP 2009-70133 A
However, in the conventional ontology creation assistance device, metadata as prior knowledge which is specialized for association of ontologies and is used for associating a data item with an existing ontology is necessary, and thus a load of ontology creation work cannot be necessarily reduced.
The present invention has been achieved in order to solve such a problem, and an object of the present invention is to provide an ontology creation assistance device capable of reducing the load of ontology creation work.
An ontology creation assistance device according to the present invention includes: a processor to execute a program; and
a memory to store the program which, when executed by the processor, performs processes of, searching an ontology database in which a triple described in an ontology description language is registered for a node set similar to a given word; creating, from common properties and nodes possessed by nodes included in the node set that has been searched for, a template for creating a triple including a node to be newly added; and creating a triple by connecting the template created to a node having the given word as a name, using the created triple as display data, and when a triple is given, registering the given triple in the ontology database.
The ontology creation assistance device according to the present invention creates, from common properties and nodes possessed by nodes included in a node set similar to a given word, a template for creating a triple including a node to be newly added, creates a triple by connecting the template to a node having the given word as a name, uses the created triple as display data, and when a triple that has been corrected with respect to the display data, for example, is given, registers the given triple in the ontology database. As a result, the load of the ontology creation work can be reduced.
Hereinafter, in order to describe the present invention in more detail, an embodiment for carrying out the present invention will be described with reference to the attached drawings.
As illustrated, an ontology creation assistance device 1 includes an ontology database 11, a word vector database 12, a search unit 13, a template creating unit 14, and an additional information determining unit 15. The ontology database 11 is a database in which a set of triples described in an ontology description language is registered. The word vector database 12 is a database in which a set of word vectors used by the search unit 13 for searching for a similar node set is registered. This word vector database 12 may be constituted, for example, using the technology described in a literature: Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean, Efficient Estimation of Word Representations in Vector Space, ICLR 2013. The search unit 13 is a processing unit for searching existing ontologies stored in the ontology database 11 for a set of nodes each having a name similar to a word input from an input device 2. The template creating unit 14 is a processing unit for creating a template used for creating a triple to which common properties possessed by nodes in a search result output by the search unit 13 is newly added. The additional information determining unit 15 is a processing unit for creating a triple by connecting the template created by the template creating unit 14 to a node having a given word as a name thereof, outputting the triple as display data to a display device 3, and when a triple that has been corrected with respect to the display data, for example, is given from the input device 2, registering the given triple in the ontology database 11.
Next, the operation of the ontology creation assistance device according to the first embodiment will be described.
The search unit 13 searches, by using the word vector database 12, the ontology database 11 for a node name similar to a word input by a user. A specific operation of the search unit 13 is illustrated in the flowchart of
Note that {right arrow over (a)}=(a1, a2, . . . an) and {right arrow over (b)}=(b1, b2, . . . bn) are satisfied.
An example of the calculation of the similarity in step ST13 will be described. When a triple is constituted by connecting a plurality of nodes with different names to a specific node via the same property, the plurality of nodes with different names is defined as a node set. When the search unit 13 searches the ontology database 11, an average vector of nodes included in a node set calculated in advance is used for calculating the similarity.
An example of similarity calculation is illustrated in
In this way, by using the similarity between the vector of a word input from a user and the average vector of nodes regarded as a node set in the ontology, it is possible to search for a node and a property of an ontology having a high probability of connection to the word input from the user.
Next, the template creating unit 14 creates a template for creating a triple to be registered in the ontology database 11 from the list of node set with high similarity output from the search unit 13. A specific operation is illustrated in the flowchart of
First, the template creating unit 14 acquires a list of node set with high similarity from the search unit 13 (step ST21). Next, the template creating unit 14 acquires lists of properties and nodes possessed by nodes included in the acquired node set (step ST22). Properties and nodes commonly possessed by nodes are extracted from the acquired lists of properties and nodes, duplication is deleted, and a template is created (step ST23). Here, for a condition of properties and nodes as extraction targets, a threshold is set in advance with the number of nodes or the like. For example, the properties and nodes as extraction targets are properties and nodes commonly possessed by two or more nodes. In the example of
As described above, by using, as a template, properties and nodes commonly possessed by nodes included in a node set with high similarity output by the search unit 13, the template creating unit 14 can automatically connect a property and a node that are possessed with high probability by a node of an ontology having a meaning close to a word input from a user, and therefore can simplify node registration work as the ontology creation assistance device 1.
The additional information determining unit 15 creates a triple by connecting a template output from the template creating unit 14 to a node having the name of a word input by a user, and presents the triple to the user using the display device 3. The user corrects the presented triple using the mouse 103 and the keyboard 104 if necessary. The additional information determining unit 15 registers the corrected triple in the ontology database 11. A specific operation is illustrated in the flowchart of
First, the additional information determining unit 15 acquires a template from the template creating unit 14 (step ST31). Next, the additional information determining unit 15 creates a node having a word input by the input device 2 as a node name and creates a triple by connecting the node to the template (step ST32). For example, as illustrated in
The user confirms the created triple. When the user determines that correction is necessary, the user corrects the triple on GUI using the mouse 103 and the keyboard 104. The additional information determining unit 15 determines whether a correction input has been made by the input device 2 (step ST34). If a correction input has been made (step ST34—YES), the correction is reflected in the triple (step ST35), and the corrected triple is registered in the ontology database 11 (step ST36). If a correction input has not been made by the input device 2 (step ST34—NO), the additional information determining unit 15 registers the triple displayed on the display device 3 in step ST33 in the ontology database 11 as it is (step ST36).
Due to this, the user only needs to determine whether or not a property and a node connected to a node to be newly added are necessary, and therefore an ontology can be created more efficiently than in prior art.
When the created triple is displayed, the additional information determining unit 15 may change a display method depending on the number of properties and nodes stored in the template created by the template creating unit 14 and the number of properties and nodes that are a source of the template. For example, in the case of the template illustrated in
Due to this, it is possible to visually determine the ratio of commonly connected properties and nodes stored in the template to connected properties and nodes in the node set that is the source of the template, thereby assisting determination of necessity at the time of correction of a triple.
As described above, the ontology creation assistance device according to the first embodiment includes: a search unit for searching an ontology database in which a triple described in an ontology description language is registered for a node set similar to a given word; a template creating unit for creating, from common properties and nodes possessed by nodes included in the node set that has been searched for by the search unit, a template for creating a triple including a node to be newly added; and an additional information determining unit for creating a triple by connecting the template created by the template creating unit to a node having the given word as a name, using the created triple as display data, and when a triple is given, registering the given triple in the ontology database. Therefore, a load of ontology creation work can be reduced.
In addition, according to the ontology creation assistance device according to the first embodiment, the additional information determining unit, when a corrected triple with respect to the display data is given, registers the corrected triple in the ontology database. Therefore, the triple can be corrected, and this corrected triple can be registered in the ontology database.
In addition, according to the ontology creation assistance device according to the first embodiment, the template creating unit extracts, from the common properties and nodes, a property and a node satisfying a set extraction target condition, and creates the template, and the additional information determining unit uses, as the display data, data in a display form indicating a ratio of the number of the common properties and nodes to the number of the extracted property and node. Therefore, this assists determination of necessity at the time of correction of a triple and makes it possible to more reduce a load of ontology creation work.
Note that any component in the embodiment can be modified, or any component in the embodiment can be omitted within the scope of the present invention.
As described above, the ontology creation assistance device according to the present invention relates to a configuration for assisting creation of an ontology including concepts themselves or terms themselves related to a domain and information that clearly defines a relationship between the concepts or between the terms, and described using an ontology description language, and is suitable for use for a knowledge base or a knowledge database system.
1: Ontology creation assistance device, 2: Input device, 3: Display device, 11: Ontology database, 12: Word vector database, 13: Search unit, 14: Template creating unit, 15: Additional information determining unit.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2017/008766 | 3/6/2017 | WO | 00 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2018/163241 | 9/13/2018 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
8335754 | Dawson | Dec 2012 | B2 |
11188580 | Osmon | Nov 2021 | B2 |
20030158851 | Britton | Aug 2003 | A1 |
20080294644 | Liu | Nov 2008 | A1 |
20090077074 | Hosokawa | Mar 2009 | A1 |
20110246407 | Lee | Oct 2011 | A1 |
20110320187 | Motik | Dec 2011 | A1 |
20120271787 | Lee | Oct 2012 | A1 |
20130260358 | Lorge | Oct 2013 | A1 |
20130262361 | Arroyo | Oct 2013 | A1 |
20140297653 | Liu | Oct 2014 | A1 |
20150019589 | Arroyo | Jan 2015 | A1 |
20160034538 | Oberle | Feb 2016 | A1 |
20160092554 | Srinivasan | Mar 2016 | A1 |
20160132572 | Chang | May 2016 | A1 |
20160371355 | Massari | Dec 2016 | A1 |
20170337268 | Ait-Mokhtar | Nov 2017 | A1 |
20190034811 | Cuddihy | Jan 2019 | A1 |
Number | Date | Country |
---|---|---|
2001-14166 | Jan 2001 | JP |
2009-70133 | Apr 2009 | JP |
Entry |
---|
Abedjan, Ziawasch., “Improving RDF data with data mining”, 2014, Diss. Universität Potsdam, 2014, pp. 1-123. |
Mikolov et al. “Efficient Estimation of Word Representations in Vector Space” ICLR, Sep. 7, 2013, p. 1-12. |
Number | Date | Country | |
---|---|---|---|
20200050657 A1 | Feb 2020 | US |