This application claims priority to and benefits of Chinese Patent Application Serial No. 201810897535.1, filed on Aug. 8, 2018, the entire content of which is incorporated herein by reference.
The present disclosure relates to a field of data visualization technology, and more particularly to a method and an apparatus for generating a knowledge graph, and an electronic device, and a computer readable storage medium.
Data visualization refers to is a graphical representation of information or data that presents data in a visual form, such as charts, maps, graphs, etc., to help people understand meaning of the data. Data visualization may include static visualization and dynamic visualization. Static visualization means that dynamical modification cannot be performed after visualization, while dynamic visualization means that update or adjustment may be dynamically performed based on dynamic data.
Knowledge graph is a common form of representing data visualization. It is a semantic representation that reveals the relationship between entities, and it can visualize objects in real or virtual world and relationship between the objects. The knowledge graph is essentially a semantic network, and it is a graph-based data structure. The knowledge graph is mainly composed of nodes and edges. Nodes and edges can also have various settable properties. In the knowledge graph, each node can represent one entity in the real or virtual world, and each edge can represent a relationship between two entities.
Embodiments of the present disclosure provide a method and an apparatus for generating a knowledge graph, an electronic device and a computer readable storage medium.
According to a first aspect of the present disclosure, a method for generating a knowledge graph is provided. The method includes: establishing a graph database based on a set of entities and relationships among the entities in given content; receiving a graph query for the given content from a user; and generating, based on the graph database, a knowledge graph of the given content by using a predefined formatted layout, the knowledge graph having a network structure.
According to a second aspect of the present disclosure, an apparatus for generating a knowledge graph is provided. The apparatus includes: an establishing module, configured to establish a graph database based on a set of entities and relationships among the entities in given content; a receiving module, configured to receive a graph query for the given content from a user; and a generating module, configured to generate, based on the graph database, a knowledge graph of the given content by using a predefined formatted layout, the knowledge graph having a network structure.
According to a third aspect of the present disclosure, an electronic device is provided. The electronic device includes: one or more processors; and a memory device, configured to store one or more programs that, when executed by the one or more processors, cause the electronic device to perform the method or the process according to embodiments of the present disclosure.
According to a fourth aspect of the present disclosure, a computer readable storage medium is provided. The computer readable storage medium has computer programs stored thereon that, when executed by a processor, cause the processor to perform the method according to embodiments of the present disclosure.
It should be understood that, the content described in the summary of the present disclosure is not intended to limit the key features or important features of the embodiments of the present disclosure, and is not intended to limit the scope of the disclosure. Other features of the present disclosure will be readily understood by the following description.
Above and other features, advantages and aspects of embodiments of the present disclosure will become more apparent from the following descriptions made with reference to the drawings. In the drawings, the same or similar elements and the elements having same or similar functions are denoted by like reference numerals throughout the descriptions.
Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. Although certain embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be implemented in various forms and should not be construed as limited to the embodiments set forth herein. In contrast, these embodiments are provided for a more complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the present disclosure are for illustrative purposes only and are not intended to limit the scope of the disclosure.
In the description of the embodiments of the present disclosure, the term “comprise” and the like are to be understood as open contains, i.e., “including but not limited to”. The term “based on” should be understood to mean “based at least in part”. The term “one embodiment” or “an embodiment” should be understood to mean “at least one embodiment”. Other explicit and implicit definitions may also be included below.
Inventors of the present disclosure noticed that there is a strong demand for knowledge graphs (such as relationships among TV play characters) during a search process of a user. In the related art, the user is mainly satisfied in a form of question-and-answer cards, lacking complete knowledge graph presentation. In the related art, there are two ways for generating a knowledge graph for given content (such as a TV play). A first way is manually editing a graph and presenting the graph as a picture. However, in this way, the graph is unable to be automatically updated by technical means. When content of the TV play is updated or new content is generated, re-editing the graph of this method is more costly with poor timeliness. A second way is generating a knowledge graph automatically. However, the graph generated in this way generally has a tree structure, which extends from one node and relationships between child nodes of this node cannot be presented. In addition, the graph generated by the second way usually has simplex edge relationships among characters, and cannot fully show the relationships among characters in the TV play, thus its ideographical expression and effect are poor, and cannot meet the needs of user. Therefore, the ways of generating a graph in the related art is either inefficient or has poorly presentation effect, which cannot meet the needs of user.
Embodiments of the present disclosure provide a solution for generating a knowledge graph. In embodiments of the present disclosure, a knowledge graph for given content may be automatically generated based on an established graph database. At the same time, the knowledge graph generated in embodiments of the present disclosure uses a formatted layout and is presented as a network structure, thus can improve presentation effect and further improve user experience. In the following, some embodiments of the present disclosure may be described in detail with reference to
The given content 110 (such as the TV play 115) may involve a lot of characters. There are intricate relationships among these characters. Based on analysis on the TV play 115, relatively important entities (such as key characters in the TV play) in the TV play 115 and relationships (such as character relationship in the TV play) among the entities may be extracted to establish the graph database 120. It should be understood that, in most scenes, only key characters and relationships among the key characters in the TV play need to be stored, however, all characters and relationships among the characters may be stored in the graph database 120. During a process of establishing the graph database 120, the characters may be taken as nodes in the graph database 120, and the relationships among the characters may be taken as edges in the graph database. As illustrated in
As illustrated in
With embodiments of the present disclosure, the generated graph knowledge uses the network structure, thus there is no restriction on a level of entity relationship, which can cover the relationships among entities in the given content to a large extent, thus improving the integrity of the entity relationships in the given content.
At block 202, a graph database is established based on a set of entities and relationships among the entities in given content. That is, as illustrated in
At block 204, a graph query for the given content is received from a user. In some embodiments, a character graph query for a TV play may be received in a search engine. For example, when the user searches “ABC character graph” through the search engine, the search engine may understand “ABC character graph” as a character graph query for a TV play ABC. In an embodiment, a query request for the given content (such as a certain TV play) may be received in a specific graph website or an application. In some embodiments, the user may realize the graph query by inputting keywords. In addition, the user may express his graph query through a menu or a button.
At block 206, a knowledge graph of the given content is generated by using a predefined formatted layout based on the graph database, the knowledge graph having a network structure. For example, a layout template such as a template of honeycomb hexagon structure may be preset. Then nodes are projected to respective parts in the template, such that the generated knowledge graph locates at a predetermined on the template, thereby the regularity and aesthetics of the knowledge graph may be ensured, thus improving presentation effect and user experience. The formatted layout may refer to a regular layout having a predetermined format, which is different from traditional irregular layouts. The knowledge graph is generated based on the formatted layout, therefore, the nodes and edges in the knowledge graph are clear, and there is no cross-edge situation.
It should be understood that, the method according to embodiments of the present disclosure may be implemented in any electronic device (i.e., a server). After the knowledge graph is generated, the knowledge graph may be sent to the user equipment, so as to be displayed on the display screen of the user equipment. For example, the knowledge graph may be displayed in a browser of the user equipment, or may be displayed in an application installed on the user equipment. The user equipment may be any electronic device, including a fixed equipment such as a desk computer or a mobile device such as a smart phone, etc.
In some embodiments, the knowledge graph may be generated by using the honeycomb hexagon structure. Each hexagon may correspond to six edge nodes and one center node. In this way, it is possible to satisfy the association and extension between nodes, and to facilitate browsing by users, especially browsing on mobile devices with smaller screens. The entire character graph may visualize the character relationships of the whole TV play by taking a character as a center node and extending outward from the center node to connect other nodes. In an embodiment, an octagon or other structures may be used for generating the knowledge graph.
In some embodiments, the graph database may store all or most part of relationships among entities, and each relationship has a relative weight depending on a type of the relationship. For example, a weight of marital relationship is greater than a weight of friend relationship. When generating the knowledge graph, only a part of the relationship with relatively high weight can be presented. For example, the character relationships may be ranked, and a top number of the relationship edges are presented based on the rank result. That is, the knowledge graph may only present relatively important relationships stored in the graph database, so that the generated knowledge graph is not too rich to affect the user's experience.
In some embodiments, when the given content is updated, the graph database may be updated correspondingly. When a new graph query for the given content is received, an updated knowledge graph may be generated based on the updated graph database. In this way, through the online call of data, online display of the knowledge graph can be updated synchronously, which greatly reduces the manual editing cost, thus can complete batch update of a large number of knowledge graphs at the same time.
As illustrated in
The character graph 320 presents a part of the entire character graph for the TV play ABC. It should be understood that, the entire character graph may be presented at one time in a case of a large screen. The presented character graph 320 includes a node of character A, a node of character B, a node of character C, a node of character D, a node of character E, a node of character F, a node of character GS a node of character H, a node of character I, a node of character J. In some embodiments, the node of each character may present an avatar or an image of the character. Character A is a most key character (i.e., the hero) of the TV play ABC, thus the node of character A may be determined as the central node. The character graph 320 is presented by taking the node of character A as the center. For example, the node of character A may be enlarged to present.
Among the nodes shown in
As illustrated in
For example, as illustrated in
In some embodiments, the given content is a film and television work, the entities are characters in the film and television work, the relationships are character relationships among the characters, and the establishing module 810 includes an establishing unit. The establishing unit is configured to establish the graph database by using the characters as nodes and using the character relationship as edges.
In some embodiments, the generating module 830 includes: a determining unit configured to determine a node corresponding to a specific entity in the set of entities as a central node; and a first generating unit configured to generate the knowledge graph by taking the central node as a center.
In some embodiments, the generating module 830 further includes: a first providing unit configured to provide a set of candidate central nodes to a user interface; and an adjusting unit, configured to, in response to selecting one from the set of candidate central nodes by the user, adjust the knowledge graph by taking the selected candidate central node as the center.
In some embodiments, the apparatus 800 further includes a providing module. The providing module is configured to, in response to selecting a node from the knowledge graph by the user, provide a profile of an entity corresponding to the selected node, to replace the set of candidate central nodes in the user interface.
In some embodiments, the generating module 830 further includes: a second providing unit configured to provide a thumbnail of an entire knowledge graph corresponding to the graph database; and a marking providing unit configured to mark a portion associated with the knowledge graph in the thumbnail.
In some embodiments, the generating module 830 further includes: a second generating unit configured to generate the knowledge graph by using a honeycomb hexagon structure, the central node being a center of the hexagon.
In some embodiments, the apparatus 800 further includes: a first updating module configured to, in response to determining that the given content is updated, update the graph database; and a second updating module configured to, in response to receiving the graph query for the given content, generate an updated knowledge graph based on the updated graph database.
It should be understood that, the establishing module 810, the receiving module 820, and the generating module 830 shown in
A plurality of components in device 900 are connected to the I/O interface 905. The components include: an input unit 906, such as a keyboard, mouse, etc., an output unit 907, such as various types of displays, speakers, etc., a storage unit 908, such as a disk, an optical disk, etc., and a communication unit 909, such as a network card, a modem, a wireless communication transceiver, and the like. The communication unit 909 allows the device 900 to exchange information/data with other devices over a computer network such as the Internet and/or various telecommunication networks.
The processing unit 901 performs the various methods and processes described above, such as the method 200. For example, in some embodiments, the method may be implemented as a computer software program that is tangibly embodied in a machine readable medium, such as the storage unit 908. In some embodiments, some or all of the computer programs can be loaded and/or installed onto the device 900 via the ROM 902 and/or the communication unit 909. One or more acts or steps of the method described above may be performed when a computer program is loaded into the RAM 903 and executed by the CPU 901. Alternatively, in other embodiments, the CPU 901 may be configured to perform the method by any other suitable means (e.g., by means of firmware).
The functions described above may be performed at least in part by one or more hardware logic components. For example, and without limitation, exemplary types of hardware logic components that may be used include: a field programmable gate array (FPGA), and application specific integrated circuit (ASIC), an application specific standard product (ASSP), a system on chip (SOC), a load programmable logic device (CPLD), and the like.
Program codes for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. The program codes may be provided to a processor or controller of a general-purpose computer, a special-purpose computer or other programmable data processing apparatus, such that the program codes, when executed by the processor or the controller, enable the functions specified in the flow charts and/or block diagrams to be implemented. The program codes may be entirely executed on a machine, partially executed on the machine, partially executed on the machine as a stand-alone software package and partially executed on a remote machine, or entirely executed on the remote machine or a server.
In the context of the present disclosure, a machine-readable medium can be a tangible medium that can contain or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine readable medium can be a machine readable signal medium or a machine readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of the machine readable storage media may include electrical connections based on one or more wires, a portable computer disk, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, a portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
In addition, although the acts or steps are described in a particular order, this should be understood that such acts or steps are required to be performed in the particular order or in the sequence shown, or all illustrated acts or steps should be executed to achieve a desired result. Multitasking and parallel processing may be advantageous in certain circumstances. Likewise, although several specific implementation details are included in the above description, these should not be construed as limiting the scope of the present disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single implementation. Instead, various features that are described in the context of a single implementation can be implemented in a plurality of implementations, either individually or in any suitable sub-combination.
Although the embodiments of the present disclosure have been described in terms of specific structural features and/or methodological acts, it is understood that the subject matters defined in the appended claims are not limited to the specific features or acts described above. Instead, the specific features and acts described above are merely exemplary forms of implementing the claims.
Number | Date | Country | Kind |
---|---|---|---|
201810897535.1 | Aug 2018 | CN | national |