KNOWLEDGE GRAPH-BASED METHOD FOR RECOMMENDING TRADITIONAL CHINESE MEDICINE PRESCRIPTIONS

CROSS REFERENCE TO RELATED APPLICATION

This patent application claims the benefit and priority of Chinese Patent Application No. 202310690324.1 filed with the China National Intellectual Property Administration on Jun. 12, 2023, the disclosure of which is incorporated by reference herein in its entirety as part of the present application.

TECHNICAL FIELD

The present disclosure relates to the field of traditional Chinese medicine (TCM) prescription recommendations, and in particular to a knowledge graph-based method for recommending TCM prescriptions.

BACKGROUND

Traditional Chinese medicine (TCM) originated in ancient China and is a medical theoretical system that is formed through practice and summarization for thousands of years. It is a valuable asset and the crystallization of wisdom for the Chinese nation. In recent years, the Chinese government has attached great importance to the development and inheritance of TCM, actively taking measures to promote its dissemination and application both domestically and internationally, further promoting the inheritance, open innovation and development of TCM. TCM, as a unique medical system, uses various methods such as looking, asking, listening and smelling and touching to obtain disease and symptom information and then carry out treatment based on syndrome differentiation. In clinical TCM practice, it is necessary that appropriate prescriptions are selected based on a scientific differential diagnosis foundation, according to individual patient characteristics, and in compliance with the principles of herbal medicine compatibility. Unlike the simple addition of the individual herbs, the efficacy of TCM prescriptions is achieved through the comprehensive actions of the herbs to harmonize the balance of Yin and Yang in the human body, leading towards recovery from diseases.

However, currently, there is a lack of clear differentiation between the primary and secondary symptoms in TCM diagnosis and treatment, prescriptions are often not appropriate for actual symptoms, and the compatibility of herbal medicines is disorderly. The absence of a standardized knowledge mode for diagnosis and treatment based on classical TCM works and famous prescriptions. The lack of experience among primary healthcare physicians often results in less objective treatment and prescriptions due to subjective factors. Currently, TCM knowledge is scattered throughout numerous ancient texts and literature, and its loose and unstructured nature results in low utilization of TCM knowledge. Additionally, TCM takes a holistic approach, considering various factors such as the body, age, gender, and medical history of the patient, rather than just treating individual symptoms or diseases. Some studies have used attention-based neural networks to detect different herbal medicine groupings in TCM prescriptions, but such approaches often lack consideration for personalization. Furthermore, due to the potential side effects of herbal medicines, it is necessary to make a proper combination of herbs to achieve a balance in medicine effects and prevent the medicine effects from canceling each other or generating side effects. However, some research, such as the use of Bidirectional Recurrent Neural Networks (BRNN), for learning herbal medicine representations, has shown limited effectiveness in handling herbal medicine compatibility. These issues limit the application and development of TCM.

The application of a TCM knowledge graph can significantly enhance the dissemination of TCM knowledge. By integrating TCM knowledge graphs with artificial intelligence algorithms, it is possible to restructure the complex semantic relationships within the TCM theoretical system and uncover underlying associations, thereby making the TCM theory more scientific and standardized. Intelligent-assisted diagnostic and treatment systems can utilize TCM knowledge graphs to semantically infer the content of medical records, providing doctors with the most relevant diagnostic evidence and treatment options for patients. This, in turn, improves the efficiency of the doctors, reduces misdiagnoses and missed diagnoses, and offers personalized diagnostic and treatment plans for patients to enhance the quality of patient care. Therefore, TCM-assisted diagnosis and treatment play a crucial role in modernizing TCM diagnosis and treatment. The key of promoting the intelligent development of TCM diagnosis and treatment is to integrate TCM knowledge graph information and build an intelligent TCM-assisted diagnostic and treatment decision-making system with a focus on differential diagnosis and treatment to provide decision-making assistance for TCM clinical diagnosis and treatment.

Currently, due to the vast and complex nature of the TCM knowledge system, which is scattered across numerous ancient texts and literature, existing TCM prescription recommendations cannot effectively handle loose and unstructured knowledge, resulting in low utilization of TCM knowledge. TCM emphasizes a holistic approach, considering various factors such as the body, age, gender, and medical history of the patent, rather than simply treating a single symptom or disease. However, current TCM prescription recommendations lack consideration for personalization. Additionally, since traditional Chinese herbs have certain side effects, it is important to make a proper combination of herbs during formulation of herb prescriptions to achieve a balance in medicine effects and prevent the medicine effects from canceling each other or generating side effects. However, existing TCM prescription recommendations have limitations in handling herbal medicine compatibility, leading to various problems.

SUMMARY

To address the issues of low utilization of existing TCM prescription recommendation knowledge, inadequate consideration of personalization for patients, and ineffective handling of herbal medicine compatibility, the present disclosure provides a knowledge graph-based method for recommending TCM prescriptions.

To solve the foregoing technical problems, the present disclosure adopts the following technical solution: a knowledge graph-based method for recommending TCM prescriptions, including the following acts:

- act 1: collecting TCM data and preprocessing the data to remove duplicate data and standardize entity names;
- act 2: performing named entity recognition and relation extraction on the preprocessed TCM data to obtain an entity set E and a relation set R, and constructing a TCM knowledge graph by using entity elements in the entity set E as nodes and relation elements in the relation set R as connection lines between the nodes;
- act 3: performing representation learning for the TCM knowledge graph by using a ComplEx model, selecting all symptom nodes and all Chinese materia medica nodes in the TCM knowledge graph, selecting age information nodes, gender information nodes, efficacy information nodes, medicine property information nodes, syndrome information nodes, and treatment and principle nodes that have connection lines with the symptom nodes or Chinese materia medica nodes, representing the selected nodes as complex vectors, and then separately calculating symptom node vector representations s′ fused with other information and Chinese materia medica node vector representations h′ fused with other information;

$s^{'} = s + W_{a} \cdot a + W_{g} \cdot g + W_{tr} \cdot tr$

$h^{'} = h + W_{ef} \cdot ef + W_{pr} \cdot p + W_{sy} \cdot sy + W_{tr} \cdot tr;$

- where s denotes a symptom node vector representation; h denotes a Chinese materia medica node vector representation; a denotes an age vector representation; W_adenotes an age weight matrix; g denotes a gender vector representation; W_gdenotes a gender weight matrix; tr denotes a treatment and principle vector representation; W_trdenotes a treatment method and principle weight matrix; ef denotes an efficacy vector representation; W_efdenotes an efficacy weight matrix; p denotes a medicine property vector representation; W_prdenotes a medicine property weight matrix; sy denotes a syndrome vector representation; and W_sydenotes a syndrome weight matrix;
- performing tensor-based knowledge graph embedding on the symptom node vector representations s′ fused with other information and the Chinese materia medica node vector representations h′ fused with other information by using an embedding layer of the ComplEx model, mapping the symptom node vector representations s′ fused with other information to a lower-dimensional vector space to obtain symptom node vector representations e_s′ for which the graph embedding model has been applied, and mapping the Chinese materia medica node vector representations h′ fused with other information to a lower-dimensional vector space to obtain Chinese materia medica node vector representations e_h′ for which the graph embedding model has been applied;
- training the ComplEx model by using a scoring function P(e_s′, r, e_h′) of the ComplEx model, and generating a recommendation training set; and
- denoting the symptom node vector representations e_s′ for which the graph embedding model has been applied and the Chinese materia medica node representations for which the graph embedding model has been applied as follows:

$e_{s}^{'} = Re (e_{s}^{'}) + i Im (e_{s}^{'})$

$e_{h}^{'} = Re (e_{h}^{'}) + i Im (e_{h}^{'});$

- where Re(e_s′) is a real part of e_s′, Re(e_h′) is a real part of e_h′, Im(e_s′) is an imaginary part of e_s′, and Im(e_h′) is an imaginary part of e_h′.
- the scoring function P(e_s′, r, e_h′) is expressed by the following formula:

$P (e_{s}^{'}, r, e_{h}^{'}) = σRe 〈 e_{s}^{'}, r, {\overline{e}}_{h}^{'} 〉;$

$Re 〈 e_{s}^{'}, r, {\overline{e}}_{h}^{'} 〉 = Re (\sum_{k = 1}^{K} {e_{s}}_{k}^{'} r_{k} {\overline{e}}_{hk}^{'}) = 〈 Re (e_{s}^{'}), Re (r), Re ({e_{h}}^{'}) 〉 + 〈 Re (e_{s}^{'}), Im (r), Im ({e_{h}}^{'}) 〉 + 〈 Im (e_{s}^{'}), Re (r), Im ({e_{h}}^{'}) 〉 - 〈 Im (e_{s}^{'}), Im (r), Re ({e_{h}}^{'}) 〉;$

- where r is a relation vector representation between e_s′ and e_h′, σ is an activation function, Re(r) is a real part of r, and Im(r) is an imaginary part of r;
- act 4: based on an entity coverage of knowledge graph (KG) in the recommendation training set, freezing or fine-tuning entity embedding vectors learned through training the ComplEx model;

$f (e_{h, t}) = {\begin{matrix} Freeze (v_{e_{h, t}}), & if coverage (e_{h, t}) < threshold \\ Fine - tune (v_{e_{h, t}}), & otherwise \end{matrix};$

- where Freeze(v_e_h,t) represents freezing an entity embedding vector V_e_h,t, Fine-tune(v_e_h,t) represents fine-tuning an entity embedding vector V_e_h,t, and threshold represents a determining threshold for the entity coverage;
- act 5: learning feature information r_sof the symptom nodes and feature information r_hof the Chinese materia medica nodes by using a graph convolutional neural network, where the feature information r_sof the symptom nodes is obtained through vector representations of the symptom nodes at each layer in the graph convolutional neural network, and the feature information r_hof the Chinese materia medica nodes is obtained through vector representations of the Chinese materia medica nodes at each layer in the graph convolutional neural network;
- where for a symptom node s, a set of one-hop neighbor Chinese materia medica nodes thereof is denoted as N_s, and a message from neighbor nodes in the k-th layer is as follows:

$r_{N_{s}}^{k - 1} = \tanh ({AGGREGATE}_{MEAN} ({q_{h \to s}^{k - 1}})) = \tanh (\frac{1}{❘ N_{s} ❘} \sum_{h \in N_{s}} q_{h \to s}^{k - 1});$

- where a vector representation of the symptom node s at the k-th layer in the graph convolutional neural network is denoted as follows:

$r_{s}^{k} = \tanh (W_{s}^{k} \cdot CONCAT (r_{s}^{k - 1}, r_{N_{s}}^{k - 1}) + b_{s}^{k});$

- where |N_s| represents a number of adjacent nodes for the symptom node, W_s^krepresents a weight matrix for the symptom node at the k-th layer, b_s^krepresents a bias term, tanh represents an activation function, CONCAT represents a vector concatenation operation, and q_h→s^k-1represents information transmitted from Chinese materia medica nodes in the (k-1)-th layer to the symptom node;
- for a Chinese materia medica node h, a set of one-hop neighbor symptom nodes thereof is denoted as N_h, and a message from neighbor nodes in the k-th layer is as follows:

$r_{N_{h}}^{k - 1} = \tanh ({AGGREGATE}_{MEAN} ({q_{s \to h}^{k - 1}})) = \tanh (\frac{1}{❘ N_{h} ❘} \sum_{s \in N_{h}} q_{s \to h}^{k - 1});$

- where a vector representation for the Chinese materia medica node h at the k-th layer in the graph convolutional neural network is denoted as follows:

$r_{h}^{k} = \tanh (W_{h}^{k} \cdot CONCAT (r_{h}^{k - 1}, r_{N_{s}}^{k - 1}) + b_{h}^{k});$

- where |N_h| represents a number of adjacent nodes for the Chinese materia medica node, W_h^krepresents a weight matrix for the Chinese materia medica node at the k-th layer, b_h^krepresents a bias term, tanh represents the activation function, CONCAT represents the vector concatenation operation, and q_s→h^k-1represents information transmitted from symptom nodes in the (k-1)-th layer to the Chinese materia medica node;
- act 6: combining graph features with a recommendation system by using an attention mechanism, and calculating a Query matrix, a Key matrix, and a Value matrix for the symptom nodes according to the symptom node vector representations e_s′ for which the graph embedding model has been applied and the feature information r_sof the symptom nodes;

$Q_{s} = W_{Q}^{(s)} e_{s}^{'}; K_{s} = W_{K}^{(s)} r_{s}; V_{s} = W_{V}^{(s)} e_{s}^{'};$

- calculating a Query matrix, a Key matrix, and a Value matrix for the Chinese materia medica nodes according to the Chinese materia medica node vector representations e_h′ for which the graph embedding model has been applied and the feature information r_hof the Chinese materia medica nodes:

$Q_{h} = W_{Q}^{(h)} e_{h}^{'}; K_{h} = W_{K}^{(h)} r_{h}; V_{h} = W_{V}^{(h)} e_{h}^{'};$

- where W_Q^(s), W_K^(s), W_V^(s), W_Q^(h), W_K^(h), W_V^(h)are parameter matrices learned from three linear transformation layers in a multi-head attention layer;
- calculating an attention matrix A_sfor the symptom nodes and an attention matrix A_hfor the Chinese materia medica nodes by using a softmax function:

$\begin{matrix} A_{s} = softmax (\frac{Q_{s} K_{s}^{T}}{\sqrt{d_{k}}}) \\ A_{h} = softmax (\frac{Q_{h} K_{h}^{T}}{\sqrt{d_{k}}}) \end{matrix}$

where d_kis a dimension value; and

- calculating fused representation vectors s of the symptom nodes and fused representation vectors e_h* of the Chinese materia medica nodes:

e
_s
*=A
_s
V
_s

e
_h
*=A
_h
V
_h;

- act 7: collecting a set of symptom entities sc from a target patient and constructing a multi-hot vector x_sc:

$x_{s c} [i] = {\begin{matrix} 1, & if i \in s c \\ 0, & otherwise \end{matrix},$

- where 1≤i≤K, K represents a number of the symptom nodes in the TCM knowledge graph
- calculating an overall symptom matrix E_s* based on the fused representation vectors e_s* of the symptom nodes:

$E_{s}^{*} = [\begin{matrix} e_{s 1}^{*} \\ e_{s_{2}}^{*} \\ ⋮ \\ e_{s_{k}}^{*} \end{matrix}],$

- extracting information from the overall symptom matrix E_s* by using the multi-hot vector x_scas a mask, to obtain an identified syndrome matrix M_sc.

$M_{s c} = E_{s}^{*} \cdot diag (x_{s c}),$

- where the multi-hot vector x_scis transformed into a diagonal matrix through a diag function, and non-zero rows in the identified syndrome matrix M_sccorrespond to the fused representation vectors e_s* of symptom nodes in the set of symptom entities sc;
- performing single induction on the identified syndrome matrix M_scby using an average pooling operation, to obtain single representation vectors e_sc:

$e_{s c} = \frac{1}{k} \sum_{i = 1}^{k} e_{s_{i}}^{*};$

- inputting the single representation vectors e_scinto the multi-layer perceptron for syndrome induction to obtain final syndrome representations:
- where the expression of the multi-layer perceptron is shown as follows:

$\begin{matrix} h_{1} = R e L U (W_{1} e_{sc} + b_{1}) \\ h_{2} = R e L U (W_{2} h_{1} + b_{2}) \\ \dots \\ h_{L} = R e L U (W_{L} h_{L - 1} + b_{L}); \end{matrix}$

- where W_Lrepresents a weight matrix of the L-th layer, b_Lrepresents a bias term of the L-th layer, and ReLU represents a non-linear activation function; and
- using an output of the multi-layer perceptron as final syndrome representation vectors e_z, that is, e_z=h_L;
- act 8: calculating an overall Chinese materia medica matrix E_H* based on the fused representation vectors e_h* of the Chinese materia medica nodes:

$E_{H}^{*} = [\begin{matrix} e_{h_{1}}^{*} \\ e_{h_{2}}^{*} \\ ⋮ \\ e_{h_{M}}^{*} \end{matrix}];$

- wherein M represents a total number of candidate Chinese materia medica nodes, the candidate Chinese materia medica nodes are Chinese materia medica nodes in the TCM knowledge graph which have connection lines with symptom nodes in symptom entities in the set sc of symptom entities;
- calculating prediction probability vectors m(sc) based on the final syndrome representation vectors e_zand the overall Chinese materia medica matrix E_H*.
- m(sc)=σ(E_H*e_z^T), where σ is an activation function;
- for each candidate Chinese materia medica, calculating a binary cross-entropy loss between the prediction probability and a true label, and summing the losses for all Chinese materia medicas; for the set of symptom entities sc, obtaining a prediction probability vector based on m(sc) denoted as ŷ=[ŷ₁, ŷ₂, . . . , ŷ_M]^T, where each element in the prediction probability vector represents a prediction probability of a corresponding candidate Chinese materia medica node, the TCM recommending device outputs candidate Chinese materia medica nodes each with a prediction probability greater than a predetermined value as recommended Chinese materia medica prescription for the set of symptom entities Sc.

Preferably, in act 1, both structured and unstructured data are collected; the structured data includes TCM dictionaries, databases, and ontologies, while the unstructured data includes TCM literature, clinical records, and expert knowledge.

With the above technical solutions, the present disclosure achieves the following beneficial effects:

The present disclosure applies a knowledge graph embedding model, a multi-head attention mechanism, graph convolutions, and other techniques to combine graph features with a recommendation system. It comprehensively considers patient conditions for TCM prescription recommendations. Taking TCM case description texts as the research object and integrating with TCM knowledge graph information, the present disclosure draws from the clinical experience of renowned TCM experts and fully considers various individual factors such as medicinal properties, efficacy, medical conditions, and patient constitutions. Based on the holistic principles of TCM and the ideology of syndrome differentiation and treatment, the method selects different herbal combinations based on varying symptoms and medical conditions of patients. The proposed TCM prescription recommendation method, incorporating the knowledge graph, considers the complex relationships between individual signs as well as symptoms and medications. It analyzes specific patterns in each prescription comprehensively, providing doctors and patients with more reliable prescription recommendations. The present disclosure assists in clinical decision-making for TCM diagnosis and treatment.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a flow chart of a knowledge graph-based method for recommending TCM prescriptions according to the present publication.

FIG. 2 is a schematic block diagram of a computer that can be used for implementing the method and the system according to the embodiments of the present disclosure.

DETAILED DESCRIPTION OF THE EMBODIMENTS
Embodiment 1

The present disclosure relates to a knowledge graph-based method and system for recommending information. The method includes the following acts 101 to 108:

In act 101, associated data for specific applications are collected and preprocessed to remove duplicate data and standardize entity names.

In act 102, the preprocessed associated data are subjected to named entity recognition and relation extraction to obtain an entity set E and a relation set R, and construct a knowledge graph by using entity elements in the entity set E as nodes and relation elements in the relation set R as connection lines between the nodes.

In act 103, representation learning is performed for the knowledge graph by using a ComplEx model. All head information nodes and all tail information nodes are selected in the knowledge graph, and head auxiliary information nodes that have connection lines with the head information nodes and tail auxiliary information nodes that have connection lines with the tail information nodes are selected. The selected nodes are represented as complex vectors, and then head information node vector representations s′ fused with other information and tail information node vector representations h′ fused with other information are separately calculated.

$\begin{matrix} s^{'} = s + W_{a 1} \times a 1 + W_{a 2} \times a 2 + \dots + W_{a i} \times ai \\ h^{'} = s + W_{b 1} \times b 1 + W_{b 2} \times b 2 + \dots + W_{b j} \times b j \end{matrix}$

- where s denotes a head information node vector representation; h denotes a tail information node vector representation; a1 denotes an first head auxiliary information vector representation; W_a1denotes an first head auxiliary information weight matrix; a2 denotes a second head auxiliary information vector representation; W_a2denotes a second head auxiliary information weight matrix; ai denotes a i-th head auxiliary information vector representation; W_aidenotes a i-th head auxiliary information weight matrix; b1 denotes an first tail auxiliary information vector representation; W_b1denotes an first tail auxiliary information weight matrix; b2 denotes a second tail auxiliary information vector representation; W_b2denotes a second tail auxiliary information weight matrix; bj denotes a i-th tail auxiliary information vector representation; and W_bj, denotes a i-th tail auxiliary information weight matrix, where i and j are both natural numbers.

Tensor-based knowledge graph embedding is performed on the head information node vector representations s′ fused with auxiliary information and the tail information node vector representations h′ fused with auxiliary information by using an embedding layer of the ComplEx model. The head information node vector representations s′ fused with other information are mapped to a lower-dimensional vector space to obtain head information node vector representations e_s′ for which the graph embedding model has been applied, and the tail information node vector representations h′ fused with other information are mapped to a lower-dimensional vector space to obtain tail information node vector representations e_h′ for which the graph embedding model has been applied.

The ComplEx model is trained by using a scoring function P(e_s′, r, e_h′) of the ComplEx model, and a recommendation training set is generated.

The head information node vector representations e_s′ for which the graph embedding model has been applied and the tail information node representations for which the graph embedding model has been applied are denoted as follows:

$\begin{matrix} e_{s}^{'} = Re (e_{s}^{'}) + i Im (e_{s}^{'}) \\ e_{h}^{'} = Re (e_{h}^{'}) + i Im (e_{h}^{'}); \end{matrix}$

- where Re(e_s′) is a real part of e_s′, Re(e_h′) is a real part of e_h′, Im(e_s′) is an imaginary part of e_s′, and Im(e_h′) is an imaginary part of e_h′.

The scoring function P(e_s′, r, e_h′) is expressed by the following formula:

$P (e_{s}^{'}, r, e_{h}^{'}) = σ Re 〈 e_{s}^{'}, r, {\bar{e}}_{h}^{'} 〉;$

$\begin{matrix} Re 〈 e_{s}^{'}, r, {\bar{e}}_{h}^{'} 〉 = Re (\sum_{k = 1}^{K} e_{s_{k}}^{'} r_{k} {\bar{e}}_{h k}^{'}) \\ = 〈 Re (e_{s}^{'}), Re (r), Re (e_{h}^{'}) 〉 + 〈 Re (e_{s}^{'}), Im (r), Im (e_{h}^{'}) 〉 + 〈 Im (e_{s}^{'}), Re (r), Im (e_{h}^{'}) 〉 - 〈 Im (e_{s}^{'}), Im (r), Re (e_{h}^{'}) 〉; \end{matrix}$

- where r is a relation vector representation between e_s′ and e_h′, σ is an activation function, Re(r) is a real part of r, and Im(r) is an imaginary part of r.

In act 104, based on coverage of knowledge graph (KG) entities in the recommendation training set, entity embedding vectors learned through training the ComplEx model are frozen or fine-tuned.

$f (e_{h, t}) = {\begin{matrix} Freeze (v_{e_{h, t}}), & if coverage (e_{h, t}) < thre s h o l d \\ Fine - tune (v_{e_{h, t}}), & otherwise \end{matrix} \begin{matrix} \end{matrix};$

- where Freeze(v_e_h,t) represents freezing an entity embedding vector V_e_h,t, Fine-tune(v_e_h,t) represents fine-tuning an entity embedding vector V_e_h,t, and threshold represents a determining threshold for the entity coverage;

In act 105, feature information of the head information nodes and feature information of the tail information nodes are learnt by using a graph convolutional neural network, where the feature information of the head information nodes is obtained through vector representations of the head information nodes at each layer in the graph convolutional neural network, and the feature information of the tail information nodes is obtained through vector representations of the tail information nodes at each layer in the graph convolutional neural network.

For a head information node s, a set of one-hop neighbor tail information nodes thereof is denoted as N_s, and a message from neighbor nodes in the k-th layer is as follows:

$r_{N_{s}}^{k - 1} = \tanh ({AGGREGATE}_{M E A N} ({q_{h \to s}^{k - 1}})) = \tanh (\frac{1}{❘ N_{s} ❘} \sum_{h \in N_{s}} q_{h \to s}^{k - 1});$

- where a vector representation of a head information node s at the k-th layer in the graph convolutional neural network is denoted as follows:

$r_{s}^{k} = \tanh (W_{s}^{k} \cdot CONCAT (r_{s}^{k - 1}, r_{N_{s}}^{k - 1}) + b_{s}^{k});$

- where |N_s| represents the number of adjacent nodes for the head information node, W_s^krepresents a weight matrix for the tail information node at the k-th layer, b_s^krepresents a bias term, tanh represents an activation function, CONCAT represents a vector concatenation operation, and q_h→s^k-1represents information transmitted from tail information nodes in the (k-1)-th layer to the head information node.

For a tail information node h, a set of one-hop neighbor head information nodes thereof is denoted as N_h, and a message from neighbor nodes in the k-th layer is as follows:

$r_{N_{h}}^{k - 1} = \tanh ({AGGREGATE}_{MEAN} ({q_{s \to h}^{k - 1}})) = \tanh (\frac{1}{❘ N_{h} ❘} \sum_{s \in N_{h}} q_{s \to h}^{k - 1});$

- where a vector representation for the tail information node h at the k-th layer in the graph convolutional neural network is denoted as follows:

$r_{h}^{k} = \tanh (W_{h}^{k} \cdot CONCAT (r_{h}^{k - 1}, r_{N_{s}}^{k - 1}) + b_{h}^{k});$

- where |N_h| represents the number of adjacent nodes for the tail information node, W_h^krepresents a weight matrix for the tail information node at the k-th layer, b_h^krepresents a bias term, tanh represents an activation function, CONCAT represents a vector concatenation operation, and q_s→h^k-1represents information transmitted from head information nodes in the (k-1)-th layer to the tail information node.

In act 106, the graph features are combined with a recommendation system by using an attention mechanism, and a Query matrix, a Key matrix, and a Value matrix for the head information nodes are calculated according to the head information node vector representations for which the graph embedding model has been applied and the feature information of the head information nodes:

$Q_{s} = W_{Q}^{(s)} e_{s}^{'}; K_{s} = W_{K}^{(s)} r_{s}; V_{s} = W_{V}^{(s)} e_{s}^{'} .$

A Query matrix, a Key matrix, and a Value matrix for the tail information nodes are calculated according to the tail information node vector representations e_h′ for which the graph embedding model has been applied and the feature information r_hof the tail information nodes:

$Q_{h} = W_{Q}^{(h)} e_{h}^{'}; K_{h} = W_{K}^{(h)} r_{h}; V_{h} = W_{V}^{(h)} e_{h}^{'} .$

- W_Q^(s), W_K^(s), W_V^(s), W_Q^(h), W_K^(h), W_V^(h)are parameter matrices learned from three linear transformation layers in a multi-head attention layer.

An attention matrix A_sfor the head information nodes and an attention matrix A_hfor the tail information nodes are calculated by using a softmax function:

$A_{s} = softmax (\frac{Q_{s} K_{s}^{T}}{\sqrt{d_{k}}})$

$A_{h} = softmax (\frac{Q_{h} K_{h}^{T}}{\sqrt{d_{k}}}),$

where d_kis a dimension value.

Fused representation vectors e_s* of the head information nodes and fused representation vectors e_h* of the tail information nodes are calculated:

e
_s
*=A
_s
V
_s

e
_h
*=A
_h
V
_h.

In act 107, a multi-hot vector x_scis construct based on a set of head information entities to be identified sc.

$x_{sc} [i] = {\begin{matrix} 1, & if i \in sc \\ 0, & otherwise \end{matrix} .$

An overall head information matrix E_s* is calculated based on the fused representation vectors s of the head information nodes:

$E_{s}^{*} = [\begin{matrix} e_{s 1}^{*} \\ e_{s_{2}}^{*} \\ ⋮ \\ e_{s_{k}}^{*} \end{matrix}] .$

Information is extracted from the overall head information matrix E_s* by using the multi-hot vector x_scas a mask, to obtain an head information characterization matrix M_sc:

$M_{sc} = E_{s}^{*} \cdot diag (x_{sc}),$

- where the multi-hot vector x_scis transformed into a diagonal matrix through a diag function, and non-zero rows in the head information characterization matrix M_sccorrespond to the fused representation vectors e_s* of head information nodes in the set of head information entities sc.

Single induction is performed on the head information characterization matrix M_scby using an average pooling operation, to obtain single representation vectors e_sc:

$e_{sc} = \frac{1}{k} \sum_{i = 1}^{k} e_{s_{i}}^{*} .$

The single representation vectors e_scare inputted into the multi-layer perceptron for head information characterization induction to obtain final head information characterization.

The expression of the multi-layer perceptron is shown as follows:

$h_{1} = Re LU (W_{1} e_{sc} + b_{1})$

$h_{2} = Re LU (W_{2} h_{1} + b_{2})$

$\dots$

$h_{L} = Re LU (W_{L} h_{L - 1} + b_{L});$

- where W_Lrepresents a weight matrix of the L-th layer, b_Lrepresents a bias term of the L-th layer, and ReLU represents a non-linear activation function.

An output of the multi-layer perceptron is used as final head information characterization vectors e_z, that is, e_z=h_L.

In act 108, an overall tail information matrix E_H* is calculated based on the fused representation vectors e_h* of the tail information nodes:

$E_{H}^{*} = [\begin{matrix} e_{h_{1}}^{*} \\ e_{h_{2}}^{*} \\ ⋮ \\ e_{h_{M}}^{*} \end{matrix}] .$

- wherein M represents a total number of candidate tail information nodes, the candidate tail information nodes are tail information nodes in the knowledge graph which have connection lines with head information nodes in head information entities in the set sc of head information entities

Prediction probability vectors m(sc) are calculated based on the final head information characterization vectors e_zand the overall tail information matrix E_H*.

- m(sc)=σ(E_H*e_z^T), where σ is an activation function.

For each candidate tail information, a binary cross-entropy loss between the prediction probability and a true label is calculated, and the losses for all tail information are summed; for the set of head information entities sc, a prediction probability vector is obtained based on m(sc) the prediction probability vector is denoted as ŷ=[ŷ₁, ŷ₂, . . . , ŷ_M]^T, where each element in the prediction probability vector represents a prediction probability of a corresponding candidate tail information node, the model outputs candidate tail information nodes each with a prediction probability greater than a predetermined value as recommended information for the set of head information entities sc.

In addition, knowledge graph-based system for recommending information include data processing and knowledge graph construction module, knowledge graph feature extraction module, multi head attention mechanism feature fusion module, and recommendation module implemented by the computer. Among them, the data processing and knowledge graph construction module is run by the computer to perform acts 101 to 102 described above. The knowledge graph feature extraction module is run by a computer to perform acts 103 to 105 described above. The multi head attention mechanism feature fusion module is run by a computer to perform act 106 describe above. And the recommendation module is run by a computer to perform acts 107 to 108.

In Example 1, by constructing a knowledge graph, a large amount of unstructured and structured data can be integrated into a coherent knowledge network, improving the availability, retrievability, and comprehensibility of entities. The system is therefore enabled to effectively organize and utilize the complex relationships between entities.

In addition, this application integrates knowledge graph and graph neural network (GNN) architecture. Specifically, a graph neural network architecture based on the ComplEx model was designed, which combines representation learning of knowledge graph and graph neural network (GNN) technology. It not only learns low dimensional complex vector representations of entities and relationships, but also innovatively integrates different types of auxiliary information, and dynamically assigns the importance of different types of information using type weight matrices, which enhances the model's consideration of object differences and requirements. In addition, this fusion improves the ability of graph neural network structures to capture and understand associations of high dimensions semantic between different entities, which helps to improve the accuracy of recommended information.

Adopting a multi head attention mechanism to integrate the semantic associations between head and tail nodes in the knowledge graph, dynamically assigning node importance, ensuring that the interaction between different head nodes and the synergies of tail nodes are considered in the information recommendation process, improving the accuracy and interpretability of the recommendation system.

Using a multi-layer perceptron to characterize the nonlinear interaction relationship between head node sets, and extracting the overall head information representation from the fusion representation of multiple head nodes, in order to accurately predict the correspondence between head information and head information representation, providing a more accurate basis for personalized information recommendation.

According to the disclosed method, both knowledge graph representation learning and information recommendation tasks were optimized during the training process. The performance of the model in both knowledge representation and actual information recommendation was ensured by a joint loss function. The experimental results showed that it outperformed other comparative models in various evaluation indicators, demonstrating high accuracy and practicality.

Embodiment 2

Given the advantages of the method in Embodiment 1 in information recommendation, it is particularly suitable for recommending TCM prescriptions.

A knowledge graph-based method for recommending TCM prescriptions includes the following acts 101 to 108.

In act 101, TCM data are collected and preprocessed to remove duplicate data and standardize entity names. Both structured and unstructured data are collected. The structured data includes TCM dictionaries, databases, and ontologies, while the unstructured data includes TCM literature, clinical records, and expert knowledge.

In act 102, the preprocessed TCM data are subjected to named entity recognition and relation extraction to obtain an entity set E and a relation set R, and construct a TCM knowledge graph by using entity elements in the entity set E as nodes and relation elements in the relation set R as connection lines between the nodes.

In act 103, representation learning is performed for the TCM knowledge graph by using a ComplEx model. All symptom nodes and all Chinese materia medica nodes are selected in the TCM knowledge graph, and age information nodes, gender information nodes, efficacy information nodes, medicine property information nodes, syndrome information nodes, and treatment and principle nodes that have connection lines with the symptom nodes or Chinese materia medica nodes are selected. The selected nodes are represented as complex vectors, and then symptom node vector representations s′ fused with other information and Chinese materia medica node vector representations h′ fused with other information are separately calculated.

$s^{'} = s + W_{a} \cdot a + W_{g} \cdot g + W_{tr} \cdot tr$

$h^{'} = h + W_{ef} \cdot ef + W_{pr} \cdot p + W_{sy} \cdot sy + W_{tr} \cdot tr;$

- where s denotes a symptom node vector representation; h denotes a Chinese materia medica node vector representation; a denotes an age vector representation; W_adenotes an age weight matrix; g denotes a gender vector representation; W_gdenotes a gender weight matrix; tr denotes a treatment and principle vector representation; W_trdenotes a treatment and principle weight matrix; ef denotes an efficacy vector representation; W_efdenotes an efficacy weight matrix; p denotes a medicine property vector representation; W_prdenotes a medicine property weight matrix; sy denotes a syndrome vector representation; and W_sydenotes a syndrome weight matrix.

Tensor-based knowledge graph embedding is performed on the symptom node vector representations s′ fused with other information and the Chinese materia medica node vector representations h′ fused with other information by using an embedding layer of the ComplEx model. The symptom node vector representations s′ fused with other information are mapped to a lower-dimensional vector space to obtain symptom node vector representations e_s′ for which the graph embedding model has been applied, and the Chinese materia medica node vector representations h′ fused with other information are mapped to a lower-dimensional vector space to obtain Chinese materia medica node vector representations e_h′ for which the graph embedding model has been applied.

The ComplEx model is trained by using a scoring function P(e_s′, r, e_h′) of the ComplEx model, and a recommendation training set is generated.

The symptom node vector representations e_s′ for which the graph embedding model has been applied and the Chinese materia medica node representations for which the graph embedding model has been applied are denoted as follows:

$e_{s}^{'} = Re (e_{s}^{'}) + i Im (e_{s}^{'})$

$e_{h}^{'} = Re (e_{h}^{'}) + i Im (e_{h}^{'});$

where Re(e_s′) is a real part of e_s′, Re(e_h′) is a real part of e_h′, Im(e_s′) is an imaginary part of e_s′, and Im(e_h′) is an imaginary part of e_h′.

The scoring function P(e_s′, r, e_h′) is expressed by the following formula:

- where r is a relation vector representation between e_s′ and e_h′, σ is an activation function, Re(r) is a real part of r, and Im(r) is an imaginary part of r.

In act 104, based on coverage of knowledge graph (KG) entities in the recommendation training set, entity embedding vectors learned through training the ComplEx model are frozen or fine-tuned.

$f (e_{h, t}) = {\begin{matrix} Freeze (v_{e_{h, t}}), & if coverage (e_{h, t}) < threshold \\ Fine - tune (v_{e_{h, t}}), & otherwise \end{matrix};$

where Freeze(v_e_h,t) represents freezing an entity embedding vector V_e_h,t, Fine-tune(v_e_h,t) represents fine-tuning an entity embedding vector V_e_h,t, and threshold represents a determining threshold for the entity coverage;

In act 105, feature information of the symptom nodes and feature information of the Chinese materia medica nodes are learnt by using the graph convolutional neural network, where the feature information of the symptom nodes is obtained through vector representations of the symptom nodes at each layer in the graph convolutional neural network, and the feature information of the Chinese materia medica nodes is obtained through vector representations of the Chinese materia medica nodes at each layer in the graph convolutional neural network.

For a symptom node s, a set of one-hop neighbor Chinese materia medica nodes thereof is denoted as N_s, and a message from neighbor nodes in the k-th layer is as follows:

$r_{N_{s}}^{k - 1} = \tanh ({AGGREGATE}_{M E A N} ({q_{h \to s}^{k - 1}})) = \tanh (\frac{1}{❘ N_{s} ❘} \sum_{h \in N_{s}} q_{h \to s}^{k - 1});$

where a vector representation of a symptom node s at the k-th layer in the graph convolutional neural network is denoted as follows:

$r_{s}^{k} = \tanh (W_{s}^{k} \cdot CONCAT (r_{s}^{k - 1}, r_{N_{s}}^{k - 1}) + b_{s}^{k});$

- where |N_s| represents the number of adjacent nodes for the symptom node, W_s^krepresents a weight matrix for the symptom node at the k-th layer, b_s^krepresents a bias term, tanh represents an activation function, CONCAT represents a vector concatenation operation, and q_h→s^k-1represents information transmitted from Chinese materia medica nodes in the (k-1)-th layer to the symptom node.

For a Chinese materia medica node h, a set of one-hop neighbor symptom nodes thereof is denoted as N_h, and a message from neighbor nodes in the k-th layer is as follows:

$r_{N_{h}}^{k - 1} = \tanh ({AGGREGATE}_{M E A N} ({q_{s \to h}^{k - 1}})) = \tanh (\frac{1}{❘ N_{h} ❘} \sum_{s \in N_{h}} q_{s \to h}^{k - 1});$

- where a vector representation for the Chinese materia medica node h at the k-th layer in the graph convolutional neural network is denoted as follows:

$r_{h}^{k} = \tanh (W_{h}^{k} \cdot CONCAT (r_{h}^{k - 1}, r_{N_{s}}^{k - 1}) + b_{h}^{k});$

- where |N_h| represents the number of adjacent nodes for the Chinese materia medica node, W_h^krepresents a weight matrix for the symptom node at the k-th layer, b_h^krepresents a bias term, tanh represents an activation function, CONCAT represents a vector concatenation operation, and q_s→h^k-1represents information transmitted from symptom nodes in the (k-1)-th layer to the Chinese materia medica node.

In act 106, the graph features are combined with a recommendation system by using the attention mechanism model, and a Query matrix, a Key matrix, and a Value matrix for the symptom nodes are calculated according to the symptom node vector representations for which the graph embedding model has been applied and the feature information of the symptom nodes:

$Q_{s} = W_{Q}^{(s)} e_{s}^{'}; K_{s} = W_{K}^{(s)} r_{s}; V_{s} = W_{V}^{(s)} e_{s}^{'} .$

A Query matrix, a Key matrix, and a Value matrix for the Chinese materia medica nodes are calculated according to the Chinese materia medica node vector representations e_h′ for which the graph embedding model has been applied and the feature information r_hof the Chinese materia medica nodes:

$Q_{h} = W_{Q}^{(h)} e_{h}^{'}; K_{h} = W_{K}^{(h)} r_{h} V_{h} = W_{V}^{(h)} e_{h}^{'} .$

- W_Q^(s), W_K^(s), W_V^(s), W_Q^(h), W_K^(h), W_V^(h)are parameter matrices learned from three linear transformation layers in a multi-head attention layer.

An attention matrix A_sfor the symptom nodes and an attention matrix A_hfor the Chinese materia medica nodes are calculated by using a softmax function:

$\begin{matrix} A_{s} = softmax (\frac{Q_{s} K_{s}^{T}}{\sqrt{d_{k}}}) \\ A_{h} = softmax (\frac{Q_{h} K_{h}^{T}}{\sqrt{d_{k}}}), \end{matrix}$

where d_kis a dimension value.

Fused representation vectors e_s* of the symptom nodes and fused representation vectors e_h′ of the Chinese materia medica nodes are calculated:

e=A
_s
V
_s

e
_h
A
_h
V
_h.

In act 107, a set of symptom entities is collected from a target patient and a multi-hot vector x_scis construct:

$x_{s c} [i] = {\begin{matrix} 1, & if i \in s c \\ 0, & otherwi s e \end{matrix},$

wherein 1≤i≤K, K represents a number of the symptom nodes in the TCM knowledge graph.

An overall symptom matrix E_s* is calculated based on the fused representation vectors e_s* of the symptom nodes:

$E_{s}^{*} = [\begin{matrix} e_{s 1}^{*} \\ e_{s_{2}}^{*} \\ ⋮ \\ e_{s_{k}}^{*} \end{matrix}] .$

Information is extracted from the overall symptom matrix E_s* by using the multi-hot vector x_scas a mask, to obtain an identified syndrome matrix M_sc:

$M_{s c} = E_{s}^{*} \cdot diag (x_{s c}),$

where the multi-hot vector x_scis transformed into a diagonal matrix through a diag function, and non-zero rows in the identified syndrome matrix M_sccorrespond to the fused representation vectors s of symptom nodes in the set of symptom entities sc.

Single induction is performed on the identified syndrome matrix M_scby using an average pooling operation, to obtain single representation vectors e_sc:

$e_{𝓈 c} = \frac{1}{k} \sum_{i = 1}^{k} e_{𝓈_{i}}^{*} .$

The single representation vectors e_scare inputted into the multi-layer perceptron for syndrome induction to obtain final syndrome representations.

The expression of the multi-layer perceptron is shown as follows:

$\begin{matrix} h_{1} = R e L U (W_{1} e_{sc} + b_{1}) \\ h_{2} = R e L U (W_{2} h_{1} + b_{2}) \\ \dots \\ h_{L} = R e L U (W_{L} h_{L - 1} + b_{L}); \end{matrix}$

- where W_Lrepresents a weight matrix of the L-th layer, b_Lrepresents a bias term of the L-th layer, and ReLU represents a non-linear activation function.

An output of the multi-layer perceptron is used as final syndrome representation vectors e_z, that is, e_z=h_L.

In act 108, an overall Chinese materia medica matrix E_H* is calculated based on the fused representation vectors e_h* of the Chinese materia medica nodes:

$E_{H}^{*} = [\begin{matrix} e_{h_{1}}^{*} \\ e_{h_{2}}^{*} \\ : \\ e_{h_{M}}^{*} \end{matrix}],$

wherein M represents a total number of candidate Chinese materia medica nodes, the candidate Chinese materia medica nodes are Chinese materia medica nodes in the TCM knowledge graph which have connection lines with symptom nodes in symptom entities in the set SC of symptom entities.

Prediction probability vectors m(sc) are calculated based on the final syndrome representation vectors e_zand the overall Chinese materia medica matrix E_H*.

m(sc)=σ(E_H*e_z^T), where σ is an activation function.

For each candidate Chinese materia medica, a binary cross-entropy loss between the prediction probability and a true label is calculated, and the losses for all Chinese materia medicas are summed; for the set of symptom entities sc, a prediction probability vector is obtained based on m(sc) the prediction probability vector is denoted as ŷ=[ŷ₁, ŷ₂, . . . , ŷ_M]^T, where each element in the prediction probability vector represents a prediction probability of a corresponding candidate Chinese material medical node, the TCM recommending device outputs candidate Chinese materia medica nodes each with a prediction probability greater than a predetermined value as recommended Chinese materia medica prescription for the set of symptom entities sc.

FIG. 2 shows a schematic block diagram of a computer that can be used for implementing the method and the system according to the embodiments of the present disclosure.

In FIG. 2, a central processing unit (CPU) 201 executes various processing according to a program stored in a read-only memory (ROM) 202 or a program loaded from a storage part 208 to a random access memory (RAM) 203. In the RAM 203, data needed at the time of execution of various processing and the like by the CPU 201 is also stored according to requirements. The CPU 201, the ROM 202 and the RAM 203 are connected to each other via a bus 204. An input/output interface 205 is also connected to the bus 204.

The following components are connected to the input/output interface 205: an input part 206 (including a keyboard, a mouse and the like); an output part 207 (including a display, such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD) and the like, as well as a loudspeaker and the like); the storage part 208 (including a hard disc and the like); and a communication part 209 (including a network interface card such as an LAN card, a modem and so on). The communication part 209 performs communication processing via a network such as the Internet. According to requirements, a driver 210 may also be connected to the input/output interface 205. A detachable medium 211 such as a magnetic disc, an optical disc, a magnetic optical disc, a semiconductor memory and the like may be installed on the driver 210 according to requirements, such that a computer program read therefrom is installed in the storage part 208 according to requirements.

In the case of carrying out the foregoing series of processing by software, programs constituting the software are installed from a network such as the Internet or a storage medium such as the detachable medium 211.

Those skilled in the art should appreciate that such a storage medium is not limited to the detachable medium 211 storing therein a program and distributed separately from the apparatus to provide the program to a user as shown in FIG. 2. Examples of the detachable medium 211 include a magnetic disc (including floppy disc (registered trademark)), a compact disc (including compact disc read-only memory (CD-ROM) and digital versatile disc (DVD), a magneto optical disc (including mini disc (MD)(registered trademark)), and a semiconductor memory. Or, the storage medium may be hard discs and the like included in the ROM 202 and the storage part 208 in which programs are stored, and are distributed concurrently with the apparatus including them to users.

The present disclosure further proposes a program product storing therein a machine-readable instruction code that, when read and executed by a machine, can implement the aforesaid method according to the embodiment of the present disclosure.

Correspondingly, a storage medium for carrying the program product storing therein the machine-readable instruction code is also included in the disclosure of the present disclosure. The storage medium includes but is not limited to a floppy disc, an optical disc, a magnetic optical disc, a memory card, a memory stick and the like.

KNOWLEDGE GRAPH-BASED METHOD FOR RECOMMENDING TRADITIONAL CHINESE MEDICINE PRESCRIPTIONS

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

Priority Claims (1)