METHOD FOR PREDICTING ADVERSE REACTIONS BETWEEN DRUGS BASED ON MULTI-ATTRIBUTE AND MULTI-KERNEL REPRESENTATION LEARNING

Description

CROSS REFERENCE TO THE RELATED APPLICATIONS

This application is based upon and claims priority to Chinese Patent Application No. 2023116173052, filed on Nov. 29, 2023, the entire contents of which are incorporated herein by reference.

TECHNICAL FIELD

The present invention belongs to the technical field of prediction of adverse reactions between drugs, and particularly relates to a method for predicting adverse reactions between drugs based on multi-attribute and multi-kernel representation learning.

BACKGROUND

Adverse reactions between drugs mean that the drug efficacy or the pharmacological action of one drug is destroyed by the other drug after two drugs are combined in use, which causes changes in original in-vivo metabolic and absorption processes of the drugs, generating adverse reactions or toxic and side effects harmful to human bodies. For example, the combined use of a triazole antifungal agent voriconazole, and a synthesized adrenal corticosteroid will induce adverse reactions such as osteoporosis, ichorrhemia, and attenuation of vision. The common use of a polyamine sevelamer and a sulfomethylaminobenzoic acid derivative furosemide will cause adverse reactions such as bradycardia, asystole, and adynamic ileus. Identification and prediction of the adverse reactions between drugs have received attention gradually by scientific researchers in various disciplines, which are also a research hotspot in the current field of clinical pharmacy and medical treatment.

To address the problem of the adverse reactions between drugs, pharmaceutical enterprises all over the world have invested much capital in a new drug research and development stage for clinical tests before the drug is listed, so that the risk of the adverse reactions of the new listed drug is reduced, and the safety of the drug is improved. Despite this, there are still adverse reaction events happening frequently every year. The costs of the pharmaceutical enterprises in a safety monitoring stage for the new listed drug are increased, and the risk for the drug being discarded and withdrawn is also improved. On the one hand, the factors affecting the adverse reactions between drugs are complicated and diverse. For example, there are differences between biochemical attributes (molecular structure, target, enzyme, pathway, and side effect) of different drugs and the absorptive and metabolic capabilities of patients to drugs are also different, resulting in that extensive analysis and researches cannot be carried out in clinical tests for the various using conditions of the drugs and different human body states. On the other hand, there are a large number of drugs, leading to their huge combination space when they are used together, leading to high clinical test cost and long time consumption in clinical tests. Therefore, efficient identification and prediction of the adverse reactions between drugs cannot be conducted on a large scale by the clinical test methods, and it is not enough to find potential adverse reactions between drugs merely dependent on the clinical tests before new drug is listing. Therefore, predicting the adverse reactions between drugs makes up for the limitation of the clinical test methods and hysteresis of safety monitoring after the new drug is listed, thereby lowering the research and development cost of the new drug and the safety monitoring cost of the new listed drug.

Research work for predicting the adverse reactions between drugs at present is mainly divided into two categories: methods for predicting adverse reactions between drugs based on a knowledge base and methods for predicting adverse reactions between drugs based on drug attributes.

Public medical and sanitary institutions collect drug safety monitoring data submitted by drug users and medical workers through a spontaneous reporting system to construct a knowledge base for the adverse reactions between drugs. Researchers proposed the methods for predicting adverse reactions between drugs based on a knowledge base to detect the adverse reactions between drugs in biochemical texts, electronic medical records, and a biological heterogeneous database.

With deep research of pharmaceutical experts on drug attributes (molecular structure, side effect, target, enzyme, pathway, phenotype, gene, diseases, and the like), the researchers extract drug attribute information from the heterogeneous database of drugs, design the methods for predicting adverse reactions between drugs based on drug attributes, and reveal attribute rules of the adverse reactions of the drugs. Common research methods include multi-task learning, feature selection, graph convolutional neural networks, et al.

Although the above methods lay a good foundation for predicting the adverse reactions between drugs, they have the following defects: by calculating the similarities of multi-attribute representations between drugs by similarity measure functions, the current methods model the relationship of adverse reactions between drugs. Specifically, the similarity measure function can be regarded as a kernel function that first maps the attribute representations of the drugs to a high-dimensional space by using a projection strategy and then calculates inner products of the attributes representations of the two drugs in high-dimensional space representations, which is seen as adverse reaction scores between the drugs. Due to that different attributes are usually distinct in revealing the underlying characteristics of adverse reactions between drugs, the kernel function most compatible with its potential characteristics is preferably selected to model the adverse reactions between drugs. The current methods usually make attempts at various kernel functions (such as linear kernel, Gaussian kernel, and Sigmoid kernel), and select the kernel function with the best result as the optimal kernel function to measure the attribute similarity measure between drugs. However, because the potential characteristics of different drug attributes are diverse and complex, the selected optimal kernel function is limited by its representation capability, and the optimal kernel function only can approximate the potential characteristics of the attributes and cannot fully reflect the characteristics of the attributes. On the other hand, from the perspective of the kernel function, different kernel functions have their own preference and tendencies when calculating the attribute representation similarities between drugs.

Due to that different attributes are usually distinct in revealing the underlying characteristics of adverse reactions between drugs, and different kernels typically have their respective preferences and tendencies in similarity estimation, based on multiple attributes of the drugs for predicting the adverse reactions between drugs, it is proposed to construct the optimal kernel function combination to better reveal the potential characteristics of different attributes in modeling the adverse reactions between drugs, that is, to find an integrated similarity measure function which is capable of being compatible to diverse and complex characteristics of the different attributes. Specifically, this optimal kernel function combination integrates preference and tendencies of various kernel functions in calculating the multi-attribute similarities, which can better reveal the relationship between the multi-attribute similarities of the drugs and the adverse reactions between drugs, thereby improving the accuracy of predicting the adverse reactions between drugs. For the differences of multi-attributes in revealing the potential characteristics of the adverse reactions between drugs, the problem to be solved in the present invention is to construct the optimal kernel function combination for accurate prediction of adverse reactions between drugs.

SUMMARY

To address the above problems, the present invention provides a method for predicting adverse reactions between drugs based on multi-attribute and multi-kernel representation learning. Based on multi-attribute information (molecular structure, target, pathway, side effect, phenotype, and disease) of the drugs, in view of highly dimensional and sparse multi-attribute feature space of the drugs and divergence among feature dimensions of different attributes, the method in the present invention includes: first, projecting a multi-attribute feature space of the drugs to a same low-dimension and dense feature space to learn their shared and private representations; then developing a multi-kernel representation learning model, and designing a distance learning strategy and a reconstruction strategy of the kernel functions by calculating an incidence relationship among the kernel functions so as to select representative kernel functions; and finally, constructing an optimal kernel function combination based on the incidence relationship between representative kernel functions and the original kernel functions for revealing the relationship between the multi-attribute similarities of the drugs and the adverse reactions between drugs, so as to predict the adverse reactions between drugs.

A technical solution for the present invention is as follows:

A method for predicting adverse reactions between drugs based on multi-attribute and multi-kernel representation learning, including the following steps:

- S1: collecting data of adverse reactions between drugs and multi-attribute information of the drugs to construct vectors of adverse reactions between drugs and multi-attribute information of the drugs, specifically including: defining a drug set as D={d₁, d₂, . . . , d_N}, where N is the number of drugs; constructing a vector r^ij∈{0, 1}^Kto denote an adverse reaction relationship between the i^thdrug d_iand the j^thdrug d_j, where K denotes the number of the types of adverse reactions, and if the k^thadverse reaction is induced by the interaction between; d_iand d_j, then r_k^ij=1; otherwise, r_k^ij=0, k=1, 2, . . . , K; and constructing a matrix X^m∈R^N×L^mto denote a feature space of m^thattribute of the drug, where L_mdenotes a feature dimension of the m^thattribute, m=1, 2, . . . , M, and M denotes the number of attributes;
- S2: learning the shared and private representations of the multi-attribute information of the drugs, where the shared representation means that different attributes have consistency information for prediction of adverse reactions between drugs, the private representation means that different attributes contain specific supplementary information of each attribute, the feature space of each attribute consists of the shared and private representations after the multi-attribute feature spaces of the drugs are projected to a same low-dimensional dense space, and an objective function is constructed based on multi-attribute representation learning so as to obtain solutions of the shared and private representations of M attribute spaces of the drugs;
- S3: constructing a distance learning strategy of kernel functions and a reconstruction strategy of the kernel functions, specifically including: using the shared and private representations of the drugs as an input of a kernel function set, performing similarity measures on the shared and private representations between two drugs by the kernel function set to calculate the distances among different kernel functions, in which the smaller the distances among different kernel functions is, the greater the similarity is, and thus obtaining a similarity matrices of the shared and private representations by using the distances among the kernel functions of the shared and private representations of drugs, and finally, constructing a kernel function learning strategy according to the similarity matrices and a kernel function incidence matrix, where the kernel function incidence matrix is a probability matrix in which the entries of the matrix denote the probability of one kernel function to represent another kernel function; and the reconstruction strategy of the kernel functions is constructed for estimating the kernel function incidence matrix, a shared representation matrix of the adverse reactions between drugs of a certain kernel function can be reconstructed by a shared representation matrix of the adverse reactions between drugs of other kernel functions, and a private representation matrix of the adverse reactions between drugs of a certain kernel function can also be reconstructed by a private representation matrix of the adverse reactions between drugs of other kernel functions, so that the reconstruction strategy of the kernel functions is obtained;
- S4: constructing a multi-kernel representation learning model, selecting representative kernel functions, and constructing an optimal kernel function combination, specifically including: constructing an objective function of the multi-kernel representation learning model according to the distance learning strategy of the kernel functions and the reconstruction strategy of the kernel functions, solving the objective function to obtain the kernel function incidence matrix, so as to obtain a weight of each kernel function, sequencing the kernel functions according to the weights, selecting the kernel functions with the maximum weights as the representative kernel functions to further obtain a representative kernel function set and derive the optimal kernel function combination, and finally obtaining the multi-kernel representations of the shared and private representations of the drugs based on the optimal kernel function combination;
- S5: constructing a model for predicting adverse reactions between drugs by using the optimal kernel function combination, specifically including: based on a vector r^ij, mining a potential relationship between the multi-attribute and multi-kernel representations of the drugs and the adverse reactions by using an R-layer neural network to obtain a mapping relationship between the multi-attribute and multi-kernel representations of the drugs and the adverse reactions between drugs, using concatenated multi-kernel representations of the shared and private representations of drugs d_iand d_jas an input of the neural network, using an output vector of the R^thlayer of the neural network as a predicted value r^ijof the adverse reactions between drugs d_iand d_j, estimating an difference between the true value r^ijand the predicted value r^ijof adverse reaction vectors between drugs d_iand d_jby using a mean square error loss function, and performing training by using data collected in S1 to obtain a trained adverse reaction prediction model; and
- S6: acquiring multi-attribute information of any two drugs, calculating the multi-kernel representations of the shared and private representations of two drugs, and inputting the multi-kernel representations into the trained adverse reaction prediction model to obtain a predicted result of the adverse reactions between drugs.

Further, the objective function constructed in S2 is as follows:

$\begin{matrix} \min_{P, Q^{m}, U^{m}} \sum_{m = 1}^{M} [{ X^{m} - (P + Q^{m}) U^{m} }_{F}^{2} + α^{m} { U^{m} }_{0}] \\ s . t . P \geq 0, Q^{m} \geq 0, U^{m} \geq 0, m = 1, \dots, M \end{matrix}$

- where ∥⋅∥_F²denotes a Frobenius norm of a matrix, ∥⋅∥₀denotes a norm of l₀matrix (i.e., the number of non-0 elements in the matrix), a matrix P∈R^N×Edenotes the shared representation of multiple attributes of the drugs, a matrix Q^m∈R^N×Edenotes the private representation of the m^thattribute of the drugs, E denotes a dimension of the shared and private representations, U^m∈R^E×L^mdenotes a reconstructed coefficient matrix of an original feature space X^mof the shared and private representations for the m^thattribute, and α^mdenotes a sparsity regularization parameter for the m^thattribute; and

the objective function can be formulated by augmented Lagrangian function and further solved by an alternating direction multipliers method and a non-negative matrix factorization optimization method so as to obtain iterative update solutions of the shared representation P and private representation Q^mof the multi-attribute feature space of the drugs and the reconstructed coefficient matrix U^mof the multi-attribute feature space, m=1, . . . , M, and the maximum number of iterations or a minimum change threshold of the objective function are set and variables are iteratively updated to obtain an optimal solution of the objective function.

Further, the distance learning strategy of the kernel functions constructed in S3 is as follows:

$\begin{matrix} \min_{Y} \sum_{l, s}^{L} (D_{ls}^{P} + D_{ls}^{Q}) Y_{ls} = tr [(D^{P} + D^{Q}) Y] \\ s . t . Y^{T} 1 = 1, diag (Y) = 0, Y \geq 0, \end{matrix}$

- where L denotes the number of kernel functions, D_ls^Pand D_ls^Qdenote the distances between the kernel functions κ_land κ_sof the shared representation and the private representation, Y_lsis an element in the kernel function incidence matrix Y, and denotes a probability that kernel function κ_lcan be used to represent kernel function κ_s, matrixes D^P, D^Q∈R^L×Lrespectively denote similarity matrixes of the L kernel functions in terms of the shared representation and the private representation, 1 and 0 respectively denote an all-1 vector and an all-0 vector, Y≥0 denotes non-negativity of all elements in the kernel function incidence matrix Y, and diag(Y)=0 denotes that diagonal elements of the matrix Y are 0;
- the weight W_lof the kernel function κ_lis defined as

$w_{l} = \frac{1}{L} \sum_{s = 1}^{L} y_{ls};$

and

- the reconstruction strategy of kernel function is as follows:

$\min_{Y} \sum_{s = 1}^{L} [{ κ_{l}^{P} - \sum_{s = 1}^{L} Y_{sl} κ_{s}^{P} }_{F}^{2} + { κ_{l}^{Q} - \sum_{s = 1}^{L} Y_{sl} κ_{s}^{Q} }_{F}^{2}]$

- where κ_l^Pis the shared representation matrix of the adverse reactions between drugs based on the kernel function κ_l, κ_l^Qis the private representation matrix of the adverse reactions, between drugs based on the kernel function κ_l, κ_s^Pis the shared representation matrix of the adverse reactions between drugs based on the kernel function κ_s, and κ_s^Qis the private representation matrix of the adverse reactions between drugs based on the kernel function κ_s.

Further, the objective function of the multi-kernel representation learning model constructed in S4 is as follows:

$\begin{matrix} \min_{Y} \sum_{l = 1}^{L} [{ κ_{l}^{P} - \sum_{s = 1}^{L} Y_{sl} κ_{s}^{P} }_{F}^{2} + { κ_{l}^{Q} - \sum_{s = 1}^{L} Y_{sl} κ_{s}^{Q} }_{F}^{2} + λ tr [(D^{P} + D^{Q}) Y] \\ s . t . Y^{T} 1 = 1, diag (Y) = 0, Y \geq 0 \end{matrix}$

- where the regularization parameter λ controls the weight of kernel function distance learning; the objective function of the multi-kernel representation learning model can be formulated by augmented Lagrangian function and is further optimized by an alternating direction multipliers method and a non-negative matrix factorization optimization method to obtain the iterative update solution of the kernel function incidence matrix Y, and by setting the maximum number of iterations or the minimum change threshold of the objective function, the matrix Y is iteratively updated and the optimal solution of the kernel function incidence matrix Y is obtained; and
- the weight of the kernel function κ_lis obtained based on the optimal solution of Y and a definition of the weight W_l, the weights of the L kernel functions are sequenced, the r_Lkernel functions with the maximum kernel function weights are selected as the representative kernel function set, and the multi-kernel representations κ_w(P_i⋅, P_j⋅) and κ_w(Q_i⋅, Q_j⋅) of the shared representation and the private representation of drugs d_iand d_jare obtained by using the representative kernel function set:

$κ_{w} (P_{i \cdot}, P_{j \cdot}) = {[\begin{matrix} w_{1} \sum_{\overline{l} = 1}^{r_{L}} Y_{\overline{l} 1} κ_{\overline{l}} (P_{i \cdot}, P_{j \cdot}), \\ w_{2} \sum_{\overline{l} = 1}^{r_{L}} Y_{\overline{l} 2} κ_{\overline{l}} (P_{i \cdot}, P_{j \cdot}), \\ \dots \\ w_{L} \sum_{\overline{l} = 1}^{r_{L}} Y_{\overline{l} L} κ_{\overline{l}} (P_{i \cdot}, P_{j \cdot}) \end{matrix}]}^{T}, κ_{w} (Q_{i \cdot}, Q_{j \cdot}) = {[\begin{matrix} w_{1} \sum_{\overline{l} = 1}^{r_{L}} Y_{\overline{l} 1} κ_{\overline{l}} (Q_{i \cdot}, Q_{j \cdot}), \\ w_{2} \sum_{\overline{l} = 1}^{r_{L}} Y_{\overline{l} 2} κ_{\overline{l}} (Q_{i \cdot}, Q_{j \cdot}), \\ \dots \\ w_{L} \sum_{\overline{l} = 1}^{r_{L}} Y_{\overline{l} L} κ_{\overline{l}} (Q_{i \cdot}, Q_{j \cdot}) \end{matrix}]}^{T}$

- where Y_lldenotes an element in the l^throw and the l^thcolumn in the kernel function incidence matrix Y, l=1, 2, . . . , r_L.

Further, the mapping relationship between the multi-attribute and multi-kernel representations between drugs and the adverse reactions of the R-layer neural network in S5 is as follows:

$\begin{matrix} h^{(1)} = σ (E^{(1)} κ_{w}^{ij} + b^{(1)}), \\ h^{(r)} = σ (E^{(r)} h^{(r - 1)} + b^{(r)}), r = 2, 3, \dots, R \end{matrix}$

- where h^(r), E^(r), and b^(r)respectively denote an output vector, a coefficient matrix, and an offset vector of the r^thlayer of the neural network, κ_w^ijdenotes the concatenation of the multi-kernel representations of the shared private representations of drugs d_iand d_j, i.e., κ_w^ij=[κ_w(P_i⋅, P_j⋅), κ_w(Q_i⋅, Q_j⋅)], the output vector h^(R)of the R^thlayer of the neural network is the predicted value r^ijof the adverse reactions between drugs d_iand d_j, and the difference between the true value r^ijand the predicted value r^ijof the adverse reaction vectors between drugs d_iand d_jis estimated by using the mean square error loss function: and

$L = \sum_{{ r^{ij} }_{2} \geq 0} { r^{ij} - {\overline{r}}^{ij} }_{2}^{2}$

- the larger an element r_k^ijin the predicted value r^ijis, the higher the probability of the k^thadverse reaction induced by drugs d_iand d_jis.

The present invention has the following beneficial effects: compared with a conventional prediction method, in light of the differences of multiple attributes of the drugs in revealing the potential characteristics of the adverse reactions between drugs, the consistent optimal kernel function combination is constructed. The optimal kernel function embodies preference and tendencies of various kernel functions when calculating the multi-attribute similarities, and the potential relationship between the multi-attribute similarities of the drugs and the adverse reactions between drugs is established, thereby improving the accuracy of predicting the adverse reactions between drugs. The present invention is capable of providing data support to experimental research on the adverse reactions between drugs, improves the clinical experimental research of the adverse reactions between drugs, and is of great significance in promoting clinical medication safety.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is an overall framework of the present invention.

FIG. 2 is an overall flowchart of the present invention.

FIGS. 3A-3C are schematic diagrams of partially predicted adverse reactions between drugs.

DETAILED DESCRIPTION OF THE EMBODIMENTS

The present invention will be described in detail below in combination with drawings and specific embodiments.

As shown in FIG. 1, in the present invention, prediction of adverse reactions between drugs is conducted based on multi-attribute and multi-kernel representation learning. The method includes: based on shared representation and private representation of multiple attributes of drugs, providing a multi-kernel representation learning model; designing a distance learning strategy of kernel functions and a reconstruction strategy of the kernel functions to select a representative kernel function; constructing an optimal kernel function combination by an incidence relationship among the kernel functions; and finally, predicting the adverse reactions between drugs based on multi-attribute and multi-kernel representations through a neural network. By constructing the optimal kernel function combination to reveal the potential characteristics of different attributes in modeling the adverse reactions between drugs, and integrating with preference and tendencies of various kernel functions when calculating the multi-attribute similarities, thereby improving the accuracy of predicting the adverse reactions between drugs. The present invention is capable of providing data support to experimental research on the adverse reactions between drugs, improves the clinical experimental research of the adverse reactions between drugs, and is of great significance in promoting clinical medication safety.

Embodiment

As shown in FIG. 2, the embodiment includes the following main steps:

- S1: collecting data of adverse reactions between drugs and multi-attribute information of the drugs;
- S2: learning shared and private representations of the multi-attribute information of drugs;
- S3: constructing a distance learning strategy of kernel functions and a reconstruction strategy of the kernel functions;
- S4: designing a multi-kernel representation learning model, selecting a representative kernel function; and constructing an optimal kernel function combination;
- S5: constructing a model for predicting adverse reactions between drugs by using the optimal kernel function combination; and
- S6: predicting the adverse reactions between drugs by multi-attribute information of any two drugs.

In the step S1, collecting the data of the adverse reactions between drugs and the multi-attribute information of the drugs. In the embodiment, data of the adverse reactions between drugs is collected from a TWOSIDES database [N. P. Tatonetti, P. P. Ye, D. Roxana, R. B. Altman, Data-driven prediction of drug effects and interactions, Science Translational Medicine 4 (125) (2012) 1-26]. The TWOSIDES data base records adverse reactions of the combined use of two drugs. In the embodiment, a relationship for predicting the adverse reactions between drugs is established by means of attribute data such as molecular structures, target, pathways, side effects, phenotypes, and diseases of the drugs. Molecular structure and target information of the drugs is originated from a DrugBank database [D. S. Wishart, Y. D. Feunang, A. C. Guo, et al. DrugBank 5.0: A major update to the DrugBank database for 2018[J]. Nucleic Acids Research, 2018, 46(D1): D1074-D1082.]. Pathway and disease information of the drugs is originated from a KEGG database [M. Kanehisa, M. Furumichi, Y. Sato, et al. KEGG: Integrating viruses and cellular organisms[J]. Nucleic Acids Research, 2021, 49(D1): D545-D551.]. Side effect information of the drugs is originated from an SIDER database [M. Kuhn, I. Letunic, L. J. Jensen, et al. The SIDER database of drugs and side effects[J]. Nucleic Acids Research, 2016, 44(D1): D1075-D1079.]. Phenotype information of the drugs is originated from a CTD database [A. P. Davis, C. J. Grondin, R. J. Johnson, et al. Comparative toxicogenomics database (CTD): Update 2021[J]. Nucleic Acids Research, 2021, 49(D1): D1138-D1143.]. Based on data of the adverse reactions between drugs and the multi-attribute data of the drugs, totally 1188258 kinds of adverse reactions between drugs are acquired in the embodiment, involving N=567 drugs and K=258 adverse reactions. Basically, common drugs and adverse reactions are covered. The data collected by using the collection method is of higher reliability.

A drug set D={d₁, d₂, . . . , d_N} is given, for adverse reactions between drugs d_iand d_j, constructing a vector r^ij∈{0, 1}^Kto denote an adverse reaction relationship between the i^thaj, where K denotes the number of of the types of adverse reactions, drug d_iand the j^thdrug d_j, where K denotes the number of of the types of adverse reactions, and if the k^thadverse reaction is induced by the interaction between d_iand d_j, then, r_k^ij=1; otherwise, r_k^ij=0. For the attribute data such as molecular structures, target, pathways, side effects, phenotypes, and diseases of the drugs, binary vectors are used to denote the attribute representation of the drugs. Vector elements 1 and 0 respectively denote whether the drugs contain representation information of corresponding attributes. The source databases and representation dimensions of the multi-attribute information of the drugs are shown in Table 1. A matrix X^m∈R^N×L^mis used to denote a feature space of an m^thattribute, where L_mdenotes the feature dimension of the m^thattribute, m=1,2, . . . , M and M denotes a number of attributes. By taking the feature space of the molecular structure of the drug as an example (m=1), molecular structure information of the drug is collected from the DrugBank data base to construct a feature space X¹∈R^N×L¹of the molecular structure of the drug, with the feature dimension L₁=881 of the molecular structure. Therefore, the attributes of the molecular structure of the drug d_ican be denoted by 881-dimensional binary vectors. If the d_icontains a j^thsub-structure, X_ij¹=1; and if the d_idoes not contain the j^thsub-structure, X_ij¹=0.

In the step S2 of the embodiment, the shared and private representation of multi-attribute representations of drugs are learned. As shown in Table 1, due to highly dimensional and sparse representations of different attributes of the drugs and differences in representations dimensions of different attributes, the multi-attribute feature spaces of the drugs are projected to a same low-dimensional dense space to learn the shared and private representations of the multi-attribute informative information of the drugs.

TABLE 1

Source database and representations dimension information

of the multi-attribute information of the drugs

Source
Characteristic

Drug attribute
database
dimension L_m

Molecular structure
DrugBank
881

Target
DrugBank
497

Pathway
KEGG
396

Side effect
SIDER
3687

Phenotype
CTD
2193

Disease
KEGG
482

In the step, the shared representations denote that different attributes have consistent contribution information for predicting the adverse reactions between drugs, and the private representations denote that different attributes contain specific information of each attribute, which plays a supplementing role in predicting the adverse reactions between drugs. In the step, by establishing the potential relationship between the shared and private representations and the original feature space of the drug attributes, that is, the feature space of each attribute consists of the shared representations and respective private representations in the projected low-dimension sense space, consistency information and supplementary information of the multi-attribute feature space are revealed. Therefore, a objective function of the shared representations and private representations denoted by the multi-attribute representations of the drugs can be written as follows:

$\begin{matrix} \begin{matrix} \min_{P, Q^{m}, U^{m}} \sum_{m = 1}^{M} [{ X^{m} - (P + Q^{m}) U^{m} }_{F}^{2} + α^{m} { U^{m} }_{0}] \\ s . t . P \geq 0, Q^{m} \geq 0, U^{m} \geq 0, m = 1, \dots, M \end{matrix} & (1) \end{matrix}$

- ∥⋅∥_F²denotes a Frobenius norm of a matrix, ∥⋅∥₀denotes a norm of l₀matrix (i.e., the number of non-0 elements in the matrix), a matrix P∈R^N×Edenoted the shared representations of multiple attributes of the drugs, a matrix Q^m∈R^N×Edenotes the private representations of the m^thattribute of the drugs, E denotes a dimension of the shared and private representations, U^m∈R^E×L^mdenotes a reconstructed coefficient matrix of an original feature space X^mof the shared and private representations for the m^thattribute. For the m^thattribute, the number of features contained in each drug is limited and is far less than the representations dimension L_mof the m^thattribute, so the feature space of the attribute is highly sparse. Therefore, an l₀-norm constraint of the coefficient matrix U^mis introduced in the equation (1) to control the sparsity of a reconstruction matrix (P+Q^m)U^mof the original feature space based on the shared and private representations, where α^mdenotes a sparsity regularization parameter. Constraint conditions P≥0, Q^m≥0, and U^m≥0 are used to maintain the non-negativity of the original feature space X^m. In the step, the feature spaces of the drug attributes are divided into the shared representations P and the private representations Q^m, the feature spaces of M attributes share the same shared representations P, and different feature spaces X^mhave respective private representations Q^m. The shared representations and the private representations are reconstructed through a sparse coefficient matrix U^m.

The equation (1) can be formulated by the augmented Lagrangian function and further optimized by the alternating direction multipliers method (ADMM), and the non-negative matrix factorization optimization method so as to acquire iterative update solutions of the shared representations P and private representations Q^mof the multi-attribute feature space of the drug and the reconstructed coefficient matrix U^mof the multi-attribute feature space, m=1, . . . , M. The maximum number of iterations or a minimum change threshold of the objective function are set and variables are iteratively updated to obtain an optimal solution of the objective function.

In the step S3 of the embodiment, the distance learning strategy of kernel functions and the reconstruction strategy of the kernel functions are designed. Based on the shared representations P and the private representations Q^mof M attributes acquired in step 2, the shared representations of the drug d_iare denoted as P_i⋅, and the private representations of the m^thattribute of the drug d_iare denoted as Q_i⋅^m, m=1, . . . , M. The private representations of the M attribute spaces of the drug d_iare concatenated to acquire the private representation Q_i⋅=[Q_i⋅¹, Q_i⋅², . . . , Q_i⋅^M] of the drug d_i.

K={κ₁, κ₂, . . . , κ_L} is given to denote L kernel function sets, where κ_ldenotes the l^thkernel function κ_l. First, the share and private representations of drugs are projected to the high-dimensional space with a projection function ϕ_l(⋅), and the inner products of two drug attribute representations are calculated in the high-dimensional space. For the adverse reactions between drugs d_iand d_j, the shared representations P_i⋅, P_j⋅ are taken as inputs of the kernel function κ_l, to be regarded as similarity measures κ_l(P_i⋅, P_j⋅)=ϕ_l(P_i⋅)^Tϕ_l(P_j⋅), the private representations Q_i⋅, Q_j⋅ of d_iand d_jare also taken as inputs of the kernel function κ_l, to be regarded as similarity measures κ_l(Q_i⋅, Q_j⋅)=ϕ_l(Q_i⋅)^Tϕ_l(Q_j⋅), l=1, 2, . . . , L. w=[w₁, w₂, . . . , w_L] is given to denote the weight vectors of L kernel functions. Therefore, multi-kernel representations of the shared representations and private representations of the drugs d_iand d_jcan be written as follows:

$\begin{matrix} κ_{w} (P_{i \cdot}, P_{j \cdot}) = {[\begin{matrix} w_{1} κ_{1} (P_{i \cdot}, P_{j \cdot}), \\ w_{2} κ_{2} (P_{i \cdot}, P_{j \cdot}), \\ \dots \\ w_{L} κ_{L} (P_{i \cdot}, P_{j \cdot}) \end{matrix}]}^{T}, κ_{w} (Q_{i \cdot}, Q_{j \cdot}) = {[\begin{matrix} w_{1} κ_{1} (Q_{i \cdot}, Q_{j \cdot}), \\ w_{2} κ_{2} (Q_{i \cdot}, Q_{j \cdot}), \\ \dots \\ w_{L} κ_{L} (Q_{i \cdot}, Q_{j \cdot}) \end{matrix}]}^{T} & (2) \end{matrix}$

For the kernel function κ_l, κ_l^P, κ_l^Q∈R^N×Nrespectively denote similarity matrixes of the shared representations and the private representations of the adverse reactions between drugs based on the kernel function κ_l, with matrix elements [κ_l^P]_ij=κ_l(P_i⋅, P_j⋅), [κ_l^Q]_ij=κ_l(Q_i⋅, Q_j⋅).

To select the appropriate representative kernel function to construct the optimal kernel function combination, the distance between the kernel functions κ_land κ_scan be regarded as the similarity between the kernel functions κ_land κ_s. The less the distance between the kernel functions κ_land κ_sis, the larger the similarity between the kernel functions κ_land κ_sis, so the probability that the kernel function κ_lis capable of being used to represent the kernel function κ_sis higher. A kernel function incidence matrix Y is designed, and a matrix element Y_lsis a probability that the kernel function κ_lcan be used to represent the kernel function κ_s. Therefore, for the similarity matrixes κ_l^Pof the shared representations and κ_l^Qof the private representations in terms of the kernel function κ_l, the distance between the kernel functions κ_land κ_sof the shared representations and the private representations can be denoted as follows:

$\begin{matrix} \begin{matrix} D_{ls}^{P} = { κ_{l}^{P} - κ_{s}^{P} }_{F}^{2} \\ D_{ls}^{Q} = { κ_{l}^{Q} - κ_{s}^{Q} }_{F}^{2} \end{matrix} & (3) \end{matrix}$

The matrixes D^P, D^Q∈R^L×Lrespectively denote the similarity matrixes of the L kernel functions on the shared representations and the private representations. Therefore, in combination with the kernel function incidence matrix Y, the distance learning strategy of the kernel functions can be written in the following:

$\begin{matrix} \begin{matrix} \min_{Y} \sum_{l, s}^{L} (D_{ls}^{P} + D_{ls}^{Q}) Y_{ls} = tr [(D^{P} + D^{Q}) Y] \\ s . t . Y^{T} 1 = 1, diag (Y) = 0, Y \geq 0, \end{matrix} & (4) \end{matrix}$

1 and 0 respectively denote an all-1 vector and an all-0 vector, Y≥0 denotes non-negativity of elements in the kernel function incidence matrix Y, diag(Y)=0 denotes that the diagonal element of the matrix Y is 0,

$\sum_{l = 1}^{L} Y_{ls} = 1$

guarantees that a probability sum of L kernel functions as the representative kernel functions to represent the kernel function κ_sis 1, equivalent to a bound term Y^T=1 in the equation (4). Y_lsis the probability that the kernel function κ_lrepresent the kernel function κ_s, the weight w_lof the kernel function κ_lcan be regarded as a mean value of the probability that κ_lserves as the representative kernel function to represent the L kernel functions, that is,

$w_{l} = \frac{1}{L} \sum_{s = 1}^{L} Y_{ls} .$

To estimate the kernel function incidence matrix Y, the shared representations matrix κ_l^Pof the adverse reactions between drugs based on the kernel function κ_lcan be represented by the shared representations matrix κ_s^Pof the adverse reactions between drugs of other kernel functions κ_s, and the private representations matrix κ_l^Qof the adverse reactions between drugs based on the kernel function κ_lcan also be represented by the private representations matrix κ_s^Qof the adverse reactions between drugs of other kernel functions κ_s, s=1, 2, . . . , L, s≠l. Therefore, the reconstruction strategy of the kernel functions can be written as the following rule items:

$\begin{matrix} \min_{Y} \sum_{l = 1}^{L} [{ κ_{l}^{P} - \sum_{s = 1}^{L} Y_{sl} κ_{s}^{P} }_{F}^{2} + { κ_{l}^{Q} - \sum_{s = 1}^{L} Y_{sl} κ_{s}^{Q} }_{F}^{2}] & (5) \end{matrix}$

In the step S4 of the embodiment, the multi-kernel representation learning model is designed, the representative kernel functions are selected, and the optimal kernel function combination is constructed. Based on the distance learning strategy of the kernel functions given in the equation (4) and the reconstruction strategy of the kernel functions given in the equation (5), the multi-kernel representation learning model is constructed, and the objective function of the model can be written as follows:

$\begin{matrix} \begin{matrix} \min_{Y} \sum_{l = 1}^{L} { κ_{l}^{P} - \sum_{s = 1}^{L} Y_{sl} κ_{s}^{P} }_{F}^{2} + { κ_{l}^{Q} - \sum_{s = 1}^{L} Y_{sl} κ_{s}^{Q} }_{F}^{2} + λ tr [(D^{P} + D^{Q}) Y] \\ s . t . Y^{T} 1 = 1, diag (Y) = 0, Y \geq 0 \end{matrix} & (6) \end{matrix}$

The regularization parameter λ controls the wight of the distance learning of the kernel functions. The objective function can be formulated by the augmented Lagrangian function and further optimized by the ADMM method, to acquire the iterative update solution of the kernel function incidence matrix Y. By setting a maximum number of iterations or a minimum change threshold of the objective function, matrix Y is iteratively updated to finally acquire the optimal solution of the objective function.

Based on the optimized kernel function incidence matrix Y, the weight w_lof the kernel function κ_lcan be denoted as

$w_{l} = \frac{1}{L} \sum_{s = 1}^{L} Y_{ls} .$

The weights of the L kernel functions are sequenced, r_Lkernel functions with the maximum weights of the kernel functions are selected as the representative kernel functions, and the representative kernel function set can be denoted as K={κ_r₁, κ_r₂, . . . , κ_r_L}. The similarity matrix of the shared representations of the adverse reactions between drugs of kernel function κ_lis reconstructed by the similarity matrix of the shared representations of the adverse reactions between drugs of the representative kernel functions, i.e.,

$κ_{l}^{P} = w_{l} \sum_{\overline{l} = 1}^{r_{L}} Y_{\overline{u}} κ_{\overline{l}}^{P},$

and the private representations of the adverse reactions between drugs of kernel function κ_lcan be also reconstructed by the similarity matrix of the private representations of the adverse reactions between drugs of the representative kernel functions, i.e.,

$κ_{l}^{Q} = w_{l} \sum_{\overline{l} = 1}^{r_{L}} Y_{\overline{u}} κ_{\overline{l}}^{Q},$

where Y_lldenotes elements in the l^throw and l^thcolumn in the kernel function incidence matrix Y,

$κ_{l}^{P} = w_{l} \sum_{\overline{l} = 1}^{r_{L}} Y_{\overline{u}} κ_{\overline{l}}^{P}$

denotes the reconstruction of the similarity matrix κ_l^Pof the shared representations of the adverse reactions between drugs of the kernel function κ_lby the similarity matrix κ_l^Pof the shared representations of the adverse reactions between drugs of the representative kernel function κ_land the element Y_lldenoting the probability of the representative kernel function κ_lrepresenting kernel function κ_l, l=1, 2, . . . , r_LTherefore, based on the selected representative kernel functions, multi-kernel representations of the shared representations and the private representations of the drugs d_iand d_jcan be written as follows:

$\begin{matrix} κ_{w} (P_{i \cdot}, P_{j \cdot}) = {[\begin{matrix} w_{1} \sum_{\overline{l} = 1}^{r_{L}} Y_{\overline{l} 1} κ_{\overline{l}} (P_{i \cdot}, P_{j \cdot}), \\ w_{2} \sum_{\overline{l} = 1}^{r_{L}} Y_{\overline{l} 1} κ_{\overline{l}} (P_{i \cdot}, P_{j \cdot}), \\ \dots \\ w_{L} \sum_{\overline{l} = 1}^{r_{L}} Y_{\overline{l} L} κ_{\overline{l}} (P_{i \cdot}, P_{j \cdot}) \end{matrix}]}^{T}, & (7) \end{matrix}$

$κ_{w} (Q_{i \cdot}, Q_{j \cdot}) = {[\begin{matrix} w_{1} \sum_{\overline{l} = 1}^{r_{L}} Y_{\overline{l} 1} κ_{\overline{l}} (Q_{i \cdot}, Q_{j \cdot}), \\ w_{2} \sum_{\overline{l} = 1}^{r_{L}} Y_{\overline{l} 1} κ_{\overline{l}} (Q_{i \cdot}, Q_{j \cdot}), \\ \dots \\ w_{L} \sum_{\overline{l} = 1}^{r_{L}} Y_{\overline{l} L} κ_{\overline{l}} (Q_{i \cdot}, Q_{j \cdot}) \end{matrix}]}^{T}$

The representative kernel function set K={κ_r₁, κ_r₂, . . . , κ_r_L} is regarded as the optimal kernel function combination.

In the step S5 of the embodiment, the model for predicting adverse reactions between drugs is constructed by using the optimal kernel function combination. A vector r^ijis given to denote the adverse reaction relationship between the drugs d_iand d_j, and an R-layer neural network is designed to mine a potential relationship between the multi-attribute and multi-kernel representations of the drugs and the adverse reactions. The equation (7) gives the multi-kernel representations κ_w(P_i⋅, P_j⋅), κ_w(Q_i⋅, Q_j⋅) of the shared representations and the private representations of the drugs d_iand d_jbased on the representative kernel functions, so that the mapping relationship between the multi-attribute and multi-kernel representations between drugs and the adverse reactions can be written as follows:

$\begin{matrix} \begin{matrix} h^{(1)} = σ (E^{(1)} κ_{w}^{ij} + b^{(1)}), \\ h^{(r)} = σ (E^{(r)} h^{(r - 1)} + b^{(r)}), r = 2, 3, \dots, R \end{matrix} & (8) \end{matrix}$

h^(r), E^(r), and b^(r)respectively denote the output vector, the coefficient matrix, and the offset vector of the r^thlayer of the neural network. κ_w^ijdenotes the concatenation of the multi-kernel representations of the shared representations and the private representations of the drugs d_i, and d_j, that is, κ_w^ij=[κ_w(P_i⋅, P_j⋅), κ_w(Q_i⋅, Q_j⋅)]. Particularly, the output vector h^(R)of the R^thlayer of the neural network can be regarded as the predicted value r^ijof the adverse reactions between the drugs d_iand d_j. The difference between the true value r^ijand the predicted value r^ijof the adverse reaction vectors between the drugs d_iand d_jare estimated by the mean square error loss function and the error function can be written as follows:

$\begin{matrix} L = \sum_{{ r^{ij} }_{2} \geq 0} { r^{ij} - {\overline{r}}^{ij} }_{2}^{2} & (9) \end{matrix}$

In the step S6 of the embodiment, multi-attribute information X_a⋅^mand X_b⋅^m, m=1, 2, . . . , M, of the drugs d_aand d_bis given to predict the adverse reactions between the drugs d_aand a_b. Through the step S2 of the embodiment, the shared representations P_a⋅ and P_b⋅ and the private representations Q_a⋅^m, Q_b⋅^m, of the multi-attribute information of the drugs d_aand d_bare obtained, m=1, 2, . . . , M, and the private representations of M attributes of d_aand d_bare concatenated, i.e. Q_a⋅=[Q_a⋅¹, Q_a⋅², . . . , Q_a⋅^M], Q_b⋅=[Q_b⋅¹, Q_b⋅², . . . , Q_b⋅^M]. Based on the kernel function set K={κ₁, κ₂, . . . , κ_L}, the similarity measures κ_l(P_a⋅, P_b⋅), κ_l(Q_a⋅, Q_b⋅), of the shared representations and the private representations of the drugs d_aand a_bare calculated, l=1, 2, . . . , L. Then, based on the representative kernel function set K={κ_r₁, κ_r₂, . . . , κ_r_L} and the kernel function incidence matrix Y obtained in the step S5 of the embodiment, the multi-kernel representations κ_w(P_a⋅, P_b⋅) and κ_w(Q_a⋅, Q_b⋅) of the shared representations and the private representations of the drugs d_aand d_bare calculated by the equation (7) to obtain the concatenated vector κ_w^ab=[κ_w(P_a⋅, P_b⋅), κ_w(Q_a⋅, Q_b⋅)]. Finally, by taking the vector κ_w^abas the input of the R^thlayer neural network constructed in the step S5 of the embodiment, a predicted vector r^abof the adverse reactions between the drugs d_aand d_bis calculated. The larger the vector element r_k^abis, the more probable the k^thadverse reaction between the drugs d_aand d_bis caused.

The method for predicting adverse reactions between drugs based on multi-attribute and multi-kernel representation learning provided in the embodiment, by constructing the optimal kernel function combination, explores the potential characteristic rules of different attributes in modeling the adverse reactions between drugs, and reveals the relationship between the multi-attribute similarities of the drugs and the adverse reactions between drugs, thereby realizing prediction of the adverse reactions between drugs. The predicted result can provide data support for research on the adverse reactions between drugs based on the bio-experimental method and research on the safety of the new drug. FIGS. 3A-3C show partially predicted adverse reactions between drugs, where FIG. 3A shows that combined use of a strong synthesized analgesic Methadone and an anticoagulant Apixaban will cause adverse reactions such as gingival bleeding, abnormal liver function, and malnutrition; FIG. 3B shows that simultaneous use of a hypotensive drug Carvedilol and an antidiabetic Gliclazide will induce adverse reactions such as pathoglycemia, ankylosis, uroclepsia, and angina; combined use of an opioid analgesic tramadol and an antifungal infection drug fluconazole will induce adverse reactions such as skin ulcer, hepatitis, and intestinal obstruction; and FIG. 3C shows that simultaneous use of a N-methyl-D-aspartic acid receptor antagonist Memantine for Alzheimer disease and a histamine type II receptor antagonist Cimetidine will cause adverse reactions such as psychiatric disorders nervous system tension.

Claims

1. A method for predicting adverse reactions between drugs based on a multi-attribute and multi-kernel representation learning, comprising the following steps: S1: collecting data of the adverse reactions between the drugs and a multi-attribute information of the drugs to construct vectors of the adverse reactions between the drugs and the multi-attribute information of the drugs, comprising: defining a drug set as D={d1, d2, . . . , dN}, wherein N is a number of the drugs; constructing a vector rij∈{0, 1}K to denote an adverse reaction relationship between an ith drug di and a jth drug dj, wherein K denotes a number of types of the adverse reactions, and if a kth adverse reaction is induced by an interaction between the ith drug di and the jth drug dj, then, rkij=1; otherwise, rkij=0, k=1, 2, . . . , K; and constructing a matrix Xm∈RN×Lm to denote a feature space of an mth attribute of the drugs, wherein Lm denotes a feature dimension of the mth attribute, m=1, 2, . . . , M and M denotes a number of attributes;S2: learning a shared representation and a private representation of the multi-attribute information of the drugs, wherein the shared representation means that different attributes have a consistency information for a prediction of the adverse reactions between the drugs, the private representation means that the different attributes contain a specific supplementary information of each attribute, a feature space of the each attribute consists of the shared representation and the private representation after multi-attribute feature spaces of the drugs are projected to a same low-dimensional dense space, and an objective function is constructed based on a multi-attribute representation learning so as to obtain solutions of the shared representation and the private representation of M attribute spaces of the drugs;S3: constructing a distance learning strategy of kernel functions and a reconstruction strategy of the kernel functions, comprising: using the shared representation and the private representation of the drugs as an input of a kernel function set, performing a similarity measure on the shared representation and the private representation between the drugs by the kernel function set to calculate distances among the kernel functions, setting a similarity to be increased as a decrease of the distances among the kernel functions, thus obtaining a similarity matrix of the shared representation and the private representation using the distances among the kernel functions of the shared representation and the private representation of the drugs, and finally, constructing a kernel function learning strategy according to the similarity matrix and a kernel function incidence matrix, wherein the kernel function incidence matrix is a probability matrix, and matrix entries denote a probability of a first kernel function representing a second kernel function; and the reconstruction strategy of the kernel functions is constructed for estimating the kernel function incidence matrix, a shared representation matrix of the adverse reactions between the drugs of a predetermined kernel function is configured to reconstructed by a shared representation matrix of the adverse reactions between the drugs of other kernel functions, and a private representation matrix of the adverse reactions between the drugs of the predetermined kernel function is configured to reconstructed by a private representation matrix of the adverse reactions between the drugs of the other kernel functions, so that the reconstruction strategy of the kernel functions is obtained;S4: constructing a multi-kernel representation learning model, selecting a representative kernel function, and constructing an optimal kernel function combination, comprising: constructing an objective function of the multi-kernel representation learning model according to the distance learning strategy of the kernel functions and the reconstruction strategy of the kernel functions, solving the objective function of the multi-kernel representation learning model to obtain the kernel function incidence matrix, so as to obtain a weight of each kernel function, sequencing the kernel functions according to the weight of each kernel function, selecting a kernel function with a maximum weight as the representative kernel function to further obtain a representative kernel function set as the optimal kernel function combination, and finally obtaining multi-kernel representations of the shared representation and the private representation of the drugs based on the optimal kernel function combination;S5: constructing a model for predicting the adverse reactions between the drugs by using the optimal kernel function combination, comprising: based on the vector rij, mining a potential relationship between multi-attribute and multi-kernel representations of the drugs and the adverse reactions by using an R-layer neural network to obtain a mapping relationship between the multi-attribute and multi-kernel representations of the drugs and the adverse reactions between the drugs, using concatenated multi-kernel representations of the shared representation and the private representation of the ith drug di and the jth drug dj as an input of the R-layer neural network, using an output vector of an Rth layer of the R-layer neural network as a predicted value rij of the adverse reactions between the ith drug di and the jth drug dj, estimating a difference between a true value rij and the predicted value rij of adverse reaction vectors between the ith drug di and the jth drug dj by using a mean square error loss function, and performing a training by using the data collected in the S1 to obtain a trained adverse reaction prediction model; andS6: acquiring the multi-attribute information of two drugs, calculating the multi-kernel representations of the shared representation and the private representation of the two drugs, and inputting the multi-kernel representations into the trained adverse reaction prediction model to obtain a predicted result of the adverse reactions between the two drugs.
2. The method for predicting the adverse reactions between the drugs based on the multi-attribute and multi-kernel representation learning according to claim 1, wherein the objective function constructed in the S2 is as follows:
3. The method for predicting the adverse reactions between the drugs based on the multi-attribute and multi-kernel representation learning according to claim 2, wherein the distance learning strategy of the kernel functions constructed in the S3 is as follows:
4. The method for predicting the adverse reactions between the drugs based on the multi-attribute and multi-kernel representation learning according to claim 3, wherein the objective function of the multi-kernel representation learning model constructed in the S4 is as follows:
5. The method for predicting the adverse reactions between the drugs based on the multi-attribute and multi-kernel representation learning according to claim 4, wherein the mapping relationship between the multi-attribute and multi-kernel representations of the drugs and the adverse reactions of the R-layer neural network in the S5 is as follows:

Priority Claims (1)

Number	Date	Country	Kind
2023116173052	Nov 2023	CN	national

METHOD FOR PREDICTING ADVERSE REACTIONS BETWEEN DRUGS BASED ON MULTI-ATTRIBUTE AND MULTI-KERNEL REPRESENTATION LEARNING

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

Priority Claims (1)