CROSS-REFERENCE TO RELATED APPLICATION
This application claims the priority benefit of China application serial no. 202310822440.4, filed on Jul. 5, 2023. The entirety of the above-mentioned patent application is hereby incorporated by reference herein and made a part of this specification.
TECHNICAL FIELD
The present invention belongs to the technical field of application programming interface (API) recommendation, and relates to a cloud-native API recommendation method fusing data augmentation and contrastive learning.
BACKGROUND
In a current era of cloud native, a number of companies have tacitly approved enterprise cloud. Emergence of a cloud native concept has promoted an enterprise to eliminate the need to build an infrastructure. How to make an adaption based on a cloud component, and how to reasonably use elasticity and functions of computation, storage, separation, etc. of the cloud have also become crucial. It is necessary to digitally transform a traditional manufacturer, which faces a business to be networked. In order to solve problems such as high concurrency and high throughput, it is inevitable to use an Internet architecture. An increasing number of enterprises are choosing to provide a service-based application. Software serving as a leading, i.e. a software-as-a-service (SaaS) mode, and a service serving as a core of a business provide computing and data resources for a developer, and provide a mature cloud software solution for the enterprise. With an open service platform, the developer can rapidly satisfy complex requirements of a user by combining reusable and replaceable third-party services. A mashup service is a typical service composition mode that generates a new application programming interface (API) by rapidly combining existing Web APIs, and is favored by the developer and has been rapidly developed with high efficiency and convenient use.
At present, the Web API is mainly built with a representational state transfer (RESTful) service technology. With complexity of business requirements of the application, it is difficult for a single RESTful service to satisfy requirements of the user. In view of this, a mashup service technology called as mashup has been proposed. The mashup service is mostly formed by a combination of the APIs, and has a core to reorganize a single API or different data sources. An increasing number of enterprises are choosing to provide the service-based application. It is crucial to choose the appropriate API to build a combined service. Rapid growth in the number of existing candidate APIs and a large number of services having similar functions have further increased the difficulty for the user to choose the appropriate API. With these challenges, a service recommendation technology has emerged.
Many solutions have been proposed to solve a service recommendation method. A graph neural network (GNN) propagates information through a graph structure. Great attention has been paid the GNN as an effective transformation learning method, which is widely used in tasks for processing non-Euclidean data, such as drug recognition, text classification and temporal recommendation. In a collaborative filtering based recommendation method, the GNN can use a graph structure formed by calling information as input such that adjacent nodes can affect each other, thereby improving learning capability for information. For service recommendation, an invocation relation between the mashup and the API can be represented as a binary graph structure. Moreover, additional information such as a tag can also be naturally represented as a graph association structure, and is suitable for being processed as a non-Euclidean data task.
Contrastive learning is a typical discriminative self-supervised learning method that learns a representation by comparing data. Original data is augmented to form new data, and the newly generated data is compared, such that comparison results between data pairs obtained by augmentation of the same data are minimized, and comparison results between data pairs obtained by augmentation of different data are maximized, so as to learn differences between different data without a tag.
SUMMARY OF THE DISCLOSURE
In order to overcome the defects of the prior art, better utilize service included information and improve the effect of a service recommendation system, the present invention provides a cloud-native application programming interface (API) recommendation method fusing data augmentation and contrastive learning. Service information is included on the basis of a service information double-graph structure, and a mutual attention mechanism is designed to compute an importance degree of each layer of information. A data optimization method for sequence information based on functional similarity and a computation method for similarity between services based on two parts of information are provided. On this basis, data of a service invocation sequence is augmented with the idea of contrastive learning to form an augmented sequence pair. A computational contrastive loss function is combined with a pair-wise recommendation loss function to optimize an overall model, thereby improving the effect of a service recommendation model. According to a feature embedding representation result of a service, pair-wise recommendation scores are computed to complete service recommendation.
The technical solution used by the present invention is as follows:
- a cloud-native API recommendation method fusing data augmentation and contrastive learning. The method includes:
- step 1, on the basis of service invocation matrix information and function association information between APIs, constructing an invocation graph structure, service invocation matrix graph (SIMG), between mashup services and API services and an invocation graph structure, connecting API functional graph (CAFG), between the APIs to form a double-graph structure;
- step 2, according to the graph structure SIMG, designing a multi-layer graph propagation network structure to learn mashup feature representations and API feature representations;
- step 3, according to the graph structure CAFG, designing a corresponding graph neural network structure to compute feature representations corresponding to APIs under the graph neural network structure;
- step 4, designing a mutual attention mechanism layer to retain multi-layer output information, and computing attention weights of multi-layer output and merging feature representations of the mashup services by taking association information of the APIs as importance guidance;
- step 5, carrying out weighted combination on API feature representations of double graphs through a gate mechanism to generate a new API feature representation;
- step 6, computing training similarity and prior information similarity between services, and generating data augmented sequence pairs through a data augmentation method;
- step 7, computing pair-wise scores through the obtained feature representation, and computing an overall loss function result through the pair-wise scores and the data augmented sequence pairs to optimize parameters of an overall recommendation model; and
- step 8, matching a user request, sorting the pair-wise scores and completing service recommendation.
Further, step 1 includes:
- 1.1 representing mashup services in a mashup service set M as a mashup type of graph nodes, representing APIs in an API set A as an API type of graph nodes, and representing an overall graph node set as Vma;
- 1.2 setting an empty set εmα as a storage set of side information in the graph structure SIMG;
- 1.3 traversing the mashup services in M, setting a single mashup service traversed currently as m, and obtaining an API set minvoke invoked by the set M from an invocation interaction matrix MA between the mashups and the APIs, where the service invocation interaction matrix MA has a size of |M|×|A|, where |M| indicates the number of elements in the set M, and |A| indicates the number of elements in the set A;
- 1.4 traversing the API set minvoke obtained in a previous step, setting the APIs traversed currently as α, and with when a value corresponding to a corresponding position [m, α] in the invocation interaction matrix MA is 1, which indicates that an invocation relation exists between m and α, storing a side combination (m, α) into εmα, which indicates that an undirected connection side exists between graph nodes corresponding to m and α;
- 1.5 completing traversal, and combining a graph node set Vmα and a side information set εmα to complete construction of a graph structure SIMG (Vmα, εma);
- 1.6 converting the API set A into a graph node set, which is represented as Vα;
- 1.7 setting an empty set εα as a storage set of side information in the graph structure CAFG;
- 1.8 traversing the APIs in the set A, setting a single API traversed currently as α1, on this basis, traversing the APIs in A apart from α1, and setting a single API traversed currently as α2;
- 1.9 with a tag information set Atag of the APIs including tags corresponding to the APIs, setting a tag set corresponding to the APIs α1 as Aα1tag, setting a tag set corresponding to the APIs α2 as Aα2tag and when Aα1tag∩Aα2tag is not an empty set, storing a side combination (α1, α2) into εα, which indicates that an undirected connection side exists between graph nodes corresponding to α1 and α2; and
- 1.10 completing traversal, and combining the graph node set Vα and the side information set εα to complete construction of a graph structure CAFG(Vα, εα).
Furthermore, step 2 includes:
- 2.1 constructing an embedding layer to store embedding vector information of the mashup services and API nodes in forms of:
- where Emashup indicates embedding representations corresponding to the mashup services, and Eα indicates embedding representations corresponding to the APIs; and combining Emashup and Eα to obtain an embedding layer embedding matrix E;
- 2.2 computing an adjacency matrix of the graph structure SIMG, which is set as Imα:
- where MA is the service invocation interaction matrix, and the adjacency matrix has a size of (|M|+|A|)×(|M|+|A|), and includes connection information of nodes in the graph structure, where T indicates matrix transpose operation;
- 2.3 defining a degree corresponding to the mashup service m as dm and a degree of the node corresponding to the API a as dα, combining degree information corresponding to the mashup service and the API into a degree matrix Dma, and computing a normalized form of the adjacency matrix:
- where an element corresponding to a position (i,j) in a normalized matrix Ĩmα has a value of
where dim indicates a size of the degree corresponding to the mashup service having a sequence number of i, and djα indicates a size of the degree corresponding to the API node having a sequence number of j;
- 2.4 traversing node information in the graph structure SIMG, setting nodes traversed currently as ν including the mashup type of graph nodes and the API type of graph nodes, setting a neighbor node set of ν in the graph structure, i.e., a node set having a side connection with the nodes, as Nν, and computing a propagation result of each νt∈Nν:
- where l indicates the number of layers of a graph propagation structure of a multi-layer graph neural network, Wl1 and Wl2 are both trainable parameter matrixes of a current layer, ⊗ indicates multiplication operation of vector elements, eν indicates an embedding representation corresponding to the node ν, and eνt indicates an embedding representation of a neighbor node νt of the node ν;
- 2.5 aggregating propagation information of the node ν and the neighbor node of the node to form node feature embedding output of the current layer:
- where σ is an activation function, eνl-1 indicates node embedding output of a previous layer, and Σ indicates vector summation operation; and
- 2.6 according to properties of the node ν, dividing an output node embedding result into a mashup service feature representation eml and an API feature representation eαl for output.
Step 3 Includes:
- 3.1 with one-hot representation being a vectorized representation method, where each feature is represented by an independent binary dimension, only one dimension has a value of 1, and remaining dimensions all have a value of 0, and the dimensions are configured to represent categorical variables or discrete variables, constructing an API embedding representation based on one-hot representation, and converting, by the API embedding representation based on one-hot representation, high-dimensional one-hot representation corresponding to the API set into a dense embedding representation AOE∈R|A|×d through an embedding layer, where d is a dimension of the embedding representation;
- 3.2 obtaining an adjacency matrix IαεR|A|×|A| of the graph structure CAFG, where when Iα[u, w]=1, it is indicated that the API αu having a sequence number of u and the API αw having a sequence number of w share the same tag information;
- 3.3 computing a corresponding degree matrix Dα∈R|A|×|A|, where the matrix is a diagonal matrix, where Dα[u, u] indicates the total number of the APIs having the same tag as αu;
- 3.4 computing a normalized adjacency matrix
as a convolution weight of an input vector set; and
- 3.5 carrying out graph convolution computation on the embedding representation through the normalized adjacency matrix:
- where σ indicates an activation function, W5∈Rd×d and W6∈Rd×d are weights of two layers respectively, computed output C is used as API feature representation output under the network structure, and has a size of C∈R|A|×d, and a feature representation corresponding to the service a is defined as cα.
Step 4 Includes:
- 4.1 representing a matrix consisting of all feature representation vectors of the mashup services of l-th layer output obtained in step 2.6 as Eml, representing a matrix consisting of feature representation vectors corresponding to the APIs as Eαl, and combining output of a current propagation layer:
- 4.2 combining Eml and a feature matrix C output from step 3.5:
- 4.3 with the attention mechanism being a common method in machine learning, and being configured to assign weights to different elements in a sequence including a query part and a key part, computing similarity between the query and the key to carry out weighted summation on the elements, so as to achieve feature extraction and weighted fusion of the sequence, and designing the mutual attention mechanism to connect the feature representations of the mashup services output by the multi-layer structure, and computing a query representation of the mutual attention mechanism:
q
l
=W
q|l
E
q
l
- where Wq|l∈Rd*(|M|+|A|) is a trainable weight parameter, and ql is used as an attention query corresponding to an output result of a l-th propagation layer;
- 4.4 computing a key representation of the mutual attention mechanism:
k
l
=W
k
E
key
l
- where Wk∈Rd*(|M|+|A|) is a trainable weight parameter, and kl indicates an attention key corresponding to an output result of the l-th propagation layer;
- 4.5 computing mutual attention weight information corresponding to the l-th layer:
- where exp( ) indicates exponential power by using a natural constant e as a base, L is the total number of graph propagation layers, and T indicates vector transpose; and
- 4.6 weighing output of graph propagation layers:
- where ⊗ is operation of computing product of each element, and αEml indicates a weighted result of feature vectors of l-th layer output; and then connecting weighted output of the layers, and converting the weighted output into a final mashup feature representation:
- where ⊕ indicates vector connection operation, ƒcm indicates a multi-layer perceptron having an input size of L*d and output of d, and the multi-layer perceptron is a model based on a neural network, and consists of a plurality of fully-connected hidden layers and output layers; and setting a mashup feature vector result corresponding to the mashup m in the matrix FEm as ƒem.
Step 5 Includes:
- 5.1 computing, by the API feature representation eαl output by the graph neural network structure based on the graph structure SIMG and the API feature representation cα output by the graph neural network structure based on the graph structure CAFG, a gate weight corresponding to the gate mechanism:
- where Wg is a trainable weight, and Eαl indicates a set of ea corresponding to all the services; and
- 5.2 with the weight for connection, computing a final output result:
- where FEα is a matrix consisting of all the API feature representations obtained by weighting feature representations in a form of a matrix; and setting a feature vector result corresponding to the API α in the matrix FEα as ƒeα.
Step 6 Includes:
- 6.1 computing the training similarity between the services, and using dot product of embedding vectors of two APIs or mashup services as a similarity computation result:
- where α1, α2 indicate any two different APIs, Simtrainapi (α1, α2) indicates the training similarity between the APIs α1 and α2, m1, m2 indicates any two different mashup services, and Simtrainmashup (m1, m2) indicates the training similarity between the mashups m1 and m2; and
- 6.2 computing the prior information similarity between the services, and computing similarity between service description documents on the basis of similarity computation of the service description documents and service function tags:
- where cos( ) indicates cosine similarity, eαlm indicates a service function representation obtained by computation of a language model corresponding to the API α, Simdescriptionapi (α1α2) indicates the prior information similarity between the APIs αi and α2, ∥eαlm∥ indicates a modulus of a vector eαlm, emlm indicates a service function representation obtained by computation of a language model corresponding to the mashup m, Simdescriptionmashup (m1, m2) indicates the prior information similarity between the mashups m1 and m2, and ∥emlm∥ indicates a modulus of a vector emlm;
- 6.3 computing similarity between service tags:
- where mtag indicates a function tag set of the mashup, αtag indicates a tag set of the APIs, |mtag| indicates the number of mashup function tags, and |αtag| indicates the number of the API function tags;
- 6.4 using a larger result between function description similarity and tag similarity as prior information similarity between the services:
- where (
) indicates a normalized similarity result;
- 6.5 using a larger result between the training similarity between the services and the prior information similarity as overall similarity information between the services:
- computing similarity information between the services as a basis for data augmentation;
- 6.6 designing four service sequence data augmentation methods as follows:
- (1) service cutting: for a service sequence mαc={α1, α2, . . . , a|mac|}, selecting a continuous subsequence having a length of Lsc=μ*|mαc∥ from a random position as an augmented sequence, where μ∈(0,1) is a random cutting parameter, where |mαc| is the number of the APIs in the service invocation sequence;
- (2) service occlusion: for a service sequence mαc {α1, α2, . . . , α|mαc|}, randomly discarding the APIs of the number being Lsm=|η*|mαc∥, where η∈(0,1) is a random occlusion parameter, where |mαc| is the number of the APIs in the service invocation sequence;
- (3) association replacement: for a service invocation sequence mαc={α1, α2, . . . , a|mαc|}, randomly selecting an item Lrc=|γ*|mαc∥, and replacing the item with a service having high function correlation, where γ∈(0,1) indicates a replacement probability parameter, and |mαc| is the number of the APIs in the service invocation sequence;
- (4) association expansion: for a service invocation sequence mαc {α1, α2, . . . , α|mαc|}, randomly selecting an item Lre=|ε*|mαc∥, and inserting a service having high function correlation with the item into a position in which the item is located, where ε∈(0,1) indicates an expansion probability parameter, where |mαc| is the number of the APIs in the service invocation sequence; and
- in the augmentation methods (3) and (4), determining service function correlation through similarity information between the services obtained in step 6.5, with respect to the association replacement method (3), for Lrc services randomly selected, obtaining, through a similarity computation result between the services provided in step 6.5, services having the highest function similarity from candidate services, apart from services existing in an original sequence, as new services, so as to replace the services randomly selected, and with respect to the association expansion method (4), for Lre services randomly selected, obtaining, through a similarity computation result between the services provided in step 6.5, services having the highest function similarity from candidate services, apart from services existing in an original sequence, as new services, and adding the new services after original positions in sequences in which the services randomly selected are located to complete sequence augmentation;
- 6.7 for the mashup service set M, traversing the mashup services in M, setting a single mashup service traversed currently as m, for the service invocation sequence mαm, corresponding to m, selecting two sequence data augmentation methods provided in step 6.6 to process the sequence, so as to generate two newly augmented sequence results, which are represented as αsm1 and αsm2 respectively, for the API set A, traversing the APIs in A, setting a single API traversed currently as α, setting a mashup sequence that invoked the API as αmα, and selecting two sequence data augmentation methods provided in step 6.6 to process the sequence, so as to generate two newly augmented sequence results, which are represented as msα1 and msα2 respectively;
Preferably, in step 6.7, a process of selecting the augmentation methods and generating the augmented sequence pairs includes:
- 6.7.1 setting a set for storing combined data augmentation methods, which is set as auglist and is initially an empty set; and storing the four service sequence data augmentation methods in step 6.6 into a set DA, setting the number of elements in the set as nd, and setting a count index as 1;
- 6.7.2 traversing DA, and setting a method object traversed currently as dα;
- 6.7.3 for each traversal, assigning index+1 to target;
- 6.7.4 when target is less than nd, storing da into auglist, storing augmentation methods having a sequence number of target in DA into auglist, and adding 1 to target;
- 6.7.5 repeating steps 6.7.3 and 6.7.4 until target has a value greater than or equal to nd;
- 6.7.6 adding 1 to index;
- 6.7.7 ending traversal, and completing initialization of the set auglist;
- 6.7.8 assigning 0 to a counter now, and setting a set for storing augmented sequence pairs, which is set as ASP, and is initially an empty set;
- 6.7.9 traversing the API invocation sequence mα of the mashup, and storing the invocation sequences corresponding to all the mashups in the set M into the set, which are set as SM;
- 6.7.10 traversing the set SM, and setting a current traversal sequence as sm;
- 6.7.11 obtaining data augmentation methods having a sequence number of now in auglist to process sm, and setting a generated augmented sequence as αsm1;
- 6.7.12 adding 1 to now, and setting now as 0 when now has a size equal to nd;
- 6.7.13 obtaining data augmentation methods having a sequence number of now in auglist to process sm, and setting a generated augmented sequence as αsm2;
- 6.7.14 adding 1 to now, and setting now to 0 when now has a size equal to nd;
- 6.7.15 storing pair-wise augmented sequence results (αsm1, αsm2) into an augmented sequence pair set ASP;
- 6.7.16 completing traversal of the set SM;
- 6.7.17 traversing the mashup invocation sequence am of the APIs, and storing invoked sequences corresponding to all APIs in the set A into the set, which are set as SA;
- 6.7.18 traversing the set SA, and setting a current traversal sequence as sa;
- 6.7.19 obtaining data augmentation methods having a sequence number of now in auglist to process sa, and setting a generated augmented sequence as msα1;
- 6.7.20 adding 1 to now, and setting now as 0 when now has a size equal to nd;
- 6.7.21 obtaining data augmentation methods having a sequence number of now in auglist to process sa, and setting a generated augmented sequence as msα2;
- 6.7.22 adding 1 to now, and setting now to 0 when now has a size equal to nd;
- 6.7.23 storing pair-wise augmented sequence results (msα1, msα2) into an augmented sequence pair set ASP; and
- 6.7.24 completing traversal, and outputting a pair-wise augmented sequence result set ASP.
- 6.8 obtaining API feature representations corresponding to all APIs in αsm1 in an API feature representation matrix FEα, and carrying out mean-pooling processing on all the obtained feature representations as a representation result corresponding to the sequence αsm1, which is set as emaug1; averaging dimensions of feature vectors to obtain a new feature vector, carrying out the same processing on αsm2 to obtain a representation result, which is set as emaug2, obtaining mashup feature representations corresponding to all mashup services in msα1 in a mashup feature representation matrix FEm, carrying out mean-pooling processing on all the obtained feature representations as a representation result corresponding to the sequence msα1, which is set as eαaug1 and carrying out the same processing on msα2 to obtain a representation result, which is set as eαaug2; and
- 6.9 completing traversal, and computing a contrastive loss result for augmented sequence pairs generated by all the mashup services:
- computing a contrastive loss result for augmented sequence pairs corresponding to all the APIs:
- an overall loss result is as follows:
where τ is a temperature coefficient, and is configured to control discrimination of the model to negative samples to adjust generalization capability of the model, and log indicates a logarithmic function.
Step 7 Includes:
- 7.1 computing a pair-wise recommendation score between the mashup m and the candidate API α″: {circumflex over (γ)}m,α″=ƒemTƒeα, where T indicates vector transpose;
- 7.2 for the mashup service m, optimizing parameters through a pair-wise Bayesian personalized ranking loss (BPR) on the basis of service invocation information in training data, where APIs invoked by the mashup in a data set are called as a positive invocation example, which is defined as Xm+, remaining candidate APIs are called as a negative invocation example Xm−, and an overall loss function is computed as:
- where O∈{(m, α′, b′)|(m, α′)∈Xm+, (m, b′)∈Xm−}, and a′ and b′ indicate a positive API obtained by sampling and a negative API obtained by sampling respectively;
- 7.3 computing an overall loss result according to steps 6.9 and 7.2:
- where θ is a hyper-parameter for controlling intensity of contrastive learning; and
- 7.4 cycling the overall model from steps 1 to 7, and optimizing the overall model through the loss function in step 7.3 to fit the data set by model parameters.
Step 8 Includes:
- 8.1 converting the user request for constructing a new combined service into a vector representation through a language model, matching the similarity between the vector representation and mashup service function description in an existing data set, and using q objects having the highest similarity in the vector representation as associated mashup services, which are represented as relationM={rm1, rm2, . . . , rmq};
- 8.2 carrying out mean-pooling operation on representations eml corresponding to mashup services in the set relationM to construct a feature representation enewR corresponding to the user request, where mean-pooling is common pooling operation; and averaging dimensions of the feature vector to obtain a new feature vector;
- 8.3 transmitting enewR into a recommendation model optimized through step 6 to output a corresponding feature representation result enewR;
- 8.4 traversing candidate APIs, setting APIs traversed currently as α, computing corresponding pair-wise recommendation scores {circumflex over (γ)}newR,α according to the method in step 7.1, and forming, by all the pair-wise recommendation scores, a set YnewR; and
- 8.5 sorting YnewR from large to small, and outputting top k APIs to complete API service recommendation for the new combined request.
The present invention has the beneficial effects: (1) according to service data characteristics and an invocation structure, the double-graph structure is designed to fuse various information and better learn service function characteristics; (2) on the basis of the graph neural network, the service recommendation model is optimized end to end, thereby improving the effect of service recommendation; (3) a multi-layer graph neural network by using the invocation graph structure as input is designed to fuse representation results between services by means of different model structures; (4) according to characteristics of multi-layer graph propagation results, the mutual attention mechanism is designed to integrate feature representation results of multi-layer output; and (5) a contrastive learning method is designed to augment data of the sequence, thereby alleviating the problem of data sparsity, and improving the effect of service recommendation.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a schematic diagram of an overall structure of a service recommendation model;
FIG. 2 is an effect diagram of a service recommendation model under changes of various parameters, where (a) shows a change trend of results of recommendation indexes along with changes of dimensions of an embedding representation; (b) shows a change trend of results of recommendation indexes along with changes of the number of graph propagation layers; (c) shows a change trend of results of recommendation indexes along with changes of probability parameters γ and ε in a data augmentation method; and (d) shows a change trend of results of recommendation indexes along with changes of intensity of contrastive learning;
FIG. 3 is a graph of a result of an ablation experiment of a service recommendation model; and
FIG. 4 is a graph of a result of an ablation experiment of a mutual attention mechanism.
DETAILED DESCRIPTION OF THE EMBODIMENTS
The present invention will be further described below with reference to the accompanying drawings.
With reference to FIGS. 1-4, a cloud-native application programming interface (API) recommendation method fusing data augmentation and contrastive learning includes:
- step 1, on the basis of service invocation matrix information and function association information between APIs, construct an invocation graph structure, service invocation matrix graph (SIMG), between mashup services and API services and an invocation graph structure, connecting API functional graph (CAFG), between the APIs to form a double-graph structure. The step includes:
- 1.1 represent mashup services in a mashup service set M as a mashup type of graph nodes, represent APIs in an API set A as an API type of graph nodes, and represent an overall graph node set as Vmα;
- 1.2 set an empty set εmα as a storage set of side information in the graph structure SIMG;
- 1.3 traverse the mashup services in M, set a single mashup service traversed currently as m, and obtain an API set minvoke invoked by the set M from an invocation interaction matrix MA between the mashups and the APIs, where the service invocation interaction matrix MA has a size of |M|×|A|, where |M| indicates the number of elements in the set M, and |A| indicates the number of elements in the set A;
- 1.4 traverse the API set minvoke obtained in a previous step, set the APIs traversed currently as α, and with when a value corresponding to a corresponding position [m, α] in the invocation interaction matrix MA is 1, which indicates that an invocation relation exists between m and α, store a side combination (m, α) into εmα, which indicates that an undirected connection side exists between graph nodes corresponding to m and α;
- 1.5 complete traversal, and combine a graph node set Vmα and a side information set εmα to complete construction of a graph structure SIMG (Vmα, εmα)
- 1.6 convert the API set A into a graph node set, which is represented as Vα;
- 1.7 set an empty set εα as a storage set of side information in the graph structure CAFG;
- 1.8 traverse the APIs in the set A, setting a single API traversed currently as α1, on this basis, traverse the APIs in A apart from α1, and set a single API traversed currently as α2;
- 1.9 with a tag information set Atag of the APIs including tags corresponding to the APIs, set a tag set corresponding to the APIs α1 as Aα1tag, set a tag set corresponding to the APIs α2 as Aα2tag, and when Aα1tag∩Aα2tag is not an empty set, store a side combination (α1, α2) into εα, which indicates that an undirected connection side exists between graph nodes corresponding to α1 and α2; and
- 1.10 complete traversal, and combine the graph node set Vα and the side information set εα to complete construction of a graph structure CAFG(Vα, εα).
- Step 2, according to the graph structure SIMG, design a multi-layer graph propagation network structure to learn mashup feature representations and API feature representations. As shown in part {circle around (1)} in FIG. 1, a schematic diagram of three layers of graph propagation structure is provided. The process is as follows:
- 2.1 construct an embedding layer to store embedding vector information of the mashup services and API nodes in forms of:
- where Emashup indicates embedding representations corresponding to the mashup services, and Eα indicates embedding representations corresponding to the APIs; and combine Emashup and Eα to obtain an embedding layer embedding matrix E;
- 2.2 compute an adjacency matrix of the graph structure SIMG, which is set as Imα:
- where MA is the service invocation interaction matrix, and the adjacency matrix has a size of (|M|+|A|)×(|M|+|A|), and includes connection information of nodes in the graph structure, where T indicates matrix transpose operation;
- 2.3 define a degree corresponding to the mashup service Dmα as dm and a degree of the node corresponding to the API α as dα, combine degree information corresponding to the mashup service and the API into a degree matrix Dmα, and compute a normalized form of the adjacency matrix:
- where an element corresponding to a position (i,j) in a normalized matrix Ĩmα has a value of
where dim indicates a size of the degree corresponding to the mashup service having a sequence number of i, and djα indicates a size of the degree corresponding to the API node having a sequence number of j;
- 2.4 traverse node information in the graph structure SIMG, set nodes traversed currently as ν including the mashup type of graph nodes and the API type of graph nodes, set a neighbor node set of ν in the graph structure, i.e., a node set having a side connection with the nodes, as Nν, and compute a propagation result of each νt∈Nν:
where l indicates the number of layers of a graph propagation structure of a multi-layer graph neural network, Wl1 and Wl2 are both trainable parameter matrixes of a current layer, ⊗ indicates multiplication operation of vector elements, eν indicates an embedding representation corresponding to the node ν, and eνt indicates an embedding representation of a neighbor node νt of the node ν;
- 2.5 aggregate propagation information of the node ν and the neighbor node of the node to form node feature embedding output of the current layer:
- where σ is an activation function, eνl-1 indicates node embedding output of a previous layer, and Σ indicates vector summation operation; and
- 2.6 according to properties of the node ν, divide an output node embedding result into a mashup service feature representation eml and an API feature representation eαl for output.
- Step 3, according to the graph structure CAFG, design a corresponding graph neural network structure to compute feature representations corresponding to APIs under the graph neural network structure. As shown in part {circle around (2)} in FIG. 1, a schematic diagram of a double-layer graph convolution structure is provided. The process is as follows:
- 3.1 with one-hot representation being a vectorized representation method, where each feature is represented by an independent binary dimension, only one dimension has a value of 1, and remaining dimensions all have a value of 0, and the dimensions are configured to represent categorical variables or discrete variables, construct an API embedding representation based on one-hot representation, and convert, by the API embedding representation based on one-hot representation, high-dimensional one-hot representation corresponding to the API set into a dense embedding representation AOE∈R|A|×d through an embedding layer, where d is a dimension of the embedding representation;
- 3.2 obtain an adjacency matrix Iα∈R|A|×|A| of the graph structure CAFG, where when Iα[u, w]=1, it is indicated that the API αu having a sequence number of u and the API αw having a sequence number of w share the same tag information;
- 3.3 computing a corresponding degree matrix Dα∈R|A|×|A|, where the matrix is a diagonal matrix, where Dα[u, u] indicates the total number of the APIs having the same tag as αu;
- 3.4 compute a normalized adjacency matrix
as a convolution weight of an input vector set; and
- 3.5 carry out graph convolution computation on the embedding representation through the normalized adjacency matrix:
- where σ indicates an activation function, W5∈Rd×d and W6∈Rd×d are weights of two layers respectively, computed output C is used as API feature representation output under the network structure, and has a size of C∈R|A|×d, and a feature representation corresponding to the service a is defined as
- step 4, design a mutual attention mechanism layer to retain multi-layer output information, and compute attention weights of multi-layer output and merge feature representations of the mashup services by taking association information of the APIs as importance guidance. As shown in part {circle around (3)} in FIG. 1, a schematic diagram of a double-layer graph convolution structure is provided. The process is as follows:
- 4.1 represent a matrix consisting of all feature representation vectors of the mashup services of l-th layer output obtained in step 2.6 as Eml, represent a matrix consisting of feature representation vectors corresponding to the APIs as Eαl, and combine output of a current propagation layer:
- 4.2 combine Eml and a feature matrix C output from step 3.5:
- 4.3 with the attention mechanism being a common method in machine learning, and being configured to assign weights to different elements in a sequence including a query part and a key part, compute similarity between the query and the key to carry out weighted summation on the elements, so as to achieve feature extraction and weighted fusion of the sequence, and design the mutual attention mechanism to connect the feature representations of the mashup services output by the multi-layer structure, and compute a query representation of the mutual attention mechanism:
- where Wq|l∈Rd*(|M|+|A|) is a trainable weight parameter, and q1 is used as an attention query corresponding to an output result of a l-th propagation layer;
- 4.4 compute a key representation of the mutual attention mechanism:
k
l
=W
k
E
key
l
- where Wk∈Rd*(|M|+|A|) is a trainable weight parameter, and ki indicates an attention key corresponding to an output result of the l-th propagation layer;
- 4.5 compute mutual attention weight information corresponding to the l-th layer:
- where exp( ) indicates exponential power by using a natural constant e as a base, L is the total number of graph propagation layers, and T indicates vector transpose; and
- 4.6 weigh output of graph propagation layers:
- where ⊗ is operation of computing product of each element, and αEml indicates a weighted result of feature vectors of l-th layer output; and then connect weighted output of the layers, and convert the weighted output into a final mashup feature representation:
- where ⊕ indicates vector connection operation, ƒcm indicates a multi-layer perceptron having an input size of L*d and output of d, and the multi-layer perceptron is a model based on a neural network, and consists of a plurality of fully-connected hidden layers and output layers; and set a mashup feature vector result corresponding to the mashup m in the matrix FEm, as ƒem.
- step 5, carry out weighted combination on API feature representations of double graphs through a gate mechanism to generate a new API feature representation. The process is as follows:
- 5.1 compute, by the API feature representation eαl output by the graph neural network structure based on the graph structure SIMG and the API feature representation cα output by the graph neural network structure based on the graph structure CAFG, a gate weight corresponding to the gate mechanism:
- where Wg is a trainable weight, and Eαl indicates a set of eαl corresponding to all the services; and
- 5.2 with the weight for connection, compute a final output result:
- where FEα is a matrix consisting of all the API feature representations obtained by weighting feature representations in a form of a matrix; and set a feature vector result corresponding to the API α in the matrix FEα as ƒeα.
- step 6, compute training similarity and prior information similarity between services, and generate data augmented sequence pairs through a data augmentation method. The process is as follows:
- 6.1 compute the training similarity between the services, and use dot product of embedding vectors of two APIs or mashup services as a similarity computation result:
- where α1, α2 indicate any two different APIs, Simtrainapi (α1, α2) indicates the training similarity between the APIs α1 and α2, m1, m2 indicates any two different mashup services, and Simtrainmashup (m1, m2) indicates the training similarity between the mashups m1 and m2; and
- 6.2 compute the prior information similarity between the services, and compute similarity between service description documents on the basis of similarity computation of the service description documents and service function tags:
- where cos( ) indicates cosine similarity, eαlm indicates a service function representation obtained by computation of a language model corresponding to the API α, Simdescriptionapi (α1, α2) indicates the prior information similarity between the APIs α1 and α2, ∥eαlm∥ indicates a modulus of a vector eαlm, emlm indicates a service function representation obtained by computation of a language model corresponding to the mashup m, Simdescriptionmashup (m1, m2) indicates the prior information similarity between the mashups m1 and m2, and ∥emlm∥ indicates a modulus of a vector emlm;
- 6.3 compute similarity between service tags:
- where mtag indicates a function tag set of the mashup, αtag indicates a tag set of the APIs, |mtag| indicates the number of mashup function tags, and |αtag| indicates the number of the API function tags;
- 6.4 use a larger result between function description similarity and tag similarity as prior information similarity between the services:
- where (
) indicates a normalized similarity result;
- 6.5 use a larger result between the training similarity between the services and the prior information similarity as overall similarity information between the services:
- compute similarity information between the services as a basis for data augmentation;
- 6.6 design four service sequence data augmentation methods as follows:
- (1) service cutting: for a service sequence mαc={α1, α2, . . . , α|mαc|}, select a continuous subsequence having a length of Lsc=|μ*|mαc∥ from a random position as an augmented sequence, where μ∈(0,1) is a random cutting parameter, where |mαc| is the number of the APIs in the service invocation sequence;
- (2) service occlusion: for a service sequence mαc {α1, α2, . . . , α|mαc|}, randomly discard the APIs of the number being Lsm=|η*|mαc∥, where η∈(0,1) is a random occlusion parameter, where |mαc| is the number of the APIs in the service invocation sequence;
- (3) association replacement: for a service invocation sequence mαc={α1, α2, . . . , α|mαc|}, randomly select an item, Lrc=|γ*|mαc∥ and replace the item with a service having high function correlation, where γ∈(0,1) indicates a replacement probability parameter, and |mαc| is the number of the APIs in the service invocation sequence;
- (4) association expansion: for a service invocation sequence mαc {α1, α2, . . . , α|mαc|}, randomly select an item Lre=|ε*|mαc∥, and insert a service having high function correlation with the item into a position in which the item is located, where ε∈(0,1) indicates an expansion probability parameter, where |mαc| is the number of the APIs in the service invocation sequence; and
- in the augmentation methods (3) and (4), determine service function correlation through similarity information between the services obtained in step 6.5, with respect to the association replacement method (3), for Lrc services randomly selected, obtain, through a similarity computation result between the services provided in step 6.5, services having the highest function similarity from candidate services, apart from services existing in an original sequence, as new services, so as to replace the services randomly selected, and with respect to the association expansion method (4), for Lre services randomly selected, obtain, through a similarity computation result between the services provided in step 6.5, services having the highest function similarity from candidate services, apart from services existing in an original sequence, as new services, and add the new services after original positions in sequences in which the services randomly selected are located to complete sequence augmentation;
- 6.7 for the mashup service set M, traverse the mashup services in M, set a single mashup service traversed currently as m, for the service invocation sequence mαm, corresponding to m, select two sequence data augmentation methods provided in step 6.6 to process the sequence, so as to generate two newly augmented sequence results, which are represented as αsm1 and αsm2 respectively, for the API set A, traverse the APIs in A, set a single API traversed currently as α, set a mashup sequence that invoked the API as αmα, and select two sequence data augmentation methods provided in step 6.6 to process the sequence, so as to generate two newly augmented sequence results, which are represented as msα1 and msα2 respectively. A process of selecting the augmentation methods and generating the augmented sequence pairs includes:
- 6.7.1 set a set for storing combined data augmentation methods, which is set as auglist and is initially an empty set; and store the four service sequence data augmentation methods in step 6.6 into a set DA, set the number of elements in the set as, and set a count index as 1;
- 6.7.2 traverse DA, and set a method object traversed currently as dα;
- 6.7.3 for each traversal, assign index+1 to target;
- 6.7.4 when target is less than nd, store dα into auglist, store augmentation methods having a sequence number of target in DA into auglist, and add 1 to target;
- 6.7.5 repeat steps 6.7.3 and 6.7.4 until target has a value greater than or equal to nd;
- 6.7.6 add 1 to index;
- 6.7.7 end traversal, and complete initialization of the set auglist;
- 6.7.8 assign 0 to a counter now, and set a set for storing augmented sequence pairs, which is set as ASP, and is initially an empty set;
- 6.7.9 traverse the API invocation sequence mα of the mashup, and store the invocation sequences corresponding to all the mashups in the set M into the set, which are set as SM;
- 6.7.10 traverse the set SM, and set a current traversal sequence as sm;
- 6.7.11 obtain data augmentation methods having a sequence number of now in auglist to process sm, and set a generated augmented sequence as αsm1;
- 6.7.12 add 1 to now, and set now as 0 when now has a size equal to nd;
- 6.7.13 obtain data augmentation methods having a sequence number of now in auglist to process sm, and set a generated augmented sequence as αsm2;
- 6.7.14 add 1 to now, and set now to 0 when now has a size equal to nd;
- 6.7.15 store pair-wise augmented sequence results (αsm1, αsm2) into an augmented sequence pair set ASP;
- 6.7.16 complete traversal of the set SM;
- 6.7.17 traverse the mashup invocation sequence am of the APIs, and store invoked sequences corresponding to all APIs in the set A into the set, which are set as SA;
- 6.7.18 traverse the set SA, and set a current traversal sequence as sa;
- 6.7.19 obtain data augmentation methods having a sequence number of now in auglist to process sa, and set a generated augmented sequence as msα1;
- 6.7.20 add 1 to now, and sett now as 0 when now has a size equal to nd;
- 6.7.21 obtain data augmentation methods having a sequence number of now in auglist to process sa, and set a generated augmented sequence as msα2;
- 6.7.22 add 1 to now, and setting now to 0 when now has a size equal to nd;
- 6.7.23 store pair-wise augmented sequence results (msα1, msα2) into an augmented sequence pair set ASP; and
- 6.7.24 complete traversal, and output a pair-wise augmented sequence result set ASP.
- 6.8 obtain API feature representations corresponding to all APIs in αsm1 in an API feature representation matrix FEα, and carry out mean-pooling processing on all the obtained feature representations as a representation result corresponding to the sequence αsm1, which is set as emaug1; with mean-pooling being common pooling operation, average dimensions of feature vectors to obtain a new feature vector, carry out the same processing on αsm2 to obtain a representation result, which is set as emaug2, obtain mashup feature representations corresponding to all mashup services in msα1 in a mashup feature representation matrix FEm, carry out mean-pooling processing on all the obtained feature representations as a representation result corresponding to the sequence msα1, which is set as eαaug1, and carry out the same processing on msα2 to obtain a representation result, which is set as eαaug2; and
- 6.9 complete traversal, and compute a contrastive loss result for augmented sequence pairs generated by all the mashup services:
- compute a contrastive loss result for augmented sequence pairs corresponding to all the APIs:
- an overall loss result is as follows:
- where τ is a temperature coefficient, and is configured to control discrimination of the model to negative samples to adjust generalization capability of the model, and log indicates a logarithmic function.
- Step 7, compute pair-wise scores through the obtained feature representation, and compute an overall loss function result through the pair-wise scores and the data augmented sequence pairs to optimize parameters of an overall recommendation model. The process is as follows:
- 7.1 compute a pair-wise recommendation score between the mashup m and the candidate API α″: {circumflex over (γ)}m,α″=ƒemTƒeα, where T indicates vector transpose;
- 7.2 for the mashup service m, optimize parameters through a pair-wise Bayesian personalized ranking loss (BPR) on the basis of service invocation information in training data, where APIs invoked by the mashup in a data set are called as a positive invocation example, which is defined as Xm+, remaining candidate APIs are called as a negative invocation example Xm−, and an overall loss function is computed as:
- where O∈{(m, α′, b′)|(m, α′)∈Xm+, (m, b′)∈Xm−}, and a′ and b′ indicate a positive API obtained by sampling and a negative API obtained by sampling respectively;
- 7.3 compute an overall loss result according to steps 6.9 and 7.2:
- where θ is a hyper-parameter for controlling intensity of contrastive learning; and
- 7.4 cycle the overall model from steps 1 to 7, and optimize the overall model through the loss function in step 7.3 to fit the data set by model parameters.
- Step 8, match a user request, sort the pair-wise scores and complete service recommendation.
- 8.1 convert the user request for constructing a new combined service into a vector representation through a language model, such as bidirectional encoder representation from transformers (BERT) and word to vector (Word2vec), match the similarity between the vector representation and mashup service function description in an existing data set, and use q objects having the highest similarity in the vector representation as associated mashup services, which are represented as relationM={rm1, rm2, . . . , rmq};
- 8.2 carry out mean-pooling operation on representations eml corresponding to mashup services in the set relationM to construct a feature representation enewR corresponding to the user request, where mean-pooling is common pooling operation; and averaging dimensions of the feature vector to obtain a new feature vector;
- 8.3 transmit enewR into a recommendation model optimized through step 6 to output a corresponding feature representation result enewR;
- 8.4 traverse candidate APIs, set APIs traversed currently as α, compute corresponding pair-wise recommendation scores {circumflex over (γ)}newR,a according to the method in step 7.1, and form, by all the pair-wise recommendation scores, a set YnewR; and
- 8.5 sort YnewR from large to small, and output top k APIs to complete API service recommendation for the new combined request.
The actual effect of the invention is analyzed below according to specific service data.
- 1) A mashup combination and an API data set are selected. Contents included in the data set are specifically shown in Table 1. 1423 mashups and 1032 candidate APIs having invocation relations with the mashups are included. Corresponding function description documents and corresponding tags are collected for each mashup and each API, and the description information is pre-segmented, and is converted into dense vectors through a BERT pre-training language model.
TABLE 1
|
|
Item
Mashups
APIs
|
|
|
Number of elements
1423
1032
|
Number of tags
297
301
|
Document vector dimension
768
768
|
|
- 2) All the services are divided into 5 parts through 20% of the number of the mashup services by using a cross-validation method, and one part of services is validated as a test set each time.
- 3) The effect of service recommendation is evaluated with multiple indexes: Hit rate (HR):
- where RecAm indicates a recommended API list for a mashup m, and ObsAm indicates an observable API invocation list of mbased on real data. Normalized discounted cumulative gain (NDCG) index:
- where N indicates the number of recommended services, n indicates services at a n-th position in order in a recommendation list, and an idea discounted cumulative gain (IDCG) is the sum of discounted cumulative gain (DCG) values obtained from all recommendation results. Mean average precision (MAP):
- where numm indicates the number of APIs invoked by the mashup m·P(t) computes a recommendation service ratio of successful hits before a current position t in the recommendation list.
- 4) A parameter experiment related to a service recommendation model is designed. Main parameters affecting the effect of the model include embedding representation dimensions, the number of graph information propagation layers, data augmentation probability, and contrastive learning intensity parameters in a loss function. FIG. 2 shows an effect diagram of a service recommendation model under various parameter changes; and (a) of FIG. 2 shows a change trend of results of recommendation indexes along with changes of embedding representation dimensions. Vectors may not accurately represent feature information of each service and may reduce the representation learning capability of the model due to the too small embedding representation dimensions. Therefore, the effect is poor. Along with increase in the embedding representation dimensions, the model training result is gradually improved; and when a value is 64, the model may balance representation and generalization capabilities, and achieve excellent results.
- (b) of FIG. 2 shows a trend of results of recommendation indexes along with changes of the number of graph propagation layers. Changes of index results caused by changes of the number of graph propagation layers are observed. It may be seen that along with increase in the number of layers, the model effect is improved and reaches a peak at a value of 3 layers. Along with increase in the number of subsequent layers, effect changes tend to be gentle, but are not significantly reduced, which indicates that the mutual attention mechanism designed by the present invention may effectively fuse output of a multi-layer structure and learn useful information in a multi-layer invocation relation.
- (c) of FIG. 2 shows a change trend of results of recommendation indexes along with changes of probability parameters γ and ε in a data augmentation method. Along with increase in values of the probability parameters, the model effect is gradually improved. When the value is around 0.6, as the value increases, the indicator results of model training remain basically stable. (d) of FIG. 2 shows a change trend of results of recommendation indexes along with changes of intensity of contrastive learning. The larger θ is, the greater contrastive intensity is. Before θ reaches a certain threshold, the effect of the service recommendation model is improved.
- 5) An ablation experiment related to the structure of the service recommendation model is designed. The function of each part of structure of the service recommendation model provided in the present invention is evaluated by providing three variations of the service recommendation model provided in the present invention. Service recommendation-1 is a model having no CAFG structure, has no additional information of the APIs, but retains feature vectors corresponding to service function description of the APIs. Service recommendation-2 is a model having no graph information propagation mechanism, and is a 0-layer structure without mutual learning of a service invocation structure. Service recommendation-3 is a model having no gate connection mechanism, which directly uses API representation output of the SIMG. FIG. 3 shows results of variant models on different experimental indexes.
The service recommendation-1 that loses additional information between APIs included in the CAFG, has a lower effect than a complete model, and especially has a significant decrease in the last two indexes related to a recommendation order. Training results of service recommendation-2 are observed, and has performance in various indexes significantly decreased after an information propagation process based on a graph neural network is eliminated. According to experimental results of service recommendation-3, results obtained without a gate mechanism are worse than those obtained by connecting two feature representations by using the gate mechanism. The results indicate that structural design of the service recommendation model is effective, thereby improving the effect of service recommendation.
- 6) Model training results that use a mutual attention mechanism for multi-layer output integration are compared with results that do not use the mutual attention mechanism, so as to determine the function of the mutual attention mechanism. FIG. 4 shows changes of model training results on various indexes caused by two multi-layer output strategies in the case of the number of graph propagation layers is changed. Noa indicates removal of the mutual attention mechanism, and directly uses a method for multi-layer connection.
From the change trend of recommended index results, it may be seen that the model effect may also be improved to some extent by utilizing additional information output from the multi-layer structure when the total number of layers is low in a simple strategy of connecting multi-layer results for output. Along with increase in the number of structural layers, the model effect may be reduced due to incapability to select multi-layer output results. The mutual attention mechanism may better generalize the output results of the multiple layers and fuse the multi-layer invocation relations between services into feature representations. As shown in the figure, the model may still maintain excellent recommendation effect in multi-layer situations, and various recommendation result indexes are not significantly reduced. Use of the mutual attention mechanism can better fuse the multi-layer output results, overcome interference caused by missing invocation information, and improve the effect of recommendation results.
The content in the examples of the description is only a listing of the implementation forms of the invention concept and is only for illustrative purposes. The scope of protection of the present invention should not be considered as limitations to the specific form stated in the examples, and the scope of protection of the present invention also extends to equivalent technical means conceivable by those of ordinary skill in the art according to the concept of the present invention.