CLOUD-NATIVE APPLICATION PROGRAMMING INTERFACE (API) RECOMMENDATION METHOD FUSING DATA AUGMENTATION AND CONTRASTIVE LEARNING

Description

CROSS-REFERENCE TO RELATED APPLICATION

This application claims the priority benefit of China application serial no. 202310822440.4, filed on Jul. 5, 2023. The entirety of the above-mentioned patent application is hereby incorporated by reference herein and made a part of this specification.

TECHNICAL FIELD

The present invention belongs to the technical field of application programming interface (API) recommendation, and relates to a cloud-native API recommendation method fusing data augmentation and contrastive learning.

BACKGROUND

In a current era of cloud native, a number of companies have tacitly approved enterprise cloud. Emergence of a cloud native concept has promoted an enterprise to eliminate the need to build an infrastructure. How to make an adaption based on a cloud component, and how to reasonably use elasticity and functions of computation, storage, separation, etc. of the cloud have also become crucial. It is necessary to digitally transform a traditional manufacturer, which faces a business to be networked. In order to solve problems such as high concurrency and high throughput, it is inevitable to use an Internet architecture. An increasing number of enterprises are choosing to provide a service-based application. Software serving as a leading, i.e. a software-as-a-service (SaaS) mode, and a service serving as a core of a business provide computing and data resources for a developer, and provide a mature cloud software solution for the enterprise. With an open service platform, the developer can rapidly satisfy complex requirements of a user by combining reusable and replaceable third-party services. A mashup service is a typical service composition mode that generates a new application programming interface (API) by rapidly combining existing Web APIs, and is favored by the developer and has been rapidly developed with high efficiency and convenient use.

At present, the Web API is mainly built with a representational state transfer (RESTful) service technology. With complexity of business requirements of the application, it is difficult for a single RESTful service to satisfy requirements of the user. In view of this, a mashup service technology called as mashup has been proposed. The mashup service is mostly formed by a combination of the APIs, and has a core to reorganize a single API or different data sources. An increasing number of enterprises are choosing to provide the service-based application. It is crucial to choose the appropriate API to build a combined service. Rapid growth in the number of existing candidate APIs and a large number of services having similar functions have further increased the difficulty for the user to choose the appropriate API. With these challenges, a service recommendation technology has emerged.

Many solutions have been proposed to solve a service recommendation method. A graph neural network (GNN) propagates information through a graph structure. Great attention has been paid the GNN as an effective transformation learning method, which is widely used in tasks for processing non-Euclidean data, such as drug recognition, text classification and temporal recommendation. In a collaborative filtering based recommendation method, the GNN can use a graph structure formed by calling information as input such that adjacent nodes can affect each other, thereby improving learning capability for information. For service recommendation, an invocation relation between the mashup and the API can be represented as a binary graph structure. Moreover, additional information such as a tag can also be naturally represented as a graph association structure, and is suitable for being processed as a non-Euclidean data task.

Contrastive learning is a typical discriminative self-supervised learning method that learns a representation by comparing data. Original data is augmented to form new data, and the newly generated data is compared, such that comparison results between data pairs obtained by augmentation of the same data are minimized, and comparison results between data pairs obtained by augmentation of different data are maximized, so as to learn differences between different data without a tag.

SUMMARY OF THE DISCLOSURE

In order to overcome the defects of the prior art, better utilize service included information and improve the effect of a service recommendation system, the present invention provides a cloud-native application programming interface (API) recommendation method fusing data augmentation and contrastive learning. Service information is included on the basis of a service information double-graph structure, and a mutual attention mechanism is designed to compute an importance degree of each layer of information. A data optimization method for sequence information based on functional similarity and a computation method for similarity between services based on two parts of information are provided. On this basis, data of a service invocation sequence is augmented with the idea of contrastive learning to form an augmented sequence pair. A computational contrastive loss function is combined with a pair-wise recommendation loss function to optimize an overall model, thereby improving the effect of a service recommendation model. According to a feature embedding representation result of a service, pair-wise recommendation scores are computed to complete service recommendation.

The technical solution used by the present invention is as follows:

- a cloud-native API recommendation method fusing data augmentation and contrastive learning. The method includes:
- step 1, on the basis of service invocation matrix information and function association information between APIs, constructing an invocation graph structure, service invocation matrix graph (SIMG), between mashup services and API services and an invocation graph structure, connecting API functional graph (CAFG), between the APIs to form a double-graph structure;
- step 2, according to the graph structure SIMG, designing a multi-layer graph propagation network structure to learn mashup feature representations and API feature representations;
- step 3, according to the graph structure CAFG, designing a corresponding graph neural network structure to compute feature representations corresponding to APIs under the graph neural network structure;
- step 4, designing a mutual attention mechanism layer to retain multi-layer output information, and computing attention weights of multi-layer output and merging feature representations of the mashup services by taking association information of the APIs as importance guidance;
- step 5, carrying out weighted combination on API feature representations of double graphs through a gate mechanism to generate a new API feature representation;
- step 6, computing training similarity and prior information similarity between services, and generating data augmented sequence pairs through a data augmentation method;
- step 7, computing pair-wise scores through the obtained feature representation, and computing an overall loss function result through the pair-wise scores and the data augmented sequence pairs to optimize parameters of an overall recommendation model; and
- step 8, matching a user request, sorting the pair-wise scores and completing service recommendation.

Further, step 1 includes:

- 1.1 representing mashup services in a mashup service set M as a mashup type of graph nodes, representing APIs in an API set A as an API type of graph nodes, and representing an overall graph node set as V_ma;
- 1.2 setting an empty set ε_mαas a storage set of side information in the graph structure SIMG;
- 1.3 traversing the mashup services in M, setting a single mashup service traversed currently as m, and obtaining an API set m_invokeinvoked by the set M from an invocation interaction matrix MA between the mashups and the APIs, where the service invocation interaction matrix MA has a size of |M|×|A|, where |M| indicates the number of elements in the set M, and |A| indicates the number of elements in the set A;
- 1.4 traversing the API set m_invokeobtained in a previous step, setting the APIs traversed currently as α, and with when a value corresponding to a corresponding position [m, α] in the invocation interaction matrix MA is 1, which indicates that an invocation relation exists between m and α, storing a side combination (m, α) into ε_mα, which indicates that an undirected connection side exists between graph nodes corresponding to m and α;
- 1.5 completing traversal, and combining a graph node set V_mαand a side information set ε_mαto complete construction of a graph structure SIMG (V_mα, ε_ma);
- 1.6 converting the API set A into a graph node set, which is represented as V_α;
- 1.7 setting an empty set ε_α as a storage set of side information in the graph structure CAFG;
- 1.8 traversing the APIs in the set A, setting a single API traversed currently as α₁, on this basis, traversing the APIs in A apart from α₁, and setting a single API traversed currently as α₂;
- 1.9 with a tag information set A^tagof the APIs including tags corresponding to the APIs, setting a tag set corresponding to the APIs α₁as A_α1^tag, setting a tag set corresponding to the APIs α₂as A_α2^tagand when A_α1^tag∩A_α2^tagis not an empty set, storing a side combination (α₁, α₂) into ε_α, which indicates that an undirected connection side exists between graph nodes corresponding to α₁and α₂; and
- 1.10 completing traversal, and combining the graph node set V_α and the side information set ε_α to complete construction of a graph structure CAFG(V_α, ε_α).

Furthermore, step 2 includes:

- 2.1 constructing an embedding layer to store embedding vector information of the mashup services and API nodes in forms of:

$E_{mashup} = {e_{m_{1}}, e_{m_{2}}, \dots, e_{m_{❘ M ❘}}} E_{a} = {e_{a_{1}}, e_{a_{2}}, \dots, e_{a_{❘ A ❘}}} E = [\begin{matrix} E_{mashup} \\ E_{a} \end{matrix}]$

- where E_mashupindicates embedding representations corresponding to the mashup services, and E_α indicates embedding representations corresponding to the APIs; and combining E_mashupand E_α to obtain an embedding layer embedding matrix E;
- 2.2 computing an adjacency matrix of the graph structure SIMG, which is set as I_mα:

$I_{ma} = [\begin{matrix} 0 & MA \\ {MA}^{T} & 0 \end{matrix}]$

- where MA is the service invocation interaction matrix, and the adjacency matrix has a size of (|M|+|A|)×(|M|+|A|), and includes connection information of nodes in the graph structure, where T indicates matrix transpose operation;
- 2.3 defining a degree corresponding to the mashup service m as d^mand a degree of the node corresponding to the API a as d^α, combining degree information corresponding to the mashup service and the API into a degree matrix D_ma, and computing a normalized form of the adjacency matrix:

${\tilde{I}}_{ma} = D_{ma}^{- \frac{1}{2}} I_{ma} D_{ma}^{- \frac{1}{2}}$

- where an element corresponding to a position (i,j) in a normalized matrix Ĩ_mαhas a value of

$\frac{1}{\sqrt{❘ d_{i}^{m} ❘ ❘ d_{j}^{a} ❘}},$

where d_i^mindicates a size of the degree corresponding to the mashup service having a sequence number of i, and d_j^α indicates a size of the degree corresponding to the API node having a sequence number of j;

- 2.4 traversing node information in the graph structure SIMG, setting nodes traversed currently as ν including the mashup type of graph nodes and the API type of graph nodes, setting a neighbor node set of ν in the graph structure, i.e., a node set having a side connection with the nodes, as N^ν, and computing a propagation result of each ν_t∈N^ν:

$h_{v \leftarrow v_{t}}^{l} = \frac{1}{\sqrt{❘ d_{v} ❘ ❘ d_{v_{t}} ❘}} (W^{l 1} e_{v} + W^{l 2} (e_{v} \otimes e_{v_{t}})),$

- where l indicates the number of layers of a graph propagation structure of a multi-layer graph neural network, W^l1and W^l2are both trainable parameter matrixes of a current layer, ⊗ indicates multiplication operation of vector elements, e_ν indicates an embedding representation corresponding to the node ν, and e_ν_tindicates an embedding representation of a neighbor node ν_tof the node ν;
- 2.5 aggregating propagation information of the node ν and the neighbor node of the node to form node feature embedding output of the current layer:

$e_{v}^{l} = σ (e_{v}^{l - 1} + Σ_{v_{t} \in N^{v}} h_{v \leftarrow v_{t}}^{l})$

- where σ is an activation function, e_ν^l-1indicates node embedding output of a previous layer, and Σ indicates vector summation operation; and
- 2.6 according to properties of the node ν, dividing an output node embedding result into a mashup service feature representation e_m^land an API feature representation e_α^lfor output.

Step 3 Includes:

- 3.1 with one-hot representation being a vectorized representation method, where each feature is represented by an independent binary dimension, only one dimension has a value of 1, and remaining dimensions all have a value of 0, and the dimensions are configured to represent categorical variables or discrete variables, constructing an API embedding representation based on one-hot representation, and converting, by the API embedding representation based on one-hot representation, high-dimensional one-hot representation corresponding to the API set into a dense embedding representation AOE∈R^|A|×dthrough an embedding layer, where d is a dimension of the embedding representation;
- 3.2 obtaining an adjacency matrix I_αεR^|A|×|A| of the graph structure CAFG, where when I_α[u, w]=1, it is indicated that the API α_uhaving a sequence number of u and the API α_whaving a sequence number of w share the same tag information;
- 3.3 computing a corresponding degree matrix D_α∈R^|A|×|A|, where the matrix is a diagonal matrix, where D_α[u, u] indicates the total number of the APIs having the same tag as α_u;
- 3.4 computing a normalized adjacency matrix

$\tilde{I_{a}} = D_{a}^{- \frac{1}{2}} I_{a} D_{a}^{- \frac{1}{2}}$

as a convolution weight of an input vector set; and

- 3.5 carrying out graph convolution computation on the embedding representation through the normalized adjacency matrix:

$L^{1} = σ (\tilde{I_{a}} {PW}^{5}) C = L^{2} = σ (\tilde{I_{a}} L^{1} W^{6})$

- where σ indicates an activation function, W⁵∈R^d×dand W⁶∈R^d×dare weights of two layers respectively, computed output C is used as API feature representation output under the network structure, and has a size of C∈R^|A|×d, and a feature representation corresponding to the service a is defined as c_α.

Step 4 Includes:

- 4.1 representing a matrix consisting of all feature representation vectors of the mashup services of l-th layer output obtained in step 2.6 as E_m^l, representing a matrix consisting of feature representation vectors corresponding to the APIs as E_α^l, and combining output of a current propagation layer:

$E_{q}^{l} = [\begin{matrix} E_{m}^{l} \\ E_{a}^{l} \end{matrix}]$

- 4.2 combining E_m^land a feature matrix C output from step 3.5:

$E_{key}^{l} = [\begin{matrix} E_{m}^{l} \\ C \end{matrix}]$

- 4.3 with the attention mechanism being a common method in machine learning, and being configured to assign weights to different elements in a sequence including a query part and a key part, computing similarity between the query and the key to carry out weighted summation on the elements, so as to achieve feature extraction and weighted fusion of the sequence, and designing the mutual attention mechanism to connect the feature representations of the mashup services output by the multi-layer structure, and computing a query representation of the mutual attention mechanism:

q
_l
=W
^q|l
E
_q
^l

- where W^q|l∈R^d*(|M|+|A|)is a trainable weight parameter, and q_lis used as an attention query corresponding to an output result of a l-th propagation layer;
- 4.4 computing a key representation of the mutual attention mechanism:

k
_l
=W
^k
E
_key
^l

- where W^k∈R^d*(|M|+|A|)is a trainable weight parameter, and k_lindicates an attention key corresponding to an output result of the l-th propagation layer;
- 4.5 computing mutual attention weight information corresponding to the l-th layer:

$γ_{l}^{'} = \frac{q_{l}^{T} k_{l}}{\sqrt{d}}, α_{l} = \frac{\exp (γ_{l}^{'})}{\sum_{l^{'} = 1}^{L} \exp (γ_{l^{'}}^{'})},$

- where exp( ) indicates exponential power by using a natural constant e as a base, L is the total number of graph propagation layers, and T indicates vector transpose; and
- 4.6 weighing output of graph propagation layers:

${aE}_{m}^{l} = E_{m}^{l} \otimes α_{l}$

- where ⊗ is operation of computing product of each element, and αE_m^lindicates a weighted result of feature vectors of l-th layer output; and then connecting weighted output of the layers, and converting the weighted output into a final mashup feature representation:

${FE}_{m} = f_{cm} ({aE}_{m}^{1} \oplus {aE}_{m}^{2} \oplus \dots \oplus {aE}_{m}^{l})$

- where ⊕ indicates vector connection operation, ƒ_cmindicates a multi-layer perceptron having an input size of L*d and output of d, and the multi-layer perceptron is a model based on a neural network, and consists of a plurality of fully-connected hidden layers and output layers; and setting a mashup feature vector result corresponding to the mashup m in the matrix FE_mas ƒe_m.

Step 5 Includes:

- 5.1 computing, by the API feature representation e_α^loutput by the graph neural network structure based on the graph structure SIMG and the API feature representation c_αoutput by the graph neural network structure based on the graph structure CAFG, a gate weight corresponding to the gate mechanism:

$β = σ (W^{g} (C \oplus E_{a}^{l}))$

- where W^gis a trainable weight, and E_α^lindicates a set of ea corresponding to all the services; and
- 5.2 with the weight for connection, computing a final output result:

${FE}_{a} = β * C + (1 - β) * E_{a}^{l}$

- where FE_α is a matrix consisting of all the API feature representations obtained by weighting feature representations in a form of a matrix; and setting a feature vector result corresponding to the API α in the matrix FE_α as ƒe_α.

Step 6 Includes:

- 6.1 computing the training similarity between the services, and using dot product of embedding vectors of two APIs or mashup services as a similarity computation result:

${Sim}_{train}^{api} (a_{1}, a_{2}) = {fe}_{a_{1}} \cdot {fe}_{a_{2}} {Sim}_{train}^{mashup} (m_{1}, m_{2}) = {fe}_{m_{1}} \cdot {fe}_{m_{2}}$

- where α₁, α₂indicate any two different APIs, Sim_train^api(α₁, α₂) indicates the training similarity between the APIs α₁and α₂, m₁, m₂indicates any two different mashup services, and Sim_train^mashup(m₁, m₂) indicates the training similarity between the mashups m₁and m₂; and
- 6.2 computing the prior information similarity between the services, and computing similarity between service description documents on the basis of similarity computation of the service description documents and service function tags:

${Sim}_{description}^{api} (a_{1}, a_{2}) = \cos (e_{a_{1}}^{lm}, e_{a_{2}}^{lm}) = \frac{e_{a_{1}}^{{lm}^{T}} e_{a_{2}}^{{lm}^{T}}}{ e_{a_{1}}^{lm}   e_{a_{2}}^{lm} } {Sim}_{description}^{mashup} (m_{1}, m_{2}) = \cos (e_{m_{1}}^{lm}, e_{m_{2}}^{lm}) = \frac{e_{m_{1}}^{{lm}^{T}} e_{m_{2}}^{{lm}^{T}}}{ e_{m_{1}}^{lm}   e_{m_{2}}^{lm} }$

- where cos( ) indicates cosine similarity, e_α^lmindicates a service function representation obtained by computation of a language model corresponding to the API α, Sim_description^api(α₁α₂) indicates the prior information similarity between the APIs α_iand α₂, ∥e_α^lm∥ indicates a modulus of a vector e_α^lm, e_m^lmindicates a service function representation obtained by computation of a language model corresponding to the mashup m, Sim_description^mashup(m₁, m₂) indicates the prior information similarity between the mashups m₁and m₂, and ∥e_m^lm∥ indicates a modulus of a vector e_m^lm;
- 6.3 computing similarity between service tags:

${Sim}_{tag}^{api} (a_{1}, a_{2}) = \frac{❘ a_{tag}^{1} ⋂ a_{tag}^{2} ❘}{❘ a_{tag}^{1} ❘ + ❘ a_{tag}^{2} ❘ - ❘ a_{tag}^{1} ⋂ a_{tag}^{2} ❘} {Sim}_{tag}^{mashup} (m_{1}, m_{2}) = \frac{❘ m_{tag}^{1} ⋂ m_{tag}^{2} ❘}{❘ m_{tag}^{1} ❘ + ❘ m_{tag}^{2} ❘ - ❘ m_{tag}^{1} ⋂ m_{tag}^{2} ❘}$

- where m_tagindicates a function tag set of the mashup, α_tagindicates a tag set of the APIs, |m_tag| indicates the number of mashup function tags, and |α_tag| indicates the number of the API function tags;
- 6.4 using a larger result between function description similarity and tag similarity as prior information similarity between the services:

${Sim}_{info}^{api} (a_{1}, a_{2}) = \max ({description}_{api} (a_{1}, a_{2}), {tag}_{api} (a_{1}, a_{2})) {Sim}_{info}^{mashup} (m_{1}, m_{2}) = \max ({description}_{mashup} (a_{1}, a_{2}), {tag}_{mashup} (m_{1}, m_{2}))$

- where () indicates a normalized similarity result;
- 6.5 using a larger result between the training similarity between the services and the prior information similarity as overall similarity information between the services:

${Sim}_{a} (a_{1}, a_{2}) = \max ({train}_{api} (a_{1}, a_{2}), {info}_{api} (a_{1}, a_{2})) {Sim}_{m} (m_{1}, m_{2}) = \max ({train}_{mashup} (m_{1}, m_{2}), {info}_{mashup} (m_{1}, m_{2}))$

- computing similarity information between the services as a basis for data augmentation;
- 6.6 designing four service sequence data augmentation methods as follows:
- (1) service cutting: for a service sequence mα_c={α₁, α₂, . . . , a_|ma_c_|}, selecting a continuous subsequence having a length of L_sc=μ*|mα_c∥ from a random position as an augmented sequence, where μ∈(0,1) is a random cutting parameter, where |mα_c| is the number of the APIs in the service invocation sequence;
- (2) service occlusion: for a service sequence mα_c{α₁, α₂, . . . , α_|mα_c_|}, randomly discarding the APIs of the number being L_sm=|η*|mα_c∥, where η∈(0,1) is a random occlusion parameter, where |mα_c| is the number of the APIs in the service invocation sequence;
- (3) association replacement: for a service invocation sequence mα_c={α₁, α₂, . . . , a|mα_c|}, randomly selecting an item L_rc=|γ*|mα_c∥, and replacing the item with a service having high function correlation, where γ∈(0,1) indicates a replacement probability parameter, and |mα_c| is the number of the APIs in the service invocation sequence;
- (4) association expansion: for a service invocation sequence mα_c{α₁, α₂, . . . , α_|mα_c_|}, randomly selecting an item L_re=|ε*|mα_c∥, and inserting a service having high function correlation with the item into a position in which the item is located, where ε∈(0,1) indicates an expansion probability parameter, where |mα_c| is the number of the APIs in the service invocation sequence; and
- in the augmentation methods (3) and (4), determining service function correlation through similarity information between the services obtained in step 6.5, with respect to the association replacement method (3), for L_rcservices randomly selected, obtaining, through a similarity computation result between the services provided in step 6.5, services having the highest function similarity from candidate services, apart from services existing in an original sequence, as new services, so as to replace the services randomly selected, and with respect to the association expansion method (4), for L_reservices randomly selected, obtaining, through a similarity computation result between the services provided in step 6.5, services having the highest function similarity from candidate services, apart from services existing in an original sequence, as new services, and adding the new services after original positions in sequences in which the services randomly selected are located to complete sequence augmentation;
- 6.7 for the mashup service set M, traversing the mashup services in M, setting a single mashup service traversed currently as m, for the service invocation sequence mα_m, corresponding to m, selecting two sequence data augmentation methods provided in step 6.6 to process the sequence, so as to generate two newly augmented sequence results, which are represented as αsm1 and αsm2 respectively, for the API set A, traversing the APIs in A, setting a single API traversed currently as α, setting a mashup sequence that invoked the API as αm_α, and selecting two sequence data augmentation methods provided in step 6.6 to process the sequence, so as to generate two newly augmented sequence results, which are represented as msα1 and msα2 respectively;

Preferably, in step 6.7, a process of selecting the augmentation methods and generating the augmented sequence pairs includes:

- 6.7.1 setting a set for storing combined data augmentation methods, which is set as aug_listand is initially an empty set; and storing the four service sequence data augmentation methods in step 6.6 into a set DA, setting the number of elements in the set as nd, and setting a count index as 1;
- 6.7.2 traversing DA, and setting a method object traversed currently as dα;
- 6.7.3 for each traversal, assigning index+1 to target;
- 6.7.4 when target is less than nd, storing da into aug_list, storing augmentation methods having a sequence number of target in DA into aug_list, and adding 1 to target;
- 6.7.5 repeating steps 6.7.3 and 6.7.4 until target has a value greater than or equal to nd;
- 6.7.6 adding 1 to index;
- 6.7.7 ending traversal, and completing initialization of the set aug_list;
- 6.7.8 assigning 0 to a counter now, and setting a set for storing augmented sequence pairs, which is set as ASP, and is initially an empty set;
- 6.7.9 traversing the API invocation sequence mα of the mashup, and storing the invocation sequences corresponding to all the mashups in the set M into the set, which are set as SM;
- 6.7.10 traversing the set SM, and setting a current traversal sequence as sm;
- 6.7.11 obtaining data augmentation methods having a sequence number of now in aug_listto process sm, and setting a generated augmented sequence as αsm1;
- 6.7.12 adding 1 to now, and setting now as 0 when now has a size equal to nd;
- 6.7.13 obtaining data augmentation methods having a sequence number of now in aug_listto process sm, and setting a generated augmented sequence as αsm2;
- 6.7.14 adding 1 to now, and setting now to 0 when now has a size equal to nd;
- 6.7.15 storing pair-wise augmented sequence results (αsm1, αsm2) into an augmented sequence pair set ASP;
- 6.7.16 completing traversal of the set SM;
- 6.7.17 traversing the mashup invocation sequence am of the APIs, and storing invoked sequences corresponding to all APIs in the set A into the set, which are set as SA;
- 6.7.18 traversing the set SA, and setting a current traversal sequence as sa;
- 6.7.19 obtaining data augmentation methods having a sequence number of now in aug_listto process sa, and setting a generated augmented sequence as msα1;
- 6.7.20 adding 1 to now, and setting now as 0 when now has a size equal to nd;
- 6.7.21 obtaining data augmentation methods having a sequence number of now in aug_listto process sa, and setting a generated augmented sequence as msα2;
- 6.7.22 adding 1 to now, and setting now to 0 when now has a size equal to nd;
- 6.7.23 storing pair-wise augmented sequence results (msα1, msα2) into an augmented sequence pair set ASP; and
- 6.7.24 completing traversal, and outputting a pair-wise augmented sequence result set ASP.
- 6.8 obtaining API feature representations corresponding to all APIs in αsm1 in an API feature representation matrix FE_α, and carrying out mean-pooling processing on all the obtained feature representations as a representation result corresponding to the sequence αsm1, which is set as e_m^aug1; averaging dimensions of feature vectors to obtain a new feature vector, carrying out the same processing on αsm2 to obtain a representation result, which is set as e_m^aug2, obtaining mashup feature representations corresponding to all mashup services in msα1 in a mashup feature representation matrix FE_m, carrying out mean-pooling processing on all the obtained feature representations as a representation result corresponding to the sequence msα1, which is set as e_α^aug1and carrying out the same processing on msα2 to obtain a representation result, which is set as e_α^aug2; and
- 6.9 completing traversal, and computing a contrastive loss result for augmented sequence pairs generated by all the mashup services:

$ℒ_{cl}^{m} = \sum_{m \in M} - \log \frac{\exp ((e_{m}^{aug 1} \cdot e_{m}^{aug 2}) / τ)}{` \sum_{m^{'}!= m} \exp ((e_{m^{'}}^{aug 1} \cdot e_{m^{'}}^{aug 2} / τ)}$

- computing a contrastive loss result for augmented sequence pairs corresponding to all the APIs:

$ℒ_{cl}^{a} = \sum_{a \in A} - \log \frac{\exp ((e_{a}^{aug 1} \cdot e_{a}^{aug 2}) / τ)}{\sum_{a^{'}!= a} \exp ((e_{a^{'}}^{aug 1} \cdot e_{a^{'}}^{aug 2} / 96 )}$

- an overall loss result is as follows:

$ℒ_{cl} = ℒ_{cl}^{m} + ℒ_{cl}^{a}$

where τ is a temperature coefficient, and is configured to control discrimination of the model to negative samples to adjust generalization capability of the model, and log indicates a logarithmic function.

Step 7 Includes:

- 7.1 computing a pair-wise recommendation score between the mashup m and the candidate API α″: {circumflex over (γ)}_m,α″=ƒe_m^Tƒe_α, where T indicates vector transpose;
- 7.2 for the mashup service m, optimizing parameters through a pair-wise Bayesian personalized ranking loss (BPR) on the basis of service invocation information in training data, where APIs invoked by the mashup in a data set are called as a positive invocation example, which is defined as X_m⁺, remaining candidate APIs are called as a negative invocation example X_m⁻, and an overall loss function is computed as:

$ℒ_{bpr} = \sum_{(m, a^{'}, b^{'}) \in O} - \ln σ ({\hat{y}}_{m, a^{'}} - {\hat{y}}_{m, b^{'}})$

- where O∈{(m, α′, b′)|(m, α′)∈X_m⁺, (m, b′)∈X_m⁻}, and a′ and b′ indicate a positive API obtained by sampling and a negative API obtained by sampling respectively;
- 7.3 computing an overall loss result according to steps 6.9 and 7.2:

$ℒ = ℒ_{bpr} + θ \cdot ℒ_{cl}$

- where θ is a hyper-parameter for controlling intensity of contrastive learning; and
- 7.4 cycling the overall model from steps 1 to 7, and optimizing the overall model through the loss function in step 7.3 to fit the data set by model parameters.

Step 8 Includes:

- 8.1 converting the user request for constructing a new combined service into a vector representation through a language model, matching the similarity between the vector representation and mashup service function description in an existing data set, and using q objects having the highest similarity in the vector representation as associated mashup services, which are represented as relationM={rm₁, rm₂, . . . , rm_q};
- 8.2 carrying out mean-pooling operation on representations e_m^lcorresponding to mashup services in the set relationM to construct a feature representation e_newRcorresponding to the user request, where mean-pooling is common pooling operation; and averaging dimensions of the feature vector to obtain a new feature vector;
- 8.3 transmitting e_newRinto a recommendation model optimized through step 6 to output a corresponding feature representation result e_newR;
- 8.4 traversing candidate APIs, setting APIs traversed currently as α, computing corresponding pair-wise recommendation scores {circumflex over (γ)}_newR,α according to the method in step 7.1, and forming, by all the pair-wise recommendation scores, a set Y_newR; and
- 8.5 sorting Y_newRfrom large to small, and outputting top k APIs to complete API service recommendation for the new combined request.

The present invention has the beneficial effects: (1) according to service data characteristics and an invocation structure, the double-graph structure is designed to fuse various information and better learn service function characteristics; (2) on the basis of the graph neural network, the service recommendation model is optimized end to end, thereby improving the effect of service recommendation; (3) a multi-layer graph neural network by using the invocation graph structure as input is designed to fuse representation results between services by means of different model structures; (4) according to characteristics of multi-layer graph propagation results, the mutual attention mechanism is designed to integrate feature representation results of multi-layer output; and (5) a contrastive learning method is designed to augment data of the sequence, thereby alleviating the problem of data sparsity, and improving the effect of service recommendation.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a schematic diagram of an overall structure of a service recommendation model;

FIG. 2 is an effect diagram of a service recommendation model under changes of various parameters, where (a) shows a change trend of results of recommendation indexes along with changes of dimensions of an embedding representation; (b) shows a change trend of results of recommendation indexes along with changes of the number of graph propagation layers; (c) shows a change trend of results of recommendation indexes along with changes of probability parameters γ and ε in a data augmentation method; and (d) shows a change trend of results of recommendation indexes along with changes of intensity of contrastive learning;

FIG. 3 is a graph of a result of an ablation experiment of a service recommendation model; and

FIG. 4 is a graph of a result of an ablation experiment of a mutual attention mechanism.

DETAILED DESCRIPTION OF THE EMBODIMENTS

The present invention will be further described below with reference to the accompanying drawings.

With reference to FIGS. 1-4, a cloud-native application programming interface (API) recommendation method fusing data augmentation and contrastive learning includes:

- step 1, on the basis of service invocation matrix information and function association information between APIs, construct an invocation graph structure, service invocation matrix graph (SIMG), between mashup services and API services and an invocation graph structure, connecting API functional graph (CAFG), between the APIs to form a double-graph structure. The step includes:
- 1.1 represent mashup services in a mashup service set M as a mashup type of graph nodes, represent APIs in an API set A as an API type of graph nodes, and represent an overall graph node set as V_mα;
- 1.2 set an empty set ε_mαas a storage set of side information in the graph structure SIMG;
- 1.3 traverse the mashup services in M, set a single mashup service traversed currently as m, and obtain an API set m_invokeinvoked by the set M from an invocation interaction matrix MA between the mashups and the APIs, where the service invocation interaction matrix MA has a size of |M|×|A|, where |M| indicates the number of elements in the set M, and |A| indicates the number of elements in the set A;
- 1.4 traverse the API set m_invokeobtained in a previous step, set the APIs traversed currently as α, and with when a value corresponding to a corresponding position [m, α] in the invocation interaction matrix MA is 1, which indicates that an invocation relation exists between m and α, store a side combination (m, α) into ε_mα, which indicates that an undirected connection side exists between graph nodes corresponding to m and α;
- 1.5 complete traversal, and combine a graph node set V_mαand a side information set ε_mαto complete construction of a graph structure SIMG (V_mα, ε_mα)
- 1.6 convert the API set A into a graph node set, which is represented as V_α;
- 1.7 set an empty set ε_α as a storage set of side information in the graph structure CAFG;
- 1.8 traverse the APIs in the set A, setting a single API traversed currently as α₁, on this basis, traverse the APIs in A apart from α₁, and set a single API traversed currently as α₂;
- 1.9 with a tag information set A^tagof the APIs including tags corresponding to the APIs, set a tag set corresponding to the APIs α₁as A_α1^tag, set a tag set corresponding to the APIs α₂as A_α2^tag, and when A_α1^tag∩A_α2^tagis not an empty set, store a side combination (α₁, α₂) into ε_α, which indicates that an undirected connection side exists between graph nodes corresponding to α₁and α₂; and
- 1.10 complete traversal, and combine the graph node set V_α and the side information set ε_α to complete construction of a graph structure CAFG(V_α, ε_α).
- Step 2, according to the graph structure SIMG, design a multi-layer graph propagation network structure to learn mashup feature representations and API feature representations. As shown in part {circle around (1)} in FIG. 1, a schematic diagram of three layers of graph propagation structure is provided. The process is as follows:
- 2.1 construct an embedding layer to store embedding vector information of the mashup services and API nodes in forms of:

$E_{mashup} = {e_{m_{1}}, e_{m_{2}}, \dots, e_{m_{❘ M ❘}}} E_{a} = {e_{a_{1}}, e_{a_{2}}, \dots e_{a_{❘ A ❘}}} E = [\begin{matrix} E_{mashup} \\ E_{a} \end{matrix}]$

- where E_mashupindicates embedding representations corresponding to the mashup services, and E_α indicates embedding representations corresponding to the APIs; and combine E_mashupand E_α to obtain an embedding layer embedding matrix E;
- 2.2 compute an adjacency matrix of the graph structure SIMG, which is set as I_mα:

$I_{ma} = [\begin{matrix} 0 & A \\ {MA}^{T} & 0 \end{matrix}]$

- where MA is the service invocation interaction matrix, and the adjacency matrix has a size of (|M|+|A|)×(|M|+|A|), and includes connection information of nodes in the graph structure, where T indicates matrix transpose operation;
- 2.3 define a degree corresponding to the mashup service D_mαas d^mand a degree of the node corresponding to the API α as d^α, combine degree information corresponding to the mashup service and the API into a degree matrix D_mα, and compute a normalized form of the adjacency matrix:

${\tilde{I}}_{ma} = D_{ma}^{- \frac{1}{2}} I_{ma} D_{ma}^{- \frac{1}{2}}$

- where an element corresponding to a position (i,j) in a normalized matrix Ĩ_mα has a value of

$\frac{1}{\sqrt{❘ d_{i}^{m} ❘ ❘ d_{j}^{a} ❘}},$

- 2.4 traverse node information in the graph structure SIMG, set nodes traversed currently as ν including the mashup type of graph nodes and the API type of graph nodes, set a neighbor node set of ν in the graph structure, i.e., a node set having a side connection with the nodes, as N^ν, and compute a propagation result of each ν_t∈N^ν:

$h_{v \leftarrow v_{t}}^{l} = \frac{1}{\sqrt{❘ d_{v} ❘ ❘ d_{v_{t}} ❘}} (W^{l 1} e_{v} + W^{l 2} (e_{v} \otimes e_{v_{t}})),$

where l indicates the number of layers of a graph propagation structure of a multi-layer graph neural network, W^l1and W^l2are both trainable parameter matrixes of a current layer, ⊗ indicates multiplication operation of vector elements, e_ν indicates an embedding representation corresponding to the node ν, and e_ν_tindicates an embedding representation of a neighbor node ν_tof the node ν;

- 2.5 aggregate propagation information of the node ν and the neighbor node of the node to form node feature embedding output of the current layer:

$e_{v}^{l} = σ (e_{v}^{l - 1} + \sum_{v_{t} \in N^{v}} h_{v \leftarrow v_{t}}^{l})$

- where σ is an activation function, e_ν^l-1indicates node embedding output of a previous layer, and Σ indicates vector summation operation; and
- 2.6 according to properties of the node ν, divide an output node embedding result into a mashup service feature representation e_m^land an API feature representation e_α^lfor output.
- Step 3, according to the graph structure CAFG, design a corresponding graph neural network structure to compute feature representations corresponding to APIs under the graph neural network structure. As shown in part {circle around (2)} in FIG. 1, a schematic diagram of a double-layer graph convolution structure is provided. The process is as follows:
- 3.1 with one-hot representation being a vectorized representation method, where each feature is represented by an independent binary dimension, only one dimension has a value of 1, and remaining dimensions all have a value of 0, and the dimensions are configured to represent categorical variables or discrete variables, construct an API embedding representation based on one-hot representation, and convert, by the API embedding representation based on one-hot representation, high-dimensional one-hot representation corresponding to the API set into a dense embedding representation AOE∈R^|A|×dthrough an embedding layer, where d is a dimension of the embedding representation;
- 3.2 obtain an adjacency matrix I_α∈R^|A|×|A| of the graph structure CAFG, where when I_α[u, w]=1, it is indicated that the API α_uhaving a sequence number of u and the API α_whaving a sequence number of w share the same tag information;
- 3.3 computing a corresponding degree matrix D_α∈R^|A|×|A|, where the matrix is a diagonal matrix, where D_α[u, u] indicates the total number of the APIs having the same tag as α_u;
- 3.4 compute a normalized adjacency matrix

$\tilde{I_{a}} = D_{a}^{- \frac{1}{2}} I_{a} D_{a}^{- \frac{1}{2}}$

as a convolution weight of an input vector set; and

- 3.5 carry out graph convolution computation on the embedding representation through the normalized adjacency matrix:

$L^{1} = σ (\tilde{I_{a}} {PW}^{5}) C = L^{2} = σ (\tilde{I_{a}} L^{1} W^{6})$

- where σ indicates an activation function, W⁵∈R^d×dand W⁶∈R^d×dare weights of two layers respectively, computed output C is used as API feature representation output under the network structure, and has a size of C∈R^|A|×d, and a feature representation corresponding to the service a is defined as
- step 4, design a mutual attention mechanism layer to retain multi-layer output information, and compute attention weights of multi-layer output and merge feature representations of the mashup services by taking association information of the APIs as importance guidance. As shown in part {circle around (3)} in FIG. 1, a schematic diagram of a double-layer graph convolution structure is provided. The process is as follows:
- 4.1 represent a matrix consisting of all feature representation vectors of the mashup services of l-th layer output obtained in step 2.6 as E_m^l, represent a matrix consisting of feature representation vectors corresponding to the APIs as E_α^l, and combine output of a current propagation layer:

$E_{q}^{l} = [\begin{matrix} E_{m}^{l} \\ E_{a}^{l} \end{matrix}]$

- 4.2 combine E_m^land a feature matrix C output from step 3.5:

$E_{key}^{l} = [\begin{matrix} E_{m}^{l} \\ C \end{matrix}]$

- 4.3 with the attention mechanism being a common method in machine learning, and being configured to assign weights to different elements in a sequence including a query part and a key part, compute similarity between the query and the key to carry out weighted summation on the elements, so as to achieve feature extraction and weighted fusion of the sequence, and design the mutual attention mechanism to connect the feature representations of the mashup services output by the multi-layer structure, and compute a query representation of the mutual attention mechanism:

$q_{l} = W^{q ❘ l} E_{q}^{l}$

- where W^q|l∈R^d*(|M|+|A|)is a trainable weight parameter, and q₁is used as an attention query corresponding to an output result of a l-th propagation layer;
- 4.4 compute a key representation of the mutual attention mechanism:

k
_l
=W
^k
E
_key
^l

- where W^k∈R^d*(|M|+|A|)is a trainable weight parameter, and k_iindicates an attention key corresponding to an output result of the l-th propagation layer;
- 4.5 compute mutual attention weight information corresponding to the l-th layer:

$γ_{l}^{'} = \frac{q_{l}^{T} k_{l}}{\sqrt{d}}, α_{l} = \frac{\exp (γ_{l}^{'})}{\sum_{l^{'} = 1}^{L} \exp (γ_{l^{'}}^{'})}$

- where exp( ) indicates exponential power by using a natural constant e as a base, L is the total number of graph propagation layers, and T indicates vector transpose; and
- 4.6 weigh output of graph propagation layers:

${aE}_{m}^{l} = E_{m}^{l} \otimes α_{l}$

- where ⊗ is operation of computing product of each element, and αE_m^lindicates a weighted result of feature vectors of l-th layer output; and then connect weighted output of the layers, and convert the weighted output into a final mashup feature representation:

${FE}_{m} = f_{cm} ({aE}_{m}^{1} \oplus {aE}_{m}^{2} \oplus \dots \oplus {aE}_{m}^{l})$

- where ⊕ indicates vector connection operation, ƒ_cmindicates a multi-layer perceptron having an input size of L*d and output of d, and the multi-layer perceptron is a model based on a neural network, and consists of a plurality of fully-connected hidden layers and output layers; and set a mashup feature vector result corresponding to the mashup m in the matrix FE_m, as ƒe_m.
- step 5, carry out weighted combination on API feature representations of double graphs through a gate mechanism to generate a new API feature representation. The process is as follows:
- 5.1 compute, by the API feature representation e_α^loutput by the graph neural network structure based on the graph structure SIMG and the API feature representation c_α output by the graph neural network structure based on the graph structure CAFG, a gate weight corresponding to the gate mechanism:

$β = σ (W^{g} (C \oplus E_{a}^{l}))$

- where W^gis a trainable weight, and E_α^lindicates a set of e_α^lcorresponding to all the services; and
- 5.2 with the weight for connection, compute a final output result:

${FE}_{a} = β * C + (1 - β) * E_{a}^{l}$

- where FE_α is a matrix consisting of all the API feature representations obtained by weighting feature representations in a form of a matrix; and set a feature vector result corresponding to the API α in the matrix FE_α as ƒe_α.
- step 6, compute training similarity and prior information similarity between services, and generate data augmented sequence pairs through a data augmentation method. The process is as follows:
- 6.1 compute the training similarity between the services, and use dot product of embedding vectors of two APIs or mashup services as a similarity computation result:

${Sim}_{train}^{api} (a_{1}, a_{2}) = {fe}_{a_{1}} \cdot {fe}_{a_{2}} {Sim}_{train}^{mashup} (m_{1}, m_{2}) = {fe}_{m_{1}} \cdot {fe}_{m_{2}}$

- where α₁, α₂indicate any two different APIs, Sim_train^api(α₁, α₂) indicates the training similarity between the APIs α₁and α₂, m₁, m₂indicates any two different mashup services, and Sim_train^mashup(m₁, m₂) indicates the training similarity between the mashups m₁and m₂; and
- 6.2 compute the prior information similarity between the services, and compute similarity between service description documents on the basis of similarity computation of the service description documents and service function tags:

${Sim}_{description}^{api} (a_{1}, a_{2}) = \cos (e_{a_{1}}^{lm}, e_{a_{2}}^{lm}) = \frac{e_{a_{1}}^{{lm}^{T}} e_{a_{2}}^{lm}}{ e_{a_{1}}^{lm}   e_{a_{2}}^{lm} } {Sim}_{description}^{mashup} (m_{1}, m_{2}) = \cos (e_{m_{1}}^{lm}, e_{m_{2}}^{lm}) = \frac{e_{m_{1}}^{{lm}^{T}} e_{m_{2}}^{lm}}{ e_{m_{1}}^{lm}   e_{m_{2}}^{lm} }$

- where cos( ) indicates cosine similarity, e_α^lmindicates a service function representation obtained by computation of a language model corresponding to the API α, Sim_description^api(α₁, α₂) indicates the prior information similarity between the APIs α₁and α₂, ∥e_α^lm∥ indicates a modulus of a vector e_α^lm, e_m^lmindicates a service function representation obtained by computation of a language model corresponding to the mashup m, Sim_description^mashup(m₁, m₂) indicates the prior information similarity between the mashups m₁and m₂, and ∥e_m^lm∥ indicates a modulus of a vector e_m^lm;
- 6.3 compute similarity between service tags:

- where m_tagindicates a function tag set of the mashup, α_tagindicates a tag set of the APIs, |m_tag| indicates the number of mashup function tags, and |α_tag| indicates the number of the API function tags;
- 6.4 use a larger result between function description similarity and tag similarity as prior information similarity between the services:

- where () indicates a normalized similarity result;
- 6.5 use a larger result between the training similarity between the services and the prior information similarity as overall similarity information between the services:

${Sim}_{a} (a_{1}, a_{2}) = \max ({train}_{api} (a_{1}, a_{2}), {info}_{api} (a_{1}, a_{2})) {Sim}_{m} (m_{1}, m_{2}) = \max ({train}_{mashup} (m_{1}, m_{2}), {info}_{mashup} (m_{1}, m_{2})$

- compute similarity information between the services as a basis for data augmentation;
- 6.6 design four service sequence data augmentation methods as follows:
- (1) service cutting: for a service sequence mα_c={α₁, α₂, . . . , α_|mα_c_|}, select a continuous subsequence having a length of L_sc=|μ*|mα_c∥ from a random position as an augmented sequence, where μ∈(0,1) is a random cutting parameter, where |mα_c| is the number of the APIs in the service invocation sequence;
- (2) service occlusion: for a service sequence mα_c{α₁, α₂, . . . , α_|mα_c_|}, randomly discard the APIs of the number being L_sm=|η*|mα_c∥, where η∈(0,1) is a random occlusion parameter, where |mα_c| is the number of the APIs in the service invocation sequence;
- (3) association replacement: for a service invocation sequence mα_c={α₁, α₂, . . . , α_|mα_c_|}, randomly select an item, L_rc=|γ*|mα_c∥ and replace the item with a service having high function correlation, where γ∈(0,1) indicates a replacement probability parameter, and |mα_c| is the number of the APIs in the service invocation sequence;
- (4) association expansion: for a service invocation sequence mα_c{α₁, α₂, . . . , α|_mα_c_|}, randomly select an item L_re=|ε*|mα_c∥, and insert a service having high function correlation with the item into a position in which the item is located, where ε∈(0,1) indicates an expansion probability parameter, where |mα_c| is the number of the APIs in the service invocation sequence; and
- in the augmentation methods (3) and (4), determine service function correlation through similarity information between the services obtained in step 6.5, with respect to the association replacement method (3), for L_rcservices randomly selected, obtain, through a similarity computation result between the services provided in step 6.5, services having the highest function similarity from candidate services, apart from services existing in an original sequence, as new services, so as to replace the services randomly selected, and with respect to the association expansion method (4), for L_reservices randomly selected, obtain, through a similarity computation result between the services provided in step 6.5, services having the highest function similarity from candidate services, apart from services existing in an original sequence, as new services, and add the new services after original positions in sequences in which the services randomly selected are located to complete sequence augmentation;
- 6.7 for the mashup service set M, traverse the mashup services in M, set a single mashup service traversed currently as m, for the service invocation sequence mα_m, corresponding to m, select two sequence data augmentation methods provided in step 6.6 to process the sequence, so as to generate two newly augmented sequence results, which are represented as αsm1 and αsm2 respectively, for the API set A, traverse the APIs in A, set a single API traversed currently as α, set a mashup sequence that invoked the API as αm_α, and select two sequence data augmentation methods provided in step 6.6 to process the sequence, so as to generate two newly augmented sequence results, which are represented as msα1 and msα2 respectively. A process of selecting the augmentation methods and generating the augmented sequence pairs includes:
- 6.7.1 set a set for storing combined data augmentation methods, which is set as aug_listand is initially an empty set; and store the four service sequence data augmentation methods in step 6.6 into a set DA, set the number of elements in the set as, and set a count index as 1;
- 6.7.2 traverse DA, and set a method object traversed currently as dα;
- 6.7.3 for each traversal, assign index+1 to target;
- 6.7.4 when target is less than nd, store dα into aug_list, store augmentation methods having a sequence number of target in DA into aug_list, and add 1 to target;
- 6.7.5 repeat steps 6.7.3 and 6.7.4 until target has a value greater than or equal to nd;
- 6.7.6 add 1 to index;
- 6.7.7 end traversal, and complete initialization of the set aug_list;
- 6.7.8 assign 0 to a counter now, and set a set for storing augmented sequence pairs, which is set as ASP, and is initially an empty set;
- 6.7.9 traverse the API invocation sequence mα of the mashup, and store the invocation sequences corresponding to all the mashups in the set M into the set, which are set as SM;
- 6.7.10 traverse the set SM, and set a current traversal sequence as sm;
- 6.7.11 obtain data augmentation methods having a sequence number of now in aug_listto process sm, and set a generated augmented sequence as αsm1;
- 6.7.12 add 1 to now, and set now as 0 when now has a size equal to nd;
- 6.7.13 obtain data augmentation methods having a sequence number of now in aug_listto process sm, and set a generated augmented sequence as αsm2;
- 6.7.14 add 1 to now, and set now to 0 when now has a size equal to nd;
- 6.7.15 store pair-wise augmented sequence results (αsm1, αsm2) into an augmented sequence pair set ASP;
- 6.7.16 complete traversal of the set SM;
- 6.7.17 traverse the mashup invocation sequence am of the APIs, and store invoked sequences corresponding to all APIs in the set A into the set, which are set as SA;
- 6.7.18 traverse the set SA, and set a current traversal sequence as sa;
- 6.7.19 obtain data augmentation methods having a sequence number of now in aug_listto process sa, and set a generated augmented sequence as msα1;
- 6.7.20 add 1 to now, and sett now as 0 when now has a size equal to nd;
- 6.7.21 obtain data augmentation methods having a sequence number of now in aug_listto process sa, and set a generated augmented sequence as msα2;
- 6.7.22 add 1 to now, and setting now to 0 when now has a size equal to nd;
- 6.7.23 store pair-wise augmented sequence results (msα1, msα2) into an augmented sequence pair set ASP; and
- 6.7.24 complete traversal, and output a pair-wise augmented sequence result set ASP.
- 6.8 obtain API feature representations corresponding to all APIs in αsm1 in an API feature representation matrix FE_α, and carry out mean-pooling processing on all the obtained feature representations as a representation result corresponding to the sequence αsm1, which is set as e_m^aug1; with mean-pooling being common pooling operation, average dimensions of feature vectors to obtain a new feature vector, carry out the same processing on αsm2 to obtain a representation result, which is set as e_m^aug2, obtain mashup feature representations corresponding to all mashup services in msα1 in a mashup feature representation matrix FE_m, carry out mean-pooling processing on all the obtained feature representations as a representation result corresponding to the sequence msα1, which is set as e_α^aug1, and carry out the same processing on msα2 to obtain a representation result, which is set as e_α^aug2; and
- 6.9 complete traversal, and compute a contrastive loss result for augmented sequence pairs generated by all the mashup services:

$ℒ_{cl}^{m} = \sum_{m \in M} - \log \frac{\exp ((e_{m}^{aug 1} \cdot e_{m}^{aug 2}) / τ)}{\sum_{m^{'}!= m} \exp ((e_{m^{'}}^{aug 1} \cdot e_{m^{'}}^{aug 2} / τ)}$

- compute a contrastive loss result for augmented sequence pairs corresponding to all the APIs:

$ℒ_{cl}^{a} = \sum_{a \in A} - \log \frac{\exp ((e_{a}^{aug 1} \cdot e_{a}^{aug 2}) / τ)}{\sum_{a^{'}!= a} \exp ((e_{a^{'}}^{aug 1} \cdot e_{a^{'}}^{aug 2} / τ)}$

- an overall loss result is as follows:

$ℒ_{cl} = ℒ_{cl}^{m} + ℒ_{cl}^{a}$

- where τ is a temperature coefficient, and is configured to control discrimination of the model to negative samples to adjust generalization capability of the model, and log indicates a logarithmic function.
- Step 7, compute pair-wise scores through the obtained feature representation, and compute an overall loss function result through the pair-wise scores and the data augmented sequence pairs to optimize parameters of an overall recommendation model. The process is as follows:
- 7.1 compute a pair-wise recommendation score between the mashup m and the candidate API α″: {circumflex over (γ)}_m,α″=ƒe_m^Tƒe_α, where T indicates vector transpose;
- 7.2 for the mashup service m, optimize parameters through a pair-wise Bayesian personalized ranking loss (BPR) on the basis of service invocation information in training data, where APIs invoked by the mashup in a data set are called as a positive invocation example, which is defined as X_m⁺, remaining candidate APIs are called as a negative invocation example X_m⁻, and an overall loss function is computed as:

$ℒ_{bpr} = \sum_{(m, a^{'}, b^{'}) \in O} - \ln σ ({\hat{y}}_{m, a^{'}} - {\hat{y}}_{m, b^{'}})$

- where O∈{(m, α′, b′)|(m, α′)∈X_m⁺, (m, b′)∈X_m⁻}, and a′ and b′ indicate a positive API obtained by sampling and a negative API obtained by sampling respectively;
- 7.3 compute an overall loss result according to steps 6.9 and 7.2:

$ℒ = ℒ_{bpr} + θ \cdot ℒ_{cl}$

- where θ is a hyper-parameter for controlling intensity of contrastive learning; and
- 7.4 cycle the overall model from steps 1 to 7, and optimize the overall model through the loss function in step 7.3 to fit the data set by model parameters.
- Step 8, match a user request, sort the pair-wise scores and complete service recommendation.
- 8.1 convert the user request for constructing a new combined service into a vector representation through a language model, such as bidirectional encoder representation from transformers (BERT) and word to vector (Word2vec), match the similarity between the vector representation and mashup service function description in an existing data set, and use q objects having the highest similarity in the vector representation as associated mashup services, which are represented as relationM={rm₁, rm₂, . . . , rm_q};
- 8.2 carry out mean-pooling operation on representations e_m^lcorresponding to mashup services in the set relationM to construct a feature representation e_newRcorresponding to the user request, where mean-pooling is common pooling operation; and averaging dimensions of the feature vector to obtain a new feature vector;
- 8.3 transmit e_newRinto a recommendation model optimized through step 6 to output a corresponding feature representation result e_newR;
- 8.4 traverse candidate APIs, set APIs traversed currently as α, compute corresponding pair-wise recommendation scores {circumflex over (γ)}_newR,a according to the method in step 7.1, and form, by all the pair-wise recommendation scores, a set Y_newR; and
- 8.5 sort Y_newRfrom large to small, and output top k APIs to complete API service recommendation for the new combined request.

The actual effect of the invention is analyzed below according to specific service data.

- 1) A mashup combination and an API data set are selected. Contents included in the data set are specifically shown in Table 1. 1423 mashups and 1032 candidate APIs having invocation relations with the mashups are included. Corresponding function description documents and corresponding tags are collected for each mashup and each API, and the description information is pre-segmented, and is converted into dense vectors through a BERT pre-training language model.

TABLE 1

Item
Mashups
APIs

Number of elements
1423
1032

Number of tags
297
301

Document vector dimension
768
768

- 2) All the services are divided into 5 parts through 20% of the number of the mashup services by using a cross-validation method, and one part of services is validated as a test set each time.
- 3) The effect of service recommendation is evaluated with multiple indexes: Hit rate (HR):

$HR = \frac{❘ {RecA}_{m} ⋂ {ObsA}_{m} ❘}{❘ {ObsA}_{m} ❘}$

- where RecA_mindicates a recommended API list for a mashup m, and ObsA_mindicates an observable API invocation list of mbased on real data. Normalized discounted cumulative gain (NDCG) index:

$NDCG = \frac{DCG}{IDCG}, DCG = \sum_{n}^{N} \frac{2^{r (n)} - 1}{\log_{2} (n + 1)}, r (n) = {\begin{matrix} 1, & if n is relevant \\ 0, & if n is irrelevant \end{matrix}$

- where N indicates the number of recommended services, n indicates services at a n-th position in order in a recommendation list, and an idea discounted cumulative gain (IDCG) is the sum of discounted cumulative gain (DCG) values obtained from all recommendation results. Mean average precision (MAP):

$MAP = \frac{1}{❘ M ❘} \sum_{m = 1}^{❘ M ❘} \frac{1}{{num}_{m}} \sum_{t = 1}^{{num}_{m}} P (t) \cdot r (t), P (t) = \frac{hit num before position t}{t}$

- where num_mindicates the number of APIs invoked by the mashup m·P(t) computes a recommendation service ratio of successful hits before a current position t in the recommendation list.
- 4) A parameter experiment related to a service recommendation model is designed. Main parameters affecting the effect of the model include embedding representation dimensions, the number of graph information propagation layers, data augmentation probability, and contrastive learning intensity parameters in a loss function. FIG. 2 shows an effect diagram of a service recommendation model under various parameter changes; and (a) of FIG. 2 shows a change trend of results of recommendation indexes along with changes of embedding representation dimensions. Vectors may not accurately represent feature information of each service and may reduce the representation learning capability of the model due to the too small embedding representation dimensions. Therefore, the effect is poor. Along with increase in the embedding representation dimensions, the model training result is gradually improved; and when a value is 64, the model may balance representation and generalization capabilities, and achieve excellent results.
- (b) of FIG. 2 shows a trend of results of recommendation indexes along with changes of the number of graph propagation layers. Changes of index results caused by changes of the number of graph propagation layers are observed. It may be seen that along with increase in the number of layers, the model effect is improved and reaches a peak at a value of 3 layers. Along with increase in the number of subsequent layers, effect changes tend to be gentle, but are not significantly reduced, which indicates that the mutual attention mechanism designed by the present invention may effectively fuse output of a multi-layer structure and learn useful information in a multi-layer invocation relation.
- (c) of FIG. 2 shows a change trend of results of recommendation indexes along with changes of probability parameters γ and ε in a data augmentation method. Along with increase in values of the probability parameters, the model effect is gradually improved. When the value is around 0.6, as the value increases, the indicator results of model training remain basically stable. (d) of FIG. 2 shows a change trend of results of recommendation indexes along with changes of intensity of contrastive learning. The larger θ is, the greater contrastive intensity is. Before θ reaches a certain threshold, the effect of the service recommendation model is improved.
- 5) An ablation experiment related to the structure of the service recommendation model is designed. The function of each part of structure of the service recommendation model provided in the present invention is evaluated by providing three variations of the service recommendation model provided in the present invention. Service recommendation-1 is a model having no CAFG structure, has no additional information of the APIs, but retains feature vectors corresponding to service function description of the APIs. Service recommendation-2 is a model having no graph information propagation mechanism, and is a 0-layer structure without mutual learning of a service invocation structure. Service recommendation-3 is a model having no gate connection mechanism, which directly uses API representation output of the SIMG. FIG. 3 shows results of variant models on different experimental indexes.

The service recommendation-1 that loses additional information between APIs included in the CAFG, has a lower effect than a complete model, and especially has a significant decrease in the last two indexes related to a recommendation order. Training results of service recommendation-2 are observed, and has performance in various indexes significantly decreased after an information propagation process based on a graph neural network is eliminated. According to experimental results of service recommendation-3, results obtained without a gate mechanism are worse than those obtained by connecting two feature representations by using the gate mechanism. The results indicate that structural design of the service recommendation model is effective, thereby improving the effect of service recommendation.

- 6) Model training results that use a mutual attention mechanism for multi-layer output integration are compared with results that do not use the mutual attention mechanism, so as to determine the function of the mutual attention mechanism. FIG. 4 shows changes of model training results on various indexes caused by two multi-layer output strategies in the case of the number of graph propagation layers is changed. Noa indicates removal of the mutual attention mechanism, and directly uses a method for multi-layer connection.

From the change trend of recommended index results, it may be seen that the model effect may also be improved to some extent by utilizing additional information output from the multi-layer structure when the total number of layers is low in a simple strategy of connecting multi-layer results for output. Along with increase in the number of structural layers, the model effect may be reduced due to incapability to select multi-layer output results. The mutual attention mechanism may better generalize the output results of the multiple layers and fuse the multi-layer invocation relations between services into feature representations. As shown in the figure, the model may still maintain excellent recommendation effect in multi-layer situations, and various recommendation result indexes are not significantly reduced. Use of the mutual attention mechanism can better fuse the multi-layer output results, overcome interference caused by missing invocation information, and improve the effect of recommendation results.

The content in the examples of the description is only a listing of the implementation forms of the invention concept and is only for illustrative purposes. The scope of protection of the present invention should not be considered as limitations to the specific form stated in the examples, and the scope of protection of the present invention also extends to equivalent technical means conceivable by those of ordinary skill in the art according to the concept of the present invention.

Claims

1. A cloud-native application programming interface (API) recommendation method fusing data augmentation and contrastive learning, comprising: step 1, on the basis of service invocation matrix information and function association information between APIs, constructing an invocation graph structure, service invocation matrix graph (SIMG), between mashup services and API services and an invocation graph structure, connecting API functional graph (CAFG) between the APIs to form a double-graph structure;step 2, according to the invocation graph structure SIMG, designing a multi-layer graph propagation network structure to learn mashup feature representations and API feature representations;step 3, according to the invocation graph structure CAFG, designing a corresponding graph neural network structure to compute feature representations corresponding to the APIs under the corresponding graph neural network structure;step 4, designing a mutual attention mechanism layer to retain multi-layer output information, and computing attention weights of multi-layer output and merging feature representations of the mashup services by taking association information of the APIs as importance guidance;step 5, carrying out weighted combination on the API feature representations of double graphs through a gate mechanism to generate a new API feature representation;step 6, computing training similarity and prior information similarity between services, and generating data augmented sequence pairs through a data augmentation method;step 7, computing pair-wise scores through the obtained feature representation, and computing an overall loss function result through the pair-wise scores and the data augmented sequence pairs to optimize parameters of an overall recommendation model; andstep 8, matching a user request, sorting the pair-wise scores and completing service recommendation.
2. The cloud-native API recommendation method fusing data augmentation and contrastive learning according to claim 1, wherein step 1 comprises: step 1.1 representing the mashup services in a mashup service set M as a mashup type of graph nodes, representing the APIs in an API set A as an API type of the graph nodes, and representing an overall graph node set Vmα;step 1.2 setting an empty set εmα as a storage set of side information in the invocation graph structure regarding SIMG;step 1.3 traversing the mashup services in the mashup service set M, setting a single mashup service m traversed currently, and obtaining an API set minvoke invoked by the mashup service set M from an invocation interaction matrix MA between the mashups and the APIs, wherein the invocation interaction matrix MA has a size of |M|×|A|, wherein |M| indicates the number of elements in the mashup service set M, and |A| indicates the number of elements in the API set A;step 1.4 traversing the API set minvoke obtained in step 1.3, setting the APIs a traversed currently, and with when a value corresponding to a corresponding position [m, α] in the invocation interaction matrix MA is 1, which indicates that an invocation relation exists between the single mashup service m and the APIs α, storing a side combination (m, α) into the empty set εmα, which indicates that an undirected connection side exists between the graph nodes corresponding to m and α;step 1.5 completing traversal, and combining the overall graph node set Vmα and the empty set εmα to complete construction of a graph structure SIMG((Vmα, εmα);step 1.6 converting the API set A into a graph node set Vα;step 1.7 setting a side information set εα as a storage set of side information in the graph structure CAFG;step 1.8 traversing the APIs α in the API set A, setting a single API α1 traversed currently, on this basis, traversing the APIs α in the API set A apart from the single API α1, and setting a single API α2 traversed currently;step 1.9 with a tag information set Atag of the APIs α comprising tags corresponding to the APIs α, setting a tag set Aα1tag corresponding to the single API α1, setting a tag set Aα2tag corresponding to the single API α2 and when Aα1tag ∩Aα2tag is not the empty set εmα storing a side combination (α1, α2) into the side information set εα, which indicates that an undirected connection side exists between graph nodes corresponding to the single API αi and the single API α2; andstep 1.10 completing traversal, and combining the graph node set Vα and the side information set εα to complete construction of a graph structure CAFG(Vα, εα).
3. The cloud-native API recommendation method fusing data augmentation and contrastive learning according to claim 2, wherein the step 2 comprises: step 2.1 constructing an embedding layer to store embedding vector information of the mashup services and API nodes in forms of:
4. The cloud-native API recommendation method fusing data augmentation and contrastive learning according to claim 3, wherein the step 3 comprises: step 3.1 with one-hot representation being a vectorized representation method, wherein each feature is represented by an independent binary dimension, only one dimension has a value of 1, and remaining dimensions all have a value of 0, and the dimensions are configured to represent categorical variables or discrete variables, constructing an API embedding representation based on one-hot representation, and converting, by the API embedding representation based on one-hot representation, high-dimensional one-hot representation corresponding to the API set into a dense embedding representation AOE∈R|A|×d through an embedding layer, wherein d is a dimension of the embedding representation;step 3.2 obtaining an adjacency matrix Iα∈R|A|X|A| of the invocation graph structure CAFG, wherein when the adjacency matrix Iα[u, w]=1, it is indicated that the API au having a sequence number of u and the API αw having a sequence number of w share the same tag information;step 3.3 computing a corresponding degree matrix Dα∈R|A|×|A|, wherein the corresponding degree matrix Dα is a diagonal matrix, wherein Dα[u, u] indicates the total number of the APIs having the same tag as αu;step 3.4 computing a normalized adjacency matrix
5. The cloud-native API recommendation method fusing data augmentation and contrastive learning according to claim 4, wherein the step 4 comprises: step 4.1 representing a matrix Eml consisting of all feature representation vectors of the mashup services of l-th layer output obtained in step 2.6, representing a matrix Eαl consisting of feature representation vectors corresponding to the APIs, and combining output of a current propagation layer:
6. The cloud-native API recommendation method fusing data augmentation and contrastive learning according to claim 1, wherein the step 5 comprises: step 5.1 computing, by the API feature representation eαl output by the corresponding graph neural network structure based on the invocation graph structure SIMG and the API feature representation eαl output by the corresponding graph neural network structure based on the invocation graph structure CAFG, a gate weight β corresponding to the gate mechanism:
7. The cloud-native API recommendation method fusing data augmentation and contrastive learning according to claim 1, wherein the step 6 comprises: step 6.1 computing the training similarity between the services, and using dot product of embedding vectors of two APIs or mashup services as a similarity computation result:
8. The cloud-native API recommendation method fusing data augmentation and contrastive learning according to claim 7, wherein in the step 6.7, a process of selecting the augmentation methods and generating the augmented sequence pairs comprises: step 6.7.1 setting a set for storing combined data augmentation methods, which is set as auglist and is initially an empty set; and storing the four service sequence data augmentation methods in the step 6.6 into a set DA, setting the number of elements in the set as nd, and setting a count index as 1;step 6.7.2 traversing DA, and setting a method object traversed currently as dα;step 6.7.3 for each traversal, assigning index+1 to target;step 6.7.4 when target is less than nd, storing da into the set auglist, storing augmentation methods having a sequence number of target in DA into the set auglist, and adding 1 to target;step 6.7.5 repeating steps 6.7.3 and 6.7.4 until target has a value greater than or equal to nd;step 6.7.6 adding 1 to index;step 6.7.7 ending traversal, and completing initialization of the set auglist;step 6.7.8 assigning 0 to a counter now, and setting a set for storing the augmented sequence pairs, which is set as ASP, and is initially an empty set;step 6.7.9 traversing the API invocation sequence mα of the mashup, and storing the invocation sequences corresponding to all the mashups in the set mashup service M into the set, which are set as SM;step 6.7.10 traversing the set SM, and setting a current traversal sequence sm;step 6.7.11 obtaining data augmentation methods having a sequence number of now in the set auglist to process the current traversal sequence sm, and setting a generated augmented sequence as αsm1;step 6.7.12 adding 1 to now, and setting now as 0 when now has a size equal to nd;step 6.7.13 obtaining data augmentation methods having a sequence number of now in auglist to process sm, and setting a generated augmented sequence as αsm2;step 6.7.14 adding 1 to now, and setting now to 0 when now has a size equal to nd;step 6.7.15 storing pair-wise augmented sequence results (αsm1, αsm2) into an augmented sequence pair set ASP;step 6.7.16 completing traversal of the set SM;step 6.7.17 traversing a mashup invocation sequence am of the APIs, and storing invoked sequences corresponding to all APIs in the set A into the set, which are set as SA;step 6.7.18 traversing the set SA, and setting a current traversal sequence sa;step 6.7.19 obtaining data augmentation methods having a sequence number of now in the set auglist to process sa, and setting a generated augmented sequence as msα1;step 6.7.20 adding 1 to now, and setting now as 0 when now has a size equal to nd;step 6.7.21 obtaining data augmentation methods having a sequence number of now in the set auglist to process the current traversal sequence sa, and setting a generated augmented sequence as msα2;step 6.7.22 adding 1 to now, and setting now to 0 when now has a size equal to nd;step 6.7.23 storing pair-wise augmented sequence results (msα1, msα2) into an augmented sequence pair set ASP; andstep 6.7.24 completing traversal, and outputting a pair-wise augmented sequence result set ASP.
9. The cloud-native API recommendation method fusing data augmentation and contrastive learning according to claim 7, wherein the step 7 comprises: step 7.1 computing a pair-wise recommendation score between the mashup service m and the candidate API α″:{circumflex over (γ)}m,α″=ƒemTƒeα, wherein T indicates vector transpose;step 7.2 for the mashup service m, optimizing parameters through a pair-wise Bayesian personalized ranking loss (BPR) on the basis of service invocation information in training data, wherein the APIs invoked by the mashup in a data set are called as a positive invocation example Xm+, remaining candidate APIs are called as a negative invocation example Xm−, and an overall loss function bpr is computed as:
10. The cloud-native API recommendation method fusing data augmentation and contrastive learning according to claim 1, wherein the step 8 comprises: step 8.1 converting the user request for constructing a new combined service into a vector representation through a language model, matching the similarity between the vector representation and mashup service function description in an existing data set, and using q objects having the highest similarity in the vector representation as associated mashup services, which are represented as relationM={rm1, rm2, . . . , rmq};step 8.2 carrying out mean-pooling operation on representations eml corresponding to mashup services in the set relationM to construct a feature representation enewR corresponding to the user request, wherein mean-pooling is common pooling operation; and averaging dimensions of the feature vector to obtain a new feature vector;step 8.3 transmitting enewR into a recommendation model optimized through the step 6 to output a corresponding feature representation result enewR;step 8.4 traversing candidate APIs, setting APIs traversed currently as α, computing corresponding pair-wise recommendation scores {circumflex over (γ)}newR,α according to the method in step 7.1, and forming, by all the pair-wise recommendation scores, a set YnewR; andstep 8.5 sorting YnewR from large to small, and outputting top k APIs to complete API service recommendation for the new combined request.

Priority Claims (1)

Number	Date	Country	Kind
202310822440.4	Jul 2023	CN	national

CLOUD-NATIVE APPLICATION PROGRAMMING INTERFACE (API) RECOMMENDATION METHOD FUSING DATA AUGMENTATION AND CONTRASTIVE LEARNING

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

Priority Claims (1)