The present invention belongs to the technical field of Federated Graph Learning, in particular relates to edge-client collaborative federated graph learning with adaptive neighbor generation.
With powerful expressive capabilities, graphs have been widely used to depict real-world application scenarios such as social network, knowledge graph. In the area of graph learning, the emerging Graph Neural Networks (GNNs) have gained significant attention due to their exceptional performance in dealing with graphrelated tasks. GNNs efficiently utilize the feature propagation by employing multiple graph convolutional layers for node classification tasks, where the structural knowledge is distilled into discriminative representations from complex graph-orient data in diverse domains such as prediction modeling, malware detection, and resource allocation. Commonly, the training performance of GNNs depends on the substantial graph data distributed among clients. However, due to privacy and overhead concerns, it is impractical to assemble graph data from all clients for GNN training.
Following a distributed training mode, Federated Graph Learning (FGL) aims to deal with the problem of graph data island by promoting cooperative training among multiple clients. To protect privacy, the FGL offers generalized graph mining models over distributed subgraphs without sharing raw data. Many studies have verified the feasibility of FGL in various domains such as transportation, computer vision, and edge intelligence. Recently, some studies also adopted FGL-based frameworks for semi-supervised classification tasks. These approaches typically join an edge server with multiple clients to train a globally-shared classifier for downstream tasks, where the clients and edge server undertake local updating and global aggregation, respectively.
In real-world FGL application scenarios, there are potential links between the subgraphs of a client and others since these subgraphs contain significant information about neighbor clients. However, previous FGL-related studies overlooked such important links among clients, as shown in
The purpose of the present invention is to provide a edge-Client Collaborative Federated Graph Learning with Adaptive Neighbor Generation, consider the typical FGL scenario with distributed graph datasets; based on this setting, first propose an improved centralized FGL framework, named FedGL; next, extend the FedGL to the scenario of multi-edge collaboration and propose a novel distributed FGL framework, named SpreadFGL.
Consider an edge server to communicate with M clients; the FedGL leverages the edge server Sj as an intermediary to facilitate the information flow among clients, where Sj covers all clients, denoted by Mj=M; incorporate a graph imputation generator to construct learnable links, thereby generating the latent links between subgraphs; employ a L-layer GNN model with the local node classifier Fij, defined as
For every edge-client communication in FedGL, each client parallelly trains the local node classifier Fij parameterized by W(j,i) in local training rounds, formulated as
After local training, Sj aggregates local parameters {W(j,i)|i∈[Mj]} to update global ones Wj, and then broadcasts Wj to all clients at each edge-client communication.
The clients upload the processed embeddings {H(j,i)|i∈[Mj]} to the edge server at every intervals of edge-client communication, where the original linked nodes remain proximate in the low-dimensional space; next, the graph imputation generator performs the fusion on the processed embeddings to obtain the globally-shared information Hj∈
|
j|×c, where vj is the number of all clients covered by Sj; based on this, Hj is denoted as
The graph imputation generator utilizes the distance to evaluate the node similarity and construct the global topology graph, referred to Āj=HjHj
The assessor adopts a fully-connected neural network to evaluate
The training processes of the autoencoder and assessor are performed simultaneously, where the assessor guides the autoencoder to learn more discriminative reconstructed data and potential features through back-propagation.
Based on the proposed versatile assessor, we first set a threshold θ∈(0, 1) in every training iteration of the autoencoder and select the attributes in huj that are less than θ; these attributes are deemed as negative and their feedbacks from the assessor are 0; next, the zero-regularization is used to process these negatives, and thus both the autoencoder and the assessor can spotlight the representations that are meaningful for downstream tasks; hence, the loss function of the assessor is updated and redefined as
The edge server Sj divides into some subgraphs, denoted by the set {
ji(
ji,
ji)|i∈[Mj]}, where
ji={
ji=Pij(
ji); by collaborating with the edge server, clients are expected to acquire diverse neighbor features from globally-shared information, thereby fixing cross-subgraph missing links.
Propose a novel distributed FGL framework, named SpreadFGL, that extends the FedGL to a multi-edge environment; the SpreadFGL is able to facilitate more efficient FGL training and better load balancing in a multiedge collaborative environment; consider that there are N edge servers, and an edge server Sj is equipped with a global node classifier Fj parameterized by Wj; besides, a client only communicates with its closest edge server; there exist neighbor relationships among the servers, denoted by the matrix A∈N×N; if Si and Sj are neighbors, aij=1; otherwise, aij=0; moreover, the parameter transmission is permitted between neighbor servers;
In SpreadFGL, the clients adopt the L-layer GNNs; the edge servers exchange information with the covered clients in each edge-client communication; at each K intervals of edge-client communications, the clients and their nearest edge servers collaboratively utilize the shared information to extract the potential links based on the proposed graph imputation generator and negative sampling mechanism;
To better explore the potential cross-subgraph links by using the information from other servers, adopt the topology structure at the edge layer to facilitate the parameter transmission between neighbor servers; this enables the information flow among clients via the gradient propagation at each intervals of edge-client communication; specifically, Sj first aggregates the model parameters of its neighbor servers; next, Sj averages the parameters and broadcast them to the covered clients; this process can be described as
Compared with the prior art, the present invention has the following beneficial effects:
;
The technical solution of the present invention is described in detail in combination with the accompany drawings.
Proposed in the present invention is a Edge-Client Collaborative Federated Graph Learning with Adaptive Neighbor Generation. Framework is as shown in
The method specifically comprises the following design process:
To address these essential challenges, we propose FedGL, an improved centralized FGL framework, to explore potential cross-subgraph links by leveraging the global information flow. As illustrated in
Graph Neural Networks [18] have drawn considerable attention in recent years due to their remarkable capabilities. As an emerging technique in semi-supervised learning, GNNs
The GAT incorporates GCNs with attention mechanisms to adaptively assign the weights αuv(l+1) for the neighbors of the node u, and the inference vector is defined as
The GraphSAGE aggregates node features by sampling from neighbor nodes and the inference vector is defined as
There is an urgent need to study the restoration of missing cross-subgraph links for better handling the semi-supervised node classification.
Federated Graph Learning (FGL) has emerged as a captivating topic in recent years. Different from the classic GNN that relies on centralized feature propagation across the entire graph, FGL enables distributed clients to collectively maintain a globally-shared model through gradient aggregation. Many efforts have contributed to this topic. For instance, He et al. proposed a graph-level scheme that distributed graph datasets across multiple clients, catering to various downstream tasks. Wu et al. designed an FGL framework for recommendation tasks, where subgraphs contain overlapped items. Xie et al. developed an FGL based framework to mitigate the heterogeneity among features and graphs. They employed clustering techniques to aggregate clients based on the GNN gradients, aiming to enhance the collaboration efficiency of federated learning.
However, the above studies overlooked the pervasive missing links between clients happened in real-world scenarios, which may cause undesired performance in downstream tasks.
To the best of our knowledge, few studies well considered and tackled the problem of missing cross-subgraph links. Zhang et al. utilized a local linear predictor to explore the potential relationships between clients according to the local subgraph structure. However, the cross-subgraph relationships rely on important information from neighbor clients, which makes it hard to find the potential links only using local subgraphs, thereby leading to inefficient recovery of crossclient information. Moreover, prior studies commonly adopted the classic FedAvg for training, ignoring the overload of a single node (e.g., edge server) especially when the number of clients expands.
In this section, we consider the typical FGL scenario with distributed graph datasets. Based on this setting, we first propose an improved centralized FGL framework, named FedGL. Next, we extend the FedGL to the scenario of multi-edge collaboration and propose a novel distributed FGL framework, named SpreadFGL. Specifically,
A graph dataset is denoted as (
, Y)D(G,Y), where
=(
, ε, X) is a global graph.
is the node set, where |
|=n, ε={euv} is the edge set that stores the link relationship between the nodes
and
, where ∀
, v∈
. X∈
n×d indicates the node feature matrix, where
∈
d is the feature vector of the i-th node. Y=[0, 1]∈
n×c is the label matrix, where c is the number of classes. Considering that there are N edge servers and M clients. The edge server Sj covers Mj local clients {Cij|
∈[Mj]} to conduct the FGL training, where Σj=1NMj=M. The client
owns the part samples of the graph dataset, denoted by
ij{
ji, Yji}, where
ji=(
ji, εji, Xji) is a local subgraph and Yji is the sub-label matrix of nodes vji. To simulate the realworld scenario of missing links between clients, we consider that there are no shared nodes and connected links among clients, formulated by
ji∩
ĵr=∅, where ∀
, r∈[Mj] and
≠r if
=
, and
∈[Mj], ∀r∈[Mĵ] if
≠ĵ. The subgraphs of all clients form the complete graph, defined by Σj=1NΣi=1M
ji|=n. Thus, there is no link between any two clients, and a client cannot directly retrieve the node features from another client. For clarity, Table I lists the main notations used in this application.
Based on the above scenario, the client , owns a local node classifier Fij, and graphic patcher Pij, and all clients can jointly learn graph representations for semi-supervised node classification. Generally, the proposed SpreadFGL aims to conduct collaborative learning on independent subgraphs across all clients, prioritizing the privacy of raw data. Therefore, the SpreadFGL obtains the global node classifiers {Fj|j∈[N]} parameterized by {Wj|j∈[N]} in the edge servers for downstream tasks. With this consideration, we formulate the optimization problem as minimizing the aggregated risks to find the optimal weights {Wj|j∈[N]}, defined as
Where tji⊆
ji is the labeled training set in the i-th client, yvji is the ground truth of node v in the i-th client.
, X, Y
ji, Xji, Yji
F
AS
AE
Since clients cannot directly capture cross-subgraph links that contain important neighbor information, the feature propagation from higher-order neighbors becomes inadequate, resulting in degraded classification performance. Therefore, it is crucial to explore the potential topology links among clients. To achieve this goal, we propose an improved centralized FGL framework, named FedGL. In FedGL, we consider an edge server to communicate with M clients. The FedGL leverages the edge server Sj as an intermediary to facilitate the information flow among clients, where Sj covers all clients, denoted by Mj=M. Specifically, we incorporate a graph imputation generator to construct learnable links, thereby generating the latent links between subgraphs. To enhance feature propagation in local tasks and facilitate subsequent inference with the global model, we employ a L-layer GNN model with the local node classifier Fij, defined as
where Yuji is the inference vector of the node u conducted by local training.
For every edge-client communication in FedGL, each client parallelly trains the local node classifier Fij parameterized by W(j,i) in local training rounds, formulated as
After local training, Sj aggregates local parameters {W(j,i)|i∈[Mj]} to update global ones Wj, and then broadcasts Wj to all clients at each edge-client communication.
C. Graph Imputation Generator with Versatile Assessor
To capture the potential cross-subgraph links, we design a graph imputation generator and incorporate it with a versatile assessor to explore a learnable potential graph j=(
j,
Graph Imputation Generator. To construct the globally shared information without revealing raw data, the clients upload the processed embeddings {H(j,i)|i∈[Mj]} to the edge server at every intervals of edge-client communication, where the original linked nodes remain proximate in the low-dimensional space. Next, the graph imputation generator performs the fusion on the processed embeddings to obtain the globally-shared information Hj∈
|
j|×c, where
j is the number of all clients covered by Sj. Based on this, Hj is denoted as
In real-world application scenarios of FGL, it is possible for each node in clients to own potential cross-subgraph links, and it may be insufficient for clients to propagate features in multi-hop neighbors if missing these cross-subgraph links. In response to this problem, the graph imputation generator utilizes the distance to evaluate the node similarity and construct the global topology graph, referred to Āj=HjHj
Where Wa(j,l+1)∈d
d
Versatile Assessor. Since the conditional distribution of
The training processes of the autoencoder and assessor are performed simultaneously, where the assessor guides the autoencoder to learn more discriminative reconstructed data and potential features through back-propagation.
Negative Sampling. To extract more refined potential features, we develop a negative sampling mechanism to concentrate on the pertinent information for node classification. Based on the proposed versatile assessor, we first set a threshold θ∈(0, 1) in every training iteration of the autoencoder and select the attributes in huj that are less than θ. These attributes are deemed as negative and their feedbacks from the assessor are 0. Next, the zero-regularization is used to process these negatives, and thus both the autoencoder and the assessor can spotlight the representations that are meaningful for downstream tasks. Hence, the loss function of the assessor is updated and redefined as
Where eu is a c-dimensional vector that judges whether huij∈huj is higher than θ (eui=1) or not (eui, =0). ⊙ is the element-wise multiplication.
Where huj and is an indicator vector with the values of 1.
Through the above operations, j=(
j,
Graph Fixing. The edge server Sj divides j into some subgraphs, denoted by the set {
ji(
ji,
ji)|i∈[Mj]}, where
ji,
ji={
ji=Pij(
ji). This process simulates the missing links, thereby promoting the feature propagation of local tasks in Eq. (3). By collaborating with the edge server, clients are expected to acquire diverse neighbor features from globally-shared information, thereby fixing cross-subgraph missing links. Moreover, these cross subgraph links contribute to training a global node classifier Fj, aligning with the overall optimization objective in Eq. (4).
In real-world application scenarios, a single edge server may encounter the problem of excessive costs and degraded performance as the number of clients expands, particularly when clients are geographically dispersed. To address this problem, we propose a novel distributed FGL framework, named SpreadFGL, that extends the FedGL to a multi-edge environment. The SpreadFGL is able to facilitate more efficient FGL training and better load balancing in a multiedge collaborative environment. We consider that there are N edge servers, and an edge server Sj is equipped with a global node classifier Fj parameterized by Wj. Besides, a client only communicates with its closest edge server. There exist neighbor relationships among the servers, denoted by the matrix A∈N×N. If Sj and Sj are neighbors, aij=1; otherwise, aij=0. Moreover, the parameter transmission is permitted between neighbor servers.
In SpreadFGL, the clients adopt the L-layer GNNs and conduct the feature propagation via Eq. (3) during the local training. The edge servers exchange information with the covered clients in each edge-client communication. At each K intervals of edge-client communications, the clients and their nearest edge servers collaboratively utilize the shared information to extract the potential links based on the proposed graph imputation generator and negative sampling mechanism.
However, the potential cross-subgraph links strictly depend on the information provided by all clients. This not only violates the core idea of the SpreadFGL but also is impractical if the information is transmitted from the clients that are under the coverage of other servers. In light of these concerns, we design a weight regularizer during the local training. Based on trace normalization, the regularizer is used to enhance the network learning capability of the local node classifiers. Specifically, the loss function of the i-th client under the coverage of Sj is defined as
To better explore the potential cross-subgraph links by using the information from other servers, we adopt the topology structure at the edge layer to facilitate the parameter transmission between neighbor servers. This enables the information flow among clients via the gradient propagation at each intervals of edge-client communication. Specifically, Sj first aggregates the model parameters of its neighbor servers. Next, Sj averages the parameters and broadcast them to the covered clients. This process can be described as
The procedure of the proposed SpreadFGL is elaborated in Algorithm 1, whose core components have been described in detail before.
we conduct ablation experiments to further verify the superiority of the core components designed in the proposed frameworks.
Real-world Testbed. As shown in
Datasets. The following four benchmark graph datasets are used in our experiments, as shown in Table II, where c is the number of classes, and |
indicates data missing or illegible when filed
Comparison Algorithms. We compare our proposed FedGL and SpreadFGL with the following state-of-the-art algorithms.
latent information in each training round. It is worth noting that there are few studies for handling the FGL scenario with completely missing cross-subgraph links between clients. FedSage+ is deemed as the state-of-the-art algorithm for studying the missing cross-subgraph links in FGL fields. However, it still suffers from performance bottlenecks and has not been well solved in real-world scenarios.
Parameter Settings. For the proposed SpreadFGL and FedGL, we adopt the GraphSAGE with two layers and use the GCN aggregator as local node classifiers. The autoencoder employs 4 fully-connected layers, where the neural number of encoder and decoder are {c, 16, d} and {d, 16, c}, respectively. In the autoencoder, the Softmax is used as an activation function in the last layer. The assessor adopts a fully connected neural network, where the hidden neural number is {c, 128, 16, 1}. In the assessor, the Sigmoid is used as an activation function in the last layer while the ReLU is used in the rest layers. The training iterations of the autoencoder and assessor are Tae Tas=5 and Tas=3, respectively, and the Adam optimizer is used to update parameters with the learning rate of 0.001. The threshold θ is set to 1/c and k ranges in [3, 20]. Moreover, we select [20%, 60%] samples as the training set and randomly choose 20% as the testing set. The Louvain algorithm is used to measure the subgraph similarity for clients. The FedGL uses an edge server and the SpreadFGL adopts three edge servers for collaborative training with a ring topology structure, where the number of clients ranges in [6, 15]. The Adam optimizer is used to update the parameters of local classifiers with the learning rate lr=0.01. Besides, we use the well-known accuracy (ACC) and macro F1-score (F1) as performance metrics.
Node Classification Accuracy. As shown in Table III, the proposed SpreadFGL and FedGL can both achieve higher classification accuracy than other state-of-the-art algorithms under different datasets, indicating the superiority of the proposed frameworks for node classification tasks. Specifically, the significant performance gap between the LocalFGL and SpreadFGL verifies the advantages of using the proposed edge client collaboration mechanism. The FedGL and SpreadFGL outperform the FedSage+ by around 12.78% and 14.71% in terms of ACC and F1, respectively. This demonstrates that the FedGL and SpreadFGL gain more generalized potential cross subgraph links through the global information flow, further validating the effectiveness of the proposed graph imputation generator. Moreover, compared to the FedGL, the SpreadFGL achieves better performance on most of the datasets under various scenarios with different numbers of clients. This indicates that the information flow between clients and edge servers utilized in the SpreadFGL effectively promotes the repair of missing links among clients even though the scenario becomes complex with more clients.
AND M = 6, 9, 12, 15
indicates data missing or illegible when filed
Performance with Different Labeled Ratios.
Parameter Sensitivity. We analyze the parameter sensitivity of the proposed SpreadFGL on different datasets with respect to the hyperparameter K and Tl. As shown in
generator can better repair the missing links in subgraphs to promote feature propagation in local models within fewer edge-client communications, thereby improving the training of the global node classifiers. In this regard, the suggested values of K range from 1 to 10.
Ablation Study. As shown in
Convergence Validation.
In this application, we propose a novel FGL-based framework named FedGL and its extended framework SpreadFGL, addressing the challenges of generating cross-subgraph links and single-node overloading. First, we design the FedGL to repair the missing links between clients, where a new graph imputation generator is developed that incorporates a versatile assessor and negative sampling mechanism to explore refined global information flow, extracting unbiased latent links and thus improving the training effect. Next, to alleviate the overloading issue at the edge layer, we extend the FedGL and propose the SpreadFGL with multi-edge collaboration to enhance the global information exchange. Extensive experiments are conducted on real-world testbed and benchmark graph datasets to verify the superiority of the proposed FedGL and SpreadFGL. The results show that the FedGL and SpreadFGL outperform state-of-the-art algorithms in terms of model accuracy. Further, through ablation experiments and convergence analysis, we validate the effectiveness of the core components designed in the proposed frameworks and the advantage of the SpreadFGL for achieving faster convergence speed.
This application is the continuation application of International Application No. PCT/CN2023/132495, filed on Nov. 20, 2023, the entire contents of which are incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/CN2023/132495 | Nov 2023 | WO |
Child | 18399696 | US |