The present disclosure relates generally to a first node and methods performed thereby, for handling predictive models. The present disclosure also relates generally to a second node and methods performed thereby, for handling predictive models. The present disclosure further relates generally to a third node and methods performed thereby, for handling predictive models.
Computer systems in a communications network may comprise one or more nodes, which may also be referred to simply as nodes. A node may comprise one or more processors which, together with computer program code may perform different functions and actions, a memory, a receiving port and a sending port. A node may be, for example, a server. Nodes may perform their functions entirely on the cloud.
The performance of a communications network may be measured by the analysis of data indicating its performance, such as, for example, Key Performance Indicators (KPIs).
Federated Learning (FL) [1] has recently emerged as a paradigm for distributed model training without the need to share training data. Various Artificial Intelligence (AI)-enabled telecommunication use-cases may benefit from FL. An example is Managed Services for Networks (MSN), where use-cases may involve computing Key Performance Indicators (KPIs) from Performance Management (PM) counter data, training Machine Learning (ML) model(s) to predict a target Key Performance Indicator (KPI) using feature KPIs, and, using the trained models to predict the target KPI from online stream(s) of counter data. Based on the prediction(s) during configured Reporting Operating Periods (ROPs), an operator may execute actuation(s) to restore the target KPI, which may otherwise cause degradation in network performance and affect end-users.
Existing methods may train Machine Learning (ML) models for MSN use-cases for specific operators, technology, e.g., Third Generation (3G)/Fourth Generation (4G)/Fifth Generation (5G), geography or frequency band(s). However, distributed model training may not be performed as KPI, and PM, data may be understood to be private, and may not often be shared. FL may emerge as a viable approach to improving performance since many models may be understood to predict similar KPIs, and their aggregation using FL may update models with degraded performance, without the need to share data.
It may further be efficient to leverage FL at a cell, or location area, level, since actuation(s) to redress KPI degradations may be understood to be executed at cells. In this document, the term “local model(s)” may be used to refer to ML model(s) trained and deployed at such nodes, that is, client nodes, while the term “global model” may be used to refer to an ML model trained, or deployed, at a server node, whose parameters may be derived from one or more model(s) at the local nodes. Additionally, multiple global model(s) may exist when multiple target KPIs may need to be predicted, and each may be derived from respective local model(s) that may predict the same KPI.
Conventional FL approaches may involve aggregation of local models at a server node by simple Federated Averaging (FedAvg) to realize the global model, but this may have sub-optimal performance or slower convergence. Other strategies [2-3] may be computationally expensive. Some methods for training of predictive models have used drift detection and resolution mechanisms at local nodes [2], in conjunction with approaches such as Principal Components Analysis (PCA) and k-means clustering at local nodes [3] for handling drift in federated learning settings. These have in turn been used to determine the model update policy at the global node [3]. These methods may be computationally expensive to implement and may impact energy efficiency at scale. Causality in ML may be understood to involve understanding model predictions, that is, the effect, because of changes in underlying assumptions or data, that is, the cause. This principle may also be applied in an FL setting and it may be referred to as causal FL. Work on causal federated learning [6] may involve minimization of the loss computed from hidden layer representations from the participating client nodes. WO2021121585A1 and WO2021089429A2 are also examples of leveraging FL for distributed model training and Life-Cycle Management (LCM) use-cases in the telecommunications domain presenting alternative approaches, such as aggregation based on performance metrics or selector neural models. However, this approach may result in sub-optimal performance of the models.
As part of the development of embodiments herein, one or more challenges with the existing technology will first be identified and discussed.
The approaches for training of predictive models discussed in the Background section, or others performed on federated learning, do not leverage representations of explaining attributes of well performing local models in updating the global model to restore, or improve, local models with degraded performance.
In conventional FL, a global model M, with parameters denoted by w, may be updated at iteration (t+1) using the gradient gk computed from the loss function Fk at k client nodes, that may have models with parameters wk trained on nk data samples, denoted as xk, with corresponding target, that is, a label to be predicted by the model from the data, e.g., target KPI for MSN, denoted collectively by yk, as follows:
The model parameters at all the local nodes may then be updated as w(t+1)k←wt−ηgk. Here, η may be understood to be a suitable learning rate, which may be a user-defined parameter, n may be understood to be the total number of samples at all local nodes, and ∇ may be understood to denote the gradient operator. The above equation may constitute an update policy.
It may be noted here that the global model may be updated based on the relative number of samples (nk/n) that may have been used to train the respective local model(s). Therefore, if a local model has been trained using many samples, it may dominate the global, and subsequently local, model updates, and eventually the performance. However, the global model update is not determined by the important attributes of the local models that are performing well. Further, multiple iterations may be needed for convergence of local and global model training, that is, parameter updates, which may increase compute requirements such as energy compute requirements, especially when many local nodes are involved, such as when FL may be implemented for a cell-level MSN use-case.
These problems with existing solutions motivate requirements that may be summarized as follows. For scalable FL, models, global and local, may be understood to need to generalize quickly on data representations. Also, energy efficiency may be understood to be needed, e.g., compute cost, number of iterations etc. Further, a global model update mechanism, may be understood to need to capture representative attributes that may be able to determine predictions of well performing local models. It may be understood that the explaining attributes of the models in an FL setup, e.g., as determined in [5], may change based on the update policies and incoming data streams.
It is an object of embodiments herein to improve the handling of predictive models. In a scenario where FL is employed at cell-level for MSN, an FL strategy may be required that may quickly update local models having degraded performance using parameters of the global model for that KPI with better performance, while also requiring less computation, e.g., faster convergence of training loss function with fewer iterations, to determine such a global model. This may be understood to be also important from an energy-efficiency perspective of such federated model training systems, even for other use cases.
According to a first aspect of embodiments herein, the object is achieved by a computer-implemented method performed by a first node. The method is handling predictive models. The first node operates in a communications system. The first node updates, using machine learning a first predictive model of an indicator of performance of the communications system. The updating is based on respective explainability values respectively obtained from a first subset of a plurality of second nodes operating in the communications network. The respective explainability values correspond to a first subset of respective second predictive models of the indicator of performance of the communications system, respectively determined by the first subset of the plurality of second nodes. The models in the first subset of respective second predictive models have a respective performance value above a threshold. The first node then provides an indication of the updated first predictive model to a third node comprised in the plurality of second nodes and excluded from the first subset, or to another node operating in the communications system.
According to a second aspect of embodiments herein, the object is achieved by a computer-implemented method performed by a third node. The method is for handling predictive models. The third node operates in a communications system. The third node receives the indication from the first node operating in the communications system. The indication indicates the updated first predictive model of the indicator of performance of the communications system. The updated first predictive model is based on the respective explainability values respectively obtained from the first subset of the plurality of second nodes operating in the communications network. The respective explainability values correspond to the first subset of the respective second predictive models of the indicator of performance of the communications system, respectively determined by the first subset of the plurality of second nodes. The models in the first subset of respective second predictive models have the respective performance value above the threshold. The respective second predictive model of the indicator of performance of the communications system of the third node has the respective performance value below the threshold. The third node is comprised in the plurality of second nodes but excluded from the first subset of the plurality of second nodes. The third node also replaces the respective second predictive model of the indicator of performance of the communications system of the third node with the updated first predictive model indicated by the received indication.
According to a third aspect of embodiments herein, the object is achieved by a computer-implemented method performed by a second node. The method is for handling predictive models. The second node operates in a communications system. The second node sends, to the first node operating in the communications system, the respective explainability values corresponding to the respective second predictive model of the indicator of performance of the communications system. The respective second predictive model has been determined by the second node. The respective second predictive model has a respective performance value above the threshold.
According to a fourth aspect of embodiments herein, the object is achieved by the first node. The first node is for handling predictive models. The first node is configured to operate in the communications system. The first node is configured to update, using machine learning, the first predictive model of the indicator of performance of the communications system. The updating is configured to be based on the respective explainability values configured to be respectively obtained from the first subset of the plurality of second nodes configured to be operating in the communications network. The respective explainability values are configured to correspond to the first subset of respective second predictive models of the indicator of performance of the communications system, configured to be respectively determined by the first subset of the plurality of second nodes. The models in the first subset of respective second predictive models have the respective performance value above the threshold. The first node is also configured to provide the indication of the first predictive model configured to be updated to the third node configured to be comprised in the plurality of second nodes and excluded from the first subset, or to the another node configured to operate in the communications system.
According to a fifth aspect of embodiments herein, the object is achieved by the third node. The third node is for handling predictive models. The third node is configured to operate in the communications system. The third node is configured to receive the indication from the first node configured to operate in the communications system. The indication is configured to indicate the updated first predictive model of the indicator of performance of the communications system. The updated first predictive model is configured to be based on the respective explainability values configured to be respectively obtained from the first subset of the plurality of second nodes configured to operate in the communications network. The respective explainability values are configured to correspond to the first subset of respective second predictive models of the indicator of performance of the communications system, configured to be respectively determined by the first subset of the plurality of second nodes. The models in the first subset of respective second predictive models are configured to have the respective performance value above the threshold. The respective second predictive model of the indicator of performance of the communications system of the third node is configured to have the respective performance value below the threshold. The third node is configured to be comprised in the plurality of second nodes but excluded from the first subset of the plurality of second nodes. The third node is also configured to replace the respective second predictive model of the indicator of performance of the communications system of the third node with the updated first predictive model configured to be indicated by the indication configured to be received.
According to a sixth aspect of embodiments herein, the object is achieved by the second node. The second node is for handling predictive models. The second node is configured to operate in the communications system. The second node is further configured to send, to the first node configured to operate in the communications system, the respective explainability values configured to correspond to the respective second predictive model of the indicator of performance of the communications system. The respective second predictive model is configured to have been determined by the second node. The respective second predictive model is configured to have the respective performance value above the threshold.
By the first node updating the first predictive model based on the respective explainability values of the first subset of respective second predictive models having the respective performance value above the threshold, the first node may enable to update the parameters of the first predictive model with loss computed on explainability values, e.g., SHAP values, and corresponding model predictions, that is, with explaining KPIs, of multiple local models, having optimal performance. The first node may thereby ensure that the first predictive model may learn from the explaining features of the respective second predictive models and consequently, to enable to obtain an improvement in the performance of the first predictive model, with fewer iterations of the updates of the parameters of the first predictive model. This may be understood to be by updating the first predictive model excluding the explainability of local models having degraded performance, such as that of the third node.
The loss function may also be enabled to rapidly converge in comparison with existing methods, which may be understood to alleviate the need to run further iterations. These benefits may be understood to in turn account for energy optimization, as computation time and cost may be lower, while performance may be improved.
By providing the indication to the third node, the first node may enable the third node to replace its degraded respective second model with the updated global model. This may enable to address the degradation of the respective second model of the third node, and thereby ensure that the first predictive model is enabled to predict the indicator of performance of the communications system with higher accuracy.
By providing the indication to the another node, the first node may enable the another node to execute the updated global model to predict the indicator of performance of the communications system for any use case, with the highest accuracy.
By receiving the first indication indicating the updated first predictive mode, the third node may be enabled to then replace its degraded respective second predictive mode of the indicator of performance of the communications system, with the updated first predictive model indicated by the received indication.
By replacing the respective second predictive model of the third node with the updated first predictive model, the third node may enable that only the performance degraded local model may be replaced, for improved generalization at the third node, that is, the local node.
The replacement action may be understood to be beneficial locally, when the local model performance may have degraded because of parameter updates due to interim undesirable training data, such as noise. The replacement may be understood to help to restore the “corrupted” model. Restoring local performance may be understood to also be advantageous for overall FL system stability.
By the second node sending the respective explainability values corresponding to the respective second predictive model having the respective performance value above the threshold to the first node, the second node enables the first node to then update the first predictive model with the respective explainability values.
Examples of embodiments herein are described in more detail with reference to the accompanying drawings, and according to the following description.
Certain aspects of the present disclosure and their embodiments may provide solutions to the challenges discussed in the Background and Summary sections. There are, proposed herein, various embodiments which address one or more of the issues disclosed herein.
As a summarized overview, embodiments herein may be understood to relate to explainability driven model federation for scalable managed services for networks. Embodiments herein may leverage explainability of local models to define the loss computed while updating the global model in a FL setup, due to which the FL implementation may require fewer training iterations, converge faster, and perform better.
Given a stream of samples comprising of a set of features and a target KPI at multiple local node(s), along with the local model(s) used to compute the target KPI, the relative importance of the feature KPIs in determining the target may be determined by using explainability at each of the local node(s).
In one of the embodiments, this may be realized by examining the aggregated SHapley Additive exPlanations (SHAP) values [5] of the sample(s) [4]. Other explainability algorithms, such as Locally Interpretable Model-agnostic Explanations (LIME) [7], and DeepLIFT [8]) may be used in alternative embodiments.
The local model parameters, model performance and explainability values may be sent to a global node. The global model may initially be chosen as one of the local models having the best performance metric on the target KPI. It may be noted here that raw data, e.g., the feature KPIs, from the local node(s) may be understood not be shared with the global node. When the performance of a local model degrades, that is, when it may fall below a configured threshold, which may be suitably defined based on the target KPI, it may request the global node for an update. The global model parameters may be updated by computing the loss using the explainability values of the local model(s) where performance has not degraded. The local model with the degraded performance may then be replaced by the updated global model to improve performance at that node.
Some of the embodiments contemplated will now be described more fully hereinafter with reference to the accompanying drawings, in which examples are shown. In this section, the embodiments herein will be illustrated in more detail by a number of exemplary embodiments. Other embodiments, however, are contained within the scope of the subject matter disclosed herein. The disclosed subject matter should not be construed as limited to only the embodiments set forth herein; rather, these embodiments are provided by way of example to convey the scope of the subject matter to those skilled in the art. It should be noted that the exemplary embodiments herein are not mutually exclusive. Components from one embodiment may be tacitly assumed to be present in another embodiment and it will be obvious to a person skilled in the art how those components may be used in the other exemplary embodiments.
Note that although terminology from Long Term Evolution (LTE)/5G has been used in this disclosure to exemplify the embodiments herein, this should not be seen as limiting the scope of the embodiments herein to only the aforementioned system. Other wireless systems with similar features, may also benefit from exploiting the ideas covered within this disclosure. It may also be noted that the use-case involving MSN may be understood to be exemplary. Embodiments herein may be used for other applications involving FL as well.
In some examples, the telecommunications system may for example be a network such as a 5G system, e.g., 5G Core Network (CN), 5G New Radio (NR), an Internet of Things (IoT) network, an LTE network, e.g. LTE Frequency Division Duplex (FDD), LTE Time Division Duplex (TDD), LTE Half-Duplex Frequency Division Duplex (HD-FDD), LTE operating in an unlicensed band, or a newer system supporting similar functionality. The telecommunications system may also support other technologies, such as, e.g., Wideband Code Division Multiple Access (WCDMA), Universal Terrestrial Radio Access (UTRA) TDD, Global System for Mobile communications (GSM) network, GSM/Enhanced Data Rate for GSM Evolution (EDGE) Radio Access Network (GERAN) network, Ultra-Mobile Broadband (UMB), EDGE network, network comprising of any combination of Radio Access Technologies (RATs) such as e.g. Multi-Standard Radio (MSR) base stations, multi-RAT base stations etc., any 3rd Generation Partnership Project (3GPP) cellular network, Wireless Local Area Network/s (WLAN) or WiFi network/s, Worldwide Interoperability for Microwave Access (WiMax), IEEE 802.15.4-based low-power short-range networks such as IPv6 over Low-Power Wireless Personal Area Networks (6LowPAN), Zigbee, Z-Wave, Bluetooth Low Energy (BLE), or any cellular network or system. The telecommunications system may for example support a Low Power Wide Area Network (LPWAN). LPWAN technologies may comprise Long Range physical layer protocol (LoRa), Haystack, SigFox, LTE-M, and Narrow-Band IoT (NB-IoT).
The communications system 100 comprises a first node 111, which is depicted in
Any of the nodes in the of the nodes in the first subset of the plurality of second nodes 112 and the third node 113 may be separate nodes. In some embodiments, any of the first node 111, and the another node 114 may be independent and separated nodes from each other, or from any of the nodes in the first subset of the plurality of second nodes 112 and the third node 113. In other embodiments, any of the first node 111, and the another node 114 may co-localized or be the same node as any of the nodes in the first subset of the plurality of second nodes 112 and the third node 113. All the possible combinations are not depicted in
Any of the first node 111, the plurality of second nodes 112 and the third node 113 may be understood as to be a node having a capability to train one or more predictive models using ML. Particularly, the first node 111 may have a capability to train a global predictive model, whereas any of the plurality of second nodes 112 and the third node 113 may have a capability to train a respective local model. The first node 111 may be called a central or server node. The plurality of second nodes 112, e.g., the third node 113, may be called local nodes or client nodes to the central node. These local node(s) may be at cell-level, eNodeB or location area level, depending on the use-case.
Any of the first node 111, the plurality of second nodes 112, the third node 113 and the another node 114 may be a network node. In particular examples, any of the first node 111, and the another node 114 may be core network nodes. In some examples, the fourth node 114 may be a device, such as any of the devices 141, 142, 143 described below. Any of the plurality of second nodes 112, the third node 113 may be, respectively, a radio network node, as depicted in panel b) of
The communications system 100 may cover a geographical area, which in some embodiments may be divided into cell areas, wherein each cell area may be served by a radio network node, although, one radio network node may serve one or several cells. In the example of
The communications system 100 may comprise a plurality of devices whereof a first device 141, a second device 142 and a third device 143 are depicted in panel b) of
The first node 111 may communicate with the another node 114 over a first link 151, e.g., a radio link or a wired link. The first node 111 may communicate with the each of the nodes in the plurality of second nodes 112 over a respective second link 152, e.g., a radio link or a wired link. In the particular non-limiting example of panel b) in
In general, the usage of “first”, “second”, “third”, “fourth”, “fifth”, “sixth”, “seventh” and/or “eighth” herein may be understood to be an arbitrary way to denote different elements or entities, and may be understood to not confer a cumulative or chronological character to the nouns these adjectives modify.
Although terminology from Long Term Evolution (LTE)/5G has been used in this disclosure to exemplify the embodiments herein, this should not be seen as limiting the scope of the embodiments herein to only the aforementioned system. Other wireless systems support similar or equivalent functionality may also benefit from exploiting the ideas covered within this disclosure. In future telecommunication networks, e.g., in the sixth generation (6G), the terms used herein may need to be reinterpreted in view of possible terminology changes in future technologies.
Generally, all terms used herein are to be interpreted according to their ordinary meaning in the relevant technical field, unless a different meaning is clearly given and/or is implied from the context in which it is used. All references to a/an/the element, apparatus, component, means, step, etc. are to be interpreted openly as referring to at least one instance of the element, apparatus, component, means, step, etc., unless explicitly stated otherwise. The steps of any methods disclosed herein do not have to be performed in the exact order disclosed, unless a step is explicitly described as following or preceding another step and/or where it is implicit that a step must follow or precede another step. Any feature of any of the embodiments disclosed herein may be applied to any other embodiment, wherever appropriate. Likewise, any advantage of any of the embodiments may apply to any other embodiments, and vice versa. Other objectives, features and advantages of the enclosed embodiments will be apparent from the following description.
Several embodiments are comprised herein. It should be noted that the examples herein are not mutually exclusive. Components from one embodiment may be tacitly assumed to be present in another embodiment and it will be obvious to a person skilled in the art how those components may be used in the other exemplary embodiments.
Embodiments of a computer-implemented method, performed by the first node 111, will now be described with reference to the flowchart depicted in
The method may comprise the actions described below. In some embodiments some of the actions may be performed. In some embodiments, all the actions may be performed. In
Components from one example may be tacitly assumed to be present in another example and it will be obvious to a person skilled in the art how those components may be used in the other examples.
In the course of operations in the communications system 100, it may be of interest to predict an indicator of performance of the communications system 100, e.g., y, such as a Key Performance Indicator (KPI). The indicator of performance of the communication system 100 y may be understood to be a target indicator of performance. For the purpose of predicting the indicator of performance y, the first node 111 may generate and train a first predictive model, MG, of the indicator of performance y of the communications system 100. The first node 111, which may be understood as a central or global node, may ultimately generate the first predictive model, MG, e.g., with FL, in a co-operative fashion with the plurality of second nodes 112. The plurality of second nodes 112, Ci, may be understood to comprise N local or client nodes. Each of the second nodes 112 may generate a respective second predictive model, e.g., MCi, which may be also referred to as a respective local model. Each respective second predictive model, MCi, may be trained to, and subsequently, predict the target indicator y of performance of the communications system 100, e.g., target KPI, computed from feature indicators of performance, e.g., feature KPIs such as Uplink Received Signal Strength Indicator (RSSI), Physical Resource Block (PRB) Utilization (kpi_prb_util_calc), LTE downlink transmission time interval (kpilte_se_dl_tti), etc. The feature indicators of performance may be generated, for example, by suitable KPI creation processes from Performance Management (PM) counter data. The plurality of second nodes 112, Ci, may communicate with the first node 111 by sharing model parameters MCi, e.g., neural network weights.
Embodiments herein may be understood to advantageously comprise computation, by each of the nodes in the plurality of second nodes 112, Ci, of explainability values, SCi, which the plurality of second nodes 112, Ci, may have respectively computed for the respective second predictive models using, for example, a SHAP explainer. These explainability values, SCi, may then also be shared with the first node 111.
In this Action 201, the first node 111 may obtain, from each of the second nodes in the plurality of second nodes 112, Ci, as obtained after a first number of iterations of training the respective second predictive models, respectively, by the plurality of second nodes 112: i) first respective parameters, e.g., MCi, of a first version of the respective second predictive models, ii) first respective indicators of performance, e.g., PCi, of the first version of the respective second predictive models, and iii) first respective explainability values, e.g., SCi, of the first version of the respective second predictive models.
Each iteration of a predictive model may result in a version of the respective model. That is, in a set of weights for each of the feature indicators, e.g., feature KPIs, in the respective model. The respective model may be considered to be trained when a certain performance value may be obtained, e.g., an accuracy may be obtained, that may exceed a certain performance threshold. The performance may be measured with a performance metric, such as percentage improvement in Root Mean Squared Error (RMSE), number of iterations required and or convergence of the model loss function. Each may be understood to have a respective threshold. The first version of the respective second predictive model may refer to that obtained after a certain number, or first number, of iterations of training, not necessarily a single iteration.
Obtaining may be understood as receiving, or retrieving, e.g., via the respective second link 152.
Each of the predictive models, any of the respective second predictive models and the first predictive model, may be understood to be an ML model, such as a multi-layer perceptron models for regression.
By obtaining the first respective parameters, e.g., MCi, of the first version of the respective second predictive models, ii) the first respective indicators of performance, e.g., PCi, of the first version of the respective second predictive models, and iii) the first respective explainability values, e.g., SCi, of the first version of the respective second predictive models in this Action 201, the first node 111 may be enabled to pick, in the next Action 202, the best performing local model to initialize the first predictive model, that is, the general or global model MG, which may enable it to ultimately predict the indicator of performance of the communications system 100, e.g., the target KPI.
In this Action 202, the first node 111 may initialize the first predictive model, e.g., MG, that is, the central or global model, with the first version of one of the respective second predictive models. The first version of the one of the respective second predictive models may correspond to the best performing model of the first version of the respective second predictive models after the first number of iterations of training of the first version of the respective second predictive models. That is, the first node 111 may choose, out of all the second predictive models obtained in Action 201, the respective second predictive model having the highest first respective indicator of performance to initialize the first predictive model with, in other words, the best performing local model.
Initializing may be understood as setting the parameters of the first predictive model, that is, the global model as the parameters of the selected respective second predictive model, e.g., the local mode.
By initializing the first predictive model with the first version of the one of the respective second predictive models in this Action 202, the first node 111 may be enabled to build a global model that may best predict the indicator of performance of the communications system 100, e.g., the target KPI, from among the local models available for predicting the indicator of performance of the communications system 100, e.g., the target KPI, compared to randomly selecting a local model, the advantage may be understood to be that the current global model may be the best representative model for that indicator of performance of the communications system 100, e.g., that KPI, in the FL system.
Each of the nodes in the plurality of second nodes 112 may continue to collect data on the feature indicators used to predict the target indicator of the performance of the communications system 100, and thereby continue to train the respective second predictive model with the newly collected data. Each of the nodes in the plurality of second nodes 112 may continue to monitor the respective performance of the respective second predictive model during the training. At some point, at least one of the nodes in the plurality of second nodes 112, referred to herein as the third node 113, may detect a degradation in the performance of its respective second predictive model. The degradation in the performance of the respective second predictive model of the third node 113 may be due, for example, to local factors, such as drift or noise in training data, e.g., feature KPIs, or abrupt changes in usage patterns due to physical or environmental factors, malicious users, or devices, among others. The degradation may usually be measured as an increase in the error on the model predictions.
In this Action 203, the first node 111 may receive a first indication from the third node 113. The first indication may request an update of a second version of the respective second predictive model of the third node 113.
To update may be understood as to replace model parameters, e.g., to replace weights of one or more features.
The requested update may be due to the detected degradation of the second version of the respective second predictive model of the third node 113, e.g., after an additional number of iterations of training of the respective second predictive model trained by the third node 113.
The first indication may indicate, for example, a degradation, e.g., an increase in error, on the prediction of a target KPI such as downlink or uplink throughput, latency, number of users, block error rate, physical resource block utilization, etc. by the respective second predictive model, e.g., in the context of MSN use-case.
By receiving the first indication in this Action 203, the first node 111 may be enabled to know that that it may need to send a new version of the predictive model to the third node 113, so the third node 113 may replace its degraded version of its respective predictive model with one with better performance. Furthermore, the first node 111 may be enabled to then update the first predictive model excluding the explainability of local models having degraded performance.
In this Action 204, the first node 111 may determine that the update to the first predictive model, e.g., MG, is to be performed, based on a detected degradation of the second version of one of the respective second predictive models. That is, the second version of the respective second predictive model of the third node 113.
Determining may be understood as calculating, deriving, or similar.
In some embodiments, the determining in this Action 204 may be based on the received first indication from the third node 113 in Action 203. In other examples, the determining in this Action 204 may be performed by the first node 111 detecting the degradation itself, e.g., based on a periodic report of a second respective indicator of performance of the second version of the respective second predictive model from the third node 113.
By determining that the update to the first predictive model is to be performed in this Action 204, the first node 111 may be enabled to drive the (re)-training of the first predictive model, that is, the global model, based on performance degradations at local nodes such as the third node 113. This may in turn ensure that the model parameters of the first predictive model may be then updated using the respective second predictive models of second nodes 112 which may be performing well on predicting the indicator of performance of the communications system 100, e.g., the target KPI, excluding the respective second predictive models of nodes whose performance may be degraded. This may be understood to ultimately lead to faster convergence of the first predictive model, with fewer iterations, as will be described later.
While the degradation may have been detected in the respective second predictive model trained by the third node 113, other respective second predictive models trained, respectively, by other second nodes in the plurality of second nodes 112 may not have degraded. Particularly, a first subset, SCj, of the plurality of second nodes 112 may have determined a first subset, k, of respective second predictive models of the indicator of performance of the communications system 100, having a respective performance value above the threshold.
In this Action 205, the first node 111 may send, to the second nodes in the first subset, Cj, of the plurality of second nodes 112, a second indication. The second indication may request to provide respective explainability values, e.g., SCj, as obtained after a second number of iterations of training the first subset, k, of the respective second predictive models, respectively, by the first subset, Cj, of the plurality of second nodes 112. That is, each of the second nodes in the first subset, Cj of the plurality of second nodes 112, may train a respective second predictive model having a respective performance value above the threshold.
The sending, e.g., transmitting, may be performed by the respective second link 152.
The second indication may be, for example, a command or a trigger.
The explainability values SCj may have been respectively computed by each of the j second nodes in the first subset, Cj, of the plurality of second nodes 112 for the feature indicators, e.g., feature KPIs, using model predictions y on the feature indicators, e.g., KPIs, using a suitable algorithm, such as SHAP explainer.
By sending the second indication in this Action 205, the first node 111 may then be enabled to obtain explainability values from the local models to update the parameters of the first predictive model, that is, the global model. As explained earlier, this may in turn ensure that the model parameters of the first predictive model may be then updated using the respective second predictive models of second nodes 112 which may be performing well on predicting the indicator of performance of the communications system 100, e.g., the target KPI, excluding the respective second predictive models of nodes whose performance may be degraded. This may be understood to ultimately lead to faster convergence of the first predictive model, with fewer iterations, as will be described later.
In this Action 206, the first node 111 may obtain, from each of the second nodes in the first subset, Cj, of the plurality of second nodes 112, respective explainability values, e.g., SCj, as obtained after a second number of iterations of training the first subset, k, of the respective second predictive models, respectively, by the first subset, Cj, of the plurality of second nodes 112.
The obtaining, e.g., receiving, may be performed by the respective second link 152.
The respective explainability values, e.g., SCi, may be obtained in this Action 206 in response to the sent second indication in Action 205.
It may be noted that while the respectively explainability values are obtained from the second nodes in the first subset, Cj of the plurality of second nodes 112, the feature indicators, e.g., KPIs, generated are kept private, that is, not shared with the first node 111.
By the first obtaining the respective explainability values SCi, from each of the second nodes in the first subset, Cj of the plurality of second nodes 112, the first node 111 may be enabled to, at that instant, to learn from representative attributes, captured by explainability, of local models that may be understood to be performing well in predicting the indicator of performance y, e.g., the target KPI. The global model parameters may be influenced by these representative attributes, as opposed to by factors such as relative number of samples at the local nodes in conventional FedAvg, allowing the first node 111 to dynamically adapt to the performance of the local nodes. The use of explainability values may be understood to additionally result in energy efficiency, since it may be understood that the use of these values may result in faster convergence of the loss function in fewer iterations, as these may have been determined from local models that may be understood to have been performing well on predicting the indicator of performance, e.g., the target KPI. Thus, multiple benefits may be realized in the FL setup.
In this Action 207, the first node 111 updates, using ML, the first predictive model, MG, of the indicator of performance of the communications system 100. The updating in this Action 206 is based on the respective explainability values SCj, respectively obtained from the first subset Cj of the plurality of second nodes 112 operating in the communications network 100. The respective explainability values SCj correspond to the first subset, k, of respective second predictive models of the indicator of performance of the communications system 100, respectively determined by the first subset, Cj, of the plurality of second nodes 112. The models in the first subset, k, of the respective second predictive models have the respective performance value above the threshold.
Updating the first predictive model may comprise updating the parameters of the first predictive model, that is, the global model parameters. The updating of the parameters of the first predictive model may be performed using a loss function.
Machine learning in this Action 207 may be, e.g., FL. Non-limiting examples of algorithms that may be used to perform the ML, e.g., FL, in this Action 207 may be, e.g., Federated Averaging, Federated Stochastic Gradient Descent, Federated Learning with Dynamic Regularization, among others.
The updating in this Action 207 may be performed using a loss function, F, computed using the respective explainability values SCj, that is, the explainability values respectively obtained from the first subset Cj, of the plurality of second nodes 112. In other words, the respective explainability values of local models having good performance. The explainability values may have been computed on local model(s) predictions of target KPI(s) using feature KPIs.
Expressed differently, the loss function, F, may be computed using explainability values of all local nodes that may not have degraded performance (k≠i), or:
In other words, in this Action 207, the first node 111 may update parameters of the first predictive model, MG, using explainability, e.g., SHAP, values SCj, of other nodes Cj having good performance. This may be done only for one epoch for each node Cj, that is by a partial_fit. Updating over one epoch may be understood to mean updating the first predictive model parameters with one iteration over each of the set of explainability values from the local nodes successively, rather than performing multiple iterations, epochs, with each set. This may be understood to allow the global model to be updated partially by explainability values of each of the local nodes. That is, the global model may be partially updated by the explaining features of each local model.
In some embodiments, the updating in this Action 207 may comprise refraining from updating the first predictive model, e.g., MG, with respective explainability values corresponding to a second subset of respective second predictive models of the indicator of performance of the communications system 100, respectively determined by a second subset of the plurality of second nodes 112. The models in the second subset of respective second predictive models may have the respective performance value below the threshold.
The updating in this Action 207 may be performed based on a result of the determination in Action 204 that the update is to be performed.
It may be understood that Action 201 may be performed prior to the updating 207 of the first predictive model, e.g., MG.
Since explainability may be understood to capture the important attributes that may determine the model predictions on the indicator of performance that may be desired to be predicted, e.g., the target KPI, updating the model loss using these values may be expected to drive model weights to optimality faster, than determining the local contribution based on a ratio of number of contributing data samples, as in existing methods.
By the first node 111 updating the first predictive model, MG, of the indicator of performance of the communications system 100 based on the respective explainability values SCj respectively obtained from the first subset Cj of the plurality of second nodes 112, the first node 111 may enable to update the parameters of the first predictive model, that is, the global model parameters, with loss computed on explainability, e.g., SHAP, values and corresponding model predictions, that is, with explaining KPIs, of multiple local models having optimal performance. The first node 111 may thereby ensure that the first predictive model may learn from the explaining features and consequently, may enable to obtain an improvement in the performance of the first predictive model, with fewer iterations of the global model parameter updates. This may be understood to be by updating the first predictive model excluding explainability of local models having degraded performance, such as that of the third node 113.
The loss function may also be enabled to rapidly converge in comparison with existing methods, which may be understood to alleviate the need to run further iterations. These benefits may be understood to in turn account for energy optimization, as computation time and cost may be lower, while performance may be improved.
In this Action 208, the first node 111 provides an indication of the updated first predictive model, {circumflex over (M)}G, to the third node 113 comprised in the plurality of second nodes 112 and excluded from the first subset, that is, the ith node that may have reported degradation in performance, or to another node 114 operating in the communications system 100.
The provided indication may be understood to be a third indication.
Providing may be understood as e.g., sending or transmitting, e.g., via the third link 153 to the third node 113, or via the first link 151 to the another node 114.
The indication, that is, the third indication may be, for example, a command or trigger.
By providing the indication to the third node 113, the first node 111 may enable the third node 113 to replace its degraded respective second model MCi with the updated global model {circumflex over (M)}G, using the updated parameters w(t+1). This may enable to address the degradation of the respective second model of the third node 113, and thereby ensure that the first predictive model is enabled to predict the indicator of performance of the communications system 100 y with higher accuracy.
By providing the indication to the another node 114, the first node 111 may enable the another node 114 to execute the updated global model {circumflex over (M)}G, using the updated parameters w(t+1) to predict the indicator of performance of the communications system 100 y for any use case, with the higher accuracy.
Embodiments of a computer-implemented method, performed by the third node 113, will now be described with reference to the flowchart depicted in
The method may comprise the following actions. Several embodiments are comprised herein. In some embodiments, the method may comprise all actions. In other embodiments, the method may comprise some of the actions. One or more embodiments may be combined, where applicable. All possible combinations are not described to simplify the description. It should be noted that the examples herein are not mutually exclusive. Components from one example may be tacitly assumed to be present in another example and it will be obvious to a person skilled in the art how those components may be used in the other examples. In
A non-limiting example of the method performed by the third node 113 is depicted in
In this Action 301, the third node 113 may send, to the first node 111, as obtained after the first number of iterations of training the respective second predictive model: i) the first respective parameters of the first version of the respective second predictive model, ii) the first respective indicator of performance of the first version of the respective second predictive model, and iii) the first respective explainability values of the first version of the respective second predictive model.
When the performance of the local model of the third node 113, that is, of its respective second predictive model, may degrade, the third node 113 may request for model update from central node, that is, from the first node 111.
In this Action 302, the third node 113 may send the first indication to the first node 111. As explained earlier, the first indication may request the update of the second version of the respective second predictive model of the third node 113. The requested update may be due to the detected degradation of the second version of the respective second predictive model of the third node 113.
In this Action 303, the third node 113 receives the indication from the first node 111 operating in the communications system 100. The indication indicates the updated first predictive model of the indicator of performance of the communications system 100. The updated first predictive model is based on the respective explainability values respectively obtained from the first subset of the plurality of second nodes 112 operating in the communications network 100. The respective explainability values correspond to the first subset of respective second predictive models of the indicator of performance of the communications system 100, respectively determined by the first subset of the plurality of second nodes 112. As explained earlier, the models in the first subset of the respective second predictive models have the respective performance value above the threshold. The respective second predictive model of the indicator of performance of the communications system 100 of the third node 113 has the respective performance value below the threshold. The third node 113 is comprised in the plurality of second nodes 112 but excluded from the first subset of the plurality of second nodes 112.
The received indication may be understood to be the third indication.
As also explained earlier, the first predictive model may be understood to be a global model, and the respective second predictive models may be understood to be local models.
Action 301 may be understood to be performed prior to the receiving of the third indication in this Action 303.
The receiving in this Action 303 of the indication may be based on the sent first indication in Action 302.
In this Action 304, the third node 113 replaces the respective second predictive model of the indicator of performance of the communications system 100 of the third node 113 with the updated first predictive model indicated {circumflex over (M)}G by the received indication.
By performing this Action 304, the third node 113 may enable that only the performance degraded local model may be replaced, for improved generalization at the third node 113, that is, the local node.
Embodiments of a computer-implemented method, performed by the second node 112, will now be described with reference to the flowchart depicted in
The method may comprise the following actions. Several embodiments are comprised herein. In some embodiments, the method may comprise all actions. In other embodiments, the method may comprise one or more of the actions. One or more embodiments may be combined, where applicable. All possible combinations are not described to simplify the description. It should be noted that the examples herein are not mutually exclusive. Components from one example may be tacitly assumed to be present in another example and it will be obvious to a person skilled in the art how those components may be used in the other examples. In
A non-limiting example of the method performed by the second node 112 is depicted in
It may be understood that the method described herein in relation to
In this Action 401, the second node 112 may send, to the first node 111, as obtained after the first number of iterations of training the respective second predictive model: i) the first respective parameters of the first version of the respective second predictive model, ii) the first respective indicator of performance of the first version of the respective second predictive model, and iii) the first respective explainability values of the first version of the respective second predictive model.
The respective second predictive model may be understood to be a local model.
In this Action 402, the second node 112 may receive, from the first node 111, the second indication requesting to provide the respective explainability values as obtained after the second number of iterations of training the respective second predictive model.
In this Action 403, the second node 112 sends, to the first node 111 operating in the communications system 100, the respective explainability values corresponding to the respective second predictive model of the indicator of the performance of the communications system 100. The respective second predictive model has been determined by the second node 112. The respective second predictive model has a respective performance value above the threshold.
The respective explainability values may be sent in response to the received second indication. The respective explainability values may be obtained after the second number of iterations of training of the respective second predictive model.
Without loss of generality, embodiments herein, e.g., in relation to
Further, the performance using conventional FL, with FedAvg, and the approach followed by the embodiments herein was compared in the Table 2, while convergence plot for loss function is shown in
Hence, sub-optimal models may be used in the FL approach followed by embodiments herein, and still obtain better performance and efficiency in terms of number of iterations required for convergence of loss.
It may be understood that embodiments herein may be used in different use involving FL in telecommunications. A first such use case may be for the problem of active causal inferencing in real-time network twins for interference root cause identification among cells. Causal inferencing may be understood to involve determining “cause-effect” relationships between predictions of the target by a model and features used to obtain them. A network digital twin may be understood to refer to a computer simulation model of a communication network, along with its operating environment and the application traffic that it may carry. The digital twin may be used to study the behaviour of its physical counterpart under a diverse set of operating conditions. The training of models on radio node software may be centralized, e.g., on a master node, while the execution may be performed in a decentralized way. The global information may be used to train policies for each cell. Post training, each cell may obtain a decentralized policy, which may be implemented based on the local observations of the cell. This architecture may enable the cells to take decisions co-operatively, based on both the local and the global conditions. The local node explainability data may be used according to embodiments herein to effectively train the global models and potentially achieve better performance.
Another non-limiting embodiment may be for the use-case involving model-based hybrid beamforming in Millimetre Wave (mmWave) Multiple Input Multiple Output (MIMO) systems [10]. Here, local beamformers may be designed using model-based manifold optimization algorithms. FL may be used to train a learning model on the local dataset of users, who may estimate the beamformers by feeding the model with their channel data. Explainability may then be leveraged in such a context as well, in a similar manner.
As a summarized overview of the foregoing, embodiments herein may be understood to use explainability of local model(s) to update the model parameters of a global model by using them to compute the loss function. The actions performed may comprise that the local nodes share model parameters MCk(w) and explainability values (SCk) to a central node. The central node may update the model when local node i may report degraded performance (PCi). Then the central model may be partially updated by loss F computed using explainability values of all local nodes that do not have degraded performance (k≠i), or
The loss function used in updating the global model parameters may be determined using the explainability values of the k local model(s) excluding the ith node that reported degradation in performance. The degraded local model may then be replaced by updated w(t+1).
One advantage of embodiments herein in FL may be understood to be energy efficiency due to faster convergence of global model loss function in few iterations and improved performance.
Another advantage of embodiments herein may be understood to be that an asynchronous local model update may only be performed when the performance of the model on a target KPI at a cell may degrade, and not irrespective, enabling to learn from model(s) of other cells.
A further advantage of embodiments herein may be understood to be the ability handle drift at the local nodes implicitly. If there is drift, e.g., a change in distribution of training data, e.g., feature KPIs, and that alters local model parameters to degrade its performance, embodiments herein may update the degraded model using the global model that may be updated from explainability values from other local models on that KPI, where performance has not degraded. This may be understood to restore the performance of the local model and redress the ill-effect of drift.
Yet another advantage of embodiments herein may be understood to be that they may enable a selective local model update based on performance, and a periodic global model update, which may allow the global model to frequently switch to a robust model architecture based on performance of the local model(s), redressing the degradation in performance of the models on predicting the target KPI across local node(s).
Furthermore, embodiments herein may advantageously support heterogeneous model topologies at multiple client nodes, since the approach used may be understood to involve updating global model parameters based on a loss function computed on explainability values, and replacing the local model whose performance may have degraded. Different ML model architectures may be supported, such as neural networks, decision trees, graph networks etc. at the nodes. Embodiments herein may be understood to not be restricted to only NNs.
The approach followed by embodiments herein may also be used in additional embodiments involving FL use cases in telecommunications, such as for estimation of coverage at cellular sites where multiple models may be used to determine respective cellular coverage, and may be trained in a federated setup with a model at the tower location, or for predicting pro-active shutdown at cells based on degradation of performance parameters for energy efficiency, or in models used for creating real-time network twins used in capacity planning use cases, or for multi-node causal inference required for traffic analytics, among others [10].
Several embodiments are comprised herein. It should be noted that the examples herein are not mutually exclusive. One or more embodiments may be combined, where applicable. All possible combinations are not described to simplify the description. Components from one embodiment may be tacitly assumed to be present in another embodiment and it will be obvious to a person skilled in the art how those components may be used in the other exemplary embodiments. In
The first node 111 is configured to, e.g., by means of an updating unit 1301 within the first node 111 configured to, update, using machine learning, the first predictive model of the indicator of performance of the communications system 100. The updating is configured to be based on the respective explainability values configured to be respectively obtained from the first subset of the plurality of second nodes 112 configured to be operating in the communications network 100. The respective explainability values are configured to correspond to the first subset of the respective second predictive models of the indicator of performance of the communications system 100, configured to be respectively determined by the first subset of the plurality of second nodes 112. The models in the first subset of respective second predictive models are configured to have the respective performance value above the threshold.
The first node 111 is further configured to, e.g., by means of a providing unit 1302 within the first node 111 configured to, provide the indication of the first predictive model configured to be updated to the third node 113 configured to be comprised in the plurality of second nodes 112 and excluded from the first subset, or to another node 114 configured to operate in the communications system 100.
In some embodiments, the updating may be configured to comprise refraining from updating the first predictive model with the respective explainability values configured to correspond to the second subset of respective second predictive models of the indicator of performance of the communications system 100, configured to be respectively determined by the second subset of the plurality of second nodes 112. The models in the second subset of respective second predictive models may be configured to have the respective performance value below the threshold.
In some embodiments, the first node 111 may be further configured to, prior to the updating of the first predictive model, e.g., by means of an obtaining unit 1303 within the first node 111 configured to, obtain, from each of the second nodes in the plurality of second nodes 112, as obtained after the first number of iterations of training the respective second predictive models, respectively, by the plurality of second nodes 112: i) the first respective parameters of the first version of the respective second predictive models, ii) the first respective indicators of performance of the first version of the respective second predictive models, and iii) the first respective explainability values of the first version of the respective second predictive models.
In some embodiments, the first node 111 may be further configured to, prior to the updating of the first predictive model, e.g., by means of an initializing unit 1304 within the first node 111 configured to, initialize the first predictive model with the first version of one of the respective second predictive models. The first version of the one of the respective second predictive models may be configured to correspond to the best performing model of the first version of the respective second predictive models after the first number of iterations of training of the first version of the respective second predictive models.
In some embodiments, the first node 111 may be further configured to, e.g., by means of a determining unit 1305 within the first node 111 configured to, determine that the update to the first predictive model is to be performed, based on the degradation configured to be detected of the second version of the one of the respective second predictive models. The updating may be configured to be performed based on the result of the determination that the update is to be performed.
In some embodiments, the first node 111 may be further configured to, e.g., by means of a receiving unit 1306 within the first node 111 configured to, receive the first indication from the third node 113. The first indication may be configured to request the update of the second version of the respective second predictive model of the third node 113. The requested update may be configured to be due to is detected degradation of the second version of the respective second predictive model of the third node 113. The determining may be configured to be based on the first indication configured to be received.
In some embodiments, the first node 111 may be further configured to, e.g., by means of the obtaining unit 1303 within the first node 111 configured to, obtain, from each of the second nodes in the first subset of the plurality of second nodes 112, the respective explainability values as configured to be obtained after the second number of iterations of training the first subset of the respective second predictive models, respectively, by the first subset of the plurality of second nodes 112.
In some embodiments wherein the indication configured to be provided may be configured to be the third indication, the first node 111 may be further configured to, e.g., by means of a sending unit 1307 within the first node 111 configured to, send, to the second nodes in the first subset of the plurality of second nodes 112, the second indication. The second indication may be configured to request to provide the respective explainability values as configured to be obtained after the second number of iterations of training the first subset of the respective second predictive models, respectively, by the first subset of the plurality of second nodes 112. The respective explainability values may be configured to be obtained in response to the second indication configured to be sent.
In some embodiments, the updating may be configured to be performed using the loss function configured to be computed using the respective explainability values.
The embodiments herein in the first node 111 may be implemented through one or more processors, such as a processor 1308 in the first node 111 depicted in
The first node 111 may further comprise a memory 1309 comprising one or more memory units. The memory 1309 is arranged to be used to store obtained information, store data, configurations, schedulings, and applications etc. to perform the methods herein when being executed in the first node 111.
In some embodiments, the first node 111 may receive information from, e.g., the plurality of second nodes 112, the third node 113, the another node 114 and/or any of the first device 141, the second device 142 and the third device 143 through a receiving port 1310. In some embodiments, the receiving port 1310 may be, for example, connected to one or more antennas in first node 111. In other embodiments, the first node 111 may receive information from another structure in the communications system 100 through the receiving port 1310. Since the receiving port 1310 may be in communication with the processor 1308, the receiving port 1310 may then send the received information to the processor 1308. The receiving port 1310 may also be configured to receive other information.
The processor 1308 in the first node 111 may be further configured to transmit or send information to e.g., the plurality of second nodes 112, the third node 113, the another node 114, any of the first device 141, the second device 142 and the third device 143 and/or another structure in the communications system 100, through a sending port 1311, which may be in communication with the processor 1308, and the memory 1309.
Those skilled in the art will also appreciate that the units 1301-1307 described above may refer to a combination of analog and digital circuits, and/or one or more processors configured with software and/or firmware, e.g., stored in memory, that, when executed by the one or more processors such as the processor 1308, perform as described above. One or more of these processors, as well as the other digital hardware, may be included in a single Application-Specific Integrated Circuit (ASIC), or several processors and various digital hardware may be distributed among several separate components, whether individually packaged or assembled into a System-on-a-Chip (SoC).
Also, in some embodiments, the different units 1301-1307 described above may be implemented as one or more applications running on one or more processors such as the processor 1308.
Thus, the methods according to the embodiments described herein for the first node 111 may be respectively implemented by means of a computer program 1312 product, comprising instructions, i.e., software code portions, which, when executed on at least one processor 1308, cause the at least one processor 1308 to carry out the actions described herein, as performed by the first node 111. The computer program 1312 product may be stored on a computer-readable storage medium 1313. The computer-readable storage medium 1313, having stored thereon the computer program 1312, may comprise instructions which, when executed on at least one processor 1308, cause the at least one processor 1308 to carry out the actions described herein, as performed by the first node 111. In some embodiments, the computer-readable storage medium 1313 may be a non-transitory computer-readable storage medium, such as a CD ROM disc, or a memory stick. In other embodiments, the computer program 1312 product may be stored on a carrier containing the computer program 1312 just described, wherein the carrier is one of an electronic signal, optical signal, radio signal, or the computer-readable storage medium 1313, as described above.
The first node 111 may comprise a communication interface configured to facilitate, or an interface unit to facilitate, communications between the first node 111 and other nodes or devices, e.g., the plurality of second nodes 112, the third node 113, the another node 114, any of the first device 141, the second device 142 and the third device 143 and/or another structure in the communications system 100. The interface may, for example, include a transceiver configured to transmit and receive radio signals over an air interface in accordance with a suitable standard.
In other embodiments, the first node 111 may comprise the following arrangement depicted in
Hence, embodiments herein also relate to the first node 111 operative to operate in the communications system 100. The first node 111 may comprise the processing circuitry 1308 and the memory 1309, said memory 1309 containing instructions executable by said processing circuitry 1308, whereby the first node 111 is further operative to perform the actions described herein in relation to the first node 111, e.g., in
Several embodiments are comprised herein. It should be noted that the examples herein are not mutually exclusive. One or more embodiments may be combined, where applicable. All possible combinations are not described to simplify the description.
Components from one embodiment may be tacitly assumed to be present in another embodiment and it will be obvious to a person skilled in the art how those components may be used in the other exemplary embodiments. In
The detailed description of some of the following corresponds to the same references provided above, in relation to the actions described for the third node 113 and will thus not be repeated here. For example, the explainability values may be configured to be computed using a SHAP explainer.
The third node 113 is configured to, e.g., by means of a receiving unit 1401 within the third node 113, configured to receive the indication from the first node 111 configured to operate in the communications system 100. The indication is configured to indicate the updated first predictive model of the indicator of performance of the communications system 100. The updated first predictive model is configured to be based on the respective explainability values configured to be respectively obtained from the first subset of the plurality of second nodes 112 configured to operate in the communications network 100. The respective explainability values are configured to correspond to the first subset of respective second predictive models of the indicator of performance of the communications system 100, configured to be respectively determined by the first subset of the plurality of second nodes 112. The models in the first subset of respective second predictive models are configured to have the respective performance value above the threshold. The respective second predictive model of the indicator of performance of the communications system 100 of the third node 113 is configured to have the respective performance value below the threshold. The third node 113 is configured to be comprised in the plurality of second nodes 112 but excluded from the first subset of the plurality of second nodes 112.
The third node 113 is further configured to, e.g., by means of a replacing unit 1402 within the third node 113 configured to, replace the respective second predictive model of the indicator of performance of the communications system 100 of the second node 112 with the updated first predictive model configured to be indicated by the indication configured to be received.
In some embodiments wherein the indication configured to be received may be configured to be the third indication, the third node 113 may be further configured to, prior to the receiving of the third indication, e.g., by means of a sending unit 1403 within the third node 113 configured to, send, to the first node 111, as configured to be obtained after the first number of iterations of training the respective second predictive model: i) the first respective parameters of the first version of the respective second predictive model, ii) the first respective indicator of performance of the first version of the respective second predictive model, and iii) the first respective explainability values of the first version of the respective second predictive model.
In some embodiments wherein the indication configured to be received may be configured to be the third indication, the third node 113 may be further configured to, e.g., by means of the sending unit 1403 within the third node 113 configured to, send the first indication to the first node 111. The first indication may be configured to request the update of the second version of the respective second predictive model of the third node 113. The requested update may be configured to be due to the detected degradation of the second version of the respective second predictive model of the third node 113. The receiving of the indication may be configured to be based on the first indication configured to be sent.
In some embodiments, the first predictive model may be configured to be the global model, and the respective second predictive models may be configured to be the local models.
The embodiments herein in the third node 113 may be implemented through one or more processors, such as a processor 1404 in the third node 113 depicted in
The third node 113 may further comprise a memory 1405 comprising one or more memory units. The memory 1405 is arranged to be used to store obtained information, store data, configurations, schedulings, and applications etc. to perform the methods herein when being executed in the third node 113.
In some embodiments, the third node 113 may receive information from, e.g., the first node 111, the plurality of second nodes 112, the another node 114 and/or the first device 141, through a receiving port 1406. In some embodiments, the receiving port 1406 may be, for example, connected to one or more antennas in third node 113. In other embodiments, the third node 113 may receive information from another structure in the communications system 100 through the receiving port 1406. Since the receiving port 1406 may be in communication with the processor 1404, the receiving port 1406 may then send the received information to the processor 1404. The receiving port 1406 may also be configured to receive other information.
The processor 1404 in the third node 113 may be further configured to transmit or send information to e.g., the first node 111, the plurality of second nodes 112, the another node 114 and/or the first device 141, and/or another structure in the communications system 100, through a sending port 1407, which may be in communication with the processor 1404, and the memory 1405.
Those skilled in the art will also appreciate that the units 1401-1403 described above may refer to a combination of analog and digital circuits, and/or one or more processors configured with software and/or firmware, e.g., stored in memory, that, when executed by the one or more processors such as the processor 1404, perform as described above. One or more of these processors, as well as the other digital hardware, may be included in a single Application-Specific Integrated Circuit (ASIC), or several processors and various digital hardware may be distributed among several separate components, whether individually packaged or assembled into a System-on-a-Chip (SoC).
Also, in some embodiments, the different units 1401-1403 described above may be implemented as one or more applications running on one or more processors such as the processor 1404.
Thus, the methods according to the embodiments described herein for the third node 113 may be respectively implemented by means of a computer program 1408 product, comprising instructions, i.e., software code portions, which, when executed on at least one processor 1404, cause the at least one processor 1404 to carry out the actions described herein, as performed by the third node 113. The computer program 1408 product may be stored on a computer-readable storage medium 1409. The computer-readable storage medium 1409, having stored thereon the computer program 1408, may comprise instructions which, when executed on at least one processor 1404, cause the at least one processor 1404 to carry out the actions described herein, as performed by the third node 113. In some embodiments, the computer-readable storage medium 1409 may be a non-transitory computer-readable storage medium, such as a CD ROM disc, or a memory stick. In other embodiments, the computer program 1408 product may be stored on a carrier containing the computer program 1408 just described, wherein the carrier is one of an electronic signal, optical signal, radio signal, or the computer-readable storage medium 1409, as described above.
The third node 113 may comprise a communication interface configured to facilitate, or an interface unit to facilitate, communications between the third node 113 and other nodes or devices, e.g., the first node 111, the plurality of second nodes 112, the another node 114 and/or the first device 141, and/or another structure in the communications system 100. The interface may, for example, include a transceiver configured to transmit and receive radio signals over an air interface in accordance with a suitable standard.
In other embodiments, the third node 113 may comprise the following arrangement depicted in
Hence, embodiments herein also relate to the third node 113 operative to operate in the communications system 100. The third node 113 may comprise the processing circuitry 1404 and the memory 1405, said memory 1405 containing instructions executable by said processing circuitry 1404, whereby the third node 113 is further operative to perform the actions described herein in relation to the third node 113, e.g.,
Several embodiments are comprised herein. It should be noted that the examples herein are not mutually exclusive. One or more embodiments may be combined, where applicable. All possible combinations are not described to simplify the description. Components from one embodiment may be tacitly assumed to be present in another embodiment and it will be obvious to a person skilled in the art how those components may be used in the other exemplary embodiments. In
The detailed description of some of the following corresponds to the same references provided above, in relation to the actions described for the second node 112 and will thus not be repeated here. For example, the explainability values may be configured to be computed using a SHAP explainer.
The second node 112 is configured to, e.g., by means of a sending unit 1501 within the second node 112 configured to, send, to the first node 111 configured to operate in the communications system 100, the respective explainability values configured to correspond to the respective second predictive model of the indicator of performance of the communications system 100. The respective second predictive model may be configured to have been determined by the second node 112. The respective second predictive model may be configured to have the respective performance value above the threshold.
In some embodiments, the second node 112 may be further configured to, e.g., by means of the sending unit 1501 within the second node 112 configured to, send, to the first node 111, as configured to be obtained after the first number of iterations of training the respective second predictive model: i) the first respective parameters of the first version of the respective second predictive model, ii) the first respective indicator of performance of the first version of the respective second predictive model, and iii) the first respective explainability values of the first version of the respective second predictive model. The respective explainability values may be configured to be obtained after the second number of iterations of training of the respective second predictive model.
In some embodiments, the second node 112 may be further configured to, e.g., by means of a receiving unit 1502 within the second node 112, configured to receive, from the first node 111, the second indication configured to request to provide the respective explainability values as configured to be obtained after the second number of iterations of training the respective second predictive model. The respective explainability values may be configured to be sent in response to the second indication configured to be received.
In some embodiments, the respective second predictive model may be configured to be a local model.
The embodiments herein in the second node 112 may be implemented through one or more processors, such as a processor 1503 in the second node 112 depicted in
The second node 112 may further comprise a memory 1504 comprising one or more memory units. The memory 1504 is arranged to be used to store obtained information, store data, configurations, schedulings, and applications etc. to perform the methods herein when being executed in the second node 112.
In some embodiments, the second node 112 may receive information from, e.g., the first node 111, the other second nodes in the plurality of second nodes 112, the third node 113, the another node 114 and/or any of the second device 142 and the third device 143, through a receiving port 1505. In some embodiments, the receiving port 1505 may be, for example, connected to one or more antennas in second node 112. In other embodiments, the second node 112 may receive information from another structure in the communications system 100 through the receiving port 1505. Since the receiving port 1505 may be in communication with the processor 1503, the receiving port 1505 may then send the received information to the processor 1503. The receiving port 1505 may also be configured to receive other information.
The processor 1503 in the second node 112 may be further configured to transmit or send information to e.g., the first node 111, the other second nodes in the plurality of second nodes 112, the third node 113, the another node 114 and/or any of the second device 142 and the third device 143 and/or another structure in the communications system 100, through a sending port 1506, which may be in communication with the processor 1503, and the memory 1504.
Those skilled in the art will also appreciate that the units 1501-1502 described above may refer to a combination of analog and digital circuits, and/or one or more processors configured with software and/or firmware, e.g., stored in memory, that, when executed by the one or more processors such as the processor 1503, perform as described above. One or more of these processors, as well as the other digital hardware, may be included in a single Application-Specific Integrated Circuit (ASIC), or several processors and various digital hardware may be distributed among several separate components, whether individually packaged or assembled into a System-on-a-Chip (SoC).
Also, in some embodiments, the different units 1501-1502 described above may be implemented as one or more applications running on one or more processors such as the processor 1503.
Thus, the methods according to the embodiments described herein for the second node 112 may be respectively implemented by means of a computer program 1507 product, comprising instructions, i.e., software code portions, which, when executed on at least one processor 1503, cause the at least one processor 1503 to carry out the actions described herein, as performed by the second node 112. The computer program 1507 product may be stored on a computer-readable storage medium 1508. The computer-readable storage medium 1508, having stored thereon the computer program 1507, may comprise instructions which, when executed on at least one processor 1503, cause the at least one processor 1503 to carry out the actions described herein, as performed by the second node 112. In some embodiments, the computer-readable storage medium 1508 may be a non-transitory computer-readable storage medium, such as a CD ROM disc, or a memory stick. In other embodiments, the computer program 1507 product may be stored on a carrier containing the computer program 1507 just described, wherein the carrier is one of an electronic signal, optical signal, radio signal, or the computer-readable storage medium 1508, as described above.
The second node 112 may comprise a communication interface configured to facilitate, or an interface unit to facilitate, communications between the second node 112 and other nodes or devices, e.g., the first node 111, the other second nodes in the plurality of second nodes 112, the third node 113, the another node 114 and/or any of the second device 142 and the third device 143 and/or another structure in the communications system 100. The interface may, for example, include a transceiver configured to transmit and receive radio signals over an air interface in accordance with a suitable standard.
In other embodiments, the second node 112 may comprise the following arrangement depicted in
Hence, embodiments herein also relate to the second node 112 operative to operate in the communications system 100. The second node 112 may comprise the processing circuitry 1503 and the memory 1504, said memory 1504 containing instructions executable by said processing circuitry 1503, whereby the second node 112 is further operative to perform the actions described herein in relation to the second node 112, e.g.,
When using the word “comprise” or “comprising”, it shall be interpreted as non-limiting, i.e. meaning “consist at least of”.
The embodiments herein are not limited to the above described preferred embodiments. Various alternatives, modifications and equivalents may be used. Therefore, the above embodiments should not be taken as limiting the scope of the invention.
Generally, all terms used herein are to be interpreted according to their ordinary meaning in the relevant technical field, unless a different meaning is clearly given and/or is implied from the context in which it is used. All references to a/an/the element, apparatus, component, means, step, etc. are to be interpreted openly as referring to at least one instance of the element, apparatus, component, means, step, etc., unless explicitly stated otherwise. The steps of any methods disclosed herein do not have to be performed in the exact order disclosed, unless a step is explicitly described as following or preceding another step and/or where it is implicit that a step must follow or precede another step. Any feature of any of the embodiments disclosed herein may be applied to any other embodiment, wherever appropriate. Likewise, any advantage of any of the embodiments may apply to any other embodiments, and vice versa. Other objectives, features and advantages of the enclosed embodiments will be apparent from the following description.
As used herein, the expression “at least one of:” followed by a list of alternatives separated by commas, and wherein the last alternative is preceded by the “and” term, may be understood to mean that only one of the list of alternatives may apply, more than one of the list of alternatives may apply or all of the list of alternatives may apply. This expression may be understood to be equivalent to the expression “at least one of:” followed by a list of alternatives separated by commas, and wherein the last alternative is preceded by the “or” term.
Any of the terms processor and circuitry may be understood herein as a hardware component.
As used herein, the expression “in some embodiments” has been used to indicate that the features of the embodiment described may be combined with any other embodiment or example disclosed herein.
As used herein, the expression “in some examples” has been used to indicate that the features of the example described may be combined with any other embodiment or example disclosed herein.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/IN2022/050323 | 3/31/2022 | WO |