The present disclosure relates to machine learning and artificial intelligence. In particular, the present disclosure relates to generation of synthetic data for training machine learning models.
In recent years, there has been increasing interest in the use of Machine Learning (ML) techniques for Quality of Experience (QoE) modeling due to the increasingly complex interdependence and high dimensionality of features that are important to QoE models.
In QoE studies, datasets that can be used to train ML models are typically collected through both time- and energy-consuming user studies. Such datasets include not only Quality of Service (QoS) metrics, but also information about a user's perceived quality of experience on applications and services. At least some of the subjective data includes important factors for the model outcome, such as user background and application usage behaviour.
Training ML models with acceptable accuracy requires the availability of large training datasets. Due to the sensitive nature of many user datasets that are used to evaluate QoE, such datasets are not easily shared amongst QoE researchers, often due to restrictions such as the General Data Protection Regulation (GDPR) and intellectual property rights (IPR). Such restrictions often make it difficult for training datasets to be validated or reused in other QoE studies.
One way of handling this limitation is through the use of Collaborative Learning or Federated Learning (FL) techniques in which collaborators exchange and aggregate model parameters without exchanging the underlying training data that was used to obtain the model parameters.
The FL approach requires that participating collaborators perform up to some acceptable starting accuracy, as there need to be many iterations of exchanging model parameters between the collaborating entities. More iterations require more network resources, which increases the cost of training. The problems of poor starting accuracy and high network footprint can be addressed by using synthetic but realistic training data to supplement the local workers' training datasets.
Larger training datasets help to train more robust generic machine learning models. In particular, large amounts of training data rare needed to train Neural Networks. In CL/FL, the most widely used algorithms are based on Neural Network models. In the cases where there is a limited amount of data to train a Neural Network, it is known to generate synthetic datasets with other ensembled algorithms, such as Random Forest models, that can generate structured tabular datasets.
Generative models can be used to generate synthetic training data from a small representative training dataset by interpolating/extrapolating the small dataset to a larger training dataset. This enables the development/training of more robust and/or generic machine learning models. Hence, the benefit of using a generative model to create synthetic training data is not only applicable in the field of FL, but can also be used to develop more robust machine learning models in general. This approach may be particularly valuable in the case of edge computing where worker models are separated/isolated from one another and must be trained in isolation.
Some embodiments provide a method of generating a synthetic training dataset for training a machine learning model using an original training dataset including a plurality of features. The method includes selecting a feature ci of the original training dataset as a target vector yi, selecting remaining features of the original training dataset as a set of training input vectors X\i, where X\i includes all features of the training dataset other than a feature corresponding to the selected feature ci, and training a prediction model f(yi|X\i). The method generates an estimate y′i of the target vector yi by applying the prediction model to the set of training vectors X\i, and inserts a synthetic feature c′i corresponding to the estimate y′i of the target vector yi into a synthetic training dataset.
The method may further include repeating, for a plurality of features of the original training dataset, operations of selecting a feature of the training dataset, selecting remaining features of the training dataset, training the prediction model, generating the estimate of the target vector and inserting the synthetic feature into the synthetic training dataset.
The features may be provided as columns in a table.
The prediction model may include a bagging or boosting algorithm, such as a random forest prediction model or a gradient boosting tree model.
Generating the estimate y′i of the target vector yi may include running an inference on the prediction model using the set of training vectors X\i.
Generating the estimate y′i of the target vector yi may include generating an estimate y′i of the target vector yi by applying the prediction model as f(X\i)->y′i.
The method may further include appending the synthetic training dataset to the training dataset to form a hybrid training dataset and training a machine learning model using the hybrid training dataset. Appending the synthetic training dataset to the training dataset to form the hybrid training dataset training the machine learning model are performed in response to an indication from a master node in a federated learning system.
Training the machine learning model may include generating trained weights for a neural network, the method further including transmitting the trained weights to the master node.
The method may further include providing a preliminary training dataset, splitting the preliminary training dataset into the training dataset and a verification dataset before generating the synthetic training dataset, and verifying the neural network using the verification dataset.
The method may further include performing feature reduction on the preliminary training dataset before splitting the preliminary training dataset into the training dataset and the verification dataset.
The method may further include sorting the preliminary training dataset in descending order according to an importance of the features.
The method may further include computing a Kullback-Leibler divergence between the training dataset and the synthetic training dataset to determine a quality of the training dataset.
Generating the synthetic training dataset may be performed by a worker in a federated learning system.
The method may further include receiving message from a master node in the federated learning system, wherein the message instructs the worker node to generate the synthetic training dataset, and generating the synthetic training dataset is performed in response to the message.
The method may further include generating a quality metric that represents a quality of the synthetic training dataset, and transmitting the quality metric to the master node.
A computing device according to some embodiments is configured to perform operations including selecting a feature ci of the original training dataset as a target vector yi, selecting remaining features of the original training dataset as a set of training input vectors X\i, where X\i includes all features of the training dataset other than a feature corresponding to the selected feature ci, and training a prediction model f(yi|X\i), generating an estimate y′i of the target vector yi by applying the prediction model to the set of training vectors X\i, and inserting a synthetic feature c′i corresponding to the estimate y′i of the target vector yi into a synthetic training dataset.
Some embodiments provide a computing device including a processing circuit, and a memory coupled to the processing circuit. The memory includes computer readable program instructions that, when executed by the processing circuit, cause the computing device to perform operations including selecting a feature ci of the original training dataset as a target vector yi, selecting remaining features of the original training dataset as a set of training input vectors X\i, where X\i includes all features of the training dataset other than a feature corresponding to the selected feature ci, and training a prediction model f(yi|X\i), generating an estimate y′i of the target vector yi by applying the prediction model to the set of training vectors X\i, and inserting a synthetic feature c′i corresponding to the estimate y′i of the target vector yi into a synthetic training dataset.
A computer program according to some embodiments includes program code to be executed by processing circuitry of a computing device, whereby execution of the program code causes a computing device to perform operations including selecting a feature ci of the original training dataset as a target vector yi, selecting remaining features of the original training dataset as a set of training input vectors X\i, where X\i includes all features of the training dataset other than a feature corresponding to the selected feature ci, and training a prediction model f(yi|X\i), generating an estimate y′i of the target vector yi by applying the prediction model to the set of training vectors X\i, and inserting a synthetic feature c′i corresponding to the estimate y′i of the target vector yi into a synthetic training dataset.
A computer program product according to some embodiments includes a non-transitory storage medium including program code to be executed by processing circuitry of a computing device, whereby execution of the program code causes the computing device to perform operations including selecting a feature ci of the original training dataset as a target vector yi, selecting remaining features of the original training dataset as a set of training input vectors X\i, where X\i includes all features of the training dataset other than a feature corresponding to the selected feature ci, and training a prediction model f(yi|X\i), generating an estimate y′i of the target vector yi by applying the prediction model to the set of training vectors X\i, and inserting a synthetic feature c′i corresponding to the estimate y′i of the target vector yi into a synthetic training dataset.
A computing device according to some embodiments includes a training dataset collection module that obtains a training dataset, the training dataset including a plurality of features, and a synthetic dataset generation module that generates a synthetic training dataset by performing operations including selecting a feature ci of the training dataset as a target vector yi, selecting remaining features of the training dataset as a set of training vectors X\i, where X\i includes all features of the training dataset other than feature ci, training a prediction model f(yi|X\i), generating an estimate y′i of the target vector yi by applying the prediction model to the set of training vectors X\i, and inserting a feature c′i corresponding to the estimate y′i of the target vector yi into the synthetic training dataset.
The synthetic dataset generation module may be further configured to perform operations including repeating, for a plurality of features of the original training dataset, operations of selecting a feature of the training dataset, selecting remaining features of the training dataset, training the
prediction model, generating the estimate of the target vector and inserting the synthetic feature into the synthetic training dataset. The features may be provided as columns in a table.
The computing device may further include a machine learning model training module that trains a machine learning model using the synthetic training dataset.
A method of operating a master in a federated learning system including a plurality of workers that communicate with the master via a message bus includes transmitting, via the message bus, a message to at least one of the workers instructing the at least one worker to generate synthetic training data, and receiving, via the message bus, model parameters of a machine learning model from the at least one worker that were generated using the synthetic tabular training data.
The model parameters received from the worker may include trained neural network weights.
The method may further include receiving from the at least one worker a set of preliminary neural network weights that were trained without using the synthetic training data, and evaluating the set of preliminary neural network weights. Transmitting the message to the at least one worker instructing the at least one worker to generate synthetic tabular training data may be performed in response to evaluating the set of preliminary neural network weights.
The method may further include, after instructing the at least one worker to generate the synthetic training data, receiving a quality metric from the at least one worker, wherein the quality metric measures a quality of the synthetic training dataset, and instructing the worker to proceed with training a machine learning model using the synthetic training dataset in response to the quality metric.
The machine learning model may include a neural network.
A master node in a federated learning system configured to perform operations of transmitting, via a message bus, a message to at least one of a plurality of workers instructing the at least one worker to generate synthetic training data, and receiving, via the message bus, model parameters of a machine learning model from the at least one worker that were generated using the synthetic tabular training data.
A master node in a federated learning system includes a processing circuit, and a memory coupled to the processing circuit, wherein the memory includes computer readable program instructions that, when executed by the processing circuit, cause the master node to perform operations of transmitting, via a message bus, a message to at least one of a plurality of workers instructing the at least one worker to generate synthetic training data, and receiving, via the message bus, model parameters of a machine learning model from the at least one worker that were generated using the synthetic tabular training data.
Some embodiments provide a computer program including program code to be executed by processing circuitry of a computing device, whereby execution of the program code causes a computing device to perform operations of transmitting, via a message bus, a message to at least one of a plurality of workers instructing the at least one worker to generate synthetic training data, and receiving, via the message bus, model parameters of a machine learning model from the at least one worker that were generated using the synthetic tabular training data.
A computer program product according to some embodiments includes a non-transitory storage medium including program code to be executed by processing circuitry of a computing device, whereby execution of the program code causes the computing device to perform operations of transmitting, via a message bus, a message to at least one of a plurality of workers instructing the at least one worker to generate synthetic training data, and receiving, via the message bus, model parameters of a machine learning model from the at least one worker that were generated using the synthetic tabular training data.
The initial training accuracy of workers in a FL system may be increased when synthetic data generated according to some embodiments is used for training. This may result in faster convergence fewer training cycles, thereby reducing network footprint and/or energy consumption.
Inventive concepts will now be described more fully hereinafter with reference to the accompanying drawings, in which examples of embodiments of inventive concepts are shown. Inventive concepts may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of present inventive concepts to those skilled in the art. It should also be noted that these embodiments are not mutually exclusive. Components from one embodiment may be tacitly assumed to be present/used in another embodiment.
The following description presents various embodiments of the disclosed subject matter. These embodiments are presented as teaching examples and are not to be construed as limiting the scope of the disclosed subject matter. For example, certain details of the described embodiments may be modified, omitted, or expanded upon without departing from the scope of the described subject matter.
Synthetic data generation methods have been employed in the area of audio, image and text synthesis. Synthetic data generation has been less successful in the area of tabular data generation, however. Given that the majority of QoE datasets are tabular and consist of continuous and discrete features with multi-modal, or different distributions, there is a need for improved synthetic data generation methods for generation of synthetic tabular data.
Generative Adversarial Networks (GANs), which work based on a combination of Neural Network algorithms and other algorithms, such as game theory, are good candidates for synthetic data generation. GANs consist of generative and discriminator models that are trained in turns. First, a discriminator model is trained with real samples to differentiate between real and synthetic data samples. Next, a generative model is trained that tries to generate synthetic but realistic samples. The realistic samples are fed into the discriminator model. The goal of the generative model is to fool the discriminator model. The effectiveness of a generative model is measured by a generative loss, while the effectiveness of the discriminator model is measured by a discriminative loss. The generative model is trained such that the generative loss would decrease while the discriminator model's loss increases. Many different GAN techniques exist that are capable of generating synthetic tabular datasets. For example the CT-GAN technique has a pre-processing phase that consists of a variational Gaussian Mixture Model for mode detection of features, and conditional generator model that prevents mode collapse.
TableGAN is another GAN-based method that is used to generate synthetic tables that are similar to the original (real) tables. GAN-based techniques are hard to train due to the known challenges of training a deep neural network plus additional challenges present in game theory. In addition to the increased model engineering time, the training time, and resource consumption during the training time are often demanding. The principle of GAN is that both the discriminator and generative models should be just good enough such that each party can train each the other one well. If a generative model is too poor, it becomes difficult to generate realistic synthetic samples. Hence, it is typically preferred for the discriminator model to be good in the beginning, while the generative model performs poorly. Over time, the generative model becomes better while the discriminator model performance becomes poorer, but only up to some point. If the discriminator is poor, then generative model will be trained with noisy loss values, hence the training will not benefit generative model. The main challenge of the GAN-based approach is to monitor the performance of the models and decide when to stop training. If the training is not stopped at the right time, it could result in model collapse.
These and other challenges are addressed by embodiments described herein, which provide a non-GAN-based approach to the generation of synthetic datasets.
Some embodiments described herein provide a method for generating tabular synthetic QoE dataset that can be used to improve existing local QoE model performance in a Federated Learning environment that may reduce the network footprint over the communication channel between the collaborators. The method described herein may be described as a leave-one-out (LOO) algorithm for tabular synthetic data generation.
Federated training methods according to some embodiments are illustrated in
By increasing size of the dataset used to train the model, the accuracy of the model may be improved. In a Federated Learning environment, increasing the initial model accuracy and/or decreasing the required number of iterations of communication between collaborators can improve model accuracy, decrease training time and/or reduce the network footprint needed for model training.
In particular, some embodiments provide systems and/or methods for generating a synthetic but realistic training tabular dataset in order to improve Neural Network (NN) based model performance in situations where available training data is limited. As a result, higher starting model performance may be achieved in a Federated Learning environment which can result in faster model convergence. Moreover, even when Federated Learning is not employed, the availability of synthetic data can improve the model performance in situations where suitable training data is otherwise difficult to obtain.
Some embodiments described herein provide a light-weight method for generating a synthetic tabular dataset that might contain a mix of continuous and categorical columns empowered by Boosting and/or Bagging methods that are known to be superior in structured tabular datasets.
In particular, some embodiments provide a method of generating a synthetic training dataset for training a machine learning model. The method may be referred to as a leave-one-out (LOO) method. In the LOO method, an initial (real) training dataset is provided in tabular form. The training dataset is provided in the form of a table that includes a plurality of columns that correspond to respective features of a system. According to some embodiments, each column ci of the training dataset is selected as a target vector yi. Remaining columns of the training dataset are selected as a set of training vectors X\i, where X\i includes all features of the training dataset other than a feature corresponding to the column ci. The method then trains a prediction model f(yi|X\i) that predicts values of the elements of target vector yi based on the training vectors X\i.
The method generates an estimate y′i of the target vector yi by applying the prediction model to the set of training vectors X\i. Once the estimate y′i of the target vector yi is generated, the method inserts a column c′i corresponding to the estimate y′i of the target vector yi into a synthetic training dataset.
This process is repeated until all of the columns of the training dataset have been reproduced with synthetic data in the synthetic training dataset. The synthetic dataset may be combined with the training dataset, effectively doubling the size of the training dataset.
Some embodiments described herein may have one or more advantages.
The initial training accuracy of workers in a FL system may be increased when synthetic data generated according to some embodiments is used for training. This may result in faster convergence fewer training cycles, thereby reducing network footprint and/or energy consumption.
In non-FL cases, i.e., cases where neither sharing of data nor sharing of neural weights is possible, some embodiments can nevertheless provide additional training data based on a known data distribution. The accuracy of an isolated NN model accuracy may thereby be improved.
Since data is generated using a bagging or boosting model according to some embodiments, the approaches described herein can be considered as light-weight in terms of computation time and model engineering efforts as compared to GAN-based approaches, since GAN-based approaches necessitate at least 2 NN models to be trained alternatively.
Since the model generation used in some embodiments makes the training dataset a bit noisy, a model trained with the generated dataset is expected to be more robust to over-fitting issues.
Finally, the use of generated synthetic tabular training data samples can address privacy issues that may arise when using training data.
Some potential advantages of using the LOO method of generating synthetic training data versus using a Generative Adversarial Network (GAN) or Recurrent Neural Network (RNN) to generate synthetic training data are summarized in Table 1 below.
Some embodiments provide systems/methods that perform synthetic tabular data generation referred to as leave-one-out (LOO). Some embodiments use a bagging or boosting algorithm, such as a Random Forest algorithm, to train and generate the synthetic dataset. A Light Gradient Boosting Tree algorithm may also be used, as be since such methods are known for high performance on tabular datasets. However, Gradient Boost Machine (GBM) trees are trained sequentially and therefore are slower compared to bagging or boosting algorithms. There are also more hyperparameters to tune in GBM models compared to RF models.
To generate a synthetic dataset S′ for testing, a tabular training dataset S containing real (non-synthetic) data is provided. Initially, the dimensionality of the real dataset may be decreased by reducing the amount of input features. This may be accomplished by applying feature importance measurement (e.g., feature selection) techniques. The remaining features are then sorted with respect to the importance of the feature, with the labels in descending order from left-to-right.
The real dataset is then split into a training dataset and a test dataset. For example, the data may be randomly divided into a training dataset and a test dataset according to a predetermined percentage (e.g., 70% training, 30% test).
Once the training dataset has been defined, a synthetic dataset is generated as illustrated in
Accordingly, referring to
First, set a column index i to 0.
On the training dataset S, select the column ci as a target vector yi (block 404). The remaining columns of the training dataset are selected to form a set of input vectors X\i (block 406). A model is then trained at block 408 that fits f(yi|X\i), where ci stands for the column i (indexed from left to right) that is selected as the target vector yi for this iteration. The “\” symbol is used for “not”, where “\” means all column indices except for column index i. The model may, for example, be a bagging or boosting algorithm, such as a Random Forest model, or a Light Gradient Boosting Tree algorithm.
Next, the method runs an inference on the model using the same training set X\i to generate an estimate y′i of the target vector yi via f(X\i)->y′i (block 410).
The estimate y′i of the target vector yi is then appended as a new column in the synthetic dataset S′ (block 412).
The column index i is then incremented by 1 (i.e., the column index is shifted to the right), and the previous steps are repeated until all columns of the synthetic dataset S′ have been generated, i.e., until i=I−1; where I is the total number of columns in the original dataset.
Once all columns of the training dataset S have been processed, a tabular synthetic dataset S′ having the same size of the original training dataset S is generated.
The method may then compute a quality metric that measures a quality of the synthetic training dataset S′, such as the Kullback-Leibler (KL) divergence between the original training data set S and the synthetic dataset S′, to make sure the values are neither too small (e.g. close to 0) nor too big. A too-small distance score may indicate that the model is not creating data samples that are far enough from the original dataset, while a too-large distance may indicate that the generated dataset is very different from original dataset.
The synthetic dataset may then be appended to the original dataset to provide a larger combined training dataset. The larger combined training dataset may be used to train a NN model.
Finally, the performance of the trained model may be evaluated on the real test set.
Some embodiments described herein may be advantageously implemented to improve the operation of a Federated Learning system. For example, in some embodiments, referring again to
For example,
At block 506, the worker 200-N generates a synthetic training dataset S′ using, for example, the methods described herein. The worker 200-N generates a quality metric for the synthetic dataset, such as the KL divergence of synthetic training dataset, and transmits the quality metric to the master 100 in a message 507. The master evaluates the quality metric and, in this example, decides that the worker 200-N should proceed with using the synthetic training dataset S′, which it indicates to the worker 200-N in a message 509.
The worker 200-N then combines the real training dataset S with the synthetic training dataset S′ at block 510 and trains the ML model using the combined dataset (S, S′) at block 512.
The worker 200-N then sends the re-trained weights to the master 100 in a message 513. The master 100 combines the re-trained weights with trained weights from other workers 200 at block 514, and transmits the combined weights to the workers 200 in a message 515.
It will be appreciated that all or only a subset of the workers 200 may be instructed to generate and use synthetic data in any given FL system in various embodiments. The decision to require synthetic data may depend on the quality of weights provided by a given worker and/or based on other considerations. For example, in some embodiments, the master 100 may set a limit on the number of training iterations allowed and may require the use of synthetic training data to ensure that the number of training iterations is under the limit.
Operations according to some embodiments are illustrated in
As shown, the device 200 includes a communication interface 32 (also referred to as a network interface) configured to provide communications with other devices. The device 200 also includes a processor circuit 34 (also referred to as a processor) and a memory circuit 36 (also referred to as memory) coupled to the processor circuit 34. According to other embodiments, processor circuit 34 may be defined to include memory so that a separate memory circuit is not required.
As discussed herein, operations of the device 200 may be performed by processing circuit 34 and/or communication interface 32. For example, the processing circuit 34 may control the communication interface 32 to transmit communications through the communication interface 32 to one or more other devices and/or to receive communications through network interface from one or more other devices. Moreover, modules may be stored in memory 36, and these modules may provide instructions so that when instructions of a module are executed by processing circuit 34, processing circuit 34 performs respective operations (e.g., operations discussed herein with respect to example embodiments).
As shown, the device 100 includes a communication interface 42 (also referred to as a network interface) configured to provide communications with other devices. The device 100 also includes a processor circuit 44 (also referred to as a processor) and a memory circuit 46 (also referred to as memory) coupled to the processor circuit 44. According to other embodiments, processor circuit 44 may be defined to include memory so that a separate memory circuit is not required.
As discussed herein, operations of the device 100 may be performed by processing circuit 44 and/or communication interface 42. For example, the processing circuit 44 may control the communication interface 42 to transmit communications through the communication interface 42 to one or more other devices and/or to receive communications through network interface from one or more other devices. Moreover, modules may be stored in memory 46, and these modules may provide instructions so that when instructions of a module are executed by processing circuit 44, processing circuit 44 performs respective operations (e.g., operations discussed herein with respect to example embodiments).
The methods described herein are applied in multiple datasets with varying training data sizes to gather indicative results. For dataset 1 and results (KPI Degradation Prediction use case), over 100 iterations are performed with different original training datasets. The training set and the test set sizes are given in Table 2.
The KPI degradation use case dataset consists of 41 features (after further dimensionality reduction). In order to evaluate the model, four experiments are performed, each with different data sizes, where bagging or boosting algorithm is used during synthetic tabular data generation.
Two evaluation methods are used to quantify the benefits of the LOO approach described herein, namely, MAE (Mean Absolute Error) and Area Under the ROC Curve (AUC). FIGS. 9(a) and 9(b) illustrate the AUC and MAE scores obtained from the ML models trained with original training set and with the synthetic & original training set. The figures indicate that the ML model performance can further be improved with larger synthetically generated training data samples up to a certain extent.
The blue and the orange curves in
To evaluate whether the LOO approach works in other datasets, a publicly available QoE dataset is chosen. Indicative features related to spatio-temporal video quality such as stalling events and presentation video resolution and bitrate were extracted from the dataset. All columns are quasi-continuous in the dataset. The dataset consists of 9 features as indicative/descriptive features to QoE such as initial bitrate, mean bitrate, nr. Of stalling events, etc. The target variable, the label, is a MOS score in ABR scale (0-100), where the higher scores are rather better QoE than the lower ones. In total there are 450 samples in the whole dataset.
Experiments are performed with varying training set sizes in the range between 50 and 220. The training set size could not be increased to more than 220 due to a need to allocate test and validation sets from a total of 450 samples. When the requested training set size is set to 50, 50 more samples are generated, hence the training set size becomes two times big; 50+50=100. Next a neural network is trained with this higher number of samples, which is a blend of real and synthetic samples. In the below table, the results from the experiments are given, and in overall the overall accuracy of the model is increased, the prediction error is decreased when a higher number of training set is used.
The experiments are repeated 50 times to be able to compare whether the proposed solution works or not within the confidence intervals.
The model accuracy R2 score and the MAE are improved in the case when the synthetically generated tabular dataset is used as the training set to the model as given in Table 3. The MAE decreased and R2 score increased. The computation time of the LOO approach is given in Table 4. The computation time increases linearly with the number of features (see
For simpler datasets, the model that is used in model generation, i.e., Random Forest, can further be simplified and replaced with a Lasso Regression, i.e., with LinearRegression with L2 regularization to avoid the chance of overfitting.
Hyperparameters of the GAN model used in the experiments are shown in Table 6.
The LOO approach can be integrated to the existing FL POC framework, as an additional function, where this function is “generate_data( )”, instead of “train_send( )”. The master node can orchestrate a worker node to generate synthetic dataset to improve the starting isolated learning accuracy. This is applicable in the cases where the worker does not contribute to the federation (overall model accuracy during FL due to its noisy starting model), and is instead asked to generate its own data up to certain accuracy threshold. This applies for a temporary period of time, where for instance in the cases a worker does not have enough data to train and join the federation in the early phases, hence is first asked to generate a data up to a certain quality and quantity and only after that is allowed to join the federation as shown in
Referring to
The LOO approach is also applicable in the cases where the worker does not benefit from overall FL model being learned collaboratively. This may happen when a worker's data distribution is significantly different from the majority of the other workers data distribution. In that case, the worker can choose to generate more dataset from its own distribution, and can still improve its model performance accuracy while being trained in an isolated manner. An example case, even if worker n in
Abbreviations
In the above-description of various embodiments of present inventive concepts, it is to be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of present inventive concepts. Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which present inventive concepts belong. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of this specification and the relevant art.
When an element is referred to as being “connected”, “coupled”, “responsive”, or variants thereof to another element, it can be directly connected, coupled, or responsive to the other element or intervening elements may be present. In contrast, when an element is referred to as being “directly connected”, “directly coupled”, “directly responsive”, or variants thereof to another element, there are no intervening elements present. Like numbers refer to like elements throughout. Furthermore, “coupled”, “connected”, “responsive”, or variants thereof as used herein may include wirelessly coupled, connected, or responsive. As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. Well-known functions or constructions may not be described in detail for brevity and/or clarity. The term “and/or” includes any and all combinations of one or more of the associated listed items.
It will be understood that although the terms first, second, third, etc. may be used herein to describe various elements/operations, these elements/operations should not be limited by these terms. These terms are only used to distinguish one element/operation from another element/operation. Thus, a first element/operation in some embodiments could be termed a second element/operation in other embodiments without departing from the teachings of present inventive concepts. The same reference numerals or the same reference designators denote the same or similar elements throughout the specification.
As used herein, the terms “comprise”, “comprising”, “comprises”, “include”, “including”, “includes”, “have”, “has”, “having”, or variants thereof are open-ended, and include one or more stated features, integers, elements, steps, components, or functions but does not preclude the presence or addition of one or more other features, integers, elements, steps, components, functions, or groups thereof.
Example embodiments are described herein with reference to block diagrams and/or flowchart illustrations of computer-implemented methods, apparatus (systems and/or devices) and/or computer program products. It is understood that a block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by computer program instructions that are performed by one or more computer circuits. These computer program instructions may be provided to a processor circuit of a general purpose computer circuit, special purpose computer circuit, and/or other programmable data processing circuit to produce a machine, such that the instructions, which execute via the processor of the computer and/or other programmable data processing apparatus, transform and control transistors, values stored in memory locations, and other hardware components within such circuitry to implement the functions/acts specified in the block diagrams and/or flowchart block or blocks, and thereby create means (functionality) and/or structure for implementing the functions/acts specified in the block diagrams and/or flowchart block(s).
These computer program instructions may also be stored in a tangible computer-readable medium that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable medium produce an article of manufacture including instructions which implement the functions/acts specified in the block diagrams and/or flowchart block or blocks. Accordingly, embodiments of present inventive concepts may be embodied in hardware and/or in software (including firmware, resident software, micro-code, etc.) that runs on a processor such as a digital signal processor, which may collectively be referred to as “circuitry,” “a module” or variants thereof.
It should also be noted that in some alternate implementations, the functions/acts noted in the blocks may occur out of the order noted in the flowcharts. For example, two blocks shown in succession may in fact be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Moreover, the functionality of a given block of the flowcharts and/or block diagrams may be separated into multiple blocks and/or the functionality of two or more blocks of the flowcharts and/or block diagrams may be at least partially integrated. Finally, other blocks may be added/inserted between the blocks that are illustrated, and/or blocks/operations may be omitted without departing from the scope of inventive concepts. Moreover, although some of the diagrams include arrows on communication paths to show a primary direction of communication, it is to be understood that communication may occur in the opposite direction to the depicted arrows.
Many variations and modifications can be made to the embodiments without substantially departing from the principles of the present inventive concepts. All such variations and modifications are intended to be included herein within the scope of present inventive concepts. Accordingly, the above disclosed subject matter is to be considered illustrative, and not restrictive, and the examples of embodiments are intended to cover all such modifications, enhancements, and other embodiments, which fall within the spirit and scope of present inventive concepts. Thus, to the maximum extent allowed by law, the scope of present inventive concepts are to be determined by the broadest permissible interpretation of the present disclosure including the examples of embodiments and their equivalents, and shall not be restricted or limited by the foregoing detailed description.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/SE2021/050172 | 3/2/2021 | WO |
Number | Date | Country | |
---|---|---|---|
62984090 | Mar 2020 | US |