METHOD AND DEVICE FOR PRESERVING PRIVACY OF LINEAR REGRESSION DISTRIBUTED LEARNING

TECHNICAL FIELD

The present disclosure relates to a method and device for preserving privacy of linear regression distributed learning, in particular for preserving privacy of linear regression distributed learning using a LASSO (least absolute shrinkage and selection operator)-VAR (vector autoregressive) model, further in particular for preserving privacy of distributed learning using a LASSO-VAR model using convex optimization like alternating direction method of multipliers (ADMM) or using coordinate descent optimization.

BACKGROUND

The forecasting skill of renewable energy sources (RES) has been improved in the past two decades through R&D activities across the complete model chain, i.e., from numerical weather predictions (NWP) to statistical learning methods that convert weather variables into power forecasts [1]. The need to bring forecasting skill to significantly higher levels is widely recognized in the majority of roadmaps that deal with high RES integration scenarios for the next decades. This is expected not only to facilitate RES integration in the system operation and electricity markets but also to reduce the need for flexibility and associated investment costs on remedies that aim to hedge RES variability and uncertainty like storage, demand response, and others.

In this context, intraday and hour-ahead electricity markets are becoming increasingly important to handle RES uncertainty and thus accurate hours-ahead forecasting methods are essential. Recent findings showed that feature engineering, combined with statistical learning models, can extract relevant information from spatially distributed weather and RES power time series and improve hours-ahead forecasting skill. Indeed, for very short-term time horizon (from 15 minutes to 6 hours ahead), the vector autoregressive (VAR) model, when compared to univariate time series models, has shown competitive results for wind and solar power forecasting.

The VAR model forecasts the power output of multiple RES power plants by linearly combining their historical (or past) power values. Four important challenges for RES forecasting have been identified when using VAR: (a) sparse structure of the coefficients' matrix, (b) uncertainty forecasting, (c) distributed and online learning, and (d) data privacy. The focus of the present disclosure is on (d), which a recent review showed that additional research is needed to develop robust techniques for privacy-preserving forecasting [2].

Sparse structure of VAR coefficients is important to produce interpretable models in terms of spatial and temporal dependency, and also to avoid noisy estimates and unstable forecasts. Sparsity can be induced by methods such as LASSO (Least Absolute Shrinkage and Selection Operator) [3] or partial spectral coherence together with Bayesian information criterion, among others.

Uncertainty forecasts can be generated with different models, such as non-parametric quantile regression or a semiparametric approach that transforms power data with the logit-normal distribution [1].

Distributed learning can be based on convex optimization using the alternating direction method of multipliers (ADMM) [4]. For example, ADMM can be used for distributed learning of LASSO-VAR applied to wind power forecasting [3]. Online learning, with online ADMM and adaptive mirror descent algorithms, is proposed in [5] for high-dimensional autoregressive models with exogenous inputs (AR-X).

Data privacy is a critical barrier to the application of collaborative RES forecasting models. Even though spatio-temporal time series models offer forecasting skill improvement and the possibility of implementing distributed learning schemes (like in [3]), the lack of a privacy-preserving mechanism makes data owners unwilling to cooperate. The VAR model fails to provide data privacy because the covariates are the lags of the target variable of each RES site, which means that agents (or data owners) cannot provide covariates without also providing their target (power measurements) variables.

Zhang and Wang described a privacy-preserving approach for wind power forecasting with off-site time series, which combined ridge linear quantile regression with ADMM [6]. However, privacy with ADMM is not always guaranteed since it requires intermediate calculations, allowing the most curious competitors to recover the data at the end of a number of iterations [2]. Moreover, the central node can also recover the original and private data. For the online learning algorithms in [5], Sommer et al. considered an encryption layer, which consists of multiplying the data by a random matrix. However, the focus of this work was not data privacy (but rather online learning), and the private data are revealed to the central agent who performs intermediary computations. Berdugo et al. described a method based on local and global analog-search (i.e., template matching) that uses solar power time series from neighboring sites [17]. However, as recognized by the authors, the main goal is not to produce the forecast with minimum error, but rather to keep power measurements private since each site only receives reference timestamps and normalized weights of the analogs identified by its neighbors; note that the concept of neighborhood is also not defined.

More generally, a critical analysis of privacy-preserving techniques for VAR has grouped these techniques as (a) data transformation, (b) secure multi-party computation, and (c) decomposition-based methods [2]. The main conclusions were that data transformation requires a trade-off between privacy and accuracy, secure multi-party computations either result in computationally demanding techniques or do not fully preserve privacy in VAR models, and that decomposition-based methods rely on iterative processes and after a number of iterations, the agents will have enough information to recover private data.

There is thus a need for a privacy-preserving distributed learning framework where original data cannot be recovered by a central agent or peers (this represents a more robust approach compared to the ADMM implementation in [5], [6]), without decreasing forecasting skill, where asynchronous communication between peers is addressed both in the model fitting and operational phases, and where a flexible collaborative model can be implemented with centralized communication with a neutral node or peer-to-peer (P2P) communication.

There is thus also a need for a privacy-preserving distributed learning framework apt for VAR and LASSO-VAR models, in particular when applied to renewable energy sources (RES) power forecasting, in particular wind and solar power forecasting.

These facts are disclosed in order to illustrate the technical problem addressed by the present disclosure.

REFERENCES

- [1] R. J. Bessa, C. Mohrlen, V. Fundel, M. Siefert, J. Browell, S. H. E. Gaidi, B.-M. Hodge, U. Cali, and G. Kariniotakis, “Towards improved understanding of the applicability of uncertainty forecasts in the electric power industry,” Energies, vol. 10, no. 9, p. 1402, Sep. 2017.
- [2] C. Gonsalves, R. J. Bessa, and P. Pinson, “A critical overview of privacy-preserving approaches for collaborative forecasting,” International Journal of Forecasting, In Press, 2020.
- [3] L. Cavalcante, R. J. Bessa, M. Reis, and J. Browell, “LASSO vector autoregression structures for very short-term wind power forecasting,” Wind Energy, vol. 20, no. 4, pp. 657-675, Apr. 2017.
- [4] S. Boyd, N. Parikh, E. Chu, B. Peleato, J. Eckstein et al., “Distributed optimization and statistical learning via the alternating direction method of multipliers,” Foundations and Trends R in Machine learning, vol. 3, no. 1, pp. 1-122, 2011.
- [5] B. Sommer, P. Pinson, J. Messner, and D. Obst, “Online distributed learning in wind power forecasting,” International Journal of Forecasting, In Press, 2020.
- [6] Y. Zhang and J. Wang, “A distributed approach for wind power probabilistic forecasting considering spatio-temporal correlation without direct access to off-site information,” IEEE Trans. on Power Systems, vol. 33, no. 5, pp. 5714-5726, Sep. 2018.
- [7] V. Berdugo, C. Chaussin, L. Dubus, G. Hebrail, and V. Leboucher, “Analog method for collaborative very-short-term forecasting of power generation from photovoltaic systems,” in Proc. Next Gener. Data Min. Summit, Greece, Sep. 2011, pp. 1-5.
- [8] C. Dwork and A. Smith, “Differential privacy for statistics: What we know and what we want to learn,” Journal of Privacy and Confidentiality, vol. 1, no. 2, pp. 135-154, Apr. 2010.
- [9] T. Zhang and Q. Zhu, “Dynamic differential privacy for ADMM based distributed classification learning,” IEEE Trans. on Information Forensics and Security, vol. 12, no. 1, pp. 172-187, Jan. 2017.
- [10] Nicholson, William B., David S. Matteson, and Jacob Bien. “Structured regularization for large vector a utoregressions.” Cornell University (2014).

General Description

It is presently disclosed a privacy-preserving distributed VAR method comprising the following contributions: (a) combination of data transformation and decomposition based methods so that the VAR model is fitted in another feature space without decreasing the forecasting skill and in a way that original data cannot be recovered by central agent and peers (this represents a more robust approach compared to the ADMM implementation in [5], [6]); (b) asynchronous communication between peers is addressed both in the model fitting and operational phases; (c) flexible collaborative model that can implement two different schemes: centralized communication with a neutral node and peer-to-peer (P2P) communication (which was not covered by [5], [6]).

As discussed, concerns about data privacy inhibit the communication and sharing of data between companies and third parties, impairing the accuracy of current forecasting methods. The present disclosure tackles the data privacy problem of linear regression-based problems by using an equivalent linear system, with specific dimensions. The present disclosure describes using encryption matrix or matrices which transform the original data into an equivalent linear system. The construction of these matrices is essential to obtain data privacy. The present disclosure is ready to work with vertical database partitioning, which is more difficult to protect, since each entity records different parameters (variables).

Because of the data structure, a protocol is proposed to define the encryption matrix or matrices, unknown by the agents but at the same time built by all agents. This protocol does not assume the existence of third parties. The present disclosure allows collaboration using both a centralized model (with entities sharing data with a neutral third party), and a decentralized model (whereby no third party is required). The present disclosure can, at least, be applied to solve the regression problems of Linear Regression (e.g. ordinal least squares estimator), Ridge linear regression (e.g. ordinal least squares estimator), or LASSO linear regression through: ADMM algorithm [4] or Coordinate descent-based algorithm [10].

The following discusses the distributed learning framework that enables different agents or data owners (e.g., RES power plant, market players, forecasting service providers) to exploit geographically distributed time series data (power and/or weather measurements, NWP, etc.) and improve forecasting skill while keeping data private. In this context, data privacy can either refer to commercially sensitive data from grid connected RES power plants or personal data (e.g., under European Union General Data Protection Regulation) from households with RES technology. Distributed learning (or collaborative forecasting) means that instead of sharing their data, learning problems for model fitting are solved in a distributed manner. Two collaborative schemes (depicted in FIG. 1) are possible: centralized communication with a central node (central hub model) and peer-to-peer communication (P2P model).

In the central hub model, the scope of the calculations performed by the agents is limited by their local data, and the information transmitted to the central node relates to functions and statistics of that data. The central node is responsible for combining these local estimators and, when considering iterative solvers like ADMM, coordinating the individual optimization processes to solve the main optimization problem. The communication scheme fits in the following business models:

- Transmission or distribution system operator (TSO or DSO) operating the collaborative platform as a central node, fostering collaboration between competitive RES power plants to improve the forecasting skill and reduce system balancing costs. Moreover, TSO or DSO can use this model to produce hierarchical forecasts (grid node, region, etc.) that use private measurements from RES power plants. The advanced metering infrastructure of Smart Grids can also feed the collaborative platform and bring additional benefits to agents.
- Forecasting service provider that hosts the central node and makes available APIs and protocols for information (not data) exchange during model fitting and receives a payment for this service. Two examples are: SingularityNET (singularitynet.io) as an open-source protocol and collection of smart contracts for a decentralized market of data services; Ocean Protocol (oceanprotocol.com) as an ecosystem for sharing data and associated services.

In the P2P model, agents equally conduct a local computation of their estimators, but share their information with peers, meaning that each agent is itself agent and central node. While P2P tends to be more robust (i.e., there is no single point of failure), it is usually difficult to make it as efficient as the central hub model in terms of communication costs—when considering n agents, each agent communicates with the remaining n−1.

The P2P model is suitable for data owners that do not want to rely (or trust) upon a neutral agent. Potential business models are related to P2P forecasting between prosumers or RES power plants, as well as to Smart Cities characterized by an increasing number of sensors and devices installed at houses, buildings, and transportation network.

In order to make these collaborative schemes feasible, the following fundamental principles must be respected: (a) ensure improvement (compared to a scenario without collaboration) in forecasting skill; (b) guarantee data privacy, i.e., agents and central node cannot have access or recover original data; (c) consider synchronous and asynchronous communication between agents. The formulation that will be described in the present disclosure fully guarantees these three core principles.

The following describes the VAR models, as well as the most common model fitting algorithms as used in the present disclosure. Throughout this disclosure, matrices are represented by bold uppercase letters, vectors by bold lowercase letters and scalars by lowercase letters. Also, a=[a₁, a₂] represents a column vector, while the column-wise operation between two vectors or matrices is denoted as [a, b] or [A, B], respectively.

The following describes the VAR model formulation. Let {y_t}_t=1^T, be an n-dimensional multivariate time series, where n is the number of data owners. Then, {y_t}_t=1^T, follows a VAR model with p lags, denoted by VAR_n(p), when

y
_t=η+Σ_t=1^py_t−lB^(l)+ε_t (1)

for t=1, T, where η=[η₁, . . . , η_n] is the constant intercept (row) vector, η∈ custom-character ⁿ; B^(l)represents the coefficient matrix at lag l=1, . . . p, B^(l)∈^n×n, and the coefficient associated with lag l of time series i, to estimate time series j, is at position (i,j) of B^(l), for i, j=1, . . . , n; and ε_t=[ε_1,t, . . . , ε_n,t], ε_t∈ⁿ, denotes a white noise vector that is independent and identically distributed with mean zero and nonsingular covariance matrix. By simplification, y_tis assumed to follow a centered process, η=0, i.e., as a vector of zeros of appropriate dimension. A VAR_n(p) model can be written in matrix form as

$\begin{matrix} Y = ZB + E, where Y = [\begin{matrix} y_{1} \\ ⋮ \\ y_{T} \end{matrix}], B = [\begin{matrix} B^{(1)} \\ ⋮ \\ B^{(p)} \end{matrix}], Z = [\begin{matrix} z_{1} \\ ⋮ \\ z_{T} \end{matrix}], E = [\begin{matrix} ε_{1} \\ ⋮ \\ ε_{T} \end{matrix}], & (2) \end{matrix}$

are obtained by joining the vectors row-wise, and define, respectively, the T×n response matrix, the np×n coefficient matrix, the T×np covariate matrix and the T×n error matrix, with z_t=[y_t−1, . . . , y_t−p].

The following describes the VAR model estimation as used in the present disclosure.

Usually, when the number of covariates, np, is substantially smaller than the records, T, the VAR model is estimated through the multivariate least squares,

$\begin{matrix} {\hat{B}}_{LS} = \arg \min_{B} ({ Y - ZB }_{2}^{2}), & (3) \end{matrix}$

where ∥·∥_rrepresents both vector and matrix L_rnorms. However, as the number of data owners increases, as well as the number of lags, it becomes indispensable to use regularization techniques, such as LASSO, aiming to introduce sparsity into the coefficient matrix estimated by the model. In the standard LASSO-VAR approach, the coefficients are estimated by

$\begin{matrix} \hat{B} = \arg \min_{B} ({ Y - ZB }_{2}^{2} + λ { B }_{1}), & (4) \end{matrix}$

where λ>0 is a scalar penalty parameter.

The LASSO regularization term makes the objective function in (4) non-differentiable, limiting the variety of optimization techniques that can be employed. In this domain, ADMM is a popular and computationally efficient technique allowing parallel estimation for data divided by records or features, which is an appealing property when designing a privacy preserving approach.

- (1) Standard ADMM and LASSO-VAR: The ADMM solution for (4) is obtained by splitting the B variable into two variables (B and H) and adding the constraint H=B,

$\begin{matrix} \hat{B} = \arg \min_{B} ({ Y - ZB }_{2}^{2} + λ { H }_{1}) subject to H = B, & (5) \end{matrix}$

Then, based on the augmented Lagrangian of (5), the solution is provided by the following system of equations—see [3],

B
^k+1=(Z^TZ+ρI)⁻¹(Z^TY+ρ(H^k−U^k)) (6a)

H

^k+1
=S
_λ/ρ(B^k+1+U^k) (6b)

U
^k+1
=U
^k
+B
^k+1
−H
^k+1 (6c)

where U is the scaled dual variable associated with the constraint H=B, I is the identity matrix with proper dimension, and S_λ/ρis the soft thresholding operator.

- (2) Distributed ADMM and LASSO-VAR: When defining a VAR model, each time series is collected by a specific data owner, meaning that data are split by features, i.e., Y=[Y_A₁, Y_A_n] and Z=[Z_A₁, . . . , Z_A_n], where Y_A_i∈^T×1and Z_A_i∈^T×pdenote the target and covariate matrix for the i-th data owner, respectively. Furthermore, B=[B_A₁^T, . . . , B_A_n^T]^T, as illustrated in FIG. 2.

Consequently, the problem in (4) can be re-written as

$\begin{matrix} \arg \min_{B} ({ Y - \sum_{i} Z_{A_{i}} B_{A_{i}} }_{2}^{2} + λ \sum_{i} { B_{A_{i}} }_{1}) . & (7) \end{matrix}$

This decomposition of the objective function allows parallel computation of B_A_i, being the ADMM solution provided by system of equations (8)—see [3],

$\begin{matrix} B_{A_{i}}^{k + 1} = \arg \min_{B} ({ Z_{A_{i}} B_{A_{i}}^{k} + {\overline{H}}^{k} - {\overline{ZB}}^{k} - U^{k} - Z_{A_{i}} B_{A_{i}} }_{2}^{2} + λ { B_{A_{i}} }_{1}), & (8 a) \end{matrix}$

$\begin{matrix} {\overline{H}}^{k + 1} = \frac{1}{N + ρ} (Y + ρ {\overline{ZB}}^{k + 1} + U^{k}) & (8 b) \end{matrix}$

$\begin{matrix} U^{k + 1} = U^{k} + {\overline{ZB}}^{k + 1} - {\overline{H}}^{k + 1}, & (8 c) \end{matrix}$

where

${\overline{ZB}}^{k + 1} = \frac{1}{n} \sum_{j = 1}^{n} Z_{A_{j}} B_{A_{j}}^{k + 1},$

B_A_i^k+1∈ custom-character ^p×n, Z_A_i∈^T×p, Y, H, U∈^T×n, i=1, . . . , n. B_A_iis estimated through ADMM by adapting (5),

$\begin{matrix} \arg \min_{B} ({ {\hat{Y}}_{A_{i}} - \sum_{i} Z_{A_{i}} B_{A_{i}} }_{2}^{2} + \hat{λ} \sum_{i} { H_{A_{i}} }_{1}), & (9) \end{matrix}$

where Ŷ_A_i=Z_A_iB_A_i^k+H^k−ZB^k−U^kand {circumflex over (λ)}=λ/ρ,

- (3) Privacy issues: Regarding the collaboration schemes, each agent determines and transmits (8a), then it is up to the central agent or peers (depending on the adopted structure) to determine the quantities in (8b) and (8c). Although there is no direct exchange of sensible data, the computation of (8b) and (8c) provides indirect information about these data, meaning that confidentiality breaches can occur after a number of iterations. The term “confidentiality breach” is hereafter taken to mean the reconstruction of the entire private dataset by another party.

To reduce the possibility of such confidentiality breaches, recent work combined distributed ADMM with differential privacy, which consists of adding random noise (with certain statistical properties) to the data itself or coefficients [8], [9]. However, these mechanisms can deteriorate the performance of the model even under moderate privacy guarantees [2].

The following describes the disclosed privacy-preserving collaborative forecasting method, which combines multiplicative randomization of the data with, in an embodiment, the distributed ADMM for generalized LASSO-VAR model. Communication issues are also addressed since they are common in distributed systems.

The following describes the disclosed data transformation with multiplicative randomization as used in the present disclosure.

Multiplicative randomization of the data comprises multiplying the data matrix X∈ custom-character ^T×nsby full rank perturbation matrices, where T is the number of records, n is the number of agents and s is the number of variables observed by agent (by simplicity, the disclosed equations use the same number of variables for all agents, however it is straightforward to adapt the disclosed equations for a different number of variables for each agent). If the perturbation matrix M∈ custom-character ^T×Tpre-multiplies X, i.e., MX, the records are randomized. On the other hand, if perturbation matrix Q∈^ns×nspost-multiplies X, i.e., XQ, then the features are randomized. The challenges related with such transformations are two-fold: (i) M and Q are algebraic encryption keys, and consequently should be fully unknown by agents, (ii) data transformations need to preserve the relationship between the original time series.

When X is split by features, as is the case with matrices Z and Y when defining VAR models, Q can be constructed as a diagonal matrix—see (10), where diagonal matrices Q_A_i∈ custom-character ^s×sare privately defined by agent i=1, . . . , n, where s is the number of covariates observed from each agent. Then, agents post-multiply their private data without sharing Q_A_i, since XQ is given by

$\begin{matrix} \underset{= X}{\underset{︸}{[X_{A_{1}}, \dots, X_{A_{n}}]}} \underset{= Q}{\underset{︸}{[\begin{matrix} Q_{A_{1}} & 0 \\ ⋱ \\ 0 & Q_{A_{n}} \end{matrix}]}} = [X_{A_{1}} Q_{A_{1}}, \dots, X_{A_{n}} Q_{A_{n}}] . & (10) \end{matrix}$

where X_A_i∈ custom-character ^T×sis the data to be protected of each i-th agent.

Unfortunately, the same reasoning is not possible when defining M, because all elements of j-th column of M multiplies all elements of j-th row in X (containing data from every agents). Therefore, the challenge is to define a random matrix M, unknown but at the same time built by all agents. We propose to define M as

M=M
_A
₁
. . . M
_A
_n (11)

where random matrix M_A_i∈ custom-character ^T×Tis privately generated (randomly) by i-th agent. This means that MX is given by:

$\begin{matrix} MX = [\begin{matrix} \underset{= {MX}_{A_{1}}}{\underset{︸}{M_{A_{1}} \dots M_{A_{n}} X_{A_{1}}}}, \dots, & \underset{= {MX}_{A_{n}}}{\underset{︸}{M_{A_{1}} \dots M_{A_{n}} X_{A_{n}}}} \end{matrix}] . & (12) \end{matrix}$

Some linear algebra-based protocols exist for secure matricial product, but they were designed for matrices with independent observations and have proven to fail when applied to such matrices as Z and Y (see [6] for a proof). Our proposal for computing MX_A_iis as follows:

- Step 1 i-th agent generates random invertible matrices C_A_i∈^T×(r−s), D_A_i∈^r×r, and shares W_A_i∈^T×rwith n-th agent,

W
_A
_i
=[X
_A
_i
,C
_A
_i
]D
_A
_i (13)

- Step 2 n-th agent receives W_A_iand shares M_A_nW_A_i
  - with (n−1)-th agent. Repeat until 1-st agent receives M_A₂. . . M_A_nW_A_iand computes MW_A_i=M_A₁M_A₂. . . M_A_nW_A_i.
- Step 3 i-th agent receives MW_A_ifrom 1-st agent and recovers MX_A_i,

[MC_A_i,MX_A_i]=MW_A_iD_A_i⁻¹ (14)

The privacy of this protocol depends on integer r, which is chosen according to the number of unique values on X_A_iand represents the size of the new variable space. The optimal value for r is discussed further below, as well as the range of values of r required by the methods of the present disclosure for ensuring data privacy.

The following discloses the formulation of the collaborative forecasting model as used in the present disclosure.

When applying the ADMM algorithm, the protocol presented above should be applied to transform matrices Z and Yin such a way that: (i) the estimated coefficients do not coincide with the originals, instead they are a secret transformation of them, (ii) agents are unable to recover the private data through the exchanged information, and (iii) cross-correlations cannot be obtained, i.e., agents are unable to recover Z^TZ nor Y^TY.

To fulfil these requirements, both covariate and target matrices are transformed through multiplicative noise. Both M and Q are assumed to be invertible, which is guaranteed for invertible M_A_iand Q_A_i, i=1, . . . , n.

- 1) Formulation: Let ZQ be the covariate matrix obtained through (10), and Y the target matrix. Covariate matrix ZQ is split by features, and the optimization problem which allows recovering the solution of (7) is

$\begin{matrix} \underset{B^{post}}{\arg \min} (\frac{1}{2} { Y - \sum_{i} Z_{A_{i}} Q_{A_{i}} B_{A_{i}}^{post} }_{2}^{2} + λ \sum_{i} { Q_{A_{i}} B_{A_{i}}^{post} }_{1}) . & (15) \end{matrix}$

After a little algebra, the relation between the ADMM solution for (7) and (15) is

B
_A
_i
^post
^k+1
=Q
_A
_i
B
_A
_i
^k+1 (16)

suggesting coefficients privacy. However, the limitations identified for (7) are valid for (15). That is, a curious agent can obtain both Y and ZQ, and because Y and Z share a large proportion of values, Z can also be recovered.

Taking covariate matrix MZQ and target MY, the ADMM solution for the optimization problem

$\begin{matrix} \underset{B^{'}}{\arg \min} (\frac{1}{2} { MY - \sum_{i} {MZ}_{A_{i}} Q_{A_{i}} B_{A_{i}}^{'} }_{2}^{2} + λ \sum_{i} { Q_{A_{i}} B_{A_{i}}^{'} }_{1}) . & (17) \end{matrix}$

preserves the relation between the original time series if M is orthogonal, i.e., MM^T=I, where B′_A_i=Q_A_iB_A_i. In this case, although data is protected, there is sensitive information to be shared. MY can be recovered without compromising Y, but (MY)^TMY=Y^TY. That is, a curious agent is able to obtain the covariance (and cross-correlation) matrix.

The problem of the previous approach is the orthogonality of M, which is necessary while computing B_A_ito ensure that

$\begin{matrix} Q_{A_{i}}^{⊤} Z_{A_{i}}^{⊤} M^{⊤} [{MZ}_{A_{i}} Q_{A_{i}} B_{A_{i}}^{k} - {\overline{MZQB}}^{k} + \dots] = Q_{A_{i}}^{⊤} Z_{A_{i}}^{⊤} [Z_{A_{i}} Q_{A_{i}} B_{A_{i}}^{k} - {\overline{ZQB}}^{k} + \dots] & (18) \end{matrix}$

We deal with this limitation using Z_A_i^TM⁻¹instead of Z_A_i^TM_A_i^T. Our proposal requires agents to compute MZ_A_i, MY_A_iand Z_A_i^TM⁻¹. Algorithm 1 summarizes our proposal for estimating a privacy-preserving LASSO-VAR model, see FIG. 6. See also Equations (19)-(22) in FIG. 6.

Z_A_i^TM⁻¹is obtained by adapting the protocol in (13)-(14). In this case, the value of r is even more restrictive because we need to ensure that the i-th agent does not obtain both Y_A_i^TM⁻¹and MY_A_i. Otherwise, the covariance and cross-correlation matrices are again vulnerable. Let us assume that Z_A_iQ_A_ihas u unique values to recover and Y_A_ihas v unique unknown values that are not in Z_A_i. Then, privacy is ensured by computing MZ_A_iQ_A_iand Q_A_i_TZ_A_i^TM using the smaller integer r such that √{square root over (Tp−u)}<r<T/2∧r>p, and then MY_A_iwith √{square root over (−u+Tp−r²−v+T)}<r′<T−2r∧r′>1 (see Disclosed r determination method 2).

Finally, it is important to underline that the presently disclosed method can be applied to both central hub model and P2P model schemes without any modification—the only difference is on who receives MZ_A_iQ_A_iB_A_i^kand computes H^kand U^k.

Malicious agents in ADMM iterative process: The present disclosure assumes that agents should only trust themselves. This assumption requires the use control mechanisms since agents can share wrong estimates of their coefficients, compromising the global model. Since MY and MZQB′^kcan be known by agents without exposing private data, a malicious agent can be detected through the analysis of ∥MY−MZQB′^k∥₂². That is, during the iterative process, this global error should smoothly converge, as depicted in FIG. 3 (left plot), and the same is expected for the individual errors ∥MY−Σ_iMZ_A_iQ_A_iB′_A_i^k∥₂², ∀i. In the example of FIG. 3, two agents are assumed to add random noise to their coefficients. This results in the erratic curve for the global error shown in FIG. 3. An analysis of individual errors, in FIG. 3 (right plot), shows that all agents have smooth curves, except the two who shared distorted information.

The following describes the asynchronous communication as used in the present disclosure.

When applying the ADMM, the matrices in (20)-(22) combine the individual solutions of all data owners, meaning that the “slowest” agent dictates the duration of each iteration. Since communication delays may occur because of computation or communication issues, the proposed algorithm should be robust to this scenario. Otherwise, the convergence to the optimal solution may require too much time. Besides, some information may never be transmitted.

The proposed LASSO-VAR approach deals with communication issues considering the last information sent by agents, but different strategies are assumed according to the adopted collaborative scheme.

Regarding the centralized scheme, let Ω_i^kbe the set of iterations for which agent i communicated its information, until current iteration k. After receiving the local contributions, central agent computes H^kand U^k, in (21)-(22), by using Σ_iMZ_A_iQ_A_iB′_A_i^max(Ωⁱ^k⁾. Then, central agent returns H^kand U^k, informing agents about max(Ω_i^k). To proceed, B′_A_i^k+1is updated by using MZ_A_iQ_A_iB′_A_i^max(Ωⁱ^k⁾in (19).

For the P2P approach, let Λ_i^kbe the set of agents sharing information computed at iteration k, with agent i, i.e., Λ_i^k={j: agent j sent MZ_A_iQ_A_iB′_A_i^kto agent i}. After computing and sharing MZ_A_iQ_A_iB′_A_i^k, a second round of peer-to-peer communication is proposed, where agents share both Λ_i^kand Σ_j∈Λ_i_kMZ_A_jQ_A_jB′_A_j^k. After this extra communication round, agent i is able to obtain missing information when Λ_i^k≠Λ_j^k, ∀i, j.

BRIEF DESCRIPTION OF THE DRAWINGS

The following figures provide preferred embodiments for illustrating the disclosure and should not be seen as limiting the scope of invention.

FIG. 1: Schematic representation of two structural organizations of collaborative RES forecasting.

FIG. 2: Schematic representation of an embodiment of a VAR model and respective data structure.

FIG. 3: Error evolution (left: global error; right: error by agent with black lines representing the two agents who add random noise to their information.

FIG. 4: Schematic representation of the results of relative NRMSE improvement [%] over the baseline models.

FIG. 5: Loss trajectory while fitting LASSO-VAR model.

FIG. 6: Schematic representation of an embodiment of a pseudo-code algorithm, hereby “algorithm 1”, with a particular embodiment of the presently disclosed methods.

DETAILED DESCRIPTION

It is disclosed a method for preserving privacy of a linear regression model used in distributed learning by a set of agents sharing covariate data and/or target data for said model, comprising obtaining an invertible random perturbation matrix, as an algebraic encryption key given by multiplication of a plurality of invertible randomly generated perturbation sub-matrixes, there being a perturbation sub-matrix for each respective agent,

- wherein the invertible random perturbation matrix is to be used as an algebraic encryption key by left-multiplication of the data to be encrypted,
- wherein an invertible perturbation sub-matrix is privately and randomly generated by each individual agent,
  
  by the steps of:
  
  for each individual agent, the individual agent privately generating a first random invertible matrix and a second random invertible matrix, multiplying a horizontally concatenated matrix of the data to be encrypted by said individual agent and said first matrix, with the second matrix, and sharing the resulting multiplication with a receiving agent selected from one of the remaining agents;
- the receiving agent multiplying the perturbation sub-matrix of the receiving agent by the received multiplication, and sharing the resulting multiplication with another receiving agent selected from one of the remaining agents, repeating until a last remaining agent receives the multiplication, and then the last remaining agent multiplying the perturbation sub-matrix of the last remaining agent by the received multiplication, and sharing the resulting multiplication with the individual agent;
  
  for each individual agent, the individual agent recovering the encrypted data of said individual agent by computing the multiplication the shared resulting multiplication with the individual agent with the inverse of the second matrix, horizontally decatenating from the computed multiplication a left-side matrix and a right-side matrix, wherein the right-side matrix is the encrypted data of said individual agent as encrypted by the perturbation matrix;
  
  for each individual agent, the individual agent sharing the encrypted data of said individual agent with all other agents.

It is disclosed a method for preserving privacy of a linear regression model used in distributed learning by a set of agents sharing covariate data and/or target data for said model, comprising obtaining a random perturbation matrix M, as an algebraic encryption key, for pre-multiplying the data to be encrypted.

In an embodiment, said method steps are repeated for encrypting M−1, the inverse of the perturbation matrix M.

In an embodiment, the set of agents indirectly shares covariate data and/or target data when sharing coefficient matrixes for said model.

In an embodiment, the set of agents directly shares covariate data and/or target data when sharing raw covariate data and/or target data for said model.

In an embodiment, said perturbation matrix is used for pre-multiplying the data to be encrypted for each agent.

It is also disclosed a method for preserving privacy of a linear regression model used in distributed learning by a set of agents sharing covariate data and/or target data for said model, comprising obtaining a perturbation matrix Q, as an algebraic encryption key, for post-multiplying the data to be encrypted.

It is also disclosed a method for preserving privacy of linear regression distributed learning using a LASSO (least absolute shrinkage and selection operator)-VAR (vector autoregressive) model by using said random perturbation matrix M, as an algebraic encryption key, for pre-multiplying the data to be encrypted and by using said perturbation matrix Q, as an algebraic encryption key, for post-multiplying the data to be encrypted, in particular for preserving privacy of distributed learning using a LASSO-VAR model using convex optimization such as alternating direction method of multipliers (ADMM) or using coordinate descent optimization.

In an embodiment, the disclosed methods can be used for forecasting methods using said linear regression model by preserving privacy of shared data between said set of agents, in particular for wind or solar power forecasting.

The disclosed methods are computer-implemented methods.

It is disclosed a computer-implemented method for preserving privacy of a linear regression model used in distributed learning by a set of n agents sharing covariate data and/or target data for said model, comprising obtaining an invertible random perturbation matrix M, as an algebraic encryption key M∈ custom-character ^T×T,

- given by multiplication of sub-matrixes M_A_i∈^T×Tgiven by:

M=M
_A
_i
. . . M
_A
_n (11)

- such that MX is given by:

$\begin{matrix} MX = \underset{\underset{= {MX}_{A_{1}}}{︸}}{[M_{A_{1}} \dots M_{A_{n}} X_{A_{1}}}, \dots, \underset{\underset{= {MX}_{A_{n}}}{︸}}{M_{A_{1}} \dots M_{A_{n}} X_{A_{n}}]}, & (12) \end{matrix}$

- where n is the number of agents, i is each agent, T is the number of observations, X∈^T×nsis the data to be encrypted of all agents, X=[X_A₁, . . . , X_A_n], X_A_i∈^T×sis the data to be encrypted of each i-th agent, s is the number of covariates observed from each agent, and wherein invertible matrix M_A_iis privately and randomly generated by each i-th agent,
  
  by the steps of:
- agent i-th privately generating random matrices C_A_i∈^T×(r−s), D_A_i∈^r×r, and sharing W_A_i∈^T×rwith a n-th agent, where W_A_iis obtained by:

W
_A
_i
=[X
_A
_i
,C
_A
_i
]D
_A
_i (13)

- the n-th agent receiving W_A_iand sharing M_A_nW_A_iwith a (n−1)-th agent, which then receives M_A_nW_A_iand shares M_A_n−1M_A_nW_A_iwith a (n−2)-th agent; and repeating until a 1-st agent receives M_A₂. . . M_A_nW_A_iand computes MW_A_i=M_A₁M_A₂. . . M_A_nW_A_i;
- the 1-st agent sends MW_A_ito agent i-th;
- agent i-th recovers MX_A_ifrom:

[MC_A_i,MX_A_i]=MW_A_iD_A_i⁻¹ (14)

and subsequently each agent sending the recovered MX_A_ias the encrypted data to be transmitted to the other agents.

An embodiment comprises obtaining a perturbation matrix Q∈ custom-character ^ns×ns, for post-multiplying the data to be encrypted X=[X_A₁, . . . , X_A_n], where X_A_i∈^T×sis the data to be encrypted of each i-th agent, as an algebraic encryption key Q, such that XQ is given by:

$\begin{matrix} \underset{\underset{= X}{︸}}{[X_{A_{1}}, \dots, X_{A_{n}}]} \underset{\underset{= Q}{︸}}{[\begin{matrix} Q_{A_{1}} & 0 \\ ⋱ \\ 0 & Q_{A_{n}} \end{matrix}]} = [X_{A_{1}} Q_{A_{1}}, \dots, X_{A_{n}} Q_{A_{n}}] . & (10) \end{matrix}$

- where Q is a diagonal matrix formed by diagonal matrices Q_A_i∈^s×swhich are random matrices privately generated by each agent 1, the method comprising the steps of:
  - each i-th agent generating random matrix Q_A_i,
  - and sharing X_A_iQ_A_iwith every other agent.

An embodiment comprises using a LASSO (least absolute shrinkage and selection operator)-VAR (vector autoregressive) model.

An embodiment comprises using a LASSO-VAR model using convex optimization such as alternating direction method of multipliers (ADMM) or using coordinate descent optimization.

In an embodiment, Z=[Z_A₁, . . . , Z_A_n] is a covariate matrix and Y=[Y_A₁, . . . , Y_A_n] is a target matrix, where Z and Y are data to be encrypted and transmitted, where covariate matrix Z is split by model features, comprising each of said agents computing MZ_A_i, MY_A_iand Z_A_i^TM⁻¹for obtaining the encrypted data to be transmitted to the other agents.

In an embodiment, ZQ is a covariate matrix and Y is a target matrix, where Z=[Z_A₁, . . . , Z_A_n] and Y=[Y_A₁, . . . , Y_A_n] are data to be encrypted and transmitted, where covariate matrix ZQ is split by model features, comprising each of said agents computing MZ_A_iQ_A_i, MY_A_iand Q_A_i^TZ_A_i^TM⁻¹for obtaining the encrypted data to be transmitted to the other agents.

In an embodiment, Z_A_iQ_A_i∈ custom-character ^T×shas u unique values to recover and Y_A_i∈^T×ghas v unique unknown values that are not in Z_A_i, comprising the steps of computing MZ_A_iQ_A_iand Q_A_i^TZ_A_i^TM⁻¹using the smaller integer r such that √{square root over (Ts−u)}<r<T/2∧r>p, and computing MY_A_iusing the smaller integer r′ such that √{square root over (−u+Ts−r²−v+Tg)}<r′<T−2r∧r′>g.

An embodiment for obtaining a non-encrypted LASSO-VAR model, further comprises each agent:

- obtaining said LASSO-VAR model, where coefficient matrix B^(l)∈^n×nwhich represents a coefficient matrix at lag l=1, . . . , p which is split into variables B and H with the added the constraint H=B to obtain model coefficients B^k+1, where

B
^k+1=(Z^TZ+ρI)⁻¹(Z^TY+ρ(H^k−U^k)) (6a)

H

^k+1
=S
_λ/ρ(B^k+1+U^k) (6b)

U
^k+1
=U
^k
+B
^k+1
−H
^k+1 (6c)

and where U is the scaled dual variable associated with the constraint H=B, I is the identity matrix with proper dimension, wherein k is an iteration of the optimization method and S_λ/ρis the soft thresholding operator.

An embodiment comprises computing B_A_iin parallel as B_A_i^k+1which is obtained by:

$\begin{matrix} B_{A_{i}}^{k + 1} = \arg \min_{B} ({ Z_{A_{i}} B_{A_{i}}^{k} + {\overline{H}}^{k} - {\overline{ZB}}^{k} - U^{k} - Z_{A_{i}} B_{A_{i}} }_{2}^{2} + λ { B_{A_{i}} }_{1}), & (8 a) \end{matrix}$

$\begin{matrix} {\overline{H}}^{k + 1} = \frac{1}{N + ρ} (Y + ρ {\overline{ZB}}^{k + 1} + U^{k}) & (8 b) \\ U^{k + 1} = U^{k} + {\overline{ZB}}^{k + 1} - {\overline{H}}^{k + 1}, & (8 c) \end{matrix}$

- where

${\overline{ZB}}^{k + 1} = \frac{1}{n} \sum_{j = 1}^{n} Z_{A_{j}} B_{A_{j}}^{k + 1},$

- and B_A_i^k+1∈^p×n, Z_A_i∈^T×p, Y, H, U∈^T×n, i=1, . . . and B_A_iis estimated through ADMM by

$\begin{matrix} \arg \min_{B} ({ {\hat{Y}}_{A_{i}} - Σ_{i} Z_{A_{i}} B_{A_{i}} }_{2}^{2} + \hat{λ} Σ_{i} { H_{A_{i}} }_{1}), & (9) \end{matrix}$

- where Ŷ_A_i=Z_A_iB_A_i^k+H^k−ZB^k−U^kand {circumflex over (λ)}=λ/ρ,
- wherein B_A_iis the non-encrypted solution to the LASSO-VAR model.

An embodiment comprises obtaining the ADMM solution for the optimization problem:

$\begin{matrix} \underset{B^{'}}{argmin} (\frac{1}{2} { MY - Σ_{i} {MZ}_{A_{i}} Q_{A_{i}} B_{A_{i}}^{'} }_{2}^{2} + {λΣ}_{i} { Q_{A_{i}} B_{A_{i}}^{'} }_{1}) & (17) \end{matrix}$

- where B′_A_i=Q_A_iB_A_iis the encrypted solution to the LASSO-VAR model.

It is also disclosed a non-transitory storage media comprising computer program instructions for implementing a method for preserving privacy of a linear regression model used in distributed learning by a set of agents indirectly sharing covariate data or target data when sharing coefficient matrixes for said model, the computer program instructions including instructions which, when executed by a processor for each agent, cause the processors to carry out the method of any of the disclosed embodiments.

The following describes a case-study and respective data description and experimental setup where the disclosure is applied to forecast solar power up to 6 hours-ahead. The data is publicly available in [2] and consists in hourly time series of solar power from 44 microgeneration units, located in a Portuguese city, and covers the period from Feb. 1, 2011 to Mar. 6, 2013. Since the VAR model requires the data are stationary, the solar power is normalized through a clear sky model [26], which gives an estimate of the solar power in clear sky conditions at any given time. In addition, night-time hours are excluded by removing data for which the solar zenith angle is larger than 90.

Based on previous work [2], a LASSO-VAR model using lags 1, 2 and 24 is fitted with a sliding-window of one month and the training period consists of 12 months. For simulation proposal, communication delays are modeled as exponential random variables D_itwith rate λ_i^exp, D_it˜E(λ_i^exp), and communication failures are modelled through Bernoulli random variables F_it, with failure probability p_i, F_it˜Bern(p_i), for each agent i=1, . . . , n, at each communication time t.

When compared to other problems, e.g., wind power forecasting, the solar power forecasting is more challenging because the lags 1 and 2 are zero for the first light hours, i.e., there are fewer unknown data.

The ADMM process stops when all agents achieve ∥B_A_i^k+1−B_A_i^k∥₂²/max(1,min(|B_A_i^k+1|, |B_A_i^k|)), i=1, . . . n, where ε is the tolerance parameter.

The performance of the models is accessed through the normalized root mean squared error (NRMSE) calculated for i-th agent and lead-time t+h, h=1, . . . , 6, as

$\begin{matrix} {NRMSE}_{i, t + h} = \frac{\sqrt{\frac{1}{k} \sum_{i = 1}^{k} {({\hat{y}}_{i, t + h} - y_{i, t + h})}^{2}}}{\max ({y_{i, t}}_{t = 1}^{T} - \min ({y_{i, t}}_{t = 1}^{T})} & (23) \end{matrix}$

where y_i,t+hrepresents the forecast generated at time t.

The following describes the benchmark models used for the case-study. The autoregressive (AR) model is implemented to assess the impact of collaboration over a model without collaboration.

Also, the analog method in [7] is implemented since it enables collaborative forecasting without data disclosure. Firstly, agent i searches the k situations most similar to the current power production values y_i,t−l+1, . . . , y_i,t. This similarity is measured through the Euclidean distance. Secondly, said k most similar situations (called analogs) are weighted according to the corresponding Euclidean distance. Agent i attributes the weight w_A_i(a) to the analog a. The forecast for h steps ahead is obtained by applying the computed weights on the h values registered immediately after the k analogs. The collaboration between agents requires the exchange of the times series indexes for the selected analogs and corresponding weights. Two analogs belong to the same situation if they occur at the same or at close timestamps. Agent i scores the analog a at timestemp t_aby performing

$\begin{matrix} s_{A_{i}} (a) = \underset{\underset{own contribution}{︸}}{(1 - α) w_{A_{i}} (a)} + \underset{\underset{other' s weights for close timestamps}{︸}}{\frac{α}{n} \sum_{i = 1}^{n} \sum_{j = 1}^{k} w_{A_{j}} (j) I_{ε} (t_{a}, t_{j}),}, & (24) \end{matrix}$

- where a is the weight given to neighbor information, j are the analogs from other agents, registered at timestamps t_j, and I_ε(t_a, t_j) is the indicator function taking value 1 if |t_j−t_a|≤ε, with ε being the maximum time difference for two analogs to be considered part of the same global situation.

TABLE I

Normalized RMSE for synchronous models.

1
2
3
4
5
6

Analogs [17]^†
10.44
13.05
14.76
15.78
16.28
16.49

AR*
10.10
13.17
14.29
14.75
14.92
14.99

LASSO-VAR^†
9.23
12.36
13.85
14.51
14.69
14.84

*without collaboration

^†with collaboration

TABLE II

Mean relative NRMSE improvement [%] over the AR model.

1
2
3
4
5
6

pi
central
P2P
central
P2P
central
P2P
central
P2P
central
P2P
central
P2P

0
8.41
6.05
2.95
1.95
1.39
0.95

0.1
7.93
8.41
5.98
6.05
2.91
2.95
1.49
1.52
1.35
1.39
0.89
0.93

0.3
7.45
″
5.89
″
2.89
″
1.40
″
1.18
″
0.69
″

0.5
6.69
″
5.77
″
2.88
″
1.30
″
1.00
″
0.52
″

0.7
5.71
″
5.54
″
2.84
″
1.24
″
0.89
″
0.33
″

0.9
3.75
8.10
5.19
5.75
2.74
2.78
0.75
1.47
0.62
1.38
−0.82
0.88

TABLE III

Mean running times (in sec) per iteration.

Non distributed
Central LASSO-VAR
P2P LASSO-VAR

LASSO-VAR
Enc. data
ADMM
Enc. data
ADMM

0.035 (≈410)
10.91
0.052 (≈300)
10.91
0.1181 (≈300)

The following describes numerical results of the case study. To access the quality of the proposed collaborative forecasting model, the synchronous LASSO-VAR is compared with benchmark models. Both central hub and P2P model have the same accuracy when considering synchronous communication.

Table I presents the NRMSE error for all agents, distinguishing between lead-times. In general, the smaller the forecasting horizon, the larger is the NRMSE improvement, i.e., (NRMSE_Bench.−NRMSE_VAR)/NRMSE_Bench.·100%. Besides, since the proposed VAR and the AR models have similar NRMSE for h>3, the Diebold-Mariano test is applied to test the superiority of the proposal, assuming a confidence level of 5%. This test showed that the improvement is statistically significant for all horizons. FIG. 4 depicts the relative improvement in terms of NRMSE for the 44 agents. According to the Diebold-Mariano test, the LASSO-VAR model outperforms benchmarks in all lead-times for at least 25 of the 44 agents.

For asynchronous communication, equal failure probabilities p, are assumed for all agents. Since a specific p, can generate various distinct failure sequences, 20 simulations were performed for each p_i, p_i∈{0.1, 0.3, 0.5, 0.7, 0.9}. Table II shows the mean NRMSE improvement for different failure probabilities p_i, i=1, . . . , n. In general, the greater the p_ithe smaller the improvement. Despite the model's accuracy decreases slightly, the LASSO-VAR model continues to outperform the AR model for both collaborative schemes, which demonstrates high robustness to communication failure.

FIG. 5 depicts the evolution of the loss while training the LASSO-VAR model, considering p_i∈{0.5, 0.9}. For the centralized approach, the loss tends to stabilize around larger values. In general, the results are better for the P2P scheme since in the centralized approach if an agent fails the algorithm proceeds with no chance of obtaining its information. In P2P, this agent may have communicated his contribution to some peers and the probability of losing information is smaller.

Finally, Table III depicts the mean running times and the number of iterations of both non-distributed and distributed approaches. The proposed schemes require larger execution times. That was expected because they require estimating B′_A_i^kthrough a second ADMM cycle (FIG. 6). However, the non-distributed LASSO-VAR requires a smaller number of iterations to achieve the stopping criterion (ε=5×10⁻⁴).

In conclusion, RES forecast models can be improved by combining data from multiple geographical locations. One of the simplest and most effective collaborative models for very short-term forecasts is the vector autoregressive model. However, different data owners might be unwilling to share their time series data. In order to ensure data privacy, this work combined the advantages of the ADMM decomposition method with data encryption through linear transformations of data. It is important to mention that the coefficients matrix obtained with the privacy-preserving protocol proposed in this work is the same obtained without any privacy protection.

This novel method also included an asynchronous distributed ADMM algorithm, making it possible to update the forecast model based on information from a subset of agents and improve the computational efficiency of the proposed model. The mathematical formulation is flexible enough to be applied in two different collaboration schemes (central hub model and P2P) and paved the way for learning models distributed by features, instead of observations.

The results obtained for a solar energy dataset show that the privacy-preserving VAR model delivers a forecasting skill comparable to a model without privacy protection and outperformed a state-of-the-art method based on analog search. Furthermore, it exhibited high robustness to communication failures, in particular for the P2P scheme. Two aspects not addressed in this disclosure were uncertainty forecasting and application to non-linear models (and consequently longer lead times). Uncertainty forecast can be readily generated by transforming original data using a logit-normal distribution and we plan to investigate the application to longer time horizons. The following discusses how to verify/obtain an optimal value of r.

Disclosed r determination method 1. Let X_A_i∈ custom-character ^T×sbe the sensible data from i-th agent, with u unique values, and M_A_i∈^T×Tbe the private encryption matrix from j-th agent. If agents compute M_A_iX_A_iapplying the protocol in (13)-(14), then two matrices D_A_i∈^r×rand C_A_i∈^T×(r−s)are generated by i-th agent and data privacy is ensured for integer r such that

√{square root over (Ts−u)}<r<T∧r>s. (25)

Proof. Since i-th agent only receives M_A_i[X_A_i, C_A_i]D_A_i∈ custom-character ^T×r, the matrix M_A_i∈^T×Tis protected if r<T. On the other hand, j-th agent receives W_A_i∈^T×rand does not know X_A_i∈^T×(r−s)and D_A_i∈^r×r, r>s. Although X_A_i∈^T×s, we assume this matrix has u unique values whose positions are known by all agents—because when defining a VAR model with p consecutive lags Z_A_ihas T+p−1 unique values, see FIG. 2—meaning there are fewer values to recover.

Given that, j-th agent receives Tr values and want to determine u+T(r−s)+r². The solution of the inequality

Tr<u+T(r−s)+r² (26)

in r, determines that data from i-th agent is protected when

r>√{square root over (Ts−u)}. (27)

Disclosed r determination method 2. Let X_A_i∈ custom-character ^T×sand let G_A_i∈^T×gbe private data matrices, such that X_A_ihas u unique values to recover and G_A_ihas v unique values that are not in X_A_i. Assume the protocol in (13)-(14) is applied to compute MX_A_i, X_A_i^TM⁻¹and MG_A_i, with M as defined in (11). Then, to ensure privacy while computing MX_A_iand X_A_i^TM⁻¹, the protocol requires

√{square root over (Ts−u)}<r<T/2∧r>s. (28)

In addition, to compute MG_A_i, the protocol should take

√{square root over (−u+Ts−r²−v+Tg)}<r′<T−2r∧r′>g. (29)

Proof. (i) To compute MX_A_i, the i-th agent shares W_A_i=[X_A_i, C_A_i]D_A_i∈ custom-character ^T×rwith the n-th agent, C_A_i∈^T×(r−s), D_A_i∈^r×r, r>s. Then, the process repeat until the 1-st agent receives M_A₂. . . M_A_nW_A_iand computes MW_A_i=M_A₁M_A₂. . . M_A_nW_A_iConsequently, agent j=1, . . . , n, receives Tr values during the protocol.

- (ii) X_A_i^TM⁻¹is computed using the matrix W_A_idefined before. Since M⁻¹=M_A_n⁻¹M_A_n−1⁻¹. . . M_A₁⁻¹, the n-th agent computes W_A_i^TM_A_n⁻¹. Then, the process repeat until the 1-st agent receives W_A_i^TM_A_n⁻¹M_A_n−1⁻¹. . . M_A₂⁻¹and computes W_A_i^TM_A_n⁻¹M_A_n−1⁻¹. . . M_A₂⁻¹M_A₁⁻¹. Again, the j-th agent receives Tr values related to the unknown data from the i-th agent. In summary, the n-th agent receives Tr values related with X_A_iand unknowns u+T(r−s)+r²(from X_A_i, C_A_iand D_A_i). The solution for Tr<u+T(r−s)+r²allows to infer that X_A_iis protected if √{square root over (Ts−u)}<r. On the other hand, the i-th agent receives 2Tr values (MW_A_i, W_A_i^TM) and does not know T²values from M, meaning that r<T/2.
- (iii) Finally, to compute MG_A_i, the i-th agent should define new matrices C′_A_i∈^T×(r−g), D′_A_i∈^r′×r′ sharing W′_A_i=[G_A_i, C′_A_i]D′_A_i∈^T×r′, r′>g. The computation of MW′_A_iprovides Tr′ new values, meaning that after computing MX_A_i, X_A_i^T, M⁻¹and MG_A_i, the n-th agent has Tr+Tr′ values and does not know u+T(r−s)+r²+v+T(r′−g)+r′²(from X_A_i, C_A_i, D_A_i, G_A_i, C′_A_iand D′_A_irespectively). The solution of the inequality Tr+Tr′<u+T(r−s)+r²+v+T(r′−g)+r′²allows to infer that r′>√{square root over (−u+Ts−r²−v+Tg)}. On the other hand, the i-th agent receives 2Tr+Tr′ and does not know T², meaning that r′<T−2r.

Global Privacy Analysis. While encrypting sensible data X_A_i∈ custom-character ^T×sand G_A_i∈^T×gsuch that X_A_ihas u unique values to recover and G_A_ihas v unique values that are not in X_A_i, the 1-st agent receives M_A_i[X_A_i, C_A_i]D_A_i∈^T×r, [[X_A_i, C_A_i]D_A_i]^TM⁻¹∈^T×rand W′_A_i=[G_A_i, C′_A_i]D′_A_i∈^T×r′, ∀i, which provides 2nTr+nTr′ values. At this stage, the agent does not know

$\underset{\underset{M}{︸}}{T^{2}} + \underset{\underset{X_{A_{i}}, \forall i \neq 1}{︸}}{(n - 1) u} + \underset{\underset{G_{A_{i}}, \forall i \neq 1}{︸}}{(n - 1) v} + \underset{\underset{C_{A_{i}}, \forall i \neq 1}{︸}}{(n - 1) T (r - s)} + \underset{\underset{D_{A_{i}}, \forall i \neq 1}{︸}}{(n - 1) r^{2}} + \underset{\underset{C'_{A_{i}}, \forall i \neq 1}{︸}}{(n - 1) T (r^{'} - g)} + \underset{\underset{D'_{A_{i}}, \forall i \neq 1}{︸}}{(n - 1) r^{′2}}$

values. Then, while fitting the LASSO-VAR model, the 1-st agent can recover MX∈ custom-character ^T×nsand MG∈^T×ng, as shown in [2]. That said, the 1-st agent receives 2nTr+nTr′+nTs+nTg, and a confidentiality breach occurs if T(2nr+nr′+ns+ng)≥T²+(n−1)[u+v+T(r−s)+r²+T(r′−g)+r′²]. After a little algebra, it is possible to verify that taking (28) and (29), the previous inequality has no solution in custom-character ₀⁺. Thus, global privacy is confirmed to assured by the present disclosure.

The term “comprising” whenever used in this document is intended to indicate the presence of stated features, integers, steps, components, but not to preclude the presence or addition of one or more other features, integers, steps, components or groups thereof.

A pseudo-code algorithm of particular embodiments of the presently disclosed methods is depicted in the figures. The pseudo-code algorithm illustrates the functional information one of ordinary skill in the art requires to perform said methods required in accordance with the present disclosure. It will be appreciated by those of ordinary skill in the art that unless otherwise indicated herein, the particular sequence of steps described is illustrative only and can be varied without departing from the disclosure. Thus, unless otherwise stated the steps described are so unordered meaning that, when possible, the steps can be performed in any convenient or desirable order.

It is to be appreciated that certain embodiments of the disclosure as described herein may be incorporated as code (e.g., a software algorithm or program) residing in firmware and/or on computer useable medium having control logic for enabling execution on a computer system having a computer processor, such as any of the servers described herein.

It is to be appreciated that certain implementations of the disclosure as described herein can be incorporated as code (e.g., a software algorithm or program) residing in firmware and/or on computer useable medium having control logic for enabling execution on a plurality of computer systems, each having a computer processor, such as any of the servers described herein. Such a computer system typically includes memory storage configured to provide output from execution of the code which configures a processor in accordance with the execution.

The disclosure can be realized by way of a plurality of computer processors, in particular general-purpose computer processors or a purpose-specific computer processors like a microcontroller, on a purpose-specific card or module, embedded in a circuit or chip, such as a custom-built chip, a FPGA (field-programmable gate array) or FPGA-like chip, or as a firmware program recorded in media such as ROM, EPROM, or the like. Examples include general purpose hardware like Atmel™ devices, Intel™ based devices, ARM™ based devices, or custom purpose systems like a custom-built SoC (system on a chip), namely as a semiconductor intellectual property core (SIP core), IP core, or IP block (reusable unit of logic, cell, or integrated circuit layout to be used in a chip manufacture). The plurality of processors may be physically distanced or physically close, for example when virtualized in the same physical processor, provided that each agent's private data is kept private from the other agents.

The code can be arranged as firmware or software, and can be organized as a set of modules, including the various modules and algorithms described herein, such as discrete code modules, function calls, procedure calls or objects in an object-oriented programming environment. If implemented using modules, the code can comprise a single module or a plurality of modules that operate in cooperation with one another to configure the machine in which it is executed to perform the associated functions, as described herein.

The disclosure should not be seen in any way restricted to the embodiments described and a person with ordinary skill in the art will foresee many possibilities to modifications thereof. The above described embodiments are combinable.

The following claims further set out particular embodiments of the disclosure.

This work has been financed by the ERDF European Regional Development Fund through the Operational Programme for Competitiveness and Internationalisation—COMPETE 2020 Programme, and by National Funds through the Portuguese funding agency, FCT—Fundação para a Ciência e a Tecnologia, within project ESGRIDS—Desenvolvimento Sustentável da Rede Elétrica Inteligente/SAICTPAC/0004/2015-POCI-01-0145-FEDER-016434.

Number	Date	Country	Kind
116866	Oct 2020	PT	national
20215836.6	Dec 2020	EP	regional

METHOD AND DEVICE FOR PRESERVING PRIVACY OF LINEAR REGRESSION DISTRIBUTED LEARNING

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

Priority Claims (2)

PCT Information