PRIVACY-PRESERVING COMPUTATION METHOD AND SYSTEM FOR SECURE THREE-PARTY LINEAR REGRESSION

Information

  • Patent Application
  • 20250106004
  • Publication Number
    20250106004
  • Date Filed
    February 22, 2024
    a year ago
  • Date Published
    March 27, 2025
    a month ago
Abstract
Disclosed is a privacy-preserving computation method and system for secure three-party linear regression, relating to the technical field of privacy-preserving computation. The method includes: processing, by two participants, two private transposed matrices and two private matrices by using a 2PHMP to obtain ∂1 and ∂2; processing, by the two participants, ∂1 and ∂2 by using a 2PIP to obtain u1 and u2; splitting, by the first participant, a first private transposed matrix into v1 and Δ; obfuscating and overlaying, by the second participant, a second private transposed matrix with Δ to obtain v2; and processing v2, u1, u2, v1, and a third private matrix by using a 3PHMP to obtain a regression coefficient matrix.
Description
CROSS REFERENCE TO RELATED APPLICATION

This patent application claims the benefit and priority of Chinese Patent Application No. 202311227154X, filed with the China National Intellectual Property Administration on Sep. 21, 2023, the disclosure of which is incorporated by reference herein in its entirety as part of the present application.


TECHNICAL FIELD

The present disclosure relates to the technical field of privacy-preserving computation, and in particular, to a privacy-preserving computation method and system for secure three-party linear regression.


BACKGROUND

A privacy-preserving computation technology not only realizes secure circulation of data but also effectively ensures separation of data ownership and data use right on the premise that original data privacy is effectively guaranteed not to be disclosed. When privacy-preserving computation is applied to the field of data governance for multi-party joint regression modeling, the barriers between user privacy security and joint modeling performance will be completely bridged. The research on joint linear regression modeling based on privacy preserving holds certain theoretical and practical significance.


Currently, there are four main types of privacy-preserving computing solutions for linear regression problems: differential privacy, homomorphic encryption, secure multi-party computation, and federated learning.


Differential Privacy:

Kamalika C and others proposed a differential privacy protection algorithm for regression analysis, which is called DiffPETs. It enhances the security of differential privacy by adding noise to an objective function of an optimization problem. Pan and others introduced an adaptive differentially private regression (ADPR) model that dynamically allocates privacy budgets using a strongly correlated noise mechanism. This effectively reduces the noise in the objective function when input features have a significant impact on the model. However, while differential privacy-based solutions can achieve privacy protection by introducing noise to data or the objective function, the introduction of a large amount of noise can lead to a decrease in model accuracy and reduced usability. Therefore, the differential privacy technology is unreliable and imprecise.


Homomorphic Encryption:

Miran and others designed an optimized homomorphic encryption scheme for real number computations, using packing and parallelization techniques. It implements regression training on ciphertext data using gradient descent and approximates the logistic function through least squares to improve model accuracy and computational efficiency. Giacomelli and others proposed a system that trains a ridge linear regression model using only LHE. A cryptographic service provider (CSP) is responsible for initializing encryption parameters in the system, while a machine learning engine (MLE) is responsible for collecting encrypted datasets provided by data owners DOi and collaborating with the cryptographic service provider (CSP) to train the model. Dong and others introduced a locally weighted linear regression (LWLR) privacy-preserving computation system, which expands the exponential function in the Paillier encryption using the Taylor series and encrypts data using uses Paillier homomorphic encryption, and then performs LWLR computations on ciphertext data using a stochastic gradient descent algorithm. While homomorphic encryption-based solutions significantly enhance the privacy security of data by introducing ciphertext computations, the complexity of ciphertext models directly leads to an exponential increase in computational overhead. Additionally, in the case of large-scale data, an excessively high frequency of encryption operations can lead to a geometric increase in ciphertext noise, resulting in significant errors in computation results, making it impossible to recover plaintext through decryption. Therefore, the homomorphic encryption technology faces issues of low speed and low accuracy.


Secure Multi-Party Computation:

Mohassel and others designed a random gradient descent algorithm for secure two-party computation based on Garbled Circuits (GC) and Oblivious Transfer (OT) protocols. This approach implements a model training solution called SecureML for linear regression, logistic regression, and neural networks. In this approach, a data owner distributes private data to two servers using secret sharing, and the two servers train the model through secure multi-party computation, achieving distributed regression computation. Demmler and others proposed a privacy-preserving linear regression framework called ABY3, which is based on the secret sharing technology and the stochastic gradient descent (SGD) method. This framework generates multiplication triples by using three technical means: arithmetic sharing, Boolean sharing, and Yao circuits of an honest-but-curious environment, presenting a regression protocol based on efficient circuit transformation. Wei and others achieved fast training and modeling on vertically partitioned datasets through Asynchronous Gradient Sharing (AGS) and local multi-round computations in the presence of two parties. Although the secure multi-party computation scheme ensures both input privacy and computational accuracy, protocols of this type offset expensive computational overhead through extensive communication interactions. Consequently, an excessive increase in the number of participants leads to a significant increase in communication time. Therefore, the secure multi-party computation technology has the issue of long communication time.


Federated Learning:

Shokri and others proposed a privacy-preserving regression scheme based on federated learning by selectively sharing model gradients between servers and data owners. All computations are performed in a plaintext format. Bonawitz and others introduced a practical and secure federated learning model using secret sharing and key agreement protocols. This solution is implemented through secret sharing of gradients between non-colluding servers. While federated learning effectively prevents privacy leakage of local data of a data owner, the encryption of local data and interaction of gradient parameters increase local storage overhead and pose risks of model information leakage. Therefore, the federated learning technology fails to achieve dual protection for data and models.


In summary, there is a need for a privacy-preserving computation method that addresses the problems of the four types of privacy-preserving computation solutions mentioned above.


SUMMARY

An objective of the present disclosure is to provide a privacy-preserving computation method and system for secure three-party linear regression that can improve reliability and accuracy, reduce communication time, and achieve dual protection for data and models.


To achieve the above objective, the present disclosure provides the following technical solutions.


A privacy-preserving computation method for secure three-party linear regression is provided, including:


computing, by a first participant, a transposed matrix of a first private matrix to obtain a first private transposed matrix, where the first private matrix is a private matrix of the first participant;


computing, by a second participant, a transposed matrix of a second private matrix to obtain a second private transposed matrix, where the second private matrix is a private matrix of the second participant;


processing, by the first participant and the second participant, the first private transposed matrix, the second private transposed matrix, the first private matrix, and the second private matrix by using a secure two-party matrix hybrid multiplication protocol (2PHMP), to obtain a matrix ∂1 and a matrix ∂2, where the matrix ∂1 is stored in the first participant, and the matrix ∂2 is stored in the second participant;


processing, by the first participant and the second participant, the matrix ∂1 and the matrix ∂2 by using a secure two-party matrix inversion protocol (2PIP), to obtain a matrix u1 and a matrix u2, where the matrix u1 is stored in the first participant, and the matrix u2 is stored in the second participant;


splitting, by the first participant, the first private transposed matrix into a random matrix v1 and a difference matrix Δ, and sending the difference matrix Δ to the second participant;


secretly obfuscating and overlaying, by the second participant, the second private transposed matrix with the difference matrix Δ to obtain a private matrix v2; and


processing, by a requester of secure three-party linear regression computation, the first participant, the second participant, and a third participant, a first private matrix sequence, a second private matrix sequence, and a third private matrix by using a secure three-party matrix hybrid multiplication protocol (3PHMP), to obtain a regression coefficient matrix of a secure three-party linear regression model, where the third private matrix is a private matrix of the third participant, the first private matrix sequence includes the matrix u1 and the random matrix v1, and the second private matrix sequence includes the matrix u2 and the random matrix v2.


A privacy-preserving computation system for secure three-party linear regression is provided, including:


a first private transposed matrix computing module configured to allow a first participant to compute a transposed matrix of a first private matrix to obtain a first private transposed matrix, where the first private matrix is a private matrix of the first participant;


a second private transposed matrix computing module configured to allow a second participant to compute a transposed matrix of a second private matrix to obtain a second private transposed matrix, where the second private matrix is a private matrix of the second participant;


a 2PHMP module configured to allow the first participant and the second participant to process the first private transposed matrix, the second private transposed matrix, the first private matrix, and the second private matrix by using a secure two-party matrix hybrid multiplication protocol, to obtain a matrix ∂1 and a matrix ∂2, where the matrix ∂1 is stored in the first participant, and the matrix ∂2 is stored in the second participant;


a 2PIP module configured to allow the first participant and the second participant to process the matrix ∂1 and the matrix ∂2 by using a secure two-party matrix inversion protocol, to obtain a matrix u1 and a matrix u2, where the matrix u1 is stored in the first participant, and the matrix u2 is stored in the second participant;


a splitting module configured to allow the first participant to split the first private transposed matrix into a random matrix v1 and a difference matrix Δ, and send the difference matrix Δ to the second participant;


an obfuscation and overlay computation module configured to allow the second participant to secretly obfuscate and overlay the second private transposed matrix with the difference matrix Δ to obtain a private matrix v2; and


a 3PHMP module configured to allow a requester of secure three-party linear regression computation, the first participant, the second participant, and a third participant to process a first private matrix sequence, a second private matrix sequence, and a third private matrix by using a secure three-party matrix hybrid multiplication protocol, to obtain a regression coefficient matrix of a secure three-party linear regression model, where the third private matrix is a private matrix of the third participant, the first private matrix sequence includes the matrix u1 and the random matrix v1, and the second private matrix sequence includes the matrix u2 and the random matrix v2.


According to specific embodiments provided in the present disclosure, the present disclosure has the following technical effects:


The present disclosure solves the privacy-preserving computation problem of secure three-party linear regression using the secure two-party matrix hybrid multiplication protocol, secure two-party matrix inversion protocol, and secure three-party matrix hybrid multiplication protocol, to improve the speed and accuracy, and reduce the communication time. The three computation protocols include a secure two-party matrix multiplication protocol and secure three-party matrix multiplication protocol, enhancing the reliability. Moreover, the present disclosure achieves dual protection for data and models through data exchange among the participants, without relying on any external cloud servers or computing services.





BRIEF DESCRIPTION OF THE DRAWINGS

To describe the technical solutions in embodiments of the present disclosure or in the prior art more clearly, the accompanying drawings required in the embodiments are briefly described below. Apparently, the accompanying drawings in the following description show merely some embodiments of the present disclosure, and other drawings can be derived from these accompanying drawings by those of ordinary skill in the art without creative efforts.



FIG. 1 is a schematic diagram of a secure three-party linear regression problem according to an embodiment of the present disclosure;



FIGS. 2A-D are schematic diagrams of four division formats of a multi-source heterogeneous dataset according to an embodiment of the present disclosure;



FIGS. 3A-D are schematic diagrams of four federated computing models based on a multi-source heterogeneous dataset according to an embodiment of the present disclosure;



FIG. 4 is a flowchart diagram of a privacy-preserving computation method for secure three-party linear regression according to an embodiment of the present disclosure;



FIG. 5 is a schematic diagram of a data disguising technical solution according to an embodiment of the present disclosure; and



FIG. 6 is a diagram of an implementation apparatus of privacy-preserving computation for secure three-party matrix regression applied to a computing node according to an embodiment of the present disclosure.





DETAILED DESCRIPTION OF THE EMBODIMENTS

The technical solutions of the embodiments of the present disclosure are clearly and completely described below with reference to the drawings in the embodiments of the present disclosure. Apparently, the described embodiments are merely a part rather than all of the embodiments of the present disclosure. All other embodiments obtained by those skilled in the art based on the embodiments of the present disclosure without creative efforts shall fall within the protection scope of the present disclosure.


In order to make the above objective, features and advantages of the present disclosure clearer and more comprehensible, the present disclosure will be further described in detail below in combination with accompanying drawings and particular implementation modes.


The secure three-party linear regression protocol (3PLRP), as the name suggests, assumes the presence of three mutually distrusting participants: Alice, Bob, and Carol. A dataset X={X1|X2} is composed of a private data matrix X1 and a private data matrix X2, which are obtained by vertically partitioning and have the same number of samples. The participant Alice holds the private data matrix X1, the participant Bob holds the private data matrix X2, and the participant Carol holds a private data matrix Y that contains label item information corresponding to the dataset. The three participants of the computation jointly execute a three-party regression model f(X1, X2, Y)=Output(Va′, Vb′, Vc)=β. Eventually, the participants obtain respective output matrices Va′, Vb′, Vc, which are random splits of a model regression coefficient β. The outputs satisfy the regression coefficient β=Va′, Vb′, Vc=(XTX)−1XTY. Throughout the entire computation process, each participant node only has knowledge of the input and output data involved in its own computation flow and cannot access any intermediate computation results of other participant nodes. Privacy-preserving computation refers to a set of information technologies that allow for the analysis and computation of data while ensuring that the data provider does not disclose the original data, to ensure that data remains “available but invisible” during circulation and integration processes.


On this basis, an embodiment of the present disclosure provides a privacy-preserving computation method for secure three-party linear regression, which includes the following steps:


A first participant computes a transposed matrix of a first private matrix to obtain a first private transposed matrix, where the first private matrix is a private matrix of the first participant.


A second participant computes a transposed matrix of a second private matrix to obtain a second private transposed matrix, where the second private matrix is a private matrix of the second participant.


The first participant and the second participant process the first private transposed matrix, the second private transposed matrix, the first private matrix, and the second private matrix by using a secure two-party matrix hybrid multiplication protocol, to obtain a matrix ∂1 and a matrix ∂2, where the matrix ∂1 is stored in the first participant, and the matrix ∂2 is stored in the second participant.


The first participant and the second participant process the matrix ∂1 and the matrix ∂2 by using a secure two-party matrix inversion protocol, to obtain a matrix u1 and a matrix u2, where the matrix u1 is stored in the first participant, and the matrix u2 is stored in the second participant;


The first participant splits the first private transposed matrix into a random matrix v1 and a difference matrix Δ, and sends the difference matrix Δ to the second participant.


The second participant secretly obfuscates and overlays the second private transposed matrix with the difference matrix Δ to obtain a private matrix v2.


A requester of the secure three-party linear regression computation, the first participant, the second participant, and a third participant processes a first private matrix sequence, a second private matrix sequence, and a third private matrix by using a secure three-party matrix hybrid multiplication protocol, to obtain a regression coefficient matrix of a secure three-party linear regression model, where the third private matrix is a private matrix of the third participant, the first private matrix sequence includes the matrix u1 and the random matrix v1, and the second private matrix sequence includes the matrix u2 and the random matrix v2.


The secure two-party matrix hybrid multiplication protocol (2PHMP), as the name suggests, assumes the presence of two mutually distrusting participants. The two participants hold secret input matrix pairs, (X1, X1T) and (X2, X2T), respectively, and jointly execute a two-party hybrid multiplication protocol f((X1, X1T), (X2, X2T))=Output(∂1, ∂2)=(X1+X2)·(X1T+X2T). Eventually, the two participants obtain corresponding outputs ∂1, ∂2 respectively, and the outputs satisfy ∂1+∂2=(X1+X2T)·(X2+X2T). Throughout the entire computation process, each participant node only has knowledge of the input and output data involved in its own computation process and cannot access any intermediate computation results of the other participant. Therefore, in practical applications, the step of processing, by the first participant and the second participant, the first private transposed matrix, the second private transposed matrix, the first private matrix, and the second private matrix by using the secure two-party matrix hybrid multiplication protocol to obtain the matrix ∂1 and the matrix ∂2 specifically includes the following steps:


The first participant and the second participant process the first private transposed matrix and the second private matrix by using a secure two-party matrix multiplication protocol, to obtain a matrix Va1 and a matrix Vb1. The matrix Va1 is stored in the first participant, and the matrix Vb1 is stored in the second participant.


The first participant computes a product of the first private matrix and the first private transposed matrix, to obtain a matrix Va0.


The first participant and the second participant process the first private matrix and the second private transposed matrix by using the secure two-party matrix multiplication protocol, to obtain a matrix Va2 and a matrix Vb2. The matrix Va2 is stored in the first participant, and the matrix Vb2 is stored in the second participant.


The second participant computes a product of the second private matrix and the second private transposed matrix, to obtain a matrix Vb0.


The first participant obtains the matrix ∂1 based on the matrix Va1, the matrix Va2, and the matrix Va0.


The second participant obtains the matrix ∂2 based on the matrix Vb1, the matrix Vb2, and the matrix Vb0.


The secure two-party matrix inversion protocol (2PIP), as the name suggests, assumes the presence of two mutually distrusting participants. The two participants hold secret input matrices ∂1 and ∂2, respectively, and jointly execute a two-party inversion protocol f(∂1, ∂2)=Output(u1, u2)=(∂1+∂2)−1. Eventually, the two participants obtain corresponding outputs v1, v2 respectively, and the outputs satisfy u1+u2=(∂1+∂2)−1. Throughout the entire computation process, each participant node only has knowledge of the input and output data involved in its own computation process and cannot access any intermediate computation results of the other participant. Therefore, in practical applications, the step of processing, by the first participant and the second participant, the matrix ∂1 and the matrix ∂2 by using the secure two-party matrix inversion protocol, to obtain the matrix u1 and the matrix u2 specifically includes the following steps:


The first participant randomly generates a secret matrix P, and obtains a matrix IA based on the secret matrix P and the matrix ∂1.


The second participant randomly generates a secret matrix Q, and obtains a matrix IB based on the matrix ∂2 and the secret matrix Q.


The first participant and the second participant process the matrix IA and the secret matrix Q by using the secure two-party matrix multiplication protocol, to obtain a matrix Va1′ and a matrix Vb1′. The matrix Va1′ is stored in the first participant, and the matrix Vb1′ is stored in the second participant.


The first participant and the second participant process the secret matrix P and the matrix IB by using the secure two-party matrix multiplication protocol, to obtain a matrix Va2′ and a matrix Vb2″. The matrix Va2′ is stored in the first participant, and the matrix Vb2″ is stored in the second participant.


The first participant obtains the matrix Va based on the matrix Va1′ and the matrix Va2′, and sends the matrix Va to the second participant.


The second participant obtains a matrix IB* based on the secret matrix Q, the matrix Va, the matrix Vb1′, and the matrix Vb2′.


The first participant and the second participant process the secret matrix P and the matrix IB* by using the secure two-party matrix multiplication protocol, to obtain the matrix u1 and the matrix u2.


In practical applications, the step of processing, by the first participant and the second participant, the matrix ∂1 and the matrix ∂2 by using the secure two-party matrix inversion protocol, to obtain the matrix u1 and the matrix u2 specifically includes the following steps:


The first participant randomly generates a reversible matrix P′ and a reversible matrix Q′.


The first participant and the second participant obtain a matrix Va1′″ and a matrix Vb1′″ based on the reversible matrix P′ and the matrix ∂2 by using the secure two-party matrix multiplication protocol. The matrix Va1′″ is stored in the first participant, and the matrix Vb1′″ is stored in the second participant.


The first participant and the second participant obtain a matrix Va2′″ and a matrix Vb2′″ based on the reversible matrix Q′ and the matrix Vb1′″ by using the secure two-party matrix multiplication protocol. The matrix Va2′″ is stored in the first participant, and the matrix Vb2′″ is stored in the second participant.


The first participant obtains a matrix Va″ based on the matrix Va1′″, matrix Q′, matrix Va2′″, matrix P′, and matrix ∂1, and sends the matrix Va″ to the second participant.


The second participant obtains a matrix T″ based on the matrix Va″ and the matrix Vb2′″, and computes an inverse matrix of the matrix T″.


The first participant and the second participant obtains a matrix Va3′ and a matrix Vb3′ based on the matrix Q′ and the inverse matrix of the matrix T″ by using the secure two-party matrix multiplication protocol. The matrix Va3′ is stored in the first participant, and the matrix Vb3″ is stored in the second participant.


The first participant and the second participant obtain a matrix Va4 and a matrix Vb4 based on the matrix P′ and the matrix Vb3′ by using the secure two-party matrix multiplication protocol.


The second participant computes the matrix u2 based on the matrix Vb4.


The first participant obtains the matrix u1 based on the matrix Va3′, the matrix P′, and the matrix Va4.


The secure three-party matrix hybrid multiplication protocol (3PHMP), as the name suggests, assumes the presence of three mutually distrusting participants. The three participant hold corresponding secret input matrices (u1, v1), (u2, v2), z, respectively, and jointly execute a two-party hybrid multiplication protocol f((u1, v1), (u1, v2), z)=Output(Va′, Vb′, Vc)=(u1+u2)·(v1+v2)·z. Eventually, the participants obtain their corresponding outputs Va′, Vb′, Vc respectively, and the outputs satisfy Va′+Vb′+Vc=(u1+u2)·(v1+v2)·z. Throughout the entire computation process, each participant node only has knowledge of the input and output data involved in its own computation process and cannot access any intermediate computation results of the other participants. Therefore, in practical applications, the step of processing, by the requester of the secure three-party linear regression computation, the first participant, the second participant, and the third participant, the first private matrix sequence, the second private matrix sequence, and the third private matrix by using the secure three-party matrix hybrid multiplication protocol, to obtain the regression coefficient matrix of the secure three-party linear regression model specifically includes the following steps:


The first participant computes a matrix A* based on the first private matrix sequence, while the second participant computes a matrix B* based on the second private matrix sequence.


The first participant and the third participant obtain a matrix Va0′ and a matrix Vc0 based on the matrix A* and the third private matrix by using the secure two-party matrix multiplication protocol. The matrix Va0′ is stored in the first participant, and the matrix Vc0 is stored in the third participant. The second participant and the third participant obtain a matrix Vb3 and a matrix Vc3 based on the matrix B* and the third private matrix by using the secure two-party matrix multiplication protocol. The matrix Vb3 is stored in the second participant, and the matrix Vc3 is stored in the third participant. Additionally, the first participant, the second participant, and the third participant obtain a matrix Va1″, a matrix Vb1″, and a Vc1 based on the matrix u1, the random matrix v2, and the third private matrix by using a secure three-party matrix multiplication protocol. The matrix Va1″ is stored in the first participant, the matrix Vb1″ is stored in the second participant, and the matrix Vc1 is stored in the third participant. Meanwhile, the second participant, the first participant, and the third participant obtain a matrix Va2″, a matrix Vb2″, and a matrix Vc2 based on the matrix u2, the random matrix v1, and the third private matrix by using the secure three-party matrix multiplication protocol. The matrix Va2″ is stored in the first participant, the matrix Vb2″ is stored in the second participant, and the matrix Vc2 is stored in the third participant.


The first participant obtains a matrix Va′ based on the matrix Va0′, the matrix Va1″, and the matrix Va2″ and sends the matrix Va′ to the requester of the secure three-party linear regression computation.


The second participant obtains a matrix Vb′ based on the matrix Vb1″, the matrix Vb2″, and the matrix Vb3 and sends the matrix Vb′ to the requester of the secure three-party linear regression computation.


The third participant obtains a matrix Vc based on the matrix Vc0, the matrix Vc1, the matrix Vc2, and the matrix Vc3 and sends the matrix Vc to the requester of the secure three-party linear regression computation.


The requester of the secure three-party linear regression computation obtains the regression coefficient matrix of the secure three-party linear regression model based on the matrix Va′, the matrix Vb′, and the matrix Vc.


The secure two-party matrix multiplication protocol (2PMP), as the name suggests, assumes the presence of two mutually distrusting participants. The two participants hold secret input matrices, x and y, respectively, and jointly execute a two-party multiplication protocol f(x, y)=Output(F1, F2)=x·y. Eventually, the two participants obtain their corresponding outputs F1, F2 respectively, and the outputs satisfy F1+F2=x·y. Throughout the entire calculation process, each participant only has knowledge of the input and output data involved in its own computation process and cannot access any intermediate computation results of the other participants. Therefore, in practical applications, the step of processing, by the first participant and the second participant, the first private transposed matrix and the second private matrix by using the secure two-party matrix multiplication protocol to obtain the matrix Va1 and the matrix Vb1 specifically includes the following steps:


An auxiliary computing node generates two random matrix pairs (Ra, ra) and (Rb, rb), and sends the random matrix pair (Ra, ra) to the first participant and the random matrix pair (Rb, rb) to the second participant.


The first participant obtains a matrix  based on the matrix Ra and the first private transposed matrix, and sends the matrix  to the second participant.


The second participant obtains a matrix {circumflex over (B)} based on the matrix Rb and the second private matrix, and sends the matrix {circumflex over (B)} to the first participant.


The second participant randomly generates a matrix Vb1, generates a matrix T based on the matrix Â, the second private matrix, the matrix rb, and the matrix Vb1, and sends the matrix T to the first participant.


The first participant obtains the matrix Va1 based on the matrix T, the matrix ra, the matrix Ra, and the matrix {circumflex over (B)}.


In practical applications, the step of obtaining, by the second participant, the matrix IB* based on the secret matrix Q, the matrix Va, the matrix Vb1′, and the matrix Vb2′ specifically includes the following steps:


The second participant obtains a matrix Vb based on the matrix Vb1′ and the matrix Vb2′.


Based on the matrix Vb and the matrix Va, a matrix T′ is obtained.


Based on an inverse matrix of the matrix T′ and the secret matrix Q, the matrix IB* is obtained.


The secure three-party matrix multiplication protocol (3PMP), as the name suggests, assumes the presence of three mutually distrusting participants. The three participants hold secret input matrices x′, y′, and z′, respectively, and jointly execute a three-party multiplication protocol f(x′, y′, z′)=Output (E1, E2, E3)=x′·y′·z′. Eventually, the three participants obtain their corresponding outputs E1, E2, E3 respectively, and the outputs satisfy E1+E2+E3=x′·y′·z′. Throughout the entire calculation process, each participant only has knowledge of the input and output data involved in its own computation process and cannot access any intermediate computation results of the other participants. Therefore, in practical applications, the step of obtaining, by the first participant, the second participant, and the third participant, the matrix Va1″, the matrix Vb1″, and the matrix Vc1 based on the matrix u1, the random matrix v2, and the third private matrix by using the secure three-party matrix multiplication protocol specifically includes the following steps:


An auxiliary computing node randomly generates three random matrix pairs (Ra′, ra′), (Rb′, rb′), and (Rc, rc), and sends the random matrix pair (Ra′, ra′) to the first participant, the random matrix pair (Rb′, rb′) to the second participant, and the random matrix pair (Rc, rc) to the third participant.


The first participant obtains a matrix Â′ based on the matrix Ra′ and the matrix u1 and sends the matrix Â′ to the second participant.


The third participant obtains a matrix Ĉ based on the matrix Rc and the third private matrix and sends the matrix Ĉ to the second participant.


The second participant obtains a matrix {circumflex over (B)}′ based on the matrix Rb′ and the random matrix v2 and determines whether the matrix {circumflex over (B)}′ is full rank, to obtain a first determining result.


If the first determining result is no, the process returns to the step that “an auxiliary computing node randomly generates three random matrix pairs (Ra′, ra′), (Rb′, rb′), and (Rc, rc), and sends the random matrix pair (Ra′, ra′) to the first participant, the random matrix pair (Rb′, rb′) to the second participant, and the random matrix pair (Rc, rc) to the third participant.”


If the first determining result is yes, the second participant obtains a matrix My based on the matrix Â′, the matrix Rb′, and the matrix Ĉ, computes a matrix φ1 based on the matrix Â′ and the matrix {circumflex over (B)}′, obtains a matrix γ1 based on the matrix Â′ and the matrix Rb′, computes a matrix φ2 based on the matrix {circumflex over (B)}′ and the matrix Ĉ, and computes a matrix γ2 based on the matrix Rb′ and the matrix Ĉ. The matrix φ1 and the matrix γ1 are sent to the third participant, while the matrix φ2 and the matrix γ2 are sent to the first participant.


The first participant computes a matrix Sa based on the matrix Ra′ and the matrix γ2, and computes a matrix Ma based on the matrix u1 and the matrix φ2.


The third participant computes a matrix Sc based on the matrix γ1 and the matrix Rc, and computes a matrix Mc based on the matrix φ1 and the matrix Rc.


The second participant performs a full-rank decomposition on the matrix {circumflex over (B)}′ to obtain a column full-rank matrix and a row full-rank matrix, and sends the column full-rank matrix to the first participant, and the row full-rank matrix to the third participant.


The first participant randomly generates the matrix Va1″, obtains a matrix Ta based on the matrix Ma, the matrix Sa, the matrix Va1″, and the matrix ra, obtains a matrix t1 based on the matrix Ra′ and the column full-rank matrix, and sends the matrix Ta and the matrix t1 to the second participant.


The third participant obtains a matrix t2 based on the row full-rank matrix and the matrix Rc, and sends the matrix t2 to the second participant.


The second participant randomly generates the matrix Vb1″, obtains a matrix Sb based on the matrix t1 and the matrix t2, obtains a matrix Tb based on the matrix Ta, the matrix Mb, the matrix Sb, the matrix Vb1″, and the matrix rb, and sends the matrix Tb to the third participant.


The third participant obtains the matrix Vc1 based on the matrix Tb, the matrix Mc, the matrix Sc, and the matrix rc.


The present disclosure provides a more specific embodiment to detail the above method, which is specifically as follows:


Linear regression is an important and fundamental machine learning algorithm that characterizes the relationship between an output and multiple inputs. It is one type of supervised learning and finds extensive applications in data mining. In practice, training a linear regression model typically relies on a large amount of data, which is often held by different organizations and contains private information of users. Therefore, when multiple computing parties aim to obtain a better-performing model through a substantial amount of data without revealing each other's data privacy, it inevitably involves the privacy-preserving computation of the secure multi-party linear regression problem.


The secure three-party linear regression problem is defined as follows:


A dataset M∈Rm×(n+1) containing m sample points {X, Y}={(xi,1, xi,2, . . . xi,n, yi)|1≤i≤m} is given, where xi,1˜xi,n correspond to n feature variables of the i-th sample point, and yi corresponds to a label variable of the i-th sample point. X∈Rm×n represents a sample set matrix with n features, and Y∈Rm×1 represents label vectors corresponding to m sample points. It is known that there exist three independent and mutually distrusting computation participants: Alice, Bob, Carol. In an environment where a dataset is heterogeneously split randomly, X is randomly split into Am×n and Bm×n. The participant Alice possesses a private data matrix Am×n={xi, j|1≤i≤m, 1≤j≤n} containing n feature variables. The participant Bob possesses a private data matrix Bm×n={xi, j|1≤i≤m, 1≤j≤n} containing n feature variables. The participant Carol possesses a private data matrix Ym×1={y1, y2, y3 . . . , ym} containing m sample point labels. The private data matrices A, B, and C satisfy relationship 1 in terms of set element distribution: {A∪B∪C=M}; the private matrices A and B satisfy relationship 2 in terms of a horizontal feature index IDsample and a vertical sample index IDfeature: {IDsample(A)∩IDsample(B)=Ø] & [IDfeature(A)∩IDfeature(B)=Ø]}. The three participants aim to jointly execute a secure three-party linear regression protocol to achieve secure inference f(A, B, Y)=β=Wa+Wb+Wc of parameters of the linear regression model for heterogeneous distributed data sets without exposing their individual data information. Eventually, each computation participant obtains its corresponding m×1-dimensional regression coefficient slice matrix {Wi|i∈<a, b, c>}, and a β regression coefficient result of the secure three-party linear regression model is obtained after aggregation of the regression coefficient slice matrices sent to the computation requester. Throughout the entire computation process, the private data matrices of the participants should remain local. Each node is semi-honest, strictly follows the computation protocol steps, and cannot access any intermediate computation results or original data information held by the other participants. The formal description of the problem is shown in FIG. 1.


The secure two-party matrix multiplication protocol (2PMP) problem is defined as follows:


It is known that there are two computation participants, Alice and Bob, which are independent of each other and mutually distrusting. Alice holds an n×s-dimensional private data matrix A that is only stored in its own computing node, and Bob holds an s×m-dimensional private data matrix B. The two participants expect to achieve f(A, B)=AB=Qa+Qb by jointly executing a secure matrix multiplication protocol. Eventually, the computation participant nodes obtain their corresponding n×m-dimensional output matrices Qa and Qb, and send the output matrices to the computation requester, who aggregates the output matrices to obtain an expected two-party matrix multiplication result. During the computation process, each participant node can only obtain its own input/output information, and cannot access any intermediate computation results or data information held by the other participant.


The protocol flow specifically includes the following steps:


Step 1: An auxiliary computing node, also referred to as a commodity server (CS), generates two random matrix pairs, which are an n×s-dimensional random matrix Ra, an s×m-dimensional random matrix Rb, and two n×m-dimensional random matrices ra, rb. These random matrices need to strictly meet the following constraint ra+rb=Ra·Rb. Then, the CS auxiliary node sends the random matrix pair (Ra, ra) to the participant computing node Alice, and the random matrix pair (Rb, rb) to the participant computing node Bob.


Step 2: After receiving the corresponding random matrix pair (Ra, ra), the participant node Alice computes Â=A+Ra internally and sends it the to the participant node Bob.


Step 3: After receiving the corresponding random matrix pair (Rb, rb), the participant node Bob computes {circumflex over (B)}=B+Rb internally and sends it to the participant node Alice.


Step 4: After receiving the matrix  sent from the node Alice, the participant node Bob internally generates a random matrix Qb ∈Rn×m secretly, secretly computes a matrix T=·B+(rb−Qb) locally, and sends it to the node Alice.


Step 5: After receiving T, the participant node Alice secretly computes a matrix Qa=T+ra−(Ra·{circumflex over (B)}) locally.


The secure three-party matrix multiplication protocol (3PMP) problem is defined as follows:


It is known that there are three participants, Alice, Bob, Carol, which are independent of each other and mutually distrusting. Alice holds an n×s-dimensional private data matrix A′ that is only stored in its own computing node, Bob holds an s×t-dimensional private data matrix B′, and Carol holds a t×m-dimensional private data matrix C′. The three participants jointly perform a three-party matrix multiplication protocol computation f(A′, B′, C′)=A′B′C′=Q′a+Q′b+Q′c, and finally the computation participant nodes obtain their respective n×m-dimensional output matrices Q′a, Q′b, Q′c, which are sent to the computation requester, who aggregates the output matrices to obtain an expected result of three-party matrix multiplication. During the computation process, each participant node can only obtain its own input/output information, and cannot access any intermediate computation results or data information held by the other participants.


The protocol flow specifically includes the following steps:


Step 1: An auxiliary computing node, also referred to as a commodity server (CS) node, generates three random matrix pairs. Specific forms of the three random matrix pairs are as follows: an n×s-dimensional random matrix Ra′, an s×t-dimensional random matrix Rb′, a t×m-dimensional random matrix Rc, and three n×m-dimensional random matrices ra′, rb′, and rc. These random matrices need to strictly meet the following constraint ra′+rb′+rc=Ra′·Rb′·Rc. Then, the CS auxiliary node sends a random matrix pair (Ra′, ra′) to the participant computing node Alice, sends a random matrix pair (Rb′, rb′) to the participant computing node Bob, and sends a random matrix pair (Rc, rc) to the participant computing node Carol. When an entire computation protocol is performed, the CS auxiliary node needs to strictly meet the following three requirements: (1) Not contact private data information related to Alice, Bob and Carol, whether an input or output result of an intermediate computational process. (2) Not collude with any participant compute node. (3) Strictly follow a protocol process to correctly perform an assigned subtask. The CS auxiliary node does not directly participate in a subsequently actual computational process of the secure three-party multiplication, only provides a random matrix pair that is independent of a private data matrix at an initial phase for performing a protocol, thereby protecting information of a private matrix of a participant and ensuring security of raw data in a subsequent computational process. Therefore, the auxiliary node CS may generate a large quantity of mutually independent random matrix pairs offline in advance, and send random seeds to Alice, Bob, and Carol compute nodes in an initial trial phase for performing a protocol in a manner similar to commodity sale, so that the compute node can obtain corresponding random matrix information, and the commodity server CS gets its name.


Step 2: After receiving the corresponding random matrix pair (Ra′, ra′), the participant node Alice computes Â′=A′+Ra′ internally and sends it the to the participant node Bob.


Step 3: After receiving the corresponding random matrix pair (Rc, rc), the participant node Carol computes Ĉ=C′+Rc internally and sends it to the participant node Bob.


Step 4: After receiving the corresponding random matrix pair (Rb′, rb′), the participant node Bob computes {circumflex over (B)}′=B′+Rb′ internally and synchronously verifies whether the matrix {circumflex over (B)}′ is a non-full rank matrix, and if not, returns to step 1 to reselect a random matrix pair until the condition is met, then continues to compute a matrix Mb=Â′·Rb′·C, sends φ1=Â′·{circumflex over (B)}′ and γ1=Â′·Rb′ to the node Carol, and sends φ2={circumflex over (B)}′·Ĉ and γ2=Rb′·Ĉ to the node Alice.


Step 5: After receiving the matrix φ2, γ2 sent from the node Bob, the participant node Alice successively computes Sa=Ra′·γ2=Ra′·Rb′Ĉ and Ma=A′·φ2=A′·B′·Ĉ locally.


Step 6: After receiving the matrix φ1, γ1 sent from the node Bob, the participant node Carol successively computes Sc1·Rc=Â′Rb′·Rc and Mc1·Rc=Â′·B′·Rc locally.


Step 7: The participant node Bob internally splits the matrix {circumflex over (B)}′ in a manner of full rank decomposition, to obtain a column full rank matrix B1 ∈Rs×r and a row full rank matrix B2∈Rr×t, where ranks of the non-zero matrix {circumflex over (B)}′ and the split matrices B1, B2 meet a constraint condition rank ({circumflex over (B)}′)=rank(B1)=rank(B2)=r. The node Bob sends the matrix B1 to the node Alice, and sends the matrix B2 to the node Carol.


Step 8: After receiving the matrix B1 from the node Bob, the participant node Alice internally generates a random matrix Qa∈Rn×m secretly, computes Ta=Ma+Sa−Qa−ra and t1=Ra′B1 locally, and sends Ta and t1 to the node Bob.


Step 9: After receiving the matrix B2 from the node Bob, the participant node Carol secretly computes t2=B2Rc, and sends a result t2 to the node Bob.


Step 10: After receiving the matrices Ta and t1 sent from the node Alice and the matrix t2 sent from the node Carol, the participant node Bob internally generates a random matrix Qb ∈Rn×m secretly, secretly computes a matrix Sb=t1·t2=Ra′B1·B2Rc=Ra′{circumflex over (B)}′Rc locally, finally computes Tb=Ta−Mb+Sb−Qb−rb, and sends it to the node Carol.


Step 11: After receiving Tb, the participant node Carol secretly computes a matrix Qc=Tb−Mc+Sc−rc locally.


The secure two-party matrix inversion protocol (2PIP) problem is defined as follows:


It is known that there are two computation participants, Alice and Bob, which are independent of each other and mutually distrusting. Alice holds a private data matrix ∂1 ∈Rn×n that is only stored in its own computing node, and Bob holds a private data matrix ∂2∈Rn×n. The two participants jointly execute a two-party matrix inversion protocol computation f(∂1, ∂2)=(∂1+∂2)−1=u1+u2. Eventually, the two participant nodes of the computation obtain their corresponding output matrices u1, u2 ∈Rn×n respectively, and send the output matrices to the computation requester, who aggregates the output matrices to obtain an expected result of the two-party matrix inversion. During the computation process, each participant node can only access input/output information related to its own computation process, and cannot access any intermediate computation results or private data information held by the other participant.


The protocol flow specifically includes the following steps:


When the private reversible matrices P, Q∈Rn×n belong to different participant nodes, without loss of generality, the private matrix P belonging to the participant node Alice and the private matrix Q belonging to the participant node Bob is taken as an example for description herein. It is known that the participant node Alice holds a private matrix ∂1∈Rn×n, and the participant node Bob holds a private matrix ∂2∈Rn×n.


Step 1: The participant node Alice secretly generates a random reversible matrix P∈Rn×n locally, and secretly computes a matrix IA=P×∂1 locally.


Step 2: The participant node Bob secretly generates a random reversible matrix Q ∈Rn×n locally, and secretly computes a matrix IB=∂2×Q locally.


Step 3: The participant node Alice and the participant node Bob input their respective private matrices IA and Q based on the secure two-party matrix multiplication protocol 2PMP, to perform the first round of secure two-party matrix multiplication. After execution of the 2PMP protocol is completed, the computation result will be randomly split into matrices Va1′, Vb1′∈Rn×n according to the random obfuscation technique, which are sent to the participant node Alice and the participant node Bob respectively. The two private output matrices satisfy the following relation: Va1′+Vb1′=(P∂1)×Q.


Step 4: The participant node Alice and the participant node Bob input their respective private matrices P and IB based on the secure two-party matrix multiplication protocol 2PMP, to perform the second round of secure two-party matrix multiplication. After execution of the 2PMP protocol is completed, the computation result will be randomly split into matrices Va2′, Vb2′∈Rn×n according to the random obfuscation technique, which are sent to the participant node Alice and the participant node Bob respectively. The two private output matrices satisfy the following relation: Va2′+Vb2′=P×(∂2Q).


Step 5: The participant node Alice secretly computes a matrix Va=Va1′+Va2′ locally, and sends the private matrix Va∈Rn×n to the participant node Bob.


Step 6: The participant node Bob secretly computes a matrix Vb=Vb1′+Vb2′ locally, aggregates the local matrix with the private matrix sent from the node Alice to obtain T′=Va+Vb, then performs inversion secretly to obtain a matrix T′−1=Q−1(Va+Vb)−1P−1, where T′−1∈Rn×n, and further computes a private matrix IB*=QT′−1.


Step 7: The participant node Alice and the participant node Bob input their respective private matrices P and Ig based on the secure two-party matrix multiplication protocol, to perform the third round of secure two-party matrix multiplication to obtain a matrix u1 and a matrix u2. After execution of the 2PMP protocol is completed, the computation result will be randomly split into matrices Ua, Ub ∈Rn×n according to the random obfuscation technique, which are sent to the participant node Alice and the participant node Bob respectively. The two private output matrices satisfy the following relation: Ua+Ub=Q×T′−1×P.


The foregoing secure two-party matrix inversion protocol is designed based on the case where the private reversible matrices P, Q∈Rn×n belong to different computation participants. Specifically, the private matrix P belongs to the participant node Alice, and the private matrix Q belongs to the participant node Bob. Therefore, the present disclosure further provides a secure two-party matrix inversion protocol based on the case where private reversible matrices P′, Q′∈Rn×n belong to the same computation participant Alice. It is known that the participant node Alice holds a private matrix ∂1∈Rn×n, and the participant node Bob holds a private matrix ∂2∈Rn×n. The specific process of the solution is as follows:


Step 1: The participant node Alice secretly generates two random invertible matrices P′, Q′∈Rn×n locally.


Step 2: The participant node Alice and the participant node Bob input their respective private matrices P′ and ∂2 based on the secure two-party matrix multiplication protocol 2PMP, to perform the first round of secure two-party matrix multiplication. After execution of the 2PMP protocol is completed, the computation result will be randomly split into matrices Va1′″, Vb1′″∈Rn×n according to the random obfuscation technique, which are sent to the participant node Alice and the participant node Bob respectively. The two private output matrices satisfy the following relation: Va1′″+Vb1′″=P′×∂2.


Step 3: The participant node Alice and the participant node Bob jointly perform the second round of secure two-party matrix multiplication, where an input from the participant node Bob is the private data matrix Vb1′″ returned in step 2, and an input from the participant node Alice is the private data matrix Q′. After execution of the 2PMP protocol is completed, the computation result will be randomly split into two private matrices Va2′″, Vb2′″∈Rn×n, which are sent to the participant nodes Alice and Bob respectively. The two private matrices satisfy the following relation: Va2′″+Vb2′″=Vb1′″×Q′.


Step 4: The participant node Alice secretly computes a matrix Va″=Va1′″×Q′+Va2′″+P′×∂1×Q′ locally, and then sends the private matrix Va″∈Rn×n to the participant node Bob.


Step 5: The participant node Bob secretly computes a matrix T″=Va″+Vb2′″=(Va2′″+Va1′″×Q′+P′×∂1×Q′)+Vb2′″=P′(∂1+∂2)Q′ locally, where the matrix T″∈Rn×n.


Step 6: The participant node Bob further secretly computes an inverse matrix T″−1=Q′−1(∂1+∂2)−1P′−1 of the matrix T″ locally, where T″−1∈Rn×n.


Step 7: The participant node Alice and the participant node Bob jointly perform the third round of secure two-party matrix multiplication, where the participant node Alice inputs the matrix Q′ based on the 2PMP protocol, and the participant node Bob inputs the matrix T″−1 based on the 2PMP protocol. After execution of the 2PMP protocol is completed, the computation result will be randomly split into two private matrices Va3′, Vb3′∈Rn×n, which are sent to the participant nodes Alice and Bob respectively. The two private matrices satisfy the following relation: Va3′+Vb3′=Q′×T″−1.


Step 8: The participant node Alice and the participant node Bob jointly perform the fourth round of secure two-party matrix multiplication, where an input from the participant node Alice is the private data matrix P′ returned in step 7, and an input from the participant node Bob is the private data matrix Vb3′. After execution of the 2PMP protocol is completed, the computation result will be randomly split into two private matrices Va4, Vb4∈Rn×n, which are sent to the participant nodes Alice and Bob respectively. The two private matrices satisfy the following relation: Va4+Vb4=Vb3′×P′.


Step 9: The participant node Bob locally stores the private matrix Vb4 returned in Step 8 as u2, and the participant node Alice secretly computes a matrix u1=Va3′×P′+Va4 locally.


The secure two-party matrix hybrid multiplication protocol (2PHMP) problem is defined as follows:


It is known that are two computation participants, Alice and Bob, which are independent of each other and mutually distrusting. Alice holds a set of private data matrices X1T∈Rm×t and X1∈Rt×n that are only stored in its own computing node, and Bob holds a set of private data matrices X2T∈Rm×t and X2∈Rt×n that are only stored in its own computing node. The two participants jointly execute a two-party matrix hybrid multiplication protocol f((X1T, X1), (X2T, X2))=(X1T+X2T)×(X1+X2)=∂1+∂2. Eventually, the two participant nodes of the computation obtain their corresponding output matrices ∂1, ∂2 ∈Rm×n respectively, and send the output matrices to the computation requester, who aggregates the output matrices to obtain an expected result of the two-party matrix hybrid multiplication. During the computation process, each participant node can only access input/output information related to its own computation process, and cannot access any intermediate computation results or private data information held by the other participant.


The protocol is specifically described as follows:


The secure two-party matrix hybrid multiplication problem typically arises in the intermediate computation process of multi-party modeling problems, with the form of two-party hybrid multiplication being quite common. For instance, in the intermediate computation process of scenarios such as regression and clustering problems, the form of overlaying and multiplying intermediate computation result matrices of the two parties. Without loss of generality, it is assumed that in this protocol, the computation participant node Alice has initial input matrices with dimensions X1T∈Rm×t and X1∈Rt×n, and the computation participant node Bob has initial input matrices with dimensions X2T∈Rm×t and X2∈Rt×n. Based on this, a high-speed parallel secure two-party hybrid multiplication protocol is proposed, which includes the following process.


Step 1: The participant node Alice and the participant node Bob input their respective private matrices X1T∈Rm×t and X2∈Rt×n based on the secure two-party matrix multiplication protocol 2PMP, to perform the first round of secure two-party matrix multiplication. After execution of the 2PMP protocol computation is completed, the intermediate computation result of this round will be randomly split into matrices Va1, Vb1∈Rm×n according to the random obfuscation technique, which are sent to the participant node Alice and the participant node Bob respectively. The two private output matrices satisfy the following relation: Va1+Vb1=X1T×X2.


Step 2: The participant node Alice locally performs private matrix multiplication Va0=X1T×X1 in parallel, and after the computation is finished, stores the result in a private storage space within the node Alice.


Step 3: The participant node Alice and the participant node Bob input their respective private matrices X1∈Rm×t and X2T∈Rt×n based on the secure two-party matrix multiplication protocol 2PMP, to perform the second round of secure two-party matrix multiplication. After execution of the 2PMP protocol computation is completed, the intermediate result of this round of computation will be randomly split into matrices Va2, Vb2∈Rm×n according to the random obfuscation technique, which are sent to the participant node Alice and the participant node Bob respectively. The two private output matrices satisfy the following relation: Va2+Vb2=X2T×X1.


Step 4: The participant node Bob locally performs private matrix multiplication Vb0=X2T×X2 in parallel, and after the computation is finished, stores the result in a private storage space within the node Bob.


Step 5: The participant node Alice secretly aggregates the intermediate random split matrices Va1, Va2 generated in the previous two rounds of 2PMP protocol execution, and performs private computation ∂11≤i≤3[Vai]=Va0+Va1+Va2 locally, where ∂1∈Rm×n.


Step 6: The participant node Bob secretly aggregates the intermediate random split matrices Vb1, Vb2 generated in the previous two rounds of 2PMP protocol execution, and performs private computation ∂21≤j≤3[Vbj]=Vb0+Vb1+Vb2 locally, where ∂2 ∈Rm×n.


The secure three-party matrix hybrid multiplication protocol (3PHMP) problem is defined as follows:


It is known that are three computation participants, Alice, Bob, and Carol, which are independent of each other and mutually distrusting. Alice holds a set of private data matrices u1∈Rm×t and v1∈Rt×s that are only stored in its own computing node, Bob holds a set of private data matrices u2∈Rm×t and v2 ∈Rt×s that are only stored in its own computing node, and Carol holds a private data matrix z ∈Rs×n that is only stored in its own computing node. The three participants jointly execute a secure three-party matrix hybrid multiplication protocol f((u1 v1), (u2, v2), z)=(u1+u2)×(v1+v2)×z=Va′+Vb′+Vc. Eventually, the three participant nodes of the computation obtain their corresponding output matrices Va′, Vb′, Vc∈Rm×n respectively, which are sent to the computation requester, who aggregates the output matrices to obtain an expected result of the three-party matrix hybrid multiplication. During the computation process, each participant node can only access input/output information related to its own computation process, and cannot access any intermediate computation results or private data information held by the other participant.


The protocol is specifically described as follows:


The secure three-party matrix hybrid multiplication problem typically arises in multi-party modeling scenarios where label information, as private data, is not publicly disclosed. For instance, in service scenarios involving joint modeling, such as a joint assessment model for residents' health levels using multi-party medical and health examination data, an assessment model for customer risk control using customers' multi-party asset and credit data from banks, and a quantitative evaluation for multi-dimensional operational status of cities using electronic government data from organizations at all levels in smart cities, there are often numerous cases of three-party hybrid addition and multiplication of intermediate computation results. Therefore, the secure three-party matrix hybrid multiplication problem is an extension of the previous secure two-party matrix hybrid multiplication problem, and is suitable for joint modeling scenarios with higher data security requirements and stronger privacy constraints. Without loss of generality, it is assumed that in this protocol, the computation participant node Alice has initial input matrices with dimensions u1∈Rm×t and v1∈Rt×s, the computation participant node Bob has initial input matrices with dimensions u2∈Rm×t and v2 ∈Rt×s, and the computation participant node has an input matrix with dimensions z ∈Rs×2. Based on this, a high-speed parallel secure three-party hybrid multiplication protocol is proposed, which includes the following process.


Step 1: The participant node Alice locally performs private matrix multiplication A*=u1×v1, and after the computation is finished, stores the private secret matrix A* ∈Rm×s in a private storage space within the node Alice.


Step 2: The participant node Bob locally performs private matrix multiplication B*=u2×v2 in parallel with step 1, and after the computation is finished, stores the private secret matrix B* ∈Rm×s in a private storage space within the node Bob.


Step 3: The participant node Alice and the participant node Carol input their respective private matrices A* ∈Rm×s and z ∈Rs×n based on the secure two-party matrix multiplication protocol 2PMP, to perform the first round of secure two-party matrix multiplication. After execution of the 2PMP protocol computation is completed, the intermediate computation result of this round will be randomly split into matrices Va0′, Vc0∈Rm×n according to the random obfuscation technique, which are sent to the participant node Alice and the participant node Carol respectively. The two private output matrices satisfy the following relation: Va0′+Vc0=A*×z.


Step 4: The participant node Alice and the participant node Carol input their respective private matrices B* ∈Rm×s and z ∈Rs×n based on the secure two-party matrix multiplication protocol 2PMP, to perform the second round of secure two-party matrix multiplication in parallel with step 3. After execution of the 2PMP protocol computation is completed, the intermediate computation result of this round will be randomly split into matrices Vb3, Vc3∈Rm×n according to the random obfuscation technique, which are sent to the participant node Bob and the participant node Carol respectively. The two private output matrices satisfy the following relation: Vb3+Vc3=B*×z.


Step 5: The participant node Alice, the participant node Bob, and the participant node Carol input their respective private matrices u1∈Rm×t, u2 ∈Rt×s and z ∈Rs×n based on the secure three-party matrix multiplication protocol 3PMP, to perform the third round of secure three-party matrix multiplication in parallel with steps 3 and 4. After execution of the 3PMP protocol computation is completed, the intermediate computation result of this round will be randomly split into matrices Va1″, Vb1″, Vc1∈Rm×n according to the random obfuscation technique, which are sent to the participant node Alice, the participant node Bob, and the participant node Carol respectively. The three private output matrices satisfy the following relation: Va1″+Vb1″+Vc1=u1×u2×z.


Step 6: The participant node Alice, the participant node Bob, and the participant node Carol input their respective private matrices v1∈Rt×s, u2∈Rm×t and z ∈Rs×n based on the secure three-party matrix multiplication protocol 3PMP, to perform the fourth round of secure three-party matrix multiplication in parallel with steps 3 to 5. After execution of the 3PMP protocol computation is completed, the intermediate result of this round of computation will be randomly split into matrices Va2″, Vb2″, Vc2∈Rm×n according to the random obfuscation technique, which are sent to the participant node Alice, the participant node Bob, and the participant node Carol respectively. The three private output matrices satisfy the following relation: Va2″+Vb2″+Vc2=u2×v1×z.


Step 7: The participant node Alice secretly aggregates the intermediate random split results Va0′, Va1″, Va2″ generated in the previous four rounds of parallel multiplication protocol execution, and performs private computation Va′=Va0′+Va1″+Va2″ locally, where Va′∈Rm×n.


Step 8: The participant node Bob secretly aggregates the intermediate random split results Vb1″, Vb2″, Vb3 generated in the previous four rounds of parallel multiplication protocol execution, and performs private computation Vb′=Vb1″+Vb2″+Vb3 locally, where Vb′∈Rm×n.


Step 9: The participant node Carol secretly aggregates the intermediate random split results Vc0, Vc1, Vc2, Vc3 generated in the previous four rounds of parallel multiplication protocol execution, and performs private computation Vc0≤k≤3[Vck]=Vc0+Vc1+Vc2+Vc3 locally, where Vc∈Rm×n.


Multi-Source Heterogeneous Data Co-Reconstruction Model (MHDCM)
1) There are Commonly Four Types of Data Partition Forms for Multi-Source Data Sets.

In the process of multi-party secure joint modeling, the key to achieving federated modeling lies in securely and efficiently conducting multi-party collaborative computing without leaking each party's raw data out of local databases. One of the core factors influencing the protocol design paradigm in this problem is the different data storage structures from complex data sources of multiple parties. In real-world scenarios, since initial data sets come from multiple different computation participants, such as various regulatory agencies, banks, hospitals, and other different functional departments, data sets from different data sources have different data partition forms. Different data sources store different sample information or different feature information, and effective preprocessing methods tailored to different data distributions, i.e., data set partition forms, will greatly impact the efficiency, accuracy, and security of the entire secure federated computation protocol. These different data source partition forms mainly manifest as horizontal partition, vertical partition, and hybrid partition, and can also be classified into two types based on data structure features: homogeneous partition or heterogeneous partition. Usually, in joint modeling scenarios that do not involve data source privacy and security, the data sets from multiple parties are integrated in a centralized manner with one node representing data from all parties, as shown in FIG. 2(A). This traditional centralized data source model, as the most common data partition manner, is commonly used in existing machine learning, deep neural networks, and data mining modeling processes that do not involve privacy and security issues. However, in the scenarios involved in the present disclosure, it is necessary to break the limitations of data silos and achieve joint modeling without disclosing data. Thus, the traditional centralized approach is no longer suitable. FIG. 2(B) shows a data source distribution model of a horizontal partition form, and this data partition form is also known as homogeneous partition, where different data sources hold data sets with the same feature space, differing only in the number of sample points. Therefore, in this partition manner, data sources are partitioned into n data sets with different numbers of sample points (M1˜Mn), and these data sets are stored in n participant nodes. Similarly, FIG. 2(C) shows a data source distribution model of a vertical partition form, and this data partition form is also known as heterogeneous partition, where different data sources hold data sets with the same sample sequences, but the feature spaces corresponding to data sets of different parties are entirely different. In this partition manner, data sources are partitioned into n data sets with different features (M1˜Mn), and these data sets are stored in n participant nodes. For more complex data distribution scenarios, where different parties hold data sets with inconsistent sample spaces and feature spaces, a mixed random partition form, as shown in FIG. 2(D), is required to store n data sets (M1˜Mn) across n participant nodes distributively. This classification is a more general case of heterogeneous partition.


2) Multi-Source Heterogeneous Data Co-Reconstruction Model

In scenarios with distributed multi-source heterogeneous data sets, it is necessary to establish paradigms for secure multi-party federated computing models corresponding to different data partition forms. In other words, it is necessary to establish a multi-source heterogeneous data co-reconstruction model (MHDCM). Without loss of generality, a secure two-party regression federated computing problem is taken as an example, where an original data set is partitioned into two parts. M represents a sample space set of observed variables; Y represents a label space set of response variables; M1 and M2 are private sample matrix units, which are partitioned from the sample space set and stored in the computing node Alice and the computing node Bob, respectively; Y1 and Y2 are private label matrix units, which are partitioned from the label space set and stored in the computing node Alice and the computing node Bob, respectively.


Firstly, for joint modeling scenarios where exposure of multi-party data privacy is not taken into consideration, the traditional centralized data source aggregation method mentioned above can be directly applied. This approach stores the different multi-party heterogenous data sets, M1, M2, in the form of matrix [M] on a trusted server node of either Alice or Bob, and establishes a general centralized model analysis paradigm as shown in FIG. 3(A). Secondly, in joint scenarios that involve the privacy security of two-party homogeneous data sources, such as the form








[




M
1






M
2




]

=


[




M
1





0



]

+

[



0





M
2




]



,




the corresponding multi-source data, i.e., two different homogeneous data sets M1, M2, is stored separately in the participant nodes Alice and Bob in a horizontal partition manner. The analytical expression of the federated computing model is as shown in FIG. 3(B). Similarly, in joint scenarios that involve the privacy security of two-party heterogeneous data sources, such as the form [M1 M2]=[M1 0]+[0 M2], the corresponding multi-source data, i.e., two different heterogeneous data sets M1, M2, is stored separately in the two participant nodes in a vertical partition manner. The analytical expression of the federated computing model is as shown in FIG. 3(C). Furthermore, for more complex scenarios where multi-source heterogeneous data is distributed in a random mixed partition form, such as M=M1+M2, it is necessary to consider both horizontal partition of the sample space and vertical partition of the feature space. In this mixed partition form, two different heterogeneous data sets, M1, M2, are stored separately in two participant nodes, and the analytical expression of the federated computing model is as shown in FIG. 3(D). Clearly, the analytical expressions for two-party data sources in both horizontal and vertical partition scenarios can be generalized as the federated computing analytical expression under the mixed partition form shown in FIG. 3(D). Therefore, whether it is homogeneous or heterogeneous partition form, data subsets can be described using the multi-source heterogeneous analytical expression derived from the mixed partition form. This data representation will be uniformly adopted in the subsequent research on secure three-party linear regression protocol in the present disclosure.


The secure three-party linear regression protocol (3PLRP) is a process that enables three computation participants, Alice, Bob, and Carol, to jointly model multi-source heterogeneous data subsets under the premise that prediction label variables remain undisclosed. In comparison to traditional secure two-party linear regression, the secure three-party linear regression further enhances the protection for label data privacy and is suitable for complex scenarios where a higher level of security is required for joint modeling. Without loss of generality, an original data set {X ∈Rm×n; Y ∈Rm×1} with m sample points is considered. The data set is Dataset: {X; Y} is partitioned into three heterogeneous data subsets using the random mixed partition form as described in FIGS. 2A-D, where two heterogeneous data matrices, X1∈Rm×n and X2∈Rm×n are stored in private databases of the node Alice and the node Bob, respectively. The data set partitioning in the sample space satisfies the following relationship: X=X1+X2. The private label matrix Y ∈Rm×1 is separately assigned and stored in the private database of the participant node Carol. The key challenge in the secure three-party linear regression problem is to collaboratively establish an efficient and secure three-party linear regression model X·β=Y while ensuring that Alice, Bob, and Carol do not expose each other's computation data and to find an analytical expression for the regression coefficient matrix β∈Rn×1 of the model. The design process of this protocol consists of four core secure computation modules as follows:

    • 1. Sample data obfuscation and multiplication module f(X1, X2)=(X1T+X2T)×(X1+X2)=∂1+∂2. This module achieves first-stage random obfuscation encryption of original data information by obfuscating and multiplying private original matrix X1, X2 held by the participant node Alice and participant node Bob.
    • 2. Sample data obfuscation and inversion module f(∂1, ∂2)=(∂1+∂2)−1=u1+u2. This module achieves second-stage random obfuscation encryption of the original data information by obfuscating and inverting the intermediate computation results ∂1, ∂2 of the first-stage encryption held by the participant node Alice and participant node Bob.
    • 3. Sample data obfuscation and transposition module f(X1, X2)=(X1+X2)T=v1+v2. This module achieves third-stage information encryption of the original data by obfuscating and transposing the private original matrix data X1, X2 held by the participant node Alice and participant node Bob.
    • 4. Label data obfuscation and multiplication module f((u1, v1), (u2, v2), Y)=(u1+u2)·(v1+v2)·Y=Va′+Vb′+Vc. This module achieves fourth-stage random obfuscation encryption of the original data information to obtain a final regression coefficient result by obfuscating and multiplying the intermediate computation results u1, u2 after the second-stage encryption, and the intermediate computation results v1, v2 after the third-stage encryption held by the participant node Alice and participant node Bob with the private original label data Y held by participant node Carol.


Since the four core secure computation modules mentioned above are built on basic secure two-party or three-party computation protocols, the final secure three-party regression model and the analytical expression f(X, Y)=f(X1, X2, Y)=β=(XT·X)−1·XT·Y=[(X1T+X2T)·(X1+X2)]−1·(X1T+X2T)·Y, Va′+Vb′+Vc for the regression coefficient matrix β can be obtained by combining the previously proposed 2PMP, 3PMP, 2PHMP, 3PHMP, and 2PIP protocols, where, Va′+Vb′+Vc represent private regression coefficient matrix slices held by the node Alice, the node Bob, and the node Carol, respectively. For a more specific process of the secure three-party linear regression privacy protocol (3PLRP) is as shown in FIG. 4:


Step 1: The participant node Alice and the participant node Bob input their respective private matrices X1∈Rm×n and X2∈Rm×n based on the secure two-party matrix hybrid multiplication protocol 2PHMP, to collaboratively perform first-round secure computation in the sample data obfuscation and multiplication module. After execution of the 2PHMP protocol computation is completed, the intermediate computation result of this round will be randomly split into matrices ∂1, ∂2 ∈Rn×n according to the random obfuscation technique, which are sent to the participant node Alice and the participant node Bob respectively. Based on the 2PHMP protocol, the two private output matrices satisfy the following relation: ∂1+∂2=(X1T+X2T)·(X1+X2).


Step 2: The participant node Alice and the participant node Bob use the intermediate computation results ∂1, ∂2, which are encrypted based on random obfuscation, obtained after the first round of secure computation as input for second-round secure computation in the sample data obfuscation and inversion module, where the computation is performed based on the secure two-party matrix inversion protocol 2PIP, a computation result of this round is randomly split into matrices u1, u2 ∈Rn×n based on the random obfuscation technique, which are then sent to the participant node Alice and the participant node Bob, and based on the 2PIP protocol, the two private output matrices satisfy the following relation: u1+u2=(∂1+∂2)−1.


Step 3: The participant node Alice secretly splits a transposed result X1T of the private matrix X1 into a random matrix v1 ∈Rn×m and a difference matrix Δ∈Rn×m, where the matrix X1T satisfies the following relation: X1T=v1+Δ; after the computation is finished, the node Alice sends the difference matrix Δ to the participant node Bob and stores, in a local database of the node Alice, v1 as an encryption output result of third-round secure computation of the sample data obfuscation and transposition module.


Step 4: Upon receiving the difference matrix Δ from the node Alice, the participant node Bob first transposes the private matrix X2 into a matrix X2, and then secretly obfuscates and overlays the matrix X2 with the difference matrix Δ to obtain a private matrix v2∈Rn×m, where the private matrix satisfies the following relationship: v2=X2T+Δ, and the private matrix v2 is stored in a local database of the node Bob as an encryption output result of third-round secure computation of the sample data obfuscation and transposition module.


Step 5: The participant node Alice, the participant node Bob, and the participant node Carol input their respective private matrix sequences (u1, v1), (u2, v2), Y based on the secure three-party matrix hybrid multiplication protocol 3PHMP, and collaboratively to execute fourth-round secure computation in the label data obfuscation and multiplication module. After execution of the 3PHMP protocol is completed, the computation result of this round will be randomly split into three random slice matrices Va′, Vb′, Vc∈Rn×1 according to the random obfuscation technique, which are sent to the corresponding participant node Alice, participant node Bob, and participant node Carol respectively. Based on the 3PHMP protocol, the three regression coefficient matrices satisfy the following relation: (u1+u2)·(v1+v2)·Y=u1v1Y+u1v2Y+u2v1Y+u2v2Y=Va′+Vb′+Vc.


Step 6: After the local computations in step 5 are completed, the participant node Alice, the participant node Bob, and the participant node Carol send their private regression coefficient matrices Va′, Vb′, Vc to a requester of secure three-party linear regression computation, where the requester aggregates the private regression coefficient matrices to obtain a final computation result β=(XT·X)−1·XT·Y=Va′+Vb′+Vc. It is evident that:








V
a


+

V
b


+

V
c


=



(


W

a

0


+

W

c

0



)

+

(


W

a

1


+

W

b

1


+

W

c

1



)

+

(


W

a

2


+

W

b

2


+

W

c

2



)

+

(


W

b

3


+

W

c

3



)


=




u
1



v
1


Y

+


u
1



v
2


Y

+


u
2



v
1


Y

+


u
2



v
2


Y


=



(


u
1

+

u
2


)

·

(


v
1

+

v
2


)

·
Y

=




(



1


+


2



)


-
1


×

(


X
1
T

-
Δ
+

X
2
T

+
Δ

)

×
Y

=




[


(


X
1
T

+

X
2
T


)

·

(


X
1

+

X
2


)


]


-
1


×

(


X
1
T

+

X
2
T


)

×
Y

=



(


X
T

·
X

)


-
1


·

X
T

·

Y
.











Security Analysis

The secure three-party linear regression protocol is a process that enables three parties to achieve joint modeling under the assumption that all participant nodes are in a semi-honest model and data is not disclosed. According to the definition of security in the information theory, for a systematic algorithmic process, if the computation security of each sub-step corresponding to the protocol can be ensured, it is assumed that the entire protocol is process-level secure. If it can be ensured that the execution process for the final protocol result does not lead to the leakage of original data or intermediate results, it is assumed that the entire protocol is result-level secure. The security of this protocol in terms of process-level security and result-level security are separately explained below.


1) Process-Level Security

According to the definition of process-level security and the execution flow of 3PLRP provided in this embodiment, if the security of each step from step 1 to step 5 can be guaranteed, it can be declared that the protocol as a whole complies with process-level security. Specifically, for step 1, the participant node Alice and the participant node Bob strictly follow the computation process of secure two-party matrix hybrid multiplication protocol 2PHMP, while the 2PHMP protocol is composed of the secure two-party matrix multiplication protocol 2PMP, the security and reliability of which have been verified. Therefore, for the semi-honest participant nodes Alice and Bob, there is no risk of leaking their private data matrices when there is no collusion. Hence, the process-level security of step 1 can be guaranteed. For step 2, the participant node Alice and the participant node Bob strictly follow the computation process of the secure two-party matrix inversion protocol 2PIP, while the protocol 2PIP is also composed of the secure two-party matrix multiplication protocol 2PMP, the security and reliability of which have been verified. Therefore, for the semi-honest participant nodes Alice and Bob, there is no risk of leaking their private data matrices when there is no collusion. Hence, the process-level security of step 2 can be guaranteed. For step 3 and step 4, because the transposition and splitting are performed locally, without involving channel security or collusion issues, there is no risk of data leakage. Thus, for the participant node Alice and the participant node Bob, the process-level security of step 3 and step 4 can be strictly guaranteed. For step 5, the participant node Alice, the participant node Bob, and the participant node Carol strictly follow the computation process of the secure three-party matrix hybrid multiplication protocol 3PHMP, while the 3PHMP protocol is composed of the secure two-party matrix multiplication protocol 2PMP and the secure three-party matrix multiplication protocol 3PMP, the security and reliability of which have been verified. Therefore, for the semi-honest participant nodes Alice, Bob, and Carol, there is no risk of leaking their private data matrices when there is no collusion. Therefore, the process-level security of step 5 can be guaranteed. In summary, based on the definition of security in the information theory, since the process-level security of each of the foregoing steps can be ensured, the entire secure three-party linear regression protocol strictly complies with process-level security.


2) Result-Level Security

Therefore, for the final matrix sequences of regression coefficient results in the secure three-party linear regression joint modeling: Va′, Vb′, Vc, whether before or after aggregation, the leakage of any slice matrix data will not result in the leakage of the original data or intermediate data of the three participants. This makes the protocol strictly comply with result-level security. At the same time, the protocol also complies with process-level security. Therefore, under the dual security protection, this protocol can be considered as meeting the security declaration of multi-party computation with semi-honest participant nodes under the information theory.


Data disguising methodology is a means of data protection for safeguarding intermediate results in secure multi-party computations. It achieves one-time-one-secret protection by randomly splitting computation results into linear combinations. For most multi-party computations, the process of achieving secure computation typically involves more than one step. How to ensure safety of intermediate results is an inevitable problem. For example, when a two-party matrix product A×B is used as an intermediate result of the computation, whether the participant node Alice or Bod obtains a result of a final matrix A×B, it is possible to reversely deduce data information of the other party. Therefore, not only security of an original data input but also security of an intermediate value should be ensured during the privacy-preserving computation process. In order to solve this problem, the present disclosure proposes a data obfuscation encryption technique, in which an arbitrary multi-item operation is disassembled into a new multi-item addition method for obfuscating and computing a result of an intermediate value. To illustrate its principle more easily, a basic two-party operation type is exemplified in the embodiment of the present disclosure, and its principle is shown in FIG. 5. It is assumed that Sk=Fk(Ai, Bi), where Fk is a target computation function, Ai is private data belonging to the organization Alice, and Bi is private data belonging to the organization Bob. During execution of each step of the secure multi-party computation protocol, the intermediate result Sk strictly follows the following constraint: Alice only knows its own computation result Ak, Bob only knows Bk, and Ak+Bk=Sk. The formula [Ai:Bi]→[Ak:Bk|Ak+Bk=Fk(Ai, Bi)] represents a transfer process of the intermediate value, during which Alice and Bob are not allowed to exchange each other's data information, including Ak and Bk split from the intermediate computation result. As long as it is ensured that the intermediate value is divided into two random data items at each step during the computation, it is possible to ensure that no one can reversely deduce an original data item from obfuscated and encrypted data, so that the whole process of privacy-preserving computation is highly secure.


To better illustrate the technical flow of the present disclosure, an embodiment of the present disclosure provides an apparatus for implementing the secure three-party linear regression model, as shown in FIG. 6. A process of implementation thereof is as follows:


Firstly, a corresponding distributed computing framework needs to be deployed at computation participant nodes participating in a secure three-party matrix regression task. The framework consists of five modules. The modules specifically include a task acquisition module, a secure computation module, a rule generation module, a consensus computation module, and a data transmission module. The task acquisition module is responsible for receiving and decoding a privacy-preserving computation request from a client. The secure computation module automatically matches a corresponding multi-party secure computation protocol according to a parsed computation request. The rule generation module splits a computation task according to an asynchronous instruction set of the secure computation protocol, and different computation nodes perform a collaborative computation according to respective self-rules. The consensus computation module, after receiving the assigned sub-rules, ensures synchronization of the computation and result consistency through a consensus agreement. After the computation is completed, the data transmission module collects computation results of the participant nodes and transmits the computation results to a computation requester.


A specific implementation is as follows: An external client sends, through an HTTP or GRPC communication protocol, a request for regression computation of original matrix matrices of three parties to a network terminal deployed with a distributed computation service. When a task acquisition module of a node on a network receives a request for the matrix regression computation, the task acquisition module parses the request and starts a secure computation service process of the corresponding computation participants: node 1, node 2, and node 3. After finishing parsing the corresponding computation request, the task acquisition module transmits the result to the secure computation module, which performs a joint query through an internal interface and synchronizes the result to the rule generation module in the three participant nodes after matching the corresponding secure computation protocol. The rule generation module makes different asynchronous parallel execution flows according to different subtasks undertaken by three different participant nodes, and maintains communication with the consensus computation module at each step of execution. The consensus computation module broadcasts and maintains consistency of results of distributed computation nodes on a chain and controls stability of an execution flow while the three participant nodes perform each computation instruction. After execution of the computation protocol is finished eventually and the three participant nodes obtain computation sub-results of each other, the three participant nodes send, by using the data transmission module, sub-matrices split from a two-party obfuscation result to the computation requester to obtain a correct computation result.


Corresponding to the foregoing method embodiment, an embodiment of the present disclosure provides a privacy-preserving computation system for secure three-party linear regression, including: a first private transposed matrix computing module, a second private transposed matrix computing module, a 2PHMP module, a 2PIP module, a splitting module, an obfuscation and overlay computation module, and a 3PHMP module.


The first private transposed matrix computing module is configured to allow a first participant to compute a transposed matrix of a first private matrix to obtain a first private transposed matrix, where the first private matrix is a private matrix of the first participant.


The second private transposed matrix computing module is configured to allow a second participant to compute a transposed matrix of a second private matrix to obtain a second private transposed matrix, where the second private matrix is a private matrix of the second participant.


The 2PHMP module configured to allow the first participant and the second participant to process the first private transposed matrix, the second private transposed matrix, the first private matrix, and the second private matrix by using a secure two-party matrix hybrid multiplication protocol, to obtain a matrix ∂1 and a matrix ∂2, where the matrix ∂1 is stored in the first participant, and the matrix ∂2 is stored in the second participant.


The 2PIP module is configured to allow the first participant and the second participant to process the matrix ∂1 and the matrix ∂2 by using a secure two-party matrix inversion protocol, to obtain a matrix u1 and a matrix u2, where the matrix u1 is stored in the first participant, and the matrix u2 is stored in the second participant.


The splitting module configured to allow the first participant to split the first private transposed matrix into a random matrix v1 and a difference matrix Δ, and send the difference matrix Δ to the second participant.


The obfuscation and overlay computation module is configured to allow the second participant to secretly obfuscate and overlay the second private transposed matrix with the difference matrix Δ to obtain a private matrix v2.


The 3PHMP module is configured to allow a requester of secure three-party linear regression computation, the first participant, the second participant, and a third participant to process a first private matrix sequence, a second private matrix sequence, and a third private matrix by using a secure three-party matrix hybrid multiplication protocol, to obtain a regression coefficient matrix of a secure three-party linear regression model, where the third private matrix is a private matrix of the third participant, the first private matrix sequence includes the matrix u1 and the random matrix v1, and the second private matrix sequence includes the matrix u2 and the random matrix v2.


In practical application, the 2PHMP module specifically includes a first 2PMP unit, a matrix Va0 computing unit, a second 2PMP unit, a matrix Vb0 computing unit, a matrix ∂1 computing unit, and a matrix ∂2 computing unit.


The first 2PMP unit is configured to allow the first participant and the second participant to process the first private transposed matrix and the second private matrix by using a secure two-party matrix multiplication protocol, to obtain a matrix Va1 and a matrix Vb1. The matrix Va1 is stored in the first participant, and the matrix Vb1 is stored in the second participant.


The matrix Va0 computing unit is configured to allow the first participant to compute a product of the first private matrix and the first private transposed matrix, to obtain a matrix Va0.


The second 2PMP unit is configured to allow the first participant and the second participant to process the first private matrix and the second private transposed matrix by using the secure two-party matrix multiplication protocol, to obtain a matrix Va2 and a matrix Vb2. The matrix Va2 is stored in the first participant, and the matrix Vb2 is stored in the second participant.


The matrix Vb0 computing unit is configured to allow the second participant to compute a product of the second private matrix and the second private transposed matrix, to obtain a matrix Vb0.


The matrix ∂1 computing unit is configured to allow the first participant to obtain the matrix ∂1 based on the matrix Va1, the matrix Va2, and the matrix Va0.


The matrix ∂2 computing unit is configured to allow the second participant to obtain the matrix ∂2 based on the matrix Vb1, the matrix Vb2, and the matrix Vb0.


The present disclosure has the following technical effects:

    • 1) The present disclosure proposes an end-to-end cascaded three-party federated regression scheme based on the secure two-party matrix hybrid multiplication protocol, secure two-party matrix inversion protocol, and secure three-party matrix hybrid multiplication protocol. Based on modular cascaded coupling, it achieves efficient joint regression modeling, addressing the issues of high client-side computational overhead and long communication time caused by the introduction of cryptographic techniques such as homomorphic encryption, circuit obfuscation, and secret sharing, as seen in the prior art. The present disclosure provides an efficient and secure three-party matrix regression solution in a semi-honest environment.
    • 2) The present disclosure employs 2PMP and 3PMP, which support computations with float64-bit precision and offer result reliability verification, as the underlying support technologies for the entire solution. This approach resolves the problem in the prior art where the fixed-length ciphertext computation in the circuit obfuscation and homomorphic encryption schemes leads to a loss of numerical precision in floating-point computations. The present disclosure also avoids the issue of unreliable result precision due to noise encryption introduced by the differential privacy scheme. The present disclosure makes the applicability of secure matrix multiplication more versatile, with computation results having a numerical precision comparable to the centralized computation method.
    • 3) Using a matrix transformation approach with random obfuscation, the present disclosure randomly splits the input and output of each stage of computation across different participant nodes. This allows for end-to-end distributed computing with dual protection for models and data throughout the entire process without relying on third-party cloud services. It addresses the security risk in the prior art where federated learning requires interaction of a significant amount of gradient parameters, potentially leading to model leakage. It also fills the technical gap of secure three-party regression protocols in scenarios where label information is not disclosed.
    • 4) In view of complex heterogeneous data sources in multi-party joint modeling scenarios, the present disclosure establishes a standardized and unified analytical expression of multi-source heterogeneous data partitioning paradigm that covers three partitioning forms for data sets and label information: horizontal partitioning, vertical partitioning, and mixed partitioning. This further enables the quantification and standardization of the joint regression modeling method.
    • 5) The present disclosure carries out model parsing based on the random mixed partitioning form of data sets in scenarios with distributed multi-source heterogeneous data sources, and designs a unified paradigm for quantified regression models covering three partitioning scenarios: horizontal partitioning, vertical partitioning, and mixed partitioning. This enhances the versatility of the solution for secure three-party linear regression problems.


Each embodiment in the description is described in a progressive mode, each embodiment focuses on differences from other embodiments, and references can be made to each other for the same and similar parts between embodiments. Since the system disclosed in an embodiment corresponds to the method disclosed in an embodiment, the description is relatively simple, and for related contents, references can be made to the description of the method.


Particular examples are used herein for illustration of principles and implementation modes of the present disclosure. The descriptions of the above embodiments are merely used for assisting in understanding the method of the present disclosure and its core ideas. In addition, those of ordinary skill in the art can make various modifications in terms of particular implementation modes and the scope of application in accordance with the ideas of the present disclosure. In conclusion, the content of the description shall not be construed as limitations to the present disclosure.

Claims
  • 1. A privacy-preserving computation method for secure three-party linear regression, comprising: computing, by a first participant, a transposed matrix of a first private matrix to obtain a first private transposed matrix, wherein the first private matrix is a private matrix of the first participant;computing, by a second participant, a transposed matrix of a second private matrix to obtain a second private transposed matrix, wherein the second private matrix is a private matrix of the second participant;processing, by the first participant and the second participant, the first private transposed matrix, the second private transposed matrix, the first private matrix, and the second private matrix by using a secure two-party matrix hybrid multiplication protocol, to obtain a matrix ∂1 and a matrix ∂2, wherein the matrix ∂1 is stored in the first participant, and the matrix ∂2 is stored in the second participant;processing, by the first participant and the second participant, the matrix ∂1 and the matrix ∂2 by using a secure two-party matrix inversion protocol, to obtain a matrix u1 and a matrix u2, wherein the matrix u1 is stored in the first participant, and the matrix u2 is stored in the second participant;splitting, by the first participant, the first private transposed matrix into a random matrix v1 and a difference matrix Δ, and sending the difference matrix Δ to the second participant;secretly obfuscating and overlaying, by the second participant, the second private transposed matrix with the difference matrix Δ to obtain a private matrix v2; andprocessing, by a requester of secure three-party linear regression computation, the first participant, the second participant, and a third participant, a first private matrix sequence, a second private matrix sequence, and a third private matrix by using a secure three-party matrix hybrid multiplication protocol, to obtain a regression coefficient matrix of a secure three-party linear regression model, wherein the third private matrix is a private matrix of the third participant, the first private matrix sequence comprises the matrix u1 and the random matrix v1, and the second private matrix sequence comprises the matrix u2 and the random matrix v2.
  • 2. The privacy-preserving computation method for secure three-party linear regression according to claim 1, wherein said processing, by the first participant and the second participant, the first private transposed matrix, the second private transposed matrix, the first private matrix, and the second private matrix by using the secure two-party matrix hybrid multiplication protocol to obtain the matrix ∂1 and the matrix ∂2 specifically comprises: processing, by the first participant and the second participant, the first private transposed matrix and the second private matrix by using a secure two-party matrix multiplication protocol, to obtain a matrix Va1 and a matrix Vb1, wherein the matrix Va1 is stored in the first participant, and the matrix Vb1 is stored in the second participant;computing, by the first participant, a product of the first private matrix and the first private transposed matrix, to obtain a matrix Va0;processing, by the first participant and the second participant, the first private matrix and the second private transposed matrix by using the secure two-party matrix multiplication protocol, to obtain a matrix Va2 and a matrix Vb2, wherein the matrix Va2 is stored in the first participant, and the matrix Vb2 is stored in the second participant;computing, by the second participant, a product of the second private matrix and the second private transposed matrix, to obtain a matrix Vb0;obtaining, by the first participant, the matrix ∂1 based on the matrix Va1, the matrix Va2, and the matrix Va0; andobtaining, by the second participant, the matrix ∂2 based on the matrix Vb1, the matrix Vb2, and the matrix Vb0.
  • 3. The privacy-preserving computation method for secure three-party linear regression according to claim 1, wherein said processing, by the first participant and the second participant, the matrix ∂1 and the matrix ∂2 by using the secure two-party matrix inversion protocol, to obtain the matrix u1 and the matrix u2 specifically comprises: randomly generating, by the first participant, a secret matrix P, and obtaining a matrix IA based on the secret matrix P and the matrix ∂1;randomly generating, by the second participant, a secret matrix Q, and obtaining a matrix IB based on the matrix ∂2 and the secret matrix Q;processing, by the first participant and the second participant, the matrix IA and the secret matrix Q by using the secure two-party matrix multiplication protocol, to obtain a matrix Va1′ and a matrix Vb1′, wherein the matrix Va1′ is stored in the first participant, and the matrix Vb1′ is stored in the second participant;processing, by the first participant and the second participant, the secret matrix P and the matrix IB by using the secure two-party matrix multiplication protocol, to obtain a matrix Va2′ and a matrix Vb2″, wherein the matrix Va2′ is stored in the first participant, and the matrix Vb2″ is stored in the second participant;obtaining, by the first participant, the matrix Va based on the matrix Va1′ and the matrix Va2′, and sending the matrix Va to the second participant;obtaining, by the second participant, a matrix IB* based on the secret matrix Q, the matrix Va, the matrix Vb1′, and the matrix Vb2′; andprocessing, by the first participant and the second participant, the secret matrix P and the matrix IB* by using the secure two-party matrix multiplication protocol, to obtain the matrix u1 and the matrix u2.
  • 4. The privacy-preserving computation method for secure three-party linear regression according to claim 1, wherein said processing, by the first participant and the second participant, the matrix ∂1 and the matrix ∂2 by using the secure two-party matrix inversion protocol, to obtain the matrix u1 and the matrix u2 specifically comprises: randomly generating, by the first participant randomly, a reversible matrix P′ and a reversible matrix Q′;obtaining, by the first participant and the second participant, a matrix Va1′″ and a matrix Vb1′″ based on the reversible matrix P′ and the matrix ∂2 by using the secure two-party matrix multiplication protocol, wherein the matrix Va1′″ is stored in the first participant, and the matrix Vb1′″ is stored in the second participant;obtaining, by the first participant and the second participant, a matrix Va2′″ and a matrix Vb2′″ based on the reversible matrix Q′ and the matrix Vb1′″ by using the secure two-party matrix multiplication protocol, wherein the matrix Va2′″ is stored in the first participant, and the matrix Vb2′″ is stored in the second participant;obtaining, by the first participant, a matrix Va″ based on the matrix Va1′″, matrix Q′, matrix Va2′″, matrix P′, and matrix ∂1, and sending the matrix Va″ to the second participant;obtaining, by the second participant, a matrix T″ based on the matrix Va″ and the matrix Vb2′″, and computing an inverse matrix of the matrix T″;obtaining, by the first participant and the second participant, a matrix Va3′ and a matrix Vb3′ based on the matrix Q′ and the inverse matrix of the matrix T″ by using the secure two-party matrix multiplication protocol, wherein the matrix Va3′ is stored in the first participant, and the matrix Vb3′ is stored in the second participant;obtaining, by the first participant and the second participant, a matrix Va4 and a matrix Vb4 based on the matrix P′ and the matrix Vb3′ by using the secure two-party matrix multiplication protocol;computing, by the second participant, the matrix u2 based on the matrix Vb4; andobtaining, by the first participant, the matrix u1 based on the matrix Va3′, the matrix P′, and the matrix Va4.
  • 5. The privacy-preserving computation method for secure three-party linear regression according to claim 1, wherein said processing, by the requester of the secure three-party linear regression computation, the first participant, the second participant, and the third participant, the first private matrix sequence, the second private matrix sequence, and the third private matrix by using the secure three-party matrix hybrid multiplication protocol, to obtain the regression coefficient matrix of the secure three-party linear regression model specifically comprises: computing, by the first participant, a matrix A* based on the first private matrix sequence, and computing, by the second participant, a matrix B* based on the second private matrix sequence;obtaining, by the first participant and the third participant, a matrix Va0′ and a matrix Vc0 based on the matrix A* and the third private matrix by using the secure two-party matrix multiplication protocol, wherein the matrix Va0′ is stored in the first participant, and the matrix Vc0 is stored in the third participant; obtaining, by the second participant and the third participant, a matrix Vb3 and a matrix Vc3 based on the matrix B* and the third private matrix by using the secure two-party matrix multiplication protocol, wherein the matrix Vb3 is stored in the second participant, and the matrix Vc3 is stored in the third participant; obtaining, by the first participant, the second participant, and the third participant, a matrix Va1″, a matrix Vb1″, and a Vc1 based on the matrix u1, the random matrix v2, and the third private matrix by using a secure three-party matrix multiplication protocol, wherein the matrix Va1″ is stored in the first participant, the matrix Vb1″ is stored in the second participant, and the matrix Vc1 is stored in the third participant; obtaining, by the second participant, the first participant, and the third participant, a matrix Va2″, a matrix Vb2″, and a matrix Vc2 based on the matrix u2, the random matrix v1, and the third private matrix by using the secure three-party matrix multiplication protocol, wherein the matrix Va2″ is stored in the first participant, the matrix Vb2″ is stored in the second participant, and the matrix Vc2 is stored in the third participant;obtaining, by the first participant, a matrix Va′ based on the matrix Va0′, the matrix Va1″, and the matrix Va2″ and sending the matrix Va′ to the requester of the secure three-party linear regression computation;obtaining, by the second participant, a matrix Vb′ based on the matrix Vb1″, the matrix Vb2″, and the matrix Vb3 and sending the matrix Vb′ to the requester of the secure three-party linear regression computation;obtaining, by the third participant, a matrix Vc based on the matrix Vc0, the matrix Vc1, the matrix Vc2, and the matrix Vc3 and sending the matrix Vc to the requester of the secure three-party linear regression computation; andobtaining, by the requester of the secure three-party linear regression computation, the regression coefficient matrix of the secure three-party linear regression model based on the matrix Va′, the matrix Vb′, and the matrix Vc.
  • 6. The privacy-preserving computation method for secure three-party linear regression according to claim 2, wherein said processing, by the first participant and the second participant, the first private transposed matrix and the second private matrix by using the secure two-party matrix multiplication protocol, to obtain the matrix Va1 and the matrix Vb1 specifically comprises: generating, by an auxiliary computing node, two random matrix pairs (Ra, ra) and (Rb, rb), and sending the random matrix pair (Ra, ra) to the first participant and the random matrix pair (Rb, rb) to the second participant;obtaining, by the first participant, a matrix  based on the matrix Ra and the first private transposed matrix, and sending the matrix  to the second participant;obtaining, by the second participant, a matrix {circumflex over (B)} based on the matrix Rb and the second private matrix, and sending the matrix {circumflex over (B)} to the first participant;randomly generating, by the second participant, a matrix Vb1, generating a matrix T based on the matrix Â, the second private matrix, the matrix rb, and the matrix Vb1, and sending the matrix T to the first participant; andobtaining, by the first participant, the matrix Va1 based on the matrix T, the matrix ra, the matrix Ra, and the matrix {circumflex over (B)}.
  • 7. The privacy-preserving computation method for secure three-party linear regression according to claim 3, wherein said obtaining, by the second participant, the matrix IB* based on the secret matrix Q, the matrix Va, the matrix Vb1′, and the matrix Vb2′ specifically comprises: obtaining, by the second participant, a matrix Vb based on the matrix Vb1′ and the matrix Vb2′;obtaining a matrix T′ based on the matrix Vb and the matrix Va; andobtaining the matrix IB* based on an inverse matrix of the matrix T′ and the secret matrix Q.
  • 8. The privacy-preserving computation method for secure three-party linear regression according to claim 5, wherein said obtaining, by the first participant, the second participant, and the third participant, the matrix Va1″, the matrix Vb1″, and the matrix Vc1 based on the matrix u1, the random matrix v2, and the third private matrix by using the secure three-party matrix multiplication protocol specifically comprises: randomly generating, by an auxiliary computing node, three random matrix pairs (Ra′, ra′), (Rb′, rb′), and (Rc, rc), and sending the random matrix pair (Ra′, ra′) to the first participant, the random matrix pair (Rb′, rb′) to the second participant, and the random matrix pair (Rc, rc) to the third participant;obtaining, by the first participant, a matrix Â′ based on the matrix Ra′ and the matrix u1 and sending the matrix Â′ to the second participant;obtaining, by the third participant, a matrix Ĉ based on the matrix Rc and the third private matrix and sending the matrix Ĉ to the second participant;obtaining, by the second participant, a matrix {circumflex over (B)}′ based on the matrix Rb′ and the random matrix v2 and determining whether the matrix {circumflex over (B)}′ is full rank, to obtain a first determining result;if the first determining result is no, returning to the step of “randomly generating, by an auxiliary computing node, three random matrix pairs (Ra′, ra′), (Rb′, rb′), and (Rc, rc), and sending the random matrix pair (Ra′, ra′) to the first participant, the random matrix pair (Rb′, rb′) to the second participant, and the random matrix pair (Rc, rc) to the third participant;”if the first determining result is yes, obtaining, by the second participant, a matrix Mb based on the matrix Â′, the matrix Rb′, and the matrix Ĉ, computing a matrix φ1 based on the matrix Â′ and the matrix {circumflex over (B)}′, obtaining a matrix γ1 based on the matrix Â′ and the matrix Rb′, computing a matrix φ2 based on the matrix {circumflex over (B)}′ and the matrix Ĉ, and computing a matrix γ2 based on the matrix Rb′ and the matrix Ĉ, wherein the matrix φ1 and the matrix γ1 are sent to the third participant, while the matrix φ2 and the matrix γ2 are sent to the first participant;computing, by the first participant, a matrix Sa based on the matrix Ra′ and the matrix γ2, and computing a matrix Ma based on the matrix u1 and the matrix φ2;computing, by the third participant, a matrix Sc based on the matrix γ1 and the matrix Rc, and computing a matrix Mc based on the matrix φ1 and the matrix Rc;performing, by the second participant, a full-rank decomposition on the matrix {circumflex over (B)}′ to obtain a column full-rank matrix and a row full-rank matrix, and sending the column full-rank matrix to the first participant, and the row full-rank matrix to the third participant;randomly generating, by the first participant, the matrix Va1″, obtaining a matrix Ta based on the matrix Ma, the matrix Sa, the matrix Va1″, and the matrix ra, obtaining a matrix t1 based on the matrix Ra′ and the column full-rank matrix, and sending the matrix Ta and the matrix t1 to the second participant;obtaining, by the third participant, a matrix t2 based on the row full-rank matrix and the matrix Rc, and sending the matrix t2 to the second participant;randomly generating, by the second participant, the matrix Vb1″, obtaining a matrix Sp based on the matrix t1 and the matrix t2, obtaining a matrix Tb based on the matrix Ta, the matrix Mb, the matrix Sb, the matrix Vb1″, and the matrix rb, and sending the matrix Tb to the third participant; andobtaining, by the third participant, the matrix Vc1 based on the matrix Tb, the matrix Mc, the matrix Sc, and the matrix rc.
  • 9. A privacy-preserving computation system for secure three-party linear regression, comprising: a first private transposed matrix computing module configured to allow a first participant to compute a transposed matrix of a first private matrix to obtain a first private transposed matrix, wherein the first private matrix is a private matrix of the first participant;a second private transposed matrix computing module configured to allow a second participant to compute a transposed matrix of a second private matrix to obtain a second private transposed matrix, wherein the second private matrix is a private matrix of the second participant;a secure two-party matrix hybrid multiplication protocol (2PHMP) module configured to allow the first participant and the second participant to process the first private transposed matrix, the second private transposed matrix, the first private matrix, and the second private matrix by using a secure two-party matrix hybrid multiplication protocol, to obtain a matrix ∂1 and a matrix ∂2, wherein the matrix ∂1 is stored in the first participant, and the matrix ∂2 is stored in the second participant;a secure two-party matrix inversion protocol (2PIP) module configured to allow the first participant and the second participant to process the matrix ∂1 and the matrix ∂2 by using a secure two-party matrix inversion protocol, to obtain a matrix u1 and a matrix u2, wherein the matrix u1 is stored in the first participant, and the matrix u2 is stored in the second participant;a splitting module configured to allow the first participant to split the first private transposed matrix into a random matrix v1 and a difference matrix Δ, and send the difference matrix Δ to the second participant;an obfuscation and overlay computation module configured to allow the second participant to secretly obfuscate and overlay the second private transposed matrix with the difference matrix Δ to obtain a private matrix v2; anda secure three-party matrix hybrid multiplication protocol (3PHMP) module configured to allow a requester of secure three-party linear regression computation, the first participant, the second participant, and a third participant to process a first private matrix sequence, a second private matrix sequence, and a third private matrix by using a secure three-party matrix hybrid multiplication protocol, to obtain a regression coefficient matrix of a secure three-party linear regression model, wherein the third private matrix is a private matrix of the third participant, the first private matrix sequence comprises the matrix u1 and the random matrix v1, and the second private matrix sequence comprises the matrix u2 and the random matrix v2.
  • 10. The privacy-preserving computation system for secure three-party linear regression according to claim 9, wherein the 2PHMP module specifically comprises: a first secure two-party matrix multiplication protocol (2PMP) unit configured to allow the first participant and the second participant to process the first private transposed matrix and the second private matrix by using a secure two-party matrix multiplication protocol, to obtain a matrix Va1 and a matrix Vb1, wherein the matrix Va1 is stored in the first participant, and the matrix Vb1 is stored in the second participant;a matrix Va0 computing unit configured to allow the first participant to compute a product of the first private matrix and the first private transposed matrix, to obtain a matrix Va0;a second 2PMP unit configured to allow the first participant and the second participant to process the first private matrix and the second private transposed matrix by using the secure two-party matrix multiplication protocol, to obtain a matrix Va2 and a matrix Vb2, wherein the matrix Va2 is stored in the first participant, and the matrix Vb2 is stored in the second participant;a matrix Vb0 computing unit configured to allow the second participant to compute a product of the second private matrix and the second private transposed matrix, to obtain a matrix Vb0;a matrix ∂1 computing unit configured to allow the first participant to obtain the matrix ∂1 based on the matrix Va1, the matrix Va2, and the matrix Va0; anda matrix ∂2 computing unit configured to allow the second participant to obtain the matrix ∂2 based on the matrix Vb1, the matrix Vb2, and the matrix Vb0.
Priority Claims (1)
Number Date Country Kind
202311227154X Sep 2023 CN national