One or more embodiments of this specification relate to the field of computer technologies, and in particular, to data processing methods and apparatuses.
It is well known that data usually include a large amount of privacy and confidential information, and are collectively referred to as private data. Many institutions such as enterprises and hospitals protect the private data. How to secretly share data over the Internet without disclosing privacy in cryptography is an important problem. In such a background, secure multi-party computation (MPC) emerges. MPC means that a group of participants who do not trust each other can perform collaborative computing while protecting data privacy. The participant is referred to as an MPC computation party.
The data provider randomly splits the private data into a plurality of data components, and provides the data component to each MPC computation party through a secure channel established between the data provider and the MPC computation party. A principle in which the data provider provides a data component to each MPC computation party is that each MPC computation party obtains only some data components rather than all of private data, and the private data can be restored after at least two MPC computation parties exchange the data components. Therefore, it can be ensured that each MPC computation party accesses only some data components. Even if an attacker breaks through an MPC computation party and steals or modifies some data components for a long time period, valid information cannot be obtained.
Because the data provider and the MPC computation party perform transmission through a public network, a data processing manner is urgently needed to reduce pressure placed by transmission of the data component between the data provider and the MPC computation party on transmission in the public network.
One or more embodiments of this specification describe a data processing method, to reduce pressure placed by transmission of a data component on transmission in a public network.
According to a first aspect, a data processing method is provided, applied to a system including a data provider and N secure multi-party computation MPC computation parties, where N is an integer greater than 3, and the method includes: Each MPC computation party obtains a data message sent by the data provider, and obtains a first data component based on the data message; and each MPC computation party performs arithmetic sharing processing by using the first data component, to obtain a second data component, so as to perform MPC processing. N data messages received by the N MPC computation parties include: a data message sent after the data provider splits private data into M data components, and M data messages each are used to carry one data component, where M is greater than 1 and is less than or equal to N, and M is a positive integer.
According to some implementable manners of the embodiments of this application, if M=N, and each data message carries one data component, the obtaining a first data component based on the data message includes: a data component carried in the obtained data message is used as the first data component; or if M is greater than 1 and is less than N, the M data messages each carry one data component, and the remaining data messages carry zero data components, the obtaining a first data component based on the data message includes: using the carried data component as the first data component if the obtained data message carries one data component, where the first data component is empty if the obtained data message carries zero data components.
According to some implementable manners of the embodiments of this application, the performing arithmetic sharing processing by using the first data component carried in the data message includes: if M=N, each MPC computation party performs arithmetic sharing processing by using the first data component as to-be-shared data, to obtain the second data component; or if 1<M<N, each MPC computation party performs zero-sharing processing, to obtain a third data component; combining the obtained third data component and the first data component carried in the data message, to obtain a fourth data component; and performing arithmetic sharing processing by using the fourth data component as to-be-shared data, to obtain the second data component.
According to some implementable manners of the embodiments of this application, the zero-sharing processing includes: Each MPC computation party generates a first derived value by using a locally held first zero-sharing key, and generates a second derived value by using a locally held second zero-sharing key; and obtains the third data component based on a difference between the first derived value and the second derived value.
According to some implementable manners of the embodiments of this application, the arithmetic sharing processing includes: sharing local to-be-shared data with a next MPC computation party after encrypting the local to-be-shared data, and receiving and decrypting data shared by a previous MPC computation party; combining the received decrypted data and the local to-be-shared data, to obtain the second data component; and performing, by each MPC computation party, arithmetic sharing processing in a cyclic order.
According to some implementable manners of the embodiments of this application, the second data component is a logical component; and the method further includes: The MPC computation party converts the second data component from a logical component to an arithmetic component, to obtain a fifth data component, so as to perform MPC processing.
According to some implementable manners of the embodiments of this application, the N MPC computation parties include a first MPC computation party, a second MPC computation party, and a third MPC computation party; and that the MPC computation party converts the second data component from a logical component to an arithmetic component includes: Each MPC computation party performs zero-sharing processing, to obtain a sixth data component, where the sixth data component is an arithmetic component; the first MPC computation party performs a first conversion and a second conversion on an arithmetic value by using a locally held logical component, to obtain two options, where the two options are arithmetic components; the first MPC computation party performs an oblivious transfer to the third MPC computation party by using the two options; and each MPC computation party performs arithmetic sharing processing by using a locally obtained arithmetic component as to-be-shared data, to obtain the fifth data component.
According to some implementable manners of the embodiments of this application, the locally held logical component includes a first logical component and a second logical component; that the first MPC computation party performs a first conversion and a second conversion on an arithmetic value by using a locally held logical component includes: The first MPC computation party generates a random value by using an interaction key; and performs the first conversion and the second conversion by using the first logical component and the second logical component that are locally held, the random value, and a quantity of decimal places of a fixed-point number used for the MPC processing, to obtain the two options; and the method further includes: The second MPC computation party generates the random value by using the interaction key.
According to some implementable manners of the embodiments of this application, the second data component is an address-geocoded component; and the method further includes: The MPC computation party converts the second data component from the address-geocoded component to a one-hot encoded component, to obtain a seventh data component, so as to perform the MPC processing.
According to some implementable manners of the embodiments of this application, that the MPC computation party converts the second data component from the address-geocoded component to a one-hot encoded component includes: determining a value of the ith bit of the one-hot encoded component in the following manner: for the jth bit of address geocoding, where j starts from 0, if the jth bit of a binary value of i is 1, determining that a current one-hot encoded component value of the ith bit is addr[0]; otherwise, determining that a current one-hot encoded component value of the ith bit is a complement value of addr[0]; increasing a value of j by 1, and if the jth bit of the binary value of i is 1, multiplying the current one-hot encoded component value of the ith bit by addr[j], and updating the current one-hot encoded component value of the ith bit by using a value obtained through multiplication; otherwise, multiplying the current one-hot encoded component value of the ith bit by a complement value of addr[j], and updating the current one-hot encoded component value of the ith bit by using a value obtained through multiplication; and performing the step of increasing a value of j by 1, until j is a highest-order bit of the address-geocoded component, to obtain a one-hot encoded component value of the ith bit, where addr[0] is a value of the 0th bit of the address-geocoded component, and addr[j] is a value of the jth bit of the address-geocoded component.
According to a second aspect, a data processing method is provided, applied to a system including a data provider and N secure multi-party computation MPC computation parties, where N is an integer greater than 3, and the method includes: The data provider splits private data into M data components, where M is greater than 1 and less than or equal to N, and M is a positive integer; and individually sends the M data components to the N MPC computation parties by using a data message, so that each MPC computation party receives one or zero data components, uses the one or zero data components as a first data component, and performs arithmetic sharing processing by using the first data component, to obtain a second data component, so as to perform MPC processing.
According to some implementable manners of the embodiments of this application, the data component is a logical component or an address-geocoded component.
According to a third aspect, a data processing apparatus is provided, applied to a system including a data provider and N secure multi-party computation MPC computation parties, where N is an integer greater than 3, and the apparatus is disposed on the MPC computation party, and includes: a data obtaining unit, configured to: obtain a data message sent by the data provider, and obtain a first data component by using the data message; and an arithmetic sharing unit, configured to perform arithmetic sharing processing by using the first data component, to obtain a second data component, so as to perform MPC processing, where N data messages received by the N MPC computation parties include: a data message sent after the data provider splits private data into M data components, and M data messages each are used to carry one data component, where 1<M≤N, and M is a positive integer.
According to a fourth aspect, a data processing apparatus is provided, applied to a system including a data provider and N secure multi-party computation MPC computation parties, where N is an integer greater than 3, and the apparatus is disposed on the data provider, and includes: a data splitting unit, configured to split private data into M data components, where 1<M≤N, and M is a positive integer; and a data sending unit, configured to individually send the M data components to the N MPC computation parties by using a data message, so that each MPC computation party receives one or zero data components, uses the one or zero data components as a first data component, and performs arithmetic sharing processing by using the first data component, to obtain a second data component, so as to perform MPC processing.
According to a fifth aspect, a computing device is provided, including a memory and a processor. The memory stores executable code, and when executing the executable code, the processor implements the method in the first aspect.
According to the methods and apparatuses provided in the embodiments of this specification, a quantity of data components transmitted between the data provider and the MPC computation party through a public network is reduced to M. Compared with a case in which there are originally (N−1)*N data components, in this case, pressure on transmission in the public network is clearly reduced. Effect is particularly obvious when there is a large quantity of data providers. In addition, the MPC computation party performs arithmetic sharing processing, so that each MPC computation party obtains only some data components, and private data can be restored only by using at least two MPC computation parties.
To describe the technical solutions in embodiments of this application or in the existing technology more clearly, the following briefly describes the accompanying drawings needed for describing the embodiments or the existing technology. Clearly, the accompanying drawings in the following description show some embodiments of this application, and a person of ordinary skill in the art can still derive other drawings from these accompanying drawings without creative efforts.
The terms used in the embodiments of this application are merely used to describe specific embodiments, and are not intended to limit this application. The terms “a”, “said”, and “the” of singular forms used in the embodiments of this application and the appended claims are also intended to include plural forms, unless otherwise specified in the context clearly.
It should be understood that the term “and/or” used in this specification merely describes an association relationship between associated objects and indicates that three relationships can exist. For example, A and/or B can indicate the following three cases: Only A exists, both A and B exist, and only B exists. In addition, the character “/” in this specification usually indicates an “or” relationship between the associated objects.
Depending on the context, for example, the word “if” used here can be interpreted as “while”, “when”, “in response to determining”, or “in response to detecting”. Similarly, depending on the context, the phrase “if determining . . . ” or “if detecting (the condition or event stated)” can be explained as “when determining . . . ”, “in response to determining . . . ”, “when detecting (the condition or event stated)”, or “in response to detecting (the condition or event stated)”.
In an existing data component transmission manner, the data provider splits private data into N data components, and then sends N−1 data components to each MPC computation party. In addition, it is ensured that each MPC computation party obtains a different one of the N−1 data components. The data provider and the MPC computation party perform transmission in a public network. In such a transmission manner, each data provider needs to transmit (N−1)*N data components. A network transmission pressure is high. When there are many data providers, pressure is especially obvious.
Solutions provided in this specification are described below with reference to the accompanying drawings.
Step 203: Individually send the M data components to N MPC computation parties by using a data message, so that each MPC computation party receives one or zero data components, and uses the one or zero data components as a first data component.
Step 303: Perform arithmetic sharing processing by using the first data component, to obtain a second data component, so as to perform MPC processing, where N data messages received by the N MPC computation parties include: a data message sent after the data provider splits private data into M data components, and M data messages each are used to carry one data component, where 1<M≤N, and M is a positive integer.
According to the data processing methods shown in
In some implementable manners, M=N, the data provider respectively sends N data components to the N MPC computation parties. In other words, each MPC computation party receives one data component, namely, the first data component, and each MPC computation party receives a different data component.
In this case, after receiving the first data component, the MPC computation party only needs to perform arithmetic sharing processing by using the received first data component.
The arithmetic sharing processing is: sharing local to-be-shared data with a next MPC computation party after encrypting the local to-be-shared data, and receiving and decrypting data shared by a previous MPC computation party; combining the received decrypted data and the local to-be-shared data, to obtain the second data component. In other words, the arithmetic sharing processing is a process in which each MPC computation party performs data sharing in a cyclic order. In addition, a key used when the MPC computation party performs encryption is the same as a key used when the next MPC computation party performs decryption. The key is preconfigured or pre-agreed upon.
This implementation is described by using the three MPC computation parties shown in
The MPC computation party A, the MPC computation party B, and the MPC computation party C jointly perform one time of arithmetic sharing processing. All MPC computation parties pre-agree upon an interaction key, so that each MPC computation party locally has an interaction key pair (share_rng_d, share_rng_u). share_rng_d of the MPC computation party A is the same as share_rng_u of the MPC computation party C, share_rng_d of the MPC computation party B is the same as share_rng_u of the MPC computation party A, and share_rng_d of the MPC computation party C is the same as share_rng_u of the MPC computation party B.
During arithmetic sharing processing, the MPC computation party A encrypts u1 by using share_rng_d, and then transmits the encrypted u1 to the MPC computation party C. The MPC computation party C decrypts the encrypted u1 by using share_rng_u, to obtain u1.
The MPC computation party B encrypts u2 by using share_rng_d, and then transmits the encrypted u2 to the MPC computation party A. The MPC computation party A decrypts the encrypted u2 by using share_rng_u, to obtain u2.
The MPC computation party C encrypts u3 by using share_rng_d, and then transmits the encrypted u3 to the MPC computation party B. The MPC computation party B decrypts the encrypted u3 by using share_rng_u, to obtain u3.
After arithmetic sharing processing, the MPC computation party A locally has u1 and u2, the MPC computation party B locally has u2 and u3, and the MPC computation party C locally has u3 and u1. However, only three data components need to be transmitted in the public network. In other words, six data components that originally need to be transmitted are reduced to three data components. However, when MPC computation parties perform arithmetic sharing, network transmission pressure is very small because arithmetic sharing is performed in a high-speed network.
In some other implementable manners, 1<M<N, and the data provider individually sends the M data components to the N MPC computation parties. In other words, some MPC computation parties each receive one data component by using a data message, and each MPC computation party receives a different data component. The other MPC computation parties receive zero data components. For example, an obtained data message carries no data component.
In this case, the carried data component is used as the first data component if a data message obtained by the MPC computation party carries a data component; or the first data component is empty if the obtained data message carries no data component.
In this case, after receiving the first data component, the MPC computation party performs zero-sharing processing, to obtain a third data component; combines the obtained third data component and the received first data component, to obtain a fourth data component; and performs arithmetic sharing processing by using the fourth data component as to-be-shared data, to obtain the second data component.
Zero-sharing processing means that each MPC computation party generates a data component, and the sum of data components generated by all MPC computation parties is 0. Specifically, zero-sharing processing includes: The MPC computation party generates a first derived value by using a locally held first zero-sharing key, and generates a second derived value by using a locally held second zero-sharing key; and obtains a third data component based on a difference between the first derived value and the second derived value. All the MPC computation parties pre-agree upon a zero-sharing key, so that each MPC computation party locally has a key pair (prng, prngu) including a first zero-sharing key and a second zero-sharing key. prng of the MPC computation party A is the same as prngu of the MPC computation party C, prng of the MPC computation party B is the same as prngu of the MPC computation party A, and prng of the MPC computation party C is the same as prngu of the MPC computation party B.
This implementation is described by using the three MPC computation parties shown in
The MPC computation party A, the MPC computation party B, and the MPC computation party C jointly perform one time of zero-sharing processing, to obtain three components x1, x2, and x3 of 0. Specifically, each MPC computation party generates a first derived value buf1 by using prng, generates a second derived value buf2 by using prngu, and uses a value of buf1−buf2 as the third data component obtained through zero-sharing processing.
The MPC computation party A combines results x1 and y1 of zero-sharing processing, to obtain a fourth data component x1+y1; the MPC computation party B combines results x2 and y2 of zero-sharing processing, to obtain a fourth data component x2+y2; and the MPC computation party C combines a result x3 of zero-sharing processing and the received zero data components, to obtain a fourth data component x3.
Then, the MPC computation party A, the MPC computation party B, and the MPC computation party C jointly perform one time of arithmetic sharing processing.
During arithmetic sharing processing, the MPC computation party A encrypts x1+y1 by using share_rng_d, and then transmits the encrypted x1+y1 to the MPC computation party C. The MPC computation party C decrypts the encrypted x1+y1 by using share_rng_u, to obtain x1+y1.
The MPC computation party B encrypts x2+y2 by using share_rng_d, and then transmits the encrypted x2+y2 to the MPC computation party A. The MPC computation party A decrypts the encrypted x2+y2 by using share_rng_u, to obtain x2+y2.
The MPC computation party C encrypts x3 by using share_rng_d, and then transmits the encrypted x3 to the MPC computation party B. The MPC computation party B decrypts the encrypted x3 by using share_rng_u, to obtain x3.
After arithmetic sharing processing, the MPC computation party A locally has u1=x1+y1 and u2=x2+y2, the MPC computation party B locally has u2=x2+y2 and u3=x3, and the MPC computation party C locally has u3=x3 and u1=x1+y1. However, only two data components need to be transmitted in the public network. In other words, six data components that originally need to be transmitted are reduced to two data components. However, when MPC computation parties perform arithmetic sharing, network transmission pressure is very small because arithmetic sharing is performed in a high-speed network.
In a model training or prediction scenario such as machine learning, private data provided by a data provider includes one-hot encoded data of sample feature data. Correspondingly, the first data component transmitted by the data provider is a logical component into which the one-hot encoded data are split, and an arithmetic component corresponding to the logical component is transmitted because of a requirement of some application scenarios. The logical component is a data component whose element is a binary value, and the arithmetic component is a data component whose element is an integral data value. For example, for a data set whose feature quantity is d, whose bin is b, and whose sample quantity is n, X[d][b][n] can be used to represent one-hot encoding of sample feature data, and a value of each element of X[d][b][n] is 0 or 1. Each element occupies 1 bit, and is split into logical components. A size of each logical component is the same as that of X[d][b][n]. In addition, the data provider needs to provide a data component corresponding to the logical component. In other words, each element of X[d][b][n] is represented by an integer, and occupies 32 bits. In such a manner, excessive transmission pressure is also clearly placed on the public network. That d=200, b=13, and n=100w is used as an example. It is difficult to accept that a size of six data components in full sending is approximately 60 GB. In the manner described in the above-mentioned embodiment, a quantity of data components can be reduced from 6 to 3 or 2, or a data amount transmitted between a data transmission party and the MPC computation party can be further reduced in the following embodiment.
In some embodiments, the first data component sent by the data provider is a logical component, and includes no arithmetic component. In other words, when the second data component is obtained, the second data component is also a logical component, and includes no arithmetic component.
In the some embodiments, the data provider sends only a logical component, and sends no arithmetic component, so that a data transmission amount of the public network can be greatly reduced. Correspondingly, when obtaining the second data component, the MPC computation party needs to convert the second data component from a logical component to an arithmetic component, to obtain the fifth data component.
A zero-sharing processing manner is described in detail in the above-mentioned embodiment, and details are omitted here for simplicity. The sixth data component obtained through zero-sharing processing is an arithmetic component. Details are described by using an example in the following embodiments.
Step 603: The first MPC computation party performs a first conversion and a second conversion on an arithmetic value by using a locally held logical component, to obtain two options, where the two options are arithmetic components.
In some implementable manners, the first MPC computation party can generate a random value by using an interaction key; and perform the first conversion and the second conversion by using a first logical component and a second logical component that are locally held, the random value, and a quantity of decimal places of a used fixed-point number, to obtain the two options. In addition, the second MPC computation party can also generate the random value by using the interaction key.
In this step, two options m0 and m1 can be respectively obtained based on the following formulas:
Here, {circumflex over ( )} is an exclusive OR operator, << is a left-moving operator, and B is the quantity of decimal places of the fixed-point number used for the MPC processing. The fixed-point number is used in an MPC algorithm, and the fixed-point number is usually a fixed-point decimal. Most of numeric data processed by a computer are decimals, and a decimal point is usually implied at a fixed location. This is referred to as a fixed-point representation, and is briefly referred to as a fixed-point number. rnd is the random value, and can be generated by using a locally held interaction key share_rng_u.
Step 605: The first MPC computation party performs an oblivious transfer to the third MPC computation party by using the two options.
The oblivious transfer (OT) is a cryptographic protocol, and is currently widely applied to MPC. A purpose is that an MPC computation party sends m0 or m1 to another MPC computation party, and the another MPC computation party can obtain only one of m0 or m1. The MPC computation party that sends m0 or m1 cannot determine which one of m0 or m1 is obtained by the another MPC computation party.
In some implementable manners, the oblivious transfer in this step can be an oblivious transfer among three parties. The first MPC computation party serves as a sending party, the second MPC computation party serves as a helping party, and the third MPC computation party serves as a receiving party. The first MPC computation party performs one oblivious transfer by using m0 and m1 as two options and using, as an option, u3 locally held by the second MPC computation party and the third MPC computation party. Details are described by using an example in the following embodiments.
Step 607: Each MPC computation party performs arithmetic sharing processing by using a locally obtained arithmetic component as to-be-shared data, to obtain the fifth data component.
Specifically, during arithmetic sharing processing, each MPC computation party shares local to-be-shared data with a next MPC computation party after encrypting the local to-be-shared data, and receives and decrypts data shared by a previous MPC computation party. The decrypted data and the local to-be-shared data are combined, to obtain the fifth data component. Each MPC computation party performs the sharing processing in a cyclic order.
This implementation is described by using the three MPC computation parties shown in
The MPC computation party A, the MPC computation party B, and the MPC computation party C jointly perform one time of zero-sharing processing, to respectively obtain respective fifth data components r1, r2, and r3. Here, r1, r2, and r3 are all arithmetic components.
The MPC computation party A generates a random value rnd by using the local share_rng_u, and determines
Here, “1<<B” is arithmetic component conversion processing. In other words, (1{circumflex over ( )}x.b{circumflex over ( )}x.bu) is converted into an arithmetic value.
The MPC computation party B generates a random value rnd by using the local share_rng_d, and the value is the same as the random value generated by the MPC computation party A. In this case, the MPC computation party B locally holds r2′. Here, r2′=r2+rnd.
The MPC computation party A serves as a sending party of the oblivious transfer, the MPC computation party B serves as a helping party, and the MPC computation party C serves as a receiving party. The MPC computation party A performs one oblivious transfer by using m0 and m1 as two options and using, as an option, u3 locally held by the MPC computation party B and the MPC computation party C.
Specifically, the MPC computation party A and the MPC computation party B interact to generate common random values W0 and W1. The MPC computation party A sends m0{circumflex over ( )}W0 and m1{circumflex over ( )}W1 to the MPC computation party C. The MPC computation party B sends Wc to the MPC computation party C by using u3. Here, Wc is W0 or W1. The MPC computation party C decrypts each of m0{circumflex over ( )}W0 and m1{circumflex over ( )}W1 by using Wc, and decrypts one of m0{circumflex over ( )}W0 or m1{circumflex over ( )}W1, to obtain mi. Here, mi is a value in m0 or m1. After the oblivious transfer, the MPC computation party C locally holds r3′. Here, r3′=r3+mi.
The MPC computation party A uses r1 as to-be-shared data, the MPC computation party B uses r2′ as to-be-shared data, and the MPC computation party C uses r3′ as to-be-shared data, to perform one time of arithmetic sharing processing, so as to obtain respective arithmetic components. In other words, the MPC computation party A obtains an arithmetic component r1+r2′, the MPC computation party B obtains an arithmetic component r2+r3′, and the MPC computation party C obtains an arithmetic component r1+r3′.
The following demonstrates whether an arithmetic component is obtained in the above-mentioned process: Because a value of u3 is 0 or 1:
Here, (1<<B) is actually processing of converting a logical value into an arithmetic value. To be specific, r1+r2′+r3′ is equivalent to the sum of a logical amount, and (u3{circumflex over ( )}u1{circumflex over ( )}u2) is converted into an arithmetic value. One time of arithmetic sharing is performed on r1, r2′, and r3′, and the three MPC computation parties each have an arithmetic component.
In this example, the data provider only needs to send a logical component of a sample, but does not need to send an arithmetic component. Compared with a case in which the logical component and the arithmetic component both need to be transmitted, in this example, a data transmission amount is reduced to 1/33.
In some other embodiments, the first data component sent by the data provider can be an address-geocoded component of a logical component. The logical component is a data component whose element is a binary value. In other words, when sending the logical component, the data provider uses address geocoding to represent one-hot encoding, to compact a transmitted data amount. Address geocoding is using a binary integer to represent a location of 1 in one-hot encoding. That a three-dimensional array X[d][b][n] is used to represent sample feature data are still used as an example. If originally used b-bit one-hot encoding is changed to address geocoding for representation, the data amount can be compacted to ceil (10820) bits. Ceil( ) is a round up function.
In b-bit one-hot encoding, only 1 bit is 1, and the other bits are 0. Address geocoding of ceil(log2 b) bits is used to represent the location of 1. For example, for one-hot encoding “0000000001000”, address geocoding “1010” can be used to represent that 1 is the 10th bit in one-hot encoding.
Therefore, for the MPC computation party, the obtained second data component is an address-geocoded representation of one-hot encoding, and needs to be converted from the address-geocoded component to a one-hot encoded component, to obtain a seventh data component, so as to perform MPC processing.
A conversion idea is that a value of each bit of the one-hot encoded component can be represented as a combination product of a value or a complement value of each bit of the address-geocoded component. Therefore, a combination product that is of a value or a complement value (namely, a value obtained through complementing) of each bit of an address-geocoded component and that corresponds to each bit of the one-hot encoded component can be used to obtain a value of each bit of the one-hot encoded component.
Step 701: For the jth bit of address geocoding, an initial value of j is 0.
Step 703: Determine whether the jth bit of a binary value of i is 1, and perform step 705 if yes; otherwise, perform step 707.
Step 705: Determine that a current one-hot encoded component value one_hot[i] of the ith bit is addr[0], and perform step 709. That is,
one_hot[i]=addr[0].
Here, addr[0] is a value of the 0th bit of the address-geocoded component.
Step 707: Determine that a current one-hot encoded component value one_hot[i] of the ith bit is a complement value of addr[0], and perform step 709. That is,
one_hot[i]=˜addr[0].
Here, ˜ indicates to complement logic of MPC processing.
Step 709: Increase the value of j by 1.
Step 711: Determine whether the jth bit of the binary value of i is 1, and perform step 713 if yes; otherwise, perform step 715.
Step 713: Multiply the current one-hot encoded component value of the ith bit by addr[j], and update the current one-hot encoded component value of the ith bit by using a value obtained through multiplication, to perform step 717. That is,
one_hot[i]=one_hot[i]×addr[j].
Here, × is a multiplication operation in MPC processing.
Step 715: Multiply the current one-hot encoded component value of the ith bit by a complement value of addr[j], and update the current one-hot encoded component value of the ith bit by using a value obtained through multiplication, to perform step 717. That is,
Step 717: Determine whether the value of j is a highest-order bit of the address-geocoded component, and perform step 719 if yes; otherwise, perform step 709.
Step 719: Obtain a one-hot encoded component value of the ith bit.
The above-mentioned procedure is executed for each of values of i from 0 to b−1, so that a one-hot encoded component value of each bit can be obtained, and finally, one-hot encoded component values of b bits are obtained.
Here, it should be noted that the first data component obtained by each MPC computation party actually includes a plurality of data components. In scenarios shown in
Based on the above-mentioned compaction algorithm, the data provider uses the address-geocoded component to replace the one-hot encoded component for transmission in the public network, to reduce a data transmission amount in the public network to ceil(log2 b)/b of an original amount.
For example, a 16-bit one-hot encoded component is represented by using a 4-bit address-geocoded component. It is assumed that address geocoding received by the MPC computation party is addr[4]={a, b, c, d}, and upper-case letters A, B, C, and D respectively represent logical complement values of MPC processing of a, b, c, and d.
Based on a procedure shown in
“ABCD” is a representation of “A×B×C×D” whose multiplication symbol is omitted. The following representations are all made in such a manner.
During the above-mentioned computation, multiplication can be optimized. Because interaction between MPC computation parties does not need to be performed for an addition operation in MPC processing, and interaction between MPC computation parties needs to be performed for a multiplication operation, multiplication can be converted into addition as much as possible, to reduce multiplication operations as much as possible. Specifically, when multiplication is performed, a product of some values of addr[j] in the address-geocoded component can be computed to obtain a computed product value, and multiplication of the other values of addr[j] is converted into processing of addition of a plurality of addition items. Here, the addition item includes the computed product value. Some addition items can be computed product values, or all addition items can be computed product values.
For example, ab, abc, Abc, aBc, abcd, Abcd, aBcd, ABcd, abCd, AbCd, and aBCd are computed by using a multiplication operation, and other multiplication is converted into addition of the above-mentioned computed results:
Specific embodiments of this specification are described above. Other embodiments fall within the scope of the appended claims. In some cases, the actions or steps described in the claims can be performed in an order different from that in the embodiments, and the desired results can still be achieved. In addition, the process depicted in the accompanying drawings does not necessarily need a particular sequence or consecutive sequence to achieve the desired results. In some implementations, multi-tasking and parallel processing are feasible or may be advantageous.
According to some embodiments of another aspect, a data processing apparatus is provided.
The arithmetic sharing unit 802 is configured to perform arithmetic sharing processing by using the first data component, to obtain a second data component, so as to perform MPC processing.
N data messages received by N MPC computation parties include: a data message sent after the data provider splits private data into M data components, and M data messages each are used to carry one data component. Here, 1<M≤N, and M is a positive integer.
In some implementable manners, M=N, each data message carries one data component, and the data obtaining unit 801 can use a data component carried in the obtained data message as the first data component. The arithmetic sharing unit 802 can be specifically configured to perform arithmetic sharing processing by using the first data component as to-be-shared data, so as to obtain the second data component.
In some other implementable manners, M is greater than 1 and less than N, the M data messages each carry one data component, and the remaining data messages carry zero data components. The data obtaining unit 801 can be specifically configured to use the carried data component as the first data component if the obtained data message carries a data component, where the first data component is empty if the obtained data message carries zero data components. Correspondingly, the zero-sharing unit 803 can first perform zero-sharing processing, to obtain a third data component. Then, the arithmetic sharing unit 802 combines the third data component obtained by the zero-sharing unit 803 and the first data component carried in the data message, to obtain a fourth data component; and performs arithmetic sharing processing by using the fourth data component as to-be-shared data, to obtain the second data component.
Specifically, the zero-sharing unit 803 can generate a first derived value by using a locally held first zero-sharing key, and generate a second derived value by using a locally held second zero-sharing key; and obtain the third data component based on a difference between the first derived value and the second derived value.
When performing arithmetic sharing processing, the arithmetic sharing unit 802 shares local to-be-shared data with a next MPC computation party after encrypting the local to-be-shared data, and receives and decrypts data shared by a previous MPC computation party; and combines the received decrypted data and the local to-be-shared data, to obtain the second data component. Each MPC computation party performs arithmetic sharing processing in a cyclic order.
In some implementable manners, the second data component is a logical component. The first conversion unit 804 is configured to convert the second data component from a logical component to an arithmetic component, to obtain a fifth data component, so as to perform MPC processing.
In some implementable manners, the N MPC computation parties in the system include a first MPC computation party, a second MPC computation party, and a third MPC computation party. The first conversion unit 804 can be specifically configured to perform zero-sharing processing, to obtain a sixth data component, where the sixth data component is an arithmetic component. If the apparatus is disposed on the first MPC computation party, the first conversion unit 804 performs a first conversion and a second conversion on an arithmetic value by using a locally held logical component, to obtain two options, where the two options are arithmetic components; and performs an oblivious transfer to the third MPC computation party by using the two options.
The first conversion unit 804 is further configured to perform arithmetic sharing processing by using a locally obtained arithmetic component as to-be-shared data, to obtain the fifth data component.
The locally held logical component includes a first logical component and a second logical component.
When performing the first conversion and the second conversion on the arithmetic value by using the locally held logical component, the first conversion unit 804 can generate a random value by using an interaction key; and perform the first conversion and the second conversion by using the first logical component and the second logical component that are locally held, the random value, and a quantity of decimal places of a fixed-point number used for MPC processing, to obtain the two options.
If the apparatus is disposed on the first MPC computation party, the first conversion unit 804 is further configured to generate the random value by using the interaction key.
If the apparatus is disposed on the third MPC computation party, the first conversion unit 804 is further configured to obtain an option obliviously transferred by the first MPC computation party.
The first conversion unit 804 can respectively obtain options m0 and m1 according to the following formulas:
Here, {circumflex over ( )} is an exclusive OR operator; << is a left-moving operator; u1 and u2 are respectively the first logical component and the second logical component; rnd is random value; and B is the quantity of decimal places of the used fixed-point number.
In some other implementable manners, the second data component is an address-geocoded component. The second conversion unit 805 is configured to convert the second data component from an address-geocoded component to a one-hot encoded component, to obtain a seventh data component, so as to perform MPC processing.
In some implementable manners, the second conversion unit 805 can be specifically configured to obtain a value of each bit of the one-hot encoded component based on a combination product that is of a value or a complement value of each bit of an address-geocoded component and that corresponds to each bit of the one-hot encoded component.
The second conversion unit 805 can be specifically configured to determine a value of the ith bit of the one-hot encoded component in the following manner: for the jth bit of address geocoding, where j is 0 at a start, if the jth bit of a binary value of i is 1, determining that a current one-hot encoded component value of the ith bit is addr[0]; otherwise, determining that a current one-hot encoded component value of the ith bit is a complement value of addr[0]; increasing a value of j by 1, and if the jth bit of the binary value of i is 1, multiplying the current one-hot encoded component value of the ith bit by addr[j], and updating the current one-hot encoded component value of the ith bit by using a value obtained through multiplication; otherwise, multiplying the current one-hot encoded component value of the ith bit by a complement value of addr[j], and updating the current one-hot encoded component value of the ith bit by using a value obtained through multiplication; and performing the step of increasing a value of j by 1, until j is a highest-order bit of the address-geocoded component, to obtain a one-hot encoded component value of the ith bit, where addr[0] is a value of the 0th bit of the address-geocoded component, and addr[j] is a value of the jth bit of the address-geocoded component.
According to some embodiments of another aspect, a data processing apparatus is provided.
In some implementable manners, the data component is a logical component. That is, the data component can include only the logical component, and include no arithmetic component.
In some other implementable manners, the logical component is a one-hot encoded component.
In some preferred implementations, the logical component can be an address-geocoded component.
According to some embodiments of another aspect, a computer-readable storage medium is further provided. The computer-readable storage medium stores a computer program, and when the computer program is executed on a computer, the computer is enabled to perform the method described with reference to
According to some embodiments of still another aspect, a computing device is further provided, including a memory and a processor. The memory stores executable code, and when the processor executes the executable code, the method described with reference to
With development of time and technology, a computer-readable storage medium has a broader meaning. A propagation path of a computer program is not limited to a tangible medium, can be directly downloaded from a network, etc. Any combination of one or more computer-readable storage media can be used. The computer-readable storage medium can be, by way of example rather than limitation, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples (a non-exhaustive list) of the computer-readable storage medium include the following: an electrical connection with one or more leads, a portable computer disk, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (an EPROM or a flash memory), an optical fiber, a portable compact disk read-only memory (CD-ROM), an optical storage component, a magnetic storage device, or any suitable combination thereof. In this specification, the computer-readable storage medium can be any tangible medium that includes or stores a program, and the program can be used by or in combination with an instruction execution system, apparatus, or component.
The processor can include one or more single-core processors or a multi-core processor. The processor can include any combination of a general-purpose processor or a dedicated processor (for example, an image processor, an application processor, or a baseband processor).
Embodiments of this specification are all described in a progressive manner. For same or similar parts in the embodiments, mutual references can be made to the embodiments. Each embodiment focuses on a difference from other embodiments. In particular, the apparatus embodiment is basically similar to the method embodiment, and therefore is described briefly. For related parts, references can be made to related descriptions in the method embodiment.
A person skilled in the art should be aware that, in the above-mentioned one or more examples, functions described in this application can be implemented by hardware, software, firmware, or any combination thereof. When implemented by using software, these functions can be stored in a computer-readable medium or transmitted as one or more instructions or code on a computer-readable medium.
The specific implementations mentioned above provide further detailed explanations of the objectives, technical solutions, and beneficial effects of this application. It should be understood that the previously mentioned descriptions are merely specific implementations of this application and are not intended to limit the protection scope of this application. Any modifications, equivalent replacements, improvements, etc. made on the basis of the technical solutions of this application shall all fall within the protection scope of this application.
Number | Date | Country | Kind |
---|---|---|---|
202210227410.4 | Mar 2022 | CN | national |
This application is a continuation of PCT Application No. PCT/CN2023/071520, filed on Jan. 10, 2023, which claims priority to Chinese Patent Application No. 202210227410.4, filed on Mar. 8, 2022, and each application is hereby incorporated by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/CN2023/071520 | Jan 2023 | WO |
Child | 18826983 | US |