HIGH-THROUGHPUT PRIVACY-FRIENDLY HARDWARE ASSISTED MACHINE LEARNING ON EDGE NODES

Information

  • Patent Application
  • 20190332814
  • Publication Number
    20190332814
  • Date Filed
    April 27, 2018
    6 years ago
  • Date Published
    October 31, 2019
    5 years ago
Abstract
A device, including: a memory; a processor configured to implement an encrypted machine leaning model configured to: evaluate the encrypted learning model based upon received data to produce an encrypted machine learning model output; producing verification information; a tamper resistant hardware configured to: verify the encrypted machine learning model output based upon the verification information; and decrypt the encrypted machine learning model output when the encrypted machine learning model output is verified.
Description
TECHNICAL FIELD

Various exemplary embodiments disclosed herein relate generally to high-throughput privacy-friendly hardware assisted machine learning on edge nodes.


BACKGROUND

Machine learning is a technique which enables a wide range of applications such as forecasting or classification. However, in the current age of Internet of Things, where the gathered user-sensitive data is used as input to train the models used in such machine learning, privacy becomes an important topic. This means privacy for both the user, who is providing their data, as well as for the entity providing the machine learning model because they have invested a lot of time and effort to train this model and acquire the data needed to the model.


SUMMARY

A summary of various exemplary embodiments is presented below. Some simplifications and omissions may be made in the following summary, which is intended to highlight and introduce some aspects of the various exemplary embodiments, but not to limit the scope of the invention. Detailed descriptions of an exemplary embodiment adequate to allow those of ordinary skill in the art to make and use the inventive concepts will follow in later sections.


Various embodiments relate to a device, including: a memory; a processor configured to implement an encrypted machine leaning model configured to: evaluate the encrypted learning model based upon received data to produce an encrypted machine learning model output; producing verification information; a tamper resistant hardware configured to: verify the encrypted machine learning model output based upon the verification information; and decrypt the encrypted machine learning model output when the encrypted machine learning model output is verified.


Various embodiments are described, wherein verification information is a signature and verifying the encrypted machine learning model output includes verifying the signature.


Various embodiments are described, wherein verification information is a signature and producing the verification information includes producing the signature.


Various embodiments are described, wherein verification information is a proof of work and verifying the encrypted machine learning model output includes verifying the prof of work is correct.


Various embodiments are described, wherein verification information is a proof of work and producing the verification information includes producing the proof of work.


Various embodiments are described, wherein the tamper resistant hardware stores a decryption key to decrypt outputs of the encrypted machine learning model.


Various embodiments are described, wherein received data is from an Internet of Things device.


Various embodiments are described, wherein the device is an edge node.


Various embodiments are described, wherein the encrypted machine learning model is encrypted using homomorphic encryption.


Various embodiments are described, wherein the encrypted machine learning model is encrypted using somewhat homomorphic encryption.


Further various embodiments relate to a method of evaluating an encrypted learning model, including: evaluating, by a processor, the encrypted learning model based upon received data to produce an encrypted machine learning model output; producing, by the processor, verification information; verifying, by a tamper resistant hardware, the encrypted machine learning model output based upon the verification information; and decrypting, by a tamper resistant hardware, the encrypted machine learning model output when the encrypted machine learning model output is verified.


Various embodiments are described, wherein verification information is a signature and verifying the encrypted machine learning model output includes verifying the signature.


Various embodiments are described, wherein verification information is a signature and producing the verification information includes producing the signature.


Various embodiments are described, wherein verification information is a proof of work and verifying the encrypted machine learning model output includes verifying the proof of work is correct.


Various embodiments are described, wherein verification information is a proof of work and producing the verification information includes producing the proof of work.


Various embodiments are described, wherein the tamper resistant hardware stores a decryption key to decrypt outputs of the encrypted machine learning model.


Various embodiments are described, wherein received data is from an Internet of Things device.


Various embodiments are described, wherein the processor and the tamper resistant hardware are in an edge node.


Various embodiments are described, wherein the encrypted machine learning model is encrypted using homomorphic encryption.


Various embodiments are described, wherein the encrypted machine learning model is encrypted using somewhat homomorphic encryption.





BRIEF DESCRIPTION OF THE DRAWINGS

In order to better understand various exemplary embodiments, reference is made to the accompanying drawings, wherein:



FIG. 1 illustrates a system including an edge node receiving data from an IoT device; and



FIG. 2 illustrates an exemplary hardware diagram for implementing either the encrypted machine learning model or the tamper resistant hardware.





To facilitate understanding, identical reference numerals have been used to designate elements having substantially the same or similar structure and/or substantially the same or similar function.


DETAILED DESCRIPTION

The description and drawings illustrate the principles of the invention. It will thus be appreciated that those skilled in the art will be able to devise various arrangements that, although not explicitly described or shown herein, embody the principles of the invention and are included within its scope. Furthermore, all examples recited herein are principally intended expressly to be for pedagogical purposes to aid the reader in understanding the principles of the invention and the concepts contributed by the inventor(s) to furthering the art and are to be construed as being without limitation to such specifically recited examples and conditions. Additionally, the term, “or,” as used herein, refers to a non-exclusive or (i.e., and/or), unless otherwise indicated (e.g., “or else” or “or in the alternative”). Also, the various embodiments described herein are not necessarily mutually exclusive, as some embodiments can be combined with one or more other embodiments to form new embodiments.


Machine learning is a technique which enables a wide range of applications such as forecasting or classification. However, in the current age of the Internet of Things, where the gathered user-sensitive data is used as input to train the machine learning models as well used in such machine learning models, privacy becomes an important topic. This means privacy for both the user, who is providing their data, as well as for the entity providing the machine learning model because this entity has invested a lot of time and effort to train their model and acquire the data needed to produce the model.


Enhancing the privacy behavior of such machine learning algorithms is not new. Most approaches focus on overcoming certain privacy problems or enhancing the performance of the machine learning operations in the cloud setting. In this setting, it is assumed that the generated data is provided by the user, or on behalf of the user by a device (such as an Internet of Things device); this data is subsequently transferred to the cloud in order to perform some computations on this data. Examples include forecasting (e.g., to determine a preference) or when using in a classification algorithm (e.g., in the context of medical data where one could predict to have a high risk at a certain disease). In this context, the user-data should be protected because this data may contain private and very sensitive information. The machine learning model, which is used as the main algorithm to compute the output, is often not considered to be sensitive or the cloud provider is assumed to be trusted. Examples of techniques which have been applied to increase security include homomorphic encryption or multi-party computation.


Embodiments will now be described that illustrate how to protect the machine learning model when it is used on an edge node in an Internet of Things (IoT) Network. These embodiments use a small tamper resistant hardware module to assist an encrypted machine learning model. This allows for high-throughput and low-latency evaluation of the machine learning model in order to compute, for instance, classifications while protecting both the privacy of the user generated data as well as the valuable data stored inside the machine learning model. This is accomplished because the encrypted machine learning model may be implemented on a faster and unsecure processor or processors.


In the edge computing environment, the data generated by the user, or by the Internet of Things device owned by the user, does not necessarily need to be protected, because by assumption it does not leave its own network. However, if the machine learning model is installed on such an edge node in order to provide very fast prediction or classification, then the machine learning model itself may require protection because it contains valuable information. The embodiments described herein provide a way of computing the outcome of the machine learning algorithm even when the machine learning model is protected using a small tamper resistant piece of hardware. This allows for high-throughput predictions and classifications in the setting of edge computing by using fast processors for the predictions and classifications.


The embodiments described herein combine the characteristics of a small tamper resistant piece of hardware in the edge node in a large Internet of Things network to protect the machine learning model used for (among others) fast and efficient classification and prediction in the network of the user. Another advantage, over for instance storing the model inside such a secure piece of hardware, is that this allows for easy and convenient replacement/upgrade of the model. Moreover, depending on the characteristics of the machine learning model, this approach lowers the size of the secure hardware needed and increases the throughput of the machine learning algorithm when the bandwidth to and the processing power of the secure element is restricted.


The embodiments described herein focus on enhancing the privacy of both the user and the owner of the machine learning model in a setting different as compared to the commonly considered cloud setting. It is assumed the machine learning model is transferred to an edge node in the Internet of Things network. Such edge computing has multiple advantages because it reduces the communication bandwidth needed between IoT devices and the cloud computing servers by performing analytics close to the source of the data. Moreover, it is assumed the machine learning model is static: hence, no training is done in real-time on the model installed on this edge node.


The embodiments described herein provide a fast and efficient solution to protect the machine learning model as well as the data generated by the Internet of Things device. Note that this latter property is satisfied “for free” when a machine learning model is used that is installed on such an edge node. In this scenario the user data simply never leaves its own network. If adversaries can eavesdrop on this network, then this data could even be encrypted with the public key of the owner of the machine learning model for additional security and privacy guarantees.


The weights used inside an artificial neural network can be the outcome of a long and sophisticated training algorithm on a large dataset which is exclusively owned or acquired by the machine model owner. The same observation holds for other machine learning algorithms. Hence, it is assumed that the machine learning model custom-character is transfered to the edge node in a form such that the user cannot deduce any useful information about it. This is called the encryption of the machine learning model and is denoted by encrypt (custom-character). However, just providing this model in this state is insufficient. The encrypted machine learning model should be able to process date from a function of the user generated data. Let f be a function which takes as input the output of an Internet of Things device, say x in some set custom-character1, and converts the data x to a form which can be used as input by the encrypted machine learning model, say custom-character2. Hence, the edge node is able to compute the (encrypted) output of the machine learning model encrypt (custom-character)(f (x)), which maps values from custom-character2 to the possibly encrypted output set custom-character1, given access to the encrypted machine learning model encrypt (custom-character) and the Internet of Things device output f(x).


In practice, the encryption used to represent the encrypted machine learning model encrypt (custom-character) can be based on a fully or somewhat homomorphic encryption scheme and the output function f is the identity function: i.e. custom-character1 is identical to custom-character2 and f(x)=x for all input values x. In this scenario, the edge node may compute the outcome of the machine learning model, but this result will be encrypted under the same key used to encrypt the model custom-character. Fully homomorphic encryption allows one to perform an arbitrary number of arithmetic operations on encrypted data. When using somewhat homomorphic encryption one can only do a limited number of arithmetic operations on encrypted data. When this exact number of arithmetic operations is fixed then one can optimize all parameters such that the performance is significantly better (compared to a fully homomorphic encryption scheme). Because the user of the Internet of Things device owned by the user has no access to this key it cannot be used directly by these Internet of Things devices.


The embodiments described herein overcome this problem by using a small tamper resistant piece of hardware that has enough memory to hold the private key of the machine model owner. This hardware will take encrypted messages as input from the set custom-character1, use the decryption key k to decrypt this message, and output the decrypted message. Hence, the tamper resistant hardware module should compute g(y) for y ∈ custom-character1 where g: custom-character1 custom-charactercustom-character2. However, slightly more functionality is needed to make this secure because as presented a malicious user could simply ask this secure hardware to decrypt the entire encrypted model encrypt (custom-character) ∈ custom-character1, which defeats the entire purpose of this approach.


One solution to this problem is to create a secure communication channel between the software computing the machine learning algorithm and the hardware module. This could be achieved by, for instance, signing the messages going to the hardware module. The hardware module checks the signatures and can in this way ensure the messages (decryption requests) are indeed coming from the software running the machine learning model.


Another solution is not to only provide the ciphertext to be decrypted, but also a proof of work. This allows the hardware module to verify that the machine learning module has been applied to the input data: i.e., this is a valid outcome based on some known inputs. This also ensures that decryption queries for the model itself are no longer possible. In both of these approaches verification information is created, either in the form of a signature or a proof of work.


One of the additional advantages of this approach is that it allows for an easy way to replace or upgrade the model. The owner of the model can simply push another encrypted version to the edge node. This is significantly more difficult when the machine learning model is installed inside the secure hardware.


Another advantage is that the memory requirement for the secure hardware is small: the memory needs sufficient space to hold the private key of the machine learning model owner. In practice this is likely to be significantly smaller as compared to the machine learning model itself. Moreover, this limits the total communication with the secure hardware, which in certain scenarios might be significantly lower compared to the communication within the unprotected processor implementing the encrypted machine learning model.



FIG. 1 illustrates a system including an edge node receiving data from an IoT device. The system 100 may include an IoT device 105 and an edge node 115. The IoT device produces data 110 that is sent to the edge node 115. The edge node 115 may include an encrypted machine learning model 120 and a tamper resistant hardware 130. The encrypted machine learning model 120 receives the data 110 from the IoT device 105. The encrypted machine learning model 120 then produces an encrypted ML output 125 that is sent to the tamper resistant hardware 135.


The encrypted ML output 125 may be sent using a secure channel where the encrypted ML output 125 is signed. The tamper resistant hardware 130 may then check the signed message and verify that the message is coming from the encrypted machine learning model 120 to prevent using the tamper resistant hardware 130 from unauthorized use. Alternatively, the ML output 125 may be sent together with a proof of work to verify that the machine learning module has been applied to the input data: i.e., this is a valid outcome based on some known inputs. These two approaches ensure that decryption queries for the model itself are no longer possible.


The tamper resistant hardware 130 includes a processor 135 and a key storage 140. The key storage 140 stores the key used to decrypt the encrypted machine learning model 120. The tamper resistant hardware 130 receives the encrypted ML output 125. The tamper resistant hardware 130 also receives verification, which may be for example either a signature or proof of work, that is used to verify the data received is valid. The tamper resistant hardware 130 verifies the signature or proof work, and then if verified, decrypts the encrypted ML output 130 and outputs the decrypted ML output 145. The processor 135 implements the verification and decryption of the encrypted ML output 125 using the decryption key stored in key storage 140.



FIG. 2 illustrates an exemplary hardware diagram 200 for implementing either the encrypted machine learning model 120 or the tamper resistant hardware 130 described above. As shown, the device 200 includes a processor 220, memory 230, user interface 240, network interface 250, and storage 260 interconnected via one or more system buses 210. It will be understood that FIG. 2 constitutes, in some respects, an abstraction and that the actual organization of the components of the device 200 may be more complex than illustrated.


The processor 220 may be any hardware device capable of executing instructions stored in memory 230 or storage 260 or otherwise processing data. As such, the processor may include a microprocessor, field programmable gate array (FPGA), application-specific integrated circuit (ASIC), or other similar devices. For the tamper resistant hardware, the processor may tamper resistant.


The memory 230 may include various memories such as, for example L1, L2, or L3 cache or system memory. As such, the memory 230 may include static random-access memory (SRAM), dynamic RAM (DRAM), flash memory, read only memory (ROM), or other similar memory devices. Further, for the memory in the tamper resistant hardware, the memory may be secure memory that resists tampering.


The user interface 240 may include one or more devices for enabling communication with a user such as an administrator. For example, the user interface 240 may include a display, a mouse, and a keyboard for receiving user commands. In some embodiments, the user interface 240 may include a command line interface or graphical user interface that may be presented to a remote terminal via the network interface 250. In some embodiments, no user interface may be present.


The network interface 250 may include one or more devices for enabling communication with other hardware devices. For example, the network interface 250 may include a network interface card (NIC) configured to communicate according to the Ethernet protocol. Additionally, the network interface 250 may implement a TCP/IP stack for communication according to the TCP/IP protocols. Various alternative or additional hardware or configurations for the network interface 250 will be apparent.


The storage 260 may include one or more machine-readable storage media such as read-only memory (ROM), random-access memory (RAM), magnetic disk storage media, optical storage media, flash-memory devices, or similar storage media. In various embodiments, the storage 260 may store instructions for execution by the processor 220 or data upon with the processor 220 may operate. For example, the storage 260 may store a base operating system 261 for controlling various basic operations of the hardware 200. Further, software for the machine learning model, 262, verification 263, and decryption 264 may be stored in the memory, depending on whether it is the machine learning model 120 or the tamper resistant hardware 130. This software may implement the various embodiments described above.


It will be apparent that various information described as stored in the storage 260 may be additionally or alternatively stored in the memory 230. In this respect, the memory 230 may also be considered to constitute a “storage device” and the storage 260 may be considered a “memory.” Various other arrangements will be apparent. Further, the memory 230 and storage 260 may both be considered to be “non-transitory machine-readable media.” Further, this memory may be tamper resistant. As used herein, the term “non-transitory” will be understood to exclude transitory signals but to include all forms of storage, including both volatile and non-volatile memories.


While the host device 200 is shown as including one of each described component, the various components may be duplicated in various embodiments. For example, the processor 220 may include multiple microprocessors that are configured to independently execute the methods described herein or are configured to perform steps or subroutines of the methods described herein such that the multiple processors cooperate to achieve the functionality described herein.


Any combination of specific software running on a processor to implement the embodiments of the invention, constitute a specific dedicated machine.


As used herein, the term “non-transitory machine-readable storage medium” will be understood to exclude a transitory propagation signal but to include all forms of volatile and non-volatile memory.


It should be appreciated by those skilled in the art that any block diagrams herein represent conceptual views of illustrative circuitry embodying the principles of the invention.


Although the various exemplary embodiments have been described in detail with particular reference to certain exemplary aspects thereof, it should be understood that the invention is capable of other embodiments and its details are capable of modifications in various obvious respects. As is clear to those skilled in the art, variations and modifications can be affected while remaining within the spirit and scope of the invention. Accordingly, the foregoing disclosure, description, and figures are for illustrative purposes only and do not in any way limit the invention, which is defined only by the claims.

Claims
  • 1. A device, comprising: a memory;a processor configured to implement an encrypted machine learning model configured to: evaluate the encrypted machine learning model based upon received data to produce an encrypted machine learning model output;producing verification information;a tamper resistant hardware configured to: verify the encrypted machine learning model output based upon the verification information; anddecrypt the encrypted machine learning model output when the encrypted machine learning model output is verified.
  • 2. The device of claim 1, wherein verification information is a signature and verifying the encrypted machine learning model output includes verifying the signature.
  • 3. The device of claim 1, wherein verification information is a signature and producing the verification information includes producing the signature.
  • 4. The device of claim 1, wherein verification information is a proof of work and verifying the encrypted machine learning model output includes verifying the proof of work is correct.
  • 5. The device of claim 1, wherein verification information is a proof of work and producing the verification information includes producing the proof of work.
  • 6. The device of claim 1, wherein the tamper resistant hardware stores a decryption key to decrypt outputs of the encrypted machine learning model.
  • 7. The device of claim 1, wherein received data is from an Internet of Things device.
  • 8. The device of claim 1, wherein the device is an edge node.
  • 9. The device of claim 1, wherein the encrypted machine learning model is encrypted using homomorphic encryption.
  • 10. The device of claim 1, wherein the encrypted machine learning model is encrypted using somewhat homomorphic encryption.
  • 11. A method of evaluating an encrypted learning model, comprising: evaluating, by a processor, the encrypted learning model based upon received data to produce an encrypted machine learning model output;producing, by the processor, verification information;verifying, by a tamper resistant hardware, the encrypted machine learning model output based upon the verification information; anddecrypting, by a tamper resistant hardware, the encrypted machine learning model output when the encrypted machine learning model output is verified.
  • 12. The method of claim 11, wherein verification information is a signature and verifying the encrypted machine learning model output includes verifying the signature.
  • 13. The method of claim 11, wherein verification information is a signature and producing the verification information includes producing the signature.
  • 14. The method of claim 11, wherein verification information is a proof of work and verifying the encrypted machine learning model output includes verifying the proof of work is correct.
  • 15. The method of claim 11, wherein verification information is a proof of work and producing the verification information includes producing the proof of work.
  • 16. The method of claim 11, wherein the tamper resistant hardware stores a decryption key to decrypt outputs of the encrypted machine learning model.
  • 17. The method of claim 11, wherein received data is from an Internet of Things device.
  • 18. The method of claim 11, wherein the processor and the tamper resistant hardware are in an edge node.
  • 19. The method of claim 11, wherein the encrypted machine learning model is encrypted using homomorphic encryption.
  • 20. The method of claim 11, wherein the encrypted machine learning model is encrypted using homomorphic encryption.