The development of cloud storage and services (sometimes referred to as “utility computing services”) has allowed users to offload both storage of their data and associated computations on that data. As a result, businesses can choose to forego the expensive proposition of maintaining their own data centers, relying instead on cloud storage and computational services. However, concerns over the loss of privacy (e.g., the loss of the value of private data and computation) present significant challenges to the adoption of cloud services by consumers and businesses alike. Accordingly, many cloud storage solutions employ a level of encryption on the user's data to preserve data privacy. Unfortunately, it is difficult to efficiently perform meaningful computations on encrypted data without decrypting the data first. As such, substantial privacy concerns remain.
Implementations described and claimed herein address the foregoing problems by providing an encryption scheme that allows meaningful, efficient computation on encrypted data. Further, the data providers, computational services, and results consumers work in concert using a somewhat homomorphic encryption scheme to preserve the secrecy while providing practical computational performance. For example, a user's data is transmitted and stored in the cloud in an encrypted format that allows meaningful computations to be performed on the data, without decrypting the data, and the computational constraints for a given application domain allow acceptable computational performance by a cloud-based computational service.
In one implementation, encrypted data is stored within network-accessible storage. The data is encrypted using an encryption scheme that allows predictive analysis on the encrypted data without decrypting the encrypted data. The predictive analysis includes evaluation of polynomials of bounded degree on elements of the encrypted data. The evaluation includes ciphertext addition compositions and a bounded number of ciphertext multiplication compositions. The predictive analysis is performed on the encrypted data without decrypting the encrypted data to create encrypted results, which are transmitted to an entity possessing a decryption key capable of decrypting the encrypted results.
In some implementations, articles of manufacture are provided as computer program products. One implementation of a computer program product provides a tangible computer program storage medium readable by a computing system and encoding a processor-executable program. Other implementations are also described and recited herein.
This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
One method of maintaining the secrecy of a user's data in a cloud-based computational environment is to store all data in an encrypted format and to perform the computations on the encrypted data, without decrypting the data first. However, previous approaches have proven to be intractable. In contrast, technology described herein provides practical schemes for offloading storage and computation of secret data without decrypting the data by supporting a bounded number of ciphertexts multiplication compositions in combination with a potentially very large number of ciphertext addition compositions. Generally, the term “ciphertext” refers to an encrypted data set (e.g., an encrypted message, an encrypted data bit, encrypted text, etc.). For certain selected applications, a somewhat homomorphic encryption (SwHE) scheme, which allows a bounded number of ciphertext multiplication compositions, provides improved efficiency improvements over fully homomorphic approaches.
In one example application, that of a cloud service for managing electronic medical records (EMR), a potential scenario exists in which many devices continuously or periodically measure and/or collect vital health information about a user (e.g., a patient). The devices stream the health information to a computation system, which can reside in the cloud or within an arbitrary communications network. Over time, the computation system can compute statistics over the collected health information and provide useful feedback pertaining to the care of the patient. For example, the statistics may suggest a change in a course of treatment (e.g., a change in a medicine dosage). Accordingly, the computational system may send an alert to the patient or his or her caregiver to adjust the dosage.
Typically, in such scenarios, the volume of collected data is large and the user would prefer not to store the data locally, thereby suggesting a role for cloud storage. Accordingly, to protect patient privacy within the cloud storage environment, the health information is uploaded to cloud storage in encrypted form. The computational system performs operations on the encrypted health information and returns feedback in the form of encrypted alerts, predictions, recommendations, and/or summaries of the results to the patient or his or her caregiver. Other example applications represent variations of this theme and are detailed below.
Before turning to example implementations in specific application domains (e.g., healthcare, financial analysis, advertising), an introduction of the somewhat homomorphic encryption (SwHE) scheme is warranted. The SwHE scheme, represented by the expression SHE=(SH.Keygen, SH.Enc, SH.Add, SH.Mult, SH.Dec), is associated with a number of parameters:
In one implementation, the SwHE scheme is a function of the following component operations:
and (2) outputting the message as {tilde over (m)}(modt).
In addition, the SwHE scheme is a function of a couple of homomorphic operations SH.Add and SH.Mult. In one implementation, in order to homomorphically compute an arbitrary function ƒ, an arithmetic circuit for ƒ (made of addition and multiplication operations over Zt) may be constructed. The SH.Add and SH.Mult operations are used to iteratively compute f on encrypted inputs. Although the ciphertexts produced by SH.Enc contain two ring elements, the homomorphic operations increase the number of ring elements in the ciphertext. In general, the SH.Add and the SH.Mult operations get as input two ciphertexts ct=(c0,c1, . . . , cδ) and ct′=(c0′,c1′, . . . ,cγ′). The output of SH.Add contains max (δ+1, γ+1) ring elements, whereas the output of SH.Mult contains δ+γ+1 ring elements.
The output ciphertext is ctmulti=(ĉ0, . . . ,ĉδ+γ).
Accordingly, given the mathematical foundation above, the described technology applies an SwHE scheme to provide predictive analysis including evaluation of polynomials of bounded degree on elements of encrypted data. Generally, predictive analysis uses computational tools, often statistical tools including modeling, data mining, game theory, etc., to analyze data to make predictions about future events, trends, values, etc. In one implementation, predictive analysis employing statistical computations, such as an average, a standard deviation, and a logistical regression, among other computations, may be performed:
is the average
returned as a pair that consists of the numerator and denominator of the expression before taking the square root
Furthermore, using an implementation of the SwHE scheme, the cloud storage system 105 may also perform computations on the uploaded encrypted data on behalf of the patient without decrypting the data itself In the scenario illustrated in
An encryption operation 204 encrypts the collected data using somewhat homomorphic encryption (SwHE) based on a private key of the data provider. A storing operation 206 uploads the encrypted data to network-accessible storage, such as a cloud storage system. While the encrypted data is stored in the network-accessible storage, it remains encrypted.
A computation operation 208 performs addition and multiplication computations on the encrypted data within the network-accessible storage without decrypting the data. Typically, the computation functions are provided to the network-accessible storage from a function database or service. In one implementation, the computation functions include a number of additions operations and a fixed set of multiplication operations. A communication operation 210 communicates the encrypted results, which remain encrypted based on the data provider's private key, from the computations to the data provider. A decryption operation 212 decrypts the results using the data provider's private key.
In the example environment 300, one or more data providers are represented by an analyst 302, a market data source 304, and an inventory system 306. Each data provider encrypts its data 308, 310, or 312 using a public key associated with the results consumer entity, such as the CEO of a company. The encrypted data 308, 310, and 312 is uploaded to one or more storage devices 316 of a cloud storage system 318. In addition, financial computation functions 311 are also encrypted using a public key of the results consumer and uploaded as encrypted functions 320 to the one or more devices 316 in the cloud storage system 318. It should be understood that although the uploaded data 308, 310, and 312 and encrypted functions 320 are represented in
In this manner, the results consumer controls his or her private encryption key(s) and, therefore, controls access to both the data 308, 310, and 312 and the encrypted functions 320. A computation system 320 of the cloud storage system 318 can execute the computations within the SwHE scheme without decrypting either the data 308, 310, or 312 or the encrypted functions 320.
In the scenario illustrated in
An encryption operation 404 encrypts the collected data using somewhat homomorphic encryption (SwHE) based on a public key of the results consumer, such as a CEO of a company. A storing operation 406 uploads the encrypted data to network-accessible storage, such as a cloud storage system. While the encrypted data is stored in the network-accessible storage, it remains encrypted.
A computation operation 408 performs addition and multiplication computations on the encrypted data within the network-accessible storage without decrypting the data. An example computation may predict a stock price or a product sales amount. Typically, the computation functions are provided in encrypted form (e.g., using the results consumer's public key) to the network-accessible storage from a function database or service and may be private and proprietary computation functions. In one implementation, the computation functions include a number of additions operations and a fixed set of multiplication operations. A communication operation 410 communicates the encrypted results, which remain encrypted based on the results consumer's public key, from the computations to the results consumer. A decryption operation 412 decrypts the results using the results consumer's private key.
Using SwHE, the consumer can encrypt the contextual information 505 before uploading it to the cloud storage system 504, thereby protecting against privacy breaches. In addition, the advertising company uploads ads 514 to the cloud storage system 504. The computation system 512 computes one or more functions on the encrypted contextual data stored in the storage 506 to determine which ads 514 to encrypt and send to the consumer. The selected ads 510 and any contextual information in the ads 510 are encrypted to the consumer's public key. Accordingly, consumer can decrypt the received, encrypted ad 510 using his or her private key 516.
An encryption operation 604 encrypts the collected data using somewhat homomorphic encryption (SwHE) based on a public key of the data provider. A storing operation 606 uploads the encrypted data to network-accessible storage, such as a cloud storage system. While the encrypted data is stored in the network-accessible storage, it remains encrypted.
A computation operation 608 performs addition and multiplication computations on the encrypted data within the network-accessible storage without decrypting the data. An example computation may select an advertisement or coupon to be presented to the data provider (collectively, “promotions”), which are typically encrypted using the data provider's public key. The computation functions may be provided in encrypted or unencrypted form to the network-accessible storage from a function database or service. In one implementation, the computation functions include a number of additions operations and a fixed set of multiplication operations. A communication operation 610 communicates the selected promotion to the data provider in encrypted form, based on the data provider's public key. A decryption operation 612 decrypts the promotion using the data provider's private key.
The system bus 23 may be any of several types of bus structures including a memory bus or memory controller, a peripheral bus, a switched fabric, point-to-point connections, and a local bus using any of a variety of bus architectures. The system memory may also be referred to as simply the memory, and includes read only memory (ROM) 24 and random access memory (RAM) 25. A basic input/output system (BIOS) 26, containing the basic routines that help to transfer information between elements within the computer 20, such as during start-up, is stored in ROM 24. The computer 20 further includes a hard disk drive 27 for reading from and writing to a hard disk, not shown, a magnetic disk drive 28 for reading from or writing to a removable magnetic disk 29, and an optical disk drive 30 for reading from or writing to a removable optical disk 31 such as a CD ROM, DVD, or other optical media.
The hard disk drive 27, magnetic disk drive 28, and optical disk drive 30 are connected to the system bus 23 by a hard disk drive interface 32, a magnetic disk drive interface 33, and an optical disk drive interface 34, respectively. The drives and their associated computer-readable media provide nonvolatile storage of computer-readable instructions, data structures, program engines and other data for the computer 20. It should be appreciated by those skilled in the art that any type of computer-readable media which can store data that is accessible by a computer, such as magnetic cassettes, flash memory cards, digital video disks, random access memories (RAMs), read only memories (ROMs), and the like, may be used in the example operating environment.
A number of program engines may be stored on the hard disk, magnetic disk 29, optical disk 31, ROM 24, or RAM 25, including an operating system 35, one or more application programs 36, other program engines 37, and program data 38. A user may enter commands and information into the personal computer 20 through input devices such as a keyboard 40 and pointing device 42. Other input devices (not shown) may include a microphone, joystick, game pad, satellite dish, scanner, or the like. These and other input devices are often connected to the processing unit 21 through a serial port interface 46 that is coupled to the system bus, but may be connected by other interfaces, such as a parallel port, game port, or a universal serial bus (USB). A monitor 47 or other type of display device is also connected to the system bus 23 via an interface, such as a video adapter 48. In addition to the monitor, computers typically include other peripheral output devices (not shown), such as speakers and printers.
The computer 20 may operate in a networked environment using logical connections to one or more remote computers, such as remote computer 49. These logical connections are achieved by a communication device coupled to or a part of the computer 20; the invention is not limited to a particular type of communications device. The remote computer 49 may be another computer, a server, a router, a network PC, a client, a peer device or other common network node, and typically includes many or all of the elements described above relative to the computer 20, although only a memory storage device 50 has been illustrated in
When used in a LAN-networking environment, the computer 20 is connected to the local network 51 through a network interface or adapter 53, which is one type of communications device. When used in a WAN-networking environment, the computer 20 typically includes a modem 54, a network adapter, a type of communications device, or any other type of communications device for establishing communications over the wide area network 52. The modem 54, which may be internal or external, is connected to the system bus 23 via the serial port interface 46. In a networked environment, program engines depicted relative to the personal computer 20, or portions thereof, may be stored in the remote memory storage device. It is appreciated that the network connections shown are example and other means of and communications devices for establishing a communications link between the computers may be used.
In an example implementation, an encryption module, a storage system, a computation system, and other engines and services may be embodied by instructions stored in memory 22 and/or storage devices 29 or 31 and processed by the processing unit 21. Collected data, computation functions, promotions, computation results, public/private keys, and other data may be stored in memory 22 and/or storage devices 29 or 31 as persistent datastores. Example storage, computation, encryption/decryption, and data collection services described may be implemented using a general-purpose computer and specialized software (such as a server executing service software), a special purpose computing system and specialized software (such as a mobile device or network appliance executing service software), or other computing configurations.
The embodiments of the invention described herein are implemented as logical steps in one or more computer systems. The logical operations of the present invention are implemented (1) as a sequence of processor-implemented steps executing in one or more computer systems and (2) as interconnected machine or circuit engines within one or more computer systems. The implementation is a matter of choice, dependent on the performance requirements of the computer system implementing the invention. Accordingly, the logical operations making up the embodiments of the invention described herein are referred to variously as operations, steps, objects, or engines. Furthermore, it should be understood that logical operations may be performed in any order, unless explicitly claimed otherwise or a specific order is inherently necessitated by the claim language.
The above specification, examples, and data provide a complete description of the structure and use of exemplary embodiments of the invention. Since many embodiments of the invention can be made without departing from the spirit and scope of the invention, the invention resides in the claims hereinafter appended. Furthermore, structural features of the different embodiments may be combined in yet another embodiment without departing from the recited claims.