The present invention relates generally to software, and more particularly to analytics software such as IoT (internet of things) analytics.
Conventional PET (Privacy Enhancing Technology) and Privacy Model technologies are described in the following references A-F:
Privacy metrics for IT systems are known including sophisticated privacy metrics like entropy, indistinguishability and weighting of goal values.
Example PETs and application of privacy metrics thereto, are described in Funke et al, “Constrained PET Composition for Measuring Enforced Privacy”, 2017, available on the Internet. Generally, what is to be achieved in terms of privacy is deemed a goal, whereas the technology used to achieve the goal/s is deemed a PET.
Other state of the art technologies are described in:
“PrivOnto: A semantic framework for the analysis of privacy policies”, A Oltramari, D Piraviperumal, F Schaub et al, 2017, content.iospress.com
In this document henceforth, “PETs” is used to denote the plural of PET (Privacy Enhancing Technologies).
Privacy metrics for single PETs are known, such as Tor's set-based anonymity metric, and Delta-Privacy. Privacy metrics are used in the field to measure the fulfillment of a specific privacy goal a PET has (e.g. Unlinkability for Tor's set-based anonymity metric. Differential privacy and k-anonymity measure how uniquely identifiable a user is who stores a data-record in a database of other records).
The disclosures of all publications and patent documents mentioned in the specification, and of the publications and patent documents cited therein directly or indirectly, are hereby incorporated by reference. Materiality of such publications and patent documents to patentability is not conceded.
Certain embodiments seek to provide a Privacy Enhancing Technologies (PETs) combiner system typically characterized by a set-based taxonomy and/or composition algebra where Privacy Enhancing Technologies (PETs) are intended to include any typically well-defined, typically domain-specific technology operative to preserve privacy in computerized systems and technologies such as but not limited to Intelligent Software Agents (ISAs), networks, human computer interfaces, public key infrastructures, cryptography and data mining and matching.
The combiner system may be characterized by all or any subset of the following characteristics:
Certain embodiments seek to enhance existing technologies such as but not limited to Heed, Heed verticals, Scale, CityMind, PBR, Smart Vehicles, with privacy features, in fields such as OTT content, (Social) IoT, or Smart Cities.
Certain embodiments seek to provide a domain-independent solution operative to provide privacy-enhanced services composed out of multiple PETs, including formally describing properties of the composed PET system especially for use cases where PETs are dynamically selected and change on the fly, e.g. by contextual policy driven infrastructures. Typically different PETs have different characteristics such as different scenarios, application domains, threat models, privacy goals etc. so, for example, when composing two PETs, PET1 on pseudonymity and PET2 on un-traceability, it is not necessarily the case that the resulting PET system preserves all properties of each the two individual PETs, PET1 and PET2. Instead, a weakness of one of the PETs, e.g. reversibility, may override the resilience of the other PET in the composed system.
Certain embodiments seek to provide technical solutions to all or any subset of the following problems:
Certain embodiments of the composer system enable enforcing a freely configurable balance between user privacy and service provider business goals. Typically, the system employs a formal, set-based and domain-independent taxonomy model for PETs, and/or an algebra for constrained composition of PETs. Typically, the system is operative for measurement of enforced privacy in service infrastructures.
Certain embodiments seek to provide a PET ecosystem, which would work similar to an online cell-app store (e.g. Apple AppStore/Google PlayStore), where contributors can curate values assigned to PETs.
Certain embodiments seek to provide a system and method for fulfilling multiple privacy requirements or goals (e.g. unlinkability+confidentiality+awareness) including combining multiple raw or individual or basic PETs (for each of which, a formal definition of which privacy requirements are covered by that raw PET is available) and generating, computationally, a formal definition of which privacy requirements are and are not covered by this combination, since certain requirements might be perturbed by certain PET combination choices (e.g. if weak attributes taint stronger ones) which would, absent the system and methods shown and described herein, be unknown.
Certain embodiments are also applicable to areas of context other than privacy benefitting from formal building block composition, such as but not limited to (a) data analytics, where analytic modules, rather than PTEs are combined, e.g. into an analytics chain. and (b) business workflows, where workflow operations, rather than PETs, may be combined into a workflow, e.g. chain of workflow operations.
When implementing these and other examples, fields and/or values of the taxonomy shown and described herein may be augmented or removed, and/or labels may be changed to labels meaningful in the analytics context and/or dimensions may be suitably classified, not necessarily as classified herein merely by way of example, as either properties or capabilities (or hybrid). Also, constrained unions may, if deemed useful, be suitably defined.
At least the following embodiments are thus provided:
A system configured to combine plural PETs, the system comprising:
I. A user interface operative to provide evaluations of each of plural raw aka known aka input PETs, in accordance with a PET taxonomy comprising a tree defining ordered and unordered dimensions comprising sub-trees along which PETs are classified; and
Ii. A processor configured to automatically generate an evaluation of at least one composition of at least two of the plural raw PETs.
It is appreciated that the taxonomy (and associated algebra) may if desired be based on graphs generally, rather than, necessarily, on a tree.
The “taxonomy” may for example include any PET taxonomy which includes (a) dimensions, e.g. some or all of the dimensions of the taxonomy of
Typically, Reversibility (by way of example) is not inherently ordered over {full, partial}, and instead the order is introduced by the respective constrained set union that treats (say) “full” as bigger than “partial”.
It is appreciated that the taxonomy of
A particular advantage of the taxonomy (“s3”) of
It is appreciated that, as exemplified by the mixed Reversibility dimension, dimension/s in
One use-case of the above system is as a middleware-component aka privacy middleware in a service-provisioning chain, which adds privacy features to usage of the services. Some or all of the services themselves may be agnostic to the existence of the PAPI system shown and described herein.
Each service typically stores and processes end users' data. The privacy features may comprise privacy goals of PETs in the middleware such as, say, Unlinkability, Confidentiality, Indistinguishability (or may comprise other PET dimensions such as Aspects, Foundation, etc.).
Typically, the PET Descriptions are provided by the middleware that may derive suitable PETs to use from a negotiated policy. The Configuration descriptions may then be delivered either at setup time or later during maintenance by the system administrator. A PET system description, representing a PET system currently employed, may be delivered back to the middleware. The privacy metric if generated by the system of
Other mappings are possible, e.g. one policy of the middleware may represent multiple policies in the service or vice versa, one policy of the service may represent multiple policies in the middleware.
Another use case in which the service provider is aware of the middleware is when the service provider incorporates the PAPI backend into the service provider's system, being directly responsible for some or all of its operation, policies, PETs.
Alternatively, however, the service providers may be agnostic to or unaware of the existence of the privacy middleware in which case policy may be negotiated between a PAPI client (e.g. on the mobile phone) and a PAPI backend (e.g. in the Cloud), e.g. as described herein. The PAPI backend then ensures that the PAPI policies match the policies of the various services' providers. Typically, negotiation is not directly with the end-user, but rather with the end-user's device (which has PAPI installed). The end-user may, once or on occasion, select a subset of policies (from a given set of policies which may be presented e.g. as a list, via a suitable user interface) that s/he may agree to. This set of policies may then (typically automatically and in the background) be compared to the set of policies that are accepted by the service provider. If a matching policy is found (a policy both selected by the end-user and accepted by the service provider), that matching policy is selected. Otherwise, typically, the user either does not use the service or selects/is prompted to select more or other policies
from the given set of policies. The given set of policies may for example be preinstalled with the client side PAPI installation. Alternatively or in addition, users may create their own policies. Also, policies may be exchanged with other users/in a community of users.
A system according to any of the embodiments herein wherein the taxonomy is formalized using nested sets, thereby to generate a set-based formalized taxonomy in which each PET comprises a set of attributes each represented by a set of possible values and wherein at least one possible value itself comprises a set.
A system according to any of the embodiments herein wherein the taxonomy defines privacy dimensions aka attributes and defines quantifiability of at least one dimension aka capability from among the dimensions where quantifiability comprises assigning one of at least 2 (or even 3 degrees) or levels of strength to a capability.
Quantifiability allows plural degrees or levels of fulfillment or existence of a given attribute, in a given PET or other data operator, to be represented. For example pet1 may be characterized by a high degree of reversibility whereas pet2 and pet3 have low and medium degrees of reversibility, respectively.
A system according to any of the embodiments herein wherein the quantifiability of at least one dimension of the taxonomy uses at least one of the N, Q or R sets where R=real numbers, Q=rational numbers, N=natural numbers.
For example, the set N of natural numbers may be used for the G dimension of
It is appreciated that there is no need for the sets of quantifiers to be numbers at all. E.g. the set of all possible words over the 26 letters A-Z is infinite, well defined and even has an inherent (lexicographic) order. Note that “word” need not be a word in any natural language and may instead be any combination of arbitrary letters in arbitrary order and of arbitrary length where each letter may appear any number of times. Example words being: A, AA, AAA, . . . , AB, ABA, ABAA, . . . , ABB, ABA, etc.
Even if the quantifiers are not numbers the quantifiers still may be used for the algebra e.g. by defining a one-to-one (bijective) mapping between the given set and a number set. E.g. the set of words defined above may be mapped to the natural numbers by interpreting the words as numbers in a p-adic system where p=26 as is known in the art.
A system according to any of the embodiments herein wherein the quantifiability of at least one dimension of the taxonomy uses a set with a finite number of elements.
For example, the {Full, Partial} set (number of elements=2) or any other finite set may be used for the R dimension of
It is appreciated that if there is no “natural” or intuitive order on the elements of the set, an order may be defined within the algebra to allow comparison and/or constrained set union.
A system according to any of the embodiments herein wherein the processor uses an algebra, for composition of the plural PETs, which is defined on the formalized taxonomy.
In privacy, reversibility is “undesirable” typically e.g. worse the higher its level is, i.e. full reversibility is bad for privacy while no reversibility is good for privacy i.e. it is typically not desirable for privacy measures to be reversible.
It is appreciated that it is not necessarily the case that the resulting PET system preserves all properties of each the two individual PETs, pet1 and pet2. Instead, a weakness of one of the PETS, e.g. High reversibility,
may override the resilience of the other PET in the composed system. This situation in which weaker attributes override stronger ones is an example of constrained composition or constrained union. The weakest link composition shown and described herein guarantees that the resulting PET system is “at least as strong as its weakest link” e.g. that a combined PET is at least as strong, along each dimension or along each capability, as the weakest value along that dimension or capability, possessed by any of the input PETs. Typically, weaker and stronger dimension d_w and d_s refer to dimensions whose strengths, for a given PET, are quantified as weaker and stronger respectively.
It is appreciated that other constrained compositions, other than weakest link composition shown and described herein, may be used, such as but not limited to:
A. Strongest link composition—where, in the combined PETs formal description in terms of the taxonomy, strong attributes of the raw PETs survive whereas weaker attributes of the raw PETs are eliminated from the combined PET's description. Strongest link composition may be used for example if all the input PETs provide confidentiality by encrypting the content in a cascade, e.g. each PET p encrypts the complete payload of a previous PET p. Then, the strongest encryption from among the input PETs is the one that “matters” (that characterizes the combined PET).
B. Average goal strength composition—where the strengths of identical goals are averaged e.g. by computing the arithmetical mean or other central tendency of the input strength. This constrained composition is not as “safe” as the weakest link composition because the actual strength of the composition may be lower than the average but this constrained composition may give a closer hence, in certain use-cases e.g. low-criticality use-cases, more useful estimation (or automatically generated evaluation) of the actual strength of the composition, relative to just using the minimum as a safe (especially in high-criticality use-cases) yet more inaccurate estimation (or automatically generated evaluation) of the actual strength of the composition.
Typically, the algebra automatically determines which attributes of individual PETs, pet1 and pet2 are preserved in the combined PET (the combination of pet1 and pet2) and which are not. Typically, the algebra achieves this by providing a suitable constrained union, e.g. the weakest link composition or strongest link composition or average goal strength composition, by way of example.
The embodiment described herein refers extensively to weakest link composition; other constrained unions may involve other methods for building the set unions such as but not limited to:
A system according to any of the embodiments herein wherein the algebra automatically determines which attributes of raw PETs being combined, are preserved in a composition of the raw PETs by using at least one use-case specific constraint, introduced into the algebra thereby to allow for special composition behavior characterizing (or unique to) less than all use-cases.
For example, using at least one use-case specific constraint may include using a weakest link constraint W for G and R. For example, all attributes aka dimensions may be composed using ordinary set union other than, say, G and R. G, R dimensions may be composed using, respectively, special unions ∪min and ∪to.
A “use case specific constraint” typically comprises a constraint applying to a union of PETs or to a composition of PETs which, when applied, causes the composition of PETs to behave in a manner appropriate to only some use cases. For example, in some use-cases but not others, it may be desired for each composition of PETs to be “as strong as the weakest link (as the weakest input PET)” along some or all dimensions of the taxonomy used to characterize the PETs.
Typically, constraint composition (such as, for example, the weakest or strongest link composition) is operative assuming a taxonomy which has capabilities. For example, in the example embodiment described herein, constrained composition is defined only over capabilities whereas properties are composed by ordinary set unions. However, alternatively, constrained composition may be defined over properties, e.g. by transforming properties into capabilities i.e. making the properties quantifiable or defining an order on the individual attributes of a property. For example, for “scenario” define the following order: “untrusted client”<untrusted server”<“external”. Then, put a min or max constrain on the union.
It is appreciated that constrained compositions may be defined as being affected by plural dimensions/attributes rather than just one, e.g., say, security model “computational” overrides security goal “confidentiality”.
Typically, the weakest link constraint for G differs from the weakest link constraint for R. While for example for G the weakest link constraint operates over N using a minimum, on R a minimum is not used and instead, within the constrained union there may be a definition of which of the non-numerical values is considered smaller than the others.
A system according to any of the embodiments herein wherein the at least one use-case specific constraint comprises a Weakest link constraint W applied to at least one dimension of the taxonomy.
A system according to any of the embodiments herein wherein the Weakest link constraint W is applied to dimension G.
A system according to any of the embodiments herein wherein the Weakest link constraint W for dimension G comprises minimum goal strength composition (∪min)
A system according to any of the embodiments herein wherein the Weakest link constraint W is applied to dimension R.
A system according to any of the embodiments herein wherein the Weakest link constraint W for dimension R comprises maximum reversibility degree composition (∪to).
A system according to any of the embodiments herein wherein the system is also configured to compute a privacy metric, from the formalized taxonomy, which computes strength achieved by all goals (dimension) in the combined PET relative to the maximum achievable goal strength.
It is appreciated that with the given formula, the maximum number of goals constitutes the maximum achievable level of goal strength possible within the given taxonomy.
According to some embodiments, the privacy index is defined only over the goal dimension. According to other embodiments, the privacy index is defined, alternatively or in addition, over other capability dimension/s.
Typically the metric provides a formal indication of minimum privacy guarantees if a particular combination of PETs is used.
A system according to any of the embodiments herein wherein the algebra is a commutative monoid.
A system according to any of the embodiments herein wherein an attribute-value pair based representation is used in an API to exchange data by exporting data structures represented using the attribute-value pair based representation from the system.
A system according to any of the embodiments herein wherein the user interface imports PETs including data structures represented using an attribute-value pair based representation.
It is appreciated that use of an attribute-value pair based representation such as JSON or XML is but one possible technical solution to express a formalized taxonomy on a computer. JSON is suitable on the API level and also works within the program running on the processor. It is appreciated that many typically more efficient solutions other than JSON exist to represent sets in computer memory since many modern programming language have a concept of sets, hence may be utilized.
A system according to any of the embodiments herein wherein service providers' end-users respective digital privacy is adjusted automatically during service-usage by negotiating PET combination/s and/or configuration policies.
Typically, negotiation are conducted between PAPI on client side and PAPI server, typically after end users have selected privacy policies they deem appropriate.
Typically, each configuration policy comprises a combination of PETs derived by the middleware from a user's privacy policy or privacy requirements, which may have been defined by the end-user in the course of the end-user's registration to one or more services which store and process end users' data.
A method configured to combine plural data operators, the method comprising:
I. Providing evaluations of each of plural raw aka known aka input data operators, in accordance with a data operator taxonomy comprising a tree defining ordered and unordered dimensions comprising sub-trees along which data operators are classified; and
Ii. Providing a processor configured to automatically generate an evaluation of at least one composition of at least two of the plural raw data operators.
Tt is appreciated that the data operators may comprise any operation performed on data such as but not limited to PETs or analytics. In the illustrated embodiments, the data operators comprise PETs however fields and/or values of the taxonomy shown and described herein may be augmented or removed, and/or labels may be changed to labels meaningful in the analytics context. Once the taxonomy is thus modified, and dimensions suitably classified as either properties or capabilities, the system shown and described herein may be used to combine raw analytics modules into an analytics chain.
According to certain embodiments, the method is operative for checking for valid input before processing. For example, self-combining may be performed prior to combining data operators (say, combining PET A with PET B). In self-combining, the method may combine PET A with PET A into PET A′ and combine PET B with PET B into PET B′. The method may then combine PET A′ and PET B′ to yield a result PET AB. This self-combining is advantageous in eliminating problems e.g. if individual PET descriptions are erroneous e.g. contain the same goal twice with the same or even different strength.
Alternatively or in addition, the Taxonomy Module of
A method according to any of the embodiments herein wherein the data operators comprise PETs.
A system according to any of the embodiments herein wherein the tree also comprises at least one hybrid dimension which is neither purely ordered nor purely unordered and or quantified or not quantified.
A system according to any of the embodiments herein and also comprising a Combination Builder.
A computer program product, comprising a non-transitory tangible computer readable medium having computer readable program code embodied therein, the computer readable program code adapted to be executed to implement a method configured to combine plural data operators, the method comprising:
I. Providing a user interface operative to provide evaluations of each of plural raw aka known aka input data operators, in accordance with a data operator taxonomy comprising a tree defining ordered and unordered dimensions comprising sub-trees along which data operators are classified; and
II. Providing a processor configured to automatically generate an evaluation of at least one composition of at least two of the plural raw data operators.
A particular advantage of certain embodiments is that a new or legacy technology in fields such as OTT content or (Social) IoT may be conveniently provided with tailored privacy features.
According to certain embodiments, privacy goals such as but not limited to some or all of the following goals: Unlinkability, Indistinguishability, Confidentiality, Deniability, Trust-establishment, Awareness), are embedded in a privacy policy engine, and map to real-world privacy problems such as but not limited to Tracking/profiling of users, disclosure of sensitive data, mistrust due to intransparency, in any information communication technology (ICT). According to certain embodiments, privacy goals may be enforced not only by “raw” PETs but also or alternatively by a combination of those “raw” privacy-preserving/enhancing technologies (PETs) that may be combined based on a policy that is specific to the service/ICT. This yields a computerized system that effectively maintains privacy of data stored therein or transmitted thereby.
Also provided, excluding signals, is a computer program comprising computer program code means for performing any of the methods shown and described herein when the program is run on at least one computer; and a computer program product, comprising a typically non-transitory computer-usable or -readable medium e.g. non-transitory computer-usable or -readable storage medium, typically tangible, having a computer readable program code embodied therein, the computer readable program code adapted to be executed to implement any or all of the methods shown and described herein. The operations in accordance with the teachings herein may be performed by at least one computer specially constructed for the desired purposes or a general purpose computer specially configured for the desired purpose by at least one computer program stored in a typically non-transitory computer readable storage medium. The term “non-transitory” is used herein to exclude transitory, propagating signals or waves, but to otherwise include any volatile or non-volatile computer memory technology suitable to the application.
Any suitable processor/s, display and input means may be used to process, display e.g. on a computer screen or other computer output device, store, and accept information such as information used by or generated by any of the methods and apparatus shown and described herein; the above processor/s, display and input means including computer programs, in accordance with some or all of the embodiments of the present invention. Any or all functionalities of the invention shown and described herein, such as but not limited to operations within flowcharts, may be performed by any one or more of: at least one conventional personal computer processor, workstation or other programmable device or computer or electronic computing device or processor, either general-purpose or specifically constructed, used for processing; a computer display screen and/or printer and/or speaker for displaying; machine-readable memory such as optical disks, CDROMs, DVDs, BluRays, magnetic-optical discs or other discs; RAMs, ROMs, EPROMs, EEPROMs, magnetic or optical or other cards, for storing, and keyboard or mouse for accepting. Modules shown and described herein may include any one or combination or plurality of: a server, a data processor, a memory/computer storage, a communication interface, a computer program stored in memory/computer storage.
The term “process” as used above is intended to include any type of computation or manipulation or transformation of data represented as physical, e.g. electronic, phenomena which may occur or reside e.g. within registers and/or memories of at least one computer or processor. Use of nouns in singular form is not intended to be limiting; thus the term processor is intended to include a plurality of processing units which may be distributed or remote, the term server is intended to include plural typically interconnected modules running on plural respective servers, and so forth.
The above devices may communicate via any conventional wired or wireless digital communication means, e.g. via a wired or cellular telephone network or a computer network such as the Internet.
The apparatus of the present invention may include, according to certain embodiments of the invention, machine readable memory containing or otherwise storing a program of instructions which, when executed by the machine, implements some or all of the apparatus, methods, features and functionalities of the invention shown and described herein. Alternatively or in addition, the apparatus of the present invention may include, according to certain embodiments of the invention, a program as above which may be written in any conventional programming language, and optionally a machine for executing the program such as but not limited to a general purpose computer which may optionally be configured or activated in accordance with the teachings of the present invention. Any of the teachings incorporated herein may, wherever suitable, operate on signals representative of physical objects or substances.
The embodiments referred to above, and other embodiments, are described in detail in the next section.
Any trademark occurring in the text or drawings is the property of its owner and occurs herein merely to explain or illustrate one example of how an embodiment of the invention may be implemented.
Unless stated otherwise, terms such as, “processing”, “computing”, “estimating”, “selecting”, “ranking”, “grading”, “calculating”, “determining”, “generating”, “reassessing”, “classifying”, “generating”, “producing”, “stereo-matching”, “registering”, “detecting”, “associating”, “superimposing”, “obtaining”, “providing”, “accessing”, “setting” or the like, refer to the action and/or processes of at least one computer/s or computing system/s, or processor/s or similar electronic computing device/s or circuitry, that manipulate and/or transform data which may be represented as physical, such as electronic, quantities e.g. within the computing system's registers and/or memories, and/or may be provided on-the-fly, into other data which may be similarly represented as physical quantities within the computing system's memories, registers or other such information storage, transmission or display devices or may be provided to external factors e.g. via a suitable data network. The term “computer” should be broadly construed to cover any kind of electronic device with data processing capabilities, including, by way of non-limiting example, personal computers, servers, embedded cores, computing system, communication devices, processors (e.g. digital signal processor (DSP), microcontrollers, field programmable gate array (FPGA), application specific integrated circuit (ASIC), etc.) and other electronic computing devices. Any reference to a computer, controller or processor is intended to include one or more hardware devices e.g. chips, which may be co-located or remote from one another. Any controller or processor may for example comprise at least one CPU, DSP, FPGA or ASIC, suitably configured in accordance with the logic and functionalities described herein.
The present invention may be described, merely for clarity, in terms of terminology specific to, or references to, particular programming languages, operating systems, browsers, system versions, individual products, protocols and the like. It will be appreciated that this terminology or such reference/s is intended to convey general principles of operation clearly and briefly, by way of example, and is not intended to limit the scope of the invention solely to a particular programming language, operating system, browser, system version, or individual product or protocol. Nonetheless, the disclosure of the standard or other professional literature defining the programming language, operating system, browser, system version, or individual product or protocol in question, is incorporated by reference herein in its entirety.
Elements separately listed herein need not be distinct components and alternatively may be the same structure. A statement that an element or feature may exist is intended to include (a) embodiments in which the element or feature exists; (b) embodiments in which the element or feature does not exist; and (c) embodiments in which the element or feature exist selectably e.g. a user may configure or select whether the element or feature does or does not exist.
Any suitable input device, such as but not limited to a sensor, may be used to generate or otherwise provide information received by the apparatus and methods shown and described herein. Any suitable output device or display may be used to display or output information generated by the apparatus and methods shown and described herein. Any suitable processor/s may be employed to compute or generate information as described herein and/or to perform functionalities described herein and/or to implement any engine, interface or other system described herein. Any suitable computerized data storage e.g. computer memory may be used to store information received by or generated by the systems shown and described herein. Functionalities shown and described herein may be divided between a server computer and a plurality of client computers. These or any other computerized components shown and described herein may communicate between themselves via a suitable computer network.
Certain embodiments of the present invention are illustrated in the following drawings:
Methods and systems included in the scope of the present invention may include some (e.g. any suitable subset) or all of the functional blocks shown in the specifically illustrated implementations by way of example, in any suitable order e.g. as shown.
Computational, functional or logical components described and illustrated herein can be implemented in various forms, for example, as hardware circuits such as but not limited to custom VLSI circuits or gate arrays or programmable hardware devices such as but not limited to FPGAs, or as software program code stored on at least one tangible or intangible computer readable medium and executable by at least one processor, or any suitable combination thereof. A specific functional component may be formed by one particular sequence of software code, or by a plurality of such, which collectively act or behave or act as described herein with reference to the functional component in question. For example, the component may be distributed over several code sequences such as but not limited to objects, procedures, functions, routines and programs and may originate from several computer files which typically operate synergistically.
Arrows between modules may be implemented as APIs.
Each functionality or method herein may be implemented in software, firmware, hardware or any combination thereof. Functionality or operations stipulated as being software-implemented may alternatively be wholly or fully implemented by an equivalent hardware or firmware module and vice-versa. Firmware implementing functionality described herein, if provided, may be held in any suitable memory device and a suitable processing unit (aka processor) may be configured for executing firmware code. Alternatively, certain embodiments described herein may be implemented partly or exclusively in hardware in which case some or all of the variables, parameters, and computations described herein may be in hardware.
Any module or functionality described herein may comprise a suitably configured hardware component or circuitry e.g. processor circuitry. Alternatively or in addition, modules or functionality described herein may be performed by a general purpose computer or more generally by a suitable microprocessor, configured in accordance with methods shown and described herein, or any suitable subset, in any suitable order, of the operations included in such methods, or in accordance with methods known in the art. Any logical functionality described herein may be implemented as a real time application if and as appropriate and which may employ any suitable architectural option such as but not limited to FPGA, ASIC or DSP or any suitable combination thereof.
Any hardware component mentioned herein may in fact include either one or more hardware devices e.g. chips, which may be co-located or remote from one another.
Any method described herein is intended to include within the scope of the embodiments of the present invention also any software or computer program performing some or all of the method's operations, including a mobile application, platform or operating system e.g. as stored in a medium, as well as combining the computer program with a hardware device to perform some or all of the operations of the method.
Data may be stored on one or more tangible or intangible computer readable media stored at one or more different locations, different network nodes or different storage devices at a single node or location.
It is appreciated that any computer data storage technology, including any type of storage or memory and any type of computer components and recording media that retain digital data used for computing for an interval of time, and any type of information retention technology, may be used to store the various data provided and employed herein. Suitable computer data storage or information retention apparatus may include apparatus which is primary, secondary, tertiary or off-line; which is of any type or level or amount or category of volatility, differentiation, mutability, accessibility, addressability, capacity, performance and energy use; and which is based on any suitable technologies such as semiconductor, magnetic, optical, paper and others.
An example end to end privacy architecture for IoT is now described which may provide a privacy framework for IoTA (IoT analytics) platforms. There is thus provided architecture that combines one, all or any subset of the following:
There is also provided according to an embodiment, a workflow design realizing one, all or any subset of the following:
A particular advantage of certain embodiments is that the IoT privacy architecture described allows PETs to achieve privacy based on their existing solutions.
Policy determination may be based on policy context description matching to the context instance resolved for a data request.
Enforcement of multiple fine-grained privacy requirements may be provided.
The policy module typically distinguishes between context descriptions and context instances.
Typically, middleware is distributed to provide end-to-end privacy.
Policy enforcement is typically end-to-end distributed, not limited to access control, and policies are bound to context modalities.
Typically, privacy policies are bound to context modalities. Typically, enforcement mechanisms are end-to-end.
Typically, mechanisms are extensible to cover arbitrary privacy requirements.
A method and system for contextual policy based privacy enforcement using a distributed Privacy Enhancing Technology (PET) framework may be provided.
It is appreciated that a composed system may include two interacting modules: contextual privacy policies and privacy middleware.
Existing privacy solutions are tailored to a certain domain, bound to a certain context and work with hard-wired methods. In order to achieve both privacy and business goals simultaneously, a more flexible and reactive solution is described herein.
From IoT characteristics e.g., resource constrained heterogeneous device and protocol landscape and IoT middleware, the following are sought:
The above are derived for (Social) IoT privacy, and an actual infrastructure that provides all of them, or alternatively any subset thereof, is described.
The architecture shown and described herein enhances privacy in service infrastructures end-to-end from devices (data source) to services (data sink) with distributed Privacy Enhancing Technology (PET) middleware. The middleware typically includes Privacy API (PAPI) and PAPI Backend components.
The middlewares set of enabled PETs and their configuration is determined on the fly (3) according a contextual configuration policy (dotted control flow arrows and boxes). PAPI resolves per request in real-time the device/user's privacy context (2), e.g. location, time, client id, user id, etc., and determines a context matching policy locally (cached) or remotely with the policy service in PAPI Backend. The user's and regulatory privacy requirements associated with a policy may be negotiated out-of-band or in an authorized user interface in PAPI Backend.
It is appreciated that the illustrated embodiment is merely exemplary. For example, the middleware components (privacy API and backend PAPI modules) may be deployed in arbitrary service layers, e.g. in/outside the service client, in/outside the device, in/outside the service network or in/outside the service server.
According to certain embodiments, the control flows are also tunneled through the middleware. The backward/response flows may be treated with the same privacy enhancing chain as the forward flow, but in reverse order. Both control flow and response flow may have different PET chains than the forward flow and their respective PET chains may be different from each other.
PAPI, PAPI Backend and their PETs may each function as independent services. Hence, the system is not limited to traditional client-server scenarios. In Peer-to-Peer (P2P) scenarios, the middleware (e.g. PAPI) typically privatizes communication from PAPI to PAPI instances. In service cooperation scenarios, the middleware typically privatizes communication from PAPI Backend to PAPI Backend instances and single PETs may be used as a service.
Operation of PETs at the server side may be generally the same as for client-side PETs. The operation of PETs at the client side may vary, depending on the PET. Typically, from the point of view of PAPI, PETs are black boxes with specified interfaces and functionality. Once selected, the data feeds into the PETs yielding results, and the respective results are returned to the system and/or to the next PET in the chain. PETs may be implemented as plug-ins, e.g. using a state of the art plugin-system such as provided by OSGI, JPF or Kharrat and Quadri.
Typically, from the point of view of PAPI, the structure and operation of PET connections on data are black boxes. Client-side PETs typically run on the client, connection PETs typically run outside the client, e.g. as a cloud service. Typically, connection PETs are called after client side PETs and before server side PETs.
The PAPI, when operating on data, typically intercepts data from other applications on the device, and applies privacy measures via PETs, thereby to yield a “private” output, and forwards the private output to the originally intended recipients.
Typically, all building blocks work together e.g. as depicted in
An Identity & Authentication Service may be provided, e.g. such as those provided by Google, Facebook or Twitter. Any suitable Identity & Authentication Software may be employed such as, say, Ping Identity: On-Prem and Cloud Based IDP (and SP) for Enterprise, OIDC/SAML/OAuth, software which provides email-based OIDC passwordless authentication such as Cierge, Keycloak being an example of Java-based OIDC/SAML IdP, AuthO being another example of OIDC IdP and Gluu being another example of OIDC/SAML IdP. Any suitable programing language may be employed suc h as but not limited to C, C++, C#, Java, Scala or Kotlin. Suitable design patterns are known—e.g. as described in Gamma, Erich et al, (1995). Design Patterns: Elements of Reusable Object-Oriented Software. Addison-Wesley. ISBN 0-201-63361-2, or in Buschmann, Frank et al (1996). Pattern-Oriented Software Architecture, Volume 1: A System of Patterns. John Wiley & Sons. ISBN 0-471-95869-7. Any suitable Identity and Authentication and Protocols may be employed such as, say, those of Kerberos, OpenID, SAML or Shibboleth.
Client, Server, Authentication, and Identity Provider components may use any standard state-of-the-art technologies. Any suitable technology may be used for interconnecting these functional components in a suitable sequence or order e.g. via a suitable API/Interface. For example, state of the art tools, such as but not limited to Apache Thrift and Avro which provide remote call support, may be employed. Or, a standard communication protocol may be employed, such as but not limited to HTTP or MQTT, and may be combined with a standard data format, such as but not limited to JSON or XML. Interfaces 1.2, 1.3 and 6.1 deal with establishing/resolving pseudonyms. Interfaces 3, 3.1, 3.2, 3.3, 3.4, 3.5, and 3.6 deal with establishing/resolving policies and any suitable implementations for using policies may be employed. Interfaces 9 and 9.1 deal with data store access which may be standardized (e.g. SQL) or may be defined by a suitable data base technology. Interfaces 4, 5, 6, 7, and 8 deal with the traffic that is routed through a PET chain. Typically, the PET chain is embedded into a standard client-server setting and uses conventional protocols and data structures.
Black dashed flows (arrows) and modules (blocks) like Trusted Third Party (TTP), Connection PETs and a Personal Data Store (PDS) may or may not be provided depending e.g. on the set of PETs and/or their respective infrastructure requirements.
Device layer middleware (PAPI) may for example be implemented as Android library with 3 PETs. Service layer middleware (PAPI Backend) may for example be implemented as a Web Service with 2 PETs.
Certain embodiments include mapping of contextual privacy requirements (e.g., weak/medium/strong confidentiality, usage and participation un-linkability, indistinguishability, trust, deniability and awareness) to PET configurations with fine-granular privacy goals. Due to the expressiveness of (composite) context modalities, policies may be bound to very fine-grained constraints e.g. may be made sensitive to all kinds of conditions. For example: Data is to be anonymized, if and only if user does not work daytime, and the device is off company premises, and/or data is to be left untouched if user is in a meeting, and/or data is to be dropped if user is in the bathroom and/or location blurring to be applied on Mondays or any operation applied to the data, given any other logical condition.
Furthermore, PETs on all deployed devices may be centrally extended over the PAPI Backend's PET registry. Users and authorities may negotiate their privacy preferences for policies associated to their specific context instances, e.g. user id, in the PAPI Backend.
A Content Agnostic PET typically comprises a PET which operates without knowledge of the actual content or format of the payload, e.g. does not need to look into the actual data and instead may operate on protocol level, may only work on meta data or may only work on the ASCII/binary representation of the payload without needing to understand it.
The PAPI is typically deployed between a regular client and a regular server. The server response arrives e.g. from whatever server the client chooses to connect to. For example, a web browser (client) may request a certain web page from a web server (server). The server's response may then be this web page (or access denied, or pop up for authentication or any other suitable response from the server to that request).
Embodiments of the invention are applicable in any service infrastructure and may enable privacy sensitive service domains.
For example, one embodiment is the application in an Event Participation Tracking service where end-users, aka “customers”, wish to track and measure quantity or quality of their live events, e.g. favorite concerts or activities based on corresponding emotions and feelings. This embodiment's service infrastructure is shown with PAPI as privacy enhancing middleware. Customers are equipped with wearables and mobile phones to collect, process and share personal data, e.g. heart rate, temperature, video, audio, location, social network events, etc. The service application transfers the data through PAPI to the service cloud to track and quantify events with analytics and artificial intelligence services.
Thus, customers' privacy needs in sensitive contexts, e.g. off topic encounters, bathroom breaks, etc. are considered automatically throughout the service provisioning chain owing to a negotiated privacy policy for these (composite) contexts in the PAPI Backend.
For example, the customer's identity is replaced with pseudonyms, sensitive data is removed or perturbated (e.g. geolocation), the source network/IP of the device is anonymized (Tor), personal (inferred) data is encrypted and stored in the PDS to be shared only with customer consent, etc.
It is appreciated that, generally, privacy may require removal of whichever portions of data are not needed, blurring of whichever portions of the data do not need full accuracy, such that the blurred version is sufficient, and leaving untouched (only) whatever data is truly needed.
Examples of how the above cases may be used include but are not limited to the following scenarios:
1: Removing all PII except age allows for fully anonymized profiling of behavior of age groups.
2: Pseudonymization of all IDs allows for privacy aware analytics with the possibility to reveal real ID with user consent.
3: Pseudonymization of all IDs allows for privacy aware detection of unlawful behavior with de-anonymization by court order.
4: Detecting usage patterns of public infrastructure, e.g. public parks, by video analytics while blurring faces for anonymization.
The following prior art technologies, providing Contextual Policies, may be used as a possible implementation for the above described operations:
The following prior art technologies providing Privacy Middleware, architectures and frameworks, may be used as a possible implementation for the above described operations.
It is appreciated that the Privacy Architecture of
A may be any end-user application/client,
B is data privatizing PET-middleware and
C may be any end-service that interacts with application A (e.g. to store and process the end-users data).
Typically, a user sends data from application A to service C. This data is intercepted by middleware B, typically but not necessarily, the interception is transparent to A. Middleware B derives a configuration-policy (combination of PETs) from the user's chosen/assigned privacy policy (the policy is, once or occasionally, selected by the user from a given set of policies, e.g. prior to registering for service C). After applying the ad hoc instantiated PET combination on the data, the middleware forwards the privatized data to end-service C. Typically, the PET Descriptions come from the privacy middleware that in turn derived the PETs to use from the negotiated policy. Typically, configuration descriptions are delivered either at setup time or later during maintenance by the system administrator. Typically, the PET system description is delivered back to the privacy middleware, representing the PET system currently employed. Typically, the privacy metric indicating the level of privacy that has be achieved is (also) given back to the privacy middleware. Typically, both outputs may be given to the end user or the service thereby to inform the user or service of applied privacy measures.
Backward data flow from C to A may work analogously; response data from C typically goes through B where the data is privatized according to the established PETs that have been derived from the policy established during negotiation. From there, the data may be delivered to application A, where it is handled by A according to A's typical behavior.
A PET composer method and system for composing and metering PET systems is now described. As shown in
Computers & Security Volume 53, September 2015, Pages 1-17, “A taxonomy for privacy enhancing technologies” by Johannes Heurix et al, describes that Privacy-enhancing technologies (PETs) are technical measures preserving privacy of individuals or groups of individuals. PETs are difficult to compare. Heurix provided a tool for systematic comparison of PETs.
The components, data, and flow may be as follows:
The input to the system of
The input (e.g. descriptions or data describing at least one PET) is provided to the Taxonomy Module of
It is appreciated that the taxonomy typically includes binary attributes aka properties which each PET either has or does not have, as well as multi-level attributes, aka capabilities, whose levels indicate the “strength” of that attribute possessed by a given PET e.g. strong confidentiality vs. weak confidentiality. Typically, the “strength” of an attribute is a scalar indicating an ordinal level (aka value) of a certain attribute possessed by a given PET.
The respective data structures (e.g. set structures) may be those described below in the description of “Model Definition” herein.
The state of the art teaches how to translate the formal definitions herein, for dimensions p, g, d, r, a, f etc., and their respective properties and capabilities, into computer legible data structures, using a suitable data format which may include attribute-value pairs, suitably nested. Since sets are prevalent in so many programming languages, any suitable constructs in any suitable programming language may be used to store these structures. Typically, although not necessarily, JSON, XML, ASN.1 are not optimal for representation of the data structures in computer memory (as opposed to APIs). An example JSON structure or syntax, useful on the API level, for representing an individual PET using the taxonomy of
Example syntax, using JSON:
An example PET description which uses the above syntax to describe a PET that provides TTP-based pseudonymization for authentication and authorization is as follows:
The above example is used herein in the description, for clarity. To see example values for the above, see the table of
Features and tweaks which accompany the JSON format may be allowed, e.g. use of arrays using square brackets. Also, the syntax may be extended (or shrunk) by adding (or removing) fields. The same is true for the possible values that may be assigned to the fields. Currently, the existing sub-fields for the PTM field and their possible values are those defined herein with reference to
ID: unique alpha-numeric identifier for a PET description input in
DB: Boolean, indicates if the actual PET description is already present in the system's Data Store (e.g., perhaps the PET description entered the DB when a previous combination of PETs was being processed). If DB=true, all subsequent fields may be ignored or skipped.
CTS: day and time of creation of the PET description [e.g. ISO 8601].
Name: alpha-numeric display name of PET description.
Conventional technology may be used to create an API to ingest the data structures shown above, such as but not limited to REST, SOAP.
It is appreciated, that converting ingested data into a given target format such as but not limited to JSON or XML, may be performed as is conventional.
A Composition Splitter may be provided in
It is appreciated, that inspecting data in a specified data format and splitting it according to given rule may be performed as is conventional, e.g. the rules may be:
Properties are sent to the Property Module.
Capabilities are sent to the Capability Module.
The Property Module of
The Capability Module of
The Composition Merger of
The Metrics Module of
A respective JSON structure for a privacy index for a single PET is shown as an example of a possible implementation. Additional information, such as identifiers, time-stamps and others, may be added for easier handling of the data within the system.
where the fields ID, CTS, and Name are defined generally as described elsewhere herein in the context of the Taxonomy Module of
The Configuration Descriptions of
Whether or not to store various data in the data store of
Retention time of various data in data store of
Selecting constraint union in case there is more than the example (weakest link union) provided herein.
Which privacy metrics to compute (e.g., say, goal strength vs. counting the number of Attacker Models that are provided by the combined PET or other indicator of level of resilience against attacks, vs. both of the above).
In case there are more than two input PETs, a parameter indicating which combinations to build e.g. if there are 3 inputs, perhaps build a combination of the first 2, then of the 3rd with the combination of the first 2
Which kind of graphical output to produce, e.g. b & w tables or color-colored
The Configuration Module of
Any available technologies for configuration management by the configuration module, may be employed e.g. Java.util.properties, Apache Commons Configuration, Preferences API.
The configuration format may for example be as follows:
With the meaning of ID, DB, TS, and Name as with the other JSONs. The actual configuration, which may be an arbitrary and flexible number of key value pairs, both alphanumeric, may be, say:
The Data Store of
Regarding the internal taxonomy and the privacy index, any state-of-the-art data store for discrete data is suitable such as but not limited to Files, Databases, Cloud Stores.
For example, if JSON is used, Elasticsearch (a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents which supports storage of JSON documents out of the box) may be used to store and index data and provide a search interface therefore. IF XML is used, a DB that has native XML support may be used, such as BaseX or eXist.
Regarding the configuration, depending on the choice of the state-of-the-art configuration management, suitable data stores may be used. As an example, if JSON is a supported format, Elasticsearch may be used. IF XML is used, a DB that has native XML support may be used, such as BaseX or eXist.
A PET System Description is an optional output of the system of
Conventional technology may be used to create a respective API to export the data structures, such as but not limited to REST, SOAP. Typically, the API outputs the composed PET in a specified data format e.g. the JSON used for PET input as described herein.
Privacy Index is an optional output of the system which typically includes a respective privacy index for each PET. The data format is typically the format provided by the Metric Module. Conventional technology may be used to create a respective API to export the data structures, such as but not limited to REST, SOAP. Typically, the API outputs the privacy index in a specified data format (e.g. the JSON used for privacy index as described herein).
Example use of the system of
P1: inverse whitelisting (=blacklisting) would prevent sharing sensitive location data.
P2: pseudonyms would prevent that tracks/routes of single individuals could be linked over time.
P5: differential privacy would prevent sharing of exposed routes (a route only used by a single person would expose this person),
which, in sum, yields a privacy index of:
where “+” is the weakest link composition shown and described herein.
Any suitable method, even manual, may be employed, given a PET, to assign values to each attribute thereof. For example, values may be assigned by an individual provider or developer of a PET, or by a community of privacy experts/PET developers, based on human expertise in the field of PETs, thereby to provide a formal description of the PET, using all or any portion or variation of the taxonomy of
example PETs mapped to the taxonomy of
Name of example PET: DC-Net
Short description: DC-Nets are an information theoretical secure approach to hide the sender of a message within a group of senders.
Name of Example PET: k-Anonymity
Short description: k-Anonymity (and improvements 1-diversity, m-invariance) anonymize the records of a database (example: medical records) before releasing the database to a third party or an analytical process.
The system of
The embodiment of
Beside technical goals, such as Unlinkability and Indistinguishability, socio-economic goals are missing in the original scheme of
Trust enhancing technologies and metrics to measure or increase trust in data and entities are typically provided. Therefore, the goal dimension is extended with trust and assume awareness, and trust enhancing technologies may be implemented as PETs in this model.
It is possible to distinguish between Unlinkability of Participants and Unlinkability of Usage. Hence, the embodiment of
A new property-dimension called Principle is introduced in the embodiment of
The Principle dimension may be used to bridge between non-technical and technical PET stakeholders. Furthermore, the principle dimension enables extensions for mappings between legal and technical privacy policies that are based on composed PET properties.
The Foundation dimension is extended, in
TTP and Scenario: in the embodiment of
The Reversibility dimension is especially notable in that it mixes capability and property characteristics. Cooperation contained in the original the characteristics Required and Not Required. Not Required may be removed. This leaves the sub-dimension with a single property. Hence, it moved up as a new Reversibility property called Co-operational. The Degree sub-dimension may be seen as capability set with the exception of the None and Deniable characteristic. None is obsolete and Deniable is a property of the reversibility, thus, it moves up. Currently, Degree is quantified by the two-element set {Full, Partial}. The usage of a finer grained quantifier set, either finite, such as {Full, Largely, Medium, Barely} or infinite, such as , , , may be done. In the case of a finite quantifier set, there are more elements, but an order may still be defined e.g. by determining a ranking extending from a “biggest” down to a “smallest” element and including all elements. In case of infinite quantifiers by numbers, the order is defined by the number system itself. In case of infinite non-number quantifiers, an order may be defined by bijective mapping to a number system and adopting the number system's order. An example for the latter is mapping all alphanumeric strings to N by interpreting the strings' binary representation (e.g. ASCII) as a binary number.
The table in
An example Model Definition is now described in detail: The taxonomy dimensions are formalized as sets. Elements are dimensional leafs of the extended taxonomy tree. Deeper structures (tree depth bigger than two) will be defined as subsets for readability. A concrete PET is defined as a set of leafs from the taxonomy tree, respectively as a set of sets. To keep the information contained in the leaf path, element subscript labels may be introduced to specify their parent characteristic and identity. For example, weak usage-unlinkability will be expressed as weakUnlink
P is the Principle property set P.
G is the Goal capability set. The ordered strength level set STR is defined as countable infinite natural numbers . To model the path, goal labels GLABEL are introduced and defined here explicitly for reuse in later definitions. e.g., for the scale between weak (1), medium (2) and strong (3): 1Aware∈G would state weak awareness. Labels allow strength level distinction of different privacy goals. For example, a PET with {1Aware, 1Conf}⊏G, would provide weak awareness and confidentiality.
D={Stored. Transmitted, Processed}
D is the Data dimension's property set.
R is the property/capability set of the Reversibility dimension.
A is the Aspect dimension's property set. For easier readability, the subdimensions Anonymity (ANO) and Pseudonymity (PSEUDO) are defined as subsets but follow the same label convention like the privacy goals. Hence, the identity aspect of single directed anonymity may be written as
F is the Foundation dimension's property set. Analogue to the previous definition, the subdimension “Category” has depth bigger than two, hence a subset CAT is introduced.
T is the TTP dimension's property set.
S is the Scenario dimension's property set.
For simplicity, the subscripted labels of the elements are abbreviated (Conf., Aware., etc.). The notion of {{x1, x2, . . . , xn}Label} shall be the short form of {x1
Technically, goal strengths are determined by the corresponding PET's configuration, e.g. k-anonymity provides weak indistinguishability for a small k, but strong for a big k. Such cases may be supported with “virtual” PETs that reflect the different strength levels. An example configuration of k-anonymity could exist as w-k-anonymity (1Indis∈G with k=10), m-k-anonymity (2Indis∈G with k=100) and s-k-anonymity (3Indis∈G with k=1000).
The complete taxonomy model may be defined as follows:
The PET Taxonomy Model (PTM) is an 8-tuple of sets:
similar to the taxonomy dimension classification, PTM components may be classified as:
Property-set components={P, D, A, F, T, S}
Capability-set components={G, R}
(PTM)=((P), (G), (D), (R), (A), (F), (T), (S))
(PTM)=PTM′=(P′, G′, D′, R′, A′, F′, T′, S′)
PTM′, the space of all possible PETs, is defined as:
where denotes the power set and the apostrophe in PTM′ components shall be used as short notation herein. Intuitively, this models the set of all possible PET characteristic combinations. According to one embodiment, although this is not intended to be limiting, the power sets (G) and (R) contain sets with multiple goal/reversibility strengths/degrees and the same label; e.g., {5Co-op, 1Co-op}∈(G)
The identity element P0∈PTM′ is defined as:
P0=({ }, { }, { }, { }, { }, { }, { }, { }) or simply Ø
Combination of properties typically includes a relation on property-(sub)dimensional
PTM characteristics using naive set union (U)
Composition of capabilities typically includes a relation on capability-(sub)dimensional PTM characteristics using a C-constrained set union (Uc), where C is defined for the individual capability dimension.
Typically, the composition-function, is a union of the PETs' dimensions.
Optionally, this composition-function may be adjusted to reflect real-life constraints, when implementing the combination of multiple PETs, that are use-case specific. For example:
In a framework with such PETs as plugins with defined interfaces, the resulting privacy properties of an overall system which may include combinations of PETs, may be estimated based on the privacy metric shown and described herein which is derived from the algebra.
The PET algebra <PTM′, Σ> is typically defined as: A tuple of the PET-space (PTM′) is a universal set and typically the signature Σ=(⊕), containing typically one binary operation ⊕ on PTM′. Short: <PTM′, ⊕>.
PTM′ element composition typically results in another PTM′ element. The relation is typically defined, as characteristic combination and composition of its terms.
Definition
A ⊕B with A; B∈PTM′ is typically defined as:
For all A; B∈PTM′ the weakest link constrained composition w may for example be defined as:
For every property-dimension, ⊕ may be defined as set-union (combination). And for every capability dimension (Goal and Reversibility) this symbol may be defined as constraint union (∪min and ∪TO).
With the exception of Goal and Reversibility, all other dimensions are typically conventionally united (e.g. simple set union) for the resulting composed PET (i.e. the result of applying the PET-algebra), aka the PET-system (PETS). This design decision induces the possibility to list properties of a composed PET system and preserves the original PET characteristics. a
Minimum goal strength composition ∪min for all G1, G2∈G′ is typically defined as:
Intuitively, the Goal sets may be united, typically, with a minimum strength level restriction. ∪min is a set union, where the “smallest” elements of the same label may be selected. e.g. {1conf, 2Aware} {3conf, 3Indis}={1conf, 2Aware, 3Indis}. This decision typically means, though, that PETs with the same goals, but weaker strength levels semantically taint the composed system. Hence, privacy goals of the composition are typically downgraded in this embodiment.
Maximum reversibility degree composition ∪TO (or Uto) for all R1, R2∈R′ is typically defined as:
Reversibility is typically a hybrid PET dimension containing properties (co-operational and deniable) and capabilities (Degree). Intuitively, the Reversibility sets are typically united for their properties and for their capabilities with a trade-off (TO). The degree typically has two capability states: fully and partially. ∪TO computes their logical conjunction. e.g. composition of a fully reversible PET {FullDegree} with a co-operational, deniable and partial reversible PET {PartialDegree, Deniable, Co-operational}, results in {Co-operational, Deniable, PartiallyDegree}.
With the previously defined algebra, PETs may be formally composed. Resulting PET systems typically contain, among others, composed goal characteristics that may be used to describe and estimate an underlying IT systems enforced privacy. The description herein relates to utilization of the Goal characteristics by way of example, but the description is applicable to other or all characteristics.
The algebra is typically used to derive an enforceable privacy metric, that measures which privacy requirements are minimally covered by a combination of PETs, when used in an application-to-service data flow.
Enforced privacy metric for a PETS P may be:
xi∈g(P) is the i-th goal's strength; gs(P) is the strength level weighted number of goals for a PETS P and |G| the maximum count of possible privacy goals in PTM. STR is defined as and is open. For the metric, it must be closed STR*. e.g. an IT system using the model only with the strength levels weak (1), medium (2) and strong (3), the subset STR*={s∈|1<=s<=3}.
Alternatively, instead of limiting the subset STR* to a predefined maximum (3 in the case above) the highest actually occurred Strength may be used as limiting element.
Metrics for other PET dimensions may be defined analogously e.g. as follows: For capabilities that are quantified over number sets, such as {1, 2, 3}, N, R, Q, the formulas given above may be re-utilized (working on the capability in scope instead of G). For capabilities that are quantified over non-number sets, a bijective mapping to a number set may be applied to make that capability quantified over a number set. For properties may be treated as capabilities by assigning quantifiers to their values, then, and this way metrics may be computed as shown herein for capabilities.
Use cases include but are not limited to:
It is appreciated that the description herein is in some ways specific, for clarity, to user centric privacy scenarios in (Social) IoT scenarios such as Quantified Self-based analytics. However, the combiner system shown and described herein may also be employed in group centric privacy scenarios such as but not limited to community deriving or establishment and/or to non-privacy scenarios such as machine or traffic monitoring with contextual data gathering restrictions.
Advantages of certain embodiments include that PET and Privacy Models are formalized using a flexible, general model including a formal PET algebra and composition, and are not specific to any particular definition of privacy or any particular privacy goals. Privacy metrics are general, rather than being tailored to specific domains.
Conventional PET and Privacy Model technologies may be combined with the teachings herein, such as but not limited to the conventional PET and Privacy Model technologies described in references A-F set out in the Background section, above.
For example, Heurix et al. propose a generic multi-dimensional PET taxonomy [A] to compare previously-thought incomparable PETs. Their model does not formalize the taxonomy, nor investigate composition of PETs such that combining with the relevant taxonomy formalization and/or composition investigation teachings herein is advantageous.
Other related formal models which are privacy definition specific and may formalize specific privacy goals and typically relations there between, may also be combined with the teachings herein. For example, the work of Pfitzmann [B] defines many privacy notions and their relationships, such as anonymity and observability. Bohli et al. [C] unify anonymity, pseudonymity, unlinkability, and indistinguishability. Backes et al. [D] analyze and quantify anonymity properties in a formal framework. A generalizing Pi-calculus based privacy model is described by Dong et al. [E] to enforce user privacy even when the user is collaborating with the adversary. Kifer et al. [F] define a formal framework to create new application-specific privacy definitions. These works are applicable to only a subset of PETs, privacy goals or definitions, but may be combined with the teachings shown and described herein.
Conventional PET Composition technologies may be combined with the teachings herein, such as but not limited to the following PET Composition technologies:
[G] Yannis Tzitzikas, Anastasia Analyti, Nicolas Spyratos, and Panos Constantopoulos, [2004], An algebraic approach for specifying compound terms in faceted taxonomies. In Information Modelling and Knowledge Bases XV, 13th European-Japanese Conference on Information Modelling and Knowledge Bases, EJC′03. 67-87.
[H] Dominik Raub and Rainer Steinwandt. 2006. An algebra for enterprise privacy policies closed under composition and conjunction. In Emerging Trends in Information and Communication Security. Springer, 130-144.
In particular, Tzitzikas et al. propose a general composition algebra for faceted taxonomies [G] but this approach requires prior declaration of invalid compositions, hence may be improved by suitable combining with the teachings herein. Raub et al. define algebra to compose privacy policies [H] yet their approach is specialized to policy composition and does not include a formal PET algebra or composition, hence may be improved by suitable combining with the teachings herein.
Conventional Privacy metrics technologies may be combined with the teachings herein, such as but not limited to the following Privacy metrics technologies: [I] Latanya Sweeney. 2002. k-anonymity: A model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 10, 05 (2002), 557-570.
[J] Andrei Serjantov and George Danezis. 2002. Towards an information theoretic metric for anonymity. In Privacy Enhancing Technologies. Springer, 41-53.
[K] Sebastian Clauß and Stefan Schiffner. 2006. Structuring Anonymity Metrics. In Proceedings of the Second ACM Workshop on Digital Identity Management (DIM '06). ACM, New York, N.Y., USA, 55-62.
It is known that metrics may be defined based on set-size, probability, entropy or Euclidean distance, e.g., for indistinguishability and anonymous communication. For example, the size of a subject set wherein an individual subject is not identifiable [B]. This anonymity-set metric is applicable to Mix-Nets and DC-Nets. Indistinguishability may be measured with set and distance metrics as well, e.g., k-anonymity [I]. Probability and entropy-based metrics such as the Shannon [J] or Rényi entropy [K], enable modeling of additional attacker knowledge to measure information-theoretic privacy. Rebollo et al. [M] propose a theoretical framework endowed with a general privacy definition in terms of the estimation error incurred by an attacker. Yet set-based privacy metrics are not applicable in arbitrary privacy domains and lack fine-grained attacker knowledge modeling. Indistinguishability based approaches have weaknesses against attackers capable of running statistical de-anonymization attacks resulting in reduced anonymity sets. Entropy approaches are heavily influenced by outliers. Therefore, each of these technologies may be improved by suitable combining with the teachings herein.
One of the many applications for the methods and systems shown herein is for Internet of things. IoT maps the real world into the virtual world with unique identifiable physical objects to virtual resources. RFID, Auto-ID and optical tags such as QR- or barcodes are among the technologies that may be employed to achieve this. Passive RFID and NFC chips in credit cards, door access cards or retail items may be used for tagging, identification and tracking of people and items. Interactions between objects, people, and the surrounding environment are captured via passive/active sensors and transferred with semantic analytics services to physical actuators and visual presentation. Autonomous, physical world events are reflected by virtual world events, e.g., haptic events with actuators or cognitive events with knowledge visualization. e.g. autonomous industrial control (Industry 4.0, SCADA), sensors for detecting pollution, radiation etc. (smart city and environment), participatory sensing with smartphone sensors e.g., microphone, gyroscope, and location, but also cloud-based service architectures and social networks with virtual entities and events. Any suitable communication technologies and protocols may be employed. For example, Classical client-server and P2P communication may be used to connect devices and services over the Internet. Wireless physical layer communication technologies enable dynamic, mobile, M2M networks such as WSN, RSN, Mobile Ad Hoc Network (MANET), Vehicular Ad Hoc Network (VANET) and hybrid cloud-based infrastructures. Wireless technologies from low to high power consumption and connectivity range include NFC, Bluetooth (low energy), WiFi, WiMax, WiFi-Direct, ZigBee, Z-Wave, 3G, 4G, and LTE-Advanced. When end-to-end connections are established on lower TCP/IP layers, specialized application layer protocols may be employed, such as but not limited to RESTful HTTP or event-based/real-time protocols such as MQ Telemetry Transport (MQTT), Extensible Messaging and Presence Protocol (XMPP) and Web-Sockets.
Privacy Enhancing Technology (PET) systems are applicable to IoT inter alia and applicability may be improved using the methods shown and described herein e.g. to provide advanced IT systems using multiple PETs. These systems′ privacy may or may not be quantified with a set-based privacy metric e.g. as described herein. Such systems are advantageous relative to conventional PETs such as, say, Tor's anonymous network communication or Differential Privacy-based privacy preserving analytics. The contextual and policy-based privacy architecture for IoT infrastructures shown and described herein is however merely by way of example, particularly since applicability of the PET taxonomy-based embodiments and algebra-based PET composition methods shown and described herein are not limited to mobile IoT or to IoT at all.
Example PETs that may be composed or combined, using methods shown and described herein, are described e.g. in the Heurix publication mentioned herein. Examples include:
Classical k-anonymity proposed by Sweeney, with its two well-known extensions, 1-diversity (Machanavajjhala) and t-closeness (Li et al);
Private information retrieval (PIR) proposed by Chor et al);
Oblivious transfer, e.g. a protocol proposed by Rabin, that allows the transfer of a secret between a sender and a receiver without the sender knowing what the receiver has received;
Boneh scheme which allows an untrusted server to verify the existence of an encrypted keyword that has been encrypted with a receiver's public key by a sender.
Proxy re-encryption e.g. cryptographic protocol proposed by Blaze et al which allows the re-encryption of data by a third party so that the data initially encrypted with the sender's public key may then be decrypted with the receiver's private key, without the third party having access to any of the secret keys or the data's content;
Ateniese et al., proxy re-encryption used as a basis for a secure file system;
Deniable encryption e.g. by Canetti et al;
Steganography operative for hiding the content of a piece of data, including hiding the existence of sensitive data by embedding or encoding the data within uncritical carrier data e.g. as proposed by Katzenbeisser and Petitcolas; and
Mix Net e.g. as proposed by Chaum, for anonymized communication between sender and receiver based on encapsulation of a message with multiple encryption layers to hide the message's route, combined with mixing the messages (sequence, delay, dummy traffic).
Techniques to hide the identity of individuals by application of group signatures, as proposed by Chaum, e.g. a group signature scheme, where members of the group are able to create a valid signature while the actual individual is kept hidden from the verifier, including the version with fixed-size public key proposed by Camenisch and Stadler.
Anonymous credential system, by proving the possession of a certain credential without disclosing the user's identity, introduced by Chaum in 1985 in which sender uses different pseudonyms for communication with a credential provider and a credential receiver, where the pseudonym for the receiver is blinded before it is sent to the credential provider;
Tor Browser which makes available the Tor anonymization framework by providing an easy-to-use Tor-enabled web browser;
PET tools added to existing web browsers as plugins such as HTTPS Everywhere by The Tor Project and the Electronic Frontier Foundation (HTTPS Everywhere) or Privacy Bird or Ghostery, e.g., which identifies trackers on each visited web site and optionally blocks them to hide the user's browsing habits;
End-user PETs including GNU Privacy Guard (GnuPG or GPG) which applies asymmetric and symmetric cryptography to messages and data to primarily encrypt the content of emails. This also includes a key management system based on the web-of-trust concept;
Data-at-rest PETs such as disk-encryption tools e.g. Truecrypt which allows both creating encrypted disk volumes or full-disk-encryption.
Typically, although not necessarily, Privacy(P) is the privacy score between [0, 1] of an IT system using the PETS P where zero is defined to mean that none, or only PETs without privacy goals, are applied such that no technological provisions at all are provided, for the benefit of customers and service providers and other end-users of such an IT system. Conversely, a score of one may be defined to mean maximal technical guaranteed privacy where all possible PET-based privacy technologies, mechanisms and/or privacy goals described herein are covered by the PETs deployed in the IT system. A particular advantage of certain embodiments is that this may occur even if the IT system was not designed a priori using “privacy by design” principles or other privacy policy compliance mechanisms, and, nonetheless, the resulting privacy is maximal. Typically, the privacy metric assumes that the PETs in P are correctly deployed in the system to enforce the respective privacy goals of each PET.
It is appreciated that, typically, the PET description input to the PET combiner of
It is appreciated that in the taxonomy of
While JSON is one suitable representation of a set based PET system, JSON is merely exemplary. For example, JSON may not be used at all, or may be used for an API (e.g. API for exporting data structures) but not to store the taxonomy in memory. to represent the taxonomy in memory, set constructs provided by the utilized programming language may be employed, since JSON is usually not optimal, such as but not limited to:
It is appreciated that the specific embodiments shown and described herein are merely exemplary. More generally, the following method (
i. provide manual or automatic evaluations of each of plural “raw” or “known” or “input” PETs, in accordance with any suitable taxonomy e.g. the taxonomies shown and described herein
ii. use some or all of the modules of the system of
iii. the system of
This index typically indicates the level of privacy achieved by the composition of “input” PETs, typically relative to a maximum achievable level defined by what is representable by the taxonomy, i.e. the number of possible values for the dimension in which we compute the privacy index. For example, consider a privacy index for “goal” dimension assuming a combined PET with the following goals
There are 7 different privacy goals, by way of example, in the taxonomy of
The highest Strength in the combined PET may be the reference for normalization: Normalization reference=3. Then normalize all Strength in the combined PET using the
So, typically, there is a normalization step involved. If the Strength is defined over an infinite set (e.g. the natural numbers in the case at hand), the highest occurred Strength may be used as the reference. If the Strength set is finite, the highest possible Strength may be the reference for normalization.
Typically, a user has the possibility to select the privacy policies he deems adequate from various available policies which may be presented via a GUI e.g. as a list. By selecting the policies, the user defines his requirements, since, for example, if the user has very relaxed privacy requirements he may express that by selecting a very relaxed policy and conversely, if the user has very strict privacy requirements, he selects a very strict policy. Typically, given the policies selected by the user, and typically after a negotiation process, the middleware shown and described herein automatically selects the appropriate PETs to implement/enforce this policy. These PETs may then be combined using the algebra shown and described herein, yielding the level of privacy defined by the policy.
It is appreciated that a system such as the system of
An advantage of the above flow is that PETs whose operation is well enough known to enable their evaluation in accordance with the taxonomy of
The system may include a display generator operative to generate a table or graph or any other description of any or all compositions in which the end user is interested, e.g. as shown in the examples of
It is appreciated that this system would allow an end-user a larger selection or libraries of PETs from which to choose on an informed basis (i.e. in accordance with each PET's properties according the taxonomy of
Each combination of PETs itself constitutes a PET, in effect, even though the PETs (e.g. programs, networks, libraries) are executed separately, one after another and even though the individual PETs need not be co-located on a single node in a communication network and instead may be scattered over the network. So, this system provides data regarding not only each of p plural “raw” or “known” or “input” PETs, but also regarding some or all compositions of the p PETs—yielding, in total, a large number of PETs whose properties according the taxonomy of
It is appreciated that references herein to the taxonomy of
References to the taxonomy of
A particular advantage of certain embodiments is that end-users do not (or need not since the system herein may be configured to allow users to choose to do this as an option for advanced users) directly upload or deal with PET descriptions in any way; instead PET descriptions are handled internally.
It is appreciated that terminology such as “mandatory”, “required”, “need” and “must” refer to implementation choices made within the context of a particular implementation or application described herewithin for clarity and are not intended to be limiting since in an alternative implementation, the same elements might be defined as not mandatory and not required or might even be eliminated altogether.
Components described herein as software may, alternatively, be implemented wholly or partly in hardware and/or firmware, if desired, using conventional techniques, and vice-versa. Each module or component or processor may be centralized in a single physical location or physical device or distributed over several physical locations or physical devices.
Included in the scope of the present disclosure, inter alia, are electromagnetic signals in accordance with the description herein. These may carry computer-readable instructions for performing any or all of the operations of any of the methods shown and described herein, in any suitable order including simultaneous performance of suitable groups of operations as appropriate; machine-readable instructions for performing any or all of the operations of any of the methods shown and described herein, in any suitable order; program storage devices readable by machine, tangibly embodying a program of instructions executable by the machine to perform any or all of the operations of any of the methods shown and described herein, in any suitable order i.e. not necessarily as shown, including performing various operations in parallel or concurrently rather than sequentially as shown; a computer program product comprising a computer useable medium having computer readable program code, such as executable code, having embodied therein, and/or including computer readable program code for performing, any or all of the operations of any of the methods shown and described herein, in any suitable order; any technical effects brought about by any or all of the operations of any of the methods shown and described herein, when performed in any suitable order; any suitable apparatus or device or combination of such, programmed to perform, alone or in combination, any or all of the operations of any of the methods shown and described herein, in any suitable order; electronic devices each including at least one processor and/or cooperating input device and/or output device and operative to perform e.g. in software any operations shown and described herein; information storage devices or physical records, such as disks or hard drives, causing at least one computer or other device to be configured so as to carry out any or all of the operations of any of the methods shown and described herein, in any suitable order; at least one program pre-stored e.g. in memory or on an information network such as the Internet, before or after being downloaded, which embodies any or all of the operations of any of the methods shown and described herein, in any suitable order, and the method of uploading or downloading such, and a system including server/s and/or client/s for using such; at least one processor configured to perform any combination of the described operations or to execute any combination of the described modules; and hardware which performs any or all of the operations of any of the methods shown and described herein, in any suitable order, either alone or in conjunction with software. Any computer-readable or machine-readable media described herein is intended to include non-transitory computer- or machine-readable media.
Any computations or other forms of analysis described herein may be performed by a suitable computerized method. Any operation or functionality described herein may be wholly or partially computer-implemented e.g. by one or more processors. The invention shown and described herein may include (a) using a computerized method to identify a solution to any of the problems or for any of the objectives described herein, the solution optionally including at least one of a decision, an action, a product, a service or any other information described herein that impacts, in a positive manner, a problem or objectives described herein; and (b) outputting the solution.
The system may, if desired, be implemented as a web-based system employing software, computers, routers and telecommunications equipment as appropriate. Any suitable deployment may be employed to provide functionalities e.g. software functionalities shown and described herein. For example, a server may store certain applications, for download to clients, which are executed at the client side, the server side serving only as a storehouse. Some or all functionalities e.g. software functionalities shown and described herein may be deployed in a cloud environment. Clients e.g. mobile communication devices such as smartphones may be operatively associated with, but external to, the cloud.
The scope of the present invention is not limited to structures and functions specifically described herein and is also intended to include devices which have the capacity to yield a structure, or perform a function, described herein, such that even though users of the device may not use the capacity, they are, if they so desire, able to modify the device to obtain the structure or function.
Any “if-then” logic described herein is intended to include embodiments in which a processor is programmed to repeatedly determine whether condition x, which is sometimes true and sometimes false, is currently true or false, and to perform y each time x is determined to be true, thereby to yield a processor which performs y at least once, typically on an “if and only if” basis e.g. triggered only by determinations that x is true and never by determinations that x is false.
Features of the present invention, including operations, which are described in the context of separate embodiments may also be provided in combination in a single embodiment. For example, a system embodiment is intended to include a corresponding process embodiment and vice versa. Also, each system embodiment is intended to include a server-centered “view” or client centered “view”, or “view” from any other node of the system, of the entire functionality of the system, computer-readable medium, apparatus, including only those functionalities performed at that server or client or node. Features may also be combined with features known in the art and particularly, although not limited, to those described in the Background section or in publications mentioned therein.
Conversely, features of the invention, including operations, which are described for brevity in the context of a single embodiment or in a certain order, may be provided separately or in any suitable subcombination, including with features known in the art (particularly although not limited to those described in the Background section or in publications mentioned therein) or in a different order. “e.g.” is used herein in the sense of a specific example which is not intended to be limiting. Each method may comprise some or all of the operations illustrated or described, suitably ordered e.g. as illustrated or described herein.
Devices, apparatus or systems shown coupled in any of the drawings may in fact be integrated into a single platform in certain embodiments or may be coupled via any appropriate wired or wireless coupling such as but not limited to optical fiber, Ethernet, Wireless LAN, HomePNA, power line communication, cell phone, Smart Phone (e.g. iPhone), Tablet, Laptop, PDA, Blackberry GPRS, Satellite including GPS, or other mobile delivery. It is appreciated that in the description and drawings shown and described herein, functionalities described or illustrated as systems and sub-units thereof may also be provided as methods and operations therewithin, and functionalities described or illustrated as methods and operations therewithin may also be provided as systems and sub-units thereof. The scale used to illustrate various elements in the drawings is merely exemplary and/or appropriate for clarity of presentation and is not intended to be limiting. Headings and sections herein as well as numbering thereof, is not intended to be interpretative or limiting.
Priority is claimed from U.S. provisional application No. 62/463,874, entitled End Privacy Architecture for IoT and filed on 27 Feb. 2017, the disclosure of which application is hereby incorporated by reference.
Number | Date | Country | |
---|---|---|---|
62463874 | Feb 2017 | US |