This section is intended to introduce the reader to various aspects of art, which may be related to various aspects of the present invention that are described and/or claimed below. This discussion is believed to be helpful in providing the reader with background information to facilitate a better understanding of the various aspects of the present invention. Accordingly, it should be understood that these statements are to be read in this light, and not as admissions of prior art.
In the field of processor-based systems, such as computer systems, it may be desirable for information and/or electronic data to be transferred from one system to another system via a network or other electronic means. For example, networks may be arranged to allow information, such as files or programs, to be shared across an office, a building, or any geographic boundary. The Internet, for example, is a global network that may allow for private e-mail communications, business transactions between multiple parties, targeted advertising, commerce, and the like. While networks such as the Internet may be used to increase productivity and convenience, they also may expose communications and computer systems to security risks (e.g., interception of confidential data by unauthorized parties, loss of data integrity, data manipulation, and unauthorized access to accounts).
Commitments may be used in interactive protocols between mutually distrusting parties. In some network communications, it may be desirable to have one party commit to a set of features such that the party can later prove that selected queries are satisfied by the committed-to set of features. Additionally, the committing entity may wish to prove that certain features were missing from the previously committed group. An improved method for providing such commitment is desirable.
One or more specific embodiments of the present invention will be described below. In an effort to provide a concise description of these embodiments, not all features of an actual implementation are described in the specification. It should be appreciated that in the development of any such actual implementation, as in any engineering or design project, numerous implementation-specific decisions must be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which may vary from one implementation to another. Moreover, it should be appreciated that such a development effort might be complex and time consuming, but would nevertheless be a routine undertaking of design, fabrication, and manufacture for those of ordinary skill having the benefit of this disclosure. It should be noted that illustrated embodiments of the present invention throughout this text may represent a general case. For example, features illustrated as integer features may be turned into a series of binary features (e.g., “X=5” becomes “X's first bit is on” and “X's third bit is on”).
Commitment methods may be used to commit to some data privately and (optionally) later prove what data was previously committed to. Such methods may include three parts: (1) a procedure for mapping the data to a small value, which may be referred to as a commitment token, and possibly some secrets (pieces of data held privately by the committer); (2) a procedure which takes the data D and a commitment token C produced earlier by (1) along with any associated secrets and produces a proof of the form “The data used to produce C is D”; and, (3) a procedure for verifying statements of the form produced by (2). These procedures may be arranged so that possessing C alone effectively reveals nothing about D.
An exemplary use of commitment methods may be illustrated by two parties attempting to flip a fair imaginary coin over a telephone as part of a game. This may be done as follows: each party flips a real coin, commits to the result, and sends the resulting commitment token to the other party. After exchanging the commitment tokens, each party may send a proof revealing what the value they committed to was. If both proofs verify, then the result of the imaginary coin is the exclusive-or (xor) of the two committed values. If one party's proof fails or if they refuse to follow this procedure, they are considered to be cheating. Note that if it was possible to learn about the other party's coin flip from just their commitment token or if it was possible to lie at revealment time about one's previously committed-to coin value, a party would be able to influence unfairly the result of the imaginary coin flip.
Set commitment methods are a specialization of commitment methods that may be used to commit to a set of values (e.g., a set of numbers, a set of names, or a set of web sites visited) privately and to later (optionally) prove information about the committed-to set's contents. In particular, they may allow proving that particular values are in the committed-to set. They may also allow proving that particular values are not in the committed-to set. Ideally, these proofs do not reveal anything else about the set's contents. An exemplary set commitment method is discussed in S. Micali, M. O. Rabin and J. Kilian, Zero-Knowledge Sets, The Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science (2003). An example relating to set commitment can be found in the patent application by Mark David Lillibridge and Rajan Mathew Lukose U.S. patent application Ser. No. 10/909,161, filed Jul. 30, 2004, entitled “System and Method for Targeted Advertising Via Commitment”, which is presently incorporated by reference herein.
A technique for building set commitment methods is the use of HDAG's (hash-based directed acyclic graph). An HDAG may be defined as a DAG (directed acyclic graph) wherein pointers hold cryptographic hashes (defined below) instead of addresses. A DAG may be defined as a data structure having directed edges and no path that returns to the same node. The node an edge emerges from is called the parent of the node that edge points to, which in turn is called the child of the original node. Each node in a DAG may either be a leaf or an internal node. An internal node has one or more child nodes whereas a leaf node has none. The children of a node, their children, and so forth are the descendents of that node and all children of the same parent node are siblings. If every child node has no more than one parent node in a DAG and every node in the DAG is reachable from a single node (called the root node), then that DAG is a tree. HDAG's that are trees are sometimes referred to as Merkle Trees. A binary tree may be a tree wherein every node in the tree has at most two children.
A cryptographic hash (shortened to hash in this document) may be defined as a small number produced from arbitrarily-sized data by a mathematical procedure called a hash function (e.g., MD5) such that (1) any change to the input data (even one as small as flipping a single bit) with extremely high probability changes the hash and (2) given a hash, it is infeasible to find any data that maps to that hash that is not already known to map to that hash. Because it is essentially impossible to find two pieces of data that have the same hash, a hash can be used as a reference to the piece of data that produced it; such references may be called intrinsic references because they depend on the content being referred to, not where the content is located. Traditional addresses, by contrast, are called extrinsic references because they depend on where the content is located.
To prove that a particular element, say five, is in the set whose commitment token the commiter announced, he merely need supply the contents of all of the nodes (inclusive) on the path from the root node to the node containing that element; in the case of five, this would be node 101 followed by node 102. The advantage of sending just a path to the node containing that element instead of the contents of the entire HDAG is that a path is often exponentially smaller than the entire HDAG. A skeptical observer may verify such proofs by checking that the first node hashes to C, each succeeding node's hash is contained in the preceding node as a pointer value, and that the final node contains the element whose presence is being proved. This method of proof is quite general: the presence of an arbitrary subset of nodes in an HDAG can be proved by supplying them and all their ancestors' contents.
Embodiments of the present invention relate to a method parameterized by a length of features L and in some cases a factor K that relates to a number of features that may be proved absent from a given committed-to set of features. In one embodiment of the present invention, a committed-to set may be obtained beginning with a set of L-bit strings representing features that act as input. Such a set may be obtained from a set of arbitrarily-sized features by cryptographically hashing each feature and keeping the first L bits of each hash. The set may be committed-to by first constructing an HDAG that encodes the set of L-bit features and then publishing the root hash of the constructed HDAG to those who wish to confirm the commitment. The exact data structure used for the committed-to HDAG is a crucial factor in determining the resulting set commitment method's properties.
Each of the possible L-bit features (2L in all) has a corresponding leaf node; that node contains “Y” if that feature is in the committed-to set and “N” otherwise. Prefix tree 200 corresponds to committed-to set S1={1, 5, 6} comprising three-bit number values (i.e., L is three). The tree 200, in accordance with embodiments of the present invention, comprises a root node 204, middle nodes 206, 208, 210, 212, 214, and 216, and leaf nodes 218, 220, 222, 224, 226, 228, 230, 232, and 234 having values as shown. The nodes of prefix tree 200 may be given implicit keys in accordance with the rule previously mentioned. Thus, a search of prefix tree 200 may be performed for the feature (key) five, which corresponds to bit string “101” (i.e., 5=1012) and whose presence information is stored in leaf node 230. Accordingly, at the root (node 204) a search would proceed to the right because the first bit value in bit string 101 is 1. At the next node (node 208), the search would proceed to the left because the next value in the bit string 101 is 0. Then the search would proceed right again at node 214 because the last bit is a 1 in bit string 101. The search would reach leaf node 230 wherein information about the presence of 5 in the set is stored. The presence of 5 in the set is indicated by the bit value illustrated by “Y” stored in leaf node 230. If five was not present in set S1, the leaf node 230 would contain a bit value represented by “N.”
Alternate embodiments may place additional information in the leaf nodes corresponding to present features (e.g., the “Y” nodes of
The only actual information about what the set S1 contains resides in the leaf nodes: the superstructure or internal nodes of the tree may be exactly the same (module exact pointer values) no matter what features are actually present in the set. Moreover, the path (including whether to branch right or left) from the root node to the node containing the information about whether or not a given feature is present in the set depends only on that feature's name. Thus, supplying that path may constitute proof that that feature is/is not in the committed set. (If the relevant information about that feature's presence could be in multiple places, then multiple paths might have to be supplied.) These proofs may leak information because it is sometimes possible to figure out what a hash pointer references. In particular, if an adversary can guess to what a hash pointer points, he can easily check his guess by hashing his guess and seeing if its hash is the same as the value of the hash pointer. Guessing may become relatively easier near the bottom of the tree because there are so few possibilities for small subtrees.
While this data structure works well for small L, it may use too much storage for larger L because there is a node for every single possible string: Required storage space increases exponentially with the length L. Using cryptographic hashing to reduce the size of features may reduce L somewhat.
Prefix tree 300 comprises a root node 304, middle nodes 306, 308, 310, 312, and 314, and leaf nodes 316, 318, and 320. Null pointers 322 are indicated by a slash. In accordance with embodiments of the present invention, the null pointers 322 may be holding a special hash value null (e.g., 0) that corresponds to no known data. Proofs are the same here as for the previous embodiment except that proofs that features are missing may end early in a null pointer instead of a leaf node containing an “N”. This may create a new information leak: as soon as a party recognizes a null pointer 322 (required for the proof to work), that party may know the associated sub-tree is empty (e.g., it has no “Y” leaves) and thus that the associated features are missing. This may not happen for proofs of presence.
To prevent information from leaking due to guessing what hash pointers reference, randomness may be added to the data structure to which a commitment is being made. This may be done in many different ways. For example, a different random number may be placed in each leaf node in
Simply adding randomness to nodes in the embodiment depicted in
Blinded pointers may be created by storing (ordinary, non-set) commitments to hashes instead of just hashes in pointers. When a blinded pointer needs to be followed in a proof, a sub-proof revealing its underlying hash may be included. Any ordinary commitment method may be used. For concreteness, in our examples we use the following simple method: to commit to a value v, a random secret r is chosen yielding commitment token hash(v, r) where hash(−) is some cryptographic hash function; the proof that the committed value was v is (v, r); and the verification procedure is to check that the proof hashes to the commitment token.
c3=hash(r3,c7,c8) Equation 1
Similarly, the following two equations represent calculation of c7 (the value of null pointer 426) and c8:
c7=hash(r7,0) Equation 2
c8=hash(r8,“Y”) Equation 3
The hash c7 is calculated using the assumption that zero is the underlying value of a null pointer.
Using blinded pointers as described above in accordance with embodiments of the present invention may prevent information from leaking. For example, a second party may attempt to determine whether c7 is a null pointer without authorization from a first party. However, the second party may not be able to recognize c7 as a null pointer because the party may not have the value of r7 and thus the commitment may prevent an information leak. If a first party chooses to provide a proof of certain values in an HDAG, the first party may provide a node trace along with the secrets (e.g., r3, r8) associated with the pointers followed in the trace. For example, if a first party provides proof to a second party that leaf node 416 has a value of “Y”, the second party will receive c7 as part of the associated node trace because c7 is part of search node 410. However, in accordance with embodiments of the present invention c7 may appear random to the second party because the second party does not have r7, which may be required to interpret c7. The second party may only be given the secrets necessary to confirm the value of leaf node 416. Accordingly, the use of random values can essentially hide all of the pointers (e.g., pointer 426) destinations, except those of pointers followed by a particular node trace.
It should be noted that in a prefix tree, the use of null pointers as illustrated by
If two non-membership proofs are issued involving the same commitment token, information may leak because the node traces of the two proofs may reveal that some node is reachable from the root in different ways. (A node is recognizable wherever it occurs because of its unique ordinary commitment tokens.) If an adversary discovers that a node can be reached in two different ways, he can be sure that no descendent of that node is a “Y” leaf. Discovering that a node is reachable in two different ways is the only way that information can leak when using the embodiment illustrated by HDAG 600 because of its use of blinded pointers and lack of null pointers. Because proofs of membership never traverse nodes reachable in multiple ways (remember that only nodes that contain “N” or have only “N” leaves are combined), if the committer limits himself to issuing only at most 1 non-membership proof per commitment token then no node will be revealed to be reachable in multiple ways.
Organizing a tree into regions such as the illustrated regions 710 may allow for more than one non-membership proof to be issued without leaking information. The regions allow division of the tree and reuse of nodes only within a region. If a tree is divided into regions based on the first K bits of each string, 2K regions may be obtained with each region having L−K+1 special nodes that encode each of the possible sub-trees having no leaf node descendents with a “Y” value. Under a construction such as this, two non-membership proofs can have a reused node in common (and hence potentially reveal a node reachable in multiple ways) only if they are for two strings whose first K bits are the same. Thus, at the cost of O(N*L+2K*(L−K+1)) time and space where N is the number of features in the set to be committed to, the committer can issue up to 2K non-membership proofs without leaking information so long as each is for a string with a different K-bit prefix.
The restriction on which non-membership proofs can be issued may be made less onerous by randomly assigning features to each region. To do this, a random permutation may be applied to all of the features before being added to a set. The permutation should be published or agreed upon before commitment time. For example, to commit to the features 0000002 and 0000012, using the procedure described above, the committer publishes the set commitment token resulting from committing to the set {P(0000002), P(0000012)} and P. To prove that 1000002 and 1000012 are not in the committed-to set, the committer may show proofs that P(1000002) and P(1000012) are not in the committed-to set. This may be done without risking privacy loss as long as P(1000002) and P(1000012) differ in their first K bits, which will happen with probability 1-2−k. Note that the probability that the committer can do this is independent of the other contents of the committed-to set. Therefore, whether or not the commiter can provide a proof does not leak extra information. If a party who will be requiring proofs is allowed to choose the permutation and possibly K as well, they may be able to choose a permutation that definitely maps each of the nonmembership proofs they might want to different regions. This does provide a little advance information about what proofs might be desired, but almost certainly not enough to matter.
If the commiter must be able to issue two or more proofs of nonmembership under any circumstances, however unlikely, he can commit to his set multiple times, producing a different HDAG 800 each time, agreeing that his actual committed-to set is the intersection of all the sets he committed to. That is, under this scheme, a valid membership proof of feature f consists of one membership proof for feature f for each of the committed-to sets, and a valid non-membership proof of feature f consists of a proof that feature f is not a member of one of the committed-to sets. Since the non-membership proof limits are per HDAG 800, this means he is guaranteed to be able to issue at least one non-membership proof per committed-to set. This scheme, however, uses more storage and time than a single HDAG 800 with many regions.
Data structure 900 (or any of the similar data structures that accomplish the same result) may be utilized in parallel with a data structure such as that represented by
HDAGs such as prefix tree 900 alone may be used alone to commit to a set if only membership proofs are needed and the size of the set must be provably limited. This type of HDAG has the advantage that is places no limits on the size of features. Additionally, multiple HDAGs such as HDAG 900 may be used to enforce different limits on the number of features belonging to different types. For example, one HDAG may be used to limit the number of URLs in a set to 10,000 and a different HDAG may be used to limit the number of keywords searched in the same set to 1,000. The proof of membership for a URL or keyword would be accompanied by the appropriate limit proof(s). It should be noted that while there are benefits to using the type of HDAG presented in
While the invention may be susceptible to various modifications and alternative forms, specific embodiments have been shown by way of example in the drawings and will be described in detail herein. However, it should be understood that the invention is not intended to be limited to the particular forms disclosed. Rather, the invention is to cover all modifications, equivalents and alternatives falling within the spirit and scope of the invention as defined by the following appended claims.
This application is a continuation-in-part of U.S. patent application Ser. No. 10/639,140, by Rajan M. Lukose and Joshua R. Tyler, entitled “Targeted Advertisement with Local Consumer Profile,” filed on Aug. 12, 2003, now abandoned which is incorporated herein by reference.
Number | Name | Date | Kind |
---|---|---|---|
5754938 | Herz et al. | May 1998 | A |
5794210 | Goldhaber et al. | Aug 1998 | A |
5848396 | Gerace | Dec 1998 | A |
5857175 | Day et al. | Jan 1999 | A |
5933811 | Angles et al. | Aug 1999 | A |
5974398 | Hanson et al. | Oct 1999 | A |
5987252 | Leino et al. | Nov 1999 | A |
5991734 | Moulson | Nov 1999 | A |
6047327 | Tso et al. | Apr 2000 | A |
6085216 | Huberman et al. | Jul 2000 | A |
6108639 | Walker et al. | Aug 2000 | A |
6182050 | Ballard | Jan 2001 | B1 |
6182068 | Culliss | Jan 2001 | B1 |
6195698 | Lillibridge et al. | Feb 2001 | B1 |
6199067 | Geller | Mar 2001 | B1 |
6324519 | Eldering | Nov 2001 | B1 |
6353925 | Stata et al. | Mar 2002 | B1 |
6396833 | Zhang et al. | May 2002 | B1 |
6539377 | Culliss | Mar 2003 | B1 |
6546390 | Pollack et al. | Apr 2003 | B1 |
6560588 | Minter | May 2003 | B1 |
6567507 | Shaffer et al. | May 2003 | B1 |
6614764 | Rodeheffer et al. | Sep 2003 | B1 |
6618814 | Gaur et al. | Sep 2003 | B1 |
6654743 | Hogg et al. | Nov 2003 | B1 |
6665710 | Bates et al. | Dec 2003 | B1 |
6670964 | Ward et al. | Dec 2003 | B1 |
6681059 | Thompson | Jan 2004 | B1 |
6718365 | Dutta | Apr 2004 | B1 |
6721275 | Rodeheffer et al. | Apr 2004 | B1 |
6735589 | Bradley et al. | May 2004 | B2 |
6738978 | Hendricks et al. | May 2004 | B1 |
6771290 | Hoyle | Aug 2004 | B1 |
6832207 | Shkedi | Dec 2004 | B1 |
6834195 | Brandenberg et al. | Dec 2004 | B2 |
6850247 | Reid et al. | Feb 2005 | B1 |
6937291 | Gryskiewicz | Aug 2005 | B1 |
6938021 | Shear et al. | Aug 2005 | B2 |
6983311 | Haitsuka et al. | Jan 2006 | B1 |
7010176 | Kusunoki | Mar 2006 | B2 |
7034848 | Sobol | Apr 2006 | B2 |
7035469 | Laaksonen | Apr 2006 | B2 |
7061509 | Dischert et al. | Jun 2006 | B2 |
7064867 | Lapstun et al. | Jun 2006 | B2 |
7065247 | Lapstun et al. | Jun 2006 | B2 |
7130841 | Goel et al. | Oct 2006 | B1 |
7155508 | Sankuratripati et al. | Dec 2006 | B2 |
7260573 | Jeh et al. | Aug 2007 | B1 |
7305691 | Cristofalo | Dec 2007 | B2 |
7310612 | McQueen et al. | Dec 2007 | B2 |
20010036224 | Demello et al. | Nov 2001 | A1 |
20010041566 | Xanthos et al. | Nov 2001 | A1 |
20010042132 | Mayadas | Nov 2001 | A1 |
20020048369 | Ginter et al. | Apr 2002 | A1 |
20020052778 | Murphy et al. | May 2002 | A1 |
20020082923 | Merriman et al. | Jun 2002 | A1 |
20020099605 | Weitzman et al. | Jul 2002 | A1 |
20020102992 | Koorapaty et al. | Aug 2002 | A1 |
20020124098 | Shaw | Sep 2002 | A1 |
20020138353 | Schreiber et al. | Sep 2002 | A1 |
20020156677 | Peters et al. | Oct 2002 | A1 |
20020178257 | Cerrato | Nov 2002 | A1 |
20030014304 | Calvert et al. | Jan 2003 | A1 |
20030023589 | Castle | Jan 2003 | A1 |
20030028451 | Ananian | Feb 2003 | A1 |
20030033199 | Coleman | Feb 2003 | A1 |
20030037041 | Hertz | Feb 2003 | A1 |
20030046244 | Shear et al. | Mar 2003 | A1 |
20030050839 | Shiomi | Mar 2003 | A1 |
20030110497 | Yassin et al. | Jun 2003 | A1 |
20030149572 | Newton et al. | Aug 2003 | A1 |
20030171995 | Dezonno et al. | Sep 2003 | A1 |
20030187726 | Bull et al. | Oct 2003 | A1 |
20030212745 | Caughey | Nov 2003 | A1 |
20040133793 | Ginter et al. | Jul 2004 | A1 |
20040153456 | Charnock et al. | Aug 2004 | A1 |
20040168190 | Saari et al. | Aug 2004 | A1 |
20040193602 | Liu et al. | Sep 2004 | A1 |
20040215711 | Martin et al. | Oct 2004 | A1 |
20050177387 | Mojsa | Aug 2005 | A1 |
20050183143 | Anderholm et al. | Aug 2005 | A1 |
20050265313 | Poikselka | Dec 2005 | A1 |
20060058948 | Blass et al. | Mar 2006 | A1 |
20060090184 | Zito et al. | Apr 2006 | A1 |
20070067297 | Kublickis | Mar 2007 | A1 |
20080077558 | Lawrence et al. | Mar 2008 | A1 |
Number | Date | Country |
---|---|---|
WO-9717774 | May 1997 | WO |
WO-9834189 | Aug 1998 | WO |
Number | Date | Country | |
---|---|---|---|
20050038774 A1 | Feb 2005 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 10639140 | Aug 2003 | US |
Child | 10934557 | US |