The disclosure relates generally to the field of secure storage and retrieval of information.
This section introduces aspects that may be helpful to facilitating a better understanding of the inventions. Accordingly, the statements of this section are to be read in this light and are not to be understood as admissions about what is in the prior art or what is not in the prior art.
The proliferation of malicious software has made it increasingly difficult to ensure the secure retrieval of information from a database server. While various methods have been used to increase security, such methods typically rely on hardening the entities in the data retrieval transaction against malicious attack, e.g. by patching vulnerabilities as they are discovered. This approach is undesirable because it is inherently reactive, and is often expensive to implement.
One embodiment provides an apparatus, e.g. a database verifier, that includes an instruction memory and a processor operatively coupled to the instruction memory. The processor is configured by instructions on the memory to verify that a record set is authorized to be transmitted by comparing a received first authenticator value to a calculated second authenticator value determined from the record set and a received verification key.
Another embodiment provides a method, e.g. for verifying a dataset received from a database. The method includes receiving a database record set, a first authenticator value, and a verification key. A second authenticator value is computed from the record set and the verification key. The record set is transmitted only on the condition that the second authenticator value is equal to the first authenticator value.
Yet another embodiment provides a method, e.g. for forming a database verifier. The method includes placing a memory in operable communication with a processor. The memory is configures with instructions adapted to implement a method. The method includes receiving by the processor a database record set, a first authenticator value, and a verification key. A second authenticator value is computed by the processor from the record set and the verification key. The second authenticator value is compared by the processor with the first authenticator value.
In some embodiments the first and second authenticator values are each an aggregated message authentication code (MAC). In some such embodiments the aggregated MAC is a modulo sum of a plurality of MACs each determined for a single record of a database from which the record set is extracted. In some embodiments the instructions are further adapted to configure the processor to transmit the record set only on the condition that the first and second authenticator values are equal. In some embodiments the determination of the second authenticator value by the processor includes computing MAC tokens of each record in the record set, and determining the aggregated MAC from the recomputed MAC tokens. In some embodiments the memory is an unmodifiable memory. In some embodiments the processor is configured to receive the verification key is received via a network path different from the network path over which the database record set is received.
A more complete understanding of the present invention may be obtained by reference to the following detailed description when taken in conjunction with the accompanying drawings wherein:
The disclosure is directed to, e.g. apparatus, systems and methods for secure storage and retrieval of information from a database server.
Embodiments presented herein describe apparatus and methods to provide improved security of information retrieval from a database. An attacker may attempt to access entries from a database by installing a malicious program on a database server. The malicious program may then access entries of the database and send these to a destination of the attacker's choosing. Some conventional attempts to secure a database rely on detecting the presence of the malicious software, removing the malicious software, and patching the server against future installation of the malicious software. This strategy may be expensive and, worse, sometimes ineffective, as malicious actors continue to seek and exploit vulnerabilities in the server.
Embodiments herein provide improved methods, apparatus and systems for protecting such data by ensuring that a database server can only send data to one or more recipients in an established set of allowed recipients. The database server is constrained to send requested data to a verifier. The verifier utilizes a small nonvolatile memory or hardware-encoded instructions that ensure the verifier cannot be easily compromised. Attempts by the database server to send data to recipients outside this set are detected and blocked by the verifier. In some known database systems, an attacker could succeed by compromising only the database server. For embodiments described herein the attacker would need to compromise both the database server and the verifier to successfully intercept data from the database server. Because the verifier is robust to attack, the likelihood of a successful diversion of data is substantially reduced relative to such known database systems. Moreover, the database held by the database server, as well as other components of the system to access the database, may be modified without any need to modify the verifier. Also, the functions provided by the verifier may be limited to only those needed to receive the data, verify the authority of the recipient, and resend the data, thereby minimizing undesirable performance degradation such as increased latency of data requests.
The owner 110 may be a computing device or system under control of an ownership entity with an interest in controlling distribution of data in a database 125 stored by and/or accessible to the server 120. As described further below, the owner 110 provides database data and an authenticated access control list (ACL) 115 to the server 120. The ACL 115 may be accessible to the owner 110 in any manner, e.g. by transient or persistent local storage, or by network access.
In response to a query from the client 130, the server 120 provides a query result to the verifier 140, along with an authenticator value. The verifier 140 determines the validity of the query result, in part using a verification key received from the owner 110. In various embodiments the verification key is a secret key. In the event that the verifier 140 determines that the query result is valid, the verifier 140 provides the query result to the client 130. If instead the verifier 140 determines the query is not valid, then the verifier 140 may block transmission of the result to the client 130.
The verifier 140 may be a small computational entity, e.g. a program, with a limited interface sufficient to receive the requested record from the server 120 and provide the verified record to the client 130. The verifier 140 includes a communications interface (not shown) to, e.g. receive data from the server 120, and transmit data to the client 130. The verifier 140 may be implemented in a general purpose or specialized computing platform, in either case including a processor 146 and an instruction memory 148. A specialized computing platform may include, e.g. a finite state machine or dedicated microcontroller. The instruction memory 148 may be or include a medium that cannot be easily modified, e.g. hardware or read-only memory, to ensure that the verifier 140 cannot be tampered with by a potential attacker.
In various embodiments the verifier 140 receives the verification key from the owner 110 via a different network path than the path over which the verifier 110 receives the query result. In some embodiments the verifier 140 receives verification key directly from the owner 110, e.g. without intermediate reception by another server entity. In particular it may be preferable to prevent the verification key from being transmitted via the database server 120 to prevent the database server 120 from tampering with the verification key. When the verification key is not transmitted via the database server 120, it may be transmitted via a network path different from the network path over which the database record set is received, e.g. as illustrated by the data path from the owner 110 to the verifier 140. Herein and in the claims, the verification key is received “directly” from the owner 110 when the verification key is received via a network path different from the network path over which the database record set is received from the database 120.
In one embodiment for each of the j entries for each row of the ACL 115, the owner 110 determines an ACL token τ, determined as
τij=MACk(Ri, IDij),
where k denotes the use of the verification key k in computing each token τ, In this embodiment
The set of all tokens τij constitutes the authenticated ACL, which as previously described is sent to the server 120 by the owner 110. In some embodiments an access control policy may be represented more concisely by the owner 120, but any such representation can of course be converted to the ACL 125 as described above.
When responding to a query by a client with ID equal to the contents of table 300 location ij, the server 120 determines a result set of database entries that meet the parameters of the query. The server 120 authenticates the result set by computing an aggregated MAC of the ACL tokens corresponding to the result set and id. Those skilled in the pertinent art will appreciate that an aggregated MAC, or aMAC, is a technique that aggregates a number of MACs into one small aggregated MAC value, such that the validity may be determined of the constituent MACs represented by the aMAC. Thus, as used herein and in the claims, an aggregated MAC is such a value.
As previously described, the verifier 140 is configured to be substantially impervious to alteration by virtue of its implementation in physically unchangeable features, e.g. hardware-encoded memory. Such implementation is typically significantly more expensive than implementations using transient memory. Thus it may be desirable to limit the size of the verifier 140 by reducing the storage and/or computational demands on the verifier 140. Thus it may not be possible, practical or desirable to store the ACL 115 on the verifier 140.
Instead, the verifier 140 receives the aMAC from the server 120. The verifier 140 may test the validity of the data set received from the server 120 by recomputing the aMAC from the received data set and the verification key received from the owner 110. If the recomputed aMAC is not equal to the aMAC received from the server 120, it may be presumed that at least one of the data in the received set is not authorized to be accessed by the recipient client. The verifier 140 may then block the transmission of all the data in the received data set. While the verifier may not have sufficient information to determine which entries in the received set have been tampered with, blocking delivery of the entire set serves the purpose of ensuring the security of the data in the database.
It is preferable that the aMAC be computed in a manner that provides a high likelihood that alteration of any of the MAC values used to determine the aMAC will result in a modified aMAC value distinguishable from the authentic aMAC value. In a nonlimiting embodiment, the aMAC may be computed by summing the values of the individual aMAC values. In various such embodiments the sum is computed with modulo arithmetic, e.g., modulo an n-bit prime pn. In some embodiments n can be selected to be at least 50. In various embodiments the operands of the aMAC computation are limited to tokens specific to the identity of the requester, e.g. a single client Cij. In various embodiments the aMAC is determined as
aMACij=Σi,j τk mod pn
where the subscript i denotes that the sum is over all rows of the returned data set, and the subscript j denotes the identity of a single, specific client.
The value of n may conveniently be related to the number of bits of a storage register in the specific implementation of the server 120 and/or the verifier 140. In some cases, n=7, but of course other value of n are possible, and are contemplated by the scope of the disclosure.
In some embodiments the owner 110 may compute the ACL tokens in a manner that results in reduced computational burden on the owner 110 and the verifier 140. In the previously described computation, the tokens τ were determined by summing along each ith database row, incrementing j. In some cases, such as when the number of authorized clients is large for a particular database row, the computation of τ may be computationally expensive. In such cases it may be more efficient to calculate the aMAC along the shorter index i. In such embodiments, the owner 110 and/or may compute the token τi0=MACk(i, Ri) and the tokens τi1=MACk(i, idi1), τi2=MACk(i, idi2) . . . τij=MACk(i, idij). The owner 110 and the verifier 140 then compute
aMACN=τi0+τi1+τi2 . . . +τiN mod pn.
The computation of the aMAC, and the underlying MAC values, may be efficiently implemented due to the use of the verification key. This efficiency may reduce the computational complexity of the verifier 140, and therefore reduce the complexity of the physical implementation of the verifier 140. Moreover, because only a single aMAC value is transmitted by the database server 120 to the verifier 140, efficiency of the verification process is significantly enhanced, e.g. by reducing communication overhead.
Turning to
In a step 510 a database owner, e.g. the owner 110, determines an authenticated ACL associated with a database, e.g. the database 125. In a step 520 the owner transmits the database and the authenticated ACL to a database server, e.g. the server 120. The owner also transmits to a database query verifier, e.g. the verifier 140, a verification key k used to determine the authenticated ACL. In a step 530 the server receives from a database client, e.g. the client 130, a query for a dataset from the database. In a step 540 the server determines a query result and an authenticator value associated with the dataset, and in a step 550 transmits the query result and authenticator value to the verifier. In a step 560 the verifier determines a verification authenticator value from the received dataset and verification key. In a step 570 the verifier compares the received authenticator value and verification authenticator value. In a step 580 the verifier transmits the dataset to the client only if the received authenticator value is equal to calculated verification authenticator value.
Although multiple embodiments of the present invention have been illustrated in the accompanying Drawings and described in the foregoing Detailed Description, it should be understood that the present invention is not limited to the disclosed embodiments, but is capable of numerous rearrangements, modifications and substitutions without departing from the invention as set forth and defined by the following claims.