The present disclosure relates to systems, components, and methodologies for providing network access control, forensics capabilities to identify network attackers and compromised clients or servers, enhanced protection for at-rest data in the event of a network breach, and other network security features.
Attacks on information networks have been increasing in frequency and success in recent years. Attack methods are becoming increasingly sophisticated, and network defense systems have not kept pace. Intrusion Detection Systems (“IDS”) and Intrusion Prevention Systems (“IPS”) utilizing signature- and statistics-based methods are not always sufficiently agile to address modern network attacks. With the rise of the Internet and computer networks, network security has become increasingly important. Similarly, increased use by organizations of centralized secure datacenters has made network security increasingly important.
Yet network attacks persist, showing that existing information/cyber security technology is not sufficient. These continuing attacks are reminders of how vulnerable network-connected computer systems are, and the regularity with which they are breached. Many of these breaches are the result of the exploitation of zero-day and metamorphic attacks, using previously unseen attack vectors, or metamorphic variants of known attacks, to strike at the vulnerable underbellies of networks.
There has also been an increased prevalence in the rise of insiders leaking confidential information, as well as employees losing laptops and mobile devices containing proprietary information. These activities highlight the need for data networks with defenses against this sort of malicious insider behavior, and for data networks that minimize the effects of memory-scraping and unauthorized information access.
Cloud and mobile devices have become increasingly prevalent as well. Their increasing popularity highlights the need for information to be securely stored and accessible only by the intended user or authorized users. While passwords and tokens can offer some protection and authentication, a password can be compromised by social engineering, key loggers, or zero-day malware. Additionally, because notebook personal computers are increasingly used for e-Commerce, there is a growing need to make the notebook platform more trustworthy. In fact, in the mobile computing context, stolen data is often regarded as being more valuable than the mobile hardware itself.
According to the present disclosure, systems, components, and methodologies are provided for network access control, forensics capabilities to identify network attackers and compromised clients or servers, enhanced protection for at-rest data in the event of a network breach, and other network security features.
Disclosed embodiments address the above-described technical problems by providing a network security system that exploits space-time separation—that is, the network security system implements authentication and protection in multiple spatial positions, and implements authentication and protection mechanisms that vary over time in a joint spatial relationship. For example, file, directory and user are identified and protected using a space-time varying identity instead of a fixed identity. The use of space-time separated and jointly-evolving relationships provide network defenses that can defend against a variety of attacks, including zero-day and metamorphic attacks.
In illustrative embodiments, the disclosed systems, components, and methodologies utilizing the space-time separated and jointly-evolving relationships also include sophisticated traceback and logging features that allow for identification of an attack's origin, the attack's culprits, and compromised botnets.
In illustrative embodiments, the disclosed systems, components, and methodologies utilizing the space-time separated and jointly-evolving relationships also provide enhanced protection of at-rest data stored within the network and traceback to the source of leakage.
In illustrative embodiments, the disclosed systems, components, and methodologies accept a request by a user to access data stored in a database; identify a sequence of security agents that will participate in authenticating the access of the data by the user; generate a sequence of passwords; check, at each one of the servers, a corresponding one of the passwords; determine that the user is permitted to access the data if all the servers accept the corresponding password; and vary the passwords over time. The security agents provide mutual support for each other using the space-time varying relationship. Since it is infeasible for attackers to compromise the space-time varying relationship, even when attackers become the superuser of any agent or client using zero-day attacks; when attackers attempt to steal protected resources, the attack will result in violating the space-time varying relationship. Hence, the attack and data leak can be prevented. Furthermore, zero-day attacks involved in the attempt can be identified in real time.
Disclosed embodiments also address the above-described technical problems by providing systems, components, and methodologies that enhance security by splitting sensitive information (e.g., files or folders) into encrypted components and storing each encrypted component in respective spatially separated memory positions. Information regarding positions at which the data is split may be stored in a map, which itself is split into encrypted components stored in respective spatially separated memory positions. In illustrative embodiments, space/time-varying identifiers are assigned to each encrypted component of the data, and the space/time-varying identifiers are used to authenticate whether a given user is authorized to access the data. This provides a fine-grained access control in an automatic manner, even for shared data. Using space/time-varying identifiers and associated protection (such as Mutated ciphertext based on space/time-varying relationship), data leak can be prevented and any insider who is selling the information will be identified in real time.
In illustrative embodiments, the systems, components, and methodologies provide authentication by authenticating the user with multiple devices and passwords.
Disclosed embodiments also address the above-described technical problems by providing systems, components, and methodologies that provide a TPM-enhanced (or equivalent hardware-based security processor) cloud-based file protection system, rather than, for example, a solely-software security implementation.
The AES-GCM program and the file splitting-merging program work mutually with each other. They both can be peroformed multiple times based on the required security strength. First, after the targeted files are encrypted and authenticated, then the encrypted file is split and the file pieces are distributed to the mobile device and PC as well as datacenter servers. Then, the index file will be encrypted and split into pieces and then distributed to server and the client PC as well as datacenter servers. The decryption-merging process is generally an inverse process. The mutated ciphertext is resistant to crypto side-channel attacks.
Additional features of the present disclosure will become apparent to those skilled in the art upon consideration of illustrative embodiments exemplifying the best mode of carrying out the disclosure as presently perceived.
The detailed description makes reference to the accompanying figures in which:
Claim Terminology
Access Control List: One or more files or other programmatic representations that indicate which users or clients are permitted to access which respective pieces of information stored in one or more databases.
Client Security Ticket: A programmatic representation that includes one-time passwords, PIDs, and a merchandise request.
Cryptographic seed information: A value, such as a number, that is provided as an input to a mathematical operation to generate a security token. For example, one or more seeds may be provided as an input to a hash operation to generate a one-time password or a Pseudo-ID.
Database: A system or component containing computer-readable memory that stores data of interest to one or more users in a programmatically organized manner.
Log records: One or more files that contain information regarding network usage, including records of the clients or servers over which given data packets have traversed.
Map: A programmatic representation capable of being stored in computer-readable memory that contains the memory positions at which respective components of a data unit, such as a file or folder, are stored.
Network Security Ticket: A programmatic representation of messages passed between servers that include client security tickets and Xchain values.
One-time password: A password taking on a value capable of authenticating a user or client with a server either once or for a predetermined time subsequent to a first use of the password value. A one-time password may repeatedly take on new values capable of authenticating a user or client with a server subsequent to expiration of a given password value.
Password: A sequence of identifiers, such as characters, numbers, or words, that are uniquely associated with one or more users and that are used to identify, confirm the identity of, or authenticate actions taken by the one or more users.
Server: A system or component that includes software executing on hardware and that performs services in response to requests from one or more users or clients. Multiple servers may be provided in separate respective hardware units or multiple servers may be provided as separate software objects that run on a single hardware unit.
Storing data in spatially separated memory positions: Storing the data in disparate positions on one or more computer-readable memory components as compared to where a file system would store the data in a single write operation. The disparate positions may reside on one memory component in a single device or may reside on multiple memory components on multiple different devices, such as for example a PC, a cloud server, and a smartphone.
Time-dependent authentication vectors: A sequence of security tokens that vary over time and that are used by servers to identify, confirm the identity of, or authenticate other services.
Time-varying identifier: A security token that varies over time and that is used by servers to identify, confirm the identity of, or authenticate actions taken by the one or more users. For example, the time-varying identifier may vary after the lapse of a predetermined period of time, after transmission of a data packet, after a request from a user to access data has been serviced, or after a user logs off from a usage session.
Trusted Platform Module: A security chip, embodied in hardware, that can perform security operations, including to create and store cryptographic keys.
Other Terms Appearing in Disclosure
Binding: Encrypting a message using a key.
Botnet: A group of computers compromised by an attacker.
Client/client computing device: A device containing software, memory, and a processor that is accessed by a user to interface, directly or indirectly, with a database. For example, a client/client computing device may include a PC, laptop, workstation, or smartphone.
Customer: A programmatic representation of a combination of a user and/or client and one or more security tokens associated with the user and/or client.
Location: A physical computing device. Examples of locations include clients, servers, and databases.
Merchandise Request: A programmatic representation of the type of data request a user seeks to perform. For example, a merchandise request may be to read, write, or execute operations on data in a database.
NP-complete problem: A problem for which, in a worst-case scenario, there is no known algorithm that can solve the problem in polynomial time. Generally, if a problem is NP-complete, there is no known algorithm for solving all instances of the problem efficiently in less than exponential time.
Pseudo-ID (PID): A unique identifier used for identification or authentication that may change over time and that is generated through a mathematical operation based on a permanent identifier. For example, a PID may be used to identify a user, a client, an application, content, or pieces of content.
Security Agent: A type of server that provides network security functionality in connection with a user request to access data and that is in networked communication with one or more other security agents or clients.
Security token: A physical object or programmatic construct that is used to identify, confirm the identity of, or authenticate one or more users. Examples of security tokens include passwords, PIDs, PINs, badges, and smart cards.
State: Memory contents of a location or virtual location, or the combination of contents of locations or virtual locations.
Super Security Agent: A type of server that provides network security functionality and is in networked communication with one or more security agents, super security agents, or databases.
Super state: A type of state that is a combination of states.
Transform: A mathematical operation which accepts a set of states and/or locations or virtual location as inputs and produces a set of states and/or locations or virtual locations as outputs.
User Agent Software: Software that runs on a client and that interfaces with security agents, super security agents, and/or databases.
Virtual location: A software object with memory storage and data processing capabilities. A virtual location is capable of residing in one or more different physical locations.
Xchain values: Values that are the product of mathematical operations and used by servers to identify, confirm the identity of, or authenticate actions taken by other servers.
Slave device: A member of a botnet.
Xslices: Portions of data removed from encrypted cyphertext and stored separately, in either contiguous or non-contiguous locations.
Xbits: Bits of seeds that are removed from seeds and stored in a separate location.
The figures and descriptions provided herein may have been simplified to illustrate aspects that are relevant for a clear understanding of the herein described devices, systems, and methods, while eliminating, for the purpose of clarity, other aspects that may be found in typical devices, systems, and methods. Those of ordinary skill may recognize that other elements and/or operations may be desirable and/or necessary to implement the devices, systems, and methods described herein. Because such elements and operations are well known in the art, and because they do not facilitate a better understanding of the present disclosure, a discussion of such elements and operations may not be provided herein. However, the present disclosure is deemed to inherently include all such elements, variations, and modifications to the described aspects that would be known to those of ordinary skill in the art.
At least one disclosed embodiment utilizes the concept of a space-time separated and jointly-evolving relationship to provide network defenses that can defend against attacks including zero-day and metamorphic attacks. A description thereof may be provided with reference to an exemplary implementation called the Intrusion-resilient, Denial-of-Service resistant, Agent-assisted Cybersecurity System (IDACS), but it should be understood that the IDACS implementation described herein is merely an illustrative example in accordance with the present disclosure.
In one respect, according to illustrative embodiments, network security systems may be designed by mathematically defining “correct” network access behavior for protected information and services, and blocking all other behavior. The mathematically-governed access behaviors may provide sufficient complexity to be unpredictable to attackers, but may be easily verified by the security system. This design may provide three mathematically-related capabilities: i) rigorous but fast network access control; ii) efficient real-time forensics capabilities; and iii) further protection of at-rest data in case of a network breach.
The mathematical design that provides this level of protection may be based on the theory of the Space-Time Separated and Jointly Evolving relationship. This theory calls for space-time evolving relationships between authentication credentials, file/database systems, and protected data in the realms of space and time to render the breaking of the access control system mathematically infeasible. Furthermore, this space-time separated and evolving relationship may be encoded into network application layer packets, and become a means for rapidly tracing attacks back to the source attacker, thus providing real-time forensics capability. The relationship may also determine the storage locations of protected data (e.g., in a cloud) and authentication credentials (e.g., on security tokens) in a time-evolving manner so that it becomes infeasible for attackers to decode the dynamic relationships. Hence, three distinct capabilities (or modules) of a security system may be described by a single principle of the space-time separated and evolving relationship.
IDACS leverages the space-time separated and jointly-evolving relationship to defend against these types of leaks of at-rest data. It also provides detection, traceback and accountability for the sources of data leaks. By separating encrypted data into pieces that are useless by themselves and storing them in separate and time-changing locations, IDACS can greatly increase the security of stored data. Herein is provided the principles and methods by which IDACS provides this data security, and it will provide proofs for the mathematical strength of these methods. Additionally, simulations will demonstrate the real-world effectiveness of such a system, even in the presence of a high number of insider traitors.
As mentioned above, IDACS may provide network security in three key areas: attack detection and prevention, digital forensics to identify the origin of the attack, and deep protection of at-rest encrypted data in case of a successful network breach and traceback to the source of leakage. IDACS combines these three aspects into a complex space-time relationship that provides mutual reinforcement between these aspects. A mathematical analysis of IDACS reveals that several facets of its network defense are NP-complete, presenting a potential attacker with an incredibly complex problem to solve. Multiple simulations of a fielded IDACS system demonstrate the high attack detection rate, network traitor identification rate, and data protection capabilities provided by this system.
IDACS similarly implements the concept of the space-time separated and jointly evolving relationship to achieve a high level of security in computer and information networks. Three aspects of IDACS facilitate this functionality. First, the space-time separated and evolving relationship is used as a basis for the IDACS Network Access Control protocol. By using multiple space-separated and time-evolving items for identifying an information or service access, e.g., file name and user ID, IDACS can efficiently allow legal access and block illegal access to the IDACS network. Second, the mathematical properties of the space-time separated and evolving relationship of the IDACS Network Access Control protocol provide a number of built-in forensics capabilities. Attacks by unauthorized users can be detected, blocked, traced back to the origin of the attack, and analyzed to determine what authentication items have been compromised, all in a very quick and efficient manner using the properties of this relationship. Third, IDACS uses the space-time separated and time-evolving relationship to protect at-rest encrypted data stored on network-connected devices (e.g., in the cloud or on PCs or mobile devices such as tablets or smartphones). IDACS uses jointly space-separated and time-evolving storage to store critical pieces of at-rest ciphertext in the IDACS network so that reassembling and decrypting the mutated ciphertext without access to the distributed pieces spread in the cloud is mathematically infeasible.
The space-time separated and evolving relationship aspect of authentication seeds is transparent to legitimate users, but it presents a virtually insurmountable barrier to attackers due to the NP-completeness of generating authentication credentials as well as the encoded file/database systems using space/time-varying IDs, locations, and protections. Additionally, this relationship aspect of authentication seeds and states contributes to the speed of the IDACS forensics capabilities.
Space separation can be understood by way of reference to computer access systems in which a user is required to have a password. One method would involve giving each user a unique password, such as a password tied to a user-specific username. By issuing different login credentials to different users, space separation of login credentials is achieved.
Another space separation concept is realized in a computer access system that has multiple authentication agents. For example, a system may require a user to authenticate with several authentication servers. The user may need to authenticate with each authentication server before access is granted to the system. Each authentication server may require a unique password or other authentication credential from the user; thus, possession of multiple passwords may be required for the user to use the system. In this manner, space separation of login credentials may be accomplished.
Time separation can be explained with reference to One-Time Passwords (OTP). In a OTP authentication system, a user may be given an OTP that may allow access to the computer system. Once the OTP has been used, it may be valid for a short period of time (e.g. t=60 seconds). After the OTP time period has expired, the OTP may no longer be used to login, either by a legitimate user or an attacker that has managed to steal the OTP.
The systems, components, and methodologies discussed herein also provide benefits and improvements in connection with real-time forensics for attack traceback capabilities and attack report correlation and aggregation capabilities. In contrast to alternative systems for digital forensics and attack report correlation, the space/time relationships exploited in accordance with the present disclosure have not been previously leveraged to provide speed and accuracy and to avoid ambiguity.
The systems, components, and methodologies provide still other benefits and improvements in connection with distributed data storage. Whereas alternative distributed data storage systems focus on scalability and redundancy for integrity and availability, the present disclosure addresses distributed storage for security purposes.
The following characterizations and notation are used as the basis for the description of the exemplary IDACS network discussed herein. As explained, the IDACS network is merely an illustrative embodiment in accordance with the present disclosure, and the characterizations provided below are to facilitate an explanation of the exemplary IDACS network.
Characterization 1: A location for the purpose of this exemplary description is a physical device with an associated physical location. The physical device includes memory storage and data processing capabilities. A virtual location for the purpose of this exemplary description is a software object with memory storage and data processing capabilities. A virtual location is capable of residing in different physical locations.
Characterization 2: A state for the purpose of this exemplary description represents the PID (Characterization 3) and memory contents associated with a piece of data that can change over time. It can also represent the memory contents of a physical location. The relationship between states and locations is further explained in Characterization 17.
Characterization 3: A location or state may be represented by a permanent, well-protected ID, or by a time-changing Pseudo-ID (PID). The PID may be computed according to a variety of methods. In illustrative embodiments, both a user and a client are assigned several different permanent IDs upon registration with the IDACS system. The user and client may hash these permanent IDs together with other pieces of secret and time-dependent information to generate time-changing PIDs. These PIDs may be used for both identification and authentication when the User attempts to log into the IDACS system. These PIDs may change between secure communication sessions as the secret time-dependent information changes. On user login, the user and/or client may exchange these PIDs with security agents via an encrypted tunnel. According to an embodiment in accordance with the present disclosure, the PID is derived by
PID(A)=hash(ID(A), crypto seeds, time-changing sequence number)
However, other computational techniques for generating the PID are within the scope of the present disclosure.
As used herein, PID(A) may also be represented implicitly as A. Specific exemplary applications of PIDs are discussed in Characterization 21.
Characterization 4: A transform for the purpose of this exemplary description is a mathematical operation which accepts a set of states and/or locations as inputs and produces a set of states and/or locations as outputs. In this disclosure, transforms may be represented by the notation F-box( ). In this notation, the parentheses contain a number of parameters which are inputs to the transform. The first parameter defines the actual internal operation of the transform. For example, a transform that computes a cryptographic hash of the inputs would be called F-box(hash), with “hash” being represented as ash; the remaining parameters would detail the inputs to the hash function.
output=F-box(ash, input data)
Transforms may be combined in a particular order to form new transforms. For example, a given transform may involve a lookup (ookup) followed by a concatenate (oncat) of the outputs of the lookup. Transforms may be combined according to the following notation:
output=F-box(ookup·oncat, input_1, input_2, input_3)
Many transforms make changes to their input superstates (e.g., Custψ as discussed in Characterization 17), although these changes are abstracted in this notation.
Characterization 5: Some variables discussed in connection with this exemplary description are a function of other variables; that is, if the value of variable A is a function of the values of variable B and time t, then the value of A depends on the value of B at time t. For the purpose of this exemplary description, this relationship is represented by the notation A:f(B, t). This relationship implies that B is the input to an F-box( ) that is used to calculate the value of output A.
Characterization 6: A set of elements Ē={E1, E2, . . . Ex} is used in this exemplary description to refer to a collection of elements. An ordered set, for the purpose of this exemplary description, shall be characterized as a set where the ordinality (order) of the elements in the set is one of the attributes of the set. Changing the ordinality of the members of Ē creates a different set Ē′. Therefore, if Ē={E1, E2, E3} and Ē′={E3, E1, E2}, then Ē≠Ē′. Unless specified, all sets are unordered.
Certain properties apply for the exemplary IDACS network.
Property 1: In the exemplary IDACS network, the Pwdθϵ
Characterization 15: Given the IDACS Network, when Userω seeks to use Clientρ to communicate with the IDACS servers at time t, Clientρ downloads a unique User Agent software program UAβ from the network. This UAβ handles communications between Clientρ and the IDACS servers. UAβ is considered a virtual location. UAβ is a function of Userω, Clientρ, and time, thus UAβ: f(Userω, Clientρ, t). UAβ is the entity that performs most of the operations on the client side in the IDACS Network, so the following characterizations and procedures in this illustrative discussion references a single UAβ.
Characterization 16: Given the IDACS Network, at time t there are c sets of Userω, Clientρ, Badgeζ, Pwdθ, PINλ, and UAβ (denoted as {Userω, Clientρ, Badgeζ, Pwdθ, PINλ, UAβ}) that are authorized to access the network. These combinations are termed Customers Custψ, ψϵ[1, c]. Custψ is considered a state. Since Custψ represents a combination of the other parameters, Custψ: f(Userω, Clientρ, Badgeζ, Pwdθ, PINλ, UAβ, t).
Characterization 17: Given the locations characterized in the IDACS network, some of the following characterizations depend on the state that describes the configuration and memory contents of a combination of certain locations. These states represent a combination of other states as characterized in Characterization 2, so for purposes of the present illustrative discussion, they are termed super-states. The symbol represents the super-state covering the entire IDACS system, with other symbols representing more narrowly-defined super-states that are subsets of , e.g., Clientρ represents the state of Clientρ in combination with UAβ.
Clientρ:f(Clientρ, UAβ, t)
The characterization of depends mainly on the memory contents of different locations and the results of the lookup transform as characterized in Characterization 24. Similar notation is used for Badgeζ,
As explained, the locations, states, transforms, notations, and characterizations are merely provided to facilitate discussion of the illustrative IDACS network. They are summarized in Table 2 for reference.
χ⋄PINλ,
, Clientρ,
Client-side operations of IDACS. Details are now provided regarding how the IDACS Network Access Control protocol is handled for Customer authentication and authorization to allow customers to access data or services residing on a DB.
Characterization 18: Given the set
Characterization 19: Ticketψ uses a Merchandise Request Reqψ which communicates the specifics of the desired network action. Reqψ is considered a state. Reqψ specifies the request type (e.g., Read/Write/Execute a piece of data on DBγ), the unique PID for Custψ, the Content(PIDε) tied to the specified data (as characterized in Characterization 22 ), and the data itself. The mechanics of the formation of Reqψ also depend on Custψ; Reqψ: f(Custψ,
Characterization 20: Ticketψ, uses a set
Since OTPχ are data structures, they are considered states. These OTPχ are used for pairwise authentication between Custψ and each SAχ. Each calculated OTPχ is a function of the Custψ calculating it, the SAχ which will be verifying it, and time t; thus, OTPχ:f(Custψ, SAχ, χ, t). The set
Characterization 21: Ticketψ uses a set
Characterization 22: Given
The Content PID indicates the data being accessed in a Read or Execute operation, or establishes a data PID for future reference in a Write operation. Permission is granted to different Custψ to access different pieces of data residing on DBγ; checking the permissions of Custψ to access a requested piece of data is part of the IDACS Network Access Control mechanism. To protect Content(PIDε) for data residing in
Access to sensitive information may also be controlled by means of authorization privileges (permissions). The SSAs may maintain an Access Control List, which may specify which clients and which users are permitted to access which pieces of sensitive information. The SSAs may also share this list with the SAs. Whenever an SA or an SSA handles an information access request, the calling client and/or user may be checked against the Access Control List for the requested piece of information.
Pieces of information residing in the database may be tied to a unique Content ID, and accessible by one or more, though perhaps not all, user/client combinations, which provides space separation. When a user attempts to access a piece of information, the user may be required to provide a collection of different authorization items proving permission to access the information. Each SA and SSA may thus possess a copy of an Access Control List (ACL), which may contain entries corresponding to the information on the Database and the related Content IDs. The ACL record may consist of several time-varying authorization PIDs, which provides time separation, associated with both the information's permanent Content ID and the user/client identity, all of which the user/client may be required to provide correctly to be authenticated by the SA or SSA. Because the ACL records may change with space and time, attacks against IDACS may be exponentially more difficult, as discussed herein.
The ACL may contain entries that contain the following fields: User PID, Host PID, Source IP Address, Destination IP Address, Current Application PID, Parent Application PID, Content PID, Network protocol PID, Host Statement of Health, Host OS PID, Network path (PIDs of SAs and SSAs), Valid Time Period when Information Can Be Accessed. All of the above-mentioned PIDs may be generated by hashing different pieces of information tied to a particular PID, such as the permanent ID associated with that PID, the time-varying secret associated with that PID (changed each time a new client-SA security tunnel is established), a transaction number that may monotonically increase with each transaction (read or write operation), and a publically-known permanent string associated with that type of PID. The different types of PIDs may equate to the different “flavors” (F) as discussed herein. In illustrative embodiments, for the above-mentioned PIDs, a PIDk may be computed in the following way:
PIDk=h2(PID_secretk, IDk, Transaction numberk, PID−specific_stringk)
Here, “h2(M)” indicates M being hashed twice. When known and secret items are hashed together, it may be easier to reverse engineer the secret items if the known information is put at the beginning of the hash. Therefore, in this illustrative embodiment, the secrets are placed at the beginning of the hash. These four items are generated by one of the SSAs and distributed to the client and relevant SAs and SSAs at the client's request. The method for distributing these items discussed further herein. Additionally, these items may be stored in a distributed (space-separated) manner on both the client and SAs. The distributed storage may contribute to the security of the system, and may also assist in the traceback algorithms to be discussed herein.
Property 2: Although
Notation: When a piece of data A is stored at a location or state B at time t, for purposes of the present description, this is indicated by the notation A⋄(B, t). However, the time parameter is often abstracted, so the notation may be simplified to A⋄B.
Characterization 23: Given
Characterization 24: Given the sets
where Clientρ can be replaced by Badgeζ, PINλ, or Pwdθ, and (OTP, χ) may be replaced by (PID, ε). For some combinations of inputs, the output set may be an empty set, i.e. j=0 and
Property 3: The members of
Characterization 25 Given a group of n generic objects O1, O2, . . . , On, the F-box(concatenate) transform accepts this group of objects as input and outputs the objects concatenated into an ordered set. The generic objects may be individual objects, or they may be sets of objects. In equation notation, the “concatenate” is represented by oncat. For example,
Characterization 26: The F-box(random) transform generates a random byte array. For this illustrative embodiment, the array may be 256 bits long (corresponding to the SHA-256 hash algorithm).
XV1=F-box(rand)
Characterization 27: Given a generic set of inputs, the F-box(hash) transform applies a cryptographic hash function (e.g., SHA-256) to a byte array representation of the inputs and outputs the resulting byte array.
output=F-box(ash, inputs)
A specific instance of this transform operates as follows. Given an item type OTP or PID of index χ or ε, the transform accepts the item type (OTP or PID), the index (χ or ε), and the associated set of seeds i.e
OTPχ=F-box(ash, Custψ,
Property 4: Due to Property 3, the outputs of the F-box(ookup) transform and the composition of
Given Characterization 17, Characterization 24, and the related Property 3, the following Theorems may be formed as to show the benefits of the systems, components, and methodologies in accordance with the present disclosure. Both of these Theorems are proved below:
Theorem 1: Given the F-box(ookup) transform, which takes as inputs (a) a super-state Clientρ, Badgeζ, PINλ, or Pwdθ (which contain cryptographic seeds), (b) the super-state Custψ, and (c) an OTP or PID index ((b) and (c) are used together to determine the identity and order of the seeds returned by the F-box(ookup) transform). This transform returns an ordered subset of the seeds derived from (a). An attacker who wishes to recreate the F-box(ookup) transform and has access to (c) and all or part of (a) but not (b) faces an NP-complete problem due to the order of the output seeds.
Theorem 2: Given the IDACS system and an attacker who is trying to calculate
As mentioned,
The IDACS network 100 includes databases 126. In operation, the users 102, 104, and 106 seek to access data stored in databases 126, write data to memory locations in databases 126, and/or execute operations on data stored in databases 126. To do so, the users 102, 104, and 106 access the IDACS network through client computing devices 128, 130, and 132, respectively. The users 102, 104, and 106 and the client computing devices 128, 130, and 132 may need to register with the IDACS network. The IDACS network 100 may include two sets of servers, security agents 134 and super security agents 136. The IDACS network 100 may include a configurable number of security agents 134 and super security agents 136, which may act as security barriers for accessing data in the databases 126. The data of interest may reside on one of the database 126 or may be split, as described in more detail herein, among multiple databases 126.
The security agents 134 and the super security agents 136 play a role in authenticating the users 102, 104, and 106 to ensure that the users 102, 104, and 106 are authorized to access the databases 126 and/or are authorized to access or execute operations on the data of interest within databases 126. In certain embodiments, the client computing devices 128, 130, and 132 may authenticate with each of security agents 134 individually, with the authentication process being monitored and further authenticated by the super security agents 136, providing space separation of authentication. Authentication credentials to be discussed herein may change per transaction, per session, and/or per packet, which provides time separation and joint evolution of authentication credentials. When the User/Client successfully authenticate with IDACS, access is granted to the information, which may be stored in one location on one Database, or may be spread across multiple Databases (space separation of information).
To interface with the security agents 134, the super security agents 136, and the databases 126, user agent software components 138, 140, and 142 are downloaded from the network (e.g., from security agents 134, super security agents 136, or elsewhere) and run on the client computing devices 128, 130, and 132. In certain embodiments, a different application must be downloaded for each new session, thus providing the system with time evolution. Each application that is downloaded may have a random, unique Application PID. The security agents 134 and/or super security agents 136 may maintain logs detailing which Application PIDs were issued to which client computing devices 128, 130, and 132 at which times. The user agent software components 138, 140, and 142 may handle all IDACS communication between the client computing devices 128, 130, and 132 and the rest of IDACS.
The security agents 134 and super security agents 136 are depicted in
In other implementations, the security agents 134 and super security agents 136 may merely reference hardware or software components that are part of a single device, such as a router, gateway, or other network-enabled electronics component. In such instances, all the network security functions disclosed with respect to the security agents 134 and super security agents 136 disclosed herein may be provided within a single network component.
The databases 126 can be conventional databases, such as those operating in Oracle®, DB2, or SQL Server environments. The database 126 may represent cloud storage solutions, such as Google® Drive, Microsoft® Cloud, Amazon® Cloud Drive, or others. The database 126 may generally represent any device with memory capable of storing programmatic data. In exemplary implementations in which the security agents 134 and super security agents 136 are provided within a single network-enabled electronics component, the databases 126 may be provided within that same network-enabled electronics component.
The client computing devices 128, 130, and 132 may be implemented as a mobile smartphone (e.g., Android®, Apple® iOS device, Windows® Phone device, Blackberry®, etc.), tablet, a Personal Data Assistant (PDA), a PC/workstation/laptop (e.g., running Windows®, Unix®, Linux®, Apple® operating systems, etc.), and the like. The client computing devices 128, 130, and 132 will generally include network connectivity, such as cellular network connectivity and/or wireless local area networking capabilities (i.e., “WiFi”) or Ethernet. The client computing devices 128, 130, and 132 will generally include a processor, a memory (e.g., RAM), and a hard drive. Client computing devices 128, 130, and 132 will operate according to program logic implemented by computer source code that is compiled into object code and stored on a memory, from where it is read and executed by a processor. Certain programming languages are disclosed herein, but any suitable programming language can be used, such as C, C++, Java, scripts, and the like.
According to certain illustrative embodiments, the IDACS system utilizes one or more of three basic types of authentication, alone or in combination with other features disclosed herein. These forms of authentication may be used to ensure that a User/Client combination is only allowed to access information it is authorized to access. The first is the One-Time Password (OTP), which provides user-SA authentication to verify the identity of the user. The second is the Access Control List (ACL) PIDs, which are used by the SAs and SSAs to further verify the identity of the user as well as the user's information access permissions (the OTPs and the PIDs may collectively be referred to as the Client Security Ticket). The third is the Network Security Ticket, which may be used for SA-SSA authentication to prove that an information access request has previously been authenticated by a genuine SA or SSA.
The systems, components, and methodologies by which the IDACS network 100 operates are discussed in further detail herein.
Exemplary implementations for the Client-side operations in accordance with the present disclosure for IDACS network authentication and authorization are now provided with reference to the above-provided characterizations.
Note that for purposes of the present disclosure, the notation A→B:C indicates that message C is being sent from party A to party B. Parenthetical notations are provided in the following descriptions with reference to algorithm line numbers.
In Algorithm 1, UAβ first calculates the
The
In Algorithm 2, if
OTP, χ = F-box( ookup· , Custψ, OTP, χ)
PID,ε = F-box( ookup· , CustΨ, PID, ε)
With the above-provided understanding of an exemplary IDACS client-side authentication and authorization procedures for gaining access to data stored on a DB in mind, details are now provided regarding corresponding exemplary network-side procedures that verify
Characterization 28: Given
Characterization 29: Given Ticketψ,
Characterization 30: Given Ticketψ which is sent by Custψ to the IDACS network, the
χ=F-box(ext, Ticketψ, SAχ,
PID(SAχ) or PID(SSAκ) are shared among all SAχ or SSAκ, but in this exemplary implementation they are not shared with Custψ. In this exemplary implementation, the F-box(next-SA-SSA) transform only occurs at SA or SSA locations.
Characterization 31: Given Ticketψ and XV values, the complete messages passed between multiple SAχ and SSAκ are termed Network Security Tickets, denoted TKA, Aϵ[1, 5]. Details regarding the Network Security Tickets are shown in Algorithm 4. TKA is a concatenation of the relevant Ticketψ and Xchain values.
Characterization 32: Given network message B, any SAχ or SSAκ processing B will record a Security Ticket Log Record \B\ detailing the vital information regarding B (e.g., the time B was processed, the IP address of the Custψ that sent B, etc.) B may be any Ticketψ or TK network messages. The logs residing on SAχ or SSAκ are part of SAχ or SSAκ.
Characterization 33: Given Characterization 32, any SA or SSA that processes a network message (e.g., TK) records a log record \TK\ using the F-box(insert-log-record) transform. This transform accepts SAχ or SSAκ and TK as inputs and outputs an updated version of SAχ or SSAκ which contains \TK\. The F-box(insert-log-record) transform is represented in equation notation by
SAX=F-box(Insert, SAX, TK)
Characterization 34: Given Characterization 33, any SAχ or SSAκ may search its own log entries for a given \TK\ that matches certain input parameters such as time, IP address of sending Custψ, etc. These input parameters are not rigidly defined, and may exist in many combinations. The F-box(retrieve-log-record) transform accepts SAχ or SSAκ and a list of conditions as inputs and outputs one or more matching log records \TK\, or ‘null’ if no matching records are found. This transform is represented in equation notation by
\TK\=F-box(trv, SAX, {conditions})
At beginning of Algorithm 4, values set forth below reside at the indicated locations after the last iteration of this algorithm, or are sent to (Ticketψ) or generated at (XV1 and XV4) the indicated locations during the first iteration of the algorithm.
The exemplary network-side authentication and authorization process described in algorithm 1 is carried out in the function described in Algorithm 4.
The initial inputs to Algorithm 4 are handled separately depending on if this is the first call of the function (Algorithm 1 (8) with n=1) or a subsequent call. For the first call of the function, SA1 has been randomly selected by Custψ, Ticketψ has been sent from Custψ to SA1 (Algorithm 1(5) connected to Algorithm 4(2)), and XV1 and XV4 have been randomly generated by SA1 and SSA1, respectively (Algorithm 1(6) connected to Algorithm 4(1) and (7)). For subsequent function calls, SA1 and SSA1 in the current Algorithm 4 function call are SA2 and SSA2 from the previous Algorithm 4 function call, and Ticketψ resides at SA1 as a consequence; XV1 and XV4 in the current Algorithm 4 function call are XV2 and XV5 from the previous Algorithm 4 function call, respectively (
Algorithm 1(8) connected to Algorithm 4(1) and (3)) (see
While SA1 and SSA1 verify OTPχ and
Algorithm 5 outlines the procedure used by SAs and SSAs to verify the
The following properties explain the operation and purpose of the Xchain values.
Property 5: The procedure outlined in Algorithm 4 provides mutually-supported authentication between the SAs and SSAs authenticating Ticketψ.
Property 6: When the run_auth_chain( ) algorithm is cascaded, XV2 and XV5 of one iteration are, in fact, the XV1 and XV4, respectively, for the next iteration; cascaded iterations of the run_auth_chain( ) algorithm are seamlessly integrated (
In this way, consecutive iterations provide mutually-connected authentication for each other.
M1M1′ means M1′=CBC−MAC(K, M1)
Here, K is the key shared between the machine performing the operation and the machine that verifies the operation (in this case, SAα and SAβ). In this case, the result of the operation is
IVSAXSAα,1 (1)
Here, “α,1” correlates with the time and location parameters. SAα may then passes [IVSA, XSAα,1, Client Security Ticket] to SAβ and [XSAα,1, Client Security Ticket] to SSAa. SSAa, after checking the Client Security Ticket PIDs, may perform the operation
XSAα,1XSAα,2 (2)
The key shared between SSAa and SAβ may be used. SSAa may then pass [XSAα,2, Client Security Ticket] on to SAβ. SAβ may now able to verify the correctness of the values of XSAα,1 and XSAα,2 based on the previous calculations. Verifying the relationship (IVSAXSAα,1) can authenticate SAα, while verifying the relationship (XSAα,1 XSAα,2) can authenticate SAα and SSAa together.
As the database access request passes through the SA-SSA layer, each three-machine combination in the process may performs an authentication process like the one described above.
If an attacker controls a single SA or SSA, this condition can be detected by the IDACS network. For example, if SAα is controlled by an attacker and clears an unauthorized Client Security Ticket (unauthorized due to OTP or ACL PID violations) and provides a correct XSAα,1 value, SSAa may not authorize the Client Security Ticket (based on OTP/PID) and may not generate the correct XSAα,2 value. Thus, SAβ can quickly detect that the Client Security Ticket was not correctly authorized by SAα and SSAa in combination. If an attacker can control both SAα and SSAa, then SAβ may be fooled to believe the Client Security Ticket was approved. Alternatively, the SAs and SSAs may be fooled if the Client Security Ticket could be successfully forged. However, forging a Client Security Ticket without access to the user's security token (e.g., password, smart card, client machine, etc.) is an NP-complete problem. This cross-connected authorization checking provides a means to detect the compromise of the user's security tokens and can therefore provide an effective defense against zero-day malware attacks.
In illustrative embodiments, the process of choosing SAα, SAβ, SSAa, etc. may be pseudo-random, but predictable. At the beginning of the authentication chain, the client randomly chooses SAα and may send it the Client Security Ticket. SAα may then perform an operation to determine the path of the Client Security Ticket. SAα may hash the Client Security Ticket and then compare the result to the PIDs of all other SAs that they use to identify themselves to each other.
Consider the following scenario: an attacker wishes to replicate the IDACS Network Access Control procedure for a legitimate Custψ to impersonate Custψ and gain access to Custψ's data residing on DBγ. To impersonate Custψ, the attacker requires correctly generated
The attacker faces an order-reassembly problem; this problem can be represented using graph theory. The group of seeds
Consider a second scenario. The same attacker does not have access to Clientρ, Badgeζ, Pwdθ, or PINλ (and therefore not Custψ either). This attacker must recreate Clientρ, Badgeζ, Pwdθ, and PINλ to gain access to both (a) and (b); therefore, the attacker must correctly reassemble the memory contents of Clientρ and Badgeζ and an analogous representation of Pwdθ, or PINλ (these memory contents are the characterization of Clientρ, Badgeζ, Pwdθ, and PINλ). Each of these items is represented by b memory locations, each of which is Σ bits long; therefore, each memory location contains one of 2Σ possible values. This situation can be represented using an undirected “colored” graph. The possible values for a given memory location can be represented by a group of 2Σ vertices v of the same “color”, vϵ, and each memory location can be represented by a different “color” group (represented as different shapes in
While the discussion above was set forth in terms of an attacker that must recreate Clientρ, Badgeζ, Pwdθ, and PINλ, it should be understood that a similar result would apply with respect to an attacker that must recreate the Access Control List, discussed above, which may include similar information. Particularly, as mentioned, each SA and SSA may possess a copy of the ACL, which may contain multiple records that specify who is allowed to access which data (which may be specified by the Content PID). When a user seeks to access a piece of data, the request may be checked by every SA or SSA in the process against the ACL to authorize the request.
In illustrative embodiments, reconstructing an ACL record is an NP-complete problem. This can be understood by considering an IDACS system that uses the above-described ACL format. This ACL format uses p different types of ACL record items (e.g. User PID, Client PID, Content PID, etc.). These different types of ACL record items may be referred to as “flavors”, symbolized by F, such that there are p different “flavors” in this system. Each of these p “flavors” may be x bits long; therefore, each item can exist as one of 2x possible values, or “states”, symbolized by S(F). The challenge to the attacker, in forging a single ACL record, is to find the correct S(F) for each of the F in the ACL record.
To frame this problem in terms of graph theory, one can consider vertices and edges as follows:
“Flavors” and “states” may be described as:
of the same F. A specified length path represents a
The challenge to the attacker would be to solve this graph problem. It should be noted that due to the time-changing nature of the ACL, the attacker may be faced with a new ACL record graph problem each time the ACL record changes.
Given a pile of data fragments (analogous to the S(F) in the ACL record problem), a PPM model can be used to automate the reassembly of the fragmented file(s). Given fragments A and B, the PPM model can analyze both fragments and generate a probability that fragments A and B are adjacent in the original file.
It is now possible to show that the ACL record graph problem derives from the file reassembly PPM problem. The states (S(F)) in the ACL record problem are analogous to the data fragments in the PPM problem. The edges (E) and edge weights (W) in both problems represent probabilities of relationships between the vertices (V). In the PPM problem, the goal is to isolate the Maximum Weight Hamiltonian Path (a path that covers all of the vertices), whereas the goal of the ACL record problem is to isolate a Maximum Weight Path of Specified Length (covering only p vertices). Both graphs would be solved in a similar manner.
Given that an attacker can use this graph method to attack the ACL record, the question of interest becomes, how efficient is PPM/ACL record graph reassembly, and what are the implications for IDACS security? To answer this question, one can examine the complexity of the reassembly algorithm. For purposes of this analysis, it is assumed the time needed to generate the W is constant for each pair of V; thus, the running time to construct the complete graph is equal to the number of E in the graph. In the ACL record reassembly problem, the total number of E is 2(2x-1)(p2−p); therefore the complexity of assembling the graph is O((p*2x)2). The main question, however, is how long does it take to determine the Maximum Weight Path of length p (which should provide the solution to the ACL record item matching problem)? This problem can be represented by the Maximum Weight Path of Specified Length problem. This problem is intractable, or NP-complete.
Thus, in the ACL record reconstruction problem, if F1=F2, then there is no E connecting S(F1) and S(F2). Assuming that the “correct” solution to the ACL record reconstruction problem is an MWP, then the ACL record reconstruction problem is equivalent to the Maximum Weight Path of Specified Length problem. Therefore, the ACL record reconstruction problem is NP-complete.
Since this problem has been proven NP-complete, it may be regarded as providing a high level of security provided that x is sufficiently large. Table 4 demonstrates that x in a realistic IDACS system is sufficiently large as to make the problem solution run-time prohibitively large. Additional security is provided if the ACL record graph W are sufficiently uniform.
The NP-completeness set forth above with respect to ACL recreation and similarly set forth above in Theorem 1 and Theorem 2 with reference to Custψ or any other super-state points to a high level of security for IDACS, since NP-completeness is associated with an exponential increase in the problem solution complexity. However, NP-completeness speaks only to the worst-case (for the attacker) situation. It may be that the problem solution can be found with significantly less than exponential complexity. As mentioned,
The National Institute of Standards and Technology (NIST) has provided a battery of tests that analyze the outputs of Random Number Generators (RNGs) to measure their “randomness” by looking for patterns. This battery of tests has also been used on ciphertext from various encryption algorithms to measure how closely it matches truly random data. This battery contains 15 individual tests, each of which measures different aspects of “randomness” in a set of data. Each test, when analyzing a data sample, asks this general question: “If the algorithm that generated this data sample was truly random, what is the probability that this specific data sample could have been generated?” The test responds with a p-score for the analyzed data sample; this p-score is a probability in the range [0, 1]. NIST recommends interpreting these p-scores using a “significance level” of 0.01; if a data sample's p-score is above 0.01, then the data sample has passed the randomness test. Some data samples that are truly random will generate a failing p-score, which would be a “false negative” for randomness; this is due to inherent weaknesses in the tests. There are two ways to interpret the results of these tests. The first way is to look at the proportion, or percentage, of data samples with passing p-scores. According to the parameters in A. R. et al., “A Statistical Test Suite for Random and Pseudorandom Number Generators for Cryptographic Applications,” Gaithersburg, Md., 2010, the contents of which are incorporated herein by reference, for a set of tests run with 1000 data samples, a truly random RNG will have a minimal proportion of 0.9805068, i.e., a minimum 98.05% pass rate. The second way is to look at the distribution of the test p-scores. For a set of truly random data samples being subjected to a test, it is expected that the p-scores of the data samples should be evenly distributed. Evenness of distribution can be measured by calculating P-valueT based on the chi-square statistic for each test as discussed in A. R. et al. if each test has P-valueT≥0.0001, then the p-scores are considered to be evenly distributed.
To determine the “randomness” of
Of the 15 tests in the battery, two of the tests were run twice during the course of the battery. Results for both tests are reported here. Three of the tests were run a number of times; results for two randomly chosen instances of those tests are reported here. All other tests were run once, and the results are reported here. There are a total of 20 separate test results.
For the second analysis, the P-valueT for each of the tests is presented in Table 5. It can be seen that all tests exceed the minimum pass value of 0.0001.
It can similarly be shown that the Network Security Ticket approval chain, discussed above with respect to Xchain values and other features, is secure. As mentioned, the Network Security Ticket may make use of XChain values, which may be generated using the CBC-MAC encryption, and the HMAC. Thus, this discussion of the security of the Network Security Ticket addresses these items. The Network Security Ticket approval chain system is secure because it is more difficult to break than the underlying CBC encryption mode used for the XChain values, and the HMAC, which are generally known to be secure. Moreover, the network security system as a whole is at least as strong as the security provided by the Network Security Ticket approval chain. Thus, the entire network is secure as well.
Implications of the disclosed embodiments may be better understood based on the following disclosure. Consider the attacker in the scenarios; an attacker who has access to cryptographic seeds but not Custψ (Theorem 1) OR no access to any memory locations at all (Theorem 2) faces the NP-complete reassembly problem. There is no known solution to these problems with a complexity polynomial to the problem size (number of seeds or memory locations in the graph). A polynomial-time solution could exist for certain situations meeting special constraints; however, due to the demonstrated randomness of the
By way of illustration, consider a Theorem 1 situation where the attacker has access to all of the Seedσ needed to calculate
During all simulations, background traffic was introduced into the network to simulate normal operating conditions. It was determined that introducing network traffic on the slower network connections did not affect the simulation results (but made the simulation running time prohibitively long). Therefore, all background traffic was introduced between the SAs and the SSAs. Uniformly distributed background traffic equal to 80 Kbps/Client was divided equally between the SAs and sent from each SA to each SSA. An equal amount of traffic was also sent from each SSA to each SA. This rate of background traffic ranged from a one-way 80% load on a 10 Gbps link for a full-sized network (102,400 Clients) to a much smaller load for smaller networks (0.8% load for 1000 Clients). This rate of background traffic affected both SA/SSA security log size (which has a substantial effect on real-time forensics, which is discussed herein and packet transit time in the datacenter (due to network congestion). Additionally, realistic packet delay times for routers were obtained from router manufacturer documentation and incorporated into the simulation. Packet processing delays for Clients, SAs, SSAs, and Databases were estimated; when the IDACS prototype implementation is completed by the researchers, more precise packet processing times will be measured and incorporated into the simulation.
Each simulation consisted of two phases. In the first phase, each Attacker would build a set of compromised Slaves (a botnet) gathered from a pool of vulnerable Clients. The attacker would compromise the Clients (turning them into bots) by sending a Compromise packet to each Slave candidate. During the second phase, each Attacker would send out a specified number of Read and Write attacks using a random-length Attack Chain of chained Slaves (the details of the attack scenarios used in this simulation are discussed herein). The start times for these attacks were uniformly distributed over a 20 millisecond period. These attack packets would be checked for
In the simulation, if the attack packet did not possess the proper
Additional probability variables were also used to govern other factors in the simulation. During chained attacks (in which an Attacker uses a chain of Slaves (bots) to launch an attack; this is discussed herein below), the attacker was given an 80% probability of stealing the cryptographic seeds needed to calculate
During the simulations, one of the statistics that were tracked was the traceback time. Each time an attack was detected, the simulation time was recorded as T1 for that attack. When the SSA completed the traceback to identify the attacker, that time was recorded as T2. When the SSA completed the log search and correlation to identify all slaves of that attacker, that time was recorded as T3. These three times were used to compile statistics about the traceback speed of the IDACS system (which are shown in the following graphs). The two times of interest are the traceback time (T2−T1) and the All Slaves Identified time (T3−T1). These times demonstrate the real-time capability for forensics reporting in IDACS. It should be noted that for any given attack, (T2−T1) will always be shorter than (T3−T1), since (T3−T1)=(T2−T1)+(T3−T2). Tracebacks in this simulation were based solely on the log correlation method; other traceback methods such as PID examination can be examined in the following paper.
All tests were based on 1000 attacks for a given test case; 500 Read attacks and 500 Write attacks. Some tests used 1 SSA, and some used 2 SSAs. The first set of tests (
As expected,
An advantage of the IDACS system is its ability to identify the attacker and all of the attacker's slaves quickly.
Another benefit of the IDACS system is that traceback can be improved by adding additional machines to the SA/SSA barrier.
One of the main features of IDACS is the real-time forensics capability. Through log examination and correlation, IDACS is able to trace back and correctly identify the origin of an attack, whether the attack is launched directly by the attacker or indirectly using a botnet of legitimate IDACS users.
In today's network security environment, it is important to detect and prevent network intrusions. It is also important to trace network attacks to their origins and identify the culprits and their methods. This allows the guilty parties to be held liable for their actions; it also allows network administrators to focus their resources once they know the weak spots in their defenses.
Thus, disclosed embodiments provide real-time digital forensics capabilities that can identify network attackers as well as their collaborators, and even traitors within IDACS itself. Explanation is now provided regarding those capabilities possessed by IDACS and how they can be used to detect, block, and trace attacks to their origins. Additionally, simulations demonstrate the ability of IDACS to detect attacks and self-heal even when the network contains a high percentage of insider traitors.
When an attacker wishes to defeat the IDACS Network Access Control Protocol to gain access to protected data or services residing within the IDACS datacenter, there are several general attack vectors available. Three exemplary attack vectors (which may be used alone or in combination) include:
Attack vector 1: Forge legitimate Custψ credentials (Clientρ, Badgeζ, Pwdθ, and PINλ) to impersonate a legitimate Custψ
Attack vector 2: Steal/hack credentials for a legitimate Custψ
Attack vector 3: Hack and gain control over one or more SAs and/or SSAs to manipulate the authentication process
Attack vector 1) requires brute-force guessing of Clientρ, Badgeζ, Pwdθ, and PINλ; this is generally infeasible according to Theorem 2. Attack vector 2) may be more effective, although the space-separation of Clientρ, Badgeζ, Pwdθ, and PINλ makes it more difficult for an attacker to collect them all and acquire a botnet of complete Custψ. By using attack vector 3), an attacker can use a botnet of SAs and SSAs to bypass
By means of attack vectors 2) and 3), any Custψ, SAχ, or SSAκ can be turned into a traitor machine. When this happens, the machine becomes a Byzantine actor (i.e. a malicious system actor that actively works to defeat the correct operation of the system). As known in the art, it is of interest to be able to prove that a given system is Byzantine-resistant, able to operate correctly in the presence of a given number of Byzantine actors.
The incorporation of the space-separated time-evolving relationship into the IDACS Network is based on a principle which affects its real-time forensics capabilities.
Principle 1: Any Custψ, SAχ, or SSAκ in IDACS can be hacked and turned into a traitor/Byzantine actor. Any Customer (Custψ), authenticating machine (SAχ or SSAκ), or real-time forensics machine (SSAκ) can be turned into a traitor/Byzantine actor.
This principle is a reason for the decentralized approach of separating authentication capabilities in space and time. With a design that keeps this principle in mind, IDACS is able to detect and prevent almost all illegal Ticketψ that are passed to it. In fact, IDACS is demonstrably secure against any illegal Ticketψ under certain conditions. The following illustrative capabilities of this exemplary implementation generally hold under certain assumptions outlined below.
Assumption 1: For purposes of the present illustrative discussion, assume that any Custψ can only communicate with
Assumption 2: For purposes of the present illustrative discussion, assume that any member of
Assumption 3: For purposes of the present illustrative discussion, assume that any member of
Assumption 4: For purposes of the present illustrative discussion, assume that any member of
Assumption 5: For purposes of the present illustrative discussion, assume that an attacker who is forming Ticketψ has access to all Seedσ stored on traitor SAχ or SSAκ.
Assumption 6: For purposes of the present illustrative discussion, assume that a spoofed SAχ or SSAκ does not have access to the Seedσ stored on the machine it is spoofing.
Assumption 7: For purposes of the present illustrative discussion, assume that any DBγ that receives a Ticketψ can verify whether or not the SSAκ that sent it was the correct SSAκ at the end of the calculated authentication chain.
Assumption 8: For purposes of the present illustrative discussion, assume that when processing Ticketψ, IDACS performs
Assumption 9: For purposes of the present illustrative discussion, assume that any attack Ticketψ falls into one of two categories: a) contains incorrect
These assumptions are provided to facilitate a description of the IDACS network's advantages. The scope of the present disclosure is not limited solely to networks satisfying these assumptions.
Based on these assumptions, certain capabilities about the attack detection and prevention capability of IDACS can be set forth. These capabilities help to illustrate the advantages and improvements of the systems, components, and methodologies in accordance with the present disclosure. Recall that N is the Authentication Chain Length; there are N SAs and N SSAs in the approach authentication chain and N SAs and N SSAs in the return authentication path.
Capability 1: A Ticketψ with incorrect
Justification for Capability 1: According to Assumption 5, if any SAχ or SSAκ is a traitor, then the attacker will have access to the Seedσ necessary to calculate
However, the strength of IDACS illustrated above is qualified by the limitation set forth in Capability 2.
Capability 2: A Ticketψ with incorrect
Justification for Capability 2: Under certain circumstances, IDACS cannot guarantee detection of an attack Ticketψ with one traitor SA and two traitor SSAs in IDACS if authentication chain path manipulation is allowed. The attacker is allowed to choose the first SA in the authentication chain, so he chooses a traitor SA. Since this SA is a Byzantine actor, it calculates the authentication chain path based on Ticketψ and checks to see whether the last SSA in the authentication chain is also a traitor.
Capability 1 and Capability 2 address a) in Assumption 9; similar capabilities can be offered to address b) in Assumption 9.
Capability 3: A Ticketψ with correct
Justification for Capability 3: Since Ticketψ contains correct
Capability 4: A Ticketψ with correct
Justification for Capability 4: The justification for Capability 4 is similar to the justification for Capability 2.
There is also one final Capability that can be made regarding spoofed IDACS machines.
Capability 4: A spoofed SAχ or SSAκ is detected as soon as it communicates with a loyal SAχ or SSAκ.
Justification for Capability 4: According to Assumption 6, a spoofed SAχ or SSAκ does not have access to the Seedσ of the machine it is spoofing, including the Seedσ needed to calculate XV when communicating with other SAχ or SSAκ. Therefore, it is unable to correctly calculate the requisite XV; this situation is detected by a loyal SAχ or SSAκ.
When an attack is detected by IDACS, it may fall into one of several categories, with each category having corresponding root causes. If an attack is detected based on a OTPχ or
When an attack is detected in Algorithm 4, the function report_ and_trace_attack( ) (Algorithm 6) is called to invoke the IDACS real-time digital forensics suite. The inputs to Algorithm 6 are
In connection with the following description about the different digital forensics functions used by IDACS, the following Property informs the disclosure:
Property 7: The different design elements of IDACS (the distribution of PID seeds, the design of Xchain values, the design of security log records, etc.) are carefully crafted to facilitate the real-time digital forensics capabilities of IDACS. Therefore, IDACS is able to provide high-speed forensic services in real-time with minimal overhead.
When an attack is detected by IDACS, the real-time digital forensics suite is able to trace the attack to the root attacker by correlating the security log records on IDACS machines. In a fully-realized IDACS system, all data packets (including Client-Client packets such as are used in attack chains) may be required to pass through the SA barrier. Even in a less-complete IDACS system, the Clientρ may still maintain security logs for all of the data packets they send and receive. These security logs are a key component to the attacker traceback capability.
In illustrative embodiments, every time an SA, SSA, or client receives a packet of any type, a record of information related to that packet is saved in a log. The log record contains some basic information about that packet, such as origin, destination, packet type (i.e. Remote Terminal, HTML, FTP, etc.), and content PID of the information on the Database that the packet was seeking to access (if applicable). In certain implementations, all packets (even those not directly accessing the database) may pass through the SA/SSA layer, so all packets may leave records in their logs. In other implementations, client-to-client communications will not pass through any SAs or SSAs. In such cases, client logs may also be used to gain a more complete picture of all network communications. When an attack is detected, the SSA may search through these logs to identify the origin of the attack.
Maintaining a log with records for each packet received over an unduly long period of time may be prohibitively expensive and time-consuming. Therefore, each SA and SSA may maintain log records based on a sliding time window of length t, e.g. t=15 minutes. Logs may be maintained for the most recent t time of traffic for fast availability, while older logs can be stored on a backup server.
As mentioned, systems in accordance with the present disclosure may provide attack traceback capability. This capability can accomplish multiple purposes. For example, it can identify the origin attacker of a detected attack. As another example, it can detect all slave clients (i.e. botnet members) controlled by an attacker. IDACS can also provide partial tracebacks that provide leads and clues for human investigators to pursue in the identification of attacker locations.
One example of the type of traceback that IDACS can provide is through log correlation. As mentioned, SAs and SSAs may maintain log records of all packets that pass through them. Client machines may also maintain logs of sent and received packets. In certain implementations, client-to-client communication may all pass through the SA-SSA barrier. Thus, all packets being sent by an attacker to compromise clients may leave records in the SA and SSA logs. In this case, it may be possible for IDACS to identify the origin of an attack and provide this information in a real-time forensics report.
IDACS processing may begin from the assumption that a given packet was part of an attack chain (see
The details of this traceback are show in the “trace_attack( )” function set forth in Algorithm 7. The SSA executing the traceback receives the log record \TK\ of the detected attack packet as input. The critical trace parameters are extracted from this record: the Custψ who sent TK is marked as the candidate for the attacker (1), and the time (2), Parent UA(PIDε) (3), and Content(PIDε) (4) are isolated. Additionally, the
Immediately following the call of “trace_attack( )” to identify the root traitor Client and the traitor Client bots in the attack chain, the SSA running the real-time digital forensics suite may run the “identify_bots( )” function to identify all traitor Client bots controlled by the attacker, even those in a dormant state. In the example shown in
Other methods can be used to expand the traceback analysis. For example,
Algorithm 8 details the “identify_bots( )” algorithm. The function receives the identity of the root Traitor Client, any suspicious packet types as determined by the “trace_attack( )” algorithm, and
The space-time separated and jointly evolving relationship built into IDACS can be used to assist the real-time digital forensics capabilities. The elements of
An example of the “identify_compromised_items( )” algorithm corresponding to
Property 8: Due to the seed distribution and PIDε formation (as discussed generically in Property 7, the checks performed in “identify_compromised_items( )” are performed very quickly with very little overhead.
Property 9: The seed distribution shown in
Additionally, each SA2 and SSA2 in a given iteration of Algorithm 4 are the SA1 and SSA1 for the next iteration, and will also be performing OTPχ and
Due to the design of
Algorithm 10 illustrates how the digital forensics suite performs this detection.
Property 10: The relationships between the XV values (as discussed generically in Property 7) and also
The above-described may also be used to simulate the attack traceback time for the illustrative IDACS network. When an attack was detected (i.e. (4), (11), (17), or (23) in Algorithm 4), this point in time was recorded as T1. After the attack had been reported to an SSA and the traceback to identify the root attacker was completed (i.e the “trace_attack( )” algorithm called at (7) of Algorithm 6 completes), this point in time was recorded as T2. After the attacker's botnet was identified (i.e. the “identify_bots( )” function called at (8) of Algorithm 6 completes), this point in time was recorded as T3. The statistics of interest in this situation were the (T2−T1) time and the (T3−T1) time. The (T2−T1) time is termed the “Attack Traceback Time”, since it represents the time it takes for the root attacker to be identified after the attack is detected. The (T3−T1) time is termed the “Botnet Detection Time”, since it represents the time it takes for the root attacker's botnet to be detected after the attack is detected. It should be noted here that the “Botnet Detection Time” will always be greater than the “Attack Traceback Time”, since (T3−T1)=(T2−T1)+(T3−T2).
An additional benefit of the IDACS system is that attack traceback time can be improved by scaling the network.
In addition to the above-described simulations, a second simulation suite performed for this research addresses the effects of the attack traceback combined with quarantine and healing for Byzantine traitor agents. Given an exemplary IDACS network under attack by parties that are able to steal Client authentication items and turn SAs and SSAs into traitors, how well will the attack traceback protect the IDACS datacenter from illegal (no permission) access? The simulation results presented here attempt to answer that question.
To fully test the capabilities of the illustrative IDACS network of the present disclosure against real-world threats and attacks, attack scenarios were constructed based on the latest and most lethal real-world attack vectors. Therefore, these simulations were carried out under the assumption that all attempts to hack SAs and SSAs and turn them into traitors would be accomplished using zero-day attacks. Since zero-day attacks have not been previously observed, it is difficult to defend against them. Additionally, through the use of metamorphic evolution techniques, it is possible to generate endless variants of these zero-day turn-traitor-attacks, each of which has a unique signature. This method can be used to defeat security systems that use signature-based scans to detect known turn-traitor-attacks. Since both of these attack methods are widely in use today, they will both be considered in this simulation.
This simulation is based on a number of assumptions, each of which approximates real-world conditions. The first assumptions on which this simulation is based are as follows:
Assumption 10: Previously unobserved zero-day turn-traitor-attacks used to gain control over network machines will require a relatively long time (weeks) for a patch that successfully secures that attack's entry point to be issued.
Assumption 11: A zero-day attack used to gain control over network machines, once detected and analyzed, can be blacklisted with a signature-scanning security system very quickly. A zero-day turn-traitor-attack or a metamorphic variation thereof, once detected and blacklisted, are detected and blocked thereafter.
Assumption 10 is seen to be true across almost all computer security vulnerabilities that are being discovered today. Assumption 11 reflects the strengths of signature-based scanners, although their strength may be slightly overstated to simplify this simulation. Based on Assumption 11, it follows that attacker behavior will reflect this reality.
Assumption 12: A zero-day turn-traitor-attack or a metamorphic variation thereof, once detected and blocked, will not be reused by the attacker.
Based on the relative importance of Custψ,
Assumption 13: Completely turning a Client into a traitor (with access to Clientρ, Badgeζ, Pwdθ, and PINλ) through theft or coercion is difficult. An exception would be in an active battlefield scenario, where a number of human users (soldiers) could be captured and coerced into turning over all of the elements of Custψ.
Assumption 14: Custψ are easier to turn into traitor bots than SAs, and SAs are easier to turn into traitor bots than SSAs.
Finally, any attacker, being intelligent and wishing to maximize his chances of success, will not launch attempts to access the IDACS datacenter until he has a certain chance of success. Thus,
Assumption 15: An attacker will not launch access-DB-attacks against the IDACS datacenter until he controls a certain number of traitor Custψ, SAχ and SSAκ.
These assumptions are provided to facilitate a description of the IDACS network's advantages. The scope of the present disclosure is not limited solely to networks satisfying these assumptions.
This simulation was implemented in MATLAB, and examined an exemplary IDACS network consisting of 500 Custψ, 40 SAs, 20 SSAs, and 10 DB (see
The simulation consists of two phases. In Phase 1, the attacker uses turn-traitor-attacks to build a botnet of traitor Custψ, SAs, and SSAs for use in IDACS access-DB-attacks. According to Assumption 15, the attacker builds a botnet consisting of traitor SAs equaling 60% of all SAs in IDACS, and traitor SSAs equaling 60% of all SSAs in IDACS (15% of the traitor SAs and SSAs were spoofed machines). Additionally, the attacker builds a botnet consisting of four traitor Custψ for each traitor SA and SSA (this number was experimentally determined to provide a sufficient number of traitor Custψ to launch a sufficient number of access-DB-attacks for the duration of the simulation). In accordance with Assumption 13, this simulation is assumed to represent an active battlefield situation, so 15% of the traitor Custψ have full access to their authentication credentials (Clientρ, Badgeζ, Pwdθ, and PINλ). Once a sufficient number of bots have been obtained, the attacker launches Phase 2.
In Phase 2, the attacker sends a burst of a high number of access-DB-attacks. The logic behind the burst is that an attacker maximizes his chance of successful datacenter accesses if he sends them quickly; detected and prevented access-DB-attacks will result in the detection and quarantine of traitor Custψ, SAs, and SSAs. By sending a burst of access-DB-attacks, the attacker makes full use of these bots before they are detected and quarantined, and the advantage gained from Assumption 15 begins to slip away. During Phase 2, access-DB-attacks are launched at an average rate of one attempt per 15 clock cycles (the actual start times of the access-DB-attacks are randomized using a normal distribution over the complete period of Phase 2). If an illegal data center access is detected and traced to one or more traitor machines based on the methods discussed herein, that machine is quarantined (removed from the IDACS network) and healed over a period of 100 clock cycles, and then returned to IDACS as a loyal Custψ, SAχ or SSAκ. During Phase 2, the attacker continues to attack Custψ, SAχ and SSAκ and turn them into traitors, thus replenishing the botnet even as bots are detected and quarantined. In accordance with Assumption 14, one new Custψ is turned traitor every 150 clock cycles, one new SAχ is turned traitor every 300 clock cycles, and one new SSA is turned traitor every 600 clock cycles (this is on average; the actual times the machines turn traitor are randomized over the period of Phase 2 using a normal distribution). In accordance with Assumption 15, if the percentage of traitor SAs or SSAs in IDACS fell below 10%, the attacker stopped launching access-DB-attacks until both of those numbers rose above 10%. As long as there was any traitor Custψ available, access-DB-attacks would continue. All traitor Custψ without access to complete authentication credentials (Clientρ, Badgeζ, Pwdθ, and PINλ) launched access-DB-attacks against data/services that particular Custψ had permissions to access (the access-DB-attack was illegal due to incorrect
In this simulation, the botnet-building activity of Phase 1 was compressed into a period of 100 clock cycles (in reality, this botnet building could occur in a “low-and-slow” turn-traitor-attack strategy over the course of weeks or months). Phase 2 activity was simulated over a period of 4000 clock cycles. The length of both the approach and return authentication chains was 4 (N=4), as would be expected in a fielded IDACS implementation.
To address the question of zero-day turn-traitor-attacks with metamorphic variants, the simulation was divided into three scenarios. In Scenario 1, whenever an access-DB-attack is detected and prevented, one or more traitor Custψ, SAχ or SSAκ is identified. This traitor is quarantined and healed, but no attempt is made to analyze the zero-day attack used to turn that machine into a traitor. Therefore, the same zero-day turn-traitor-attack can be used again to turn other machines into traitors during Phase 2. In Scenario 2, there are 20 different zero-day turn-traitor-attacks used to turn machines into traitors. When a traitor machine is identified, the IDACS forensics suite analyzes the zero-day turn-traitor-attack that was used to turn this machine into a traitor. A signature for the zero-day turn-traitor-attack is identified and added to each Custψ, SAχ and SSAκ's blacklist in accordance with Assumption 11, and will not be successful in turning any more machines into traitors during Phase 2. Therefore, the attacker will stop using that zero-day turn-traitor-attack according to Assumption 12. In Scenario 3, the attacker begins with 20 different zero-day turn-traitor-attacks and 20 metamorphic variants of each zero-day attack. Each analyzed and blacklisted metamorphic turn-traitor-attack variant will no longer be used, but many other metamorphic variants are available. In short, Scenario 1 represents the “simplified case” situation, Scenario 2 represents the “best case” situation, and Scenario 3 represents the “realistic case” situation. Results from these three Scenarios are presented herein.
Each of these three scenarios was simulated 10 times, and the results averaged together. This was done to gain a better view of broad trends and mask random variations in different simulation runs.
The purpose of this simulation was to demonstrate how well IDACS could protect the datacenter from illegal access. This can be measured in two ways. First, the number of successful illegal accesses over the period of the simulation indicates the success of the IDACS defense. Second, the number of undetected traitor Custψ, SAs, and SSAs remaining in IDACS over the course of the simulation demonstrate how effectively IDACS is detecting, quarantining, and healing traitor machines.
Characterization 23 pertains to the concept of cryptographic seeds Seedσ and explains that Seedσ are spread across different locations (i.e. Seedσ⋄Clientρ,
Characterization 35: The cryptographic keys used to encrypt data residing on Clientρ have a certain number of bits removed and stored in a different location. These bits are termed Xbits, corresponding to the relevant Custψ.
Each time the cryptographic key is used for encryption or decryption, Clientρ reforms the cryptographic keys and calculates new Xbit locations base on the updated version of Custψ. As such, an attacker who manages to derive the Xbit insertion locations for a given cryptographic key at time t will not possess the correct Xbit insertion points at a later time t′ after Custψ has been adjusted. Thus, the space-separated time-evolving relationship is used to protect the integrity of cryptographic keys. This difficulty faced by the attacker is summarized in the following Theorem, which is proved herein below.
Theorem 3: An attacker who possesses a cryptographic key and the corresponding Xbits, but does not possess the Custψ necessary to determine the Xbit insertion points, faces an NP-complete problem to determine the Xbit insertion points.
In accordance with at least one disclosed embodiment, the IDACS system may be designed to store and protect data or services in the IDACS datacenter. However, this design can also be used to help protect data stored on Client devices. Data stored on Clientρ is encrypted using encryption keys stored across multiple locations (Clientρ, Badgeζ, Pwdθ, and PINλ); this guarantees that an attacker must have access to all of these items to decrypt the data.
Characterization 36: When encrypted data is stored on Clientρ, segments of data are removed from the ciphertext and stored in a physically different location. These removed segments are called Xslices.
All of the Xslices that are removed from a Client-side ciphertext are stored in the IDACS datacenter (
To decrypt data stored on Clientρ, one must have access to the ciphertext stored on Clientρ; all of the locations storing pieces of the encryption keys (Clientρ, Badgeζ, Pwdθ, and PINλ); the Xbits for the encryption keys, which are stored across multiple SAs in IDACS; and the Xslices that are stored across multiple DB in the IDACS datacenter. Additionally, each time the data stored on Clientρ is decrypted to be viewed, the value of Custψ is updated, and the data is re-encrypted with new cryptographic keys that have new Xbits, and new Xslices are removed from the ciphertext and stored at new locations in the datacenter. By combining space-separation and time-evolving characteristics, this IDACS encryption scheme can achieve a much higher level of security than simple encryption.
The location and length of the Xslices in the ciphertext are pseudorandom; they are calculated based on Custψ, according to Characterization 37 and Characterization 38. This pseudorandomness contributes to the strength of the IDACS encryption, as addressed in above.
Characterization 37: Given Custψ, a block of ciphertext to have Xslices removed, and the PID of that data block, the F-box(data-block-offset) transform returns the length between the beginning of the block of data or the end of the previous xslice, and the beginning of the next xslice. This transform also updates Custψ so that the next call to the transform will return the length of the next sub-block. This transform must produce the same sequence of lengths for consecutive transform calls for a given data block PID after Custψ has been reinitialized, so that data blocks may be disassembled and reassembled. The sub-block lengths are determined based on the cryptographic hash of secret seeds stored in Custψ. This transform is represented by
local_data_block_length=F-box(ffset, Custψ)
Characterization 38: Given Custψ, a block of data to have xslices removed, and the PID of that data block, the F-box(xslice-length) transform returns the length of the next xslice to be removed from the data block. This transform also updates Custψ so that the next call to the transform will return the length of the next xslice. This transform must produce the same sequence of lengths for consecutive transform calls for a given data block PID after Custψ has been reinitialized, so that data blocks may be disassembled and reassembled. The xslice lengths are determined based on the cryptographic hash of secret seeds stored in Custψ. This transform is represented by
local_xslice_length=F-box(Lth, Custψ)
Characterization 39: Given an input string or byte array, the F-box(substring) transform returns a substring or sub-array based on specified indices. The transform is represented by
local_xslice=F-box(tring, data_block, offset, length)
The “data_block” parameter is the input string or byte array, the “offset” parameter is the index indicating where in “data_block” the desired substring begins, and the “length” parameter indicates the length of the desired substring.
Characterization 40: Given a block of data and Custψ, the F-box(encrypt) transform encrypts the block of data using encryption keys provided by Custψ and returns the ciphertext along with the updated version of Custψ. This transform is represented by
{Custψ, ciphertext}=F-box(ncrypt, Custψ, data_block)
The use of Xslices in the IDACS Client-side data encryption scheme leads to several theoretical implications which demonstrate the security of this encryption scheme. First, consider the situation where the Xslices extracted from a given piece of data's ciphertext are stored in a contiguous block in a single location (Storage Method 1 in
Theorem 4: An attacker who possesses a ciphertext block requiring Xslice insertion, but does not possess the SCustψ necessary to determine the Xslice insertion points, faces an NP-complete problem to determine the Xslice insertion points.
Theorem 5: An attacker who possesses a block of concatenated Xslices extracted from a ciphertext, but does not possess the SCustψ necessary to determine the lengths of and separated individual Xslices, faces an NP-complete problem to separate the individual Xslices.
A second situation, where individual Xslices are stored across multiple DB in the IDACS datacenter (Storage Method 2 in
Theorem 6: An attacker who possesses all Xslices extracted from a ciphertext, but does not possess the Custψ necessary to determine the order in which these Xslices should be re-inserted into the ciphertext, faces an NP-complete problem to correctly order the individual Xslices.
The preceding Theorems are proved herein.
As explained above, Xslices may be used to protect the confidentiality of encrypted Client-side data. Additionally, distributed Xslices can be combined with data segmentation gain additional security by minimizing the level of decrypted data exposure and minimize the damage caused by an attacker who is able to successfully pass several illegal IDACS datacenter access requests.
The standard approach to file encryption and decryption is to decrypt an entire protected file at the time of access. Unfortunately, this exposes the entire contents of the protected file to an attacker who can steal a Clientρ on which a currently decrypted file is being viewed. The concept of the space-time evolving relationship can be used to minimize this risk.
Segmenting an encrypted data file in this manner enhances data security in several ways. First, in the event that a Clientρ being used to decrypt and view data is stolen, the amount of decrypted data residing on Clientρ is limited. Additionally, if an attacker is able to force a few illegal IDACS datacenter access requests through IDACS, the encrypted data that attacker can recover is limited to a few file segments rather than the same number of files.
Segmenting data on the single file-level provides benefits in terms of both security and performance, which are summarized in Table 8. If a single segment of a file is decrypted, the number of Xslices retrieved from the datacenter as well as the amount of decrypted plaintext exposed is less than if the encrypted file were non-segmented. Additionally, the time required to complete this operation is constant (O(1)) rather than linear (O(x)). If all segments of the file are decrypted, then there is no relative advantage over a non-segmented approach. Table 8 provides a comparison of segmented vs. non-segmented data encryption for file of length x.
The results shown in Table 8 may be used as justification for the following:
Claim 5: Segmenting encrypted files and decrypting and issuing Xslices one segment at a time increases security and performance if one or a few pages are decrypted, but has no effect on security or performance if an entire file is decrypted.
As explained above, data segmentation can be used to protect a single encrypted data file; the same concept can also be used to protect and encrypt a file directory tree.
Navigating through the encrypted File Directory Tree is similar to navigating through any file explorer program on a PC.
Table 9 illustrates the performance of a non-segmented File Directory Tree compared to the segmented version. Table 9 provides a comparison of performance of segmented vs. non-segmented File Directory Trees containing x data files. The performance of the segmented version exceeds that of the non-segmented version if a single file is retrieved; however, the performance of the segmented version drops if all of the files in the Directory Tree are retrieved. For its application in IDACS, this tradeoff in performance is considered acceptable in return for the corresponding increase in security, which is demonstrated in Table 9.
Table 10 compares the security provided (in terms of how much file data and pointers are exposed) for non-segmented and segmented File Directory Trees. Table 10 provides a comparison of security of segmented vs. non-segmented File Directory Trees containing x data files. If a single file is accessed, the segmented version provides higher security by not exposing the file data and pointers for non-accessed files; of course, this advantage is lost if all of the files in the directory tree are accessed. In either case, the segmented version provides a higher level of security by forcing more authentication and permissions checks by a factor of log x. Since the user must potentially navigate the depth of the File Directory Tree for each file accessed, retrieving Xbits and Xslices from the IDACS datacenter for each zone accessed, the segmented version forces more
The results displayed in Tables 9 and 10 may be taken as justification for the following claims.
Claim 6: Segmenting the File Directory Tree and allowing a user to decrypt one zone at a time, as compared to a File Directory Tree system that provides the entire directory tree at once, for a single file access in a tree containing x files:
Claim 7: Segmenting the File Directory Tree and allowing a user to decrypt one zone at a time, as compared to a File Directory Tree system that provides the entire directory tree at once, for accessing every file in a tree containing x files:
Mathematical proofs for Theorems 3-6 are now provided. However, first, a short review of the Theorems to be proved is provided.
Theorem 3: An attacker who possesses a cryptographic key and the corresponding Xbits, but does not possess the SCustψ necessary to determine the Xbit insertion points, faces an NP-complete problem to determine the Xbit insertion points.
Theorem 4: An attacker who possesses a ciphertext block requiring Xslice insertion, but does not possess the SCustψ necessary to determine the Xslice insertion points, faces an NP-complete problem to determine the Xslice insertion points.
Theorem 5: An attacker who possesses a block of concatenated Xslices extracted from a ciphertext, but does not possess the SCustψ necessary to determine the lengths of and separated individual Xslices, faces an NP-complete problem to separate the individual Xslices.
All three cases represent a “splitting” problem, where a block of data must be split at certain points to re-insert extracted information (Theorem 3 and Theorem 4) or to separate the extracted information into pieces for re-insertion (Theorem 5). In essence, the problem requires the attacker to recreate the sequence of outputs from repeated calls to the F-box(Lth) or F-box(ffset) transforms, as demonstrated in
Now, solving the problem posed in Theorem 3, Theorem 4, or Theorem 5 is equivalent to solving this Maximum Weight Path problem. This problem may be formalized by specifying that a path of length Z (where Z is the number of columns) must be found. This is now the Maximum Weight Directed Path of Specified Length (MWDPSL) problem, which is proved NP-Complete herein. Thus, the NP-Completeness of Theorem 3, Theorem 4, and Theorem 5 is proved.
Theorem 6: An attacker who possesses all Xslices extracted from a ciphertext, but does not possess the SCustψ necessary to determine the order in which these Xslices should be re-inserted into the ciphertext, faces an NP-complete problem to correctly order the individual Xslices.
The proof for Theorem 6 is identical to the proof for Theorem 1, with the Seedσ in Theorem 1 replaced by the Xslices in Theorem 6. The Xslice ordering problem in Theorem 6 may also be represented by the Maximum Weight Path of Specified Length (MWPSL) problem, which is proved NP-Complete herein. Thus, the NP-Completeness of Theorem 6 is proved.
While Theorem 3, Theorem 4, Theorem 5, and Theorem 6 have been proved NP-complete, the value of this proof must be qualified. NP-completeness speaks only to the worst-case complexity of a given decision problem (which is that the complexity grows exponentially with the problem size); there may be other factors that can significantly reduce the complexity of a problem.
Consider the example of reassembling fragmented data. Generally, highly-patterned data will result in stronger pattern recognition, which will result in a graph with a few high-weight edges. Therefore, highly-patterned data will result in a Maximum Weight Path reassembly problem that has a complexity significantly less than the worst-case exponential. It has been reported that highly-patterned data does indeed lead to faster and more accurate file reassembly. Thus, a relevant inquiry is whether the fragmented, distributed ciphertext (Xslices and their associated ciphertext) that is present in IDACS produce a uniform or non-uniform edge weight distribution in the graph.
To address this issue, one may analyze the “randomness” of the type of ciphertext fragments that are present in IDACS to judge what type of edge weight distribution a Maximum Weight Path model applied to these fragments would generate. This analysis was performed using a software package created by the National Institute of Standards and Technology (NIST), which provides a battery of tests (referenced above) that analyzes the outputs of Random Number Generators (RNGs) to measure their “randomness” by looking for patterns in the outputs. The battery consists of 15 individual tests, each of which measures different aspects of “randomness” in the data. Each of these tests ask the question: “If the algorithm that generated this data sample was truly random, what is the probability that this specific data sample could have been generated?” The individual tests respond with a p-score in the probability range [0, 1]. The NIST standard recommends using a passing score, or “significance level”, of 0.01. Some truly random data samples will fail the tests and generate a “false positive” for randomness due to weaknesses in the test; therefore, two types of statistics are recommended for analyzing the test p-scores.
The first statistic looks at the proportion, or percentage, of tests with passing p-scores According to the parameters in A. R. et al., “A Statistical Test Suite for Random and Pseudorandom Number Generators for Cryptographic Applications,” Gaithersburg, Md., 2010, for a set of tests run with 1000 data samples, a truly random RNG will have a minimal proportion of 0.9805068, i.e. a minimum pass rate of 98.05%. The second statistic looks at the distribution of the p-scores. For a set of truly random data samples, it is expected that the p-scores should be evenly distributed. The evenness of the distribution can be measured by calculating the P-valueT for each test based on the chi-square statistic; if each test has a P-valueT≥0.0001, then the p-scores are considered to be evenly distributed.
To measure the randomness of ciphertext blocks, the NIST battery was applied to two sets of data. The first set of data consisted of samples of normal AES ciphertext blocks (representing a segment of AES ciphertext with the correct Xslice re-inserted), and the second set consisted of samples of two normal AES ciphertext blocks encrypted with two different AES keys back-to-back with each other (representing a segment of AES ciphertext with an incorrect Xslice re-inserted). This test was designed to determine whether there was any discernible difference between the “pattern” (or randomness) of a correctly- and incorrectly-re-inserted Xslice. Both data sets consisted of 1000 samples, each of which was 106 bits (1.25*105 bytes) long. The samples in the first data set consisted of plaintext encrypted using AES in CBC mode, using a unique key for each sample. The samples in the second data set consisted of the same plaintext encrypted using AES in CBC mode, but each sample was split into two halves, each of which was encrypted using a unique key).
The NIST battery of tests consists of 15 individual tests. Two of these tests are run twice during the course of the battery; results for both of the tests are reported here. Three of the tests are run a number of times; results for two randomly selected instances of those tests are reported here. All other tests were run once; the results are reported here. In total, 20 separate test results are reported.
Table 11 lists the P-valueT for all of the NIST tests for both the “matched” and “mismatched” ciphertext fragments. It can be seen that all tests for both data sets pass the minimum score of 0.0001. Thus, the two data sets may be considered equally “random”. Therefore, it can be concluded that both matching and mismatching ciphertext fragments would generate uniform edge weights in a weighted graph. This indicates that there is no discernible difference (pattern-wise) between adjacent ciphertext blocks (ciphertext joined with associated Xslices) and non-adjacent blocks (concatentated Xslices or ciphertext with Xslices removed), and that the graphs generated to solve Theorem 3 through Theorem 6 would have uniform edge weights, maximizing the effect of the NP-complete property.
Table 11 provides a comparison of the P-valueT for NIST tests for “matched” ciphertext fragments and “mismatched” ciphertext fragments.
The simulations used to examine the effect of Xbits, Xslices, Single File Data Segmentation, and File Directory Tree Segmentation on increasing IDACS security used the same simulation suite discussed above. To simulate the use of Xbits, Xslices, and Data Segmentation, the simulation was expanded to make the access-DB-attacks directed towards accessing the contents of a Segmented File Directory Tree and the data files it stores.
Each Traitor Custψ in IDACS possessed the same encrypted File Directory Tree; however, each Traitor Custψ's File Directory Tree was encrypted using different encryption keys, different Xslices, and different Content(PIDϵ) associated with the store Xbits and Xslices. Therefore, Traitor Custψ were not able to collaborate with each other by sharing Content (PIDϵ) and thus “skipping over” the retrieval of a given folder; all Traitor Custψ were forced to completely decrypt their own File Directory Trees. Additionally, each Traitor Custψ contained only the portions of the File Directory Tree that were “ancestors” of the Data Files that particular Custψ had permissions to access; in this simulation, Traitor Custψ were unable to access files for which they did not have permissions. However, the group of Traitor Custψ was able to collaborate with each other in a limited way; if one Traitor Custψ had retrieved a particular Data File Segment, all other Traitor Custψ would seek to retrieve other Data File Segments, rather than pursue data that had already been successfully recovered. Additionally, the group of Traitor Custψ would give priority for retrieval attempts to the Traitor Custψ with the deepest exploration into the File Directory Tree, thus focusing the resources of Traitor SAs and SSAs towards the Traitor Custψ with the highest likelihood of successfully accessing Data File Segments.
During this simulation, two separate IDACS datacenter accesses were completed to decrypt a single folder/data file pointer/data file segment; one access to retrieve the Xbits, and one access to retrieve the Xslices. The length of the approach and return authentication chains was 2 (N=2); this parameter was shortened from the previous value of 4 to allow more access-DB-attacks to succeed during this simulation. This simulation used the turn-traitor-attack vectors and metamorphic variations defined for Scenario 3 herein, with 40 attack vectors and 100 variations per attack vector (these parameters were increased to mask the limiting effects of these parameters from the results of this simulation). Phase 2 of the simulation began after a threshold of 90% of the SAs and SSAs had been turned into traitors, with a single Traitor Custψ for each Traitor SA or SSA. This 90% threshold is much higher than what we would expect to see in a real-world situation; however, it was set at that level for two purposes. First, the 90% threshold represents a catastrophic scenario; if IDACS is capable of defending against this type of scenario, then the real-world performance is expected to be much higher. Second, it was necessary to raise the threshold to 90% for an appreciable number of access-DB-attacks to succeed so that the effect on the Segmented File Directory Tree could be observed. Access-DB-attacks would cease if the percentage of active SAs or SSAs that were traitors fell below 15%. Additionally, no SAs or SSAs in this simulation were spoofed; all traitor SAs or SSAs were completely functional traitors.
During Phase 2 of this simulation, if a Traitor Custψ was identified, it would be quarantined and the entire File Directory Tree residing on that Custψ would be “re-encrypted”. Therefore, if that Custψ was later turned into a Traitor, he would have none of the File Directory Tree decrypted, and would have to start again from the “root”. However, once a Data File segment was retrieved, it was considered to be owned by the attacker regardless of whether that Custψ was detected in the future or not, and that data was added to the pool of data that had been successfully retrieved by the collaborating Traitor Custψ.
Because the threshold of 90% of SAs and SSAs turned traitor before Phase 2 of the simulation began, a unique situation presented itself. In most cases, the percentage of Traitor SAs and SSAs in IDACS would drop quickly after the start of Phase 2 of the simulation (
It should be noted that Runaway Botnet simulations occurred in only 1 out of 10 simulation runs with an SA/SSA Traitor threshold of 90%; Runaway Botnets were not observed in simulations with a threshold of 80% or less. Therefore, a Runaway Botnet represents a highly unlikely, but very catastrophic situation. For the sake of completeness, the results for Runaway Botnet simulations are included in discussions herein.
The results of this simulation were analyzed to discover the effectiveness of the protection provided by the Segmented File Tree Directory against illegal access of the Client-side encrypted data.
One of the advantages of the IDACS Segmented File Directory Tree approach is that it allows previous illegal IDACS datacenter accesses to be detected. When a Traitor Custψ is detected, the IDACS forensics engine reports to the System Administrator that all Data File Segments previously retrieved by that Custψ have been stolen. It is very useful, in the aftermath of a network breach, to know where data leakages occurred, and what data was leaked.
When a Traitor Custψ is detected to be a Traitor, this allows IDACS to hold that Custψ accountable for stealing files.
The Client Device (Clientρ) may be implemented in a number of ways.
In addition to being scalable, this implementation of IDACS uses a network communications protocol that achieves reliable delivery over UDP. Given the particulars of the IDACS algorithm and the per-message overhead
The BlackBerry application implements the concepts of space/time-separation and also Xbits and Xslices to protect encrypted data. When the BlackBerry application is run, it is given a file to encrypt. The application begins with a few randomly-generated cryptographic seeds that are the basis for all following actions. These cryptographic seeds are used to seed a pseudo-random process which divides the file data into pseudo-random-sized blocks and encrypts each block with a unique pseudo-random AES key. Next, the resulting ciphertext is divided into pseudo-random-sized blocks, and a certain percentage of those blocks are removed as Xslices. Xbits are also pseudo-randomly removed from these cryptographic seeds (using the User Password as a seed for the pseudo-randomness). The post-Xslice ciphertext is then divided into 1 KB blocks, which are stored in alternating data files (
As explained above, disclosed embodiments may provide an illustrative integrated security system IDACS that utilizes the space-time separated and jointly evolving relationship to provide multiple layers of constantly-changing barriers that are mathematically infeasible for attackers to predict. The implementation of these ideas can successfully detect and defeat different types of network attacks, including zero-day attacks. Table 12 details several common network attacks that systems, components, and methodologies in accordance with the present disclosure can address. Mathematical analysis demonstrates that it is generally infeasible to recreate the IDACS authentication protocol, and simulations also reinforce the strength of these space-time relationships. Table 12 lists the types of network attacks defeated by IDACS
In addition to detecting and preventing attacks, systems, components, and methodologies in accordance with the present disclosure provide real-time forensics capabilities, allowing traitorous network actors to be identified quickly and accurately. Simulations demonstrate that forensics are efficient and effective. Also, systems, components, and methodologies in accordance with the present disclosure use the space-time separated and jointly-evolving relationship to protect at-rest mutated encrypted data. Space-time-changing Xbits and Xslices, providing mutation to ciphertext, stored across multiple locations and data segmentation provide greater security for encrypted data. Once again, mathematical analysis demonstrates the theoretical strength of this system, and simulation provides a more concrete expression of this security.
Thus, systems, components, and methodologies in accordance with the present disclosure implements the space-time separated and jointly-evolving relationship across multiple aspects of the system to provide a complete end-to-end network and data protection system that has strong mathematical properties.
As explained above, the problem of finding the highest sum W(e) path is defined as the Maximum Weight Directed Path of Specified Length (MWDPSL) problem; this problem will now be proven NP-complete. The process of proving a given decision problem C to be NP-complete has two operations: 1) Show that C is in NP; 2) Show that every problem in NP is reducible to C in polynomial time.
The first operation can be shown by demonstrating that a candidate solution to C can be checked for correctness in polynomial (or better) time. The second operation can be shown by demonstrating that any one known NP-complete problem B is reducible to C. If one NP-complete problem B can be reduced to C, then all other NP-complete problems can be reduced to C. A problem B is reducible to problem C if there is a polynomial-time, many-one reduction from B to C; that is, there is a reduction that can transform any instance of B into an instance of C. Any algorithm that can be used to solve all instances of problem C can be used to solve all instances of problem B.
The process of proving that the MWPSL and MWDPSL problems (C) are NP-complete begins with a proven NP-complete problem, the Hamiltonian Path problem (B).
As explained herein, each of the reductions is composed of a series of indicated operations. Subsequently, the MWPSL and MWDPSL (C) problems are proven to be NP-complete.
Starting Point: The Hamiltonian Path is NP-Complete
Hamiltonian Path: Given an undirected graph G=(, E) where is a set of vertices {v1, v2, . . . } and every eϵE is an unordered set of vertices {v1, v2} called edges. Does G contain a Hamiltonian path, which is a sequence <v1, v2, . . . , vη> of distinct vertices from such that {vi, vi+1}ϵE for 1≤i<η and every member of appears once and only once in the sequence?
The Hamiltonian Path problem has been proven NP-complete.
Operation 1: Show that the Maximum Weight Hamiltonian Path problem is NP-complete.
Maximum Weight Hamiltonian Path: Given an undirected graph G=(, E) where every eϵE is an unordered set of vertices {v1, v2} called edges and has a weight W(e)ϵQ+, and there is a number RϵQ+. Is there a Hamiltonian path <v1, v2, . . . , vi, . . . vη> in G where η=|| such that Σi+1η−1W(vi, vi+1)≥R, where {vi, vi+1)}ϵE?
Operation 1.1: Show that the Maximum Weight Hamiltonian Path is in NP.
A candidate solution to this problem can be checked by tracing the path, verifying that each vertex is touched once and only once, and summing the weights of the edges in the path and checking the final sum. The candidate solution is checked in linear time.
Operation 1.2: Show that the Hamiltonian Path problem is reducible to the Maximum Weight Hamiltonian Path problem in polynomial time.
The Hamiltonian Path problem is a special case of the Maximum Weight Hamiltonian Path problem, so the first can be reduced to the second. Create an instance of the Maximum Weight Hamiltonian Path problem. Set all W(e)=1 for all eϵE and set R=(||−1). This is now an instance of the Hamiltonian Path problem, and the reduction is accomplished in linear time.
Result: The Maximum Weight Hamiltonian Path problem is NP-complete.
Operation 2: Show that the Maximum Weight Path of Specified Length (MWPSL) problem is NP-complete.
Maximum Weight Path of Specified Length (MWPSL): Given an undirected graph G=(, E) where every eϵE is an unordered set of vertices {v1, v2} called edges and has a weight W(e)ϵQ+, there is a number RϵQ+ and an integer N≤||. Is there a path P=<v1, v2, . . . , vi, . . . , vN> in G such that any vϵ appears at most once in P and Σi=1N−1W(vi, vi+1)≥R, where {vi, vi+1}ϵE?
Operation 2.1: Show that the MWPSL problem is in NP.
A candidate solution that connects some or all of the vertices can be checked by tracing the path, verifying each vertex in the path is touched at most once, verifying that there are N vertices in the path, and summing the path edge weights and comparing the sum to R. This candidate solution is checked in linear time.
Operation 2.2: Show that the Maximum Weight Hamiltonian Path problem is reducible to the MWPSL problem.
The Maximum Weight Hamiltonian Path problem is a special case of the MWPSL problem, so the first can be reduced to the second. Create an instance of the MWPSL problem and set N=||. This is now an instance of the Maximum Weight Hamiltonian Path problem; this reduction is accomplished in linear time.
Result: The Maximum Weight Path of Specified Length (MWPSL) problem is NP-complete.
Operation 3: Show that the Maximum Weight Directed Path of Specified Length (MWDPSL) problem is NP-complete.
Maximum Weight Directed Path of Specified Length (MWDPSL): Given a directed graph G=(, E) where every eϵE is an ordered set of vertices {v1, v2} called arcs and has a weight W(e)ϵQ+, there is a number RϵQ+ and an integer N≤||. Is there a path P=<v1, v2, . . . , vN> in G such that any vϵ appears at most once in P and Σi=1N−1W(vi, vi+1)≥R, where {vi, vi+1}ϵE?
Operation 3.1: Show that the MWDPSL problem is in NP.
A candidate solution that connects some or all of the vertices can be checked by tracing the path, verifying each vertex in the path is touched at most once, verifying that there are N vertices in the path, and summing the path edge weights and comparing the sum to R. This candidate solution is checked in linear time.
Operation 3.2: Show that the MWPSL problem is reducible to the MWDPSL problem.
The MWPSL problem is a special case of the MWDPSL problem, so the first can be reduced to the second. Create an instance of the MWDPSL problem corresponding to an instance of the MWPSL problem where every eϵE with a given W(e) in the MWPSL problem is replaced by a pair of opposite-direction directed eϵE in the MWDPSL problem, both with the same W(e) as in the MWPSL problem. This now equates to an instance of the MWPSL problem; this reduction is accomplished in linear time.
Result: The Maximum Weight Directed Path of Specified Length (MWDPSL) problem is NP-complete.
The MWPSL and MWDPSL problems represent a reassembly-due-to-space-separation problem at a given instant in time. Thus, the space separation of IDACS provides the NP-completeness to the systems. However, due to the joint time-evolution of IDACS, the problem evolves into a completely new MWPSL or MWDPSL problem each time the system states change (which can occur every few seconds). Therefore, the time-evolution greatly increases the complexity of the problem.
In accordance with disclosed embodiments, systems, components, and methodologies also provide a security scheme that incorporates the cloud and mobile devices possessed by a user to give the user the required data confidentiality, integrity and user authentication at substantial security strength, e.g., more than 768-bit security strength, and an NP-complete problem without sacrificing performance. This is possible due to the randomness of wrapping, splitting, encrypting files, and storing the different pieces in different locations on the PC, mobile devices, and in the cloud.
More particularly, systems, components, and methodologies in accordance with the present disclosure provide security and authentication of encrypting files with more than a password and a simple location. They overcome problems such as password cracking, cloud storage, zero-day malware, Trojans, phishing, and information leakage that are drawbacks to alternative implementations. Generally, one way the disclosed systems, components, and methodologies solves these problems is by not only using “what a person knows”, like a password, but also “what a person has”. The present disclosure employs cloud storage and mobile devices together to provide the benefits and improvements described above.
The systems, components, and methodologies in accordance with the present disclosure provide authentication by authenticating the user with multiple devices and passwords. Also, by splitting and storing the encrypted data on multiple devices such as a server, an Android mobile device, a PC, and cloud servers, the user will have a higher degree of integrity and confidentiality because the attacker will have to have access to all encrypted pieces, devices, and passwords.
Factors behind the systems and methodologies described herein include: what the user knows, what the user owns, where the encrypted pieces are, and encryption, such as by way of example AES-GCM encryption.
With the user's password and devices, a key may be constructed as input to the AES-GCM encryption and the encrypted output is split up and randomly distributed among the devices including the cloud service DropBox. The location of these splits is stored in a list or map that will also be protected.
According to one exemplary implementation, various tools may be used including: an Android 2.2-4.2 device, the Android SDK, a PC, the DropBox API, JAVA, Eclipse JUNO, and the RSA JAVA Share package for GCM. Software using JAVA was created for the PC, mobile device, and server. The software design may be broken into three sections, the salt/key generation, protection (encryption, splitting, and mapping), and unprotection (gathering, merging, and decryption).
If a user wants to protect a folder, a 256-bit key may be created for the AES-GCM encryption. This key may be made from a randomly generated 256-bit salt seeded with the time stamp of the PC. The salt is concatenated with the PC password and hashed using SHA-256 n-number of times.
Details regarding encryption are now provided. AES with GCM (Galois/Counter Mode) is an advantageous option because of its efficiency, performance, and built-in authentication. Its high throughput makes it favorable for high-speed data transfer. GCM can take full advantage of instruction pipelining in contrast to alternatives, such as Counter Block Chaining (CBC) which incurs pipeline stalls. GCM improves on counter mode (CTR) by using finite field to add authentication to the encryption process. GCM belongs to a class of cipher modes called Authenticated Encrypted with Associated Data (AEAD). The RSA BSAFE Share library is used since the JAVA cipher class has not implemented GCM yet. However, it should be understood that the scope of the present disclosure is not limited solely to AES with GCM.
After encryption the bulk is split randomly into pieces, e.g., 4-9 pieces. This may be done with a randomly generated number of 512-bits created using the JAVA SecureRandom class. Each split is then encrypted with keys generated with the password and salt. The smain, sbits, and random number from the salt generation are concatenated, encrypted, and split as well.
The tests were run with total file sizes of 36 KB, 1.08 MB, and 10.8 MB that contain various amounts of files and folders (Table 14 and 15). The average total running time and the average AES-GCM/splitting time were calculated. However, as the files became bigger, the total average time of the protection became nearly 28 seconds. This was due to the uploading of the larger splits to DropBox which is limited by the upload speed of the internet connection. To combat this problem, a second series of tests was done to see the effect of limiting the DropBox upload size to just 10 KB. This decreased the running time by nearly 300%. Nevertheless, the AES-GCM encryption and splitting of the group of files was very fast, even for larger file sizes.
With the data from the original timed runs, it was estimated that the speed of the encryption:
≈FileSize*3.80764×10−8 sec)/1Byte.
While the speed of the decryption is:
≈(FileSize*3.40814×10−8 sec)/1Byte.
The speeds of splitting and merging were also calculated at (FileSize*3.6064×10−8 sec)/1 Byte and ≈(FileSize*3.47196×10−8 sec)/1 Byte, respectively.
With the speed and performance of AES-GCM, encrypting and decrypting large files is not a problem on orders of 10−8 seconds/byte. The same goes for splitting and merging files.
An objective of the systems, components, and methodologies in accordance with the present disclosure is to provide high security to the user's information without sacrificing performance. By having not only a password and being able to distribute the encrypted data among various devices and locations, retrieving protected information not only depends on “what the user knows”, but also “what the user has”.
When it comes to protecting the passwords, the password to the mobile device may be hashed and saved on the mobile device while the password to the PC is not saved at all. One reason for not saving the password on the PC is because if the same password is not entered when unprotecting as when protecting, the unprotecting process will fail due to the GCM built-in authentication with the improper key that would be created. Since the mobile device uses AES-CBC, built-in authentication is not available and the password must be saved. Since a password is used for both the PC and mobile device, two different passwords can be used. The password+salt combination makes dictionary attacks unreliable against the PC because the salt is protected by encryption and splitting.
A strength of the disclosed systems is the spreading of encrypted splits, smain, and sbits among various devices and locations. This addresses the situation that if some splits are found either by eavesdropping or device compromise, no plaintext data can be recovered. All of the splits from all locations are used to recover data.
Tables 16 and 17 show the different strengths of each part of the security scheme. The total cryptographic key strength is 768-bits because without the 768-bit protected map, then the splits cannot be located.
An explanation is now provided of an exemplary usage scenario of a system in accordance with the present disclosure.
According to this exemplary usage scenario, the folder that the user wants to protect is selected and a password is entered. After protecting, the encrypted data is split and stored in the different places. The “kf*.txt” files are the splits of the encrypted data.
The performance versus security is excellent for the encryption, decryption, splitting, and merging. The performance can be improved for large files if the upload speed to the cloud is increased, though this can be achieved with better upload speed or a design change in which the split size to the cloud is limited; by doing this, the security is not hampered what so ever. The AES-GCM makes the 256-bit encryption and decryption process very efficient, even for large files sizes. With the multiple encryptions, the overall encryption strength of this scheme is 768-bits, which is unbreakable by technology today and in the near future. Security can be improved without hindering performance if the key size is increased and strong passwords are enforced.
The systems, components, and methodologies disclosed herein provide technical solutions to problems described earlier. Encrypted files being stored in multiple places protect against zero-day malware and Trojans on the PC; also, protection against insecure cloud storage is provided. Since all devices and the password plus protected salts are needed for unprotection, this will protect against password leakage, brute forcing, phishing against the user, and lost devices. Also, information leakage (via insecure networks or device compromise) is protected because if a few splits are found, no information about the plaintext can be recovered. These design properties makes this implementation very secure against outside attackers.
The systems, components, and methodologies disclosed herein make data protection more secure by providing greatly improved security and having an additional form of authentication and file protection by fully utilizing the available mobile devices possessed by a user—in this exemplary illustration, an Android mobile device. The high security strength and spreading parts of the encrypted file among multiple devices and cloud makes this scheme extremely difficult to break. An attacker must not only need to attack the password or keys, but also needs to possess all of the pieces from all locations. According to the above-described exemplary illustration, the scheme provides at least 768-bit security with an NP problem without sacrificing performance.
In accordance with at least one disclosed embodiment, systems, components, and methodologies may provide a TPM-enhanced cloud-based file protection system. Such systems, components, and methodologies address the need for better cloud computing system security. According to the present disclosure, file distribution design can be introduced into the information protection system to add another layer of security. Distributed file pieces could obfuscate and defeat the hackers from recovering the whole file.
The present disclosure addresses shortcomings of alternatives, including alternatives that rely solely on software applications, by introducing Trusted Computing design which utilizes the Trusted Platform Module (TPM) into this design. TPM is a security chip that can create and store cryptographic keys, generate random numbers, and so on. In one aspect in accordance with the present disclosure, TPM's security features are deliberately designed as part of the system implementations disclosed herein. TPM is first used to bind cryptographic key to provide root of trust. Then to provide support for cloud computing design, TPM provides identity attestation for client.
In accordance with illustrative embodiments, the file protection system, consisting of a server, a client and an Android mobile device, provides 5 layers security. First, in accordance with disclosed embodiments, the logon scheme is protected by obfuscated inputs on client and Android and is authenticated on server. In this exemplary implementation, no one device has a password or password hash, and as such, this scheme can effectively defeat the key logger and screenshot capturer. Second, in accordance with disclosed embodiments, AES-GCM scheme is used for file encryption and decryption. Third, in accordance with disclosed embodiments, encrypted file splitting and hiding scheme can be implemented in the cloud storage to avoid side channel attacks to encrypted files. Fourth, in accordance with disclosed embodiments, TPM is used to create 2048 bits RSA binding key to protect the encrypting key for the index file, which is the start point of the file unprotection. Fifth, in accordance with disclosed embodiments, the TPM is used to create 2048 bits RSA Attestation Identity Key to provide identity authentication for the client. Finally, the encrypted file and encrypted index file are distributed to server, Android and client. Only the authorized client with the original TPM which carries the RSA signing key and binding key can retrieve distributed file pieces and finally unprotect them. Possession of all of the devices cannot recover the information.
Traditionally, for the security software that resides only on the software process, the cryptographic keys has to be stored plainly on the hard drive which means that running that whole security process in software is like leaving a spare front door key somewhere in the yard—one is relying on being able to think of a key-sized hiding place that a burglar won't find. That is the unavoidable weak point for software security. Incorporating TPM into the crypto system can finally resolve this trouble and escalate the file protection system to the hardware level. Trusted Platform Module can implement security features and can be used as a reference point to provide root of trust for cryptographic processes. Based on the key management infrastructure and root of trust features, TPM provides cryptographic ability to secure critical data and act as a reference point for the information protection which solves the weakness of solely-software cryptographic processing.
The cloud storage of information presents additional security concerns, including the identity proof of different parties. To ensure full access to the cloud storage information pieces, each party of the system should trust others to ensure the security of the information. Thus, the identity attestation ability of TPM provides advantages over alternative implementations. Certain alternative implementations use a Role Based Access Control Model as the security binding for the cloud storage. In systems, methods, and components in accordance with the present disclosure, the Identity Attestation Key inside TPM, according to the features of personal information protection, which is strongly protected by Storage Root Key, is generated and used as the identity proof of the specific TPM, and this will ensure the safety of the cloud storage data. Because the TCP is based on relatively independent hardware modules, the disclosed systems do not require significant CPU resources, and can improve the performance of cryptographic computation processing.
For the information security system, password protection is another important consideration. Passwords are often the primary source for protection. But passwords may be vulnerable for two reasons. The traditional input methods of passwords are so weak that it they may be easily captured by a key logger or screen capturer. Likewise, many passwords chosen by people can be cracked by dictionary attacks or social engineering.
There are different alternative models designed for password protection. Many rely on the complicated mathematical processing of user passwords and include biotic features and distributed computing features. In accordance with the present disclosure, a specific server, the client PC, and the Android Device multiple ends synchronous password logon scheme is designed based on the random number sequence projection and fully utilize the personal cloud computing powers. This scheme can effectively defeat key logger and off-line dictionary attacks.
Thus, to protect users, the present disclosure describes a highly secure, cloud based information storage infrastructure enhanced by TPM to meet the security demands that requires data confidentiality, integrity and User Authentication.
In certain implementations, Java is used to implement the disclosed systems because it is a suitable multi-platform environment that provides ease of software development with efficient applications. Moreover, Java is platform independent, offering another advantage. In accordance with certain implementations of the present disclosure, SHARE FOR RSA library is used to implement AES-GCM mode encryption of files under the Java environment. Java's robustness, ease of use, cross-platform capabilities and security features provides beneficial worldwide web solutions. The ability to run the same program on many different systems is beneficial to World Wide Web software and Java succeeds at this by being platform independent at both the source and binary levels. However, it should be understood that Java is merely an exemplary mode of implementation, and other suitable development environments may be used.
According to an illustrative embodiment, the TPM design is based on public key infrastructure, and thus may utilize RSA 1024 bits or 2048 bits to encrypt files. A potential downside for RSA is the efficiency—when the file size gets large, RSA-based implementations may become clumsy. Thus, in accordance with illustrative embodiments, AES 256 bits is used as the encryption method which has the same security level as RSA 2048 bits but enhanced efficiency. Then, TPM is used to protect the AES keys. While improving performance, this approach also resolves the problem of storage of cryptographic keys, which is problematic in solely-software security design.
According to this illustrative embodiment, TPM is first used to create 2048-bits binding keys to wrap and store the index file encryption key. Because the private key can never come out of TPM and is protected by the storage root key which was stored in non-volatile RAM, the whole file system is well protected.
As discussed above, to ensure the safety of cloud storage, it is desirable that every access to information pieces in the cloud storage be fully authenticated. In accordance with this exemplary implementation, TPM Attestation Identity Key (AIK) is used to provide identity attestation to server for recovery of data pieces. In other alternatives, CA issued certificates and TPM signing keys are recommended to bind with data for authentication. The present system, in contrast, provides efficiency and features of personal use, as the AIK is used to provide identity proof to server. AIK is well protected by SRK inside TPM and also unique identifier UUID is used to refer to it and UUID is also a user identity to use specific AIK. Thus, this design offers security requirements for the cloud storage identity attestation.
According to another aspect of the present disclosure, splitting and spreading the encrypted files provides protection of sensitive information. Exemplary illustrative embodiments use a splitting-merging program that provides splitting and merging functionality of files. This may be accomplished by storing the paths and keys of the pieces to an index file. The index file is further used for merging the pieces back to the file. A system generated time stamp may provide a unique seed for the generation of AES keys.
Generally, a password is often the weakest point of security system. It may be easily captured by key logger or screen shot capturer and sometime even guessed by and dictionary attack and social engineering. To enhance the protection over the password, exemplary systems in accordance with the present disclosure use a multipart synchronous logon system in which the password may be split and input on both the Android device and PC separately and at the same time the input may be transferred into random number sequences and sent to the server. On the server side, when the server receives the random sequences, it will reorganize all the sequences based on their timestamp and generate the SHA-256 hash value of the combined random number sequence. Finally, the hash value may be compared to the correct one and verified. This design can effectively defeat key logger or screen shot capturer.
Random and varying salts may provide protection against brute force attacks, dictionary attacks, and birthday attacks. Protection can be given by encrypting, splitting and spreading the information to be offline for the rest of life. The salt is a sequence of bytes that is added to the password before being digested. This makes the digests different to what they would be if one encrypted the password alone, and as a result protects against dictionary attacks. Adding salt to the key or password can vastly extend the key range, which may make it more difficult for the exhaustive search of keys by brute force attack.
In accordance with at least one embodiment of the present disclosure, three devices may be used for the security consideration of file protection: a server, an Android® device, and a PC equipped with TPM. Thus, the overall system is easy to implement and applicable for practical and commercial use. In exemplary implementations, the only knowledge required by the system on the part of the user is the passwords and the input orders on the Android device and the PC. Then all the remaining jobs are implemented by those three devices. So the overall design is easy to use but can provide comparatively high level security.
After the user registers an account, the system operation may be divided into two parts based on user choice, Protection and Unprotection.
In total, the overall design can be split into 5 modules:
In a synchronous log on scheme module, an Android® phone and the client PC are the input devices and implement the input digits to random number mapping. Each digit of the password may be input alternately on the Android phone and the client PC, and at the same time each of the digits may be mapped into certain length random number sequence and sent to the server for verification. On the server side, the SHA-256 hash value of correct random number sequence, which the hash value of received random number sequence may be compared to, is stored. The result of the compare may be sent back to PC and Android for next operations.
An AES-GCM encryption-decryption scheme module uses the AES-256 encryption method, which is considered to be strong enough for exemplary security requirements. The AES keys are generated based on a unique seed which is based on system generated time stamp. To play against side channel attack and timing attack, a GCM mode is introduced into this design. The security of the GCM mode makes use of the fact that the underlying block cipher cannot be distinguished from a random permutation. Finally, a random generated salt is added to play against brute force attack.
For a file splitting-merging module, a specific program is written to split the encrypted files into pieces and bulks based on random numbers. Junk data are also injected into the split files. The paths and keys of the pieces are stored into an index file. The index file is further used for merging the pieces back to the original file. The index file may also be encrypted by AES-GCM and split by the splitting program. In the decryption process, the recovery of all the files will start from the index file.
A TPM key binding-unbinding module may provide binding and unbinding functionality. Binding generally includes the operation of encrypting a message using a public key. That is, the sender uses the public key of the intended recipient to encrypt the message. The message is recoverable by decryption using the recipient's private key. When the private key is managed by the TPM as a non-migratable key only the TPM that created the key may use it. Hence, a message encrypted with the public key is “bound” to a particular instance of a TPM. Keys may be considered communication endpoints and improperly managed keys can result in loss of security. Thus, the TPM in accordance with the present disclosure aids in improving security by providing key management. The final index file encryption key is the root of the cryptographic process, so it will be binding with TPM for concrete protection. In the decryption process, the index file encryption key may be recovered using the corresponding private key by providing correct identity to TPM.
Finally, a TPM signature authorization module may be utilized when, due to cloud storage features, only the authenticated users with unique AIK can access the data on the server and recover the whole file. TPM signature authorization scheme may defeat unauthorized recovery of distributed index file pieces from server. At the end of the encryption process, TPM will generate 2048 bits RSA Attestation Identity Key and the public key may be sent to the server for storage. In the first place of decryption process, the TPM has to provide the server the signature generated by previous generated key, after the signature is verified by server using stored public key, then the index file pieces can be sent back to the client for the next operation of the decryption process.
According to exemplary implementations, to use the information system application, the first operation is user registration. User registration includes several parts, including password registration and cryptographic key registration. Table 18 lists exemplary user registration information.
To register AIK, a user may assign proper usage and migration policies to it, and the policies are protected by a user created secret. After the AIK is created, the TPM binding key also has to be created for key binding. And the same as AIK, the user has to register a unique UUID to refer to binding key and also for the assigned usage and migration policies, the user has to register authentication secret for authorization of using and migrating the binding key.
Alternative methods of user password input may be vulnerable to a key logger or screen capturer. For example, the password can be stolen by a key logger, by phishing or by shoulder-surfing. For example, the key logger code may log all the key strokes at the operating system level so that such logs are delivered to some adversaries who analyze what the victim has keyed in their system, and then try to extract the user password. Such a key logger may be very effective if the user typed their password in an unsafe machine on which the key logger is installed.
In accordance with the present disclosure, a password protection scheme is designed which involves random number generation and server-client collaborative authentication. In this exemplary implementation, no single one of these machines possess the password, so this scheme can effectively guard against key loggers and screen capturers.
Each digit of password may be input alternately on the Android phone and the client PC, and at the same time each of the digits may be mapped into a random number sequence of a certain length and sent to the server for verification. On the server side, the SHA-256 hash value of correct random number sequence is compared to the hash value of received random number sequence. The result of the comparison is sent back to the PC and Android for the next operation. If the result is correct, then the confirmation signal may be sent back to the PC and the Android device, and the cryptographic processing interface will may be displayed. If the result is wrong, then a system exit signal may be sent and the error message may be given.
In exemplary implementations in accordance with the present disclosure, enhanced security is provided by selecting Advanced Encryption Standard (AES) as the encryption-decryption scheme. AES provides a robust replacement for the Data Encryption Standard (DES) and to a lesser degree Triple DES. AES supports key sizes of 128, 192 and 256 bits. It is implementable in hardware and software, as well as in restricted environments (for example, in a smart card) and offers good defenses against various attack techniques. Until May 2009, the only successful published attacks against the full AES were side-channel attacks on some specific implementations. The design and strength of all key lengths of the AES algorithm (i.e., 128, 192 and 256) are sufficient to protect classified information up to the SECRET level. TOP SECRET information will require use of either the 192 or 256 key lengths.
The generation of a master key has two operations in this illustrative embodiment. First, the Android phone and the client PC will both generate an AES key based on password and random salts. Then the Android phone generated AES key may be sent to the client PC and the two AES keys will derive a master key by an XORing operation.
After the master has been generated, the encryption process can take place. To offer protection from side channel attacks, disclosed embodiments use GCM mode operation. Galois/Counter Mode (GCM) is a block cipher mode of operation that uses universal hashing over a binary Galois field to provide authenticated encryption. GCM was designed originally as a way of supporting very high data rates, since it can take advantage of pipelining and parallel processing techniques to bypass the normal limits imposed by feedback MAC algorithms. This allows authenticated encryption at data rates of many tens of Gbps, permitting high grade encryption and authentication on systems which previously could not be fully protected. Software implementations can achieve enhanced performance by using table-driven field operations. AES-GCM is an authenticated encryption algorithm designed to provide both authentication and privacy. The AES-GCM mode has four inputs and two outputs: inputs: Secret Key, Initialization Vector IV, Plaintext and Additional Authenticated Data; and Outputs: Cipher text and Authentication Tag.
According to exemplary implementations in accordance with the present disclosure, a file splitting and merging scheme can further defeat the attacker trying to access the full information. Only the owner of files knows where the pieces are and can recover them.
After the index file is encrypted and split, the index file encryption key may be left relatively unprotected. The storage and protection of the key is potentially problematic for the security software that resides only on the software process, the cryptographic keys can only be stored on the hard drive which means that running that whole security process in software is akin to leaving a spare front door key somewhere in the yard—one is relying on being able to think of a key-sized hiding place that a burglar won't find. Thus, a hardware security feature is introduced in systems in accordance with the present disclosure.
A Trusted Platform (TP) may include a computing platform that has a trusted component, which is used to create a foundation of trust for software processes. TPM provides the root of trust for identity based on the endorsement key inside which was created uniquely by manufacturer and can never be read or modified. Based on the root of trust, a tree of trust, which is the name for key infrastructure inside TPM, can be created inside TPM which was protected from the vicious attack from any software outside. At the root of the trust tree is the Storage Root Key (SRK), and any new created keys can be protected by SRK or previous created key, all the keys can only be used inside TPM by authenticated user based on the settings. That means the TPM provide a reference point for protection.
In accordance with the present disclosures, based on these features of TPM, the disclosed systems utilize the binding function of TPM. Binding generally includes the operation of encrypting a message using a public key. That is, the sender uses the public key of the intended recipient to encrypt the message. The message is only recoverable by decryption using the recipient's private key. When the private key is managed by the TPM as a non-migratable key only the TPM that created the key may use it. Hence, a message is encrypted with the public key, “bound” to a particular instance of a TPM. In accordance with the present disclosure, after the index file was encrypted, the index file encryption key is finally bound to TPM by using TPM generated 2048 bits RSA binding key, and this binding key is further protected by SRK inside TPM.
The TPM generates cryptographic keys but due to the low cost nature the internal memory (i.e. number of key slots) is limited. Nevertheless applications might need to store keys permanently. With the key management component of the TSS it is possible to store keys in a persistent storage (file system) outside the TPM encrypted under a parent key. To do so the user must provide this parent key before the TPM can create a new key pair. Before the TPM writes to the persistent storage it encrypts the new private key under its parent key to ensure that no unencrypted key leaves the TPM.
The root of the key hierarchy is the storage root key (SRK) which is generated at taking ownership and then stored inside the TPM permanently.
In the creation of binding key, disclosed systems may assign a (possibly globally) unique identifier called UUID to the key and register the key with the UUID. Then the key blobs are stored in the persistent storage in the OS file system. Later the program can use this UUID as reference to the requested key. The disclosed systems may also assign an unmigratable policy to the key object which means the key can never been migrated and can only be used by this specific TPM.
As explained above, TPM is based on root of trust. That is, much of the value (or trust) associated with the TPM comes from the fact that the EK is unique and that it is protected within the TPM at all times. This property is certified by the Endorsement Certificate (Cert).
The Endorsement Key (EK) is a public/private key-pair. The size of the key-pair will generally have a modulus (a.k.a. key size) of 2048 bits. The private component of the key-pair is generated within the TPM by manufacturer and is never exposed outside the TPM. TPM manufacturers will provide the endorsement key pair and store this in tamper resistant non-volatile memory before shipping the TPM. A certificate, or endorsement credential, can then be created which contains the public EK and information about the security properties of the TPM. The EK is unique to the particular TPM and therefore the particular platform which supports for the root of trust. Based on these features, TPM can be used to provide identity authentication.
To add another layer of security to the disclosed systems, a specific server-client authentication scheme may be used utilizing the client PC equipped TPM. Generally, in the decryption process, the server stored index .piece file will not be sent back to client unless the TPM identity has been authenticated. Implementations in accordance with the present disclosure can prevent the attacker trying to bypass the protection of TPM and recover the index file. The exemplary features providing for such a design discussed below.
In certain implementations, the AIK is kept in secret and unique to represent the user and TPM. First, it is created as the 2048 bits RSA key. Second, it is protected by the SRK which is the root of the trust tree. Third, it is registered with a unique identifier UUID and both the UUID and AIK is saved on persistent storage, e.g., a USB stick or a Hard Drive, so only the user possesses them. Finally, the usage policy with secret and migration policy is assigned to AIK. As illustrated above,
In the decryption process, the identity authentication may take place first between server and client before the index .piece file can be sent back to the client PC, according to the following operations.
a) Server uses the random number generator program, which use the system time as seed, to generate a new nonce. At the same time, TPM utilizes the inside random number generator to generate a new nonce.
b) Server and client both send the new generated nonce to each other.
c) Both the server and client use the self-generated nonce to XOR the received nonce to get new XORed nonce, and then generate the SHA-1 hash value of it on both server and client.
d) TPM retrieves the AIK based on the UUID and signs on the hash value using the private part of AIK, and then sends the signature to server for verification.
e) Server receives the signature, and then uses the saved public key and generated hash value to verify the signature, and the result is sent back to client for next operations.
The Trusted Computing Group (TCG) publishes specifications defining architectures, functions and interfaces that provide a baseline for a wide variety of computing platform implementations. Additionally, the TCG will publish specifications describing specific platform implementations such as the personal computer, PDA, cellular telephones and other computing equipment.
A Trusted Platform may include a computing platform that has a trusted component, sometimes in the form of built-in hardware, which it uses to create a foundation of trust for software processes. Platforms based on the TCG specifications will generally meet functional and reliability standards that allow increased assurances of trust. The TCG will publish evaluation criteria and platform specific profiles that may be used as a common yard stick for evaluating devices incorporating TCG technology. Achieving improved trust also requires operational integrity of maintenance processes after deployment.
The TPM provides a set of crypto capabilities that allow certain crypto functions to be executed within the TPM hardware. Hardware and software agents outside of the TPM do not have access to the execution of these crypto functions within the TPM hardware, and as such, can only provide I/O to the TPM.
In case of the PC platform, the hardware TPM is part of the mainboard and may not easily be removed or replaced. It is typically connected to the rest of the system via the LPC bus. The functionality of this hardware device resembles that of a smart card. A tamper resistant casing contains low-level blocks for asymmetric key cryptography, key generation, cryptographic hashing (SHA-1) and random number generation. With these components it is able to keep secret keys protected from any remote attacker. Additional high-level functionality consists of protected non-volatile storage, integrity collection, integrity reporting (attestation) and identity management. TPM is a passive device, a receiver of external commands. It does not measure system activity by itself but rather represents a trust anchor that cannot be forged or manipulated.
The I/O component manages information flow over the communications bus. It performs protocol encoding/decoding suitable for communication over external and internal buses. It routes messages to appropriate components. The I/O component enforces access policies associated with the Opt-In component as well as other TPM functions requiring access control.
Non-volatile storage is used to store Endorsement Key (EK), Storage Root Key (SRK), owner authorization data and persistent flags. Platform Configuration Registers (PCR) can be implemented in either volatile or non-volatile storage. They are reset at system start or whenever the platform loses power. TCG provides a minimum number of registers to implement (16). Registers 0-7 are reserved for TPM use. Registers 8-15 are available for operating system and application use.
Attestation Identity Keys (AIK) must be persistent, but it is recommended that AIK keys be stored as Blobs in persistent external storage (outside the TPM), rather than stored permanently inside TPM non-volatile storage. TCG hopes TPM implementers will provide ample room for many AIK Blobs to be concurrently loaded into TPM volatile memory as this will speed execution.
Program code contains firmware for measuring platform devices. Logically, this is the Core Root of Trust for Measurement (CRTM). Ideally, the CRTM is contained in the TPM, but implementation decisions may require it be located in other firmware.
The TPM contains a true random-bit generator (RNG) used to seed random number generation. The RNG is used for key generation, nonce creation and to strengthen pass phrase entropy.
A Sha-1 message digest engine is used for computing signatures, creating key Blobs and for general purpose use.
TCG provides the RSA algorithm for use in TPM modules. Its recent release into the public domain combined with its long track record makes it a good candidate for TCG. The RSA key generation engine is used to create signing keys and storage keys. TCG requires a TPM to support RSA keys up to a 2048-bit modulus, and mandates that certain keys (the SRK and AIKs, for example) must have at least a 2048-bit modulus.
An Opt-In component implements TCG policy providing that TPM modules are shipped in the state the customer desires. This ranges from disabled and deactivated to fully enabled; ready for an owner to take possession. The Opt-In mechanism maintains logic and (if necessary) interfaces to determine physical presence state and ensure disabling operations are applied to other TPM components as needed.
An execution engine runs program code. It performs TPM initialization and measurement taking.
The TCG main specification may not provide the communications interfaces or bus architectures. These may be considered implementation decisions documented in the Platform Specific Specification(s). However, TCG does provide an interface serialization transformation that can be transported over virtually any bus or interconnect.
TCG provides that the TPM be physically protected from tampering. This includes physically binding the TPM module to the other physical parts of the platform (e.g., motherboard) such that it cannot be easily disassembled and transferred to other platforms. These mechanisms are intended to resist tampering. Tamper evidence measures are to be employed. Such measures enable detection of tampering upon physical inspection.
To implement the root of trust, TPM may utilize the tree of trust inside for key management to extend its trust to other parts of the platform. There are different types of key including Storage Root Key (SRK), Endorsement Key (EK), Attestation Identity Key (AIK), Signing Key, Storage Key, Bind Key, Legacy Key, and Authentication Key.
The Endorsement Key (EK) is a public/private key-pair. The size of the key-pair is mandated to have a modulus (a.k.a. key size) of 2048 bits. The private component of the key-pair is generated within the TPM and is never exposed outside the TPM. The EK is unique to the particular TPM and therefore the particular platform.
Much of the value (or trust) associated with the TPM comes from the fact that the EK is unique and that it is protected within the TPM at all times. This property is certified by the Endorsement Certificate (Cert). The same party that provides the EK may not provide the Endorsement Cert.
AIKs are used to provide platform authentication to a service provider. This is also called pseudo-anonymous authentication and is different from user authentication.
Signing keys are asymmetric general purpose keys used to sign application data and messages. Signing keys can be migratable or non-migratable. Migratable keys may be exported/imported between TPM devices. The TPM can sign application data and enforce migration restrictions.
Storage keys are asymmetric general purpose keys used to encrypt data or other keys. Storage keys are used for wrapping keys and data managed externally.
Bind keys may be used to encrypt small amounts of data (such as a symmetric key) on one platform and decrypt it on another.
Legacy Keys are keys created outside the TPM. They are imported to the TPM after which may be used for signing and encryption operations. They are migratable.
Authentication Keys are symmetric keys used to protect transport sessions involving the TPM.
The TPM may become a low cost commodity component, suitable for consumer class computing platforms. Therefore, the TPM itself may have limited runtime (volatile) and persistent (non-volatile) storage. TCG usage scenarios suggest unlimited storage may be advantageous. For this reason TPM external storage and a cache manager may be provided.
To allow for virtually unlimited keys and storage areas the RTS packages keys destined for external storage into encrypted key BLOBs. Key blobs are opaque outside the TPM and may be stored on any available storage device (e.g., Flash, Disk, and Network File Server). BLOB structures are bound to a particular TPM and may be sealed to a particular platform configuration as well. Blobs are referenced using a cryptographic hash of its contents, by handle or other suitable referencing mechanism. Reference identifiers disambiguate Blobs externally to the KCM or other application program performing the storage functions. Other information including Key Type and Key Attribute are available externally.
The TPM exposes interfaces that allow external programs the ability to manage the limited storage resources of the TPM. Management function is distinguished from application function by separating the ability to cache keys from the ability to use a key. Key Cache Managers (KCM) will generally only be concerned with caching keys, while applications may be concerned about key usage. A noted exception is storage keys which are used to protect other keys. The KCM will likely control both caching and use of storage keys.
Keys sealed to a particular platform configuration may be loaded even when the platform is outside the intended configuration. This allows flexibility in transitioning the platform between readiness states without impacting its ability to obtain needed keys. Security is maintained because configuration is checked each time it is used, hence loading need not be checked. The KCM tracks available key slots and determines when it is appropriate to expel a key and replace with another. The TPM does not provide proactive notification when Key Slots are depleted or when applications need to use a particular key. As such, application programs may need to inform the KCM when such events occur or the KCM needs to implement a TPM interface layer, through which applications obtain TPM services4. The TPM provides interfaces to prepare keys for transitioning between TPM and Storage Device. The KCM generally will not render keys in the clear.
Designers of secure distributed systems, when considering exchange of information between systems, should identify the endpoints of communication. The composition and makeup of the endpoint is as important to the overall security of the system as is the communications protocol. TCG designers assert endpoints are generally comprised of asymmetric keys, key storage and processing that protects protocol data items.
Classic message exchange based on asymmetric cryptography suggests that messages intended for one and only one individual can be encrypted using a public key. Furthermore, the message can be protected from tampering by signing with the private key. Keys are communication endpoints and improperly managed keys can result in loss of security. Additionally, improperly configured endpoints may also result in loss of security. The TPM aids in improving security by providing both key management and configuration management features such as Protected Storage, Measurement and Reporting. These features can be combined to “seal” keys and platform configuration making endpoint definition stronger.
TCG provides four classes of protected message exchange: Binding, Signing, Sealed-Binding (Sealing) and Sealed-Signing.
Binding is the traditional operation of encrypting a message using a public key. That is, the sender uses the public key of the intended recipient to encrypt the message. The message is only recoverable by decryption using the recipient's private key. When the private key is managed by the TPM as a nonmigratable key, only the TPM that created the key may use it. Hence, a message encrypted with the public key, “bound” to a particular instance of a TPM. It is possible to create migratable private keys that are transferable between multiple TPM devices. As such, binding has no special significance beyond encryption.
Signing also in the traditional sense, associates the integrity of a message with the key used to generate the signature. The TPM tags some managed keys as signing only keys, meaning these keys are only used to compute a hash of the signed data and encrypt the hash. Hence, they cannot be misconstrued as encryption keys.
Sealing takes binding one operation further. Sealed messages are bound to a set of platform metrics specified by the message sender. Platform metrics specify platform configuration state that must exist before decryption is allowed. Sealing associates the encrypted message (actually the symmetric key used to encrypt the message) with a set of PCR register values and a non-migratable asymmetric key.
A sealed message is created by selecting a range of PCR register values and asymmetrically encrypting the PCR values plus the symmetric key used to encrypt the message. The TPM with the asymmetric decryption key may only decrypt the symmetric key when the platform configuration matches the PCR register values specified by the sender. Sealing is a powerful feature of the TPM. It provides assurance that a protected message is only recoverable when the platform is functioning in a very specific known configuration.
Signing operations can also be linked to PCR registers as a way of increasing the assurance that the platform that signed the message meets a specific configuration requirement. The verifier mandates that a signature must include a particular set of PCR registers. The signer, during the signing operation, collects the values for the specified PCR registers and includes them in the message, and as part of the computation of the signed message digest. The verifier can then inspect the PCR values supplied in the signed message, which is equivalent to inspecting the signing platform's configuration at the time the signature was generated.
As illustrated in system design discussions above, this exemplary implementation consists of five modules and each of them provide a different layer of security. Java is chosen as the computer language to provide all these functions, although other programming languages are within the scope of the present disclosure. The RSA SHARE FOR JAVA security library provides the programming API in this exemplary implementation. To communicate with TPM security chip, IAIK TCG Java software stack provides the Java implementation of TCG software stack and was chosen as the API for TPM programming. More exemplary implementation details are discussed herein.
In a synchronous logon system design, to make sure that the password transmitting between the client, the Android device, and the server may not reveal the password information and any one of the three devices doesn't possess the whole password, random number sequences are used as the representation of digits of password to be sent to server for verification. So before start the logon process, a random number sequences table should be generated both on the client PC and the Android Device. The tables on the two devices are different.
After the random number sequence tables have been generated both on client and the Android Device, the random number sequences mapping of password should be sent to server and the SHA-256 hash value of the random number sequences mapping are calculated by server and stored on server side for password verification.
The key used for user file encryption and decryption is generated from a random number generator which utilize the system time with random salt added as the seed for random number generation.
To achieve the requirement for protection over sensitive data and get better security level, AES-GCM mode encryption method was chosen as the cryptographic process standard for this exemplary implementation.
The file splitting may take place after the file encryption, and the file merging may be run before the file was sent to decryption. The file splitting and merging can be run twice respectively. One for encrypted personal file and another for encrypted index file. In this exemplary implementation, the splitting function randomly takes ⅕ of the data from the original file and replaces it with randomly generated bytes. The random data taken out is stored in a piece file. The file with the modified contents is then stored in a bulk file. The original file name, the key and IV used for encryption and the random byte locations are all stored in the index file which is split later.
Random number generation is important for cryptographic key creation. To get better randomness and therefore better security, 64-bit system time (GMT) plus salt and password combination is used in this implementation as the seed for random number generator.
In file splitting process, random number generator is used to provide the random position as start point for content extraction.
As illustrated above, the index file plays an important role in the system design. It contains paths to the file pieces and keys to decrypt them. Thus, the index file may be encrypted by AES-GCM to be protected. The protection of the index file encryption key is also important. The TPM security chip is the better choice for this design to provide hardware protection over the index file encryption key which means TPM binding was used to bind the key to TPM.
After the index file was split into .piece file and .bulk file, to increase the difficulty to recover the index file pieces by attacker, the .piece file is sent to server for storage. In this illustrative embodiment, TPM is used to further prevent attackers from trying to recover the whole index file without the identification of TPM.
To do this, the TPM Attestation Identity Key (AIK) is used as the identity key of TPM and every time the PC wants to request the .piece file back, it has to identify the particular TPM to server which has the public AIK key stored.
To identify the TPM is the specific TPM that the user own, the AIK key is used to generate a signature and send the signature to server which has the public AIK key stored. When the server verified the signature and makes sure the TPM is the correct one, it will send the index .piece file back to the client PC for file un-protection process.
Before the TPM use the AIK to sign, a new Nonce has to be generated and the hash value of the Nonce is signed by the AIK.
To generate the new Nonce, the client and server will both generate a nonce separately. The client use TPM random number generator to generate the nonce and the server use software random number generator which use system time as seed. After the two nonces were generated, the client and server will exchange the nonce with each other and then the final Nonce is generated on both sides by XORing the nonce from client and the nonce from server.
Finally, the SHA-1 hash value of Nonce was generated and the AIK signs on it for server verification as shown in
Implementations in accordance with the present disclosure use Java as the programming language due to advantages of Java. Java is a general-purpose, concurrent, class-based, object-oriented computer programming language that is designed to have few implementation dependencies. It is intended to let application developers “write once, run anywhere” (WORA), meaning that code that runs on one platform does not need to be recompiled to run on another. Java applications are typically compiled to byte code (class file) that can run on any Java virtual machine (JVM) regardless of computer architecture. Java is one of the most popular programming languages in use. Java's robustness, ease of use, cross-platform capabilities and security features provides worldwide web solutions. The ability to run the same program on many different systems is crucial to World Wide Web software and Java succeeds at this by being platform independent at both the source and binary levels. Also Java-based Android phone is a key point in this design to get better security, so Java programming language is the best choice. However, it should be understood that systems, components, and methodologies in accordance with the present disclosure may also be implemented with any other suitable programming language.
In at least one exemplary implementation, instead of using Sun security library, RSA BSAFE® Share for Java Platform (Share for Java) may be chosen as the toolkit for security implementation.
Share for Java provides various security features including cryptography, Public Key Infrastructure (PKI), and Transport Layer Security (TLS). Using cryptography, algorithms provide encryption, digital signatures, message digests and Pseudo Random Number Generation (PRNG). Using PKI technology, Digital Certificates may be used to identify secure servers on the Internet and are used with encrypted and signed email. TLS technology provides the security for secure https connections over the Internet.
Share for Java is a Java security toolkit. Share for Java contains two jar files: shareCrypto.jar: Cryptographic and PKI functionality implemented as a Java Cryptographic Extension (JCE) provider; and shareTLS.jar: SSL v3.0, TLS v1.0, v1.1 and v1.2 functionality implemented as a Java Secure Sockets Extension (JSSE) provider.
Before installation of the Share for Java toolkit, the correct Java Cryptography Extension (JCE) Jurisdiction Policy Files may be downloaded and installed first following the two operations: (1) extract the local_policy.jar and US_export_policy.jar files from the Downloaded.zip file; and (2) copy local_policy.jar and US_export_policy.jar to the <jdk install dir>/jre/lib/security directory, overwriting the existing policy files.
In accordance with Share for Java installation procedures, the Share for Java binary distribution directory structure is copied into a suitable location on the target system and the Share for Java toolkit.jar files, shareCrypto.jar and shareTLS.jar are added to the class path.
To Statically Register the Share for Java JCE and JSSE providers the shareCrypto.jar and shareTLS.jar are copied to <jdk install directory>/jre/lib/ext directory and the JCE and JSSE providers is added to the provider list in the <jdk install directory>/jre/lib/security/java.security file using the two lines below:
Subsequently, all of the subsequent provider entries are modified, changing value of n in security.provider.n so the providers are in ascending order and each provider has a unique number.
The Trusted Computing Group (TCG) specifies the Trusted Platform Module (TPM) and the accompanying software infrastructure called TCG Software Stack (TSS). This system software defines interfaces to applications written in the C language. IAIK Java TCG Software Stack makes the TSS available to Java developers in a consistent and object oriented way.
The Trusted Computing Group (TCG) designed the TSS as the default mechanism for applications to interact with the TPM. In addition to forwarding application requests to the TPM the TSS provides a number of other services such as concurrent TPM access or a persistent storage on the hard disk for cryptographic keys generated inside the TPM.
TPMs are required to provide protected capabilities and at the same time are designed as low cost devices. Due to their inexpensive nature, the internal resources and external interfaces are kept to a minimum.
The TPM device driver (TDD) resides in the Kernel space. For a 1.1b TPM this driver is vendor specific since it just offers a proprietary interface to upper layers whereas 1.2 TPMs support generic TPM Interface Specification (TIS) drivers. TIS provide a vendor independent interface to access TPM functionality. It depends on the platform and the operating system but the TDD may also support additional functionality such as power management. Nowadays, all major operation systems ship with TIS drivers or at least support them.
The TSS Device Driver Library (TDDL) resides in User space. From the user's point of view it exposes an OS and TPM independent set of functions that allow a basic interaction with the TPM. This includes sending commands as byte streams to the TPM and receiving the TPM's responses. The TCG specifies the TDDL Interface (TDDLI) as a required set of functions implemented in the TDDL. The intention was to offer a standardized TPM interface regardless of the TPM vendor and the accompanying TPM device driver. This ensures that different TSS implementations can communicate with any given TPM. In contrast, the communication between the TDDL and the TPM is vendor specific. The TDDL is designed as a single-instance and single-threaded component.
The jTSS can operate on major Operating Systems used today, including releases of Windows, such as Windows 8 or Windows Server 2012.
The Linux OS implements the TDDL such that it opens the TPM device file (/dev/tpm*) provided by the underlying driver. Microsoft ships Windows Vista with a generic TIS driver that accesses the TPM via the so called TPM Base Services (TBS). This service interface should allow similar access to the TPM as the device file under Linux does.
For the implementation, a context object serves as entry point to all functionality such as authorized and validated TPM commands, policy and key handling, data hashing, encryption, and PCR composition. The TSP can also be used to integrate the TPM in cryptographic libraries like PKCS#11.
The Java programming language evolved in the last years to a commonly accepted environment. The main advantages are a restrictive type and memory safety ideally suited for security relevant applications.
Although, the basic concepts and functionality of the native TSP remains the same in its Java counterpart, several aspects were changed to meet the object oriented nature of Java. TSS entities such as contexts, keys, hashes, or the TPM are represented by actual Java objects. This relieves developers from object handles and memory management as required in the original TSP. The Java interface provides all the flexibility and features of the underlying stack to Java developers. Existing resources such as TSPI based C-code can therefore easily be mapped to Java. Some relevant classes are described below.
TclContext: A context represents a connection to the TSS Core Services. One can either connect to a local or a remote TCS. A context allows specifying the connection host. The context creates all further TSS objects like policy objects and registers, loads or unregisters keys from the persistent storage. The context can close objects (release their handles), get information (capabilities) about the TCS as well as free TSS memory.
TclTPM: This class represents the TPM and parts of its functionality. It provides methods to take or clear TPM ownership, read and set the TPM status, obtain random numbers from the TPM, access time stamping functions, or read and extend PCR registers. Aside from low level functions, e.g., trigger a TPM self-test, it offers functions to create “attestation identities”. Further, it can do quote operations to attest the current state of the platform represented by the contents of the PCR registers.
TclRsaKey: Instances of this class represent keys in the TPM's key hierarchy. It provides functionality to create a new key, load a key into a key slot of the TPM, or certify keys.
TclEncData: This class provides access to the TPM's bind/unbind and seal/unseal functions which encrypt data with a TPM key. If this key is not migratable only the TPM that did the bind operation is able to unbind the data. It is computationally unfeasible to decrypt data if the TPM and therefore the according private key are unavailable any more. Sealing takes this concept an operation further: This operation includes the platform configuration to encrypt data with a TPM key. By that, the sealed data can only be unsealed if the platform is in the state specified at seal time. The platform configuration is represented by the content of the TPM's PCRs.
TclHash: This class provides access to the TSS's hash algorithm SHA1. That includes unkeyed hash calculation and verification as well as keyed hash functions, e.g., create signatures of data blocks with a TPM key.
TclPcrComposite: The platform configuration registers (PCRs) can be used to attest the state of a platform (quote operation) or to seal data to a specific configuration. Instances of this class select one or more PCRs and hand them to the quote or seal functions.
TclPolicy: The policy class handles authorization data for TSS objects such as keys. The authorization data consists of the SHA-1 hash of the user password. Note that different character encodings (ASCII, UTF-16LE Unicode, etc.) will hash to different values. Alternatively to setting a password, a pop-up window will ask the user to enter the appropriate secret. UTF-16LE Unicode without a zero string termination should be used.
TclNvRam: This class stores the attributes of a region of non-volatile RAM inside the TPM. It can be used for defining, releasing, reading or writing such a region. An example is the Endorsement Key certificate shipped with Infineon TPMs.
At the same time, the index file is created, and then it is encrypted and split into .piece and .bulk file.
Then, the usage policy and secret policy is assigned to the binding key object and the index file encryption key is bound to TPM.
Following these operations, the Protection process may be considered complete.
Turning to TPM signature verification,
Then the server and TPM may both generate a new nonce and send the nonce to each other, then the two nonce will XOR with each other into a new nonce on both client and server. After that, a SHA-1 hash value is generated using the new generated nonce.
Then TPM will sign on the SHA-1 hash value and send the signature to server for verification. After the TPM identity has been verified by server, the index .piece file can be sent back to client for the merging and decryption process.
After the file has been decrypted, it is saved back to the original folder and then user can reset the application status by clicking “Reset” button. Then all the intermediate files are deleted.
A description is now provided regarding efficiency measurements in term of latency for an exemplary implementation. Measurements were taken by dividing the exemplary implementation into four parts to measure the execution latency for each part, after which the total protection latency and unprotection latency were separately determined.
To set up the measurement, different file sizes and file numbers were provided as different test groups. For each group, execution latency was measured 10 times in seconds. The results are set forth herein.
This exemplary implementation utilized three devices: Server PC, Client PC and Android Phone. The hardware configurations for this exemplary implementation are provided in Tables 19-21.
The synchronous logon system execution includes the interaction among the Android Device, the client PC and server. For testing, different file size and file numbers were used as different test groups. For each group, measurements of execution latency in seconds were taken 10 times. The difference between each group is mainly due to the user input speed. Here, Logon Latency=User Input Latency+Network Latency+Verification Latency. Table 22 shows logon system measurement (Unit: Seconds, KB: Kilo Bytes)
In the protection process, the encryption and splitting efficiency is given in the term of execution latency. Here, File Encryption & Splitting Latency=AES/GCM Encryption Latency+File Splitting Latency. Table 23 illustrates file encryption and split measurement (Unit: Seconds, KB: Kilo Bytes)
In the unprotection process, the decryption and merging efficiency is measured. Here, File Merging & Decryption Latency=AES/GCM Decryption Latency+File Merging Latency. Table 24 shows File Decryption & Merge Measurement (Unit: Seconds, KB: Kilo Bytes).
In the protection process, the TPM binding efficiency is given in the term of execution latency. The total time includes the user input time for the binding key usage secret and migration secret. Thus, the difference mainly depends on the user input speed. Here, Key Binding Latency=SRK Generation Latency+Binding Key retrieve Latency+Key Binding Latency. Table 25 shows TPM binding measurement (Unit: Seconds, KB: Kilo Bytes)
In the unprotection process, the TPM unbinding latency is measured and the total time includes the user input time for binding key usage secret. So the difference mainly depends on the user input speed. Here, Key Unbinding Latency=SRK Generation Latency+Binding Key retrieve Latency+Key Unbinding Latency. Table 26 shows TPM unbinding measurement (Unit: Seconds, KB: Kilo Bytes)
The process of TPM identification to server mainly includes two operations. First, the client and server exchange the new generated nonce and both produce a final Nonce using the received nonce and self-created nonce. Second, TPM signature signing and server verification are performed. The total time includes the user input time for AIK usage secret. Here, Identity Attestation Latency=SRK Generation Latency+AIK Retrieve Latency+RSA Signing Latency+Signature Verification Latency+Network Latency. Table 27 includes identity attestation measurement (Unit: Seconds, KB: Kilo Bytes)
The protection latency may be measured and the latency given in the term of seconds. The protection process latency mainly includes the latencies of file encryption and split, TPM binding process. Here, Total protection latency=Password Logon System Latency+File Encryption & Splitting Latency+Key Binding Latency+AIK & File Distribution Latency. Table 28 shows protection measurement (Unit: Seconds, KB: Kilo Bytes)
The unprotection latency may be measured and the latency given in the term of seconds. The unprotection process latency mainly includes the latencies of TPM identity attestation, file decryption and merge, TPM unbinding process. Here, Total Unprotection latency=Password Logon System Latency+TPM Identity Attestation Latency+Key Unbinding Latency+File Merging & Decryption Latency. Table 29 shows unprotection measurement (Unit: Seconds, KB: Kilo Bytes)
In terms of security design, the illustrative embodiment of the present disclosure may be thought of as mainly five modules that, when combined smoothly with each other, provide security protection over personal information. Each module adds a layer of security to the overall design. The synchronous log on scheme may utilize the Android Device and the client PC as the input devices and to implement the input digits to random number mapping. Each digit of password is input alternately on android phone and the client PC, and at the same time each digits is mapped into certain length random number sequence and sent to server for verification. On the server side, the SHA-256 hash value of correct random number sequence, which the hash value of received random number sequence is compared to, is stored. The result of the compare is sent back to PC and android for the next operation.
An encryption/decryption scheme is provided that may utilize AES-256 as the encryption method which was considered to be strong enough for current security requirement. The AES keys are generated based on a unique seed which is based on system generated time stamp. To play against side channel attack and timing attack, GCM mode is introduced into this design. The security of GCM mode relies on the fact that the underlying block cipher cannot be distinguished from a random permutation. Finally, a randomly generated salt was added to play against brute force attack.
A program is provided to split the encrypted files into pieces and bulks based on random numbers. Junk data are also injected into the split files. The paths and keys of the pieces are stored into an index file. Index file is further used for merging the pieces back to the original file. The index file may also be encrypted by AES-GCM and split by the splitting program. In the decryption process, the recovery of all the files may start from the index file.
A TPM key binding/unbinding process may be provided. Binding generally includes the operation of encrypting a message using a public key. That is, the sender uses the public key of the intended recipient to encrypt the message. The message is recoverable by decryption using the recipient's private key. When the private key is managed by the TPM as a non-migratable key only the TPM that created the key may use it. Hence, a message encrypted with the public key, “bound” to a particular instance of a TPM. Keys are communication endpoints and improperly managed keys can result in loss of security. Thus, the TPM in this exemplary implementation aids in improving security by providing key management. In detail, the final index file encryption key is the root of the whole cryptographic process, so it is binding with TPM for concrete protection. In decryption process, the index file encryption key is recovered using the corresponding private key by providing correct identity to TPM.
A TPM signature authorization process may be provided. To defeat attackers pretending to be the correct client with the unique TPM trying to get the distributed index file pieces on server, the TPM identity attestation scheme is designed in this illustrative embodiment. In the end of encryption process, TPM will generate an Attestation Identity Key which is 2048 bits RSA signing key pair and the public key is sent to server for storage. In the first place of decryption process, the TPM has to provide server the signature generated by previous generated Attestation Identity Key, after the signature is verified by server using stored public key, then the index file pieces can be sent back to client for the next operation of decryption process.
Through the overall design, the TPM binding protection and TPM identity attestation provides advantages compared to alternative security protection software. It offers a solution for the storage and protection of the software key, which is problematic for alternative security software design. The TPM identity attestation adds another layer of security over protection. It prevents the attacker from recovering the index file bypassing the TPM.
As shown above, the disclosed exemplary system works smoothly and after multiple tests, it turns out to be very stable and trustable.
For the overall system design, each module may add a layer of security to the overall protection and they rely on one another.
In fact, the cloud storage and TPM hardware security features are added over all the cryptographic processes. The cloud storage of file pieces together with the identity attestation function provided by TPM makes the file pieces can only be recovered by the authorized user. The TPM adds the whole cryptographic processes with hardware security features which bound the final index encryption key to a stable and strongly protected hardware and makes sure that only the authorized user with possession of the specific TPM can recover the whole processes. The security features of each module are provided in Table 31.
In Table 32, the security abilities and weakness of each module are listed.
The systems, methodologies, and components disclosed above create a TPM-enhanced cloud-based file protection system. Cloud computing will increase in importance going forward. Due to possessions of multiple personal computing devices, such as laptop, desktop, phone and tablet, cloud computing security integrating all these computing powers becomes available. Cloud storage of file pieces can successfully confuse hackers. In cloud computing, one must have full access to the file system, each cloud endpoint should provide identity proof to the other endpoints for trust. Based on unique Endorsement Key and root of trust, TPM can function as the identity proof by using Attestation Identity Key. Attestation Identity Key cloud be issued by CA through proving the possession of unique Endorsement Key in the TPM.
Alternative systems with only software cryptographic processes face a conundrum which is the storage and protection of the cryptographic keys. Traditionally, the keys being stored on hard drive plainly is like leaving a spare front door key somewhere in the yard. Security is just relying on a key-sized hiding place that the hacker cannot find. This presents a weakness for alternative implementations. Incorporating TPM into the crypto system provides a solution to this technical problem and escalates the file protection system to the hardware level. Thus, in accordance with the present disclosure, utilizing TPM's key binding feature and identity attestation feature, drawbacks of alternative software cryptographic processing implementations are addressed by TPM's root of trust and key binding and the cloud computing can be fully accessed by providing identity attestation.
Password protection can also benefit from cloud computing. In the disclosed illustrative embodiments, a password is separately inputted on the client PC and the Android Device and mapped into random number sequences to be sent to sever for verification. This design makes sure no one cloud endpoint has the full password or password hash. It can successfully defeat key logger and screen capturer, and it is also resistant to dictionary attack and social engineer.
Finally, the disclosed systems, components, and methodologies, which in this illustrative embodiment consisted of 5 layers protection, is a good implementation and combination of security, reliability, availability, efficiency and easy to use. AES-GCM is used as the reliable encryption-decryption scheme which provides high level symmetric cryptographic security. Due to the popularity and variety of personal computing devices, there are need and trend to better utilize multiple mobile computing devices to improve both efficiency and security. This information system which utilizes the computing powers of PCs and the Android Device explored the cloud computing security features by using file splitting-merging scheme to further obfuscate the protected information by file pieces distribution. And as the major part of the design, to solve the cryptographic key protection weakness of solely software cryptographic processes, TPM is introduced into the system design to provide key binding function and also provide identity attestation function to protect the file distribution process. And for the password protection, a synchronous logon system which fully utilized the PCs and the Android Device is designed so that the possession of any of these computing devices cannot reveal the whole original password.
After the system latency measurement, the disclosed systems are efficient and suitable for small size file protection due to the high possibility of network transmission failure for files over 300 Kilo Bytes. As shown by way of screenshots above, the system is easy to use, as it only requires a user to click a button and go on, with all underlying cryptographic processes transparent for users.
Other improvements and features are within the scope of this disclosure. In the synchronous Logon System, random numbers mapping to password digits could be updated and saved each time after user logon and used with random salt to generate cryptographic keys. Then the password logon process could further confuse hackers and get better protection over user password. And also the random sequence could further extend the randomness of key creation.
In addition, in the AIK creation process, AIK could be registered and authorized through Certificate Agent by providing the possession of unique Endorsement Key. Then AIK could be used as the substitution of Endorsement Key and avoid the reveal of Endorsement Key.
According to illustrative embodiments, various mechanisms discussed above may combine to provide intrusion detection and prevention capabilities. By way of example, in illustrative embodiments, all of the OTPs and ACL items that are required for network access may need to be completely correct, such that a failure at any single point is detected as an intrusion and blocked. Moreover, even if one or more of the authentication checkpoints are compromised, intrusions may still be detected. According to one benefit of illustrative embodiments in accordance with the present disclosure, there are no false positives for intrusion detection. Although not all intrusion events may be attacks (e.g., a valid client attempting to login with corrupted authentication information is correctly detected as an intrusion), in certain implementations there may be no detected intrusions that are false positives. Moreover, the ACL system implemented by IDACS as discussed herein may provide significantly high intrusion detection performance and can be used to trace back to the source of attacks and generate real-time forensics reports based on specific ACL violations.
Network security systems in accordance with the present disclosure can be used for a variety of applications, and generally may be suitable for any networked environment. In one example, network security systems in accordance with the present disclosure can be used in connection with industrial networks.
Industrial Networks may be divided into three general areas, each of which should be logically partitioned from the others by some security mechanism, such as a firewall.
The depicted Industrial Network includes an Industrial Control System consisting of Programmable Logic Controllers (PLCs) which control the elements of and industrial process, such as sensors, motors, and pumps. The Industrial Control System also contains a Control Console, which is used to issue instructions or write firmware to, or collect performance data from the PLCs. The depicted Industrial Network also includes a Supervisory Network, also known as Supervisory Control And Data Acquisition (SCADA), which provides external control and performance data recording for the Industrial Control System. Finally, the depicted Industrial Network connects these networks to company intranets and/or the Internet, enabling the packaging of performance data into real-time reports that can be viewed remotely.
The Network connects Customers and Endpoints through a series of security servers, including Security Agents and Super Security Agents, such as those described above.
Whenever a Customer or an Endpoint connects to the Network, it may download a User Agent (UA), which operates in connection with other network components in the manner described above.
Although certain embodiments have been described and illustrated in exemplary forms with a certain degree of particularity, it is noted that the description and illustrations have been made by way of example only. Numerous changes in the details of construction, combination, and arrangement of parts and operations may be made. Accordingly, such changes are intended to be included within the scope of the disclosure, the protected scope of which is defined by the claims.
This application is a continuation of U.S. patent application Ser. No. 14/293,350, filed Jun. 6, 2014, which claims domestic priority to and the benefit of U.S. Provisional Patent Application Ser. No. 61/878,694, filed Sep. 17, 2013, which applications are hereby incorporated by reference.
Number | Name | Date | Kind |
---|---|---|---|
6993652 | Medvinsky | Jan 2006 | B2 |
8532287 | Mizuno | Sep 2013 | B2 |
8583926 | Benson | Nov 2013 | B1 |
9032212 | Juels | May 2015 | B1 |
9119069 | Vipond | Aug 2015 | B1 |
20020029341 | Juels et al. | Mar 2002 | A1 |
20030177376 | Arce Velleggia et al. | Sep 2003 | A1 |
20070220597 | Ishida | Sep 2007 | A1 |
20090006858 | Duane | Jan 2009 | A1 |
20110145910 | Barnes et al. | Jun 2011 | A1 |
20120303972 | Kuno et al. | Nov 2012 | A1 |
20130061298 | Longobardi et al. | Mar 2013 | A1 |
20140032922 | Spilman | Jan 2014 | A1 |
20150312242 | Ogawa | Oct 2015 | A1 |
Entry |
---|
K. Shanmugasundaram a. N. Memon, “Automatic Reassembly of Document Fragments via Context Based Statistical Models,” in Proceedings of the 19th annual Computer Security Applications Conference (ASAC '03), Washington, D. C., 2003. |
A. Maurer and S. Tixeuil, “Limiting Byzantine Influence in Multihop Asynchronous Networks,” 2012 IEEE 32nd International Conference on Distributed Computing Systems (ACDCS), pp. 183-192, 2012. |
V. Pandit, J.H. Jun and D. Agrawal, “Inherent Security Benefits of Analog Network Coding for the Detection of Byzantine Attaches in Multi-Hop Wireless Networks,” 2011 IEEE 8th International Conference on Mobile Adhoc and Sensor Systems (MASS), pp. 697-702, 2011. |
F. Tao, Z. Bingtao and M. Jianfeng, “Security Random Network Coding Model against Byzantine Attack Based on CBC,” 2011 International Conference on Intelligent Computing Technology and Automation (ICICTA), vol. 2, pp. 1178-1181, 2011. |
Mozaffari-Kermani, M.; Reyhani-Masoleh, A., “Efficient and High-Performance Parallel Hardware Architectures for the AES-GCM,” Computers, IEEE Transactions on , vol. 61, No. 8, pp. 1165,1178, Aug. 2012. |
McGrew, D.; Viega, John., “The Galois/Counter Mode of Operation (GCM),” http://csrc.nist.gov/groups/ST/toolkit/BCM/documents/proposedmodes/gcm/gcm-spec.pdf, 2004. |
McGrew, D., “Efficient authentication of large, dynamic data sets using Galois/counter mode (GCM),” Security in Storage Workshop, 2005. SISW '05. Third IEEE International , vol., No., pp. 6 pp. 94, 13-13 Dec. 2005. |
Wentao Liu, “Research on cloud computing security problem and strategy,” Consumer Electronics, Communications and Networks (CECNet), 2012 2nd International Conference on , vol., No., pp. 1216,1219, Apr. 21-23, 2012. |
Heikkila, Faith M., “Encryption: Security Considerations for Portable Media Devices,” Security & Privacy, IEEE, vol. 5, No. 4, pp. 22,27, Jul.-Aug. 2007. |
Z. Shen and X. Wu, “The protection for private keys in distributed computing system enabled by trusted computing platform,” in 2010 International Conference on Computer Design and Applications (ICCDA), 2010, vol. 5, pp. V5-576-V5-580. |
S. Cheng, L. Jing, P. Weiping, and T. Xinji, “A security-enhanced key authorization management scheme for trusted computing platform,” in 2012 2nd International Conference on Consumer Electronics, Communications and Networks (CECNet), 2012, pp. 1573-1576. |
D. C. Latham, “DoD 5200.28-STD-Department of Defense Trusted Computer System Evaluation Criteria”, 1985. |
E. Gallery, “An overview of trusted computing technology,” Trust. Comput., vol. 6, p. 29, 2005 (book reference; available upon request). |
Z. Shen, L. Li, F. Yan, and X. Wu, “Cloud Computing System Based on Trusted Computing Platform,” in 2010 International Conference on Intelligent Computation Technology and Automation (ICICTA), 2010, vol. 1, pp. 942-945. |
Z. Shen and Q. Tong, “The security of cloud computing system enabled by trusted computing technology,” in 2010 2nd International Conference on Signal Processing Systems (ICSPS), 2010, vol. 2, pp. V2-11-V2-15. |
J. Daemen and V. Rijmen, “Advanced encryption standard (AES)(FIPS 197),” Technical report, Katholijke Universiteit Leuven/ESAT, 2001. |
J. Salowey, A. Choudhury, and D. McGrew, “AES Galois Counter Mode (GCZM) Cipher Suites for TLS,” IETF RFC 5288, 2008. |
Seth Beech, R. Goyal, W. Khur, and K Thomas, “Cloud-Based Information Protection System,” Samuel Ginn College of Engineering, Auburn University, 2012. |
A. Tomlinson, “Introduction to the TPM,” in in Smart Cards, Tokens, Security and Applications, Springer, 2008, pp. 155-172 (book reference; available upon request). |
H. Brandi, “Trusted Computing: The TCG Trusted Platform Module Specification,” Embedded Systems, 2004, Infineon Technol. Ag Httpwww Wintecindustries Comorderdesktpmdocumentstpm1 2-Basics Pdf. |
R. L. Rivest, A. Shamir, and L. Adleman, “A method for obtaining digital signatures and public-key cryptosystems,” Commun. Acm, vol. 21, No. 2, pp. 120-128, 1978. |
Number | Date | Country | |
---|---|---|---|
20160182486 A1 | Jun 2016 | US |
Number | Date | Country | |
---|---|---|---|
61878694 | Sep 2013 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 14293350 | Jun 2014 | US |
Child | 14960798 | US |