This invention relates to the protection of software systems, and in particular to the technology to protect the integrity and usage of software systems and associated devices.
Software fortification allows software systems to control their functionality, their usage and their integrity. The two principal attacks on software integrity are tampering and spoofing. Tampering involves changing the codes, data, authorizations or relationships in the software system. Spoofing involves replacing a software component for a program with an imposter. Fortification can use up to four different methods to protect the software. The first is that all the programs are tamper-proofed by networks of internal and external guards, including separate guard programs. The second is that all system components have secure identities for positive dynamic identification. The third is that components of the system protect each other as well as themselves, and some of the components may be entirely devoted to that protection. The fourth is explicit policies that determine the fortification and establish the system relationships. The software system preferably operates within a secure environment and infrastructure. When the original code is correct, the hardware performs properly, the external authorizations and identifications are reliable. Fortification provides stronger security than just tamper-proofing all system components because it also protects against viruses and dynamic attacks.
A software system is a set of computational components that interact to perform one or more tasks. The system components can include programs, procedures, devices and data that communicate through transfers of control and exchanges of data. The software system may include components which are: a) software within a simple computer with a processor and associated memory; or b) software distributed within a complex computer with multiple processors, operating systems and associated memories; or c) physical devices within little or no software, such as a device with hard wired computations, or d) objects including people and instruments that produce data for and interact with other components; or e) any combination of the above components. The software system may be packaged in a single physical device or distributed among a network of various devices. The logical and physical structure including hardware and networking configuration, is assumed to be fixed during the operation of a software system. The components of the system are completely defined and fortification implements detailed policies to provide protection. Fortification is used to preserve the integrity and functionality of the system, and to control the usage of the system. Fortification also provides some, often very substantial, capabilities to prevent extraction of software subsets from the system and to protect the data of the system.
Fortification creates an integrated, coordinated protection of the system. The system is a completely defined set of software components plus interfaces to external devices or objects. These external devices or objects may be other software modules, hardware, people or anything that interfaces with the system. The system may include components whose only purpose is to protect other components. Fortification of an operational system can include adding protection inside and outside to create a fortified system. Fortification includes the option for some components to be not trusted. Unless a system is fairly simple, it is better to develop the system and its fortification together. The fortification of a system uses detailed knowledge of that system that may enlarge the system substantially to create a fortified version thereof.
Fortification is achieved using four (4) technologies:
A related patent application, U.S. patent application Ser. No. 11/178,710, filed Jul. 11, 2005, entitled “Combination Guard Technology for Tamper-Proofing Software,” is hereby incorporated by reference, and describes various types of guards, obfuscation techniques and special protections. Many of the guards described can be used for both external and internal guarding. The different obfuscation techniques can be used for both internal and external guards as well. And the special protection techniques, which are neither purely guards nor purely obfuscations, are also useful for tamper-proofing software.
The technology of internal guarding has matured rapidly in the past few years, and provides versatile and powerful tools to create and insert internal guards into a program. These guards can be very dynamic and continually check the program during its execution. If a program is tampered with, then the correctness tests detect the tampering and the appropriate responses are taken.
External guarding is somewhat more primitive in status. The security products in current use include Tripwire and Vormetrics. The Tripwire process computes a complete checksum of a program once a day and compares that with the correct value. This is normally done on very large sets of programs simultaneously. Vormetrics computes a complete checksum of a program as it is loaded from secondary memory, for example from a hard drive, to primary memory and compares that with the checksum from the last time the program was loaded. It is not difficult to tamper with the program to circumvent such protections. Advancing the technology of external guards is one of the objects of the fortified software technology.
Software fortification uses a definition of the structure of the fortified system and checks it thoroughly and often. One of the ways of accomplishing this is by making positive, secure identifications of the software components, computers, devices, people, and other entities that interact with the system. Identification methodology is highly developed and can be made very secure. Software fortification has higher efficiency requirements than usual in identification, and a secure identification technology is disclosed which provides both high efficiency and high security. Note that this higher efficiency is required because an external guard may execute every millisecond or every microsecond in some applications.
Additional features and advantages of the present invention will be evident from the following description of the drawings and exemplary embodiments.
FIGS. 8A-C show an example of preserving privacy through use of signatures;
FIGS. 16A-E show an example of hiding and protecting data with the use of silent and non-silent guards.
A software system is a set of computer programs that interact to perform a set of tasks. The components of a fortified software system can include programs, procedures, data, people and other items that communicate through transfers of control and exchanges of data. The components may be distributed within a simple machine, a complex machine or a network. The machine might be a general purpose programmable computer, a single purpose fixed program device, or anything in between.
A fortified system has three relevant elements: (a) the original codes of all of its programs; (b) the external interfaces of the system; and (c) the hardware that supports the software execution. The original code is the fortified software before it is fortified or protected from attacks. Hardware may execute programs, so we will distinguish between software and hardware by the assumption that the operation of the hardware is fixed and unchangeable over the lifetime of the fortified system. We consider the hardware security by verifying its identity. The code within fortified software may be changed somehow and we protect against such changes through security measures. The external interfaces handle the input data to, and the output from, the fortified software as specified in the original code. This data can originate from a person, be provided by a device, or be provided by a program that is not part of the fortified software. Of particular interest for security is identification and authorization data for the system. These data consist of things like passwords, fingerprint images, hardware serial numbers, and similar identifiers.
The fortified software is a complete software system if its execution only interacts with other software through its external interface. Of special concern for security are the low level software support modules that are incorporated into the system as a convenience. This is an easy point to introduce malware into a system or to launch attacks on a system.
We assume fortified software has a secure infrastructure, including the hardware, networks, communication and other systems. This means the fortified software is complete, and its elements perform properly. We also assume there are no bugs or malware in the original software.
There are five goals of software security, and fortification primarily focuses on the first two of these. The first goal is to preserve the integrity and functionality of the system by preventing changes to a software component or substitution by unauthorized components. This is called fraud protection or tamper-proofing. The second goal is to control the use of the system by preventing unauthorized entities (people, software or devices) from using the software. This is called piracy protection. The third goal is to prevent extraction of software subsets by preventing the extraction of code, software subsets or methods from the fortified software. This is called fragmentation protection. The fourth goal is the protection of system data by preventing system data from being provided to unauthorized entities. This data could be one number (e.g., a password or key) or a huge file (e.g., a book, a chapter or a song). Note that software subsets are things that are executable code while system data do not execute. This is called media protection. The fifth goal is to protect the intellectual property of the fortified software by preventing anyone from understanding or extracting the process, methods or algorithms in the fortified software. This is called intellectual property or IP protection or reverse engineering protection.
The general goal of software system fortification is to preserve the integrity and functionality of the system, and to control the use of the system that operates in a secure infrastructure. Fortification also provides substantial help when preventing extraction of software subsets, protecting system data, and protecting the intellectual property of the software. Fortification is achieved through the use of four technologies: tamper-proofing, secure identification, interacting protections and systematic policy enforcement. All of the programs are protected from tampering by a network of internal and external guards. Fortification uses both internal guards which protect code inside the software component and external guards which also provide tamper-proofing for code outside the guard's component. These can be located both in other system components and within independent guard programs. These can protect and prevent viruses from infecting fortified software and prevent dynamic attacks on its components. Secure identification is used so that all system components can be positively identified throughout the operation of the system. This is required to secure the interfaces and to prevent spoofing. Interacting protections enable the various components to protect themselves and each other as well as the programs. Some components may be devoted entirely to protection. Systematic policy enforcement is performed using a policy system that is installed during the fortification process. The policy system controls external communication, the relationships among the system components, and the checking and protection procedures used.
The fortification process assumes that the original codes are secure, that is (1) the hardware infrastructure operates properly; (2) the interfaces are correct and complete; and (3) the original software is complete and correct.
The fortification process has three components. The first component is tamper-proofing the system codes. This means that either the code cannot be changed because physical barriers prevent access to the code involved or, more likely, any change in the code will be detected and an appropriate protective response taken. Example responses would be to terminate the computations, notify various external systems or people, or repair the changed code. The responses made are dependent on the nature of the system and its environment.
The second component is to provide secure positive identification of components. When one component of a system contacts another, there are mechanisms to provide positive identification. These identities can have high complexity such as natural biometrics. These identities may also present different appearances each time to prevent spoofing. There may be several exchanges of information in the identification process to make it reasonably efficient to generate these appearances.
The third component is to embed security policies in the system. The security policy system is the central entity for managing the security, identity, and authorizations of the system. It applies both to the particular application and to the general software security. Security policies have two parts: generic system protection measures to be used; and policies about who, how and when authorizations are made or modified.
Tamper-proofing is a technology that uses networks of guards to protect the code of the program from change. The guards systematically and continually check the program's code and each other to see if any changes have been made. If a change is detected, then an appropriate response is made. This technology is described more fully in U.S. patent application Ser. No. 11/178,710, entitled “Combination Guard Technology for Tamper-Proofing Software” which is incorporated herein by reference. Software fortification can be viewed in part as extending this technology to software systems.
Some obfuscation is required in tamper-proofing to protect the guards. If an attacker can identify all the guards exactly, then they can delete them simultaneously and break the protection. The selection of several obfuscation techniques plus specialized guards makes it more difficult to find and remove the guards. This protection can be made stronger and stronger by applying more and more iterations of obfuscation. Special protection techniques are similar to obfuscations in that they preserve the protection of the guards even though they do not necessarily preserve the semantics of the program.
Encryption is a special form of obfuscation for data. The capabilities of encryption are well understood and there are many very strong encryption algorithms. Encryption is very good at hiding information but unfortunately the information must be decrypted before it can be used. Once decrypted, the information is vulnerable to theft or change. Thus, encryption is most suitable for hiding constants within software and for exchanging information over networks.
There are a variety of other security tools that can be used to achieve some of the secure infrastructure goals. The assumption of a secure infrastructure is difficult to achieve. Perhaps the most difficult part of this assumption is that the original code is error-free, which suggests that absolutely secure software systems are very difficult to achieve. The following are some of the supporting tools that can be used to achieve some of the secure infrastructure goals. Malware checkers to check for the presence of varieties of code in a program that can undermine security. These tools can be quite effective for detecting trap doors, spyware, and key loggers. They should be applied to or included in the original code of the components going into the fortified software system. Disc ram transfer monitors are specialized programs to monitor and protect the communications internal to computers. External communication monitors examine the items and patterns of communication to detect and/or combat the various kinds of attacks, for example, denial of service or spyware. Firewalls examine the communication coming into a fortified software system and filter out various classes of communication and content which might be destructive or unwanted. Intrusion detection tools examine the behavior of a system and its communication to detect attempts to insert malware, viruses, spyware, and other unwanted software into the system. Machine and person identification tools help authenticate the identity of machines and people that attempt to access the system. These can include simple password checks, multiple biometrics, or sophisticated challenge response exchanges. Fortification uses a specialized set of identification tools for systems that have distributed components and to be sure that an entire subsystem has not been replaced.
One of the key components of fortification are guards that continuously check the system for attacks, changes and problems. These guards are networked together so that they guard each other and they are integrated into the fortified software so that they are very difficult to identify accurately and cannot be removed without detection. Networks of internal and external guards are inserted into individual programs so that any tampering is detected. This technology is the foundation of the fortification process and it generates the basis of fortification.
Internal guards check observed data against required data. The comparisons can be for equality, normal for integer and symbolic information, or close enough for numerically measured data such as biometrics or for the results of floating point computations. The definition of “close enough” is specified in the policy system. Machine codes are normally checked by computing a hash checksum of the machine instructions interpreted as integers. One of the tasks in guarding code is to be able to identify exactly which machine words are instructions that must not be changed. Guards and devices are usually in simpler computing environments and it is easier to identify the executable codes. However, the guarding must be tailored to the devices as they may use specialized conventions or constructions. It should also be determined how the device serial numbers or other hardware identifications are accessed. The very simplest devices might have no special hardware identification so the security may have to rely entirely on software guarding.
External guards are used to detect viruses, malware and other undesired software that are usually inserted at the very beginning of a program. They can also detect various kinds of dynamic and clone attacks because their checking is not synchronized with the program's execution in any way. For example, program statements 4,025 to 4,167 can be checked externally while statements 11,720 to 11,988 are executing. Indeed the external guards can check a program while it is idle as long as its code is accessible in memory. The external guards are either within other components of the fortified system or are independent guard agents dedicated to guarding other programs. The external guards use data about the checksum values derived from within the programs when they are being tamper-proofed. Of course, external guards agents may also be tamper-proofed.
External guards can be distributed over several components of a fortified system. First a guard can check several different programs at once and combine the results and then test. For example, a guard could checksum one statement from each of thirty seven programs and then test the resulting hash result. Second, the code of the guard itself could be distributed over several different programs.
There are different approaches for implementing of external guards and the communication for external guarding. Higher security results can be obtained by mixing these different types of communication in a fortified software.
One way of communicating is through direct reading of the code. The external guard G reads code from another program P and computes the checksum of some code segments just like an ordinary internal guard does. The guard G can locate the program P through the standard mechanism for invoking programs. The disadvantage of this approach is that the external guard has a signature that can be used by an attacker. The guard G is reading the instructions of another program. This is an unusual action which might give an attacker clues about the identity of the external guards.
Another communication mechanism is communication via arguments. Here the external guard G calls or is called by another program P and communication is through the arguments of the call. A guard G can invoke a program P and pass an argument A which the program P uses to return a computed checksum value to guard G. No test should be made within the program P of this value and probably this value is not used within the program P. The technology for creating secure identities can be applied to this value so that the actual value returned changes from time to time.
An example of communication via arguments is shown in
This process of communication via argument can be reversed to have P contact the guard G. The advantage of this second approach is that it makes it more difficult to identify external guards. Of course, more sophisticated interactions and networking can be used to increase the difficulty of identifying the external guards. Checking via arguments can also be incorporated into normal interactions among the components of the fortified system as illustrated in the example of
Another communication mechanism is piggy-back guarding on to normal communications. An example of this mechanism is shown in
The fourth communication mechanism is communication via bulletin boards or files. Here a program P and a guard G agree to use a file F or similar entity as a bulletin board for passing information back and forth. An example of this is shown in
There is a potential problem from having a guard in one program guarding code in another program. The information the external guard uses affects the guards protecting it wherever it is located. Thus there can be a cyclic effect, where guard A depends on information about guard B, which depends on information about guard C, which depends on information about guard A. The guarding technology disclosed in U.S. patent application Ser. No. 11/178,710, entitled “Combination Guard Technology for Tamper-Proofing Software” includes techniques to handle the cyclic effect and is applicable to external guards as well.
Internal virus guards provide some protection against viruses in some dynamic or clone attacks by immediately checking the first few statements of a program. Some examples of these types of guards are provided later in the application. An internal guard cannot usually detect tampering of the first few statements within a program because it does not have the opportunity to execute before the malware executes. Using a dynamic attack, the malware can be inserted, execute and then repair the beginning of a program so that internal guards do not detect the attack. In fact, malware can be inserted at any point in a program that is executed before it is guarded. It is often quite difficult to identify such locations in a program which create difficulties for both an attacker and for the guarding. One way to do this is to have the first guard of a program check the entire program. There is a large penalty in execution speed for such a guard, but it may be done in some critical cases. However, a network of interlocking guards can overcome this weakness by including one guard very close to the beginning of the program that checks the start plus some guards to check the empty spaces in the code. That guard is then protected by all the guards in the network.
External virus guards are external guards specialized just to provide protection against viruses and other malware inserted into a component of the fortified system without affecting the normal action or code of the component. Unlike the virus guards discussed earlier, they just check the start of each component plus the end and empty spaces. This checking must be done before the components execute, for example as they are installed or brought into working memory from disk storage. These can be organized as an independent network, as part of the overall external guard network, as individual guards (one per component) or into a single global virus guard that protects all the components. Making these part of the overall external guard network is the most secure, and the single global virus guard is the least secure. Microguards are well-suited for use in external virus guards. Microguards are very short guards (one or two statements) that can check one item in a program, they are very hard to detect and execute very fast.
Distributed and networks of external guards can provide protection of component P that cannot be removed without removing all the guards simultaneously from the component. Attacks on fortified software are likely to first focus on identifying and disabling the internal guards. This protection is extended in the fortification of fortified software and is, in fact, even stronger. A distributed guard is one whose parts are distributed over a number of programs including the program P and these parts communicate just as the external guards communicate. To remove such a guard requires that all of the parts be removed or otherwise the guard's protection will be triggered.
A network of external guards is created by linking sets of internal and external guards in several components of the fortified software. This creates two types of guard networks; those inside a single component and the external guards. There can be external guards checking a component's guard network silently in the sense that a component does not have any awareness of an external guard computing a checksum of its code. There are also external guards which use stealthy access to internal information for guarding. It requires very sophisticated analysis of the system's operation even to identify such an external guard. Further, the timing of external checking is not synchronized with the component's execution.
The external network should include guards that merely check the completeness of the network. A set of very lightweight guards (for example, microguards) can just check for the presence of larger external guards and of each other. These execute very rapidly and thus they impact computer performance very little. In a high security application there can be hundreds of such guards that would have to be removed or disabled within a very short time in order to avoid detection of the attack. Overall, the security of fortified software is greatly enhanced compared to just tamper-proofing its components one by one.
Viruses are an example of malware. The virus guards provide protection against other attacks and against the insertion of malware in general. A virus guard can protect against dynamic and clone attacks. External microguards are also very useful to protect against these attacks.
Hardware and environment guards are also useful for more global protection of fortified software. There are two primary types of hardware and environment guards: guards that check to see if certain hardware devices are present, and guards that are implemented in hardware to check certain simple properties of the fortified software. Some of these simple properties can include connectivity of some components of the fortified software, presence of some devices, or presence of some codes. These just make simple, common sense checks that the fortified system is all there and in reasonable shape.
Data protection has two primary aspects. The first aspect is detecting if data items have been changed, and the second aspect is preventing unauthorized access to data. The first aspect of data protection is essentially the same as tamper-proofing code. One has guards to check if data has changed. Thus, this aspect is subsumed under guarding, either internal or external. The second aspect is one of the more difficult tasks of software security. Using passwords as an example, the password must be available for use but must not be visible for an outsider to see while examining or executing the software.
There are three distinct types of data to hide. Internal data that is used within the component, which can include passwords and encryption keys. System data that is used only internally within the system, for example private and shared identification information. All of this identification information used for name security is of this type. External data that is to be provided outside the system, for example bank accounts, IP addresses and telephone numbers.
Hiding data internal to the fortified software system is quite feasible but may not be easy. Hiding external data is not feasible since the data must eventually be presented outside the fortified system. Outside the system it is vulnerable to being observed and discovered. If the external data is to be protected, then normal security measures can be used but the fortification of the system should not depend on this being secure. Note that system data is actually handled just like internal data. However, the system components must collaborate to use the data without exposing it. This collaboration requires planning and special handling but can be made as secure as the hiding of internal data. In many cases, it is sufficient to encrypt the data before it leaves one component and to decrypt it once it is received by another component. In some environments this security might be applied automatically for all communication between some or all of the system components.
There are two general information hiding technologies available to hide data items: encryption and obfuscation. Encryption can hide data very securely except that care must be taken that the data is not decrypted in order to be used. If it is decrypted, then monitoring the execution of the software can allow the data to be seen while it is not encrypted. But the encrypted form of the data, for example a password, can be used directly. For example, by encrypting the password presented by the external contact and comparing the result with the encryption of the true password.
Obfuscation provides ways of data item hiding by transforming computations or information so that one cannot discover what is being done. For example, a simple password test might be made by transforming the password several times to compute several or hundreds of different numbers. Then computations are introduced whose correctness depends on these numbers being correct. This is an instance of silent guarding techniques where checks are made silently if the data has been changed. If the data has been changed, then the program's operation is corrupted and this corruption often takes place in unpredictable ways.
The level of difficulty of retrieving the information measures the level of security of the information hiding. One simple example of obfuscation is to hide the numbers 867,193 and 30,541 by computing their product 264,849,413. Factoring the resulting long product is very difficult if both 867,193 and 30,541 are prime numbers. This type of data hiding is the basis of many encryption schemes. Other simple examples are to translate text from English into the Navajo language, or to translate a program from a high level computer language such as C++ into absolute machine language of a 1960s computer. The results can be very effective ways to obfuscate the original content. Data hiding for software can use both language techniques and computational (mathematical) techniques. The level of security possible is known to be quite high, and it is widely believed that the security can be increased by applying more and more obfuscation.
Reliable identification and authentication is an essential component of fortified software and of any software security system. A system can be attacked by spoofing, in which an unauthorized component (person, program, etc.) gains access by masquerading as an authorized component, and then carrying out an attack to obtain information, to provide bogus information, to obtain services, to pirate code, or for other purposes. There is a very large technology to identify the components that might be in a computer system. This technology can be tailored to the requirements of fortification of computer systems.
The term “component” is used to refer to programs, systems, persons or other entities that are a single entity as far as the system is concerned. An insider component is part of the fortified system and an outsider component is not. Components interact via contacts. Contacts means different things depending on the capabilities and nature of the component. One component may be invoked by another component or it may communicate via email or message boards. In any of these cases a name is used to identify the component being contacted.
We introduce three different types of names for components: public, shared and private. A public name of a component A can be widely known and can be used by any entity to contact the component A. Each software component has a public name which is generally publicly known though it does not have to be. A shared name is known outside of the system, but it is intended to be known by a limited number of outsiders; and steps are taken to ensure that an outsider using the name is actually authorized to do so. A private name is only known within the fortified system itself and no outsiders are supposed to be aware of it. Stronger steps are taken to ensure that a system component using the private name is actually an insider. A component may have many names (pseudonyms or aliases) for each type. One purpose of multiple levels of identifiers is to combat spoofing. A component might respond to the use of its public name in some situations and not in others.
Software components are the building blocks of software systems, and one of the principal attacks on the security of software systems is to modify or replace a system component. This can be done by changing the identity of one of the components of the system. The identities of the software components for a fortified system should be both secure and efficient. Providing secure identities can be done through many different methods such as providing a secure hash function of a program's code to provide the identification. However, it is expensive to continually compute hash functions to verify identity. Testing identity can be done securely using, for example, zero-knowledge comparisons. Such comparisons however involve many rounds of communication depending on the level of security that is desired and each round may involve significant computation. The security system should be able to provide secure identification that is efficient and which allows for privacy in the sense that the software component can safely use pseudonyms which do not reveal its true identity.
There are three fundamental differences between software identification and personal identification. First is the fact that software can be copied easily and exactly whereas people cannot. Thus, maintaining a unique identity for software includes an issue involving physical and electronic security. Second, identity for people in practice involves both identification and certification. Examples of certification are: (a) I have a valid driver's license; (b) I am a citizen of France; and (c) I have rented a car until December 29th. Furthermore, identities, both electronic and physical representations, for people can be copied and/or loaned which means that certifications can be loaned. This combination of identification and certification creates considerable complexity for personal identification which is not present for software identification.
The fundamental mechanism for highly reliable software and personal identification is the same. One has a very complex identification structure from which a small subset or signature suffices to establish identity. For a person, the identification structure includes physical characteristics (e.g., fingerprints, voiceprints, face prints, walking gait, keystroke behavior) and internal information (e.g., knowledge of passwords and personal history). For software, there are no physical characteristics but a complex internal information structure can be created to form the basis for secure identification. These structures can be both efficient and secure in the sense that they cannot be broken or reverse engineered by observing and analyzing the signatures, are secure against typical attacks like replay, provide for essentially an unlimited number of pseudonyms, and allow complete privacy.
A program has a name, many pseudonyms, and an identification. The identification is the complex structure embedded within a program from which it generates the signatures used for identification. The signatures can be derived directly from the program's innate identity.
For example, consider a simple program named P with instructions in a fixed format (e.g., an executable object file). Then its identification is its set of machine instructions indexed 1 through N. A signature of program P is a subset S of the program's instructions, for example instructions K1 through Kj. In this example, assume that N is 8,000, j is 5, and each instruction has 32 bits. Then a signature has about 5*(13+32)=225 bits. There are potentially about 1075 different signatures possible for the program P, but the bits are not actually random, so the actual number of different signatures is much smaller. Even so, the number of different signatures is very large, probably more than 1012.
As another example, the program P can identify itself with a pseudonym and select a signature S=(ki, Ii) for i=1 to 5 for five random values ki, where Ii is an instruction of program P. This creates another name for the program P which has the identifying signature S. If the program has only forty instructions and uses five of them per signature, then it can generate over 650,000 distinct signature and pseudonyms pairs. It can then pass a pair (P, S) to another program Q, and then use them for communication with the program Q.
When a program establishes contact with another program, there is a registration event where the identity information is exchanged. In practice, the registration normally occurs when the programs are assembled into a system and is carried out by the system builder. For example, if a program P is to establish contact with another program Q, then the program P gives the pair (P, S) to the program Q where S is the signature of the program P which the program Q can use to identify it. A simple example of this communication protocol is shown in
These protocols illustrate basic mechanisms for using identification signatures. More complicated protocols are used to increase the security and to foil other types of attacks. Even so, this basic mechanism makes it difficult for one program to fool another by some type of exhaustive trial and error or pattern analysis of possible signatures.
The identification discussed above is actually very efficient in that it requires very little memory and computation. By using more (index, instruction) pairs, the program identification can be complicated to the point that brute force attempts or exhaustive search to find a correct signature can become pointless. However, this method can have shortcomings in certain instances. One instance is if the program does not have a built-in index of its instructions. Another instance is that the number of possible pseudonyms may be quite limited if the program is short, especially if an (index, instruction) pair is never reused in a signature. Yet another instance is the potential for leaking the code of the program if there is collusion among programs interacting with it. That is, all or almost all of the program's instructions could be collected by other programs which pool their knowledge to discover the program's instructions.
One alternative for the identification structure that does not have these shortcomings is to create signatures using data lists. Instead of the actual code of the program P being the identification data list, a separate list of random content, call it IDlist, is inserted into the program P to identify it. The IDlist can be tailored to the application and security level requirements. Thus, the IDlist can be a random list of 10,000 8-bit numbers, or a list of 1,000 80-bit numbers, or a list of 10,000 80-bit numbers, etc. The size of the list and the number of items in the signature can be used in the tailoring. This approach may be expensive in memory usage for a short program, however for a program with hundreds of kilobytes of code this approach may increase the length very little and it is very fast to compute and verify a signature.
Another alternative identification structure is to create signatures with random number generators. Instead of having a list of random numbers one might simply use a random number generator. Compared to the above example, one is trading off memory usage for computing time. However, the amount of computing time required is low and essentially fixed, and the complexity of the random number generator can be made extremely high. The technique of the one (1) pass random number generator can be used. An example of this type of identification structure is shown in
Checksums and hash functions can also be used as alternatives for identification structures. The idea of using a hash function to checksum data lists can be applied in many other ways. First, one can checksum any list of numbers including those of a signature, i.e., the data lists, the random numbers used in the preceding examples or the object code of a program. The advantages are: (1) the checksum is shorter than the data itself, so there is less to communicate, (2) the source of the signature is further obscured, so it is impractical to determine the original signatures, (3) the need for security in communication is reduced, and (4) it is faster to check the signature. The disadvantages are: (1) it is more work to compute the checksum and its hash function, and (2) if enormous numbers of signatures are needed, there is a very small risk that they are repeated.
There are various security levels of identification information, IDs. When component A is contacted, the contacting entity uses a name and may also provide some auxiliary information about its identity and authorization. This identification information determines the identification security level of an ID and there might be a sequence of challenges or exchanges of information as in a challenge response situation. When A is contacted, it examines the identification information. Even when component A is in the public mode it may examine this information to detect erroneous contacts such as being provided a character sequence when a number is required, or being provided a negative number when a positive number is required. The identification information is to provide component A with the means to check the authorization for the contact. A password is the simplest and most common means of providing some security when contact is made. The security of transferring identification information between components is preferably handled by a secure infrastructure.
We identify four levels of identification security for components: none, password secure, semi-secure and secure. The first is no identification security which is where component A may check that the identification information is operationally valid but otherwise assumes the contact is authorized. If the contents of the identification information can be ascertained from easily available knowledge, then there is no intrinsic security in its content.
The second level is password secure which is where component A checks the identification information to make sure that it has the correct content such as a password. This content is invariant, so that, once compromised, any outsider with this content is authorized to use A. Obviously there can be a wide variety of actual security strengths within this level.
The third level is semi-secure which is where the component A is contacted by a component B and then there is an exchange of information of a challenge-response type. The exchange is said to be simple if the logic behind this exchange is simple. That is, the rules for the response could be guessed by observing a fair or perhaps large number of exchanges. A simple example is for A to send B a number N and then B to return a password plus the date of N days in the future. Another example is for A to send B a number N and then B to return the result of a logical exclusive-or operation on the password with the date N days in the future. This definition depends on the meaning of simple. We say the rules are simple if a person knows them could easily remember them for several days without writing them down or using ten to a thousand examples or exchanges could derive the password algorithm. Thus a person who knows something about the rules of B could imitate B and gain access to the component A.
A secure identification security level is where component A interacts with component B in a way that requires very large amounts of information and logic in order for B's identity to be accepted. This would require at least dozens of lines of code to compute the data and/or dozens of complicated data items. Examples are where B is a person and provides his fingerprints or similar biometric, or B is a program and receives a set of K numbers N from the program A and returns K words from a particular secret book at location N. We assume that communication and transport in infrastructure are secure.
The dividing lines between password secure, semi-secure and secure can be fuzzy but are useful for determining a security level. Nevertheless, these definitions do illustrate general ranges of security in identifications and the security of a fortified system is dependent on secure identifications of the components. The principal danger is that an ID is compromised so a program or person can spoof the fortified system using a false ID to gain some advantage.
The automatic creation of secure IDs from machines and software components is preferred for large and/or dynamic systems. High security requires that these identities have the privacy properties similar to personal biometrics. Fortified software usually needs identification that is efficient in both computation and communication as components might check identities very frequently, on the order of every millisecond or microsecond. Some techniques using random number generators can be used to achieve this secure identification of software necessary for fortified software just as biometrics have inherent random characteristics. When a new component or device is introduced into a fortified system, new secure identities are created for it. A very simple model of this would be to use a random number generator to create a new 16-character alphanumeric password for a password-secure component.
This approach is made highly secure by increasing the complexity of the information and the protocols for the exchange of information. If there is no predictable relationship between the input and the identification values, then a secure ID exists.
A fortified software system is similar to an organization that wants to assure its integrity, i.e., that all its members are exactly the ones expected. Such a software system might require very high security and have ten, a thousand or a million components operating on various devices (PCs, fingerprint readers, network servers, optical scanners, etc.). There are many aspects to fortifying such a system and one of these is that each software component must be positively identified. Many of them need several pseudonyms, each to be used for communication to a different class of other programs. It must even be able to differentiate among several “identical” programs which run on different PCs or devices. Highly secure operations may require that the identities of programs be verified more than just with each use. For example, external security monitoring components of the fortified system might verify software identities every few minutes, seconds or milliseconds. Such a system is likely to be static in nature; that is, it is set up or up-dated infrequently and then operated very frequently.
A typical component needs to interact with other components of the system, components of other “trusted” systems, with entities that have the authority to modify certain of its parameters or properties, and external “untrusted” objects (people, programs, devices, etc.). The component should use a pseudonym and signature for interacting with each class of programs or components. Different levels of identity security are required, for example, none is needed when interacting with an untrusted entity.
Preservation of privacy means that no collection of signatures that occurs is sufficient to reveal the “true” identification information about a program. This concern is very important for people (e.g., fingerprints) but it is also important even for some software. An example of the technique for protecting privacy is shown in
When program P calls program Q, program P provides program Q with the M items of its signature. Program Q checks these against its set and, if correct, recognizes P. If program P wants to test the identity of Q, it can ask Q for the indices (k1, . . . , kM) at the start.
When program Q calls program P, program P asks program Q for its signature as above. Then program Q provides program P with its set of indices (k1, . . . , kM) of Q's signature and program P responds with the correct values (I1, . . . , IM) to be recognized by program Q.
The security lies in the fact that there are so many possible signatures that none is ever reused and even collusion among thousands of program provides little information about the signature of program P. For example, if N=10,000 and M=5, then there are about 1018 signatures possible. Even if the signatures are chosen at random, there could be 10,000 signatures with a substantial probability that many items are not used. By managing the assignment of items to signatures, a huge number of signatures can be created without compromising the security. If a random number generator is used instead of a data list, then the list effectively has millions of items and there is no risk of revealing the entire set of items.
The program P can create and launch other programs, call them agents, to help with various tasks. These agents can be used to search the net for information, to monitor devices or sensors that detect certain events, or to collect data on events occurring in a wide environment. These agents are probably somewhat autonomous and they must have names for a program to contact them, and identifying signatures for contacting a program. These agents must also be able to identify a program. In some applications, the agents can contact the desired program using a pseudonym, and in other applications, the agents simply wait to be contacted by the program.
As the use of software agents matures, agents will create new agents, which in turn, will create even more agents. These agents obviously need to interact with their creators; they may need to be able to interact in some way with the original or an intermediate creator in their ancestry, and to be able to recognize other agents that are descendents of the original or an intermediate creator in their ancestry. There might be thousands of such agents, each with a separate pseudonym and identity signature.
This identification technology places no constraints on the organizational form of the agents. The organization can have 2-way communication (each agent knows the other's identification), 1-way communication (only one agent knows the other's identification), or a mixture of these. Communication can be restricted to be “up” only, “down” only or “horizontal” only. The organization can be very structured (all agents know the entire organization and its structure) or amorphous (agents know they belong to the organization but do not know their position in it). Each agent needs an address book with perhaps a few entries or perhaps a very large directory. But each entry is of a reasonable size, perhaps a few dozen bytes. The organization can change dynamically with agents added or deleted easily. There can be a central information service to provide addresses for large organizations provided measures are taken to secure the service.
Assume that a program P is in charge of a search for terrorists and uses agents sent out over the internet. Each agent has
the ID of its creator,
the IDs of its siblings,
the pseudonym (flycatcher) of Program P but without the signature,
the signature of the agent network.
The network has a tree structure with Program P at the route. Agents may create sub-agents to extend the network. The agents have some detection technique to identify potential terrorists. Once a potential terrorist is identified the agent:
sends a message to flycatcher,
provides all the information to its creator who sends it up the network,
provides all its siblings with all the information.
These communications all use the agent IDs and network signature for secure identification of the participants. In case the network is damaged, Program P has the information and IDs to contact all the surviving agents. It is clear that such a network can be organized in many ways and use many protocols as suited for the network's goals.
Software can also be used as an aid in identifying people. Reliable identification of people depends on assessing complex biometric characteristics of people such as faces, fingerprints and speech patterns. People have built-in mental facilities to support remembering some types of biometric identification but these facilities are not always reliable. Thus, society has generated mechanisms to support identification such as photo IDs and passports. In most situations the person produces his identity (produces credentials and/or allows biometric data to be measured) and this is compared with reference identity data. This approach is very reliable but there is the risk of the biometric data being stolen. There are various methods for securing the biometric data by allowing identifications to be made using subsets of the biometric identification. This process is the same as using signatures to identify software.
Personal identification that provides high levels of privacy and security requires computational support. People cannot perform the measurements, computations and transformations mentally. Further, there is an ever growing need to make secure identifications at a distance, e.g., over the network. Thus various computational aids have been developed to assist people with managing their identity data. The most common are smart cards that include both computational power and memory. Protocols and systems to protect personal identity information primarily use encryption and other standard security techniques. The personal identification problem using these aids has two components: problem (a): secure identification of the computational aid, and problem (b): reliable association of the computational aid with a person. If problem (b) can be solved then there is no need to use biometrics in the identification process.
There have been several solutions proposed for solving problem (b). One solution is embedding the aid as a computer chip in a person's body. Such a device has been approved recently by the FDA, but it is extremely simple. Another solution is using a challenge-response conversation to verify that the aid and the person both “know” the appropriate information. This expands the password concept into something that is both more reliable and more natural for people. Yet another solution is having people transmit transformed biometric information securely so that the aid can identify the person but no one else can interpret or use the transmitted information. This topic is discussed later as such transmissions are also needed for securing the integrity of software systems.
It is practical, even required, for people to use a computational aid for identification. Even though the use of smart cards is now widespread, the losses due to electronic identification fraud are still enormous and growing. The software identification technology presented here can then be used to provide high levels of security for people and organizations. Further, they can create private and secure software agents to aid their activities.
The purpose of security transformations is to protect against replay attacks or spoofing in communication. For these transformations we assume: (a) that both software components, say programs P and Q, involved have access to some shared or global information that changes continuously; and (b) that programs P and Q share a private function or procedure that uses the shared information to transform the signature each time it is sent. The transformation procedure itself need not be particularly secure. A simple example is a transformation based on time and random numbers. Let the global information be universal time T. Assume the frequency of communication is low, no more often than once an hour. Then T can be used as the seed for a random number generator RNG shared by both components to obtain a random sequence Rand=Ri, i=1, 2, 3, . . . The transformation is then for Rand to be added to the signature S by the sender and subtracted from the signature S by the receiver. That is, P sends {Si+Ri} and Q uses {Si-Ri}. This transformation is simple and effective in many cases. Its weakness is that it depends on the frequency of communication and the synchronization of clocks. The clock can be replaced by other items.
One alternative is to use information from the communication history of programs P to Q. For example, maintain a message count M, and use M as the seed for RNG instead of T. Or use some item from the content of the previous message from P to Q. For example, use every seventh character of that message to generate an eight character seed for RNG.
Another alternative is to use information from the current message between program P and Q. For example, use the first 8 characters of the message as the RNG seed to generate the sequence Rand and then use Rand to transform the remainder of the message which is its actual content. That is, program P sends Q the message {Ai=Ci+Ri} and Q computes {Ai−Ri}={Ci}, the original message. The first 8 characters of the message are ignored.
Yet another alternative is to use information that is universally available, such as yesterday's Dow Jones closing average, as the seed for Rand.
So far security has been taken to mean that one cannot “break the code” that generates the signatures. This is, of course, essential for secure identification but it is not sufficient. We consider three other attacks on the security of software component identification: replay attacks, reverse engineering and physical attacks.
Replay attacks capture the identity information as it is transmitted and use it later. This attack is to copy the information transmitted and then replay it to “impersonate” the software component. This type of attack is widely used against the security of software systems. Fortunately, it can be defeated rather easily using transformations of the signatures; the defense techniques are presented in some detail below.
Reverse engineering involves the study of the program code to determine how the signature is created and then synthesize or copy the identification mechanism used by the program. Recall that an exact copy of the program cannot be distinguished from the original. However, copying is not a great danger if internal procedures are put into the program that prevent its misuse by copying. A complete security compromise can occur if all the code associated with generating the signatures can be recreated for another program to use. Protection against reverse engineering is a security issue orthogonal to identification. The measures used to prevent reverse engineering use a combination of obfuscation and tamperproofing (guarding) technologies.
Physical attacks modify the hardware of the machine that executes the program to alter its behavior, extract information, or for other unauthorized purposes. Again, these attacks are orthogonal to identification and sufficient measures must be taken to assure the integrity of the hardware that executes the program. One important type of protection is to include code in the program that tests hardware identity and its characteristics thoroughly.
Reliability refers to a loss of functionality as opposed to a loss of security. Thus, if communication is lost within parts of a fortified software system, the identification becomes unreliable although still secure. Consider the following examples: (1) Suppose the program P is executing on the machine Atlas and Atlas is destroyed by a lightening strike. How can the fortified system be reconstituted without P? (2) Suppose the cable between two machines is cut. How can the fortified system be restored? Will the entire system be disabled by this break? (3) Suppose the encryption between two machines is accidentally disabled (by an entity outside the system). How can security be restored? These are important issues that must be addressed by the fortification of a software system.
Reasonable responses to these events are as follows: (1) Programs that communicate with program P recognize, after some time, that program P does not respond. There is code within the system to react to this information and an entity that has the authority to restore the system or to modify its operation. (2) The procedure that handles “lost” machines can equally well handle “lost” connections. Often a system has multiple connectivity so one lost connection is easily or automatically replaced by another. (3) A fortified software system should use the general encryption of a secure network but it should also use its own encryption of messages in addition.
The theme of these responses is that events that cause loss of functionality must be anticipated and responses incorporated into the system in advance. A byproduct of these reliability steps is that there must be system backups. This, in turn, creates yet another security problem: one must protect the backups. This can be very important if the code involved is the computational aid for a person's identity. If that code is lost then the person may have very severe difficulties in recovering everything needed. Again, this is not an issue of identity protection specifically, but it is a related issue that must be addressed.
The policy system of a fortified software system has two distinct parts: the parts specific to the particular application, and the parts that provide general software security. The policy system is also a central entity for managing the security, identity and authorizations used by the system. In practice it is preferable to have a single entity managing policy even though this is not essential to security in principle. Otherwise, there is significant overhead in updating security controls and security errors become more likely. Policies can fall into three general categories: (1) policies specific to a particular application of the system, (2) generic system protection measures, and (3) policies about who, how and when authorizations are to be made or modified.
Once the policies are made, then the policy system manages the creation of identities associated with verification information. These identities are inserted at the appropriate places within the system components. The policy system also manages changes in policies. There is preferably someone authorized to change the policies and an audit trail is maintained of the changes.
Guard responses are coded into the program and determined by the security policy. These can be gentle reminders that something might be wrong, urgent messages to security authorities, locks on the entire system, repairing the changed code, or corrupting program execution.
Dynamic policies are those that can be changed while the system is deployed even while it is operating. These are policies that can be modified while changing a few data items in the system software. For example, changing the identity of the person guarding the bank vault can be made by changing a few items within the code; adding a fourth person to run the ski lift can be made by adding a new entry to a database of lift operators along with their identifying information; or a fingerprint reader can be replaced by updating the serial numbers. Practical operational efficiency requires that it be easy to make security changes. Otherwise, people will try to avoid making changes even if they are necessary for high security.
Static policies are intrinsic to the system and cannot be changed without rebuilding some components of the system. For example, changing from a one-level challenge response mode to a two-level challenge response mode requires that new code be added to the components to generate and process the new types of challenges and responses. Of course, several different modes can be included in the system and then a switch can be used to change dynamically between them. It is often impractical to build a highly flexible capability for all of the changes in a system. The system designer must decide which policies are to be dynamic and which are to be static. In practice, it is expected to take several iterations to identify a good balance between the two choices. It is sometimes feasible to automate the rebuilding of certain components so that changing static policies is less burdensome on the system support staff.
The system policy manager has responsibility for all the dynamic policies. Logically, the system policy manager is thought of as a separate system component with global connections to the other system components. There are at least two advantages to having a manager distributed throughout the system. First, especially for a large system, there are simple things that are more efficient to do locally. For example, giving a sixth person the authority to access the fourth floor storage closet should probably be implemented by the software of the building facilities supervisor rather than that of the company's chief security officer. Second, and more important, the security of the system is stronger if the security policy is distributed throughout the system. Thus, instead of having a single system policy manager that can be attacked, an attacker has to deal with many system components where the security policy functions are mingled with all the other operations.
It is a substantial and technically difficult task to fortify a large or even a medium-sized software system. There are two systems involved in fortification. First is the fortified software being fortified and second is the system that creates the fortification. The fortified software is of course modified during the fortification process. In principle, fortified software can be created in many ways as long as the result is secure. In practice, it is much more efficient to use a systematic and deliberate approach to create fortified software unless the fortified software system is rather simple.
An outline of a systematic and deliberate approach is shown in
Steps 1 and 2 are standard in software development. In Step 1, the goals and methods of the software system are defined, and Step 2 is the beginning of the parallel design of the software system and the fortification.
Step 3 is where a skeleton version of the fortified software is created for use in the fortification design and development. It is at this point that some of the data protection policies are developed.
Step 4 includes two parallel actions: the prototype system code is written and a prototype security plan of Step 3 is implemented. It is in Step 4 that the security policies are put into the skeleton code.
In Step 5, the markers for the special security information and the actual special authorization code are inserted into the system. Simultaneously, in Step 5, the security policies are tested and validated using the prototype system code. This is where parts of security policies are transferred into the system code.
In Step 6, the system code is tested and validated. This includes the security authorization codes but not the other security items. In parallel, in Step 6, the policy manager and guards are created, the skeleton security is validated and the security testing is defined. The final structure of the fortified software is used to validate the security plan. Also in Step 6, special security items are implemented.
Step 7 is the integration of the system code with the security. In this step, the fortified software is implemented and the fortification is completed. Typical specific steps that are performed here include:
Source code obfuscation, if any.
Create and insert source code for identity creation and testing, if any.
Insert any policy manager code distributed into system components.
Compile source code.
Obfuscate machine code, hide data items identified by markers.
Tamperproof binary codes; create both internal guards and guards in one component that guard another. More obfuscation of machine code and hiding data items.
Compute data for external guards
Compile external guards and policy manager.
Tamperproof external guards, policy manager, etc.
Step 8 includes system and security tests and is when final acceptance tests for the fortified system are performed.
As an example, consider an airport passenger check-in system that identifies passengers, accesses existing ID databases and screens the passengers for potentially dangerous people. The system is to protect the privacy of individual data, not delay passengers unduly and to be secure against hacker attacks. The description is simplified here to concentrate on the “be secure against hacker attack” requirement. We assume that the biometric identification, called BioID, used is fingerprints. The basic requirements of the check-in procedure are:
The components and interfaces of the counter check-in system are shown in
There are three types of attacks that could compromise the security of this system. First is spying and spoofing for connections 140, 142, 144 and 146. Our assumption of a secure infrastructure means that spying is not a concern, the communication is secure. However, spoofing is a concern and we must assure that the devices connected are the correct devices. This is done using the secure identities and challenge-response identity verification procedures. Second is impersonation (by people or programs) at components 130, 132, 134, 136, 138, 139 or by an agent or a passenger. Again, secure identities are used to prevent this. However, some of these identities are not electronic so other means must be used, typical examples are:
All of the system programs are tamperproofed as with the Arxan EnforcIT tool. This includes components 130-139. The tamperproofing creates a network of internal guards within each of these programs. When tampering is detected, the responses programmed into these components follow the policies set in the policy system. These responses, at least, notify the agent, the overall airline passenger management system and the local check-in system itself stops processing passengers until the supervisor restarts it. The internal guards in the fingerprint reader 132, the ticket reader 134 and the keyboard 138 and display 139 check codes, data, and machine IDs. These guards are in simpler computing environments and it is easier to identify the executable code. One must also ascertain exactly how the device serial numbers, and other identification information are accessed. The fingerprint reader 132, the ticket reader 134 and the agent's computer 136 have small internal memory files of IDs and relevant policies (installed by the fortification process).
The agent's computer 136 and the local passenger database 134 have internal guards to check themselves. In addition they act as external guards to check each other and components 130, 132, 138, 139 and the agent. They have substantial memory files of IDs and policies installed by the fortification process. They also have independent external guards.
The computers, components 134 and 136, have public IDs and are attached to various networks. All the devices have local private or shared IDs. The entities in the check-in system are listed along with their identifications of various types.
Sample application, generic and authorization policies for this system are listed below. The components and connections are identified by the numbers in
Application Specific
Generic Protection
Authorization
Another example of a fortified system can be illustrated by an election voting system at a voting site. The fortified system must: (i) identify people: the voters and the staff (poll workers, party representatives and political authorities), (ii) access voter record databases, (iii) allow voting and (iv) collect the results. The system is to protect the privacy of individual data, not delay voters unduly and to be secure against hacker attacks. The description is simplified here to concentrate on the “be secure against hacker attack” requirement. We assume that the biometric identification, called BioID, used is face-prints and fingerprints. The basic requirements of the voting procedures are as follows:
The structure of the overall voting system is illustrated in
There are four types of people involved in this process: (1) political authority, the entity running the election; (2) party representatives, one for each party involved, running the voting site; (3) poll workers, one for each terminal of the system; and (4) voters. Only the political authority is fixed, the other staff may change during the voting but they all must be identified and registered in advance, and then be recognized and authorized as they assume their roles. They may come and go during the voting. Face-prints are checked from time to time for those using the poll control machine and terminals. The fortified system has no external network connections. Its software and databases are initialized in advance by the political authority using physical storage devices carried to the polling place. A complete audit record is kept of the events at the voting site. We assume these are secure to simplify the discussion.
There are many potential attack points in the voting system. The voting site system has the K+N+2 physical components seen in
All programs in fortified system are tamperproofed as with the Arxan EnforcIT tool. This includes the poll control, terminals, voting machines and BioID devices. The tamperproofing creates a network of internal guards within each of these programs. When tampering is detected, the responses programmed into these components follow the policies set in the policy system. These responses, at least, notify the party representatives, create an entry in the voting audit record, and the voting site system itself stops processing voters until the party representatives restart it. The internal guards in all components, except BioID devices, check codes, data, and machine IDs. All these machines have internal memory files of hardware identification information and relevant policies (installed during the fortification process). In addition, the poll control machine contains external guards to check all the other components. It has a memory file of identification information and policies installed during the fortification process. The terminals have external guards that protect the poll control machine software.
The computers have public IDs. All the devices have local private or shared IDs. The entities in the voting site system are listed below along with their various types of identifications.
Application, generic and authorization policies used for the fortified system are listed below. When a time interval is given for checking, it means an average value. Actual values are preferably varied randomly within about twenty percent of this average. The generic word “machines” includes the poll control, the terminals and the voting machines.
Application Specific
Generic Protection
Authorization
As an example of the use of multiple IDs, consider a function MyID(Input) where the value computed is not related to Input in any predictable way. MyID could, for example, just look up numbers from a table of 10,000 numbers (they need not even be different). Identities with different names for different contacts are generated and given a key (password) for them to verify my identity. A table as shown in
When I contact MyBank the exchange is as shown in
This approach is made highly secure by increasing the complexity of MyID and the protocols for exchange of information. MyID could use a 12 digit input and produce four values, each with 12 digits. This provides 1036 potential ID values and 1012 possible names; with only 1012 inputs to MyID there can actually be only 1012 different outputs. If there is no predictable relationship between the input and the ID values, then a secure ID exists. A wide variety of communication applications can be made secure using this technology.
As example of hiding and protecting data is described with reference to
First, a simple statement in the software is randomly selected, say X=DATA+1, is replaced with the statements shown in
The test could be even more explicit such as shown in
Note that neither the number 360194 nor the string 0a+ appears anywhere in the resulting software. Of course, this simple example does not hide 0a+ very well, but one can extend this approach extensively and then obfuscate the resulting code to make it very difficult to determine the correct password from the information in the software.
Data can be protected from tampering by using both internal and external guards. External guards provide stronger protection because they are harder to find and their anti-tamper actions are not synchronized with the execution of the program containing the data. Micro guards are useful to provide special protection to particularly important data items. Micro guards are very short guards (1 or 2 statements) which check one “item” in a program. They are very hard to detect and execute very fast, which makes them very well suited for use in external virus guards.
Special guards can be used to protect against viruses, dynamic attacks and clone attacks. There is a class of attacks that involves inserting malware into code at the very beginning (or elsewhere). Special guards are needed which focus on the common properties of these attacks. The basic steps in these protections are as follows
The goal of virus guards is to protect against viruses being inserted into a program. Internal virus guards do exactly the things described above. External virus guards can also check the start, transfer points plus other empty spaces. External virus guards provide additional protection because the guarding is not synchronized with the program's execution. In particular, they are able to check the initial statements of P before they execute to initiate a virus attack. A network of guards can be created that makes these checks both before and after the program executes and at random times during the program's execution. Thus, providing complete virus protection. Virus guards can also provide much better defenses against dynamic and clone attacks which involve inserting “virus-like” code into the program.
A dynamic attack against a program P proceeds as follows. One finds a spot S#1 in P that is not checked before it executes [the first statement always qualifies]. Copy S#1's code to empty space and insert new code which makes step #1 of the attack. Then locate spot S#2 which is not checked between the time S#1 is executed and S#2 is reached. Copy S#2's code to empty space and insert new code which makes step #2 of the attack. This chain is continued until the attack is complete. The final step may include erasing all the codes inserted and restoring the original code to remove the evidence of the attack. The original code may be restored step by step also. The dynamic attack is always “on the move” to avoid detection. At some crucial time the attack's action is taken. Such an attack can be used to steal $10 million from Mr. X's bank account. The attack starts after the bank's system has identified Mr. X making a transaction, e.g., an ATM withdrawal. The system is hijacked to (a) send $10 million to a safe offshore account, (b) update all records to show Mr. X authorized the transfer, (c) continue with the ATM withdrawal, and (d) erase all traces of the attack. Such attacks appear complex at first, but following the details of one makes it easy to see how to do it in general.
An external guard can check all the empty spots in program P to detect the code that such an attack uses. Further, the external guard's checking of P's code is not synchronized with the execution of P so that the attacker is unable to avoid detection by being always “on the move” away from the guarding. A dynamic attack on a well tamper-proofed (by internal guards) program is very difficult. One must identify the guards and other protections of program P in detail and then devise a strategy to move code around to avoid detection. Nevertheless, a dynamic attacker can probably succeed no matter how well P is protected by internal guards (including silent, repair and other types of internal guards). Using external virus guards makes it easy (and relatively cheap) to prevent dynamic attacks. A successful dynamic attacker must defeat both the internal and external guarding.
A clone attack on the code P operates as follows:
An external guard normally cannot locate the program Q but it can observe that statement 1 of program P is wrong. Thus, a virus guard can detect a clone attack and take appropriate action. Note that the guard must check P rather often; the checking interval should be substantially less than the time to execute P. The clone attack can also be detected by the fact that many variables in program P are changing while program Q executes and an external guard can check these.
Anti-cloning guards are repair guards used in a special way to defend against clone attacks. Early in the program repair guards are inserted that correct deliberate errors in code executed later. These corrections take place in the program P and not in the program copy Q. As a result, the cloned code has errors and does not execute properly. To help hide the guard, the code can be re-damaged later so the repair is not revealed by a postmortem dump. Note that silent guards are also anti-cloning guards as their protection is unaffected by cloning.
While the invention has been illustrated and described in detail in the drawings and foregoing description, such illustration and description is to be considered as exemplary and not restrictive in character, it being understood that only exemplary embodiments have been shown and described and that all changes and modifications that come within the spirit of the invention and the attached claims are desired to be protected.
This application claims the benefit of U.S. Provisional Application Ser. No. 60/592,039, filed Jul. 29, 2004.
Number | Date | Country | |
---|---|---|---|
60592039 | Jul 2004 | US |