This invention relates generally to computer systems, and deals more particularly with detection of malicious computer attacks such as caused by computer viruses, worms and hackers.
Malicious computer attacks, such as manual “hacker” attacks, computer viruses and worms are common today. They may attempt to delete, corrupt or steal important data, disable a computer or conduct a denial of service attack on another computer.
A manual attempt to “hack” a victim's server or workstation begins when a (hacker) person at a remote workstation attempts in real time to gain access to the victim's server or workstation. This typically begins by the hacker entering many combinations of user IDs and passwords, hoping that one such combination will gain access to sensitive software or data in the server or workstation. A hacker may also transmit an exploitation program which automatically exploits vulnerabilities in a victim's server, as would a hacker do manually.
A computer virus is a computer program that is normally harmful in nature to a computer. Computer viruses are received via several media, such as a computer diskette, e-mail or vulnerable program. Once a virus is received by a user, it remains dormant until it is executed by the user or another program. A computer worm is a computer program similar to a computer virus, except that a computer worm does not require action by a person or another program to become active. A computer worm exploits some vulnerability in a system to gain access to that system. Once the worm has infected a particular system, it replicates by executing itself. Normally, worms execute themselves and spawn a process that searches for other computers on nearby networks. If a vulnerable computer is found, the worm infects this computer and the cycle continues.
Computer attacks are typically received via the network intranet or Internet, and are targeted at an operating system. Often, a computer virus or worm is contained in a file attached to an e-mail. Computer firewalls can prevent some types of attacks transmitted through a network. However, a computer exploit can use encryption technologies to transmit information through firewalls. Alternately, the computer exploit may be embedded in an image that can pass through the firewall.
Most computer attacks have a characteristic “signature” by which the attack can be identified. An intrusion detection system can also be used to detect known computer attacks by matching key words of the attack program to a known signature. However, until a computer attack becomes known and its signature determined, it can avoid the intrusion detection system.
Another known method for identifying malicious software is to heuristically check system operation to identify unusual behavior. For example, if a system iterates through all files or all files of a certain category, and changes or deletes them, this may be considered unusual behavior. As another example, it may be considered unusual behavior for software to iterate sequentially through the file system overwriting the start of each executable file. As another example, if an application connects to the Internet when the current workload of the system does not require such a connection, this would be considered unusual behavior. Heuristic checking software was previously known to detect these types of unusual/suspicious behavior. The heuristic checking software monitors all programs that are executing to detect these types of behavior and flags an alert to the user, hopefully before too much damage is done.
It was also known to identify suspicious network communications as follows. Firewalls detect all attempts to connect to the Internet and are capable of blocking certain types of messages. The user is required to configure the security policy, although most systems have default settings that are suitable for the average user. The security policy determines the type of connections that are allowed to pass through the firewall. For example, the security policy may allow HTTP access on a particular port to download HTML. However, the firewall will block other types of messages. If an attempt is made to pass such a message through the firewall, a dialogue box is generated that alerts the user to the prohibited message. The dialogue box may also also ask the user whether such a message should be allowed to pass through the firewall.
Information flow diagrams are known for use in analyzing software during development. See “Certification of Programs for Secure Information Flow” by Dorothy E. Denning and Peter J. Denning in Communications of ACM, 20(7):504-513, July 1977 for details of information flow techniques. This publication is hereby incorporated by reference as part of the present disclosure.
An object of the present invention to facilitate the detection of malicious software within or attacking a system.
The invention resides in a system, method and program product for detecting malicious software within or attacking a computer system. In response to a system call, a hook routine is executed at a location of the system call to (a) determine a data flow or process requested by the call, (b) determine another data flow or process for data related to that of the call, (c) automatically generate a consolidated information flow diagram showing the data flow or process of the call and the other data flow or process. After steps (a-c), a routine is called to perform the data flow or process requested by the call. A user monitors the information flow diagram and compares the data flow or process of steps (a) and (b) with a data flow or process expected by said user. If there are differences, the user may investigate the matter or shut down the computer to prevent damage.
The information flow diagram may represent the physical and virtual locations of information entities at stages of a processing activity. The set of system functions which are monitored may include: open file, copy file to memory, copy memory to register, mathematical functions, write to file, and network or communication functions.
The present invention provides a system, method and program product for monitoring and displaying the activity of a computer system in real time as an information flow diagram. This permits an operator to determine if the activity appears consistent with the bona fide work requested of the computer system. The information flow diagrams show the physical and virtual location of information entities at all stages of their processing and the operations, such as the copying, encryption and transmission of information. The information flow diagrams are generated in real time as the information is being processed and moved about the computer.
In order to automatically construct the information flow diagrams, system calls are “hooked”. “Hooking” is the insertion of an additional routine at a call location in an operating system or other program and relocating the original, called routine from the call location. In accordance with the present invention, the additional routine is used to monitor system activity. After the additional routine is executed, it calls the operating system routine to perform the function requested by the software application, so that the function of the operating system is preserved. After the operating system function executes, it returns to the application, so the application will not realized any difference in function. Software interrupt hooking is used to study internal operations of the operating system, movement of data and any other activity characteristic of malicious software. Memory hooking involves copying DOS subroutines to a different memory location and writing an alternative subroutine in its place. The additional routine generally calls the original routine once it has completed its processing. In this way, the underlying function of the operating system is maintained.
In DOS and early Windows environments, system calls were named “software interrupts”. Software interrupts are program generated interrupts that stop the current processing in order to request a service provided by an interrupt handler. Applications typically use this mechanism to get different services from the operating system. A software interrupt causes the processor to stop what it is doing and start a new subroutine. It does this by suspending the execution of the code on which it was working, saving its place and states, and then executing the program code of the subroutine. Once the subroutine has been executed, the program code is continued from where it was interrupted. The software interrupts comprise mainly BIOS and DOS subroutines called by programs to perform system functions.
To call a given interrupt handler, a calling program needs to be able to find the program code that carries out the function. Part of the processor memory is reserved for a map called an interrupt vector table providing the addresses for the program code for carrying out an interrupt. Calling a software interrupt requires setting up registers and then executing the interrupt. For example, interrupt “21H” in the DOS operating system is used for the main file I/O functions. Interrupt “21H” instructions can be used to open a file and read the contents of a file into memory. Using further interrupts, individual bytes of data can be copied from specific memory locations to registers, and mathematical operations can be performed. The resulting data can then be copied from the register back to alternative memory locations and ultimately written to file or communications ports. Software written in a high level language such as Java or C++ is compiled into these low level instructions by the compiler. This low level of programming at which software interrupts operate is the level at which many computer viruses work. Later versions of Windows operating systems include similar functions for hooking operating system calls. Later versions of Windows, which are not based on DOS operating systems, are based on higher level system calls. Low level software interrupts still exist and are available to be hooked. It is desirable in the present invention to hook the lowest level calls where possible.
FIGS. 7A-C further illustrate the process of
In the present invention, a series of interrupt hooks such as hook 207 are located and implemented to automatically generate an information flow diagram in real time. One example is to track an individual byte of information by hooking interrupts for loading data into memory, copying data from one location to another and writing the data back to a file. In this and other examples, the hooks monitor and display system operation. There is sufficient detail in the information flow diagrams to monitor and display the operation of the computer system without flooding the user with excessive information. In one embodiment of the present invention, the interrupts which are hooked include open file, copy file to memory, copy memory to register, mathematical functions, write to file, and network functions. Each one of these hooks generates an icon or other graphical representation of the current operation to be performed by the original routine at the call address. The following are examples. When a file is to be opened by an original routine at the call address, the hooking routing will create an icon which represents the file, a label on the icon for the file name, and a memory location for the file. When the file is to be moved from one location to another location by an original routine at another call address, the hooking routing will create an adjacent icon which represents the file, a label on the adjacent icon for the file name, a new memory location of the file, and an arrow between the two icons pointing to the adjacent icon to indicate a file transfer. When the file is to be sent out on the Internet to a destination IP address by the original routine at another call address, the hooking routing will generate a third icon which illustrates the Internet and a fourth icon which illustrates the destination device on the Internet, and an arrow leading from the second icon to the Internet. Other icons can represent an encryption operation, a mathematical operation, an insertion of an IP address operation, etc. These information flows can instantly reveal suspicious activities.
As another example, by hooking a series of calls, it is possible to track and illustrate when an individual byte of data is read from disk to a memory location, the data from that memory location is copied to a register, a mathematical operation is performed on the data in the register, and the result of the mathematical operation is written to an alternative location (i.e. memory or disk). Even at the byte level, a meaningful information flow diagram can be generated.
Another hook 610 is located at the call location for writing file F from memory location 504 to memory location 506. When called by the malicious software B (step 611), hook 610 creates another record for the file stating the name of the file, its size, the location from which it will be copied and the location to which it will be written (step 612). Then, hook 610 compares the parameters of the file, i.e. name of the file, its size and its current location to the existing records to learn if there is a related data flow or process indicating that the flow of this file up to the present time is currently displayed (decision 613). In the illustrated example, this is the case; such a related record was made by hook 600. So hook 610 generates the icon for file F in memory location 506 and the arrow from memory location 504 to memory location 506 (step 614). Then, hook 610 calls the actual routine to read file F from memory location 504 to memory location 506 (step 618). (Referring again to decision 613, no branch, if there was no related data, then hook 610 would begin a new flow diagram.)
A third hook 620 is located at the call location for deleting file F from memory location 504. When called by the malicious software B (step 621), hook 620 creates another record for file F stating the name of the file, its size and the location from which it will be deleted (step 622). Then, hook 620 compares the parameters of the file, i.e. name of the file, its size and its current location to the existing records to determine if there is a related data flow or process, and therefore whether the flow of file F up to the present time is currently displayed (decision 623). In the illustrated example, this is the case. So, hook 620 generates the icon for the deleted file F from memory location 504 and the arrow from existing file F at memory location 504 to deleted file F at memory location 504 (step 625). Then, hook 620 calls the actual routine to delete file F from memory location 504 (step 628).
In another example, an information flow diagram illustrates a file being read from disk and written to a database (i.e. attaching a document to a e-mail), reading the file from the database and writing the file to a communications port (i.e. replicating databases). Both of these activities would be expected if the user had just created and sent an e-mail. However, if the file being copied to a database had not recently been attached by the user and some form of encryption was shown by the information flow diagram, the activity would not be expected, and therefore, would be suspicious. Consider another example where malicious software e-mails a confidential presentation to a competitor. An information flow diagram will reveal this activity, although the destination may not be shown. The user can identify malicious activity if the user has not attempted to e-mail the presentation to anyone. If the computer system 110 is not expected to be carrying out the activity illustrated by any of the respective information flow diagrams 300, 330 or 500, the user can investigate the matter or shut down the computer system before damage occurs.
The present invention is typically implemented as a computer program product, comprising a set of program instructions for controlling a computer or similar device. These instructions can be supplied preloaded into a system or recorded on a storage medium such as a CD-ROM, or made available for downloading over a network such as the Internet or a mobile telephone network.
Improvements and modifications can be made to the foregoing without departing from the scope of the present invention.