The present invention generally relates to the field of computing and malicious software or software threats, such as for example a computer virus, and more particularly to a method, system, computer readable medium of instructions and/or computer program product for providing automated threat analysis.
As used herein a “threat” includes malicious software, also known as “malware” or “pestware”, which includes software that is included or inserted in a part of a processing system for a harmful purpose. Types of malware can include, but are not limited to, malicious libraries, viruses, worms, Trojans, adware, malicious active content and denial of service attacks. In the case of invasion of privacy for the purposes of fraud or theft of identity, malicious software that passively observes the use of a computer is known as “spyware”.
A hook (also known as a hook procedure or hook function), as used herein, generally refers to a callback function provided by a software application that receives certain data before the normal or intended recipient of the data. A hook function can thus examine or modify certain data before passing on the data. Therefore, a hook function allows a software application to examine data before the data is passed to the intended recipient.
An API (“Application Programming Interface”) hook (also known as an API interception), as used herein as a type of hook, refers to a callback function provided by an application that replaces functionality provided by an operating system's API. An API generally refers to an interface that is defined in terms of a set of functions and procedures, and enables a program to gain access to facilities within an application. An API hook can be inserted between an API call and an API procedure to examine or modify function parameters before passing parameters on to an actual or intended function. An API hook may also choose not to pass on certain types of requests to an actual or intended function.
A process, as used herein, is at least one of a running software program or other computing operation, or a part of a running software program or other computing operation, that performs a task.
A hook chain as used herein, is a list of pointers to special, application-defined callback functions called hook procedures. When a message occurs that is associated with a particular type of hook, the operating system passes the message to each hook procedure referenced in the hook chain, one after the other. The action of a hook procedure can depend on the type of hook involved. For example, the hook procedures for some types of hooks can only monitor messages, others can modify messages or stop their progress through the chain, restricting them from reaching the next hook procedure or a destination window.
A kernel, as used herein, refers to the core part of an operating system, responsible for resource allocation, low-level hardware interfaces, security, etc.
An interrupt, as used herein, is at least one of a signal to a processing system that stops the execution of a running program so that another action can be performed, or a circuit that conveys a signal stopping the execution of a running program.
A library is a file containing executable code and data which can be loaded by a process at load time or run time, rather than during linking. There are several forms of a library including, but not limited to, Dynamic Linked Libraries (DLL) and Active X technologies.
In a networked information or data communications system, a user has access to one or more terminals which are capable of requesting and/or receiving information or data from local or remote information sources. In such a communications system, a terminal may be a type of processing system, computer or computerised device, personal computer (PC), mobile, cellular or satellite telephone, mobile data terminal, portable computer, Personal Digital Assistant (PDA), pager, thin client, or any other similar type of digital electronic device. The capability of such a terminal to request and/or receive information or data can be provided by software, hardware and/or firmware. A terminal may include or be associated with other devices, for example a local data storage device such as a hard disk drive or solid state drive.
An information source can include a server, or any type of terminal, that may be associated with one or more storage devices that are able to store information or data, for example in one or more databases residing on a storage device. The exchange of information (ie. the request and/or receipt of information or data) between a terminal and an information source, or other terminal(s), is facilitated by a communication means. The communication means can be realised by physical cables, for example a metallic cable such as a telephone line, semi-conducting cables, electromagnetic signals, for example radio-frequency signals or infra-red signals, optical fibre cables, satellite links or any other such medium or combination thereof connected to a network infrastructure.
A system registry is a database used by modern operating systems, for example Windows™ platforms. The system registry includes information needed to configure the operating system. The operating system refers to the registry for information ranging from user profiles, to which applications are installed on the machine, to what hardware is installed and which ports are registered.
Manual Threat Analysis
Known techniques that seek to protect users against unwanted threats or malicious software rely on anti-virus (“AV”) software that firstly attempt to identify a threat. Once the threat is identified the threat is then blocked from affecting the user environment, for example the threat is disinfected, deleted or quarantined. This process normally requires the following steps:
This is the typical process that is presently followed to identify threats and update AV products. Even when AV products rely on identifying potential threats by suspicious behaviour, such suspicious behaviour-based AV products are generally considered to be prone to false positives. Thus, the known manual approach remains the most effective solution, whereby a potential threat is submitted to and analysed by a human analyst, prior to updating AV software products and producing documentation describing removal procedures, threat characteristics, replication mechanisms, etc.
This known process is illustrated in
In practice a new threat is normally discovered relatively quickly, for example by being intercepted by proactive detection system or a suspicious file being submitted by a cautious user. The main “bottle-neck” of the presently known process is the AV product vendor response time. During the period of time an AV product vendor is identifying a threat, a user environment remains vulnerable to that threat because virus dictionaries have not as yet been updated.
The threat identification phase is the most important and critical stage. The major reason why it normally takes at least hours for an AV product vendor to respond is because the threat identification phase involves extensive manual analysis performed by specialist malicious software analysts. Once a threat is identified, for example as a spybot, a new virus dictionary update can be created and delivered to AV software product installations and a user environment is then secured against the threat.
However, once a new threat is identified it is still required to be described. Users/customers may now have a new set of concerns, for example: where did the threat come from (eg. country of origin)? Is the threat based on other threats in its functionality (eg. are there any similarities with other threats)? What sort of exploits/vulnerabilities does the threat employ? What are the side effects or what was the actual damage caused? How to revert a system into a pre-infection stage (eg. removal instructions)? What sort of confidential information may have been stolen? What sort of reputation damage may have been caused? How vulnerable is a system for future threats similar to the identified threat? and many other concerns.
Preferably, any threat mitigation task is associated with not only threat identification, but also the important task of threat description. Some AV product vendors follow a practice of providing generic detections, for example when a single virus name represents thousands of virus variations. In practice, this means that a user/customer receives a virus dictionary update to detect a new threat with no clarification regarding the threat functionality, removal instructions, and many other threat mitigation issues.
Thus, two manual activities involve “threat identification” and “threat description” and require an extensive manual analysis, and therefore provide the largest contribution to delays in overall response time in updating AV products. Both threat identification and threat description can be considered as a single concept, that of“threat analysis”.
Threat analysts around the world employ various techniques in threat analysis. However, presently threat analysis is essentially a manual process and typically involves the following manual actions:
There exists a need for a method, system, computer readable medium of instructions, and/or a computer program product to provide automated threat analysis which addresses or at least ameliorates one or more problems inherent in the prior art.
The reference in this specification to any prior publication (or information derived from it), or to any matter which is known, is not, and should not be taken as an acknowledgment or admission or any form of suggestion that that prior publication (or information derived from it) or known matter forms part of the common general knowledge in the field of endeavour to which this specification relates.
According to a first broad form, there is provided an automated threat analysis system comprising a core, the core associated with an input interface and an output interface and the core comprising: one or more core components; and, an operating system having at least one library hooked to at least one of the one or more core components; wherein, when a threat is passed into the core and the threat is executed in the core, report data is generated and the report data is passed out of the core via the output interface.
According to a second broad form, there is provided a computer program product for providing automated threat analysis, the computer program product comprising a core, the core associated with an input interface and an output interface and the core comprising: one or more core components; and, an operating system having at least one library hooked to at least one of the one or more core components; wherein, the computer program product is configured such that when a threat is passed into the core and the threat is executed in the core, report data is generated and the report data is passed out of the core via the output interface.
According to a third broad form, there is provided a method of providing automated threat analysis by utilising a core, the core associated with an input interface and an output interface, the core comprising one or more core components and an operating system having at least one library hooked to at least one of the one or more core components, the method comprising the steps of, in a processing system: passing a threat into the core; executing the threat in the core; generating report data using the one or more core components; and, passing the report data out of the core via the output interface.
According to a particular embodiment, an Automated Threat Analysis System (ATAS) is provided and is designed to accelerate threat identification and threat description phases for new threats, real or potential, thereby providing a significant reduction in time for the entire threat analysis response cycle. This assists an AV product vendor to respond accurately and in a timely manner to new threats. ATAS, in one form, can provide answers to questions that users/customers or AV product vendors may have regarding threat functionality, such as a description of threat characteristics, removal instructions and/or replication mechanisms.
In another form, as ATAS is automated, the system may automatically build descriptions for various threats. These descriptions can be used to update a comprehensive forensics database with search capabilities, such as the ability to search possible side effects for all known threats. If a new threat reveals a certain set of side effects then a search for those features in the database may assist in identifying a threat family to which the new threat belongs, and therefore reveal any additional features/characteristics the new threat may have. This can help security agencies to obtain more information about specific threats and not only those threats that are published by AV product vendors.
According to another embodiment, this allows ATAS to be used to automatically build a threat removal tool by knowing the scope of side effects caused by a threat. In another non-limiting form, the report data is passed out of the core via the output interface according to a predefined format.
According to other forms, the present invention provides a computer readable medium of instructions or a computer program product for giving effect to any of the methods or systems mentioned herein. In one particular, but non-limiting, form, the computer readable medium of instructions are embodied as a software program.
An example embodiment of the present invention should become apparent from the following description, which is given by way of example only, of a preferred but non-limiting embodiment, described in connection with the accompanying figures.
The following modes, given by way of example only, are described in order to provide a more precise understanding of the subject matter of a preferred embodiment or embodiments.
In the figures, incorporated to illustrate features of an example embodiment, like reference numerals are used to identify like parts throughout the figures.
Processing System
A particular embodiment of the present invention can be realised using a processing system, an example of which is shown in
Input device 106 receives input data 118 and can include, for example, a keyboard, a pointer device such as a pen-like device or a mouse, audio receiving device for voice controlled activation such as a microphone, data receiver or antenna such as a modem or wireless data adaptor, data acquisition card, etc. Input data 118 could come from different sources, for example keyboard instructions in conjunction with data received via a network. Output device 108 produces or generates output data 120 and can include, for example, a display device or monitor in which case output data 120 is visual, a printer in which case output data 120 is printed, a port for example a USB port, a peripheral component adaptor, a data transmitter or antenna such as a modem or wireless network adaptor, etc. Output data 120 could be distinct and derived from different output devices, for example a visual display on a monitor in conjunction with data transmitted to a network. A user could view data output, or an interpretation of the data output, on, for example, a monitor or using a printer. The storage device 114 can be any form of data or information storage means, for example, volatile or non-volatile memory, solid state storage devices, magnetic devices, etc.
In use, processing system 100 is adapted to allow data or information to be stored in and/or retrieved from, via wired or wireless communication means, the at least one database 116, and also for processes or software modules to be executed. The interface 112 may allow wired and/or wireless communication between processing unit 102 and peripheral components that may serve a specialised purpose. The processor 102 receives instructions as input data 118 via input device 106 and can display processed results or other output to a user by utilising output device 108. More than one input device 106 and/or output device 108 can be provided. It should be appreciated that the processing system 100 may be any form of terminal, server, specialised hardware, or the like.
Processing system 100 may be an isolated system when analysing a threat. However, if appropriate, processing system 100 may be a part of a networked communications system. Processing system 100 could connect to network, for example the Internet or a WAN. Input data 118 and/or output data 120 could be communicated to other devices via the network. The transfer of information and/or data over the network can be achieved using wired communications means or wireless communications means. A server can facilitate the transfer of data between the network and one or more databases. A server and one or more databases provide an example of an information source.
Automated Threat Analysis System
Referring to
When a threat 340 is passed into core 305 via input interface 310 and threat 340 is executed in core 305 using operating system 325 this results in report data 345 being generated by the one or more core components 320. Report data 345 is then passed out of core 305 via output interface 315, which in one non-limiting example may be according to a predefined format. For example, a predefined format of report data 345 can be used to further isolate threat 340 so that threat 340 cannot escape or send output data from core 305 thereby maintaining core 305 as an isolated environment.
A predefined format of report data 345 is not essential as if a threat attempts to escape core 305 by infecting report data 345 that core 305 delivers back into the clean environment, then the format of the data will eventually be violated because threat 340 is not aware of that format. Data with a corrupted format would simply be discarded and analysis of such a threat can be considered as failed.
System 300 can also be provided with a snapshot manager to record the state of at least part of core 305 before and after execution of threat 340. At least some of any differences in the state of core 305, for example in the state of operating system 325, before execution of threat 340 and after execution of threat 340 can form part of report data 345. The snapshot manager can also include or be associated with a database of exclusions of known differences in state before and after execution to filter out normal changes caused by normal operation of operating system 325.
Furthermore, system 300 can include at least one or more service components 350 and each particular service component 355 can be used to monitor at least one port associated with operating system 325. A service component 355 can also emulate response data at a port using a particular protocol. One or more core components 320 can be used to record at least part of any data transferred via a port using a protocol. Such recorded data can then form part of report data 345.
System 300 can be associated with a searchable database to store report data 345 from various threats. Operating system 325 may be a modified Windows® operating system. Preferably, operating system 325 functions and parameters used by threat 340 are logged by the one or more core components 320. It is also possible that at least some return data from operating system 325 functions is modified by the one or more core components 320.
A core manager can also be provided which at least in part supplies threat 340 to core 305 and receives report data 345 from core 305. System 300 may also include a wrapper acting as an interface between the core manager and the searchable database. The core manager can also be used to control return data on ports to core 305 that may be used by threat 340. The return data to ports can be provided in accordance with a protocol associated with a specific port. For example, the protocol may be HTTP, SMTP, DNS, Time, SNTP, IRC or RPC DCOM.
Referring to
The following example provides a more detailed description of a particular embodiment. The example is intended to be merely illustrative and not limiting to the scope of the present invention.
Referring to
Core Manager 525 provides the Core 505 component with a threat sample 530 via the Input Interface 535. Core Manager 525 then instructs Core 505 to execute the threat in a fully isolated hardware or hardware-emulated (i.e. virtual) environment. Software that runs inside Core monitors the threat and inspects the threat's behaviour. The collected information can then be placed into the reports 540 which are delivered back to the Core Manager 525 via Output Interface 545. The interfaces are built in such a way that a threat cannot “escape” from the isolated environment. This task is achieved by employing strictly defined internal formats for the reports that are delivered via a file sharing mechanism. There are no network communications used to accomplish this task (in case of the virtual environment, the NAT service is fully disabled).
A Wrapper coordinates work between the Core Manager and the Database components to establish a forensics database update with the newly obtained information.
Modified Operating System (OS) and Hooks
The operating system inside Core 505 is modified in such a way that many of the system libraries 550 are hooked to forward their functionality into the Core's own components. This serves two major purposes:
An example implementation of an API hook is as follows: a system DLL's export entry is patched with the export forward. Forwarded export is then handled by the Core's own DLL: it is either served entirely by the DLL, or the call is then forwarded back into the native DLL. In any case, the call handler is capable of modifying parameters and/or logging the function call itself. If a native Windows system DLL performs hash-based checks (such as file contents or export table CRC checks), then the native DLL logics should also be patched so that it allows itself to be loaded in spite of its file being physically modified. Windows file integrity checks should also be disabled in this case to prevent the patched system DLLs from being restored from the Windows DLL cache.
For example, by hooking the Windows system API User32.SetWindowsHookEx( ), it is possible to reveal the following parameters: hook procedure and the handle to the DLL that contains the hook procedure. By knowing the handle to the hook module, it is possible to reveal the filename of the module that was requested as a hook handler. This way, it becomes possible to reveal any attempts to install keystroke monitors that are used by keyloggers. Once logged, the intercepted API call is then forwarded back to the native system DLL to be served in a proper manner.
An example of how the invoked function return may be modified is as follows: the hooks installed on the system APIs RasEnumConnections( ) and RasGetConnectStatus( ) of rasapi32.dll allow Core to fake the presence of a valid RAS connection in the system, should a threat rely on this fact in its logics. Core DLL can return the API call to the caller. That is, the intercepted API call is never forwarded back to the native DLL.
Service Providers & Monitors
Core Manager's service providers 515 can include:
These servers listen on corresponding ports and serve incoming requests in strict accordance with the relevant protocol specification. For example, RPC DCOM Provider listens on ports 135/445 with the native Windows server switched off (such as LSASS—The Local Security Authority Subsystem Service). As soon as a threat attempts to establish a new connection on ports 135/445, the installed RPC DCOM Provider accepts the connection and provides the connected client with legitimate response SMB packets according to protocol. Accepted SMB packets are then logged and wrapped into the reports that are then delivered back to Core Manager. The “dumped” traffic is then analysed by Core Manager to reveal any attempts by the connected clients to rely on existing RPC DCOM exploits. If there were exploit signatures detected in the intercepted traffic, then the threat that generated such traffic can be identified as a RPC DCOM worm (such as Spybot, Randex, IRC bot, etc.)
Appendix A provides an example report resulting from a Spybot and contains information about an MSO4-12 exploit detected in the outbound traffic on port 135/tcp.
The Time/SNTP Servers can be used to serve any possible threat attempts to rely on a time factor in functionality (such as the Sober worm).
Appendix B provides an example report resulting from the Sober worm and relies on the date Jan. 5, 2006—the last day when the Sober worm still replicated; the next day its mass-mailing routine was stopped.
The HTTP Server monitors any possible HTTP Get/Post requests that a threat may generate.
The DNS Server supplies a client that makes a DNS query with a fake MX record for the recipient's domain name, which is a host name of a mail exchange server accepting incoming mail for that domain. This is required to reveal any mass mailers that rely on DNS servers in their mass mailing functionality (such as Netsky, Sober).
The SMTP Server communicates with the clients acting like a legitimate SMTP Server: a threat is convinced that it communicates with the real SMTP server. The intercepted SMTP traffic is then delivered back to Core Manager for further analysis and parsing.
The IRC Server accepts incoming requests to join IRC channels and generates responses that are common for the legitimate IRC servers. Moreover, IRC server attempts to release hacker commands to the connected client. The commands it sends are common for IRC bots, such as Randex and Spybot. If the connected bot does not rely on password-protected authentication, then the IRC server may cause the connected bot to initiate DoS attacks inside the isolated environment to make sure that the connected bot is capable of initiating such attacks.
Snapshot Manager
Snapshot Manager 520 makes snapshots before and after a threat is run. Snapshot Manager 520 then compares two snapshots and reveals any differences that may have taken place in the system. The snapshots may be taken for the following Windows objects:
If the Snapshot Manager reveals any changes in the file system after running a threat, it is assumed that the file changes were induced by that threat. Any modifications in the state of the kernel components, such as modified contents of the System Service Descriptor Table, or modified addresses of the Major I/O Request Packet Functions, are designed to reveal a possible rootkit component of the threat. The Snapshot Manager contains a large database of exclusions to filter out those changes that are normally caused by the operating system itself.
The file system and registry changes, changes in the services, and open ports are all wrapped into the reports that are delivered to the Core Manager. Memory is handled in the following way: the Snapshot Manager reveals any newly created processes and/or any newly loaded modules. For every newly created process/module, a mapped executable/DLL filename is revealed to check if the retrieved filename is among the newly created files.
This approach reveals only newly created processes/modules that correspond to the newly created files. Then, the Snapshot Manager dumps the new processes/modules and delivers the dumps back into the Core Manager for further analysis. This allows the Core Manager to accomplish heuristics analysis over the memory dumps to detect any additional characteristics, as memory dumps represent memory images of the malicious code in the unpacked/decoded/unencrypted form, the form that the malicious code obtains at some point in order to run. The threat must be capable of decrypting itself in order to run. Once decrypted, the threat is dumped and the dump is studied and searched for signatures.
The Snapshot Manager is also capable of detecting any newly created windows in the system. The Snapshot Manager then snapshots the screen contents, cuts out the background and delivers the image back in the reporting system.
If a threat starts generating SMTP traffic, then the Snapshot Manager loads a Graphics User Interface (GUI) that fakes the look of an email client application. Then, it loads into the GUI all the characteristics of the intercepted SMTP traffic, such as email sender, recipient, subject, message body and attachment name. Once the GUI is populated, a new snapshot image is created and delivered back to the Core Manager. The final report can then create a screen capture designed to simulate how a new mass-mailer would look in an email client application.
In another form, ATAS can be used to provide for the detection of rootkit files/ADS and registry entries. This can be achieved if the second snapshot of an affected systems was taken from a clean primary partition by reading the affected (secondary) partition's files/registry. Automatic partition mounting is achievable both for a physical machine (by using relays) and a virtual machine (by modifying files that represent virtual drives and machine configuration).
Appendices A and B demonstrate many of the aforementioned features. The reports are produced by an example implementation of the Automated Threat Analysis System.
The embodiments discussed may be implemented separately or in any combination as a software package or components. Such software can then be used to notify, restrict, and/or prevent malicious activity being performed. Various embodiments can be implemented for use with the Microsoft Windows operating system or any other operating system.
Optional embodiments of the present invention may also be said to broadly consist in the parts, elements and features referred to or indicated herein, individually or collectively, in any or all combinations of two or more of the parts, elements or features, and wherein specific integers are mentioned herein which have known equivalents in the art to which the invention relates, such known equivalents are deemed to be incorporated herein as if individually set forth.
Although a preferred embodiment has been described in detail, it should be understood that various changes, substitutions, and alterations can be made by one of ordinary skill in the art without departing from the scope of the present invention.
Submission Summary:
Technical Details:
To mark its presence in the system, the sample created the following Mutex object:
aleks001
The following file was created in the system:
Note:
%System% is a variable that refers to the System folder. By default, this is C:\Windows\System (Windows 95/98/Me), C:\Winnt\System32 (Windows NT/2000), or C:\Windows\System32 (Windows XP)
Attention! There was outbound traffic produced on port 135/tcp with the following characteristics:
Automated Threat Analysis System has performed Heuristics Analysis of the created process and detected the following:
Automated Threat Analysis System has established that the sample is capable to steal CD keys of the following games:
The following ports were open in the system:
The following Host Name was requested from a host database:
scv.unixirc.de
There registered attempts to establish connection with the remote IP addresses. The connection details are:
Attention! There was a new connection established with a remote IRC Server. The generated outbound IRC traffic is provided below:
Submission Summary:
Technical Details:
Possible Country of Origin:
The new window was created, as shown below:
The following files were created in the system:
The following directories were created:
There were new processes created in the system:
The newly created Registry Values are:
The following ports were open in the system:
The following Host Names were requested from a host database:
Message Body:
| Number | Date | Country | Kind |
|---|---|---|---|
| 2006-100099 | Feb 2006 | AU | national |