The present disclosure generally relates to determining a malign or non-malign behavior of an executable file.
Since the appearance of the ‘Phone’ smartphone the word “app” has become synonym for applications that users of smartphone can use for various tasks. Although applications can be pre-stored on a smartphone (or similar device), the most common case is that a user downloads his/her application and installs it on his/her device. This is, for example, true for iPhone™, Android™, or Windows™ Phone devices. For instance, iPhone™ users download their applications from Apple™'s App Store, and Android™ phone users from Google™ Play although the latter users also can load their applications from other sources.
A problem with the possibility to download and install programs is the threat that the programs may be malware, that is they contain functionality that is designed to exhibit malign functions (e.g. theft of user information, corrupt user data, or code that modifies the working of other applications or even system functions) or it functions in a way that is not conforming the owner of the device (e.g. a phone used by a corporate user).
Apple™ has partially addressed this issue by introducing mandatory digital signing of applications through its application distribution point App™ Store and by screening applications before releasing them on App™ Store. For Android™ phones Google™ operates Google™ Play as a common distribution point and recently started an automated screening of applications before being made available via Google™ Play. Also Android™ applications are digitally signed.
When detecting anomalies of application or system execution, it is often desired to have or collect information about a host device. Scenarios where this is crucial involve targeted attacks where malware triggers its malicious functionality when located in a specific world-region or using a specific connectivity provider, but also for debugging purpose of crashed executables.
For the sake of simplicity in this disclosure, the term “malware” denotes all types of programs they either contain or entirely consist of functionality aiming a task that when performed successfully causes harm to the owner of the device where it operates or to the organization where the device is in operation. Well-known types of malware are virus and Trojan-horse programs but also programs like remote device managers can become malware when being installed and operated without proper authorization.
While the screening efforts of the applications by, for example, Apple™ and Google™, has a sanitizing effect on the employment of applications that a user can choose from at the official distribution points, there are still applications at the distribution points that are malware.
There are several reasons for this and most importantly one has to deal with:
There exist reputation based systems for applications where people can express their opinion on an application and there are solutions where the people's approval can be secured through digital signatures. Such schemes can help to prevent a wide spread of bad applications but have the disadvantage that they are slow in detecting malware, susceptible to false opinion insertion, and that it is in practice to setup schemes that are reliable/secure. Today most distribution points have means where users can rate the app and leave any opinion in an unprotected way.
Accordingly, there is a need for an implementation of a scheme that avoids one or more of the problems discussed above, or other related problems.
In a first aspect, there is provided a method for determining a malign or non-malign behavior of an executable file, wherein the method comprises the steps of first acquiring a first behavior profile of the executable file, the first behavior profile comprising a first observable execution trace of the executable file from an emulated environment, second acquiring a second behavior profile of the executable file, the second behavior profile comprising a second observable execution trace of the executable file from a real environment and comparing the first and second observable execution traces so as to determine the malign or non-malign behavior of the executable file.
In optional refinements of the first aspect, there is or are provided at least one of the following:
In a second aspect, there is provided a method for anonymously collecting behavior data of an executable file, wherein the executable file is resident on each of two or more file-execution devices and distributed by a distribution point, and wherein the method is performed in an entity different from the distribution point and two or more file-execution devices, and comprises the steps of determining a trigger condition; first collecting, responsive to the trigger condition, a first behavior profile of the executable file from a first one of the two or more file-execution devices, the first behavior profile comprising a first observable execution trace of the executable file, and the first observable execution trace being non-mapped to the first file-execution device; and second collecting, responsive to the trigger condition, a second behavior profile of the executable file from a second one of the two or more file-execution devices, the second behavior profile comprising a second observable execution trace of the executable file, and the second observable execution trace being non-mapped to the second file-execution device.
Concerning the terminology used for the second to third, sixth and seventh aspects (as well as the other aspects insofar related to the first-named aspects), the following applies:
In optional refinements of the second aspect, there is or are provided at least one of the following:
In a third aspect, there is provided a method for anonymously collecting behavior data of an executable file distributed by a distribution point, wherein the method is performed in a file-execution device, the executable file is resident on the file-execution device, and comprises the steps of receiving, from an entity different from the distribution point and the file-execution device, a request for anonymous collection of behavior data of the executable file, and collecting, responsive to the received request, a behavior profile of the executable file, the behavior profile comprising an observable execution trace of the executable file, and the observable execution trace being non-mapped to the file-execution device.
In optional refinements of the third aspect, there is or are provided at least one of the following:
In a fourth aspect, there is provided a computer program product comprising program code portions for performing a method according to any one of the first to third aspects, when the computer program product is executed on one or more computing devices.
In an optional refinement of the fourth aspect, the computer program product is stored on a computer readable recording medium.
In a fifth aspect, there is provided an apparatus for determining a malign or non-malign behavior of an executable file, the apparatus comprising at least one processor configured to acquire a first behavior profile of the executable file, the first behavior profile comprising a first observable execution trace of the executable file from an emulated environment, acquire a second behavior profile of the executable file, the second behavior profile comprising a second observable execution trace of the executable file from a real environment, and compare the first and second observable execution traces so as to determine the malign or non-malign behavior of the executable file.
In a sixth aspect, there is provided an apparatus for anonymously collecting behavior data of an executable file, wherein the apparatus is constituted by an entity different from a distribution point and two or more file-execution devices, the executable file is resident on each of the two or more file-execution devices, and the apparatus comprises at least one processor configured to determine a trigger condition, collect, responsive to the trigger condition, a first behavior profile of the executable file from a first one of the two or more file-execution devices, the first behavior profile comprising a first observable execution trace of the executable file, and the first observable execution trace being non-mapped to the first file-execution device, and collect, responsive to the trigger condition, a second behavior profile of the executable file from a second one of the two or more file-execution devices, the second behavior profile comprising a second observable execution trace of the executable file, and the second observable execution trace being non-mapped to the second file-execution device.
In a seventh aspect, there is provided an apparatus for anonymously collecting behavior data of an executable file, wherein the apparatus is constituted by a file-execution device, the executable file is resident on the file-execution device, and the apparatus comprises at least one processor configured to receive, from an entity different from a distribution point and the file-execution device, a request for anonymous collection of behavior data of the executable file, and collect, responsive to the received request, a behavior profile of the executable file, the behavior profile comprising an observable execution trace of the executable file, and the observable execution trace being non-mapped to the file-execution device.
In an eighth aspect, there is provided a system, comprising the apparatus according to the fifth aspect being functionally split between a distribution point and a file-execution device, wherein the comparing operation is performed in at least one of the distribution point and the file-execution device, and a secure channel is established between the distribution point and the file-execution device.
In a ninth aspect, there is provided a data structure for storing observable execution traces of an executable file in a behavior profile, the data structure comprising at least one entry per system call performed by the executable file, the entry comprising an argument-to-function/method call and a timestamp of when the system call occurred.
In an optional refinement of the ninth aspect, there is or are provided at least one of the following:
The embodiments of the technique presented herein are described herein below with reference to the accompanying drawings, in which:
In the following description, for purposes of explanation and not limitation, specific details are set forth (such as particular signaling steps) in order to provide a thorough understanding of the technique presented herein. It will be apparent to one skilled in the art that the present technique may be practiced in other embodiments that depart from these specific details. For example, the embodiments will primarily be described in the context of so-called “apps” as an example for executable files; however, this does not rule out the use of the present technique in connection with other file systems or formats.
For the purpose of this disclosure, the terms “apparatus” and “system” have been introduced. Without being restricted thereto, the “system” may be implemented as a wireless communication network or a portion thereof. Moreover, the “apparatus” or the “wireless communication device” may be functionally split into a “distribution point” and a “file-execution device”. In turn, the “distribution point” may be implemented as functionality in the Internet, for example in the IT/Telecommunications cloud. Moreover, the “file-execution device” may be fixed/wirebound or mobile, such as a fixed workstation, or a fixed or wireless desktop/laptop, or a fixed or mobile Machine-to-Machine (M2M) interface, or a mobile terminal, such as a smartphone. However, those implementation examples are only illustrative; the person skilled in the art can readily devise various additional or supplemental implementations of the “system” and “wireless communication device”.
Moreover, those skilled in the art will appreciate that the services, functions and steps explained herein may be implemented using software functioning in conjunction with a programmed microprocessor, or using an Application Specific Integrated Circuit (ASIC), a Digital Signal Processor (DSP) or general purpose computer. It will also be appreciated that while the following embodiments are described in the context of methods and devices, the technique presented herein may also be embodied in a computer program product as well as in a system comprising a computer processor and a memory coupled to the processor, wherein the memory is encoded with one or more programs that execute the services, functions and steps disclosed herein.
The present disclosure, without being restricted thereto, may be summarized in that the fact is used that the app is screened, preferably dynamically by executing it, and that there is a known trusted distribution point that could convey its findings of the screening in a more relevant way. Today the fact an app is made downloadable via a distribution point implies that the screening did not find anything harmful in the code having testing it. In the case of Android™, one also lists the permission (to other functions) the app requires. But this is basically all information that is available. According to the present disclosure, the user can of the device be actually instructed of the expected (and approved observed) behavior of the application. In the device, the app can be monitored when it executes and compare it with the behavior it showed during the screening. This allows to identify security relevant deviations from the approved behavior and to take countermeasures, e.g. blocking the app from further execution, notifying the user and or distribution point. Through digital signing the distribution point can convey the observed behavior in a secure (integrity protected) way. When identifying deviating behavior, device-specific information may be collected from the device in order to determine parameters that may have caused the abnormal behavior.
Likewise, the file-executing device 1002 comprises a core functionality (e.g., one or more of a Central Processing Unit (CPU), dedicated circuitry and/or a software module) 10021, an optional memory (and/or database) 10022, an optional transmitter 10023 and an optional receiver 10024. Moreover, the device 1002 comprises an acquirer 10025, a comparator 10026, an optional separator 10027, an optional installer 10028, an optional linker 10029, an optional executioner 100210, an optional query 200211, an optional updater 100212 and an optional simulator 100213.
Finally, the (other) entity 1003 comprises a core functionality (e.g., one or more of a Central Processing Unit (CPU), dedicated circuitry and/or a software module) 10031, an optional memory (and/or database) 10032, an optional transmitter 10033 and an optional receiver 10034. Moreover, the entity 1003 comprises an acquirer 10035 and a comparator 10036.
In the following paragraphs, assume that x=1, 2 or 3. As partly indicated by the dashed extensions of the functional block of the CPUs 100x1, the acquirer 10015, the comparator 10016, the updater 10017 and the creator 10018 (of the Distribution point 1001), the acquirer 10025, the comparator 10026, the separator 10027, the installer 10028, the linker 10029, the executioner 100210, the query 200211, the updater 100212 and the simulator 100213 (of the device 1002) and the acquirer 10035 and the comparator 10036 (of the entity 1003) as well as the memory 100x2, the transmitter 100x3 and the receiver 100x4 may at least partially be functionalities running on the CPUs 100x2, or may alternatively be separate functional entities or means controlled by the CPUs 100x1 and supplying the same with information. The transmitter and receiver components 100x3, 100x4 may be realized to comprise suitable interfaces and/or suitable signal generation and evaluation functions.
The CPUs 100x1 may be configured, for example, using software residing in the memories 100x2, to process various data inputs and to control the functions of the memories 100x2, the transmitter 100x3 and the receiver 100x3 (as well the acquirer 10015, the comparator 10016, the updater 10017 and the creator 10018 (of the Distribution point 1001), the acquirer 10025, the comparator 10026, the separator 10027, the installer 10028, the linker 10029, the executioner 100210, the query 200211, the updater 100212 and the simulator 100213 (of the device 1002) and the acquirer 10035 and the comparator 10036 (of the entity 1003)). The memory 100x2 may serve for storing program code for carrying out the methods according to the aspects disclosed herein, when executed by the CPUs 100x1.
It is to be noted that the transmitter 100x3 and the receiver 100x4 may be provided as an integral transceiver, as is indicated in
The embodiment may be based on a secure channel between the mobile 1002 and a Trusted Service 1001 in the network. Part of the embodiment may reside in establishing a security context between the mobile 1002 and the trusted network service 1001. As a best mode, there is disclosed a setup where the trusted service 1001 (or the distribution point) signs the profile data with the secret key of a public-key cryptosystem. The public key that can be used for verifying the signature may be either already stored on the device 1002 or may be sent as part of a so-called (digital) certificate whose content can be verified by chain of certificates in a PKI (Public Key Infrastructure) scheme whose root certificate is stored on the device 1001.
One way to analyze an app(lication) is to perform a static code analysis. Such an analysis can detect leaks of private information. However, static analysis is limited as it may not cover the dynamics of the application as it executes or it is rendered ineffective due to hiding and code obfuscation techniques.
The compilation of the app behavior profile P1, P2 is conducted by a behavior analysis in the distribution point before releasing it to the public. The app is executed in an emulated environment. During its execution, any interactions with the underlying operating system by the app are observed and stored in the profile P1, P2 using e.g. taint analysis. But other methods to capture the behavior are also possible, as long as they deliver a digitally observable execution trace of the behavior. The data of this profile P1 may be referred to as the Reference Application Behavior Profile (RABP).
This profile P1 may later be verified against a profile P2 generated by the same app on a real mobile device 1002. To augment the effectiveness of the profiles P1, P2 one can even watch the argument to function/method calls, e.g.
Call_profile_entry E=timestamp+syscall nr+argument 1+ . . . +argument n or
Call_profile_entry E=hash (timestamp+syscall nr+argument 1+ . . . +argument n)
for each system call that the app generates. Instead of the argument itself, it would also be possible to first perform a classification of the argument and then optionally include the classification of the argument in the hash computation. Such a classification could be a list of data elements like, e.g., argument origin, security label, or argument range constraints.
Call_profile_entry E=timestamp+syscall nr+\ Classify(argument 1)+ . . . +Classify(argument n)) or
Call_profile_entry E=hash (timestamp+syscall nr+\ Classify(argument 1)+ . . . +Classify(argument n)))
The classification may render the call-profile-entries more suitable in capturing generic arguments rather than specific values.
The app behavior profile itself may be composed of information collected during a behavioral analysis of the app during runtime. To this end, different technologies can be used as example embodiments we mention here treemaps in
One further feature of these tree maps and behavior graphs resides in rendering the same e.g. multi-dimensional, so they can capture the behavior in a richer way.
Returning to
In the device 1002, the app file is dissected (S2-1d, 10027) in the normal app part and the signed RABP. The former is processed by the existing procedures for installing (S2-1e, 10028) the app with the additional restriction that the signature of the profile data is successfully verified as being correct. The latter, RABP P1 may be in the device 1002 and linked (S2-1f, 10029) to the app so that when the app is executed its behavior profile P1 can be found in the device.
When the app executes, the device 1002 may also trace the app as it proceeds and constructs an Observed Application Behavior Profile (OABP) P2. The OABP P2 may be compared (S1-2, S2-2, S3-3; 10016, 10026, 10036) to the RABP P1 and if the comparison reveals significant deviations the app may be stopped or halted and the user is informed and asked for consent to proceed.
Note that this allows for a mechanism where the RABP P1 may be updated by the gained insight through the user consent so the user is not bothered the next time when the same condition occurs at a later instant.
In an alternative mode, the device 1002 could first simulate (S2-1b, 100213) the upcoming behavior, i.e., opening of external connections, and first compare the resulting behavior profile P2 with the reference P1 before committing to actually actions. This avoids that improper behavior only can be detected when it already occurred.
Instead of the only analysis by the distribution point 1001, the distribution point could use a set of trusted devices 1002 that already downloaded and installed the app to improve the correctness of its behavior profile P1. These devices can report (S1-2a, 10014) their results (updated RABPs) P1 and the distribution point 1001 can compare the reports and compile a behavior profile (S1-2c, 10018) or augments a basic profile (S1-2b, 10017) that it already established from a basic screening of the app. In such a way one has a collective learning that improves the quality of the protection the reference profile provides.
The collected information must be securely transmitted from the trusted devices to the distribution point. One solution for that may be SSL/TLS or VPN secured connections.
When an anomaly has been detected during application or system execution (S3, 100214), the execution host (file-execution device) 1002 is pulled for device information (S2, S4, S4a). In another use-scenario where the execution profiles are compared locally, the host device pushes (S4, S4a) this information to the entity 1003 responsible for collecting it (abbreviated, e.g., as Host Information Collector, HIC).
The HIC 1003 can be deployed as a service in the cloud. Collected information can be merged (55) by increasing a counter for each specific parameter present in the device-information. This information may have been sent to the HIC 1003 encrypted (S4, S4a) and may use a homomorphic encryption or another scheme that prevents the information from being mapped to (e.g., from being usable to identify) a specific user or device 1002 #1, 1002 #2 in order to preserve privacy. Sending information from a trusted application residing in a trusted execution environment would prevent tampering of device-information. If the goal is to monitor a system, a hypervizor solution can be responsible for sending (S4, S4a) the information.
The behavior above can be generalized so that trigger conditions are defined by application developers (third party) 1004 (S1). In the case when trigger conditions are met, certain device-information is pushed encrypted (S4, S4a) to the HIC 1003. The HIC 1003 buffers and/or mixes (obfuscates, S5) the data in a way that prevents the developer from mapping received data with a certain user or device 1002 when retrieving (S6) the statistical data from the HIC 1003. In order for the HIC 1003 to merge encrypted data (S5), a homomorphic encryption scheme or other similar schemes can be applied.
More specifically, when a trigger condition is met (S3), possibly based on hypervisor monitoring, a trusted application encrypts the application developer requested device-specific data with a public key (Pub key) supplied by the developer. This encrypted information is then sent to the HIC 1003 (S1) where the third party 1004 can retrieve (S6) the merged data and decrypt it with its private key (S7). This behavior prevents the HIC 1003 from reading sensitive data and mapping users 1002 with the read data, and the third party 1004 is only able to retrieve merged data (S6) and therefore unable to map individual users.
As a non-liming example, the third party 1004 might want to collect location-information from all devices using its app when a certain condition is fulfilled, for example, to retrieve information about where customers live. One solution would be to request permission to retrieve location updates. However, this does not prevent the third party 1004 from mapping individual app users with location data which may be a privacy concern for the user. Another solution to this particular example is to ask the connectivity provider for location-data but this suggested approach is more flexible in terms of monitoring host-device execution and collecting device-information.
The present disclosure provides one or more of the following advantages:
It is believed that the advantages of the technique presented herein will be fully understood from the foregoing description, and it will be apparent that various changes may be made in the form, constructions and arrangement of the exemplary aspects thereof without departing from the scope of the invention or without sacrificing all of its advantageous effects. Because the technique presented herein can be varied in many ways, it will be recognized that the invention should be limited only by the scope of the claims that follow.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2012/076161 | 12/19/2012 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
61670000 | Jul 2012 | US |