A portion of the disclosure of this patent document contains material to which a claim for copyright is made. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the United States Patent and Trademark Office patent file or records, but reserves all other copyright rights whatsoever.
The present invention relates generally to mobile device security, and more particularly but not exclusively to dynamic taint tracking.
Mobile devices, such as smartphones and tablets, have become commonplace and are now employed not just to make voice calls over traditional mobile telephone networks, but also to browse the Internet, watch streamed video, and play online games. The number of mobile apps for mobile operating systems is growing each day.
Despite their increasing sophistication, mobile devices remain resource-constrained relative to laptop and desktop computers. Accordingly, mobile devices run mobile operating systems, such as the ANDROID and the iOS operating systems. An application program suitable for a mobile operating system is referred to as a “mobile app” or simply as an “app.” Apps may be obtained from an app store, such as the GOOGLE PLAY app store and AMAZON app store for ANDROID-based mobile devices and the APPLE app store for iOS-based mobile devices.
Governmental requirements and general privacy concerns have prompted evaluation of mobile devices for leakage of sensitive data. More particularly, there is a need to evaluate apps for conformance with privacy policies, such as whether apps misuse sensitive data. Examples of such misuse include transmitting location information, contacts information, etc., out of the mobile device in violation of a privacy policy.
In one embodiment, taint virtual instructions are added to virtual instructions of a control-flow graph (CFG). A taint virtual instruction has a taint operand that corresponds to an operand of a virtual instruction and has a taint output that corresponds to an output of the virtual instruction in a block of the CFG. Registers are allocated for the taint virtual instruction and the virtual instructions. After register allocation, the taint virtual instruction and the virtual instructions are converted to native code, which is executed to track taint on the mobile device.
These and other features of the present invention will be readily apparent to persons of ordinary skill in the art upon reading the entirety of this disclosure, which includes the accompanying drawings and claims.
The use of the same reference label in different drawings indicates the same or like components.
In the present disclosure, numerous specific details are provided, such as examples of apparatus, components, and methods, to provide a thorough understanding of embodiments of the invention. Persons of ordinary skill in the art will recognize, however, that the invention can be practiced without one or more of the specific details. In other instances, well-known details are not shown or described to avoid obscuring aspects of the invention.
Apps 161 may be received from a variety of sources including from an app store 160, which in the example of
There is a concern that the untrusted app 171 may violate one or more privacy policies. More particularly, the mobile device 100 may store a plurality of sensitive data, such as location information of the mobile device 100, contacts records, accelerometer values, short test messages (SMS), and so on. The untrusted app 171 may access and leak sensitive data by transmitting the sensitive data out of the mobile device 100 and onto an external computer 150 (see arrow 152) in violation of a privacy policy. The privacy policy may be based on governmental requirements, such as the European Union (EU) General Data Protection Regulation (GDPR) and the United States Children's Online Privacy Protection Rule (COPPA). The privacy policy may also be based on preference of the user of the mobile device 100 or other privacy requirements.
In the context of computer security, labeling or marking data for tracking purposes is referred to as “tainting.” In the present disclosure, “taint” is a label or marker applied to sensitive data for purposes of tracking the sensitive data. Dynamic taint analysis refers to tracking the propagation of taint during runtime. Because of resource constraints, dynamic taint analysis solutions that are employed for laptop and desktop computers are not readily applicable to mobile devices.
In legacy ANDROID mobile operating systems, apps are executed in a Dalvik virtual machine runtime environment, which interprets opcode at runtime according to a just-in-time (JIT) compilation strategy. Generally speaking, just-in-time compilation strategy makes dynamic taint analysis relatively easy. In newer ANDROID mobile operating systems, the Dalvik virtual machine runtime environment has been replaced with the ANDROID RunTime (ART) environment, which uses an ahead-of-time (AOT) compilation strategy. In ahead-of-time compilation strategy, apps are compiled during the installation stage, which makes it difficult to dynamically track taint at runtime.
In the example of
The mobile device 100 may track the propagation of a taint 173 as the taint 173 propagates through the untrusted app 171 and other components of the mobile device 100. This advantageously allows the mobile device 100 to detect when the taint 173, and thus a sensitive data 172 marked with the taint 173, is being leaked out of the mobile device 100. A component where a taint is forwarded or stored for leaking out of the mobile device 100 is also referred to as a “taint sink.” The taint sink may be a network communications output of the mobile device 100. The mobile device 100 as configured with the dynamic taint tracker 170 may monitor taint sinks for presence of the taint 173 to detect data leak.
In the example of
The output of the block building step 201 is a CFG comprising a first set of virtual instructions. The block building step 201 is followed by an optimization step 202, which optimizes the CFG to improve runtime performance. The output of the optimization step 202 is an optimized CFG comprising a second set of virtual instructions. The first and second set of virtual instructions may be different, e.g., when virtual instructions in the first set are removed during optimization and are thus not included in the second set.
In a traditional compilation process, the optimization step 202 is immediately followed by a register allocation step 204 whereby registers needed by the app at runtime are allocated. The register allocation step 204 is followed by a native code generation step 205 whereby the virtual instructions of the optimized CFG are converted to native code suitable for execution by the runtime environment.
In the example of
In the example of
As can be appreciated, the block building step 201, optimization step 202, register allocation step 204, and native code generation step 205 may be implemented as in the ART compiler. Accordingly, embodiments of the present invention may be implemented by performing suitable modifications to the ART compiler to accommodate the added taint virtual instructions as disclosed herein.
The example of
In the example of
A taint virtual instruction may be added just before or just after a corresponding virtual instruction. In the example of
In one embodiment, when a first virtual instruction with a corresponding first taint virtual instruction outputs to a second virtual instruction, a second taint virtual instruction is added to receive the output of the first taint virtual instruction. In the example of
A taint input of a taint virtual instruction may receive a taint when an input of a corresponding virtual instruction is configured to receive data marked with the taint. That is, when tainted data is received by the app at runtime, the code logic of the taint virtual instructions receive the taint. A taint virtual instruction may store an operand as an output. A variable or other storage location assigned to receiving the output of the taint virtual instruction may be read to check if the taint has propagated through the block.
In the example of
In one embodiment, a taint may be assigned a single bit in a 32-bit word to identify different sensitive data. For example, taints may be defined as shown in Table 1 below:
As a particular example, the virtual instruction 253 may be an ADD instruction that receives inputs 251 and 252 as operands, and outputs the sum of the operands as output/input 254. In that example, the taint virtual instruction 353 may be a pre-defined OR instruction that has been designated to be added to the CFG for an ADD instruction that receives two operands. Assuming the TAINT_LOCATION identifying a taint for the location information of the mobile deice 100 is present at the input 251 or input 252, the virtual instruction 253 would perform an ADD operation on the taint, and possibly change it. On the other hand, the taint virtual instruction 353 would simply OR the taint inputs 351 and 352, thereby effectively passing the TAINT_LOCATION through and allowing for its detection and tracking. And because the taint virtual instruction 353 is inserted in the optimized CFG prior to the register allocation step and native code generation step (see
An example of an original opcode that will be modified for taint tracking is shown in Table 2 below.
In the example of Table 2, the code block of the function “calculator” receives variables “width” and “length” as inputs, performs a multiply operation on “width” and “length”, and outputs a variable “result” equal to the multiplication of “width”, “length”, and “height.” An example code logic of a taint virtual instruction inserted in the code block of Table 2 is shown in Table 3 below.
The example of Table 3 is in original code format, instead of virtual instruction format, to facilitate understanding of an example code logic of an added taint virtual instruction. As can be appreciated, in practice, Table 3 will be in virtual instruction format, i.e., intermediate representation. In the example of Table 3, the taint virtual instruction receives taint from the variables “width_taint” and “length_taint”, which correspond to the variables “width” and “length” of the virtual instruction. Whereas the original code logic of the virtual instruction performs a multiply operation on the variables “width” and “length”, the code logic of the taint virtual instruction performs an OR operation on the taint variables (compare “temp” to “temp_taint” in Table 3). The output of the code block is set to “delegate_result” (an object of “Jvalue” type) so as to pass the taint through the code block. “delegate_result” may be read to check for the taint.
In the example of Table 3, the function stack frame is extended to pass taint (width_taint and length_taint) parameter and to allocate additional members for storing the taint (height_taint) member variable for object during compilation. Also, taint (temp_taint) representing temp variable in the function will be allocated during compilation. In the example of Table 3, the function return value is wrapped by Jvalue (delegate_result), and a taint member is added at the end of Jvalue. This way, the format of the return value is not changed and taint values are appended to the end of Jvalue. An example data structure for Jvalue is shown in Table 4 below.
Taint virtual instructions may be added in untrusted apps, as well as in system services and libraries. Generally speaking, taint virtual instructions may be added in software modules that serve as a “taint tag”, i.e., where presence of taint is to be tracked. A “taint source” comprises data that is marked with a taint.
In the example of
In an example operation, the taint source 472 may comprise location data that has been marked with the taint “TAINT_LOCATION:0x00000001”. The location data may indicate the global positioning system (GPS) coordinates of the mobile device 100, which is output by a location manager system service 454 (see arrow 401). The tainted location data will be detected at the taint tag 480. The untrusted app 452 may call the system service 454 to get the last known location of the mobile device 100. The untrusted app 452 will get the tainted location data (see arrow 406) through the binder IPC 456 (see arrow 402), the kernel binder driver 458 (see arrow 403), the binder IPC 457 (see arrow 404), and the framework library 453 (see arrow 405). Along the way, the taint is detected at taint tags 480-485. The untrusted app 452 and/or other modules traversed by the tainted location data may encode or modify (e.g., splice data to) the tainted location data to avoid detection. However, the taint on the location data is still tracked as propagating through the taint tags 480-485, and is detected to have been received and sent out by the untrusted app 452. The untrusted app 452 may attempt to transmit (see arrow 407) the tainted location data out of the mobile device 100 by way of the taint sink 475, e.g., by HTTP/HTTPS/SOCKET/SMS. The taint may be detected at the taint sink 475.
Dynamic taint tracking is often performed in test environments, such as when evaluating untrusted apps for compliance with privacy policies. A mobile device in a test environment may first be checked to make sure that no data leak occurs. Then, one untrusted app at a time may be installed and executed with dynamic taint tracking on the mobile device. This way, any data leak detected at a taint sink may be readily attributed to the untrusted app. The untrusted app responsible for the data leak at the taint sink may also be identified by its process identifier (ID).
In the example of
The native code is executed to evaluate the untrusted app for data leakage (step 506). A sensitive data is marked with taint (step 507), and the taint is tracked through the untrusted app and other software modules that have taint virtual instructions (step 508). The untrusted app is deemed to be leaking data in violation of a privacy policy when the taint is detected at a taint sink (step 509). In response to detecting the taint at the taint sink, a response action is performed against the untrusted app (step 510). The response action may include preventing execution of the untrusted app on a mobile device, such as preventing installation of the untrusted app on other mobile devices, blocking availability of the untrusted app in app stores, etc.
The mobile device 100 is a particular machine as programmed with one or more software modules 110, comprising instructions stored non-transitory in the main memory 108 for execution by the processor 101 to cause the mobile device 100 to perform corresponding programmed steps. An article of manufacture may be embodied as computer-readable storage medium including instructions that when executed by the processor 101 cause the mobile device 100 to be operable to perform the functions of the one or more software modules 110. The software modules 110 may comprise a mobile operating system, dynamic taint tracker, apps, etc.
Mobile devices and methods for dynamic taint tracking have been disclosed. While specific embodiments of the present invention have been provided, it is to be understood that these embodiments are for illustration purposes and not limiting. Many additional embodiments will be apparent to persons of ordinary skill in the art reading this disclosure.
Number | Name | Date | Kind |
---|---|---|---|
8423965 | Goel | Apr 2013 | B2 |
8505094 | Xuewen | Aug 2013 | B1 |
8776026 | Candea | Jul 2014 | B2 |
10635823 | Gutson | Apr 2020 | B2 |
20140113588 | Chekina et al. | Apr 2014 | A1 |
20150242635 | Li et al. | Aug 2015 | A1 |
20150356282 | Heen et al. | Dec 2015 | A1 |
20160012221 | Antonelli et al. | Jan 2016 | A1 |
20160042191 | Enck et al. | Feb 2016 | A1 |
20160154960 | Sharma et al. | Jun 2016 | A1 |
20170004303 | Yan et al. | Jan 2017 | A1 |
20170083705 | Lee et al. | Mar 2017 | A1 |
20170161176 | Ferrara et al. | Jun 2017 | A1 |
20170206355 | Nagumo et al. | Jul 2017 | A1 |
20170235945 | Lee et al. | Aug 2017 | A1 |
20170286644 | Dong et al. | Oct 2017 | A1 |
20180035285 | Ferrara et al. | Feb 2018 | A1 |
20180046798 | Zeller et al. | Feb 2018 | A1 |
Entry |
---|
Schutte et al., “AppCaulk: Data Leak Prevention by Injecting Targeted Taint Tracking Into Android Apps”, 2014, IEEE, pp. 371-379. (Year: 2014). |
Backes et al., “POSTER: Towards Compiler-Assisted Taint Tracking on the Android Runtime (ART)”, 2015, ACM, pp. 1629-1631. (Year: 2015). |
Graa et al., “Tracking Explicit and Control Flows in Java and Native Android Apps Code”, 2016, In Proceedings of the 2nd International Conference on Information Systems Security and Privacy, pp. 307-316. (Year: 2016). |
William Enck, et al. “TaintDroid: An Information-Flow Tracking System for Realtime Privacy Monitoring on Smartphones”, Oct. 2010, 15 sheets, OSDI'10 Proceedings of the 9th USENIX conference on Operating systems Design and Implementation. |
Mingshen Sun, et al.“TaintArt: A Practical Multi-level Information-Flow Tracking System for Android RunTime”, Oct. 2016, 12 sheets, CCS'16 Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. |