Embodiments of the present invention relate to computer systems, and more particularly to dynamic information flow tracking in such systems.
As computer systems become more complex, security is becoming of great concern. Authorization and privacy are two major concerns within the security domain. Authorization issues are related to unauthorized access to computer systems or privilege escalation within a system via exploitation of holes in software. Privacy issues are related to access to sensitive data and leaking of such data via access control security holes or propagation.
In an effort to resolve security issues, dynamic information flow tracking has been used to protect systems from authorization violations and compromised privacy. Such flow tracking is typically implemented using a hardware-based approach. These approaches typically include additional hardware support for performing tracking of secure data throughout its lifetime in a system. As an example, data may be tagged with a sensitivity level, which may be located in the dedicated hardware support. During program execution, the system dynamically propagates the sensitivity level for the tagged data and detects violations of user-specified rules. However, by implementing dynamic information flow tracking using a hardware-based approach, legacy systems lacking such specialized hardware cannot perform dynamic information flow tracking. Furthermore, there is added expense and computation complexity in performing a hardware-based dynamic information flow tracking process.
Another issue with respect to current dynamic information flow tracking processes is that they cannot adapt to legacy code. That is, code written without extensions for implementing dynamic information flow tracking cannot take advantage of the hardware support present for such tracking operations.
Embodiments of the present invention may use dynamic binary translation (DBT) to perform dynamic information flow tracking. DBT may be used to convert instructions from a source instruction set architecture (ISA) to a target ISA. A DBT may also perform run-time activities with regard to the translated program. For example, a DBT can instrument and optimize code, and furthermore perform profiling of the run-time behavior of the translated code. Based on such activities, particularly active portions of a program (i.e., hot spots) can be dynamically optimized to improve performance.
In various implementations, a two-phase dynamic binary translator (also referred to herein as DBT) may be used to identify and optimize frequently executed code. More specifically, a first phase (i.e., a profiling phase) may be used to profile the code to determine hot spots within a program. Then in a second phase (i.e., an optimization phase), these hot spots may be optimized in various manners.
Using DBT, flow tracking may be implemented in a pure-software based approach so that the tracking can be performed on machines lacking hardware support for tracking. Furthermore, embodiments may be used to perform dynamic information flow tracking during execution of legacy code (for example, code developed for a 32-bit machine) on more advanced platforms, e.g., a 64-bit machine, although the scope of the present invention is not so limited.
Dynamic information flow tracking in accordance with an embodiment of the present invention may be used to protect various data. For example, in some embodiments some or all user input data may be protected using such flow tracking. Embodiments may further seek to reduce the amount of tracking computation needed based on an analysis of incoming data. Such redundant tracking elimination may be referred to as just enough tracking (JET). More specifically, based upon a pattern of the incoming data, some embodiments may eliminate redundant tracking where a pattern of the input data has been seen previously. Accordingly, upon a first pass of input data into a portion of code, e.g., a basic block, information flow tracking may be performed. A summary of the tracking information computed may be stored upon conclusion of the basic block. Then, when a similar input data pattern is provided to the basic block, information flow tracking may be avoided, as instead the summary corresponding to the input data pattern may be accessed and provided at an output of the basic block.
While described primarily herein with respect to a dynamic binary translation engine, it is to be understood that the scope of the present invention is not so limited and in other embodiments other manners of performing software-based information flow tracking, along with elimination of redundant tracking may be realized.
Referring now to
Still referring to
Translation engine 55 may be adapted to receive incoming source code 20 and translate it into target code 40. More specifically, translation engine 55 may translate source code 20 into the language used in a given environment to be able to perform the desired operations using the ISA of the target machine.
Instrumentation engine 60 may be used to instrument target code 40 with additional instructions to perform various functions. With respect to embodiments of the present invention, instrumentation engine 60 may be adapted to insert code to perform dynamic information flow tracking. In various embodiments, each target instruction may be instrumented with additional code to perform the information flow tracking. Accordingly, instrumentation engine 60 may generate additional code to be inserted into target code 40. In various embodiments, to avoid the computation expense of performing the instrumented code in every execution, in some embodiments instrumentation engine 60 may generate instrumented code to be stored as a fat block of instrumented code of target code 40, while the original translated code (without instrumentation) may also be stored in target code 40. In this way, when dynamic information flow tracking is not needed for a given code block during execution, the computation expense of executing the instrumented code (e.g., the fat block) can be avoided.
To determine whether or not the instrumented code is to be executed, translator 50 may further include a dynamic analysis engine 65 which may be used to dynamically analyze incoming data to a code block, e.g., a basic block or a trace which may be formed of a plurality of basic blocks. Based on whether a pattern of the input data has been previously seen by a code block, dynamic analysis engine 65 will provide the input data to either the original translated code block in target code 40 or the instrumented fat block in target code 40. While described with this particular implementation in the embodiment of
Referring now to
After translation and instrumentation, the program (i.e., translated code) corresponding to the target code may be executed (block 130). In various embodiments, a DBT may be used to execute the code on a target platform. During execution of code, it may be determined, e.g., upon entry to a given basic block or other code segment whether the code block is a hot spot (diamond 140). That is, it may be determined whether the code block to be executed has been run more than a selected number of times, as determined by instrumentation code or the like. If it is determined that the code to be executed is not of a hot spot, control passes to block 150. There, the instrumented code may be executed (block 150). Accordingly, a fat instrumented block including flow tracking code may be executed so that upon conclusion of the executed code, data values can be passed to the next code block. Furthermore, a tracking summary corresponding to that data may also be passed to the next code block. In various implementations, the tracking summary may further be stored in a storage. From block 150, control passes back to block 130 for execution of further code, e.g., a next code block.
Still referring to
Referring now to
Still referring to
Thus as shown in
As mentioned above, redundant tracking elimination may further be implemented on a larger scale, e.g., on a program region or trace-level. As an example, a program region may be a collection of basic blocks that are executed frequently, may contain multiple branches, have a single entry point, and may contain multiple exits. Referring now to
Instrumented program region 310, which may correspond to a fat program region, includes additional code to perform dynamic flow tracking. By such instrumentation, the complexity and length of original program region 300 is thus expanded. Accordingly, when flow tracking information is already available for a given input data pattern to a selected program region, embodiments may seek to execute original program region 300 rather than instrumented program region 310.
Still referring to
As further shown in
Referring now to
If instead at diamond 520 it is determined that an input data pattern has been seen before, control passes to block 560. There, an original code segment (i.e., translated but uninstrumented code) may be executed using the input data (block 560). By executing the original code segment, the expense of performing the instrumented code can be eliminated. At the conclusion of code execution, a tracking summary may be applied (block 570). That is, a tracking summary previously stored (e.g., at block 540) when the corresponding instrumented code block was performed for input data having the same security data pattern may be applied to the output data. Then as discussed above, continued program execution may occur at block 550. While described with this particular implementation in the embodiment of
Thus according to various embodiments, only a limited amount of flow tracking may be performed based on an input data pattern, i.e., just enough tracking (JET). When used in a DBT, this limited flow tracking may be referred to as just enough tracking dynamic binary translation (JETDBT). In this way, embodiments of the present invention may incur low run-time overhead, allowing a pure software-based dynamic information flow tracking approach. Furthermore, using embodiments of the present invention security may be enhanced for legacy code, e.g., 32-bit code, when that code is translated into a 64-bit environment.
Embodiments may be implemented in code and may be stored on a storage medium having stored thereon instructions which can be used to program a system to perform the instructions. The storage medium may include, but is not limited to, any type of disk including floppy disks, optical disks, compact disk read-only memories (CD-ROMs), compact disk rewritables (CD-RWs), and magneto-optical disks, semiconductor devices such as read-only memories (ROMs), random access memories (RAMs) such as dynamic random access memories (DRAMs), static random access memories (SRAMs), erasable programmable read-only memories (EPROMs), flash memories, electrically erasable programmable read-only memories (EEPROMs), magnetic or optical cards, or any other type of media suitable for storing or transmitting electronic instructions.
Now referring to
The processor 610 may be coupled over a host bus 615 to a memory hub 630 in one embodiment, which may be coupled to a system memory 620 (e.g., a dynamic random access memory (DRAM)) via a memory bus 625. Programs such as a dynamic binary translator in accordance with an embodiment of the present invention may be stored in system memory 620 during operation, along with program data such as tracking summaries generated during code execution. The memory hub 630 may also be coupled over an Advanced Graphics Port (AGP) bus 633 to a video controller 635, which may be coupled to a display 637 which may be a flat panel display, in some embodiments. The AGP bus 633 may conform to the Accelerated Graphics Port Interface Specification, Revision 2.0, published May 6, 1998, by Intel Corporation, Santa Clara, Calif.
The memory hub 630 may also be coupled (via a hub link 638) to an input/output (I/O) hub 640 that is coupled to a input/output (I/O) expansion bus 642 and a Peripheral Component Interconnect (PCI) bus 644, as defined by the PCI Local Bus Specification, Production Version, Revision 2.1 dated June 1995. The I/O expansion bus 642 may be coupled to an I/O controller 646 that controls access to one or more I/O devices. As shown in
The PCI bus 644 may also be coupled to various components including, for example, a network controller 660 that is coupled to a network port (not shown). Additional devices may be coupled to the I/O expansion bus 642 and the PCI bus 644, such as an input/output control circuit coupled to a parallel port, serial port, a non-volatile memory, and the like.
Although the description makes reference to specific components of the system 600, it is contemplated that numerous modifications and variations of the described and illustrated embodiments may be possible. More so, while
While the present invention has been described with respect to a limited number of embodiments, those skilled in the art will appreciate numerous modifications and variations therefrom. It is intended that the appended claims cover all such modifications and variations as fall within the true spirit and scope of this present invention.