Code instrumentation is a method for analyzing and evaluating program code performance. Source instrumentation modifies a program's original source code, while binary instrumentation modifies an existing binary executable. In one approach to binary code instrumentation, new instructions or probe code are added to an executable program, and consequently, the original code in the program is changed and/or relocated. Some examples of probe code include adding values to a register, moving the address of some data to some registers, and adding counters to determine how many times a function is called. The changed and/or relocated code is referred to as instrumented code, or more generally, as an instrumented process.
One specific type of code instrumentation is referred to as dynamic binary instrumentation. Dynamic binary instrumentation allows program instructions to be changed on-the-fly. Measurements such as basic-block coverage and function invocation counting can be accurately determined using dynamic binary instrumentation. Additionally, dynamic binary instrumentation, in contrast to static instrumentation, is performed at run-time of a program and only instruments those parts of an executable that are actually executed. This minimizes the overhead imposed by the instrumentation process itself. Furthermore, performance analysis tools based on dynamic binary instrumentation require no special preparation of an executable such as, for example, a modified build or link process.
One embodiment of the present invention may comprise a system for branch profiling an executable program. The system may comprise a branch profiler that assigns branch integers to branches in a loop and a plurality of path values that correspond to sums of the branch integers through respective execution paths in the loop. The system may also comprise a probe instrument that inserts an integer add instruction into branches of the loop, a set of path counter instructions after a last branch in the loop, and a set of loop path counter array update instructions after an exit point of the loop.
Another embodiment may comprise a method of branch profiling of an executable program. The method may comprise inserting an integer add instruction in branches of a loop of the executable program, inserting path counter instructions after a last branch of the loop and prior to an exit point of the loop, and inserting loop counter array update instructions after the exit point of the loop.
Another embodiment relates to a computer readable medium having computer executable instruction for performing a method. The method may comprise performing a branch profiling on at least one loop in an executable program to assign branch integers to branches and path values to execution paths in the at least one loop, inserting integer add instructions to branches of the at least one loop, inserting path counter instructions at an end of the at least one loop, and inserting loop path counter array update instructions after an exit point of the at least one loop.
Still another embodiment may relate to a dynamic instrumentation system. The dynamic instrumentation system may comprise means for generating an intermediate representation of a function associated with an executable program, means for analyzing the intermediate representation to identify at least one loop in the function, and means for performing branch profiling on the identified at least one loop to assign branch integers to branches and path values to execution paths of the identified at least one loop. The system may further comprise means for inserting code into the identified at least one loop. The means for inserting code may insert integer add instructions into branches of the identified at least one loop, and path counter instructions at an end of the identified at least one loop. The system may also comprise means for encoding the inserted code and the intermediate representation of the function to produce an instrumented function.
This disclosure relates generally to dynamic instrumentation systems and methods. Branch profiling is performed on at least one loop of an executable program. The branch profiling determines the possible paths through branches of the at least one loop. The branch profiling then assigns branch integers to branches contained within the at least one loop, and a plurality of path values for respective possible execution paths through the branches. The path values correspond to the sum of the branch integers over a respective path. Integer add instructions are inserted in respective branches. The integer add instructions sum up the branch integers of a given execution path for a given loop execution. Path counter instructions are inserted at an end of the at least one loop. Upon completion of a loop iteration, program execution is directed to the path counter instructions. The path counter instructions compare the value of an integer adder with the plurality of path values to determine which path of the plurality of execution paths has been taken during the given single loop execution. A corresponding path counter associated with the execution path that has been taken is incremented for a respective loop iteration.
A loop path counter array update instruction is inserted after an exit point of the loop. The loop path counter array update instruction updates loop path counter entries in a loop path counter array with the number of executions for respective paths taken through the loop. One or more loops can be assigned loop path counter arrays. The loop path counter array update instructions can be embedded in a multi-thread safe set of ownership instructions, such as a spinlock operation. A spinlock operation provides a thread with ownership of the loop path counter array entries stored in memory, preventing other threads from modifying the loop path counter entries associated with the loop path counter array, until the ownership is released.
The dynamic instrumentation tool 12 is operative to perform branch profiling on the executable program 14. The branch profiling includes analyzing at least one loop to assign integer values to branches within the at least one loop, and path values that correspond to unique execution paths through the at least one loop. The sum of the branch integers through an execution path is equal to a respective path value, such that an execution path through the at least one loop can be determined by decoding the path value. The branch profiling can be an algorithm (e.g., Ball-Larus algorithm) for assigning integers to branches and path values to unique execution paths.
The dynamic instrumentation tool 12 is operative to assign instrumentation counters to at least one loop associated with the executable program 14. The instrumentation counters can include an integer adder that adds branch integers associated with branch executions to provide a unique path value for a given loop execution. The instrumentation counters can include path counters that maintain a count associated with a number of times a given execution path has been taken over the total loop iterations of a respective loop. The dynamic instrumentation tool 12 is operative to assign a free register to the at least one loop for integer adding, and a plurality of free registers for path counting.
A free register can be found by analyzing the executable program 14 to determine which registers are used for storing data by the executable program and which registers are not used. Additionally, the code can be analyzed to determine which registers are currently available for use that would not interfere with the executable program execution. It is to be appreciated that a variety of techniques can be employed to find a free register.
The dynamic instrumentation tool 12 can load the executable program 14 and insert breaks at a beginning of each function under the control of a debugging interface, which is provided by the operating system (e.g., ttrace( ) on HP-UX® Operating System, ptrace( )on LINUX® Operating System, Extended Debugging Interface (eXDI) on MICROSOFT WINDOWS® Operating System). The executable program 14 then is executed. The debugging interface makes it possible to transfer control from the target application to the dynamic instrumentation tool 12 whenever a break is encountered in the executable program.
As the executable program 14 encounters the breaks corresponding to a new reached function, control is passed to the dynamic instrumentation tool 12. The dynamic instrumentation tool 12 loads the function. The dynamic instrumentation tool 12 then converts the function into an intermediate representation by decoding the binary code associated with the function and converting the decoded binary code via an intermediate representation instrument. A control flow graph constructor then generates a control flow graph from the intermediate representation. A loop analysis is then performed on the intermediate representation by a loop recognition algorithm.
Branch profiling can be performed on the control flow graph to identify branches and possible execution paths through the one or more loops. A branch profiling algorithm (e.g., Ball-Larus algorithm) can be employed to assign branch integers to branches and path values to execution paths through one or more loops, such that the branch integers can be summed to determine an execution path that has been taken through a given loop during a single loop execution. The branch integer sum is a unique value that corresponds to a specific execution path through a set of branches.
The dynamic instrumentation tool 12 can then insert one or more instrumentation counters via a probe code instrumenter. The one or more instrumentation counters include an integer adder that adds a branch integer corresponding to an executed branch to the integer adder. The integer adder can be a free register. The probe instrument can insert instructions in branches to increment the free register by the value of the branch integer when the branch is executed. At the end of the loop, the free register will provide a path value associated with execution through a set of branches corresponding to the execution path through the loop.
A path counter can be employed for each unique path through a loop iteration. A respective path counter can be incremented each time its corresponding execution path is taken by comparing the sum of the integer adder with a set of assigned path values that identify which execution path has been taken. A path counter can include a free register that is incremented if its associated path has been taken. The probe instrument can insert instructions that increment a respective path counter if the integer adder value is equal to the path value associated with the respective path counter. In this manner, a respective path counter is incremented through each loop iteration based on the execution path through the loop. Once execution of the loop has completed (e.g., after a plurality of loop iterations), the path counters will contain values indicative of the number of times a particular path through the loop has been taken for each possible execution path.
As previously stated, a number of different techniques can be employed to generate free registers. Free registers are inherently multi-thread safe. Typically, loop counters are employed to count loop iterations by utilizing atomic memory update instructions. The atomic memory update instructions are multi-thread safe, but are substantially time intensive (e.g., about 20 clock cycles) as compared to a register add instruction (e.g., about 1 clock cycle).
A loop path counter array update instruction is inserted after an exit point of the loop. The loop path counter array update instructions updates loop path counter entries in a loop path counter array with the number of executions for the paths taken through a respective loop associated with the executable program. The loop path counter array update instructions can be embedded in a multi-thread safe set of ownership instructions, such as a spinlock operation. A spinlock operation provides a thread with ownership of the loop path counter array stored in memory preventing other threads from modifying the loop path counter array, until the ownership is released. The loop path counter array maintains a count associated with total path executions of each possible execution path over one or more loop executions associated with a given loop. A respective loop path counter array can be assigned to respective loops for one or more loops in the executable program 14.
The dynamic instrumentation tool 12 then encodes the modified function code to provide an instrumented function in binary form. The instrumented function is stored in a shared memory 18. The original entry point of the function (where the break point was placed) is patched with a branch/jump to the instrumented version of the function. Execution is then resumed at the address of the instrumented function (e.g., resume can be an option in the debug interface). Therefore, control has been transferred back to the executable program, which continues to execute until another breakpoint at a new non-encountered function is encountered. The process then repeats for the next function until all function have been instrumented. Once the executable program 14 and instrumented functions have completed execution, the dynamic instrumentation tool 12 can retrieve the loop path counter entries from one or more loop path counter arrays from the shared memory 18.
The dynamic instrumentation tool 40 includes a branch profiler 48. The branch profiler 48 analyzes the branches in one or more loops to identify potential execution paths through the one or more loops. Each branch is assigned a branch integer and each execution path is assigned a path value, such that the sum of branch integers through each possible execution path of a set of branches provides a unique path value. In this manner, the particular execution path taken through the branches can be determined in which paths can be reconstructed by comparing the integer sum values with the different path values.
The dynamic instrumentation tool 40 also includes a probe code instrumenter 50. The probe code instrumenter 50 can insert integer add instructions in the branches. The integer add instructions can add the corresponding branch integer to an integer adder for an executed branch. The probe code instrumenter 50 can insert path counter instructions that can increment respective path counters that maintain a path count for a respective execution path. The path counter instructions can be inserted at the end of a loop prior to an exit point of the loop. The probe code instrumenter 50 can insert loop path counter array update instructions after the exit point of a loop. The loop path counter array update instruction updates loop path counter entries of the loop path counter array to maintain path execution counts over one or more executions of a given loop. A plurality of loops can be assigned respective loop path counter arrays.
Free registers can be employed to implement the integer adder and the path counters. The probe code instrumenter 50 can insert register initialization instruction at the loop entry point. The probe code instrumenter 50 can generate free registers associated with the integer add instructions and the path counter instructions for one or more loops, as long as free registers are available. If free register are no longer available, the probe code instrumenter 50 can insert path counter instructions that store the path count values to memory. The dynamic instrumentation tool 40 includes an encoder 50 that encodes the IR instrumented function into a binary instrumented function. The dynamic instrumentation tool 40 includes a process control 52 that stores the binary instrumented function in shared memory, patches a branch/jump instruction in the executable program where the break point was placed, and passes control back to the executable program.
Therefore, the shared memory 60 includes a loop path counter array access flag, labeled C1AF, associated with the loop path counter array. The access flag is employed to maintain ownership of the loop path counter array memory spaces by a single process at a time, so that loop path counter array integrity is maintained. For example, if a process desires to overwrite the loop path counter array, the process will request control of the loop path counter array by checking the corresponding counter access flag. If the counter access flag is not set, the process will set the flag and update the corresponding loop path counter array. The process will then reset the flag and release control of the loop path counter array, so that other processes may access the loop path counter array in shared memory 60. In this manner, loop path counter array integrity is maintained by being multi-thread safe. It is to be appreciated that a respective loop path counter array can be assigned to a respective loop for one or more loops in an executable program to maintain path counts for the one or more loops.
The shared memory 60 also retains a plurality of instrumented functions, labeled 1 through K, where K is an integer greater than or equal to one. The dynamic instrumentation tool stores the encoded instrumented functions in shared memory 60 to provide ready access to both the instrumentation tool and the executable program. The address associated with a given instrumented function is placed before the associated function in the executable program, such that the instrumented function is executed in place of the given function via a jump or call instruction. Once the executable program is completely instrumented, a substantial portion of executable program execution occurs in shared memory 60 via the instrumented functions.
The loop 70 includes instrumentation code provided by a dynamic instrumentation tool. The dynamic instrumentation tool assigns a set of free registers to the loop 70. A first register Rx is employed as an integer adder, while the remaining registers R1-RN are employed as path counters. The dynamic instrumentation tool inserts a set of register initialization instructions 72 (R1=0; R2=0; . . . ;RN=0) at lines 001-003 prior to a loop start or entry point (004) to reset the path counters. The dynamic instrumentation tool inserts an integer adder register initialization instruction (Rx=0) at lines 005 after a loop start or entry point (004) to reset the integer adder for each loop iteration.
The dynamic instrumentation tool inserts a register integer add instruction in one or more control flow branches of the loop 70. For example, a first register integer add instruction (Rx=Rx+INT0) is inserted at line 007 within a first branch 74 indicated as extending from lines 006 to 008, and a second register integer add instruction (Rx=Rx+INT1) is inserted at line 009 within a second branch 75 indicated as extending from lines 008 to 010. If cond1 is true at line 006, the first branch 74 will be executed and the integer adder (Rx) will be incremented by INT0. If cond1 is false at line 006, the second branch 75 will be executed and the integer adder (Rx) will be incremented by INT1.
The process is repeated for B number of branches. For example, a B-2 integer add instruction (Rx=Rx+INTB-2) is inserted at line 012 within a B-2 branch 76 indicated as extending from lines 011 to 013, and a B-1 integer add instruction (Rx=Rx+INTB-1) is inserted at line 014 within a B-1 branch 77 indicated as extending from lines 013 to 015. If condk is true at line 011, the B-2 branch 76 will be executed and the integer adder (Rx) will be incremented by INTB-2. If condk is false at line 011, the B-1 branch 77 will be executed and the integer adder (Rx) will be incremented by INTB-1.
A set of path counter instructions 78 are inserted after the last branch 77 of the loop 70 and prior to an exit point 80 of the loop 70. The exit point 80 of the loop 70 includes a set of instructions illustrated in lines 019-021 in which a loop condition is determined from which the loop continues execution for another iteration if the loop condition is true and terminates loop execution if the loop condition is false. The set of path counter instructions 78 are illustrated in lines 016 to lines 018 in which the integer adder value is compared with a plurality of path values associated with a unique execution path that can be taken. For example, if the value of the integer adder (Rx) is equal to the path value PV0, the value of a first path counter (R1) is incremented by one indicated that the execution path associated with the path value PV0 has been executed. If the value of the integer adder (Rx) is equal to the path value PV1, the value of a second path counter (R2) is incremented by one indicated that the execution path associated with the path value PV1 has been executed. Finally, if the value of the integer adder (Rx) is equal to the path value PVN-1, the value of a final path register RN is incremented by one indicated that the execution path associated with the path value PVN-1 has been executed, such that there are N possible unique execution paths.
The dynamic instrumentation tool also inserts a set of loop path counter array update instructions 82 after the exit point 80 of the loop 70 beginning at line 022. Execution of the loop path counter array update instructions 82 causes the loop path counter array entries in shared memory to be updated by adding the values of the path registers to the respective loop path counter entries of the loop path counter array in shared memory.
Since the loop path counter array resides in shared memory, the loop path counter entries are not multi-thread safe. Therefore, the loop path counter array update instructions are embedded in memory ownership instructions, such that ownership of the loop path counter array memory locations are requested prior to updating of the loop path counter entries. For example, a spinlock access command is a set of instructions that requests access of a loop path counter array by checking the state of a loop access flag via a set of spinlock access instructions illustrated at line 022. The loop path counter entries are then updated by execution of the loop path counter update instructions 84. The loop access flag is then reset via set of spinlock release instructions illustrated at line 026, thus releasing ownership control of the memory locations associated with the loop path counter array. Although a single instruction is shown for illustrating a spinlock access instruction set, a loop path counter array update instruction and a spinlock release instruction set, a plurality of instructions can be employed to execute any of a spinlock access, a loop counter update and a spinlock release.
In view of the foregoing structural and functional features described above, certain methods will be better appreciated with reference to
At 130, the dynamic instrumentation tool decodes the executable function and generates an intermediate representation of the given function, and generates a control flow graph from the intermediate representation. The dynamic instrumentation tool then performs loop recognition analysis on the control flow graph to identify loops in the given function at 150. After the loops have been identified, the methodology proceeds to 160.
At 160, branch profiling is performed on one or more loops. Branch profiling can be performed on the control flow graph to identify branches and possible execution paths through the one or more loops. A branch profiling algorithm (e.g., Ball-Larus algorithm) can be employed to assign branch integers to branches and path values to execution paths through one or more loops, such that the branch integers can be summed to determine an execution path that has been taken through a given loop during a single loop execution. The methodology then proceeds to 170.
At 170, one or more instrumentation counter instructions are inserted into one or more loops associated with the given function. The one or more instrumentation counter instructions include register initialization instructions, integer adder instructions, path counter instructions and loop path counter array update instructions for one or more respective loops. The instrumentation execution speed can be improved by generating free registers for implementing an integer adder and path counters for a given loop.
Register initialization instructions for initializing the path counters can be inserted prior to an entry point in the loop. A register initialization instruction for initializing the integer adder can be inserted after the entry point in the loop. The one or more inserted instructions include inserting integer adder instruction in branches that increment an integer adder free register by the value of the branch integer when the branch is executed. The one or more inserted instructions include inserting path counter instructions that increment respective path counter free registers by one if the integer adder value is equal to the path value associated with the respective path counter. The path counter instructions can be inserted at an end of the loop after a last branch and prior to an exit point of a loop. In this manner, a respective path counter is incremented through each loop iteration based on the execution path through the loop. Once execution of the loop has completed (e.g., after a plurality of loop iterations), the registers will contain values indicative of the number of times a particular path through the loop has been taken for each possible execution path.
The one or more inserted instructions include inserting loop path counter array update instructions after the exit point of a loop. The loop path counter array update instructions update loop path counter entries in a loop path counter array with the number of executions for the paths taken through a respective loop associated with the executable program. The loop path counter array update instructions can be embedded in a multi-thread safe set of ownership instructions, such as a spinlock operation. The loop path counter array maintains a count associated with total path executions of each possible execution path over one or more loop executions associated with a given loop.
At 180, the modified instrumented executable function is encoded into a binary executable, and stored in shared memory. At 190, the break in the executable program associated with the given function is replaced with a branch/jump to the instrumented function and control is returned to the executable program. The methodology then proceeds to 200 where execution is continued at the start of the instrumented function. The methodology then returns to 110 until the next breakpoint is encountered.
The computer system 320 includes a processing unit 321, a system memory 322, and a system bus 323 that couples various system components including the system memory to the processing unit 321. Dual microprocessors and other multi-processor architectures also can be used as the processing unit 321. The system bus may be any of several types of bus structure including a memory bus or memory controller, a peripheral bus, and a local bus using any of a variety of bus architectures. The system memory includes read only memory (ROM) 324 and random access memory (RAM) 325. A basic input/output system (BIOS) can reside in memory containing the basic routines that help to transfer information between elements within the computer system 320.
The computer system 320 can includes a hard disk drive 327, a magnetic disk drive 328, e.g., to read from or write to a removable disk 329, and an optical disk drive 330, e.g., for reading a CD-ROM disk 331 or to read from or write to other optical media. The hard disk drive 327, magnetic disk drive 328, and optical disk drive 330 are connected to the system bus 323 by a hard disk drive interface 332, a magnetic disk drive interface 333, and an optical drive interface 334, respectively. The drives and their associated computer-readable media provide nonvolatile storage of data, data structures, and computer-executable instructions for the computer system 320. Although the description of computer-readable media above refers to a hard disk, a removable magnetic disk and a CD, other types of media which are readable by a computer, such as magnetic cassettes, flash memory cards, digital video disks and the like, may also be used in the operating environment, and further that any such media may contain computer-executable instructions.
A number of program modules may be stored in the drives and RAM 325, including an operating system 335, one or more executable programs 336, other program modules 337, and program data 338. A user may enter commands and information into the computer system 320 through a keyboard 340 and a pointing device, such as a mouse 342. Other input devices (not shown) may include a microphone, a joystick, a game pad, a scanner, or the like. These and other input devices are often connected to the processing unit 321 through a corresponding port interface 346 that is coupled to the system bus, but may be connected by other interfaces, such as a parallel port, a serial port or a universal serial bus (USB). A monitor 347 or other type of display device is also connected to the system bus 323 via an interface, such as a video adapter 348.
The computer system 320 may operate in a networked environment using logical connections to one or more remote computers, such as a remote client computer 349. The remote computer 349 may be a workstation, a computer system, a router, a peer device or other common network node, and typically includes many or all of the elements described relative to the computer system 320. The logical connections can include a local area network (LAN) 351 and a wide area network (WAN) 352.
When used in a LAN networking environment, the computer system 320 can be connected to the local network 351 through a network interface or adapter 353. When used in a WAN networking environment, the computer system 320 can include a modem 354, or can be connected to a communications server on the LAN. The modem 354, which may be internal or external, is connected to the system bus 323 via the port interface 346. In a networked environment, program modules depicted relative to the computer system 320, or portions thereof, may be stored in the remote memory storage device 350.
What have been described above are examples of the present invention. It is, of course, not possible to describe every conceivable combination of components or methodologies for purposes of describing the present invention, but one of ordinary skill in the art will recognize that many further combinations and permutations of the present invention are possible. Accordingly, the present invention is intended to embrace all such alterations, modifications and variations that fall within the spirit and scope of the appended claims.
This application is related to the following commonly assigned co-pending patent application entitled: “SYSTEMS AND METHODS FOR INSTRUMENTING LOOPS OF AN EXECUTABLE PROGRAM,” Attorney Docket No. 200313028-1, which is filed contemporaneously herewith and is incorporated herein by reference.