A buffer overflow occurs when a program or process tries to store more data in a buffer (temporary data storage area) than it was intended to hold. When a program writes past the bounds of a buffer, the extra data typically overflows into one or more adjacent buffers, corrupting or overwriting the valid data held in the adjacent buffers. Buffer overflow often occurs as a result programming error. Buffer overflow also is exploited by malicious code developers to bypass standard intrusion security measures and attack computer systems.
There are four general types of buffer overflow avoidance techniques: secure coding; non-executable stacks; array bounds checking; and pointer integrity checking. Only array bounds checking, however, can provide complete protection against all forms of buffer overflow risks. Some compilers introduce a software bounds check before each array reference. The bound check verifies the lower and upper bounds of an array subscript expression. If a bounds violation is detected, program execution is terminated and an error is reported. This type of complete array bounds checking, however, is difficult to implement and requires a high overhead that significantly slows system performance.
In one aspect, the invention features a machine-implemented method of processing program code in accordance with which machine-executable code is generated. The machine-executable code includes machine-readable instructions for detecting a memory address bounds violation by the program code based on a determination that a boundary memory address stored in a hardware table has been accessed during execution of the program code. The boundary memory address delimits a boundary for a set of memory addresses allocated for execution of the program code. The machine-executable code is stored in a machine-readable medium.
The invention also features a machine-readable medium storing machine-readable instructions for causing a machine to implement the inventive program code processing method described above.
In another aspect, the invention features a machine-implemented method of processing program code in accordance with which a boundary memory address delimiting a boundary for a set of memory addresses allocated for execution of the program code is stored in a hardware table. The program code is executed. A memory address bounds violation by the program code is detected based on a determination that the boundary memory address stored in the hardware table has been accessed during execution of the program code.
Other features and advantages of the invention will become apparent from the following description, including the drawings and the claims.
In the following description, like reference numbers are used to identify like elements. Furthermore, the drawings are intended to illustrate major features of exemplary embodiments in a diagrammatic manner. The drawings are not intended to depict every feature of actual embodiments nor relative dimensions of the depicted elements, and are not drawn to scale.
The bounds checking embodiments described in detail below leverage the data speculation hardware functionality present in modern computer processor designs to detect array bounds violations. In this way, these bounds checking embodiments avoid the performance penalties associated with software-based bounds checking schemes that check the value of every array access or that only load as much data as will fit in a designated buffer. In addition, the bounds checking embodiments described herein can detect memory address bounds violations from a calling routine in a program.
Data speculation is a process that a compiler uses to load data earlier than originally scheduled. In this way, the data required by an instruction will be available in a register when it is needed, thereby reducing or avoiding memory latency delays. Data speculation is enabled by processor instruction sets that include advanced load instructions. For example, the Intel IA-64 architecture uses an instruction called an advanced load that is executed by the processor earlier in the instruction stream than a corresponding original load instruction. In particular, the advanced load of the value at some address A may be executed before a write to potentially the same address A. When executed, an advanced load allocates an entry in a hardware structure called the advanced load address table (ALAT), which is managed by hardware. The load address, the load type and the size of the load are stored in the ALAT register. A compiler typically inserts a load checking instruction at the instruction stream location of the original load instruction to validate the advanced load entry in the ALAT. The load checking instruction specifies the same register number, address and operand size as the corresponding advanced load instruction. When executed, the load checking operation searches the ALAT for a matching entry. If the matching entry is found, the value in the destination register is valid. If no matching entry is found, for instance if the intervening write did store at address A, the value loaded in the destination register is invalid and the required data may be reloaded from memory or some other recovery routine may be initiated depending on the type of load checking instructions is used.
The execution resources 68 receives data and instructions from the various memory cache 70-74 and the system memory 14, which are arranged in a memory hierarchy with lower cache levels being closer to the processor core. Load and store operations respectively transfer data to and from register files. A load operation searches the memory hierarchy for data at a specified memory address, and returns the data to a register file, whereas a store operation writes data from a register file to one or more levels of the memory hierarchy.
The code generator 84 determines one or more boundary memory addresses delimiting respective boundaries for the set of memory addresses (block 96). As shown in
After the one or more boundary memory addresses have been determined (block 96), the code generator 84 generates machine-readable instructions for storing each of the boundary memory addresses in a hardware table (block 106). Code generator 84 generates machine-readable instructions for executing the program code 80 (block 108). Code generator 84 also generates machine-readable instructions for validating each of the boundary memory addresses stored in the hardware table (block 110).
For example, assume that a contiguous chunk of memory with addresses ranging from 4 to 8 is allocated by the code generator 84 for a particular program code 80. In this example, the code generator 84 determines that the upper and lower boundary memory addresses are 9 and 3, which are adjacent to the maximum and minimum addresses in the allocated memory range. In one exemplary implementation that is compatible with the Intel IA-64 architecture, the code generator 84 generates the following sequence of machine-readable instructions in accordance with the memory of FIG. S. An address is inserted into the hardware table by the ld.a instructions. Any store instruction to that address removes the entry from the table. A subsequent check instruction determines if the entry is in the hardware table.
In this implementation, the machine-readable instruction “ld.a [x]” is an advanced load instruction that loads the memory address x into the ALAT hardware table. The machine-readable instruction “ld.c [x], target” is a speculation (or advanced) check instruction that validates the boundary memory address x stored in the ALAT hardware table. The speculation check instruction tests to determine whether a store has occurred to the address contained in the hardware table. Since the addresses being checked are outside the specified bounds, the absence of the address denotes a bounds violation. The code then branches to the address specified by “target” to take remedial action.
In the code example presented above, if the machine-readable instructions corresponding to program code 80 stores to memory address 3, then memory address 3 is removed from the ALAT hardware table and the corresponding “ld.c [3]” ALAT entry check fails. Similarly, if the machine-readable instructions corresponding to program code 80 stores to memory address 9, then memory address 9 is removed from the ALAT hardware table and the corresponding “ld.c [9]” ALAT entry check fails.
Referring back to
In some current implementations of the Intel IA-64 architecture, the ALAT hardware table is not part of the state of a process and entries may be removed at any time. When used to embody or implement the bounds checking embodiments described herein, the Intel IA-64 architecture should be modified so that use of the ALAT is deterministic. In this regard, the instruction set may be modified to include an additional bit that denotes that an entry in the ALAT is part of the process state to be saved and restored across context switches and that such an entry may not be replaced, only explicitly cleared.
Other embodiments are within the scope of the claims. For example,
The systems and methods described herein are not limited to any particular hardware or software configuration, but rather they may be implemented in any computing or processing environment, including in digital electronic circuitry or in computer hardware, firmware, or software. In general, the systems may be implemented, in part, in a computer process product tangibly embodied in a machine-readable storage device for execution by a computer processor. In some embodiments, these systems preferably are implemented in a high level procedural or object oriented processing language; however, the algorithms may be implemented in assembly or machine language, if desired. In any case, the processing language may be a compiled or interpreted language. The methods described herein may be performed by a computer processor executing instructions organized, for example, into process modules to carry out these methods by operating on input data and generating output. Suitable processors include, for example, both general and special purpose microprocessors. Generally, a processor receives instructions and data from a read-only memory and/or a random access memory. Storage devices suitable for tangibly embodying computer process instructions include all forms of non-volatile memory, including, for example, semiconductor memory devices, such as EPROM, EEPROM, and flash memory devices; magnetic disks such as internal hard disks and removable disks; magneto-optical disks; and CD-ROM. Any of the foregoing technologies may be supplemented by or incorporated in specially designed ASICs (application-specific integrated circuits).