This application is related to the following:
U.S. patent application Ser. No. 10/101,296, filed Mar. 18, 2002 in the name of Eduard de Jong, entitled “Method and Apparatus for Deployment of High Integrity Software Using Initialization Order and Calling Order Constraints”, commonly assigned herewith.
U.S. patent application Ser. No. 10/101,289, filed Mar. 18, 2002 in the name of Eduard de Jong, entitled “Method and Apparatus for Deployment of High Integrity Software Using Reduced Dynamic Memory Allocation”, commonly assigned herewith.
The present invention relates to the field of computer science. More particularly, the present invention relates to a method and apparatus for deployment of high integrity software using static procedure return addresses.
High integrity software is software that must be trusted to work dependably in some critical function, and whose failure to do so may have catastrophic results, such as serious injury, loss of life or property, business failure or breach of security. Some examples include software used in safety systems of nuclear power plants, medical devices, electronic banking, air traffic control, automated manufacturing, and military systems. The importance of high quality, low defect software is apparent in such critical situations. However, high integrity software is also important in more mundane business areas where defective software is often the norm.
Formal verification is the process of checking whether a design satisfies some requirements or properties. In order to formally verify a design, it must first be converted into a more condensed, verifiable format. The design is specified as a set of interacting systems, each having a finite number of configurations or states. States and transition between states constitute finite state machines (FSMs). The entire system is a FSM that can be obtained by composing the FSMs associated with each component. The first step in verification consists of obtaining a complete FSM description of the system. Given a present state (or current configuration), the next state (or successive configuration) of a FSM can be written as a function of its present state and inputs (transition function or transition relation). Formal verification attempts to execute every possible computational path with every possible state value to prove every possible state is consistent.
Software programs typically include multiple procedures. Each procedure may call another procedure or be called by another procedure. A program is recursive if at least one of its procedures may call itself, either directly or indirectly. When a calling procedure calls a called procedure, the address where program execution will resume upon completion of the called procedure needs to be stored. How this is done typically depends upon whether the programming language supports recursion.
Cyber FORTRAN (CDC Cyber 205 Fortran 66, Control Data Corporation, 1983) is an example of a non-recursive programming language. In Cyber FORTRAN, each procedure is associated with a static location used for storing its return address. When a calling procedure calls a called procedure, the calling procedure places the procedure return address in the static location used for storing the return address of the called procedure. When the called procedure completes execution, the procedure return address is obtained from the static location associated with the called procedure and program control is transferred to the procedure return address.
Computer languages that allow recursion typically use a portion of memory called a stack to maintain program state information between procedure calls. A stack is a last-in-first-out storage structure. One can put a new item on top of the stack at any time, and whenever one attempts to retrieve an item from the top of the stack it is always the one most recently added to the stack. A call stack is a stack used primarily to store procedure return addresses. When a calling procedure calls another procedure, the calling procedure places on the call stack the address where program execution should resume once the called procedure completes execution. When the called procedure completes execution, the procedure return address is obtained from the call stack and program execution resumes at the procedure return address.
One or more other stacks may be used to store parameter values (parameter stack) and local variables declared in called procedures (local stack). However, memory management complexity increases as the number of program stacks increase. An improvement is made possible by merging multiple stacks such as the call stack, parameter stack and local stack into a smaller number of stacks.
Turning now to
Turning now to
Still referring to
Unfortunately, using a call stack to store procedure return addresses makes the program susceptible to program execution flow manipulation through call stack modification. For example, a malicious programmer might manipulate program execution flow by writing procedure code that pushes a value onto the call stack. The value is pushed on top of the valid procedure return address. When execution of the procedure completes, a “return-from-procedure” instruction is executed, which pops the new value from the stack and transfers program control to the address represented by the value. Manipulating a call stack in this way allows the programmer to transfer program control to any address, regardless of the address validity and regardless of whether the proper initialization has been performed prior to transferring program control. The ability to modify the call stack in such a manner makes the program unpredictable and makes program verification relatively difficult.
Accordingly, what is needed is a solution that increases program verifiability. A further need exists for such a solution that prevents manipulation of program control flow merely by changing the call stack. A further need exists for such a solution that reduces stack management complexity.
A method for statically allocating a procedure return address includes separating a software program including multiple procedures into a cyclic part and an acyclic part, allocating a static address for the return address of a procedure in the acyclic part and modifying at least one of the procedures to refer to the static address for the procedure return address.
The accompanying drawings, which are incorporated into and constitute a part of this specification, illustrate one or more embodiments of the present invention and, together with the detailed description, serve to explain the principles and implementations of the invention.
In the drawings:
Embodiments of the present invention are described herein in the context of a method and apparatus for deployment of high integrity software using static procedure return addresses. Those of ordinary skill in the art will realize that the following detailed description of the present invention is illustrative only and is not intended to be in any way limiting. Other embodiments of the present invention will readily suggest themselves to such skilled persons having the benefit of this disclosure. Reference will now be made in detail to implementations of the present invention as illustrated in the accompanying drawings. The same reference indicators will be used throughout the drawings and the following detailed description to refer to the same or like parts.
In the interest of clarity, not all of the routine features of the implementations described herein are shown and described. It will, of course, be appreciated that in the development of any such actual implementation, numerous implementation-specific decisions must be made in order to achieve the developer's specific goals, such as compliance with application- and business-related constraints, and that these specific goals will vary from one implementation to another and from one developer to another. Moreover, it will be appreciated that such a development effort might be complex and time-consuming, but would nevertheless be a routine undertaking of engineering for those of ordinary skill in the art having the benefit of this disclosure.
In the context of the present invention, the term “network” includes local area networks, wide area networks, the Internet, cable television systems, telephone systems, wireless telecommunications systems, fiber optic networks, ATM networks, frame relay networks, satellite communications systems, and the like. Such networks are well known in the art and consequently are not further described here.
In accordance with one embodiment of the present invention, the components, processes and/or data structures may be implemented using C or C++ programs running on high performance computers (such as an Enterprise 2000™ server running Sun Solaris™ as its operating system. The Enterprise 2000™ server and Sun Solaris™ operating system are products available from Sun Microsystems, Inc. of Palo Alto, Calif.). Different implementations may be used and may include other types of operating systems, computing platforms, computer programs, firmware, computer languages and/or general-purpose machines. In addition, those of ordinary skill in the art will recognize that devices of a less general purpose nature, such as hardwired devices, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), or the like, may also be used without departing from the scope and spirit of the inventive concepts disclosed herein.
According to embodiments of the present invention, a software program is separated into a cyclic part and an acyclic part and the software program is modified to use a static address for at least one procedure return address in the acyclic part.
Many other devices or subsystems (not shown) may be connected in a similar manner. Also, it is not necessary for all of the devices shown in
For purposes of the present disclosure, a program specification refers to a model of a program design, expressed in terms of a strictly formalized language that is directly amenable to analysis using formal mathematical logic. A program specification may include one or more module specification where each module specification indicates other modules callable by the module. A program implementation refers to a software program written using a particular programming language.
Turning now to
Turning now to
Kernel component 410 includes executable code modules that include one or more procedures. Modules (440–460) include an initialization procedure to initialize the module (440–460). The initialization procedure must be called before other procedures within the module (440–460) are called. When apparatus 400 is reset, reset indicator 435 sends a signal to boot manager 440. Boot manager 440 calls the initialization procedure of at least one module (440–460) in a predetermined order. As shown in the example illustrated by
Calling order constraints in system 400 correspond to the initialization order constraints. A calling module may call any module that occurs before the calling module in the initialization sequence. A special case exists for embodiments where the boot manager module 440 is an actual module rather than a placeholder. If the boot manager module 440 is an actual module, it is limited to calling the initialization procedure for any module (440–460). In the example illustrated by
Still referring to
The call stack manager 450 allocates space for static, pre-allocated return addresses. The call stack manager 450 allocates the space by making a procedure call to the memory manager 445, including the memory allocation request. Since the call stack manager 450 must call or use the services of the memory manager 445, the call stack manager 450 is placed after the memory manager in the initialization sequence. Placing the call stack manager 450 formally early in the initialization sequence guarantees memory allocation for the static return addresses. It also guarantees static allocation of a memory area for a call stack. The call allows the memory manager 445 to reserve space for the static return addresses in its formal model of memory. The logic of the call stack manager is a call stack tool, which may rewrite modules to use static locations to store procedure return addresses, much like the dynamic memory tool may rewrite modules to replace dynamic memory requests with static memory allocations.
Turning now to
System 500 may be further organized into columns of related functionality. Four columns of related functionality (520, 525, 530, 535) are shown in
Turning now to
The software program may be separated into a cyclic part and an acyclic part using a call graph of the program, where every node of the graph represents a procedure and every directed line connecting two nodes represents one procedure calling another procedure. The call graph of a software program may include one or more cyclic parts and one or more acyclic parts. The acyclic part of the software program is represented by a directed acyclic graph (DAG) within the call graph. A calling sequence in the DAG is represented by the path from the root to a leaf. Because the calling sequence represented by such a DAG is nonrecursive, the address of any procedure in the calling sequence will occur at most one time in any call stack corresponding to the calling sequence. This allows a static address to be used for the procedure return address of each procedure in the DAG.
Turning now to
Referring again to
Still referring to
Referring again to reference numeral 610 of
According to one embodiment of the present invention, the procedure return address is based upon the maximum depth of the procedure return address in all configurations of the call stack that include the procedure return address. This is illustrated in more detail below with respect to
Turning now to
Still referring to
Turning now to
According to one embodiment of the present invention, a “return N” instruction is used to return control to the calling procedure, where “N” refers to the static address for the procedure return address associated with the called procedure. Depending upon the program size, the return address may be built in to the encoding of the “return N” instruction. By way of example, if the depth of the call stack is less than 16 in an implementation that uses 8-bit instructions, four bits of the instruction may be used to indicate 16 different locations that contain return addresses. Encoding return address locations in this way makes manipulation of program control flow more difficult because the return address location is part of the instruction. According to another embodiment of the present invention, the “N” refers to an index into a table that includes return address locations. This is explained in more detail below with reference to
Turning now to
The illustration above with respect an 8-bit instruction is not intended to be limiting in any way. Those of ordinary skill in the art will recognize that instructions having other sizes may be used.
According to one embodiment of the present invention, code modification is performed automatically by a tool such as a compiler or optimizer or the like. Those of ordinary skill in the art will recognize that other tools may be used to perform the code modification.
According to another embodiment of the present invention, a software program is modified to indicate the maximum size of a call stack during execution of the program. The maximum size of the call stack may be indicated using a parameter included in the software program. Alternatively, the maximum size of the call stack may be indicated using a “Set Call Stack Size” instruction in the software program.
According to another embodiment of the present invention, a processor such as a CPU, virtual machine or the like accepts a program for execution where the program itself includes an indication of the maximum size of the program's call stack during execution of the program. The maximum size of the call stack may be indicated using a parameter included in the software program. Alternatively, the maximum size of the call stack may be indicated using a “Set Call Stack Size” instruction in the software program. The processor may use this size information to build or allocate an area of memory for static procedure return addresses.
According to one embodiment of the present invention, a software program modified to use a static address for at least one procedure return address is targeted for execution on a resource-constrained device. Resource-constrained devices are generally considered to be those that are relatively restricted in memory and/or computing power or speed, as compared to typical desktop computers and the like. According to one embodiment of the present invention, the resource-constrained device comprises a smart card. According to another embodiment of the present invention, the smart card comprises a Java Card™ technology-enabled smart card. The invention can be used with other resource-constrained devices including, but not limited to, cellular telephones, boundary scan devices, field programmable devices, personal digital assistants (PDAs) and pagers, as well as other miniature or small-footprint devices. The invention can also be used on non-resource constrained devices.
Embodiments of the present invention have a number of advantages. Preventing program control flow modification through call stack manipulation increases program predictability and verifiability. Fixing the amount of memory used for procedure return addresses also increases program verifiability by reducing the state space subject to verification. Additionally, eliminating the call stack reduces stack memory allocation and deallocation.
While embodiments and applications of this invention have been shown and described, it would be apparent to those skilled in the art having the benefit of this disclosure that many more modifications than mentioned above are possible without departing from the inventive concepts herein. The invention, therefore, is not to be restricted except in the spirit of the appended claims.
Number | Name | Date | Kind |
---|---|---|---|
4734568 | Watanabe | Mar 1988 | A |
5107418 | Cramer et al. | Apr 1992 | A |
5384749 | Lisart et al. | Jan 1995 | A |
5423027 | Jackson | Jun 1995 | A |
5615137 | Holzmann et al. | Mar 1997 | A |
5650948 | Gafter | Jul 1997 | A |
5659754 | Grove et al. | Aug 1997 | A |
5668999 | Gosling | Sep 1997 | A |
5740441 | Yellin et al. | Apr 1998 | A |
5748964 | Gosling | May 1998 | A |
5790859 | Sarkar | Aug 1998 | A |
5802519 | De Jong | Sep 1998 | A |
5828883 | Hall | Oct 1998 | A |
5836014 | Faiman, Jr. | Nov 1998 | A |
5887161 | Cheong et al. | Mar 1999 | A |
5968169 | Pickett | Oct 1999 | A |
5974255 | Gossain et al. | Oct 1999 | A |
6038397 | Iwanishi et al. | Mar 2000 | A |
6052690 | de Jong | Apr 2000 | A |
6094656 | De Jong | Jul 2000 | A |
6185597 | Paterson et al. | Feb 2001 | B1 |
6282700 | Grover et al. | Aug 2001 | B1 |
6292874 | Barnett | Sep 2001 | B1 |
6463581 | Bacon et al. | Oct 2002 | B1 |
6526571 | Aizikowitz et al. | Feb 2003 | B1 |
6604190 | Tran | Aug 2003 | B1 |
6634019 | Rice et al. | Oct 2003 | B1 |
6684261 | Orton et al. | Jan 2004 | B1 |
6718485 | Reiser | Apr 2004 | B1 |
6735758 | Berry et al. | May 2004 | B1 |
6957422 | Hunt | Oct 2005 | B2 |
6971091 | Arnold et al. | Nov 2005 | B1 |
20020019969 | Hellestrand et al. | Feb 2002 | A1 |
20020097269 | Batcha et al. | Jul 2002 | A1 |
20020147903 | Hubert et al. | Oct 2002 | A1 |
20030097581 | Zimmer | May 2003 | A1 |
20040015920 | Schmidt | Jan 2004 | A1 |
20040103416 | Orton et al. | May 2004 | A1 |
Number | Date | Country |
---|---|---|
0 390 339 | Oct 1990 | EP |
0 543 588 | May 1993 | EP |
0 605 872 | Jul 1994 | EP |
0 751 458 | Jan 1997 | EP |
0 821 305 | Jan 1998 | EP |
1 056 002 | Nov 2000 | EP |
2 806 813 | Mar 2000 | FR |
9424673 | Oct 1994 | WO |
9819237 | May 1998 | WO |
9924944 | May 1999 | WO |
Number | Date | Country | |
---|---|---|---|
20030177474 A1 | Sep 2003 | US |