This application is based upon and claims the benefit of priority from Japanese Patent Application No. 2006-152717, filed May 31, 2006, the entire contents of which are incorporated herein by reference.
1. Field
One embodiment of the present invention relates to a compile method for a computer program and, more particularly, to a medium for recording a compiler program that processes pieces which are obtained by dividing a program and which can be processed without branch, a compile method, and an information processing apparatus therewith.
2. Description of the Related Art
A compile process is performed in execution of a computer program. In relation to this, control flow analysis or the like is known.
Patent Document (Jpn. Pat. Appln. KOKAI Publication No. 2003-330741) discloses a technique for dividing a program in units of procedures, and unloading a less-frequently used module to improve efficiency in use.
The conventional technique in Patent Document 1 has the following program. A less-frequently used procedure module in the program is unloaded, and only an error process or the like which is not frequently used in one procedure cannot be excluded.
A general architecture that implements the various feature of the invention will now be described with reference to the drawings. The drawings and the associated descriptions are provided to illustrate embodiments of the invention and not to limit the scope of the invention.
Various embodiments according to the invention will be described hereinafter with reference to the accompanying drawings. In general, according to one embodiment of the invention, a recording medium which records the following program in a storage region, a compiler program which is loaded on a memory of a computer and executed to have the following functions: analyzing a source program to detect a branched part in control of the program; and dividing the source program into divided programs which are program pieces executed without the branched part and parts executed while being branched.
An embodiment of the present invention provides a compile method and an information processing apparatus which compile pieces which are obtained by dividing a program and can be processed without branch, load the pieces on a cache, and process the pieces to improve execution efficiency of the program.
One embodiment provides a recording medium which records a compiler program which is loaded on a memory of a computer and executed to have the following functions: analyzing a source program to detect a branched part in control of the program; and dividing the source program into divided programs P1 to P5 which are program pieces executed without the branched part and parts C1 to C5 executed while being branched.
There is further provided a compile method comprising: analyzing a source program to detect a branched part in control of the program; and dividing the source program into divided programs (P1 to P5) which are program pieces executed without the branched part and parts (C1 to C5) executed while being branched to compile the source program.
In this manner, an instruction code or the like of a program such as a less-frequently used error process is less frequently loaded on a cache memory or the like. For this reason, efficiency in execution of the program can be improved.
An embodiment of the present invention will be described below in detail with reference to the accompanying drawings.
In general, a large part of execution time of a program is occupied by only a small part of the program, and this part is preferably executed at high speed. It is generally performed to set a frequently-executed instruction code in a high-speed cache memory to execute the instruction code. However, when a code size of a program increases, an instruction cache becomes short, and a cache miss frequently occurs to cause a decrease in execution efficiency. In particular, in an embedded device, since a cache memory which occupies a large area of an LSI cannot be easily increased in size, the cache memory must be efficiently used.
In consideration of the details of a large-scale program, a large part of a program code is occupied by error handling in many cases. These processes are not executed by a general normal path. For this reason, when a line size of the cache memory is sufficiently small, these program codes do not largely occupy the cache memory. However, with an increase in speed of a memory, a line size of a cache tends to increase. For this reason, some error handling may be loaded on the cache.
In a case in which a programmer purposely uses a local memory such as a scratch pad or a local storage, a code is generally DMA-transferred from a main memory onto the local memory in units of parts of a program to execute the program. In this case, when error handling is written by a syntax such as if-then-else, a program code is transferred to the local memory together with a program of a normal path to worsen use efficiency of the local memory.
In general, a configuration of a program can be classified into procedures A, B, . . . , as will be described in
Therefore, in the compile method according to the embodiment of the present invention, as shown in
In this manner, error handling which cannot be executed in general is not loaded on a cache, and a program of a normal path can be arranged in continuous regions. For this reason, efficiency of execution of the program is improved. When program codes are executed while being output or input in a limited memory space such as a scratch pad or a local storage, the number of times of replacing programs can be reduced.
(Information Processing Apparatus to Realize Compile Method)
An example of an information processing apparatus 20 to realize a compile method according to the embodiment of the present invention will be described below with reference to
(Operation)
First, a compile process performed in the information processing apparatus 20 will be described in detail with reference to the flow chart in
On the basis of this, a program is divided into parts (to be referred to as basic blocks) which are serially executed by control of the program without branch (step S13). In
The divided pieces are encoded into objective codes as PIC codes. At the same time, a code is generated by using a control flow as a virtual instruction sequence of a virtual machine. The pieces 13 of the program codes and an instruction code 14 of the virtual machine constitute a program substance. In general, the pieces 13 and the instruction code 14 are stored in the secondary storage 21 such as a hard disk. These compile processes are performed to all the programs (step S14).
In execution of the program, as shown in the flow chart in
In execution of the program piece X by the instruction J, more specifically, as shown in the flow chart in
These processes are executed by processing a virtual code by a virtual machine such as Java™. More specifically, as shown in the flow chart in
In this case, the virtual code can also be executed by an interpreter of a virtual machine or can also be executed by being developed into native codes by a JIT compiler.
Furthermore, in the JIT compiling, according to a calling flow of a program piece, a code arrangement for a local memory can also be optimized.
An example of a concrete compile process of a program will be described below with reference to
When codes are transferred to a high-speed memory in units of modules, both the codes are loaded on the memory. In the embodiment of the present invention, only one of the codes is loaded on the high-speed memory.
An example obtained after the program is divided is shown in the right part in
Since an execution code on the VM side can be executed at a high speed by being Just-In-Time-compiled, the VM code is not necessarily used.
In the above embodiment, although a program is divided into pieces by a compiler, in a description stage of the program, it is preferable that a control structure and a program sequence be divisionally described. Furthermore, in this case, by using a language which describes parts that can be executed in parallel to each other, a VM which executes a control flow can effectively assign the parts which can be parallel executed to sleds, respectively.
As described above, by a compile process and an execution process of a program according to the embodiment of the present invention, cache miss of instruction codes of a large-scale program can be reduced. In an environment in which a program is loaded and unloaded on a high-speed memory such as a scratch pad or a local storage and executed, the number of times of loading/unloading a program can be reduced.
According to various embodiments described above, a person skilled in the art can realize the present invention. Furthermore, various modifications of these embodiments can be easily conceived by the person skilled in the art. The present invention can be applied to various embodiments without inventive ability. Therefore, the present invention ranges in scope without departing from the disclosed principles and novel characteristics, and the present invention is not limited to the embodiments described above.
While certain embodiments of the inventions have been described, these embodiments have been presented by way of example only, and are not intended to limit the scope of the inventions. Indeed, the novel methods and systems described herein may be embodied in a variety of other forms; furthermore, various omissions, substitutions and changes in the form of the methods and systems described herein may be made without departing from the spirit of the inventions. The accompanying claims and their equivalents are intended to cover such forms or modifications as would fall within the scope and spirit of the inventions.
Number | Date | Country | Kind |
---|---|---|---|
2006-152717 | May 2006 | JP | national |