This application is based upon and claims the benefit of priority of the prior Japanese Patent Application No. 2018-116777, filed on Jun. 20, 2018, the entire contents of which are incorporated herein by reference.
The embodiments discussed herein are related to an information processing apparatus, a computer-readable recording medium storing therein a compiler program, and a compiling method.
For example, a compiler (hereinafter also referred to as a compiler program) that compiles source code written in C++ generates object code from selected source code used as an input. Specifically, the compiler generates an intermediate language representation from the selected source code by, for example, analyzing syntax used in the source code and semantics of the source code. The compiler then, for example, optimizes the generated intermediate language representation and generates object code from the optimized intermediate language representation. As such, performing compilation enables, for example, reducing time taken for executing object code and resources used for executing the object code (see, for example, Japanese Laid-open Patent Publication Nos. 2008-217134, 2007-328692, and 2004-362216).
According to an aspect of the embodiments, an information processing apparatus includes a memory; and a processor coupled to the memory and the processor configured to when source code includes an instruction for storing units of data in an area of an N-dimensional variable-length array (N being an integer and a value of N being equal to or greater than 2), generate object code in the memory to cause the units of data to be stored in an area of an N-dimensional fixed-length array instead of the area of the N-dimensional variable-length array, and when the source code includes an instruction for successively accessing the unit of data stored in the area of the N-dimensional variable-length array, generate the object code in the memory to cause the units of data stored in the area of the N-dimensional fixed-length array to be stored contiguously in an area of a one-dimensional fixed-length array.
The object and advantages of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the claims.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention.
The source code described above may include, for example, an instruction for accessing units of data stored in discrete memory locations (for example, data stored in a multidimensional variable-length array). The instructions may include “for”, “do while”, and/or “while” statements, and the like. The instructions may further include a function and/or an expression. Executing object code generated from source code including such an instruction may cause a large number of cache misses, and as a result, a degradation in the efficiency of executing the instruction may occur in a central processing unit (CPU) coupled to a data cache.
The compiler therefore preferably generates object code with which it is possible to suppress a decrease in the cache hit rate when the object code is generated from source code including an instruction with which units of data stored in discrete memory locations are successively accessed.
Configuration of Information Processing System
In the information processing apparatus 1, various functions including, for example, a compiler 11 and a linker 12, are implemented by a CPU (not illustrated) and various programs organically cooperating with each other.
The compiler 11 generates, for example, object code 133 from source code 131 stored in an information storage area 130. The compiler 11 includes, for example, an analysis unit 11a, an optimization unit 11b, and a code generation unit 11c. The analysis unit 11a generates an intermediate language representation 132 from the source code 131 by analyzing syntax and semantics of sequences of characters in the source code 131. The optimization unit 11b optimizes the intermediate language representation 132 generated by the analysis unit 11a. The code generation unit 11c generates the object code 133 from the intermediate language representation 132 optimized by the optimization unit 11b.
Specifically, the analysis unit 11a generates, for example, the intermediate language representation 132 of a library (hereinafter also referred to as the intermediate language representation 132a) from the source code 131 of the library (hereinafter also referred to as the source code 131a) as illustrated in
The analysis unit 11a also generates, for example, the intermediate language representation 132 of an application (hereinafter also referred to as the intermediate language representation 132b) from the source code 131 of the application (hereinafter also referred to as the source code 131b) as illustrated in
The linker 12, for example, generates an executable file 134 by linking the object code 133a generated from the source code 131a and the object code 133b generated from the source code 131b as illustrated in
The source code 131b described above may include, for example, an instruction for accessing units of data stored in discrete memory locations (for example, data stored in a multidimensional variable-length array). Executing the object code 133b generated from the source code 131b including such an instruction may cause a large number of cache misses, and as a result, a degradation in the efficiency of executing the instruction may occur in the CPU (not illustrated) of the information processing apparatus 1.
The compiler 11 therefore preferably generates object code with which it is possible to suppress a decrease in the cache hit rate when the object code 133b is generated from the source code 131b including such an instruction with which units of data stored in discrete memory locations are successively accessed.
In this respect, when compiling the source code 131b, in a case where the source code 131b includes an instruction for storing units of data (hereinafter also referred to as particular data) in an N-dimensional variable-length array (N is an integer equal to or greater than 2), the compiler 11 of the present embodiment generates the object code 133b so as to cause the particular data to be stored in an N-dimensional fixed-length array instead of the N-dimensional variable-length array.
When compiling the source code 131b, in a case where the source code 131b includes an instruction for successively accessing units of data stored in an N-dimensional variable-length array, the compiler 11 generates the object code 133b so as to cause the units of data stored in the N-dimensional fixed-length array to be stored contiguously in a one-dimensional fixed-length array and cause the units of data stored in the one-dimensional fixed-length array to be successively accessed.
In other words, in a case where the source code 131 includes an instruction for storing units of data in an N-dimensional variable-length array, the compiler 11 of the present embodiment generates the object code 133 by converting the instruction for storing units of data in an N-dimensional variable-length array into an instruction for storing units of data contiguously in a one-dimensional fixed-length array.
In this manner, the compiler 11 is able to generate the object code 133b in which access to data stored in non-contiguous memory locations is suppressed. The compiler 11 is therefore able to generate the object code 133b with which a higher cache hit rate in execution of the object code 133b is achieved.
Hardware Configuration of Information Processing System
Next, a hardware configuration of the information processing system 10 is described.
The information processing apparatus 1 includes a CPU 101 as a processor, a memory 102, an external interface (an input/output (I/O) unit) 103, and a recording medium 104. These components are coupled to each other via a bus 105. The processor may be a single CPU, a multiple-CPU, or a multi-core CPU. The CPU 101 may include a data cache. The data cache coupled to the CPU 101 and the memory 102.
The recording medium 104 includes, for example, a program storage area (not illustrated) for storing a program 110 for performing processing for compiling the source code 131 (hereinafter also referred to as compiling processing). The recording medium 104 also includes, for example, the information storage area 130 (hereinafter also referred to as the memory unit 130 or the memory 130) that is used when compiling processing is performed. The recording medium 104 may be, for example, a hard disk drive (HDD).
The CPU 101 performs compiling processing by running the program 110 loaded into the memory 102 from the recording medium 104.
The external interface 103 communicates with, for example, the operating terminal 3.
Functions of Information Processing System
Next, functions of the information processing system 10 are described.
In the information processing apparatus 1, as illustrated in
The information processing apparatus 1 stores the source code 131, the intermediate language representation 132, the object code 133, the executable file 134, and address information 135 in the information storage area 130 as illustrated in
The array specification unit 111 of the analysis unit 11a specifies, among the statements of the source code 131b stored in the information storage area 130, a declaration about an N-dimensional variable-length array, an instruction for storing units of data in an N-dimensional variable-length array, and an instruction for successively accessing units of data in an N-dimensional variable-length array.
When the array specification unit 111 specifies the declaration about an N-dimensional variable-length array and the instructions, the type conversion unit 112 of the optimization unit 11b converts information about an N-dimensional variable-length array into information about an N-dimensional fixed-length array in the intermediate language representation 132b generated from the source code 131b by the analysis unit 11a.
When the array specification unit 111 specifies the declaration about an N-dimensional variable-length array and the instructions, the storage changing unit 113 of the optimization unit 11b modifies the intermediate language representation 132b generated by the analysis unit 11a so as to cause units of data to be stored in an N-dimensional fixed-length array instead of an N-dimensional variable-length array.
When the array specification unit 111 specifies the declaration about an N-dimensional variable-length array and the instructions, the access destination changing unit 114 of the optimization unit 11b modifies the intermediate language representation 132b generated by the analysis unit 11a so as to cause units of data stored in an N-dimensional fixed-length array to be stored contiguously in a one-dimensional fixed-length array and cause the units of data stored in the one-dimensional fixed-length array to be successively accessed. The address information 135 will be described later.
Next, an outline of a first embodiment is described.
As illustrated in
When the timing of starting compilation of the source code 131b is reached (YES in S1), the compiler 11 determines whether the source code 131b (the intermediate language representation 132b) includes an instruction for storing particular data in an N-dimensional variable-length array (S2).
In a case where it is determined that an instruction for storing particular data in an N-dimensional variable-length array is included (YES in S3), the compiler 11 generates the object code 133b from the source code 131b (the intermediate language representation 132b) so as to cause the particular data to be stored in an N-dimensional fixed-length array instead of an N-dimensional variable-length array (S4).
The compiler 11 subsequently determines whether the source code 131b includes an instruction for successively accessing units of data stored in an N-dimensional variable-length array (S5).
In a case where it is determined that an instruction for successively accessing units of data stored in an N-dimensional variable-length array is included (YES in S6), the compiler 11 generates the object code 133b so as to cause the units of data stored in an N-dimensional fixed-length array to be stored contiguously in a one-dimensional fixed-length array and cause the units of data stored in the one-dimensional fixed-length array to be successively accessed (S7).
Specifically, as illustrated in
Furthermore, as illustrated in
In addition, the optimization unit 11b optimizes the intermediate language representation 132b generated by the analysis unit 11a further as desired. Subsequently, the code generation unit 11c generates the object code 133b from the intermediate language representation 132b optimized by the optimization unit 11b.
Specific Example of Processing in S4
Next, a specific example of the processing in S4 is described.
In the examples in
Firstly, the array v illustrated in
The array iv1 illustrated in
The array iv2 illustrated in
The array iv3 illustrated in
Next, the array data illustrated in
The array data illustrated in
The array data illustrated in
For example, when the source code 131b includes an instruction for storing units of data in the array v, the compiler 11 modifies the intermediate language representation 132b so as to cause units of data to be stored not in the array v (not stored as illustrated in
Specific Example of Processing in S7
Next, a specific example of the processing in S7 is described.
For example, when the source code 131b includes an instruction for successively accessing the units of data stored in the array v, the compiler 11 causes the units of data stored in the array data as a two-dimensional fixed-length array (stored as illustrated in
In other words, in a case where the source code 131 includes an instruction for storing units of data in an N-dimensional variable-length array, the compiler 11 of the present embodiment generates the object code 133 by converting the instruction for storing units of data in an N-dimensional variable-length array into an instruction for storing units of data contiguously in a one-dimensional fixed-length array.
In this manner, the compiler 11 is able to generate the object code 133b in which access to data stored in non-contiguous memory locations is suppressed. The compiler 11 is therefore able to generate the object code 133b with which a higher cache hit rate in execution of the object code 133b is achieved.
Next, the first embodiment is described in detail.
Compiling Processing Performed by Analysis Unit of Compiler
Firstly, compiling processing performed by the analysis unit 11a of the compiler 11 is described.
As illustrated in
When the timing of starting compiling processing is reached (YES in S11), the array specification unit 111 of the analysis unit 11a specifies, among the statements in the source code 131b for which the compiling processing has started in S11, a declaration about an N-dimensional variable-length array, an instruction for storing units of data in an N-dimensional variable-length array, and an instruction for successively accessing units of data in an N-dimensional variable-length array (S12). A specific example of the processing in S12 is described below.
Specific Example of Processing in S12
CODE1 in the source code 131b illustrated in
Referring back to
Compiling Processing Performed by Optimization Unit of Compiler
Secondly, compiling processing performed by the optimization unit 11b of the compiler 11 is described.
As illustrated in
When the generation of the intermediate language representation 132b has been completed (YES in S21), the optimization unit 11b determines whether a declaration about an N-dimensional variable-length array, an instruction for storing units of data in an N-dimensional variable-length array, and an instruction for successively accessing units of data in an N-dimensional variable-length array are specified in the processing in S12 (S22).
In a case where it is determined that the declaration about an N-dimensional variable-length array and the other instructions are all specified (YES in S23), the type conversion unit 112 of the optimization unit 11b modifies the intermediate language representation 132b generated in the processing in S13 so as to convert the declaration about an N-dimensional variable-length array included in the intermediate language representation 132b into a declaration about an N-dimensional fixed-length array (S24).
Specifically, the type conversion unit 112 modifies the intermediate language representation 132b so as to convert, among the statements in the source code 131b illustrated in
Subsequently, the storage changing unit 113 of the optimization unit 11b modifies the intermediate language representation 132b generated in the processing in S13 so as to convert, among the instructions included in the intermediate language representation 132b, an instruction for storing units of data in an N-dimensional variable-length array into an instruction for storing units of data in an N-dimensional fixed-length array and an instruction for generating the address information 135 indicating addresses at which units of data are stored in an N-dimensional fixed-length array (S25). A specific example of the address information 135 will be described later.
Since the declaration about the array v (an N-dimensional variable-length array) included in the intermediate language representation 132b generated in the processing in S13 is converted into the declaration about the array data (an N-dimensional fixed-length array) in the processing in S24, a function push_back that is called in execution of the object code 133b generated from the intermediate language representation 132b (a function included in CODE2 in the source code 131b illustrated in
In regard to this point, for example, before performing the compiling processing of the present embodiment, the operator creates an Array Extend class inheriting the Array class in the standard template library (STL) and adds in advance the Array class push_back as a new member function including an instruction for storing units of data in the array data (an N-dimensional fixed-length array) and an instruction for generating the address information 135 indicating addresses at which the units of data are stored in the array data, and as a result, it is possible to cause the optimization unit 11b to perform the processing in S25 without modifying a portion that is included in the intermediate language representation 132b generated in the processing in S13 and that corresponds to CODE2.
As illustrated in
Specifically, as illustrated in
Here, before performing the compiling processing of the present embodiment, the operator may, for example, creates an Array Extend class inheriting the Array class and adds in advance a new function for performing the processing in S31 and subsequent steps as a member function by referring to the address information 135. Specifically, for example, in execution of the object code 133b, when a function begin as an Array class member function (a function included in CODE3 in the source code 131b illustrated in
In this manner, the operator is able to cause the optimization unit 11b to perform the processing in S31 and subsequent steps without, for example, changing the portion enclosed as a loop in CODE3 included in the intermediate language representation 132b generated in the processing in S13.
The access destination changing unit 114 then modifies the intermediate language representation 132b generated in the processing in S13 so as to cause the units of data stored in the N-dimensional fixed-length array to be stored contiguously in a one-dimensional fixed-length array (S32).
Specifically, the access destination changing unit 114 modifies the intermediate language representation 132b so as to, for example, cause the units of data stored in the array data to be stored contiguously in the array data2.
Hereinafter, in the processing performed along with the execution of the object code 133b generated from the intermediate language representation 132b (hereinafter also referred to as the object execution processing), a portion of the processing in regard to the intermediate language representation 132b modified in the processing in S32 is described.
Object Execution Processing
As illustrated in
The CPU 101 causes a unit of data stored in a location specified by subscripts indicated as the value of the variable i and the value of the variable j in an N-dimensional fixed-length array to be stored in a location specified by a subscript indicated as the value of the variable counter in a one-dimensional fixed-length array (S42).
The CPU 101 then determines whether, in the address information 135 stored in the information storage area 130, the maximum value that is set for the vector count is equal to the value of a variable i+1 and the number of units of information in which the value set for the vector count is equal to the value of the variable i is equal to the value of a variable j+1 (S43).
In a case where it is determined that both the maximum value and the number of units of information are not equal to the respective target values (NO in S44), the CPU 101 determines whether, in the address information 135 stored in the information storage area 130, the number of units of information in which the value set for the vector count is equal to the value of the variable i is equal to the value of the variable j+1 (S45).
As illustrated in
In a case where it is determined that both the maximum value and the number of units of information are equal to the respective target values (YES in S44), the CPU 101 generates a new N-dimensional fixed-length array as illustrated in
The CPU 101 then causes the units of data stored in the one-dimensional fixed-length array to be stored in the Nth dimension of the new N-dimensional fixed-length array and causes, with respect to each of the first to N−1st dimensions, information of the beginning address of locations in a particular dimension to be stored in another dimension of the first to N−1st dimensions, the particular dimension is a dimension one dimension higher than the other dimension to which information of the corresponding beginning address in the particular dimension is stored (S62).
The CPU 101 sets the units of data stored in the new N-dimensional fixed-length array as access target data (S63).
To be specific, the CPU 101 sets, as access target data, the units of data stored in the new N-dimensional fixed-length array generated in the processing in S61 instead of the units of data stored in the one-dimensional fixed-length array in the processing in S42.
In this manner, the operator is not desired to change the portion enclosed as a loop in CODE3 included in the intermediate language representation 132b generated in the processing in S13.
Referring back to
In a case where it is determined in the processing in S22 that at least one of the declaration about an N-dimensional variable-length array and the other instructions is not specified (NO in S23), the optimization unit 11b similarly ends optimizing the intermediate language representation 132b.
First Specific Example in Execution of Instruction Modified in Processing in S25
Next, among the instructions modified in the processing in S25 in the intermediate language representation 132b, a specific example in execution of the instruction for storing units of data in an N-dimensional fixed-length array (in execution of the object code 133b) is described.
Firstly, a specific example of the array v is described.
The array v illustrated in
Next, a specific example of the array data is described.
The array data illustrated in
For example, when the CPU 101 executing the object code 133b (hereinafter also simply referred to as the CPU 101) executes the instruction modified in the processing in S25, the CPU 101 causes the units of data to be stored in the array data (stored as illustrated in
Second Specific Example in Execution of Instruction Modified in Processing in S25
Next, among the instructions modified in the processing in S25 in the intermediate language representation 132b, a specific example in execution of the instruction for generating the address information 135 (in execution of the object code 133b generated from the source code 131b) is described.
The address information 135 illustrated in
Specifically, the array data illustrated in
Accordingly, the CPU 101 sets 1 as the vector count, 0x0100 as the address, and 4 as the number of units of data with regard to the record whose record number is 1 as illustrated in
The CPU 101 also sets 2 as the vector count, 0x0200 as the address, and 2 as the number of units of data with regard to the record whose record number is 2.
The CPU 101 also sets 3 as the vector count, 0x0300 as the address, and 3 as the number of units of data with regard to the record whose record number is 3.
Specific Example in Execution of Instruction Modified in Processing in S31
Next, a specific example in execution of the instruction modified in the processing in S31 (in execution of the object code 133b generated from the source code 131b) is described.
In the address information 135 illustrated in
Specific example in execution of instruction modified in processing in S32
Next, a specific example in execution of the instruction modified in the processing in S32 (in execution of the object code 133b generated from the source code 131b) is described.
The array data illustrated in
In a case where the value of the variable i is 0, the value of the variable j is 0, and the value of the variable counter is 0, the CPU 101 causes 0 stored in the array data [0][0] to be stored in the array data2 [0] as illustrated in
Here, as the number of units of data in the record whose vector count is 1 in the address information 135 illustrated in
The CPU 101 then repeats the processing in S42 and subsequent steps, and as a result, causes the units of data stored in the array data illustrated in
Since the variable-length array specified in the processing in S12 is a two-dimensional array, the CPU 101 then generates an array data2_box as a new two-dimensional fixed-length array (S61).
The CPU 101 causes the units of data stored in the array data2 in the processing in S42 to be stored in the second dimension of the array data2_box as illustrated in
It is noted that, when the variable-length array specified in the processing in S12 is three-dimensional or higher, the CPU 101 may generate an array dataN_box as a new fixed-length array having three or more dimensions.
Specifically, in a case where the variable-length array specified in the processing in S12 is, for example, a three-dimensional array, the CPU 101 generates an array data3_box as a new three-dimensional fixed-length array. The CPU 101 then causes the units of data stored in the array data2 in the processing in S42 to be stored in the third dimension of the array data3_box as illustrated in
As described above, when compiling the source code 131b, in a case where the source code 131b includes an instruction for storing particular data in an N-dimensional variable-length array, the compiler 11 of the present embodiment generates the object code 133b so as to cause the particular data to be stored in an N-dimensional fixed-length array instead of the N-dimensional variable-length array.
When compiling the source code 131b, in a case where the source code 131b includes an instruction for successively accessing units of data stored in an N-dimensional variable-length array, the compiler 11 generates the object code 133b so as to cause the units of data stored in the N-dimensional fixed-length array to be stored contiguously in a one-dimensional fixed-length array and cause the units of data stored in the one-dimensional fixed-length array to be successively accessed.
In other words, in a case where the source code 131 includes an instruction for storing units of data in an N-dimensional variable-length array, the compiler 11 of the present embodiment generates the object code 133 by converting the instruction for storing units of data in an N-dimensional variable-length array into an instruction for storing units of data contiguously in a one-dimensional fixed-length array.
In this manner, the compiler 11 is able to generate the object code 133b with which the access to units of data stored in discrete memory locations is not desired. The compiler 11 is therefore able to generate the object code 133b with which a higher cache hit rate in execution of the object code 133b is achieved.
All examples and conditional language provided herein are intended for the pedagogical purposes of aiding the reader in understanding the invention and the concepts contributed by the inventor to further the art, and are not to be construed as limitations to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to a showing of the superiority and inferiority of the invention. Although one or more embodiments of the present invention have been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.
Number | Date | Country | Kind |
---|---|---|---|
JP2018-116777 | Jun 2018 | JP | national |
Number | Name | Date | Kind |
---|---|---|---|
5586325 | MacDonald | Dec 1996 | A |
7062761 | Slavin | Jun 2006 | B2 |
8015556 | Cui | Sep 2011 | B2 |
9250878 | McCallum | Feb 2016 | B1 |
9733908 | Slavin | Aug 2017 | B2 |
10360003 | Slavin | Jul 2019 | B2 |
20030014588 | Hu | Jan 2003 | A1 |
20030014607 | Slavin | Jan 2003 | A1 |
20040168027 | Hu | Aug 2004 | A1 |
20040221281 | Suganuma | Nov 2004 | A1 |
20060195661 | Hu | Aug 2006 | A1 |
20060221747 | Slavin | Oct 2006 | A1 |
20070240137 | Archambault | Oct 2007 | A1 |
20070299892 | Nakahara | Dec 2007 | A1 |
20100174876 | Kasahara et al. | Jul 2010 | A1 |
20130007721 | Slavin | Jan 2013 | A1 |
20140304691 | Slavin | Oct 2014 | A1 |
20150046913 | Cui | Feb 2015 | A1 |
20170322784 | Slavin | Nov 2017 | A1 |
20180157470 | Matsuura | Jun 2018 | A1 |
20190114150 | Ogasawara | Apr 2019 | A1 |
Number | Date | Country |
---|---|---|
2004-362216 | Dec 2004 | JP |
2007-328692 | Dec 2007 | JP |
2008-217134 | Sep 2008 | JP |
Number | Date | Country | |
---|---|---|---|
20190391795 A1 | Dec 2019 | US |