This application is based on Japanese Patent Application No. 2015-072814 filed on Mar. 31 2015, the disclosure of which is incorporated herein by reference.
The present disclosure relates to a parallelization compiling method, a parallelization compiler, and a vehicular device.
With the mounting of a multi-core processor on a vehicular device, programs can be distributed to respective cores, and therefore throughput can be improved. In the vehicular device on which the multi-core processor is mounted, a sequential program designed for a conventional single-core processor program needs to be replaced with a parallelized program with which parallelizing process can be executed by the multi-core processor. Up to now, in order to facilitate the generation of the parallelized program, a parallelization compiler that automatically generates the parallelized program from the sequential program has been proposed. As the parallelization compiler of this type, a parallelization compiler disclosed in JP 2015-001807 A (corresponding to US 2014/0372995 A1) has been proposed. The parallelization compiler disclosed in JP 2015-001807 analyzes a data dependency and a control dependency between processes included in the sequential program, and automatically generates the parallelized program from the sequential program on the basis of an analysis result.
Meanwhile, for example, when a part of the processing in the sequential program is changed due to version upgrade, there is a possibility to change a data dependency and a control dependency between the processes configuring the sequential program. For that reason, when the changed sequential program is compiled by the parallelization compiler without any action, an execution order of the parallelized program of not only a changed portion but also the other portions may be remarkably replaced with each other. In other words, a small change of the sequential program is likely to spread to a change in the overall parallelized programs. In that case, because there is a need to verify again whether the overall parallelized program is normal or not, the verification cost of the parallelized program may be increased.
In view of the foregoing difficulties, it is an object of the present disclosure to provide a parallelization compiling method, a parallelization compiler, and a vehicular device each of which is capable of easily verifying a parallelized program when a modification or change is made in a program.
According to a first aspect of the present disclosure, a parallelization compiling method includes analyzing a sequential program prepared for a single-core processor; dividing the sequential program into a plurality of processes based on an analysis result; and generating a parallelized program, which is subjected to a parallelized execution by a multi-core processor, from the plurality of processes. The generating of the parallelized program includes compiling the plurality of processes under an execution order restriction defined based on a predetermined parameter.
According to a second aspect of the present disclosure, a parallelization compiler, which is stored in a non-transitory tangible computer readable storage medium as a program product, includes instructions to be executed by a compiling device. The instructions are for implementing analyzing a sequential program prepared for a single-core processor; dividing the sequential program into a plurality of processes based on an analysis result; and generating a parallelized program, which is subjected to a parallelized execution by a multi-core processor, from the plurality of processes. The generating of the parallelized program includes compiling the plurality of processes under an execution order restriction defined based on a predetermined parameter.
According to a third aspect of the present disclosure, a vehicular device includes multi-core processor performing a control process based on the parallelized program generated by the parallelization compiling method according to the first aspect of the present disclosure.
According to a fourth aspect of the present disclosure, a vehicular device includes a multi-core processor performing a control process based on the parallelized program generated by the parallelization compiler according to the second aspect of the present disclosure.
With the above parallelization compiling method, parallelization compiler, and vehicular device, when a modification or change is made to a parallelized program, verification of the parallelized program can be carried out in an easy and effective manner.
The above and other objects, features and advantages of the present disclosure will become more apparent from the following detailed description made with reference to the accompanying drawings. In the drawings:
The following will describe a parallelization compiling method, a parallelization compiler, and a vehicular device according to a first embodiment with reference to the drawings.
As illustrated in
The parallelization compiler CP generates a multi-core processor source program for a built-in system from a single-core processor source program for a built-in system. Hereinafter, the single-core processor program is called “sequential program”. The multi-core processor source program is called “parallelized program”.
The storage medium 1 may be provided by an optical disk, a magnetic disk, or a semiconductor memory such as a Digital Versatile Disc (DVD), a Compact Disc Read-Only Memory (CD-ROM), a Universal Serial Bus (USB) memory, or a memory card (registered trademark). The parallelization compiler CP stored in the storage medium 1 is installed into a compiling device 2.
The compiling device 2 is used for the development of the parallelized program for a built-in system of the vehicular device. The compiling device 2 includes a display unit 20, a hard disk drive (HDD) 21, a central processing unit (CPU) 22, a read-only memory (ROM) 23, a random access memory (RAM) 24, an input device 25, and a reading unit 26.
The display unit 20 displays an image based on an image signal output from the CPU 22.
The input device 25 includes a keyboard and a mouse. The input device 25 outputs a signal corresponding to user's operation to the CPU 22.
The reading unit 26 reads out the parallelization compiler CP from the storage medium 1.
The RAM 24 is used as a storage area for temporarily storing a program and a storage area for temporarily storing calculation processing data when the CPU 22 executes the program stored in the ROM 23 or the HDD 21.
The CPU 22 reads out an operating system (OS) from the HDD 21, and executes the operating system, to thereby execute various programs stored in the HDD 21 as a process on the OS. The CPU 22, for example, receives an input of a signal from the input device 25, controls an output of an image signal to the display unit 20, and reads out data from the RAM 24 and the HDD 21 or writes data into the RAM 24 and the HDD 21 in this process.
The parallelization compiler CP, which is read out from the storage medium 1 by the reading unit 26, is installed into the compiling device 2. The installed parallelization compiler CP is stored in the HDD 21, and functions as one of the applications executed as a process on the OS.
In the compiling device 2, the parallelization compiler CP executes a parallelizing process in response to a user's instruction. As illustrated in
The input/output port 32 captures detection signals from various sensors 5 mounted in a vehicle, and transmits a control signal to a control target 4 such as a vehicle actuator. The driving of the control target 4 is controlled according to the control signal transmitted from the ECU 3 to the control target 4.
The multi-core processor 30 includes a ROM 301, a RAM 302, multiple cores 303a, 303b and so on. Binary data of the parallelized program P2 generated by the compiling device 2 is stored in the ROM 301. The multi-core processor 30 operates on the basis of the parallelized program P2 stored in the ROM 301, and comprehensively controls the driving of the control target 4.
The communication unit 31 is configured to communicate with another in-vehicle ECU connected to the communication unit 31 through an in-vehicle network such as a controller area network (CAN, registered trademark).
The following will describe a process of generating the parallelized program P2 from the sequential program P1 by the parallelization compiler CP in detail.
As illustrated in
In this situation, the parallelization compiler CP performs an inline expansion on the sequential program P1. The inline expansion is a process for replacing a description for calling a subroutine in the program with a description of a process defined in the subroutine. Because a built-in system program is generally fine in process, the parallelization with coarse granularity is difficult. However, by performing the inline expansion, a parallelism in the subroutine can be effectively used.
In addition, the parallelization compiler CP identifies, in each function block of the sequential program P1, process blocks each of which uses a local variable having the same name. The parallelization compiler CP also modifies the sequential program P1 so that a local variable having individual name is used in each of the identified processing blocks. Each process block is, for example, an aggregation of the descriptions of statements of loop processing and branch processing such as an if-statement or a switch-case statement, and assignment statements associated with the statements.
At S2, the parallelization compiler CP divides the sequential program P1 into multiple processes (macro tasks) A, B and so on, on the basis of the lexicon, syntactic/semantics, or the like, which is analyzed in S1. The respective processes A, B and so on are a series of processes including, for example, various calculating, assigning, and branching processes, and function calls. For example, as illustrated in
As illustrated in
Condition A1: When the process x is data-dependent on the process y, the process y cannot be executed until the execution of the process x is completed.
Condition A2: When a condition branch destination of the process y is determined, the process x control-depending on the process y can be executed even if the execution of the process y is not completed.
In the processing graph MTG, the data dependency of all the processes included in the sequential program P1 is represented. For example, the parallelization compiler CP generates the processing graph MTG as illustrated in
Subsequent to S5, the parallelization compiler CP, at S6, analyzes the restriction applied on the execution order of the respective processes A, B and so on, on the basis of process priority information I1 including priority information of each process included in the sequential program P1. The process priority information I1 is preliminarily determined to represent, for example, priorities of the process units A, B, which are illustrated in
Subsequent to the process in S6, the parallelization compiler CP, at S7, performs scheduling for assigning the respective processes A, B and so on to multiple process groups PG1, PG2 and so on, on the basis of the processing graph MTG generated in S5, and the restriction applied on the execution order of the respective processes A, B and so on analyzed in S6. The number of process groups PG1, PG2 and so on corresponds to the number of cores 303a, 303b and so on of the multi-core processor 30. For example, when the number of cores provided in the multi-core processor 30 is two, two process groups are provided. Specifically, the parallelization compiler CP allocates all or a part of the processes executable in the parallelizing process to different process groups while satisfying the following conditions B1 and B2.
Condition B1: The process having higher priority is executed prior to the process having lower priority.
Condition B2: A start time of the process having lower priority is posterior to an end time of the process having higher priority.
The parallelization compiler CP inserts a waiting duration into the process groups PG1, PG2 and so on for satisfying the above-described conditions as necessary. As a result, the parallelization compiler CP generates the process groups PG1 and PG2 illustrated in
The parallelization compiler CP may generate multiple comparative example sets of the process groups PG1, PG2 and so on, and select a set of optimum process groups PG1, PG2 and so on from the multiple comparative example sets. The determination of whether one set of the process groups PG1, PG2 and so on is optimum or not, is performed, for example, with the use of the respective lengths of execution durations of the multiple sets of process groups PG1, PG2 and so on, and evaluation values calculated on the basis of a predetermined evaluation function.
After the generation of the process groups PG1, PG2 and so on has been completed, the parallelization compiler CP generates binary data of the parallelized program P2 on the basis of the process groups PG1, PG2, and so on. The parallelized program P2 generated as described above is stored in the ROM 301 of the multi-core processor 30.
The following will describe the operation and advantages of the parallelization compiling method, the parallelization compiler CP, and the ECU 3 according to the present embodiment.
As illustrated in
In some cases, a program specification has provisions of the execution order on a function basis. When the allocation of the processes to the process groups is performed with consideration of only data dependency of the processing graph MTG, the specifications of the processes assigned to the process groups PG1, PG2 and so on may mismatch with one another. In other words, there is a possibility that the specifications of the processes included in the parallelized program P2 mismatch with one another. According to the parallelization compiler CP of the present embodiment, by setting the priority information I1 according to the specification of each process, the mismatch of the specifications of processes in the parallelized program P2 can be avoided in a case where, for example, the process E is changed.
The following will describe a parallelization compiling method, a parallelization compiler CP, and an ECU 3 according to a second embodiment according to a second embodiment. Hereinafter, differences from the first embodiment will be mainly described.
As illustrated in
The following will describe the operation and advantages of the parallelization compiling method, the parallelization compiler CP, and the ECU 3 according to the present embodiment.
As illustrated in
The following will describe a parallelization compiling method, a parallelization compiler CP, and an ECU 3 according to a third embodiment. Hereinafter, differences from the first embodiment will be mainly described.
As illustrated in
As illustrated in
Subsequent to the process in S6, the parallelization compiler CP performs, at S7, scheduling for assigning the respective processes A, B and so on to multiple process groups PG1, PG2 and so on, on the basis of the processing graph MTG generated in S5 and the restriction applied on the execution order of the respective processes A, B and so on which are analyzed in S6. Specifically, the parallelization compiler CP allocates all or a part of the processes executable in the parallelizing process to different process groups PG1, PG2 and so on while satisfying the following conditions C1 and C2.
Condition C1: The function having higher priority is executed prior to the function having lower priority.
Condition C2: A start time of the function having lower priority is posterior to an end time of the function having higher priority.
As a result, the parallelization compiler CP generates the process groups PG1 and PG2 illustrated in
The following will describe the operation and advantages of the parallelization compiling method, the parallelization compiler CP, and the ECU 3 according to the present embodiment.
As illustrated in
The following will describe a parallelization compiling method, a parallelization compiler CP, and an ECU 3 according to a fourth embodiment. Hereinafter, differences from the first embodiment will be mainly described.
As illustrated in
According to the above configuration, since the priority order can be set for the respective processes A, B and so on as in the first embodiment, the same operation and advantages as those in the first embodiment can be obtained.
The following will describe a parallelization compiling method, a parallelization compiler CP, and an ECU 3 according to a fifth embodiment. Hereinafter, differences from the first embodiment will be mainly described.
As illustrated in
For example, it is assumed that the sequential program P1 includes the process folders FDa, FDb, and FDc. The process files in which the contents of the processes A to C are described are stored in the process folder FDa. The process files in which the contents of the processes D to F are described are stored in the process folder FDb. The process files in which the contents of the processes G and H are described are stored in the process folder FDc. The parallelization compiler CP executes the process files stored in the process folders FDa to FDc. In that case, for example, as illustrated in
Subsequent to the process in S6, the parallelization compiler CP performs, at S7, scheduling for assigning the respective processes A, B and so on to multiple process groups PG1, PG2 and so on, on the basis of the processing graph MTG generated in S5 and the restriction applied on the execution order of the respective processes A, B and so on which are analyzed in S6. Specifically, the parallelization compiler CP allocates all or a part of the processes executable in the parallelizing process to the different process groups PG1, PG2 and so on while satisfying the following conditions D1 and D2.
Condition D1: The process folder having higher priority is executed prior to the process folder having lower priority.
Condition D2: A start time of the process folder having lower priority is posterior to an end time of the process folder having higher priority.
As a result, the parallelization compiler CP generates the process groups PG1 and PG2 illustrated in
The following will describe the operation and advantages of the parallelization compiling method, the parallelization compiler CP, and the ECU 3 according to the present embodiment.
As illustrated in
The following will describe a parallelization compiling method, a parallelization compiler CP, and an ECU 3 according to a sixth embodiment. Hereinafter, differences from the first embodiment will be mainly described.
As illustrated in
For example, it is assumed that a sequential program P1 includes the processing files FLa, FLb, and FLc. The contents of the processes A to C are described in the process file FLa. The contents of the processes D to F are described in the process file FLb. The contents of the processes G and H are described in the process file FLc. The parallelization compiler CP executes the processes stored in the process files FLa to FLc. In that case, for example, as illustrated in
Subsequent to the process in S6, the parallelization compiler CP performs, at S7, scheduling for assigning the respective processes A, B and so on to multiple process groups PG1, PG2 and so on, on the basis of the processing graph MTG generated in S5 and the restriction applied on the execution order of the respective processes A, B and so on which are analyzed in S6. Specifically, the parallelization compiler CP allocates all or a part of the processes executable in the parallelizing process to the different process groups PG1, PG2 and so on while satisfying the following conditions E1 and E2.
Condition E1: The process file having higher priority is executed prior to the process file having lower priority.
Condition E2: A start time of the process file having lower priority is posterior to an end time of the process file having higher priority.
As a result, the parallelization compiler CP generates the process groups PG1 and PG2 illustrated in
The following will describe operation and advantages of the parallelization compiling method, the parallelization compiler CP, and the ECU 3 according to the present embodiment.
As illustrated in
The above-described embodiments may also be implemented by the following configurations.
The parallelization compiler CP according to the first embodiment may employ the following condition B3 instead of the condition B2 for performing scheduling at S7.
Condition B3: A start time of the process having lower priority is posterior to a start time of the process having higher priority.
Based on the above condition, the parallelization compiler CP can generate process groups PG1 and PG2 illustrated in
The indicator Ind according to the fourth embodiment may add a dependency of the prior and posterior processes in a sequential program P1. Even with the above configuration, similar operation and similar advantages as those in the fourth embodiment can be obtained,
The parallelization compiler CP may be installed into a compiling device 2 through a network.
The compiling device 2 is not limited to the vehicular device, but can be used in the development of a parallelized program for a built-in system intended for various applications such as information appliances, or the development of a parallelized program for other applications aside from the built-in system. According to the foregoing embodiments, the compiling is performed in a state where the execution order of the multiple processes is restricted in changing any one of the multiple processes. Thus, an influence of a change in the process is restricted to a local extent corresponding to the restriction applied on the execution order. In other words, the change in the process merely affects the local extent of the parallelized program without affecting the overall parallelized program. Hence, because only the local extent of the parallelized program has only to be verified, the verification of the program is facilitated.
Further, according to the foregoing embodiments of the present disclosure, the parallelized program can be easily verified in a case where a program change is made.
In the present disclosure, it is noted that a flowchart or the processing of the flowchart in the present disclosure includes sections (also referred to as steps), each of which is represented, for instance, as S1. Further, each section can be divided into several sub-sections while several sections can be combined into a single section. Furthermore, each of thus configured sections can be also referred to as a circuit, device, module, or means.
While only the selected exemplary embodiments have been chosen to illustrate the present disclosure, it will be apparent to those skilled in the art from this disclosure that various changes and modifications can be made therein without departing from the scope of the disclosure as defined in the appended claims. Furthermore, the foregoing description of the exemplary embodiments according to the present disclosure is provided for illustration only, and not for the purpose of limiting the disclosure as defined by the appended claims and their equivalents.
Number | Date | Country | Kind |
---|---|---|---|
2015-72814 | Mar 2015 | JP | national |