The embodiment discussed herein is related to an information processing device, an information processing method, and a recording medium.
In circuit design, the number of design steps of the design of circuit modules has been reduced by improvement of Register Transfer Level (RTL) design tools and high-level synthesis tools.
Japanese Laid-open Patent Publication No. 2004-54641 is disclosed as related art.
According to an aspect of the embodiments, an information processing device includes: a memory; a processor coupled to the memory and configured to: perform, based on input descriptions of a first circuit module that performs processing of a first task and a second circuit module that receive data output from the first circuit module and performs processing of a second task, high-level synthesis of the first circuit module and the second circuit module; synthesize an interface circuit that includes a memory to and from which the data is input and output and that performs data transfer between the first circuit module and the second circuit module based on write information of the data that is written to the interface circuit by the first circuit module and read information of the data which is read from the interface circuit by the second circuit module; calculate a minimum operation start interval of the interface circuit based on the write information of the data and the read information of the data; and provide, when the calculated minimum operation start interval is larger than a minimum operation start interval of each of the first circuit module and the second circuit module, a storage element that is different from the memory and that stores data which is input to or output from the memory in the interface circuit based on the minimum operation start intervals of the first circuit module and the second circuit module.
The object and advantages of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the claims.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention.
For example, design support by design tools is not much provided for the design of an interface circuit for connecting circuit modules, which is responsible for data transfer between circuit modules. Since an interface circuit between circuit modules affects both the performance of the entire system and the circuit area, it is important to consider how to realize the interface circuit at the time of design.
As a method of creating an interface circuit between circuit modules, there is a method of designing the interface circuit manually as a dedicated circuit adapted to the operation of the circuit modules. According to this method, the interface circuit is created for dedicated use. Thus, an interface circuit that is excellent in terms of performance and circuit area can be created, but it takes a lot of time and labor for design. In addition, when the interface of circuit modules is changed, the interface circuit also needs to be changed, which is time-consuming.
As another method of creating an interface circuit, there is a method of creating an interface circuit by adopting a standard system as an interface between circuit modules and using an IP macro. According to this method, the number of design steps can be reduced, but the feasible range of customization for the purpose of performance improvement and circuit area reduction is limited. Further, as another method of creating an interface circuit, a technique for automatically synthesizing an interface circuit between circuit modules may be proposed.
As illustrated in
Here, the operation start interval (interval at which the same processing is repeated) of the interface circuit 2602 is determined depending on the write operation by the circuit module 2601, the read operation by the circuit module 2603, and the memory capacity of the interface circuit 2602. If the memory capacity of the interface circuit 2602 is reduced in order to avoid an increase in circuit area, the operation start interval of the interface circuit 2602 increases. When the operation start interval of the interface circuit 2602 is larger than the operation start intervals of the circuit module 2601 and the circuit module 2603, the interface circuit 2602 is a bottleneck. As a result, the period from the time when data is output from the circuit module 2601 to the time when the data is input to the circuit module 2603 becomes long to degrade the performance of the entire system.
An information processing device capable of automatically synthesizing an interface circuit that allows improvement of performance while suppressing an increase in circuit area may be provided.
Hereinafter, an embodiment of the present invention will be described based on the drawings.
In the following, as a hardware circuit to be designed, a circuit that performs the processing of a task A, and performs the processing of a task B using received output data of the task A as illustrated in
The hardware circuit based on the design data 122 is automatically synthesized by executing software such as a high-level synthesis tool 131, a logic synthesis tool 132, a schedule information extraction tool 133, and an interface circuit synthesis tool 134 on the server 121. The high-level synthesis tool 131 is a tool that a synthesizes circuit module (RTL description) that perform processing according to input description (operation description) from the input description. The logic synthesis tool 132 is a tool that generates gate-level circuit data based on the RTL description.
The schedule information extraction tool 133 is a tool that obtains information of data input/output timing (write trace data and read trace data) from schedule information. The interface circuit synthesis tool 134 is a tool that synthesizes an interface circuit (RTL description) based on the write trace data and the read trace data obtained from the schedule information.
Here, the schedule information is obtained as one piece of log information output at the time of synthesis of the circuit module, and is information indicating the flow of processing of the circuit module. The write trace data is data write information indicating which memory element in the interface circuit the data is to be written to at which timing, and the read trace data is data read information indicating which memory element in the interface circuit the data is to be read from at which timing.
The high-level synthesizer 201 synthesizes circuit modules (RTL description) that perform processing described in an input description (operation description). The high-level synthesizer 201 has a pipeline synthesis function, and synthesizes circuit modules capable of operating in a pipeline. When an input description 211A of the task A is input, the high-level synthesizer 201 synthesizes a circuit module that performs the processing of the task A based on the input description 211A, and outputs an RTL description 212A of the circuit module and log information 213A. In addition, when an input description 211B of the task B is input, the high-level synthesizer 201 synthesizes a circuit module that performs the processing of the task B based on the input description 211B, and outputs an RTL description 212B of the circuit module (task B), and log information 213B. The log information 213A and 213B each includes schedule information indicating the flow of processing in the circuit module.
The schedule information extractor 202 obtains information of data input/output timing (write trace data and read trace data) from schedule information in log information output by the high-level synthesizer 201. The schedule information extractor 202 obtains write trace data 214 from the schedule information in the log information 213A of the circuit module (task A) that is the data output side (writes data), and outputs the write trace data 214. The schedule information extractor 202 also obtains read trace data 215 from the schedule information in the log information 213B of the circuit module (task B) that is the data input side (reads data), and outputs the read trace data 215.
The interface circuit synthesizer 203 synthesizes an interface circuit (RTL description), which is responsible for data transfer between circuit modules, based on the write trace data and the read trace data output by the schedule information extractor 202. The interface circuit synthesizer 203 analyzes the lifetime of the memory in the interface circuit 2602 using the write trace data 214 and the read trace data 215. The interface circuit synthesizer 203 also synthesizes the interface circuit based on the analysis result, and outputs an RTL description 216 of the interface circuit. The lifetime starts when data is written to a memory element and ends when the data is read last, and indicates a period for which the memory element holds the data.
The logic synthesizer 204 logically synthesizes circuit information (RTL description) of each of the circuits to generate gate-level circuit data. The logic synthesizer 204 generates gate-level circuit data 217 of a hardware circuit to be designed based on the RTL descriptions 212A and 212B of the circuit modules output by the high-level synthesizer 201 and the RTL description 216 of the interface circuit output by the interface circuit synthesizer 203, and outputs the gate-level circuit data 217.
When the processing starts, in step S301, the high-level synthesizer 201 of the information processing device performs high-level synthesis of the circuit module that performs the processing of the task A and the circuit module that performs the processing of the task B based on the input descriptions thereof. Next, in step S302, the schedule information extractor 202 of the information processing device obtains write trace data from the schedule information of the circuit module that performs the processing of the task A, and obtains read trace data from the schedule information of the circuit module that performs the processing of the task B. The circuit modules have been obtained by the high-level synthesis in step S301.
Subsequently, in step S303, the interface circuit synthesizer 203 of the information processing device performs lifetime analysis of a memory in the interface circuit using the write trace data and the read trace data obtained in step S302. Next, in step S304, the interface circuit synthesizer 203 determines an operation start interval of each of the circuit modules and an interface circuit based on the schedule information of corresponding circuit module and the analysis result of lifetime. Here, an operation start interval is an interval of repeated processing of the same task, and is a time interval from start of processing of a task to start of processing of the same task next time.
Next, in step S305, the interface circuit synthesizer 203 evaluates the processing performance and the circuit area of the entire circuit to be designed. If the processing performance and the circuit area do not satisfy predetermined conditions, the processing proceeds to step S306. For example, when the interface circuit synthesizer 203 determines that the minimum operation start interval of the interface circuit is larger than the minimum operation start interval of each circuit module so that the interface circuit is a bottleneck, the processing proceeds to step S306.
In step S306, the interface circuit synthesizer 203 selects one memory element of the memory in the interface circuit and separates the memory element into a separated memory device (storage element). The interface circuit synthesizer 203 then performs lifetime analysis of the memory in the interface circuit in a state where the memory element is separated into a separated memory device, and updates the analysis result. The processing then returns to step S304.
If the interface circuit synthesizer 203 determines that the processing performance and the circuit area of the entire circuit satisfy the predetermined conditions in step S305, the interface circuit synthesizer 203 generates an RTL description of the interface circuit by synthesizing the interface circuit in step S307. Subsequently, in step S308, the logic synthesizer 204 of the information processing device logically synthesizes the RTL descriptions of the circuit modules obtained in step S301 and the RTL description of the interface circuit obtained in step S307 and outputs circuit data. The processing then ends.
Hereinafter, circuit synthesis performed by the information processing device according to the present embodiment will be specifically described by taking, as an example, a case where circuits that perform the processing described in the input descriptions illustrated in
In the present embodiment, it is assumed that a task does not perform both writing and reading to/from the memory in one interface circuit, and can perform only one of writing and reading to/from the memory in one interface circuit. In addition, it is assumed that a task can perform writing on each memory element in the memory in an interface circuit only once during one performance of the task, and can perform reading on each memory element in the memory in an interface circuit more than once during one performance of the task.
The input description 401 of the task A indicates that the task A uses data in[0] to in[7] as input and data tmp12[0] to tmp12[7] as output, and repeatedly performs arithmetic processing using data in[j] to output an arithmetic result as data tmp12[j] eight times with increment of the value j from 0 by 1. The input description 402 of the task B indicates that the task B uses data tmp12[0] to tmp12[7] as input, and data tmp23[0] to tmp23[7] as output, and repeatedly performs arithmetic processing using data tmp12[j] and tmp12[(j+1)%8] (% is a modulo operator) to output an arithmetic result as data tmp23[j] eight times with increment of the value j from 0 by 1.
When the input descriptions 401 and 402 are input to the high-level synthesizer 201 with specification that input data is stored in a memory (RAM) and output data is to be stored in the memory (RAM), the high-level synthesizer 201 synthesizes a circuit module that performs the processing of the task A and a circuit module that performs the processing of the task B. An example of the synthesis result of the high level synthesis by the high-level synthesizer 201 is illustrated in
The register (reg0) 511 is a register that stores data read from the memory 520 as input data. The combinational logic circuit (logic) 512 is a circuit that performs arithmetic processing using the data stored in the register (reg0) 511, and data obtained as the arithmetic result is stored in the register (reg1) 513. The register (reg1) 513 is a register that stores data to be written to the memory 530 as output data.
As illustrated in
Since the circuit module 510 operates in a pipeline, data is read from the memory element in[1] of the memory 520 in cycle 1, data is read from the memory element in[2] of the memory 520 in cycle 2, and in the following cycles, data is similarly read from the memory elements in[3] to in[7] of the memory 520 sequentially, and the above-described processing is repeated. Then, in cycle 10, data is written from the register 513 to the memory element tmp12[7] of the memory 530, and thus one performance of the processing of the task A is completed.
As described above, although one performance of the processing of the task A takes eleven cycles, the circuit module 510 can perform pipeline operation at the task level, and therefore, as illustrated in
The register (reg0) 611 is a register that stores data read from the memory 620 as input data. The combinational logic circuit (logic) 612 is a circuit that performs arithmetic processing using the data stored in the register (reg0) 611, and data obtained as the arithmetic result is stored in the register (reg1) 613. The register (reg1) 613 is a register that stores data to be written to the memory 630 as output data.
As illustrated in
Since the circuit module 610 operates in a pipeline, data is read from the memory elements tmp12[1] and tmp12[2] of the memory 620 in cycle 1, and in the following cycles, data is similarly read from the memory elements of the memory 620 sequentially, and the above-described processing is repeated. Then, data is read from the memory elements tmp12[1] and tmp12[7] of the memory 620 in cycle 7, and arithmetic processing using the read data is performed, and in cycle 10, data is written from the register 613 to the memory element tmp23[7] of the memory 630, and thus one performance of the processing of the task B is completed.
As described above, although one performance of the processing of the task B takes eleven cycles, the circuit module 610 can perform pipeline operation at the task level, and therefore, as illustrated in
When the schedule information of each of the task A and task B as described above is obtained, the schedule information extractor 202 obtains the write trace data and the read trace data related to the memory of the interface circuit based on the schedule information. The write trace data can be obtained by extracting the writing of data from the circuit module 510 to the memory 530 from the schedule information of the task A on the data output side, as illustrated in
Next, the interface circuit synthesizer 203 analyzes the lifetime of the memory using the obtained write trace data and the read trace data related to the memory of the interface circuit to obtain information regarding a time zone during which data is held in each memory element. The task A and the task B have a dependency that the processing of the task B is started after the processing of the task A is started. Therefore, assuming that the processing of that task A is started in cycle 0 and the processing of the task B is started in cycle t (t>0), the write trace data and the read trace data can be converted as follows.
Since data in a memory in an interface circuit is read after it is written, and the processing of the task B is preferably started as soon as possible from the viewpoint of processing performance, the minimum value t that satisfies the following expression is obtained.
For example, the expression is expanded as follows.
By solving this expression, t=2 is obtained. Therefore, as a result of the memory lifetime analysis, the writing and reading of data to/from the memory of the interface circuit are performed at timings as illustrated in
When the minimum operation start interval in the interface circuit is determined based on the lifetime analysis result, the interface circuit synthesizer 203 determines an operation start interval of each of the circuit module 510 that performs processing of the task A, the circuit module 610 that performs processing of the task B, and the interface circuit that connects the circuit module 510 and the circuit module 610. In this example, the minimum operation start interval of the circuit modules 510 and 610 is eight cycles while the minimum operation start interval of the interface circuit is ten cycles. Therefore, the operation start intervals of the circuit module 510, the circuit module 610, and the interface circuit are determined to ten cycles that is the largest among these minimum operation start intervals.
In this case, the operation timings of the circuit module that performs the processing of the task A, the interface circuit, and the circuit module that performs the processing of the task B are as illustrated in
While the minimum operation start interval of the circuit modules 510 and 610 is eight cycles, the minimum operation start interval of the interface circuit is ten cycles. In such a case where the minimum operation start interval of the interface circuit is larger than the minimum operation start interval of the circuit module, it is possible to prevent the interface circuit from being a bottleneck by providing a separated memory device (storage element) in the interface circuit based on the minimum operation start interval of the circuit modules.
Hereinafter, a method of improving the minimum operation start interval in the interface circuit by providing a separated memory device in the interface circuit will be described. The interface circuit synthesizer 203 selects a memory element to contact first in the processing of the next performance of the task at an access timing to the memory of the interface circuit. In this example, as illustrated in
Reading is then controlled such that after data is read from the 0-th memory element of the memory in the interface circuit in cycle 2, the data is written to a separated memory device x0, and in cycle 9, reading is not performed on the 0-th memory element but data is read from the separated memory device x0. With this control, the access timings in the interface circuit illustrated in
The minimum operation start interval of the interface circuit calculated based on the changed lifetime is three cycles. Here, the lifetime of the memory device x0 is not considered. Therefore, in the case of pipeline operation, access to the interface circuit at access timings as illustrated in
When the interface circuit is accessed as illustrated in
As described above, the minimum operation start interval in the interface circuit can be improved by providing a separated memory device x0 in the interface circuit. The interface circuit synthesizer 203 determines the operation start interval of each of the circuit module 510, the circuit module 610, and the interface circuit using the improved minimum operation start interval of the interface circuit again. Since the minimum operation start interval of the circuit modules 510 and 610 is eight cycles and the minimum operation start interval of the interface circuit is three cycles, the operation start interval of the circuit module 510, the circuit module 610, and the interface circuit is determined to be eight cycles, which is the largest among the minimum operation start intervals.
In this case, it is sufficient that the operation start interval of the interface circuit is set to eight cycles, and the access timings in the interface circuit are as illustrated in
Thus, the interface circuit synthesizer 203 generates an RTL description of the interface circuit having an improved operation start interval due to provision of the separated memory device x0. The logic synthesizer 204 then logically synthesizes the RTL description of the interface circuit and the RTL descriptions of the circuit modules 510 and 610 to generate circuit data of a circuit that performs the processing described by the input descriptions 401 and 402 illustrated in
The circuit module 1220 receives a write ready signal WRDY from the interface circuit 1210, and outputs a write enable signal WEN, a write address WA, and write data WD to the interface circuit 1210. The circuit module 1230 receives a read ready signal RRDY and read data RD from the interface circuit 1210, and outputs a read enable signal REN and a read address RA to the interface circuit 1210. The number of ports related to the read address RA and the read data RD and provided in the interface circuit 1210 and the circuit module 1230 is set in accordance with the number of data pieces to be read simultaneously in the processing of the task B.
The interface circuit 1210 includes a control circuit 1211, a memory 1212, a FIFO buffer 1213, and multiplexers 1214 and 1215. The control circuit 1211 notifies the circuit module 1220 that data can be written to the interface circuit 1210 by outputting the write ready signal WRDY, and notifies the circuit module 1230 that data can be read from the interface circuit 1210 by outputting a read ready signal RRDY. The control circuit 1211 also controls the FIFO buffer 1213 and the multiplexers 1214 and 1215 to control writing and reading of data to/from the FIFO buffer 1213. The write ready signal WRDY indicates that data can be written when the value is “1”, and the read ready signal RRDY indicates that data can be read when the value is “1”.
The memory 1212 and the FIFO buffer 1213 store data to be transferred from the circuit module 1220 to the circuit module 1230. When the write enable signal WEN that is input is “1”, the memory 1212 stores the write data WD in a memory element specified by the write address WA. When a read enable signal REN that is input is “1”, the memory 1212 reads data from a memory element specified by a read address RA and outputs the read data as a first piece of read data RD.
When a FIFO write enable signal FWEN received from the control circuit 1211 is “1”, the FIFO buffer 1213 stores data output from the multiplexer 1215. When a FIFO read enable signal FREN received from the control circuit 1211 is “1”, the FIFO buffer 1213 outputs the oldest piece of data among the stored pieces of data as a second piece of read data RD.
According to a control signal SEL2 output from the control circuit 1211, the multiplexer 1214 outputs either the first piece of read data RD output from the memory 1212 or the second piece of read data RD output from the FIFO buffer 1213 to the circuit module 1230 as read data RD. The multiplexer 1214 outputs the first piece of read data RD as the read data RD to the circuit module 1230 when the control signal SEL2 is “0”, and outputs the second piece of read data RD as the read data RD to the circuit module 1230 when the control signal SEL2 is “1”. Here, the control circuit 1211 sets the FIFO read enable signal FREN and the control signal SEL2 to “1” at a predetermined timing determined in advance based on the schedule information.
According to a control signal SEL1 output from the control circuit 1211, the multiplexer 1215 outputs either the first piece of read data RD output from the memory 1212 or the write data WD output from the circuit module 1220 to the FIFO buffer 1213. The multiplexer 1215 outputs the first piece of read data RD to the FIFO buffer 1213 when the control signal SEL1 is “0”, and outputs the write data WD to the FIFO buffer 1213 when the control signal SEL1 is “1”. By providing the multiplexer 1215, even data that is not read a plurality of times, and thus is read only once in the processing of the task B can be stored in the FIFO buffer 1213.
The read ready signal generation circuit 1303 operates to set the read ready signal RRDY to “1” after a predetermined number of cycles have elapsed since the write ready signal WRDY is changed to “1”. The read ready signal generation circuit 1303 counts the number of cycles elapsed after the write ready signal WRDY is changed to “1” by a counter including a combinational circuit 1304, a flip flop 1305, and an adder 1307. Further, the read ready signal generation circuit 1303 determines whether the predetermined number of cycles have elapsed by comparing the count value of the number of elapsed cycles with a predetermined parameter value using a comparator 1306.
Upon determination that the predetermined number of cycles has elapsed, the read ready signal generation circuit 1303 sets the output of a flip flop 1308 for the read ready signal RRDY to “1”. The output of the flip flop 1308 is output to the circuit module 1230 as the read ready signal RRDY. The parameter value used for comparison by the comparator 1306 is a constant value determined based on schedule information or the like. Similarly, it is sufficient that other signals output from the control circuit 1211 are generated by using a counter or a state by providing a counter or the like in the control circuit 1211.
When a start signal START indicating start of processing is changed to “1”, the control circuit 1211 sets the write ready signal WRDY to “1”. Thereafter, the processing of the task A is performed in the circuit module 1220, the write enable signal WEN output from the circuit module 1220 is changed to “1” in cycle 1, and the output data of the circuit module 1220 is written to the 0-th memory element of the memory 1212 specified by the write address WA. Thereafter, pieces of output data of the circuit module 1220 are written to memory elements of the memory 1212 specified by the write addresses WA similarly.
A period from start of the processing of the task A (after write ready signal WRDY is changed to “1”) to writing of data to the memory 1212 that allows start of the processing of the task B (five cycles in this example) has elapsed by cycle 2. In cycle 2, the control circuit 1211 sets the read ready signal RRDY to “1”. In next cycle 3, the read enable signal REN output from the circuit module 1230 that has received the read ready signal RRDY of “1” is changed to “1”, and data pieces are read from the 0-th memory element and the 1-st memory element of the memory 1212 specified by read addresses RA0 and RA1 and output to the circuit module 1230 via a multiplexer 1214. Thereafter, pieces of data are read from memory elements of the memory 1212 specified by the read addresses RA0 and RA1 and output to the circuit module 1230 via the multiplexer 1214 similarly.
In cycle 3, the control circuit 1211 sets the FIFO write enable signal FWEN to “1”. Thus, the data read from the 0-th memory element of the memory 1212 is written to the FIFO buffer 1213 via the multiplexer 1215. Then, in cycle 10 in which the 0-th memory element of the memory 1212 is specified again as the read address RA1, the control circuit 1211 sets both the FIFO read enable signal FREN and the control signal SEL2 to “1”. Thus, the data written to the FIFO buffer 1213 in cycle 3 is output to the circuit module 1230 via the multiplexer 1214 as data of the 0-th memory element of the memory 1212.
Since the operation start interval of the interface circuit 1210 and the circuit modules 1220 and 1230 is eight cycles, each signal is similarly controlled and processing is performed every eight cycles thereafter.
The control circuit 1211 sets the write ready signal WRDY to “0” when cycles corresponding to the operation start interval of the circuit module 1220 (task A) elapses after the write ready signal WRDY is changed to “1”. In addition, the control circuit 1211 sets the write ready signal WRDY to “1” when cycles corresponding to (operation start interval of the interface circuit 1210-operation start interval of the circuit module 1220) elapses after the write ready signal WRDY is changed to “0”. However, since the operation start interval of the interface circuit 1210 and the operation start interval of the circuit module 1220 are the same in this example, the write ready signal WRDY remains “1”.
Further, the control circuit 1211 sets the read ready signal RRDY to “0” when the cycles corresponding to the operation start interval of the circuit module 1230 (task B) elapses after the read ready signal RRDY is changed to “1”. In addition, the control circuit 1211 sets the read ready signal RRDY to “1” when cycles corresponding to (operation start interval of the interface circuit 1210-operation start interval of the circuit module 1230) elapses after the read ready signal RRDY is changed to “0”. However, since the operation start interval of the interface circuit 1210 and the operation start interval of the circuit module 1230 are the same in this example, the read ready signal RRDY remains “1”.
In the embodiment described above, the larger minimum operation start interval among the minimum operation start interval of the circuit module that performs the processing of the task A and the minimum operation start interval of the circuit module that performs the processing of the task B is used as the operation start interval of each of the circuit module performing the processing of the task A, the circuit module performing the processing the task B, and the interface circuit. However, the operation start interval can be smaller by using a plurality of instances. For example, the operation start interval may be set to three cycles, which is the minimum operation start interval of the interface circuit.
As illustrated in
In
The circuit modules 1601-1, 1601-2, and 1601-3 supply output data obtained as the processing result to the circuit modules 1603-1, 1603-2, and 1603-3 each performing the processing of the task B via the interface circuit 1602. The circuit modules 1603-1, 1603-2, and 1603-3 perform processing using output data output from the circuit modules 1601-1, 1601-2, and 1601-3 respectively, and output the processing result to a memory 1605.
With this configuration, it is possible to reduce the time required for transferring data from the circuit modules that perform the processing of the task A to the circuit modules that perform the processing of the task B. Also, since the number of instances of the interface circuit responsible for data transfer is one, the circuit area can be small.
Assuming that the operation start interval of the circuit module that performs the processing of the task A, the circuit module that performs the processing of the task B, and the interface circuit is eight cycles, the access timings in the interface circuit are as illustrated in
In the above description, a case where circuits that perform the processing described in the input descriptions illustrated in
The values of a function schedule (task, resource, t) are listed in a matrix M having circuit resources in the row direction and time (cycle) in the column direction, and an element in row i and column j of the matrix M is represented by M[i, j]. The function schedule (task, resource, t) is a function that returns 1 when the circuit resource is used in the task at time t, and returns 0 when the circuit resource is not used. For example, the schedule information of the task A described above can be represented by a matrix having 19 rows and 11 columns as illustrated in
The operation start interval of a task is an index value indicating at how much time interval, the task is operated in a case where the processing is repeated at the task level. In a case where the task having schedule information represented by the matrix M starts at time 0 and then starts at time t, a circuit resource used at each time can be obtained by calculating M+shiftright (M, t). The function shiftright (M, t) is a function that returns a matrix obtained by shifting the elements of the matrix M to the right by t columns, and indicates the operating status of the circuit resource when the task having schedule information represented by the matrix M starts at time t. In addition, the values of matrix elements that are empty due to shift by the function shiftright are set to 0. Thus, when the matrix M has i rows and j columns, the function is expressed as follows.
When the matrix M and the matrix obtained by shifting the matrix elements to the right are added, matrices with different numbers of columns are added. In such addition, a matrix element having no corresponding matrix element is added with an element value 0.
For example, M+shiftright (M, 11) representing a case where the operation start interval is set to 11 for the above-described task A is as illustrated in
The write trace data of the first task on the data output side is represented by write (x)=wx (x=0, 1, 2, . . . , n−1, where wx are real numbers that represent time). write (x)=wx indicates that data is written to the x-th memory element (memory address x) at time wx. The read trace data of the second task on the data input side that receives the output data of the first task is represented by read (x)={rx0, rx1, . . . , rxk−1} (x=0, 1, 2, . . . , n−1, and rxj are real numbers that represent time). read (x)={rx0, rx1, . . . , rxk−1} indicates that data is read from the x-th memory element (memory address x) at times rx0, rx1, . . . , rxk−1. The number of reading times may vary for each memory element (memory address). It is assumed that elements of the right sides{rx0, rx1, rxk−1} of read (x) are sorted in ascending order without loss of generality. Thus, rxj<rxj+1 (0≤j≤k−2). The minimum value rx0 is represented by rxmin, and the maximum value rxk−1 is represented by rxmax.
When the operation start time of the first task is represented by t1 and the operation start time of the second task is represented by t2, the data of all memory elements (memory addresses) have to satisfy a dependency that the reading is performed by the second task after writing is performed by the first task. This is a condition for correct data transfer from the first task to the second task. With interpretation of each of write trace data and read trace data as time, the minimum value of t is calculated based on the following simultaneous inequalities that satisfy t1+write (x)<t2+read (x) (x=0, 1, 2, . . . , n−1).
This can be simplified as follows.
t
1
+w
x
<t
2
+r
x
min [Expression 7]
x=0,1,2, . . . ,n−1
Here, by using a variable t that satisfies t2-t1=t, the expression can be converted as follows.
(t2−t)+wx<t2+rxmin
w
x
<t+r
x
min [Expression 8]
x=0,1,2, . . . ,n−1
The minimum value of t among the solutions is represented by twrite. For example, the value twrite is 5 for the task A and the task B described above, and t1=−3, and t2=2 because it is assumed that data is written to the 0-th memory element (memory address 0) at time 0 for convenience. The lifetime (x) that is the lifetime of the x-th memory element is expressed as lifetime (x)=[t1+wx, t2+rxmax]. rxmax is the maximum value of read (x), and thus, is max ({rx0, rx1, rxk−1}). lifetime (x)=[t1+wx, t2+rxmax] indicates that data of the x-th memory element (memory address x) needs to be held from time t1+wx to time t2+rxmax.
Here, for the interface circuit between the circuit module performing the processing of the first task and the circuit module performing the processing of the second task, trace data trace (if12, x) indicating when data is written to/read from the x-th memory element (memory address x) is expressed as (if12, x)=[t1+wx, {t2 rx0, t2 rx1, . . . , t2 rxk−1}]. Thus, the trace data includes write trace data of the first task that starts operation at time t1 and read trace data of the second task that starts operation at time t2.
The operation start interval when the processing of the interface circuit is repeatedly performed is calculated assuming that the processing starts for the first time at time 0 and the processing starts for the second time at time t. At this time, in order for data to be normally transferred from the first task to the second task, order relationship that lifetimes of memory elements in the second processing start after lifetimes of the memory elements in the first processing end has to be satisfied. Thus, since lifetime (x)=[t1+wx, t2+rxmax] the minimum value of t that satisfies the following simultaneous inequalities is the minimum value of the operation start interval of the interface circuit.
t
2
+r
x
max
<t+t
1
+w
x [Expression 9]
x=0,1,2, . . . ,n−1
When an array variable corresponding to the data to be transferred from the first task to the second task is represented by tmp12, the schedule information of the interface circuit can be represented by the function schedule (if, tmp12[x], t) (x=0, 1, 2), . . . , n−1). The size of the tmp12 is n. The function schedule (if, tmp12[x], t) returns 1 when data is written to the circuit resource tmp12[x] at time t, returns 4 when data is read from the circuit resource tmp12[x] at time t, returns 16 when data is held in the circuit resource tmp12[x] at time t, and returns 0 otherwise as illustrated in
When schedule information of the interface circuit between the circuit module that performs the processing of the first task and the circuit module that performs the processing of the second task is expressed in a matrix having circuit resources in the row direction and time (cycle) in the column direction, the matrix is represented by M (if12). The number of columns of the matrix M (if12) is represented by c (M (if12)). For example, the schedule information of the interface circuit between the task A and the task B described above is as illustrated in
The operation start interval when the processing of the interface circuit is repeatedly performed is calculated based on M (if12)+shiftright (M (if12), t). If all values of the matrix elements as a result of calculating M (if12)+shiftright (M (if12), t) are any one of 0, 1, 4, and 16, the schedule is possible, but if any value other than the values (for example, 2, 5, 8, 17, and 20) is included, the schedule is impossible.
First, time t is defined as t=c (M (if12)). Next, it is determined whether a schedule of M (if12)+shiftright (M (if12), t) is possible. If the schedule of M (if12)+shiftright (M (if12), t) is possible as a result, the time t is updated to t=t−1, and it is determined whether schedule M (if12)+shiftright (M (if12), t) is possible. The determination is repeated with decrement of the value of t by 1 until it is determined that the schedule of M (if12)+shiftright (M (if12), t) is impossible. When it is determined that the schedule of M (if12)+shiftright (M (if12), t) is impossible, the memory element (memory address) of the row where any value of the matrix elements becomes 2, 5, 8, 17, or 20 is determined as a memory element to be separated.
For example, in the interface circuit between the task A and the task B described above, the state of M (if12)+shiftright (M (if12), 10) is illustrated in
A method of improving the operation start interval of the interface circuit by providing a separated memory device will be described. A row vector for the 0-th memory element of the matrix M (if12) indicating the schedule information of the interface circuit is represented by M (if12) [x0]. A separated memory device for storing data of the 0-th memory element is newly introduced.
The time when writing to the 0-th memory element is performed is represented by wt0, the time when reading from the memory element is performed for the last is represented by rt0, and the time when reading from the memory element is performed for the second to the last is represented by rt1. The operation of the interface circuit is then changed as follows. The data read from the 0-th memory element at time rt1 is written to the separated memory device x0, and then it is considered that the lifetime of the 0-th memory element ends. Thus, lifetime (x0)=[wt0, rt1]. In addition, at time rt0, data is read from the separated memory device x0 and passed to the second task. The schedule information of the memory in the interface circuit that operates in this way is represented by M2 (if12), and the schedule information of the separated memory device is represented by M2 (FIFO (x0)).
In a case where the data of the 0-th memory element is read only once by the second task, the operation of the interface circuit is changed as follows. At time wt0, no data is written to the 0-th memory element, and data is written to the separated memory device x0. Then, at time rt0, data is read from the separated memory device x0 and output to the second task. The schedule information of the memory in the interface circuit that operates in this way is represented by M2 (if12), and the schedule information of the separated memory device is represented by M2 (FIFO (x0)).
Next, a method of determining the size of the memory device provided separately from the memory in the interface circuit will be described. As described below, row vectors are added with L set to an integer of 1 or more, and the maximum value among the matrix elements is obtained. Then, the obtained maximum value is divided by 16 and rounded up to the place of integer to obtain the size of the separated memory device.
Σi=0n−1shiftright(M2(FIFO(x0)),L×i) [Expression 10]
In the example of the task A and the task B described above, assuming that the operation start interval of the interface circuit is 3, the sum row illustrated in
According to the present embodiment, when the minimum operation start interval of the interface circuit is larger than the minimum operation start interval of the circuit module, it is possible to improve processing performance while suppressing increase in circuit area by providing a memory device (storage element) separated from the memory provided in the interface circuit to reduce the operation start interval of the interface circuit. In addition, since write trace data and read trace data are obtained from schedule information, and interface circuits are automatically synthesized based on the write trace data and the read trace data, design steps may be reduced and an interface circuit having an appropriate circuit area may be designed.
When software realizes functions related to configuration synthesis or logic synthesis, and functions related to automatic synthesis (including schedule information extraction) of the interface circuit between modules, the CPU 2501 realizes the functions based on a program by reading data from the main storage 2502 or the auxiliary storage 2503 and using the main storage 2502 as a work area. In the computer illustrated in
In the computer illustrated in
In addition, the embodiment described above provides only one example of implementation in implementing this invention, and it is not desirable that the technical scope of the present invention is interpreted as limited to the embodiment. That is, the present invention can be implemented in various forms without departing from the technical idea or the main features thereof.
All examples and conditional language provided herein are intended for the pedagogical purposes of aiding the reader in understanding the invention and the concepts contributed by the inventor to further the art, and are not to be construed as limitations to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to a showing of the superiority and inferiority of the invention. Although one or more embodiments of the present invention have been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.
Number | Date | Country | Kind |
---|---|---|---|
2017-048999 | Mar 2017 | JP | national |
This application is a continuation application of International Application PCT/JP2018/002091 filed on Jan. 24, 2018 and designated the U.S., the entire contents of which are incorporated herein by reference. The International Application PCT/JP2018/002091 is based upon and claims the benefit of priority of the prior Japanese Patent Application No. 2017-048999, filed on Mar. 14, 2017, the entire contents of which are incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/JP2018/002091 | Jan 2018 | US |
Child | 16559674 | US |