The present application is based upon and claims the benefit of priority from the prior Japanese Patent Application No. 2009-229252 filed on Oct. 1, 2009; the entire contents of which are incorporated herein by reference.
1. Field of the Invention
The present invention relates to a distributed processing system.
2. Description of the Related Art
In some conventional distributed processing systems, an application required by a user is input to the system as a program or interconnection information to perform parallel processing. A program written in a high-level programming language can be written regardless of whether the processing system performs sequential processing or parallel processing. If parallel processing is to be executed in such a program, the portion of the sequentially executed program that can be executed in parallel is automatically extracted, then division of data and division of program are performed, and then it is determined whether or not a communication between computation modules is required. This type of method of generating a parallel program has been disclosed, for example, in Japanese Patent Application Laid-Open No. 8-328872.
A distributed processing system according to the present invention is a distributed processing system for executing an application including a processing element capable of performing parallel processing, a control unit, and a client that makes a request for execution of an application to the control unit, wherein, the processing element comprises, at least at the time of executing the application, one or more processing blocks that process respectively one or more tasks to be executed by the processing element, a processing block control section for calculating the number of parallel processes based on an index for controlling the number of parallel processes received from the control unit, a division section that divides data to be processed input to the processing blocks by the processing block control section in accordance with the number of parallel processes, and an integration section that integrates processed data output from the processing blocks by the processing block control section in accordance with the number of parallel processes.
In the following, embodiments of the distributed processing system according to the present invention will be described in detail with reference to the accompanying drawings. It should be understood that this invention is not limited to the embodiments.
In the following description, a JPEG decoding process will be discussed as an application executed by the distributed processing system according to the embodiment. However, the invention can be applied to processes other than the JPEG decoding process.
The distributed processing system according to the embodiment has a parallel processing virtualization processing element (VPE) 30, which is a computation module capable of parallel processing. The VPE 30 has a single input stream and/or a single output stream as an input/output interface to/from processing elements (PE) 21, 22, and 23 as external devices. A client 10 makes a request for execution of an application to a control unit (CU) 40.
The VPE 30 has a control section 31, a division section 32, an integration section 33, and one or more processing blocks (PB) 34, 35, 36, 37 in it.
In the case shown in
The control section (processing block control section) calculates the number of parallel processes according to a policy (an index for controlling the number of parallel processes) given by the control unit. The control section controls the data stream division section or the data stream integration section in accordance with the calculated number of parallel processes to divide the input stream (i.e. data input to the processing blocks) or integrate the output stream (i.e. data output from the processing blocks).
In the following, an example of execution of an application will be described with reference to
As shown in
An application, such as the JPEG decoding process, requested by a user to be executed will be referred to as a “service”. A sub-process, such as the entropy decoding, that constitutes a part of the JPEG decoding process, will be referred to as a “task”. In other words, tasks are one or more unit processes that constitute an application.
A unique service ID and a unique task ID are assigned to each service and each task respectively in order to identify the content of their processing. In the case discussed herein, the service ID of the JPEG decoding process is SV-823 and the task IDs of the tasks that constitute the JPEG decoding process are TK-101 to TK-106.
When a request for execution of the JPEG decoding process as a service is made by the client 10, the control unit 40 divides the JPEG decoding process, for example, into a series of tasks TK-101 to TK-106 in accordance with a service-task correspondence table shown in
The tasks are assigned to the processing elements that can execute the respective tasks. The processing elements to which the tasks are assigned include the processing elements PE21, 2E22, and PE23 and the VPE 30.
A path between processing elements has a unique pair of input and output, to which an ID or a path ID that identifies the path is assigned. The control unit 40 generates information on the configuration of the processing path (or execution transition information). An example of the execution transition information is shown in
For example, the processing path in performing the JPEG decoding may be configured in a manner shown in
After the execution transition information is created, allocation of computational resources necessary for the process and allocation of processing paths are performed.
The execution transition information in this embodiment includes a task policy. Policies are restrictions placed on the execution of the entire service and the execution of the tasks in the processing elements including the parallel processing virtualization processing element (VPE). Policies include a service policy (which is criteria concerning the execution of an application) that restricts the execution of the entire service and a task policy (which is an index for controlling the number of parallel processes) that restricts the execution of the task in each processing element. For example, it is possible to directly specify the multiplicity of the parallel processes (i.e. the number of parallel processes) for each task using a task policy. Furthermore, it is also possible that the processing time of the entire service is specified as a performance index using a service policy, the control unit optimizes the processing paths in order to complete the process in the specified processing time to automatically determine the number of parallel processes of the task execution in the processing elements, thereby automatically generating a task policy. In any case, the task policy that regulates the task execution in the processing elements, is determined eventually and notified to the processing elements.
Now, examples of parameters that can be used in the service policy and the task policy will be described. Table 1 shows parameters contained in the policy.
Examples of the service policy include the type of, quality assurance, upper and lower limits of electrical energy consumption, upper and lower limits of processing time, and upper and lower limits of output throughput, as shown in
The service policy is registered in the control unit with an ID for identifying the client before a service execution request is made by the client. In connection with this, if the service(s) to which the service policy is applied is (are) to be designated, the service(s) should be registered in the control unit with the service ID(s). If the service(s) to which the service policy is applied is (are) not to be designated, the registered service policy is applied to all the services requested by the client, and the execution transition information including the task policy is created.
The task policy is created at the time of creating the execution transition information regardless of whether or not the service policy is registered. The parameters of the task policy may include choices to be chosen by the client. The control unit may determine the most appropriate task policy and use it without consent of the client.
In the following, embodiments of the distributed processing system according to the above mode will be described. In the following description, the characteristic configurations, operations, and effects will be mainly discussed, and the configurations, operations, and effects that have already been described will not be described in some cases. The number of the processing elements arranged antecedent and subsequent to the VPE may be different from those in the case shown in
In the following embodiments, the processing elements are basically assumed to be implemented as software. However, the processing elements may be implemented as hardware. The outline of the latter case will be described later.
A first embodiment relates to processing in a case in which the client specifies the task policy.
Among the processes shown in
The IDCT (step S104) corresponds to input of data to be processed to the VPE (PE4) in which the number of parallel processes is equal to 2, division of data (step S204), a process in two processing blocks (PB) in the VPE (steps S205 and S206), integration of the processed data in the two processing blocks, and output of the data (step S207).
The upsampling (step S105) corresponds to a process in the processing element PE5 (step S208) in which the number of parallel processes is equal to 1, and the color signal conversion (step S106) corresponds to a process in the processing element PE6 (step S209) in which the number of parallel processes is equal to 1. Thus, the JPEG decoding process can be divided into sequential processes of steps S201 to S209 as with steps S101 to S106 shown in
In the following, the sequence according to the first embodiment will be described.
In step 300 of the sequence, the processing element PE1 sends PE (processing element) registration information shown in
Here, the processing element PE1 can provide the function having an FID of FN-101 (JPEG file analysis) with the maximum number of parallel processes equal to 1, which means that the processing element PE1 is not capable of parallel processing.
In step 301 of the sequence, the VPE serving as the processing element PE4 sends the PE registration information shown in
The processing element PE4 (VPE) can provide the function having an FID of FN-104 (IDCT) with the maximum number of parallel processes equal to 2. Though not shown in the drawings, the other processing elements also send their PE registration information at the time of start-up to register themselves to the control unit CU.
In step 302 of the sequence, the client makes a request for execution of the JPEG decoding, as a service execution request, to the control unit with designation of a service ID of SV-823.
In step 303 of the sequence, the control unit creates execution transition information based on the registration information of the processing elements. The control unit may keep monitoring the dynamically changing PE registration information among the registration information of the processing elements and reflect the monitoring result on the created execution transition information.
It is assumed that the execution transition information containing the path information same as that shown in
In step 304 of the sequence, the control unit sends the execution transition information including the task policy to the client. In step 305 of the sequence, the client chooses a certain value (task policy) for the parameters for which a range of the value is presented. Here, the policy that allows choice is only the upper and lower limits of the number of parallel processes. It is assumed that the other policies are inherent to each processing element, and the client cannot specify the values for them. If the combination of the values of the policies is improper or impossible, it is checked on the GUI (fifth embodiment), or the control unit checks it and returns an error.
Here, it is assumed that the client determines, in step 305 of the sequence, the values of the parameters as shown in
In step 306 of the sequence, the client transmits the execution transition information containing the chosen task policy to the control unit CU.
Then, the control unit sends the execution transition information containing the task policy to the processing elements designated in the execution transition information, thereby requesting allocation of computational resources. In step 307 of the sequence, the control unit CU sends the execution transition information to the processing element PE1 and requests allocation of the computational resources.
Before sending the execution transition information to the processing elements PE, the control unit checks the values of the parameters and the combination of the values of the parameters. If the values and/or the combination are improper or impossible, the control unit returns an error and terminates the processing of the service.
In step 308 of the sequence, upon receiving the execution transition information, the processing element PE1 recognizes the assigned task and allocates the computational resources such as memory necessary for the task execution. The processing element PE1 applies the task policy and changes the internal configuration of itself as needed. In the first embodiment, the processing element PE1 recognizes that both the upper and lower limits of the number of parallel processes are equal to 1 and determines the number of parallel processes as 1. If the processing element PE1 has no processing block, the processing element PE1 newly creates a process or a thread, or loads program information. If the processing element PE1 is implemented as hardware, the processing element PE1 performs dynamic reconfiguration such as switching as needed.
In step 309 of the sequence, after allocating the computational resources and applying the policy, the processing element PE1 notifies the control unit of completion of the computational resource allocation.
In step 310 of the sequence, the processing element PE4 receives the computational resource allocation request from the control unit as with the processing element PE1.
In step 311 of the sequence, the processing element PE4 recognizes that both the upper and lower limits of the number of parallel processes are equal to 2 and determines the number of parallel processes as 2. If the processing element PE4 does not have two processing blocks, the processing element PE4 newly creates a process(es) or a thread(s), or loads program information. If there is unnecessary processing block, it may be deleted. If the processing element PE4 is implemented as hardware, the processing element PE4 performs dynamic reconfiguration such as switching of the path to a processing block(s) as needed.
In step 312 of the sequence, after allocating the computational resources and applying the policy, the processing element PE4 notifies the control unit of completion of the computational resource allocation.
The processing elements other than the processing elements PE1 and PE4 also perform allocation of computational resources.
After verifying the completions of the computational resource allocation by all the processing elements designated in the execution transition information, the control unit makes a request to the respective processing elements for allocation of processing paths between the processing elements (step 313 of the sequence). Each processing element allocates processing paths with the adjacent processing elements. Then, the control unit makes a request to the client for connection to the processing elements, thereby notifying the client that the processing of the service can be started.
In step 314 of the sequence, after the processing paths between the processing elements have been allocated, the client transmits data. The processing elements perform data flow processing along the allocated processing paths. In this case, the client transmits the data to the processing element PE1, and the processing element PE1 processes the data and transmits the processed data to the processing element PE2. Similarly, the processing elements PE2 to PE5 receive data from the respective antecedent processing elements and transmit the processed data to the respective subsequent processing elements, and the processing element PE6 outputs the result.
In step 315 of the sequence, the processing element PE1 receives the data from the client, retrieves a JPEG file, and analyzes the header. The processing element PE1 transmits the retrieved image information to the processing element PE2.
In step 316 of the sequence, after the processing element PE2 performs entropy decoding, the processing element PE3 receives the data from the processing element PE2 and performs inverse quantization, and the processing element PE4 receives the data from the processing element PE3.
In step 317 of the sequence, the processing element PE4 performs IDCT by parallel processes. For example, if the processing element PE4 performs the processes on a MCU (Minimum Coded Unit)-by-MCU basis and sequential and unique numbers are assigned to the MCUs in accordance with the coordinate values in an image, the division section divides the MCUs into the MCUs having even numbers and the MCUs having odd numbers in order to process them in different processing blocks in parallel, and the integration section reintegrates the data by performing synchronizing processing before transmitting the data to the next processing element PE5.
In step 318 of the sequence, after the processing element PE5 performs upsampling, the processing element PE6 performs color signal conversion and returns the result data to the client, and the processing of the service is terminated.
In step 319 of the sequence, the processing paths and the computational resources are deallocated, and then the completion processing is executed.
In step 320 of the sequence, the control unit sends a notification of completion of service execution to the client, and terminates execution of the service.
In second to seventh embodiments, it is assumed that the computational resource allocation in steps 307 to 309 and steps 310 to 312 in the above-described sequences is performed appropriately in all the processing elements necessary for the execution of the service, unless specifically stated otherwise. The processes from the computational resource allocation (step 313) to the completion of service execution (step 320) are performed in the same manner in all the embodiments, and detailed descriptions of these processes will be eliminated in some cases by simply describing “the execution of the service is continued” or the like.
A second embodiment relates to a processing in the case in which a client specifies the service policy as a best effort type quality assurance.
In step 400 of the sequence, the processing element PE1 sends the PE registration information shown in
In step 401 of the sequence, as with the processing element PE1, the processing element PE4 (VPE) sends the PE registration information (shown in
In step 402 of the sequence, the client registers the policy to be applied to the service to the control unit together with its own client ID. This policy may be applied to all the services or only to a specific service(s) by specifying a service ID(s).
In the second embodiment, the service ID is not specified, and the service policy shown in
In step 403 of the sequence, the client makes a request for execution of the JPEG decoding, as a service execution request, to the control unit with designation of a service ID of SV-823.
In step 404 of the sequence, upon receiving the service execution request, the control unit determines whether or not the service policy is applicable and a task policy can be determined. In the second embodiment, since the quality assurance type in the service policy is the best effort type, the control unit can automatically determine the task policy and proceed with the task execution even if a condition(s) such as electrical energy consumption is (are) not satisfied.
In step 405 of the sequence, the control unit determines the task policy based on the PE registration information and creates execution transition information. The control unit may keep monitoring the dynamically changing PE registration information among the PE registration information and reflect the monitoring result on the created execution transition information. As a result, the control unit creates the execution transition information that is the same as the execution transition information shown in
In step 406 of the sequence, the control unit transmits the execution transition information containing the created task policy to the processing elements designated in the execution transition information, thereby requesting allocation of computational resources.
In step 407 of the sequence, upon receiving the execution transition information, the processing elements recognize the assigned tasks and allocate computational resources such as memory necessary for the task execution. The processing elements apply the task policy and change the internal configuration of themselves as needed.
In the second embodiment, if a processing element does not have processing blocks as many as the number of the parallel processes, the processing element newly creates a process(es) or a thread(s), or loads program information. If the processing element is implemented as hardware, the processing element performs dynamic reconfiguration such as switching as needed. Since the upper limit and the lower limit of the number of parallel processes are different from each other in the task policy of the task 4 shown in
In step 408 of the sequence, the control unit receives notification of the completion of computational resource allocation from the processing elements, allocates the processing paths, and proceeds with the processing of the service.
A third embodiment relates to a processing in the case in which although the client specifies the service policy as a guarantee type quality assurance, the service policy specified by the client is not applicable.
In step 500 of the sequence, the client registers the service policy shown in
In step 501 of the sequence, the client makes a request to the control unit for execution of the service with designation of the ID of the requested service.
In step 502 of the sequence, the control unit determines whether or not the service policy registered in advance by the client is applicable. In the case of the third embodiment, the control unit determines that the service policy is not applicable.
In step 503 of the sequence, the control unit returns an error to the client and terminates the processing of the service.
In the third embodiment, since the selected quality assurance type is the guarantee type, if the quality is not assured, the control unit terminates the execution of the service and returns an error because of the contradiction to the client's intent.
The following fourth and fifth embodiments relate to procedures followed by the distributed processing system if it is determined in step 502 of the sequence in
(1) presenting an alternative service policy (fourth embodiment); and
(2) restricting or checking the entry of task policy on the GUI of the client (fifth embodiment).
In the following, the fourth and fifth embodiments will be described.
The fourth embodiment relates to the presentation of an alternative service policy among the measures taken by the distributed processing system when it is determined that the service policy is not applicable.
First, in step 600 of the sequence, it is assumed that the registration of the processing elements has been completed. The client has registered in advance the service policy same as that in step 500 of the sequence in
In step 601 of the sequence, the client makes a request to the control unit for execution of the service with the designation of the ID of the requested service.
In step 602 of the sequence, the control unit determines whether or not the service policy that is registered in advance by the client is applicable to the service. In the fourth embodiment, the control unit determines that the service policy is not applicable.
In step 603 of the sequence, the control unit creates an alternative service policy. The control unit presents the created service policy to the client (step 604). A graphical image of the presentation of the alternative policy displayed on the GUI of the client is shown in
The example of the GUI shown in
In step 605 of the sequence, the client makes a selection between acceptance and refusal of the alternative. In the fourth embodiment, the client accepts the alternative.
In step 606 of the sequence, the client transmits the result of the selection to the control unit. The control unit makes a determination as to whether or not the result of the selection is acceptable (step 607).
If the control unit accepts the result of the selection in step 607 of the sequence, the control unit determines the task policy and creates execution transition information (step 608). In step 609 of the sequence, the control unit transmits the execution transition information to the processing elements together with a computational resource allocation request.
On the other hand, if the control unit does not accept the result of the selection in step 607 of the sequence, the control unit returns an error to the client and terminates the execution of the service.
The fifth embodiment relates to the restriction or check of the entry of task policy made on the GUI of the client among the measures taken by the distributed processing system when it is determined that the service policy is not applicable.
In step 700 of the sequence, it is assumed that the registration of the processing elements has been completed. The client makes a request to the control unit for execution of the service with designation of the ID of the requesting service.
In step 701 of the sequence, the control unit creates an execution transition information including a candidate task policy.
In step 702 of the sequence, the control unit transmits the execution transition information, which contains choices of the values of parameters of the task policy, to the client. When transmitting the execution transition information, the control unit also transmits allowable combinations of the values of parameters of the task policy.
In step 703 of the sequence, when setting the parameters of the task policy, the client can set the value of the parameters of the task policy within the range of allowable values displayed on the GUI shown in
In step 704 of the sequence, after determining the task policy, the client sends the task policy to the control unit.
In step 705 of the sequence, the control unit configures execution transition information based on the determined task policy and starts allocation of computational resources necessary for the execution of the service.
A sixth embodiment relates to a process of specifying the policy by designating a priority level for each of the tasks.
Since the priority level cannot be applied to a service, the client specifies the priority levels to the tasks after execution transition information is created by the control unit in response to a service execution request made by the client. Alternatively, a priority level(s) is registered in advance for a specific processing element(s). In the case of this embodiment, JPEG encoding is requested as a service, and the maximum numbers of parallel processes of the processing element PE1 and the processing element PE4 have been registered as 2 and 4 respectively (the same conditions as the second embodiment). In this situation, it is assumed that the client specifies priority levels for the execution transition information created by the control unit as shown in
The relationship between the priority level and the number of parallel processes at the time when a task is solely executed while no other task is executed at the same time is defined as follows.
(1) High priority level: the task is executed with the maximum number of parallel processes.
(2) Middle priority level: the task is executed with the upper limit number of parallel processes. If the upper limit is not specified, the task is executed with the maximum number of parallel processes.
(3) Low priority level: the task is executed with the lower limit number of parallel processes. If the lower limit is not specified, the task is executed with the number of parallel processes equal to 1.
In step S800, a determination is made as to whether or not there is a task that has been executed before the execution of the subject task A in the same processing element. If no task has been executed in the same processing element (N in step S800), the process proceeds to step S808, where the task is executed with the aforementioned number of parallel processes at the time of sole execution. Then, the task execution is terminated.
On the other hand, if there is a task that has been executed before the execution of the subject task A in the same processing element (Y in step S800), the process proceeds to step S801. In step S801, a determination is made as to whether or not the sum of the number of parallel processes at the time of execution of task A and the number of parallel processes at the time of execution of task B exceeds the maximum number of parallel processes of the processing element. If the sum does not exceed the maximum number of parallel processes (N in step S801), the process proceeds to step S809. In step S809, the tasks A and B are executed at the same time. Then, the task execution is terminated.
On the other hand, if the sum exceeds the maximum number of parallel processes (Y in step S801), the process proceeds to step S802. In step S802, a determination is made as to whether or not the priority level of the subject task A is lower than the priority level of the task B that has already been executed. If the priority level of the subject task A is lower (Y in step S802), the process proceeds to step S803. In step S803, the task A is qualified as L, and the task B is qualified as H.
If the priority level of the subject task A is equal to or higher than the priority level of the task B (N in step S802), the process proceeds to step S810. In step S810, the task B is qualified as L, and the task A is qualified as H.
Both steps 803 and 810 proceed to step S804 after the processing. In step S804, a determination is made as to whether or not the maximum number of parallel processes of the processing element qualified as L is larger than the number of parallel processes of the task qualified as H at the time of execution.
If the maximum number of parallel processes of the processing element is larger than the number of parallel processes at the time of execution of the task qualified as H (Y in step S804), the number of parallel processes at the time of execution of the task qualified as L is replaced by the difference between the maximum number of parallel processes of the processing element and the number of parallel processes at the time of execution of the task qualified as H, in step S811. Then, in step S812, the tasks A and B are executed at the same time. Thereafter, the task execution is terminated.
If the determination in step S804 is negative, the process proceeds to step S805. In step S805, the task qualified as L is suspended. In step 806, a determination is made as to whether or not the task qualified as H has been terminated.
If the task qualified as H has not been terminated (N in step S806), the process returns to step S805, where the suspension of the task qualified as L is continued. If the task qualified as H has been terminated (Y in step S806), the execution of the task qualified as L is restarted with the number of parallel processes equal to that at the time of sole execution of the task. Then, the task execution is terminated.
The task having a task ID of TK-104 is assigned to the processing element PE4 (VPE) with a priority level of “low”. Since the lower limit of the number of parallel processes is not specified, the task is executed with the number of parallel processes equal to 1 even if no other task is in execution (as illustrated by solid lines in
A seventh embodiment relates to a case of copying a processing block by itself, including a case in which the processing block is a general-purpose processing block. In the case described as the seventh embodiment, processing blocks that have different functions are not present in a processing element at the same time. Furthermore, the description will be made taking an example case in which a processing block is not loaded from outside, regardless of whether the processing block is a special-purpose processing block or a general-purpose processing block, and only program information and library information including reconfiguration information can be loaded from outside.
In the first to the sixth embodiments, all the processing blocks are special-purpose processing blocks that provide functions specialized for applications regardless of whether they are implemented as software or hardware. However, a general-purpose processing block that can provide a function same as a normal processing block specialized for an application by loading a library may be implemented in a processing element. This general-purpose processing block will be referred to as a GP-PB. In the following description, the expression like “loading a library” means loading program information of a library.
In the following, examples of the following three cases will be described:
(1) a case in which a GP-PB and a library are copied in combination;
(2) a case in which a GP-PB and a library are copied separately; and
(3) a case in which only a library is copied.
(1) A case in which a GP-PB and a library are copied in combination (
In this case, in the initial state, there is no processing element that has a special-purpose processing block for executing JPEG encoding as a service. Therefore, a library is dynamically downloaded to a processing element (VPE) having a GB-PB. The initial state of the processing element having the GB-PB is shown in
In step 900 of the sequence, the processing element having GP-PB is registered in the control unit with a function ID (FID) of FN-999. An example of the PE registration information is shown in
In step 901 of the sequence, the client makes a request for execution of the JPEG decoding, as a service execution request, to the control unit with designation of a service ID of SV-823.
In step 902 of the sequence, the control unit creates execution transition information containing a candidate task policy based on the PE registration information. The task of TK-104 can be assigned to the processing element having GP-PB. In other words, the processing element having GP-PB can provide the function same as a special-purpose processing block having the function of TK-104 by downloading a library to the GP-PB, and designates the processing element having GP-PB in the execution transition information. The execution transition information created here is assumed to be the same as that shown in
In step 903 of the sequence, the control unit sends the execution transition information containing the candidate task policy to the client.
In step 904 of the sequence, the client only makes a choice on the number of parallel processes. Here, the client specifies the number of parallel processes of 2 for the task 4, which corresponds to a task having a task ID of TK-104, as with the case shown in
In step 905 of the sequence, the client sends the execution transition information containing the chosen task policy to the control unit.
In step 906 of the sequence, the control unit firstly checks the execution transition information containing the task policy. The control unit knows the function of the library loaded to the processing element having GP-PB and the operation state of the library. The control unit determines whether or not the library is necessary based on the execution transition information containing the task policy.
In the seventh embodiment, since the library providing the function of FN-104 necessary for the execution of the task of TK-104 has not been downloaded to the processing element having GP-PB, the control unit determines that it is necessary to dynamically deliver the library providing the function of FN-104.
In step 907 of the sequence, the control unit obtains the program information as the library providing the function of FN-104 to the GP-PB from, for example, a database server and downloads the library through the load section of the processing element (
In step 908 of the sequence, the processing element receives a computational resource allocation request from the control unit. In step 909 of the sequence, the processing element recognizes the task to be executed and determines the number of parallel processes at the time of execution.
At this stage, having only one processing block despite the number of parallel processes equal to 2, the processing element copies the GP-PB and the library in combination to reconfigure the internal configuration so that the parallel processes can be performed with the number of parallel processes equal to 2 (
In step 910 of the sequence, when the processing element becomes ready for execution of the task, it may be concluded that the allocation of computational resources has been completed, and the processing element returns a notification of the completion of computational resource allocation to the control unit.
In step 911 of the sequence, the processing of the service is continued.
(2) A case in which a GP-PB and a library are copied separately (
In this case, in the initial state, a library that provides the function of FN-500 has already been loaded to the GP-PB (
(3) A case in which only a library is copied (
In this case, the task of TK-104 is to be executed on the processing element (VPE) having the GP-PB with the number of parallel processes equal to 2. If the processing element has two GP-PBs already (
If the processing element already has the library and the GP-PB in combination as shown in
All the GP-PBs can be copied if needed as described in cases (1) and (2). In addition, as described in case (1), a GP-PB and a library can be copied in combination, and a GP-PB and a library can be copied separately (as described in cases (2) and (3)). It is also possible to delete a library and newly loads another library having another function from outside. In any case, the number of parallel processes can be dynamically controlled as desired by copying or deleting the processing block.
The GP-PB and the library can be unloaded to the holding sections. All the GP-PBs and the libraries can be unloaded without causing any problem. The control section can hold one of the GP-PB or the library, or the GP-PB and library in combination in the holding sections (
In the seventh embodiment, the processing block that provides a specific function using a GP-PB and a library in combination has been described. Since the combination of the GP-PB and the library is functionally equivalent to a special-purpose processing block PB, the above-described embodiment can also be applied to a special-purpose processing block implemented as software. More specifically, a special-purpose processing block providing a certain function can be copied, unloaded, or deleted (
An eighth embodiment relates to implementation of a processing element as hardware.
In the first to seventh embodiments, the processing element is implemented as software, and the processing blocks can be increased/decreased as desired up to the maximum number of parallel processes. Although the maximum number of parallel processes depends on the memory capacity, it may be regarded to be unlimited, unless blocks have significantly large sizes relative to the memory capacity.
On the other hand, the processing element may be implemented as hardware. In the case of the processing element implemented as hardware, since the processing blocks to be used are circuits that have been built in advance, the maximum number of parallel processes is limited by the number of the already built processing blocks. The processing block may be continuously connected with the division section or the integration section. However, the processing paths may be dynamically configured by providing switches to the processing blocks as shown in
As shown in
Furthermore, as shown in
In the following, a process of determining the number of parallel processes in the processing element will be described with reference to
In step S1000, it is checked whether or not the upper limit of the number of parallel processes is set (or specified) in the task policy_. If the upper limit is set (Y in step S1000), the process proceeds to step S1020. If the upper limit is not set (N in step S1000), the process proceeds to step S1010. In step S1010, the upper limit of the number of parallel processes is set to the maximum number of parallel processes. Then, the process proceeds to step S1020.
In step S1020, it is checked whether or not the lower limit of the number of parallel processes is set (or specified) in the task policy. If the lower limit is set (Y in step S1020), the process proceeds to step S1040. If the lower limit is not set (N in step S1020), the process proceeds to step S1030. In step S1030, the lower limit of the number of parallel processes is set to 1. Then, the process proceeds to step S1040.
In step S1040, it is determined whether or not the upper limit of the number of parallel processes is larger than the lower limit of the number of parallel processes. If the upper limit of the number of parallel processes is larger than the lower limit of the number of parallel processes (Y in step S1040), the process proceeds to step S1060. If the upper limit of the number of parallel processes is equal to or smaller than the lower limit of the number of parallel processes (N in step S1040), the process proceeds to step S1050.
In step S1050, it is determined whether or not the upper limit of the number of parallel processes is equal to the lower limit of the number of parallel processes. If the upper limit of the number of parallel processes is equal to the lower limit of the number of parallel processes (Y in step S1050), the process proceeds to step S1070, because there is no option in the number of parallel processes. If the upper limit of the number of parallel processes is not equal to the lower limit of the number of parallel processes (N in step S1050), the process proceeds to step S1150, where an error notification is sent to the control unit.
In step S1060, it is checked whether or not parameters of the task policy other than the number of parallel processes and the priority level such as the electrical energy consumption, the processing time, and the output throughput are set. If any one of the parameters is set (Y in step S1060), the process proceeds to step S1080. If none of the parameters is set (N in step S1060), the process proceeds to step S1070.
In step S1070, the number of parallel processes is set to the upper limit value of the number of parallel processes, and the process proceeds to step S1130.
In step S1080, the upper and lower limits of the number of parallel processes are calculated based on the profile of the electrical energy consumption to determine a range A of the number of parallel processes, and then process proceeds to step S1090.
In step S1090, as with step S1080, the upper and lower limits of the number of parallel processes are calculated based on the profile of the processing time to determine a range B of the number of parallel processes, and then process proceeds to step S1100.
In step S1100, as with steps S1080 and S1090, the upper and lower limits of the number of parallel processes are calculated based on the profile of the output throughput to determine a range C of the number of parallel processes, and then process proceeds to step S1110.
In step S1110, it is determined whether or not a common range D can be extracted from the ranges A, B and C of the number of parallel processes. If the common range D can be extracted (Y in step S1110), the process proceeds to step S1120. If the common range D cannot be extracted (N in step S1110), the process proceeds to step S1150. For example, If the range A of the number of parallel processes includes 1, 2, and 3, the range B includes 2 and 3, and the range C includes 2, 3, and 4, the common range D includes 2 and 3.
In step S1120, a common range is further extracted from the range between the upper and lower limits of the number of parallel processes and the common range D. The largest number in the extracted range is determined as the number of parallel processes to be used. Then, the process proceeds to step S1130. For example, if the common range D includes 2 and 3, the upper limit of the number of parallel processes is 4, and the lower limit of the number of parallel processes is 2, the number of parallel processes to be used is determined to be 3.
In step S1130, it is checked whether or not the priority level is specified. If the priority level is specified (Y in step S1130), the process proceeds to step S1140. If the priority level is not specified (N in step S1130), the process is terminated.
In step S1140, the number of parallel processes is adjusted based on the priority level in accordance with, for example, the process shown in
In step S1150, since there is no allowable number of parallel processes, an error is returned to the control unit, and the process is terminated.
Though not shown in the drawings, in the case where the lower limit of output throughput and the upper limit of electrical energy consumption are specified in the task policy, and the type of quality assurance is the best effort type, if the output throughput and the electrical energy consumption cannot take values within the range limited by the task policy, the number of parallel processes is dynamically adjusted by the control section in such a way that optimum trade-off between the output throughput and the electrical energy consumption is achieved. In the case where the type of quality assurance is the guarantee type, the number of parallel processes is adjusted in such a way that the regulations on the values of these parameters placed by the task policy are ensured.
As described above, the distributed processing system according to the present invention can be advantageously applied to a distributed processing system in which the most suitable number of parallel processes is desired to be specified.
In a system defined by a data flow type module network including computation modules that provide one or more specific functions in order to implement a service required by a user, the present invention can provide an advantageous distributed processing system in which not only the number of parallel processes and the overall performance but also the electrical energy consumption and the processing time are used as indexes in performing virtual parallel processing in the modules, and the optimum number of parallel processes is determined based on these indexes.
Furthermore, according to the present invention, there can be provided a distributed processing system in which the number of processing blocks in a computation module can be dynamically increased/decreased based on the dynamically defined number of parallel processes.
Furthermore, according to the present invention, an application execution environment optimal for a user can be created by specifying, as a policy, an index that is considered to be important at the time of executing the application, without need for designation of the number of parallel processes by the user.
Number | Date | Country | Kind |
---|---|---|---|
2009-229252 | Oct 2009 | JP | national |