This application is based upon and claims the benefit of priority of the prior Japanese Patent Application No. 2013-210289, filed on Oct. 7, 2013, the entire contents of which are incorporated herein by reference.
The embodiment discussed herein is related to a non-transitory computer-readable storage medium, a method for data processing, and a processing management apparatus.
In recent years, the range and amount of information treated in business have remarkably increased. Data processing is performed using a large amount of data (big data) generated one after another.
As a system of processing of a large amount of data, there are, for example, a batch system and an incremental system. The batch system is a system for processing entire accumulated data. The incremental system is a system for, when new data (hereinafter referred to new arrival data) arrives, sequentially processing data related to the new arrival data. The incremental system is useful for analysis processing that makes use of the new arrival data in the large amount of data.
As a model of a parallel calculation for performing distributed processing, an actor model is known. In incremental processing for a large amount of data, for example, data is distributedly stored on a plurality of disks and related data is sequentially processed according to the actor model. In the actor model, each of calculation entities called actors performs an operation for 1) processing a received message and updating an internal state according to necessity, 2) transmitting a limited number of messages to the other actors, and 3) generating a limited number of new actors, whereby distributed processing is performed.
In recent years, processing targeting social data (social networking service: SNS, etc.) has been increasing. The social data includes, for example, as a characteristic, in relation to elements of one data (hereinafter referred to as node data), other node data. The number of related node data is different depending on node data. Data having relations among node data is described in, for example, Japanese Patent Application Laid-open No. 2008-134688.
When the large amount of data is processed in the incremental system, after generation of the new arrival data, parallel processing is applied according to an amount of the data processing in order to reduce time consumed for the data processing.
In the processing of the data having the relation among the node data such as the social data, as explained above, the number of related node data is different depending on node data. In the data processing, following start node data at a start point, node data branching from the start node data and node data branching from the node data are sequentially traced and processed. Therefore, a total number of related node data may be unable to be grasped. It is difficult to estimate an amount of the data processing.
Therefore, the parallel processing is applied to the processing of the branching node data. However, when the parallel processing is performed, generation of processes and copying of messages corresponding to the number of branches are performed. A consumed amount of resources increases. Therefore, when parallelism is enormous, depletion of resources occurs.
According to one aspect, there is provided a non-transitory computer-readable storage medium, a method for data processing, and a processing management apparatus for efficiently executing data processing.
According to a first aspect of the embodiment, a non-transitory computer-readable storage medium storing therein a program that causes a computer to execute a process includes managing a data processing by a processing target node among a plurality of nodes in which respective nodes have relations with other nodes, the processing target node being traced from a start node on the basis of the relations, and calculating a total number of nodes linked to the start node on the basis of numbers of stages indicating distances of processed nodes and the processing target node from the start node, and numbers of branches from the processed nodes and the processing target node, while the processing target node performs the data processing.
The object and advantages of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the claims.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention.
[Configuration of a Parallel Processing Management System]
The parallel processing management system illustrated in
The actor model is one of parallel calculation models and represented by a set of calculation entities called actors. The actors indicate, for example, processing processes of the incremental processing engines 21a to 21n executed by the parallel processing management apparatuses 10a to 10n. The individual actors (processing processes) behave as explained below. The actor (the processing process) transmits a limited number of messages to the other actors (processing processes). The actor (the processing process) generates a limited number of new actors (processing processes). Upon receiving a message, the actor (the processing process) processes the message and updates an internal state according to necessity.
Description will be made correspondingly to the parallel processing management system illustrated in
[Configuration of the Parallel Processing Management Apparatus]
The storage medium 15 includes a data storing unit 20a. The data storing unit 20a stores processing target data and new arrival data. As explained above with reference to
[Block Diagram of the Parallel Processing Management Apparatus]
The incremental processing engine 21a of the parallel processing management apparatus illustrated in
[Data Having Relations Among Elements]
Target data of the parallel processing management system in this embodiment is, for example, data in which elements of have relations with other elements. The data in which elements have relations with other elements is for example, social data (social networking services: SNS, etc.). The data having relations among elements is explained.
As illustrated in
When the data having relations among elements illustrated in
For example, the parallel processing management system traces the node data having relations from the start node data using a depth-first search of the tree structure, and sequentially performs processing for the node data. That is, in an example illustrated in
When fifteen node data AnA to Ono included in the tree structure illustrated in
A processing time consumed for data processing is explained. As illustrated in
Specifically, when the fifteen node data AnA to Ono included in the tree structure illustrated in
[Parallel Processing]
In the example illustrated in
As illustrated in
Therefore, the parallel processing management apparatus in this embodiment estimates the number of processing target node data in data processing and suppresses imbalance of the processing time. Alternatively, the parallel processing management apparatus reduces an average of turnaround times on the basis of the number remaining node data (the number of pieces of remaining processing) based on the processing target node data. Details are explained with reference to
Specifically, the parallel processing management apparatus in this embodiment performs data processing for processing target node data and calculates, on the basis of the number of stages and the number of branches of processed node data and the processing target node data, a total number of node data linked to start node data. On the basis of propagate information concerning the processed node data and information concerning the processing target node data, the parallel processing management apparatus can estimate a total number of node data linked to the start node data. The total number of node data is the number of processing target data (hereinafter referred to as total number of pieces of processing). The number of stages of node data indicates, for example, a distance from the start node data.
Further, when the total number of pieces of processing exceeds a reference number, the parallel processing management apparatus sets, as a target of the parallel processing, processing for node data branching from the processing target node data. When the total number of pieces of data processing is large, the parallel processing management apparatus can suppress imbalance of a processing time among pieces of data processing by setting processing of a part of node data as a target of parallelization.
In the example illustrated in
In the example illustrated in
As illustrated in
On the other hand,
Specifically, in
As illustrated in
A flow of processing in the parallel processing management system in this embodiment is explained.
[Flowchart]
Subsequently, the terminal apparatuses 30a to 30c transmit the generated messages to any parallel processing management apparatus 10 through a network (S13). The terminal apparatuses 30a to 30c transmit the generated messages to any one of the parallel processing management apparatuses 10. The message control unit 24 of the incremental processing engine 21 of the parallel processing management apparatus 10, which receives the messages, analyzes the messages and determines, on the basis of key information included in the messages, whether the processing target node data is stored in the data storing unit 20 of the parallel processing management apparatus 10 (S14).
When the processing target node data is not stored in the data storing unit 20 of the parallel processing management apparatus 10 (NO in S14), the message control unit 24 transmits, on the basis of an address, a message to the parallel processing management apparatus 10 that stores the processing target node data in the data storing unit 20 (S15). The data processing unit 22 of the incremental processing engine 21 of the parallel processing management apparatus 10, which stores the processing target node data in the data storing unit 20, performs node data processing on the basis of the message (S16). Details of the processing is explained below with reference to a flowchart of
When one or more messages are generated as a result of the processing in step S16 (YES in S17), the message control unit 24 of the incremental processing engine 21 of the parallel processing management apparatus 10 transmits, on the basis of the key information included in the message, the generated message to the parallel processing management apparatus 10 that stores the node data in the data storing unit 20 (S18).
The processing of the incremental processing engine 21 in the parallel processing management apparatuses 10 in the flowchart of
Upon receiving the message, the data processing unit 22 analyzes the message and acquires key information indicating processing target node data and a function name representing processing content (S22). Subsequently, the data processing unit 22 reads out the processing target node data from the data storing unit 20 on the basis of the key information (S23). The data processing unit 22 calls a function on the basis of the acquired function name and applies processing to the read-out node data (S24). The data processing unit 22 writes an execution result of the processing in the data storing unit 20 (S25).
The data processing unit 22 calculates, on the basis of number of pieces of processing estimation information included in the message, a total number of node data linked to start node data and performs estimation of a total number of pieces of processing (S26). Details of estimation processing for a total number of pieces of processing are explained below with reference to
The data processing unit 22 determines, on the basis of the calculated total number of pieces of processing, whether processing for node data branching from the processing target node data is set as a target of parallelization (S27). Subsequently, the data processing unit 22 generates a message of the node data branching from the processing target node data and transmits a message to the parallel processing management apparatus 10 that retains node data to be set as a processing target next (S28). In this case, when the following node data is a target of the parallel processing, the data processing unit 22 transmits the message to the plurality of parallel processing management apparatuses 10.
An overview of propagation of a message between the parallel processing management apparatuses 10 in the data processing explained with reference to the flowcharts of
[Flow of Message Propagation]
In an example illustrated in
Note that the incremental processing engine 21 may include, for example, a correspondence table of node data and the parallel processing management apparatuses 10, which retain the node data, and detect, referring to the correspondence table, the parallel processing management apparatus 10 that retains the node data. In this case, the parallel processing management apparatuses 10 respectively retain correspondence tables.
Alternatively, the incremental processing engine 21 may calculate a hash value of a node data ID or the like capable of uniquely identifying node data and detect, on the basis of the calculated hash value, the parallel processing management apparatus 10 that retains the node data.
For example, the incremental processing engine 21 detects, on the basis of a remainder obtained by dividing the hash value of the node data ID by the number of the parallel processing management apparatuses 10, the parallel processing management apparatus 10 that retains the node data. It is assumed that the hash value of the node data ID is “105” and the number of the parallel processing management apparatuses 10 is “10”. In this case, the parallel processing management apparatus 10 corresponding to an identification number of a remainder “5 (=105/10)” obtained by dividing the hash value by the number of the parallel processing management apparatuses 10 corresponds to the parallel processing management apparatus 10 that retains the node data having the node data ID. That is, the node data ID is given to the node data retained by the parallel processing management apparatus 10 corresponding to the identification number 5 such that a remainder obtained by dividing the hash value by “10” is “5”.
Referring back to
The incremental processing engine 21b calculates, on the basis of the number of pieces of processing estimation information, a total number of node data linked to the node data A and performs estimation of a total number of pieces of processing (S26). In this case, the number of pieces of processing estimation information includes the number of stages of the node data A and the number of branches from the node data A. Details of estimation processing for a total number of pieces of processing are explained below with reference to
Subsequently, the incremental processing engine 21b generates messages of the node data B to the node data F, which are branching destination of the node data A, and transmits a message M2 to the parallel processing management apparatus 10c that retains the node data B to be set as a processing target next (S28). The message M2 includes information indicating the processing target node data B (Key=B), the processing content p1 for node data, information concerning the remaining node data (node data C to F), and the number of pieces of processing estimation information.
Similarly, the incremental processing engine 21c of the parallel processing management apparatus 10c analyzes the received message M2 and specifies the processing target node data B and the processing content p1 for the node data B (S22). The incremental processing engine 21c accesses the data storing unit 20c and acquires the node data B and information (node data G and node data H) n2 concerning connection destinations (branching destinations) of the node data B and executes the processing p1 on the node data B (S23 and S24). The incremental processing engine 21c then writes a result of the processing p1 for the node data B in the data storing unit 20c of the incremental processing engine 21c (S25).
The incremental processing engine 21c calculates, on the basis of the number of pieces of processing estimation information, a total number of node data linked to the node data A and performs estimation of a total number of pieces of processing (S26). In this case, the number of pieces of processing estimation information includes the numbers of stages of the node data A and B and the numbers of branches from the node data A and B. The incremental processing engine 21c generates messages of the node data G and the node data H, which are branching destinations of the node data B, and transmits a message M3 to the parallel processing management apparatus 10 that retains the node data G to be set as a processing target next (S28).
The processing explained with reference to
A determination condition for the parallel processing in the specific example is “a total number of node data is equal to or larger than ten and the number of branches is equal to or larger than three”. For example, after reaching a leaf (a terminal end) of a tree structure, when a total number of node data calculated by the estimation processing for a total number of node data is equal to or larger than ten and the number of branches obtained from processing target node data is equal to or larger than three, the parallel processing management apparatus 10 sets, as a target of parallelization, processing for node data branching from the processing target node data.
In the specific example, node data corresponding to start node data is the node data AnA. As explained with reference to the flowcharts of
Specifically, the incremental processing engine 21 accesses the node data AnA in the data storing unit 20, acquires node data (the node data BnB to the node data FnF) at connection destinations of the node data AnA, and acquires the number of branches “5” from the node data AnA. Therefore, the incremental processing engine 21 assumes that the number of branches from a zero-th stage L0 to a first stage L1 is “5”, and calculates the number of node data “5” in the first stage L1. The number of node data in the zero-th stage L0, to which the node data AnA serving as the start node data belongs, is “1”.
The incremental processing engine 21 adds up the numbers of node data in the zero-th stage L0 and the first stage L1 and calculates a total number of node data 6 (=1+5). In this case, since the node data AnA is already processed, the incremental processing engine 21 subtracts the number of processed node data “1” from the total number of node data “6” and calculates the number of unprocessed node data “5 (=6-1)”.
Subsequently, the incremental processing engine 21 generates a message of five node data at the branching destination of the node data AnA and transmits the message to the parallel processing management apparatus 10 that retains node data to be set as a processing target next (S28 in
The incremental processing engine 21 of the parallel processing management apparatus 10, which receives the message, applies processing to the node data BnB (S22 to S25 in
Specifically, the incremental processing engine 21 accesses the node data BnB in the data storing unit 20 and acquires node data (the node data GnG and the node data HnH) at connection destinations of the node data BnB and the number of branches “2” from the node data BnB. Therefore, the incremental processing engine 21 assumes that the number of branches from the first stage L1 to a second stage L2 is “2”, multiplies together the number of node data “5” in the first stage L1 and the number of branches “2” from the first stage L1 to the second stage L2 and calculates the number of node data “10 (=5×2)” in the second stage L2.
The incremental processing engine 21 adds up the numbers of node data in the zero-th stage L0 to the second stage L2 and calculates a total number of node data “16 (=1+5+10)”. In this case, since the node data AnA and BnB are already processed, the incremental processing engine 21 subtracts the number of processed node data “2” from the total number of node data “16” and calculates the number of unprocessed node data “14 (=16-2)”.
Subsequently, the incremental processing engine 21 generates a message of two node data at branching destinations of the node data BnB and transmits a message to the parallel processing management apparatus 10 that retains the node data GnG to be set as a processing target next (S28 in
The incremental processing engine 21 of the parallel processing management apparatus 10, which receives the message, applies processing to the node data GnG (S22 to S25 in
Specifically, the incremental processing engine 21 accesses the node data GnG in the data storing unit 20 and acquires node data (no connection destinations) at connection destinations of the node data GnG and the number of branches “0” from the node data GnG. Therefore, the incremental processing engine 21 assumes that the number of branches from the second stage L2 to a third stage is “0” and calculates the number of node data “0” in the third stage.
The incremental processing engine 21 adds up the numbers of node data in the zero-th stage L0 to the second stage L2 and calculates a total number of node data “16 (=1+5+10)”. In this case, since the node data AnA and BnB are already processed, the incremental processing engine 21 subtracts the number of processed node data “3” from the total number of node data “16” and calculates the number of unprocessed node data “13 (=16-3)”.
Since the processing target node data reaches the leaf, the incremental processing engine 21 determines, on the basis of the calculated total number of pieces of processing, whether processing for node data branching from the processing target node data is set as a target of parallelization (S27 in
Subsequently, the incremental processing engine 21 transmits a message to the incremental processing engine 21 of the parallel processing management apparatus 10 that retains the node data HnH to be set as a processing target next (S28).
As explained with reference to
Similarly, the incremental processing engine 21 of the parallel processing management apparatus 10, which receives the message, applies processing to the node data HnH (S22 to S25 in
Specifically, the incremental processing engine 21 accesses the node data HnH in the data storing unit 20 and acquires node data (no connection destinations) at connection destinations of the node data HnH and the number of branches “0” from the node data HnH. Therefore, the incremental processing engine 21 assumes that the number of branches from the second stage L2 to the third stage is “0” and calculates the number of node data “0” in the third stage.
The incremental processing engine 21 adds up the numbers of node data in the zero-th stage L0 to the second stage L2 and calculates a total number of node data “16 (=1+5+10)”. In this case, since the node data AnA, BnB, GnG, and HnH are already processed, the incremental processing engine 21 subtracts the number of processed node data “4” from the total number of node data “16” and calculates the number of unprocessed node data “12=(16-4)”.
Since the number of branches from the node data H is smaller than three, the incremental processing engine 21 does not set the following processing as a target of parallelization. Subsequently, the incremental processing engine 21 transmits a message to the incremental processing engine 21 of the parallel processing management apparatus 10 that retains the node data CnC to be set as a processing target next (S28).
Similarly, the incremental processing engine 21 of the parallel processing management apparatus 10, which receives the message, applies processing to the node data CnC (S22 to S25 in
The number of branches from the node data BnB belonging to the first stage L1 like the node data CnC is “2”. Therefore, the incremental processing engine 21 assumes that the number of branches from the first stage L1 to the second stage L2 is an average “1.5=((2+1)/2)”, multiplies together the calculated number of node data “5” in the first stage L1 and the number of branches “1.5” from the first stage L1 to the second stage L2, and calculates the number of node data “7.5 (=5×1.5)” in the second stage L2. The incremental processing engine 21 adds up the numbers of node data in the zero-th stage L0 to the second stage L2 and calculates a total number of node data “13.5 (=1+5+7.5)”. In this case, since the node data AnA, BnB, GnG, HnH, and CnC are already processed, the incremental processing engine 21 subtracts the number of processed node data “5” from the total number of node data “13.5” and calculates the number of unprocessed node data “8.5 (=13.5-5)”.
Since the number of branches from the processing target node data CnC is smaller than three, the incremental processing engine 21 does not set the following processing as a target of parallelization. Subsequently, the incremental processing engine 21 transmits a message to the incremental processing engine 21 of the parallel processing management apparatus 10 that retains the node data InI to be set as a processing target next (S28).
Similarly, the incremental processing engine 21 of the parallel processing management apparatus 10, which receives the message, applies processing to the node data InI (S22 to S25 in
Specifically, the incremental processing engine 21 accesses the node data InI in the data storing unit 20 and acquires node data (no connection destinations) at connection destinations of the node data InI and the number of branches “0” from the node data InI. Therefore, the incremental processing engine 21 assumes that the number of branches from the second stage L2 to the third stage is “0” and calculates the number of node data “0” in the third stage.
The incremental processing engine 21 adds up the numbers of node data in the zero-th stage L0 to the second stage L2 and calculates a total number of node data “13.5 (=1+5+7.5)”. In this case, since the node data AnA, BnB, GnG, HnH, CnC, and InI are already processed, the incremental processing engine 21 subtracts the number of processed node data “6” from the total number of node data “13.5” and calculates the number of unprocessed node data “7.5=(13.5-6)”.
Since the node data InI is node data of the leaf that does not branch, the incremental processing engine 21 does not set the following processing as a target of parallelization. Subsequently, the incremental processing engine 21 transmits a message to the incremental processing engine 21 of the parallel processing management apparatus 10 that retains the node data DnD to be set as a processing target next (S28).
As explained with reference to
The parallel processing management apparatus 10 in this embodiment recalculates a total number of pieces of processing on the basis of the number of stages and the number of branches of the processing target node data in addition to the number of stages and the number of branches of the propagated processed node data. Therefore, even when the number of branches of the processing target node data is not fixed, it is possible to gradually increase accuracy of a total number of pieces of processing.
The parallel processing management system in this embodiment traces related node data in a depth first manner. By tracing the related node data in a depth first manner, the parallel processing management system reaches node data of a leaf early. Therefore, the parallel processing management system can acquire the numbers of branches of all the numbers of stages in a tress structure at an early stage. Therefore, the parallel processing management apparatus 10 can estimate a total number of pieces of processing at an early stage of a process of data processing.
Similarly, the incremental processing engine 21 of the parallel processing management apparatus 10, which receives the message, applies processing to the node data DnD (S22 to S25 in
Specifically, the incremental processing engine 21 accesses the node data DnD in the data storing unit 20 and acquires node data (the node data JnJ to LnL) at connection destinations of the node data DnD and the number of branches “3” from the node data DnD. Therefore, the incremental processing engine 21 assumes that the number of branches from the first stage L1 to the second stage L2 is an average “2=((2+1+3)/2)”, multiplies together the calculated number of node data “5” in the first stage L1 and the number of branches “2” from the first stage L1 to the second stage L2, and calculates the number of nodes “10 (=5×2)” in the second stage L2.
The incremental processing engine 21 adds up the numbers of node data in the zero-th stage L0 to the second stage L2 and calculates a total number of node data “16 (=1+5+10)”. In this case, since the node data AnA, BnB, GnG, HnH, CnC, InI, and DnD are already processed, the incremental processing engine 21 subtracts the number of processed node data “7” from the total number of node data “16” and calculates the number of unprocessed node data “9=(16−7)”.
Since the total number of pieces of processing is equal to or larger than ten and the number of branches from the processing target node data DnD is equal to or larger than three, the incremental processing engine 21 sets processing of the node data JnJ to LnL as a target of parallelization. Therefore, the incremental processing engine 21 transmits messages respectively to the incremental processing engines 21 of the parallel processing management apparatuses 10 that retain the node data JnJ, the node data KnK, and the node data LnL to be set processing targets next (S28).
Processing of the node data JnJ is explained. However, processing of the node data KnK and the node data LnL is the same as the processing of the node data JnJ. Similarly, the incremental processing engine 21 of the parallel processing management apparatus 10, which receives the message, applies processing to the node data JnJ (S22 to S25 in
Specifically, the incremental processing engine 21 accesses the node data JnJ in the data storing unit 20 and acquires node data (no connection destinations) at connection destinations of the node data JnJ and the number of branches “0” from the node data JnJ. Therefore, the incremental processing engine 21 assumes that the number of branches from the second stage L2 to the third stage is “0” and calculates the number of node data “0” in the third stage.
The incremental processing engine 21 assumes that the number of branches from the second stage L2 to the third stage is “0” and calculates the number of node data “0” in the third stage. The incremental processing engine 21 adds up the numbers of node data in the zero-th stage L0 to the second stage L2 and calculates a total number of node data “16 (=1+5+10)”. In this case, since the node data AnA, BnB, GnG, HnH, CnC, InI, DnD, and JnJ are already processed, the incremental processing engine 21 subtracts the number of processed node data “8” from the total number of node data “16” and calculates the number of unprocessed node data “8=(16-8)”.
Since the node data JnJ is node data of the leaf that does not branch, the incremental processing engine 21 does not set the following processing as a target of parallelization. Since connection destinations of the node data JnJ are absent, the incremental processing engine 21 does not transmit a message.
The incremental processing engine 21 of the parallel processing management apparatus 10, which receives the message from the parallel processing management apparatus 10 that retains the node data D, applies processing to the node data EnE (S22 to S25 in
Specifically, the incremental processing engine 21 accesses the node data EnE in the data storing unit 20 and acquires node data (node data MnM) at a connection destination of the node data EnE and the number of branches “1” from the node data EnE. Therefore, the incremental processing engine 21 multiplies together a deemed number of node data “5” in the first stage L1 and an average “1.75=((2+1+3+1)/2)” of the numbers of branches from the first stage L1 to the second stage L2 and calculates the number of node data “8.75 (=5×1.75)” in the second stage L2.
The incremental processing engine 21 adds up the numbers of node data in the zero-th stage L0 to the second stage L2 and calculates a total number of node data “14.75 (=1+5+8.75)”. In this case, since the node data AnA, BnB, GnG, HnH, CnC, InI, BnB, JnJ to LnL, and EnE are already processed, the incremental processing engine 21 subtracts the number of processed node data “11” from the total number of node data “14.75” and calculates the number of unprocessed node data “3.75=(14.75−11)”.
Since the number of branches is smaller than three, the incremental processing engine 21 does not set processing for the node data MnM branching from the node data EnE as a target of parallelization (S27 in
As explained with reference to
Note that, when the remaining number of pieces of processing is equal to or larger than a reference value, the parallel processing management apparatus 10 may set processing of node data at branching destinations of the processing target node data as a target of the parallel processing. Consequently, when the total number of pieces of processing is large and the remaining number of pieces of processing is large, the parallel processing management apparatus 10 can set unprocessed node data processing as a target of the parallel processing.
When an estimated total number of kinds of processing is equal to or larger than a reference value and when the number of branches is equal to or larger than a reference number of branches (in the example illustrated in
Note that, in the examples illustrated in
On the other hand,
As illustrated in
As explained above, the parallel processing management program in this embodiment includes managing a data processing by a processing target node among a plurality of nodes in which respective nodes have relations with other nodes, the processing target node being traced from a start node on the basis of the relations and calculating a total number of nodes linked to the start node on the basis of numbers of stages indicating distances of processed nodes and the processing target node from the start node, and numbers of branches from the processed nodes and the processing target node, while the processing target node performs the data processing.
With the parallel processing management program in this embodiment, a total number of pieces of processing is recalculated on the basis of the number of stages and the number of branches of the propagated processed node data and the number of stages and the number of branches of the processing target node data. Consequently, even during data processing, it is possible to estimate a total number of pieces of processing of the data processing. Therefore, with the parallel processing management program, even during the data processing, it is possible to control node data on the basis of the total number of pieces of processing.
With the parallel processing management program in this embodiment, every time data processing is performed, a total number of pieces of processing is recalculated on the basis of the number of stages and the number of branches of the propagated and accumulated processed node data and the number of stages and the number of branches of the processing target node data. Consequently, it is possible to correct the total number of pieces of processing and gradually reduce an error in a process of the data processing. Therefore, with the parallel processing management program in this embodiment, the total number of pieces of processing is recalculated on the basis of related information of the node data accumulated by propagation. Consequently, it is possible to obtain a highly accurate total number of pieces of processing at an earlier stage in the process of the data processing.
With the parallel processing management program in this embodiment, a total number of pieces of processing is estimated on the basis of the number of stages and the number of branches of the processed node data accumulated by propagation and the number of stages and the number of branches of the processing target node data. Consequently, even when the number of branches of the processing target node data is not fixed, it is possible to estimate a more highly accurate total number of pieces of processing.
In the managing of the parallel processing management program in this embodiment, further, when the calculated total number of nodes exceeds a reference number, processing for nodes branching from the processing target node is set as a target of the parallel processing.
With the parallel processing management program in this embodiment, when an estimated value of a total number of node data (a total number of pieces of processing) is equal to or larger than a reference value (in the example illustrated in
In the managing of the parallel processing management program in this embodiment, further, when the number of branches from the processing target data exceeds a reference number of branches, processing for nodes branching from the processing target node is set as a target of the parallel processing. With the parallel processing management program in this embodiment, further, when the number of branches is equal to or larger than a reference number of branches (in the example illustrated in
In the calculating of the parallel processing management program in this embodiment, further, the number of unprocessed nodes is calculated on the basis of a calculated total number of nodes linked to a start node and the number of processed nodes. With the parallel processing management program in this embodiment, it is possible to specify the remaining processing time based on a total number of pieces of processing. Therefore, it is possible to reduce an average of turnaround times of data processing by controlling to execute the data processing on the basis of the remaining processing time calculated by the parallel processing management program.
In the process of the parallel processing management program in this embodiment, the numbers of stages and the numbers of branches of processed nodes are received from a previous processed node. The numbers of stages and the numbers of branches of the processed nodes and a processing target node added with the number of stages and the number of branches of the processing target node are transmitted to a next processing target node to be set as a processing target next time. With the parallel processing management program in this embodiment, the number of stages and the number of branches of the processed node data is propagated to node data to be set as a processing target next time. Consequently, even during data processing, it is possible to estimate a total number of pieces of processing of the data processing.
In the parallel processing management program in this embodiment, nodes are traced in a depth first manner from a start node on the basis of relations among nodes. With the parallel processing management program in this embodiment, since the nodes are traced in a depth first manner, it is possible to acquire the numbers of branches of all the numbers of stages in a tree structure at an early stage. Therefore, it is possible to estimate a total number of pieces of processing at an early stage of a process of data processing.
All examples and conditional language provided herein are intended for the pedagogical purposes of aiding the reader in understanding the invention and the concepts contributed by the inventor to further the art, and are not to be construed as limitations to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to a showing of the superiority and inferiority of the invention. Although one or more embodiments of the present invention have been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.
Number | Date | Country | Kind |
---|---|---|---|
2013-210289 | Oct 2013 | JP | national |