This application is based upon and claims the benefit of priority of the prior Japanese Patent Application No. 2022-111887, filed on Jul. 12, 2022, the entire contents of which are incorporated herein by reference.
The embodiments discussed herein are related to a management device, an arithmetic processing device, a load distribution method of the arithmetic processing device, and a load distribution program of the arithmetic processing device.
A technology has been proposed that performs processing on large-scale data at high speed by executing tasks by distributing loads among a plurality of nodes by high-performance computing (HPC). For example, in screening performed in an initial stage of drug discovery research, it is needed to deal with a very large number of compound groups. Therefore, a technology of virtual screening using the HPC has been proposed.
Japanese Laid-open Patent Publication No. 11-053328, Japanese Laid-open Patent Publication No. 05-298272, and “VirtualFlow”, [online], [retrieved on Jun. 29, 2022], Internet <https://docs.virtual-flow.org/documentation/-LdE8RH9UN4HKpckqkX3/> are disclosed as related art.
According to an aspect of the embodiments, a management device includes: a memory; and a processor coupled to the memory and configured to: classify a plurality of arithmetic processing devices that executes a plurality of tasks in parallel by distributing loads into a plurality of arithmetic processing device groups; select a representative arithmetic processing device from arithmetic processing devices that belong to each arithmetic processing device group; notify the representative arithmetic processing device of identification information of other arithmetic processing devices that belong to an arithmetic processing device group to which the representative arithmetic processing device belongs; instruct the representative arithmetic processing device to acquire information regarding tasks to be executed by the arithmetic processing devices that belong to the arithmetic processing device group from a first task list as a shared file in which information regarding the plurality of tasks is stored, and to generate a second task list; notify each of the other arithmetic processing devices of identification information of the representative arithmetic processing device; and instruct each of the other arithmetic processing devices to acquire information regarding tasks to be executed by the representative arithmetic processing device and each of the other arithmetic processing devices from the second task list generated by the representative arithmetic processing device, and to generate a third task list.
The object and advantages of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the claims.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention.
Furthermore, as a technology related to load distribution among a plurality of nodes, a parallel computer system has been proposed in which a loop is divided and executed in parallel by each node processor. In this system, a host processor obtains a division loop length for an original loop described in a program by performing division by using a maximum integer value less than a loop length of the original loop divisible by the number of node processors as a dividend and the number of node processors as a divisor. Then, the same number of division loops as the number of node processors having this division loop length are generated from the original loop, and an execution process of each of these division loops is assigned to the node processor, and the remaining execution process of the original loop is assigned to the host processor itself.
Furthermore, a parallel processing apparatus has been proposed that executes an iterative sequence of instructions by arranging the sequence into subtasks and allocating those subtasks to processors. The division and the allocation are conducted in such a manner as to minimize data contention among the processors and also to maximize locality of data to the processors.
For example, in the technology of virtual screening using the HPC described above, information regarding a compound (ligand) to be screened is managed in a task list as a single shared file. Therefore, each of a plurality of nodes that executes parallel processing needs to generate a subtask list for tasks to be executed by the own node from the shared file. Since it is needed to avoid contention in accessing the shared file, generation of the subtask list by each node is sequential processing. Therefore, as the number of nodes that execute the parallel processing increases, a standby time for accessing the shared file increases, and the generation of the subtask list by each node becomes an overhead when tasks are executed in the HPC.
Furthermore, the technology related to load distribution among a plurality of nodes described above focuses on a point of effectively dividing loop processing during program execution into processors, and may not resolve a problem that a standby time for accessing a shared file increases.
As one aspect, the disclosed technology aims to reduce a standby time for accessing a task list shared by a plurality of arithmetic processing devices that executes parallel processing of tasks.
Hereinafter, examples of embodiments according to the disclosed technology will be described with reference to the drawings. In each of the following embodiments, an example of applying the disclosed technology to virtual screening will be described.
First, before each embodiment is described, virtual screening, the existing technology of virtual screening using high-performance computing (HPC) described above, and a problem of the existing technology will be described.
The virtual screening is to search for a ligand that binds to a substance of interest (target protein) from a large number of candidates by calculation using a computer. For example, as illustrated in
An actual library of ligands is managed in one file as a ligand group in which a plurality of ligands is grouped together. For example, in a library of ligands illustrated in a left diagram of
In a case where the HPC is used for the virtual screening, ligand groups are distributed to each node as illustrated in
For example, as illustrated in
For example, as illustrated in
Therefore, in each of the following embodiments, by setting a group of a plurality of nodes as a node group, determining a representative of the node group, and accessing a task list as a shared file by the representative, a standby time for accessing the task list is reduced. Hereinafter, each embodiment will be described.
In
In
The classification unit 12 classifies a plurality of nodes into a plurality of node groups. Hereinafter, each node group is also referred to as a “team”, and a node belonging to the team is also referred to as a “member”. For example, the classification unit 12 acquires a list (hereinafter referred to as an “identifier (ID) list”) of identification information (hereinafter referred to as “node ID”) of all nodes reserved for executing a program of the HPC. The classification unit 12 classifies all nodes into a plurality of teams based on the number of members belonging to each team, which is specified in advance by a user.
For example, as illustrated in
The selection unit 14 selects a representative node from nodes belonging to each team. For example, in a case where the node ID is a number, the selection unit 14 selects a node with the smallest node ID among the nodes 20 belonging to each team as the representative node. Note that the method of selecting the representative node is not limited to this, and a node with the largest node ID among the nodes 20 belonging to each team may be selected as the representative node, or the representative node may be randomly selected.
The instruction unit 16 notifies a representative node that it is a representative of a team and of node IDs of other members belonging to the team to which the representative node belongs. Furthermore, the instruction unit 16 instructs the representative node to acquire information regarding tasks to be executed by the members belonging to the team from the task list 32 as a shared file that stores information regarding a plurality of tasks, and generate a team task list. Furthermore, the instruction unit 16 notifies each of the other members other than the representative node of a node ID of the representative node of the team to which the member belongs. Furthermore, the instruction unit 16 instructs each member to acquire the information regarding the tasks to be executed by the respective members from the team task list generated by the representative node, and generate a subtask list.
In
The generation unit 22 generates a team task list from the task list 32 in a case where it is notified from the management device 10 that the node 20 is a representative of a team and is notified of node IDs of other members belonging to the team. For example, in a case where the task list 32 is accessible, the generation unit 22 locks the task list 32 and then accesses the task list 32. Then, the generation unit 22 extracts information regarding tasks one by one (line by line) from the task list 32, and adds the information to the team task list. The generation unit 22 generates the team task list by ending the acquisition of the information regarding the tasks at a stage when a total size of the acquired information regarding the tasks exceeds a threshold according to the number of members of the team. The generation unit 22 unlocks the task list 32 when the generation of the team task list is completed.
Furthermore, in a case where the node 20 functions as the representative node, the generation unit 22 acquires information regarding tasks to be executed by the own node from the generated team task list, and generates a member task list. Furthermore, in a case where the node 20 functions as a member other than the representative node, the generation unit 22 acquires information regarding tasks to be executed by the own node from the team task list, and generates a member task list. The member other than the representative node generates the member task list of the own node in a case where the member is notified by the notification unit 24 (details will be described later) of the representative node that the team task list has been generated. The team task list is a shared file shared by members belonging to the same team. Therefore, similarly to a case where the representative node accesses the task list 32, the generation unit 22 of each member locks the team task list and then accesses the team task list, and when the generation of the member task list is completed, unlocks the team task list.
The notification unit 24 notifies other members belonging to a team that a team task list has been generated when the generation unit 22 completes generation of the team task list in a case where the node 20 functions as a representative node. Note that it is assumed that notification destinations such as internet protocol (IP) addresses may be specified from node IDs of the members.
With reference to
Hereinafter, the team task lists 34A and 34B will be referred to as “team task list 34” in a case where they are described without distinction, and the member task lists 36A and 36B will be referred to as “member task list 36” in a case where they are described without distinction. Note that the task list 32 is an example of a “first task list” of the disclosed technology, the team task list 34 is an example of a “second task list” of the disclosed technology, and the member task list 36 is an example of a “third task list” of the disclosed technology.
When the generation of the member task list 36 by the generation unit 22 is completed, the execution unit 26 executes screening of tasks included in the member task list 36, for example, ligands. For example, the execution unit 26 extracts information regarding the tasks line by line from the member task list 36, accesses information regarding ligand groups indicated by the information regarding the tasks, and calculates a score indicating a degree of binding between each ligand included in each ligand group and a substance of interest.
The management device 10 may be implemented by, for example, a computer 40 illustrated in
The storage device 43 is, for example, a hard disk drive (HDD), a solid state drive (SSD), a flash memory, or the like. The storage device 43 as a storage medium stores a first load distribution program 50 for causing the computer 40 to function as the management device 10. The first load distribution program 50 includes a classification process control instruction 52, a selection process control instruction 54, and an instruction process control instruction 56.
The CPU 41 reads the first load distribution program 50 from the storage device 43, expands the first load distribution program 50 in the memory 42, and sequentially executes the control instructions included in the first load distribution program 50. The CPU 41 executes the classification process control instruction 52 so as to operate as the classification unit 12 illustrated in
The node 20 may be implemented by, for example, a computer 60 illustrated in
The storage device 63 stores a second load distribution program 70 for causing the computer 60 to function as the node 20. The second load distribution program 70 includes a generation process control instruction 72, a notification process control instruction 74, and an execution process control instruction 76.
The CPU 61 reads the second load distribution program 70 from the storage device 63, expands the second load distribution program 70 in the memory 62, and sequentially executes the control instructions included in the second load distribution program 70. The CPU 61 executes the generation process control instruction 72 so as to operate as the generation unit 22 illustrated in
Note that the first load distribution program 50 and the second load distribution program 70 are examples of a “load distribution program of the arithmetic processing device” of the disclosed technology. Furthermore, the functions implemented by each of the first load distribution program 50 and the second load distribution program 70 may be implemented by, for example, a semiconductor integrated circuit, for example, an application specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or the like.
Next, operation of the load distribution system 100 according to the first embodiment will be described. First, the management device 10 executes management processing illustrated in
First, the management processing illustrated in
In Step S10, the classification unit 12 acquires an ID list of all nodes reserved for executing a program of the HPC. Next, in Step S12, the classification unit 12 classifies all nodes into a plurality of teams based on the number of members specified in advance by a user. Next, in Step S14, the selection unit 14 selects a node with the smallest node ID among the nodes 20 belonging to each team as a representative node.
Next, in Step S16, the instruction unit 16 notifies the representative node that it is a representative of a team and of node IDs of other members belonging to the team to which the representative node belongs, and instructs the representative node to generate the team task list 34 and the member task list 36 of the own node. Furthermore, the instruction unit 16 notifies each of the other members other than the representative node of a node ID of the representative node of the team to which the member belongs, and instructs the member to wait for notification from the representative node and generate the member task list 36, and the management processing ends.
Next, the representative node processing illustrated in
In Step S20, the generation unit 22 determines whether or not the task list 32 of the shared file system 30 is accessible. In a case where the task list 32 is locked and is not accessible, the processing proceeds to Step S22, stands by for a certain period of time, and returns to Step S20. In a case where the task list 32 is accessible, the processing proceeds to Step S24.
In Step S24, the generation unit 22 locks the task list 32 and then accesses the task list 32. Next, in Step S26, the generation unit 22 determines whether or not there is a ligand in the task list 32, for example, whether or not there is a remaining line in the task list 32. In a case where there is a ligand in the task list 32, the processing proceeds to Step S28, and in a case where there is no ligand, the representative node processing ends.
In Step S28, the generation unit 22 executes processing of extracting a ligand group (information regarding tasks) in a first line from the task list 32 and adding the extracted ligand group to the team task list 34, until a total size of the added ligand groups exceeds a threshold×the number of members. Next, in Step S30, the generation unit 22 unlocks the task list 32.
Next, in Step S32, the generation unit 22 determines whether or not the team task list 34 is accessible. In a case where it is not accessible, the processing proceeds to Step S34, stands by for a certain period of time, and returns to Step S32. In a case where it is accessible, the processing proceeds to Step S36.
In Step S36, the generation unit 22 locks the team task list 34 and then accesses the team task list 34. Next, in Step S38, the generation unit 22 determines whether or not there is a ligand in the team task list 34. In a case where there is a ligand in the team task list 34, the processing proceeds to Step S40, and in a case where there is no ligand, the processing returns to Step S20.
In Step S40, the generation unit 22 executes processing of extracting a ligand group in a first line from the team task list 34 and adding the extracted ligand group to the member task list 36 of the own node, until a total size of the added ligand groups exceeds a threshold. Alternatively, the generation unit 22 executes the processing until the total size of the added ligand groups reaches a capacity of the member task list 36 of the own node. Next, in Step S42, the generation unit 22 unlocks the team task list 34.
Next, in Step S44, it is determined whether or not the notification unit 24 has notified the other members belonging to the team that the generation of the team task list 34 has been completed. In a case where the completion has not been notified, the processing proceeds to Step S46, and in a case where the completion has already been notified, the processing proceeds to Step S48. In Step S46, the notification unit 24 notifies the other members belonging to the team that the team task list 34 has been generated. In Step S48, the execution unit 26 executes screening based on the member task list 36 of the own node, and returns to Step S32 in a case where the member task list 36 becomes empty.
Next, the member processing illustrated in
In Step S60, the generation unit 22 stands by until the notification of the completion of the generation of the team task list 34 is received from the representative node. When the notification is received, in Steps S62 to S72, the generation unit 22 generates the member task list 36 of the own node similarly to the processing in Steps S32 to S42 of the representative node processing (
It is sufficient that results of the screening executed in the respective nodes 20 are aggregated by the management device 10 so that candidates for ligands that bind to the substance of interest are presented according to ranking of scores.
As described above, according to the load distribution system according to the first embodiment, the management device classifies a plurality of nodes into a plurality of teams, and selects a representative node from nodes belonging to each team. Then, the management device notifies the representative node that it is a representative and of node IDs of other members, and notifies the other members of a node ID of the representative node. Then, the representative node generates a team task list to be shared by the team from a task list as a shared file, each node generates a member task list of the own node from the team task list, and each node executes tasks based on the member task list of the own node. With this configuration, it is possible to reduce a standby time for accessing a task list shared by a plurality of arithmetic processing devices that executes parallel processing of tasks, and to shorten a task list generation time in all nodes.
With reference to
In an existing method, since sequential processing is performed in which all nodes sequentially access the task list 32 as a shared file, a time needed to acquire tasks increases in proportion to the number of nodes. On the other hand, in the method of the present embodiment (hereinafter referred to as “present method”), only a representative node of a team acquires tasks from the task list 32 of the shared file system. Then, each member acquires tasks from the team task list 34 generated by the representative node of the team (arrows in
Furthermore, when it is assumed that the number of teams is G, the number of members of the team is M, and a time needed to acquire tasks by one node 20 is t, in the existing method, a time T until all nodes complete generation of the member task list 36 is T=G*M*t. On the other hand, in the present method, a time until the generation of the team task list 34 is completed, for example, a time T′ for the representative node to access the task list 32 is T′=G*t. Furthermore, the time T until all nodes complete the generation of the member task list 36 is T=(G+M)*t, which may be greatly shortened compared to the existing method.
Next, a second embodiment will be described. Note that, in a load distribution system according to the second embodiment, similar components to those of the load distribution system 100 according to the first embodiment are denoted by the same reference signs, and detailed description thereof will be omitted.
As illustrated in
In
The determination unit 213 divides tasks included in the task list 32 by the number of teams, and determines a range of tasks in the task list 32 assigned to each team. For example, as illustrated in
The instruction unit 216 notifies, in addition to the notification and the instruction to each node 220 by the instruction unit 16 of the first embodiment, a representative node of each team of an assigned range for the team determined by the determination unit 213. Then, the instruction unit 216 instructs each representative node to generate a team task list 34 from the assigned range for the team in the task list 32.
In
The generation unit 222 generates, in a case where an assigned range is notified from the management device 210, for example, in a case where the node 220 functions as a representative node, the team task list 34 from the notified assigned range in the task list 32. For example, by accessing the assigned range for the own team in the task list 32, extracting information regarding tasks in the assigned range from the task list 32, and adding the extracted information to the team task list 34, the generation unit 222 generates the team task list 34. At this time, the task list 32 does not need to be locked because the assigned ranges for the respective teams are divided in advance and there is no access contention. Therefore, each representative node may access the task list 32 at the same time.
Furthermore, the generation unit 222 generates a member task list 36 similarly to the generation unit 22 of the first embodiment.
With reference to
The management device 210 may be implemented by, for example, a computer 40 illustrated in
A CPU 41 reads the first load distribution program 250 from the storage device 43, expands the first load distribution program 250 in a memory 42, and sequentially executes the control instructions included in the first load distribution program 250. The CPU 41 executes the determination process control instruction 253 so as to operate as the determination unit 213 illustrated in
The node 220 may be implemented by, for example, a computer 60 illustrated in
A CPU 61 reads the second load distribution program 270 from the storage device 63, expands the second load distribution program 270 in a memory 62, and sequentially executes the control instructions included in the second load distribution program 270. The CPU 61 executes the generation process control instruction 272 so as to operate as the generation unit 222 illustrated in
Note that the first load distribution program 250 and the second load distribution program 270 are examples of the “load distribution program of the arithmetic processing device” of the disclosed technology. Furthermore, the functions implemented by each of the first load distribution program 250 and the second load distribution program 270 may be implemented by, for example, a semiconductor integrated circuit, for example, an ASIC, an FPGA, or the like.
Next, operation of the load distribution system 200 according to the second embodiment will be described. First, the management device 210 executes management processing illustrated in
First, the management processing illustrated in
After Steps S10 and S12, in the next Step S213, the determination unit 213 divides tasks included in the task list 32 by the number of teams, and determines a range of tasks in the task list 32 assigned to each team. After Step S14, in the next Step S216, the instruction unit 216 notifies a representative node that it is a representative of a team and of node IDs of other members belonging to the team to which the representative node belongs, and an assigned range for the team. Then, the determination unit 213 instructs the representative node to generate the team task list 34 and the member task list 36 of the own node. Furthermore, the instruction unit 216 notifies each of the other members other than the representative node of a node ID of the representative node of the team to which the member belongs, and instructs the member to wait for notification from the representative node and generate the member task list 36, and the management processing ends.
Note that the processing of Step S213 and the processing of Step S14 may be executed in a reversed order.
Next, the representative node processing illustrated in
In Step S220, the generation unit 222 accesses the task list 32. Next, in Step S222, by extracting information regarding tasks in the assigned range for the own team notified from the management device 210 from the task list 32, and adding the extracted information to the team task list 34, the generation unit 222 generates the team task list 34. Hereinafter, processing of Steps S32 and S48 is executed similarly to that in the first embodiment, and the representative node processing ends.
As described above, according to the load distribution system according to the second embodiment, the management device divides tasks included in a task list as a shared file by the number of teams, determines a range of tasks in the task list assigned to each team, and notifies a representative node of the assigned range. Then, the representative node generates a team task list from the range of the own team in the task list. With this configuration, it is possible to reduce a standby time for accessing a task list shared by a plurality of arithmetic processing devices that executes parallel processing of tasks, and to shorten a task list generation time in all nodes.
With reference to
Furthermore, when it is assumed that the total number of nodes is N, the number of teams is G, and a time needed to acquire tasks by one node 20 is t, in the existing method, a time T until all nodes complete generation of the member task list 36 is T=N*t. On the other hand, in the present method, a time until the generation of the team task list 34 is completed, for example, a time T′ for the representative node to access the task list 32 is T′=t. Furthermore, the time T until all nodes complete the generation of the member task list 36 is T=(N/G+1)*t, which may be greatly shortened compared to the existing method.
Note that, in each of the embodiments described above, the case where the disclosed technology is applied to the virtual screening has been described. However, the embodiments are not limited to this. The disclosed technology may be applied to load distribution processing that needs access to a shared file. For example, it is effective in a case where a size of each task is different.
Furthermore, in each of the embodiments described above, the case where the number of members of each team is equal has been described. However, the number of members of each team may be different from each other.
Furthermore, in each of the embodiments described above, the first load distribution program and the second load distribution program are stored (installed) in the storage device in advance. However, the embodiments are not limited to this. The programs according to the disclosed technology may be provided in a form stored in a storage medium such as a compact disc read only memory (CD-ROM), a digital versatile disc read only memory (DVD-ROM), or a universal serial bus (USB) memory.
All examples and conditional language provided herein are intended for the pedagogical purposes of aiding the reader in understanding the invention and the concepts contributed by the inventor to further the art, and are not to be construed as limitations to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to a showing of the superiority and inferiority of the invention. Although one or more embodiments of the present invention have been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.
Number | Date | Country | Kind |
---|---|---|---|
2022-111887 | Jul 2022 | JP | national |