Not applicable.
Not applicable.
Not applicable.
The present invention relates to a scheduling method for remote object procedure call and a system thereof, and more particularly to a scheduling method for remote object procedure call and a system thereof based on the management of stateful invocations and stateless invocations, which is especially suitable for the remote object procedure call complying with TCP (Transmission Control Protocol) channel.
Distributed object-oriented environments have become important platforms for parallel and distributed service frameworks. According to the connection of high-speed networks, components that are located in different networks can share resources and serve each other. The improvement of performance and utility of the distributed object-oriented computing environment has become an important issue.
The grid environment has been developed for a long time, and related researchers have been focusing on a global distributed object-oriented service environment. During the development of the global distributed object-oriented service environment, the establishment of standards and protocols is a most time-consuming process, which is gradually completed. When the platforms are formed and becoming practical, the combination of various applications in the future and the advanced network technologies is a critical point for integration of diverse important technologies.
As for the development of the distributed object-oriented service in the grid environment, how to optimize the distributed components is an important category. First, one must set up the cooperation among different distributed components, such as the objects of Java Remote Method Invocation (Java RMI), .NET Remoting, Common Component Architecture (CCA) and Open Grid Services Architecture (OGSA). All the above languages exhibit the function of remote calling. If the cooperation among the components in different platforms is established, the reuse and applications of the components will be improved. Therefore, how to overcome the barriers among the different languages is an important topic. Also, how to replace the components dynamically according to different environments to improve the performance and to meet the security requirement becomes a mainstream of research, for example, Component Management Services (CMS). Facing the aforesaid distributed components' environment, the traditional distributed load-balancing mechanism, in which different protocols and platforms are involved and difficulty in the process is increased, cannot meet the requirements (i.e., security, performance, and so on). Currently, there is an issue of stateful invocations that needs solving in remote procedure call service, which is described below. A stateful invocation means it has to save the state thereof in the server for the next call (i.e., the next stateful invocation). That is, there is dependence between the stateful invocation and the next stateful invocation. Thus, a server is required to keep the state of the stateful invocation in the distributed component environment. However, traditional scheduling methods cannot deal with the issue or cannot optimize the result of scheduling based on the characteristics of the components. In contrast to the stateful invocations that have to be assigned to the servers in advance, the stateless invocations are not assigned to the servers in advance.
As for the scheduling in the distributed component environment, so far, many scheduling methods are proposed using the information of the workflows, but they still cannot solve the aforesaid issue effectively. For the traditional scheduling methods, they need a mechanism capable of supporting the stateful invocations and the stateless invocations simultaneously in order to be applied efficiently to the procedures written by the object-oriented languages with the function of remote procedure calls, such as .NET Remoting and Java RMI.
The objective of the present invention is to provide a scheduling method for remote object procedure call and system thereof, by a two-phase scheduling mechanism and referring to a workflow of an application program, to deal with the stateful invocations and the stateless invocations simultaneously and to achieve the optimization of scheduling invocations.
In order to achieve the objective, the present invention discloses a scheduling method for remote object procedure call, which comprises the steps of prescheduling and dynamic scheduling. The step of prescheduling, the first phase, comprises: (A1) grouping plural stateful invocations in a workflow into plural groups of stateful tasks; (A2) assigning the groups of stateful tasks to the server according to the loads of the servers; (A3) repeats Steps (A1) and (A2) until all the groups of stateful tasks are assigned; and (A4) determines the rank of each invocation (including stateful and stateless invocations), where the rank is an approximation of the length of the longest path from the invocation to the exit invocation in the workflow. The larger the rank is (i.e., the longer the length is), the earlier the associated invocation is treated.
The step of dynamic scheduling, the second phase, is performed according to the rank of each invocation. If there is a stateful invocation that is estimated to be time-out but is actually not time-out, the stateful invocation will be scheduled for the server to which the associated group of stateful tasks has been assigned. Then, the step of prescheduling will be executed for the remaining stateful invocations in the associated group of stateful tasks. If a stateful invocation is actually time-out, the stateful invocation will be scheduled for the server depending on the result of the prescheduling step. For each stateless invocation, an estimated finish time for each server is calculated, and each stateless invocation is assigned to the corresponding server with the minimum estimated finish time.
The present invention also discloses a scheduling system for remote object procedure call, comprising plural clients, plural servers, and a middle gateway. The servers respond to plural invocations from the clients. The middle gateway connects the servers and the clients to dispatch the invocations to the servers according to a load-balancing mechanism that comprises the two-phase scheduling mechanism.
The scheduling method of the present invention arranges a scheduling object (i.e., a middle gateway) between the servers and the clients. When the clients request a service from the servers (i.e., an invocation is sent from the clients to the servers), the invocation will be dispatched to the proper server according to the two-phase scheduling mechanism in the scheduling object, during which the stateful invocations and the executing status of the scheduling system are factors affecting the performance of the scheduling system and the two-phase scheduling mechanism is used to meet the requirement of load balancing. The scheduling method of the present invention is applicable to a distributed component environment.
The scheduling method and scheduling system present the following features. First, the scheduling of invocations can be performed by a third party only with the operating parameters provided by the clients and the servers. Thus, the middle gateway, which is used to perform the scheduling method of the present invention, can be disposed in the servers or at other nodes. Second, the load of the current server is considered to optimize the scheduling of the invocations, which depends on different properties of the invocations, and thus, different references. For example, if a program needs more computation capability, the computation cost will be considered; if the program needs more network communication, the bandwidth will be considered. As a result, the performance of the scheduling method of the present invention becomes more noticeable. Third, the scheduling method and the scheduling system of the present invention are applicable to any remote invocation method that utilizes Transmission Control Channel, for example, .NET Remoting or Java RMI.
To easily understand the scheduling method for remote object procedure call of the present invention, the scheduling system for remote object procedure call of the present invention is described first as follows.
Some terminology is defined below to help explain the content of the present invention.
TCT: stands for the TCP (Transmission Control Protocol) connection table that is maintained to track existing TCP connections. Each row in TCT contains four columns, the source IP, the source port, the destination IP and the destination port of one TCP connection. When the packets pass through the middle gateway 11 from the servers 13 via the channel 15, a TCT is used to record the content of a session table and to pick up the information of the headers of the packets as indices for successive connections.
Rank of a task: stands for the priority of the scheduling order of a task. The rank of a task is determined by a workflow of an application by the concept of critical paths. The task most seriously affecting the execution of the application is regarded as the highest-rank task, which is executed first to avoid delaying other tasks that exhibit dependency on the highest-rank task (i.e., the dependency issue). All the tasks in the workflow are divided by level. When the tasks in one level have been executed, the tasks in the next lower level are ready to execute in order to avoid the dependency issue.
EFT: stands for Estimated Finish Time, which is a formula to estimate the finish time of a task utilizing the remaining resources of the servers. To use this formula, the time that each task enters each server is required; the application needs profiling to pick up the estimated computation cost, communication cost and the dependencies between tasks, and the clock rate and the bandwidth of each server also need to be considered.
where si is the i-th server, Rk is the remaining computation time of task nk, gj is the j-th group of stateful tasks, and gt is the new group of stateful tasks.
After Step S22, the rank of each invocation is determined (Step S23). The rank rank(ni) of a task ni in the workflow is an approximation of the length of the longest path from the task ni to the exit task (i.e., the task n11 in
where wi is the computation cost of the task ni, succ (ni) is the set of the immediate successors of the task ni, and cij is the communication cost associated with the tasks ni and nj. That is, the rank of each invocation can be determined by the computation cost of each invocation and the communication cost associated with one of the immediate successors of each invocation.
According to the result of Step S301, if the invocation is stateless, an EFT (Estimated Finish Time) of each stateless invocation for each server is calculated (Step S309). The EFT is determined by the estimated computation cost, the communication cost of each stateless invocation, the dependency thereof on the previous invocation, and the task assignment table that is associated. The EFT(ni, si) of the i-th stateless invocation ni in the j-th server sj is defined by formula (4) below.
where pred(ni) is the set of immediate predecessor tasks of the task ni, and avail[sj] is the earliest time at which the server sj is ready for task execution. AFT(nm) is the actual finish time of the task nm and cm,i is the communication cost between the stateless tasks nm and ni. Exec (wi, avail[sj], k) is the execution cost of the task ni with computation cost wi executed on the server sj which can parallel execute most k tasks from time avail[sj]. After Step S309, a second target server, with the minimum EFT among each server, is determined (Step S310) and the stateless invocation is assigned to the second target server. Then, the task assignment table in the middle gateway that connects to the second target server is renewed (Step S311). Next, the destination IP of the stateless invocation is renewed (Step S312). The step of dispatching the tasks (Step S40) follows Steps S305, S308 or S312. At Step S40, the assigned invocations are dispatched to the corresponding server by modifying the destination IP of the invocation to the destination of dispatching so as to complete the scheduling.
Between Step S23 of
One embodiment of the scheduling method is given below. Referring to FIGS. 3(a) and 3(b), we take the workflow of
The above parameters are explained as follows. Number of tasks (v) is the number of nodes (i.e., tasks) in the workflow. Shape of graph (s) is used to control the shape of graphs. The levels of generated graphs form a normal distribution with the mean value equal to √{square root over (ν)}/s. The numbers of the nodes of each level also form a normal distribution with the mean value equal to √{square root over (ν)}×s. Out degree (o) means out edges of each node, which is used to control the dependence degrees between two nodes. CCR is the ratio of the communication cost to the computation cost. We can generate computation-intensive application graphs by assigning low values to CCR. Stateful task ratio means the ratio of the number of stateful tasks to the number of all the tasks in the workflow. Under the parameter settings in Table 1, the performance results of two different stateful task ratios, 25% and 50%, are shown in
The above-described embodiments of the present invention are intended to be illustrative only. Numerous alternative embodiments may be devised by persons skilled in the art without departing from the scope of the following claims.
Number | Date | Country | Kind |
---|---|---|---|
094144000 | Dec 2005 | TW | national |