The present disclosure relates to systems, apparatuses and methods for data processing systems to collaborate and accomplish dynamic workflows in a distributed environment.
More particularly the present disclosure relates to techniques for managing dynamic production workflows through a persistence based message driven asynchronous communication between applications in a distributed environment. In addition, the workflows may be orchestrated in such a manner that the processing applications accomplish the tasks in a timely manner through efficient utilization of resources.
In general, production workflows in computer-based applications such as data processing, supply chain management, data publishing systems, etc. comprise a set of jobs to be executed among computational nodes or to deliver information on multiple client systems. Each job may in turn require one or more tasks to be executed on the computational nodes. The workflow typically starts with the receipt of a task or a job from a sender application to a receiver application. The receiver application acknowledges the receipt of the task and after completion of the job communicates the exit status to the sender application. If the exit status indicates a success, the sending application schedules one of the subtasks to another receiver application running on a different computational node. The final deliverables are generated once all the tasks in the workflow are completed as per the desired order. In case the exit status indicates an error, an alarm is raised, and another task is taken up for processing. In a typical production scenario a predetermined number of requests in the pipeline need to be completed within a stipulated timeline. In the above scenarios, a workflow manager application manages the tasks by selecting appropriate processing application based on the parameters in the user request.
A workflow manager implemented through a client server architecture often possess limitations, such as tight coupling among software components. In addition, such a configuration may lead to inefficient utilization of resources as client applications need to wait for the server process to provide the data.
The implementation of product generation workflows using asynchronous communication, with non-persistent messaging, would pose serious problems due to a receiver application, running over a node connected to the sender application through the network, may go on or off in random order. This in turn would affect the delivery of the messages, and may lead to failures. If an exit status is not available, the workflow cannot proceed further, leading to non-fulfillment of the user request. Also, the computational resources in the distributed environment may not be fully exploited just by employing message based asynchronous methods of communication between workflow manager and the processing application. If large number of products are in the pipeline, this would result in an exponential increase in the number of workflows pending for completed. Further, this would lead to unpredictable product delivery timelines if appropriate steps were not taken in managing the workflows. Moreover, this may lead to suboptimal utilization of resources, as some of the products may never get a chance to execute, and would lead to unacceptable long delays in providing deliverables to users.
In accordance with certain embodiments disclosed herein, methods and systems are disclosed for optimizing processing and management of dynamic production workflows utilizing asynchronous persistent message driven communication between the processing applications and the workflow manager.
To further optimize the workflows, certain embodiments incorporate methods that would ensure quality of service (QOS) from the processing systems in terms of improved turnaround time (TAT) and optimizing the throughput from the systems. In other embodiment, techniques are disclosed for managing and monitoring the dynamic production workflows.
In certain exemplary embodiments, techniques are disclosed for managing dynamic production workflows in distributed scheduling and transaction processing in a computer-based system. Distributed computational node processing and routing of the tasks by the workflow manager may be integrated using a persistent message queuing system to provide asynchronous communication between the applications.
In product generation workflows, a first application may send a communication to a second application for processing the requests pertaining to the users. The second application inserts the request into a database leading to a tuple level change that triggers a stored procedure, to generate a message. The message may be appended to the in-queue of the message queue (MQ) pertaining to the third application. A third application acknowledges the receipt of the messages and prepares the workflows for each of these products. If an acknowledgment is not received from the receiving application, then the message is again retried for a specific number of attempts. Based on the tasks in the workflow the third application looks into the local resource manager and generates a message that is appended into the MQ of a fourth application. The fourth application, which may reside on a node, sends an acknowledgment of the message and schedules a list of subtasks to be performed on the node. The workflow preferably comes to a halt only when the exit status of any of the application is either false, or all the tasks are completed without an exit status being false. The product in the pipeline is assumed to be successfully completed if all the tasks in the workflow are completed and they are ready to be delivered to the user.
In addition, message queues may be managed such that the priority is periodically updated automatically by an auto prioritize application so that all the workflows receive the required computational resources and are completed as per specified timelines.
On availability of one or more computational nodes, a load balancer application may automatically scale the performance of the workflow system by optimizing the distributing of load among the nodes based on weights obtained from the parameters such as resources on the node, resource requirement of the tasks and the type of processing required for generation of the product.
A dispatch engine may receive a message from an application after it completes the required processing on a computation node. On receipt of the message, the dispatch engine consults a knowledge base for generating a message to the next application based on the rules set for the job.
A reporting engine, issue tracker and an analytical engine may complement the workflow by providing means for monitoring, tracking and assessing the production environment.
An auto prioritize engine may build a model from the past data on the production environment to prioritize the requests currently pending in the workflow. The engine may first identify products waiting for allocation of resources, and subsequently build a model based on the parameters such as time spent in the workflow, probable time of completion etc., to prioritize the queues so that the delivery timelines meet the user requirement.
The present invention is illustrated by way of example and not limitation in the figures of the accompanying drawings, in which like references indicate similar elements and in which:
The following discussion is aimed at disclosing architectural elements and providing a concise general description of the computing infrastructure in which the various embodiments may be implemented.
Real world problems are generally solved by divide and conquer strategies, i.e., each problem independently can be divided into sub problems and subsequently into tasks that can be executed on any computing infrastructure. The more experienced and skilled in the present art will appreciate the fact that the embodiments disclosed herein can be practiced not only on networked personal computers but also on multiprocessor/multi core machines, mainframe computers, hand held devices and the like. One may can even practice the invention in a distributed processing environment where in the real processing is done by applications running on a system connected through a network. The data and the programs required for processing may be located on the local computer or on the remote system. In a data centric approach, the processing applications may access the data from a centralized storage infrastructure such as storage area network and utilize the remote computing infrastructure to accomplish a task.
With reference to
Users can access the system through input devices such as keyboard (18), and mouse (19). In general these input devices are connected to the processing unit through a serial port interface (38) via the system bus, but in addition they may also be connected through a universal serial bus (USB) (21) or optical interfaces (22). An external hard disk (37) may be connected through an interface to the system bus. Output devices such as video monitors (23) may be connected to the system bus through video adapters (35) via the system bus. In addition, the multimedia kit such as speaker (25) and microphone (26) are connected to the system through an adapter (36) to the processing unit via the system bus. A printer (18) may be configured through a parallel port interface (24) for taking hard copy outputs from the system.
The system may interact with other remote computers over a network environment through a network switch (29) via a network interface adapter (28) for connecting to the systems on the network. The communication between the processing nodes (30) may be implemented through network protocols. Applications residing on the processing nodes may in turn utilize a group of systems (31) for executing the tasks. It should be appreciated that the system shown in the
In one exemplary embodiment, a workflow management system is disclosed in a network environment comprising message driven communication through queuing mechanism for receiving and transmitting the messages both from/to different applications. Messages may be generated by sensing a tuple level change in the database and transmitting the required information to the applications. A message may contain information specific to the application and is preferably added to a preconfigured message queue. Each message payload may contain data in the form of an object (business object) or it may include only control information for pointing to the data stored in the centralized repository. A typical application may comprise a software agent for sending and receiving messages and an interface module to invoke the processing modules required to accomplish the tasks by accessing data from centralized storage. The messages are made persistent by storing them in a database or in a file until a confirmation is received from respective applications.
Archiving the messages in a persistent storage before transmission in asynchronous mode ensures the delivery of the message payload even if the application is not in service at a certain point of time. The sending and receiving application may be on the same machine or on different machines connected by a network. Although a point to point communication is shown, those skilled in the art would appreciate that messages published by the workflow manager can be sent to all those applications who have subscribed to certain specific messages. Also, those skilled in the art should appreciate that messages can be delivered through a secured channel over a network. Further, one can extend the present embodiment to distribute the jobs to a remote workflow manager by routing the messages through a server. The remote workflow manager may in turn schedule jobs to applications on a different network of computer systems. The rerouting of jobs may be accomplished by incorporating appropriate processing rules to harness the distributed computational resources.
Each message preferably contains an identification number, time, status, priority (38) and/or a payload (39). An instance of the business object may be appended to the message by the workflow manager for delivering to the applications. In addition one can append even an extensible markup language (XML) file as message payload. The message is received by a software agent (65) which in turn invokes the processing modules of the application. The software agent is implemented as a daemon process. As soon as the message is en-queued, the agent listening to the queue would receive the message if the application (45) is configured in point to point mode. If the agent is not available at the time of receiving the message, the status would be retained as undelivered. When the agent comes online, it checks the availability of the messages through a queue look up service (64). The agent acknowledges (47) receipt of the messages and the status in the middleware is updated as received. If an acknowledgment is received from the agent for the message, the status is updated as delivered on the contrary if an acknowledgement is not received from the agent, the same message would be sent again (retransmitted) after a certain time gap. If the number of retries exceeds a predetermined value, the messages are assigned to an exception queue (65). The messages in the exception queue are automatically shown on to a issue tracker (114) user interface. Messages is recovered from the exception queue to the main queue once the error is resolved and updated using issue tracker (114) interface. Under another embodiment, only the location of the data is sent to the applications (45) along with the message wherein on its receipt it may initiate processing of jobs utilizing a group of (31) compute nodes by accessing the data from a centralized (16) storage. Some of the applications (44) may even store the message payload in a local database for subsequent processing or onward transmission.
One can even deliver the same message to multiple recipient applications (44) in a subscription mode under one embodiment. Also, the messages can be delivered in secured mode of transmission by incorporating required agents using services such as SSL and HTTPS for communication between the applications (46).
In case a database table is accessed by the processing application, the end application acknowledges the receipt of the message by updating the status of the tuple in the table. The processing applications, after completing the job, would insert a message into the queue through an agent or updating the status in the database.
The dispatcher engine of the workflow manager on receipt of the messages applies the business rules to route the request to other applications. User requests may be routed to the applications until all the required processing is completed.
We now focus on
The throughputs of different applications are measured and the timelines of delivery of products are updated in the KB. The products which require attention are monitored and resolved through an issue tracker (117). The updated timelines (118) are propagated back to the user to keep him abreast of the current situation.
Turning now to
For kth job denoted by (Jk) in the workflow waiting for an assignment to a processing application a method to check whether the Job is running as per schedule. If a deviation is found a preventive measure is to prioritize the Job. Let Tglobal represent the total time spent by the Jk in the workflow, Ti be the time taken by the ith application to complete the sub task of the Job and Tn is the waiting time of the Jk at the nth processing application. We compute (603) the total time spent by Jk as
In Step 604, a method for computing the nominal timelines of generation pertaining to jobs already processed in the workflow is presented. Let Tglobal represent the nominal time line, h is the total number of instances of a similar job order in the history, n is the total number of processing applications required for the kth Job Jk and Tpq is the time taken by the pth instance of a similar job order at qth application is computed as an average of sum of the time taken by similar job orders by different application in the previous time steps. The Tglobal′ for kth Job Jk is computed as
A simple comparison in Step 605 of Tglobal and Tglobal′ leads to Step 606. Let ΔTglobal denote difference in timelines between the present Job and the nominal time taken for delivery of similar Job. One can compute ΔTglobal as
ΔTglobal(Jk)=Tglobal(Jk)−Tglobal′(Jk). (3)
The quantity ΔTglobal>0 is an indication that the user request is being delayed and a preventive action needs to be initiated. Accordingly, an aspect current invention the new priority of the job order Jk is recomputed in Step 606 as
P
global(Jk)=P(Jk)+LPCF(P(Jk),ΔTglobal(Jk)) (4)
where Pglobal(Jk) and P are the updated global priority and initial priority of the job order respectively. The LPCF in Equation 4 represents a linear piecewise polynomial function. Those skilled in the art would appreciate that other forms of curve fitting methods such as spline, rational polynomial function etc., may be adopted to fine tune the relationship between P and ΔT.
In
T
local(Ar,Jk)=Tcur(Ar,Jk)−Tin(Ar,Jk). (5)
In Step 706, the nominal time of generation Tlocal′ for similar type of job order (Jk) in the application queue of Ar is computed from workflow history as an average time taken by similar job jk by the processing application Ar
where h is the total number of instances of similar job order processed earlier by the application Ar and Ti(Ar,Jk) is the time taken by the ith instance of a similar job order Jk by the processing application Ar
A comparison of Tlocal(Ar,Jk) and Tlocal′(Ar,Jk)′ is shown in Step 707. The difference in between Tlocal(Ar,Jk) and Tlocal′(Ar,Jk) represented as ΔTlocal is a measure of local variations in completing the Job of type Jk by the application Ar computed in Step 708 as
ΔTlocal(Ar,Jk)=Tlocal(Ar,Jk)−Tlocal′(Ar,Jk). (7)
Based on the ΔTlocal(Ar,Jk) one can prioritize the user request Step 709 as
P
local(Ar,Jk)=P(Jk))+LPCF(P(Jk),ΔTlocal(Ar,Jk)), (8)
where Plocal and P are the updated local priority and initial priority of the job order respectively. The function LPCF represents a linear piecewise model.
Turning to
A transaction in a database (102) may act as a trigger for invocation of load balancer. A trigger initiates a message as soon as the transaction database is updated and the stored procedure adds the messages to the message queue of the load balancer application. On completion of the job the application updates the status as (success/failure) in the database leading to a message generation for the Job Dispatcher (111). The dispatcher consults the KB for updating the job to the next application. If an incoming job is of higher priority, then a need may arise for the load balancer to preempt some of the existing jobs (which are not under process) if the queue is already full. In case of node failure, the automatic node monitoring software generates a message to update the status of the node in the KB. An update of the tuple in the KB a message is generated for the load balancer. On receipt of the message, the load balancer fetches back all the jobs pending at that processing node and redistributes it among other available compute nodes. If the node again becomes available, it redistributes the work orders to attain equilibrium of load.
The jobs are in general comprise of both normal and emergency types. Referring to
Turning to
If the source application rejects the request with a specific reason, the dispatcher routes the request to the appropriate destination application.
The dispatcher may then check if a counter for next processing center exceeds predefined limit (511). If yes, then it means it has exceeded its limit for that processing centre and thus is problematic case and to avoid infinite looping, it is to be sent to an issue tracker for manual analysis. Therefore, a message is generated for resolving the issue in processing the Job at the issue tracker application (512). It accordingly updates metadata for job to indicate updated processing centre (513). The job is then removed from the compute node out queue (514). It may also check whether all jobs in a queue are finished (515). In case of Job(s) that are pending for dispatch a loop continues till all the jobs in the group are dispatched as a single unit.
The estimated time (115) is computed based on the historical information on the timelines taken by the processing application to complete a similar type of Job. The database table also contains the standard deviations along with the average time taken for Job completion. When the ingest engine (101) makes an entry of the request into the database the estimated timelines are computed as
and then transmitted back to the user. The variable T(P) represent the time taken for the product P at workcenter i denoted by wi
As per the preferred embodiment the delivery time line (117) of the product will be maintained in the transaction database (102) corresponding to the user request. The delivery time line (117) are recomputed whenever a product takes a hop from one processing application (44) to another depending upon the actual time taken by application to generate the product. Let TO denote the outgoing time of the product and TI be the time at which the product is assigned for processing. For each product p the delivery time may be computed as
where ai represents the ith application involved in the workflow, n denotes the total number of processing application required to be invoked for completing the workflow and k≦n denotes the number of applications that have completed the process.
In view of the above detailed description, it can be appreciated that the invention provides a method and system for driving a workflow through a message driven communication with persistence in the dynamic production environment. The operations involved in the workflow are coordinated by sending and receiving an acknowledgment from the processing applications. The orchestration of workflows keeping in view the performance of different component is disclosed. A reliable distribution of messages and workload optimization leads to effective utilization of resources. The disclosed methods would help the business to obtain customer satisfaction by paving a way for dynamic customer relationship management.
The Abstract of the Disclosure is provided to comply with 37 C.F.R. §1.72(b), requiring an abstract that will allow the reader to quickly ascertain the nature of the technical disclosure. It is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims. In addition, in the foregoing Detailed Description, it can be seen that various features are grouped together in a single embodiment for streamlining the disclosure. This method of disclosure is not to be interpreted as reflecting an intention that the claimed embodiments require more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive subject matter lies in less than all features of a single disclosed embodiment. Thus, the following claims are hereby incorporated into the Detailed Description, with each claim standing on its own as a separate embodiment.