This disclosure relates generally to a method and system for timeline-based planning and scheduling. More specifically the disclosure relates to the on-line state-space planning of operations and actions in order to achieve pre-defined goals.
Planning is directed to the problem of finding a sequential or parallel sequence of actions that when executed from a known initial state achieves all pre-defined goals. There are many different methods of planning used in various applications, e.g. academic planners that are normally offline and deterministic planners where relevant planning data is known. The input to a deterministic planning system consists of a set of variables, a set of actions, the initial state, and the desired goal condition. Each action is represented by its lists of conditions and effects. Conditions are constraints on variables that need to be satisfied for the action to be executed. The planner finds a logically consistent sequence of actions (a plan) that connects the initial state to the goal state. The planner does not account for issues such as: (1) if variables are affected by actions outside of the planner's control (e.g., by actions from another plan being executed); (2) how variables may change values during the planning time needed to find the plan; and (3) new goals arriving in real-time. These issues are associated with online planning, where the planner must account for the passing of time.
In continual on-line planning, goals and system (or world) state are continually updated over time to account for plan executions of previously planned for goals that overlap with the planning process. Current online planners known in the art use domain-specific guidance techniques to guide the planner, making it time-consuming to adapt to new applications. These limitations make it difficult to develop a traditional action-based general-purpose planning heuristic to guide the search for a plan.
An on-line forward state-space planning system and method adds actions in the form of tokens, at fixed wall clock times, to partial plans representing a potential final plan. The adding of the actions is repeated until a final sequence of actions satisfies a defined goal, wherein during the planning process all actions in the partial plans and the tokens introduced by the actions are constrained to happen at the fixed wall-clock times.
The planner 22 is suitably embodied as operating within a digital processing device or system, such as an illustrative computer 70, executing suitable software, firmware, or other instructions. The digital processing device includes suitable user interfacing components such as to receive the on-line messages 18 and to output instructions/data to controller 60, visualizer 64—in the case of the illustrative computer 70, these include an illustrative display device 74 providing visual output (and, if embodied as a “touch screen”, optionally also providing for user input), an illustrative keyboard 74, and an illustrative mouse 76 (or a trackball, track pad, or other pointing device). Instead of the illustrative computer 70, the planner system can be embodied by another digital processing device such as a network server, a personal data assistant (PDA), a laptop or notebook computer, a tablet device such as an iPad (available from Apple Corporation, Cupertino, Calif., USA), or so forth. In a portable system, wireless connectivity is suitably used.
It is also to be appreciated that the planner 22 may be embodied by a storage medium storing instructions executable by a digital processing device to implement the valuation system. By way of illustrative example, the storage medium may be a hard disk (not shown) of the computer 70 or some other magnetic storage medium, or an optical disk or other optical storage medium, or random access memory (RAM, not illustrated) of the computer 70 or FLASH memory or some other electronic storage medium, or so forth.
Timelines for several variables modified in the example depicted in
Each timeline for a variable ν consists of a value cνεD(ν), with D(ν) being the value domain of ν, which contains all possible values of v. The timeline for v consists of the current value of ν at the current wall-clock time tc and a set of tokens representing future events affecting the value of ν. The tokens are added due to actions in the plans found for previous goals. The three tokens 212, 214, 216 depicted in
Each token tk is represented by:
Given that tokens represent conditions and changes caused by actions, there can be temporal relations between tokens that represent either: (1) an execution condition or effect of the same action a; or (2) a condition or effect of actions that are related to one another. For example, before moving the package from L1 to L2 using Crane1, it first needs to be loaded. Thus, tokens caused by load action need to finish before the tokens added by the move action. Therefore, there are temporal ordering relations between the tokens. In a valid plan, temporal relations between all tokens within a timeline and between timelines for all variables are consistent.
The set of timelines for all variables is consistent if each timeline is:
A valid plan must achieve the desired goal or set of goals. For a given goal g=νg,x
(i.e., νg=x), a consistent timeline for ννg achieves the desired goal, if at the end of the timeline the end value of the last token matches x. Alternatively, we say that the timeline achieves g at some point in time if there exist a token T such that the end value of T matches x. For a given goal set G, if for all gεG a consistent timeline for νg satisfies g then we say that set of timelines TL for all variables satisfy G or TL |= G.
The planner takes as an input 410 a consistent timeline set TL, representing all changes happening from the current wall-clock time to all state variables, and a goal set G. The planner attempts to find a plan P such that (1) adding T(P) to TL does not cause any inconsistency, (2) achieves all goals, and (3) is executable (i.e., all tokens caused by this plan should be able to start after the wall-clock time when the plan is found) 424. To achieve this, the planner starts with an empty plan 412 and continually generates revisions until a valid plan is found. It does so by maintaining a queue (SQ) of plan states, each containing a potential incomplete plan P and the corresponding timeline containing tokens representing actions in P. SQ is initialized with an empty plan and the current timelines at the planning time 412. The planner then picks the best state s=<(TLs,Ps> from the queue according to the objective function of the planning process in 414. If, in 416, the state contains a consistent plan Ps, then it is returned for execution 424. If not, then the planner will create zero or more revisions P′ of the partial plan Ps in 418. It also generates the corresponding timeline set TL′ for each new P′ in 420. The new states combining newly generated plans P′ and timelines TL′ are, in 422, put back into the state queue SQ and the process is repeated back to 414.
The concepts of the above described
It is to be understood that the above algorithm is sufficiently general to capture both systematic and local-search style planning for different plan representations, and for different planners that can handle different sets of variables and constraints. In that regard, the specific revising of P; determining what is the best plan; and the representation of the plan during the planning process can and will vary dependent upon particular implementations. So, in one embodiment, for example, “best” is understood to be the plan that meets more of the predetermined criteria (e.g. shortest execution time, lowest total execution cost) than other potential plans.
P0,TL0
;
Ps,TLs
from SQ;
TL′,P′
to generated state set SQ;
Turning now to implementing forward state-space (FSS) planners on a timeline, it is understood FSS planners search for a plan by moving forward through time. FSS planners start with an empty plan and gradually add actions at some fixed wall-clock time to the end of the currently expanding partial plan. This process is repeated until the final sequence of actions satisfies the defined goals. Thus, during the planning process, all actions in the partial plans and the tokens introduced by them are constrained to happen at some fixed wall-clock time. This set of constraints and the fact that FSS planners move forward, therefore not considering actions happening before a given time-stamp, simplifies plan state representation and reduces the branching factor compared to other algorithms.
A flow diagram 500 representing the operations performed by the FSS planner is shown in p and the expected time to conduct one planning step
e. To start the planning process, the planner moves to the expected time at which the planner can start to execute the eventually found plan: tp=tc+
p 512. The planner also “freezes” all tokens in all timelines before tp and removes them from the initial timeline set 514. This step simplifies the token and timeline representation and also reduces their sizes. Like the planner in
Next, the planner selects a subset of promising actions 524, removing irrelevant actions (i.e., actions that do not lead towards a goal). There are several methods to implement this step, the simplest approaches being selecting all applicable actions or selecting only a single best action according to a heuristic function. For each action a in the candidate set, tokens are then created to represent the conditions and effects of action a and are added to the timeline set for the plan 526. The actions are added to the plan at the wall-clock time ta found in the previous step and the resulting state containing the newly created timelines and plan are added to the state queue (SQ) 528.
Next, to create one additional resulting state, the time-stamp is moved forward 530. This is a special action that helps to move the state time-stamp forward closer to the goal. When moving the time-stamp forward, the function sets a newer lower-bound on the future action execution time, which: (1) limits the branching factor; (2) simplifies the timelines by removing all tokens before the new time-stamp; and (3) reduces the interactions between tokens and future actions, leading to shorter heuristic computation time. Then the process moves from 530 back to 520.
Given that the plan returned by the FSS algorithm has all actions and tokens tied to some fixed wall-clock times, the FSS planning algorithm may not return the plan in which all actions start at the earliest possible time. As an optional step, it is possible to convert from the “fixed-time” plan into a plan with temporal ordering between tokens and actions 562. This can be accomplished using an extension of the approach specified in Do, M., & Kambhampati, S., “Improving the Temporal Flexibility of Position Constrained Metric Temporal Plans”, on Proceeding of the 13th International Conference on Automated Planning and Scheduling (ICAPS), 2003.
Turning to Algorithm 2 below, the above concepts are detailed in pseudocode. In Algorithm 2, it may be seen that lines 8-21 mimic the main steps in general Algorithm 1 that use a best-first-search framework (with lines 20-21 providing the being option of converting from fixed-time tokens to tokens with temporal constraints. Corresponding to the discussion of
p: estimated planning time;
e: estimated node expansion time;
ts,TL,Ps
with ts is a wall-clock time-stamp of s;
p;
t0,TL0,
;
e then
a,ta
ε As′
a,ta
}
a,ta
,s):;
a,ta
};
Going back to the example shown in
Attention is now turned to a partial-order planner, implemented on a timeline according to the present disclosure. It is to be noted here that while this disclosure uses the term partial-order planner and such a term is used in the literature, there are significant differences, particularly as this partial-order-planner is designed to operate on timelines. It is noted an FSS planner finds a plan by moving forward through a sequence of consistent timelines until a given timeline set satisfies all goals. Conversely, a partial order planner searches backward from the goals. The partial order planner creates special tokens representing the goals and has an objective of creating enough tokens through adding actions to plan to support all in the set of unsupported tokens, which initially contains only special goal tokens. So the partial order planner may start with an inconsistent timeline set and systematically refine it until it becomes consistent. Instead of finding Applicable actions as in the FSS planner, the partial order finds Relevant actions. Relevant actions are those actions that can contribute tokens that support currently un-supported tokens.
The flow diagram 700 in
Further describing the above flow diagram, shown below is the pseudocode of Algorithm 3 corresponding to the described partial order planning (POP) algorithm.
The main loop of the POP algorithm uses a search to find the plan (lines 6-16) which is similar to Algorithm 1 and Algorithm 2. However, particular differences between this POP algorithm and the FSS algorithm are:
TL0,Ps0 =
};
e then
tk,a
,s)
tk,a
ε As′
tk,a
}
tka,a
,s):;
The partial order planner and FSS planner include various attributes. The fixed-time and the association of a time-stamp for each search state during the planning process of the FSS planner leads to:
Attributes of the POP-style algorithm include:
The foregoing has described a timeline-based planning approach that operates by maintaining timelines that capture how the values of system variables change over time. The planner builds and maintains consistent plans by adding tokens to affected timelines, each token representing different types of operation and/or change affecting the variable represented by that timeline. The application supports many types of variables and various operations on those variables through different tokens, all of which can be shared between different planning episodes for different goals. Given that different planning algorithms are more suitable for different applications, the overall framework is designed to allow multiple planning algorithms to be used for a given task. In turn, different planning algorithms can call different search algorithms and constraint solvers (e.g., temporal reasoning, uncertainty reasoning, etc.) to solve the planning or replanning tasks.
The disclosed embodiments provide examples of improved solutions to the problems noted in the above Background discussion and the art cited therein. There is shown in these examples an improved online continual automated planning framework based on timelines. In one embodiment, a timeline-based continual on-line planning and scheduling method for determination of a sequence of actions that when executed from a known initial state achieves all pre-defined goals. The method is performed by a planner residing within a computer control system having a memory storage. The planner builds and maintains a consistent valid plan by adding tokens to affected timelines. The plan is defined by a sequence of actions and each timeline represents a variable. All variables and their values represent a state and each timeline comprises the current value of the variable and a set of tokens representing constraints and changes on the value of that variable over time. A token represents a condition or effect of an action affecting the variable and tokens are added to timelines due to actions in the plan that affect the value of different variables. Each token has an earliest time point and a latest time point that the action can occur. The planner takes as an input a goal set and a consistent set of timelines representing all operations occurring after the current wall-clock time that affect any state variables.
It will be appreciated that variants of the above-disclosed and other features and functions, or alternatives thereof, may be combined into many other different systems or applications. Various presently unforeseen or unanticipated alternatives, modifications, variations or improvements therein may be subsequently made by those skilled in the art which are also intended to be encompassed by the following claims.
The following co-pending and commonly assigned applications, the disclosures of each being totally incorporated herein by reference, are mentioned: U.S. patent application Ser. No. [Atty. Dkt. No. 20100583-US-NP], filed XXXXX, entitled, “Online Continual Automated Planning Framework Based on Timelines”, by Minh Binh Do; and U.S. patent application Ser. No. [Atty. Dkt. No. 20100585-US-NP], filed XXXXXX, entitled, “Partial-Order Planning Framework Based On Timelines”, by Minh Binh Do.