Method and system for dispatching semiconductor lots to manufacturing equipment for fabrication

TECHNICAL FIELD OF THE INVENTION

This invention relates generally to the field of process control, and more particularly to a method and system for dispatching semiconductor lots to manufacturing equipment for fabrication.

BACKGROUND OF THE INVENTION

In a semiconductor manufacturing fabrication facility with a production-to-order type operation, control of many machines and various processes must be managed. There are always the major goals and tasks of providing short cycle times and precise delivery schedules for the purpose of satisfying expectations of customers. Difficulties encountered are a complex set of choices of process mix and of product mix, unscheduled machine down times and equipment arrangements. How to effectively schedule and dispatch lots has become a very important topic in handling manufacturing. Within each such fabrication facility, a scheduler typically drives a buffer of available lots to be processed. The buffer of available lots is typically fed to a dispatcher that allocates available lots to specific machines or processes.

In general, dispatch systems can be open loop or closed loop. Open loop dispatch systems make dispatch decisions without regard to the actual performance of the system and without regard to the dispatch decisions. Open loop dispatch systems rely solely on the open loop model of the process and assume that the process does what is expected of it. Closed loop dispatch systems make dispatch decisions and then feed back appropriate information to help improve future dispatch decisions. The closed loop dispatch systems learn, with feedback, what is occurring in the system and then changes the dispatch behavior based upon that information.

SUMMARY OF THE INVENTION

In accordance with the present invention, a method and system for dispatching semiconductor lots to manufacturing equipment for fabrication are provided that substantially reduce or eliminate disadvantages and problems associated with conventional lot dispatchers. More particularly, the dispatch system and method implements a closed-loop system that minimizes the influence of process and inventory disturbances and optimizes process performance.

According to an embodiment of the present invention, there is provided a method for dispatching available lots to unallocated machines that includes receiving metrics data providing performance measurements for a plurality of machines. The method next provides for determining a state for each of the machines based on the metrics data. The method next provides for receiving one or more lots to be dispatched where each lot has a lot type and each lot type is associated with one of a plurality of models. The method next provides for selecting a preferred lot type for each of the plurality of models associated with each of the machines based on the state of the machine. The method next provides for selecting a preferred model based on a time since a last run of the model, a cost of switching to a new model and lot type, and the state of the machine. The method next provides for resolving conflicts between selected preferred lot types/preferred model combinations when insufficient lots are available to fill the selections. The method next provides for assigning each lot to one of the machines according to the preferred model and preferred lot type selections.

Technical advantages include providing an improved dispatch system for semiconductor manufacturing facilities. In particular, the dispatch system uses process performance information to dispatch available lots to available machines. As a result, process performance is optimized.

In addition, the dispatcher takes the cost of switching between processes into account in making dispatch decisions. The dispatch system assures adequate sampling of machines in the semiconductor fabrication facility that in turn provides for optimal machine, or process, performance.

Other technical advantages may be readily apparent to one skilled in the art in the following figures, description and claims.

BRIEF DESCRIPTION OF THE DRAWINGS

For a more complete understanding of the present invention and the advantages thereof, reference is now made to the following description taken in conjunction with the accompanying drawings, wherein like reference numbers represent like parts, and in which:

FIG. 1

is a block diagram illustrating a processing system including a scheduler and a dispatcher in accordance with one embodiment of the present invention;

FIG. 2

is a block diagram illustrating the dispatcher of

FIG. 1

including a preferred lot type module, a preferred model module and an arbitrator in accordance with one embodiment of the present invention;

FIG. 3

is a flow diagram illustrating operation of the dispatcher of

FIG. 2

in accordance with one embodiment of the present invention;

FIG. 4

is a flow diagram illustrating the process of the arbitrator of

FIG. 2

in accordance with one embodiment of the present invention;

FIG. 5

is a block diagram illustrating additional details of the preferred lot type module of

FIG. 2

in accordance with one embodiment of the present invention; and

FIG. 6

is a block diagram illustrating additional details of the preferred model module of

FIG. 2

in accordance with one embodiment of the present invention.

DETAILED DESCRIPTION OF THE INVENTION

Referring to

FIG. 1

, a work cell is generally indicated at

10

. Work cell

10

is a system designed to perform a work function using various machines, or processes, assigned to work cell

10

. In an exemplary embodiment, work cell

10

is a group of machines in a semiconductor fabrication facility organized such that a series of fabrication processes may be performed on semiconductor wafers using the various machines. However, any suitable work cell may be used such as a work cell to select a telecommunications channel. The present invention may be used with any process that may be sampled.

Work cell

10

includes a scheduler

12

, a buffer

14

, a dispatcher

16

, and a plurality of machines

18

. Each machine

18

may have an associated controller

20

. However, the controller

20

functions may be performed by the dispatcher

16

. By using a controller

20

for each machine

18

, the processes of the present invention may be distributed between controller

20

and dispatcher

16

.

Scheduler

12

develops a list of lots

13

to be processed by work cell

10

and places the list of lots

13

in buffer

14

. The list of lots

13

includes one or more lots

22

. Dispatcher

16

takes the available lots

22

in buffer

14

and dispatches each lot

22

to optimize operation of the work cell

10

. Dispatcher

16

bases its dispatch decisions in part on feedback data

24

from each machine

18

. Feedback data

24

includes machine metrics

26

from each machine

18

and downstream metric data

28

. In the exemplary embodiment, machine metrics

26

and downstream metric data

28

are forwarded to an associated controller

20

that then forwards the feedback data

24

to dispatcher

16

. Controller

20

may use the metric information for process control purposes such as controlling the associated machine

18

.

Machine metrics

26

includes measurements on a particular machine

18

such as temperature, vacuum pressure, electrical current flow, and any other suitable machine measurements. Downstream metric data

28

may be gathered from testing devices designed to measure critical dimensions, product functionality, device errors, or any other suitable metric data.

The work cell

10

implements a closed-loop dispatch system to make dispatch decisions and then to feedback appropriate information to help improve future dispatch (allocation) decisions in work cell

10

. The closed-loop dispatch system learns, with feedback, what is occurring in work cell

10

and then changes the dispatcher

16

behavior based on that information. This allows the work cell

10

to account for various disturbances that affect operations. Cell disturbances include buffer disturbances. For example, work cell

10

performance may be affected by changes in the product, or lot, mix appearing in buffer

14

. The proportion of different types of lots in buffer

14

may change on a daily basis. Another type of disturbance affecting work cell

10

performance includes equipment drift. Equipment drift may be caused by equipment going out of alignment, having temperature sensors fail, or other equipment disturbances. These equipment disturbances cause changes in product fabrication that may not affect the current lot

22

being processed by a particular machine

18

, but that may significantly affect a later lot

22

dispatched to the particular machine

18

.

In an embodiment, the closed-loop dispatch system is non-idling meaning that if sufficient lots

22

are available, none of the available machines

18

will be idle. This results in maximizing throughput through work cell

10

. The closed-loop dispatch system also allocates low running lots

22

first so that adequate sampling is obtained on which to make future dispatch decisions. The closed-loop dispatch system soft dedicates machines

18

, preferring to run a given type of low running lot

22

on a limited set of machines

18

. This results in better control due to higher sampling rates on a limited set of machines

18

. Thus, for low volume lots, the dispatcher

16

tries to allocate lots

22

to machines

18

best suited (in a performance sense) to run the lots

22

. High volume lots

22

are run on many machines

18

thereby providing sampling data on all available machines

18

. If a specific lot

22

(process) is run on multiple machines

18

, and if one of those machines

18

has a process shift that significantly degrades its performance, the dispatcher

16

will try to limit lots

22

run on that machine

18

. Once appropriate corrective action is taken for the machine

18

, the dispatcher

16

will again start loading that machine

18

.

The cost for a particular machine

18

to switch to a new process for new lot

22

includes the cost involved in switching product mixes, machine settings, reticle mixtures, and any other suitable variables. The cost for a particular machine

18

to not switch to a new process includes having insufficient sampling data for running a type of lot

22

on that particular machine

18

. Future dispatch decisions rely on sufficient feedback data

24

in the form of process sampling to make future dispatch decisions.

In operation, scheduler

12

determines lots

22

that need to be processed within the next period of time where the period of time may be any suitable period including hours or days. Upon receiving a dispatch event such as one of the machines

18

becoming available to accept a new lot

22

, dispatcher

16

then assigns a lot

22

from buffer

14

to a particular machine

18

for processing. The dispatcher

16

bases its dispatch decisions on feedback data

24

and on the cost for a particular machine

18

to switch to a new process as compared to that machine

18

not switching to the new process.

After a machine

18

completes an assigned lot

22

or process, machine

18

, through feedback data

24

, informs dispatcher

16

that it requires a lot

22

for processing. Dispatcher

16

then allocates available lots

22

in buffer

14

to idle, or available, machines

18

. Reallocating lots

22

to machines

18

with each request for a lot

22

by a machine

18

allows dispatcher

16

to make sure that changes in buffer

14

are taken into account prior to allocating a lot

22

to a machine

18

. Scheduler

12

can change lots

22

in buffer

14

at any time. By doing this, scheduler

12

handles priority lots

22

by removing non-priority lots

22

from buffer

14

to ensure that priority lots

22

are dispatched next.

Referring to

FIG. 2

, details of dispatcher

16

are illustrated. Dispatcher

16

includes a machine preference module

40

for each machine

18

in work cell

10

, an arbitrator

42

, and a metrics history and switching cost data

44

. Each machine preference model

40

retrieves data from metric history and switching cost data

44

and determines a preferred lot type from available lot types represented in buffer

14

. The preferred lot types are submitted to arbitrator

42

where any conflicts between determined preferred lot types are resolved. Each machine preference module

40

includes a preferred lot type module

46

for each model that may be processed by a particular machine

18

and a preferred model module

48

. As described in more detail below, each preferred lot type module

46

generates a preferred lot type

50

from the available lots

22

associated with a particular model. The preferred model module

46

generates a preferred model

52

from the plurality of preferred lot types

50

where each preferred lot type

50

is associated with a particular model. In one embodiment, dispatcher

16

, machine preference module

40

, arbitrator

42

, preferred lot type module

46

, and preferred model module

48

are implemented in one or more software modules stored in a computer readable medium.

Each lot

22

in buffer

14

is a member of a particular lot type. A lot

22

is a specific instance of a lot type. A lot type represents any classification that is used to identify differences between lots within a given model. In the exemplary embodiment, a lot type may represent the time period and target thicknesses for an oxidation layer grown in a furnace. Other examples of lot types may include lots with different device pattern densities that have the same targets but need different exposures to achieve a particular target line width.

Each lot type is a member of a model. Thus, the lot types are grouped into lot types associated with a particular model. The model identifies a common characteristic of the group of lot types associated with the model. In an exemplary embodiment, a model represents the expected output from a machine

18

running a recipe on a lot

22

of wafers. In that embodiment, the model may represent a specific temperature, pressure and deposition time combination for a furnace and the associated deposition thicknesses resulting from that model. Other examples of models may include the over-etch time necessary to hit a target polysilicon line width in an etch machine

18

where the incoming material comes from the same photo-stepper. Any other suitable model may be used.

A machine

18

may process one or more models. Each model may include one or more lot types. Each lot type may be represented by one or more lots

22

in buffer

14

. Although the present invention is discussed with reference to lots, lots types, models, and machines as used in a semiconductor fabrication facility, the present invention is applicable to a variety of processes including other types of manufacturing or allocation of channels in a telecommunications system.

In the exemplary embodiment, the inventory of lots

22

in buffer

14

are divided according to the model associated with specific lots

22

. Each model is further divided by lot types. For example, in the case of oxidation furnaces, different ambients would imply different models. Each model in turn could account for different target thicknesses by using different lot types. In certain cases, for example overlay situations, different models could correspond to different pattern levels with each model having a single lot type.

In operation, each machine preference module

40

, associated with a particular machine

18

, selects and requests a particular lot

22

from buffer

14

. Each machine preference module

40

first groups the lots

22

in buffer

14

by a model associated with lots

22

that may be processed by machine

18

. Preferred lot type module

46

then selects a preferred lot type

50

for a particular model from the lot types of the available lots

22

for that model. Preferred lot type module

46

uses information from metrics history and switching cost data

44

to assist in the decision.

After each preferred lot type module

46

has selected a preferred lot type

50

for its associated model, preferred model module

48

selects one of the models associated with the preferred lot types

50

as a preferred model

52

. Preferred model

52

includes the preferred lot type

50

for the selected model. Preferred model module

48

uses metrics history and switching cost data

44

to assist in selecting and requesting the preferred model

52

.

After each machine preference module

40

has selected and requested a preferred model

52

, arbitrator

42

resolves conflicts between machine preference modules

40

that request the same preferred model

52

and preferred lot type

50

when insufficient lots

22

exist in buffer

14

to fill the requests. Arbitrator

42

produces an optimal list of lot assignments

54

. Lot assignments

54

allocate, or assign, a particular lot

22

in buffer

14

to a particular machine

18

.

Referring to

FIG. 3

, the general processing flow of dispatcher

16

is illustrated. The process commences at step

80

where dispatcher

16

receives a dispatch event. A dispatch event occurs when a machine

18

completes its assigned process (lot) and is available to accept a new lot

22

from buffer

14

.

The process proceeds to step

82

where dispatcher

16

determines the lot types of unassigned lots

22

in buffer

14

. Each lot

22

is associated with a lot type, and each lot type is associated with a model. Unassigned lots

22

are those lots that are not committed to a particular machine

18

. In one embodiment, lots

22

are committed once they are assigned to a particular machine

18

. However, in another embodiment, lots

22

are not committed until they are in process on the assigned machine

18

.

The process proceeds to step

84

where the lots

22

associated with relevant lot types are passed to each machine preference module

40

. The relevant lot types are lot types belonging to models that may be processed by the particular machine

18

associated with the machine preference module

40

.

The process proceeds to step

86

where the lot types associated with a common model are passed to the preferred lot type module

46

for that particular model. Each lot type is associated with a particular model and is therefore forwarded to the preferred lot type module

46

for that particular model.

The process proceeds to step

88

where each preferred lot type module

46

determines a preferred lot type

50

from the available lot types of the relevant lots

22

. The preferred lot type is the lot type that will assist in assuring that the associated machine

18

continues to perform at an optimal level. The process of determining a preferred lot type

50

will be discussed in more detail with reference to FIG.

5

.

The process proceeds to step

90

where preferred model module

48

selects a preferred model

52

from the preferred lot types

50

generated in step

88

. Each machine preference module

40

selects one preferred model

52

. The preferred model

50

has a preferred lot type

52

associated with it. Multiple lots

22

in buffer

14

may satisfy the requested preferred model

52

and preferred lot type

50

. In the exemplary embodiment, multiple lots of the same wafer may be scheduled by scheduler

12

and placed in buffer

14

. Identical lots will have the same lot type and model. A method for selecting a preferred model

52

will be discussed in more detail with reference to FIG.

6

.

The process proceeds to step

92

where the preferred model

52

from each machine preference module

40

is passed to arbitrator

42

. Preferred model

52

also includes an associated preferred lot type

50

.

The process proceeds to decisional step

94

where a decision is made regarding whether there are sufficient lots

22

available in buffer

14

to fill all preferred model

52

requests. If there are insufficient lots available to fill all preferred model

52

requests, the NO branch of decisional step

94

proceeds to step

96

where arbitrator

42

resolves conflicts between machine preference modules

40

requesting the same preferred model

52

and preferred lot type

50

.

The process proceeds to step

98

where lots

22

that are subject to a conflict are assigned to the machine

18

indicated in step

96

. The lot assignments are stored in lot assignments

54

.

The process proceeds to step

100

where the dispatch process is repeated for unassigned lots

22

and machines

18

without assigned lots. Machines

18

that did not receive an assigned lot

22

select another lot choice from buffer

14

. Arbitrator

42

then resolves any conflicts between the new selections received from machines

18

that did not receive an assigned lot

22

in the previous pass through the dispatch process. The method of resolving conflicts between selected preferred model

52

and preferred lot type

50

will be described in more detail with reference to

FIG. 4

regarding arbitrator

42

.

The process proceeds to decisional step

102

where a decision is made regarding whether all lots

22

in buffer

14

are fully assigned or whether each machine

18

has a predefined number of assigned lots. This decisional step allows each machine preference module

40

to make a number of ordered selections from buffer

14

. Thus, each machine

18

may select a first choice, a second choice, and so on until a predefined number of choices is made. This allows a certain number of predefined lots to be frozen and statically assigned to a particular machine

18

. If the buffer is not fully assigned or each machine

18

does not have a predefined number of assigned lots, the NO branch of decisional step

102

proceeds to step

82

where the dispatch process is repeated. If the buffer is fully assigned or each machine

18

has a predefined number of assigned lots, the YES branch of decisional step

102

terminates the process.

Referring again to decisional step

94

, if there are sufficient lots

22

available in buffer

14

to fill all preferred model

52

and preferred lot type

50

requests from machine preference modules

40

, the YES branch of decisional step

94

proceeds to step

104

where lots

22

in buffer

14

are assigned to machines

18

and placed in lot assignments

54

according to the preferred model

52

and preferred lot type

50

. After step

104

, the process proceeds to decisional step

102

.

Referring to

FIG. 4

, the method of arbitrator

42

in dispatcher

16

is illustrated. The method commences at step

120

where the list of requests in the form of preferred models

52

and associated preferred lot types

50

is received from the machine preference modules

40

. Each machine preference module

40

is associated with a particular machine

18

and selects a preferred model and lot type associated with one of the lots

22

in buffer

14

for processing.

The method proceeds to step

122

where the machines

18

are grouped by the requested preferred model

52

and associated preferred lot type

50

in increasing order of an upper value function.

The method proceeds to decisional step

124

where arbitrator

42

determines whether any conflicts exist. A conflict is defined as more than one machine

18

requesting a particular model and lot type. If there are no conflicts, the NO branch of decisional step

124

proceeds to step

126

where lots

22

in buffer

14

are assigned to the requesting machines

18

. After step

126

the method terminates.

Returning to decisional step

124

, if there are conflicts, the YES branch of decisional step

124

proceeds to decisional step

128

where a decision is made regarding whether there are sufficient lots

22

to fill all requested preferred models

52

and associated preferred lot types

50

. If there are sufficient lots, the YES branch of decisional step

128

proceeds to step

126

. If there are insufficient lots, the NO branch of decisional step

128

proceeds to step

130

where all machines

18

that have requested preferred models

52

with insufficient lots

22

to fill those requests are selected as a group of machines

18

with conflicts.

The method proceeds to step

132

where the requested models

52

associated with the group of machines

18

with conflicts selected in step

130

are sorted by their maximum cost (upper value) function. The maximum cost function is described in further detail below.

The method proceeds to step

134

where lots

22

in buffer

14

are allocated from the model with the minimum maximum cost function as determined in step

132

.

The method proceeds to step

136

where the machines

18

that received a lot

22

allocation are removed from the group of machines

18

with conflicts. In addition, all lots

22

that are allocated are removed from the lots

22

that are available for allocation.

The method proceeds to decisional step

138

where a decision is made regarding whether all machines

18

have an allocated lot

22

or whether all lots

22

are allocated. If all machines

18

have an allocated lot

22

or all lots

22

are allocated, the YES branch of decisional step

138

terminates the method. If all machines

18

do not have an allocated lot or all lots

22

are not allocated, the NO branch of decisional step

138

proceeds to step

140

where the reduced set of available lots

22

are forwarded to the unallocated machines

18

for selection of a new preferred model

52

and associated preferred lot type

50

. After step

140

, the method proceeds to step

120

where the arbitration process is repeated for machines

18

without allocated lots

22

. An exemplary embodiment of arbitrator

42

will be described in more detail below.

Referring to

FIG. 5

, details of preferred lot type module

46

are illustrated. Preferred lot type module

46

includes a state estimator

160

, a V mapping module

162

, an optimizer

164

, and a control map module

166

. In

FIG. 5

, the lots

22

in buffer

14

are represented by a {overscore (b)}.

The inputs to preferred lot type module

46

are feedback data measurements from metrics history and switching cost data

44

and the lots

22

from the buffer

14

({overscore (b)}). The feedback data measurements from metric history and switching cost data

44

are supplied by feedback data

24

. Since feedback data

24

is recursive, there is a one step delay between the current processing and the feedback data measurements

24

. The feedback data measurements

24

are fed to a state estimator

160

. In the exemplary embodiment state estimator

160

is implemented by an equation (4) in general or an equation (17) more specifically as described in more detail below. The output of state estimator

160

is an information state (p

k

). Equation (4) is a cost biased estimator which is optimal for solving a dynamic game equation (3).

At the same time, the buffer value ({overscore (b)}) is fed to V mapping module

162

that outputs a function V (.,{overscore (b)}). The V mapping module

162

is constructed by solving an equation (7) and storing the solution V for later retrieval as described in more detail below. This function V(.,{overscore (b)}) and the information state p

k

(.,{overscore (b)}) are fed to an optimizer

164

. In the exemplary embodiment optimizer

164

is implemented by an equation (8) as described in further detail below. Optimizer

164

generates an optimized state estimate

165

.

Via the dynamic programming principle, V(.,{overscore (b)}) represents the best worst-case cost to go, and p

k

(.,) represents the worst-case cost, up to time k, given measurements and inputs from and to work cell

10

. By maximizing the sum of the two costs, one tries to gauge the value of the state that would have resulted if the disturbances were behaving in a worse-case sense. The states estimated by state estimator

160

are frozen until a lot belonging to the specific machine/model combination being processed by preferred lot type module

46

is run. At that time, the process illustrated by state estimator

160

, V mapping module

162

, and optimizer

164

is performed.

The optimized state estimate

165

from optimizer

164

and the lots

22

in buffer

14

({overscore (b)}) are forwarded to control map module

166

that includes the control map (u

f

) obtained from solving equation (7) as described in more detail below. From optimization theory, this represents the best action to take given the state x

k

and the buffer state b

k

. The output of control map module

166

is the preferred lot type

50

that should be run next from all the lots

22

of a particular model in buffer

14

for optimal performance.

Following is an exemplary embodiment of selecting a preferred lot type

50

. First, notation used in describing the exemplary embodiment is introduced.

|·| denotes the Eucledian norm.

∥·∥ denotes the l

2

norm.

| denotes the reals.

denotes integers.

+

denotes nonnegative integers.

B

n

={b:bε\0

n

}.

x

(i)

denote the i

th

element of vector xε

□

n

.

For x,yε

□

n

,xy if y

(i)

=0 whenever x

(i)

=0, i=1, . . . , n. Also, xy if y, and ∃

j

ε{1, . . . , n} such that y

(j)

=0 but x

(j)

≠0. , are similarly defined.

{overscore (1)}

n

is the vector [1 1 1 . . . 1]

T

ε|

n

.

e

i

n

is the i

th

basis vector in |

n

.

x

i,j

and x

[i,j]

denote a sequence {x

i

,x

1

, . . . ; x

j

,}.

l

[0,k]

2

denotes the set of sequences {x

0,k

: ∥x∥ is finite}.

<x> denotes the dimension of a vector x.

For bεB

n

define

U_{m} (b) = {\sum_{i = 1}^{n} α_{i} ⅇ_{i} : 0 < \sum_{i = 1}^{n} α_{i} \leq m, α_{i} \in, b^{(i)} \geq α_{i}, i = 1, \dots, n} .

π(N) denotes the set of all possible permutations of {1, 2, . . . , N}. The supremum and infimum over an empty set are noted by negative infinity (−4) and positive infinity (+4), respectively.

Single Machine Case

The system under consideration is defined as:

\begin{matrix} \sum {\begin{matrix} x_{k + 1} = f (x_{k}, u_{k} w_{k}), x_{0} = \overline{x} \\ y_{k + 1} = g (x_{k}, u_{k} w_{k}) \\ z_{k + 1} = l (x_{k}, u_{k} w_{k}) \\ b_{k + 1} = b_{k} - u_{k} + η_{k}, b_{0} = b, k = 0, 1, \dots \end{matrix} & [1] \end{matrix}

where x

k

ε|

n

are the states, u

k

εU

1

(b

k

) is the specific lot that was chosen for run k, y

k

ε|

t

are the measurements, w

k

ε|

r

and η

k

ε

s

are exogenous disturbances, b

k

εB

s

is the buffer, and z

k

ε|

q

are the quantities we want to regulate.

The dynamics f denote the recursions for both the controller, and any weighting filters on the disturbances driving the process. The output map g is typically a model based on response surfaces. l is the quantity we want to control, and it typically represents targeting errors. The buffer b

k

is a listing of the number of different types of material available. The i

th

element of b

k

, denoted b

k

(i)

is the number of lots of type i available for dispatch. Given M models, this listing can always be reordered such that b

k

(1)

,b

k

(2)

, . . . , b

k

(i

1

)

belong to model 1, b

k

(i

1

+1)

, . . . , b

k

(i

2

)

to model 2, and so on up to model M.

η

k

denotes lots of different types added or deleted by the scheduler. We allow deletion to handle, for example, priority lots, in which case the scheduler pulls all non-priority lots out of the buffer and forces the dispatcher to allocate priority lots only.

Let us define the set of sufficiently full buffers Λ

s

m

as by

Λ

s

m

={bεB

s

:b

(i)

≧m,i=

1

, . . . ,s}

******

for some m>0. Let there also exist a function :B

s

xU(B

s

)x→

□

*. Define the set Ψ

(0,∞)

s

(b,u)={η

(0,∞)

ε

(0,∞)

:∥(b,u,η)∥<∞}. We will assume that x=0 is an equilibrium point for the system. Namely, there exists a u

φ

εU

1

(Λ

s

m

) such that

f

(0,

u

φ

,0)=0

l

(0,

u

φ

,0)=0.

Also, let N

100

={ηε

(0,∞)

:(b,u,η)=0,bεB

s

, uεU

1

(B

s

)}.

We now state basic assumptions that capture the dispatch process, and impose a detectability condition on the system (1):

A1. The dispatcher is required to dispatch only one machine at a time.

A2. Detectability. If z

k

→0 as k→∞, then x

k

→0 as k→∞.

Assumption A2 is required to infer stability in measurement feedback control problems, and is weaker than observability. Assumption A1 implies that the system iterates only when a request for material is received by the dispatcher. At that time, the cost contributed by the machine is accounted for. This assumption makes sense in an environment where the dispatcher maintains a queue of requests it is being asked to service, and takes up one request at a time.

Let denote the space of non-anticipating measurement feedback policies, i.e. policies based on observations of the outputs y, the buffer b, and performance z alone (and not the system states x). The objective we are trying to achieve can now be stated as follows. Given a γ>0, find a dispatch policy u*ε such that the following two conditions are satisfied:

C1. If x

0

=0, and b

0

εΛ

s

m

then for any bεB

[0,∞)

s

,

\begin{matrix} \sup_{w \in l_{[0, \infty]}^{2}, η \in Ψ_{[0, \infty)}^{S} (b, u^{*}), w, ϑ (b, u^{*}, η) \neq 0} \frac{{&LeftDoubleBracketingBar; z &RightDoubleBracketingBar;}^{2}}{{&LeftDoubleBracketingBar; w &RightDoubleBracketingBar;}^{2} + {&LeftDoubleBracketingBar; ϑ (b, u^{*}, η) &RightDoubleBracketingBar;}^{2}} \leq γ^{2} . & [2] \end{matrix}

C2. If w=0, and ηεN

φ

then for any x

o

ε

□

n

, we have x

k

→0 as k→∞

Condition C1 denotes that starting from equilibrium, and with sufficiently full buffers, the l

2

gain of the system from the disturbances to the regulated output is bounded by γ. This is also the nonlinear H∞ condition [9]. Condition C2 deals with system stability in the absence of any disturbances. The minimizing (sub-optimal) value of γ in C1 can be obtained via a bisection search.

One would have preferred to minimize the l

∞

norm of the regulated output [2], but given that the buffers are typically unbounded, one ends up with minimizing the l

∞

norm assuming (fictitious) bounds on the buffer. However, how one should impose such bounds is not very clear at this time. The l

2

solution, as we shall see is surprisingly independent of the actual buffer contents.

The Multi-Machine Case

One aspect of the multi-machine case not found in the single machine case is contention for the finite buffer b. This induces coupling between the individual solutions. Given our impetus for deriving indexed policies this is particularly disturbing. In this case the quantity of various lot types in the buffer does matter, since a specific machine could face starvation, and due to the non-idling requirement be forced into a highly non-optimal allocation. As intuitive behavior that one would desire is to spread high running types across all machines, keeping them well sampled. For low running material it makes sense to focus in on a limited number of machines. This ensures that sampling rates for these are reasonably high, and minimizes the risk of misprocess by due to running on a machine which has never run this material type before. The control policies can be designed to account for this on a limited basis. We say limited, since the dispatcher has no knowledge of the product mix, and the only information it gets regarding this mix is what it sees in its buffer. Thus a spurt in a typically low running material type could fill up the buffer, and the dispatcher has no option (due to non-idling requirements) but to spread this material out across multiple machines.

The multi-machine case will also involve ensuring that the l

2

gain criteria C1, and stability C2 are met. However, in this case we need to ensure that all machine-model combinations get sampled infinitely often for the stability requirement to make sense. One way to ensure this is to restrict b

k

εB

s

such that b

k

(i)

≧m,i=1, . . . , s, where m is the number of machines. This is clearly a very strong condition and we will refrain from requiring it to hold. Instead we argue that one need only consider those machine-model combinations that are run infinitely often. The other machine-model combinations represent system transients and do not influence stability, or the l

2

gain after a period of time.

Review of General Solution and Certainty Equivalence

The solution to the nonlinear l

2

gain problem involves reposing the original problem as a soft-constrained dynamic game:

\begin{matrix} \inf_{u \in} \sup_{η \in ψ_{[0, \infty) (b . u)}^{s}, w \in l_{[0, 28)}^{2}} {\sum_{k = 0}^{\infty} ({&LeftBracketingBar; l (x_{k}, u_{k}, w_{k}) &RightBracketingBar;}^{2} - {(γ^{2} ({&LeftBracketingBar; w_{k} &RightBracketingBar;}^{2} + ϑ (b_{k}, u_{k}, η_{k}) &RightBracketingBar;)}^{2}))} . & [3] \end{matrix}

At any time k>0, let y

1,k+1

be the measurement trajectory, b

0,k

denote the buffer trajectory, and (equation) be the (observable) trajectory of buffer disturbances. Let E* denote the space of functions

p:|

n

→|{+∞}.

Then the information state p

k

εE* for the system (Σ) is defined as:

\begin{matrix} p_{k = 1} (x, b) = {\begin{matrix} [\begin{matrix} \sup_{ξ \in^{□^{n}}} {p_{k} (ξ, b_{k}) + \sup_{w \in^{□^{r}}} ({&LeftBracketingBar; l (ξ, u_{k}, w) &RightBracketingBar;}^{2} - γ^{2} ({&LeftBracketingBar; w &RightBracketingBar;}^{2} + \\ {&LeftBracketingBar; ϑ (b_{k}, u_{k}, n_{k}) &RightBracketingBar;}^{2}) : x = f (ξ, u_{k}, w), y_{k + 1} = g (ξ, u_{k}, w)) : \\ x \in^{□^{n}}, b \in B^{s}} \\ b_{k} + η_{k} \end{matrix} \\ - \infty else \end{matrix}] & [4] \end{matrix}

with p

0

εE*. We can write [4] succinctly as a functional recursion

p

k+1

=H

(

p

k

,u

k

,y

k+1

,η

k

). [5]

Denote by ρ

b

(p) the (b+η) component of p in (4). Then the solution to the measurement feedback problem involves solving the following infinite dimensional dynamic programming equation

\begin{matrix} W (p) = \inf_{u {ϵU}_{m} (ρ_{b} (p))} \sup_{y ϵ•} \sup_{^{t} ηϵτ s} {W (H (p, u, y, η))} . & [6] \end{matrix}

In particular, equation [6] with additional side conditions is both a necessary and sufficient condition for the solvability of the measurement feedback l

2

gain problem. The catch of course being that it is infinite dimensional in general.

This infinite dimmensionality has motivated search for output injection schemes based on certainty equivalence [8]. The certainty equivalence controller involves first solving the state feedback problem, and using the upper value function for this problem to approximate the upper value function for the output feedback dynamic programming equation [6]. The state feedback problem (i.e. the states x are also observed) involves the existence of a function V: |

n

x B

s

→|* with the following properties: (i) V(x,b)≧0 for all xε

□

n

, bεB

s

, (ii) V(0,b)=0, for all bεΛ

s

m

, and V satisfies

\begin{matrix} V (x, b) = \inf_{u {ϵU}_{m} (b)} \sup_{w {ϵ•}^{r}, ηϵ s} {{&LeftBracketingBar; l (x, u, w) &RightBracketingBar;}^{2} - γ^{2} ({&LeftBracketingBar; w &RightBracketingBar;}^{2} + {&LeftBracketingBar; ϑ (b, u, η) &RightBracketingBar;}^{2}) + V (f (x, u, w), b + η - u)} & [7] \end{matrix}

with V(x,b)=−∞for all b∉B

s

. For any xε

□

n

, and bεB

s

we define the state feedback policy u

F

to be u

F

(x,b) the value of u that infimizes the right hand side in [7]. Then at any time k given p

k

, the information state, and V, one computes the estimate

\begin{matrix} {\hat{x}}_{k} ϵ \arg \max_{x {ϵ•}^{n}} {p_{k} (x, b_{k}) + V (x, b_{k})} & [8] \end{matrix}

and uses the control action u

k

=u

F

({circumflex over (x)}

k

,b

k

). Conditions for the optimality and uniqueness of certainty equivalence controllers exist [1].

The state feedback problem [7] can be solved, for example, via the method of value iteration. This is typically done off-line, and the control policy and upper value function stored either via table look-up [15], linear approximations, or nonlinear functional approximation (e.g. neuro-dynamic programming [4]).

It appears that one still has a computationally hard problem given the fact that vector b is countably infinite. However, we will show in the next section that the buffer adds very little to the complexity of the original state feedback problem. Furthermore, in the discussion of Output Injection below, we will show that for the types of problems under consideration, not only does the information state have a nice finite dimensional parameterization, but it can also be propagated efficiently.

Referring to

FIG. 6

, details of preferred model module

48

are illustrated. Preferred model module

48

includes a penalty function module

180

, a switching cost adjustment module

182

, and an optimized state estimate adder

165

for each model of the associated machine preference module

40

. Preferred model module

48

also includes a max selector

186

for selecting the preferred model

52

from the penalized and adjusted optimized state estimate

165

for each preferred model

52

and associated preferred lot type

50

. The input to the max selector

186

for the selected preferred model

52

is the upper value function for that machine/model combination.

For a particular model, penalty function module

180

obtains the last time that model was run from metrics history and switching cost data

44

and compares it to the current time

188

to determine the elapsed time since the last run of the particular model. This elapsed time is fed into a penalty function to determine a penalty value for penalizing the model for the time since the last run on that model.

Next, switching cost adjustment module

182

obtains the cost of switching from the previous model run on the associated machine

18

to the particular model being considered, and the switching cost is added to the penalty value determined in penalty function module

180

. The penalty value and switching cost adjustment are added to the optimized state estimate

165

in optimized state estimate adder

184

to obtain a total model value

190

. The resulting model value

190

from each model is forwarded to max selector

186

where the model with the maximum model value

190

is selected as the preferred model

52

. In the exemplary embodiment, the max selector is implemented using theorem (3) as described in more detail below. Once the preferred model

52

is determined by max selector

186

, the preferred lot type

50

associated with the preferred model

52

is coupled with the preferred model

52

to create a preferred model and lot type.

Following is an exemplary embodiment of preferred model module

48

.

State Feedback Policies

The Review of General Solution Subsection essentially laid out the solution to the exact problem under the most general conditions. This section focuses on state feedback policies uεS, i.e. policies based on complete information at any time k regarding the system states x

k

, and the buffer contents b

k

. The solution to the state feedback case is required to facilitate feedback via certainty equivalence. We place additional restrictions in order to simplify equation [7], and to ease selection of the next model to sample for the single machine case. We will employ the concept of uniform ordering to decouple the buffer in case of multiple machines, and show that these indeed have the desirable property or reducing the number of machines sampled for low running model types. We first consider the single machine, single model case. In this situation, the problem is to determine what lot type belonging to the model should be run next. After considering this case, we extend it to the case where we are not selecting across multiple models, however, still with a single machine. The last subsection then extends the results to the multi-machine case. Note that the single machine-single model case is the basic computational block for our dynamic dispatch policies. The solutions to the other cases are obtained by optimally ordering the single machine-single model solutions.

The approach followed yields a hierarchical distributed architecture where (i) each machine decides what model and lot type it needs to run next, and (ii) the dispatcher looks at all the machine requests, and optimally resolves contention for a limited supply of lots.

Single Machine-Single Model

In this subsection, we derive the structure of the value function, and optimal policies for the single machine-single model case. It turns out that this structure has tremendous computational implications. Since there is only 1 model, there are no issues with switching across models, or not sampling a model enough. The problem facing us is to come up with an appropriate plan for sampling the model in order to maximize controller performance. Note that in this case we will have u

k

εU

1

(b

k

). We place additional assumptions on the form of the buffer penalty function .

A3. (b,u,η)=(b−u+η), i.e. we penalize η based on the next buffer state.

Lemma 1. For all b

1

, b

2

εB

s

, with, b

1

2

the solution V(x,b) to [7] satisfies

V

(

x,b

2

)≧

V

(

x,b

1

),∀

xε

□

n

.

Proof. From [7] we have

V (x, b_{1}) = \inf_{u {ϵU}_{1} (b_{1})} \sup_{w {ϵ•}^{r}, ηϵ s} {{&LeftBracketingBar; l (x, u, w) &RightBracketingBar;}^{2} - γ^{2} ({&LeftBracketingBar; w &RightBracketingBar;}^{2} + {&LeftBracketingBar; ϑ (b_{1}, - u + η) &RightBracketingBar;}^{2}) + V (f (x, u, w), b_{1} - u + η)} .

For any η define η* such that b

1

=b

2

−η+η*. Then

V (x, b_{1}) = \inf_{u {ϵU}_{1} (b_{1})} \sup_{w {ϵ•}^{r}, ηϵ s} {{&LeftBracketingBar; l (x, u, w) &RightBracketingBar;}^{2} - γ^{2} ({&LeftBracketingBar; w &RightBracketingBar;}^{2} + {&LeftBracketingBar; ϑ (b_{2}, - u + η^{*}) &RightBracketingBar;}^{2}) + V (f (x, u, w), b_{2} - u + η^{*})} \leq \inf_{u {ϵU}_{1} (b_{2})} \sup_{w {ϵ•}^{r}, ηϵ s} {{&LeftBracketingBar; l (x, u, w) &RightBracketingBar;}^{2} - γ^{2} ({&LeftBracketingBar; w &RightBracketingBar;}^{2} + {&LeftBracketingBar; ϑ (b_{2}, - u + η^{*}) &RightBracketingBar;}^{2}) + V (f (x, u, w), b_{2} - u + η^{*})} = V (x, b_{2}) .

Corollary 1. For any b

1

,b

2

εΛ

s

m

and for any xε

□

n

,V(x,b

1

)=V(x,b

2

) Proof. We have b

1

2

, and b

2

1

. The proof follows

The following corollary has important implications on the structure of the optimal upper value function.

Corollary 2. For all α

1

, α

2

ε

+

, α

1

, α

2

≠0, and for any iε{1, . . . ,s}

V

(

x,α

1

e

i

s

)=

V

(

x,α

2

e

i

s

)

for any xε

□

n

.

Proof. Same as proof of corollary 1, noting that α

1

e

i

s

2

e

i

s

and α

2

e

i

s

1

e

i

s

.

Lemma 2. Let b=b

1

+b

2

for any b

1

,b

2

εB

s

. Then for any xε

□

n

, the solution V (x,b) to equation [7] satisfies

V

(

x,b

)=inf{

V

(

x,b

1

),

V

(

x,b

2

)}

Proof.

\begin{matrix} V (x, b) = \inf_{u \in U_{1} (b)} \sup_{w \in^{□^{r}}, η \in s} {{&LeftBracketingBar; l (x, u, w) &RightBracketingBar;}^{2} - γ^{2} ({&LeftBracketingBar; w &RightBracketingBar;}^{2} + {&LeftBracketingBar; ϑ (b + η - u) &RightBracketingBar;}^{2}) + \\ V (f (x, u, w), b + η - u)} \\ = \inf_{u \in U_{1} (b_{1} + b_{2})} \sup_{w \in^{□^{r}}, η \in s} {{&LeftBracketingBar; l (x, u, w) &RightBracketingBar;}^{2} - γ^{2} ({&LeftBracketingBar; w &RightBracketingBar;}^{2} + \\ {&LeftBracketingBar; ϑ (b_{1} + b_{2} + η - u) &RightBracketingBar;}^{2}) + V (f (x, u, w), b_{1} + b_{2} + η) - u)} \\ = \inf {\inf_{u \in U_{1} (b_{1})} \sup_{w \in^{□^{r}}, η \in s} {{&LeftBracketingBar; l (x, u, w) &RightBracketingBar;}^{2} - γ^{2} ({&LeftBracketingBar; w &RightBracketingBar;}^{2} + \\ {&LeftBracketingBar; ϑ (b_{1} + b_{2} + η - u) &RightBracketingBar;}^{2}) + V (f (x, u, w), b_{1} + b_{2} + η - u)}, \\ \inf_{u \in U_{1} (b_{1})} \sup_{w \in^{□^{r}}, η \in s} {{&LeftBracketingBar; l (x, u, w) &RightBracketingBar;}^{2} - γ^{2} ({&LeftBracketingBar; w &RightBracketingBar;}^{2} + \\ {&LeftBracketingBar; ϑ (b_{1} + b_{2} + η - u) &RightBracketingBar;}^{2}) + V (f (x, u, w), b_{1} + b_{2} + η - u)}} \end{matrix}

Thus,

\begin{matrix} V (x, b) = \inf {\inf_{u \in U_{1} (b_{1})} \sup_{w \in^{□^{r}}, η^{*} \in s} {{&LeftBracketingBar; l (x, u, w) &RightBracketingBar;}^{2} - \\ γ^{2} ({&LeftBracketingBar; w &RightBracketingBar;}^{2} + {&LeftBracketingBar; ϑ (b + η^{*} - u) &RightBracketingBar;}^{2}) + \\ V (f (x, u, w), b_{1} + η^{*} - u)}, \inf_{u \in U_{1} (b_{1})} \sup_{w \in^{□^{r}}, \overline{η} \in s} {{&LeftBracketingBar; l (x, u, w) &RightBracketingBar;}^{2} - \\ {&LeftBracketingBar; ϑ (b_{2} + \overline{η} - u) &RightBracketingBar;}^{2}) + V (f (x, u, w), b_{2} + \overline{η} - u)}} \\ = \inf {V (x, b_{1}), V (x, b_{2})} . \end{matrix}

The following theorem is the main representation result for the value function in terms of the buffer (b).

Theorem 1.

{Let}^{b} = \sum_{i = 1}^{s} α_{i} e_{1}^{s}, α_{i} ϵ, i = 1, \dots, s . Then

V (x, b) = \inf_{l ϵ {1, \dots, s}} {V (x, e_{l}^{s}) : α_{l} \neq 0} .

Proof. The proof follows via induction, and repeated application of lemma 2, and corollary 2. Let 2 of the elements of α be non zero. Then b=αe

1

s

+α

2

e

2

s

. By lemma 2,

V (x, b) = \inf_{l ϵ {1, 2}} {V (x, α_{l} e_{l}^{s})} = \inf_{l ϵ {1, 2}} {V (x, e_{l}^{s})} via corollary 2.

Let this hold for the case where j−1 entries in a are non-zero. For the case where j entries are non-zero we have

\begin{matrix} V (x, b) = \inf {V (x, α_{j} ⅇ_{j}^{s}), V (x, \sum_{i = 1}^{j - 1} α_{1} ⅇ_{i}^{s})} via lemma 2 \\ = \inf {V (x, ⅇ_{j}^{s}), \inf_{i \in {1, \dots, j - l}} {V (x, ⅇ_{i}^{s})}} via induction hypothesis \\ = \inf_{i \in {1, \dots, j}} {V (x, ⅇ_{i}^{s})} . \end{matrix}

The values of i for which α

i

=0 are excluded since e

i

s

does not contribute to b.

The alert reader will notice that theorem 1 gives a tremendous reduction in the computational complexity for solving [7]. Given that the buffer is s dimensional, one need not evaluate the upper value function for all possible (countably infinite) values of the buffer entries. One needs to only track values corresponding to specific lot types. The value of V(x,b) for any b can be obtained from this finite set via Theorem 1.

We now consider two limiting cases of the buffer cost function . These two correspond to the cases where (i) the buffer incurs no cost (b)≡0, and (ii) the buffer incurs infinite cost ((b)=∞) for all b{overscore (1)}

s

.

Theorem 2. Consider V(x,b) the solution to [7], and assumption A3. If

i. Let (b)=0 for all bεB

s

. Then V(x,b) satisfies

\begin{matrix} V (x, b) = \inf_{u ϵU} \sup_{l (b) w {ϵ•}^{r}} {{&LeftBracketingBar; l (x, u, w) &RightBracketingBar;}^{2} - γ^{2} {&LeftBracketingBar; w &RightBracketingBar;}^{2} + \sup_{l = 1, \dots, s} {V (x, e_{l}^{s})}} . & [9] \end{matrix}

ii. Let ((b)=∞) for all b{overscore (1)}

s

. Then V(x,b) satisfies

\begin{matrix} V (x, b) = \inf_{u ϵU} \sup_{l (b) w {ϵ•}^{r}} {{&LeftBracketingBar; l (x, u, w) &RightBracketingBar;}^{2} - γ^{2} {&LeftBracketingBar; w &RightBracketingBar;}^{2} + \inf_{i - 1, \dots, s} {V (x, e_{l}^{s})}} . & [10] \end{matrix}

Equation [9] refers to the case where no buffer penalties are considered. This is the typical case found commonly in practice. Equation [10] considers the other extreme, where infinite penalty is imposed on the buffer whenever it leaves Λ

s

1

. Let {overscore (γ)} be the infimizing value of γ that ensures that V(0,b

0

)=0 for b

0

εΛ

s

1

in [9], and let

γ

be the infimizing value that achieves V(0,b

0

)=0 for b

0

εΛ

s

1

in [10]. Then we have {overscore (γ)}≧

γ

.

Single Machine-Multiple Models

We now consider the case where the machine could run one of M possible models at any given time. Given the current (at time k) buffer states {bi,k}

i=1

M

, and current states {xi,k}

i=1

M

let the individual upper value functions be denoted by {V

i

(x

i,k

,b

i

)}

i=1

M

. At any given time, assuming that we know the states, we will simply write these as V

i

.

Where there is no ambiguity, we will drop the model subscript. In that case, it is assumed that the states, and buffers under question are restricted to the model being considered.

For each model i consider the following function for any policy uεS,

\begin{matrix} \begin{matrix} J_{i}^{k} (x, b; u) = \sup_{w \in l_{[k, \infty),}^{2} η \in Ψ_{[k, \infty) (\hat{b}, u)}^{s_{i}}} {a_{k}^{i} + \sum_{j = k}^{\infty} ({&LeftBracketingBar; l^{i} (x_{j}, u_{j}, w_{j}) &RightBracketingBar;}^{2} - \\ γ^{2} ({&LeftBracketingBar; w_{j} &RightBracketingBar;}^{2} + {&LeftBracketingBar; θ^{i} ({\hat{b}}_{j}, u_{j} η_{j}) &RightBracketingBar;}^{2})) : x_{k} = x, {\hat{b}}_{k} = b} . \end{matrix} & [11] \end{matrix}

Then by the dynamic programming principle, α

k

i

+V

i

(x,b)=inf

uεS

J

i

(x,b,u). Hence for the multi-model case we uniformize the problem by asking that the policy uεS solve

\begin{matrix} \inf_{u ϵS} \sup_{l ϵ {1, \dots, M}} {J_{l}^{k} (x, b; u)} . & [12] \end{matrix}

Lemma 3. Given V

i

(x,b) the upper value functions for the individual models, the optimal model j

k

*

to run next, given x

k

and b

k

is

j_{k}^{*} \in \arg \max_{i \in {1, \dots, M}} {a_{k}^{i} + V_{i} (x_{k}, b_{k})}

Proof. The result of the theorem follows from the dynamic programming principle, and the mutually exclusive nature of the models, i.e. the machine can only run one model at a time.

Switching costs and performance degradation

We now discuss how switching costs, and performance degradation on idle are incorporated into determining the next model to run. Given that model i incurred the last run, the model j is being chosen for the current run, model i incurs a switching cost c

ij

≧0. Also, c

ii

=0, for all models i 0 {1, . . . , M}.

Note that the cost of switching across types within a model can be accommodated in l, and we shall not concern ourselves with this cost here.

We are also assuming that the states of inactive models are frozen, because no additional information regarding an inactive model is available since it was last run. However, in practice there will be performance degradation. The effect of this is to bump up the cost to go. We will denote this increase by a stage cost τ

i

(k,k

i

)≧0, where k

i

is the run index at which the switch from model i to another j≠i took place. Also, τ

i

(k

i

, k

i

)=0. At a given time k, the total degradation for model i is denoted by:

\overline{τ} (k, k_{i}) = \sum_{h = k_{i}}^{k} τ_{i} (h, k_{i})

Now let i be the model active in the last run. For any model j with j≠i we set a

k

j

={overscore (τ

j

)}(k,k

j

). For model i we set a

k

i

=c

ij

where j is the model that is chosen in the current run.

Theorem 3. Suppose model i was run at time k−1, the optimal model to run at time k, given x

k

and b

k

is

\begin{matrix} \arg \max_{i, j} {V_{i} (x_{k}, b_{x}) + c_{i, j}, V_{j} (x_{k}, b_{k}) + {\overline{τ}}_{j} (k, k_{j})} = \arg \max_{i, j} {V_{i} (x_{k}, b_{k}), V_{j} (x_{k}, b_{k}) + τ_{j} (k, k_{j}) - c_{ij}} \\ = \arg \max_{i, j} {V_{i} (x_{k}, b_{k}) + {\overline{τ}}_{i} (k, k_{i}) - c_{n}, V_{j} (x_{k}, b_{k} + {\overline{τ}}_{j} (k, k_{j}) - c_{ij}} \\ since {\overline{τ}}_{i} (k, k_{i}) = c_{n} = 0. \end{matrix}

Proof. Follows from lemma 3, and the fact that for any j,

τ_{i} (k, k_{i}) = {\begin{matrix} 0 & if k \leq k_{i} \\ α ⅇ^{δ (k - k_{i})}, α \in_{+}^{□}, δ \in^{□} & else . \end{matrix}

Here we briefly discuss the form of τ

i

(k, k

i

). One such function is

j_{k | i}^{*} \in \arg \max_{j \in {1, \dots, M}} {V_{j} (x_{k}, b_{k}) + {\overline{τ}}_{j} (k, k_{j}) - c_{ij}}

The factor ∀ denotes the rate of degradation, and δ denotes the urgency one feels to run an idle model. For δ>0, as k−k

i

increases, the single state cost τ

i

(k, k

i

) also increases, tending to 4. For δ<0, as k−k

i

increases, the single stage cost τ

i

(k, k

i

) decreases, and tends to 0. Hence we have for the cumulative cost τ

i

(k, k

i

),

{\overline{τ}}_{i} (k, k_{i}) = {\begin{matrix} 0 & if k < k_{i} \\ α (k - k_{i} & if δ = 0, k \geq k_{i} \\ α ⅇ^{δ} (\frac{1 - ⅇ^{{(k - k_{i})}^{δ}}}{1 - ⅇ^{δ}}) & if δ \neq 0, k \geq k_{i} . \end{matrix}

Multiple Machines-Multiple Models

We now come to the general case, where the workstation has multiple machines {1, . . . , m}, and each machine can run multiple models. Without loss of generality we will assume that each machine can run any of M models. This assumption is made only in order to simplify the notation. Note that unlike the single machine cases considered thus far, the event that a given model runs is no longer mutually exclusive, since the different machines could run different models, or contend for a limited supply of lots belonging to a single model. This results in coupling between the policies.

In what follows, we assume that there are sufficiently many lots available, so that if all the machines were to ask for material, there would be no idling. In addition, we assume that we are allowed to reserve lots to be run on machines that need to sample a given model the most. In case the number of lots falls below m, this could result in a machine being idled, since the lot is being held for another machine to run. A special case is that of hot (priority) lots, which are assumed to be an exception, and allocated the idle machine. In case a priori information is available about the arrival of a hot lot, the appropriate machine could be idle and waiting for processing. There is also an issue of how one should run low volume models. The cleanest way of dealing with this is to ensure that one is only allowed to run it on a limited number of machines. The dispatcher has no knowledge of the actual volume mix in the fab since all it sees are its buffers. However, the policies derived here do ensure soft dedication.

Let the buffer at time k be given by b

k

εB

s

. Let W

k

i

(b

k

) denote the upper value function, including degradation and switching costs, picked by machine i at time k, given a buffer b

k

. W

k

i

(b

k

) is given by theorem 3,

W_{k}^{i} (b_{k}) = \max_{j_{2} \in {1, \dots, M}} {V_{j_{2}}^{i} (x_{k}, b_{k}) + {\overline{τ}}_{j_{2}}^{i} (k, k_{j}) - c_{j_{1} j_{2}}^{i} : j_{1} = model run at time k = 1}

where V

i

, {overscore (τ)}

i

, and c

i

indicate that these functions correspond to models for machine i. We are ignoring the states x

k

in the expression of W

k

i

, since although they do determine the value of W

k

i

, they are assumed frozen during this assignment phase. Let u

i

εU

1

(b

k

) denote the lot type (and model) selected by machine i at time k from buffer b

k

. The space of assignment policies for the multi-machine case is denoted by U

1

m

(b

k

) and is defined by

U_{1}^{m} (b_{k}) = {u \in s \times m : \sum_{t = 1}^{s} u_{i}, . = 1; \sum_{i = 1}^{m} u_{j, i} \leq b_{k}^{(j)}, j = 1, \dots, s}

The set U

1

m

(b

k

) has as its columns the vector u

.,i

chosen for each machine. Clearly, each of these columns has to sum to 1, since each machine can only run 1 lot type at a time. Also, the sum across each row denotes the total number of lots of a specific type selected, and this cannot be greater than the number of lots of that specific type available in the buffer.

The aim of the assignment problem can now be stated as follows:

Find a permutation κ*(b

k

) of the machines so that the following is achieved.

\begin{matrix} * (b_{k}) \in \arg \underset{π (m) i \in {1,, m}}{\min \max} {W_{k}^{π_{i} (m)} (b_{k} - \sum_{j = 1}^{i} u_{π_{j - 1 (m)}})} & [13] \end{matrix}

where u

π

0

(m)

=φ. Then u

k

εU

1

m

(b

k

) can be determined as follows. Start with κ*(b

k

)

(1)

and assign that machine with the lot of its choosing (say u

κ*(b

k

)

(1)

). Now assign the lot chosen by machine κ*(b

k

)

(2)

, with the buffer value b

k

−u

κ*(b

k

)

(1)

. Let this lot be denoted by u

κ*(b

k

)

(2)

. Next assign κ*(b

k

)

(3)

with the lot type of its choice given buffer b

k

−u

κ*(b

k

)

(1)

−u

κ*(b

k

)

(2)

. Repeat this procedure until κ*(b

k

)

(m)

has been assigned.

Consider the following dynamic programming equation

\begin{matrix} Z_{m - 1} (b_{k} M_{m - 1}) = \min_{i \in M_{m - l}} \max {W_{k}^{i} (b_{k}), Z_{m} (b_{k} - u_{i} (b_{k}), M_{m} \ i)} \\ = \min_{i_{m - 1} \in M_{m - 1}} \max {W_{k}^{i_{m - 1}} (b_{k}), \min_{i_{m} \in {M_{m - 1} l_{m - 1}}} W_{k}^{i_{m}} (b_{k} - u_{i_{m - 1}} (b_{k}))} \\ \min_{i_{m}, i_{m - 1} \in M_{m - 1}} \max {W_{k}^{i_{m - 1}} (b_{k}), W_{k}^{i_{m}} (b_{k} - u_{i_{m - 1}} (b_{k}))} \end{matrix}

Theorem 4. Suppose there exists a solution to Z to the dynamic programming equation [14], and let i

j

be the minimizing value of i on the RHS. Consider the sequence {Z

j

(b

k

−u

i

j

,M

j

)}

j=1

m

. Then κ(b

k

)={i

1

, . . . , i

m

} is an solution to [13].

Proof. We prove this assertion via an induction argument. For j=m−1 one has

\begin{matrix} Z_{j} (b_{k}, M_{j}) = \min_{i_{j} \in M_{j}} \max {W_{k}^{i_{j}} (b_{k}), Z_{j + 1} (b_{k} - u_{i_{j}} (b_{k}), M_{i_{j}} \ i)} \\ = \min_{i_{j} \in M_{j}} \max {W_{k}^{i_{j}} (b_{k}), \min_{i_{j + 1}, \dots, i_{m}} \max_{q \in {i_{j + 1}, \dots, i_{m}}} {W_{k}^{q} (b_{k} - \sum_{i = j + 1}^{q} u_{il})} : \\ = \min_{i_{j}, \dots, i_{m} \in M_{j}} \max_{q \in {i_{j}, \dots, i_{m}}} {W_{k}^{q} (b_{k} - \sum_{l = j}^{q} u_{i_{t}})} . \end{matrix}

This shows that an optimal 2 element sequence is picked with elements from b

k

. Assume this holds for j+1. Then for j we have

\begin{matrix} Z_{j} (b, M_{j}) = \min_{i \in M_{j}} \max {W^{i} (b), Z_{j + 1} (b - u_{i} (b), M_{j} | i)} j = 1, \dots, m - 1 \begin{matrix} Z_{m} (b, M_{m}) = \min_{i \in M_{m}} W^{i} (b), b \in B^{s}, M_{m} ⋐ M_{j} ⋐ M_{1} \\ = {1, 2, \dots, m} . \end{matrix} & [14] \end{matrix}

In the general case the dynamic programming equation is in the class NP. However, it does substantially reduce the complexity of the original problem. To estimate the complexity note that we have to execute m iterations in [14]. The total steps (S

m,b

) we have to execute are

\begin{matrix} S_{m, b} = \sum_{j = 1}^{m} (m - j + 1) C_{m - j +}^{m} ⟨ b ⟩ m \\ = \sum_{j = 1}^{m} (m - j + 1) \frac{m!}{(m - j + 1)! (j - 1)!} ⟨ b ⟩ m \end{matrix}

which in the loop corresponding to j=m/2 yields

\frac{m!}{(m / 2)! (m / 2)!} ⟨ b ⟩ m .

This shows that the complexity of solving [14] is at least 0(2

m

<b>m).

Although the dynamic programming equation [14] gives the optimal allocation over all models it is computationally intractable for real-time applications. One could argue that Z and its associated policy be computed off-line, and stored. However, given the possible number of machines, and buffer sizes the complexity would still be enormous, and would be a extremely difficult to maintain, especially if additional machines or models are introduced on a regular basis.

Consider the case where the idling machine j is requesting a lot from model v

1

, and we want to restrict allocation to machines requesting lots from v

1

alone. Let b

k

v

1

denote the buffer b

k

restricted to lots belonging to model v

1

. We can start solving the dynamic programming recursion (14) by setting b

k

=b

k

v

1

. If the machine j is allocated we stop, else we next pick the model v

2

which is the model of choice for W

k

j

(b

k

−b

k

v

1

), and repeat the allocation procedure on machines requesting model v

2

. This is continued until machine j is assigned a lot. This substantially cuts down the complexity associated with solving [14]. Note however, that this is not an optimal solution by virtue of the fact that we preclude non-idle machines from switching to model v

1

. Below we consider a special case, which under the assumption of uniformly ordered costs yields an optimal polynomial time allocation rule.

Single Model Allocation and Uniform Ordering

Let τ

v

denote the set of types for a model v, with T

v

=<τ

v

>. Suppose at time k, we have m

v

machines that opt for a lot from model v, with their individual value functions denoted by W

k

i

,i=1, . . . , m

v

. Let b

k

v

denote the buffer b

k

restricted to model v. We now define the concept of a uniform ordering amongst the these value functions.

Let W

k

i

(b

k

v

) and W

k

i

(b

k

v

) denote the upper value functions for the 2 machines for model v. We say that these 2 costs W

k

i

and W

k

j

are uniformly ordered, if one of the following holds

W

k

i

(

b

k

v

)≦

W

k

j

(

b

k

v

)or

W

k

i

(

b

k

v

)≧

W

k

j

(

b

k

v

), ∀

b

k

v

εB

t

v

. [15]

One could justify this assumption on the basis that sine all machines are requesting lot types in order to uniformly bound individual model costs [12], a machine will either be uniformly better than another one for a given model type, or be uniformly worse.

The following lemma captures general properties of the function W

k

i

Lemma 4. Consider W

k

i

(b

k

), and assume that the optimal lot request corresponds to model v. Let b

k

v

denote the buffer b

k

restricted to model v, and assume that b

k

v

has more than 1 lot available. Then the following inequalities follow:

i. W

k

i

(b

k

−u)≧W

k

i

(b

k

) for all uεU

1

(b

k

v

)

ii. W

k

i

(b−b

k

v

)≦W

k

i

(b

k

)

The following lemma is then used to justify the indexed policy for a single model case.

Lemma 5. Assume machines i and j are requesting an assignment for model v. Further assume that at least 2 lots are available from model v. Let μ

1

be

, μ_{1} \in \arg \max_{μ \in {i, j}} {W_{k}^{μ} (b_{k}^{v})}

and let μ

2

be the remaining machine index. Then, if W

k

i

and W

k

j

are uniformly ordered, we have

max{

W

k

μ

1

(

b

k

v

),

W

k

μ

2

(

b

k

v

−u

μ

1

(

b

k

v

))}≦max{

W

k

μ

2

(

b

k

v

),

W

k

μ

1

(

b

k

v

−u

μ

2

(

b

k

v

))}.

Proof. First note that by lemma 4

W

k

i

(

b

k

v

)≦

W

k

i

(

b

k

v

−u

),

uεU

1

(

b

k

v

).

Consider the following cases:

i. If u

μ

1

(b

k

v

)≠u

μ

2

(b

k

v

) we have

W

k

μ

2

(

b

k

v

−u

μ

1

(

b

k

v

))=

W

k

μ

2

(

b

k

v

)≦

W

k

μ

1

(

b

k

v

)=

W

k

μ

2

(

b

k

v

−u

μ

1

(

b

k

v

)).

Hence the assertion of the lemma holds with equality.

ii. If u

μ

1

(b

k

v

)=u

μ

2

(b

k

v

)={overscore (u)} Then the assertion of the lemma holds if

W

k

μ

2

(

b

k

v

−u

μ

1

(

b

k

v

))=

W

k

μ

2

(

b

k

v

−{overscore (u)}

)≦

W

k

μ

1

(

b

k

v

−u

μ2

(

b

k

v

))=

W

k

μ

1

(

b

k

v

−{overscore (u)}

).

But the latter follows from the uniformly ordered assertion.

The following corollary is an immediate consequence of lemma 5, following from a pairwise interchange argument commonly found in scheduling theory [14].

Corollary 3. Assume that machines 1, . . . , μ are requesting model v. Assume that there are at least μ lots available in model v. Let the models be ordered such that

W

k

1

(

b

k

v

)≧

W

k

2

(

b

k

v

)≧ . . . ≧

W

k

μ

(

b

k

v

).

If the costs are uniformly ordered, then the optimal lot allocation rule is to allocate machines in the sequence 1, 2, . . . , μ.

Proof. The proof of this assertion follows via an induction argument coupled by a pairwise interchange.

We now derive, in a manner similar to lemma 5 and corollary 3, the appropriate allocation scheme in case the buffer is starved for lots of the model requested. Specifically, we look at the case where the number of lots (ζ) of a model v are strictly less than the number of machines μ requesting lots from this model. An intuitive scheme would, for example allocate machines based on corollary 3, except for the μ−ζ machines with the highest upper value functions W

k

i

(b

k

v

). We now make this intuition precise.

Lemma 6. Let W

k

1

(b

k

v

)≦W

k

2

(b

k

v

)≦W

k

3

(

b

k

v

), and be uniformly ordered. Assume that these machines are each asking for lots from model v. Then

\max {W_{k}^{2} (b_{k}^{v}), W_{k}^{1} (b_{k} - u_{2} (b_{k}^{v}))} \leq \max_{i \in {1, 2}} {W_{k}^{3} (b_{k}^{v}), W_{k}^{i} (b_{k} - u_{1} (b_{k}^{v}))} .

Proof. If u

2

(b

k

v

)≠u

3

(b

k

v

) then

W

k

1

(

b

k

v

−u

2

(

b

k

v

))≦

W

k

3

(

b

k

v

−u

2

(

b

k

v

))=

W

k

3

(

b

k

v

).

Hence the lemma follows. In case u

2

(b

k

v

)=

u

3

(b

k

v

) one has

W

k

1

(

b

k

v

−u

2

(

b

k

v

))=

W

k

1

(

b

k

v

−u

3

(

b

k

v

)≦

W

k

2

(

b

k

v

−u

3

(

b

k

v

)).

Since W

k

2

(b

k

v

)≦W

k

2

(b

k

v−u

3

(b

k

v

)) via lemma 4, the result follows.

As in case of corollary 3, the following result follows via induction and pairwise interchange.

Corollary 4. Let W

k

1

(b

k

v

)≦W

k

2

(b

k

v

)≦ . . . ≦W

k

v

(b

k

v

) be the sequenced of uniformly ordered upper value functions. Let there be ζ<μ available lots in buffer b

k

v

. Then the optimal allocation scheme is to allocate machines 1, 2, . . . , ζ using the ordered assignment in corollary 3, and to roll over the remaining μ−ζ machines into another model.

Having stated this, we are now in a position to present a general algorithm for solving the allocation problem [13].

Following is an exemplary embodiment of arbitrator

Polynomial Time Allocation Algorithm

Suppose machine i requests a lot from model v. Then clearly, W

k

1

(b

k

v

)=W

k

i

(b

k

). We will thus denote the buffer argument by restricting it to the specific model being requested. Also, let denote the number of lots in buffer b

k

v

for model v. The algorithm can now be stated as follows:

Algorithm A

S1. Sort each model v=1, . . . , M such that

\begin{matrix} W_{k}^{1_{1}} (b_{k}^{1}) \leq W_{k}^{2_{1}} (b_{k}^{1}) \leq & \dots \leq & W_{k}^{ζ^{2}} (b_{k}^{1}) \\ W_{k}^{1_{2}} (b_{k}^{2}) \leq W_{k}^{2_{2}} (b_{k}^{2}) \leq & \dots \leq & W_{k}^{ζ^{2}} (b_{k}^{2}) \\ ⋮… \leq & \dots \leq & …⋮ \\ W_{k}^{1_{M}} (b_{k}^{M}) \leq W_{k}^{2_{M}} (b_{k}^{M}) \leq & \dots \leq & W_{k}^{ζ^{M}} (b_{k}^{M}) \end{matrix}

S2. If each model specific buffer b

k

v

has ζ

v

lots, assign them according to corollary 3. Go to step S6.

S3. Assign all models v such that ζ

v

> via corollary 4. Keep track of the max cost for the machines assigned {1

v, . . . ,

v

}.

S4. Sort the buffers assigned in S3 by their max costs. Consider the model (or models) with the minimal max cost (denoted by v*). Freeze all allocations for this model (v*) and roll over unassigned machines into their next model and sort these models. Delete the machines assigned to model v* from the list of available machines. Delete the model v* from the list of models.

S5. If all machines are allocated or all buffers are empty, go to S6. Else, go to S2.

S6. Stop.

The algorithm presented above yields an optimal allocation under the uniformly ordered assumption. Furthermore, this algorithm has polynomial time complexity. Observing that at each iteration i (assuming only 1 machine got assigned in the previous one), the algorithm requires m−i+1 assignments in S3. Sorting M−i+1 models and sorting m−1 machines in step S4. This repeats for m iterations until all the machines are exhausted. Hence the number of operations S

m,M

is given by

S_{m, M} = \sum_{i = 1}^{m} (M - i + 1) \log_{2} (M - i) \log_{2} (m - i)

and hence the complexity of the algorithm is O(mM log

2

(M)+m

2

log

2

(m)).

The complexity reduction from [14] is tremendous. Whereas the original dynamic programming problem had exponential complexity, the algorithm presented has only polynomial complexity.

Theorem 5. Assume that all the value functions W

k

i

, i=1, . . . , m are uniformly ordered. Then Algorithm A yields an optimal allocation and solves [13].

Proof. If all models have enough lots to satisfy requests, we have an optimal allocation via corollary 3. In case there is a model M

1

which cannot satisfy all requests, then by corollary 4 it makes sense to roll over those machines with the highest cost into another model. If these get rolled into M

2

then we need to make sure (as the algorithm does) that lots in M

2

are assigned after the machines are rolled over. If two models need their machines rolled over, then it again makes sense to allocate the lower cost model first, since by not doing so we have increased the lower bound on the achievable cost. These are exactly the sequence of actions taken by the algorithm.

The algorithm also makes clear an interesting property concerning infrequently run (low running) models. If the device types in a model are not run very frequently, one expects that there will be several machines requesting allocation from this model due to their increasing idling penalties (τ). However, if a lot does show up, it gets allocated to the machine with the lowest cost. This machine then will run this model over a period of time. Hence, in case of low runners the allocation scheme introduces a natural soft dedication of machines. In case of multiple low running models, the machines chosen to run these will be picked according to the device mix seen by the dispatcher, which will try to maintain a sufficiently high sampling rate for all machines that run these models.

Following is an exemplary embodiment of state estimator

160

.

Output Injection

In the previous section it was assumed that the state x

k

was known while computing the upper value function V

j

(x

k

b

k

). We now consider the case where this state information is not directly observable. as mentioned in subsection 2.3 we need to maintain and propagate an information state [4], [5]. Given this information state, we can obtain an estimate of the state x

k

employing certainty equivalence [8]. In contrast to the upper value function V

j

which can be computed off-line and stored either in a table look-up, or via functional approximation (e.g. neural networks), the information state has to be propagated on-line. This makes the complexity of computing [4] critical. Below we present a system which is not only structured to suit applications in semiconductor manufacturing, but also yields a computationally efficient information state recursion.

It is clear that we need only propagate the information state, and estimate the state of the model that was run. The other models are assumed frozen, and all they incur is an idling penalty (τ).

We will therefore focus our attention on a single model. The resulting recursion can be applied to any model that is run.

Information State Recursion

Consider the following system, which represents the behavior of a single model, and is a special case of [1]:

\begin{matrix} \sum {\begin{matrix} θ_{k + 1} = K (χ_{k}, u_{k}) θ_{k} + E (χ_{k}, u_{k}) v_{k} \\ χ_{k + 1} = f (χ_{k}, y_{k + 1}, u_{k}) \\ y_{k + 1} = {g (χ_{k}, u_{k})}^{T} θ_{k} + G (χ_{k}, u_{k}) w_{k} \\ z_{k + 1} = l (y_{k + 1}, u_{k}, χ_{k}) \\ b_{k + 1} = b_{k} - u_{k} + n_{k}, k = 0, 1, \dots \end{matrix} & [16] \end{matrix}

Here θ

k

, /

n1

, χ

k

/

n2

, and represent components of the state x

k

Also, v

k

, /

r1

and w

k

, /

r2

represent the disturbances. The dimension of the other quantities are as in [1]. It is assumed that we are able to directly observe x

k

. We however cannot observe θ

k

. In this set-up, x

k

represent the states of a supervisory controller that has been implemented on the process, with the function ƒ:

□

n

2

x

□

t

xU(B

s

)→

□

n

2

denoting the controller dynamics. The matrix maps

K:

□

n

2

xU

(

B

s

)→

□

n

1

x

□

n

1

E:

□

n

2

xU

(

B

s

)→

□

n

1

x

□

r

1

denote the controller state (and hence process setting) and lot type driven filtering on the process drift.

These can be used to loop-shape the closed-loop response of the dispatcher. The matrix valued function

g:

□

n

2

xU

(

B

s

)→

□

n

1

x

□

t

represents a function relating the controller states (and hence the process setting), and the lot type to inputs into a regression fit parameterized by θ

k

. This encompasses not only multi-output linear models, but also multi-output nonlinear models. The matrix map

G:

□

n

2

xU

(

B

s

)→

□

t

x

□

r

2

is used to capture the relative magnitudes of both the modeling error and measurement noise on each of the outputs. Lastly the function

l:

□

t

xU

(

B

s

)→

□

n

2

x

□

m

denotes the vector of quantities we want to regulate and could include the error from target (via y

k+1

and u

k

), as well as the cost of making changes in the process settings (via χ

k

.).

We note that the system represented in [16] has greater generality than those found in typical supervisory control applications. For example, in case of single output plants where time is the controlled variable (e.g. deposition or etch), and a controller based on the EWMA filter [16] is employed χ

k

, y

k

, and z

k

are all scalars. Furthermore, in this case, K(χ

k

, u

k

)=E(χ

k

, u

k

)=G(χ

k

, u

k

)=1. Also, g(χ

k

, u

k

)=[1 Fu

k

,=χ

k

)/M]

t

, where F=[T

1

T

2

. . . T

s

] is a vector of targets, and M is a constant used to represent θ

k

(2)

. In this case x

k

is used to track θ

k

(1)

. Also l(y

k+1

,u

k

x

k

)=y

k+1

−Fu

k

, the error from target of the measured process output. The controller dynamics in the EWMA case are

ƒ(χ

k

,y

k+1

,u

k

)=χ

k

+λ(

y

k+1

−Fu

k

).

The system [16] also encompasses the case where one has a human in the loop (e.g. no supervisory control). In this case the function f would be such that Π would be constant, with its value changing only on manual updates.

Since the only unmeasurable quantity in [16] is θ

k

, we need to only set up the information state for this variable. The rest of the variables can be observed, e.g. the controller clearly knows its own state Π

k

, and we need not estimate them. The following result presents the finite dimensional form of the information state, and the recursive update equations.

Theorem 6. Consider the nonlinear system [16], and assume that E(Π

k

, u

k

) and G(Π

k

, u

k

) have full row rank. Furthermore, assume that K (Π

k

, u

k

) has full rank. Then given γ>0, the information state for this system is finite dimensional, and has the form

p

k

(θ)=(θ−{overscore (θ)}

k

)

T

P

k

(θ−{overscore (θ)}

k

)+

c

k

, P

k

=P

k

T

ε

□

n

2

xn

2

, {overscore (θ)}

k

ε

□

n

2

,k=

0,1, . . .

with P

0

symmetric and strictly negative definite, c

o

=0, and {overscore (θ)}ε

□

n

2

. Furthermore, P

k

and {overscore (θ)}

k

are updated recursively by the following equations:

P

k+1

=γ

2

{overscore (E)}

(χ

k

,u

k

)[γ

2

K

(χ

k

,u

k

)

S

k

−1

K

(χ

k

,u

k

)

T

−E

(χ

k

, u

k

)

T

]{overscore (E)}

(χ

k

,u

k

)

{overscore (θ)}

k+1

=E

(χ

k

,u

k

)

E

(χ

k

,u

k

)

T

(γ

2

K

(χ

k

,u

k

)

S

k

−1

K

(χ

k

,u

k

)

T

− [17]

E

(χ

k

,u

k

)

E

(χ

k

,u

k

)

T

)

−1

K

(χ

k

,u

k

)

S

k

−1

(

P

k

·{overscore (θ)}

k

−γ

2

g

(χ

k

,u

k

)

y

k+1

)

where

S

k

=−P

k

+γ

2

g

(χ

k

,u

k

)

{overscore (G)}

(χ

k

,u

k

)

g

(χ

k

,u

k

)

T

+γ

2

K

T

(χ

k

,u

k

)

{overscore (E)}

(χ

k

,u

k

)

K

(χ

k

,u

k

){overscore (

E

)}

(χ

k

,u

k

)=(

E

(χ

k

,u

k

)

E

(χ

k

,u

k

)

T

)

−1

;{overscore (G)}

(χ

k

,u

k

)=(

G

(χ

k

,u

k

)

G

(χ

k

,u

k

)

T

)

−1

with S

k

>0, k=0, 1, . . .

We intentionally do not consider the recursion for c

k

since it is independent of θ and hence does not influence the certainty equivalence estimate {overscore (θ)}

k

.

Proof. Given the structure of the system [16], we can write

p

k+1

(θ)=

sup{p

k

(ξ)+

l

(

y

k+1

,u

k

,x

k

)

T

l

(

y

k+1

,u

k

,x

k

)−ξε

□

n2

γ

2

|G

(

x

k

,u

k

)

T

{overscore (G)}

(

x

k

,u

k

)(

y

k+1

−g

(

x

k

,u

k

)

T

ξ)|

2

−γ

2

|E

(

x

k

,u

k

)

T

{overscore (E)}

(

x

k

,u

k

)(θ−

K

(

x

k

,u

k

)ξ)|

2

}.

Note that the RHS is at most quadratic in θ yielding the form of p

k

postulated in the theorem. Hence let p

k

(θ)=(θ−{overscore (θ)}

k

)

T

P

k

(θ−{overscore (θ)}

k

)+c

k

. Substituting into the equation above this yields

p

k+1

(θ)=

sup

{(ξ−{overscore (θ)}

k

)

T

P

k

(ξ−{overscore (θ)}

k

)+

c

k

+l

(

y

k+1

,u

k

,x

k

)

T

l

(

y

k+1

,u

k

,

χ

k

)−ξε

□

n

2

γ

2

|G

(χ

k

,u

k

)

T

{overscore (G)}

(χ

k

,u

k

)(

y

k+1

−g

(χ

k

,u

k

)

T

ξ)|

2

−γ

2

|E

(χ

k

,u

k

)

T

{overscore (E)}(χ

k

,u

k

)(θ−

K

(χ

k

,u

k

)ξ)|

2

}.

Collecting together the quadratic and linear terms in ξ this equals

p

k+1

(θ)=

sup{−ξ

T

S

k

ξ+ξ

T

R

k

+R

k

T

ξ+l

(

y

k+1

,u

k

χ

k

)

T

l

(

y

k+1

,u

k

χ

k

)ξε

□

n

2

−γ

2

y

k+1

T

{overscore (G)}

(χ

k

,u

k

)

y

k+1

−γ

2

θ

T

{overscore (E)}(χ

k

,u

k

)θ+

c

k

} [18]

where

S

k

=−P

k

+γ

2

g

(χ

k

,u

k

)

{overscore (G)}

(χ

k

,u

k

)

g

(χ

k

,u

k

)

T

+γ

2

K

T

(χ

k

,u

k

)

{overscore (E)}

(χ

k

,u

k

)

K

(χ

k

,u

k

)

R

k=−P

k

{overscore (θ)}

k

+γ

2

g

(χ

k

,u

k

)

{overscore (G)}

(χ

k

,u

k

)γ

k+1

+γ

2

K

(χ

k

,u

k

)

T

{overscore (E)}

(χ

k

,u

k

)θ.

Maximizing [18] with respect to ξ yields

p

k+1

(θ)=

R

k

T

S

k

−1

R

k

−γ

2

θ

T

{overscore (E)}

(χ

k

,u

k

)θ

+

c

k

+{overscore (θ)}

k

T

P

k

{overscore (θ)}

k

−γ

2

y

k+1

T

{overscore (G)}

(χ

k

u

k

)

y

k+1

+l

(

y

k+1

,u

k

,χ

k

)

T

l

(

y

k+1

,u

k

,χ

k

).

Collecting together like powers of θ yields

\begin{matrix} p_{k + 1} (θ) = θ^{T} (γ^{4} \overline{E} (χ_{k}, u_{k}) K (χ_{k}, u_{k}) S_{k}^{- 1} {K (χ_{k}, u_{k})}^{T} \overline{E} (χ_{k}, u_{k}) - \\ γ^{2} \overline{E} (χ_{k}, u_{k})) θ + γ^{2} (- P_{k} {\overline{θ}}_{k} + \\ {γ^{2} g (χ_{k}, u_{k}) \overline{G} (χ_{k}, u_{k}) y_{k + 1})}^{T} S_{k}^{- 1} {K (χ_{k}, u_{k})}^{T} \overline{E} (χ_{k}, u_{k}) θ + \\ γ^{2} θ^{T} \overline{E} (χ_{k}, u_{k}) K (χ_{k}, u_{k}) S_{k}^{- 1} (- P_{k} θ_{k} + \\ γ^{2} g (χ_{k}, u_{k}) {\overline{G}}^{*} (χ_{k}, u_{k}) y_{k + 1}) + \\ \dots constant terms \dots \\ = {(θ - {\overline{θ}}_{k + 1})}^{T} P_{k + 1} (θ - {\overline{θ}}_{k + 1}) + c_{k + 1} \\ = θ^{T} P_{k + 1} θ - θ^{T} P_{k + 1} {\overline{θ}}_{k + 1} - {\overline{θ}}_{\overline{k}}^{T} P_{k + 1} θ + constant terms . \end{matrix}

From this it follows that

P

k+1

=γ

2

{overscore (E)}

(χ

k

,u

k

)(γ

2

K

(χ

k

,u

k

)

S

k

−1

K

(χ

k

,u

k

)

T

{overscore (E)}

(χ

k

,u

k

)−1)=γ

2

{overscore (E)}

(χ

k

,u

k

)(γ

2

K

(χ

k

,u

k

)

S

k

−1

K

(χ

k

,u

k

)

T

−E

(χ

k

,u

k

)

E

(χ

k

,u

k

)

T

)

{overscore (E)}

(χ

k

,u

k

){overscore (θ)}

k+1

=−P

k+1

−1

γ

2

{overscore (E)}

(χ

k

,u

k

)

K

(χ

k

,u

k

)

S

k

−1

(−

P

k

{overscore (θ)}

k

+γ

2

g

(χ

k

,u

k

)

{overscore (G)}

(χ

k

,u

k

)

y

k+1

)=(

{overscore (E)}

(χ

k

,u

k

)[γ

2

K

(χ

k

,u

k

)

S

k

−1

K

(χ

k

,u

k

)

T

{overscore (E)}

(χ

k

,u

k

)−

I

])

−1

{overscore (E)}

(χ

k

,u

k

).

K

(χ

k

,u

k

)

S

k

−1

(

P

k

{overscore (θ)}

k

−γ

2

g

(χ

k

,u

k

)

{overscore (G)}

(χ

k

,u

k

)

y

k+1

)=

E

(χ

k

,u

k

)

E

(χ

k

,u

k

)

T

(γ

2

K

(χ

k

,u

k

)

S

k

−1

K

(χ

k

,u

k

)

T

−E

(χ

k

,u

k

)

E

(χ

k

,u

k

)

T

)

−1

K

(χ

k

,u

k

)

S

k

−1

(

P

k

{overscore (θ)}

k

−γ

2

g

(χ

k

,u

k

)

{overscore (G)}

(χ

k

,u

k

)

y

k+1

).

We need S

k

to be positive definite for the maximum to exist in [18]. This is guaranteed provided that P

k

is negative definite. This is true provided

γ

2

K

(χ

k

,u

k

)

S

k

−1

K

(χ

k

,u

k

)

T

−E

(χ

k

,u

k

)

E

(χ

k

,u

k

)

T

<0

or

γ

2

K

(χ

k

,u

k

)

S

k

−1

K

(χ

k

,u

k

)

T

<E

(χ

k

,u

k

)

E

(χ

k

,u

k

)

T

.

This is equivalent to (using the full rank assumption on K(x

k

, u

k

))

γ

2

<K

(χ

k

,u

k

)

−1

E(χ

k

,u

k

)

E

(χ

k

,u

k

)

T

K

(χ

k

,u

k

)

−T

[−P

k

+γ

2

g

(χ

k

,u

k

)

{overscore (G)}

(χ

k

,u

k

)

g

(χ

k

,u

k

)

T

+γ

2

K

(χ

k

,u

k

)

T

{overscore (E)}

(χ

k

,u

k

)

K

(χ

k

,u

k

)].

This implies

γ

2

g

(χ

k

,u

k

)

{overscore (G)}

(χ

k

,u

k

)

g

(χ

k

,u

k

)

T

−P

k

>0

which holds for any γ>0.

Theorem 6 establishes the recursions for propagating the information states, and also illustrates their finite dimensionality. This makes the information state recursion [5] amenable to real-time computations. In fact, it also makes the dynamic programming equation [6] finite dimensional. This raises the possibilities of directly solving [6], and by-passing certainty equivalence altogether. However, note that if the system is n dimensional, then the dimension of the information state is O(n

2

), which could still pose significant numerical difficulties. Certainty equivalence helps alleviate this additional complexity, especially in light of theorem 1.

Thus, it is apparent that there has been provided in accordance with the present invention, a closed-loop dynamic dispatch policy for optimizing process performance that satisfies the advantages set forth above. Those advantages include minimizing the influence of process and inventory disturbances on process performance and minimizing idle time for machines. Although the present invention and its advantages have been described in detail, it should be understood that there is changes, substitutions, and alterations may be readily apparent to those skilled in the art and may be made without departing from the spirit and the scope of the present invention as defined by the following claims.

Number	Name	Date	Kind
4195041	Kawamura et al.	Mar 1980	A
4866628	Natarajan	Sep 1989	A
5260868	Gupta et al.	Nov 1993	A
5291397	Powell	Mar 1994	A
5375061	Hara et al.	Dec 1994	A
5559710	Shahraray et al.	Sep 1996	A
5668733	Morimoto et al.	Sep 1997	A
5706200	Kumar et al.	Jan 1998	A
5721686	Shahraray et al.	Feb 1998	A
5737728	Sisley et al.	Apr 1998	A
5818716	Chin et al.	Oct 1998	A
5826238	Chen et al.	Oct 1998	A
5841677	Yang et al.	Nov 1998	A
6351686	Iwasaki et al.	Feb 2002	B1

Method and system for dispatching semiconductor lots to manufacturing equipment for fabrication

Information

Patent Number

Date Filed

Date Issued

Inventors

Original Assignees

Examiners

Agents

CPC

US Classifications

Field of Search

US

International Classifications

Term Extension

Abstract

Description

Claims

Parent Case Info

US Referenced Citations (14)

Non-Patent Literature Citations (15)

Provisional Applications (1)

Entry
Reduce Complexity Nonlinear H∞ Controllers: Relation to certainty equivalence. In Proceedings of the 13th IFAC World Congress, vol. E; pp. 383-387, J.S. Baras and N.S. Patel., 1966.
Robust Control of Set-Valued Discrete Time Dynamical Systems. IEEE Transactions on Automatic Control, 43(1):61-75, J.S. Baras and N.S. Patel, 1998.
Neuro-Dynamic Programming. Athena Scientific, Belmont, MA, D.P. Bertsekas and J.N. Tsitsiklis, 1996.
Scheduling Semiconductor Lines Using a Fluid Network Model. IEEE Transactions on Robotics and Automation, 10(2):88-98, D. Connors, G. Feigin, and D. Yao, 1994.
Multi-armed Bandit Allocation Indices, J. C. Gittins, 1989.
A Dynamic Allocation Index for the Sequential Design of Experiments. In J. Gani, editor, Progress in Statistics, pp. 241-266, J. C. Gittins and D.M. Jones, 1974.
On the Certainty Equivalence Principle for Partially Observed Dynamic Games. IEEE transactions on Automatic Control, 39(11):2321-2324; M.R. James, 1994.
Recent Developments in Nonlinear H∞ Control. In Preprints of IFAC Nonlinear Control Design Symposium (NOLCOS'95), pp 578-589; M.R. James, 1995.
Scheduling Semiconductor Manufacturing Plants. IEEE Control Systems, pp 33-40; P.R. Kumar, 1994.
Stochastic Systems: Estimation, Identification, and Adaptive Control; by P.R. Kumar and P. Varaiya, 1986.
Efficient Scheduling Policies to Reduce Mean and Variance of Cycle- Time in Semiconductor Manufacturing Plants, IEEE Transactions on Semiconductor Manufacturing, 7(3):374-388; by S.C.H. Lu, D. Ramaswamy, and P.R. Kumar, 1994.
Impact of Multi-Product and -Process Manufacturing on Run-to-Run Control, Process, Equipment and Materials Control in Integrated Circuit Manufacturing III, pp 138-146 by M.L. Miller, 1997.
Scheduling: Theory, Algorithms, and Systems, by M. Pinedo, 1995.
Feature-Based Methods for Large-Scale Dynamic Programming. Master's Theses, Massachusetts Institute of Technology by B.V. Roy, 1994.
Run by Run Control: Combining SPC and Feedback Control. IEEE Transactions on Semiconductor Manufacturing, 8(1):26-43, 1995.