Embodiments described herein involve a method comprising monitoring a condition of an asset comprising one or more subsystems connected in series. Each subsystem comprises one or more components connected in parallel. The asset has one or more jobs. A probability of the asset surviving a predetermined amount of time is determined based on the monitoring and one or more shared resources. The one or more shared resources are configured to be shared between the subsystems. A model is established using a threshold based heuristic maintenance policy. The model is configured to maximize a number of successful jobs that the asset is able to complete based on the determined probability. The one or more shared resources are allocated to the one or more subsystems based on the model.
Embodiments involve a system comprising a processor and a memory storing computer program instructions which when executed by the processor cause the processor to perform operations. The operations comprise monitoring a condition of an asset comprising one or more subsystems connected in series. Each subsystem comprises one or more components connected in parallel. The asset has one or more jobs. A probability of the asset surviving a predetermined amount of time is determined based on the monitoring and one or more shared resources. The one or more shared resources are configured to be shared between the subsystems. A model is established using a threshold based heuristic maintenance policy. The model is configured to maximize a number of successful jobs that the asset is able to complete based on the determined probability. The one or more shared resources are allocated to the one or more subsystems based on the model.
The figures are not necessarily to scale. Like numbers used in the figures refer to like components. However, it will be understood that the use of a number to refer to a component in a given figure is not intended to limit the component in another figure labeled with the same number.
Industrial and mission critical assets such as factory assembly lines, aircraft engines and military equipment are composed of series-parallel systems that require periodic maintenance and replacement of faulty components. The assets typically operate over a finite time window and are brought in (or shut down) for service between consecutive operational windows (e.g., missions). To prevent excessive down times and costly breakdown, a selective maintenance policy that judiciously selects the sub-systems and components for repair and prescribes allocation of resources may be used.
Embodiments described herein involve a threshold based system and method that provides near optimal maintenance policy especially when resources (man power, budget, spare parts) are sparse and have to be shared across the entire planning horizon. The exact planning problem is computationally intensive and our heuristic method will produce near optimal maintenance schedules in reasonable time suitable for practical applications. The threshold value is expressed as the long-term expected maximum reward yielded by holding a single resource in reserve. The heuristic policy described herein determines threshold values that are a function of the resource inventory levels and state of the system. If the immediate reward yielded by allocating a resource exceeds the long-term expected marginal yield (e.g., the threshold value), the policy dictates that the resource be allocated. In this regard, we perform a tradeoff between the immediate reward and the long-term expected reward yielded by holding the resource in reserve (i.e., for allocation at a future time). If a plurality of resources are available, the policy dictates that the resource with the maximum marginal reward be allocated first. Also, if a plurality of sub-systems/components are in need of repair, the policy dictates that those sub-systems/components whose repair will yield the maximum asset reliability be repaired first. Embodiments described herein can be used in any type of system. For example, the techniques described herein can be used in aircraft engine maintenance, assembly line selective maintenance (e.g., in printer ink facilities), industrial equipment, military asset management, and/or weapons allocation (e.g., assignment of weapons to targets).
In some cases, finite resources are typically allocated for each service period (or break). Embodiments described herein specifically include allocation of scarce resources across the planning horizon. In general, consumable resources, (e.g., man hours, parts) can be shared and therefore optimally assigned depending on the condition of the asset and the remaining number of missions (time horizon). The selective maintenance scheduling problem for series-parallel systems is computationally intensive and optimal schedules may take hours to compute on current day edge devices. Embodiments described herein exploit the structure in the problem to derive an intuitive threshold based heuristic policy that is amenable to fast real time implementation. The heuristic policy employs a scalable linear backward recursion for computing the threshold values as opposed to solving a non-linear Bellman recursion (which is typically not scalable for a system with large number of sub-systems and components) for computing the optimal allocation.
A probability of the asset surviving a predetermined amount of time is determined 120 based on the monitoring 110 and one or more shared resources configured to be shared between the subsystems. The shared resources may include one or both of replenishable resources and consumable resources. Specifically, the one or more shared resources include one or more of man hours, budget, parts, and equipment used to perform maintenance.
A model is established 130 using a threshold based heuristic maintenance policy. The model may be configured to maximize a number of successful jobs that the asset is able to complete based on the determined probability. According to various embodiments, the model is configured to minimize a shared resource cost. The resource cost may be calculated using a Maximal Marginal Reward (MMR) algorithm that is configured to maximize an expected payout or minimize an expected cost.
The one or more shared resources are allocated 140 to the one or more subsystems based on the model. A maintenance schedule for the asset may be determined based on the model. In some cases, the one or more components have a known failure rate and the known failure rates are used to establish the model.
If it is not determined 155 that it is the final decision stage, a failure probability for all components of the asset is determined 150 until the next decision epoch. The next decision epoch may be a current task that one or more subsystems of the asset is performing and/or a current mission of the asset, for example.
Components and/or subsystems are prioritized for repair and/or replacement 160. The prioritization may be based on a time to failure of the component and/or subsystem, for example. In some cases, the prioritization may be based on a cost of the replacement and/or repair.
A threshold value is computed 170 for the next available resource. It is determined 180 whether the immediate resource allocation reward is greater than the threshold. If it is determined 180 that the immediate resource allocation reward is not greater than the threshold, the process again computes 170 the threshold value for the next available resource.
If it is determined 180 that the immediate resource allocation reward is greater than the threshold, resources are allocated 190. The remaining resources and components selected for repair are iterated through. The process then advances 195 to the next decision epoch and the process returns to determining 155 if it is the final decision stage and the process continues.
If it is determined 155 has been reached, an algorithm is used 165 to allocate all remaining resources. According to various configurations, the algorithm is the MMR algorithm. The process then ends 175.
The methods described herein can be implemented on a computer using well-known computer processors, memory units, storage devices, computer software, and other components. A high-level block diagram of such a computer is illustrated in
Establishing the model using a threshold based heuristic maintenance policy is described in further detail below.
The decision maker (DM) has at his disposal M homogenous resources that are to be sequentially allocated to incoming tasks. Decisions are made at discrete epochs, i.e., t=0, 1, . . . . Each task i has a window of opportunity, [tsi, tfi], within which it is active and therefore, the DM can assign resources towards completing the task. A single resource assigned to a task gets the job done at or before the next decision epoch (i.e., task is completed) with probability p<1. So, if k resources are allocated to an incomplete task i at decision epoch t∈[tsi, tfi), the task will be completed with probability 1−(1−p)k at the next decision epoch t+1. We assume “complete information” in that the task completion status is known to the DM at the beginning of each decision epoch. Upon successful completion of task i, the DM earns the positive reward ri. The DM also incurs a cost c>0 for each resource assigned. Suppose at time 0, the DM knows that N tasks shall arrive with a priori known windows of opportunity. Without loss of generality, we shall assume that at least one of the tasks starts at time 1 i.e., minitsi=1. Furthermore, let T=maxitfi be the time horizon of interest. We wish to compute the optimal allocation of resources to tasks so that the DM accrues the maximal cumulative reward over the time window, t∈[1, T].
Let the task completion status (state) variable: z(i)=0,1 indicate that task i is incomplete and complete respectively. At decision epoch t, let V (z,k,t) indicate the optimal cumulative reward (value function) that can be achieved thereafter where k indicates the number of resources left and the vector z encapsulates the status of each task.
A task i is active at the current time if it is incomplete, i.e., z(i)=0 and if the current time t∈[tsi, tfi]. Accordingly, we define the set of active tasks, A(z,t)={i: t∈[tsi, tfi] and z(i)=0}.
The decision variable, u is a vector wherein ui indicates the number of resources to be assigned to task i. The feasible allocation set is shown in (1).
U(z,k,t)={u:Σi∈A(z,t)ui≤k,ui≥0,∀i∈A(z,t),ui=0,∀i∉A(z,t)}. (1)
It follows that the value function V (z,k,t) satisfies the Bellman recursion:
where q=1−p. In effect, we assign Σi∈A(z,t) ui resources in total. In the next decision epoch, the status of each task changes according to:
Since the evolution of each task is a completely independent Markov process, the transition probabilities in (3) are given by:
Prob(
The boundary conditions for the value function are given by:
We shall address the much simpler single task scenario first.
Suppose we are only interested in the optimal allocation of resources to a single active task i. For ease of exposition, let us assume that 1=tsi<tfi=Ti. The corresponding task specific value function is given by:
with the boundary conditions:
The optimal policy is given by:
μi(k,t)=arg maxu
Suppose the DM has u+1 resources left at the terminal time Ti. We denote the marginal reward yielded by assigning 1 additional resource over and above u resources to the active task i as:
ΔTi(u)=−c(u+1)+(1−qu+1)ri−[−cu+(1−qu)ri]=−c+pquri. (11)
Since q<1 it follows that ΔTi(u) is monotonic decreasing in u. Let κi(T) be the least non-negative integer at which the marginal reward becomes negative i.e., κi(Ti)= s.t. ri<c. It follows that the optimal policy at the terminal time is given by:
In other words, we assign as many resources as is available but no more than the threshold. We will show likewise that at each decision epoch, there is an upper bound to the number of resources that the DM shall assign (optimally) for the remainder of the time horizon. Having computed the terminal value function, we are now in a position to compute the value function at the previous decision epoch. Indeed, we have from (7):
We can show that the value function Vi(k,Ti−1) has a threshold based analytical form similar to Vi(k,Ti). Let the function inside the max operator in (14) be given by,
where,
where, fT
Let κi(Ti−1)= s.t.
Lemma 1 κi(Ti−1)≤κi(Ti).
Proof. If κi(Ti)=0,
where (17) follows from the definition of κi(Ti). In other words, ri>c for all ∈ [0, κi(Ti)). It immediately follows from the definition of κi(Ti−1) that κi(Ti−1)≤κi(Ti).
From the definition of the marginal reward (16), we note that determining when it switches sign depends in part on fT
Lemma 2
Proof. By definition we have: qu*(k)fT
Theorem 1 The optimal allocation policy at Ti−1 is dictated by:
Proof. First, we note that the marginal reward function ΔT−1i(k,u) defined earlier (16) is monotonic decreasing in u. Indeed, for u∈[0, k−κi(Ti)), ΔT−1i(k,u)=−c+pqu
ΔT−1i(k,k−κi(Ti)−1)>ΔT−1i(k,k−κi(Ti)).
From the definition of κi(Ti), we have: pqκ
As before, It follows that the optimal allocation u* is the threshold value at which the marginal reward becomes negative. However, the added complexity here is the dependence on k. So, we deal with the three possible scenarios that lead to different threshold values:
where the last inequality follows from the definition of κi(Ti−1). So, it follows that the threshold value at which the marginal reward turns negative is given by u*=κi(Ti−1).
with the boundary condition u*(κi(Ti)+κi(Ti−1))=κi(Ti−1), which was established in 2).
In summary, we have:
where u*(k) is computed via the backward recursion (24).
It follows from (15) that the optimal value function at time Ti−1 is given by:
Corollary 1 The optimal allocation at time Ti−1 is such that the number of resources left at the last stage Ti is no more than the optimal threshold κi(Ti). In other words,
k−μi(k,Ti−1)≤κi(Ti), k<κi(Ti)+κi(Ti−1). (27)
Proof. This follows from (24). As k decreases by a value of 1, u*(k) either remains the same or goes down by 1. So, it cannot decrease fast enough that k−u*(k) exceeds the threshold κi(Ti). In other words, when the total number of resources is less than the sum of thresholds κi(Ti)+κi(Ti−1), no resources will be left unused at the final time Ti.
In light of Theorem 1, one can easily compute the optimal thresholds i.e., number of resources to be allocated at each decision epoch when resources are aplenty. In particular, we have the result: for any t∈[1, Ti],
where the thresholds can be computed via a backward recursion. For any t=Ti, . . . , 1:
with the boundary condition, R0=ri. The optimal value function is given by:
In other words, one can quickly compute the thresholds κi(t) over the entire time window without resorting to the non-linear Bellman recursion. At decision time t, the DM allocates exactly the threshold amount so long as the number of available resources is no less than the sum of thresholds over the remainder of the time horizon.
This also tells us that for any task i, the DM requires at most a maximum of
resources. Furthermore, since the rewards are strictly decreasing with (see (17) in Lemma 1), it follows that κi(Ti−) must go to zero as increases. Indeed, there exists n(i) ∈ [0, ∞) such that κi(T(i)−n(i))=0. So, the DM need never allocate more than
resources to task i and furthermore, the window of opportunity for target i needs to be no longer than n(i). If the window is any longer, it is optimal for the DM to simply hold off without assigning any resources until the remaining time window shrinks to n(i).
We use as an example the allocation of maintenance personnel to a piece of industrial equipment prone to failure to illustrate our method and results. The example problem involves the allocation of maintenance personnel to a single equipment over a window of opportunity, T=10. The reward, r=90, the probability of resolving the fault, p=0.25 and the cost of allocating a maintainer, c=1. According to various configurations, since the cost of allocating one maintainer is c=1, minimizing this cost is equivalent to minimizing the number of maintainers. Using the backward recursion (29), we compute the threshold values, κ(j), j=1, . . . , 10 as shown in
In the previous section, we have completely characterized the optimal allocation at stages Ti and Ti−1. Note that the optimal allocation at stage Ti is linear up to the threshold κi(Ti) and a constant thereafter. The optimal allocation at stage Ti−1 is far more interesting in that it is piecewise constant and monotonic non-decreasing up to the threshold κi(Ti−1). Arguably, the optimal allocation at earlier stages will also exhibit a piecewise constant and monotonic non-decreasing behavior. The difficulty is in computing the switching points where the allocation increases by 1. When resources are abundant, we have the complete solution. The interesting case is when resources are scarce where it is not obvious how to distribute them among the different stages knowing that future stages may not come into play. Nonetheless, one can still generalize some of the results that have been shown to be true for stage Ti−1. Indeed, the optimal value function for earlier stages generalizes (26) and has the form: for >0 and k≤
The above value function reflects the fact that the DM incurs a cost of c×μ({tilde over (k)}j, Ti−+j) only if the resources allocated at all previous stages Ti−+n, n ∈ [0, j−1] are unsuccessful in completing the task. Moreover, the optimal allocation at stage Ti−+j is a function of the number of resources left at the stage, , which, by definition, is the number of resources allocated at previous stages deducted from the initial inventory of k resources. Since all k resources are allocated, the expected reward equals [1−qk]ri. The exact distribution of the k resources amongst the different stages requires solving the Bellman recursion. For the special case of k=κi(Ti−h), the result is immediate and the optimal allocation, μ({tilde over (k)}j, Ti−+j)=κi(Ti−+j), j ∈ [0, ].
From (31), we can also generalize the marginal reward function. Suppose u resources are allocated at stage Ti− and optimal allocations are made thereafter. We can write:
It follows that the marginal reward at stage Ti− is given by:
Let (k,u)=−c+cqu(k−u), where (k−u) represents the quantity inside the curly brackets in (33). Assuming that the unit incremental property of the optimal allocation (see Lemma 2) holds for all stages, we have either μ({tilde over (k)}1, Ti−+1)=μ({tilde over (k)}1−1, Ti−+1) or μ({tilde over (k)}1, Ti−+1)=μ({tilde over (k)}1−1, Ti−+1)+1. Accordingly,
where we recognize that the expression inside the square brackets in (35) is, by definition, fT
In addition to the unit incremental property, further suppose (as in Lemma 2)), that:
μ({tilde over (k)}1−1,Ti−+1)=μ({tilde over (k)}1,Ti−+1)−2) if ({tilde over (k)}1,μ({tilde over (k)}1−Ti−+1))≤q. (38)
Combining (36), (37) and (38), we have:
(w)=pμ(w,T1−+1)+max{q(w−μ(w,Ti−+1))}. (39)
Recall that the optimal allocation at the last stage, μ(k,Ti)=k, k≤κi(Ti). So, we have the initial condition for the update given by:
fT
which matches with our earlier definition for fT
Theorem 2 The optimal allocation policy at Ti− for any >0 is dictated by:
The update function for (.) is given by (39) with the initial condition (40).
To prove the above result, we have to generalize the results shown for fT
We return our attention to the multiple task arrivals. The ensuing analysis is greatly simplified when the arriving tasks are temporally disjoint. In other words, suppose the time windows [tsi,tfi] for different tasks are completely disjoint. Without loss of generality, we can assume that the N tasks are ordered temporally as follows: 1=ts1<tf1<ts2<tf2< . . . <tsN<tfN=T. Furthermore, we can also assume that tsi+1=tfi+1 for all i<N since the DM cannot allocate anything between two successive disjoint windows of opportunity.
This case deals with the scenario wherein there are two or more tasks such that the intersection of their windows of opportunity is non empty. The curse of dimensionality may render the dynamic programming recursion intractable. However, we can use the notion of marginal reward from the single task case to establish heuristic policies. The idea would be to assign resources greedily to tasks that yield the maximal marginal reward.
This would be the most complex scenario, wherein the windows of opportunity are no longer fixed. In other words, suppose the length of each task window Ti is known, but the start time is random. The original recursion has to be modified to account for random arrivals. For example, we could assume that each time step task of type i would appear with probability αi. Furthermore, we could assume that once a task arrives, no other task can arrive until either the current task is completed or its window of opportunity expires.
The allocation of resources is described in more detail in the following paragraphs. According to various implementations, an asset performs a sequence of identical missions at time, t=1, 2, . . . , T and is brought into service in between missions. The asset is comprised of m independent sub-systems connected in series with each sub-system i comprising ni independent, identical constant failure rate (CFR) components connected in parallel. At any point in time, a component is either functioning or has failed. A sub-system is also either functioning (if at least one of its components is functioning) or has failed. The asset is functioning and can perform a mission successfully only if all of its sub-systems are functioning. A faulty component can be repaired during the break between two consecutive missions. We assume that each component in sub-system i has reliability ri i.e., a functioning component at the start of a mission fails during the mission with probability qi=1−ri. In this example, each component requires exactly 1 man hour to be repaired. The decision maker (DM) has at his disposal a total budget of M man hours over the entire planning horizon. There is a unit cost c>0 borne by the DM for each man hour expended. We assume “complete information” in that each component's status is known to the DM at the end of a mission (or beginning of a break). We wish to maximize the asset reliability while minimizing cumulative service cost over the remainder of the planning horizon. At the end of mission t, suppose si≤ni components are found to be faulty in sub-system i and the DM allocates ui≤si man-hours for repairing subsystem i. The reliability of subsystem i for mission t+1 is given by: 1−qin
Suppose we have a budget of k man-hours and a single mission for which we wish to maximize asset reliability and minimize resource use (cost). As before, let si≤ni denote the number of faulty components before the start of the mission. Let us allocate ui≤si man hours towards repairing components in sub-system i. The optimization problem becomes:
For this single stage optimization problem, it can be shown that the optimal allocation is given by the Maximal Marginal Reward (MMR) algorithm. Indeed, we assign one man-hour at a time to the sub-system that yields the highest marginal reward for the additional allocation. Suppose we have already assigned ui man-hours to sub-system i. The marginal reward i.e., increase in asset reliability yielded by assigning an additional man hour to sub-system i minus the additional cost is given by:
We assign the next available man hour to the sub-system with the highest payoff: arg maxi[Δi(u)]. We stop assigning resources when this quantity becomes negative indicating that any additional allocation yields a negative (marginal) reward.
Suppose at decision time t, the DM is left with k man-hours and the number of faulty components in sub-system i is given by si and the state vector, s=(s1, . . . , sm). Let the DM allocate ui≤si man-hours towards repairing components in sub-system i. The corresponding expected future payoff or value function is given by:
where the number of failed components in sub-system i at the end of mission t+1 is given by
On the other hand, if the additional resource is kept in reserve for future stages (breaks), the expected marginal future reward is given by:
where:
W(
Suppose we are only interested in the optimal allocation of resources to a single sub-system i. Recall decisions are made at t=1, . . . , T before the start of the tth mission. We assume that all components are healthy before the start of the 1st mission. Suppose at decision time t, the DM is left with k man-hours and the number of faulty components is given by s. We wish to minimize the cost of labor and maximize the number of successful missions. The unit cost c>0 can be appropriately chosen to trade off the cost of labor with mission success probability. The corresponding sub-system specific value function is given by:
where where the immediate reward associated with allocation u is given by Ri(s,u)=1−qin
So, the Bellman recursion becomes:
The optimal policy is given by:
At the terminal decision epoch, we have the boundary condition:
When there are no man hours left, the system evolves autonomously and we have:
Lemma 1
Proof. From the boundary condition (53), we have:
Vi(0,s,T)=1−qin
We make the induction assumption that:
for some h>1. It follows that:
We have used the induction assumption in (58) and repeatedly used
to arrive at (60).
At the terminal decision epoch T, if there are s faulty components, we will at most need only s man hours to repair them. So, the optimal allocation for k>s is the same as the optimal allocation for k=s. With this in mind, recall the boundary condition:
Suppose the DM has u+1 resources left at the terminal time T. We denote the marginal reward yielded by assigning 1 additional resource over and above u resources to the active task i as:
ΔTi(u)=Ri(s,u+1)−Ri(s,u)=−c+qin
Since qi<1 it follows that ΔTi(u) is monotonic decreasing in u. Let κi(T) be the least non-negative integer at which the marginal reward becomes negative i.e., κi(T)=s.t. ri<c. It follows that the optimal policy at the terminal decision epoch is given by: for k≤s,
For k>s, the optimal assignment is given by:
In other words, we assign as many resources as possible so that ni−s+u gets close to the threshold κi(T) without exceeding it. The optimal terminal value is given by:
Vi(k,s,T)=Ri(s,μi(k,s,T))=1−qin
We will show likewise that at each decision epoch, there is an upper bound to the number of resources that the DM shall assign (optimally) for the remainder of the time horizon. Having computed the terminal value function, we are now in a position to compute the value function at the previous decision epoch. Indeed, we have from (50):
where the future expected reward associated with allocation u is given by:
We wish to show that the value function Vi(k,s,T−1) has a threshold based analytical form similar to Vi(k,s,T).
Case 1: u<κi(T)−ni+s and k−u≥κi(T). It follows that:
n−s+u−<κi(T) and n−s+k−≥κi(T)∀. (68)
⇒μi(k−u,s−u+,T)=κi(T)−ni+s−u+. (69)
So, we have:
Case 2: n−s+k≤κi(T). It follows that:
n−s+k−≤κi(T)∀. (71)
⇒μi(k−u,s−u+,T)=k−u. (72)
So, we have:
It follows that:
R(s,u)+Q(k,s,u,T)=2−ck−qin
Unless otherwise indicated, all numbers expressing feature sizes, amounts, and physical properties used in the specification and claims are to be understood as being modified in all instances by the term “about.” Accordingly, unless indicated to the contrary, the numerical parameters set forth in the foregoing specification and attached claims are approximations that can vary depending upon the desired properties sought to be obtained by those skilled in the art utilizing the teachings disclosed herein. The use of numerical ranges by endpoints includes all numbers within that range (e.g. 1 to 5 includes 1, 1.5, 2, 2.75, 3, 3.80, 4, and 5) and any range within that range.
The various embodiments described above may be implemented using circuitry and/or software modules that interact to provide particular results. One of skill in the computing arts can readily implement such described functionality, either at a modular level or as a whole, using knowledge generally known in the art. For example, the flowcharts illustrated herein may be used to create computer-readable instructions/code for execution by a processor. Such instructions may be stored on a computer-readable medium and transferred to the processor for execution as is known in the art. The structures and procedures shown above are only a representative example of embodiments that can be used to facilitate ink jet ejector diagnostics as described above.
The foregoing description of the example embodiments have been presented for the purposes of illustration and description. It is not intended to be exhaustive or to limit the inventive concepts to the precise form disclosed. Many modifications and variations are possible in light of the above teachings. Any or all features of the disclosed embodiments can be applied individually or in any combination, not meant to be limiting but purely illustrative. It is intended that the scope be limited by the claims appended herein and not with the detailed description.
Number | Name | Date | Kind |
---|---|---|---|
8903750 | Bodkin | Dec 2014 | B1 |
11106190 | Huang | Aug 2021 | B2 |
20080140361 | Bonissone | Jun 2008 | A1 |
20080234994 | Goebel | Sep 2008 | A1 |
20090210081 | Sustaeta | Aug 2009 | A1 |
20100017241 | Lienhardt | Jan 2010 | A1 |
20110246093 | Wood | Oct 2011 | A1 |
20140156584 | Motukuri | Jun 2014 | A1 |
20140257526 | Tiwari | Sep 2014 | A1 |
20170205818 | Adendorff et al. | Jul 2017 | A1 |
Entry |
---|
Wu, Sze-jung, et al. “A neural network integrated decision support system for condition-based optimal predictive maintenance policy.” IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans 37.2 (2007): 226-236. (Year: 2007). |
Patra, Sunandita. Acting, Planning, and Learning Using Hierarchical Operational Models. Diss. University of Maryland, College Park, 2020. (Year: 2020). |
Ahadi et al., “Approximate Dynamic Programming for Selective Maintenance in Series-Parallel Systems”, IEEE Transactions on Reliability, vol. 69, Issue 3, Sep. 2020, pp. 1147-1164. |
Meuleau et al., “Solving Very Large Weakly Coupled Markov Decision Processes”, Proceedings of the 15th National/Tenth Conference on Artificial Intelligence, Jul. 1998. pp. 165-172. |
Number | Date | Country | |
---|---|---|---|
20220351106 A1 | Nov 2022 | US |