The invention generally relates to a self-correcting closed-loop control policy such as for additive or subtractive manufacturing through machine learning, e.g., with regard to deposition of viscous materials in situations that involve on-the-fly adjustment of process parameters to handle inconsistencies in the deposition process and material formulations.
Generally speaking, additive manufacturing is the process of creating an object by building it one layer at a time. Technically, additive manufacturing can refer to any process where a product is created by building something up but it is commonly used to refer to 3D printing.
In accordance with one embodiment of the invention, a manufacturing system comprises a tool configured to interact with or produce a product, at least one sensor that provides sensor information on the quality of the operation of the tool relative to the product, and a controller configured to control operation of the tool based on a predetermined manufacturing process and further configured to dynamically adjust at least one parameter of the predetermined manufacturing process to thereby dynamically adjust operation of the tool based on qualitative performance information derived from the sensor information applied as feedback to a closed-loop control policy learned through machine reinforcement learning.
In various alternative embodiments, the tool may perform an additive manufacturing process and/or a subtractive manufacturing process. The at least one sensor may include one or more cameras (e.g., two cameras), a 3D laser scanner, and/or a coordinate measuring machine. The tool may be a CNC machine. The controller may utilize a policy network for controlling the manufacturing process, wherein the policy network is trained using a learning environment that models the relationship between the process parameters and result as well as a reward function that penalizes or rewards the policy depending on how well the policy performed. In any case, the controller may be manufacturing process agnostic such that the controller can be used on different types of manufacturing processes.
In certain specific embodiments, the tool may be a 3D printer having a material dispenser, the at least one sensor may include at least one camera (e.g., two cameras) configured to provide images of a location around the deposition, and the closed-loop control policy may use qualitative performance information derived from the images. The qualitative performance information may include both deposition and variance of deposition. The at least one parameter may include (1) the velocity at which the printing head is moving and/or (2) displacement of the printing head in a direction perpendicular to the motion.
In accordance with another embodiment of the invention, a method comprises learning a self-correcting closed-loop control policy through machine reinforcement learning for a manufacturing process that involves on-the-fly adjustment of process parameters to handle inconsistencies in the manufacturing process and material formulations, and controlling operation of a tool configured to interact with or produce a product including dynamically adjusting at least one parameter of the manufacturing process to thereby dynamically adjust operation of the tool based on qualitative performance information derived from at least one sensor applied as feedback to the closed-loop control policy learned through machine reinforcement learning.
In various alternative embodiments, the tool may perform an additive manufacturing process and/or a subtractive manufacturing process. The at least one sensor may include one or more cameras (e.g., two cameras), a 3D laser scanner, and/or a coordinate measuring machine. The tool may be a CNC machine. The controller may utilize a policy network for controlling the manufacturing process, wherein the policy network is trained using a learning environment that models the relationship between the process parameters and result as well as a reward function that penalizes or rewards the policy depending on how well the policy performed. In any case, the controller may be manufacturing process agnostic such that the controller can be used on different types of manufacturing processes.
In certain specific embodiments, the tool may be a 3D printer having a material dispenser, the at least one sensor may include at least one camera (e.g., two cameras) configured to provide images of a location around the deposition, and the closed-loop control policy may use qualitative performance information derived from the images. The qualitative performance information may include both deposition and variance of deposition. The at least one parameter may include (1) the velocity at which the printing head is moving and/or (2) displacement of the printing head in a direction perpendicular to the motion.
Additional embodiments may be disclosed and claimed.
Those skilled in the art should more fully appreciate advantages of various embodiments of the invention from the following “Description of Illustrative Embodiments,” discussed with reference to the drawings summarized immediately below.
It should be noted that the foregoing figures and the elements depicted therein are not necessarily drawn to consistent scale or to any scale. Unless the context otherwise suggests, like elements are indicated by like numerals.
Certain embodiments are described herein with reference to additive manufacturing processes such as 3D printing using camera-based feedback, although it should be noted that embodiments are not limited to additive manufacturing or to camera-based feedback but instead can be applied more generally to a wide variety of other manufacturing processes and feedback mechanisms (essentially any type of sensor that can provide qualitative feedback of a manufacturing process, e.g., 3D laser scanner, coordinate measuring machine, etc.).
A critical component of additive manufacturing is identifying process parameters of the material and deposition system to deliver consistent, high-quality printouts. In commercial devices, this is typically achieved by expensive trial-and-error experimentation (Gao et al., 2015). To make such an optimization feasible, a critical assumption is made: there exists a set of parameters for which the deposition is consonant. However, this assumption does not hold in practice because the printing materials are unstable non-homogenous mixtures. Their properties vary from batch to batch and, over time, as they settle or cure. These inconsistencies lead to printing imperfections that hinder the industrial adoption of additive manufacturing (Wang et al., 2020). Therefore, to achieve consistent prints, closed-loop control is a must for additive manufacturing.
Recently, there has been promising progress in learning policies for interaction with amorphous materials (Li et al., 2019b; Zhang et al., 2020). Unfortunately, in the context of additive manufacturing, discovering effective control strategies is significantly more challenging. The deposition parameters have a non-linear coupling to the dynamic material properties. To assess the severity of deposition errors, the material needs to be observed over long time horizons. Available simulators either lack predictive power (Mozaffar et al., 2018) or are too complex for learning (Tang et al., 2018; Yan et al., 2018). And learning on hardware is intractable, often requiring tens of thousands of printed samples. All of these challenges are further exaggerated by the limited perception of printing hardware where only a small in-situ view is available to assess the deposition quality.
A numerical model is proposed for learning closed-loop control policies for additive manufacturing. To formulate this model, a key assumption is made. The in-situ view of a printing apparatus allows it to perceive the materials only qualitatively. Different materials can be treated the same as long as their local deposition is similar. This assumption is leveraged to design an efficient simulator based on Position-Based-Dynamics. Linear Predictive Coding is used here to explicitly include the parameter coupling as a noise distribution on the width of the deposited material. Furthermore, the numerical model provides privileged information about the deposition process. More specifically, it allows evaluation of the quality of deposition in unobserved regions and include material changes over long-horizons. As demonstrated, the proposed model can be used to learn closed-loop policies that outperform state-of-the-art controllers. Moreover, it is shown that the proposed control policies have minimal sim-to-real gap and are readily applicable to the physical hardware. Finally, the proposed model was used to construct what is believed to be a first-of-its-kind closed-loop printing apparatus and use it to fabricate several slices. It is believed that the numerical model enables future research on optimal deposition control entirely in simulation without investing in specialized hardware.
To identify process parameters for additive manufacturing, it is important to understand the complex interaction between a material and a deposition process. This is typically done through trial-and-error experimentation (Kappes et al., 2018; Wang et al., 2018; Baturynska et al., 2018). Recently, optimal experiment design and more specifically Gaussian processes have become a tool for efficient use of the samples to understand the deposition problem (Erps et al., 2021). However, even though Gaussian Processes model the deposition variance, they do not offer tools to adjust the deposition on-the-fly. Another approach to improve the printing process is to design closed-loop controllers. One of the first designs was proposed by Sitthi-Amorn et al. (2015) that monitors each layer deposited by a printing process to compute an adjustment layer. Liu et al. (2017) build upon the idea and train a discriminator that can identify the type and magnitude of observed defects. A similar approach was proposed by Yao et al. (2018) that uses handcrafted features to identify when a print significantly drops in quality. The main disadvantage of these methods is that they rely on collecting the in-situ observations to propose one corrective step by adjusting the process parameters. However, this means that the prints continue with sub-optimal parameters and it can take several layers to adjust the deposition. In contrast, our system runs in-process and reacts to the in-situ views immediately. This ensures high-quality deposition and adaptability to material changes.
Recently, machine learning techniques have sparked a new interest in design of adaptive control policies (Mnih et al., 2015). A particularly successful approach for high quality in-process control is to adopt the Model Predictive Control paradigm (MPC) (Gu et al., 2016; Silver et al., 2017; Oh et al., 2017; Srinivas et al., 2018; Nagabandi et al., 2018). The control scheme of MPC relies on an observation of the current state and a short-horizon prediction of the future states. By manipulating the process parameters, we observe the changes in future predictions and can pick a future with desirable characteristics. Particularly useful is to utilize deep models to generate differentiable predictors that can be efficiently used to observe the derivatives with respect to control changes (de Avila Belbute-Peres et al., 2018; Schenck & Fox, 2018; Toussaint et al., 2018; Li et al., 2019a). However, addressing uncertainties of the deposition process with MPC is challenging. In a noisy environment, we can rely only on the expected prediction of the deposition. This leads to a conservative control policy that effectively executes the mean action. Moreover, reacting to material changes over time requires optimizing actions for long time horizons, which is a known weakness of the MPC paradigm (Garcia et al., 1989). As a result, MPC is not suitable for in-process control of noisy environments.
Another option to derive control policies is to leverage deep reinforced learning (Rajeswaran et al., 2017; Liu & Hodgins, 2018; Peng et al., 2018; Yu et al., 2019; Lee et al., 2019; Akkaya et al., 2019). The key challenge in design of such controllers is formulating an efficient numerical model that captures the governing physical phenomena. As a consequence, it is most commonly applied to rigid body dynamics and rigid robots, where such models are readily available (Todorov et al., 2012; Bender et al., 2014; Coumans & Bai, 2016; Lee et al., 2018). In contrast, learning with non-rigid objects is significantly more challenging, as the computation time for deformable materials is higher and relies on some prior knowledge of the task (Clegg et al., 2018; Elliott & Cakmak, 2018; Ma et al., 2018; Wu et al., 2019). Recently, Zhang et al. (2020) proposed a numerical model for training control policies in which a rigid object interacts with amorphous materials. Similarly, in the proposed model, a rigid printing nozzle interacts with the fluid-like printing material. However, the proposed model is specialized for the printing hardware and models not only the deposition but also its variance. It is demonstrated that this is an important component to minimize the sim-to-real gap and design control policies readily applicable to the physical hardware.
The choice of additive manufacturing technology constrains the subsequent numerical modeling. To keep the applicability of our developed system as wide as possible, the proposed model opted for a Direct-Write needle deposition system mounted on a 3-axis Cartesian robot as shown in
To control the printing apparatus, a state-of-the-art slicer was employed. The input to the slicer is a three-dimensional object. The output of the slicer is a series of locations the printing head visits to reproduce the model as closely as possible. To generate a single slice of the object, the slicer starts by intersecting the 3D model with a Z-axis aligned plane (please note that this does not affect the generalizability of the slicer, as the input model can be arbitrarily rotated prior to slicing).
The reference control policy strictly relies on a constant width of the material. To discover policies that can adapt to the in-situ observations, the search was formulated in a reinforcement learning framework. The control problem is described by a Markov decision process (), where is a set of states, ∈d is a d-dimensional continuous action that the control policy can take in each state, =P(s′|s,a) is the transition function that is a distribution of next states s′ given a current state s and action a, and (s,a)→ is the reward function that assigns a numerical value to how good it is to be in state s and perform action a. The following section describes how these components can be designed in the context of additive manufacturing.
To define the observation space, the constraints of the physical hardware were closely followed. The observation space was modeled as a small in-situ view centered at the printing nozzle location. The view has a size of 84×84 pixels, which translates to roughly 2.95×2.95 scene units (SU). The view contains either a heightmap (for infill printing) or material segmentation (for outline printing). Since the location directly under the nozzle is obscured for the physical hardware, we mask a small central position in the view equivalent to 0.42 SU or 1/7th of the in-situ view. Together with the local view, the printer was provided with a local image of the desired printing target and an image of the path the control policy will take in the environment. To further minimize the observation space, the in-situ view was rotated such that the printer moves along the positive X-axis in the image. These three inputs are stacked together into a 3-channel image,
The selection of action space plays a critical role in adapting a controller to the real hardware. One possibility is to control and directly modify the acceleration of individual motors. However, such approach is not readily transferable between printing devices because, generally speaking, the controls are tied too tightly to the hardware selection and would exaggerate the sim-to-real gap. Moreover, directly affecting the acceleration motors would mean that the control policy needs to learn how to trace print inputs. Instead, a strategy is proposed that leverages the body of work on designing state-of-the-art controllers. Similar to the state-of-the-art, this control policy follows a path generated by a slicer. However, the control policy enables dynamic modification of the path. At each state, the printer can modify two actions: (1) the velocity at which the printing head is moving and (2) displacement of the printing head in a direction perpendicular to the motion,
The transition function takes a state-action pair and outputs a new state of the environment. In the simulation setting, this means that the fabrication process needs to be numerically modeled, which is a notoriously difficult problem. Here, the system leverages the assumption that the observation space is so localized that it can identify the deposited materials only qualitatively. Therefore, the system can trade physical realism for visual fidelity and efficiency. This description fits the Position-Based-Dynamics (PBD) (Macklin & Müller, 2013) framework, which is a geometrical approximation to the equations of motion. For numerical details on how to approximate printing materials in the PBD framework, see Appendix C, which is incorporated herein physically and by reference.
Another important choice for the numerical model is the used discretization. There are have two options: (1) time-based and (2) distance-based. Time-based discretization was originally experimented. However, it was found that time discretization is not suitable for printer modeling. As the velocity in simulation approaches zero, the difference in deposited material becomes progressively smaller until the gradient information completely vanishes,
An interesting design element is the orientation of the control polygons created by gcode. When the outline is defined as points given counter-clockwise, then, due to the applied rotation, each view is split roughly into two half-spaces,
To design a realistic printing environment, the model needs to capture the deposition imprecision. The source of this imprecision is the complex non-linear coupling between the dynamic material properties and the deposition parameters. Analytical modeling of this coupling is challenging, as it requires deep understanding of the complex coupled interactions. Instead, a data-driven model was adopted. It was observed that the final effect of the deposition error is a varying width of the deposited material.
To recover such a model for our apparatus, start by printing a reference slice over multiple iterations and measure the width variation at specified locations (e.g., cross-sections). This yields observations of how the material width evolves over time. To formulate a predictive generative model, a tool from speech processing called Linear Predictive Coding (LPC) (Marple, 1980) was employed. The model assumes that a signal is generated by a buzz filtered by an auto-correlation filter. This assumption was used to recover filter coefficients that transform white Gaussian noise into realistic pressure samples.
For one example, with reference to
Viscous materials take significant time to settle after deposition. Therefore, to assess deposition errors, it is needed to observe the deposition over long horizons. However, the localized nature of the in-situ view makes such observations impossible on the physical hardware. As a result, learning long-horizon planning has infeasible sample complexity. To tackle this issue, the fact that we utilize a numerical approximation of the deposition process with access to privileged information was leveraged. At each simulation step, the model simulates the entire printing bed. This allows formulation of the reward function as a global print quality metric. More specifically, the metric is composed of two terms: (1) a reward term for depositing material inside the desired slice and (2) a punishment term for depositing material outside of the slice. To keep the values consistent across slices of varying size, the values were normalized by the length of the outline or the infill area, respectively. To accelerate the training further, dense rewards were provided as the difference between the metrics evaluated at two subsequent timesteps. For details on how to compute the reward function, see Appendix E, which is incorporated herein physically and by reference.
This section provides results collected in both virtual and real environments. For details about the training procedure, please see Appendix F, which is incorporated herein physically and by reference. It is first shown that an adaptive policy can outperform state-of-the-art approaches in environments with constant deposition. Next, the in-process monitoring and the ability of our policy to adapt to dynamic environments are showcased. This section concludes by demonstrating that the learned controllers transfer to physical environments with minimal sim-to-real gap.
5.1 Comparison with Reference Control Policy
The optimized control scheme was evaluated on a selection of freeform models and CAD files sampled from Thingy 10k (Zhou & Jacobson, 2016) and ABC (Koch et al., 2019) datasets,
Next, the shapes where our control policy achieves the highest and the lowest gain respectively were investigated. From
Finally, the control policy is compared with fine-tuned state-of-the-art. The reference control policy uses the same parameters for each slice. It is possible that different process parameters are optimal for different slices. To this end, two slices were chosen, a freeform slice of a bird and a CAD slice of a bolt and optimized their process parameters using Bayesian optimization.
Our control policy relies on a live view of the deposition system to select the control parameters. However, the in-situ view is a technologically challenging addition to the printer hardware that requires a carefully calibrated imagining setup. With this ablation study, we verify how important the individual observations are to the final print quality. Three cases were considered: (1) no printing bed view, (2) no target view, and (3) no future path view. The performance of each case on the evaluation dataset is reported in the supplementary material below. The results were analyzed from the pre-test (full observation space M=9.74, SD=4.92) and the post-tests (no canvas M=8.75, SD=5.70, no target M=7.16, SD=5.45, no path M=8.42, SD=4.79) printing task using paired t-tests with Holm-Bonferroni correction. The analysis indicates that the availability of all three inputs: the printing bed, the target, and the path, resulted in an improvement in final printouts (P values<0.01 for all three cases).
To verify that our policy can adapt to printing artifacts, three models of varying viscosity were trained in the noisy environments.
The infill policy was also evaluated in a noisy environment. As shown in
We present what is believed to be the first closed-loop controller for additive manufacturing guided by an in-situ view, which is also applicable, as discussed herein, to other manufacturing processes using different types of feedback. To learn an effective control policy, we design a custom numerical model of the deposition process where we tackle several challenges. To obtain an efficient approximation of the deposition process, we leverage the limited perception of a printing apparatus and model the deposition only qualitatively. To include non-linear coupling between process parameters and printed materials, we utilize a data-driven predictive model for the deposition width. Finally, to enable long-horizon learning with viscous materials, we use the privileged information generated by our numerical model for reward computation. We demonstrate that our model can be used to train control policies that outperform state-of-the-art, adapt to materials of varying viscosity, and transfer to physical apparatus with minimal sim-to-real gap. To showcase our controllers, we fabricate several printouts. We believe that our numerical model can guide future development of closed-loop policies for additive manufacturing. Thanks to its minimal sim-to-real gap the model democratizes research on learning for additive manufacturing by limiting the need to build specialized hardware.
Appendix H includes some supplemental information including a copy of a draft publication providing additional details of various embodiments, the contents of which are incorporated herein physically and by reference.
Appendix I includes some supplemental information including a copy of an updated publication providing additional details of various embodiments, the contents of which are incorporated herein physically and by reference.
The following appendices are incorporated herein physically and by reference.
To enable realtime control of the printing process, we implemented an in-situ view of the material deposition. Ideally, we would capture a top-down view of the deposited material. Unfortunately, this is not possible, since the material is obstructed by the dispensing nozzle. As a result, the camera has to observe the printing bed from an angle. Since the nozzle would obstruct the view of any single camera, we opted to use two cameras. More specifically, we placed two CMOS cameras (Basler AG, Ahrensburg, Germany) at 45 degrees on each side of the dispensing nozzle, as shown in
The recovered in-situ view is scaled to attain the same universal scene unit size as our control policies are trained in. Since we seek to model the deposition only qualitatively, it is sufficient to rescale the in-situ view to match the scale of the virtual environments. We identify this scaling factor separately for each material. To calibrate a single material, we start by depositing a straight line at maximum velocity. The scaling factor is then the ratio required to match the observed thickness of the line with simulation.
The last assumption of our control policy is that the deposition needle is centered with respect to the in-situ view. To ensure that this assumption holds with the physical hardware, we calibrate the location of the dispensing needle within the field of view of each camera and with respect to the build platform. First, a dial indicator is used to measure the height of the nozzle in z and the fine adjustment stage
To calibrate the reference control, we follow the same procedure in simulation and physical hardware. We start by depositing a line (e.g., a straight line) at a constant velocity. Next, we measure the width of the deposited line at various locations to estimate the mean width. We use the width to generate the offset for outline printing and spacing of the infill pattern.
To model the interaction of the deposited material with the printing apparatus, we rely on Position-Based Dynamics (PBD). PBD approximates rigid, viscous, and fluid objects as collections of particles. To represent the fluid, we assume a set of N particles, where each particle is defined by its position p, velocity v, mass m, and a set of constraints C. In our setting, we consider two constraints: (1) collision with the nozzle and (2) incompressibility of the fluid material. We model the collision with the nozzle as a hard inequality constraint:
We further tune the simulation parameters to achieve a wide range of viscosity properties. More specifically, we couple the effects of viscosity, adhesion, and energy dissipation into a single setting. By coupling these parameters, we obtain materials with optically different viscosity properties. Moreover, we noticed that the number of solving steps has a significant effect on viscosity and surface tension of the simulated fluids. Therefore, we also tweak the number of substeps from 2 for liquid-like materials to 5 for highly-viscous materials.
To formulate a predictive generative model, we employ a tool from speech processing called Linear Predictive Coding (LPC) (Marple, 1980). We can predict the next sample of a signal as a weighted sum of M past output samples and a noise term:
As depicted in
To print the outline (
For infill printing, we compute the reward from the heightfield of the deposited material. We start by estimating how much of the slice was covered. To this end, we use a thresholded version of the canvas and compute the coverage as R=>ΣCT. Similarly, we estimate the amount of over-deposited material as P=ΣC(1−T). To keep these values consistent across different slices, we normalize them by the total area of the print. Finally, to motivate deposition of flat surfaces suitable for 3D printing, we add another penalty term as the standard deviation of the canvas heightfield.
To train our control policy, we start with g-code generated by state-of-the-art slicer. As inputs to the slicer, we consider a set of 3D models collected from the Thingy 10k dataset. To train a controller, the input models need to be carefully selected. On the one hand, if we pick an object with too low frequency features with respect to the printing nozzle size, then any printing errors due to control policy will have negligible influence on the final result. On the other hand, if we pick a model with too high frequency features with respect to the printing nozzle, then the nozzle will be physically unable to reproduce these features. As a result, we opted for a manual selection of 18 models that span a wide variety of features,
We adopt the model architecture of Mnih et al. (2015). The network input is 84×84 pixel image. The image is passed through three hidden layers. The convolution layers have the respective parameters: (32 filters, filter size 8, stride 4), (64 filters, filter size 4, stride 2), and (64 filters, filter size 3, stride 1). The final convolved image is linearized and passed through a fully-connected layer with 512 neurons that is connected to the output action. Each hidden layer uses nonlinear rectifier activation. We formulate our objective function as:
We also experimented with training controllers for materials with varying viscosity.
While the state-of-the-art reference policy closely follows the printed boundaries, it is possible that there is a more suitable policy to maximize our objective function. To verify this, we use the environment described in Section 4 to search for a velocity and offset that maximizes the reward function. More specifically, we optimize a simplified objective of Equation 8 limited to a single shape:
This Appendix provides some supplemental information including the contents of the draft publication entitled CLOSED-LOOP CONTROL OF ADDITIVE MANUFACTURING VIA REINFORCEMENT LEARNING providing additional details of various embodiments, which was incorporated physically and by reference in PCT Application No. PCT/US2022/045662 and in U.S. Provisional Patent Application No. 63/252,418, from which this patent application claims priority and which are incorporated herein by reference.
The present disclosure relates to the control of manufacturing processes by using a reinforcement learning agent to select process parameters that result in increasing the quality of manufacturing outcomes. The reinforcement learning controller enables a higher level of control that typical control methods cannot achieve.
Many manufacturing processes rely on selecting and controlling a set of essential process parameters to ensure that the quality of the outcome meets specifications. Traditionally, closed-loop controllers are employed to control the process parameters to ensure that the process stays in a window that produces acceptable results. However, in circumstances where the stochastic error is difficult to model, or deterministic error is difficult to remove, many controllers that are currently used have reduced performance, limiting the ability to control the manufacturing process adequately. This causes reduced efficiency in the manufacturing process and, in some cases, makes the process infeasible.
We propose a novel control system that utilizes a policy network for controlling the manufacturing process. To train the policy network, a learning environment is used that effectively models the relationship between the process parameters and result as well as a reward function that penalizes or rewards the policy depending on how well the policy performed. During training, the policy network selects a set of process parameters to test in the learning environment, and the learning environment then reports back the manufacturing outcome and reward based on those process parameters. The policy network updates during training to maximize the total reward from the environment. After training, the policy network can then be used directly as the controller for the manufacturing process.
This new controller is manufacturing process agnostic and can be used on many different types of manufacturing processes such as additive manufacturing, CNC machining, metal forming and joining and more. To be adapted to different processes, a new environment that adequately models the different manufacturing process as well as an altered reward function are needed. To adequately represent the manufacturing process, the training environment can be a physical or computational environment that models the effects the process parameters have on the outcome. Furthermore, stochastic effects on process parameters can also be included in the environment.
We formulate our reward function as a combination of local and global quality metrics. The local quality metrics ensure that the outcome of the manufacturing process is optimal by rewarding desirable performance metrics. We do this, for example, by rewarding consistent line thickness for additive manufacturing, high quality surface finishes for CNC machining, attaining welds with high penetration and minimal defects for welding, or other appropriate qualitative analysis for a given manufacturing process. The global metrics are designed to evaluate the quality of geometrical reproduction. We do this by estimating the amount of material deposited/removed from desirable and undesirable regions for additive or subtractive manufacturing. A weighted combination of these factors describes how close we are to matching a target outcome and how well the next layer can be processed.
Thus, embodiments can be applied to a wide range of manufacturing processes using any of a wide variety of feedback mechanisms. Manufacturing systems of various embodiments can be generalized as comprising a tool configured to interact with or produce a product, at least one sensor that provides sensor information on the quality of the operation of the tool relative to the product, and a controller configured to control operation of the tool based on a predetermined manufacturing process and further configured to dynamically adjust at least one parameter of the predetermined manufacturing process and hence to dynamically adjust operation of the tool based on qualitative performance information derived from the sensor information applied as feedback to a closed-loop control policy learned through machine reinforcement learning. Embodiments can be applied to a wide range of manufacturing processes, e.g., without limitation, additive manufacturing processes, subtractive manufacturing processes, automated welding processes, automated cutting processes (e.g., mechanical, laser, waterjet), etc. Also, embodiments can use any of a variety of qualitative feedback mechanisms, e.g., without limitation, camera, 3D laser scanner, coordinate measuring machine, etc. The reinforcement learning/training and process control described herein can be adapted for a particular manufacturing process/system, e.g., rather than depositing test samples and measuring parameters of the deposited materials as in 3D printing embodiments described above, the learning/training and process control might be based on test welds in an automated welding system or on test cuts in an automated cutting system, using transfer and reward functions that are appropriate for the particular manufacturing process. As is generally known, variance and stochastic error make many manufacturing processes difficult to control. With the described methodologies of adding variance to the training environment, many types of manufacturing processes can be controlled even though they have stochastic error.
This Appendix provides some supplemental information including the contents of an updated publication entitled CLOSED-LOOP CONTROL OF DIRECT INK WRITING VIA REINFORCEMENT LEARNING along with it Supplementary Material providing additional details of various embodiments, which was incorporated physically and by reference in PCT Application No. PCT/US2022/045662 and in U.S. Provisional Patent Application No. 63/252,418, from which this patent application claims priority and which are incorporated herein by reference.
It should be noted that headings are used above for convenience and are not to be construed as limiting the present invention in any way.
While various inventive embodiments have been described and illustrated herein, those of ordinary skill in the art will readily envision a variety of other means and/or structures for performing the function and/or obtaining the results and/or one or more of the advantages described herein, and each of such variations and/or modifications is deemed to be within the scope of the inventive embodiments described herein. More generally, those skilled in the art will readily appreciate that all parameters, dimensions, materials, and configurations described herein are meant to be exemplary and that the actual parameters, dimensions, materials, and/or configurations will depend upon the specific application or applications for which the inventive teachings is/are used. Those skilled in the art will recognize or be able to ascertain using no more than routine experimentation, many equivalents to the specific inventive embodiments described herein. It is, therefore, to be understood that the foregoing embodiments are presented by way of example only and that, within the scope of the appended claims and equivalents thereto, inventive embodiments may be practiced otherwise than as specifically described and claimed. Inventive embodiments of the present disclosure are directed to each individual feature, system, article, material, kit, and/or method described herein. In addition, any combination of two or more such features, systems, articles, materials, kits, and/or methods, if such features, systems, articles, materials, kits, and/or methods are not mutually inconsistent, is included within the inventive scope of the present disclosure.
Various inventive concepts may be embodied as one or more methods, of which examples have been provided. The acts performed as part of the method may be ordered in any suitable way. Accordingly, embodiments may be constructed in which acts are performed in an order different than illustrated, which may include performing some acts simultaneously, even though shown as sequential acts in illustrative embodiments.
All definitions, as defined and used herein, should be understood to control over dictionary definitions, definitions in documents incorporated by reference, and/or ordinary meanings of the defined terms.
The indefinite articles “a” and “an,” as used herein in the specification and in the claims, unless clearly indicated to the contrary, should be understood to mean “at least one.”
The phrase “and/or,” as used herein in the specification and in the claims, should be understood to mean “either or both” of the elements so conjoined, i.e., elements that are conjunctively present in some cases and disjunctively present in other cases. Multiple elements listed with “and/or” should be construed in the same fashion, i.e., “one or more” of the elements so conjoined. Other elements may optionally be present other than the elements specifically identified by the “and/or” clause, whether related or unrelated to those elements specifically identified. Thus, as a non-limiting example, a reference to “A and/or B”, when used in conjunction with open-ended language such as “comprising” can refer, in one embodiment, to A only (optionally including elements other than B); in another embodiment, to B only (optionally including elements other than A); in yet another embodiment, to both A and B (optionally including other elements); etc.
As used herein in the specification and in the claims, “or” should be understood to have the same meaning as “and/or” as defined above. For example, when separating items in a list, “or” or “and/or” shall be interpreted as being inclusive, i.e., the inclusion of at least one, but also including more than one, of a number or list of elements, and, optionally, additional unlisted items. Only terms clearly indicated to the contrary, such as “only one of” or “exactly one of,” or, when used in the claims, “consisting of,” will refer to the inclusion of exactly one element of a number or list of elements. In general, the term “or” as used herein shall only be interpreted as indicating exclusive alternatives (i.e., “one or the other but not both”) when preceded by terms of exclusivity, such as “either,” “one of,” “only one of,” or “exactly one of.” “Consisting essentially of,” when used in the claims, shall have its ordinary meaning as used in the field of patent law.
As used herein in the specification and in the claims, the phrase “at least one,” in reference to a list of one or more elements, should be understood to mean at least one element selected from any one or more of the elements in the list of elements, but not necessarily including at least one of each and every element specifically listed within the list of elements and not excluding any combinations of elements in the list of elements. This definition also allows that elements may optionally be present other than the elements specifically identified within the list of elements to which the phrase “at least one” refers, whether related or unrelated to those elements specifically identified. Thus, as a non-limiting example, “at least one of A and B” (or, equivalently, “at least one of A or B,” or, equivalently “at least one of A and/or B”) can refer, in one embodiment, to at least one, optionally including more than one, A, with no B present (and optionally including elements other than B); in another embodiment, to at least one, optionally including more than one, B, with no A present (and optionally including elements other than A); in yet another embodiment, to at least one, optionally including more than one, A, and at least one, optionally including more than one, B (and optionally including other elements); etc.
In the claims, as well as in the specification above, all transitional phrases such as “comprising,” “including,” “carrying,” “having,” “containing,” “involving,” “holding,” “composed of,” and the like are to be understood to be open-ended, i.e., to mean including but not limited to. Only the transitional phrases “consisting of” and “consisting essentially of” shall be closed or semi-closed transitional phrases, respectively, as set forth in the United States Patent Office Manual of Patent Examining Procedures, Section 2111.03.
Although the above discussion discloses various exemplary embodiments of the invention, it should be apparent that those skilled in the art can make various modifications that will achieve some of the advantages of the invention without departing from the true scope of the invention. Any references to the “invention” are intended to refer to exemplary embodiments of the invention and should not be construed to refer to all embodiments of the invention unless the context otherwise requires. The described embodiments are to be considered in all respects only as illustrative and not restrictive.
This patent application is a continuation of PCT Application No. PCT/US2022/045662 entitled LEARNING CLOSED-LOOP CONTROL POLICIES FOR MANUFACTURING filed Oct. 4, 2022, which claims the benefit of U.S. Provisional Patent Application No. 63/252,418 entitled LEARNING CLOSED-LOOP CONTROL POLICIES FOR MANUFACTURING filed Oct. 5, 2021, each of which is hereby incorporated herein by reference in its entirety.
This invention was made with Government support under Grant Nos. IIS1815585 and IIS1815070 awarded by the National Science Foundation. The Government has certain rights in the invention.
Number | Date | Country | |
---|---|---|---|
63252418 | Oct 2021 | US |
Number | Date | Country | |
---|---|---|---|
Parent | PCT/US22/45662 | Oct 2022 | WO |
Child | 18626653 | US |