The present invention relates to a production system for executing a production plan made in a higher-level management controller.
Conventionally, in an assembly line, plural kinds of and a number of components are handled to produce a product.
In this respect, the kind of components necessary to produce one product and the number of the components are included, as product information S0, in the higher-level management controller 200. The higher-level management controller 200 causes plural kinds of components, the number of which is determined in accordance with the product information S0, to be supplied to the cell 400.
Further, Japanese Unexamined Patent Publication (Kokai) No. 2013-016087 discloses that the production planning device improves the productivity based on information regarding the stock of plural kinds of components and the number of components.
In the production system shown in
Specific examples are as follows.
(1) The production capability remarkably reduces because of a failure in at least one of the machines R1 and R2.
(2) The higher-level management controller 200 has a wide supervision area, but has less responsiveness. Thus, there is a delay in supply of components, and accordingly, the machines R1 and R2 reach a standby condition and then time loss occurs. In this instance, products cannot be suitably produced.
(3) There is an error in input to the higher-level management controller 200, and an excess or deficiency occurs in supply of components.
When these problems are not rapidly detected, the production efficiency of the production system reduces as the time passes. Note that Japanese Unexamined Patent Publication (Kokai) No. 2013-016087 does not disclose that the aforementioned problems are rapidly detected.
The present invention was made in light of the circumstances described above and has an object to provide a production system which can rapidly detect, for example, a failure of a machine, to efficiently operate the machine.
To achieve the above object, according to a first aspect of the invention, there is provided a production system includes at least one cell including a plurality of machines for producing products, and a plurality of machine control devices for controlling the plurality of machines, a cell control device which is communicably connected to the at least one cell, to control the cell, and a higher-level management controller which is communicably connected to the cell control device and which includes product information. The product information includes plural kinds of components to produce each product and the number of each kind of components. The cell control device includes a product information monitoring unit for monitoring the product information, a component supply state monitoring unit for monitoring the plural kinds of components to be supplied to the at least one cell and the number of each kind of components, and a notification unit which transmits a notice to the higher-level management controller when the number of each kind of components, which is monitored by the component supply state monitoring unit, deviates from a predetermined range determined for each kind of components.
According to a second aspect of the invention, in the production system according to the first aspect of the invention, the cell control device includes a product monitoring unit for monitoring the number of the products actually produced in the cell. When the number of the products to be produced in the cell, which is determined in accordance with the number of the plural kinds of components and each kind of components, which are monitored by the component supply state monitoring unit, is less than the number of the products which are monitored by the product monitoring unit and which are actually produced in the cell, the notification unit transmits a notice to the higher-level management controller.
According to a third aspect of the invention, in the production system according to the first aspect of the invention, the cell control device includes a product monitoring unit for monitoring the number of the products actually produced in the cell. When the number of the products to be produced in the cell, which is determined in accordance with the number of the plural kinds of components and each kind of components, which are monitored by the component supply state monitoring unit, is equal to the number of the products which are monitored by the product monitoring unit and which are actually produced in the cell, and the number of the products to be produced in the cell is less than the desired number of the products, the notification unit transmits a notice to the higher-level management controller.
According to a fourth aspect of the invention, in the production system according to any of the first to third aspects of the invention, the production system includes a machine learning device for learning production data of the production system. The machine learning device includes a state quantity observation unit for observing the state quantity of the production system, an operation result acquisition unit for acquiring a production result of each product in the production system, a learning unit which receives an output from the state quantity observation unit and an output from the operation result acquisition unit, to learn the production data in association with the state quantity of the production system and the production result, and a decision-making unit which outputs production data with reference to the production data learned by the machine learning device.
According to a fifth aspect of the invention, in the production system according to the fourth aspect of the invention, the cell control device includes a product monitoring unit for monitoring the number of the products actually produced in the cell. The state quantity observed by the state quantity observation unit includes at least one of the desired number of products, the product information monitored by the product information monitoring unit, the number of the plural kinds of components and each kind of components monitored by the component supply state monitoring unit, the number of the products which are monitored by the product monitoring unit and which are actually produced, and settings for the plurality of machines included in the cell.
According to a sixth aspect of the invention, in the production system according to the fourth or fifth aspect of the invention, the production data output by the decision-making unit includes at least one of the number of each kind of components to be supplied to the at least one cell and the settings for the plurality of machines included in the at least one cell.
According to a seventh aspect of the invention, in the machine learning device according to the fourth aspect of the invention, the machine learning device includes a learning model for learning production data, an error calculation unit for calculating an error between the production result acquired by the operation result acquisition unit and a predetermined target, and a learning model update unit for updating the learning model in accordance with the error.
According to an eighth aspect of the invention, in the machine learning device according to the fourth aspect of the invention, the machine learning device has a value function for determining the value of production data. the machine learning device further includes a reward calculation unit which provides a plus reward in accordance with a difference between the production result acquired by the operation result acquisition unit and a predetermined target when the difference is small, and provides a minus reward in accordance with the difference when the difference is large, and a value function update unit for updating the value function in accordance with the reward.
These objects, features, and advantages of the present invention and other objects, features, and advantages will become further clearer from the detailed description of typical embodiments illustrated in the appended drawings.
Embodiments of the present invention will be described below with reference to the accompanying drawings. In the following figures, similar members are designated with the same reference numerals. These figures are properly modified in scale to assist the understanding thereof.
The cell 40 is a set of a plurality of machines for performing predetermined operations. Examples of the machines R1 and R2 include machine tools, articulated robots, parallel link robots, manufacturing machines, industrial machines, etc. The machines may be comprised of the same kind of machines, or different kinds of machines. Further, cells 40′ and 40″ having similar configurations are connected to the cell control device 30.
In
Note that, in the present invention, the cells 40, 40′, and 40″ can be installed in, for example, a factory for manufacturing products, whereas the cell control device 30 and the higher-level management controller 20 can be installed in, for example, a building different from the factory. In this instance, the cell control device 30 and the machine control devices RC1 and RC2 can be connected via a network, such as an intranet (first communication unit 41). The higher-level management controller 20 can be installed in, for example, an office away from the factory. In this instance, the higher-level management controller 20 can be communicably connected to the cell control device 30 via a network, such as the Internet (second communication unit 42). However, this is merely an example. Any communication unit, which communicably connects the cell control device 30 and the machine control devices RC1 and RC2, can be adopted as the first communication unit 41. Any communication unit, which can communicably connect the cell control device 30 and the higher-level management controller 20, can be adopted as the second communication unit 42.
The higher-level management controller 20 is, for example, a personal computer, and functions as a production planning device which makes a production plan for the system 10 and transmits the same to the cell control device 30. As shown in
An operator uses, for example, an input unit to input the desired number N0 of products to the higher-level management controller 20. The higher-level management controller 20 controls the supply of plural kinds of the components A to C to the cells 40, 40′, and 40″ based on the feedback from the cell control device 30 and the desired number N0 of products. Note that the product information S0 may include the desired number N0 of products.
The cell control device 30 is configured to control the cells 40, 40′, and 40″. Specifically, the cell control device 30 can transmit plural kinds of commands to the machine control devices RC1 and RC2, or can acquire data regarding, for example, the operating condition of the machines R1 and R2, from the machine control devices RC1 and RC2.
As shown in
First, in step S11, the product information monitoring unit 31 of the cell control device 30 acquires the product information S0 and the desired number N0 of products in the higher-level management controller 20. Subsequently, in step S12, the component supply state monitoring unit 32 of the cell control device 30 monitors the supply state of the components A to C. In other words, the component supply state monitoring unit 32 acquires plural kinds of the components A to C to be supplied to the cell 40 and the numbers NA1, NB1, and NC1 of the components A to C.
Subsequently, in step S13, whether each of the components A to C is appropriately supplied to the cell 40 is determined. For each of the components A to C, the maximum number and the minimum number of the components to be appropriately processed in the cell 40 are set. In step S13, whether the numbers NA1 to NC1 of the components A to C are remained between the corresponding maximum and minimum numbers is determined.
When, for example, the number NA1 of the component A is greater than the corresponding maximum number, or is less than the corresponding minimum number, the process shifts to step S15. In step S15, the fact that the number of the supplied components A is too much or not enough is determined, and the notification unit 34 transmits this state to the higher-level management controller 20. The other components B and C are processed in a similar manner.
As described above, in order to produce one product, all the plural kinds of the components A to C are necessary. Thus, when the fact that the number of at least one kind of components among the plural kinds of the components A to C is too much or not enough is determined, it is not possible to successfully produce the products, and the notification unit 34 transmits this information to the higher-level management controller 20.
In such a case, the higher-level management controller 20 causes the too much or not enough number of the components A to C to be increased or decreased by, for example, only a predetermined number. Thus, the production system 10 can be efficiently operated.
Note that, when the fact that the numbers NA1 to NC1 of the components A to C are remained between the corresponding maximum numbers and the corresponding minimum numbers is determined in step S13, the fact that products can be appropriately produced using the components A to C can be determined. Thus, in this instance, the process shifts to step S14, to continue producing products.
Subsequently, in step S16, the number N1 of products to be produced in the cell 40 is calculated. The number N1 of products to be produced in the cell 40 is determined in accordance with the product information S0 acquired in step S11 and the numbers NA1 to NC1 of the components A to C acquired in step S12.
Subsequently, in step S17, the product monitoring unit 33 of the cell control device 30 acquires the number N2 of products actually produced in the cell 40. Further, in step S18, whether the number N1 of products to be produced in the cell 40 is less than the number N2 of products actually produced and whether the number N1 of products to be produced in the cell 40 is greater than the number N2 of products actually produced are determined.
When the fact that the number N1 of products to be produced in the cell 40 is less than the number N2 of products actually produced is determined in step S18, the fact that at least one of the machines R1 and R2 in the cell 40 breaks down can be determined. Thus, the notification unit 34 transmits, in step S19, this information to the higher-level management controller 20. Subsequently, the higher-level management controller 20 causes, for example, the number of plural kinds of components to be decreased by the same ratio. This causes the production system 10 to efficiently operate.
Realistically, there is no possibility that the number N1 of products to be produced in the cell 40 is greater than the number N2 of products actually produced. Thus, when the aforementioned fact is determined in step S18, the notification unit 34 transmits the possibility that an abnormality may occur in the cell 40, to the higher-level management controller 20 (step S19).
In the meantime, when the fact that the number N1 of products to be produced in the cell 40 is equal to the number N2 of products actually produced is determined in step S18, the fact that no abnormality occurs in the cell 40 can be determined. In such a case, the desired number N0 of products is acquired in step S20, and whether the number N1 of products to be produced in the cell 40 is less than the desired number N0 of products is determined in step S21. Note that the operation in step S20 can be omitted.
When the number N1 of products to be produced in the cell 40 is equal to the number N2 of products actually produced, but the number N1 of products to be produced in the cell 40 is less than the desired number N0 of products, the fact that the number of the components A to C to be supplied to the cell 40 is small can be determined. This causes the notification unit 34 to transmit this information to the higher-level management controller 20. Subsequently, the higher-level management controller 20 causes the number of plural kinds of the components A to C to be increased by, for example, a predetermined ratio. This causes the production system 10 to efficiently operate.
As seen above, the cell control device 30 according to the present invention uses the product information monitoring unit 31, the component supply state monitoring unit 32, and the product monitoring unit 33, to acquire various pieces of information from the higher-level management controller 20 and the cell 40. Further, the cell control device 30 determines whether an abnormality occurs, based on various pieces of information, and transmits, when an abnormality occurs, the occurrence of the abnormality to the higher-level management controller 20. This rapidly eliminates the abnormality in the present invention, and accordingly, causes the production system 10 to efficiently operate.
The learning unit 13 of the machine learning device 50 receives an output from the state quantity observation unit 11 for observing the state quantity of the production system 10 and an output (production result of a product) from the operation result acquisition unit 12 for acquiring a processing result in the production system 10, to learn production data in association with the state quantity of the production system 10 and the production result. The decision-making unit 14 decides production data with reference to the production data learned by the learning unit 13, and outputs the same to the cell control device 30.
In this respect, the state quantity observed by the state quantity observation unit 11 includes at least one of the desired number N0 of products, the product information S0 monitored by the product information monitoring unit 31, plural kinds of the components A to C and the numbers NA1 to NC1 of the components, which are monitored by the component supply state monitoring unit 32, the number N2 of products actually produced, which is monitored by the product monitoring unit 33, and settings for the machines R1 and R2 monitored by the product monitoring unit 33. Note that the settings for the machines R1 and R2 include, for example, the operation speed, the acceleration and deceleration, and the times for acceleration and deceleration of the machines R1 and R2.
Further, the production data output by the decision-making unit 14 include the numbers NA2 to NC2 of the plural kinds of components to be supplied to at least one cell 40 and/or the settings for the machines R1 and R2 included in at least one cell 40.
The learning unit 13 includes a learning model for learning different production data. The learning unit 13 includes an error calculation unit 15, which calculates an error between the production result acquired by the operation result acquisition unit 12, e.g., the number of products, the various qualities of products, etc. and a predetermined target, and a learning model update unit 16 for updating the leaning model according to the error.
When products are produced based on given production data, if the quality of the products, which is received as one of outputs from the operation result acquisition unit 12, exceeds a predetermined threshold value, the error calculation unit 15 outputs a calculation result indicating that a predetermined error occurs in the production result of the production data. Further, the learning model update unit 16 updates the learning model in accordance with the calculation result.
The reward calculation unit 18 provides a plus reward according to the magnitude of a difference when the difference between the quality of products acquired by the operation result acquisition unit 12 and a target quality is small, and provides a minus reward according to the magnitude of a difference when the difference is large.
In this instance, when products are produced based on given production data, if the quality of the products, which is received as one of outputs from the operation result acquisition unit 12, exceeds a predetermined threshold value, it is preferable that the reward calculation unit 18 provides a predetermined minus reward, and the value function update unit 19 updates a value function according to the predetermined minus reward.
Finally, a learning method of the machine learning device 50 will be described. The machine learning device 50 has a function for extracting, for example, a useful algorithm, a rule, a knowledge expression, a criterion, etc. in a set of data input thereto by analysis, outputting a determination result, and learning knowledge.
Examples of machine learning include algorithms, such as supervised learning, unsupervised learning, and reinforcement learning. In order to achieve these leaning methods, there is another method referred to as “deep learning” for learning extraction of feature quantity itself.
Supervised learning is a method in which a large volume of input-output (label) paired data are given to the machine learning device 50, so that characteristics of these datasets can be learned, and a model for inferring an output value from input data, i.e., the input-output relation can be inductively acquired. In the supervised learning, input-output paired data appropriate for learning are given, so that learning is relatively easily facilitated.
Unsupervised learning is a method in which a large volume of input-only data are given to a learning apparatus, so that the distribution of the input data can be learned, and leaning is performed by a device for, for example, compressing, classifying, and fairing the input data even if the corresponding teacher output data are not given. This method is different from the supervised learning in that “what to be output” is not previously determined. This method is used to extract the essential structure behind the data.
Reinforcement learning is a learning method for learning not only determinations or classifications but also actions, to learn an appropriate action based on the interaction of environment to an action, i.e., an action to maximize rewards to be obtained in the future. In the reinforcement learning, learning is started from a state where a result of an action is totally unknown or known only incompletely. However, the reinforcement learning can be started from a starting point having good conditions, i.e., the state, in which the pre-learning is carried out by the supervised learning, set as an initial state. The reinforcement learning has characteristics in which an action for discovering unknown learning areas and an action for utilizing known learning areas can be selected with good balance. Thus, there is a possibility that appropriate target production conditions may be further found in condition areas which have been conventionally unknown. Further, outputting of production data causes the temperature etc. of machines or products to change, i.e., an action exerts an effect to the environment. Thus, adopting of the reinforcement learning is seemingly meaningful.
First, a learning method using supervised learning will be described. In the supervised learning, a pair of input data and output data appropriate for learning is provided, and a function (learning model) for mapping input data and output data corresponding thereto is generated.
An operation of the machine learning apparatus that performs the supervised learning includes two stages, i.e., a learning stage and a prediction stage. At the learning stage, when supervising data including a value of a state variable (explanation variable) used as input data and a value of a target variable used as output data are provided, the machine learning apparatus, which performs the supervised learning, learns outputting of the value of the target variable at the time of inputting of the value of the state variable, and constructs a prediction model for outputting the value of the target variable with respect to the value of the state variable. Then, at the prediction stage, when new input data (state variable) is provided, the machine learning apparatus, which performs the supervised learning, predicts and outputs output data (target variable) according to the learning result (constructed prediction model). In this respect, the result (label) attached data recording unit 17 can hold the result (label) attached data obtained thus far, and provide the result (label) attached data to the error calculation unit 15. Alternatively, the result (label) attached data of the cell control device 30 can be provided to the error calculation unit 15 of the cell control device 30 through a memory card, a communication line, etc.
As an example of learning of the machine learning apparatus that performs the supervised learning, a regression formula of a prediction model similar to, for example, that of following equation (1) is set, and learning proceeds to adjust values of factors a0, a1, a2, a3, . . . so as to obtain a value of a target variable y when values taken by state variables x1, x2, x3, . . . during the learning process are applied to the regression formula. Note that the learning method is not limited to this method, and varies from one supervised learning algorithm to another.
y=a
0
+a
1
x
1
+a
2
x
2
+a
3
x
3
+ . . . +a
n
x
n
As supervised learning algorithms, there are known various methods such as a neural network, a least squares method, and a stepwise method, and any of these supervised learning algorithms may be employed as a method applied to the present invention. Each supervised learning algorithm is known, and accordingly, detailed description thereof is omitted herein.
Subsequently, a learning method using reinforcement learning will be described. Problems of the reinforcement learning may be set as follows.
As representative reinforcement learning methods, Q learning and TD learning are known. Hereinafter, the case of the Q learning will be described, but a method is not limited to the Q learning.
The Q learning is a method for learning a value Q (s, a) for selecting an action a under a given environment state s. In the state s, an action a of a highest value Q (s, a) may be selected as an optimal action. However, at first, as a correct value of the value Q (s, a) is not known for a combination of the state s with the action a, an agent (action subject) selects various actions a under the state s, and is given rewards for the actions a at the time. This way, the agent selects a better action, in other words, learns a correct value Q (s, a).
Further, with a view to maximizing the sum of rewards obtained in the future as a result of the action, Q (s, a)=E[Σ(γc)rt] may be finally achieved. E[ ] represents an expected value, t represents time, γ represents a parameter referred to as a discount rate described below, rt represents a reward at the time t, and Σ represents the sum at the time t. The expected value in this formula is taken when a state changes according to the optimal action, and learned through searching as it is not known. An update formula for such a value Q (s, a) can, for example, be represented by equation (2) described below.
In other words, the value function update unit 16 updates a value function Q (st, at) by using the following equation (2):
where, st represents a state of the environment at the time t, and at represents an action at the time t. The action at changes the state to st+1. rt+1 represents a reward that can be obtained via the change of the state. Further, a term with max is a Q value multiplied by γ for a case where the action a for the highest Q value known at that time is selected under the state st+1. γ is a parameter of 0<γ≦1, and referred to as a discount rate. α is a learning factor, which is in the range of 0<α≦1.
The equation (2) represents a method for updating an evaluation value Q (st, at) of the action at in the state st on the basis of the reward rt+1 returned as a result of the action at. It indicates that when the sum of the reward rt+1 and an evaluation value Q (st+1, max at+1) of the best action max a in the next state based on the action a is greater than the evaluation value Q (st, at) of the action a in the state s, Q (st, at) is increased, whereas when less, Q (st, at) is decreased. In other words, it is configured such that the value of some action in some state is made to be closer to the reward that instantly comes back as a result and to the value of the best action in the next state based on that action.
Methods of representing Q (s, a) on a computer include a method in which the value is retained as an action value table for all state-action pairs (s, a) and a method in which a function approximate to Q (s, a) is prepared. In the latter method, the abovementioned equation (2) can be implemented by adjusting parameters of the approximation function by a technique, such as stochastic gradient descent method. The approximation function may use a neural network.
As described above, as the learning algorithm of the supervised learning or the approximation algorithm of the value function in the reinforcement learning, the neural network can be used. Thus, the machine learning device 50 preferably has the neural network.
where θ is a bias, and fk is an activation function.
As illustrated in
The neurons N11 to N13 output z11 to z13, respectively. In
Finally, the neurons N31 to N33 output results y1 to y3, respectively. An operation of the neural network includes a learning mode and a value prediction mode: in the learning mode, the weight w is learned by using a learning data set, and in the prediction mode, an action of outputting production data is determined by using parameters thereof. Here, the apparatus can be actually operated in the prediction mode to output the production data and instantly learn and cause the resulting data to be reflected in the subsequent action (on-line learning), and a group of pre-collected data can be used to perform collective learning and implement a detection mode with the parameter subsequently for quite a while (batch learning). An intermediate case is also possible, where a learning mode is introduced each time data is accumulated to a certain degree.
The weights w1 to w3 can be learned by an error backpropagation method. Error information enters from the right side and flows to the left side. The error backpropagation method is a technique for adjusting (learning) each weight so as to minimize a difference between an output y when an input x is input and a true output y (teacher) for each neuron.
The number of intermediate layers (hidden layers) of the neural network illustrated in
The application of the reinforcement learning and the supervised learning has been described. However, the machine learning method applied to the present invention is not limited to these methods. Various methods such as “supervised learning”, “unsupervised learning”, and “half-supervised learning”, and “reinforcement learning” usable in the machine learning device 10 can be applied.
The machine learning device 50 described above performs learning based on the information from the product information monitoring unit 31, the component supply state monitoring unit 32, and the product monitoring unit 33, to estimate the required number of the plural kinds of the components A to C per product to be produced. The number N1 of products which can be produced in the cell 40 is calculated from the estimated values for the components A to C, and then is compared with the number N2 of products actually produced. As in the description above, for example, which one of the machines R1 and R2 breaks down can be estimated.
Further, the machine learning device 50 learns the time sift of the numbers NA1, NB1, and NC1 of the plural kinds of the components A to C from the component supply state monitoring unit 32 and the number N2 of products from the product monitoring unit 33, to estimate the state of the cell 40. When the components A to C supplied to the cell 40 are exhausted, the fact that the supply of products is halted is estimated. When the number of the supplied products is enough, but the number N2 of products actually produced is small, an estimation in which, for example, any of the machines R1 and R2 breaks down can be obtained.
The machine learning device 50 has an excellent real-time property, and a local supervision area, and accordingly, can improve the accuracy of detection of the abnormality described above.
Note that, in the production system 10 in the above embodiments, as shown in
Furthermore, the machine learning device 50 may be located inside or outside the production system 10. Alternatively, a plurality of production systems 10 may share a single machine learning device 50 via communication media. Alternatively, the machine learning device 50 may be located on a cloud server.
Consequently, it is possible to share the learning effect as well as to collectively manage data and perform learning using a large high-performance processor. Thus, the learning speed and learning accuracy can be improved, and more appropriate production data can be output. Further, the time necessary to decide production data to be output can be reduced. A general-purpose computer or processor can be used for these machine learning devices 50. However, when, for example, general-purpose computing on graphics processing units (GPGPU) or large PC clusters are applied, processing can be performed at a higher speed.
Effect of the Invention
In the first aspect of the invention, when the number of each kind of components deviates from a predetermined range, it can be determined that at least one of the plural kinds of components to be supplied to the cell is too much or not enough. Thus, the higher-level management controller receives a notice, and appropriately changes the number of components which are too much or not enough, whereby the production system can be efficiently operated.
In the second aspect of the invention, the number of products to be produced in the cell is determined in accordance with the number of plural kinds of products to be supplied to the cell. When the number of products to be produced in the cell is less than the number of products which are monitored by the product monitoring unit and which are actually produced, it can be determined that at least one of the machines in the cell breaks down. Thus, the higher-level management controller receives this information, and reduces the number of the plural kinds of products by the same ratio, whereby the production system can be efficiently operated.
In the third aspect of the invention, even if the number of products to be produced in the cell is equal to the number of products which are monitored by the product monitoring unit and which are actually produced, when the number of products to be produced in the cell is less than the desired number of products, it can be determined that the number of plural kinds of components to be supplied to the cell is not enough. Thus, the higher-level management controller receives this information, and increases the number of plural kinds of components, whereby the production system can be efficiently operated.
In the fourth to eighth aspects of the invention, the accuracy in detection of an abnormality in the production system can be improved.
The present invention has been described above using exemplary embodiments. However, a person skilled in the art would understand that the aforementioned modifications and various other modifications, omissions, and additions can be made without departing from the scope of the present invention.
Number | Date | Country | Kind |
---|---|---|---|
2016-082284 | Apr 2016 | JP | national |