The present disclosure relates to a technique for learning a film formation condition by machine learning.
In recent years, in order to manufacture a cutting tool having high wear resistance, a hard film of TiN, TiAlN, CrN, or the like is formed by a physical vapor deposition (PVD) on a base material to be the cutting tool (e.g., Patent Literature 1). In order to manufacture a tool having high wear resistance, it is required to appropriately determine a film formation condition.
However, film formation conditions have conventionally been determined relying on many years of experience by a skilled technician. Therefore, it has been difficult to easily determine an appropriate film formation condition.
The present invention has been made to solve such a problem, and an object of the present invention is to provide a machine learning method and the like that can easily determine an appropriate film formation condition.
In recent years, various services related to machine learning including deep learning have been provided on a cloud, and users can easily use the services. The present inventor has found that an appropriate film formation condition can be easily determined by machine learning with the film formation condition and a physical quantity related to the performance evaluation of film formation, and has been conceive of the present invention.
A machine learning method according to one aspect of the present invention is a machine learning method in which a machine learning device determines a film formation condition of a film forming device that forms a film on a workpiece that is a base material, the film forming device including a vacuum evacuation system that evacuates a chamber, a heating and cooling system that heats and cools the chamber, an evaporation source system that evaporates a target, a table system on which a workpiece is placed, a process gas system that introduces a process gas into the chamber, and an etching system, the machine learning method including: acquiring a state variable including at least one physical quantity related to performance evaluation of film formation and at least one film formation condition; calculating a reward for a determination result of the at least one film formation condition based on the state variable; updating, based on the reward, a function for determining the at least one film formation condition from the state variable; and determining a film formation condition under which the reward is obtained most by repeating update of the function, in which the at least one film formation condition is at least one of a first parameter related to the vacuum evacuation system, a second parameter related to the heating and cooling system, a third parameter related to the evaporation source system, a fourth parameter related to the table system, and a fifth parameter related to the process gas system, and the at least one physical quantity is at least one of a film quality characteristic, a mechanical characteristic, and a physical characteristic that are related to the film.
A machine learning device according to another aspect of the present invention is a machine learning device that determines a film formation condition of a film forming device that forms a film on a workpiece that is a base material, the film forming device including a vacuum evacuation system that evacuates a chamber, a heating and cooling system that heats and cools the chamber, an evaporation source system that evaporates a target, a table system on which a workpiece is placed, a process gas system that introduces a process gas into the chamber, and an etching system, the machine learning device including: a state acquisition unit that acquires a state variable including at least one physical quantity related to performance evaluation of film formation and at least one film formation condition; a reward calculation unit that calculates a reward for a determination result of the at least one film formation condition based on the state variable; an update unit that updates, based on the reward, a function for determining the at least one film formation condition based on the state variable; and a determination unit that determines a film formation condition under which the reward is obtained most by repeating update of the function, in which the at least one film formation condition is at least one of a first parameter related to the vacuum evacuation system, a second parameter related to the heating and cooling system, a third parameter related to the evaporation source system, a fourth parameter related to the table system, and a fifth parameter related to the process gas system, and the at least one physical quantity is at least one of a film quality characteristic, a mechanical characteristic, and a physical characteristic that are related to the film.
A machine learning program according to yet another aspect of the present invention is a computer-readable machine learning program that causes a computer to function as a machine learning device that determines a film formation condition of a film forming device that forms a workpiece that is a base material, the film forming device including a vacuum evacuation system that evacuates a chamber, a heating and cooling system that heats and cools the chamber, an evaporation source system that evaporates a target, a table system on which a workpiece is placed, a process gas system that introduces a process gas into the chamber, and an etching system, the machine learning program causing a computer to function as: a state acquisition unit that acquires a state variable including at least one physical quantity related to performance evaluation of film formation and at least one film formation condition; a reward calculation unit that calculates a reward for a determination result of the at least one film formation condition based on the state variable; an update unit that updates, based on the reward, a function for determining the at least one film formation condition based on the state variable; and a determination unit that determines a film formation condition under which the reward is obtained most by repeating update of the function, in which the at least one film formation condition is at least one of a first parameter related to the vacuum evacuation system, a second parameter related to the heating and cooling system, a third parameter related to the evaporation source system, a fourth parameter related to the table system, and a fifth parameter related to the process gas system, and the at least one physical quantity is at least one of a film quality characteristic, a mechanical characteristic, and a physical characteristic that are related to the film.
A communication method according to still another aspect of the present invention is a communication method for a film forming device when machine learning a film formation condition of the film forming device that forms a workpiece that is a base material, the film forming device including a vacuum evacuation system that evacuates a chamber, a heating and cooling system that heats and cools the chamber, an evaporation source system that evaporates a target, a table system on which a workpiece is placed, a process gas system that introduces a process gas into the chamber, an etching system, and a communication unit, the communication method including: observing a state variable including at least one physical quantity related to performance evaluation of film formation after film formation is executed and at least one film formation condition; and transmitting the state variable to a network via the communication unit and receiving at least one machine-learned film formation condition, in which the at least one film formation condition is at least one of a first parameter related to the vacuum evacuation system, a second parameter related to the heating and cooling system, a third parameter related to the evaporation source system, a fourth parameter related to the table system, and a fifth parameter related to the process gas system, and the at least one physical quantity is at least one of a film quality characteristic, a mechanical characteristic, and a physical characteristic that are related to the film.
A film forming device according to still another aspect of the present invention is a film forming device that forms a film on a workpiece that is a base material, the film forming device including: a vacuum evacuation system that evacuates a chamber; a heating and cooling system that heats and cools the chamber; an evaporation source system that evaporates a target; a table system on which a workpiece is placed; a process gas system that introduces a process gas into the chamber; an etching system; a state observation unit that observes a state variable including at least one physical quantity related to performance evaluation of film formation after film formation is executed and at least one film formation condition; and a communication unit that transmits the state variable to a network and receives at least one machine-learned film formation condition, in which the at least one film formation condition is at least one of a first parameter related to the vacuum evacuation system, a second parameter related to the heating and cooling system, a third parameter related to the evaporation source system, a fourth parameter related to the table system, and a fifth parameter related to the process gas system, and the at least one physical quantity is at least one of a film quality characteristic, a mechanical characteristic, and a physical characteristic that are related to the film.
According to the present invention, it is possible to easily determine an appropriate film formation condition for a base material.
Embodiments of the present invention will be described below with reference to the accompanying drawings. Note that the following embodiments are examples embodying the present invention and are not intended to limit the technical scope of the present invention.
The film forming device 30 includes a vacuum evacuation system 510, a heating and cooling system 520, an evaporation source system 530, a table system 540, a process gas system 550, an etching system 560, and a chamber 570.
The vacuum evacuation system 510 includes an evacuation device 511 and evacuates the inside of the chamber 570. The evacuation device 511 includes a pump or the like for evacuating air in the chamber 570.
The heating and cooling system 520 includes a heater power supply unit 521 and a heater 522, and heats a workpiece 545. The heater power supply unit 521 is a power supply circuit that supplies electric power to the heater 522. The heater 522 is provided in the chamber 570 and generates heat by electric power supplied from the heater power supply unit 521. The heating and cooling system 520 cools the workpiece 545 by stopping heat generation of the heater 522.
The evaporation source system 530 is a system that evaporates a target (film formation material). The evaporation source system 530 includes an arc cathode 531 and an arc power supply unit 532. The arc power supply unit 532 is a power supply circuit that supplies a discharge current to the arc cathode 531. The arc cathode 531 includes a target, and generates vacuum arc discharge with the inner wall of the chamber 570 by the electric power supplied from the arc power supply unit 532. When the vacuum arc discharge is started, a molten region called an arc spot having a diameter of several μm is generated on the cathode surface. A high-density current is concentrated in the arc spot, and the cathode surface is instantaneously molten and evaporated. This vacuum arc discharge forms a film on the surface of the workpiece 545.
In the example of
The table system 540 is a rotary table on which the workpiece 545 is mounted. The table system 540 includes a table 541, a table drive unit 542, and a bias power supply unit 543. The table 541 is provided in the chamber 570. The workpiece 545 is placed on the table 541. The table drive unit 542 includes a motor and the like, and rotates the table 541. The bias power supply unit 543 applies a negative potential to the workpiece 545 via the table 541.
The process gas system 550 introduces a process gas for forming a reactive film in the chamber 570.
The etching system 560 includes a discharge power supply unit 561, a pair of filament electrodes 562, and a filament (not illustrated) provided between the pair of filament electrodes 562. The discharge power supply unit 561 is a power supply circuit that supplies a discharge current to the filament via the pair of filament electrodes 562. The etching system 560 generates argon plasma between the arc cathode 531 and the filament and between the inner wall of the chamber 570 and the filament. The surface of the workpiece 545 is cleaned by this generation of the argon plasma. In this cleaning, the arc cathode 531 and the inner wall of the chamber 570 function as an anode, and the filament functions as a cathode.
The chamber 570 is a container that accommodates the workpiece 545. The inside of the chamber 570 is evacuated by the vacuum evacuation system 510 to maintain the vacuum state.
The configuration of each device will be specifically described below. The server 10 includes a processor 100 and a communication unit 101. The processor 100 is a control device including a CPU and the like. The processor 100 includes a reward calculation unit 110, an update unit 120, a determination unit 130, and a learning control unit 140. Each block included in the processor 100 may be implemented by the processor 100 executing a machine learning program for causing a computer to function as the server 10 in the machine learning system, or may be implemented by a dedicated electric circuit.
The reward calculation unit 110 calculates a reward for a determination result of at least one film formation condition based on the state variable observed by a state observation unit 321.
Based on the reward calculated by the reward calculation unit 110, the update unit 120 updates a function for determining at least one film formation condition from the state variable observed by the state observation unit 321. As the function, an action value function described later is adopted.
The determination unit 130 determines at least one film formation condition under which the reward is obtained most by repeating update of the function.
The learning control unit 140 performs overall control of machine learning. The machine learning system of the present embodiment learns film formation conditions by reinforcement learning. The reinforcement learning is a machine learning method in which an agent (action subject) selects a certain action based on an environmental situation, the agent is caused to change the environment based on the selected action, and the agent is given a reward associated with the environmental change, whereby the agent is caused to learn selection of a better action. As the reinforcement learning, Q learning and TD learning can be adopted. In the following description, Q learning will be described as an example. In the present embodiment, the reward calculation unit 110, the update unit 120, the determination unit 130, the learning control unit 140, and the state observation unit 321 described later correspond to agents.
The communication unit 101 includes a communication circuit that connects the server 10 to the network 40. The communication unit 101 receives, via the communication device 20, the state variable observed by the state observation unit 321. The communication unit 101 transmits the film formation condition determined by the determination unit 130 to the film forming device 30 via the communication device 20. In the present embodiment, the communication unit 101 is an example of a state acquisition unit that acquires a state variable.
The communication device 20 includes a transmitter 201 and a receiver 202. The transmitter 201 transmits the state variable transmitted from the film forming device 30 to the server 10, and transmits the film formation condition transmitted from the server 10 to the film forming device 30. The receiver 202 receives the state variable transmitted from the film forming device 30 and receives the film formation condition transmitted from the server 10.
The film forming device 30 includes a communication unit 310, a processor 320, a memory 330, a sensor unit 340, and an input unit 350 in addition to the configuration shown in
The communication unit 310 is a communication circuit for connecting the film forming device 30 to the network 50. The communication unit 310 transmits the state variable observed by the state observation unit 321 to the server 10. The communication unit 310 receives the film formation condition determined by the determination unit 130 of the server 10. The communication unit 310 receives a film formation execution command, which is described later, determined by the learning control unit 140.
The processor 320 is a control device including a CPU. The processor 320 includes the state observation unit 321, a film formation execution unit 322, and an input determination unit 323. The communication unit 310 transmits the state variable acquired by the state observation unit 321 to the server 10. Each block included in the processor 320 is implemented by, for example, the CPU executing a machine learning program causing each block included in the processor 320 to function as the film forming device 30 of the machine learning system.
The state observation unit 321 acquires the physical quantity detected by the sensor unit 340 after execution of film formation. The state observation unit 321 observes a state variable including at least one physical quantity related to performance evaluation of film formation after execution of film formation and at least one film formation condition. Specifically, the state observation unit 321 acquires a film formation condition based on the measurement value of the sensor unit 340. The state observation unit 321 acquires a physical quantity based on the measurement value of the sensor unit 340 or the like.
The first parameter includes at least one of an evacuation speed, an ultimate pressure, a residual gas type, a residual gas partial pressure, and a P-Q characteristic. The evacuation speed is a speed at which the vacuum evacuation system 510 evacuates the air and residual gas in the chamber 570 and the introduced process gas. The evacuation speed is obtained by, for example, calculation from a performance value of a pump constituting the vacuum evacuation system 510. Alternatively, the evacuation speed may be a measurement value calculated from the pressure sensor and the evacuation time. The ultimate pressure is the pressure in the chamber 570 before start of film formation process. The ultimate pressure is obtained by, for example, calculation from a performance value of a pump constituting the vacuum evacuation system 510. Alternatively, the ultimate pressure may be a measurement value of the pressure sensor. The residual gas type is a gas residual in the chamber 570 and is an impurity. The residual gas type is, for example, nitrogen, oxygen, moisture, hydrogen, and the like. The residual gas type is determined based on the partial pressure of the residual gas described later. The residual gas partial pressure is a partial pressure of a plurality of residual gases residual in the chamber 570. The residual gas partial pressure is obtained by measurement of a vacuum residual gas monitor such as a quadrupole mass spectrometer. The P-Q characteristic is a characteristic indicating the relationship between the chamber internal pressure (P) and the flow rate (Q). The P-Q characteristic is obtained by calculation from, for example, the flow rate of the gas in the chamber 570 detected by the flow rate sensor and the measurement value of the pressure sensor.
The second parameter includes at least one of a heater temperature, a workpiece temperature, a heater temperature rise rate, a workpiece temperature rise rate, a heater output, a heater temperature accuracy, a workpiece temperature accuracy, a heater temperature/workpiece temperature, a heater temperature distribution, a workpiece temperature distribution, a coolant gas type, a coolant gas pressure, and a workpiece cooling rate.
The heater temperature is the temperature of the heater 522. The heater temperature is a measurement value of a temperature sensor (thermocouple), for example. The workpiece temperature is a temperature of the workpiece 545. The workpiece temperature is a measurement value of a temperature sensor provided in the vicinity of the workpiece 545, for example. The heater temperature rise rate is a change rate of the heater temperature when the heater 522 rises in temperature. The heater temperature rise rate is obtained from a time-series change in the heater temperature. The workpiece temperature rise temperature is a change rate of the workpiece temperature when the workpiece 545 rises in temperature. The workpiece temperature rise rate is obtained from a time-series change in the workpiece temperature.
The heater output is the output of the heater 522. The heater output is obtained by calculation from the setting value of the heater power supply unit 521. The heater output may be calculated from measurement values by the sensor of a current value and a voltage value supplied to the heater.
The heater temperature accuracy is a value indicating variation in heater temperature. The heater temperature accuracy is calculated from a measurement value of past heater temperature. The workpiece temperature accuracy is a value indicating variation in workpiece temperature. The workpiece temperature accuracy is calculated from a measurement value of past workpiece temperature. The heater temperature/workpiece temperature is a response characteristic of the heater 522 with respect to the workpiece 545.
The heater temperature distribution is a temperature distribution of the heater 522. The heater temperature distribution is obtained from measurement values of a plurality of temperature sensors provided around the heater 522. The workpiece temperature distribution is a temperature distribution of the workpiece 545. The workpiece temperature distribution is obtained from measurement values of a plurality of temperature sensors provided around the workpiece 545.
The coolant gas type is information indicating the type of gas for cooling the inside of the chamber 570, and is an input value input in advance. The coolant gas pressure is the pressure of the coolant gas. The coolant gas pressure is a measurement value by a pressure sensor provided in the chamber 570. The workpiece cooling rate is a cooling rate of the workpiece 545. The workpiece cooling rate is obtained from a time-series change in the workpiece temperature detected by a temperature sensor provided in the vicinity of the workpiece 545.
The third parameter includes at least one of a target composition, a target thickness, a target manufacturing method, an arc discharge voltage, an arc discharge current, an evaporation source magnetic field, an evaporation source coil current, and an arc ignition characteristic. The target composition is a composition of a substance constituting the target. The target thickness is the thickness of the target. The target manufacturing method is a manufacturing method of the target. The target composition, the target thickness, and the target manufacturing method are input values input in advance.
The arc discharge voltage is a voltage supplied from the arc power supply unit 532 to the arc cathode 531, and is a measurement value by the sensor. The arc discharge current is a current supplied from the arc power supply unit 532 to the arc cathode 531, and is a measurement value by the sensor.
The evaporation source magnetic field is the position and strength of the magnetic field emitted by the permanent magnetic flux contained in the evaporation source system 530. The evaporation source magnetic field is an input value input in advance. The evaporation source coil current is a current flowing through the coil included in the evaporation source system 530, and is a measurement value obtained by a sensor. The are ignition characteristic is behavior of the voltage and current on the arc surface at the time of arc ignition. The arc ignition characteristic is obtained from measurement values of the arc discharge voltage and the arc discharge current at certain timing.
The fourth parameter includes at least one of a bias voltage, a bias current, the number of times of OL, an OL time change, a bias voltage waveform, a bias current waveform, a workpiece rotation speed, a workpiece shape, a workpiece load amount, a workpiece load method, and a workpiece material.
The bias voltage is a bias voltage supplied to the workpiece 545 by the bias power supply unit 543, and is a measurement value by the sensor. The bias current is a bias current supplied to the workpiece 545 by the bias power supply unit 543, and is a measurement value by the sensor.
The number of times of OL (Over Load) is the number of times of abnormal discharge in the table system or the workpiece, and is a measurement value by the sensor. The OL time change is the number of times of OL per unit time. The bias voltage waveform is a waveform of the bias voltage, and is obtained from a measurement value by the sensor. The bias voltage waveform is a voltage waveform at the time of pulse bias in particular. The bias current waveform is a waveform of the bias current, and is obtained from a measurement value by the sensor. The workpiece rotation speed is the rotation speed per unit time of the workpiece 545, and includes the rotation speed per unit time of the table 541 and the rotation speed per unit time when the workpiece 545 rotates on the table 541. The workpiece rotation speed is a detection value by the sensor, for example. The workpiece shape is a numerical value indicating the shape of the workpiece 545 and is an input value input in advance. The workpiece load amount is a load amount (e.g., weight) of the workpiece 545, and is an input value input in advance. The workpiece load method is a load method of the workpiece 545 with respect to the table 541, and is an input value input in advance. The workpiece material is a material of the workpiece 545, and is an input value input in advance.
The fifth parameter includes at least one of a gas flow rate, a gas type, and a gas pressure. The gas flow rate is a flow rate of the process gas. The gas type is information indicating the type of process gas. The gas pressure is the pressure of the process gas. These are detection values of sensors, for example.
The sixth parameter includes at least one of a filament heating current, a filament heating voltage, a filament diameter, a discharge current, and a discharge voltage. The filament heating current is a heating current for heating the pair of filament electrodes 562 constituting the etching system 560, and is a measurement value by the sensor. The filament heating voltage is a heating voltage for heating the pair of filament electrodes 562, and is a measurement value by the sensor.
The filament diameter is a diameter of each of the pair of filament electrodes 562, and is an input value input in advance. The filament diameter may be calculated by calculation. The discharge current is a discharge current of the pair of filament electrodes 562, and is a measurement value by the sensor. The discharge voltage is a discharge voltage of the pair of filament electrodes 562, and is a measurement value by the sensor.
The film thickness is the thickness of the film. The surface texture is a form of the surface including surface roughness. The composition is the composition of the film. The crystal structure is a crystal structure of the film. The film microstructure is in a general sense, and represents a microstructure such as a crystal form and orientation. The crystallinity is a proportion of crystal. The crystal grain size is the size of a crystal grain. The residual stress is an internal stress of the film.
The film thickness is obtained by a film thickness measuring instrument. The roughness is obtained by a roughness meter. The surface texture is obtained by a microscope or a roughness meter. The composition is obtained by X-ray spectrometry. The crystal structure, the film microstructure, the crystallinity, the crystal grain size, and the residual stress are obtained by X-ray diffractometry or an electron microscope.
The density is the density of the particles constituting the film. The particle amount is the amount of waste contained in the film. The particle size is the size of waste contained in the film. The density is obtained by an X-ray reflection method. The particle amount and the particle size are obtained by a microscope or image processing.
The mechanical characteristic includes at least one of hardness, elastic modulus, wear resistance, an erosion resistance characteristic, a high-temperature strength, and high-temperature creep. The hardness is obtained by a hardness tester or a nanoindenter. The elastic modulus is obtained by a nanoindenter. The wear resistance is obtained by a sliding test or a wear resistance test. The erosion resistance characteristic is a grind amount by sandblasting. The high-temperature strength and the high-temperature creep are obtained by a nanoindenter.
The physical characteristic includes at least one of a friction coefficient, oxidation resistance, adhesion, and thermal conductivity. The friction coefficient is obtained by a sliding test. The oxidation resistance is obtained by X-ray analysis or composition analysis. The adhesion is obtained by an indentation method or a scratch test. The thermal conductivity is obtained by thermal conductivity measurement.
Referring back to
In a case of manually determining whether or not to be a mass production process, when data indicating that it is a mass production process is input to the input unit 350, the input determination unit 323 determines that the film forming device 30 is in the mass production process. When in the mass production process, the film forming device 30 does not perform machine learning.
The memory 330 is, for example, a nonvolatile storage device, and stores a finally determined optimal film formation condition and the like. The sensor unit 340 is various sensors used for measurement of the film formation condition shown in
In step S2, the learning control unit 140 determines at least one film formation condition and a setting value for the film formation condition. Here, the film formation condition to be set is a film formation condition other than the film formation condition described as Input among the film formation conditions listed in
Specifically, the learning control unit 140 randomly selects a setting value for each film formation condition to be set. Here, the setting value is randomly selected from a predetermined range for each film formation condition.
In step S3, by transmitting a film formation execution command to the film forming device 30, the learning control unit 140 causes the film forming device 30 to start a film formation operation. When the film formation execution command is received by the communication unit 310, the film formation execution unit 322 sets the film formation condition in accordance with the film formation execution command and starts the film formation operation. The film formation execution command includes an input value of the film formation condition having been set in step S1 and a setting value of the film formation condition having been determined in step S2.
When the film formation operation ends, the state observation unit 321 observes the state variable (step S4). Specifically, the state observation unit 321 acquires, as state variables, a physical quantity related to the film formation evaluation described in
In step S5, the determination unit 130 evaluates the physical quantity. Here, the determination unit 130 evaluates the physical quantity by determining whether or not the physical quantity to be evaluated (hereinafter referred to as a target physical quantity) among the physical quantities acquired in step S4 has reached a predetermined reference value. The target physical quantity is one or a plurality of physical quantities among the physical quantities listed in
When determining that the target physical quantity has reached the reference value (YES in step S6), the determination unit 130 outputs the film formation condition set in step S2 as a final film formation condition (step S7). On the other hand, when determining that the physical quantity has not reached the reference value (NO in step S6), the determination unit 130 proceeds with the processing to step S8. Note that in a case where there are a plurality of target physical quantities, the determination unit 130 is only required to determine YES in step S6 if all the target physical quantities have reached the reference value.
In step S8, the reward calculation unit 110 determines whether or not the target physical quantity is close to the reference value. If the target physical quantity is close to the reference value (YES in step S8), the reward calculation unit 110 increases the reward for the agent (step S9). On the other hand, if the target physical quantity is not close to the reference value (NO in step S8), the reward calculation unit 110 decreases the reward for the agent (step S10). In this case, the reward calculation unit 110 is only required to increase or decrease the reward in accordance with a predetermined increase or decrease value of the reward. Note that in a case where there are a plurality of target physical quantities, the reward calculation unit 110 is only required to perform the determination in step S8 for each of the plurality of target physical quantities. In this case, the reward calculation unit 110 is only required to increase or decrease the reward for each of the plurality of target physical quantities based on the determination result of step S8. In addition, a different value may be adopted as the increase or decrease value of the reward in accordance with the target physical quantity.
In step S11, the update unit 120 updates the action value function using the reward given to the agent. The Q-learning adopted in the present embodiment is a method of learning a Q-value (Q(s,a)) that is a value for selection of an action a under a certain environment state s. Note that an environment state st corresponds to the state variable of the above flow. Then, in the Q-learning, an action a with the highest Q(s,a) is selected in the certain environment state s. In the Q-learning, various actions a are taken under the certain environment state s by trial and error, and correct Q(s,a) is learned using the reward at that time. An update expression of the action value function Q(st,at) is expressed by the following expression (1).
Here, st and at represent an environment state and an action at time t, respectively. The environment state changes to st+1 by the action at, and a reward rt+1 is calculated by the change of the environment state. The term with max is a Q value (Q(st+1,a)) in a case where the most valuable action a known at that time is selected under the environment state st+1, the Q value multiplied by γ. Here, γ is a discount rate and has a value of 0<γ≤1 (normally 0.9 to 0.99). α is a learning coefficient and has a value of 0<α≤1 (normally about 0.1).
In this update expression, if γ·max Q(st+1,a) based on the Q value when taking the best action in the next environment state st+1 by the action a is larger than Q Q(st,at), which is the Q value of the action a in the state s, Q(st,at) is made large. On the other hand, in this update expression, if γ·max Q(st+1,a) is smaller than Q(st,at), Q(st,at) is made small. That is, the value of the certain action a in the certain state st is made close to the value of the best action in the next state st+1 by the action a. Due to this, an optimal state for forming a film on the workpiece 545, i.e., at least one optimal film formation condition is determined.
When the processing of step S11 ends, the processing returns to step S2, and the setting value of the selected film formation condition is changed, or an unselected film formation condition is selected as the next film formation condition, whereby the action value function is similarly updated. Although the update unit 120 updates the action value function, the present invention is not limited thereto, and the update unit 120 may update an action value table.
In Q(s,a), values for all pairs of states and actions (s,a) may be stored in a table format. Alternatively, Q(s,a) may be expressed by an approximate function that approximates values for all the pairs of states and actions (s,a). This approximation function may include a neural network having a multilayer structure. In this case, the neural network is only required to learn, in real time, data obtained by actually operating the film forming device 30 and perform online learning to reflect it in the next action. This achieves deep reinforcement learning.
Conventionally, in a film forming device, development of film formation condition has been performed by changing the film formation condition so as to give a good film. In order to obtain a good film, it is required to find the relationship between the evaluation of the film and the film formation condition. However, since the number of types of film formation conditions is enormous as shown in
Thus, according to the first embodiment, at least one parameter among the first to sixth parameters described above and at least one physical quantity among a film quality characteristic, a mechanical characteristic, and a physical characteristic that are related to performance evaluation of film formation are observed as state variables. Then, the reward for the determination result of the film formation condition is calculated based on the observed state variable, the action value function for determining the film formation condition from the state variable is updated based on the calculated reward, and the film formation condition under which the reward is obtained most is learned by repeating this update. Thus, in the first embodiment, the film formation condition is determined by machine learning without using the above-described physical model. As a result, the first embodiment can easily determine an appropriate film formation condition for the cutting tool.
The film forming device 30 of the second embodiment is a device that forms a decorative film on a workpiece for the purpose of enhancing decorativeness. The workpiece is, for example, a decorative article such as a wristwatch and a necklace, a housing of a mobile phone, a bumper of an automobile, and the like. The decorative film is, for example, TiN, TiAlN, TiCN, CrN, diamond-like carbon (DLC), or the like. The machine learning system of the second embodiment performs machine learning of an appropriate film formation condition related to a decorative film.
Note that in the second embodiment, the same components as those in the first embodiment are given identical reference numerals, and description thereof will be omitted. In the second embodiment, the configuration of the film forming device 30 is the same as that in
The physical quantity is roughly classified into the middle classification. The middle classification includes at least one of a film quality characteristic and a physical characteristic. The film quality characteristic is the same as that in the first embodiment. The physical characteristic includes at least one of adhesion and an optical characteristic. The adhesion indicates the degree of adhesion of the film to the base material, and is obtained by an indentation method or a scratch test. The optical characteristic indicates the color, luster, or texture of the film. The optical characteristic is measured by a spectrophotometric colorimeter.
Thus, according to the second embodiment, at least one parameter among the first to sixth parameters described above and at least one physical quantity of a film quality characteristic and a physical characteristic that are related to performance evaluation of film formation are observed as state variables. Then, the reward for the determination result of the film formation condition is calculated based on the observed state variable, the action value function for determining the film formation condition from the state variable is updated based on the calculated reward, and the film formation condition under which the reward is obtained most is learned by repeating this update. Therefore, in the second embodiment, the film formation condition is determined by machine learning without using the above-described physical model. As a result, the second embodiment can easily determine an appropriate film formation condition for the decorative film.
The film forming device 30 of the third embodiment is a device that forms a protective film for protection on a workpiece. The workpiece is, for example, a cutting tool, a mold for injection formation, a screw, and the like. The protective film is, for example, TiN, TiAlN, TiCN, CrN, or the like. The machine learning system of the third embodiment performs machine learning of an appropriate film formation condition related to a protective film.
Note that in the third embodiment, the same components as those in the first embodiment are given identical reference numerals, and description thereof will be omitted. In the third embodiment, the configuration of the film forming device 30 is the same as that in
The physical characteristic includes at least one of a friction coefficient, oxidation resistance, adhesion, thermal conductivity, cohesion, corrosion resistance, chemical resistance, and surface chemical affinity. The friction coefficient is obtained by a sliding test. The oxidation resistance is obtained by X-ray diffractometry or composition analysis. The adhesion indicates the degree of adhesion of the film to the base material, and is obtained by an indentation method or a scratch test. The thermal conductivity is obtained by thermal conductivity measurement. The cohesion is obtained by a sliding test or microscopic observation. The corrosion resistance indicates difficulty in corrosion of the film, and is obtained by a corrosion solution spray test or an immersion test. The chemical resistance indicates difficulty in corrosion of the film due to chemicals, and is obtained by an application test or an immersion test. The surface chemical affinity indicates chemical affinity between the film surface and an external environmental substance, and is obtained by surface chemical analysis.
The physical quantity shown in
For example, in the film quality characteristic, the physical quantity required for each of wear resistant applications, corrosion resistant applications, and heat resistant applications are the same.
Regarding the wear resistant applications, for example, for the mechanical characteristic, at least one physical quantity is required among all the physical quantities listed in
Regarding the corrosion resistant applications, the mechanical characteristic may be omitted, and for the physical characteristic, a physical quantity of at least one of oxidation resistance, corrosion resistance, chemical resistance, and surface chemical affinity is required. The hyphens in the table of
Regarding the heat resistant applications, for the mechanical characteristic, at least one physical quantity is required among hardness, elastic modulus, high-temperature strength, and high-temperature creep, and for the physical characteristic, at least one physical quantity is required among oxidation resistance, adhesion, and thermal conductivity.
Thus, according to the third embodiment, at least one parameter among the first to sixth parameters described above and at least one physical quantity of a film quality characteristic and a physical characteristic that are related to performance evaluation of film formation are observed as state variables. Then, the reward for the determination result of the film formation condition is calculated based on the observed state variable, the action value function for determining the film formation condition from the state variable is updated based on the calculated reward, and the film formation condition under which the reward is obtained most is learned by repeating this update. Therefore, in the third embodiment, the film formation condition is determined by machine learning without using the above-described physical model. As a result, the third embodiment can easily determine an appropriate film formation condition for the protective film.
The film forming device 30 of the fourth embodiment is a device that forms a sliding film on the surface of a workpiece in order to improve hardness of the workpiece surface. The workpiece is, for example, a sliding component of an engine, a piston, and the like. The sliding film is, for example, TiN, TiAlN, TiCN, CrN, diamond-like carbon (DLC), or the like. The machine learning system of the fourth embodiment performs machine learning of an appropriate film formation condition related to a sliding film.
Note that in the fourth embodiment, the same components as those in the first embodiment are given identical reference numerals, and description thereof will be omitted. In the fourth embodiment, the configuration of the film forming device 30 is the same as that in
The mechanical characteristic includes at least one of hardness, elastic modulus, and wear resistance. The hardness is obtained by a hardness tester or a nanoindenter. The elastic modulus is obtained by a nanoindenter. The wear resistance is obtained by a sliding test or a wear resistance test.
The physical characteristic includes at least one of a friction coefficient, oxidation resistance, adhesion, thermal conductivity, cohesion, corrosion resistance, and surface chemical affinity. The friction coefficient is obtained by a sliding test. The oxidation resistance is obtained by X-ray analysis or composition analysis. The adhesion indicates the degree of adhesion of the film to the base material, and is obtained by an indentation method or a scratch test. The thermal conductivity is obtained by thermal conductivity measurement. The cohesion is obtained by a sliding test or microscopic observation. The corrosion resistance indicates difficulty in corrosion of the film, and is obtained by a corrosion solution spray test or an immersion test. The surface chemical affinity indicates chemical affinity between an external environmental substance and the film surface, and is obtained by surface chemical analysis.
Thus, according to the fourth embodiment, at least one parameter among the first to sixth parameters described above and at least one physical quantity among a film quality characteristic, a mechanical characteristic, and a physical characteristic that are related to performance evaluation of film formation are observed as state variables. Then, the reward for the determination result of the film formation condition is calculated based on the observed state variable, the action value function for determining the film formation condition from the state variable is updated based on the calculated reward, and the film formation condition under which the reward is obtained most is learned by repeating this update. Therefore, in the fourth embodiment, the film formation condition is determined by machine learning without using the above-described physical model. As a result, the present embodiment can easily determine an appropriate film formation condition for the sliding film.
Note that the present invention can adopt the following modification.
(1)
Thus, according to the machine learning system according to this modification, the optimal film formation condition can be learned by the film forming device 30A alone.
(2) In the above flow, the state variable is observed after the end of the film formation operation, but this is an example and a plurality of state variables may be observed during one film formation operation. For example, when the state variable only includes an instantaneously measurable parameter, a plurality of state variables can be observed during one film formation operation. This shortens learning time.
(3) In the first to fourth embodiments described above, the film forming device 30 is a device that forms a film by the arc ion plating method, but the present invention is not limited thereto, and may be a device that forms a film by another physical vapor deposition such as an evaporation method.
The present embodiment is summarized as follows.
A machine learning method according to one aspect of the present invention is a machine learning method in which a machine learning device determines a film formation condition of a film forming device that forms a film on a workpiece that is a base material, the film forming device including a vacuum evacuation system that evacuates a chamber, a heating and cooling system that heats and cools the chamber, an evaporation source system that evaporates a target, a table system on which a workpiece is placed, a process gas system that introduces a process gas into the chamber, and an etching system, the machine learning method including: acquiring a state variable including at least one physical quantity related to performance evaluation of film formation and at least one film formation condition; calculating a reward for a determination result of the at least one film formation condition based on the state variable; updating, based on the reward, a function for determining the at least one film formation condition from the state variable; and determining a film formation condition under which the reward is obtained most by repeating update of the function, in which the at least one film formation condition is at least one of a first parameter related to the vacuum evacuation system, a second parameter related to the heating and cooling system, a third parameter related to the evaporation source system, a fourth parameter related to the table system, and a fifth parameter related to the process gas system, and the at least one physical quantity is at least one of a film quality characteristic, a mechanical characteristic, and a physical characteristic that are related to the film.
According to the present configuration, at least one film formation condition among the first parameter related to the vacuum evacuation system, the second parameter related to the heating and cooling system, the third parameter related to the evaporation source system, the fourth parameter related to the table system, and the fifth parameter related to the process gas system, and at least one physical quantity among a film quality characteristic, a mechanical characteristic, and a physical characteristic that are related to performance evaluation of film formation are observed as state variables. Then, the reward for the determination result of the film formation condition is calculated based on the observed state variable, the function for determining the film formation condition from the state variable is updated based on the calculated reward, and the film formation condition under which the reward is obtained most is learned by repeating this update. Therefore, the present configuration can easily determine an appropriate film formation condition for the base material.
In the above configuration, the first parameter may be at least one of an evacuation speed, an ultimate pressure, a residual gas type, a residual gas partial pressure, and a P-Q characteristic.
According to the present configuration, since machine learning is performed with at least one of the evacuation speed, the ultimate pressure, the residual gas type, the residual gas partial pressure, and the P-Q characteristic as the film formation condition related to the vacuum evacuation system, an appropriate film formation condition can be determined in consideration of the state of the vacuum evacuation system.
In the above configuration, the second parameter may be at least one of a heater temperature of a heater that constitutes the heating and cooling system, a workpiece temperature that is a temperature of the workpiece, a temperature rise rate of the heater, a temperature rise rate of the workpiece, an output of the heater, a temperature accuracy of the heater, a temperature accuracy of the workpiece, response characteristics of the heater temperature and the workpiece temperature, a temperature distribution of the heater, and a temperature distribution of the workpiece.
According to the present configuration, since machine learning is performed with at least one of the heater temperature, the workpiece temperature, the temperature rise rate of the heater, the temperature rise rate of the workpiece, the output of the heater, the temperature accuracy of the heater, the temperature accuracy of the workpiece, the response characteristic of the heater temperature, the response characteristic of the workpiece temperature, the temperature distribution of the heater, and the temperature distribution of the workpiece as the film formation condition related to the heating and cooling system, an appropriate film formation condition can be determined in consideration of the state of the heating and cooling system.
In the above configuration, the third parameter may be at least one of a composition of the target, a thickness of the target, a manufacturing method of the target, an arc discharge voltage, an arc discharge current, an evaporation source magnetic field, an evaporation source coil current, and an arc ignition characteristic.
According to the present configuration, since machine learning is performed with at least one of the composition of the target, the thickness of the target, the manufacturing method of the target, the arc discharge voltage, the arc discharge current, the evaporation source magnetic field, the evaporation source coil current, and the arc ignition characteristic as the film formation condition related to the evaporation source system, an appropriate film formation condition can be determined in consideration of the state of the evaporation source system.
In the above configuration, the fourth parameter may be at least one of a bias voltage with respect to the workpiece, a bias current with respect to the workpiece, the number of times of abnormal discharge, a time change of the abnormal discharge, a waveform of the bias voltage, a waveform of the bias current, a rotation speed of the workpiece, a shape of the workpiece, a load amount of the workpiece, a load method of the workpiece, and a material of the workpiece.
According to the present configuration, since machine learning is performed with at least one of the bias voltage, the bias current, the number of times of abnormal discharge, the time change of abnormal discharge, the waveform of the bias voltage, the waveform of the bias current, the rotational speed of the workpiece, the shape of the workpiece, the load amount of the workpiece, the load method of the workpiece, and the material of the workpiece as the film formation condition related to the table system, an appropriate film formation condition can be determined in consideration of the state of the table system.
In the above configuration, the fifth parameter may be at least one of a flow rate of the process gas, a type of the process gas, and a pressure of the process gas.
According to the present configuration, since machine learning is performed with at least one of the flow rate of the process gas, the type of the process gas, and the pressure of the process gas as the film formation condition related to the process gas system, an appropriate film formation condition can be determined in consideration of the state of the process gas system.
In the above configuration, the at least one film formation condition may further include a sixth parameter related to the etching system.
According to the present configuration, since machine learning is performed in consideration of the film formation condition related to the etching system, an appropriate film formation condition can be determined in consideration of the state of the etching system.
In the above configuration, the sixth parameter may be at least one of a heating current for heating a filament of the etching system, a heating voltage for heating the filament, a diameter of the filament, a discharge current of the filament, and a discharge voltage of the filament.
According to the present configuration, since machine learning is performed with at least one of the heating current of the filament, the heating voltage of the filament, the diameter of the filament, the discharge current of the filament, and the discharge voltage of the filament as the film formation condition related to the etching system, an appropriate film formation condition can be determined in consideration of the state of the etching system.
In the above configuration, a film for the base material may be any one of a film for a cutting tool that is the base material, a decorative film for decorating the base material, a protective film for protecting the base material, and a sliding film for improving hardness of a sliding member that is the base material.
According to the present configuration, it is possible to determine any one of an appropriate film formation condition of the film for the cutting tool, an appropriate film formation condition of the decorative film, an appropriate film formation condition of the protective film, and an appropriate film formation condition of the sliding film.
In the above configuration, the function may be updated in real time using deep reinforcement learning.
According to this configuration, since the function is updated in real time using the deep reinforcement learning, the function can be updated accurately and quickly.
Each processing of the machine learning method described above may be implemented by a machine learning device, or may be implemented and distributed in a machine learning program. The machine learning device may include a server or may include a film forming device.
A communication method according to another aspect of the present invention is a communication method for a film forming device when machine learning a film formation condition of the film forming device that forms a workpiece that is a base material, the film forming device including a vacuum evacuation system that evacuates a chamber, a heating and cooling system that heats and cools the chamber, an evaporation source system that evaporates a target, a table system on which a workpiece is placed, a process gas system that introduces a process gas into the chamber, an etching system, and a communication unit, the communication method including: observing a state variable including at least one physical quantity related to performance evaluation of film formation after film formation is executed and at least one film formation condition; and transmitting the state variable to a network via the communication unit and receiving at least one machine-learned film formation condition, in which the at least one film formation condition is at least one of a first parameter related to the vacuum evacuation system, a second parameter related to the heating and cooling system, a third parameter related to the evaporation source system, a fourth parameter related to the table system, and a fifth parameter related to the process gas system, and the at least one physical quantity is at least one of a film quality characteristic, a mechanical characteristic, and a physical characteristic that are related to the film.
According to the present configuration, information necessary for machine learning of film formation information is provided. Such a communication method can also be implemented in a film forming device.
Number | Date | Country | Kind |
---|---|---|---|
2019-131218 | Jul 2019 | JP | national |
2019-215681 | Nov 2019 | JP | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2020/027465 | 7/15/2020 | WO |