MASK LAYOUT DETERMINING MODEL TRAINING METHOD AND APPARATUS, AND MASK LAYOUT DETERMINING METHOD AND APPARATUS

FIELD OF THE TECHNOLOGY

This application relate to the field of integrated circuit (IC) technologies, and in particular, to a mask layout determining model training method and apparatus, and a mask layout determining method and apparatus.

BACKGROUND OF THE DISCLOSURE

In an IC technology, an IC layout needs to be first designed. Then, a lithography machine is invoked to expose the IC layout on a photoresist to obtain a mask layout. Subsequently, the lithography machine is invoked again to expose the mask layout on a wafer to obtain a wafer layout.

The IC layout may alternatively be referred to as a chip layout. As a design size is continuously reduced, a design size of the chip layout is already approximate to or smaller than a light source wavelength used in a lithography process. Consequently, an interference effect and a scattering effect are clearer, so that an actually formed wafer layout is seriously distorted compared with the mask layout, and the wafer layout is different from the chip layout. Based on this, how to obtain the mask layout is a problem to be solved urgently.

SUMMARY

This disclosure provides a mask layout determining model training method and apparatus, and a mask layout determining method and apparatus. The quality of a mask layout can be improved, so that a wafer layout determined based on the mask layout is highly similar to a chip layout. The technical solution includes the following content.

According to a first aspect, a mask layout determining model training method is provided. The method is performed by an electronic device, and includes:

- obtaining a labeled mask layout and a sample chip layout, where the labeled mask layout is a mask layout of the sample chip layout determined under conditions of a reference lithography process parameter;
- determining a predicted mask layout of the sample chip layout through a neural network model, where the predicted mask layout is a mask layout of the sample chip layout obtained by prediction;
- determining a first wafer layout of the predicted mask layout through a first wafer layout determining model, where the first wafer layout determining model includes an actual lithography process parameter, and the first wafer layout is a wafer layout of the predicted mask layout obtained by prediction; and
- training the neural network model through the labeled mask layout, the predicted mask layout, the sample chip layout, and the first wafer layout, to obtain a mask layout determining model, where the mask layout determining model is configured for determining a target mask layout of a target chip layout.

According to a second aspect, a mask layout determining method is provided. The method is performed by an electronic device, and includes:

- obtaining a target chip layout; and
- determining a target mask layout of the target chip layout through a mask layout determining model, where the mask layout determining model is obtained by training through the mask layout determining model training method according to any content in the first aspect, and the target mask layout is a mask layout of the target chip layout obtained by prediction.

According to a third aspect, a mask layout determining model training apparatus is provided. The apparatus includes:

- an obtaining module, configured to obtain a labeled mask layout and a sample chip layout, where the labeled mask layout is a mask layout of the sample chip layout determined under conditions of a reference lithography process parameter;
- a determining module, configured to determine a predicted mask layout of the sample chip layout through a neural network model, where the predicted mask layout is a mask layout of the sample chip layout obtained by prediction;
- the determining module, further configured to determine a first wafer layout of the predicted mask layout through a first wafer layout determining model, where the first wafer layout determining model includes an actual lithography process parameter, and the first wafer layout is a wafer layout of the predicted mask layout obtained by prediction; and
- a training module, configured to train the neural network model through the labeled mask layout, the predicted mask layout, the sample chip layout, and the first wafer layout, to obtain a mask layout determining model, where the mask layout determining model is configured for determining a target mask layout of a target chip layout.

According to a fourth aspect, a mask layout determining apparatus is provided. The apparatus includes:

- an obtaining module, configured to obtain a target chip layout; and
- a determining module, configured to determine a target mask layout of the target chip layout through a mask layout determining model, where the mask layout determining model is obtained by training through the mask layout determining model training method according to any content in the first aspect, and the target mask layout is a mask layout of the target chip layout obtained by prediction.

According to a fifth aspect, an electronic device is provided. The electronic device includes a processor and a memory. The memory has at least one computer program stored therein. The at least one computer program is loaded and executed by the processor, to cause the electronic device to implement the mask layout determining model training method according to any content in the first aspect or the mask layout determining method according to any content in the second aspect.

According to a sixth aspect, a non-transitory computer-readable storage medium is further provided. The non-volatile computer-readable storage medium has at least one computer program stored therein. The at least one computer program is loaded and executed by a processor, to cause an electronic device to implement the mask layout determining model training method according to any content in the first aspect or the mask layout determining method according to any content in the second aspect.

According to a seventh aspect, a computer program or a computer program product is further provided. The computer program or the computer program product has at least one computer program stored therein. The at least one computer program is loaded and executed by a processor, to cause an electronic device to implement the mask layout determining model training method according to any content in the first aspect or the mask layout determining method according to any content in the second aspect.

According to the technical solution provided in this disclosure, a predicted mask layout of a sample chip layout is determined through a neural network model, and a first wafer layout in which an actual lithography process parameter is considered is determined based on the predicted mask layout. The neural network model is trained through a labeled mask layout, the predicted mask layout, the sample chip layout, and the first wafer layout, to obtain a mask layout determining model, so that a target mask layout determined through the mask layout determining model better conforms to the actual lithography process parameter, and a wafer layout highly similar to a target chip layout can be obtained under the conditions of the actual lithography process parameter, thereby improving the quality of the target mask layout.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 2 is a flowchart of a mask layout determining model training method according to an exemplary embodiment of this disclosure.

FIG. 3 is a schematic diagram of an edge modification information according to an exemplary embodiment of this disclosure.

FIG. 4 is a flowchart of determining a labeled mask layout according to an exemplary embodiment of this disclosure.

FIG. 5 is a flowchart of a mask layout determining method according to an exemplary embodiment of this disclosure.

FIG. 6 is a schematic diagram of pre-training an initial network model according to an exemplary embodiment of this disclosure.

FIG. 7 is a schematic diagram of optimally training a neural network model according to an exemplary embodiment of this disclosure.

FIG. 8 is a schematic diagram of a mask process window according to an exemplary embodiment of this disclosure.

FIG. 9 is a structural diagram of a mask layout determining model training apparatus according to an exemplary embodiment of this disclosure.

FIG. 10 is a structural diagram of a mask layout determining apparatus according to an exemplary embodiment of this disclosure.

FIG. 11 is a schematic structural diagram of a terminal device according to an exemplary embodiment of this disclosure.

FIG. 12 is a schematic structural diagram of a server according to an exemplary embodiment of this disclosure.

DESCRIPTION OF EMBODIMENTS

To make the objects, technical solutions, and advantages of this disclosure clearer, the following further describes implementations of this disclosure in detail with reference to the accompanying drawings.

FIG. 1 is a schematic diagram of an implementation environment for a mask layout determining model training method or a mask layout determining method according to an exemplary embodiment of this disclosure. As shown in FIG. 1, the implementation environment includes a terminal device 101 and a server 102. The mask layout determining model training method or the mask layout determining method in this exemplary embodiment of this disclosure may be performed by the terminal device 101, or may be performed by the server 102, or may be performed jointly by the terminal device 101 and the server 102. A device for performing the mask layout determining model training method may be the same as or different from a device for performing the mask layout determining method.

The terminal device 101 may be a smartphone, a desktop computer, a tablet computer, a laptop portable computer, or the like. The server 102 may be one server, a server cluster formed by multiple servers, or any one of a cloud computing center or a virtualization center. This is not limited in this exemplary embodiment of this disclosure. The server 102 may perform communication connection with the terminal device 101 through a wired network or a wireless network. The server 102 may have functions of data processing, data storage, data transmission and reception, and the like. This is not limited in this exemplary embodiment of this disclosure. The quantity of terminal devices 101 and the quantity of servers 102 are not limited, and there may be one or more terminal devices and one or more servers.

Exemplary embodiments of this disclosure may be implemented based on an artificial intelligence (AI) technology.

With continuous development of a computer technology, an IC technology also develops. In the field of IC technologies, an IC layout usually needs to be designed. A mask layout is obtained based on the IC layout. A wafer layout is obtained by exposing the mask layout on a wafer.

The IC layout may alternatively be referred to as a chip layout. As a design size is continuously reduced, a design size of the chip layout is already approximate to or smaller than a light source wavelength used in a lithography process. Consequently, an interference effect and a scattering effect are clearer, so that an actually formed wafer layout is seriously distorted compared with the mask layout, and the wafer layout is different from the chip layout. Based on this, how to improve the quality of the mask layout is a problem to be solved urgently.

This exemplary embodiment of this disclosure provides a mask layout determining model training method. The method may be applied to the foregoing implementation environment. The quality of a mask layout can be improved, so that a wafer layout determined based on the mask layout is highly similar to a chip layout. A flowchart of a mask layout determining model training method according to an exemplary embodiment of this disclosure shown in FIG. 2 is used as an example. For ease of description, the terminal device 101 or the server 102 that performs the mask layout determining model training method in this exemplary embodiment of this disclosure is referred to as an electronic device. The method may be performed by the electronic device. As shown in FIG. 2, the method includes the following operations 201 to 204.

Operation 201: Obtain a labeled mask layout and a sample chip layout.

The labeled mask layout is a mask layout of the sample chip layout determined under conditions of a reference lithography process parameter.

The sample chip layout may be any chip layout. The chip layout is a pattern obtained after layout and connection of multiple circuit devices based on an implementation of a logic circuit. In short, the logic circuit may be first designed. For example, each function that needs to be implemented by the logic circuit is analyzed based on a requirement, and each function is simulated and verified, to obtain the logic circuit. Next, information such as a circuit device type, a quantity of circuit devices, a size of the circuit devices, a relative position between the circuit devices, and a connection relationship between the circuit devices is determined based on the logic circuit, and the circuit devices are laid out and connected according to the information. The obtained pattern is the chip layout.

The mode of obtaining the sample chip layout is not limited in this exemplary embodiment of this disclosure. For example, the electronic device may use any input chip layout as a sample chip layout, or the electronic device may capture a chip layout from a network to obtain the sample chip layout.

The sample chip layout is calculated to obtain a mask layout of the sample chip layout under conditions of a reference lithography process parameter. The mask layout is recorded as a labeled mask layout. The reference lithography process parameter refers to a set lithography process parameter. The lithography process parameter is a parameter of a lithography process. The lithography process is an operation in a semiconductor device manufacturing process, and mainly describes a geometric figure structure on a photoresist layer by exposure and development, and transfers, by an etching process, a pattern on a mask (i.e. the mask layout) to a substrate on which the photoresist layer is located. In this exemplary embodiment of this disclosure, in the process of exposing the labeled mask layout on a wafer by a lithography machine, the lithography machine may perform exposure based on a set lithography process parameter. The set lithography process parameter herein is a reference lithography process parameter. To be specific, the reference lithography process parameter is a set lithography process parameter according to which the labeled mask layout is exposed on the wafer by the lithography machine.

In an exemplary embodiment, the reference lithography process parameter is determined through a defocus process parameter distribution function and/or an exposure dose deviation process parameter distribution function. For example, a value of an independent variable for the reference lithography process parameter is substituted into the defocus process parameter distribution function, to obtain a first dependent variable value, and the first dependent variable value is used as the reference lithography process parameter. Alternatively, a value of an independent variable for the reference lithography process parameter is substituted into the exposure dose deviation process parameter distribution function, to obtain a second dependent variable value, and the second dependent variable value is used as the reference lithography process parameter. Alternatively, a value of an independent variable for the reference lithography process parameter is substituted into the defocus process parameter distribution function, to obtain a first dependent variable value, the value of the independent variable for the reference lithography process parameter is substituted into the exposure dose deviation process parameter distribution function, to obtain a second dependent variable value, and the first dependent variable value and the second dependent variable value are used as the reference lithography process parameter.

The defocus process parameter distribution function is configured for describing energy distribution of light on a workpiece in a case that a positional relationship between a focus and the workpiece is determined. In this exemplary embodiment of this disclosure, the focus refers to a focus of the lithography machine, and the workpiece includes the wafer. The focus may be above the workpiece, may be inside the workpiece, or may be below the workpiece. The focus being above the workpiece is referred to as a positive defocus, and the focus being inside the workpiece or the focus being below the workpiece is referred to as a negative defocus. The defocus process parameter distribution function is shown in the following formula (1).

$\begin{matrix} ξ (h_{μ}) = \exp {- \frac{{(h_{μ})}^{2}}{2 {(σ_{h})}^{2}}} & Formula (1) \end{matrix}$

- where ξ(h_μ) represents the defocus process parameter distribution function, exp represents an exponential function with a natural constant e as a base, and h_μ represents a distance between the wafer and a focal plane of the lithography machine. The focus of the lithography machine is located on the focal plane of the lithography machine. σ_hrepresents distribution function broadening. Values of h_μ and σ_hare not limited in this exemplary embodiment of this disclosure. For example, for the reference lithography process parameter, h_μ=0 nanometer (nm), σ_h=80. For example, the values of h_μ and σ_hfor the reference lithography process parameter are substituted into formula (1), to obtain a value of ξ(h_μ). The value of ξ(h_μ) is the first dependent variable value.

The exposure dose deviation process parameter distribution function is configured for describing an energy distribution deviation of light in different areas of the workpiece. The exposure dose deviation process parameter distribution function is shown in the following formula (2).

$\begin{matrix} ζ (t_{q}) = \exp {- \frac{{(t_{q})}^{2}}{2 {(σ_{q})}^{2}}} & Formula (2) \end{matrix}$

- where ζ(t_q) represents the exposure dose deviation process parameter distribution function. exp represents the exponential function with the natural constant e as the base, t_qrepresents an exposure dose deviation of the lithography machine, and σ_qrepresents distribution function broadening. Values of t_qand σ_qare not limited in this exemplary embodiment of this disclosure. For example, for the reference lithography process parameter, t_q=0, σ_q=0.1. For example, the values of t_qand σ_qfor the reference lithography process parameter are substituted into formula (2), to obtain a value of ζ(t_q). The value of ζ(t_q) is the second dependent variable value.

In an exemplary implementation, “obtaining a labeled mask layout” in operation 201 includes operations A1 to A3 (not shown).

Operation A1: Generate an initial mask layout based on the sample chip layout.

In some exemplary embodiments, the sample chip layout is linearly modified, to obtain a layout recorded as the initial mask layout. For example, a calculation formula of the initial mask layout is: M₀=w₁Z_t+w₂, where M₀represents the initial mask layout, w₁and w₂represent linear modification parameters, and Z_trepresents the sample chip layout. Values of the linear modification parameters are not limited in this exemplary embodiment of this disclosure. For example, w₁=0.98, w₂=0.1. A difference between the initial mask layout calculated based on the two values and the sample chip layout is relatively small. Therefore, when a first gradient of a reference loss relative to the initial mask layout is subsequently determined, a value of the first gradient is relatively large, facilitating subsequent gradient descent processing. Therefore, when the initial mask layout is adjusted based on the first gradient, an adjustment effect can be improved.

Operation A2: Determine a second wafer layout of the initial mask layout through a second wafer layout determining model, where the second wafer layout determining model includes the reference lithography process parameter, and the second wafer layout is a wafer layout of the initial mask layout obtained by prediction under the conditions of the reference lithography process parameter.

In some exemplary embodiments, the initial mask layout is directly inputted to the second wafer layout determining model, and the second wafer layout is outputted through the second wafer layout determining model. In some exemplary embodiments, the initial mask layout is activated, to obtain a reference mask layout, the reference mask layout is then inputted to the second wafer layout determining model, and the second wafer layout determining model is outputted through the second wafer layout determining model.

In some exemplary embodiments, the initial mask layout is activated according to the following formula (3) to obtain the reference mask layout.

$\begin{matrix} M_{0} = \frac{1}{1 + \exp (- θ_{M} \times ({\bar{M}}_{0} - M_{t h}))} & Formula (3) \end{matrix}$

- where M₀represents the reference mask layout, exp represents the exponential function with the natural constant e as the base, θ_Mand M_threpresent activation processing parameters, and M₀represents the initial mask layout. Values of the activation processing parameters are not limited in this exemplary embodiment of this disclosure. For example, θ_M=4, M_th=0.225.

Formula (3) is a sigmoid function, where the sigmoid function is an activation function. In this exemplary embodiment of this disclosure, the initial mask layout is a binary mask layout (a mask layout including two parts that are transparent and non-transparent), and it is very difficult to perform non-continuous optimization on the binary mask layout. Therefore, by activating the initial mask layout into the reference mask layout through the activation function, the non-continuous optimization performed on the binary mask layout may be converted into continuous optimization performed on the reference mask layout, thereby reducing the difficulty of optimization. Furthermore, the initial mask layout is activated into the reference mask layout through the activation function, so that tiny complex structures (for example, an island structure, a hollow structure, an aliasing structure, and an extension structure) in the initial mask layout may be filtered out, thereby reducing structural complexity of the reference mask layout, and increasing manufacturability of the reference mask layout.

In this exemplary embodiment of this disclosure, a model structure, a model size, and the like of the second wafer layout determining model are not limited. For example, the second wafer layout determining model is a Hopkins diffraction lithography physical model based on a coherent imaging system. The model includes multiple kernel functions and an activation function. The kernel functions are obtained by performing singular value decomposition on a cross-transmission coefficient of the lithography system. In some exemplary embodiments, the lithography system is a 193 nm annular light source system. Functions of the kernel functions and the activation function are correspondingly described below. Details are not described herein again.

In an exemplary embodiment, operation A2 includes operations A21 to A22 (not shown).

Operation A21: Determine reference light intensity distribution information based on the initial mask layout through the second wafer layout determining model, where the reference light intensity distribution information is a light intensity distribution obtained after the initial mask layout is imaged on a wafer under the conditions of the reference lithography process parameter.

In this exemplary embodiment of this disclosure, the second wafer layout determining model includes multiple kernel functions. The initial mask layout may be convolved with the kernel functions, to obtain the reference light intensity distribution information based on convolution processing results. Alternatively, the initial mask layout may be first activated, to obtain a reference mask layout, and the reference mask layout may be then convolved with the kernel functions, to obtain the reference light intensity distribution information based on convolution processing results. In some exemplary embodiments, the reference light intensity distribution information is shown in the following formula (4).

$\begin{matrix} I (x, y; h_{μ}) = \sum_{k = 1}^{K} ω_{k} (h_{μ}) {❘ M (x, y) \otimes h_{k} (x, y; h_{μ}) ❘}^{2} & Formula (4) \end{matrix}$

- where M(x, y) represents the reference mask layout, x and y are parameters of the reference mask layout. I(x, y; h_μ) represents the reference light intensity distribution information, where h_μ represents the distance between the wafer and the focal plane of the lithography machine. K represents a quantity of kernel functions, where the kernel functions are obtained by performing singular value decomposition on the cross transfer coefficient of the lithography system. In an exemplary embodiment, the first 24 kernel functions are obtained by performing singular value decomposition on the cross transfer coefficient of the lithography system. To be specific, K=24. ω_k(h_μ) represents a weight coefficient of a k^thkernel function, and h_k(x, y; h_μ) represents the k^thkernel function. Σ represents a function sign of summation processing, ⊗ represents a function sign of convolution processing, and ∥ represents a function sign of absolute value processing.

Since the second wafer layout determining model includes the reference lithography process parameter, in formula (4), h_μ=0.

Operation A22: Determine the second wafer layout based on the reference light intensity distribution information through the second wafer layout determining model.

In this exemplary embodiment of this disclosure, the second wafer layout determining model includes an activation function. The reference light intensity distribution information is activated through the activation function, to obtain the second wafer layout. In some exemplary embodiments, the activation function is a sigmoid function. The reference light intensity distribution information is activated through the sigmoid function according to the following formula (5), to obtain the second wafer layout.

$\begin{matrix} Z (x, y; h_{μ}, t_{q}) = s ig (I; h_{μ}, t_{q}) = \frac{1}{1 + \exp (- θ_{Z} \times (I (x, y; h_{μ}) - I_{t h} / (1 + t_{q})))} & Formula (5) \end{matrix}$

- where Z(x, y; h_μ, t_q) represents the second wafer layout, x and y are parameters of the reference mask layout, h_μ represents the distance between the wafer and the focal plane of the lithography machine, and t_qrepresents the exposure dose deviation of the lithography machine. sig(I; h_μ, t_q) represents the sigmoid function, where I represents the reference light intensity distribution information. exp represents the exponential function with the natural constant e as the base. θ_Zand I_threpresent the activation processing parameters. For example, θ_Z=50, I_th=0.225. I(x, y; h_μ) represents the reference light intensity distribution information.

Since the second wafer layout determining model includes the reference lithography process parameter, in formula (5), h_μ=0, t_q=0.

Operation A3: Adjust the initial mask layout based on the sample chip layout and the second wafer layout, to obtain the labeled mask layout.

In this exemplary embodiment of this disclosure, the initial mask layout is constantly adjusted through the sample chip layout and the second wafer layout, so that the mask layout is constantly optimized toward a direction in which the wafer layout can increasingly approach the sample chip layout, thereby achieving high accuracy of the labeled mask layout.

In an exemplary embodiment, operation A3 includes operations A31 to A34 (not shown).

Operation A31: Determine a first reference sub-loss based on first difference information between the sample chip layout and the second wafer layout.

In this exemplary embodiment of this disclosure, the first difference information between the sample chip layout and the second wafer layout may be calculated. For example, the first difference information may be calculated based on a difference between a pixel value of a pixel in the sample chip layout and a pixel value of a corresponding pixel in the second wafer layout. For example, a mean square error between the pixel value of the pixel in the sample chip layout and the pixel value of the corresponding pixel in the second wafer layout is used as the first difference information.

For example, the sample chip layout includes at least one geometrical pattern. A sum of perimeters of the geometrical patterns is used as an edge length of the sample chip layout. The first reference sub-loss is determined based on the first difference information and the edge length of the sample chip layout. An error between the sample chip layout and the second wafer layout is described through the first reference sub-loss. In some exemplary embodiments, the first reference sub-loss is determined according to the following formula (6).

$\begin{matrix} L_{Aerial} = \frac{1}{L} \sum_{x = 1}^{N} \sum_{y = 1}^{N} {(Z (x, y; h_{μ}, t_{q}) - Z_{t} (x, y))}^{γ} & Formula (6) \end{matrix}$

- where L_Aerialrepresents the first reference sub-loss. L represents the edge length of the sample chip layout. Z(x, y; h_μ, t_q)−Z_t(x, y) represents the first difference information. Z_t(x, y) represents the sample chip layout, where x and y represent the parameters of the sample chip layout. N represents a side length of the sample chip layout. To be specific, the sample chip layout is a square with a side length of N. Z(x, y; h_μ, t_q) represents the second wafer layout, h_μ represents the distance between the wafer and the focal plane of the lithography machine, and t_qrepresents the exposure dose deviation of the lithography machine. γ is an adjustable parameter. For example, γ=2.

A dimension of the edge length of the sample chip layout is a length unit, and a dimension of the first reference sub-loss is also a length unit. Furthermore, the sample chip layout is a square with a side length of N. Therefore, a dimension of the sample chip layout is also a length unit. In other words, the dimension of the edge length of the sample chip layout, the dimension of the first reference sub-loss, and the dimension of the sample chip layout remain the same.

Operation A32: Determine a reference loss based on the first reference sub-loss.

In an exemplary embodiment, the first reference sub-loss is used as the reference loss. Alternatively, the first reference sub-loss is weighted, to obtain a weighting result used as the reference loss. Alternatively, operation A32 includes operations A321 to A322 (not shown). To be specific, the reference loss is determined based on the first reference sub-loss according to the following operations A321 to A322.

Operation A321: Obtain at least one of a second reference sub-loss, a third reference sub-loss, a fourth reference sub-loss, a fifth reference sub-loss, or a sixth reference sub-loss, where the second reference sub-loss is determined based on the initial mask layout, the third reference sub-loss is determined based on second difference information between the sample chip layout and the initial mask layout, the fourth reference sub-loss is determined based on the first difference information and edge modification information, the fifth reference sub-loss is determined based on the initial mask layout and the edge modification information, the sixth reference sub-loss is determined based on the second difference information and the edge modification information, and the edge modification information is information for modifying an edge of the sample chip layout.

For example, the second reference sub-loss is directly determined based on the initial mask layout. For example, the initial mask layout is first activated, to obtain a reference mask layout, and the second reference sub-loss is then determined based on the reference mask layout. In an exemplary embodiment, a pixel value of the reference mask layout is between 0 and 1. To be specific, the reference mask layout belongs to continuous distribution. A pixel value of the binary mask layout is 0 or 1. To be specific, the binary mask layout belongs to discrete distribution. In some exemplary embodiments, 0 represents being transparent, and 1 represents being non-transparent. It can be seen based on the foregoing content that there is an error between the continuous distribution of the reference mask layout and the discrete distribution of the binary mask layout. The second reference sub-loss may be determined based on the reference mask layout according to the following formula (7), to reflect the error between the continuous distribution of the reference mask layout and the discrete distribution of the binary mask layout through the second reference sub-loss.

$\begin{matrix} R_{D} = \frac{1}{L} \sum_{x = 1}^{N} \sum_{y = 1}^{N} [1 - {(1 - 2 M (x, y))}^{2}] & Formula (7) \end{matrix}$

- where R_Drepresents the second reference sub-loss. M(x, y) represents the reference mask layout, where x and y represent the parameters of the reference mask layout. The reference mask layout includes at least one geometrical pattern. A sum of perimeters of the geometrical patterns is used as the edge length of the reference mask layout, and L represents the edge length of the reference mask layout. N represents a side length of the reference mask layout. To be specific, the reference mask layout is a square with a side length of N.

For example, difference information between the sample chip layout and the initial mask layout is directly calculated. The difference information is recorded as second difference information. The third reference sub-loss is determined based on the second difference information. For example, the initial mask layout is first activated, to obtain a reference mask layout, difference information between the sample chip layout and the reference mask layout is then calculated, the difference information is recorded as second difference information, and the third reference sub-loss is determined based on the second difference information. For example, the difference information between the two layouts (the sample chip layout and the initial mask layout, or the sample chip layout and the reference mask layout) may be calculated based on a difference between pixel values of corresponding pixels in the two layouts. For example, a mean square error between the pixel values of the corresponding pixels in the two layouts is used as the difference information between the two layouts.

Usually, many tiny structures are generated in the process of performing mask optimization based on pixel features. In other words, there is a structural difference between the sample chip layout and the reference mask layout (or the initial mask layout). Since there is a structural difference between the reference mask layout and the sample chip layout, complexity of the reference mask layout is improved, making it difficult to manufacture the reference mask layout due to high complexity. Therefore, the third reference sub-loss may be determined based on the second difference information, to measure the complexity of the reference mask layout through the third reference sub-loss. In some exemplary embodiments, the third reference sub-loss is determined according to the following formula (8).

$\begin{matrix} R_{TV} = \frac{1}{L} ({ \frac{\partial f}{\partial x} }_{1} + { \frac{\partial f}{\partial y} }_{1}) = \frac{1}{L} ({ D \oplus f }_{1} + { f \oplus D^{T} }_{1}) & Formula (8) \end{matrix}$

$f = ❘ M - Z_{t} ❘$

- where M represents the reference mask layout, Z_trepresents the sample chip layout, and f represents the second difference information. R_TVrepresents the third reference sub-loss. L represents the edge length of the sample chip layout. ∂f/∂x represents a first derivative of the second difference information relative to x, ∂f/∂y represents a first derivative of the second difference information relative to y, and x and y represent the parameters of the sample chip layout or the parameters of the reference mask layout. ∥ ∥₁represents a 1-norm. D represents a first derivative of the second difference information, T represents a transposed matrix, and ⊕ represents matrix addition.

In this exemplary embodiment of this disclosure, the edge of the sample chip layout may be modified, to obtain edge modification information. In some exemplary embodiments, the edge of the sample chip layout is modified according to an edge modification distance and an edge modification amplitude, to obtain the edge modification information. The edge modification information includes multiple pieces of modification information, and any piece of modification information includes position information of multiple pixels. A distance between two adjacent pieces of modification information is the edge modification distance, and an amplitude of any piece of modification information is the edge modification amplitude. For example, the edge modification distance is 40 nm, and the edge modification amplitude is 30 nm.

Refer to FIG. 3. FIG. 3 is a schematic diagram of an edge modification information according to an exemplary embodiment of this disclosure. A thick line in FIG. 3 represents the edge of the sample chip layout, and a thin line on the edge represents the modification information. Any piece of modification information includes position information of each pixel on the corresponding thin line. The edge modification distance is a distance between two adjacent pieces of modification information, i.e. a distance between two adjacent thin lines. The edge modification amplitude is an amplitude of any piece of modification information, i.e. a length of any thin line.

By modifying the edge of the sample chip layout, noise is introduced into the sample chip layout, and the complexity of the sample chip layout is reduced, so that the mask layout and the wafer layout can more easily approach the sample chip layout, thereby helping to improve generalization capability and accuracy of the mask layout determining model.

For example, the fourth reference sub-loss is determined based on the first difference information between the sample chip layout and the second wafer layout and the edge modification information, and the error between the sample chip layout and the second wafer layout in a case that noise exists in the sample chip layout is described through the fourth reference sub-loss. In some exemplary embodiments, the fourth reference sub-loss is determined according to the following formula (9).

$\begin{matrix} Formula (9) \end{matrix}$

$L_{Aerial - MEPE} = \frac{1}{L} \sum_{x = 1}^{N} \sum_{y = 1}^{N} MEPE ⊙ {(Z (x, y; h_{μ}, t_{q}) - Z_{t} (x, y))}^{γ}$

- where L_Aerial-MEPErepresents the fourth reference sub-loss. L represents the edge length of the sample chip layout. Z(x, y; h_μ, t_q)−Z_t(x, y) represents the first difference information. Z_t(x, y) represents the sample chip layout, where x and y represent the parameters of the sample chip layout. N represents the side length of the sample chip layout. Z(x, y; h_μ, t_q) represents the second wafer layout, h_μ represents the distance between the wafer and the focal plane of the lithography machine, and t_qrepresents the exposure dose deviation of the lithography machine. γ is an adjustable parameter. A modified edge placement error (MEPE) is the edge modification information. ⊙ represents multiplication of corresponding elements of a matrix.

For example, the fifth reference sub-loss is directly determined based on the initial mask layout and the edge modification information. For example, the initial mask layout is first activated, to obtain a reference mask layout, and the fifth reference sub-loss is then determined based on the reference mask layout and the edge modification information. In an exemplary embodiment, the fifth reference sub-loss may be determined based on the reference mask layout and the edge modification information according to the following formula (10), to reflect, through the fifth reference sub-loss, the error between the continuous distribution of the reference mask layout and the discrete distribution of the binary mask layout in a case that the reference mask layout has the same noise as that of the sample chip layout.

$\begin{matrix} R_{D - MEPE} = \frac{1}{L} \sum_{x = 1}^{N} \sum_{y = 1}^{N} [1 - {(1 - 2 M (x, y) ⊙ MEPE)}^{2}] & Formula (10) \end{matrix}$

- where R_D-MEPErepresents the fifth reference sub-loss. M(x, y) represents the reference mask layout, where x and y represent the parameters of the reference mask layout. L represents the edge length of the reference mask layout. N represents the side length of the reference mask layout. MEPE is the edge modification information. ⊙ represents multiplication of corresponding elements of a matrix.

For example, the sixth reference sub-loss is determined based on the second difference information and the edge modification information, to measure, through the sixth reference sub-loss, the complexity of the reference mask layout in a case that the reference mask layout has the same noise as that of the sample chip layout. In some exemplary embodiments, the sixth reference sub-loss is determined according to the following formula (11).

$\begin{matrix} \begin{matrix} R_{TV - MEPE} = \frac{1}{L} ({ \frac{\partial f ⊙ MEPE}{\partial x} }_{1} + { \frac{\partial f ⊙ MEPE}{\partial y} }_{1}) \\ = \frac{1}{L} ({ D \oplus (f ⊙ MEPE) }_{1} + { (f ⊙ MEPE) \oplus D^{T} }_{1}) \end{matrix} & Formula (11) \end{matrix}$

$f = ❘ M - Z_{t} ❘$

- where M represents the reference mask layout, Z_trepresents the sample chip layout, and f represents the second difference information. R_TV-MEPErepresents the sixth reference sub-loss. L represents the edge length of the sample chip layout. MEPE is the edge modification information. ⊙ represents multiplication of corresponding elements of a matrix.

$\frac{\partial f ⊙ MEPE}{\partial x}$

represents a first-order derivative of a product of the second difference information and the edge modification information relative to x,

$\frac{\partial f ⊙ MEPE}{\partial y}$

represents a first derivative of the product of the second difference information and the edge modification information relative to y, and x and y represent the parameters of the sample chip layout or the parameters of the reference mask layout. ∥ ∥₁represents a 1-norm. D represents a first derivative of the second difference information, T represents a transposed matrix, and ⊕ represents matrix addition.

The foregoing describes respective obtaining modes for the second reference sub-loss, the third reference sub-loss, the fourth reference sub-loss, the fifth reference sub-loss, and the sixth reference sub-loss. During an actual application, one or more reference sub-losses may be selected from the second reference sub-loss, the third reference sub-loss, the fourth reference sub-loss, the fifth reference sub-loss, and the sixth reference sub-loss according to a requirement, and the selected reference sub-loss is obtained according to the obtaining modes described above.

Operation A322: Determine the reference loss based on the first reference sub-loss and at least one of the second reference sub-loss, the third reference sub-loss, the fourth reference sub-loss, the fifth reference sub-loss, or the sixth reference sub-loss.

For example, weighted summation is performed on the first reference sub-loss and at least one of the second reference sub-loss, the third reference sub-loss, the fourth reference sub-loss, the fifth reference sub-loss, and the sixth reference sub-loss. The reference loss is determined based on a weighted summation result. For any two reference sub-losses in the first reference sub-loss to the sixth reference sub-loss, weights of the two reference sub-losses may be the same or may be different.

In some exemplary embodiments, the reference loss is determined according to the following formula (12).

$\begin{matrix} L_{MEPE} = L_{Aerial - MEPE} + α_{1} R_{D - MEPE} + κ_{1} R_{TV - MEPE} & Formula (12) \end{matrix}$

$L_{total} = \sum_{μ}^{U} \sum_{q}^{Q} ξ (h_{μ}) ζ (t_{q}) (L_{Aerial} + α_{2} R_{D} + κ_{2} R_{TV} + β L_{MEPE})$

- where L_totalrepresents the reference loss. L_Aerialrepresents the first reference sub-loss. R_Drepresents the second reference sub-loss, and α₂represents the weight of the second reference sub-loss. R_TVrepresents the third reference sub-loss, and κ₂represents the weight of the third reference sub-loss. L_MEPErepresents the sub-loss corresponding to the edge modification information, and β represents the weight of the sub-loss corresponding to the edge modification information. L_Aerial-MEPErepresents the fourth reference sub-loss, and the weight of the fourth reference sub-loss is β. R_D-MEPE represents the fifth reference sub-loss, α₁represents a sub-weight of the fifth reference sub-loss, and a product of the sub-weight of the fifth reference sub-loss and the weight of the sub-loss corresponding to the edge modification information is the weight of the fifth reference sub-loss. To be specific, α₁β represents the weight of the fifth reference sub-loss. R_TV-MEPErepresents the sixth reference sub-loss, κ₁represents a sub-weight of the sixth reference sub-loss, and a product of the sub-weight of the sixth reference sub-loss and the weight of the sub-loss corresponding to the edge modification information is the weight of the sixth reference sub-loss. To be specific, κ₁β represents the weight of the sixth reference sub-loss. ξ(h_μ) represents the defocus process parameter distribution function, and ζ(t_q) represents the exposure dose deviation process parameter distribution function.

There is at least one h_μ, μ represents a sequence number, and U represents a quantity of h_μ. For example, for the reference lithography process parameter, h_μ=0. In this case, U=1. For an actual lithography process parameter, a value of h_μ may be −80, 0, or 80. In this case, U=3. Similarly, there is at least one t_q, q represents a sequence number, and Q represents the quantity of t_q. For example, for the reference lithography process parameter, t_q=0. In this case, Q=1. For an actual lithography process parameter, a value of t_qmay be −0.1, 0, or 0.1. In this case, Q=3.

The weight and sub-weight of each reference sub-loss are not limited in this exemplary embodiment of this disclosure. For example, the weight of the second reference sub-loss and the sub-weight of the fifth reference sub-loss are the same, i.e. α₁=α₂=α. The weight of the third reference sub-loss and the sub-weight of the sixth reference sub-loss are the same, i.e. κ₁=κ₂=κ. Relative values of the reference sub-losses are balanced through the weights of the reference sub-losses. Therefore, the weights of the reference sub-losses are adjustable parameters. For example, α=0.025, κ=0.06, β=2.

Operation A33: Determine a first gradient of the reference loss relative to the initial mask layout.

In this exemplary embodiment of this disclosure, the reference loss includes the first reference sub-loss and at least one of the second reference sub-loss to the sixth reference sub-loss. Therefore, the first gradient of the reference loss relative to the initial mask layout includes two parts. The first part is a gradient of the at least one reference sub-loss among the second reference sub-loss to the sixth reference sub-loss relative to the initial mask layout. The second part is a gradient of the first reference sub-loss relative to the initial mask layout.

An example in which the reference loss includes the first reference sub-loss to the sixth reference sub-loss is used. The first gradient of the reference loss relative to the initial mask layout is shown in the following formula (13).

$\begin{matrix} \frac{\partial L_{total}}{\partial \bar{M}} = \sum_{μ}^{U} \sum_{q}^{Q} ξ (h_{μ}) ζ (t_{q}) (\frac{\partial L_{Aerial}}{\partial \bar{M}} + α \frac{\partial R_{D}}{\partial \bar{M}} + κ \frac{\partial R_{TV}}{\partial \bar{M}} + β \frac{\partial L_{Aerial - MEPE}}{\partial \bar{M}} + αβ \frac{\partial R_{D - MEPE}}{\partial \bar{M}} + κ β \frac{\partial R_{TV - MEPE}}{\partial \bar{M}}) & Formula (13) \end{matrix}$

- where ∂L_total/∂M represents the first gradient of the reference loss relative to the initial mask layout. ξ(h_μ) represents the defocus process parameter distribution function, and ζ(t_q) represents the exposure dose deviation process parameter distribution function. ∂L_Aerial/∂M represents the gradient of the first reference sub-loss relative to the initial mask layout. ∂R_D/∂M represents the gradient of the second reference sub-loss relative to the initial mask layout, and a represents the weight of the second reference sub-loss. ∂R_TV/∂M represents the gradient of the third reference sub-loss relative to the initial mask layout, and κ represents the weight of the third reference sub-loss.

$\frac{\partial L_{Aerial - MEPE}}{\partial \bar{M}}$

represents the gradient of the fourth reference sub-loss relative to the initial mask layout, and β represents the weight of the fourth reference sub-loss.

$\frac{\partial R_{D - MEPE}}{\partial \bar{M}}$

represents the gradient of the fifth reference sub-loss relative to the initial mask layout, and αβ represents the weight of the fifth reference sub-loss.

$\frac{\partial R_{TV - MEPE}}{\partial \bar{M}}$

represents the gradient of the sixth reference sub-loss relative to the initial mask layout, and κβ represents the weight of the sixth reference sub-loss.

In this exemplary embodiment of this disclosure, the following formula (14) exists. The principle of formula (14) is the same as that of formula (3) mentioned above. In formula (3), M₀represents the reference mask layout, and M₀represents the initial mask layout. In formula (14), M represents the reference mask layout, and M represents the initial mask layout. In other words, M in formula (14) is equivalent to M₀in formula (3), and M in formula (14) is equivalent to M₀in formula (3).

$\begin{matrix} M = \frac{1}{1 + \exp (- θ_{M} \times (\bar{M} - M_{th}))} & Formula (14) \end{matrix}$

- where M represents the reference mask layout, exp represents the exponential function with the natural constant e as the base, θ_Mand M_threpresent activation processing parameters, and M represents the initial mask layout. For example, θ_M=4, M_th=0.225.

For the gradient of the first reference sub-loss relative to the initial mask layout in formula (13), the following formula (15) may be obtained based on formula (14).

$\begin{matrix} Formula (15) \end{matrix}$

$\begin{matrix} \frac{\partial L_{Aerial}}{\partial \bar{M}} = \frac{1}{L} {γ (Z - Z_{t})}^{γ - 1} ⊙ \frac{\partial Z}{\partial M} ⊙ \frac{\partial M}{\partial \bar{M}} \\ = \frac{1}{L} {γθ}_{M} θ_{Z} {H^{flip} \otimes [{(Z - Z_{t})}^{γ - 1} ⊙ Z ⊙ (1 - Z) ⊙ (M \otimes H^{*})] + \\ {(H^{flip})}^{*} \otimes [{(Z - Z_{t})}^{γ - 1} ⊙ Z ⊙ (1 - Z) ⊙ (M \otimes H)]} ⊙ \\ M ⊙ (1 - M) \end{matrix}$

- where ∂L_Aerial/∂M represents the gradient of the first reference sub-loss relative to the initial mask layout. L represents the edge length of the sample chip layout. γ is an adjustable parameter. Z−Z_trepresents the first difference information, Z_trepresents the sample chip layout, and Z represents the second wafer layout. ∂Z/∂M represents the gradient of the second wafer layout relative to the reference mask layout. ∂M/∂M represents the gradient of the reference mask layout relative to the initial mask layout. ⊙ represents multiplication of corresponding elements of a matrix. θ_Mand θ_Zare both activation processing parameters. For example, θ_M=4, θ_Z=50. H^fliprepresents a function obtained by flipping the kernel function H by 180o, where the kernel functions are obtained by performing singular value decomposition on the cross transfer coefficient of the lithography system. ⊗ represents the function sign of the convolution processing. H* represents a complex conjugate of the kernel function H, and (H^flip)* represents a complex conjugate of the function H^flip. M represents the reference mask layout, and M represents the initial mask layout.

For the gradient of the second reference sub-loss relative to the initial mask layout in formula (13), the following formula (16) may be obtained based on formula (14).

$\begin{matrix} \frac{\partial R_{D}}{\partial \bar{M}} = \frac{\partial R_{D}}{\partial M} ⊙ \frac{\partial M}{\partial \bar{M}} = \frac{1}{L} θ_{M} (- 8 \times M + 4) ⊙ M ⊙ (1 - M) & Formula (16) \end{matrix}$

- where ∂R_D/∂M represents the gradient of the second reference sub-loss relative to the initial mask layout. ∂R_D/∂M represents the gradient of the second reference sub-loss relative to the reference mask layout. ∂M/∂M represents the gradient of the reference mask layout relative to the initial mask layout. ⊙ represents multiplication of corresponding elements of a matrix. L represents the edge length of the reference mask layout, θ_Mis the activation processing parameter, and M represents the reference mask layout.

For the gradient of the third reference sub-loss relative to the initial mask layout in formula (13), the following formula (17) may be obtained based on formula (14).

$\begin{matrix} \begin{matrix} \frac{\partial R_{TV}}{\partial \bar{M}} = \frac{\partial R_{TV}}{\partial f} ⊙ \frac{\partial f}{\partial M} ⊙ \frac{\partial M}{\partial \bar{M}} \\ = \frac{1}{L} θ_{M} [D^{T} \oplus sign (D \oplus f) + sign (f \oplus D^{T}) \oplus D] ⊙ \\ sign (M - Z_{t}) ⊙ M ⊙ (1 - M) \end{matrix} & Formula (17) \end{matrix}$

where ∂R_TV/∂M represents the gradient of the third reference sub-loss relative to the initial mask layout. ∂R_TV/∂f represents the gradient of the third reference sub-loss relative to the second difference information. ∂f/∂M represents the gradient of the second difference information relative to the reference mask layout. ∂M/∂M represents the gradient of the reference mask layout relative to the initial mask layout. ⊙ represents multiplication of corresponding elements of a matrix. L represents the edge length of the sample chip layout, θ_Mis the activation processing parameter, D represents the first-order derivative of the second difference information, T represents the transposed matrix, ⊕ represents the matrix addition, f represents the second difference information, M represents the reference mask layout, and Z_trepresents the sample chip layout. sign is a sign function, where the sign function satisfies: sign (x)=0, x=0; sign (x)=1, x>0; sign (x)=−1, x<0.

For the gradient of the fourth reference sub-loss relative to the initial mask layout in formula (13), the following formula (18) may be obtained based on formula (14).

$\begin{matrix} \frac{\partial L_{Aerial - MEPE}}{\partial \overline{M}} = \frac{\partial L_{Aerial}}{\partial \overline{M}} ⊙ MEPE & Formula (18) \end{matrix}$

$where \frac{\partial L_{Aerial - MEPE}}{\partial \overline{M}}$

represents the gradient of the fourth reference sub-loss relative to the initial mask layout. ∂L_Aerial/∂M represents the gradient of the first reference sub-loss relative to the initial mask layout. MEPE is the edge modification information. ⊙ represents multiplication of corresponding elements of a matrix.

For the gradient of the fifth reference sub-loss relative to the initial mask layout in formula (13), the following formula (19) may be obtained based on formula (14).

$\begin{matrix} \frac{\partial R_{D - MEPE}}{\partial \overline{M}} = \frac{\partial R M_{D - MEPE}}{\partial M} ⊙ \frac{\partial M}{\partial \overline{M}} = \frac{1}{L} \times θ_{M} (- 8 \times MEPE ⊙ M + 4) ⊙ M ⊙ (1 - M) & Formula (19) \end{matrix}$

$where \frac{\partial R_{D - MEPE}}{\partial \overline{M}}$

represents the gradient of the fifth reference sub-loss relative to the initial mask layout.

$\frac{\partial R M_{D - MEPE}}{\partial M}$

represents the gradient of the fifth reference sub-loss relative to the reference mask layout. ∂M/∂M represents the gradient of the reference mask layout relative to the initial mask layout. ⊙ represents multiplication of corresponding elements of a matrix. L represents the edge length of the reference mask layout. θ_Mis the activation processing parameter, MEPE is the edge modification information, and M represents the reference mask layout.

For the gradient of the sixth reference sub-loss relative to the initial mask layout in formula (13), the following formula (20) may be obtained based on formula (14).

$\begin{matrix} \frac{\partial R_{TV - MEPE}}{\partial \overline{M}} = \frac{\partial R_{TV - MEPE}}{\partial f} ⊙ \frac{\partial f}{\partial M} ⊙ \frac{\partial M}{\partial \overline{M}} = \frac{1}{L} θ_{M} [D^{T} \oplus sign (D \oplus (MEPE ⊙ f)) + sign ((MEPE ⊙ f) \oplus D^{T}) \oplus D] ⊙ sign (M - Z_{t}) ⊙ M ⊙ (1 - M) & Formula (20) \end{matrix}$

$where \frac{\partial R_{TV - MEPE}}{\partial \overline{M}}$

represents the gradient of the sixth reference sub-loss relative to the initial mask layout.

$\frac{\partial R_{TV - MEPE}}{\partial f}$

represents the gradient of the sixth reference sub-loss relative to the second difference information. ∂f/∂M represents the gradient of the second difference information relative to the reference mask layout. ∂M/∂M represents the gradient of the reference mask layout relative to the initial mask layout. ⊙ represents multiplication of corresponding elements of a matrix. L represents the edge length of the sample chip layout, θ_Mis the activation processing parameter, D represents the first-order derivative of the second difference information, T represents the transposed matrix, ⊕ represents the matrix addition, MEPE is the edge modification information, f represents the second difference information, M represents the reference mask layout, and Z_trepresents the sample chip layout. sign is the sign function.

Formula (13) includes the gradient of the first reference sub-loss to the sixth reference sub-loss relative to the initial mask layout. Since the reference loss includes the first reference sub-loss and the at least one of the second reference sub-loss to the sixth reference sub-loss, the first gradient of the reference loss relative to the initial mask layout includes: a gradient of the at least one of the second reference sub-loss to the sixth reference sub-loss relative to the initial mask layout and a gradient of the first reference sub-loss relative to the initial mask layout. Details are not described herein again.

Operation A34: Adjust the initial mask layout based on the first gradient, to obtain the labeled mask layout.

For example, according to a steepest descent method or a conjugate gradient method, a reference gradient is determined based on the first gradient, and the initial mask layout is adjusted based on the reference gradient. For the operation of “determining a reference gradient based on the first gradient”, there are different implementations of the steepest descent method and the conjugate gradient method. The two implementations are respectively described below.

In the steepest descent method, the first gradient may be determined as the reference gradient. For example, the reference gradient is determined according to the following formula (21).

$\begin{matrix} V_{k} = g_{k} = \frac{\partial L_{t otal}}{\partial {\overline{M}}_{k}} & Formula (21) \end{matrix}$

- where V_krepresents a reference gradient corresponding to a k^thiteration. g_krepresents a first gradient corresponding to the k^thiteration, and ∂L_total/∂M_krepresents a first gradient of the reference loss relative to a mask layout corresponding to the k^thiteration. The initial mask layout corresponds to a mask layout corresponding to iteration 0, optimization corresponding to iteration 0 is performed on the initial mask layout, and there is at least one iteration. It can be seen from formula (21) that for the steepest descent method, an optimization direction of the initial mask layout is a direction of the first gradient.

In the conjugate gradient method, for iteration 0, the first gradient is the reference gradient. For any iteration other than iteration 0, a conjugate gradient factor corresponding to the current iteration is first determined based on a first gradient corresponding to the current iteration and a first gradient corresponding to a previous iteration, where the conjugate gradient factor is in negative correlation with a weight of a reference gradient corresponding to the previous iteration. Next, a reference gradient corresponding to the current iteration is determined based on the first gradient corresponding to the current iteration, the conjugate gradient factor corresponding to the current iteration, and the reference gradient corresponding to the previous iteration. In some exemplary embodiments, the reference gradient is determined according to the following formula (22).

$\begin{matrix} V_{k} = g_{k} = \frac{\partial L_{t otal}}{\partial {\overline{M}}_{k}}, k = 0 & Formula (22) \end{matrix}$

$V_{k} = g_{k} - η_{k} V_{k - 1}, k > 0$

$η_{k} = \frac{{ g_{k} }^{2} - \sum g_{k} \cdot g_{k - 1}}{{ g_{k - 1} }^{2}}$

- where V_krepresents a reference gradient corresponding to the k^thiteration, and V_k−1represents a reference gradient corresponding to a (k−1)^thiteration. g_krepresents the first gradient corresponding to the k^thiteration, and g_k−1represents a first gradient corresponding to the (k−1)^thiteration. ∂L_total/∂M_krepresents the first gradient of the reference loss relative to the mask layout corresponding to the k^thiteration. η_krepresents a conjugate gradient factor corresponding to the k^thiteration.

The initial mask layout corresponds to a mask layout corresponding to iteration 0, optimization corresponding to iteration 0 is performed on the initial mask layout, and there is at least one iteration. It can be seen from formula (22) that for the conjugate gradient method, the optimization direction of the initial mask layout is the direction of the first gradient. For all subsequent optimization directions when the mask layout is optimized, refer to the previous optimization direction, and the conjugate gradient factor is adaptively adjusted. By automatically adjusting the conjugate gradient factor, stagnation of optimization can be avoided, and an optimization effect can be improved, thus helping to improve the accuracy of the labeled mask layout.

In an exemplary embodiment, operation A34 includes operations A341 to A343 (not shown).

Operation A341: Determine an adjustment step based on the first gradient.

In this exemplary embodiment of this disclosure, the reference gradient may be determined based on the first gradient according to the steepest descent method or the conjugate gradient method, and the adjustment step of the initial mask layout may be determined based on the reference gradient. In some exemplary embodiments, the adjustment step is determined according to the following formula (23).

$\begin{matrix} ω_{k} = \frac{ε}{\max (❘ V_{k} ❘)} & Formula (23) \end{matrix}$

- where ω_krepresents an adjustment step corresponding to the k^thiteration. ε represents a step factor. For example, ε=0.1. max represents a function sign of a maximum function, and V_krepresents the reference gradient corresponding to the k^thiteration.

Operation A342: Adjust the initial mask layout based on the adjustment step and the first gradient, to obtain an adjusted mask layout.

For example, the adjustment step and the first gradient are multiplied, to obtain a product result. The initial mask layout is adjusted based on the product result, to obtain the adjusted mask layout. In some exemplary embodiments, the adjusted mask layout is determined according to the following formula (24).

$\begin{matrix} {\overline{M}}_{k + 1} = {\overline{M}}_{k} - ω_{k} V_{k} & Formula (24) \end{matrix}$

- where M_k+1represents a mask layout corresponding to a (k+1)^thiteration, and M_krepresents the mask layout corresponding to the k^thiteration. ω_krepresents the adjustment step corresponding to the k^thiteration. V_krepresents the reference gradient corresponding to the k^thiteration. When k=0, a mask layout corresponding to iteration 0 is the initial mask layout, and a mask layout corresponding to iteration 1 is the adjusted mask layout.

Operation A343: Activate the adjusted mask layout in a case that the adjusted mask layout satisfies an adjustment ending condition, to obtain the labeled mask layout.

Content of the adjusted mask layout satisfying the adjustment ending condition is not limited in this exemplary embodiment of this disclosure. For example, the case that the adjusted mask layout satisfies the adjustment ending condition is: The quantity of iterations corresponding to the adjusted mask layout reaches a first set quantity, or the first gradient (or the reference gradient) corresponding to the adjusted mask layout reaches a set gradient. The first set quantity and the set gradient are set according to experience, or are flexibly adjusted according to an application requirement. This is not limited in this exemplary embodiment of this disclosure.

If the adjusted mask layout satisfies the adjustment ending condition, the adjusted mask layout is activated through the activation function, to obtain an activated mask layout. In some exemplary embodiments, the activation function is a sigmoid function. The activated mask layout is determined according to the following formula (25). The principle of formula (25) is the same as those of formula (3) and formula (14) mentioned above. M_k+1in formula (25) is equivalent to M in formula (14) and M₀in formula (3), and M_k+1in formula (25) is equivalent to M in formula (14) and M₀in formula (3).

$\begin{matrix} M_{k + 1} = \frac{1}{1 + \exp (- θ_{M} \times (\overline{{\overline{M}}_{k + 1}} - M_{t h}))} & Formula (25) \end{matrix}$

- where M_k+1represents an activated mask layout corresponding to the (k+1)^thiteration. exp represents the exponential function with the natural constant e as the base. θ_Mand M_threpresent activation processing parameters. M_k+1represents the mask layout corresponding to the (k+1)^thiteration.

Since a pixel value of the mask layout corresponding to the (k+1)^thiteration may be greater than 1 or less than 0, the pixel value may be fixed between 0 and 1 through the sigmoid function. To be specific, a pixel value of the labeled mask layout is between 0 and 1.

Next, the activated mask layout is binarized, to obtain the labeled mask layout. In some exemplary embodiments, the labeled mask layout is determined according to the following formula (26).

$\begin{matrix} M_{b i n} (x, y) = 0, M (x, y) \leq M_{t} & Formula (26) \end{matrix}$

$M_{b i n} (x, y) = 1, M (x, y) > M_{t}$

- where M_bin(x, y) represents the labeled mask layout, and the pixel value on the labeled mask layout is 0 or 1. M(x, y) represents the activated mask layout. M_trepresents a pixel threshold. The pixel threshold may be set according to experience, or may be flexibly adjusted according to an application requirement. This is not limited in this exemplary embodiment of this disclosure. In some exemplary embodiments, the pixel threshold is 0.5.

It can be seen from formula (26) that if any pixel value in the activated mask layout is less than or equal to the pixel threshold, the pixel value is set to 0. If any pixel value in the activated mask layout is greater than the pixel threshold, the pixel value is set to 1. By using the foregoing mode of binarizing each pixel value in the activated mask layout, the labeled mask layout may be obtained.

In some exemplary embodiments, after operation A342, the method further includes operations A344 to A345 (not shown).

Operation A344: Determine a third wafer layout of the adjusted mask layout through the second wafer layout determining model in a case that the adjusted mask layout does not satisfy the adjustment ending condition, where the third wafer layout is a wafer layout of the adjusted mask layout obtained by prediction under the conditions of the reference lithography process parameter.

If the adjusted mask layout does not satisfy the adjustment ending condition, based on an implementation principle of operation A2, the third wafer layout of the adjusted mask layout is determined through the second wafer layout determining model. The adjusted mask layout in operation A344 corresponds to the initial mask layout in operation A2. The third wafer layout in operation A344 corresponds to the second wafer layout in operation A2. Since an implementation principle of operation A344 is similar to the implementation principle of operation A2, refer to the foregoing descriptions for operation A2. Details are not described herein again.

Operation A345: Adjust the adjusted mask layout based on the sample chip layout and the third wafer layout, to obtain the labeled mask layout.

For example, based on an implementation principle of operation A3, the adjusted mask layout is adjusted based on the sample chip layout and the third wafer layout, to obtain the labeled mask layout. The adjusted mask layout in operation A345 corresponds to the initial mask layout in operation A3. The third wafer layout in operation A345 corresponds to the second wafer layout in operation A3. Since an implementation principle of operation A345 is similar to the implementation principle of operation A3, refer to the foregoing descriptions for operation A3. Details are not described herein again.

It can be seen from content of operations A1 to A3 that, in this exemplary embodiment of this disclosure, the initial mask layout is iteratively adjusted based on the second wafer layout determining model of the reference lithography process parameter, to obtain the labeled mask layout. This adjustment process belongs to an inverse lithography technology (ILT). Refer to FIG. 4. FIG. 4 is a flowchart of determining a labeled mask layout according to an exemplary embodiment of this disclosure. A procedure of determining the labeled mask layout includes the following operations 401 to 408.

Operation 401: Linearly modify a sample chip layout, to obtain an initial mask layout. For an implementation of operation 401, refer to the descriptions for operation A1. Details are not described herein again.

Operation 402: Input the initial mask layout to a second wafer layout determining model, to obtain a second wafer layout. For an implementation of operation 402, refer to the descriptions for operation A2. Details are not described herein again.

Operation 403: Determine a reference loss based on the sample chip layout, the initial mask layout, and the second wafer layout. For an implementation of operation 403, refer to the descriptions for operations A31 to A32. Details are not described herein again.

Operation 404: Determine a first gradient of the reference loss relative to the initial mask layout. For an implementation of operation 404, refer to the descriptions for operation A33. Details are not described herein again.

Operation 405: Adjust the initial mask layout based on the first gradient, to obtain an adjusted mask layout. For an implementation of operation 405, refer to the descriptions for operations A341 to A342. Details are not described herein again.

Operation 406: Determine whether the adjusted mask layout satisfies an adjustment ending condition.

Operation 407: Determine, based on the adjusted mask layout, a labeled mask layout if the adjusted mask layout satisfies the adjustment ending condition. For an implementation of operation 407, refer to the descriptions for operation A343. Details are not described herein again.

Operation 408: Use the adjusted mask layout as the initial mask layout if the adjusted mask layout does not satisfy the adjustment ending condition. Next, operation 402 and subsequent operations are performed, until the adjusted mask layout satisfies the adjustment ending condition and the labeled mask layout is determined based on the adjusted mask layout.

In this exemplary embodiment of this disclosure, the initial mask layout is optimized step by step through the second wafer layout determining model, to obtain the labeled mask layout, so that the labeled mask layout is determined based on a deep learning method, a calculation amount of determining the labeled mask layout is reduced, and efficiency of determining the labeled mask layout is improved.

When training the mask layout determining model, the electronic device may obtain the labeled mask layout in real time through the foregoing operations A1 to A3. The electronic device may alternatively obtain and store the labeled mask layout through the foregoing operations A1 to A3 before training the mask layout determining model, so as to extract the labeled mask layout from storage when training the mask layout determining model. The electronic device may alternatively obtain the labeled mask layout from other devices when training the mask layout determining model. The other devices are configured to obtain and store the labeled mask layout through the foregoing operations A1 to A3 before the electronic device trains the mask layout determining model.

Operation 202: Determine a predicted mask layout of the sample chip layout through a neural network model.

The predicted mask layout is a mask layout of the sample chip layout obtained by prediction.

For example, the sample chip layout is inputted to the neural network model, and the predicted mask layout is determined through the neural network model. For example, the neural network model is an initial network model. In some exemplary embodiments, the initial network model is a U-net model, and the U-net model includes an encoder and a decoder. Functions of the encoder and the decoder are correspondingly described below. Details are not described herein again. Alternatively, the neural network model is obtained by pre-training the initial network model through the sample chip layout and the labeled mask layout. In some exemplary embodiments, the neural network model also includes an encoder and a decoder. However, there is a model parameter difference between the encoder included in the neural network model and the encoder included in the initial network model, and there is also a model parameter difference between the decoder included in the neural network model and the decoder included in the initial network model.

In the process of pre-training the initial network model through the sample chip layout and the labeled mask layout, the sample chip layout may be inputted to the initial network model, and the predicted mask layout is determined through the initial network model. A loss of the initial network model is determined based on difference information between the labeled mask layout and the predicted mask layout determined through the initial network model, and the model parameter of the initial network model is adjusted based on the loss of the initial network model, to obtain the adjusted initial network model.

If the adjusted initial network model satisfies a pre-training ending condition, the adjusted initial network model is used as the neural network model. If the adjusted initial network model does not satisfy the pre-training ending condition, the sample chip layout is inputted to the adjusted initial network model, and the predicted mask layout is determined through the adjusted initial network model. A loss of the initial network model is determined based on difference information between the labeled mask layout and the predicted mask layout determined through the adjusted initial network model, and the model parameter of the adjusted initial network model is adjusted again based on the loss of the initial network model. The rest can be deduced by analogy, until the adjusted initial network model satisfies the pre-training ending condition and the adjusted initial network model is used as the neural network model.

For example, the case that the adjusted initial network model satisfies the pre-training ending condition is: The quantity of trainings of the adjusted initial network model reaches a second set quantity, or the loss of the initial network model is within a first set range, or the loss of the initial network model converges. The second set quantity and the first set range are set according to experience, or are flexibly adjusted according to an application requirement. This is not limited in this exemplary embodiment of this application. The second set quantity may be the same as or different from the first set quantity.

The initial network model is pre-trained, so that not only the neural network model has a capability of determining the predicted mask layout based on the sample chip layout, but also a convergence speed of the neural network model can be increased and efficiency can be improved when the neural network model is subsequently trained.

Functions, structures, and the like of the initial network model and the neural network model are the same, and there is only a difference in model parameters between the two network models. Therefore, an implementation principle of determining the predicted mask layout of the sample chip layout through the initial network model is the same as an implementation principle of determining the predicted mask layout of the sample chip layout through the neural network model. Refer to the following descriptions for operations 2021 to 2022. In an exemplary implementation, operation 202 includes operations 2021 to 2022 (not shown).

Operation 2021: Encode the sample chip layout through the neural network model, to obtain a layout feature of the sample chip layout.

In this exemplary embodiment of this disclosure, the neural network model is a U-net model, and the neural network model includes an encoder. The sample chip layout is encoded through the encoder, to obtain the layout feature of the sample chip layout.

In some exemplary embodiments, the encoder includes at least one convolutional neural network layer, and any convolutional neural network layer includes at least one filter. For example, the encoder includes eight convolutional neural network layers. The eight convolutional neural network layers respectively include 8, 16, 32, 64, 128, 256, 512, and 1024 3×3 filters. For any convolutional neural network layer, the convolutional neural network layer may be followed by a batch normalization layer and an activation function layer. In some exemplary embodiments, the activation function layer is a function layer of a rectified linear unit (ReLU).

After the sample chip layout is inputted to the encoder, the sample chip layout is convolved through the convolutional neural network layers sequentially, and the layout feature of the sample chip layout is obtained based on a convolution processing result of the last convolutional neural network layer. In some exemplary embodiments, the sample chip layout is a matrix of (256, 256, 1) dimensions, and the layout feature of the sample chip layout is a matrix of (1, 1, 1024) dimensions. Certainly, the sample chip layout and the layout feature may alternatively be matrices in other dimensions. This is not limited in this exemplary embodiment of this disclosure.

Operation 2022: Decode the layout feature of the sample chip layout through the neural network model, to obtain the predicted mask layout.

The neural network model further includes a decoder connected in series to the encoder. The layout feature of the sample chip layout is decoded through the decoder, to obtain the predicted mask layout.

In some exemplary embodiments, the decoder includes at least one deconvolutional neural network layer, and any deconvolutional neural network layer includes at least one filter. For example, the decoder includes eight deconvolutional neural network layers. The eight deconvolutional neural network layers respectively include 1024, 512, 256, 128, 64, 32, 16, and 1 3×3 filter. For any deconvolutional neural network layer, the deconvolutional neural network layer may be followed by a batch normalization layer and/or an activation function layer. In some exemplary embodiments, the activation function layer is a function layer of a leaky rectified linear unit (Leaky-ReLU) or a function layer of a sigmoid activation function. For example, each of the first seven deconvolutional neural network layers is followed by the batch normalization layer and the function layer of the Leaky-ReLU, and the eighth deconvolutional neural network layer is followed by the function layer of the sigmoid activation function.

After the layout feature of the sample chip layout is inputted to the decoder, the sample chip layout is deconvolved through the deconvolutional neural network layers sequentially, and sigmoid activation processing is performed on a deconvolution processing result of the last convolutional neural network layer, to obtain the predicted mask layout. In some exemplary embodiments, the layout feature of the sample chip layout is a matrix of (1, 1, 1024) dimensions. The predicted mask layout is a matrix of (256, 256, 1) dimensions, and values in the matrix are values between 0 and 1. Certainly, the layout feature of the sample chip layout and the predicted mask layout may alternatively be matrices in other dimensions. This is not limited in this exemplary embodiment of this disclosure.

Operation 203: Determine a first wafer layout of the predicted mask layout through a first wafer layout determining model.

The first wafer layout determining model includes an actual lithography process parameter. The first wafer layout is a wafer layout of a predicted mask layout obtained by prediction under conditions of the actual lithography process parameter.

In an actual operation, in a process of exposing the labeled mask layout on the wafer through the lithography machine, the lithography machine may perform exposure based on any lithography process parameter. The lithography process parameter used when the lithography process is actually performed is an actual lithography process parameter. The actual lithography process parameter is a lithography process parameter used when the lithography process is actually performed.

In an exemplary embodiment, the actual lithography process parameter may be determined through a defocus process parameter distribution function and/or an exposure dose deviation process parameter distribution function. In the process of an actual lithography process, a distance between the wafer and the focal plane of the lithography machine may not be equal to 0. Based on this, for the actual lithography process, h_μ in formula (1) may be any data, and σ_hmay remain unchanged. For example, a value of h_μ may be [−80 nm, 0 nm, 80 nm], σ_h=80. Furthermore, during an actual lithography process, an exposure dose deviation of the lithography machine may not be equal to 0. Based on this, t_qin formula (2) may be any data, and σ_qmay remain unchanged. For example, a value of t_qmay be [−0.1, 0.0, 0.1], σ_q=0.1.

It can be seen from the foregoing that the reference lithography process parameter is a set lithography process parameter. In some exemplary embodiments, the set lithography process parameter refers to: the wafer and the focal plane of the lithography machine are located in the same plane, and the exposure dose of the lithography machine has no deviation. The actual lithography process parameter is a lithography process parameter that is actually used. The lithography process parameter that is actually used includes: the wafer and the focal plane of the lithography machine are or not located in the same plane (the wafer is above or below the focal plane of the lithography machine), and the exposure dose of the lithography machine has a deviation or no deviation.

For example, the predicted mask layout is directly inputted to the first wafer layout determining model, and the first wafer layout is outputted through the first wafer layout determining model. Alternatively, the predicted mask layout is first activated, to obtain a reference mask layout, the reference mask layout is then inputted to the first wafer layout determining model, and the first wafer layout is outputted through the first wafer layout determining model. An implementation of operation 203 is similar to the implementation of operation A2, and may be seen from the descriptions for operation A2. Details are not described herein again.

The predicted mask layout in operation 203 corresponds to the initial mask layout in operation A2. The reference mask layout in operation 203 corresponds to the reference mask layout in operation A2. The first wafer layout determining model in operation 203 corresponds to the second wafer layout determining model in operation A2. The first wafer layout in operation 203 corresponds to the second wafer layout in operation A2.

In an exemplary implementation, operation 203 includes operations 2031 to 2032 (not shown).

Operation 2031: Determine actual light intensity distribution information based on the predicted mask layout through the first wafer layout determining model, where the actual light intensity distribution information is a light intensity distribution obtained after the predicted mask layout is imaged on a wafer under the conditions of the actual lithography process parameter.

An implementation of operation 2031 is similar to the implementation of operation A21, and may be seen from the descriptions for operation A21. Details are not described herein again. The predicted mask layout in operation 2031 corresponds to the initial mask layout in operation A21. The first wafer layout determining model in operation 2031 corresponds to the second wafer layout determining model in operation A21. The actual light intensity distribution information in operation 2031 corresponds to the reference light intensity distribution information in operation A21.

Since the first wafer layout determining model includes an actual lithography process parameter, h_μ in formula (4) may be any data. For example, a value of h_μ may be [−80 nm, 0 nm, 80 nm].

Operation 2032: Determine the first wafer layout based on the actual light intensity distribution information through the first wafer layout determining model.

An implementation of operation 2032 is similar to the implementation of operation A22, and may be seen from the descriptions for operation A22. Details are not described herein again. The first wafer layout in operation 2032 corresponds to the second wafer layout in operation A22. The first wafer layout determining model in operation 2032 corresponds to the second wafer layout determining model in operation A22. The actual light intensity distribution information in operation 2032 corresponds to the reference light intensity distribution information in operation A22.

Since the first wafer layout determining model includes an actual lithography process parameter, h_μ in formula (5) may be any data, and t_qmay be any data. For example, a value of h_μ may be [−80 nm, 0 nm, 80 nm], and a value of t_qmay be [−0.1, 0.0, 0.1].

Operation 204: Train the neural network model through the labeled mask layout, the predicted mask layout, the sample chip layout, and the first wafer layout, to obtain a mask layout determining model.

The mask layout determining model is configured for determining a target mask layout of a target chip layout.

For example, the target loss is determined through the labeled mask layout, the predicted mask layout, the sample chip layout, and the first wafer layout. The neural network model is trained through the target loss, to obtain a trained neural network model. If the trained neural network model satisfies a training ending condition, the trained neural network model is determined as the mask layout determining model. If the trained neural network model does not satisfy the training ending condition, the trained neural network model is used as a neural network model trained next time, and the neural network model is trained next time according to content of operations 202 to 204, until the trained neural network model satisfies the training ending condition and the trained neural network model is determined as the mask layout determining model.

The content that the trained neural network model satisfies the training ending condition is not limited in this exemplary embodiment of this disclosure. For example, the case that the trained neural network model satisfies the training ending condition is: The quantity of trainings corresponding to the trained neural network model reaches a third set quantity, or the target loss corresponding to the trained neural network model is within a second set range, or the target loss corresponding to the trained neural network model converges.

The third set quantity and the second set range are set according to experience, or are flexibly adjusted according to an application requirement. This is not limited in this exemplary embodiment of this disclosure. The third set quantity may be the same as the first set quantity or the second set quantity, or may be different from both the first set quantity and the second set quantity. The second set range may be the same as or different from the first set range.

The predicted mask layout is determined through the neural network model, so that the predicted mask layout is determined based on a deep learning method, a calculation amount of determining the predicted mask layout is reduced, and efficiency of determining the predicted mask layout is improved. Furthermore, the first wafer layout is determined through the first wafer layout determining model, so that the first wafer layout is determined based on the deep learning method, a calculation amount of determining the first wafer layout is reduced, and efficiency of determining the first wafer layout is improved. Since efficiency of determining the predicted mask layout and the first wafer layout is high and efficiency of determining the labeled mask layout is also high, efficiency of training the mask layout determining model can be improved.

In an exemplary implementation, operation 204 includes operations 2041 to 2045 (not shown).

Operation 2041: Determine a first target sub-loss through third difference information between the labeled mask layout and the predicted mask layout.

For example, the third difference information between the labeled mask layout and the predicted mask layout is calculated. In some exemplary embodiments, the third difference information may be obtained by calculation based on a difference between a pixel value of a pixel in the labeled mask layout and a pixel value of a corresponding pixel in the predicted mask layout. For example, a mean square error between the pixel value of the pixel in the labeled mask layout and the pixel value of the corresponding pixel in the predicted mask layout is used as the third difference information.

The first target sub-loss is determined based on the third difference information, to describe an error between the labeled mask layout and the predicted mask layout through the first target sub-loss. In some exemplary embodiments, the first target sub-loss is determined according to the following formula (27).

$\begin{matrix} L_{fit} = {❘ {Mask}_{pred} - Mask ❘}^{2} & Formula (27) \end{matrix}$

- where L_fitrepresents the first target sub-loss. Mask_pred−Mask represents the third difference information between the labeled mask layout and the predicted mask layout, Mask_predrepresents the predicted mask layout, and Mask represents the labeled mask layout.

Operation 2042: Determine a second target sub-loss through fourth difference information between the sample chip layout and the first wafer layout. A mode of determining the second target sub-loss is similar to a mode of determining the first reference sub-loss, and may be seen from the descriptions for operation A31. Details are not described herein again. The “first wafer layout” in operation 2042 corresponds to the “second wafer layout” in operation A31. The “fourth difference information” in operation 2042 corresponds to the “first difference information” in operation A31. The “second target sub-loss” in operation 2042 corresponds to the “first reference sub-loss” in operation A31.

Operation 2043: Determine a target loss based on the first target sub-loss and the second target sub-loss.

In an exemplary embodiment, weighted calculation is performed on the first target sub-loss and the second target sub-loss, to obtain a weighted calculation result used as the target loss. Alternatively, the target loss is determined based on the first target sub-loss and the second target sub-loss according to an exemplary implementation shown below.

In an exemplary implementation, operation 2043 includes: obtaining at least one of a third target sub-loss, a fourth target sub-loss, a fifth target sub-loss, a sixth target sub-loss, or a seventh target sub-loss, where the third target sub-loss is determined based on the predicted mask layout, the fourth target sub-loss is determined based on fifth difference information between the sample chip layout and the predicted mask layout, the fifth target sub-loss is determined based on the fourth difference information and the edge modification information, the sixth target sub-loss is determined based on the predicted mask layout and the edge modification information, the seventh target sub-loss being determined based on the fifth difference information and the edge modification information, and the edge modification information is information for modifying the edge of the sample chip layout; and determining the target loss based on the first target sub-loss, and the second target sub-loss, and at least one of the third target sub-loss, the fourth target sub-loss, the fifth target sub-loss, the sixth target sub-loss, or the seventh target sub-loss.

A mode of determining the third target sub-loss to the seventh target sub-loss in operation 2043 is similar to a mode of determining the second reference sub-loss to the sixth reference sub-loss in operation A321. A corresponding relationship between the target sub-loss and the reference sub-loss is:

A mode of determining the third target sub-loss is similar to a mode of determining the second reference sub-loss, and may be seen from the descriptions for the second reference sub-loss in operation A321. Details are not described herein again. The “predicted mask layout” in operation 2043 corresponds to the “initial mask layout” in operation A321. The “third target sub-loss” in operation 2043 corresponds to the “second reference sub-loss” in operation A321.

A mode of determining the fourth target sub-loss is similar to a mode of determining the third reference sub-loss, and may be seen from the descriptions for the third reference sub-loss in operation A321. Details are not described herein again. The “predicted mask layout” in operation 2043 corresponds to the “initial mask layout” in operation A321. The “fifth difference information” in operation 2043 corresponds to the “second difference information” in operation A321. The “fourth target sub-loss” in operation 2043 corresponds to the “third reference sub-loss” in operation A321.

A mode of determining the fifth target sub-loss is similar to a mode of determining the fourth reference sub-loss, and may be seen from the descriptions for the fourth reference sub-loss in operation A321. Details are not described herein again. The “fourth difference information” in operation 2043 corresponds to the “first difference information” in operation A321. The “fifth target sub-loss” in operation 2043 corresponds to the “fourth reference sub-loss” in operation A321.

A mode of determining the sixth target sub-loss is similar to a mode of determining the fifth reference sub-loss, and may be seen from the descriptions for the fifth reference sub-loss in operation A321. Details are not described herein again. The “predicted mask layout” in operation 2043 corresponds to the “initial mask layout” in operation A321. The “sixth target sub-loss” in operation 2043 corresponds to the “fifth reference sub-loss” in operation A321.

A mode of determining the seventh target sub-loss is similar to a mode of determining the sixth reference sub-loss, and may be seen from the descriptions for the sixth reference sub-loss in operation A321. Details are not described herein again. The “fifth difference information” in operation 2043 corresponds to the “second difference information” in operation A321. The “seventh target sub-loss” in operation 2043 corresponds to the “sixth reference sub-loss” in operation A321.

For example, weighted summation is performed on the first target sub-loss, the second target sub-loss, and at least one of the third target sub-loss, the fourth target sub-loss, the fifth target sub-loss, the sixth target sub-loss, and the seventh target sub-loss. The target loss is determined based on a weighted summation result. For any two target sub-losses in the first target sub-loss to the seventh target sub-loss, weights of the two target sub-losses may be the same or may be different.

In some exemplary embodiments, the target loss is determined according to the following formula (28).

$\begin{matrix} L_{loss} = L_{fit} + α * L_{t otal} = {❘ {Mask}_{pred} - Mask ❘}^{2} + α * L_{t otal} & Formula (28) \end{matrix}$

- where L_lossrepresents the target loss. L_fitrepresents the first target sub-loss. L_totalrepresents a sum of target sub-losses obtained after weighted summation is performed on the second target sub-loss to the seventh target sub-loss. The sum of the target sub-losses corresponds to the “reference loss” mentioned above. For a mode of determining L_total, refer to formula (12). Details are not described herein again. a represents a weight of L_total, and is an adjustable parameter. A larger size of a represents that training of the neural network model pays more attention to the actual lithography process parameter, thus helping to enlarge a process window for generating the mask layout. Mask_pred−Mask represents the third difference information, Mask_predrepresents the predicted mask layout, and Mask represents the labeled mask layout.

Operation 2044: Determine a second gradient of the target loss relative to a model parameter of the neural network model.

The target loss includes the first target sub-loss, the second target sub-loss, and at least one of the third target sub-loss to the seventh target sub-loss. Therefore, the second gradient of the target loss relative to the model parameter of the neural network model includes three parts. The first part is a gradient of the at least one target sub-loss in the third target sub-loss to the seventh target sub-loss relative to the model parameter of the neural network model. The second part is a gradient of the first target sub-loss relative to the model parameter of the neural network model. The third part is a gradient of the second target sub-loss relative to the model parameter of the neural network model.

An example in which the target loss includes the first target sub-loss to the seventh target sub-loss is used. The second gradient of the target loss relative to the model parameter of the neural network model is shown in the following formula (29).

$\begin{matrix} \frac{\partial L_{loss}}{\partial w} = \frac{\partial L_{fit}}{\partial w} + α \times \frac{\partial L_{t otal}}{\partial {Mask}_{pred}} \times \frac{\partial {Mask}_{pred}}{\partial w} & Formula (29) \end{matrix}$

- where ∂L_loss/∂w represents the second gradient, L_lossrepresents the target loss, and w represents the model parameter of the neural network model. ∂L_fit/∂w represents the gradient of the first target sub-loss relative to the model parameter of the neural network model, and L_fitrepresents the first target sub-loss. ∂L_total/∂Mask_predrepresents the gradient of the sum of target sub-losses obtained after weighted summation is performed on the second target sub-loss to the seventh target sub-loss relative to the predicted mask layout, L_totalrepresents the sum of the target sub-losses, and Mask_predrepresents the predicted mask layout. α represents the weight of L_total

$\frac{\partial {Mask}_{pred}}{\partial w}$

represents the gradient of the predicted mask layout relative to the model parameter of the neural network model.

The “model parameter of the neural network model” mentioned in this exemplary embodiment of this disclosure refers to a weight of each neuron included in the neural network model.

Operation 2045: Adjust the model parameter of the neural network model based on the second gradient, to obtain the mask layout determining model.

For example, an adjustment step of the model parameter is determined based on the second gradient, and the model parameter of the neural network model is adjusted based on the second gradient and the adjustment step, to obtain an adjusted model parameter, so that the neural network model is trained once, to obtain a trained neural network model. The model parameter of the trained neural network model is the adjusted model parameter.

If the trained neural network model satisfies a training ending condition, the trained neural network model is determined as the mask layout determining model. If the trained neural network model does not satisfy the training ending condition, the trained neural network model is trained next time, until the trained neural network model satisfies the training ending condition and the trained neural network model is determined as the mask layout determining model. Content of the training ending condition is described above. Details are not described herein again.

In this exemplary embodiment of this disclosure, the second gradient of the target loss relative to the model parameter of the neural network model is calculated, to minimize the target loss through a gradient descent method. The target loss includes the first target sub-loss, the second target sub-loss, and at least one of the third target sub-loss to the seventh target sub-loss. By minimizing the target loss, the first target sub-loss, the second target sub-loss, and the at least one of the third target sub-loss to the seventh target sub-loss may be minimized.

The first target sub-loss is determined based on the third difference information between the labeled mask layout and the predicted mask layout. By minimizing the first target sub-loss, the predicted mask layout outputted by the model increasingly approaches the labeled mask layout, so that a target mask layout with high accuracy can be determined through the mask layout determining model.

The second target sub-loss is determined based on the fourth difference information between the sample chip layout and the first wafer layout. By minimizing the second target sub-loss, under the conditions of the actual lithography process parameter, a wafer layout similar to the sample chip layout can be obtained based on the mask layout outputted by the model, thus helping to improve the quality of the wafer layout, so that the target mask layout determined through the mask layout determining model has high quality.

The third target sub-loss is determined based on the predicted mask layout. The third target sub-loss may reflect an error between continuous distribution of the predicted mask layout and discrete distribution of a binary mask layout. By minimizing the third target sub-loss, the predicted mask layout may increasingly approach the binary mask layout, thereby improving the accuracy of the predicted mask layout.

The fourth target sub-loss is determined based on the fifth difference information between the sample chip layout and the predicted mask layout. The fourth target sub-loss may measure complexity of the predicted mask layout. By minimizing the fourth target sub-loss, the complexity of the predicted mask layout can be reduced, the manufacturability of the predicted mask layout can be improved, and a process window for generating the mask layout can be enlarged.

The fifth target sub-loss is determined based on the fourth difference information between the sample chip layout and the first wafer layout and the edge modification information. The fifth target sub-loss may describe an error between the sample chip layout and the first wafer layout in a case that noise exists in the sample chip layout. By minimizing the fifth target sub-loss, a generalization capability of the model is improved, so that the model can output a high-quality mask layout. In short, even if there is a small difference between two chip layouts, the model can output a mask layout of each chip layout, and a wafer layout highly similar to the chip layout can be obtained based on the mask layout of any chip layout, so that there can also be a difference between the two chip layouts.

The sixth target sub-loss is determined based on the edge modification information and the predicted mask layout. The sixth target sub-loss may reflect the error between the continuous distribution of the predicted mask layout and the discrete distribution of the binary mask layout in a case that the predicted mask layout has the same noise as that of the sample chip layout. By minimizing the sixth target sub-loss, the generalization capability of the model can be improved, so that the model can output an accurate mask layout. In short, even if there is a small difference between two chip layouts, the model can output a mask layout of each chip layout, and the two mask layouts have a difference between the two chip layouts. Any mask layout is similar to the binary mask layout, and the mask layout has high accuracy.

The seventh target sub-loss is determined based on the fifth difference information between the sample chip layout and the predicted mask layout and the edge modification information. The seventh target sub-loss may measure the complexity of the predicted mask layout in a case that the predicted mask layout has the same noise as that of the sample chip layout. By minimizing the seventh target sub-loss, the generalization capability of the model can be improved, so that the model can output a mask layout with low complexity. In short, even if there is a small difference between two chip layouts, the model can output a mask layout of each chip layout. The two mask layouts can have a difference between the two chip layouts, and any mask layout has low complexity.

The target loss includes the first target sub-loss, the second target sub-loss, and at least one of the third target sub-loss to the seventh target sub-loss. Therefore, the mask layout determining model has effects corresponding to the first target sub-loss, the second target sub-loss, and the at least one of the third target sub-loss to the seventh target sub-loss. Furthermore, the target loss may further include another target sub-loss.

For example, this target sub-loss is an eighth target sub-loss. In some exemplary embodiments, a first wafer layout of the predicted mask layout is determined through at least two first wafer layout determining models, where any two first wafer layout determining models include different actual lithography process parameters. The eighth target sub-loss is determined based on sixth difference information between every two first wafer layouts. A target loss is determined based on the eighth target sub-loss, the first target sub-loss, the second target sub-loss, and at least one of the third target sub-loss to the seventh target sub-loss. The neural network model is trained through the target loss, to obtain the mask layout determining model. The eighth target sub-loss is configured for measuring an error between the first wafer layouts generated by the predicted mask layout under the conditions of different actual lithography process parameters, so that the complexity of the predicted mask layout can be measured.

Usually, the complexity of the mask layout is positively correlated to a manufacturing cost of the mask layout. This is because the mask layout with high complexity has more tiny structures. These tiny structures include, but are not limited to, holes, extensions, and aliasing. A larger quantity of tiny structures indicates a higher manufacturing cost of the mask layout. By minimizing the target loss, the eighth target sub-loss may be minimized, so that the first wafer layouts generated by the predicted mask layout under the conditions of different actual lithography process parameters are increasingly similar to each other, thereby reducing the complexity of the predicted mask layout, and reducing the manufacturing cost of the predicted mask layout.

A ninth target sub-loss may further be determined based on the sixth difference information and the edge modification information. A target loss is determined based on the first target sub-loss, the second target sub-loss, and at least one of the third target sub-loss to the ninth target sub-loss. The neural network model is trained through the target loss, to obtain the mask layout determining model. By minimizing the target loss, the ninth target sub-loss may be minimized. While the generalization capability of the model is improved, the model can output a mask layout with low complexity.

Furthermore, a second wafer layout of the initial mask layout may be determined through at least two second wafer layout determining models according to a principle of determining the eighth target sub-loss, where any two second wafer layout determining models include different reference lithography process parameters. The seventh reference sub-loss is determined based on seventh difference information between every two second wafer layouts. The eighth reference sub-loss may further be determined based on the seventh difference information and the edge modification information according to a principle of determining the ninth target sub-loss.

A reference loss is determined based on the first reference sub-loss and at least one of the second reference sub-loss to the eighth reference sub-loss. The initial mask layout is adjusted through the reference loss, to obtain the labeled mask layout. The seventh reference sub-loss is configured for measuring an error between the second wafer layouts generated by the initial mask layout under the conditions of different reference lithography process parameters, so that the complexity of the initial mask layout can be measured. The eighth reference sub-loss may measure the complexity of the initial mask layout in a case that the initial mask layout has the same noise as that of the sample chip layout. By reducing the reference loss, the complexity of the labeled mask layout can be reduced.

The information (including, but not limited to, user equipment information, user personal information, and the like), data (including, but not limited to, data for analysis, stored data, displayed data, and the like), and signals involved in this disclosure all are authorized by the user or fully authorized by each party, and the collection, use, and processing of relevant data need to comply with relevant laws and regulations of relevant regions. For example, the sample chip layout, the reference lithography process parameter, the actual lithography process parameter, and the like involved in this disclosure are all obtained under full authorization.

By using the foregoing method, a predicted mask layout of a sample chip layout is determined through a neural network model, and a first wafer layout in which an actual lithography process parameter is considered is determined based on the predicted mask layout. The neural network model is trained through a labeled mask layout, the predicted mask layout, the sample chip layout, and the first wafer layout, to obtain a mask layout determining model, so that a target mask layout determined through the mask layout determining model better conforms to the actual lithography process parameter, and a wafer layout highly similar to a target chip layout can be obtained under the conditions of the actual lithography process parameter, thereby improving the quality of the target mask layout.

Furthermore, the initial mask layout is optimized step by step through the second wafer layout determining model, to obtain the labeled mask layout, so that the labeled mask layout is determined based on a deep learning method, a calculation amount of determining the labeled mask layout is reduced, and efficiency of determining the labeled mask layout is improved. The initial mask layout is constantly adjusted through the sample chip layout and the second wafer layout, so that the mask layout is constantly optimized toward a direction in which the wafer layout can increasingly approach the sample chip layout, thereby achieving high accuracy of the labeled mask layout.

This exemplary embodiment of this disclosure provides a mask layout determining method. The method may be applied to the foregoing implementation environment. A high-quality mask layout may be determined, so that a wafer layout determined based on the mask layout is highly similar to a chip layout. A flowchart of a mask layout determining method according to an exemplary embodiment of this disclosure shown in FIG. 5 is used as an example. For ease of description, the terminal device 101 or the server 102 that performs the mask layout determining method in this exemplary embodiment of this disclosure is referred to as an electronic device. The method may be performed by the electronic device. As shown in FIG. 5, the method includes the following operations 501 and 502.

Operation 501: Obtain a target chip layout. Content of the target chip layout is similar to content of the sample chip layout, and may be seen from the descriptions for operation 201. Details are not described herein again.

Operation 502: Determine a target mask layout of the target chip layout through a mask layout determining model. The mask layout determining model is obtained by training according to the mask layout determining model training method related to FIG. 2, and the target mask layout is a mask layout of the target chip layout obtained by prediction.

A mode of determining the target mask layout is similar to a mode of determining the predicted mask layout, and may be seen from the descriptions for operation 202. Details are not described herein again. The “mask layout determining model” in operation 502 corresponds to the “neural network model” in operation 202. The “target chip layout” in operation 502 corresponds to the “sample chip layout” in operation 202. The “target mask layout” in operation 502 corresponds to the “predicted mask layout” in operation 202.

In an exemplary implementation, operation 502 includes: encoding the target chip layout through the mask layout determining model, to obtain a layout feature of the target chip layout; and decoding the layout feature of the target chip layout through the mask layout determining model, to obtain the target mask layout.

A mode of encoding the target chip layout is similar to the manner of encoding the sample chip layout, and may be seen from the descriptions for operation 2021. A mode of decoding the layout feature of the target chip layout is similar to a mode of decoding the layout feature of the sample chip layout, and may be seen from the descriptions for operation 2022. Details are not described herein again.

In this exemplary embodiment of this disclosure, the target chip layout is inputted to the mask layout determining model, and the target mask layout of the target chip layout is determined through the mask layout determining model, so that the target mask layout is determined based on a deep learning method, a calculation amount of determining the target mask layout is reduced, and efficiency of determining the target mask layout is improved.

The information (including, but not limited to, user equipment information, user personal information, and the like), data (including, but not limited to, data for analysis, stored data, displayed data, and the like), and signals involved in this disclosure all are authorized by the user or fully authorized by each party, and the collection, use, and processing of relevant data need to comply with relevant laws and regulations of relevant regions. For example, the target chip layout and the like involved in this disclosure are obtained under full authorization.

The mask layout determining model in the foregoing method is obtained by training in the following mode: determining a predicted mask layout of a sample chip layout through a neural network model, and determining, based on the predicted mask layout, a first wafer layout in which an actual lithography process parameter is considered; and training the neural network model through a labeled mask layout, the predicted mask layout, the sample chip layout, and the first wafer layout, to obtain a mask layout determining model. Since the actual lithography process parameter is considered in the training process of the mask layout determining model, a target mask layout determined through the mask layout determining model better conforms to the actual lithography process parameter, and a wafer layout highly similar to a target chip layout can be obtained under the conditions of the actual lithography process parameter, thereby improving the quality of the target mask layout.

The foregoing describes the mask layout determining model training method and the mask layout determining method in this exemplary embodiment of this disclosure from the perspective of method operations. The following describes the methods systematically and comprehensively.

In this exemplary embodiment of this disclosure, two lithography mask data sets may be obtained. There are approximately 10271 chip layouts and corresponding mask layouts in the two lithography mask data sets, and these chip layouts satisfy a 32 nm lithography process and a certain design rule. Furthermore, a chip layout of a contact hole (VIA) type and a corresponding mask layout may further be obtained. The foregoing mask layout is obtained by optimizing the chip layout according to an inverse lithography mask optimization algorithm in an inverse lithography technology and by using the wafer layout determining model (i.e. the second wafer layout determining model mentioned above) including the reference lithography process parameter. For this, refer to content of operations A1 to A3. Details are not described herein again. Any chip layout may be used as the sample chip layout, and the mask layout corresponding to the sample chip layout is the labeled mask layout.

The initial network model is trained to be the mask layout determining model based on the sample chip layout and the labeled mask layout. This training process is roughly divided into two stages. The first stage is: pre-training the initial network model based on the sample chip layout and the labeled mask layout, to obtain the neural network model. The second stage is: optimally training the neural network model based on the sample chip layout and the labeled mask layout, to obtain the mask layout determining model. The two stages are separately described below.

First, the first stage is performed. Refer to FIG. 6. FIG. 6 is a schematic diagram of pre-training an initial network model according to an exemplary embodiment of this disclosure. In the pre-training stage, a sample chip layout may be inputted to the initial network model, and a predicted mask layout is outputted through the initial network model. A loss of the initial network model is determined based on difference information between the predicted mask layout and a labeled mask layout. A model parameter of the initial network model is adjusted based on the loss of the initial network model, to obtain an adjusted initial network model. In this way, the initial network model is trained once based on the sample chip layout and the labeled mask layout.

If the quantity of trainings reaches a second set quantity, the adjusted initial network model is used as a neural network model. If the quantity of trainings does not reach the second set quantity, the adjusted initial network model is used as the initial network model, and the initial network model is trained next time based on the sample chip layout and the labeled mask layout according to the foregoing content, until the quantity of trainings reaches the second set quantity and the adjusted initial network model is used as the neural network model.

The second stage is performed after the neural network model is obtained by training. Refer to FIG. 7. FIG. 7 is a schematic diagram of optimally training a neural network model according to an exemplary embodiment of this disclosure. In the optimal training stage, a sample chip layout may be inputted to the neural network model, and a predicted mask layout is outputted through the neural network model. The predicted mask layout is inputted to a first wafer layout determining model including an actual lithography process parameter, to obtain a first wafer layout. A loss (i.e. the target loss mentioned above) of the neural network model is determined based on the sample chip layout, the first wafer layout, the predicted mask layout, and a labeled mask layout. A model parameter of the neural network model is adjusted based on the loss of the neural network model, to obtain an adjusted neural network model. In this way, the neural network model is trained once based on the sample chip layout and the labeled mask layout.

If the quantity of trainings reaches a third set quantity, the adjusted neural network model is used as a mask layout determining model. If the quantity of trainings does not reach the third set quantity, the adjusted neural network model is used as the neural network model, and the neural network model is trained next time based on the sample chip layout and the labeled mask layout according to the foregoing content, until the quantity of trainings reaches the third set quantity and the adjusted neural network model is used as the mask layout determining model.

After the mask layout determining model is obtained by training, a target chip layout may be inputted to the mask layout determining model, and a target mask layout is outputted through the mask layout determining model. Since the actual lithography process parameter is considered in an optimization training stage, a process window of the target mask layout is large. Refer to FIG. 8. FIG. 8 is a schematic diagram of a mask process window according to an exemplary embodiment of this disclosure. The mask process window includes an exposure dose deviation range and a defocus range.

For a chip layout of a contact hole type and a chip layout of a logic type, corresponding mask layouts may be obtained in modes 1 to 4. Mode 1 is: obtaining a mask layout by optimization based on a wafer layout determining model including a reference lithography process parameter. The mask layout is the labeled mask layout mentioned above. Mode 2 is: obtaining a mask layout by optimization based on a wafer layout determining model including an actual lithography process parameter. The mode of determining the mask layout is similar to the mode of determining the labeled mask layout. Details are not described herein again. Mode 3 is: determining a mask layout based on a neural network model obtained by pre-training. The mask layout is a predicted mask layout outputted by the neural network model. Mode 4 is: determining a mask layout based on a mask layout determining model obtained by pre-training and optimal training. The mask layout is the target mask layout mentioned above.

For the mask layouts corresponding to the chip layouts of the contact hole type that are obtained by using modes 1 to 4, statistics about relationships between two process parameters, i.e. an exposure dose deviation and defocus, of various mask layouts may be collected, to obtain (a) in FIG. 8. For the mask layouts corresponding to the chip layouts of the logic hole type that are obtained by using modes 1 to 4, statistics about relationships between two process parameters, i.e. an exposure dose deviation and defocus, of various mask layouts may be collected, to obtain (b) in FIG. 8.

It can be seen from (a) in FIG. 8 and (b) in FIG. 8 that, in a case of the same defocus, the target mask layout corresponds to a larger exposure dose deviation, and in a case of the same exposure dose deviation, the target mask layout corresponds to a larger defocus. Therefore, the mask layout determining model in this exemplary embodiment of this disclosure may determine a mask layout with a larger process window.

FIG. 9 is a schematic structural diagram of a mask layout determining model training apparatus according to an exemplary embodiment of this disclosure. As shown in FIG. 9, the apparatus includes:

- an obtaining module 901, configured to obtain a labeled mask layout and a sample chip layout, where the labeled mask layout is a mask layout of the sample chip layout determined under conditions of a reference lithography process parameter;
- a determining module 902, configured to determine a predicted mask layout of the sample chip layout through a neural network model, where the predicted mask layout is a mask layout of the sample chip layout obtained by prediction;
- the determining module 902, further configured to determine a first wafer layout of the predicted mask layout through a first wafer layout determining model, where the first wafer layout determining model includes an actual lithography process parameter, and the first wafer layout is a wafer layout of the predicted mask layout obtained by prediction; and
- a training module 903, configured to train the neural network model through the labeled mask layout, the predicted mask layout, the sample chip layout, and the first wafer layout, to obtain a mask layout determining model, where the mask layout determining model is configured for determining a target mask layout of a target chip layout.

In an exemplary implementation, the obtaining module 901 is configured to: generate an initial mask layout based on the sample chip layout; determine a second wafer layout of the initial mask layout through a second wafer layout determining model, where the second wafer layout determining model includes the reference lithography process parameter, and the second wafer layout is a wafer layout of the initial mask layout obtained by prediction; and adjust the initial mask layout based on the sample chip layout and the second wafer layout, to obtain the labeled mask layout.

In an exemplary implementation, the obtaining module 901 is configured to: determine reference light intensity distribution information based on the initial mask layout through the second wafer layout determining model, where the reference light intensity distribution information is a light intensity distribution obtained after the initial mask layout is imaged on a wafer under the conditions of the reference lithography process parameter; and determine the second wafer layout based on the reference light intensity distribution information through the second wafer layout determining model.

In an exemplary implementation, the second wafer layout determining model includes multiple kernel functions. The obtaining module 901 is configured to: activate the initial mask layout, to obtain a reference mask layout; and convolve the reference mask layout with the kernel functions, to obtain the reference light intensity distribution information based on convolution processing results.

In an exemplary implementation, the second wafer layout determining model includes an activation function. The obtaining module 901 is configured to activate the reference light intensity distribution information through the activation function, to obtain the second wafer layout.

In an exemplary implementation, the obtaining module 901 is configured to: determine a first reference sub-loss based on first difference information between the sample chip layout and the second wafer layout; determine a reference loss based on the first reference sub-loss; determine a first gradient of the reference loss relative to the initial mask layout; and adjust the initial mask layout based on the first gradient, to obtain the labeled mask layout.

In an exemplary implementation, the obtaining module 901 is configured to: obtain at least one of a second reference sub-loss, a third reference sub-loss, a fourth reference sub-loss, a fifth reference sub-loss, or a sixth reference sub-loss, where the second reference sub-loss is determined based on the initial mask layout, the third reference sub-loss is determined based on second difference information between the sample chip layout and the initial mask layout, the fourth reference sub-loss is determined based on the first difference information and edge modification information, the fifth reference sub-loss is determined based on the initial mask layout and the edge modification information, the sixth reference sub-loss is determined based on the second difference information and the edge modification information, and the edge modification information is information for modifying an edge of the sample chip layout; and determine the reference loss based on the first reference sub-loss and at least one of the second reference sub-loss, the third reference sub-loss, the fourth reference sub-loss, the fifth reference sub-loss, or the sixth reference sub-loss.

In an exemplary implementation, the obtaining module 901 is further configured to modify the edge of the sample chip layout according to an edge modification distance and an edge modification amplitude, to obtain the edge modification information.

In an exemplary implementation, the obtaining module 901 is configured to: determine an adjustment step based on the first gradient; adjust the initial mask layout based on the adjustment step and the first gradient, to obtain an adjusted mask layout; and activate the adjusted mask layout in a case that the adjusted mask layout satisfies an adjustment ending condition, to obtain the labeled mask layout.

In an exemplary implementation, the obtaining module 901 is further configured to: determine a third wafer layout of the adjusted mask layout through the second wafer layout determining model in a case that the adjusted mask layout does not satisfy the adjustment ending condition, where the third wafer layout is a wafer layout of the adjusted mask layout obtained by prediction under the conditions of the reference lithography process parameter; and adjust the adjusted mask layout based on the sample chip layout and the third wafer layout, to obtain the labeled mask layout.

In an exemplary implementation, the sample chip layout includes at least one geometrical pattern. The obtaining module 901 is configured to: use a sum of perimeters of the geometrical patterns as an edge length of the sample chip layout; and determine the first reference sub-loss based on the first difference information and the edge length of the sample chip layout.

In an exemplary implementation, the determining module 902 is configured to: encode the sample chip layout through the neural network model, to obtain a layout feature of the sample chip layout; and decode the layout feature of the sample chip layout through the neural network model, to obtain the predicted mask layout.

In an exemplary implementation, the determining module 902 is configured to: determine actual light intensity distribution information based on the predicted mask layout through the first wafer layout determining model, where the actual light intensity distribution information is a light intensity distribution obtained after the predicted mask layout is imaged on a wafer under the conditions of the actual lithography process parameter; and determine the first wafer layout based on the actual light intensity distribution information through the first wafer layout determining model.

In an exemplary implementation, the training module 903 is configured to: determine a first target sub-loss through third difference information between the labeled mask layout and the predicted mask layout; determine a second target sub-loss through fourth difference information between the sample chip layout and the first wafer layout; determine a target loss based on the first target sub-loss and the second target sub-loss; determine a second gradient of the target loss relative to a model parameter of the neural network model; and adjust the model parameter of the neural network model based on the second gradient, to obtain the mask layout determining model.

In an exemplary implementation, the training module 903 is configured to: obtain at least one of a third target sub-loss, a fourth target sub-loss, a fifth target sub-loss, a sixth target sub-loss, or a seventh target sub-loss, where the third target sub-loss is determined based on the predicted mask layout, the fourth target sub-loss is determined based on fifth difference information between the sample chip layout and the predicted mask layout, the fifth target sub-loss is determined based on the fourth difference information and the edge modification information, the sixth target sub-loss is determined based on the predicted mask layout and the edge modification information, the seventh target sub-loss being determined based on the fifth difference information and the edge modification information, and the edge modification information is information for modifying the edge of the sample chip layout; and determine the target loss based on the first target sub-loss, and the second target sub-loss, and at least one of the third target sub-loss, the fourth target sub-loss, the fifth target sub-loss, the sixth target sub-loss, or the seventh target sub-loss.

The foregoing apparatus determines a predicted mask layout of a sample chip layout through a neural network model, and determines, based on the predicted mask layout, a first wafer layout in which an actual lithography process parameter is considered. The neural network model is trained through a labeled mask layout, the predicted mask layout, the sample chip layout, and the first wafer layout, to obtain a mask layout determining model, so that a target mask layout determined through the mask layout determining model better conforms to the actual lithography process parameter, and a wafer layout highly similar to a target chip layout can be obtained under the conditions of the actual lithography process parameter, thereby improving the quality of the target mask layout.

When the apparatus provided in FIG. 9 implements the functions of the apparatus, only division into the foregoing function modules is used as an example for description. In the practical application, the functions may be allocated to and completed by different function modules according to requirements. To be specific, an internal structure of the device is divided into different function modules, to complete all or some of the functions described above. In addition, the apparatus provided in the foregoing exemplary embodiments and the method embodiments shown in FIG. 2 belong to the same concept. For details of a specific implementation process, refer to the method embodiments shown in FIG. 2. Details are not described herein again.

FIG. 10 is a schematic structural diagram of a mask layout determining apparatus according to an exemplary embodiment of this disclosure. As shown in FIG. 10, the apparatus includes:

- an obtaining module 1001, configured to obtain a target chip layout; and
- a determining module 1002, configured to determine a target mask layout of the target chip layout through a mask layout determining model, where the mask layout determining model is obtained by training through the mask layout determining model training method according to any content in the first aspect, and the target mask layout is a mask layout of the target chip layout obtained by prediction.

In an exemplary implementation, the determining module 1002 is configured to: encode the target chip layout through the mask layout determining model, to obtain a layout feature of the target chip layout; and decode the layout feature of the target chip layout through the mask layout determining model, to obtain the target mask layout.

The mask layout determining model in the foregoing apparatus is obtained by training in the following mode: determining a predicted mask layout of a sample chip layout through a neural network model, and determining, based on the predicted mask layout, a first wafer layout in which an actual lithography process parameter is considered; and training the neural network model through a labeled mask layout, the predicted mask layout, the sample chip layout, and the first wafer layout, to obtain a mask layout determining model. Since the actual lithography process parameter is considered in the training process of the mask layout determining model, a target mask layout determined through the mask layout determining model better conforms to the actual lithography process parameter, and a wafer layout highly similar to a target chip layout can be obtained under the conditions of the actual lithography process parameter, thereby improving the quality of the target mask layout.

When the apparatus provided in FIG. 10 implements the functions of the apparatus, only division into the foregoing function modules is used as an example for description. In the practical application, the functions may be allocated to and completed by different function modules according to requirements. To be specific, an internal structure of the device is divided into different function modules, to complete all or some of the functions described above. In addition, the apparatus provided in the foregoing exemplary embodiments and the method embodiments shown in FIG. 5 belong to the same concept. For details of a specific implementation process, refer to the method embodiments shown in FIG. 5. Details are not described herein again.

FIG. 11 shows a structural block diagram of a terminal device 1100 according to an exemplary embodiment of this disclosure. The terminal device 1100 includes: a processor 1101 and a memory 1102.

The processor 1101 may include one or more processing cores, for example, a 4-core processor or an 8-core processor. The processor 1101 may be implemented in at least one hardware form of a digital signal processor (DSP), a field-programmable gate array (FPGA), and a programmable logic array (PLA). The processor 1101 may alternatively include a main processor and a coprocessor. The main processor is configured to process data in a wake-up state, also referred to as a central processing unit (CPU). The coprocessor is a low-power-consumption processor configured to process data in a standby state. In some exemplary embodiments, the processor 1101 may be integrated with a graphics processing unit (GPU). The GPU is configured to render and draw content that needs to be displayed on a display. In some exemplary embodiments, the processor 1101 may further include an AI processor. The AI processor is configured to process computing operations related to machine learning.

The memory 1102 may include one or more computer-readable storage media. The computer-readable storage medium may be non-transient. The memory 1102 may further include a high-speed random access memory and a nonvolatile memory, for example, one or more disk storage devices or flash storage devices. In some exemplary embodiments, the non-transient computer-readable storage medium in the memory 1102 is configured to store at least one computer program. The at least one computer program is configured for being executed by the processor 1101, to cause the terminal device 1100 to implement the mask layout determining model training method or the mask layout determining method provided in the method embodiments of this disclosure.

In some exemplary embodiments, the terminal device 1100 may alternatively include: a display 1105.

The display 1105 is configured to display a user interface (UI). The UI may include a graph, text, an icon, a video, and any combination thereof. When the display 1105 is a touch display, the display 1105 further has a capability of acquiring a touch signal on or above a surface of the display 1105. The touch signal may be inputted to the processor 1101 as a control signal for processing. In this case, the display 1105 may be further configured to provide a virtual button and/or a virtual keyboard that are/is also referred to as a soft button and/or a soft keyboard. In some exemplary embodiments, one display 1105 may be disposed on a front panel of the terminal device 1100. In some other exemplary embodiments, at least two displays 1105 may be separately disposed on different surfaces of the terminal device 1100 or in a folded design. In some other exemplary embodiments, the display 1105 may be a flexible display disposed on a curved surface or a folded surface of the terminal device 1100. Even, the display 1105 may be further set in a non-rectangular irregular pattern. To be specific, the display is a special-shaped display. The display 1105 may be prepared by using materials such as a liquid crystal display (LCD) or an organic light-emitting diode (OLED). For example, a chip layout, a mask layout, a wafer layout, and the like may be displayed by using the display 1105.

The structure shown in FIG. 11 constitutes no limitation on the terminal device 1100, and the terminal device may include more or fewer components than those shown in the figure, or some components may be combined, or a different component deployment may be used.

FIG. 12 is a schematic structural diagram of a server according to an exemplary embodiment of this disclosure. The server 1200 may vary greatly due to different configurations or performance, and may include one or more processors 1201 and one or more memories 1202. The one or more memories 1202 have at least one computer program stored therein. The at least one computer program is loaded and executed by the one or more processors 1201, to cause the server 1200 to implement the mask layout determining model training method or the mask layout determining method provided in the foregoing method embodiments. For example, the processor 1201 is a CPU. It is clear that the server 1200 may further have components such as a wired or wireless network interface, a keyboard, an input/output interface for input and output. The server 1200 may further include other components configured to implement device functions. Details are not described herein again.

In an exemplary embodiment, a non-volatile computer-readable storage medium is further provided. The non-volatile computer-readable storage medium has at least one computer program stored therein. The at least one computer program is loaded and executed by a processor, to cause an electronic device to implement the mask layout determining model training method or the mask layout determining method according to any one of the foregoing aspects.

In some exemplary embodiments, the non-volatile computer-readable storage medium may be a read-only memory (ROM), a random access memory (RAM), a compact disc read-only memory (CD-ROM), a magnetic tape, a floppy disk, an optical data storage device, or the like.

In an exemplary embodiment, a computer program or a computer program product is further provided. The computer program or the computer program product has at least one computer program stored therein. The at least one computer program is loaded and executed by a processor, to cause an electronic device to implement the mask layout determining model training method or the mask layout determining method according to any one of the foregoing aspects.

“Multiple” mentioned in the specification means two or more. “And/or” describes an association relationship for describing associated objects and represents that three relationships may exist. For example, A and/or B may represent the following three cases: Only A exists, both A and B exist, and only B exists. The character “/” generally indicates an “or” relationship between the associated objects.

The foregoing descriptions are merely exemplary embodiments of this application, but are not intended to limit this application. Any modification, equivalent replacement, or improvement made within the principle of this application shall fall within the protection scope of this application.

	Number	Date	Country
Parent	PCT/CN2023/128438	Oct 2023	WO
Child	19057661		US

MASK LAYOUT DETERMINING MODEL TRAINING METHOD AND APPARATUS, AND MASK LAYOUT DETERMINING METHOD AND APPARATUS

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

Priority Claims (1)

RELATED APPLICATION

Continuations (1)