This application claims priority to and the benefit of Korean Patent Application No. 10-2023-0101678 filed in the Korean Intellectual Property Office on Aug. 3, 2023, the entire disclosure of which is incorporated herein by reference for all purposes.
The present disclosure relates to a method and device with image-difference reduction preprocessing.
Inspection of electronic devices for defects, for example, is an important procedure used to reduce equipment malfunctions and process accidents in various fields. To detect the defects from image data of the electronic device, a defect signal may be extracted therefrom.
The defect signal may be extracted from a difference image of a difference between a reference image of a specific image of the electronic device and the specific image. The defect signal may not be extracted properly due to high background noise (Low SNR), at least as compared to a fine defect signal. As a result, performance of a defect inspection system may be sub-optimal.
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
In one general aspect, a method is performed by one or more processors, the method is for preprocessing a source image, and the method includes: generating a feature map by applying at least one kernel to the source image; generating a first reconstructed source image by applying at least one mask to the feature map, the feature map corresponding to the source image; and updating the kernel based on a value of a loss function between the first reconstructed source image and a target image.
The at least one kernel may include a foreground kernel and a background kernel, and the generating of the feature map may include: applying the foreground kernel to the source image; and applying the background kernel to the source image.
The feature map may include a first feature map generated by the applying the foreground kernel to the source image and a second feature map generated by the applying the background kernel to the source image, and the generating of the first reconstructed source image may include: applying a foreground mask to the first feature map; and applying a background mask to the second feature map.
The generating of the first reconstructed source image may further include: forming a first image by applying the foreground mask to the first feature map; forming a second image by applying the background mask to the second feature map; and combining the first image with the second image.
The method may further include: generating a second reconstructed source image by applying an updated version of the kernel and the mask to the source image, the updated version of the kernel being generated based on the kernel being updated according to the value of the loss function; and determining whether to finish updating the kernel based on the second reconstructed source image being generated.
The determining of whether to finish updating the kernel may include updating the updated kernel based on a value of a loss function between the second reconstructed source image and the target image or finishing training through the update of the kernel.
The determining of whether to finish updating of the kernel may include finishing training through the updating of the kernel based on the source image being reconstructed by a predetermined number of times.
The method may further include preprocessing the source image by using a trained background kernel based on the finishing of the updating of the kernel being determined.
In another general aspect, a method for detecting a defect in a defect image that has a corresponding reference image without the defect, the method performed by one or more processors and including: applying a trained background kernel to the reference image to generate a preprocessed reference image, the trained background kernel being trained based on applying a background kernel and a mask to the reference image; and detecting the defect in the defect image based on a comparison between the preprocessed reference image and the defect image.
The method may further include: outputting a feature map by applying the background kernel to the reference image, applying the mask to the feature map to generate a reconstructed reference image that is a reconstruction of the reference image; and updating the background kernel based on the reconstructed reference image.
The outputting of the feature map may include: applying a foreground kernel to the reference image; and applying the background kernel to the reference image.
The applying the foreground kernel to the reference image may generate a first feature map of the feature map, the applying the background kernel to the reference image may generate a second feature map of the feature map, and the reconstructing of the reference image may include: applying a foreground mask of the mask to the first feature map to generate a first masked feature map; applying a background mask of the mask to the second feature map to generate a second masked feature map; and reconstructing the reference image based on a combination of the first masked feature map and the second masked feature map.
In another general aspect, a device for preprocessing a reference image includes: one or more processors; memory storing instructions configured to cause the one or more processors perform a process including: generating a feature map by applying a kernel to the reference image; generating a first reconstructed reference image by applying a mask to the feature map, the feature map corresponding to the reference image; and updating the kernel based on a value of a loss function between the first reconstructed reference image and a defect image.
The kernel may include a foreground kernel and a background kernel, and generating the feature map may include applying the foreground kernel to the reference image and applying the background kernel to the reference image.
The mask may include a foreground mask and a background mask, and the generating the first reconstructed reference image may include forming a first masked image by applying the foreground mask to a feature map output by the foreground kernel and forming a second masked image by applying the background mask to a feature map output by the background kernel.
The first reconstructed reference image may be generated based on a combination of the first masked image and the second masked image.
The mask may include a background mask that corresponds to a foreground-background segmentation of the reference image.
The foreground mask may be trained based on the value of the loss function.
The foreground mask may be generated by applying a foreground-background segmentation model or rule to the reference image.
The kernel may include a background kernel and the first reconstructed reference image has reduced background noise.
Throughout the drawings and the detailed description, unless otherwise described or provided, the same or like drawing reference numerals will be understood to refer to the same or like elements, features, and structures. The drawings may not be to scale, and the relative size, proportions, and depiction of elements in the drawings may be exaggerated for clarity, illustration, and convenience.
The following detailed description is provided to assist the reader in gaining a comprehensive understanding of the methods, apparatuses, and/or systems described herein. However, various changes, modifications, and equivalents of the methods, apparatuses, and/or systems described herein will be apparent after an understanding of the disclosure of this application. For example, the sequences of operations described herein are merely examples, and are not limited to those set forth herein, but may be changed as will be apparent after an understanding of the disclosure of this application, with the exception of operations necessarily occurring in a certain order. Also, descriptions of features that are known after an understanding of the disclosure of this application may be omitted for increased clarity and conciseness.
The features described herein may be embodied in different forms and are not to be construed as being limited to the examples described herein. Rather, the examples described herein have been provided merely to illustrate some of the many possible ways of implementing the methods, apparatuses, and/or systems described herein that will be apparent after an understanding of the disclosure of this application.
The terminology used herein is for describing various examples only and is not to be used to limit the disclosure. The articles “a,” “an,” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. As used herein, the term “and/or” includes any one and any combination of any two or more of the associated listed items. As non-limiting examples, terms “comprise” or “comprises,” “include” or “includes,” and “have” or “has” specify the presence of stated features, numbers, operations, members, elements, and/or combinations thereof, but do not preclude the presence or addition of one or more other features, numbers, operations, members, elements, and/or combinations thereof.
Throughout the specification, when a component or element is described as being “connected to,” “coupled to,” or “joined to” another component or element, it may be directly “connected to,” “coupled to,” or “joined to” the other component or element, or there may reasonably be one or more other components or elements intervening therebetween. When a component or element is described as being “directly connected to,” “directly coupled to,” or “directly joined to” another component or element, there can be no other elements intervening therebetween. Likewise, expressions, for example, “between” and “immediately between” and “adjacent to” and “immediately adjacent to” may also be construed as described in the foregoing.
Although terms such as “first,” “second,” and “third”, orA, B, (a), (b), and the like may be used herein to describe various members, components, regions, layers, or sections, these members, components, regions, layers, or sections are not to be limited by these terms. Each of these terminologies is not used to define an essence, order, or sequence of corresponding members, components, regions, layers, or sections, for example, but used merely to distinguish the corresponding members, components, regions, layers, or sections from other members, components, regions, layers, or sections. Thus, a first member, component, region, layer, or section referred to in the examples described herein may also be referred to as a second member, component, region, layer, or section without departing from the teachings of the examples.
Unless otherwise defined, all terms, including technical and scientific terms, used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure pertains and based on an understanding of the disclosure of the present application. Terms, such as those defined in commonly used dictionaries, are to be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and the disclosure of the present application and are not to be interpreted in an idealized or overly formal sense unless expressly so defined herein. The use of the term “may” herein with respect to an example or embodiment, e.g., as to what an example or embodiment may include or implement, means that at least one example or embodiment exists where such a feature is included or implemented, while all examples are not limited thereto.
A source image and a target image (a training pair) may each include a foreground and a background of a same scene or subject. The foreground may correspond to a region of interest (ROI) and the background may be another portion exclusive of the ROI. Regarding the source image and the target image, generally, differences between the backgrounds of the respective source and target images are small (the backgrounds may be close to identical), however, differences between the foregrounds (the ROls) of the respective images may be greater, i.e., the foregrounds may not be close to identical. When the target image is, for example, a defect image (includes a defect of an electronic device or other workpiece), the source image may serve as a reference image (ground truth image) that lacks the defect and that can be compared to the defect image in detecting the defect in the target image.
In an embodiment, a preprocessing system 100 may preprocess the source image to minimize differences between the backgrounds of the source image and the target image. For example, the preprocessing system 100 may de-noise (reduce) noise in a difference between the preprocessed source image and the target image, thereby increasing a signal to noise ratio (SNR) of the difference between the source image and the target image.
For this purpose, given an image training pair of the source image and the target image, the preprocessing system 100 may change the source image so that the source image is closer to the target image. When a region of the target image that is different from the source image is determined to correspond to a defect, a defect inspection device may detect the defect in the target image.
Referring to
The kernel generator 110 may generate a kernel (e.g., a foreground or background kernel) that is then used for processing images. The neural network may update the kernel according to an operation result of an operation on an image by the kernel. In an embodiment, the kernel generator 110 may generate a foreground kernel and a background kernel and may input a same image (source image) to the foreground kernel and the background kernel to train the neural network (see
Further regarding training, the region designator 120 may generate a mask for distinguishing/separating a background region and a foreground region in an image and may update the mask according to a result of the neural network applying the mask to the image. In an embodiment, the region designator 120 may generate a foreground mask and a background mask and may respectively apply the foreground mask and the background mask to an image (e.g., source image) to train the neural network. The foreground kernel may be used to train the foreground mask and the background kernel may be used to train the background mask. That is, the kernel applied to the image (mentioned earlier) may be trained according to the foreground region and background region determined by the region designator 120.
The region designator 120 may generate the mask of the training image (e.g., source image) according to a binary method by which the pixels of the mask (which respectively correspond to the pixels of the image) initially have binary values of 0 or 1. The pixel values of the mask may then be updated to have numeric values ranging from 0 to 1.
The region designator 120 may also generate a second mask (e.g., foreground) from a first mask (e.g., background) by generating one first mask (e.g., background) and then generating an a copy of the first mask and inverting the copied mask thus forming the second mask (e.g., foreground). For example, the region designator 120 may generate the foreground mask by first generating the background mask and then (in a copy of the background mask) subtracting respective pixel values of the background mask from 1. When the pixel value of a specific pixel of the background mask is 0.95, for example, the pixel value of a corresponding pixel of the foreground mask may be 0.05. Alternatively, the region designator 120 may generate the background mask by generating the foreground mask and subtracting the respective pixel values of the generated foreground mask from 1.
In an embodiment, for a training pair (a source image and a target image) the preprocessing system 100 may generate a reconstructed source image (i.e., a reconstructed version of the source image) by applying filters (e.g., a kernel) to the source image and may update weights of the filters to minimize a value of a loss function computed from the reconstructed version of the source image (as compared to the target image). The source image may be repeatedly reconstructed by correspondingly updated filters until a predetermined criteria (e.g., a maximum number of iterations or minimum degree of loss) is satisfied. The weights may be updated by backpropagation, for example.
In an embodiment, the loss function may be expressed as a function for indicating a difference between the reconstructed source image and the target image. The filters may include a kernel and a mask (multiple of each are possible).
Referring to
To further summarize, based on completion of the updating of the kernel and/or the mask, the preprocessing system 100 may generate the preprocessed image from the source image in the inferring phase, which uses the updated filter. Referring to
In detail, referring to
In an embodiment, the feature map may be generated by a convolutional operation (indicated by the “*” in
In an embodiment, the kernel applied to the source image may be an nxn kernel generated by the neural network or by a Gaussian kernel.
Referring to
Referring to
Referring to
In the inferring phase, the preprocessing system 100 may generate the preprocessed source image by applying the trained kernel to the source image and output the thus-preprocessed source image and the target image as a new image pair (S160).
Referring to
The preprocessing system 100 may apply the foreground kernel and the foreground mask to the source image and may apply the background kernel and the background mask to the source image.
In an embodiment, the foreground kernel may be applied to the source image, a first feature map of the source image may be output, and the foreground mask may be applied to the first feature map produced by the foreground kernel, thus updating the first feature map. Further, the background kernel may also be applied to the source image, thus forming a second feature map of the source image, and the background mask may be applied to the second feature map produced by the background kernel. In this way, the foreground mask and the background mask may allow the foreground kernel and the background kernel to learn the foreground and the background of the source image, respectively.
The kernel generator 110 may apply a predetermined condition to generate the foreground kernel and the background kernel in a fixed way. The predetermined condition of the kernel generator 110 may be as expressed in Equation 2.
Referring to Equation 2, p is a mean value, S is a source image, and E is a margin.
In an embodiment, for a set of source-target image pairs, when the difference between the source images and their corresponding target images generally fall within a predetermined image region (e.g., a region where defects commonly occur), a rule-based mask may be used. For example, when the location of the differences between the source images and the target images is predetermined to be generally provided in a 7×7 region with a location near the middle of the image, the foreground mask may pass-through only/primarily a portion of the first feature map that is in the 7×7 region (that is in the middle of the image) and the background mask may pass-through only/primarily a portion of the second feature map that is exclusive of the 7×7 region.
In another embodiment, the foreground masks may be applied before the foreground kernels are applied (in
The preprocessing system 100 may update the weights of the foreground kernel and the background kernel based on the value of the loss function (a loss function for indicating differences between reconstructed source images and corresponding target images). The preprocessing system 100 may update the foreground mask and/or the background mask based on the value of the loss function computed from the reconstructed source image.
The preprocessing system 100 may update the foreground mask and/or the background mask based on the value of the loss function computed from the reconstructed source image. In a non-limiting example, the foreground mask may be updated based on the value of the loss function and then the background mask may be updated by inverting the foreground mask (e.g., subtracting the foreground mask from 1). For example, in a case when the foreground mask can be updated more easily than the background mask based on the value of the loss function, the foreground mask may be updated based on the value of the loss function, and then the background mask may be updated by inverting the foreground mask. Alternatively, in a case when capacity of the foreground mask is less than capacity of the background mask, the foreground mask may be updated based on the value of the loss function, and then the background mask (which will be used for the next epoch) may be the inverted foreground mask.
The preprocessing system 100 may determine whether to start the training again based on a determination of whether the value of the loss function computed from the reconstructed source image is in a predetermined range (or satisfies a threshold). For example, when the value of the loss function is not within the predetermined range, the preprocessing system 100 may start the next epoch. However, when the value of the loss function is within the predetermined range, the preprocessing system 100 may finish the training. Alternatively, when the epoch is performed for a predetermined number of times, the preprocessing system 100 may finish the training and may output a trained background kernel for performing the inferring phase.
Referring to
The preprocessing system 100 may apply the foreground mask and the foreground kernel to the source image and may also apply the background mask and the background kernel to the source image.
In an embodiment, the foreground mask may be applied to the source image and the result thereof may be applied to the foreground kernel so that the feature map is outputted from the foreground kernel. Further, the background mask may be applied to the source image and a result thereof may be applied to the background kernel so that the feature map is outputted from the background kernel. By this, the foreground kernel and the background kernel may learn a portion filtered by the foreground mask and the background mask.
The preprocessing system 100 may combine the two feature map images output by the foreground kernel and the background kernel and may generate one reconstructed source image. Because the mask is applied before the kernel, an additional element may be included in the preprocessing system 100 for correcting distortion that may be generated in the respective regions (the foreground and the background) of the feature map image. For example, the preprocessing system 100 may further include a boundary processing device and/or a neural network (not shown) for processing the boundary of the foreground and the background.
The preprocessing system 100 may update the weights of the foreground kernel and/or the background kernel based on the value of the loss function for indicating the difference between the reconstructed source image and the target image. The preprocessing system 100 may update the foreground mask and/or the background mask based on the value of the loss function computed from the reconstructed source image.
The preprocessing system 100 may update the foreground mask and/or the background mask based on the value of the loss function of the reconstructed source image. For example, the foreground mask may be updated based on the value of the loss function and the background mask may be updated by inverting the foreground mask. When the capacity of the foreground mask is less than the capacity of the background mask, the foreground mask may be updated based on the value of the loss function and the background mask (which will be used for the next epoch) may be the inverted foreground mask.
The preprocessing system 100 may determine whether to start the training again based on a determination of whether the value of the loss function computed from the reconstructed source image is in a predetermined range or satisfies a threshold. For example, when the value of the loss function is not within the predetermined range, the preprocessing system 100 may start the next epoch. When the value of the loss function is within the predetermined range, the preprocessing system 100 may terminate the training. Alternatively, when the predetermined number of the epochs are performed, the preprocessing system 100 may terminate the training and may output the trained background kernel for performing the inferring phase.
Referring to
When the training is finished, the preprocessing system 100 may preprocess the source image in the inferring phase by applying the trained background kernel to the source image. The preprocessed source image may be used in detecting a defect in the target image.
In an embodiment, a defect inspecting system 200 may photograph the electronic device to obtain a defect image and may determine defects in the defect image by using a reference image corresponding to the defect image. The reference image may be an image of the electronic device including no defect (not necessarily the same physical electronic device, but rather a copy). The defect inspecting system 200 may be applied in the field of defect detection using defect data and reference data. For example, the defect inspecting system 200 may be used in the semiconductor manufacturing process.
Referring to
The image pair generating device 210 may obtain the defect image by photographing a suspected defect portion of the electronic device and may generate an image pair of the defect image and the reference image, where the reference is generated as corresponding to the defect image.
Specifically, the preprocessing device 220 may convert the reference image (corresponding to the defect image) into a reconstructed reference image by using filters and may generate a preprocessed version of the reference image in which noise of the background of the reference image is removed or reduced based on differences between the reconstructed reference image and the defect image. The filters may include at least one kernel and at least one mask.
In an embodiment, the preprocessing device 220 may apply the background kernel and the foreground kernel to the reference image and may apply the background mask and the foreground mask to the feature maps respectively output by the background kernel and the foreground kernel. The preprocessing device 220 may apply the background mask to the feature map output by the background kernel and may apply the foreground mask to the feature map output by the foreground kernel. The reference image may be reconstructed by combining the two images/maps masked by the background mask and the foreground mask. The preprocessing device 220 may update (train) the background kernel and the foreground kernel based on the value of the loss function on the differences between the reconstructed reference image and the defect image. Then, for continued training, the previously-updated kernel may be applied again to the reference image to reconstruct the reconstructed reference image and the previously-updated kernel may be updated again based on the re-reconstructed reference image. When the update of the kernel is finished, in the inferring phase the preprocessing device 220 may apply the updated background kernel to the reference image to generate a thus-preprocessed version of the reference image.
The defect inspection device 230 may detect the defect from the defect image by comparing the preprocessed reference image and the defect image.
Referring to
Referring to
The input layer 910 may include input nodes (x1 to xi), and the number of the input nodes (x1 to xi) may correspond to the number of independent input variables. Based on a training data set being input to the input layer 910 for the purpose of training the neural network 900 and a test data set being input to the input layer 910 of the trained neural network 900, the output layer 930 of the trained neural network 900 may output an inferring/inference result.
The hidden layer 920 may be positioned between the input layer 910 and the output layer 930 and may include at least one hidden layer (9201 to 920n). The output layer 930 may include at least one output node (y1 to yj). An activation function may be used in the hidden layer 920 and the output layer 930 to determine node outputs/activations. In an embodiment, the neural network 900 may be learned by updating the weights of the hidden nodes included by the hidden layer 920.
The preprocessing device may be implemented as a computer system, for example, a computer readable medium (but not a signal per se). Referring to
The processor 1010 may implement functions, stages, or methods proposed in the embodiments. An operation of the computer system 1000 may be implemented by the processor 1010. The at least one processor 1010 may include at least one of CPU, GPU, and NPU. In practice the processor 1010, may be one or more processors of one or more types.
The memory 1020 may be provided inside and/or outside the processor and may be connected to the processor through various known means. The memory may represent a volatile or non-volatile storage medium in various forms, and for example, the memory may include a read-only memory (ROM) and a random access memory (RAM).
The computing apparatuses, the electronic devices, the processors, the memories, the image sensors, the displays, the information output system and hardware, the storage devices, and other apparatuses, devices, units, modules, and components described herein with respect to
The methods illustrated in
Instructions or software to control computing hardware, for example, one or more processors or computers, to implement the hardware components and perform the methods as described above may be written as computer programs, code segments, instructions or any combination thereof, for individually or collectively instructing or configuring the one or more processors or computers to operate as a machine or special-purpose computer to perform the operations that are performed by the hardware components and the methods as described above. In one example, the instructions or software include machine code that is directly executed by the one or more processors or computers, such as machine code produced by a compiler. In another example, the instructions or software includes higher-level code that is executed by the one or more processors or computer using an interpreter. The instructions or software may be written using any programming language based on the block diagrams and the flow charts illustrated in the drawings and the corresponding descriptions herein, which disclose algorithms for performing the operations that are performed by the hardware components and the methods as described above.
The instructions or software to control computing hardware, for example, one or more processors or computers, to implement the hardware components and perform the methods as described above, and any associated data, data files, and data structures, may be recorded, stored, or fixed in or on one or more non-transitory computer-readable storage media. Examples of a non-transitory computer-readable storage medium include read-only memory (ROM), random-access programmable read only memory (PROM), electrically erasable programmable read-only memory (EEPROM), random-access memory (RAM), dynamic random access memory (DRAM), static random access memory (SRAM), flash memory, non-volatile memory, CD-ROMs, CD-Rs, CD+Rs, CD-RWs, CD+RWs, DVD-ROMs, DVD-Rs, DVD+Rs, DVD-RWs, DVD+RWs, DVD-RAMs, BD-ROMs, BD-Rs, BD-R LTHs, BD-REs, blue-ray or optical disk storage, hard disk drive (HDD), solid state drive (SSD), a card type memory such as multimedia card micro or a card (for example, secure digital (SD) or extreme digital (XD)), magnetic tapes, floppy disks, magneto-optical data storage devices, optical data storage devices, hard disks, solid-state disks, and any other device that is configured to store the instructions or software and any associated data, data files, and data structures in a non-transitory manner and provide the instructions or software and any associated data, data files, and data structures to one or more processors or computers so that the one or more processors or computers can execute the instructions. In one example, the instructions or software and any associated data, data files, and data structures are distributed over network-coupled computer systems so that the instructions and software and any associated data, data files, and data structures are stored, accessed, and executed in a distributed fashion by the one or more processors or computers.
While this disclosure includes specific examples, it will be apparent after an understanding of the disclosure of this application that various changes in form and details may be made in these examples without departing from the spirit and scope of the claims and their equivalents. The examples described herein are to be considered in a descriptive sense only, and not for purposes of limitation. Descriptions of features or aspects in each example are to be considered as being applicable to similar features or aspects in other examples. Suitable results may be achieved if the described techniques are performed in a different order, and/or if components in a described system, architecture, device, or circuit are combined in a different manner, and/or replaced or supplemented by other components or their equivalents.
Therefore, in addition to the above disclosure, the scope of the disclosure may also be defined by the claims and their equivalents, and all variations within the scope of the claims and their equivalents are to be construed as being included in the disclosure.
Number | Date | Country | Kind |
---|---|---|---|
10-2023-0101678 | Aug 2023 | KR | national |