The present application claims priority to Chinese Patent Application No. CN202311706018.9, filed with the China National Intellectual Property Administration on Dec. 12, 2023, the disclosure of which is hereby incorporated herein by reference in its entirety.
The present disclosure relates to the field of computer technology, and in particular to fields of artificial intelligence, computer vision, image processing and other technologies.
In an industrial scenario of a spinning process, since there are too many devices in a spinning workshop, it is necessary to monitor devices in a spinning workshop in real time, in order to ensure the quality of chemical fiber products. The manual monitoring method is too wasteful of human resources, and manual detection cannot be performed in some areas. Therefore, it is a problem in the related art how to automatically monitor the spinning workshop.
The present disclosure provides a training method for a fault detection model, a device fault detection method and related apparatuses, to solve or alleviate one or more technical problems in the related art.
In a first aspect, the present disclosure provides a training method for a fault detection model, including:
In a second aspect, the present disclosure provides a device fault detection method, applied to the fault detection model of the first aspect, including:
In a third aspect, the present disclosure provides a training apparatus for a fault detection model, including:
In a fourth aspect, the present disclosure provides a device fault detection apparatus, applied to the fault detection model of the third aspect, including:
In a fifth aspect, provided is an electronic device, including:
In a sixth aspect, provided is a non-transitory computer-readable storage medium storing a computer instruction thereon, and the computer instruction is used to cause a computer to execute the method according to any one of the embodiments of the present disclosure.
In a seventh aspect, provided is a computer program product including a computer program, and the computer program implements the method according to any one of the embodiments of the present disclosure, when executed by a processor.
In the embodiments of the present disclosure, the drone formation is used to perform all-round imaging for the same device, and the model to be trained is guided to learn a correlation between different images based on the position coding of the drones, to improve the fault detection efficiency of the model.
It should be understood that the content described in this part is not intended to identify critical or essential features of embodiments of the present disclosure, nor is it used to limit the scope of the present disclosure. Other features of the present disclosure will be easily understood through the following description.
In the accompanying drawings, the same reference numbers represent the same or similar parts or elements throughout the accompanying drawings, unless otherwise specified. These accompanying drawings are not necessarily drawn to scale. It should be understood that these accompanying drawings only depict some embodiments provided according to the present disclosure, and should not be considered as limiting the scope of the present disclosure.
The present disclosure will be described below in detail with reference to the accompanying drawings. The same reference numbers in accompanying drawings represent elements with identical or similar functions. Although various aspects of the embodiments are shown in the accompanying drawings, the accompanying drawings are not necessarily drawn to scale unless specifically indicated.
In addition, in order to better illustrate the present disclosure, numerous specific details are given in the following specific implementations. Those having ordinary skill in the art should understand that the present disclosure may be performed without certain specific details. In some examples, methods, means, elements and circuits well known to those having ordinary skill in the art are not described in detail, in order to highlight the subject matter of the present disclosure.
Moreover, the terms “first” and “second” are only for the purpose of description, and cannot be construed to indicate or imply the relative importance or implicitly point out the quantity of technical features indicated. Therefore, the feature defined with “first” or “second” may explicitly or implicitly include one or more features. In the description of the present disclosure, “a plurality of” means two or more than two, unless otherwise expressly and specifically defined.
In an industrial scenario of the spinning process, since there are too many devices in a spinning workshop, it is necessary to monitor the devices in the spinning workshop, in order to ensure normal production of chemical fiber products.
The area of the spinning plant is relatively large, the process flow is long, and many types of devices are used in each process flow. Some devices are too large to be detected manually. Therefore, for the situations of a large detection area, many types of devices and a relatively complex detection environment, an automated detection technology is needed to improve the efficiency of fault detection.
It should be noted that main types of spinning products involved in the solution of embodiments of the present disclosure may include one or more of partially oriented yarns (POY), fully drawn yarns (FDY), draw textured yarns (DTY) (or called low-elastic yarns), etc. For example, types of yarns may specifically include polyester partially oriented yarns, polyester fully drawn yarns, polyester drawn yarns, polyester draw textured yarns, etc.
In order to automatically and accurately detect relevant devices in the spinning plant, an embodiment of the present disclosure provides a fault detection model, to be expected to collect images by controlling a drone queue, and the fault detection model completes fault detection.
Firstly, an embodiment of the present disclosure proposes a training method for a fault detection model, to be expected to train a satisfactory model with detection effect using this method. As shown in
S101: a designated device is sampled based on a preset drone formation to obtain a sample sequence.
The sample sequence is numbered according to the drone formation.
Here, the designated device may be any device in a spinning workshop, and the spinning workshop may be as shown in
Many devices in a spinning plant need to be observed from multiple angles. Therefore, the drone queue is used to collect images around a device to obtain the overall situation of the device at the same time in the embodiment of the present disclosure. For example, a plurality of drones may be deployed in a drone formation based on the structure of the designated device, to simultaneously image the designated device at multiple angles. The drone formation is drone 1, drone 2, . . . , and drone n; the image taken by drone 1 is image 1, the image taken by drone 2 is image 2, . . . , and the image taken by drone n is image n. The sample sequence is {image 1; image 2; . . . ; image n}.
S102: position encoding is performed on the sample sequence according to the drone formation to obtain a drone formation encoding result.
The relative position information of each drone is obtained, so that position encoding may be performed on the entire drone formation to obtain the drone formation encoding result.
Position encoding is performed on the drone formation because sample images collected by each drone can be regarded as part of images in the three-dimensional imaging of the designated device. The fault of the designated device may vary at different angles, so there is a dependency between image contents of a plurality of sample images in the same sample sequence. Therefore, the model to be trained can be guided to learn relevant knowledge from the sample sequence based on the dependency through the position encoding of the drone formation, so as to facilitate fault detection.
S103: the sample sequence and the drone formation encoding result are input into a model to be trained to obtain a fault detection result output by the model to be trained.
S104: a loss value are determined based on the fault detection result and a true value of a fault detection result of the sample sequence.
S105: a model parameter of the model to be trained is adjusted based on the loss value to obtain the fault detection model.
Here, the model to be trained includes an encoder and a decoder.
The encoder is configured to perform feature extraction on each sample image in the sample sequence to obtain a multi-scale feature of each sample image; perform feature fusion on features in a same scale of the sample sequence based on the drone formation encoding result to obtain a fused feature corresponding to each scale; and perform feature fusion on fused features in all scales to obtain a target feature.
The decoder is configured to determine a fault detection result of the designated device based on the target feature, where the fault detection result includes a fault prediction type and a fault prediction box of a same fault position in a fault sample map; and splice sample images in the sample sequence with reference to the drone formation to obtain the fault sample map.
Here, the schematic diagram of inputting the sample sequence and the drone formation encoding result into the model to be trained is shown in
In an embodiment of the present disclosure, the image features of the sample sequence collected based on the drone queue are fused to obtain the target feature, and the target feature may describe features of the same fault from multiple angles, facilitating a comprehensive description of the fault situation, so that the model to be trained may accurately perform the fault detection. Moreover, the target feature may contain important features at different levels by extracting and fusing multi-scale features of sample images at different angles, thereby improving the efficiency of the model in fault identification.
In an embodiment of the present disclosure, in order to obtain a high-quality labeling result, the true value of the fault detection result of the sample sequence may be determined based on the following method:
Step A1: the sample sequence is spliced into the fault sample map according to the drone formation, where the fault sample map describes a state of the designated device from a plurality of drone perspectives.
Step A2: first prompt information is constructed based on the fault sample map, where the first prompt information includes a fault point of at least one fault in the fault sample map, and position information of detection boxes of a same fault in different drone perspectives in the fault sample map is used as sub-position parameters.
Step A3: position encoding is performed on the sub-position parameters of the same fault to obtain a fault position code of the same fault.
For each fault, the following operations are performed:
Step A31: a fault point of the fault and a fault position code of the fault as second prompt information are input into an everything segmentation model, so that the everything segmentation model segments out a fault mask map of the fault from the fault sample map.
Step A32: a true class label of the fault mask map of the fault is obtained, and a detection box label of the fault is constructed based on position information of the fault mask map in the fault sample map, to obtain the true value of the fault detection result.
The schematic diagram of inputting the fault sample map into the everything segmentation model may be shown in
For example, when the designated device includes three fault positions, images of the fault positions may be collected from multiple perspectives, the same fault position may include a plurality of detection boxes, and each detection box has a corresponding category. For example, the fault position A includes a detection box 1 from the perspective of drone 1, whose fault code is a; and further includes a detection box 2 from the perspective of drone 2, whose fault code is also a.
In the embodiment of the present disclosure, the fault sample map is segmented based on the everything segmentation model, so that the everything segmentation model segments out the fault mask map of the fault from the fault sample map, then the true class label of the fault mask map may be obtained, and then the detection box label of the fault is constructed based on the position information of the fault mask map in the fault sample map, laying a strong foundation for training the model to be trained.
In some embodiments, in order to optimize the parameter of the fault detection model and improve the fault detection efficiency, the loss function in the embodiment of the present disclosure includes the following loss items:
In some embodiments, the position loss may be measured from multiple perspectives. For example, the detection box itself is a fault position obtained by the regression task. During the model training phase, the detection box predicted for the same fault position will become more and more accurate as the model parameter is optimized.
In an embodiment of the present disclosure, the position loss may include a first position loss sub-item, a second position loss sub-item, a third position loss sub-item, a fourth position loss sub-item, and a fifth position loss sub-item. The specific calculation method of each loss sub-item is as follows:
The first position loss sub-item is used to represent loss between fault prediction boxes of the same fault position in a plurality of sample images in the sample sequence and center points of corresponding detection box labels.
The second position loss sub-item is used to represent detection box width loss between the fault prediction boxes of the same fault position in the plurality of sample images and the corresponding detection box labels, and detection box height loss between the fault prediction boxes of the same fault position in the plurality of sample images and the corresponding detection box labels.
The third position loss sub-item is used to represent confidence loss of the fault prediction boxes of the same fault position in the plurality of sample images.
The fourth position loss sub-item is used to represent overlap rate loss between important fault prediction boxes of the same fault position in the plurality of sample images and detection boxes of important positions in corresponding detection box labels.
The fifth position loss sub-item is used to represent quantity loss between the total quantity of fault prediction boxes of the same fault position in the plurality of sample images and the total quantity of detection boxes in the corresponding detection box labels.
During implementation, due to the detection error of the fault detection model, there may be undetected detection boxes or misdetected detection boxes, and all of these detection boxes participate in calculation.
During implementation, important and unimportant labels may be marked in the detection box labels. Unimportant detection boxes may not participate in the loss calculation even through they are detected, while detected detection boxes at important positions participate in the loss calculation.
In the embodiments of the present disclosure, the loss value of the fault prediction box is comprehensively measured based on the center point, width, height and other information of the detection box to improve the accuracy of the fault detection model.
In some embodiments, the classification loss is determined based on the following formula:
Here, due to the adjustment of the model parameter, the prediction of the detection box of the same fault position will change with the optimization of the model. Therefore, the accumulation symbol in expression (6) may be understood as the accumulation of the position prediction results of the same detection box label in multiple rounds of prediction. The amount accumulated each time may be determined as needed.
Here, the statistic is a mean value, a mode or a maximum value. During implementation, since one fault position is included in a plurality of images, there are a plurality of detection boxes, and each detection box has a prediction score. The prediction scores of the plurality of detection boxes need to be integrated. The mean value of the prediction scores of the plurality of detection boxes may be selected as the prediction score, or the mode or the maximum value may be selected as the prediction score.
On the basis of obtaining the above-mentioned position loss and classification loss, the loss function constructed is shown in formula (7):
In some embodiments, each sample sequence is a training sample, and a plurality of sample sequences constitute a sample set. A part of the sample set may be taken as first training data, and another part thereof may be taken as second training data. The first training data is used to adjust learnable parameters in the model to be trained, and the second training data is used to adjust hyper-parameters in the model to be trained.
In an embodiment of the present disclosure, the hyper-parameters that need to be learned may include λ1, λ2, λ3, λ4, λ5, λ6 and pr. The learnable parameters are model parameters other than the hyper-parameters in the loss function.
The model to be trained of
The first training data is input into the model to be trained to obtain a fault detection result. The fault detection result includes a fault prediction type and a fault prediction box. The first loss (including position loss and classification loss) is calculated based on the difference between the fault detection result and the true value, the gradient is calculated based on the first loss, and then the learnable parameters in the model to be trained are optimized based on the gradient direction. On the basis of the model expressed by the learnable parameters after this round of adjustment, the second training data is input into the model to be trained to obtain a fault detection result. The fault detection result includes a fault prediction type and a fault prediction box. The position loss is calculated based on the difference between the fault prediction box and the detection box label, and the classification loss is calculated. Then the second loss is determined based on the position loss and the classification loss. The hyper-parameters are optimized with the goal of minimizing the second loss. Then, the learnable parameters in the model to be trained are adjusted again based on the first training data, and the cycle is iterated in sequence until the optimal hyper-parameters that minimize the second loss are obtained. At this point, the hyper-parameters are determined, and then the method shown in
In the embodiment of the present disclosure, the position loss and classification loss are comprehensively considered, and thus the objective function designed may provide a strong basis for training the model to be trained.
Based on the fault detection model obtained above and based on the same technical concept, an embodiment of the present disclosure further includes a device fault detection method, as shown in
S501: an initial image set of a target device is obtained based on a drone queue, where the drone queue is used to collect images of the target device from a plurality of perspectives to obtain the initial image set.
S502: each initial image in the initial image set is denoised to obtain a set of images to be detected.
During implementation, for each initial image, a target blur kernel may be obtained based on the trajectory of the drone collecting the initial image, and the initial image is denoised based on the target blur kernel, to obtain the set of images to be detected constructed from the denoised initial images.
Taking an initial image for example, firstly k key points need to be obtained from the initial image, and an initial blur kernel is obtained based on the k key points.
S503: position encoding is performed on the drone queue to obtain a drone position encoding result.
S504: the set of images to be detected and the drone position coding result are input into the fault detection model to obtain a fault detection result of the fault detection model for the target device.
Here, the fault detection result includes a fault prediction type and a fault prediction box of a same fault position in a target map; and the target map is obtained by splicing the images to be detected in the set of images to be detected with reference to the drone queue.
In the embodiment of the present disclosure, the initial image set collected based on the drone queue is denoised to obtain the set of images to be detected, and the set of images to be detected may describe the features of the same fault from multiple angles, facilitating a comprehensive description of the fault situation, so that the fault detection result may accurately perform the fault detection. Based on this method, the target device may be automatically monitored to save human resources.
In some embodiments, in order to reduce misdetection of prediction boxes, the method may also be implemented as follows:
Step B1: at least one key prediction box is screened out from a plurality of fault prediction boxes of the same fault position.
The fault prediction boxes may be labeled with critical and non-critical labels. The non-critical detection boxes detected do not participate in the loss calculation, while the detected detection boxes at critical positions participate in the loss calculation.
Step B2: an image to be detected of each key prediction box is separated out from the target map based on the at least one key prediction box.
Step B3: a three-dimensional effect graph of the same position is constructed and output based on the image to be detected of each key prediction box.
When an anomaly of an important fault prediction box is detected, a three-dimensional effect graph of this position is constructed based on position encoding of the queue collected by the drone and output to the staff to implement staff management.
In the embodiments of the present disclosure, at least one key prediction box is screened out from the plurality of fault prediction boxes of the same fault position, the key detection box is used for fault prediction to reduce consumption of computing resources, and the position at which the fault is detected is rendered so that the staff may process the fault in time.
Based on the same technical concept, an embodiment of the present disclosure proposes a training apparatus 600 for a fault detection model, as shown in
In some embodiments, the apparatus further includes an obtaining module configured to:
In some embodiments, a loss function of the model to be trained includes following loss items:
In some embodiments, the determining module is configured to determine the classification loss based on a following formula:
In some embodiments, the statistic is a mean value, a mode or a maximum value.
In some embodiments, the position loss includes:
Based on the same technical concept, an embodiment of the present disclosure proposes a device fault detection apparatus 700, applied to the fault detection model obtained in the above embodiments, as shown in
In some embodiments, the apparatus further includes a generating module configured to:
For the description of specific functions and examples of the modules and sub-modules\units of the apparatus of the embodiment of the present disclosure, reference may be made to the relevant description of the corresponding steps in the above-mentioned method embodiments, which are not repeated here.
In the technical solution of the present disclosure, acquisition, storage and application of the user's personal information involved are in compliance with relevant laws and regulations, and do not violate public order and good customs.
If the memory 810, the processor 820 and the communication interface 830 are implemented independently, the memory 810, the processor 820 and the communication interface 830 may be connected to each other and complete communication with each other via a bus. The bus may be an industry standard architecture (ISA) bus, a peripheral component interconnect (PCI) bus, or an extended industry standard architecture (EISA) bus, etc. The bus may be divided into address bus, data bus, control bus, etc. For ease of representation, the bus is represented by only one thick line in
Optionally, in a specific implementation, if the memory 810, the processor 820 and the communication interface 830 are integrated on one chip, the memory 810, the processor 820 and the communication interface 830 may communicate with each other via an internal interface.
It should be understood that the above-mentioned processor may be a central processing unit (CPU) or other general-purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic device, a discrete gate or transistor logic device, a discrete hardware component, etc. The general-purpose processor may be a microprocessor or any conventional processor, etc. It is worth noting that the processor may be a processor that supports an advanced RISC machines (ARM) architecture.
Further, optionally, the above-mentioned memory may include a read-only memory and a random access memory, and may also include a non-volatile random access memory. The memory may be a volatile memory or a non-volatile memory, or may include both a volatile memory and a non-volatile memory. Here, the non-volatile memory may include a read-only memory (ROM), a programmable ROM (PROM), an erasable prom (EPROM), an electrically EPROM (EEPROM) or a flash memory. The volatile memory may include a random access memory (RAM), which acts as an external cache. By way of illustration and not limitation, many forms of RAMs are available, for example, a static RAM (SRAM), a dynamic random access memory (DRAM), a synchronous DRAM (SDRAM), a double data rate SDRAM (DDR SDRAM), an enhanced SDRAM (ESDRAM), a synchlink DRAM (SLDRAM) and a direct RAMBUS RAM (DR RAM).
The above embodiments may be implemented in whole or in part by software, hardware, firmware or any combination thereof. When implemented by software, they may be implemented in the form of a computer program product in whole or in part. The computer program product includes one or more computer instructions. When the computer instructions are loaded and executed on a computer, the processes or functions described in the embodiments of the present disclosure are generated in whole or in part. The computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from a computer readable storage medium to another computer readable storage medium. For example, the computer instructions may be transmitted from a website, computer, server or data center to another website, computer, server or data center in a wired (e.g., coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (e.g., infrared, Bluetooth, microwave, etc.) way. The computer readable storage medium may be any available medium that can be accessed by a computer, or a data storage device such as server or data center that is integrated with one or more available media. The available media may be magnetic media (for example, floppy disk, hard disk, magnetic tape), optical media (for example, digital versatile disc (DVD)), or semiconductor media (for example, Solid State Disk (SSD)), etc. It is worth noting that the computer readable storage medium mentioned in the present disclosure may be a non-volatile storage medium, in other words, may be a non-transitory storage medium.
Those having ordinary skill in the art can understand that all or some of the steps for implementing the above embodiments may be completed by hardware, or may be completed by instructing related hardware through a program. The program may be stored in a computer readable storage medium. The above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
In the description of the embodiments of the present disclosure, the description with reference to the terms “one embodiment”, “some embodiments”, “example”, “specific example” or “some examples”, etc. means that specific features, structures, materials or characteristics described in conjunction with the embodiment or example are included in at least one embodiment or example of the present disclosure. Moreover, the specific features, structures, materials or characteristics described may be combined in a suitable manner in any one or more embodiments or examples. In addition, those skilled in the art can integrate and combine different embodiments or examples and features of different embodiments or examples described in this specification without conflicting with each other.
In the description of the embodiments of the present disclosure, “/” represents or, unless otherwise specified. For example, A/B may represent A or B. The term “and/or” herein only describes an association relation of associated objects, which indicates that there may be three kinds of relations, for example, A and/or B may indicate that only A exists, or both A and B exist, or only B exists.
In the description of the embodiments of the present disclosure, the terms “first” and “second” are only for purpose of description, and cannot be construed to indicate or imply the relative importance or implicitly point out the quantity of technical features indicated. Therefore, the feature defined with “first” or “second” may explicitly or implicitly include one or more features. In the description of the embodiments of the present disclosure, “multiple” means two or more, unless otherwise specified.
The above descriptions are only exemplary embodiments of the present disclosure and not intended to limit the present disclosure. Any modifications, equivalent replacements, improvements and others made within the spirit and principle of the present disclosure shall be contained in the protection scope of the present disclosure.
| Number | Date | Country | Kind |
|---|---|---|---|
| 202311706018.9 | Dec 2023 | CN | national |