The present disclosure relates to the technical field of image processing, and in particular, relates to a model training and scene recognition method, an apparatus, a device, and a medium.
Machine moderation techniques (referred to as machine review) are increasingly used in large-scale short video/image moderation, where the offensive images are determined by the machine moderation are then pushed to staff members for moderation (referred to as manual review), which ultimately determines whether or not the images are offensive. The emergence of machine moderation has greatly improved the efficiency of image moderation. However, the machine moderation tends to make an offense judgment by relying on the visual commonality of images, which ignores the changes in moderation results due to changes in the general environment. For example, in the case of gun offense, when the machine moderation recognizes the presence of a gun in an image, the image will generally be considered offensive, but the accuracy of such a machine moderation result is poor. This is because, for example, where the gun appears in an anime or game scene, the image is not offensive. Therefore, scene recognition is critical to the accuracy of the machine moderation results, and a scene recognition scheme is urgently desired.
Embodiments of the present disclosure provide a model training and scene recognition method, an apparatus, a device, and a medium.
According to some embodiments of the present disclosure, a method for training a scene recognition model is provided. The scene recognition model includes a core feature extraction layer, a global information feature extraction layer connected to the core feature extraction layer, a local supervised learning (LCS) module of at least one level with an attention mechanism, and a fully-connected decision layer. The method includes:
According to some embodiments of the present disclosure, a scene recognition method for a scene recognition model acquired by training according to the method as described above is provided. The method includes:
According to some embodiments of the present disclosure, an electronic device is provided. The electronic device includes a processor and a memory, the memory storing one or more computer programs therein. The processor, when loading and running the one or more computer programs stored in the memory, is caused to perform the steps of the method for training the scene recognition model as described above, or the steps of the scene recognition method as described above.
According to some embodiments of the present disclosure, a non-transitory computer-readable storage medium, storing one or more computer programs therein is provided. The one or more computer programs, when loaded and run by a processor, cause the processor to perform the steps of the method for training the scene recognition model as described above, or the steps of the scene recognition method as described above.
Some embodiments of the present disclosure provide a model training and scene recognition method, an apparatus, a device, and a medium. The scene recognition model includes a core feature extraction layer, a global information feature extraction layer connected to the core feature extraction layer, a local supervised learning LCS module of at least one level with an attention mechanism, and a fully-connected decision layer. The method includes:
For clearer descriptions of the technical solutions in the embodiments of the present disclosure, the following briefly introduces the accompanying drawings to be required in the descriptions of the embodiments. Apparently, the accompanying drawings in the following description show merely some embodiments of the present disclosure, and persons of ordinary skills in the art may still derive other drawings from these accompanying drawings without creative efforts.
The present disclosure is described hereinafter in conjunction with the accompanying drawings, and the described embodiments are only a portion of the embodiments of the present disclosure and not all of the embodiments. Based on the embodiments in this disclosure, all other embodiments acquired by those skilled in the art without creative efforts fall within the scope of protection of this disclosure.
The specialized acronyms or custom terms involved in the embodiments of the present disclosure are explained hereinafter.
Convolutional Neural Network: an end-to-end complex mapping for extracting image or video features and completing, based on the extracted features, visual tasks such as classification and detection, which typically consists of a stack of multiple base convolutional modules.
Convolutional Layer: an operation layer that performs weighted summation and feature extraction using kernel images with specific receptive fields, which is typically also combined with a nonlinear activation function to improve mapping capabilities.
Pooling: a summing operation, such as summing pixel values of a specific range or a specific dimension, typically including maximizing, minimizing, averaging, and the like.
Grouped Convolution: organizing feature map groups into several sub-groups by channel, where each sub-group of feature maps performs the same or different convolution operations, which is used to reduce calculation overhead.
Feature Pyramid: a multi-scale feature extraction method that typically extracts feature maps from different levels of a network, then aligns the feature maps with a certain up-sampling scheme and produces multi-scale features by fusing these feature maps.
Residual Block: a module consisting of multiple convolutional layers with cross-layer connection bypass, using which deeper convolutional neural networks can be built and the phenomenon of gradient vanishing can be avoided, accelerating the training of the network.
Heat Map: a feature map that reflects the local importance of an image, generally the higher the importance, the higher the local heat value, or vice versa.
Local Supervised Learning: learning of parameters or extraction capabilities for some parts of the model or a local part of the feature map using directly connected labels and losses.
Attention Mechanism: a mechanism that forces the network to focus on important regions by fitting important levels of different parts and to make decisions based on features of the important regions.
Sigmoid: an activation function that does not take into account, category mutual exclusive relationship, and usually output values after being activated fall in the interval [0, 1] to complete normalization.
Deformable Convolution: a convolution operation in which a convolution kernel is not a canonical geometry, and the non-canonical geometry is usually generated from the original shape plus an offset.
Standard Cross Entropy: a conventional loss evaluation function for simple classification problems, commonly used to train classification networks, including single-label classification and multiple-label classification.
Focal Loss: a loss function for category imbalance problems that allows categories with less data to receive larger penalties, preventing the model from completely favoring categories with more data.
It should be noted that the embodiments of the present disclosure are not directly applied to the machine moderation session to directly generate moderation results; rather, the scenario information required by the specific machine moderation model is output in the form of scenario signals, and push results are generated together by an appropriate strategy and the machine moderation model. The videos or pictures considered to be offensive by the final push result will be pushed to the manual moderation session for multiple rounds of moderation to obtain the penalty result; while the videos or pictures considered to be normal by the final push result will also be sampled and inspected in different regions according to a sampling rate, or pushed to the manual moderation session for moderation based on the reporting result to avoid omitting videos/pictures that are in serious violation.
In S101, parameters of a core feature extraction layer and a global information feature extraction layer are acquired by training based on a first scene label of a sample image and a standard cross-entropy loss.
In S102, a weight parameter of an LCS module of each level is trained based on a loss value acquired by performing a pixel-by-pixel calculation on a feature map output from the LCS module of each level and the first scene label of the sample image.
In S103, a parameter of a fully connected decision layer is acquired by training based on the first scene label of the sample image and standard cross-entropy loss.
The scene recognition model includes the core feature extraction layer, the global information feature extraction layer connected to the core feature extraction layer, the LCS modules of each level, and the fully connected decision layer.
The scene recognition method according to some embodiments of the present disclosure is applied to an electronic device, which is a smart device such as a personal computer (PC), a tablet computer, or a server.
To adapt to the requirements of different fine-grained scene recognition, in the embodiments of the present disclosure, the scene recognition model further includes a branch expansion structure. The branch expansion structure includes a convolutional layer and a local object association relationship module.
A weight parameter of the convolutional layer of each level of the branch expansion structure is trained based on a loss value acquired by performing a pixel-by-pixel calculation on feature maps output from the convolutional layers of the branch expansion structure and a second scene label of the sample image, and a parameter of the local object association relationship module is acquired by training based on a loss function with a scene confidence regularization term. The first scene label and the second scene label have different granularities.
Generally, the first scene label is a coarse-grained scene label and the second scene label is a fine-grained scene label.
As shown in
Where scene information corresponding to the to-be-recognized image is determined to be offensive scene information, and the moderation result of the machine moderation is that the to-be-recognized image is an offensive image, then it is determined that the to-be-recognized image is an offensive image, and it is decided to be pushed to the manual moderation session. Where it is determined that the scene information corresponding to the to-be-recognized image does not belong to the offensive scene information, or the moderation result of the machine moderation is that the to-be-recognized image is not an offensive image, it is determined that the to-be-recognized image is not an offensive image, and at this time, it is not pushed to the manual moderation session, or it is sampled and reviewed in different regions in accordance with a sampling rate, or it is pushed to the manual moderation session for re-examination based on a report result.
The electronic device is capable of storing in advance information about which scene information belongs to the offensive scene information. Therefore, after the scene information corresponding to the to-be-recognized image is determined, the electronic device is capable of determining whether the scene information corresponding to the to-be-recognized image belongs to the offensive scene information. The process in which the machine moderation model moderates whether the to-be-recognized image is an offensive image is performed using the related art, which is not repeated herein.
After the model body training phase is completed, the model is expanded with branches on the body structure based on subsequent fine-grained scene requirements.
Some embodiments of the present disclosure provide a scheme for image scene recognition based on a scene recognition model. In training the scene recognition model, firstly, the parameters of the core feature extraction layer and the global information feature extraction layer are acquired by training based on the first scene label of the sample image and the standard cross-entropy loss, and then, based on the loss value acquired by performing the pixel-by-pixel calculation on the feature map output by the LCS module of each level and the first scene label of the sample image, the weight parameter of the LCS module of each level is trained, and finally, the parameter of the fully-connected decision layer of the scene recognition model is acquired by training. In this way, the scene recognition model is made to have the ability to extract high-richness features, and the accuracy of scene recognition is substantially improved by performing the scene recognition based on the scene recognition model. Moreover, the scene recognition model further includes the branch expansion structure, thereby adapting to the requirements of different fine-grained scene recognition.
As can be seen from
The core feature extraction layer includes a first-class packet multi-receptive field residual convolution module and a second-class packet multi-receptive field residual convolution module;
The first-class packet multi-receptive field residual convolution module includes a first packet, a second packet, and a third packet. Each of the first packet, the second packet, and the third packet has a different convolution size and includes a residual calculation bypass structure. Each packet outputs feature maps by convolution operation and residual calculation, and the feature maps output from each packet are spliced and channel shuffled in a channel dimension and output to the next module upon convolutional fusion.
The second-class packet multi-receptive field residual convolution module includes a fourth packet, a fifth packet, and a sixth packet. The fourth packet, the fifth packet, and the sixth packet have different convolution sizes. Each of the fifth packet and the sixth packet includes a 1×1 convolution bypass structure and a residual calculation bypass structure, respectively. Each packet outputs feature maps that are spliced and channel shuffled in the channel dimension and output to the next module upon convolutional fusion.
The scene recognition model according to the embodiments of the present disclosure has the core feature extraction layer structure as shown on the right side of
Acquiring the parameters of the core feature extraction layer and global information feature extraction layer by training based on the first scene label of the sample image and standard cross-entropy loss includes:
performing up-sampling on the feature maps of different levels in the core feature extraction layer by an inverse convolution operation with different expansion factors, aligning the number of channels using a bilinear interpolation algorithm in the channel dimension, summing and merging the feature maps of each level channel-by-channel, convolving and fusing the merged feature map group, acquiring a global information feature vector by channel-by-channel global average pooling, splicing the global information feature vector and the fully connected layer FC feature vector, and acquiring the parameters of the core feature extraction layer and the global information feature extraction layer by training based on the standard cross-entropy loss.
In the first round of the model body training phase, a global information feature extraction module is trained along with the core feature extraction layer of the model.
Training the weight parameters of the LCS module of each level based on the loss value acquired by performing the pixel-by-pixel calculation on the feature map output by the LCS module of each level and the first scene label of the sample image includes:
acquiring an importance weight of each channel by an activation function based on an attention mechanism of the channel dimension, and acquiring a summary heat map by performing a weighted summation on the feature maps of the channels according to the importance weight of each channel; and
calculating the loss value pixel-by-pixel based on the summary heat map, an object scene association importance, and an area of the object, and training the weight parameter of the LCS module of each level based on the loss value.
Another important step in the model body structure part is to make the model have a good extraction capability for the local object feature as well. As shown in
Upon outputting the attention-enhanced feature map groups, the LCS module accesses the “local object supervised loss” to supervise and guide the module to learn the ability of local object feature extraction. For example, the attention-enhanced feature map groups are first summed pixel-by-pixel across channels, and a heat map reflecting activation at different pixel positions is acquired. Then, the loss is acquired by using the heat map and a mask map labeled based on a block map object and an “object-scene association importance”, and back-propagated. The mask map is a label acquired, based on the image-level scene semantic label, according to an influence degree of an object in the scene image on the scene discrimination. The object in the image gives a mask according to a range of the block diagram it occupies. An object that has a large impact on the scene discrimination is labeled as “important” (a mask value is given as 1.0), a public object that has a small impact on the scene discrimination and appears in multiple scenes is labeled as “unimportant” (a mask value is given as 0.5), and a background mask value is given as 0.0. In order to achieve the effect of “local supervised learning”, the loss uses a pixel-by-pixel binary sigmoid loss, and a penalty weight is selected based on a ratio of an area of the “important object” to an area of an “unimportant object.” In a case where the area of the “important object” is much smaller than the area of the “unimportant object,” a relative gap between a penalty weight of the “important object” and that of the “unimportant object” is enlarged, such that the LCS module increases the learning effort of the “important object” when the “important object” is a small target, which avoids a bias to learning about the “unimportant object” or “background.” It should be noted that since the goal of the LCS module is to extract the local object feature, a penalty weight of “background” always takes a smaller value in both cases. The specific loss expression is as follows.
Pi,j represents an activation value of a pixel on the heat map, maski,j represents a pixel-level label, and area represents an area. In the present disclosure, λim, λunim, λ′im, λ′unim, and λpack take the values of 0.8, 0.6, 1.0, 0.5, 0.3, respectively. It should be noted that in the present disclosure, in training the LCS module, the module of each level is directly connected to the loss, back-propagated alone, and the mask map performs down-sampling accordingly as needed.
H and W represent a height and a width of an image, i and j represent a row number and a column number of a pixel, lbsigmoid represents a calculation method a loss value corresponding to each pixel, Tarea represents a threshold that triggers different calculation methods of loss values, mask_areaim represents an area of a mask region of an important object, and mask_areaunim represents an area of a mask region of a normal object. The mask_areaim and the mask_areaunim are artificially labeled.
After the LCS module completes the training, the features directly extracted by the module are still feature map groups with a size of H×W×C, which still has large redundancy when used directly as the feature, while using a non-linear fully-connected layer to extract the feature vectors results in the loss of some subtle deterministic features. Therefore, the embodiments of the present disclosure reduce the dimension of the feature map by using the Fisher convolutional coding method and extract the local object feature vector by using the Fisher convolutional feature coding technique. In this way, the loss of subtle deterministic features is reduced and the effect of geometric transformation caused by the redundant features is avoided. The process of the Fisher convolutional feature coding is simple, and it mainly mixes vectors on different pixels by using a variety of general Gaussian distributions, such that the number of features of the size dimension is reduced. The steps are as follows.
The feature map is spread in the size dimension, such that it is represented as H×W C-dimensional vectors.
Each C-dimensional vector is dimensionally reduced to M-dimension by using PCA.
K Gaussian mixing parameter values are calculated by using K Gaussian distributions on the H×W M-dimensional vectors.
The H×W M-dimensional vectors are evolved into K M-dimensional Gaussian vectors.
Mean vectors and variance vectors of all the Gaussian vectors are calculated, spliced, and L2 regularized, and finally, the local object feature vector with a length of 2MK is output. Each level outputs one vector.
The difference between here and the global information feature extraction is that in order to acquire some subtle local object features, the features of different levels are no longer fused, but are output separately. As shown in step 3 of
The branch expansion structure is constructed using a depth-wise separable convolutional residual block DW. In a main path of the residual block, a DW convolutional layer is used as a middle layer, and a 1×1 convolutional layer is used before and after the DW convolution.
The local object association relationship learning module includes a deformable convolutional layer, a convolutional layer, and an average pooling layer.
The deformable convolution layer acquires a convolution kernel offset value of a current pixel position, acquires a real effective position of a convolution kernel parameter by adding the current position of the convolution kernel parameter to the offset value, acquires a feature image pixel value of the real effective position, and outputs the feature map after the convolution operation and average pooling operation.
After completing the training of the model body, the branch expansion phase is entered. Branches are usually expanded according to new fine-grained scene requirements, and a suitable network structure is used to design new branches according to the requirements. Embodiments of the present disclosure consider the multiple expandability of branches, and in order to control the overhead of each branch, a depth-wise separable convolutional residual block (DW) is used to construct branches, as shown in
In order to gain the ability to learn the association relationship based on the ability of local object feature extraction, the present disclosure embeds an “association relationship learning module” between the components of each branch module in the second round of the branch expansion phase, and these modules are trained together with the components of the original branch network. As shown in the lower side of
Lfocus represents a standard focus loss, Cscorei represents a confidence score of the image in the body part for a certain scene category i, R is a regular term, and the present disclosure uses the L2 regular term as a penalty term for the expansion. The branch expansion is performed at any level of the body recognition feature extraction layer and expanded in a tree-like manner.
Numclass represents the number of classes.
The embodiments of the present disclosure achieve the following beneficial effects.
The present disclosure trains the model body feature extraction part from these perspectives, namely, abstract features, global information features, and local object features using a three-phase training scheme, such that the model has the ability to extract high-richness features and is capable of making scene discrimination based on them, and thus the scene recognition accuracy is improved.
The present disclosure combines the idea of a feature pyramid to mine the global information features from multiple scales. In this way, the loss of global spatial correlation information caused by excessive down-sampling and nonlinear transformation is avoided, high-quality global information features are provided, and the ability to recognize the background scenes is improved.
The present disclosure provides a local object feature extraction capability for different levels by local supervised learning at multiple levels, which reduces the loss of subtle scene decision information and enriches the local object features compared to local object feature extraction at a single level.
The present disclosure enhances the attention degree of the local supervised learning module to different channels by an attention mechanism, strengthens the activation of important local object features, and gives direction to subsequent Fisher coding.
The present disclosure proposes for the first time using a new pixel-by-pixel binary sigmoid loss for optimization based on the summary heat map, combined with the importance of local objects at the block diagram level, such that the local supervised learning module is forced to focus on the learning of the “important local objects” and reduce the interference of the “unimportant local objects” and “background” in decision making.
The present disclosure extracts feature vectors from the feature map using Fisher convolutional coding, which reduces redundancy while avoiding over-abstraction and loss of information.
In the body training phase, in order to increase the richness of the features, the present application uses multi-branch residual convolution as the basic module, which ensures the feature extraction capability, while in the model branch expansion phase, the present application reduces the overhead by using strategies such as depth-wise separable convolution and sharing the local learning module.
The present disclosure proposes for the first time building the association relationship learning module using the deformable convolution, which uses the geometric flexibility of the deformable convolution to accurately model the association relationship of the local objects.
The present disclosure also well optimizes the fine-grained scene recognition with class imbalance using the scene confidence of the body part as the regular term, combined with focus loss.
The first round of the model body training phase only fully trains the core feature extraction layer by using the focal loss, and then the global information feature extraction module is trained separately.
The global information feature extraction module uses two layers of inverse convolution alone for both size up-sampling and channel expansion, which, however, slows down the convergence.
The global information feature extraction module achieves feature fusion by using a channel-level attention mechanism and a fully connected layer.
The local supervised learning module is trained with the fully connected layer by using the image-level semantic label combined with auxiliary loss.
The fine-grained branch expansion network is expanded on the existing branch expansion network without using the body network as the starting point for expansion.
The model body part also reduces the overhead by using a base module based on depth-wise separable convolution, while the n×n convolution is also converted to equivalent 1×n and n×1 convolutions to reduce the overhead.
It is possible to train the association relationship learning alone at multiple levels by designing a specialized loss function, with no need to mix the association relationship learning together in the branch expansion network for training.
In S201, an image to be recognized is acquired.
In S202, the image to be recognized is input into a pre-trained scene recognition model, and scene information corresponding to the image to be recognized is determined based on the scene recognition model.
The scene recognition method according to some embodiments of the present disclosure is applied to an electronic device, which is a smart device such as a PC, a tablet, or a server. The electronic device that performs the scene recognition is the same as or different from the electronic device that performs the model training in the above embodiments.
The process of model training is generally offline. Therefore, in a case where the electronic device for model training trains the model by the method in the above embodiment, the trained scene recognition model is directly saved in the electronic device for scene recognition, such that the electronic device for subsequent scene recognition is capable of directly performing corresponding processing by the trained scene recognition model.
In some embodiments of the present disclosure, the image input into the scene recognition model for processing is treated as the image to be recognized. In the case that the image to be recognized is acquired, the image to be recognized is input into the pre-trained scene recognition model, and the scene information corresponding to the image to be recognized is determined based on the scene recognition model.
a first training unit 11, configured to acquire parameters of a core feature extraction layer and a global information feature extraction layer by training based on a first scene label of a sample image and a standard cross-entropy loss;
The apparatus further includes:
The first training unit 11 is further configured to perform up-sampling on the feature maps of different levels in the core feature extraction layer by an inverse convolution operation with different expansion factors, align the number of channels using a bilinear interpolation algorithm in a channel dimension, sum and merge the feature maps of the various levels channel-by-channel, convolve and fuse the merged feature map group, acquire a global information feature vector by performing channel-by-channel global average pooling, splice the global information feature vector with a fully connected layer FC feature vector, and acquire the parameters of the core feature extraction layer and global information feature extraction layer by training based on the standard cross-entropy loss.
The second training unit 12 is further configured to acquire an importance weight of each channel by using an activation function based on an attention mechanism of the channel dimension, and acquire a summary heat map by performing a weighted summation on the feature maps of the respective channels according to the importance weights of the respective channels; and calculate a loss value pixel-by-pixel according to the summary heat map, an object scene association importance, and an area of the object, and train the weight parameter of the LCS module of each level according to the loss value.
The apparatus further includes:
Based on the above embodiments, some embodiments of the present disclosure further provide an electronic device. As shown in
The memory 303 has one or more computer programs stored therein. The one or more programs, when loaded and executed by the processor 301, cause the processor 301 to perform:
Based on the same inventive concept, an electronic device is also provided by some embodiments of the present disclosure. The principles of problem solving of the electronic device are similar to those of the training method of the scene recognition model, and thus for implementation of the electronic device, reference is made to the implementation of the method, which is not repeated herein.
Based on the above embodiments, some embodiments of the present disclosure further provide an electronic device. As shown in
The memory 403 has one or more computer programs stored therein. The one or more programs, when loaded and executed by the processor 301, cause the processor 401 to perform:
Based on the same inventive concept, an electronic device is also provided by some embodiments of the present disclosure. The principles of problem solving of the electronic device are similar to those of the scene recognition method, and thus for implementation of the electronic device, reference is made to the implementation of the method, which is not repeated herein.
Based on the above embodiments, some embodiments of the present disclosure further provide a computer-readable storage medium, storing one or more computer programs executable by an electronic device therein, such that the one or more programs, when loaded and run on the electronic device, cause the electronic device to perform:
Based on the same inventive concept, a computer-readable storage medium is also provided by the embodiments of the present disclosure. The principles of the problem solving by the processor in performing the computer programs stored in the computer-readable storage medium are similar to those of the training method of the scene recognition model, and thus for the implementation of the processor in performing the computer programs stored in the computer-readable storage medium, reference is made to the implementation of the method, which is not repeated herein.
Based on the above embodiments, some embodiments of the present disclosure further provide a computer-readable storage medium, storing one or more computer programs executable by an electronic device therein, such that the one or more programs, when loaded and run on the electronic device, cause the electronic device to perform:
Based on the same inventive concept, some embodiments of the present disclosure further provide a computer-readable storage medium. The principles of the problem solving by the processor in performing the computer programs stored in the computer-readable storage medium are similar to those of the scene recognition method, and thus for the implementation of the processor in performing the computer programs stored in the computer-readable storage medium, reference is made to the implementation of the method, which is not repeated herein.
Embodiments of the present disclosure provide a model training and scene recognition method, an apparatus, a device, and a medium, providing a scene recognition solution with high accuracy.
The present disclosure is described with reference to flowcharts and/or block diagrams of methods, devices (systems), and computer program products according to the embodiments of the present disclosure. It should be understood that each of the processes and/or blocks in the flowchart and/or block diagram, and the combination of processes and/or blocks in the flowchart and/or block diagram, may be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general-purpose computer, a special-purpose computer, an embedded processor, or another programmable data-processing device to produce a machine such that the instructions executed by the processor of the computer or other programmable data-processing device produce a device for carrying out the functions specified in the process or processes of the flowchart and/or the box or boxes of the block diagram.
These computer program instructions may also be stored in the computer-readable memory capable of directing the computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer-readable memory produce an article including an instruction device. The instruction device implements the function specified in the one process or a plurality of processes of the flowchart and/or one block or a plurality of blocks of the block diagram
These computer program instructions may also be loaded onto a computer or other programmable data processing device, such that a series of operational steps are performed on the computer or other programmable device to produce computer-implemented processing. In this way, the instructions executed on the computer or other programmable device provide steps for implementing the functions specified in one process or a plurality of processes of the flowchart and/or one block or a plurality of blocks of the block diagram.
Although some embodiments of the present disclosure have been described, those skilled in the art may make additional changes and modifications to these embodiments once the underlying inventive concepts are known. Therefore, the appended claims are intended to be construed to include several embodiments as well as all changes and modifications that fall within the scope of this disclosure.
Those skilled in the art may make various changes and variations to the present disclosure without departing from the spirit and scope of the present disclosure. Thus, if such modifications and variations of the present disclosure fall within the scope of the claims of the present disclosure and its technical equivalents, the present disclosure is intended to encompass those changes and variations as well.
| Number | Date | Country | Kind |
|---|---|---|---|
| 202111174534.2 | Oct 2021 | CN | national |
This application is a U.S. national stage of international application No. PCT/CN2022/123011, field on Sep. 30, 2022, which claims priority to Chinese Patent Application No. 202111174534.2, filed on Oct. 9, 2021, the contents of which is are herein incorporated by reference in its their entireties.
| Filing Document | Filing Date | Country | Kind |
|---|---|---|---|
| PCT/CN2022/123011 | 9/30/2022 | WO |