This application relates to the artificial intelligence field, and in particular, to an image processing method, a neural network training method, and a related device.
Artificial intelligence (AI) is a theory, a method, a technology, and an application system for using a digital computer or a machine controlled by a digital computer to simulate, extend, and expand human intelligence, sense an environment, obtain knowledge, and use the knowledge to obtain an optimal result. In other words, artificial intelligence is a branch of computer science, and is intended to understand essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is to research design principles and implementation methods of various intelligent machines, so that the machines have perception, inference, and decision-making functions. Image processing by using artificial intelligence is a common application manner of artificial intelligence.
Currently, a spiking neural network (SNN), as a bionic neural network, has attracted extensive attention in recent years. A leaky integrate and fire (LIF) module in the spiking neural network has an advantage of fast and effective calculation.
However, the spiking neural network is mainly used to process sparse data. For example, the spiking neural network is used to process a plurality of pictures captured by a dynamic vision sensor, but the spiking neural network cannot be directly used to execute mainstream general vision tasks.
Embodiments of this application provide an image processing method, a neural network training method, and a related device, so that feature extraction is performed on a single image by using an LIF module, and the LIF module can be used to execute mainstream general visual tasks.
To resolve the foregoing technical problem, embodiments of this application provide the following technical solutions.
According to a first aspect, an embodiment of this application provides an image processing method, where an artificial intelligence technology may be applied to the image processing field. The method includes: an execution device inputs a to-be-processed image into a first neural network, and performs feature extraction on the to-be-processed image by using the first neural network, to obtain feature information of the to-be-processed image. That the execution device performs feature extraction on the to-be-processed image by using the first neural network includes: The execution device obtains first feature information corresponding to the to-be-processed image, where the to-be-processed image includes a plurality of image blocks, the first feature information includes feature information of the plurality of image blocks in the to-be-processed image, and the first feature information is also the feature information of the to-be-processed image. The execution device sequentially inputs feature information of at least two groups of image blocks into an LIF module, to obtain target data generated by the LIF module, where feature information of a group of image blocks includes feature information of at least one image block. The execution device obtains second feature information corresponding to the to-be-processed image based on the target data, where the second feature information includes updated feature information of the image block, and the second feature information is updated feature information of the to-be-processed image.
In an embodiment, the feature information of the entire to-be-processed image is divided into feature information of a plurality of image blocks in the to-be-processed image, and the feature information of the plurality of image blocks may be divided into feature information of at least two groups of image blocks. The feature information of the at least two groups of image blocks is sequentially input into the LIF module, to implement leakage and integration processes of the LIF module, and obtain the target data generated by the LIF module, and then the updated feature information of the to-be-processed image is obtained based on the target data. In the foregoing manner, feature extraction is performed on a single image by using the LIF module, so that the LIF module can be used to execute mainstream general visual tasks. This helps improve efficiency and accuracy of a feature extraction process.
In an embodiment of the first aspect, that the execution device sequentially inputs feature information of at least two groups of image blocks into an LIF module, to obtain target data generated by the LIF module includes: The execution device sequentially inputs the feature information of the at least two groups of image blocks into the LIF module, and when an excitation condition of the LIF module is satisfied, generates the target data by using an activation function.
The target data is not binarized data, that is, the target data output by the LIF module may not be pulse data, that is, the target data output by the LIF module may not be two fixed values, but data with higher precision. For example, the target data may be floating-point data. In an embodiment, precision of the target data may be the same as that of the feature information of the image block, that is, a numerical level of the target data may be the same as that of the feature information of the image block.
In an embodiment, the target data output by the LIF module is non-binarized data, that is, precision of the target data output by the LIF module is improved, so that more abundant feature information of the to-be-processed image can be extracted. In this way, in a process of performing feature extraction on the to-be-processed image, an advantage of quick and effective calculation of the LIF module is retained, and more abundant feature information can be obtained.
In an embodiment of the first aspect, that the execution device sequentially inputs feature information of at least two groups of image blocks into an LIF module includes: The execution device sequentially inputs the feature information of the at least two groups of image blocks into the LIF module in a plurality of rounds. Further, the execution device inputs feature information of a group of image blocks into one LIF module in each round. In an embodiment, the first neural network may include M parallel LIF modules. In each round, the execution device may simultaneously input feature information of M groups of image blocks into the M parallel LIF modules, and process input data by using the M parallel LIF modules.
In an embodiment of the first aspect, the feature information of the at least two groups of image blocks includes feature information of a plurality of rows of image blocks, feature information of each row of image blocks includes feature information of a plurality of image blocks in a same row, and feature information of each group of image blocks includes feature information of at least one row of image blocks. In addition/alternatively, the feature information of the at least two groups of image blocks includes feature information of a plurality of columns of image blocks, feature information of each column of image blocks includes feature information of a plurality of image blocks in a same column, and feature information of each group of image blocks includes feature information of at least one column of image blocks.
In an embodiment of the first aspect, the excitation condition of the LIF module may include whether a value of a film potential in the LIF module is greater than or equal to a preset threshold. Further, because the feature information of the image block may include feature information of an image block corresponding to at least one channel, correspondingly, the excitation condition of the LIF module may include one or more thresholds, that is, threshold values corresponding to different channels may be the same or different.
In an embodiment of the first aspect, the first neural network is a multilayer perceptron MLP, a convolutional neural network, or a neural network using a self-attention mechanism, and the neural network using the self-attention mechanism may also be referred to as a transformer neural network.
In an embodiment, regardless of whether the first neural network is the MLP, the convolutional neural network, or a residual transformer neural network, the first neural network can be compatible with the LIF module by using the image processing method provided in an embodiment of the application. Because the MLP, the convolutional neural network, and the residual transformer neural network may be applied to different application scenarios, application scenarios of this solution are greatly extended and implementation flexibility is greatly improved.
In an embodiment of the first aspect, the method further includes: The execution device performs feature processing on the feature information of the to-be-processed image by using a second neural network, to obtain a prediction result corresponding to the to-be-processed image, where the first neural network and the second neural network are included in a same target neural network, and a task executed by the target neural network is any one of the following: image classification, image segmentation, performing target detection on an image, or performing super-resolution processing on an image. In embodiments of this application, a plurality of application scenarios of this solution are provided. This greatly improves implementation flexibility of this solution.
According to a second aspect, an embodiment of this application provides a neural network training method, so that an artificial intelligence technology may be applied to the image processing field. The method includes: inputting a to-be-processed image into a first neural network, performing feature extraction on the to-be-processed image by using the first neural network, to obtain feature information of the to-be-processed image, and performing feature processing on the feature information of the to-be-processed image by using a second neural network, to obtain a prediction result corresponding to the to-be-processed image; and training the first neural network and the second neural network by using a loss function based on the prediction result and a correct result that correspond to the to-be-processed image, where the loss function indicates a similarity between the prediction result and the correct result.
The performing feature extraction on the to-be-processed image by using the first neural network includes: obtaining first feature information corresponding to the to-be-processed image, where the to-be-processed image includes a plurality of image blocks, and the first feature information includes feature information of the image block; sequentially inputting feature information of at least two groups of image blocks into a leaky integrate and fire LIF module, to obtain target data generated by the LIF module, where feature information of a group of image blocks includes feature information of at least one image block; and obtaining second feature information corresponding to the to-be-processed image based on the target data, where the second feature information includes updated feature information of the image block, and both the first feature information and the second feature information are the feature information of the to-be-processed image.
In the second aspect of this application, a training device is further configured to perform the operations performed by the execution device in an embodiment of the first aspect. For implementations of the operations, meanings of terms, and brought beneficial effects in an embodiment of the second aspect of this application, refer to the first aspect. Details are not described herein again.
According to a third aspect, an embodiment of this application provides an image processing apparatus, so that an artificial intelligence technology may be applied to the image processing field. The image processing apparatus includes: an input unit, configured to input a to-be-processed image into a first neural network; and a feature extraction unit, configured to perform feature extraction on the to-be-processed image by using the first neural network, to obtain feature information of the to-be-processed image.
The feature extraction unit includes: an obtaining subunit, configured to obtain first feature information corresponding to the to-be-processed image, where the to-be-processed image includes a plurality of image blocks, and the first feature information includes feature information of the image block; and a generation subunit, configured to sequentially input feature information of at least two groups of image blocks into a leaky integrate and fire LIF module, to obtain target data generated by the LIF module, where feature information of a group of image blocks includes feature information of at least one image block. The obtaining subunit is configured to obtain second feature information corresponding to the to-be-processed image based on the target data, where the second feature information includes updated feature information of the image block, and both the first feature information and the second feature information are the feature information of the to-be-processed image.
In the third aspect of this application, the image processing apparatus is further configured to perform the operations performed by the execution device in an embodiment of the first aspect. For implementations of the operations, meanings of terms, and brought beneficial effects in an embodiment of the third aspect of this application, refer to the first aspect and an embodiment of the first aspect. Details are not described herein again.
According to a fourth aspect, an embodiment of this application provides a neural network training apparatus, so that an artificial intelligence technology may be applied to the image processing field. The neural network training apparatus includes: a feature extraction unit, configured to: input a to-be-processed image into a first neural network, and perform feature extraction on the to-be-processed image by using the first neural network, to obtain feature information of the to-be-processed image; a feature processing unit, configured to perform feature processing on the feature information of the to-be-processed image by using a second neural network, to obtain a prediction result corresponding to the to-be-processed image; and a training unit, configured to train the first neural network and the second neural network by using a loss function based on the prediction result and a correct result that correspond to the to-be-processed image, where the loss function indicates a similarity between the prediction result and the correct result.
The feature extraction unit includes: an obtaining subunit, configured to obtain first feature information corresponding to the to-be-processed image, where the to-be-processed image includes a plurality of image blocks, and the first feature information includes feature information of the image block; and a generation subunit, configured to sequentially input feature information of at least two groups of image blocks into a leaky integrate and fire LIF module, to obtain target data generated by the LIF module, where feature information of a group of image blocks includes feature information of at least one image block. The obtaining subunit is further configured to obtain second feature information corresponding to the to-be-processed image based on the target data, where the second feature information includes updated feature information of the image block, and both the first feature information and the second feature information are the feature information of the to-be-processed image.
In the fourth aspect of this application, the image processing apparatus is further configured to perform the operations performed by the execution device in an embodiment of the fourth aspect. For implementations of the operations, meanings of terms, and brought beneficial effects in an embodiment of the third aspect of this application, refer to the second aspect and an embodiment of the second aspect. Details are not described herein again.
According to a fifth aspect, an embodiment of this application provides a computer program product. The computer program product includes a program, and when the program is run on a computer, the computer is enabled to perform the method according to the first aspect or the second aspect.
According to a sixth aspect, an embodiment of this application provides a computer-readable storage medium. The computer-readable storage medium stores a computer program, and when the program is run on a computer, the computer is enabled to perform the method according to the first aspect or the second aspect.
According to a seventh aspect, an embodiment of this application provides an execution device, including a processor and a memory. The processor is coupled to the memory, the memory is configured to store a program, and the processor is configured to execute the program in the memory, so that the execution device performs the image processing method according to the first aspect.
According to an eighth aspect, an embodiment of this application provides a training device, including a processor and a memory. The processor is coupled to the memory, the memory is configured to store a program, and the processor is configured to execute the program in the memory, so that the training device performs the neural network training processing method according to the second aspect.
According to a ninth aspect, an embodiment of this application provides a chip system. The chip system includes a processor, and is configured to support a terminal device or a communication device in implementing functions in the foregoing aspects, for example, sending or processing data and/or information in the foregoing method. In an embodiment, the chip system further includes a memory. The memory is configured to store a program instruction and data that are necessary for the terminal device or the communication device. The chip system may include a chip, or may include a chip and another discrete component.
Embodiments of this application provide an SIMD instruction generation and processing method and a related device, to select information about a second SIMD instruction model from a plurality of groups of information about a first SIMD instruction model based on a length of each loop dimension of a tensor formula, and further generate, based on the second SIMD instruction model, a first SIMD instruction obtained after a first tensor formula is converted, to greatly improve efficiency of an SIMD instruction generation process.
The following describes embodiments of this application with reference to the accompanying drawings. One of ordinary skilled in the art may learn that, with development of technologies and emergence of a new scenario, the technical solutions provided in embodiments of this application are also applicable to a similar technical problem.
In the specification, claims, and accompanying drawings of this application, the terms “first”, “second”, and so on are intended to distinguish between similar objects but do not necessarily indicate an order or sequence. It should be understood that the terms used in such a way are interchangeable in proper circumstances, which is merely a discrimination manner that is used when objects having a same attribute are described in embodiments of this application. In addition, the terms “include”, “have”, and any other variants thereof mean to cover the non-exclusive inclusion, so that a process, method, system, product, or device that includes a series of units is not necessarily limited to those units, but may include other units not expressly listed or inherent to such a process, method, product, or device.
An overall working procedure of an artificial intelligence system is first described.
The infrastructure provides computing capability support for the artificial intelligence system, implements communication with the external world, and implements support by using a basic platform. The infrastructure communicates with the external world by using a sensor. A computing capability is provided by intelligent chips. The intelligent chips may be hardware acceleration chips such as a central processing unit (CPU), an embedded neural-network processing unit (NPU), a graphics processing unit (GPU), an application-specific integrated circuit (ASIC), or a field-programmable gate array (FPGA). The basic platform includes related platform assurance and support such as a distributed computing framework and a network, and may include cloud storage and computing, an interconnection and interworking network, and the like. For example, the sensor communicates with the outside to obtain data, and the data is provided to an intelligent chip in a distributed computing system provided by the basic platform for computing.
Data at an upper layer of the infrastructure indicates a data source in the artificial intelligence field. The data relates to a graph, an image, a speech, and a text, further relates to internet of things data of a conventional device, and includes service data of an existing system and perception data, for example, force, displacement, a liquid level, a temperature, and humidity.
Data processing usually includes data training, machine learning, deep learning, searching, inference, decision making, and the like.
Machine learning and deep learning may mean performing symbolic and formal intelligent information modeling, extraction, preprocessing, training, and the like on data.
Inference is a process in which human intelligent inference is simulated in a computer or an intelligent system, and machine thinking and problem resolving are performed by using formal information according to an inference control policy. A typical function is searching and matching.
Decision making is a process of making a decision after intelligent information is inferred, and usually provides functions such as classification, ranking, and prediction.
After data processing mentioned above is performed on the data, some general capabilities may further be formed based on a data processing result. For example, the general capabilities may be an algorithm or a general system, for example, translation, text analysis, computer vision processing, speech recognition, and image recognition.
The intelligent product and the industry application are a product and an application of the artificial intelligence system in various fields, and are encapsulation for an overall solution of artificial intelligence, to productize intelligent information decision making and implement the application. Application fields thereof mainly include an intelligent terminal, intelligent manufacturing, intelligent transportation, a smart home, intelligent healthcare, intelligent security protection, autonomous driving, a smart city, and the like.
Embodiments of this application may be applied to various application fields in the artificial intelligence field, and may be applied to executing image processing tasks in various application fields. The tasks include but are not limited to: performing feature extraction on an image, image classification, image segmentation, performing target detection on an image, performing super-resolution processing on an image, or another type of task. This is not exhaustively listed herein.
For example, in the field of intelligent terminals, smart homes, or intelligent security protection, or another application field, there may be a requirement for performing image classification by using a neural network. For more intuitive understanding of this solution,
In another example, for example, in the autonomous driving field, there may be a requirement for performing target detection on a captured image by using a neural network, that is, inputting a to-be-processed image captured by an autonomous vehicle into the neural network, to obtain a category and a location of at least one object in the to-be-processed image output by the neural network.
In another example, for example, in the field of intelligent terminals, an image retouching application may provide a function of performing image segmentation on an input image, that is, inputting a to-be-processed image into a neural network, to obtain a category of each pixel in the to-be-processed image output by the neural network. The category of each pixel is a foreground or a background.
In another example, for example, in the fields such as intelligent security protection and a smart city, there may be a requirement for performing super-resolution processing on a captured image, that is, inputting an image captured by a monitoring device into a neural network, to obtain a processed image output by the neural network. The processed image has higher resolution.
In another example, for example, in the fields of intelligent terminals and intelligent security protection, there may be a requirement for performing facial recognition based on a captured image. In this case, feature extraction needs to be performed on a captured image of a user by using a neural network, so that extracted feature information may be matched with pre-registered feature information, to determine whether the current user is a registered user.
It should be noted that the graphics processing method provided in an embodiment of the application may be further applied to another application scenario. This is not exhaustively listed herein. In the foregoing application scenarios, in each process of processing an image by using a neural network, feature extraction needs to be first performed on an input image. To apply an LIF module to a feature extraction process of a single image, an embodiment of this application provides an image processing method.
The following first describes an image processing system in an embodiment of this application with reference to
The database 220 stores a training data set. The training device 210 generates a first model/rule 201, performs iterative training on the first model/rule 201 by using the training data set, to obtain a trained first model/rule 201, and deploys the trained first model/rule 201 to the calculation module 231 of the execution device 230. The first model/rule 201 may be represented as a neural network, or may be represented as a non-neural network model. In an embodiment of the application, only an example in which the first model/rule 201 is represented as a neural network is used for description.
The execution device 230 may be represented as different systems or devices, for example, a mobile phone, a tablet, a notebook computer, a virtual reality (VR) device, or a monitoring system. The execution device 230 may invoke data, code, and the like in the data storage system 240, or may store data, instructions, and the like in the data storage system 240. The data storage system 240 may be disposed in the execution device 230, or the data storage system 240 may be an external memory relative to the execution device 230.
In some embodiments of this application, refer to
With reference to the foregoing description,
In an embodiment, the feature information of the at least two groups of image blocks is sequentially input into the LIF module, to implement leakage and integration processes of the LIF module, and obtain the target data generated by the LIF module, and then the updated feature information of the to-be-processed image is obtained based on the target data. In the foregoing manner, feature extraction is performed on a single image by using the LIF module, so that the LIF module can be used to execute mainstream general visual tasks.
With reference to the foregoing description, the following starts to describe implementation procedures of an inference phase and a training phase of the image processing method provided in embodiments of this application.
In an embodiment of the application,
301: An execution device inputs a to-be-processed image into a first neural network.
In an embodiment of the application, after obtaining the to-be-processed image, the execution device may input the to-be-processed image into the first neural network, and perform feature extraction on the to-be-processed image by using the first neural network, to obtain feature information of the to-be-processed image.
The first neural network may be represented as a multilayer perceptron (MLP), a convolutional neural network (CNN), a neural network using a self-attention mechanism, or another type of neural network. The neural network using the self-attention mechanism may also be referred to as a transformer neural network, and may be determined flexibly based on an actual application scenario. This is not limited herein.
For more intuitive understanding of this solution, the following first describes an overall architecture of the first neural network with reference to
The segmentation unit in the first neural network is configured to perform feature extraction and segmentation on the to-be-processed image, to obtain initial feature information (embedding) of a plurality of image blocks (patch) included in the to-be-processed image. Because the plurality of image blocks form the to-be-processed image, the feature information of the plurality of image blocks is the feature information of the to-be-processed image. The segmentation operation is used to divide the to-be-processed image into a plurality of image blocks, and an execution sequence of the feature extraction operation and the segmentation operation may be flexibly determined based on an actual application scenario. It should be noted that the feature information of the plurality of image blocks shown in
The LIF unit in the first neural network is configured to update the feature information of the image block. The LIF unit includes at least the LIF module in an embodiment of the application. The LIF unit may further include another neural network layer. An implementation process of the LIF unit is described in detail in the following operations 302 to 304.
The channel mixing unit in the first neural network is also configured to update the feature information of the image block. Both the up-sampling unit and the down-sampling unit are configured to change a size of the feature information of the to-be-processed image. The up-sampling unit is configured to perform an up-sampling operation on the feature information of the image block, to scale up the feature information of the image block. The down-sampling unit is configured to perform a down-sampling operation on the feature information of the image block, to scale down the feature information of the image block.
It should be noted that in actual application, the first neural network may include more or fewer units, locations of the LIF unit and the channel mixing unit may be adjusted, and quantities of the LIF unit, the channel mixing unit, and the up-sampling/down-sampling unit may be the same or different, provided that there is the LIF unit in the first neural network. The first neural network shown in
302: The execution device obtains first feature information corresponding to the to-be-processed image, where the to-be-processed image includes a plurality of image blocks, and the first feature information includes feature information of the image block.
In an embodiment of the application, before updating the initial feature information of the plurality of image blocks by using the LIF module, the execution device may first obtain the first feature information corresponding to the to-be-processed image. The to-be-processed image includes the plurality of image blocks, and the first feature information includes the feature information of the plurality of image blocks.
Further, the first feature information may include initial feature information of each image block, or may be updated feature information of each image block.
In an embodiment, after obtaining the first feature information corresponding to the to-be-processed image, the execution device may perform a convolution operation on the feature information of the plurality of image blocks again by using a convolutional neural network layer, to update the feature information of the image block, and obtain updated first feature information. The convolutional neural network layer may be represented as a depthwise separable convolution layer (depth wise convolution) or another type of convolutional neural network layer. When the depthwise separable convolution layer is selected, a calculation amount of the foregoing convolution operation can be reduced.
303: The execution device sequentially inputs feature information of at least two groups of image blocks into the LIF module, to obtain target data generated by the LIF module.
In an embodiment of the application, after obtaining the feature information (that is, the first feature information or the updated first feature information) of the plurality of image blocks included in the to-be-processed image, the execution device may divide the feature information of the plurality of image blocks included in the to-be-processed image into the feature information of the at least two groups of image blocks, and sequentially input the feature information of the at least two groups of image blocks into the LIF module, to implement leakage and integration processes of the LIF module, and obtain the target data generated by the LIF module. Feature information of a group of image blocks includes feature information of at least one image block.
In an embodiment, the execution device may sequentially input the feature information of the at least two groups of image blocks into the LIF module, and when an excitation condition of the LIF module is satisfied, generate the target data by using an activation function.
The target data may be binarized data, that is, the target data output by the LIF module may be two preset values. Alternatively, the target data may be non-binarized data, that is, the target data output by the LIF module may not be pulse data, that is, the target data output by the LIF module may not be two fixed values, but data with higher precision. For example, the target data may be floating-point data. In an embodiment, precision of the target data may be the same as that of the feature information of the image block, that is, a numerical level of the target data may be the same as that of the feature information of the image block.
In an embodiment of the application, the target data output by the LIF module is the non-binarized data, that is, precision of the target data output by the LIF module is improved, so that more abundant feature information of the to-be-processed image can be extracted. In this way, in a process of performing feature extraction on the to-be-processed image, an advantage of quick and effective calculation of the LIF module is retained, and more abundant feature information can be obtained.
For a concept of the feature information of the at least two groups of image blocks, the to-be-processed image may include two direction dimensions: a horizontal dimension and a vertical dimension. Correspondingly, the to-be-processed image may be segmented in the two direction dimensions, namely, the horizontal and vertical dimensions, that is, the feature information of the plurality of image blocks may include feature information of a plurality of image blocks in the horizontal dimension and feature information of a plurality of image blocks in the vertical dimension.
For more intuitive understanding of this solution,
As shown in
The execution device may determine feature information of one or more rows of image blocks as feature information of a group of image blocks, or the execution device may determine feature information of one or more columns of image blocks as feature information of a group of image blocks, to divide the feature information of the plurality of image blocks into the feature information of the at least two groups of image blocks.
For a process of sequentially inputting the feature information of the at least two groups of image blocks into the LIF module, one LIF unit in the first neural network may have one or more LIF modules. For more intuitive understanding of this solution, refer to
The MLP layer is a neural network layer including at least one fully connected neuron. If the first neural network is represented as the convolutional neural network, the MLP layer may be replaced with the convolutional neural network layer. If the first neural network is represented as the transformer neural network, the MLP layer may be replaced with the transformer neural network layer. Further, the convolutional neural network layer is a neural network layer including at least one partially connected neuron, and the transformer neural network layer is a neural network layer that introduces an attention mechanism.
Feature information of each group of image blocks obtained by the vertical LIF module includes feature information of at least one row of image blocks. For more intuitive understanding of this solution,
Feature information of each group of image blocks acquired by the horizontal LIF module includes feature information of at least one column of image blocks, that is, feature information of a plurality of image blocks is grouped in a horizontal direction, and at least two obtained groups are sequentially input to the horizontal LIF module. For more intuitive understanding of this solution,
It should be noted that one LIF unit in the first neural network may include more or fewer neural network layers. The example in
In an embodiment, if the first neural network includes the vertical LIF module, the execution device may group the feature information of the plurality of image blocks in the vertical direction, and sequentially input obtained feature information of at least two groups of image blocks into the vertical LIF module, that is, input feature information of a group of image blocks into the vertical LIF module each time.
Each time after inputting feature information of at least one row of image blocks (that is, feature information of a group of image blocks) into the vertical LIF module, the execution device determines whether an excitation condition of the vertical LIF module is satisfied. If a determining result is that the excitation condition of the vertical LIF module is not satisfied, the vertical LIF module may not generate any value. If a determining result is that the excitation condition of the vertical LIF module is satisfied, the vertical LIF module may generate the target data by using the activation function, and reset a film potential of the vertical LIF module to 0. The execution device continues to input feature information of a next group of image blocks into the vertical LIF module, to leak and integrate the feature information of the two groups of image blocks. The execution device repeatedly performs the foregoing operations, to complete processing feature information of all the image blocks by using the vertical LIF module.
For further understanding of this solution, the following discloses a formula of a implementation of the LIF module:
τ represents a leakage parameter of the LIF module, and is a hyperparameter; when a value of ut+1n is greater than Vth, a value of otn is 1; when a value of ut+1n is less than or equal to Vth, a value of otn is 0; Vth represents the excitation condition of the LIF module, yt+1n represents an nth value in feature information of a group of image blocks input into the LIF module in a current round (that is, a (t+1)th round); utn represents a film potential of the LIF module in a previous round (that is, a tth round); and ut+1n represents a film potential of the LIF module in the current round. When the excitation condition of the LIF module is satisfied, the LIF module generates rt+1n, and a calculation formula of rt+1n is as follows:
ut+1n and Vth may be understood with reference to the foregoing description. ReLU is an example of the activation function. It should be understood that the examples in formula (1) and formula (2) are merely intended to facilitate understanding of this solution rather than limiting this solution.
Further, data sizes of the feature information of the two groups of image blocks are consistent, that is, values included in the feature information of the two groups of image blocks may be in a one-to-one correspondence. In this case, the LIF module may multiply feature information of an image block in a previous round by the leakage parameter, and then add the product to feature information of the image block in a current round to obtain a plurality of target values included in the current round. ut+1n represents an nth value in the plurality of target values. When a value of ut+1n is greater than a preset threshold, it is determined that ut+1n satisfies the excitation condition of the LIF module, the LIF module generates a piece of target data by using the activation function.
Still further, because the feature information of the image block may include feature information of an image block corresponding to at least one channel, correspondingly, the excitation condition of the LIF module may include one or more thresholds. Further, threshold values corresponding to different channels may be the same or different.
For more intuitive understanding of this solution,
In an embodiment, one LIF unit of the first neural network may include M parallel vertical LIF modules. In each round, the execution device may simultaneously input feature information of M groups of image blocks into the M parallel vertical LIF modules, and process input data by using the M parallel vertical LIF modules.
In an embodiment, if the first neural network includes the horizontal LIF module, the execution device may group the feature information of the plurality of image blocks in the horizontal direction, and sequentially input obtained feature information of at least two groups of image blocks into the horizontal LIF module, that is, input feature information of a group of image blocks into the horizontal LIF module each time.
Each time after inputting feature information of at least one column of image blocks (that is, feature information of a group of image blocks) into the horizontal LIF module, the execution device determines whether an excitation condition of the horizontal LIF module is satisfied. If a determining result is that the excitation condition of the horizontal LIF module is not satisfied, the horizontal LIF module may not generate any value. If a determining result is that the excitation condition of the horizontal LIF module is satisfied, the horizontal LIF module may generate the target data by using the activation function, and reset a film potential of the horizontal LIF module to 0. The execution device continues to input feature information of a next group of image blocks into the horizontal LIF module, to leak and integrate the feature information of the two groups of image blocks. The execution device repeatedly performs the foregoing operations, to complete processing feature information of all the image blocks by using the horizontal LIF module.
It should be noted that manners of processing the input data by the “vertical LIF module” and the “horizontal LIF module” are similar. For an implementation of the horizontal LIF module, refer to the foregoing descriptions. Details are not described herein again.
In an embodiment, one LIF unit of the first neural network may include M parallel horizontal LIF modules. In each round, the execution device may simultaneously input feature information of M groups of image blocks into the M parallel horizontal LIF modules, and process input data by using the M parallel horizontal LIF modules.
For more intuitive understanding of this solution,
In a second round, the execution device inputs feature information of a second column of image blocks (that is, feature information of a group of image blocks represented by E2) into the horizontal LIF module, and inputs feature information of a fourth column of image blocks (that is, feature information of a group of image blocks represented by F2) into the horizontal LIF module. In this way, feature information of the four groups of image blocks is input into two parallel horizontal LIF modules. It should be understood that the example in
If the first neural network includes both the vertical LIF module and the horizontal LIF module, the execution device may separately process, by using the vertical LIF module and the horizontal LIF module, feature information of all image blocks included in the to-be-processed image. Implementation details of the vertical LIF module and the horizontal LIF module are not described herein again.
304: The execution device obtains second feature information corresponding to the to-be-processed image based on the target data, where the second feature information includes updated feature information of the image block.
In an embodiment of the application, after obtaining a plurality of pieces of target data generated by the LIF module, the execution device may obtain the second feature information corresponding to the to-be-processed image based on the plurality of pieces of target data, where the second feature information includes the updated feature information of the image block, and both the first feature information and the second feature information are the feature information of the to-be-processed image.
In an embodiment, if the first neural network includes only the vertical LIF module or the horizontal LIF module, the execution device may determine the plurality of pieces of target data output by the vertical LIF module or the horizontal LIF module as the second feature information corresponding to the to-be-processed image; or may process the plurality of pieces of output target data again by using another neural network layer, and determine processed data as the second feature information corresponding to the to-be-processed image.
If the first neural network includes both the vertical LIF module and the horizontal LIF module, the execution device may fuse the target data output by the vertical LIF module and the target data output by the horizontal LIF module, and directly determine fused data as the second feature information corresponding to the to-be-processed image. Alternatively, the execution device may perform an update operation by using another neural network layer before or after performing a fusion operation.
Further, if the first neural network is represented as the MLP, the another neural network layer may be the MLP layer. If the first neural network is represented as the convolutional neural network, the another neural network layer may be the convolutional neural network layer. If the first neural network is represented as the transformer neural network, the another neural network layer may be the transformer neural network layer or the like. If the first neural network uses another type of neural network, the another neural network layer may be further replaced with another type of neural network layer or the like. Details are not described herein.
In an embodiment of the application, regardless of whether the first neural network is the MLP, the convolutional neural network, or the transformer neural network, the first neural network can be compatible with the LIF module by using the image processing method provided in an embodiment of the application. Because the MLP, the convolutional neural network, and the transformer neural network may be applied to different application scenarios, application scenarios of this solution are greatly extended and implementation flexibility is greatly improved.
It should be noted that the operations described in operations 302 to 304 are operations performed by one LIF unit in the first neural network. After obtaining the second feature information corresponding to the to-be-processed image, the execution device may update the second feature information by using another neural network layer, that is, update the feature information of the to-be-processed image again.
Further, it is understood with reference to
305: The execution device performs feature processing on the feature information of the to-be-processed image by using a second neural network, to obtain a prediction result corresponding to the to-be-processed image.
In an embodiment of the application, after generating the feature information of the to-be-processed image by using the first neural network, the execution device may perform feature processing on the feature information of the to-be-processed image by using the second neural network, to obtain the prediction result corresponding to the to-be-processed image. The first neural network and the second neural network are included in a same target neural network, and a task executed by the target neural network is any one of the following: image classification, image segmentation, performing target detection on an image, performing super-resolution processing on an image, another type of task, or the like. Implementation tasks of the target neural network are not exhaustively listed herein.
A meaning of the prediction result corresponding to the to-be-processed image depends on a type of the task executed by the target neural network. For example, if the task executed by the target neural network is image classification, the prediction result corresponding to the to-be-processed image may be used to indicate a prediction category corresponding to the to-be-processed image. For another example, if the task executed by the target neural network is performing target detection on an image, the prediction result corresponding to the to-be-processed image may be used to indicate a prediction category and a prediction location of each object in the to-be-processed image. For another example, if the task executed by the target neural network is image segmentation, the prediction result corresponding to the to-be-processed image may be used to indicate a prediction category of each pixel in the to-be-processed image. For another example, if the task executed by the target neural network is image segmentation, the prediction result corresponding to the to-be-processed image may include a processed image and the like. This is not exhaustively listed herein.
In embodiments of this application, a plurality of application scenarios of this solution are provided. This greatly improves implementation flexibility of this solution.
In an embodiment of the application, the feature information of the entire to-be-processed image is divided into feature information of a plurality of image blocks in the to-be-processed image, and the feature information of the plurality of image blocks may be divided into feature information of at least two groups of image blocks. The feature information of the at least two groups of image blocks is sequentially input into the LIF module, to obtain the target data generated by the LIF module, and then the updated feature information of the to-be-processed image is obtained based on the target data. In the foregoing manner, feature extraction is performed on a single image by using the LIF module, so that the LIF module can be used to execute mainstream general visual tasks.
In an embodiment of the application,
1101: A training device inputs a to-be-processed image into a first neural network.
1102: The training device obtains first feature information corresponding to the to-be-processed image, where the to-be-processed image includes a plurality of image blocks, and the first feature information includes feature information of the image block.
1103: The training device sequentially inputs feature information of at least two groups of image blocks into an LIF module, to implement leakage and integration processes of the LIF module, and obtain target data generated by the LIF module.
1104: The training device obtains second feature information corresponding to the to-be-processed image based on the target data, where the second feature information includes updated feature information of the image block.
1105: The training device performs feature processing on the feature information of the to-be-processed image by using the second neural network, to obtain the prediction result corresponding to the to-be-processed image.
In an embodiment of the application, a training data set may be configured on the training device. The training data set is used to train a target neural network. The target neural network includes a first neural network and a second neural network. A task executed by the target neural network is any one of the following: image classification, performing target detection on an image, image segmentation, performing super-resolution processing on an image, or another type of task. This is not exhaustively listed herein.
The training data set includes a plurality of pieces of training data, each piece of training data includes one to-be-processed image and a correct result corresponding to the to-be-processed image, and a meaning of the correct result corresponding to the to-be-processed image depends on a type of the task executed by the target neural network. Two concepts “the correct result corresponding to the to-be-processed image” and “the prediction result corresponding to the to-be-processed image” are similar. A difference lies in that the “the correct result corresponding to the to-be-processed image” includes correct information, and the “the prediction result corresponding to the to-be-processed image” includes information generated by the target neural network.
For an embodiment of operations 1101 to 1105, refer to the descriptions of operations 301 to 305 in the embodiment corresponding to
1106: The training device trains the first neural network and the second neural network by using a loss function based on the prediction result corresponding to the to-be-processed image and a correct result corresponding to the to-be-processed image, where the loss function indicates a similarity between the prediction result and the correct result.
In an embodiment of the application, the training device may generate a function value of the loss function based on the prediction result corresponding to the to-be-processed image and the correct result corresponding to the to-be-processed image, perform gradient derivation on the function value of the loss function, and back-broadcast a gradient value, to update weight parameters of the first neural network and the second neural network (that is, the target neural network), and complete training on the first neural network and the second neural network. The training device repeatedly performs operations 1101 to 1106 until a convergence condition is satisfied.
The loss function indicates the similarity between the prediction result corresponding to the to-be-processed image and the correct result corresponding to the to-be-processed image. A type of the loss function may be flexibly selected with reference to an actual application scenario. For example, if the task executed by the target neural network is image classification, a cross entropy loss function, a 0-1 loss function, another type of loss function, or the like may be selected as the loss function. The example herein is merely intended to facilitate understanding of this solution rather than limiting this solution.
The convergence condition may be that a convergence condition of the loss function is satisfied, or a quantity of iterations reaches a preset quantity of times, or the like. This is not limited herein.
In an embodiment of the application, implementation operations of the first neural network in an execution phase are provided, and implementation operations of the first neural network in the training phase are further provided. This extends application scenarios of this solution, and improves comprehensiveness of this solution.
For more intuitive understanding of beneficial effects brought by this solution, the following describes the beneficial effects brought by this solution with reference to experimental data. First, an example in which the target neural network executes an image classification task is used. An experiment is conducted on an ImageNet dataset, and obtained experiment results are shown in Table 1 below.
ResMLP-B24, DeiT-B, and AS-MLP-B are three existing neural networks. The three neural networks may be used to classify an image. It can be learned from the foregoing data that a classification result obtained by using a model provided in embodiments of this application has highest accuracy.
The following uses an example in which the target neural network performs target detection on an image, and obtained experiment results are shown in Table 2.
DNL, Swin-S, and OCRNet are all existing neural networks, and mIoU is an indicator for evaluating precision of a detection result of target detection performed on an image. It can be learned from the foregoing data that a target detection result obtained by using a model provided in embodiments of this application has highest precision.
Based on the embodiments corresponding to
The feature extraction unit 1202 includes: an obtaining subunit 12021, configured to obtain first feature information corresponding to the to-be-processed image, where the to-be-processed image includes a plurality of image blocks, and the first feature information includes feature information of the image block; and a generation subunit 12022, configured to sequentially input feature information of at least two groups of image blocks into a leaky integrate and fire LIF module, to obtain target data generated by the LIF module, where feature information of a group of image blocks includes feature information of at least one image block. The obtaining subunit 12021 is configured to obtain second feature information corresponding to the to-be-processed image based on the target data, where the second feature information includes updated feature information of the image block, and both the first feature information and the second feature information are the feature information of the to-be-processed image.
In an embodiment, the generation subunit 12022 is configured to: sequentially input the feature information of the at least two groups of image blocks into the LIF module, and when an excitation condition of the LIF module is satisfied, generate the target data by using an activation function, where the target data is not binarized data.
In an embodiment, the first neural network is a multilayer perceptron MLP, a convolutional neural network, or a neural network using a self-attention mechanism.
In an embodiment, the image processing apparatus 1200 further includes a feature processing unit, configured to perform feature processing on the feature information of the to-be-processed image by using a second neural network, to obtain a prediction result corresponding to the to-be-processed image, where the first neural network and the second neural network are included in a same target neural network, and a task executed by the target neural network is any one of the following: classification, segmentation, target detection, or super-resolution.
It should be noted that content such as information exchange and an execution process between the modules/units in the image processing apparatus 1200 is based on a same concept as the method embodiment corresponding to
The feature extraction unit 1301 includes: an obtaining subunit 13011, configured to obtain first feature information corresponding to the to-be-processed image, where the to-be-processed image includes a plurality of image blocks, and the first feature information includes feature information of the image block; and a generation subunit 13012, configured to sequentially input feature information of at least two groups of image blocks into a leaky integrate and fire LIF module, to obtain target data generated by the LIF module, where feature information of a group of image blocks includes feature information of at least one image block. The obtaining subunit 13011 is further configured to obtain second feature information corresponding to the to-be-processed image based on the target data, where the second feature information includes updated feature information of the image block, and both the first feature information and the second feature information are the feature information of the to-be-processed image.
In an embodiment, the generation subunit 13012 is configured to: sequentially input the feature information of the at least two groups of image blocks into the LIF module, and when an excitation condition of the LIF module is satisfied, generate the target data by using an activation function, where the target data is not binarized data.
It should be noted that content such as information exchange and an execution process between the modules/units in the neural network training apparatus 1300 is based on a same concept as the method embodiment corresponding to
The following describes an execution device provided in an embodiment of this application.
The memory 1404 may include a read-only memory and a random access memory, and provide instructions and data to the processor 1403. A part of the memory 1404 may further include a non-volatile random access memory (NVRAM). The memory 1404 stores a processor and operation instructions, an executable module or a data structure, a subset thereof, or an expanded set thereof. The operation instructions may include various operation instructions for implementing various operations.
The processor 1403 controls an operation of the execution device. In an application, the components of the execution device are coupled together through a bus system. In addition to a data bus, the bus system may further include a power bus, a control bus, a status signal bus, and the like. However, for clear description, various types of buses in the figure are marked as the bus system.
The methods disclosed in embodiments of this application may be applied to the processor 1403 or may be implemented by the processor 1403. The processor 1403 may be an integrated circuit chip and has a signal processing capability. In an implementation process, the operations in the foregoing methods can be implemented by using a hardware integrated logic circuit in the processor 1403, or by using instructions in a form of software. The processor 1403 may be a general-purpose processor, a digital signal processor (DSP), a microprocessor, or a microcontroller. The processor 1403 may further include an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA) or another programmable logic device, a discrete gate or transistor logic device, or a discrete hardware component. The processor 1403 may implement or perform the methods, the operations, and logical block diagrams that are disclosed in embodiments of this application. The general-purpose processor may be a microprocessor, or the processor may be any conventional processor or the like. The operations in the methods disclosed with reference to embodiments of this application may be directly performed and completed by a hardware decoding processor, or may be performed and completed by using a combination of hardware and software modules in the decoding processor. A software module may be located in a mature storage medium in the art, such as a random access memory, a flash memory, a read-only memory, a programmable read-only memory, an electrically erasable programmable memory, or a register. The storage medium is located in the memory 1404, and the processor 1403 reads information in the memory 1404 and completes the operations in the foregoing methods in combination with hardware of the processor 1403.
The receiver 1401 may be configured to receive input digit or character information, and generate a signal input related to a related setting and function control of the execution device. The transmitter 1402 may be configured to output digital or character information through a first interface. The transmitter 1402 may be further configured to send instructions to a disk group through the first interface, to modify data in the disk group. The transmitter 1402 may further include a display device, for example, a display.
In an embodiment of the application, the processor 1403 is configured to perform the image processing method performed by the execution device in the embodiment corresponding to
The performing feature extraction on the to-be-processed image by using the first neural network includes: obtaining first feature information corresponding to the to-be-processed image, where the to-be-processed image includes a plurality of image blocks, and the first feature information includes feature information of the image block; sequentially inputting feature information of at least two groups of image blocks into a leaky integrate and fire LIF module, to obtain target data generated by the LIF module, where feature information of a group of image blocks includes feature information of at least one image block; and obtaining second feature information corresponding to the to-be-processed image based on the target data, where the second feature information includes updated feature information of the image block, and both the first feature information and the second feature information are the feature information of the to-be-processed image.
It should be noted that a manner in which the application processor 14031 performs the foregoing operations is based on a same concept as the method embodiments corresponding to
An embodiment of this application further provides a training device.
The training device 1500 may further include one or more power supplies 1526, one or more wired or wireless network interfaces 1550, one or more input/output interfaces 1558, and/or one or more operating systems 1541, for example, Windows Server™, Mac OS X™, Unix™, Linux™, and FreeBSD™.
In an embodiment of the application, the central processing unit 1522 is configured to perform the image processing method performed by the training device in the embodiment corresponding to
The performing feature extraction on the to-be-processed image by using the first neural network includes: obtaining first feature information corresponding to the to-be-processed image, where the to-be-processed image includes a plurality of image blocks, and the first feature information includes feature information of the image block; sequentially inputting feature information of at least two groups of image blocks into a leaky integrate and fire LIF module, to obtain target data generated by the LIF module, where feature information of a group of image blocks includes feature information of at least one image block; and obtaining second feature information corresponding to the to-be-processed image based on the target data, where the second feature information includes updated feature information of the image block, and both the first feature information and the second feature information are the feature information of the to-be-processed image.
It should be noted that a manner in which the central processing unit 1522 performs the foregoing operations is based on a same concept as the method embodiment corresponding to
An embodiment of this application further provides a computer program product. When the computer program product runs on a computer, the computer is enabled to perform operations performed by the execution device in the method described in embodiments shown in
An embodiment of this application further provides a computer-readable storage medium. The computer-readable storage medium stores a program used to perform signal processing. When the program is run on a computer, the computer is enabled to perform operations performed by the execution device in the method described in embodiments shown in
The image processing apparatus, the neural network training apparatus, the execution device, or the training device provided in embodiments of this application may be a chip. The chip includes a processing unit and a communication unit. The processing unit may be, for example, a processor. The communication unit may be, for example, an input/output interface, a pin, or a circuit. The processing unit may execute computer-executable instructions stored in a storage unit, to enable a chip to perform the image processing method described in the embodiments shown in
In an embodiment,
In an embodiment, the operation circuit 1603 includes a plurality of processing units (Process Engine, PE). In an embodiment, the operation circuit 1603 is a two-dimensional systolic array. The operation circuit 1603 may alternatively be a one-dimensional systolic array or another electronic circuit capable of performing mathematical operations such as multiplication and addition. In an embodiment, the operation circuit 1603 is a general-purpose matrix processor.
For example, it is assumed that there is an input matrix A, a weight matrix B, and an output matrix C. The operation circuit fetches data corresponding to the matrix B from a weight memory 1602 and buffers the data on each PE in the operation circuit. The operation circuit fetches data of the matrix A from an input memory 1601, to perform a matrix operation with the matrix B to obtain a partial result or a final result of a matrix, and stores the result into an accumulator 1608.
A unified memory 1606 is configured to store input data and output data. Weight data is directly transferred to the weight memory 1602 by using a direct memory access controller (DMAC) DMAC 1605. The input data is also transferred to the unified memory 1606 by using the DMAC.
A BIU is a bus interface unit, namely, a bus interface unit 1610, and is configured to perform interaction between an AXI bus, and the DMAC and an instruction fetch buffer (IFB) 1609.
The bus interface unit 1610 (Bus Interface Unit, BIU for short) is used by the instruction fetch buffer 1609 to obtain an instruction from an external memory, and further used by the memory access controller 1605 to obtain original data of the input matrix A or the weight matrix B from the external memory.
The DMAC is mainly configured to: transfer input data in an external memory DDR to the unified memory 1606, transfer the weight data to the weight memory 1602, or transfer the input data to the input memory 1601.
A vector calculation unit 1607 includes a plurality of operation processing units. When necessary, further processing is performed on an output of the operation circuit, such as vector multiplication, vector addition, exponential operation, logarithmic operation, and value comparison. The vector calculation unit 1607 is mainly used for non-convolutional or fully connected layer network calculation in a neural network, for example, batch normalization, pixel-level summation, and up-sampling of a feature map.
In an embodiment, the vector calculation unit 1607 can store, into the unified memory 1606, a processed output vector. For example, the vector calculation unit 1607 may apply a linear function and/or a non-linear function to the output of the operation circuit 1603, for example, perform linear interpolation on a feature plane extracted at a convolutional layer. For another example, a linear function and/or a non-linear function is applied to a vector of an accumulated value to generate an activation value. In an embodiment, the vector calculation unit 1607 generates a normalized value, a pixel-level sum, or a normalized value and a pixel-level sum. In an embodiment, the processed output vector can be used as an activation input to the operation circuit 1603, for example, the processed output vector is used in a subsequent layer in the neural network.
The instruction fetch buffer 1609 connected to the controller 1604 is configured to store instructions used by the controller 1604.
The unified memory 1606, the input memory 1601, the weight memory 1602, and the instruction fetch buffer 1609 are all on-chip memories. The external memory is private for a hardware architecture of the NPU.
Operations at layers of the first neural network and the second neural network shown in
Any one of the foregoing processors may be a general-purpose central processing unit, a microprocessor, an ASIC, or one or more integrated circuits configured to control execution of a program of the method in the first aspect.
In addition, it should be noted that the described apparatus embodiment is merely an example. The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on a plurality of network units. Some or all of the modules may be selected depending on actual requirements to achieve the objectives of the solutions of embodiments. In addition, in the accompanying drawings of the apparatus embodiments provided by this application, connection relationships between modules indicate that the modules have communication connections with each other, which may be implemented as one or more communication buses or signal cables.
Based on the description of the foregoing implementations, one of ordinary skilled in the art may clearly understand that this application may be implemented by software in addition to necessary universal hardware, or by dedicated hardware, including a dedicated integrated circuit, a dedicated CPU, a dedicated memory, a dedicated component, and the like. Generally, any functions that can be performed by a computer program can be easily implemented by using corresponding hardware. Moreover, a hardware structure used to achieve a same function may be in various forms, for example, in a form of an analog circuit, a digital circuit, or a dedicated circuit. However, as for this application, software program implementation is a better implementation in most cases. Based on such an understanding, the technical solutions of this application essentially or the part contributing to the conventional technology may be implemented in a form of a software product. The computer software product is stored in a readable storage medium, such as a floppy disk, a USB flash drive, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disc of a computer, and includes several instructions for instructing a computer device (which may be a personal computer, a training device, a network device, or the like) to perform the methods in embodiments of this application.
All or some of the foregoing embodiments may be implemented by using software, hardware, firmware, or any combination thereof. When software is used to implement embodiments, all or some of embodiments may be implemented in a form of a computer program product.
The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, the procedures or functions according to embodiments of this application are all or partially generated. The computer may be a general-purpose computer, a dedicated computer, a computer network, or another programmable apparatus. The computer instructions may be stored in a computer-readable storage medium, or may be transmitted from a computer-readable storage medium to another computer-readable storage medium. For example, the computer instructions may be transmitted from a website, a computer, a training device, or a data center to another website, computer, training device, or data center in a wired (for example, a coaxial cable, an optical fiber, or a digital subscriber line (DSL)) or wireless (for example, infrared, radio, or microwave) manner. The computer-readable storage medium may be any usable medium that can be stored by a computer, or a data storage device, such as a training device or a data center, integrating one or more usable media. The usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, or a magnetic tape), an optical medium (for example, a DVD), a semiconductor medium (for example, a solid state disk (SSD)), or the like.
Number | Date | Country | Kind |
---|---|---|---|
202210302717.6 | Mar 2022 | CN | national |
This application is a continuation of International Application No. PCT/CN2023/082159, filed on Mar. 17, 2023, which claims priority to Chinese Patent Application No. 202210302717.6, filed on Mar. 25, 2022. The disclosures of the aforementioned applications are hereby incorporated by reference in their entireties.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/CN2023/082159 | Mar 2023 | WO |
Child | 18894274 | US |