This application claims the priority benefit of Taiwan application serial no. 111127827, filed on Jul. 25, 2022. The entirety of the above-mentioned patent application is hereby incorporated by reference herein and made a part of this specification.
The present disclosure relates to a data processing mechanism, and more particularly, to a fault-mitigating method and a data processing circuit.
Neural network is an important theme in artificial intelligence (AI), which makes decisions through simulating operations of human brain cells. It is worth noting that there are many neurons in the human brain cells, and these neurons are connected to each other through synapse. Among them, each of the neurons receive signals via the synapse, and a converted output of the signal will be transmitted to another neuron. The conversion ability of each of the neurons is different, and through operations of the aforementioned signal transmission and conversion, human beings form an ability to think and judge. The neural network obtains the corresponding ability according to the aforementioned operation method.
The neural network is often used in image recognition. In the operation of each of the neurons, an input component and a weight of the corresponding synapse are multiplied (possibly with a bias) and then output through a calculation of a nonlinear function (e.g. activation function) to extract image features. Inevitably, a memory for storing input values, weight values, and function parameters may cause some storage blocks to fault/damage (e.g. hard error) due to poor yields, thereby affecting the completeness or correctness of a stored data. Even for a convolutional neural network (CNN), after executing a convolution calculation, the faulty/damaged situation will seriously affect image recognition results. For example, if the fault occurs in higher bits, the recognition success rate may approach zero.
In light of the foregoing, embodiments of the present disclosure provide a fault-mitigating method and a data processing circuit, which replace data based on statistical characteristics of adjacent features to improve recognition accuracy.
The fault-mitigating method of the embodiment of the present disclosure is suitable for a memory having faulty bits. The fault-mitigating method includes (but is not limited to) the following. A first data is written into the memory. A computed result is determined according to one or more adjacent bits of the first data at the faulty bits. According to the computed result, new values are determined. The new values replace the values of the first data at the faulty bits to form a second data. The first data includes multiple bits. The first data is image-related data, weights used by a multiply-accumulate (MAC) for extracting features of images, and/or values used by an activation calculation. The adjacent bits are adjacent to the faulty bits. The computed result is obtained through computing the values of the first data at non-faulty bits of the memory.
The data processing circuit of the embodiment of the present disclosure includes (but is not limited to) a memory and a processor. The memory is used for storing codes and has one or more faulty bits. The processor is coupled to the memory and is configured to load and execute the following steps. A first data is written into the memory. A computed result is determined according to one or more adjacent bits of the first data at the faulty bits. According to the computed result, new values are determined. The new values replace the values of the first data at the faulty bits to form a second data. The first data includes multiple bits. The first data is image-related data, weights used by a MAC for extracting features of images, and/or values used by an activation calculation. The adjacent bits are adjacent to the faulty bits. The computed result is obtained through computing the values of the first data at non-faulty bits of the memory.
Based on the above, the fault-mitigating method and the data processing circuit of the embodiments of the present disclosure use the computed result of the values at the non-faulty bits to replace the values at the faulty bits. Accordingly, an error rate of image recognition is reduced, thereby reducing the influence of faults.
In order to make the above-mentioned features and advantages of the disclosure clearer and easier to understand, the following embodiments are given and described in details with accompanying drawings as follows.
The memory 11 may be a static or a dynamic random access memory (RAM), a read-only memory (ROM), a flash memory, a register, a combinational circuit or a combination of the above components. In an embodiment, the memory 11 is used for storing image-related data, weights used by a MAC for extracting features of images, and/or values used by an activation calculation, a pooling calculation, and/or other neural network calculations. In other embodiments, users may determine the type of data stored in the memory 11 according to actual needs.
In an embodiment, the memory 11 is used to store codes, software modules, configurations, data or files (e.g. neural network related parameters, computed results), which will be described in details in subsequent embodiments.
In some embodiments, the memory 11 has one or more faulty bits. The faulty bits refer to faults/damages of the bits due to process errors or other factors (may be called hard error or permanent fault), which causes access results to be different from actual stored contents. The faulty bits have been detected in advance, and location information of the faulty bits in the memory 11 is available to the processor 12 (via a wired or wireless transmission interface). On the other hand, the bits in the memory 11 without faults/damages due to process errors or other factors are referred to as non-faulty bits. That is, non-faulty bits are not faulty bits.
The processor 12 is coupled to the memory 11. The processor 12 may be a circuit composed of multiplexers, adders, multipliers, encoders, decoders, or one or more of various types of logic gates, and may be central processing units (CPUs), graphic processing units (GPUs), or other programmable general-purpose or special-purpose microprocessors, digital signal processors (DSPs), programmable controllers, field programmable gate arrays (FPGAs), application-specific integrated circuits (ASICs), neural network accelerators or other similar components or a combination of the above components. In an embodiment, a processor 10 is configured to execute all or part of operations of the data processing circuit 10, and load and execute various software modules, codes, files and data stored in the memory 11. In some embodiments, operations of the processor 12 is implemented through software.
It should be noted that a data processing circuit 100 is not limited to applications of a deep learning accelerator 200 (e.g. inception_v3, resnet101 or resnet152), and may be applied in any technical field requiring MACs.
In the following, a method according to an embodiment of the present disclosure will be described with reference to various components or circuits in the data processing circuit 100. Each process of the method may be adjusted according to the implementation situation, and is not limited hereto.
The memory 11 with one or more faulty bits provides one or more blocks for the first data or other data to store. The blocks are used for storing input parameters and/or output parameters (e.g. features maps or weights) of the neural network. The neural network is any version of Inception, GoogleNet, ResNet, AlexNet, SqueezeNet or other models. The neural network includes one or more layers of calculation. The calculation layer may be a convolutional layer, an activation layer, a pooling layer, or other neural network related layers.
If the first data is stored in the faulty bits of the memory 11, it may affect subsequent recognition or prediction results of the neural network. As an example,
A processor 12 determines a computed result according to one or more adjacent bits of the first data at the faulty bits (step S220). Specifically, one or more bits of the first data are stored in the faulty bits of the memory 11. The adjacent bits are adjacent to the faulty bits. That is, the adjacent bits are bits located one bit higher than the faulty bits or bits located one bit lower than the faulty bits.
As an example,
According to experimental results, for image recognition or related applications, if values on non-faulty bits are replaced with/regarded as/replaced with the values at the faulty bits, it will help to improve accuracy or prediction ability. A computed result is obtained through computing values of a first data at the non-faulty bits of a memory 11. That is, a processor 12 performs calculations on the values at the non-faulty bits to obtain the computed result.
In an embodiment, the processor 12 obtains a first value of the first data at one or more evaluation bits. The evaluation bits are located at the lower bits of the adjacent bits. As an example,
The processor 12 adds the first value at the evaluation bits to a random number. A carry result after adding the random number is the computed result. It is worth noting that applying stochastic rounding to block floating point (BFP) helps to minimize impacts of rounding and thus reduce losses. For example, mantissa and stochastic noise are added to shorten the mantissa of the BFP. Furthermore, since similarity/correlation between adjacent features of images is high, introducing stochastic noise to the adjacent bits helps to predict the values at the faulty bits. The carry result includes carry or no carry from adjacent bits located at the higher bits of the evaluation bits.
Taking
In another embodiment, the adjacent bits include the higher bits and the lower bits adjacent to the faulty bits. For example,
The processor 12 determines a statistical value of the values of the first data at the higher bits and the lower bits. The statistical value is the computed result. The statistical value may be an arithmetic mean or a weighted calculation of the values of the first data at the higher bits and the lower bits. The experimental results show that there is still a certain degree of similarity or correlation between the values of a certain bit and a plurality of adjacent bits of the certain bit. Therefore, the values at the faulty bits may be predicted with reference to more adjacent bits.
In other embodiments, the computed result may also be other mathematical calculations.
Referring to
In the embodiment of the statistical value, the processor 12 directly regards the statistical value as the new values. For example, the arithmetic mean of “0” and “1” is “0”. In another example, the arithmetic mean of “1” and “1” is “1”.
The processor 12 replaces the values of the first data at the faulty bits with the new values to form a second data (step S240). Specifically, the processor 12 accesses data as input data to a multiplier-adder or other calculation units if there is MAC or other requirements. It is worth noting that the processor 12 ignores accessing the values on one or more faulty bits in the memory 11 because the faulty values will be accessed from the faulty bits. Taking
That is, if there is a demand for access, the processor 12 obtains the second data. The second data is the first data, but the values corresponding to the faulty bits is changed to the new values, while the values corresponding to the non-faulty bit remains unchanged. Taking
It should be noted that “replacement” in the context means that when some bits in the first data are stored in the faulty bits, the processor 12 ignores reading the values at the faulty bits and directly uses the new values as the values at the faulty bits. However, the values stored in the faulty bits is not stored in the non-faulty bits. For example, if the faulty bits are the second location, the processor 12 replaces the values of the second location with the new values, and disables/stops/does not read the values of the second location. At this time, the values of the second location in the second data read by the processor 12 are the same as the new values.
To sum up, in the data processing circuit and the fault-repairing method of the embodiments of the present disclosure, the new values for replacing the faulty bits are determined according to the computed result of the values of the adjacent non-faulty bits. Accordingly, the error rate of the prediction result of the neural network is reduced.
Although the present disclosure has disclosed the embodiments in the above, it is not intended to limit the present disclosure. Those skilled in the art can make some changes and modifications without departing from the spirit and the scope of the present disclosure. The protection scope of the present disclosure shall be determined by the claims appended in the following.
Number | Date | Country | Kind |
---|---|---|---|
111127827 | Jul 2022 | TW | national |